(7
a^yCix
(/*****■
(', oac.
vy.
//
1
\
SI
f
PRENTICEHALL PHYSICS SERIES
Vector and Tensor Analysis, by A. P. Wills,
A Survey Course in Physics, by Carl F.
Eyring.
Elements of Nuclear Physics, by Franco
Jtasetti.
PRINCIPLES OF El,ECTHIC AKS MAGNETIC Meas
urements, by P. Vigoun.ux and C. E. Webb.
Atomic Spectra and Atomic Structure, by
Gerhard Herzberg.
Electricity and Magnetism, by A. W. Hirst.
Properties of Matter, by F. C, Champion
and N. Davy.
Thermodynamics, by Enrico Fermi.
College Physics, by Henry A. Perkins.
College Physios, Abridged, E>j/ Henry A.
Perkins .
Introductory Quantum Mechanics, by Vlad
imir Rojansky.
Procedures in Experimental Physics, hy
John Strong, in collaboration with H. Victor
Neher, Albert E. Wbitford, C. Hawley
Cartw right, and Roger Hay ward.
FUNDAMENTAL PRINCIPLES OF PHYSICS, fr#
Herman G. Hell <mi Willard 11. Bennett.
The Structure or Steel Simply Explained,
by Erie N. Simons and Edwin Gregory.
Molecular Spectra and Molecular Struc
ture: Part I — Diatomic Molecules, by
Gerhard Herzberg.
Physical Meteorology, by John G, Albright.
Vibrational Spectra and Structure of
Polyatomic Molecules, by TaYou Wu.
An Introduction to the Theory of Rela
tivity, by Peter G. Bergman n.
INTRODUCTION
TO THE
THEORY OF RELATIVITY
by
PETER GABRIEL BERGMANN
1JEMP.ER, INSTITUTE EOR ADVANCED STUDY, 19361941
ASSISTANT PROFESSOR 01' PHYSICS
BLACK MOUNTAIN COLLEGE
WITH A FOREWORD
BY
ALBERT EINSTEIN
New York
PRENTICEHALL, INC.
COPYRIGHT, 1942, BY
PRENTICEH ALL , LN O .
70 Fifth Avonuo, New York
all marrTS besbkYbd. so part of this book
MAY BE EEFRQDURED IN ANY TOill, BY MIMEO
GRAPH Oil ANY OTHER MEANS, WFEHOTT PER
MISSION" IN WRITING FROM THE PUBLISHERS .
First Printing :.:.. May 19.42
Second Printing January 1946
Tli ire! Printing March 1917
Fmiitri Printing ...Aiieust i<hs
Foreword by Albert Einstein
Although a number of technical expositions of the theory of rela
tivity have been published, Dr. Bergniann's book seems to me to
satisfy a definite need. It is primarily a textbook for students oi'
physics and mathematics, which may be used either in the classroom
or for individual study. The only pre requisites for reading the book
are a familiarity with calculus and some knowledge of differential
equations, classical mechanics, and electrodynamics.
This book gives an exhaustive treatment of the main features of the
theory of relativity which is not. only systematic and logically com
plete, but also presents adequately its empirical basis. The student
who makes a thorough study of the book will master the mathematical
methods and physical aspects of the theory of relativity and will be
in a position to interpret for himself its implications. He will also be
able to understand, with no particular difficulty, the Literature of the
field.
I believe that more time and effort might well be devoted to the
systematic teaching of the theory of relativity than is usual at present
at most universities. It is true that the theory of relativity, par
ticularly the general theory, has played a rather modest role in the
correlation of empirical facts so far, and it has contributed little to
atomic physics and our understanding of quantum phenomena. It is
quite possible, however, that some of the results of the general theory of
relativity, such as the general co variance of the laws of nature and their
nonlinearity, may help to overcome the difficulties encountered at
present in the theory of atomic: and nuclear processes. Apart from this,
the theory of relativity has a special appeal because of its inner con
sistency and the logical simplicity of its axioms.
Much effort has gone into making this book logically and pedagog
ieally satisfactory, and Dr. Eergmann has spent many hours with me
which were devoted to this end. It is my hope that many students
will enjoy the book and gain from it a better understanding of the ac
complishments and problems of modern theoretical physics.
A, hjNSTEIN
The Institute for Advanced Study
PRINTED IN" THE UNITED STATES OF AMERICA
Preface
TBtrs book presents the theory of relativity for students of physics
and mathematics who have had no previous introduction to the
subject and whose mathematical training does not go beyond the
fields which are necessary for studying classical theoretical physics.
The: specialized mathematical apparatus used in the theory of rela
tivity, tensor calculus, and .Rieci calculus, is, therefore, developed in the
book itself, The main emphasis of the hook is on the development of the
basic ideas of tin; theory of relativity; it is these basic ideas rather than
special applications which give the theory its importance among the
various branches of theoretical physics.
The material has been divided into three parts, the special theory of
relativity, the general theory of relativity, and a report on unified
field theories. The three parts fonn a unit. The author realises that
many students are interested in the theory of relativity mainly for its
applications to atomic and nuclear physics. It is hoped that these
readers will find in the first part, on the special theory of relativity, all
the information which they require. Those readers who do not intend
to go beyond the special theory of relativity may omit one section of
Chapter V (p. G7) and all of Chapter VI11; these passages contain
material which is needed only for the development of the general theory
of relativity.
The second part treats the general theory of relativity, including the
work by Einstein, Infeld, and Hoffmann on the equations of motion.
The third part deals with several attempts to overcome defects in the;
general theory of relativity. Kone of these theories has been com
pletely satisfactory. Nevertheless, the author believes that this report
rounds out the discussion of the general theory of relativity by indi
cating possible directions of future research. However, the third part
may be omitted without destroying tire unity of the remainder.
The author wishes to express Ins appreciation for the help of Pro
fessor Einstein, who read die whole manuscript and made many valuable
suggestions. Particular thanks are due to Dr. and Mrs. Fred Fender,
who read the manuscript carefully and suggested many stylistic and
other improvements. The figures were drawn by Dr. Fender. Margot
Bergmann read the manuscript, suggested improvements, and did
almost all of the technical work connected with the preparation of the
manuscript. The friendly cooperation of the Editorial Department of
PrenticeHall, Inc. is gratefully acknowledges*.
P. G. B.
vii
Contents
PAGE
Foreword by Albert Einstein v
Preface vii
Introduction xiii
Part I
THE SPECIAL. THEORY OF RELATIVITY
CHAPTER
I. Frames of Reference, Coordinate Systems, and Co
ordinate Transformations 3
Coordinate transformations not involving time. Coordinate
transformations involving time.
11. Classical Mechanics 8
The law of inertia, inertial systems. Galilean transformations.
The force law and its transformation properties.
III. The Propagation of Light is
The problem confronting classical optics. The corpuscular hy
pothesis. The transmitting medium as the frame of reference,
Th aba ol n (: e f ram o f re f e ren c e . The e xpe ri men t of Mi e he Iso n an d
Morley. The ether hypothesis.
IV. Thk Lorbntz Teansfoematioh 28
The relative character of simultaneity. The length of scales.
The rate of clocks. The Lorenlz transformation. The "kine
matic" effects of the T.orentz transformation. The proper time
interval. The relafivistie law of the addition of velocities. The
proper time of a material body. Problems.
V. Vector and Tensor Calculus in an n Dimensional
Continuum 47
Orthogonal transformations. Transformation determinant. Im
proved notation. Veetors. Vector analysis. Tensors. Tensor
analysis. Tensor densities. The tensor density of LeviCivita.
Vector product and curl. .Generalisation, n dimensional con
tinuum. General transformations. Vectors. Tensors. Metric
tensor, Riemannian spaces. Raising and lowering of indices,
ix
CONTENTS
Y. Vector and Tensor Calculus in an n Dimensional
C o ntix u u m: ( Con I . )
Tensor densities, EeviCivita tensor density. Tensor analysis.
Geodesic lines. Minkowski world and Lo rents transformations.
Paths, world lines, Problems.
VI. Relativistic Mechanics of Mass Points 85
Program for relativism mechanics. The form of the conservation
laws. A model example. Lorentz co variance of the new conserva
tion laws. Relation between energy and mass. The Compter*
e fleet. Relativistic analytical mechanics. Eelativistie force.
Problems.
VI T. Relativistic Electrodynamics 106
Maxwell's field equations. Preliminary remarks on transforma
tion properties. The representation of four dimensional tensors
in three plus one dimensions. The Lorentz in variance of Maxwell's
field equations. The physical significance of the transformation
laws. Gauge transformations. The ponderomotivo equations.
VIII. The Mechanics of Continuous Matter 121
Introductory remarks. No n relativistic treatment. A. special
coordinate system. Tensor form of the equations. The stress
energy tensor of electrodynamics. Problem.
IX. Applications of the Special Theory of Relativity . . 133
Experimental verifications of the special theory of relativity.
Charged particles in electromagnetic fields. The field of a rapidly
moving particle. Sommerfeld's theory of the hydrogen fine struc
ture. De Broglie waves. Problems.
Part II
THE GENERAL THEORY OF RELATIVITY
X. The Principle of Equivalence 151
Introduction . The principle of equivalence. Preparations for a
relativistic theory of gravitation. On ineftial systems. Ein
stein's "elevator." The principle of general eo variance. The
nature of the gravitational field.
XL The RiemannChristoffel Curvature Tensor . . .161
The characterisation of Riema.nnian spaces. The intcgrability of
the a.ffine connection. Euclidicity and integrability. The criterion
of integrability. The commutation law for covariant differentia
tion, the tensor character of Ryuf! Properties of the curvature
tensor. The covariant form of the curvature tensor. Contracted
forms of the curvature tensor. The contracted Bianchi identities.
The number of algebraically independent, components of the curva
ture tensor.
CONTENTS
XI
XIL The Field Equations of the General Theory of
Relativitt 175
The pondero motive equations of the gravitational field. The rep
resentation of matter in the field, equations. The differential
identities. The field equations. The linear approximation and
the standard coordinate conditions. Solutions of the linearized
field equations. The field of a mass point. Gravitational waves.
The variational principle. The combination of the gravitational
and electromagnetic fields. The conservation laws in the general
theory of relativity.
XIII. Rigorous Solutions of the Field Equations of the
General Theory of Relativity 198
The solution of Schwarzschild. The "Schwarzschild singularity."
The field of an electrically charged mass point. The solutions
with rotational symmetry.
XIV. The Experimental Tests of the General Theory of
Relativity 211
The advance of the perihelion of Mercury. The deflection of light
in a Schwarzschild field. The gravitational shift of spectral lines.
XV. The Equations of Motion in the General Theory
of Relativity 223
Force laws in classical physics and in electrodynamics. The law
of motion in the general theory of relativity. The approximation
method. The first approximation and the mass conservation
law. The second approximation and the equations of motion.
Conclusion. Problem.
Part III
UNIFIED FIELD THEORIES
XVI. Weyl's GaugeInvariant Geometry 245
The geometry. Analysis in gauge invariant geometry. Physical
interpretation of Weyl's geometry, Weyd's variational principle.
The equations G^y = 0.
XVII. Kaluza's Five Dimensional Theory and the Projj ac
tive Field Theories 25i
Kaluza's theory. A four dimensional formalism in a five dimen
sional space. Analysis in the pfornialism. A special type of
ooordina.te system. Covariant formulation of Kaluza's theory.
Projective field theories.
XVIIL A Generalization of Kaluza's Theory 271
Possible generalisations of Kaluza's theory. The geometry of the
closed, five dimensional world. Introduction of the special co
ordinate system. The derivation of field equations from a varia
tional principle. Differential field equations.
Index 281
Introduction
Almost all the laws of physics deal with the behavior of certain objects
in space in tin.; course 1 of time. The position of a body or the location
of an event can bo; expressed only as a location relative to some other
body suitable for that purpose. For instance; in an experiment with
At wood's machine, the velocities and accelerations of the weights are
referred to the machine itself, that is, ultimately to the earth. An
astronomer may refer the motion of the planets to the center of gravity
of the sun. All motions are described as motions relative to some
reference body.
We imagine that conceptually, at least, a framework of rods which
extends into space can be rigidly attached to the reference body. Using
this conceptual framework as a Cartesian coordinate system in three
dimensions, we characterize any location by three numbers, the coordi
nates of that space point. Such a conceptual framework, rigidly con
nected with sonic material body or other welldefined point, is often
called a frame of reference.
Some bodies may be suitable as reference bodies, others may not.
Even before the theory of relativity was conceived, the problem of
selecting suitable frames of reference played an important part in the
development of science. Galileo, the father of postmedieval physics,
considered the choice of tire heliocentric frame to be so important that
he risked imprisonment and even death in his efforts to have the new
frame of reference accepted by his contemporaries. Tn the last analysis,
it was the choice of the reference body which was the subject of his
dispute with the authorities.
Later, when Newton gave a comprehensive presentation of the physics
of his time, the heliocentric frame of reference had been generally
accepted. Still, Newton felt that further discussion was necessary. To
show that some frames of reference were more suitable for the descrip
tion of nature than others, he devised the famous pail experiment: He
filled a pail with water. By twisting the rope which supported the pail,
he made it rotate around its axis. As the water gradually began tc
participate in the rotation, its surface changed from a plane to a para
boloid. After the water had gained the same speed of rotation as the
XIV
INTRODUCTION
pail, he stopped the pail. The water slowed down and eventually came
to complete rest. At the same time, its surface resumed the shape of
a plane.
The description given above is based on a frame of reference con
nected with the? earth. The law governing the shape of the water's
surface could be formulated thus: The surface of the water is a plane
whenever the water does not rotate. It is a paraboloid when the water
rotates. The state of motion of the pail has no influence on the shape
of the surface.
Now let lis describe the whole experiment in terms of a frame of refer
ence rotating relatively to the earth with a constant angular velocity
equal to the greatest velocity of the pail. At first, the rope, the pail,
and the water "rotate" with a certain constant angular velocity with
respect to our new frame of reference, and the surface of the water is a
plane. Then the rope, and in turn the pail, is "stopped," and the water
gradually "glows clown," while its surface becomes a paraboloid. After
the water has come to a "complete rest," its surface still a paraboloid,
the rope, and in turn the pail, is again made to "rotate" relatively to
our frame of reference (that is, stopped with respect to the earth); the
water gradually begins to participate in th.e "rotation," while its surface
flattens out. In the end, the whole apparatus is "rotating" with, its
former angular velocity, and the surface of the water is again a plain;.
With respect to this frame of reference, the la.w would have to be formu
lated like this: Only when the water "rotates" with a certain angular
velocity, is its surface a plant;. The deviation from a plane increases
with the deviation from this particular state of motion. The state of
rest produces also a paraboloid. Again the rotation of the pail is
immaterial.
Newton's pail experiment brings out ve,ry clearly what is meant by
"suitable" frame of reference. We can describe nature and we can
formulate its laws using whatever frame of reference we choose. But
there may exist a frame or frames in which the laws of nature are funda
mentally simpler, that is, in which the laws of nature contain fewer
elements than they would otherwise. Take the instance of Newton's
rotating pail. If our description of nature were based on the frame of
reference connected with the pail, many physical laws would have to
contain an additional element, the angular velocity a of the pail relative
to a ''more suitable" frame of reference, let us say to the earth.
The laws of motion of the planets become; basically simpler when they
are expressed in terms of the heliocentric frame of reference instead of
the geocentric frame. That is why the description of Copernicus and
INTRODUCTION
xv
Galileo won out over that of Ptolemy, even, before Kepler and Newton
succeeded in formulating the underlying laws.
Once it was clearly recognized that the choice of a frame of reference
determined the form of a law of nature, investigations were carried out
which established the effect of this choice in a mathematical form.
Mechanics was the first branch of physics to be expressed in a com
plete system of mathematical laws. Among all the frames of reference
conceivable, there exists a set of frames with respect to which the law
of inertia tabes its familiar form: In Liu: absence of forces, the space
coordinates of a mass point are linear functions of time. These frames
of reference; are called inertia! systems. If; was found that all of the
laws of mechanics take the same form when stated in terms of any one
of these mertial systems. Another frame of reference necessitates a
more involved physical and mathematical description, for example, the
frame of reference connected with Newton's rotating pail. The charac
terization of the motions of mass points not subject to forces is possible
in terms of this frame of reference, but the mathematical form of the
law of inertia is involved. The space coordinates are not linear func
tions of time.
Since the laws of mechanics take the same form in all frames of refer
ence which are inertia! systems, all inert! al systems are equivalent from
the point of view of mechanics. We can find out whether a given body
is "accelerated" or "un accelerated"' by comparing its motion with that
of some; mass point which is not subject to any forces. But whether a
body is "at rest" or "'in uniform motion" depends entirely on the inertial
system used for the description; the terms "at rest" and "in uniform
motion" have no absolute meaning. The principle that all inertia! sys
tems are equivalent for the description of nature is called the principle
of relativity .
When Maxwell developed the equations of the electromagnetic held,
these equations were apparently incompatible with the principle of rela
tivity". For, according to this theory, electromagnetic waves in empty
space should propagate with a universal, constant velocity c of about
3 X 10 w cm /sec, and this, it appeared, could not be true with respect
to both of two different inertial systems which were moving relatively
to each other. The one frame of reference with respect to which the
speed of electromagnetic radi.ation would be the same hi all directions
could be used for the definition of "absolute rest" and of "absolute
motion." A number of experimenters tried hard to find this frame of
reference and to determine the earth's motion with respect to it.
All these attempts, however, were unsuccessful. On the contrary, all
INTRODUCTION
experiments seemed to suggest that the principle of relativity applied
to the laws of electrodynamics as well as to those of mechanics. H. A.
Lorentz proposed a new theory, in which he accepted the existence of
one privileged frame of reference, and at the same time explained why
this frame could not be discovered by experimental methods. But he
had to introduce a number of assumptions which could not have been
checked by any conceivable experiment. To this (stent his theory was
not very satisfactory. Einstein finally recognized that only a revision
of our fundamental ideas about spa.ee and time would resolve the im
passe between theory and experimenL Once this revision, had been
made, the principle of relativity was extended to the whole of physics.
This Is now called the special theory of relativity. It establishes the
fundamental equivalence of ai 1 iucrtial systems . 1 1 pressor ves fully their
privileged position among all conceivable frames of reference. The so
called general theory of relativity analyzes and thereby destroys this
privileged position and is able to give a pew theory of gravitation.
,. in this book we shall first discuss the role of different frames of refer
ence, from a classical point of view, in mechanics and to some extent in
electrodynamics. Only when the student understands fully the dead
lock between theoretical conclusions and cxperi mental results in classical
electrodvnamics can he appreciate the necessity of revising classical
physics 'along relativistic lines. Once the new ideas of space and time
are grasped, ""relativistic mechanics" and "relativistic electrodynmm.es''
are easilv understood.
The second part of this book is devoted to the general theory of
relativity, while the third part discusses some recent attempts to ex
tend the theory of gravitation to the field of electrodynamics.
PARTI
Trie Special Theory of Relativity
CHAPTER I
Frames or Reference / Coordinate Systems,
and Coordinate Transformations
We have spoken of frames of reference and have mentioned Cartesian
coordinate systems. In this chapter we shall examine more closely the
relationships Between different Frames of reference and different co
ordinate systems.
Coordinate transformations not involving time. As a specific in
stance; let us consider a frame of reference connected rigidly with the
earth, that is, a geocentric frame of reference. In order to express
quantitatively the location of a point relative to the earth, we introduce
a coordinate system. Wo choose a point of origin, let us say the center
of the earth, and directions for the three axes; for instance, the Zaxi.s
may go from the earth's center through the intersection of the equator
and the Greenwich meridian, the Kaxis through the intersection of the
equator and the 90° J '^meridian, and the Zaxis through the North Pole.
The location of any point is then given by three real numbers, the co
ordinates of that point. The motion of a point is completely described
if we express the three point coordinates as functions of time. A point
is at rest relatively to our frame of reference if these three functions are
constant.
Without abandoning the earth as the body with which our frame of
reference is rigidly connected, we can introduce another coordinate
system. We may, for instance, choose as the point of origin some well
defined point on the earth's surface, let us say one of the markers of the
United States Coast and Geodetic Survey; and as the direction for the
Xaxis the direction due East; for tire 7 axis, the direction due North;
and for the Zaxis, the direction straight up, away from the earth's
center, the earth assumed to be a sphere.
The relationship between the two coordinate systems is completely
determined if the coordinates of any given point with respect to one co
ordinate system are known functions of its coordinates with respect to
the other coordinate system. Let us call the first coordinate system S
3
FRAMES OF REFERENCE
[ Chap. I
and the second coordinate system S', and the coordinates of a certain
point P with respect to S (x, y, z) and the coordinates of the same point
with respect to S' (V, y', s'). Then, x' ; y\ and z' are connected with
x, y, and z by equations of the form:
%' = cu.x + c i2 y + c Vi z + x'
y' = fast + cny + cxz + y'
s' — m. ■■%. + <%$# + ess + £'•
(l.i)
(&', &', if) are the coordinates of the point of origin of S with respect
to £'. The constants ca are the cosines of the angles between the axes
of S and S', en referring to the angle between the X and the; X'axis,
Cm to the angle between the F and the X'axis, eai to the angle between
the X and the F'axis, and so forth.
The transition from one coordinate system to another is called a
coordinate transformation, and the equations e ounce ting the point co
ordinates of the two coordinate systems are called transformation
equations.
A coordinate system is necessary not only for the description of loca
tions, but also for the representation of vectors. Let us consider some
vector held, for example, an electrostatic field, in the neighborhood of
the point P. The value and direction of the field strength E at P is
completely determined when we know the components of E with respect
to some stated coordinate system B. Let us call the components of E
at P with respect to S, E x , E y , and B* . The components of E at P
with respect to another system, for instance S', can be computed if we
know the transformation equations defining the coordinate transforma
tion S into S'. These new components, Ml , E'„ , and K , are inde
pendent of the translation of the point of origin, that is, the constants
x' , i/o , and z' is of (1.1). E' x is the sum of the projections of E x , E y ,
and E, on the X'axis, and E'„ and E' t are determined similarly,
El = c n E x + cuE„ f f>®S* ,
E' y = c, n E* + c,nE v + cwEz ,
E' z s= tkSV + (h'lEy + CsiEz .
A law winch expresses the components of a certain quantity at a point
in terms of the components of the same quantity at the same point
with respect to another coordinate system is called 'a transformation taw.
Coordinate transformations involving time. We have thus far con
sidered only transformations which lead from one coordinate system to
Chap, I ]
FRAMES OF REFERENCE
another one rigidly connected with the same reference body, such as the
earth. But coordinate transformation offers an important method for
investigating the relationship between two different frames of reference
which move relatively to each other. In such a case, we represent each
of the two frames by one coordinate system.
Let us compare a frame of reference rigidly connected with the earth
and another one 1 connected with Newton's pail, which we assume is
rotating with constant angular velocity. We can introduce two co
ordinate systems which enable us to describe quantitatively the location
of any point with respect to either frame of reference. Let us call these
two coordinate systems S (this $ is not identical with the former S)
and A"*, respectively, and let us choose the points of origin so that they
both lie on the axis of the pail and coincide; the Zaxis and the Z*axis
may be identical and pointing straight up. If the pail rotates with a
constant angular velocity ti relative to the earth, and if at the time
t = the Xaxis is parallel to the X*axis, the coordinate transforma
tion equations take the form
s* = cos wtx + sin wty,
y* = — sin mi • x + cos cot ■ y,
(12)
(Kg. 1)
Eqs. (1.2) have a form similar to eqs. (1.1), except that the cosines are
no longer constant, but functions of time. The relative motion of the
two frames of reference expresses itself in the functional dependence of
the Cik on time.
Eqs. (1.2) express the relationship between, two frames of reference
which are rotating relatively to each other, Very often we are interested
in the relationship between two frames of reference which are in a state
of uniform, translatory motion relative to each other. In that case;, it
is convenient to choose; (lie two coordinate systems A and S* so that
their corresponding axes are parallel to each other and so that the points
of origin coincide at the time ( = 0. The transformation equations
have the form:
x — v x t,
y*  y
0.3)
where v x , v y , and ? a are the components of the velocitv of 8* relativelv
t<j S.
The form of the transformation equations (1.2) and (1.3) depends, of
FRAMES OF REFERENCE
[ Chap. I
course, on the relative motion of the two frames of reference, but it
also depends on certain assumptions regarding the nature of time and
space: We assume that it is possible to define a time t independently of
any particular frame of reference, or, in other words, that it is possible
to build clocks which are not affected by their state of motion. This
1
l n ig. I, The coordinate svst.em S* with the coordinates {x*, y*, s") rotates
relatively to the coordinate" system 8 with the coordinates 0, y, z) with the
angular velocity a.
assumption is expressed in our transformation equations by the absence
of a transformation equation for t. Tf we wish, we can add the equation
t,
(1.4)
expressing the universal character of time explicitly.
The other assumption concerns length measurements. We assume
that the distance between two points they ma > r be particles— at a
given time is quite independent of any particular frame of reference;
Chap. I ]
FRAMES OF REFERENCE
that is, we assume that we can construct rigid measuring rods whose
length is independent of their state of motion. Eqs. (1.3) show with
parti cular clarity how this assumption is expressed by the form of the
transformation equations. For the, distance between two points Pi and
P 2 With the coordinates §& , y ± , sty and (x 2 , y 2 , g 2 ) is
to = V(i,  .rO r +Ty E  ihf + (a  %)* , (1.5)
and obviously
Vfe  ai)" + (y 2  y,)' + (s 2  £,)='
is satisfied for any time I.
We shall have to consider these assumptions again at a later time.
CHAPTER II
Classical Mechanics
The law of inertia, inertial systems. The branch of physics which
from the; first was most consistently developed as an experimental science
was GalileanNewtonian mechanics. The first law to be formulated
was the law of inertia: Bodies when removed from interaction with other
bodies will continue in their states of rent or straightline uniform motion.
In other 'words, the motion of such bodies is unaccelerated.
To express the law of inertia in mathematical form, we designate the
location of a body by its three coordinates, x, y, and z. When a body is
not at rest, its coordinates are functions of time. According to the law
of inertia, the second time derivatives of these three functions, the ac
celerations, vanish when the body is not subjected to forces, that is,
x = 0,
V  0,
d\
0.
(2.1)
We use the usual notation x for '—. The first integral of eqs. (2.1) ex
presses the; constancy of the three velocity components,
x = u^ , y = uy , z = % . (2.2)
The equations expressing the law of inertia contain coordinates and
refer, therefore, to a certain coordinate system. As long as this co
ordinate system is not specified, the italicized statement does not have
a precise meaning. For, given any body, we can always introduce a
frame of reference with respect to which it is at rest and, therefore,
unaccelerated. The teal assertion is, rather: There exists a coordinate
system (or coordinate systems) with respect to which all bodies not subjected
to forces are unaccelerated. Coordinate systems with this property and
the frames of reference represented by them are called inertial systems.
Of course, not all frames of reference are inertial systems. For in
stance, let us start out from an inertial coordinate system S, and carry
out a transformation (1.2), leading to £*, a system rotating with a
constant angular velocity w relative to S. In order to obtain the
transformation laws of eqs. (2.1) and (2.2), we differentiate the trans
Chap.
CLASSICAL MECHANICS
formation equations (1.2) once and then a second time with respect
to i. The resulting equations contain x, y, z, x*, ?/*, 2*, and the first
and second time derivatives of these quantities.
We assumed that the coordinate system S is an inertial system. We
substiti.de, therefore, for ;i;, if and & and for x, y, and i the expressions
(2.1) and (2.2) respectively. Thus, we obtain, for the starred co
ordinates and their derivatives
and
x* = uy* + u x cos oil + v.,, sin bit,
if* = — ax* f «,, cos <d — u x sin mi 3
z* = 1%,
= w as* f 2ay* t
(2.3)
u y
2u.r*
r = 0.
(2.4)
It turns out that in the coordinate system $* the second tune deriva
tives do not all vanish. Occasionally it is desirable to work with frames
of reference in which accelerations occur which are not caused by real
interactions between bodies. These accelerations, multiplied by the
masses, are treated like real forces, often called "transport forces,"
"inertial forces," and so forth. In spite of these names, these expres
sions are not actual forces; they merely appear in the equations formally
in the same way as forces do. In our ease, the first terms, ui'x*, </y*,
multiplied by the mass, arc called "centrifugal forces," and the last
terms, also multiplied by the mass, arc the socalled Coriolis forces.
On the other hand, there are also types of coordinate transformations
which leave the form of the law' of inertia (2.1) unchanged. As a case
in point, we shall consider first a transformation which involves no
transition to a new frame of reference, of the type (1.1). The differentia
tion of eqs. (1,1) with substitution of x, x. and so forth, from eqs. (2.1)
and (2.2) produces the equations
%' = Cnu x + C&% + Cirfi 3 = u
1j' = C:ii f CnUy f BMS* = Uy,{
z' = c 3 ii f c^u,, + ciiUz = #',J
(2.5)
and
x = y =
(2.6]
10
CLASSICAL MECHANICS
[ Chap. II
X*
=
ia —
a»
=
ffi* ,1
t
=
% —
&
=
.*
■_
U s —
y*
=
Uz .
The velocity components transform just as we would expect a vector
to transform, and eqs. (2.1) arc reproduced in the new coordi nates with
out change.
Another transformation preserving the law of inertia is the type (1.3).
It corresponds to the transition from one frame of reference to another
one winch is in a state of straightline, uniform motion relative to the;
first frame. Taking the second derivatives of eqs, (1.3), we obtain
a;* = x, ?/* =? f, 2* = z; (2.7)
and if the motion of the body satisfies the law of inertia (2.1) in the co
ordinate system S, we have also:
2* = if = z* = 0, (2.8)
while the hrst derivatives of the starred coordinates (if eqs. (2.2) apply
to the unstarred coordinates) are
(2.9)
Eq. (2.8) shows that the law of inertia holds in the new system as well
as in the old on.e. Eqs. (2.9) express the fact that the velocity com
ponents in the new coordinate system S* are equal to those in the old
system minus the; components of the relative; velocity of the two co
ordinate systems themselves. This law is often referred to as the (clas
sical) law of the addition of velocities.
Frames of reference and coordinate systems in which the law (2.1) is
valid are inertial systems. All Cartesian coordinate systems wMeh are
at rest relative to an inertia! coordinate system, are themselves inertial
systems. Cartesian coordinate systems belonging to a frame of refer
ence which is in a state of straightline, uniform motion relative to an
inertial system are also inertial systems. On the other hand, when we
carry out a transition to a new frame of reference which is in some state
of accelerated motion relative to the first one, the corresponding co
ordinate transformation does not reproduce eqs. (2.1) in terms of the
new coordinates. The acceleration of the new frame of reference rela
tive to an inertial system manifests itself in apparent accelerations of
bodies not subject to real forces.
Galilean transformations. If the form of a law is not changed by
certain coordinate transformations, that is, if it is the same law in
Chap. M ]
CLASSICAL MECHANICS
M.
terms of either set of coordi nates, we call that law invariant or covariant
with respect to the transformations considered. The law of inertia (2.1)
is covariant with respect to transformations (1.1) and (1.3), but not
with respect to (1.2).
Transformations (1.1) and (1.3) are of the greatest importance for our
further discussions. They are usually referred to as Galilean transforma
tions. According to classical physics, any two inertial systems are con
nected by a Galilean transformation..
The force law and its transformation properties. We shall now discuss
the transformation properties of the basic laws of classical mechanics.
These laws may be formulated tints.
When bodies are subject to forces, their accelerations do not vanish, but
are proportional to the forces acting on them. The ratio of force to accelera
tion is a constant, different for every individual body: this constant is called
the mass of the body.
The total force acting on one body is the vector sum of all the forces caused
by every other body of the mechanical system. In other words, the total
interaction among a number of bodies is the combination of interactions of
pairs. The forces which two bodies exert on each other lie in their connect
ing straight line and are equal except that they point in opposite directions;
that is, two bodies can either attract or repel each other. The magnitude of
these forces is a function of their distance only; neither velocities nor ac
celerations have any influence.
These laws apply to such phenomena as gravitation, electrostatics,
and Van Der Waals forces, but electrodynamics is not. included because
the interaction between magnetic fields and electric charges produces
forces whose direction is not in the connecting straight line, and which
depend on the velocity of the charged body as well as on its position.
But whenever the italicized conditions are satisfied, the forces can be
represented by the negative derivatives of the potential energy. The
latter is the sum of the potential energies characterizing the interaction
of any two bodies or '''mass points,"
zl J2 ^>t(sa),
i < k.
m = Vfc ztf + (yj VuY + fe  ztf
(2.10)
The indices i and k refer to the interaction between the z'tli and the kth.
mass points, and s ik is the distance between them. The functions
V ih (si k ) are given by the special nature of the problem, for example,
Coulomb's law, Newton's law of gravitation, and so forth.
12
CLASSICAL MECHANICS
[ Chap. II
dXi
•%^r dv ik Xi ~~ %h ]
The force acting on t lie ith mass point is given by
f. =  ^1 =  V' *** HZX* U ^ ;. (2,11;
» 6 V Tr^r d Vile Zi — Zt
The set of equations (2.11), by its form, implies that the force com
ponents due to the interaction of the ith and the fcth bodies alone are
equal, except for opposite signs, that is,
dV ik
dXi
dx k
Therefore, the sum of all forces acting on all n mass points vanishes,
L/^ = E/.'„ = S/^ = o
(2.12)
The differential equations governing the motions of the bodies are
(2.13)
where nu is the mass of the it% body.
We are now going to show that the system of equations determining the
behavior of a mechanical system, (2.10), (2.11), and (2.13) is eomriant
with respect to Galilean transformations.
Let us start with eq. (2.10). 7 depends on the distances m of the
various mass points fro in each other. How do the s a change (trans
form) when a coordinate transformation (1.1) or (1.3) is carried out?
In order to answer that question, it must be kept in mind that the co
ordinates of the ith and of the kih body are to be taken at the same
time; in other words, that tin: distance between the two bodies is itself
a function of time. Of course, the coordinates of the various mass
points transform independently of each other, each set (Xi , yi , zi) by
itself, according to the transformation equations (1.1) or (1.3), re
spectively.
Considering these points, it is seen immediately that trans format ion
Chap. II ]
CLASSICAL MECHANICS
13
(1.3), corresponding to the straightline, uniform motion, leaves the
coordinate differences, for example, xi — %, unchanged, or,
Xi
x k
Xi .
(2.14)
Therefore, the s« themselves take the same form in the new coordinate
system 6'* which they have in S.
The transformation equations (1.1) express the relationship between
two coordinate systems which are at rest relative to each other and whose
axes are not parallel. Obviously, the distance between two points is
expressed in the same way in either coordinate system; so that
V (xt x k f + ^  yyf + &  z k f
1
= V7*2  W + (v'i  yi) 2 + Wi  4) 2 \ [ (2.1 5)
Sik = Sik . J
A quantity which does not change its value (at a given, point) when a
coordinate transformation, is carried out is called an invariant with
respect to that transformation. The distance between two points is an
invariant.
We have seen that the arguments of the function V, the s,t , are in
variant with respect to Galilean coordinate transformations. There
fore, the function V itself, the total potential energy of the mechanical
system, is an invariant, too; expressed in terms of the new coordinates,
it has the same form and takes the same values as in the original co
ordinate system. Eq. (2.10) is co variant with respect to Galilean
transformations.
Let us proceed to eqs. (2.11) and again begin with transformation
(1.3). The right hand sides of eqs. (2,11) contain the derivatives of a
quantity which we already know is an invariant. These derivatives
with respect to the two sets of coordinates are related to each other by
the equations
dV = dV
&*s ' ' ax*
dV
dV
dV
dV
%« dyt
(2.1G)
and therefore, the righthand side of eqs. (2.11) is invariant with respect
to transformations (1.3) . Whether the same holds true for the lefthand
side, we shall be able to decide after discussing the transformation,
properties of eqs. (2.13). It is clear, however, that the equation remains
valid in the new coordinate system only if both sides transform the
same way. Otherwise, it is not covariant with respect to the trans
formation considered. We shall have to Jind out virether the trans
14
CLASSICAL MECHANICS
Chap. \\
formation properties of the righthand side of eqs. (2.11) are compatible
with those of the lefthand side of eqs. (2.13), as both determine the
transformation properties of the forges, ti ■
Let us first transform the righthand side of eqs. (2.11) by a trans
formation (1.1). Applying the rules of partial differentiation, we obtain
dV
dXi
d_V
av_
07
ov
Cn + — r% + Ticn
%i
dzi
_L dV
■ess + —7
fJS,;
ay
■Cm
3m
(■'23
07
dz'
■CZ2
■Ciz
(2.17)
The ?>n equations (2.17) can be separated into n groups of 3 equations
each, these groups being identical save for the value of i. Each group
transforms as the components of a vector, that is, each component in
one system is equal to the sum of the projections upon it of the three
components in the other system.
Whether the lefthand sides of eqs. (2.11) also have vector character
must be decided after discussion of the transformation properties of
eqs. (2.13).
The lefthand sides of eqs. (2. i3) are products of masses and accelera
tions. We have already stated that in classical physics the mass is con
sidered to be a constant of a body, independent of its state of motion
and invariant with respect to coordinate transformations.
That the accelerations of a body are invariant with respect to trans
formation (1.3) we have already seen in eq, (2.7). Therefore, the left
hand sides of eqs. (2,13) transform with respect to (1.3) in the same way
as the righthand sides of eqs. (2.1 1).
Turning to transformations (1.1), we know that
x'i = cn±i + ctfji + c a zi , and so forth,
(2.18)
but because the c ab have the significance of cosines of angles, and because
the value of a cosine does not depend on the sign of the angle,
COS a = COS ( — £*)>
it is also true that
m = CuXi + 6%m + cnz'i , and so forth. (2. J 8a)
Again, the lefthand sides of eqs. (2.13) transform in exactly the same
way as the righthand sides of (2.11), m this case as n vectors.
Eqs. (2.13) can be considered as the equations defining the forces
U . We conclude, therefore, that the forces themselves transform so
Chap/. I! 1
CLASSICAL MECHANICS
15
that both eqs. (2.11) and (2.13) art; eo variant. Willi reaped to spatial,
orthogonal transformations of the coordinate system, the forces are vectors,
and they are invariant with respect to transformations representing a
straightline uniform motion of one system relative to the other. These rela
tions can be expressed in a slightly different form. By eliminating the
quantities U from eqs. (2.11) and (2.13) we can combine them into new
equations of the form
dV
dV
dV
dzi
+ mxi = 0,
+ mjk = 0,
+ mzi
0.
(2.19)
These equations contain the essential physical statements of eqs. (2,11)
and (2.13), Suit do not bring out so clearly the force concept.
The result of the above consideration, is that the two sides of each of
equations (2.1.1.) and (2..I3) transform in the same way, and that, there
for*. 1 , these equations remain valid when arbitrary Galilean transforma
tions are carried out.
Eq nations which do not change at all with the transformation (that
is, the terms of winch are invariants) are called invariant. Equations
which remain valid because their terms, though not invariant, transform
JIT/"
according to identical transformation laws (such as the terms  — and
dXi
viXi , and so forth, in eqs. (2.19)) are called covariant.
The eovariance of equations is the mathematical properly which
corresponds to the existence of a relativity principle for the physical
laws expressed by those equations, fn fact, the relativity principle of
classical mechanics is equivalent to our result, that the laws of me
chanics take the same form in all inertial systems, that is, in all those
coordinate systems which can be obtained by subjecting any one inertial
System to arbitrary Galilean transformations.
The other branches of mechanics, such as the treatment of continuous
matter (the theory of clastic bodies and hydrodynamics) or the me
chanics of rigid bodies, can be deduced from the mechanics of free mass
points by introducing suitable interaction energies of the type (2.10),
and by carrying out certain limiting processes. It is evident, even
without a detailed treatment of these branches of mechanics, that the
results obtained apply to them as well as to the laws of motion of free
mass points.
CHAPTER III
The Propagation of Light
The problem confronting classical optics. During the nineteenth cen
tury, a new branch of physics was developed which could not be brought
within the realm of mechanics. That branch was electrodynamics. As
long as only electrostatic and magnctostatic effects were known, they
could be treated within the framework of mechanics by the introduction
of electrostatic and magnetostatie potentials which depended only on
the distance of the electric charges Or magnetic poles from each other.
The interaction of electric and magnetic fields required a different
treatment. This was brought out clearly by Oersted's experiment. He
found that a magnetized needle was deflected from its normal North
Sou ill direction by a current flowing through an overhead NorthSouth
wire. The sign of the deflection was reversed when the direction of the
current was reversed. Obviously, the magnetic actions produced by
electric currents, that is, by moving charges, depend not only on the
distance but also on the velocity of these charges. Furthermore, the
force does not have the direction of the connecting straight line. The
concepts of Newton's mechanics are no longer applicable.
Maxwell succeeded in formulating the laws of electromagnet! sm by
introducing the new concept of "field." As we have seen in the pre
ceding chapter, in mechanics a system is completely described when the
locations of the constituent mass points are known as functions of time,
in Maxwell's theory, we encounter a certain number of "field variables,"
such as the components of the electric and magnetic field strengths.
While the point coordinates of mechanics are defined as functions of the
time coordinate alone, the field variables are defined for all values both
of the time coordinate and of the three space coordinates, aud are thus
functions of four independent variables.
1 In the meshamicB of continuous media, we find variables which resemble field
variables : The mass density, momentum density, stress components, and so forth;
but they have only statistical significance. They are the total mass of the av
erage number of particles per unit volume, and so forth. In electrodynamics,
however, the field variables are assumed to be the basic physical quantities.
16
Chap. Ill ]
THE PROPAGATION OF LIGHT
17
in Maxwell's field theory it is further assumed that the change of the
field variables with time at a given space point depends only on the
immediate neighborhood of the space point. A disturbance of the field
at a point induces a change of the field in its immediate neighborhood,
this in turn causes a change farther away, and thus the original disturb
ance has a tendency to spread with a finite velocity and to make itself
felt eventually over a great distance. "Action at a distance" may thus
be produced by the field, but always in connection with a definite lapse
of tune. The laws of a field theory have the form of partial differential
equations containing the partial derivatives of the field variables with
respect to the space coordinates and with respect to time.
The force acting upon a mass point is determined by the field in the
immediate neighborhood of the mass point. Conversely, the presence
of the mass point may and usually does modify the field.
Since the structure of Maxwell's theory of electromagnetism is so
different from Newtonian mechanics, the validity of the relativity prin
ciple in mechanics by no means implies its extension to electrodynamics.
Whether oi' not this principle applies to the laws of the electromagnetic
field must be the subject of a new hives Ligation.
A complete investigation of this kind would have to establish the
transformation laws of the electric and magnetic field intensities with
respect to Galilean transformations, and than determine whether the
transformed quantities obey the same laws with respect to the new
coordinates. Such investigations were carried out by various scientists,
among them H. Hertz and H. A. Lorentz. "But we can obtain the most
important result of these investigations by much simpler considerations.
Instead of treating Maxwell's field equations themselves, we shall con
fine ourselves at present to one of their aspects, the propagation of
electromagnetic waves.
Maxwell himself recognized that electromagnetic disturbances, such
as those produced by oscillating charges, propagate through space with
a velocity which depends on the electric nature of the matter present in
space. In the absence of matter, the velocity of propagation is inde
pendent of its direction, and equal to about 3 X 10 10 cm see 1 . This is
equal to the known speed of light; Maxwell assumed, therefore, that
light was a type of electromagnetic radiation. When Hertz was able to
produce electromagnetic radiation by means of an electromagnetic appa
ratus, Maxwell's theory of the electromagnetic field and his electro
magnetic theory of light were accepted as an integral part of our physical
knowledge.
Electromagnetic radiation propagates in empty space with a uniform,
constant velocity (hereafter denoted by c). This conclusion can be
18
THE PROPAGATION OF LIGHT
£££2s its Sr ss ?s jrr;r ^
we can stndy the tatnsformation propafa oTto llff °'' "'' '"
consider a coordinate l! ! + f ^formations. Let us
uArm .peed ofS L il If wel^tta G n ^ f ° IT ° ? *"
of the type (1.3) corresuondinttn f ^ ^^O^tic*
newsvi's ek^ o ?! S * T trans!afe ^ ^tion of the
be equal to t io al dl^tt " ! f ^ ! *» ™ *& ^
direction of a iigrd «T£S£ w u^ "** ^^ &■ If ^
aad Tl .ail^t^S S"? * * "*■ *< *
respect to S are ' ' U * Ve '° clty WMrts with
with
Wo + do8*j9f cos'y = 1.
(3.1)
(3.2)
SK I' 9 ' (M) ' *■ ™** «^— ■ «* pect to the „
^s = C COS a — p„
v 'v = c. cos /3 — v„ ,
»a = C COS 7 — j
(3.3)
a^^o speed of h g ht depots „ its direction as indicated hy the
c S + v 2 .: 2,(^ cos ^ f %W^+1^^), ( 3 d)
will be it^T^it * *? ^ ** ^ to J!
t It appears, thus, it^ ^ £Sf? " "* * ° + : *
*e laws of electromagnetic Nation, ^^ ^ST^ ^
electromagnetic fields. If confidence in Maxwel I T\ *$*** ° f
IK tw mU st ** «^**tt£^t£2
Chap. Ill J
THE PROPAGATION OF LIGHT
19
system with respect to which the field equations take their standard
orm Any frame of reference which is in a state of mol c , S to
chap,, would apply only to mechanics, not to the whole of ph vsies
made to orentome these difficulties. We shall i,o,v ,.?,,? ,
important ones. consider the more
train »n,d h„ refered to a fcj conned Se «7 ' ""^
iJ^tTrp™^;^ j:r:„hr s ,ieU to,y ° f
a eocp^cma,. theory „, the type *'h" toZ ,  SST£Z
is consistent with the principle of relarivitv ti ,f lm  But ' lt
contains explicitly the velocitv a t J i bo ^ ^ ° f ™^
also have comparatively great velocities nhtWoZ ?' y
20
THE PROPAGATION OF LIGHT
[ Chap. Hi
tancc divided by the speed of light. If v is Ik: variation of velocity of
one component of the double star, one would have
I = d/c, At/Ac
At— 
vd
c
(d is the distance between double star and earth, c is the speed of light.,
and t is the mean time for light to reach the earth). Reasonable
assumptions for the order of magnitude of the quantities », d, and c are:
10 cm sec ,
■ 10 B em sec" 1 ,
d > Iff cm;
therefore,
M > 10* sec.
As there are many double star systems for which d exceeds 10 21 cm and
which have: periods less than 10 fi sec, the resulting effects could not
escape observation. ,
However, no trace of any such effect has ever been observed. Trusts
sufficiently conclusive to rule out further consideration of this hypothesis.
The transmitting medium as the frame of reference. Another hy
pothesis was that, whenever light was transmitted through a material
medium, this medium was the "local' 1 privileged frame of reference.
Within the atmosphere of the earth, the speed of light should be uniform
with respect to a geocentric frame of reference.
This hypothesis, too, is unsatisfactory in many respects. Let us
assume, for the sake of the argument, that it is the transmitting medium
and its' state of motion which determine the speed of light. Suppose,
now, that electromagnetic radiation goes from one medium in a certain
state of motion to a second medium in a different state of motion. The
speed of light would be bound to change, this change depending on the
relative velocity of the two media, and on the direction of the radiation
(also, of course,' on the difference of indices of refraction) . If this experi
ment should be carried out with increasingly rarefied media, the inter
action between matter and radiation would become less and less, as far
as refraction, scattering, and so forth, arc concerned, but the change of u
would remain the same. In the case of infinite dilution, that is, of a
vacuum, we should have a finite jump in u without apparent cause.
There is also experimental evidence bearing on this hypothesis. In
order to obtain information on the influence of a moving medium on the
Chap. [II J
THE PROPAGATION OF LIGHT
21
speed of light, Fizeau carried out the following experiment. He sent a
ray of light through a pipe filled with a flowing liquid, and measured the
speed of light in both the negative and positive directions of flow. He
determined these speeds accurately by measuring the position of inter
ference fringes.
The experiment shewed that the speed of light does depend on the
velocity of the flowing liquid, but not to the extent that the velocities of
the light and of the medium, could simply be added. If we denote the
speed of light by c, the velocity of the liquid by v, and the index of
refraction by n, we should expect, according to our present assumption,
that the observed speed of light is
± v,
(3.5)
the sign depending on the relative directions of the light and the flow.
The actual result was that the change of speed of light is, within the
limits of experimental error, given by
u =  ± I
n
M
(3.6)
This experimental result is consistent with the first objection. For,
as the medium is increasingly rarefied, the index of refraction n ap
proaches the value 1, the dependence of u on v becomes negligible, and,
in the limiting ease of infinite dilution, u becomes simply c.
Another effect which indicates that the speed of light does not depend
on the motion of a rarefied medium of transmission is that of aberration.
Fixed stars at a great distance change their relative positions in the sky
in a systematic way with a period of one year. Their paths are ellipses
around fixed centers, with the major axis in all cases approximately 41"
of arc, Stars near the celestial pole carry out movements that are
approximately circles, while stars near the; ecliptic have paths which are
nearly straight lines.
Fig. 2 illustrates the way the star is seen away from its "normal"
position (the center of the ellipse) at two typical points of the path of
the earth around the sun.
Aberration can be explained thus (Fig. 3): As the telescope is rigidly
connected with the earth, it goes through, space at an approximate rate
of 3 X 10 s cm sec"" 1 . Therefore, when a light ray enters the telescope,
let us say from straight above;, the telescope must be inclined in the
indicated manner, so that the lower end Will have arrived straight below
the former position of the upper end by the time that the light ray has
arrived at the lower end. The tangent of the angle of aberration, a,
22 THE PROPAGATION OF LIGHT ( Chap, ill
must be the ratio between the distance traversed by the earth and the
distance traversed by the light ray daring the same time interval, or the
ratio between the speed of the earth and the speed ol light Tim
ratio is
i'eaith
c
3 X 10
Vx 10 W
JL^ ~ iir
the angle corresponding to that tangent is 20.5", the amount of greatest
aberration from tire center of the ellipse.
To Star
Tke apparent change in position of a fixed star during a year (aberra
' 'n the figure and amounts to not more than
F%. 2
tion). This change is exaggerate
about 20" .5.
This explanation of aberration again contradicts the assumption that
the transmitting medium i.s decisive for the speed of light. For if this
assumption were true, the light rays, upon entering our atmosphere,
would be "swept" along, and no aberration would take place.
The absolute frame of reference. All these arguments suggested the
independence of the electromagnetic laws from the motion of either the
source of radiation or the transmitting medium. The other alternative,
it appeared, was to give up the principle of relativity and to assumethat
there existed a universal frame of reference with respect to: which the
speed of light was independent of the direction of propagation. As
mentioned before, the equations of the eleetromsgnetie held would have
chap, m i
THE PROPAGATION OF LIGHT
23
taken their standard form with respect to that frame. As the accelera
tions of charged partiel.es are proportional to the field, this frame could
be expected to be an inertial system, so that the ace (derations would tend
to zero with the field.
The experiment of Michelson and Morley. On the basis of this
assumption, Michelson and Morley devised an experiment which was
designed to determine the motion
of the earth with respect to the
privileged frame of reference in
which the speed of light was to be
uniform. The essential idea of
their experiment was to compare
the apparent speed of light in two
different directions.
Before studying their experimen
tal setup, let us discuss the ex
pected results from the standpoint
of this new hypothesis. The earth
itself cannot be the privileged
frame of reference with respect to
which the equations of Maxwell
hold, for it is continually subject
to the gravitational action of the
sun; and in a frame of reference
connected with the center of grav
itation of our solar system, the
velocity of the earth is of the
order of 3 X Iff em see" J . It
changes, therefore, about 6 X 10 6
cm sec" 1 in the course of one half
year relatively to a frame of reference which approximates an inertial
system better than the earth does. Therefore;, even if the earth
could at any time be identified with the state of motion of the privi
leged frame of reference, it would have a speed of 6 X itf cm sec" 1
half a year later relative to the privileged system.
_ In any case, it has a speed of at least 3 X 10 s cm sec" 1 relative to any
inertial system ^throughO months of the year. The speed of light is
about 3 X 10 cm sec \ If it is possible to compare the speed of
light m two orthogonal directions with a relative accuracy better than
t0 , and if the experiments are carried out over a period exceeding 6
months, the effects of the motion of the earth would become noticeable.
We proceed now to a description of Michelson and Motley's experi
Fig. 3. Explanation of aberration.
24
THE PROPAGATION OF LIGHT
[ Chap, III
menfc (Fig 4). Light from a terrestrial source L is separated into two
parts by a thinly silvered glass plate P. At nearly equal distances
from P, and at right angles to each other, two mirrors & and & arc
placed which reflet: t the light back to P. There, a part of each of the
two fays reflected by Si and S, , respectively, are reunited and are
observed through a telescope F. Since the light emanating from L has
AJk
L
h
: s i
Fig. 4. The MichelsonMorley apparatus.
travelled almost equal distances, L — P — S L — P — F and
/_. _ p — Ss — P — F, respectively, interference fringes are observed,
and their exact location depends on Hie difference between the dis
tances h and U •
So far, we have assumed that the speed of light is the same in all
directions. If this assumption is dropped, the position of the inter
ference fringes in F will also depend on the difference in the speeds along
Chap. Ill ]
THE PROPAGATION OF LIGHT
25
h and h . Let us assume that the earth, and with it the apparatus, is
moving, relatively to the "absolute" frame of reference, along ihv, direc
tion of k at the rate of speed v. With respect to the apparatus, the
speed of light along the path P — S l equals (c  v), and along the path
ii — P it is (c + v). The time required to travel the path P — Si — P
will be
k
k
+
k
2i, h
(3.7)
c — v c + v 1 — i' 2 /c s '
The relative speed of the light travelling along the path P — .g& — P
will also be modified. While the light travels from P to S 2 , the whole
apparatus is moving sideways a distance 5,
v
jance tmvelli
it = VjT+I*
s =
i.
and the actual distance travelled by the light is
U
(3.8)
(3.9)
Vl  v*/c 2 '
On the way back, the light has to travel an equal distance. The total
time required by the light for the path P  S 2  P is, therefore,
2!,,/c
k 
Vl  0t
(3.10)
After the apparatus has been swung 90° about its axis, the times
required to travel the paths P  &  P and P  S 2  P are, respec
tively,
k =
t2 =
Mi/e
VI ?/■&'
2UJc
1  v'/c '
(3.11)
The time differences between the two alternative paths are, therefore,
before and after the apparatus has been swung around,
At = &
 h = ^L= (*L=  h )
Vi  v 2 /c 2 \Vi  fife 7 '
(3.12)
and
tA =   \ =
2/c ft k_ \ to ,
2(5
THE PROPAGATION OF LIGHT
[ Chap, lit
Chap. Ill ]
THE PROPAGATION OF LIGHT
The change in M which is brought about by rotating the apparatus is,
therefore ,
At At
2/c
fc+a^b* 1 )
(3.13)
Vl  v'/c V± ' ' ' Wl  v'/c
As {v/cf is of the order of magnitude of 1Q" 3 , we. shall expand the right
hand side of eq. (3. .1 3) into a power series in (v/cf and consider only the
first nonvanishing term. We obtain the approximate; expression
At At = — (h + h) t
c &
(3.14)
We should expect the interference fringes in the telescope to shift
because of this change in the time difference M. The amount of this
shift, expressed in terms of the width of one fringe, would be equal to
(At — At) divided by the time of one peri'od of oscillation,  ,
As
a
e c
(3.15)
■v is the velocity of the earth relative to the "absolute" frame of
reference, presumably at least of the order 3X10° cm sec" 1 , (v/nf is,
therefore, of the order 10" s .  is the wave number, and for visible; light,
c
about 2 X 10 4 cm" 1 . Wo have, therefore,
As
a
k + U
5 X 10* cm'
(3 J 5a)
By lining multiple reflection, Michelson and Morley were able to work
with effective lengths h and l 2 of several meters. Any effect should
have been clearly observable after all the usual sources of error, such as
stresses, temperature eff eets, and so forth, had been eliminated. Never
theless, no effect was observed.
An impasse was at hand: No consistent theory would agree with the
results of 1'izcau's experiment, the Mich els onMorley experiment, and
the effect of aberration. A great number of additional experiments were
performed along similar hues. Their discussion can be omitted here,
because they did not change the situation materially. What was needed
was not more experiments, but some new theory which would explain
the apparent contradictions.
The ether hypothesis. Before launching into an explanation of that
new theory, the theory of relativity, we mention a hypothesis which
today has only historical significance. Physicists had been accustomed
to think largely in terms of mechanics. When Faraday, Maxwell, and
Hertz created the first field theory, it was only natural that attempts
were made by many physicists to explain the new fields in terms of
mechanical concepts. Maxwell and Hertz themselves contributed to
these efforts. Within the realm of mechanics itself, there existed a
branch which used concepts and methods resembling those of field
physics, namely, the mechanics of continuous media. So the electro
magnetic fields were explained as the stresses of a hypothetical material
medium, the socalled ether.
There are many reasons why this interpretation of the electromag
netic field finally had to be abandoned. Among them are: the ether
would have to be endowed with properties not shared by any known
medium; it would have to penetrate all matter without exhibiting any
frictiona! resistances and it would have no mass and would not be
affected by gravitation. Also, Maxwell's equations are different in
many ways from the equations to which elastic waves are subject.
There exists, for instance, no analogy in electrodynamics to the "longi
tudinal." elastic waves.
A I; the end of the nineteenth century, however, the ether was regarded
as a, most promising and even necessary hypothesis. Katu rally, at
tempts were made to apply this concept to the problem discussed in this
chapter, namely, to find that coordinate system in which the speed of
light is equal to c in all directions. The idea of the ether suggested that
it might be the coordinate system in which the ether is at rest. That
theory, however, does little to solve the fundamental difficulty. All
that it does is reword the problem; for, in order to find out what the state
of motion of the ether really is, we would have no other means than to
measure the speed of light. The outcome of the Michel s onMorley
experiment would, therefore, suggest that the ether is dragged along
with the earth, as far as the immediate neighborhood of the earth is
concerned. The motion of small masses, such as in Fizeau's experiment,
would carry the ether along, but not completely. But these hypotheses
could not account for aberration . The existence of the aberration effect
would be consistent with an ether hypothesis only if the earth could
glide through the ether without carrying it along, even right on its
surface, where our telescope picks up the light.
CHAPTER IV
The Lorentz Transformation
Several decades of experimental research showed that there was no
way of determining the state of motion of the earth through the "ether."
All the evidence seemed to point toward the existence of a "relativity
principle" in optics and electro dyn amies, even though the Galilean
transformation equations ruled that out.
Nevertheless, Fitzgerald and especially H. A. Lorentz tried to pre
serve the traditional transformation equations and still account theo
retically for the experimental results. Lorentz was able to show that
the motion of a frame of reference through the ether with a velocity v
would produce only "secondorder effects"; that is, all observable devia
tions from the laws which were valid with respect to the frame connected
with the ether itself would be proportional not to v/c, but to (v/c)\
One of these expected secondort.lt a effects was that, in a system
moving relatively to the ether, a. light ray would take longer to go out
and back over a fixed distance parallel to the direction of the motion
than over an equal distance perpendicular to the motion. The Michel
sonMorley experiment was designed to measure that effect. In order
to explain the negative outcome of the experiment, Fitzgerald and
Lorentz assumed that scales and other "rigid" bodies moving through
the ether contracted in the direction of the motion just sufficiently to
offset this effect. This hypothesis preserved fully the privileged char
acter of one frame of reference (the ether). The negative result of the
MichelsonMorley experiment was not explained by the existence of an
"optical relativity principle," but was attributed to an unfortunate com
bination of effects which made it impossible to determine experimentally
the motion of the earth through the ether.
Einstein, on the contrary, accepted the experiments as conclusive evi
dence that the relativity principle was valid in the field of electro
dynamics as well as in mechanics. Therefore, his efforts were directed
toward an analysis and modification of the Galilean transformation
equations so that they would become compatible with the relativity
principle in optics. We shall now retrace this analysis in order to derive
the new transformation laws.
28
Chap. IV ]
THE LORENTZ TRANSFORMATION
29
In writing down transformation equations, we always made two
assumptions, although we did not always stress them; That there exists
a universal time t which is defined independently of the coordinate sys
tem or frame of reference, and that the distance between two simulta
neous events is an invariant quantity, the value of which is independent
of the coordinate system used.
The relative character of simultaneity. Let us take up the first
assumption. As soon as we set out to define a universal time, we are
confronted with the necessity of defining simultaneity. We can com
part; and adjust timemeasuring devices in a unique way only if the
statement "The two events A. and B occurred simultaneously" can be
given a meaning independent of a frame of reference. That this can be
done is one of the most important assumptions of classical physics; and
this assumption has become so much a part of our way of thinking, that
almost everyone has great difficulty in analyzing its factual basis.
To examine this hypothesis, we must devise an experimental test
which will decide whether two events occur simultaneously. Without
such an experiment (which can be performed, at least in. principle), the
statement "The two events it and B occurred simultaneously" is devoid of
physical significance.
When two events occur close together in space, we can set up a
mechanism somewhat like the coincidence counters used in the investi
gation of cosmic rays. This mechanism will react only if the two events
occur simultaneously.
If the two events occur a considerable distance apart, the coincidence
apparatus is not adequate. In such a case, signals have to transmit
the knowledge that each event has occurred, to some location where
the coincidence apparatus has been set up. If we had a method of
transmitting signals with infinite velocity, no great complication would
arise. By "infinite velocity" we mean that the signal transmitted
from a point Pi to another point Pi and then back to P L would return
to Pj at the same time as it started from there.
Unfortunately, no signal with this property Is known. All actual
signals take a finite time to travel out and back to the point of origin,
and this time increases with the distance traversed. In choosing the
type of signal, we should naturally favor a signal where the speed of
transmission depends on as few factors as possible. Electromagnetic
waves art: most suitable, because their transmission does not require
the presence of a material medium, and because their speed in empty
space does not depend on their direction, then wave length, or their
intensify. As the recording device, we can use a coincidence circuit
with twt) photon counters.
30
THE LORENTZ TRANSFORMATION
[ Chap. IV
To account for the finite time lost in transmission, we set up our
apparatus at the midpoint of the straight line connecting the sites of
the two events A and B. Each event, as it occurs, emits a light signal,
and we shall call the events simultaneous if the two light signals arrive
simultaneously at the midpoint. This experiment has been designed
to determine the simultaneity of two events without the use of specific
timemeasuring devices. It is assumed that simultaneity as defined by
this experiment is "transitive," meaning that if two events A and B
occur simultaneously (by our definition) , and if the two events A and C
also occur simultaneously, then B and C are simultaneous. It must
be understood that this assumption is a hypothesis concerning the
behavior of electromagnetic signals.
Granted that this hypothesis is correct, we still have no assurance
that our definition of simultaneity is independent of the frame of refer
ence to which we refer our description of nature. Locating two events
and constructing the point midway on the connecting straight line
necessarily involves a particular frame and its state of motion.
Is our definition invariant with respect to the transition from one
frame to another frame in a different state of motion? To answer this
question, we shall consider two frames of reference: Oik; comiected
with the earth (S), the other with a very long train (S*) moving along
a straight track at a constant rate of speed. We shall have two ob
servers, one stationed on the ground alongside the railroad track, the
other riding on the train. Each of the two, observers is equip ped with
a recording device of the type described and a, measuring rod. Their
measuring rods need not be the same length; it is sufficient that each
observer be; able to determine the point midway between two points
belonging to his reference body— ground or train.
Let us assume now that two thunderbolts strike, each hitting the
train as well as the ground and leaving permanent marks. Also sup
pose that each observer finds afterwards that his recording apparatus
was stationed exactly midway between the marks left on his reference
body. In Fig. 5, the marks are denoted by A, B, A*, and B*, and the
coincidence apparatus by C and C*. Is it possible that the light sig
nals issuing from .4, A* and from B, B* arrive simultaneously at C and
also simultaneously at (7*?
At the instant that the thunderbolt strikes at A. and A'*, these two
points coincide. The same is true of B and B*. If eventually it turns
out that the two bolts struck simultaneously as observed by the ground
observer, then 0* must coincide with C at the same time that A coin
cides with A* and B with B* (that is, when the two thunderbolts
Chap. IV ] THE LORENTZ TRANSFORMATION
31
strike) .' It is understood that all these simultaneities are defined with
respect to the frame ,S'.
Because of the finite time needed by the light signals to reach C and
C*, C* travels to the left (Fig. 5, stages b, c, d). The .signal issuing
from A, A''" reaches C, therefore, only after passing C* (stages h, c);
while the light signal from B, B* reaches C before it gets to C* (stages
c, d). As a result, the train observer finds that the signal from A, A*
reaches his coincidence apparatus sooner than the signal from B, B*
(stages b, d).
1
r
^^ ===r ^
I
r A"
IC*
B'
(a)
A
'C
B *
1
nk
z^.
(b)
lA
IC*
TT
jj3
£W_
(c)
IC*
JD
£L
id)
IC*
'A "~
Tr
El
w
Fig. 5. The two events occurring at A, A* and at B, B*, respectively, appear
simultaneous to an observer at rest relative to the ground (S), but not to an
observer who is at rest relative to the train (S*). At (a) the two events occur
(h) the light signal from A, A * arrives at C* , (c) the light signals from both events
arrive at C, and {d) the light signal from B, B* arrives at C*.
This does not imply that the ground has a property not possessed
by the train. It is possible for tin; thunderbolts to strike so that the
light signals reach 0* simultaneously. But then the signal from 4, A*
will arrive at C after the signal from B t B*. In airy ease, it is impos
sible for both recording instmments, at C and at C*, to indicate that
the two thunderbolts struck simultaneously.
I Otherwise, the distances A~*C* and B*C* would not appear equal from the
point of view of the ground observer; we shall explain later why we do not assume
anything of this kind.
32
THE LORENTZ TRANSFORMATION
[ Chap. IV
We conclude, therefore, that two events which are simultaneous with
respect to one frame of reference are in general not simultaneous with
respect to another frame.
The length of scales. Our conclusion affects the evaluation of length
measurements. We have assumed that the ground observer and the
train observer are able to carry out, length measurements in their re
spective frames of reference. Two rods which are at rest relatively to
the same frame of reference are considered equal in length if they can
be placed alongside: each other so that their respective end points E,
E* and F, F'" coincide. Two distances which are marked off on two
different reference bodies moving relatively to each other can be com
pared by the same method, provided these distances are parallel to
each other and perpendicular to the direction of the relative motion.
However, if the two distances are parallel Lo the direction of relative
motion, and if they are travelling along the same straight line, their
respective end points will certainly coincide at certain times. The two
distances EF and E*F * are considered equal if the two coin t si deuces
occur simultaneously. But whether they occur simultaneously de
pends on the frame of reference of the observer. Thus, in the case of
the thunderbolts, the two distances AB and &*B* appear equal to the
ground observer; the train observer, on the other hand, finds that A.
coincides with A* before B coincides with B*, and concludes that A*B*
is longer than A B. In other words, not only the simultaneity of events,
but also the result of length measurements, depends on the frame of
reference.
The rate of clocks. The frame of reference of the observer also deter
mines whether two clocks at. a considerable distance from each other
agree (that is, whether their hands assume equivalent positions simul
taneously). Moreover, if the two clocks are in different states of mo
tion, we cannot even compare their rates independently of the frame of
reference. To illustrate this, let us consider two clocks D and JD* one
stationed alongside the track and the other on the train. Let us assume
that the two clocks happen to agree at the moment when 7>* passes D.
We can say that D* and D go at the same rate if they continue to agree.
2 Our definition of simultaneity is, of course, to a certain degree arbitrary.
However, it is impossible to devise an experiment by means of which simultaneity
could be defined independently of a frame of reference. From the outcome of
the MichelsonMorlcy experiment, we conclude that the law of propagation of
light takes the same form in all inortial systems. TTad the outcome of the
MichelaonMorley experiment been positive, in other words, if it were possible
to determine the "state of motion of the "ether," we should naturally have based
our definition of simultaneity on the frame of reference connected with the ether,
and thereby have given it absolute signifieance.
Ch ap . IV ] THE LORENTZ TRANSFORMATION
33
But after a while, D* and D will be a considerable distance apart; and,
as we know from our earlier considerations, their hands cannot assume
equivalent positions simultaneously from the points of view both of
the ground observer and of the train observer.
The Lorentz transformation. The above considerations help us to
remove the apparent contradiction between the law of the propagation
of electromagnetic waves and the principle of relativity. If it is im
possible to define a universal time, and if the length of rigid rods cannot
be defined independently of the frame of reference, it is quite: conceivable
that the speed of light is actually the same with respect to different
frames of reference which are moving relatively to each other. We
are now in a position to show that the classical transformations con
necting two inertial systems (Galilean transformation equatkms) can be
replaced by new equations which are not based Oil the assumptions of
a universal time and the invariant, length of scales, but which assume
at the outset the in. variant character of the speed of light.
In the derivation of these new transformation equations, we shall
accept the principle of relativity as fundamental ; that is, the transforma
tion equations must contain nothing which would give one of the two
coordinate systems a preferred position aw compared with the other
system. In addition, we shall assume that the transformation equa
tions preserve the homogeneity of space; all points in space and time
shall be equivalent from the point of view of the transformation.. The
equations must, therefore, be linear transformation equations. This is
why avo considered the two distances ,4*C* and B*G* equal in terms of
^'coordinates as well as in A 1 * coordinates (see page 31).
Let lis consider two inertia! coordinate systems, S and $** S* moves
relatively to 5 at the constant rate v along the Xaxis; a.t the >S T time
t = 0, the points of origin of S and S* coincide. The A"*axis is parallel
to the Xaxis and, in fact, coincides with it. Points which are at rest
relative to A"* will move with speed v relative to S m the Xdirection.
The first of our transformation equations will, thus, take the form
a (x — vt),
(4.1)
where a is a constant to be determined later.
It is not quite obvious that a straight line which is perpendicular to
the Xaxis should also be perpendicular to the X*axis (the angles to
be measured by observers in 5 and $*, respectively). But if we did
not assume that it was, the leftright symmetry with respect to the
Xaxis would be destroyed by the transformation. For similar reasons,
we shall assume that the Y and the 7axes are orthogonal to each
34
THE LORENTZ TRANSFORMATION [ Chap, iv
other, as observed from either system, and that the same is true of the
F*~ and Z*am$.
As mentioned before, we can compare the lengths of rods in different
states of motion in an invariant manner if they are parallel breach
other and orthogonal to the direction of relative motion. If their re
spective end points coincide, it follows from the principle of relativity
that, they are the same length. Otherwise, the relationship between S
and S* would not be reciprocal.
On the basis of this, we can formulate two further transformation
equations,
V* = V;
(.1.2)
To complete this set of equations, we have to formulate an equation
connecting f\ the time measured in. &*'; with the time and space coordin
ates of S. t* mnst depend on L x, y, and z linearly, because of what
we have called the: "homogeneity" of space and time. For reasons of
symmetry, we assume further that f* does not depend on y and z.
Otherwise, two S*clocks in the 7*Z !; plane would appear to disagree
as observed from S. Choosing the point of time origin so that the
inhomogeneous (constant) term in the transform ation equation vanishes,
we have
Finally, we must evaluate the constants a of eq. (4.1) and (3 and y
eq. (4.3)" We shall find that they are determined by the two condi
tions that the speed of light be the same with respect to S and S*,
and that the new transformation equations go over into the classical
equations when v is small compared with the speed of light, c.
bet us assume that at the time I = an electromagnetic spherical
wave leaves the point of origin of S, which coincides at that moment
with the point of origin of ,S'*. The speed of propagation of the wave
is the same in all directions and equal to c in terms of either set of
coordinates. Its progress is therefore described by either of the two
equations
(44)
(4.5)
x*" + tf* I 2*
2.2
C t j
By applving eqs. (4.1), (4.2), and (4.3), we can replace the starred
quantities in eq. (4.5) completely by uns tarred quantities,
f?0i _. yx f = jfts  vlf 4 if + i. (4.(0
Ch ap . IV] THE LORENTZ TRANSFORMATION
35
By rearranging the terms, we obtain
{of  vV)t 2 = (a  cW + y~ + i  2{v<x 4 c~8y)xl. (4.7)
This equation goes over into eq. (4.4) only if the coefficients of I and x~
are the same in cqs. (4.7) as in eqs. (4.4), and if the coefficient of xt in
eq, (1. 7) vanishes. Therefore,
est
V a"
2 2
e 7
c{iy = 0.
(4.8)
We solve these three equations for the three unknowns a, ff; and y by
first eliminating a . We obtain the equations
c'yiiS + pry) = v.
Then we (eliminate y and obtain for ,3" the expression
„* 1
v'jr}
(4.9)
(440)
/3 is not equal to unity, as it is in the classical transformation theory.
But by choosing the positive root of (4.10), we can make it nearly equal
to unity for small values of v/c; its deviation from unity is of the second
order. 7 is given by the equation
7 = l^* = fc (4.11)
v[i
and finally, a is obtained from the equation
i: = r:8y/v  8\ (4.12)
Again we choose the positive sign of the root.
By substituting all these values into eqs. (4.1), (4.3), we get the new
transformation equations,
* x
 vi
x Vi
— v l jii l
f = ?/,
z* = z,
I 
1* 
V
Vi
 «V<? j
(4,13)
36
THE LORENTZ TRANSFORMATION
Chap. IV
These equations are the socalled Lorentz transformation equations. For
small values of v/c, they arc approximated by the Galilean transforma
tion equations,
vt
V

V,
z*

h
I*
=
L
(1.14)
The deviations are all of the second order in v/c (or x/ct). We can
therefore test the Lorentz transformation equations experimentally only
if we are able to increase (v/c)' be.yond the probable experimental error.
Mich el son and Morley, in their famous experiment, were able to in
crease the accuracy to such an extent that they could measure a second
order effect and prove experimentally the inadequacy of the Galilean
transformation equations.
When we solve the equations (4.13) with respect to x, y,z, and t, we
obtain
X —
+ vt*
t =
Vl  v"/c 2
Vl  v./c 1 '
(4.15)
Comparing eqs. (4. 15) with eqs. (4.13), we conclude that 8 has the rela
tive velocity ('— v) with respect to 6**. This is not a trivial conclusion,
for neither the unit length nor the unit time is directly comparable in
in S and S*.
The velocity of a light signal emanating from any point at any time
is equal to c with respect to any one system if it is equal to c in the
other system, for the coordinate and time differences of two events
transform exactly like x, y, z, and t themselves.
The Lorentz transformation equations do away with the classical
notions regarding space and time;. They extend the validity of the
relativity principle to the law of propagation of light.
So far, we have fashioned our trans forma tion theory to fit the out
come of the MichclsonMorley experiment. How does this new theory
account for aberration 1 ? We have to compare the direction of the in
Chsp. IV 1
THE LORENTZ TRANSFORMATION
37
coming light with respect to two frames of reference, that of the sun and
that of the earth. The amount of aberration, depends on the angle
between the incoming light and the relative motion of these two frames
of reference. We shah call that angle a. Both coordinate systems are
to be arranged so that their relative motion is along their common
Xaxis, and that the path of the; light lay lies entirely within the XY
plane. With respect to the sun, the path of the light ray is given by
eieos a.
ctsm a.
(4.16)
With respect to the moving earth, we find the equations of motion by
applying the inverted equations of the Lorentz transformation, (4.15).
Eq. (4,16) takes the form
yWi
L v f' = r:(/* + v/cx*) cos a,
?//(; = c{t* + v/c~z*) sin a. t
(4.17)
By solving these equations with respect to x* and y A \ we get
cos a — v/c
ef
1 — u/c.eos a
clVl  V"l&
= c£* cos or
sin a
1 — v/c • COS a
The cotangent of the new direction is
ctg a — v/ccosec a
ef sin a*.
(4.18)
ctg u<
\/l — v 2 /c
(4.10)
According to the classical explanation given on gage 21 , the angle
would turn out to be
Ctg a* = Ctg a — V/C ■ COSeC a.
(4.20)
If we wish to compare eq. (4.19) with eq. (4.20), we have to keep in
mind that v/c is a small quantity (about 10 ~ 4 ). Therefore, we expand
both formulas into power series with respect to v/c. We get
ctg «* eb! = ctg a
'■'/ccosee a + (y/e) etg a +
(4.19a)
and
ctg c^cinss = ctg a — y/ccosee a. (4.20a)
The observed effect is the first order effect, while the relativistic second
order effect is far below the attainable accuracy of observation. The
relativistic equation (4.19) is, therefore, in agreement with the observed
facts.
THE LORENTZ TRANSFORMATION
[ Chap. IV
We can explain Fizeau's experiment by connecting the coordinate
system 8 with the eartli and 8* with the Hewing liquid. With respect
to S*, the liquid is at rest, and the equation of the light rays must be
of the form
2* = c/n(t*  t). . (421)
Applying the Lorentz transformation equations (1.13), we obtam
x  v t = c/n[{t  v/c'x)  &Vi  tffH (4.22)
We obtain the velocity of the light ray with respect to S by solving this
equation with respect to x.
(4.23)
x —
1
1/V
1 + v/nc
t f const.
Again the observable firstorder effect is in agreement with the
experiment.
The "kinematic" effects of the Lorentz transformation. Wo shall
bow study in more detail the effect of the Lorentz equations on length
and time measurements in different frames of reference.
Let us consider a clock that is rigidly connected with the starred
frame of reference, stationed at some point (x , y'ti, &>). Let us com
pare the time indicated by that clock with the time I measured in the
mistarred system. According to eq. (4. 15), the imstarred time co
ordinate of the clock is given by
_ i}/cX + t*
vT" WM '
An tftime interval, (k  fe), is therefore related to the readings ?% and
if of the clock as follows:
k
k = (it ~ ft)/Vl  */#■
(4.24)
Thus, the rate of the clock appears slewed down, from the point of view
of S, by the factor \/l — &■/&. But not only that. Observed from
the imstarred frame of reference, different £*clocks go at the same
rate, but with a phase constant depending on their position. The
farther away an 5*cloek is stationed from the point of origin along
the positive A'*axis, the slower it appears to be. Two events that occur
simultaneously with respect to $ are not in general simultaneous with
respect to B*, and vice versa.
We can re verse our setup and compare an Sclock with ,$*time.
Chap. IV I
THE LORENTZ TRANSFORMATION
30
The clock may be located at the point (x x , y x , s&J, and the starred time
is connected with the time indicated by the £eIock through the
equation
e
l — v/c i\
Vl  uVc 2
Again the readings of the £clock are related to an 6'* time interval
as follows:
(M  h)/Vl 
v"/rr
(4.25)
It appears that the S'clock is slowed down, measured in terms of >S*time,
and that it is ahead of an 5eloek placed at the origin, if its own
.rco ordinate is positive.
How is it that an observer connected with either frame of reference
finds the rate of the clocks in the other system slow? To measure the
rate of a clock T which is not at rest relatively to his frame of reference,
an observer compares it with all the clocks in his system which T
passes in flu; course of time. That is to say, an Sobserver compares
one »S*cIock with a succession of >S'elocks, while an £* observer com
pares one Sclock with several ,S*eIocks. The A'*cIock passes, in. the
course of time, ^clocks which are farther and farther along the positive
Xaxis and therefore increasingly fast with respect to S*; consequently,
the? rate of the *S*c lock appears slow r in comparison. Conversely, an
Sclock passes S*clocks farther and farther along the negative A'*axis
and therefore increasingly fast with respect to 8, The rate of the
5cIock appears slow compared with &*eloeks.
In the case of length measurements, conditions are somewhat more
involved, because the transformation equations contain y and z in a,
different way than x, the direction of relative motion. A rigid scale
that is perpendicular to the direction of relative motion has the same
length in either coordinate system. However, when the scale is parallel
to the Xaris and the X*axis, we have to distinguish whether the scale
is at rest relative to one coordinate system or to the other. Let us
first consider a rod rigidly connected with B*, the end points of which
have the coordinates ($£ , 0, 0) and (:t* , 0, 0). Ks length in its own
system is
I* = xt
Xi
(4.26)
An observer connected with 8 will consider as the length of the rod the
coordinate difference (x 2 — Xi) of its end points at the same time, I.
40
THE LORENTZ TRANSFORMATION [ Ch ap . IV
The coordinates a; 2 and x% are related t.o x% , Xi , and t by equations
(4.13), yielding
Xi =
Xi =
X! — vt
Vl — v 2 /c 2
x 2 — vt
>
(4.27)
Vl — v/c* j
Therefore, the coordinate differences are
Xi =
Xj — X!
Vl  v 2 fe 2
If we denote the length fe — Xi) by I, we obtain
I
VT
MF
(4.28)
(4.29)
The rod appears contracted by the factor Vl — v'^/cK This effect is
called the Lorentz contraction.
A calculation that reverses the roles of the two coordinate systems
shows that a rod at rest in the unstarred System appears contracted in
the starred system.
Thus, we have the rules: Every clock appears to go at its fastest rate
when it is at rest relatively to the observer. If il moves relatively to the
observer with the velocity v, its rate appears slowed down by the factor
Vl — v 2 /cr. Every rigid body appears to be longest when at rest rela
tively to the observer. When it is not at rest, it appetars contracted in the
direction of its relative motion by the factor Vl — v'jc 1 , while its dimen
sions perpendicular to the direction of motion are unaffected.
The proper time interval. In contrast to the classical transformation
theory, we no longer consider length and time intervals as invariants.
But the invariant character of the speed of light gives rise to the exist
ence of another invariant. Let us return to equations (4.1), (4.2), and
(4.3), and conditions (4.8). We shall consider two events having the
space and time coordinates (xi , y% , z x , k) and \±% , yz , z<± , fe), respec
tively. The difference between the squared time interval and the
squared distance, divided by c , shall be called rn > or
2
Til
= fc " kf  \ [(x,  x,f + ( Ih  yO 2 + %  zi)'] (4.30)
Correspondingly, we define a similar quantity with respect to S*,
= (ft  $'f  UMm  z*f + (vt  df + (4  ztf\. (4.31)
#2
Tl2
Chap. IV ]
THE LORENTZ TRANSFORMATION
■II
Now we express nl in terms of A'quan titles, according to eqs. (4.1),
(4.2), and (4.3), just as we did in the discussion of eq. (4.5), and obtain
T ?i = (tf  «V/<r)a 2  t;f  I [(a 2  ,y)fe  Xt y )
c \ (4.32)
+ (ih ~ Vi? + fe  zif] + 2( t /v/c 2 + liy)(x 2  xJiht,}. J
Because the constants a, 0, and y satisfy conditions (4.8), we find that
ti2 is an invariant with, respect to the trans formation equations
(4.13), or,
*2
Tie
■2
Til
(4.33)
It is also invariant with respect to spatial orthogonal transformations
(14).
Hereafter, we shall call all the linear transformations with respect to
which Tia is invariant, Lorentz transformations, regardless of whether the
relative motion of the two systems takes place along the common
Xaxis or not. Obviously, the invarianee of r}! implies the in variance
of the speed of light, for the path, of a light ray is characterized by the
vanishing of rj for all pairs of points along its path.
What is the physical significance of this quantity n° ? Tf there
exists a frame (4' reference with respect to which both events take place
at the same space point, then m (the positive square root of rj) is the
time recorded by a clock at rest in that frame of reference, fa is
therefore called the proper time interval (or ricjen lime intervat).
Does there always exist a frame of reference with respect to which
two events take place at the same space point? if we were dealing
with the classical transformation equations, the answer would be yes,
unless the two events took place "simultaneously." But the equations
of the Lorentz transformation, (4.13), become singular when v, the
relative velocity of the two frames, becomes equal to the speed of
light. For values of 9 greater than e, equations (4,13) would lead to
imaginary values of x* and t*. The Lorentz transformation equations
are, thus, defined only for relative velocities of the two frames of refer
ence smaller than c. Therefore, if two events occur in such rapid sue
cession that the time difference is equal to or less than the time needed
by a light ray to traverse the spatial distan.ee between the two events,
no frame of reference exists with respect to which the two events occur
at the same spot.
Whenever the two events can be just connected by a light ray which
leaves the site of one event at the time; it occurs and arrives at the site
of the other event as it takes place, the proper time interval r H between
42
THE LORENTZ TRANSFORMATION [Chap. IV
them vanishes. Whenever the sequence of two events is such that a
light ray coming from either event arrives at the site of the other only
after it" has occurred, nl & negative. Then wo introduce instead of
ri2 tho invariant 01.2 = '■ n: ,
«§ = (&  %if + fes  + (&
jO 2 " <f&  fi) 2 . (4.34)
Either ha or 'm fe real for any two events. Whenever 0u is real, we can
carry out a Lorentz transformation so that tf — h vanishes. In other
words, there exists a frame of reference with respect to which the events
occur simultaneously. In that frame of reference, the spatial distance,
between the two events is simply 012 .
Frequently, either r& or a vi is referred to as the spacetime interval
between the two events. The interval is called timelike when r v > is
real, and spacelike when ov_ is real. Whether the interval between the
two events is timelike or spacelike does not depend on the frame of
reference or the coordinate system used, but is an invariant property
of the two events.
We mentioned before that the Lorentz transformation is defined only
for relative velocities smaller than the speed of light. If a frame of
reference could move as fast as or faster than light, it would be, indeed,
impossible for light to propagate at all in the forward direction, much
less with the speed c.
The relativistic law of the addition of velocities. Is it possible to find
two frames of reference which are moving relatively to each other with
a velocity greater than e by carrying out a series of successive Lorentz
transformations? To answer tiiis question, we shall study the super
position of two (or more) Lorentz transformations. We shall introduce
three frames of reference, S, S% »S'**. S* has the velocity v relative
to S, and A'** has the velocity w relative to A'*. We want to find the
transformation equations connecting S*
equations
j, _ x — vl
f * = t  v/<?x_
~ Vl  &i* '
x*  wi*
with S. Starting with the
y,
= «,
Vl  w 2 /c 2 '
t*  w/cx*
vr~
= z
w / e°
(4.35)
fihap. IV I
THE LORENTZ TRANSFORMATION
.1.3
we have to substitute the first set of equations in the second set. The
result of the straightforward calculation is
1
x — v.l
Vi  «7c 2 '
if* = V,
£** = Z,
t — ufe'x
;
{** =
with
Vl ~~ W 2 /c 2
v + w
1 r vw/c 2 '
(4.3G)
(4.37)
Thus, two Lorentz trans format it) ns, carried out one after the other,
are equivalent to one Lorentz transformation. But the relative velocity
of 8** with re spent to S is not simply the sum of v and w. As long as
both v/c and w/c arc small compared with unity, u Is very nearly equal
to v 4 w, but as one of the two velocities approaches e, the deviation
becomes important. Eq. (4.37) can be written in the form
1
(f  v/c)(\  w/c) '
1 + vw/c*
(4.37a)
In this form, it is obvious that u cannot become equal to or greater
than c, as long as both v and w are smaller than c. Therefore, it is
impossible to combine several Lorentz transformations in one involving
a relative velocity greater than e.
Eq. (4.37) can be interpreted in a slightly different way, for a body
which lias the velocity to with respect to S* has the velocity u with
respect to S. Then eq. (4.37) can be regarded as the transformation
law for velocities (in the Xdi recti on). In this case, it wou Id be pref
erable to write it
uvj& '
(4.38)
where w has been replaced by u*. We conclude that a body has a
velocity smaller than c in every inertial system if its velocity is less
than c with respect to one inertial system.
The Lorentz transformation equations imply that no material body
can have a velocity greater than c with respect to any inertial system.
For each material body can be used as a frame of reference; and if it
44
THE LORENTZ TRANSFORMATION
[ Chap. IV
is removed from interaction with other bodies and does not rotate
around its own center of gravity, it defines a new inertial system.
Then, if the body could assume a velocity greater than c with respect
to any inertial system, this system and the one connected with the
body would have a relative velocity greater than c.
The proper time of a material body. We have spoken before of the
spacetime interval between two events. The application of this con
cept to the motion of a material body and to the spacetime points
along its path is particularly important for the development of rola
ti vis tic mechanics. Since the velocity of a material body remains
below c at all times, such an interval is always timelike. If the mo
tion of the body is not straightline and uniform, we can still define the
parameter along its path by the differential equation
dr = dl
(dx 2 + dif f dz 2 )
1 
di)
+(!)*+ (1)1 if  ( " 9)
t is the tune shown by a clock rigidly connected with the moving body,
really its "proper time ; ' (its own time). When eg. (4.39) is divided by
di' and the root Is taken, we obtain the relation between coordinate
time and proper time,
dr
dt
V
h  u/c 2
(4,1:0)
where u is the velocity of the body. This relation is valid for acceler
ated as well as unaccelerated bodies.
Both dr and t, which is defined by the integral
= /v—
=yV dt,
(4.40a)
are invariant with respect to Lorentz transformations, though dt and
u are not.
PROBLEMS
1. On page 39 we have discussed one method of measuring the length
of a moving rod. We could also define that length as the product of
the velocity of the moving rod by the tune interval between the instant
when one end point of the moving rod passes a fixed marker and the
instant when the other end point passes the same marker.
Chap. IV]
THE LORENTZ TRANSFORMATION
45
Show that this definition leads also to the Lorentz contraction for
mula, equation (4.29).
2. Two rods which are parallel to each other move relatively to each
other in their length directions. Explain the apparent paradox that
either rod may^ appear longer than the other, depending on the state
of motion of the observer.
3. Suppose that the frequency of a light ray is v with respect to a
frame of reference S. Its frequency v* in smother frame of reference,
S*, depends on the angle a between the direction of the light ray and
the direction of relative motion of S and S*. Derive both the classical
and the relativistic equations stating how #* depends on v and the
angle a.
Tor this purpose, the light may be treated as a plane scalar wave
moving with the velocity c.
Answer:
k(1 — cos av/c),
J'rcl.
1 — cos av/c , , ' i / i \i
— Hi  cos av/c + i(v/c)
■)■
Vl  v'/c 2
The firstorder effect common to both formulas is the "classical" Doppler
effect, the second order term is called the "relativistic" Doppler effect.
It is independent of the angle; a.
4. H, A. Lorentz created a theory which was the forerunner to the
relativity theory as we know it today. Instead of trying to extend the
relativity principle to electrodynamics, he assumed that there exists one
privileged frame of reference, with respect to which the ether was to be
at rest. In order to account for the outcome of the MichelsonMorley
experiment, he assumed that the ether affects scales and clocks which
are moving through it. According to this hypothesis, clocks are slowed
down and scales are contracted in the direction of their motion. It is
possible to derive the quantitative expressions for the factors of time
and lengthcontraction with the help of these notions,
(a) Assuming that the G able an transformation equations are appli
cable, derive the rigorous expression for the time that a light ray needs
to travel a measured distance I in both directions along a straight path
in a MichelsonMorley apparatus, provided that the velocity of the
apparatus relative to the privileged system is v and that the angle
between the path and the direction of s is <s.
21 Vl — i»Vc 2 'Sin 2 a
Answer:
I, =
1 — v^/e 2
(4. pi.)
46
THE LORENTZ TRANSFORMATION
[ Chap, IV
{b) Now we introduce Lorentz' hypothesis and assume that equation
(4,pl) holds for the true, contracted length i and the true, distorted
angle u. The time indicated by the observer's clock is not the real
time t, but the clock time, t*. Furthermore, we measure the length with
scales that are contracted themselves; that is, what we measure is not
the true, contracted length I, but the apparent, uncontracted length I*.
The relation between the clocktime (* and the apparent length I* is
2P
(4. P 2)
according to the outcome of the MieheisonMorley experiment. We
call the factor of timecontraction. and the factor of length contraction
in the direction of v, A. Derive the relations between t and £* I and I*,
and determine A and 6 so that eqs. (4. pi) and (4.p2) become equivalent.
Answer:
t*
OL
I* — J\/sin a a H A 2 cos a a,
A = = Vl  I'Vc 2 . '
(4.p3)
(c) In order to obtain the complete Lorentz transformation equations
(4,13), introduce two coordinate systems, one at rest and one moving
through the ether (S and S*). Determine the apparent distances of
points on the starred coordinate! axes from the starred point of origin.
Finally j find out how moving clocks must be adjusted so that a signal
spreading in all directions from the starred point of origin and starting
at the time I = (* = has the apparent speed c in all directions.
CHAPTER V
Vector ana Tensor Calculus in an n Dimensional
Continuum
The classical transformation theory draws a sharp dividing line be
tween space and time coordinates. The time coordinate is always trans
formed into itself, because time intervals are considered in classical
physics to be invariant.
The relativistic transformation theory destroys this detached position
of the time coord in ate in that the time coordinate of one coordinate
system depends on both the time and space coordinates of another sys
tem whenever the two systems considered are not at rest relative to
each other.
The laws of classical physics are always formulated so that the time;
coordinate is set apart from the spatial coordinates, and this is quite
appropriate because of the character of the transformations with respect
to which these laws are covariant. It is possible to formulate relativistic
physics so that the time coordinate retains its customary special posi
tion, but we shall find that in this form the relativistic laws are cumber
some and often difficult to apply.
A proper formalism must be adapted to the theory which it is to
represent. The Lorentz transformation equations suggest the uniform
treatment of the four coordinates x, y, z, and t. How this might be done
was shown by H. Minkowski. We shall find that the application of his
formalism will simplify many problems, and that with its help many
relativistic laws and equations turn out to be more lucid than their non
relati vi s ti e analogues.
Classical physics is characterized by the in. variance of length and
time. We can formally characterize relativistic physics by the invari
ance of the expression
nf = (h  kf   [(set  xd" + (y 2  yd 2 + (a  zd% (5.1)
The invariance of this quadratic form of the coordinate differences
restricts the group of all conceivable linear transformations of the four
47
48
VECTOR AND TENSOR CALCULUS [ Ch, P . v
coordinates x, y, z, and t to that of the Lorentz transformations, just as
the in variance of the expression
\2
Sl2
= (xz  x,y + (m  vO + la  *)"
(5.2)
defines the group of three dimensional orthogonal transformations. The
four dimensional continuum (x, y, z, t), with its invariant form r r ?, can
be treated as a four dimensional "space," in which ra is the "distance"
between the two "points" (xj , j/i , s L , k) and (x, , y 2 , % , h). This pro
cedure permits the development of a sort of generalized vector calculus
in the "Minkowski world," and the formulation of all invariant rela
tions in a clear and concise way.
We shall begin the study of this mathematical method with a recapitu
lation of elementary vector calculus, focusing our attention on its formal
aspects. Then we shall generalize the formalism so that it becomes
applicable to the spacetime continuum.
Orthogonal transformations. Let us start with a rectangular Car
tesian coordinate system and call its three coordinates x\ , x 2 , and x 3
(instead of x, y, and z). Call the coordinate differences between two
points P and 1 J ', Axi , Ax 2 , and Ax s . The distance between the two
points is given by
, 2 = j: Ax?. (5.2a)
If we carry out a linear coordinate transformation,
«J~Es«** + fti* ^=1,2,3, (5.3)
the new coordinate differences are
Ax^t^A^, i= 1,2,3. (5.4)
k 1
These equations can he solved with respect to the Ax k ;
Ax, = £ei#&4 ft =1,2, 3, (5.5)
Eq. (5.2a) expresses itself in terms of the new coordinates thus:
s 2 = £ c^uAxlM. (5.6)
Chap. V ] VECTOR AND TENSOR CALCULUS
49
The new coordinate system is a rectangular Cartesian system only if
eq, (5.6) is formally identical with eq. (5.2a), that is, if
A / , /0 if k *l,
i ,i (1 it k — I. j
(5.7)
These equations take a more concise form if we use the socalled
Kronecker symbol S u , which is defined by the equations
hi = 0, k & I,
hi = 1, fe= I
Eq. (5.7) takes the form
22 $& c'u = $M , htl~ 1, 2, 3.
(5.8)
(5.7a)
Eq. (5.7a) is the condition which must be satisfied if the transformation
equations (5,3) are to represent the transition from one Cartesian coordi
nate system to another.
We can easily formulate the condition to be satisfied by the c ih them
selves. By substituting eqs. (5.4) in eqs. (5.5) wc obtain
Ax k = J2 c'uCnAxt, k = 1, 2, 3, (5.9)
■i, £=1
and because this equation holds for arbitrary Ax k , we find
S c'mCu = hi, k, I = 1. 2, 3. (5.10)
Now we can multiply eqs. (5.7a) by e im and sum over the three possible
values of I. We obtain, because of (5.10),
Zj VifrGilChn = &M = Zj falCba = 6fea. (5.11)
i.i=l ' ' [=i
By substituting c ki for c, k , and so forth, in eqs, (5.7a), we obtain
a
X) e»i.«« ?=*■%, h, I = 2, 2, 3, (5.7b)
and eqs. (5.10) take the form
£ &ea = S H , k, I = 1, 2, 3. (5.10a)
Either eqs. (5.7b) or (5.10a) , together with eqs. (5.3), define the group
of orthogonal transformations.
50
VECTOR AND TENSOR CALCULUS
[ Chap. V
Transformation determinant. We shall now investigate the trans
formations (5.3) and (5.7b) a little further.
The determinant of the coefficients cn : ,
tn , Cia , €13
C31 j C:i'z , C:i3
is equal to ± 1 . To prove this statement, we make use of the multipli
cation law of determinants, which states that the product of two deter
minants I o« I and I bn I is equal to the determinant  "£ q#&a . Now
1
we form the determinant of both sides of eqs. (5.7b),
I
£ Cj»&* = I b k i
i=l
1
(5.12)
According to the abovementioned multiplication law, the lefthand side
can be written
Cki
Cil  Chi
Cli
c ki
(5.13)
The value of the righthand side of eq. (5.12) is equal to unity, since
1
1
1
1.
(5.14)
Therefore, we really have
c M I = ±1
(5.15)
The value +1 of the; determinant belongs to the "proper" rotations,
while the value —1 belongs to orthogonal transformations involving a
reflection.
Improved notation. In the great majority of equations occurring in
three dimensional vector (and tensor) calculus, every literal index which
occurs once in a product assumes any of the three values 1, 2, 3, and
every literal index which occurs twice in a product is a summation index.
From now on, therefore, we shall omit all summation signs and all
remarks of the type (i, h — 1,2, 3), and it shall be understood that:
(1) Each literal index which occurs once in a product assumes all its
possible values;
Chap. V ]
VECTOR AND TENSOR CALCULUS
51
(2) Each literal index which occurs twice in a product is a summation
index, where the summation is to he carried out over all possible values.
Thus, we write eqs. (5.3) and (5.6) like this:
Xt = 6fHSi + x { ,
s 2 = C;ifinAx k Axi .
Summation indices are often called dummy indices or simply dummies.
The significance of an expression is not changed if a pair of dummies is
replaced by some other letter, for example.,
OikXic = cuxi .
Vectors. The transformation law of the Ax*., (5.4), Is the general
transformation, law of vectors with respect to orthogonal transforma
tions, or, rather: A vector is defined as a set of three quan titles which
transform like coordinate differences :
% = cm&i ■ (5.16)
When the vector components are given with respect to any one Cartesian
coordinate system, they can be computed with respect to every other
("artesian coordinate system.
The norm of a vector is defined as the sum of the squared vector com
ponents.
W e shall prove that the norm is an invariant with respect to orthogonal
transformations, or. that
am = aiai . (5.17)
Substituting for a h its expi'ession (5.16), and making use of eq. (5,10a),
we obtain
a<ai : = cua,iCfiai = &u&&i = a&i ,
which proves that eq. (5.1.7) holds for orthogonal transformations.
The scalar product of two vectors is defined as the sum of the products
of corresponding vector components,
(ab) = aA. (5.18)
That this expression is an invariant with respect to orthogonal trans
formations is shown by a computation analogous to the proof of eq,
(5.17). The norm of a vector is the scalar product of the vector by
itself.
The word scalar is frequently used in vector and tensor, calculus
s in
52
VECTOR AND TENSOR CALCULUS ( Chap. V
stead of invariant. "Scalar product" means "invariant product/'
Sums and differences of vectors are, again, vectors,
di + bi = Si,
8s — hi = di
(5.19)
That the new quantities s? and S { really transform according to eq.
{5. 16) follows from the linear, homogeneous character of that trans
formation law.
The product of a vector and a scalar (invariant) is a vector,
aa t = h. (5.20)
The proof is left to the reader.
The discussion of the remaining algebraic vector operation, the vector
product, must be deferred until later in this chapter, because its trans
formation properties are not quite like those of a vector.
Vector analysis. We are now ready to go on to the simplest differ
ential operations, the gradient and the divergence. In the three dimen
sional space of the three coordinates as* , let us take a scalar field V,
that is, a function of the three coordinates Xi which is invariant with
respect to coordinate transformations. The form of the function V of
the coordinates will, of course, depend on the coordinate system used,
but in such a way that its value at a fixed point P is not changed by the
transformation.
What is the transformation law of the derivatives of V with respect
to the three coordinates,
dXi
(5.21)
We must express the deriva lives with respect to x' k . in tenns of the
derivatives witH respect to xi ,
dV = dxidV^ (522)
Silt dx k ® x i
According to eq. (5.3), the x' k are linear f auctions of the ft , and vice
versa. Therefore, the dXi/dx k are constants, and they are the constants
c ik defined by eqs. (5.5). We have, therefore,
V,k = c' ik V,i
and, according to eq. (5.11),
71* = CuT' < 5  23 )
Chap. VI VECTOR AND TENSOR CALCULUS
53
The three quantities V ti transform according to eq. (5.11); therefore,
they are the components of a vector, which is called the gradient of the
scalar field V.
Three functions of the coordinates, T\(xi , x% , xs), are the components of a
■vector field if at each spacepoint they transform as the compone.nts of a
vector. The functions Vi of the coordinates .iv are, thus, given by the
equations
V'iti) = c it V k (x s ), (5.11a)
where the x s are connected with the x r by the transformation equations.
The gradient creates a vector field out of a scalar field.
The divergence does the opposite. Given a vector field Vi , we form
the sum of the three derivatives of each component with respect to the
coordinate with the same index,
div V = f w . (5.24)
We have to show that this expression is an invariant (or scalar),
FU = ii.i . (5.25)
The procedure is exactly the same as before. We replace the primed
quantities and derivatives by the unprimed quantities,
V h , k ' = c^kifikiVt),
(5.26)
Because of eq. (5.10), this last expression is equal to the righthand side
of eq. (5.25).
The divergence of a gradient of a scalar field is the Laplacian of that
scalar field and, of course, is itself a scalar field,
div grad V = V,„ a V 2 F. (5.27)
Tensors. In many parts of physics we encounter quantities whose
transformation laws are somewhat more involved than those of vectors.
As an example, let us consider the socalled "vector gradient." When
a vector field Vi is given, we can obtain a set of quantities which deter
mine the change of each component of Vi as we proceed from a point
with the coordinates Xi in an arbitrary direction to the infinitesimal ly
near point with the coordinates Xi + Sxi . The increments of the three
quantifies ft are
SVi = VijMk , (5.28)
and the nine quantities Vi,tt are called the vector gradient of Vi . W :, 'e
can easily' derive its transformation law in the usual manner:
n
P3fe»\vn*i<' i) ,k CmiCrify. f i r k *
(5.20)
54
VECTOR AND TENSOR CALCULUS I Chap, v
The vector gradient is one example of the new class of quantities
which we are now going to treat, the tensors. In general, a tensor /ws
N indices, all of which lake all values 1 to 3. The tensor has, therefore,
3 Y components. These 3' v components transform according to Ihe trans
formation law
?.j i f 'Tti ( :' ' .i '
I ikl
(5.30)
The number of indices, N, is called its rank. The vector gradient is a
tensor of rank 2, vectors are tensors of rank 1 , and scalars may be
called tensors of rank 0.
One very important tensor is the Kronecker symbol. Tts values in
one coordinate system, when substituted into eq. (5.30), yield the same
values in another coordinate system,
Skt = CkiCifia — CkiCu
Ski ,
(5.31)
according to eq. (5.7b).
The sum or difference of two tensors of equal rank is a tensor of the
same rank. We formulate this law for tensors of rank 3:
Tm + U ikl = V m ,
T m  U m = W m .
(5.32)
(5.33)
The proof is the same as for the corresponding law for vectors, eq. (5.19).
The product of two tensors of ranks M and A r is a new tensor of rank
(M + N),
T^.Ih,* = Vi,...^.... (5.34)
The rank of a tensor may be lowered by 2 (or by any even number)
by an operation called "contraction/'' Any two indices are converted
into a pair of dummy indices. For instance, we can contract the tensor
T&i... to obtain the tensors T ,„.,.... or T,„... . The proof that these
new contracted tensors are again tensors is very simple. For the first
example given here, it runs as follows.
Because of eq. (5.10a), the righthand side is equal to
T',,,1... = Sitfitm ■ ■ ■ T Am ... = Sfc, ■ ■ • TV™.. ■ (5^5)
When we contract the vector gradient (tensor of rank 2), we obtain the
divergence (tensor of rank 0). The operations product, (5.34), and the
eontraction can be combined, so that they yield tensors such as T ik l ; ik ,
TUeVkm , Tiki' ik , J ikijki ■ .....
Tensors may have symmetry properties with respect to their indices.
Ch ap . v i VECTOR AND TENSOR CALCULUS
55
If a tensor is not changed when two or more indices are exchanged,
then it is symmetric in these indices. Instances are
tiki — Ikil ,
*iklm = tilkm ~ tkn m = tlkim = tjslipi ~ Hihm ■
The first tensor is symmetric in its first two indices; the second tensor is
symmetric in its first three indices.
When a tensor remains the same or changes the sign of every com
ponent upon the permutation of certain indices, the sign depending on
whether it is an even or an odd permutation, we say that the tensor is
antisymmetric (also skew&ymmetric or alternating) with respect to these
indices. Instances are
tiki = — tk~l j
tiklm = tklim = tijk m — Hficm — — ikilni. = — tlkim •
All such symmetry properties of a tensor are invariant. The proof
is extremely simple and shall be: loft to the student.
The Kronecker tensor is symmetric in its two indices.
Tensor analysis. When a tensor is differentiated with respect to the
coordinates, a new tensor is obtained, the rank of which is greater by 1.
The proof again consists of simple computation:
T mn ..., s > = c u (c m ;c rii: ■ ■ ■ Tuc),i — c !!t ic n k ■ •■ c 3 iT ik ,.,i . (5.36)
When the resulting tensor is contracted with respect to the index of
differentiation and another index, for example, T, k ...,k , it is often called
a divergence.
Tensor densities. The "vector product" of two vectors a and b is
usually defined as a vector which is perpendicular to a and to b and
which has the magnitude  a   1 b  sin (a, b). As there are always two
Vectors satisfying these conditions, viz., P s and P 2 in Fig. 6, a choice
is made between these two vectors by the further condition that a, b,
and P shall form a 'screw" of the same type as the coordinate axes in
the sequence x, y, z. In Fi.g. 6, the vector Pj satisfies this condition,
but only because the chosen coordinate system is a "righthanded"
coordinate system. If we carry out a "reflection" (for example, give
the positive A r axis the direction to the rear of the figure instead of to
the front), P 2 becomes automatically the vector product of a and b.
The vector product is, thus, not an ordinary vector, but changes its
sign when we transform a righthanded coordinate system into a left
5G
VECTOR AND TENSOR CALCULUS
[ Chap. V
handed system, or vice versa. Such quantities are called "axial vec
tors," while ordinary vectors are called "polar vectors."
Fig. 6. The vector product. In a righthanded coordinate system, Pi represents
the vector product of a by b.
With respect to a Cartesian coordinate system, the components of P
are given by the expressions
Pi = (t: h — o,i h ,
Ps = a s bi — (Ji&s,
(5.37)
Similarly, the curl of a vector field 7; is defined as an "axial vector"
with the components
Chap. V J
VECTOR AND TENSOR CALCULUS
57
W — V 3,2 — V 3,3
C 2 = V h , T, Al }
Cz = ¥%% — T'i.2
(5.38)
From the point of view of tensor calculus, we can avoid the concept
of "axial vector" by introducing vector product and curl as skewsym
nietric tensors of rank 2,
and
Pik — ffi#& ^ CLkbi ,
Vx = FL  7,
(5.37a)
(5.38a)
It can be shown that all equations in which "axial vectors" appear can
be written in the co variant manner with the help of such skewsymmctrio
tensors. Nevertheless, this treatment does not show very clearly the
connection between the transformation law of a skewsymmetric tensor
of rack 2 and that of an '"'axial vector. " We can conform closely to
the methods of elementary vector calculus by introducing in addition
to tensors a new type of quantity, the "tensor densities."
The tensor densities transform like tensors, except that they are also
multiplied by the transformation determinant (5.15). As long as this
determinant equals +1, that is, when the transformation is a "proper
orthogonal transformation" without reflection, there is no difference
between a tensor and a tensor density. But a density undergoes a
change of sign (compared with a tensor) when a reflection of the coordi
nate system is carried out. The tensor densities have, thus, the same
relationship to tensors as the "axial vectois" have to the "polar vectors/' ;
Their transformation law can be written thus :
Asm.
CWjCnft
^i/c
(5.39)
The laws of tensor density algebra, and calculus are: The sum or
difference of two tensor densities of equal rank is again a tensor density
of the same rank. Tlio product of a tensor and a tensor density is a
tensor density. The product of two tensor densities is a tensor. The
contraction of a tensor density yields a new tensor density of Lower rank.
The derivatives of the components of a tensor density are the com
Ponents of a new tensor density, the rank of which is greater by I than
the rank of the original density.
The tensor density of LeviCivita. Wo found that the Kronecker
symbol is a tensor, the components of which take the same constant
58
VECTOR AND TENSOR CALCULUS
[ Chap, V
Chap. V ]
VECTOR AND TENSOR CALCULUS
59
values in everv coordinate system. Likewise, there exists a constant
tensor density of rank 3, the LevhCivita tenser density, defined as
follows, tat is skewsymmelrw in its three irtdiees; therefore, all those com
ponents which have at lead two indices equal vanish. The values of the
norwanishing components are ±1, the sign depending on whether (?, k, I)
is an even or an odd permutation of {I, 2, 3).
We have yet to show that 5 m are really the components of a tensor
density To do that, let us consider a tensor density D iU which has the
components 5 m in one coordinate system. If it tarns out that its com
ponents in some other coordinate system are again km , «* assertion
is proved.
The components of Dm in another coordinate system are
£>1™ =  fcj*  CmiCnkCslSi
(5.40)
As the skewsymmotry <>f Pm is preserved by the coordinate trans
formation, we know that all components DL., with at least two equal
indices vanish. We have to compute only components with all three
indices different. The component Diss is given by the expression
D^
C a b I CuClkCltSikl
(5.41)
The righthand side is simply the square of  c rt [, and equal to unify.
For b ikl is defined so that Ci&a&st&m & just the determinant  c<a .
Now that we know that I)' 1M is equal to unity, the remaining com
ponents are obtained simply by using the symmetry properties.
They are
Sin = Dk = Dm = S* = ■$*>
D
(5.42)
In other words, the ZC. are again equal to &* , and the proof is
completed.
Vector product and curl. With the help of the LeviOivita tensor
density, we can associate skewsymmetrie tensors of rank 2 with vector
densities :
ft; = i&qjs&li!
(5,13)
(5.44)
The converge relation is
iPfti = Sii.ro; .
Applying eg. (5,13) to the vector product and to the curl, defined by
eqs. (5.37) and (5.38), respectively, we obtain
%h = fkkid&i , ( 5  37b )
Because these two vectps densities & and £,■ transform, like vectors
except for the change of sign in the case of coordinate reflections, they
are treated as vectors in vector calculus, but they are referred to as
"axial ;r vectors, implying that they have something to do with
"rotation."
The}' really do have something to do with rotation. The angular
momentum, for instance, is fee vector product of the radius vector and
the ordinary momentum,
Si = 'omXkVi
(5.45)
In the case of a reflection, it transforms as an ordinary vector would,
except for its sign. Assume that of the x k only m does not vanish, and
feat p has only the component p 2 . Then fee angular momentum has
only the component 3' 3 . We can carry out a reflection in three different
ways: We can replace x, by (— %) or we can do the same thing with x.
oi with x s , fee other two coordinates remaining unchanged in every
case. 3 3 changes its sign in the first two cases, and it remains unchanged
when Xs is replaced by (x :i ). A genuine vector would change its sign
only when x s is replaced by ( — z t ) .
Generalization. "Now that we have reviewed briefly vector and tensor
calculus in three; dimensions with respect to orthogonal transformations,
we ate in a position to generalize the concepts obtained so that they
will be applicable to th e problems we shall discuss later. Th e gcnerali na
tion is to be carried out in two steps. First, we have to extend the
formalism, so that it applies to any positive integral number of dimen
sions; second, we shall have to consider coordinate transformations other
tli an orthogonal transformations.
n dimensional continuum. The first generalization is almost trivial.
Instead of three coordinates as , x 2 , ssj , we have n coordinates,
Xi ■ ■ • x„ , describing an n dimensional manifold. We assume, again,
that (fere exists an invariant distance between two points,
s = AxtAxi ,
(5.2b)
where the summation is to be carried out over all n values of the
index i. Eq. (5.2b) is invariant with respect to the group of n dimen
sional, orthogonal transformations,
X{ CifcXh ~~\~ X^ 5
where the c ik have to satisfy the conditions
CikCil = 0/;l. .
(5,3a)
(5.10b)
60
VECTOR AND TENSOR CALCULUS
Chap. V
AT. indices fcafce all nfaw 1 ■ * ■• «. ^d mttanalfeitt are to be earned
out from 1 to n. The determinant I c a!l  is again equal to ±1.
Vectors are defined by the transformation law
(Ik — CkiQi ,
(5.16a)
and their algebra and analysis are identical with the algebra and analysis
of three dimensional vectors. .
Tensors and tensor densities are defined as in three dimensional
space, except that, all indices run from 1 to n. fc is again a symmetric
6 The LeviCivita tensor density is defined as follows. fct . . ., fe.A tensor
density with n indices (of rank n), skewsymmetric m all of them JLbe
nonvanishing components are ±1, the sign depending on whether
(i h  ■ ■ s) is an even or an odd permutation (A (\, 2, ■ • ■ ,n). me
'Vector product" is no longer a vector density. With the help of the
LeviCivita tensor density, we\can form from a skewsymmetric tensor
of rank m (m ^ n) a skewsymmetric tensor density of rank (« m).
Only when n is 3 is the " conjugate" tensor density to a tensor of rank 2
•a vector density.
General transformations. The "length" defined in the Minkowski
space, (5.1), does not have the form (5.2b). We shall, therefore no
longer restrict ourselves to transformations which leave eqs. (5,3b) in
variant but shall «ake up general coordinate transformations, Since
the Lorentz transformations are much less general than the coordinate
transformations which we are about to consider, it may appear that we
are deviating from our mam purpose. But we shall need the general
coordinate transformations in the general theory of relativity; and
since they are as simple in most respects as the more restricted group oi
Lorentz transformations, we shall thus avoid needless repetition.
' Let us consider a space in which we can introduce Cartesian coordinate
systems so that the length is defined by eq. (5.2b). Then let us pass
from a Cartesian coordinate system to another coordinate system winch
is not Cartesian. The new coordinates may be called £ ,  , • ■ ■ , £ (trie
superscripts are not to be mistaken for power exponents). We have
then,
? = /*'(£!, ■■■ x n ), %
= 1
(5.46)
where the n functions/' are arbitrary, except that we shall assmnc that
their derivatives exist up to the order needed in any d)scussion; that the
Jacobian of the transformation,
det
dx r
Chp. v 1 VECTOR AND TENSOR CALCULUS
61
vanishes nowhere; and that the {' are real for all real values of the
Xi ■ ■ ■ X
2
s is not, in general, a quadratic form of the A£*, as it is of the Ax, .
But the square of the distance between two infmitesimally near points
remains a quadratic form of the coordinate differentials. In terms of
Cartesian coordinates, this infinitesimal distance is given by
ds — dxylxk ,
and dxk can be expressed in terms of the (if' ,
<&& = *tfr.
Substitution into eq. (5.47) yields
ds*
dx k dx k
d? ai
fapdr
(5.47)
(5.4S)
(5,19)
ds is a quadratic, form of the d£, regardless of the coordinate system
used. This suggests that the coordinate differentials df and the dis
tance differential ds will, in the field of general coordinate transforma
tions, take the place of the coordinate differences Axi and the distance s,
which are adapter.! to Cartesian coordinates and orthogonal trans
formations.
Vectors. Let us see how the coordinate differentials transform in the
case of a general coordinate transformation. Let £', £ be two sets of
non Cartesian coordinates. Then the coordinate differentials are con
nected by the equations
*"£*
(5.50)
The transformed coordinate differentials d%' 1 are linear, homogeneous
functions of the dg; but the transformation coefficients {dff l fdg) are not
constant, but functions of the £\ Neither is their determinant,
i d? a /d% IJ \, a constant. We shall use them for the purpose of defining
a type; of geometrical quantity, the "contra variant vector": A contra
variant vector has n components, which transform like coordinate differ
entials,
dp
(5.51)
The sum or the difference of contravariant vectors, and the product of a
contravariant vector and a scalar are also contravariant vectors.
It is impossible to form scalar products of contravariant vectors alone.
62
VECTOR AND TENSOR CALCULUS
[ Chap. V
To rind out what corresponds to the scalar product in our formalism,
we shall consider a scalar field V(£, ■ ■ ■ , ?). The change of V along
an infinitesimal displacement If' is given by
SV = Xi&f,
(5,52)
The lefthand side is obviously invariant. The righthand side has the
form of an inner product; one factor is the contra. variant vector S l ,
the other is the gradiant of Y, V.i ,
The components of the gradient of V transform according to the law
(5.53)
The V ,i> arc linear, homogeneous functions of the V ti . The trans
formation law (5.53) is not that of a contra variant vector. We call the
gradient of a scalar field a covariant vector and define in general a co
variant vector as a set of n fjuardities which transform according to the law
°"> = urn **
(5.54)
The sum or difference of covariant vectors, and the product of a co
variant vector and a scalar are, again, covariant vectors.
Tn order to distinguish between contravariant and covariant vec
tors, we shall always write contravariant vectors with superscripts, and
covariant vectors with subscripts.
The transformation coefficients of contravariant and covariant vectors
are different, but they arc related. The {&f l /S^) of eq. (5.51) and the
(3V 5 f *) °f e T ( 5 ' 54 arc connected by the n equations
&£ n df
5J ,
(5.55)
where $f is, again, the Kroneeker symbol. Because of eq. (5.55), the
inner product of a covariant and a, contra variant vector is an invariant,
aib = (lib .
(5.5(3)
Let us return for a moment to the orthogonal transformations. Their
transformation coefficients, && , satisfy the equations (5.7b) and (5.10a).
In the case of orthogonal transformations, the coordinate transformation
derivatives are
= Cn ;
Chap. V 1
VECTOR AND TENSOR CALCULUS
G3
and, because eq. (5.55) holds for all transformations, it follows from eq.
(5.7b) that (dxi/dx'i), too, is
dx.j
dxi
Cu
(5.57)
That is why the distinction between contravariant and covariant vec
tors does not exist in the realm of orthogonal transformations.
Tensors. Tensors are defined as forms with n N components {N being
the rank, of the tensor), which transform with respect to each index like a
vector. They may be covariant in all indices, contravariant in all indices,
or mixed t that is, contravariant in some indices, covariant in others. The
contravariant indices are written, as superscripts, the covariant indices
as subscripts. The example of a mixed tensor of rank three may illus
trate the definition:
j . _ St dt ,
" ,n " ~dt» W~ n a?
(5.58)
Symmetry properties of tensors are invariant with respect to coordi
nate; transformations only if they exist with respect to indices of the
same type (covariant or contravariant) .
The Kronecker symbol is a mixed tensor,
jH = of Of = d£ d£
 4
(5.59)
The product of two tensors of ranks M and N is a tensor of rank
{M + N), and e^ery index of ei tiler factor keeps its character as a
contravariant or covariant index. Furthermore, just as the scalar
product of a covariant and a contravariant vector is a scalar, so any two
indices of different position can bo used as a pair of dummies, and the
result is a lowering of the tensor rank by 2. Examples of such prod
ucts are:
«, t . I.) .
a ik . b. . .
ait. b..i
Wheil the corresponding components of two tensors of equal rank arc
added or subtracted, the sums or differences are components of a new
tensor, provided that the two original tensors have equal numbers of
indices of the same type.
These are the simple rules of tensor algebra. They indicate how new
quantities may be formed which transform according to laws of the
pattern (5.58).
Metric tensor, Riemannian spaces. The expression which occurs in
04
VECTOR AND TENSOR CALCULUS
[ Chap. V
eq, (5.49), in = —■ —^ transforms as a tensor when we change from a
coordinate system £' to £"",
(5.00)
In other words, gu is a covariant symmetric tensor of rank 2. It is
called the metric tensor.
There exist "spaces" where it is not possible to introduce a Cartesian
coordinate system. Two dimensional "spaces" of that kind include the
surface, of a sphere. If we introduce as a coordinate system the latitude
and longitude, <p and <?, it is possible to express the distance between two
infinitesiinally near points on the spherical surface in terms of their
coordinate differentials,
di = ff{$P + cm <pd&).
In order to include such continuous manifolds within the scope of our
investigations, we shall consider spaces with a metric tensor, without
raising the question, for the time being, of whether a Cartesian coord
inate system can be introduced or not. Whenever a "squared infini
tesimal distance/' that is, an invariant homogeneous quadratic function.
of the coordinate differentials, is defined, we call the manifold a "metric
space" or a "Riemannian space." Tf it is possible to introduce in a
liiemannian space a coordinate system with respect to which the com
ponents of the metric tensor assume the values & ik at every point, such a
coordinate system is a ('artesian coordinate system, and the space is
called a Euclidean spa.ee. Euclidean spaces are, therefore, a special
ease of liiemannian spaces.
Whenever the infinitesimal distance is given by equations of the form
ds" = gu d£ d% ,
(5.61)
where d/ is an invariant, gu is a covariant tensor. Our previous proof
was based on the assumption that the g ;i were given by the expression
djk dx k . » otllG1 . wor[ n, w<3 implied the possibility of introducing Car
tosian coordinates. To show that the transformation properties of g t[
are independent of this assumption, we shall consider the equation
gud^da' = gLdr'dr, (562)
Chap. V ] VECTOR AND TENSOR CALCULUS
65
which expresses the invariance of rfs . When we replace the $? on the
lefthand side by (a£ '■/ 'df"") • df , we obtain
gu d ^ md ^di dg =g^d£ $ .
(5.63)
Because the d% /m arc arbitrary, it follows that the coefficients on both
sides are equal, that is, eq. (5.60) is satisfied.
If the determinant of five components of gu does not vanish, it is
possible to define new quantities g' ! ' by the equations
1:1 ~i
Etoi?
5
(5.04)
In order to determine the transformation law of these quan titles, we
transform first the gu~, ■ We replace them by the expression
so that we get
r , <>r N _ ,i
0$ »* dj :t B 0i '
fit** "it***
then multiply the latter equation by — > —  . We know from eq. (5.59)
that the righthand side becomes K ; and the lefthand side becomes
ar ! a? , at m af
eg $$ r y ™ n Bg 5 d g
>r ® nn r m g " m Sr g " tn m m (J
> drat „
' 9 " a? a? g :
^o tha" we get
fh* Sk k 5 i 3  °r
By comparing oq. (5.06) with eq. (5.64), we find that
3^, d g y J '
(5.66)
(5.67)
that is, the g kl are the components of a eontravariant tensor. This
tensor is symmetric. We can show this by multiplying eq. (5.64) bj r
Stag". The lefthand side becomes
log 9m = sfes g s^ = hg qm = g^g ,
66
VECTOR AND TENSOR CALCULUS [ a.*, v
(5.68)
while the righthand side is equal to
; W tV 6" off
mug = M ~ ( J='9 = 5 * ■
We obtain
9^9 = &s ■
Comparing this equation with eq. (5.G4), we find that
9 !l = 9 ii: 
g kl is called the contravariant metric tensor. The values of its com
ponents are equal to the minor of the g», , divided by the full deter
minant g =  g,ib  ,
( f = (f 1  minor fe). (5 69 )
In the case of Cartesian coordinate systems, g' 1 equals hi •
Raising and lowering of indices. A covariant vector can be obtained
from a contra variant vector by multiplying it by the covenant metric
tensor and summing over a pair of indices,
t
a*
gna
(5.70)
This process can be reversed by multiplying (k by the contravariant
metric tensor
9 a i
(5.71)
From the definition of the contravariant metric tensor, (5.64), it follows
that eq. (5.71) IS equivalent to eq. (5.70); in other words, eq. (5. 71)
leads back to the same contravariant vector which appears in. eq, (5.70).
The two vectors a :i and a can, therefore, be properly considered as two
equivalent representations of the same; geometrical object. The opera
tions (5.70) and (5.71) are called lowering and raising of indices. It is
possible, of course, to raise and lower any tensor index in the same way.
The norm of a vector is defined as the scalar
a~ = g i! .,a i a k = g*a,a f: = a% , (5./2)
while the scalar product of two vectors can be written in the alternative
forms
a,V = abi = gu^a
g aiO*
(5.73)
Tensor densities, LeviCivita tensor density. We call a tensor density
a quantity v>hich transforms according to the law
X"!".
(5.74)
Chap, V 
VECTOR AND TENSOR CALCULUS
67
W is a constant the value of which is characteristic for any given tensor
density; this constant is called the weight of the tensor density. Tensors
are tensor densities of weight zero. Depending on the number of in
dices, we speak also of scalar densities and vector densities.
The sum of two densities with the same numbers of like indices and the
same weight is a density with the same characteristics. In the multi
plication, the weights are added.
The LeviCivita symbols V,... a , ; and a' 1 "'"* are densities of weight
( — 1) and (H), respectively, (n is the number of dimensions.) The
proof is simple and analogous to the one given for orthogonal trans
formations.
The determinant of the covariant metric tensor.
Qik
(5.75)
is a scalar density of weight 2.
We shall have very little occasion to work with tensor densities.
Tensor analysis. 1 The consideration of tensor densities completes the
discussion of tensor algebra as far as it is needed in this book. A.s for
tensor analysis, we have already found that the ordinary derivatives of
a scalar field are the components of a covariant vector field. In general,
however, the derivatives of a tensor field do not form a new tensor field.
Let us take the derivative of a vector. The derivative compares the
value of a vector at one point with its value at another i nil nitesim ally
near point, in a given direction. In the ease of a coordinate trans
formation, the vectors at the two points do not transform with the same
transformation coefficients, for the coefficients of the transformation are
themselves functions of the coordinates. Therefore, the derivatives of
the transformation coefficients enter into the transformation law of the
derivatives of the vector.
However, there is a way which enables us to obtain ncw r tensors by
differentiation. The method is suggested by our experience in the realm
of (.'artesian coordinates. There we can describe the derivative of a
vector or tensor thus: The vector is first carried to the "neighboring' 1
point without changing the values of Its components: it is displaced
■parallel to itself. (As long as we use Cartesian coordinates, that state
ment has an invariant meaning, because the transformation coefficients
are the same at both points.) Then this parallel displaced vector is
compared with the value of the vector (as a function of the coordinates)
1 Tensor analysis with respect to general coordinates is discussed here because
it is necessary to an unders Landing of the general theory of relativity. It is not
needed anywhere in the special theory of relativity, and may be omi fcted by those
who are not interested in t.he general theory of relativity.
VECTOR AND TENSOR CALCULUS
[ Chap. V
at the same point. The difference is given by Ai,,5x* ■ If it were
possible to define "the same vector" or "the parallel vector" at a neigh
boring point, the difference between the parallel displaced vector and
the actual vector at a neighboring point would be subject only to the
transformation law at that point.
A definition of parallel displacement is actually possible in a com
paratively simple way. Of course, the value of the displaced vector
depends on the original vector itself and en the direction of displace
ment. Let us first consider a Euclidean space, where we can introduce
a Cartesian coordinate system. With respect to such a coordinate sys
tem, the law of parallel displacement takes the form
a i:t Sx k = 0, (5.76)
where 5x k represents the infinitesimal displacement. Let us now intro
duce an arbitrary coordinate transformation (5.46). The vector com
ponents with respect to that new coordinate system may be denoted
by a prime. Then we have
da
Sx k
\ (>x k ap ysx, V
dp* d? da T , af dx, 8^f r
dx k dx~i 'dp dx k dp dxi dx t
dx k transforms according to eq. (5.48), and we obtain
= a ilk hx k = <■— —  + —— —a r \^.
(5.76a)
 Sp is the actual increment of a', as a result of the displacement,
_ .*
and shall he denoted by 5a '. Multiplying the righthand side of eq.
3a
(5.76a) bv — j , we get finally
dp
t dxidx; d"p~
(5.77)
dp di? dXidXi
When no Cartesian coordinate system can be introduced, we shall
retain the linear form of the last equation and assume that, because of
a parallel displacement, the infinitesimal changes of the vector com
ponents are bilinear functions of the vector components and the com
ponents of the infinitesimal displacement,
5( i (5.79)
Sa k — ■ J rY k ia i &£ .
Chap. V ]
VECTOR AND TENSOR CALCULUS
69
i ii
The coefficients Th and FL of these new tentative laws are, so far,
entirely unknown quantities. But we can determine their transforma
tion laws. Su' is the difference between two vectors at two points,
characterized by the coordinate values £ and £ + 5£ . In the case of a
coordinate transformation, the new 5a' are given by the following ex
pressions:
8a>^(^A ~(fA *r€*wi
W J tut? \dp U ft\4& /
■a ylk
ei
%*
(5.80)
P i i j , «£ „> Sl ,
; a 8£ + — — a 1 S£
dpdg 9f
= „ — r, a 5P + —  5a .
dp dp dp
As stated before, 8a s does n.ot transform like a vector; this was the
original source of our difficulty.
We substitute the expression (5.80) into the lefthand side of the
equation
1
la,"
T mr ,a 5%
and replace a"" and 5p n by the expressions from eqs. (5.51) and (5.50).
We obtain
i a "I + 1HT <> a
dp dp dp
Substituting da from eq. (5.78), we get
7* 1
' Q 2 p h
,dp dp
dt; '" i, r \ t „[ _fs dp'' dp" , 1
Jp V "j nS * = ^ r "'" dp dp aSt
As both a" and §% l are arbitrary quantities, their coefficients on both
sides must be equal,
a^ dp l m "
dp"
dp
rl,
d'p !:
dp dp
The transformation law of the V.t is obtained by multiplication by
8p_ dp"
dp~ a dp 1 '
70
VECTOR AND TENSOR CALCULUS
[ Chap. V
The last term on the righthand side can be written in a slightly different
way, by shifting the second derivative. It is
8? $ cfj' k _
3f s 3
+
3?' ap a? Wv
The first term on the righthand side vanishes, because the parenthesis
in that term is constant. Eq. (5.81) thus becomes
J« _ a£ faf a? r 1 + IV\. (5.82)
Bv carrying put the same computation with eq. (5.70), we obtnin the
It , J j,
transformation law for T% . It is identical with that of V ai .
i 11
We can now subject T% and it, to conditions which are compatible
with their transformation law. The transformation law consists of
two terms. One term depends on the 1% in the old coordinate system
and has exactly the same form as the transformation law of a tensor.
The second term does not depend on the it,, and adds an expression
which is symmetric in the two subscripts. So, even though the T" ah
may vanish in one coordinate system, they do not vanish in other
systems. But if the V k aM were symmetric in their subscripts in one
coordinate system, they would be symmetric in every other coordinate
system as well, This would be particularly true if the VS were to
vanish in one system. Furthermore, if the l4 were equal to the t&
in one coordinate system, this equality would be preserved by arbitrary
coordinate transformations. We shall find that geometrical considera
tions lead us to treat only systems of I&, which satisfy both these
conditions.
Let us displace two vectors a, and V parallel to themselves along
an infinitesimal path, Sf. The change of their scalar product, {*$.',
is given by
n i
*'C«itf} = at& + b'Sa, = mb'Tri  ThW
When two vectors are displaced parallel to themselves, Iheir scalar product
always remains constant if and only if the C& are equal to the c.orre
n
spending Tit
Actually, the assumption that the two types of Tl, are equal is
(5.83)
Chap. V ]
VECTOR AND TENSOR CALCULUS
71
strongly suggested not only because in Cartesian coordinates the inner
product of two constant vectors is itself a constant, but because of
another consideration which does not refer to Cartesian coordinates.
By extending the law or definition of "parallel" displacement (5.78),
(5.79) from vectors to tensors, we can displace any tensor "parallel"
to itself according to the rule
ii ii i
fife!  (pLul + fltiJ.  r,U t :)et 3 .
(5.84)
This rule is derived from the postulate that the "parallel displacement"
of a product is given by the same law that applies to the differentiation
of products,
R(abc) = ab 5c + as bb j be Sa. (5.85)
Applying the law (5.84) to the parallel displacement of the Kroneeker
tensor, we obtain
«(*:) = (r;.s* r%m? = (ii "*
lYejSJ*.
(5.86)
Now apply eq. (5.86) and the product rule (5.85) to the "parallel
displacement" of the product a l 8 L  . The result is
S(a f ^) = $fyf + aXfy = &a k + S%fJ),
On the other hand, the product a's'i is equal to a 1 '. We have, therefore,
5a' ; = Sa" + a'o(S ■) .
Therefore, S(3) must vanish. Accordingly, we have
it, = T%
(5.87)
Henceforth, we shall omit the distinguishing marks I and II.
As mentioned before, the T*„ are symmetric in their subscripts if it
is possible to introduce a coordinate system in which they vanish at
least locally. From now on, let it be understood that we shall consider
only symmetric T;,. The if, are still, to a high degree, arbitrary.
They are, however, uniquely determined if we connect them with the
metric tensor g lt by the following condition: The result of the parallel
displacement of a vector a shall not depend on whether we apply the
law of parallel displacement (5.78) to its (son tra variant representation
or the law (5.79) to its eovarianfc representation. The two representa
tions a 1 and a k have, at the point (^ + 5% the components (a* + Sa')
and (a k + $a k ), respectively, where 5a' and Sa k are given by equations
(5 .78) and (5 .79) . That these two vectors are again to be representations
72
VECTOR AND TENSOR CALCULUS
[ Chap. V
of the same vector (a + 5a), at the point, (£ s + Kf), is expressed by the
equation
a k + ba k = {g, k + Sff*)ft + 5a), (5.88)
whore &gi k is
TCq. (5.88) must be satisfied up to linear terms in the differentials and
for arbitrary a' and &£. If wo multiply out the righthand side of
eq. (5.88) ; we obtain
Substituting Sc& and Sa from eqs. (5.78) and (5.79), we obtain
or
tf8£(g,iVii + gafli ~ (M.d = 0.
(5.88a)
a* and S£ J are arbitrary; therefore, the contents of the parenthesis
vanish.
Now we make use of the symmetry condition and write* the vanishing
bracket down three times, with different index combinations:
r u g kT + r' k ,.g< r
Qi
0.
rj0« + 'VUe»  w* = Q >,
rt#i + rj#s  ^ = o.
Wc subtract the first of these equations from the sum of the other
two equations. Several terms cancel, and we obtain the equation
grXit = i(3i*.& + £**.< ~~ ?«»•)•
(5.89)
We multiply this equation by g sl to obtain the final expression for
t ik ,
vU = kHm^ + tfaj  ?&>*}•
(5.90)
This expression is usually referred to as the Christoffel threeindex
symbol of the second kind, and it is denoted by the symbol
IK
\ = lg l "(g^,k + 9h.i  g»_.i)
(5.90a)
Chap, v ] VECTOR AND TENSOR CALCULUS
73
The lefthand side of eq. (5.89) is called the Christoffel symbol of the
first kind. It is denoted by the sign [ik, s],
[ik, s] = §{#*»,* H g kSii  g ik , s ). (5.89a)
In tht? ease of Cartesian coordinates, both kinds of Christoffel three
index symbols vanish.
The concept of parallel displacement is independent of the existence
of a metric tensor. We call a space with a law of parallel displacement
an affmely connmted space and the T l ik the components of the offine con
nection. When a metric is defined., covariant and contra variant vec
tors becotne equivalent, and the T% must take the values < , > so that
the parallel displacement of a vector does not depend on which of the
two representations has been chosen.
We shall return now to our original program, the formation of new
tensors by differentiation.. Consider a "tensor field," that is, a tensor
the components of which are functions of the coordinates. Now, we
can take the value of this tensor at a point (£') and then displace it
parallel to itself to the point $ + 5£ E ) The value of the tensor field
at the point (* f 5^), minus the value of the parallel displaced tensor,
is itself a tensor. In the case of a mixed tensor of rank two, the value
of the tensor at the point (f s I &£") is
U k . + $,M,
and the value of the parallel displaced tensor
the difference between these two expressions is
U k .+
(5.91)
This expression is a tensor because of the way we have obtained it.
As 5£ s is itself a vector and arbitrary, the expression in the parenthesis
is a tensor. This tensor is (ailed the covariant derivative of t£ with
respect to £*, It is identical with the ordinary derivative when the
, V vanish, and, therefore, particularly in the case of Cartesian co
ordinates. Two of the more usual notations of the covariant derivative
are Vifc* and i£, . We shall use the latter.
The covariant derivatives of an arbitrary tensor are formed by add
74
VECTOR AND TENSOR CALCULUS [ Chap. V
mg to the ordinary derivatives for each index of the differentiated
tensor a further term, which for contra variant indices takes the form
t.
i
+
('],..:.
+ '■ ,
while for covariant indices it is
fc.j
»•—{£,}' +
This definition satisfies the rule for product differentiation,
{A3 ••)■,» = A [S B ■■■ + A7? ;a ■■■+■■,
regardless of whether some of the indices of ,1 , B, ■ ■ ■ , are dummies.
The eovariant derivatives of the metric tensor vanish, because of the
vanishing parenthesis of eq. (5.88a). And since the no variant dif
ferentiation obeys the law of product differentiation, indices can be
raised and lowered under the differentiation,
(Xi;s
9ika r. ,
and so forth.
(5.92)
Geodesic lines. The Chris toff el symbols appear not only in connection
with covariant differentiation and parallel displacement, but also in
connection with a problem which is more directly related to the metric
of a space;, that is, the setting up of the differential equations of straight
lines or shortest lines in terms of general coordinates. In a Euclidean
space, the shortest line connecting two points is a straight line. In the
case of general Iiiemannian spaces, there may not exist lines having
all the properties of straight lines, but there is, in general, a uniquely
determined shortest connecting line between two points. In the case
of the surface of a sphere, for instance, these lines are great circles.
Such shortest connecting lines are called geodesies. The length of an
arbitrary line connecting two points Pi and P 2 is
„ . /; *  /; v»3F  /; 4/,,. % % « m>
where p is an arbitrary parameter.
In order to find the minimum value of Sa with fixed end points of
integration, one has to carry out the variation according to theEuler
Lagrange equations,
Ch.p. v 1 VECTOR AND TENSOR CALCULUS
and the extremals are given by the equations
dy a dx \dyj
75
(5.95)
In our ease, the Lagrangian is the integrand of the last expression of
eq. (5.93), while the variables y a are the coordinates f. The derivatives
7[ and — , where £ stands for — , are given by
u% dp
1 (faff 1
3£
BL dL
tig k{&
, jpm dip
3 \fff»iff 2 ^ s
M± 9ut' _ J? dp
dp W7 *"
d fdL\ _ dp [ , j it it , if d
dpW')~ d~ S l gil ^ * +a ^ +mk '{c
We have, thus,
'p/ds 2
(dp/dsY.
if d 2 p/ds 2 \ !
~ did wiwr dv 
(5.96)
Because the parenthesis is multiplied by an expression symmetric with
respect to i and k, we symmetrize the parenthesis itself and write:
+ §.ui + 9a£
<fp/ds 2 \
{dp/dsY)
(5.97)
$ dp.
The parenthesis is now an expression encountered before, eq. (5.89a).
The differential equations for the geodesic lines are, thus,
m,wf f + gnr + m
ft . ^r d p/ds 2
{dp/d'sY
o,
or, multiplied by g l \
p/ds*
(dp/dsY
= 0.
(5.98)
70
VECTOR AND TENSOR CALCULUS
Chap. V
If we choose as the parameter the at<3 length s itself, the last term
vanishes, and we have:
ds> [Ml ds ds
(5.99)
When the coordinate system is Cartesian, the second term vanishes
identically, and eq. (5.99) simply states that the £ must be linear
functions of s.
Minkowski world and Lorentz transformations. We can now return
to our starting point, Minkowski's treatment of the theory of relativity.
He considered the ordinary, three dimensional space plus the time as
a four dimensional continuum, the "world," with the invariant "length"
or "interval" defined by eq. (5.1). A "world point" is an Ordinary
point at a certain time;, its four coordinates x, y, z, and I, winch we
shall often denote from now on by z, %\ x\ and x\ By introducing
a "metric tensor" ^ with the components
*}&
1
,
c
,
o , ~~
(5.100)
.. > +lj
we can write eq. (5.1) in the fonn
r^^^Ax^X. (5 J 01)
The Lorentz transformations are those linear coordinate transforma
tions which carry the metric tensor y,* over into itself. The inertial
systems of the special relativity theory are, thus, analogous to the
Cartesian coordinate systems of ordinary three dimensional Euclidean
geometry, and the Lorentz transformations correspond to tire three
dimensional orthogonal transformations. Their trans Ion nation co
efficients are also subject to conditions similar to (5.10a).
When we carry out any linear transformation (not necessarily a
Lorentz transformation), the transformation equations are of the form
and the coordinate differences transform as contra variant vectors,
(5.103)
Ax*
j\Ax\
Chap. V ]
VECTOR AND TENSOR CALCULUS
77
The conditions for Lorentz transformations are that
y^Ax^Ax*" = T^Ar'^s*
for arbitrary Ax\ By substituting Ax* 11 from eq.
V^y'\y\Ax l Ax" = ■f] LK Ax l Ax x ,
and, because the Ax L are arbitrary,
(5.104)
(5.103), we get
(5.105)
(5.106)
These are the conditions for the transformation coefficients of Lorentz
transformations, corresponding to the conditions (5. 10a) for orthogonal
transformations.
The; difference between, a Euclidean four dimensional space and the
Minkowski world is that hi the latter the in variant n s ! is not positive
definite. That is why no real coordinate transformation can carry eq.
(5.101) over into eq, (5.2b), page 59. We have, therefore, to (lis
ting uish between contravariant and co variant indices.
In order to recognize coordinates and tensors of the Minkowski
world as such, we shall a.dopt the convention of using in general the
Latin alphabet for indices belonging to the ordinary space, running
from 1 to 3, and of using Greek letters for Minkowski indices, running
from 1 to 4. We shall call vectors and tensors of the Minkowski world
"world vectors" and "woi'ld tensors."
The contra variant metric tensor has the components
(5.107)
When contravariant tensors transform with the coefficients y p a which
satisfy the conditions (5.10G), the coefficients of the co variant trans
formation law are the solutions of the equations
2
<'■ ,
o ,
,
,
c\
o ,
,
,
2
■■«■>
o ,
,
o ,
+1,
7 »7p
= «
(5.108)
In order to find an explicit expression for these coefficients yj, one ca
multiply the transformation equations of a co variant vector
hy ■q"" and replace v$ by ifcjif: The result is
(5.109)
v »■«,
v To neiiv
78
VECTOR AND TENSOR CALCULUS [ Chap. V
Now, the lefthand side is equal to v* p , and, therefore, to y p y,
p f pa $ JJ
7,» = ij 7« 1M
Further, since v" is arbitrary, the coefficients must be equal,
Finally, by multiplying by W "* and switching sides, we obtain
V = *W7W*. (5  111}
All the algebraic operations of the general tensor calculus can be applied
to world tensors. As in the case of orthogonal transformations, the
determinant of the transformation coefficients takes only the values
±1. Thus, the densities of even weight transform like tensors, and
the densities of odd weight transform similarly to the; "axial" vectors
of the three dimensional orthogonal formalism.
The components of the metric tensor Vli , are constant; therefore, the
Christoffel symbols vanish, and the covariant derivatives are simply
the ordinary derivatives. We shall denote such ordinary derivatives
by the comma,
~dx°
= t'.
(5.112)
We shall now demonstrate how the transformation coefficients of a
Lorentz transformation are related to the relative velocity of the two
coordinate systems S and <S*. When a point is in a state of straight
line, uniform motion, its velocity is described by the ratios of the co
ordinate differences of any two world points along its path,
i _ A'f_
* Ax* '
The velocitv of the svstem S relative to the system S* is the velocity
of a particle P, which is at rest in S, relative to S*. The first three
coordinate differences of P with respect to 8, Ax, vanish. The co
ordinate differences with respect to >S'* are, therefore, given by
Ax*" = tuA#.J
S has, therefore, relative to S*, tho velocity
(5.113)
(5.H4)
Chap, v I VECTOR AND TENSOR CALCULUS
79
Conversely, we can compute the velocity of S* with respect to S
by employing the "inverse" transformation coefficients y/, given by
cq. (5.108). Because they are the transformation coefficients of the
transformation 3* ^ 8, we can write;, referring to eq. (5,114),
74
T4 4
(5.1.15)
By making use of eqs. (5.111), (5.100), and (5.107), we can write the
righthand side in the form
v = —c
7i
(5.116)
Now it is easy to show that in general «* 2 is equal to iA Let us first
form the three dimensional norm of ?;*', eq. (5.114). We have
Because of eq. (5.106) the numerator can be rewritten.
£ {y\f = c'[(y\r ~ 1],
and we find, therefore;,
^ = ^Ct4)  l
(7 4 d 2
(5.11.7)
Now we treat eq. (5.116) in an analogous way. It is
4 y^ (y'i )
U {y\Y '
Again we liave
and v is
E (y\) 2 = \ [(7 4 4) 2  l],
j _ ? (y\r  1
Incidentally, we find that y\ is always given by the expression
1
(5.118)
74
VT — s 2 /c
(5.119)
80
VECTOR AND TENSOR CALCULUS [ Ch*. V
Paths, world lines. Odin arily , the motion of a particle a) ong its path
is described by stating the functional dependence of the three space
coordinates on the time t,
x> = ftij. (5.1205
The components of the velocity are Riven as. the derivatives,
(5.121)
This kind of description is, of course, possible in the theoiy of relativity
as well as in nonrelativistic physics. However, it is often useful to
choose a description in which the time is not set apart from the spatial
coordinates as in eqs. (5.120) and (5.121).
The motion of a mass point is, in terms of the Minkowski world,
represented by a line, a "world line," which we can advantageously
describe by a parameter representation,
# = f /(p), (3122)
where p is an arbitrary parameter defined along the world line. Such
parameter descriptions are commonly used in analytical geometry.
In three dimensional geometry., the arc length is often chosen as the
parameter. In the Minkowski world, we can use as the parameter the
proper time r along a world line. Just as the arc length £ in ordinary
geometry is defined as the line integral
s = [ Vd&Tdy 1 + dJ,
r is given as the integral of the differential dr along the world line of
the particle,
fir^j J dP   (dx> + dy* + m
 I VV dx» dx" = J Uib
^ d "'dr. (5.123)
dr dr
We can, therefore, describe the path by equations of tire form
/ = rfr). (5124)
T is related to the x" by a differential equation. When we divide the
integrand of eq. (5.123) by dt and take account of eq. (5.121), we obtain
a
at
r^IW"^^
(5.125)
Chap. V ]
VECTOR AND TENSOR CALCULUS
81
The velocity of a body in terms of the usual description (5.120),
(5.121) is replaced by the direction of the world line in the four dimen
sional description. When, the body is at rest, the line is parallel to the
Z 4 axis; a.nd when the body is moving, the line will run at an. angle
relative to the X*axi& We can describe its direction in terms of its
tangential vector,
U? =
dx"
(5.126)
The four quantities U" are the components of a contra variant unit
vector,
%.#"# = 1 (5.127)
This can be verified easily by replacing dr in eq. (5.126) by its defini
tion,
dr = V^d^de,
The U* are related very simply to !,he velocity components u* of
eq. (5.121); making use of eq, (5.. 125), we have
Tf
_ dt
~ = dr
d/
dr
1
Tf
VI
_ dx' dt
dt dr
= u (.■
(5.128)
In the following chapters, we shall use it' and If consistently in the
way they were introduced here. v l shall be used exclusively to denote
the relative motion of two coordinate systems.
When a body moves without being accelerated, its direction in the
Minkowski world is constant and its world line is a straight line. The
law of inertia takes a very simple form in our new description :
D* = const.
PROBLEMS
15,129)
1. Prove that the righthand side of eq. (5.90a) transforms according
to eq. (5.82).
2. Prove that the symmetry properties with respect to indices of the
same transformation character are invariant.
3. For three dimensional Riemannian space, define the differential
operations gradient (of a scalar), divergence, and curl, and prove that
the relation holds:
cui'l gcad V = 0.
82
VECTOR AND TENSOR CALCULUS t CW. V
Treating "axial" vectors as skew symmetric tensors of rank 2, define
also the divergence of an axial vector and prove the relation
div curl A = 0,
where A is a polar vector.
4. Prove that the following relations hold in three dimensional space:
5m f :m = 2S7 ; S ikt 6 ima = &t ~ ffiff ■
State similar relations for the Minkowski world.
5. With the help of Problem 4, prove that in the Cartesian, three
dimensional geometry the relation holds:
curl curl A = grad div A  V ? A,
A being either a polar vector or an axial vector.
%, Compute in three dimensional space the two triple products of
the three polar vectors:
[AX IB X CJ1, (A[B XC1).
7. In a plane, introduce polar coordinates.
(a) Compute the components of the metric tensor.
(b) State the differential equations of the straight lines.
8 On the surface of a sphere of radius R, introduce the socalled
Riemannian homogeneous coordinates, which are characterized by an
expression for the infinitesimal length of (he form
si = fii + j)(de + dj), with/to = i.
(a) State the function f{f + n) and the transformation equations
between lliemannian coordinates and the usual coordinates of longitude
and latitude,
Answer:
/ =
1 =
R 5
_r> + ue + f)J
2E eos ip .
. —  cos d;
1 + sin ip
2R. cos ip
„ =  sin 3.
1 + sin tp
(b) Compute the differential equations for the great circles in either
coordinate system.
Chap, V
VECTOR AND TENSOR CALCULUS
83
Remark: The Riemannian coordinates are obtained by a conformal
transformation leading from a plane to a spherical surface, familiar
from the theory of complex functions.
9. The Laplacian operator in n dimensions is defined as the diver
gence of the gradient of a scalar, in general coordinates:
(«) Using the [ w „ j , write out the rightdiand side.
(b) Introduce a coordinate system tin; coordinates of which are
everywhere orthogonal to each other, in other words, where the line
element takes the form
ds> = E fc&OW,
e; = ±1.
The hi are functions of f\
Express the Laplacian in terms of the //, .
Answer i
v 5 V = } E (—, v.) , n = hih, h n .
Remark: This expression is frequently used in order to obtain the
Schroedinger equation in other than Cartesian coordinates. The
student can easily derive the expression for V~V in three dimensions
for spherical coordinates, cylinder coordinates, and so forth.
10. (a) Show that this relation holds in n dimensions:
■ma I
!
§
{ii<:
U^vVd7U,
Vg
(b) Show that the following expressions are a scalar and a vector,
respectively, when 7 l is a vector and F l " a skew^symmctric, contravariant
tensor :
■/ (VgV 1 ),; ~(V~vP ik ).<=
Vg Vg
(c) In general coordinates, bring V~V into a form which is a generaliza
tion of that given sub 9(6).
11. Vi is a. vector, Fm a skewsymmetric tensor. Show that the fol
lowing explosions are tensors with respect to arbitrary transformations,
even though the derivatives are ordinary derivatives,
IV*  TV; ; F ik . i + F ku + F U:
84 VECTOR AND TENSOR CALCULUS [ Ch ap . v
12. Schwarz's inequality,
states in effect that, in an n dimensional Euclidean space, any side of a
triangle is shorter than fee sum of fee two others. For the latter
statement can he written in the form
a + b } a + b.
When both sides arc squared,
a" + b 2 + 2a[b % a" + b s + 2(ab),
the squai'es on either side cancel; and when the remaining terms are
squared once more, Schwarz's inequality is obtained.
By introducing a suitable Cartesian coordinate system, prove
Schwarz's inequality, and thereby that the above statement about the
sides of triangles is true regardless of the number of dimensions.
Another method of proof uses the positive norm of the skewsym
metric product,
E foto  aAf = [a Xbf.
(A
13. Tn an n dimensional Euclidean space, m unit vectors may be
defined, i?%
other,
t>4 , m ^ n, which are mutually orthogonal to each
turn — 5k!
k,l
1
1
Show that for any vector fi , Bess el's inequality holds:
SV 5
~ m
E (fivS,
ir=l
? S o.
The inequality goes over into an equality if m = ft,
14. Prove that a vector field Vi in an n dimensional space can be
represented as the gradient field of a scalar function if and only if Its
skewsymmetric derivatives vanish,
V t *  TV; = 0.
CHAPTER Vi
Rclativistic Mechanics of Mass Points
Program for relativistic mechanics. In Chapter IV, we have laid the
foundation for the special theory of relativity. But so far we have
dealt only wife uniform, straightline motion. The clocks and. scales
which wo used for the determination of coordinate values were not
accelerated. We replaced the Galilean transformation equations con
necting two inertial systems, (4..H), by the Lorentz equations (4.13).
The Lorentz equations are linear transformation equations, just as the
Galilean equations were; that is, the new coordinate values (space and
time) arc linear functions of the old ones. Therefore, an unaccelerated
motion in one inertia! system will remain unaccelerated when a Lorentz
transformation is carried out.
The law of inertia (2.1) U invariant with respect to Lorentz transforma
tions.
The remaining chapters of Part I will discuss accelerated motion, in
otner words, will develop a relativistic mechanics. This will be more
involved than the development of classical mechanics. The difficulties
are twofold: In the first place, the equations of classical mechanics are
covariant wife respect to Galilean transformations, but not with re
spect to Lorentz transformations. Therefore, we shall have to develop
a Lorentz invariant formalism so that our statements may be inde
pendent of the coordinate system used. The second difficulty is more
profound: In classical mechanics, the force which acts on a body at a
given time is determined by the positions of the other interacting
bodies at the same time. An "actioiiatadistanee" force law can be
formulated Only if it is meaningful to speak of fee "positions of the
other interacting bodies at the same time"; that is, if the "same time"
is independent of the frame of reference used. This condition, wc know,
runs counter to the theory of relativity.
It is, therefore, impossible to transform automatically every con
ceivable classical law of force info a Lorentz covariant law. We can
treat only those theories from which the concept of action at a distance
can bo eliminated. This possibility exists in the theory of collisions,
55