Skip to main content

Full text of "Introduction to the Theory of Relativity: Part 1"

See other formats




(', oac. 








Vector and Tensor Analysis, by A. P. Wills, 
A Survey Course in Physics, by Carl F. 

Elements of Nuclear Physics, by Franco 


urements, by P. Vigoun.-ux and C. E. Webb. 

Atomic Spectra and Atomic Structure, by 
Gerhard Herzberg. 

Electricity and Magnetism, by A. W. Hirst. 

Properties of Matter, by F. C, Champion 
and N. Davy. 

Thermodynamics, by Enrico Fermi. 

College Physics, by Henry A. Perkins. 

College Physios, Abridged, E>j/ Henry A. 
Perkins . 

Introductory Quantum Mechanics, by Vlad- 
imir Rojansky. 

Procedures in Experimental Physics, hy 
John Strong, in collaboration with H. Victor 
Neher, Albert E. Wbitford, C. Hawley 
Cartw right, and Roger Hay ward. 


Herman G. Hell <mi Willard 11. Bennett. 

The Structure or Steel Simply Explained, 
by Erie N. Simons and Edwin Gregory. 

Molecular Spectra and Molecular Struc- 
ture: Part I — Diatomic Molecules, by 
Gerhard Herzberg. 

Physical Meteorology, by John G, Albright. 

Vibrational Spectra and Structure of 
Polyatomic Molecules, by Ta-You Wu. 

An Introduction to the Theory of Rela- 
tivity, by Peter G. Bergman n. 











New York 



70 Fifth Avonuo, New York 

all marrTS besbkYbd. so part of this book 


First Printing :.:.. May 19.42 

Second Printing January 1946 

Tli ire! Printing March 19-17 

Fmii-tri Printing ...Aiieust- i<hs 

Foreword by Albert Einstein 

Although a number of technical expositions of the theory of rela- 
tivity have been published, Dr. Bergniann's book seems to me to 
satisfy a definite need. It is primarily a textbook for students oi' 
physics and mathematics, which may be used either in the classroom 
or for individual study. The only pre requisites for reading the book 
are a familiarity with calculus and some knowledge of differential 
equations, classical mechanics, and electrodynamics. 

This book gives an exhaustive treatment of the main features of the 
theory of relativity which is not. only systematic and logically com- 
plete, but also presents adequately its empirical basis. The student 
who makes a thorough study of the book will master the mathematical 
methods and physical aspects of the theory of relativity and will be 
in a position to interpret for himself its implications. He will also be 
able to understand, with no particular difficulty, the Literature of the 

I believe that more time and effort might well be devoted to the 
systematic teaching of the theory of relativity than is usual at present 
at most universities. It is true that the theory of relativity, par- 
ticularly the general theory, has played a rather modest role in the 
correlation of empirical facts so far, and it has contributed little to 
atomic physics and our understanding of quantum phenomena. It is 
quite possible, however, that some of the results of the general theory of 
relativity, such as the general co variance of the laws of nature and their 
nonlinearity, may help to overcome the difficulties encountered at 
present in the theory of atomic: and nuclear processes. Apart from this, 
the theory of relativity has a special appeal because of its inner con- 
sistency and the logical simplicity of its axioms. 

Much effort has gone into making this book logically and pedagog- 
ieally satisfactory, and Dr. Eergmann has spent many hours with me 
which were devoted to this end. It is my hope that many students 
will enjoy the book and gain from it a better understanding of the ac- 
complishments and problems of modern theoretical physics. 

The Institute for Advanced Study 



TBtrs book presents the theory of relativity for students of physics 
and mathematics who have had no previous introduction to the 
subject and whose mathematical training does not go beyond the 
fields which are necessary for studying classical theoretical physics. 
The: specialized mathematical apparatus used in the theory of rela- 
tivity, tensor calculus, and .Rieci calculus, is, therefore, developed in the 
book itself, The main emphasis of the hook is on the development of the 
basic ideas of tin; theory of relativity; it is these basic ideas rather than 
special applications which give the theory its importance among the 
various branches of theoretical physics. 

The material has been divided into three parts, the special theory of 
relativity, the general theory of relativity, and a report on unified 
field theories. The three parts fonn a unit. The author realises that- 
many students are interested in the theory of relativity mainly for its 
applications to atomic and nuclear physics. It is hoped that these 
readers will find in the first part, on the special theory of relativity, all 
the information which they require. Those readers who do not intend 
to go beyond the special theory of relativity may omit one section of 
Chapter V (p. G7) and all of Chapter VI11; these passages contain 
material which is needed only for the development of the general theory 
of relativity. 

The second part treats the general theory of relativity, including the 
work by Einstein, Infeld, and Hoffmann on the equations of motion. 
The third part deals with several attempts to overcome defects in the; 
general theory of relativity. Kone of these theories has been com- 
pletely satisfactory. Nevertheless, the author believes that this report 
rounds out the discussion of the general theory of relativity by indi- 
cating possible directions of future research. However, the third part 
may be omitted without destroying tire unity of the remainder. 

The author wishes to express Ins appreciation for the help of Pro- 
fessor Einstein, who read die whole manuscript and made many valuable 
suggestions. Particular thanks are due to Dr. and Mrs. Fred Fender, 
who read the manuscript carefully and suggested many stylistic and 
other improvements. The figures were drawn by Dr. Fender. Margot 
Bergmann read the manuscript, suggested improvements, and did 
almost all of the technical work connected with the preparation of the 
manuscript. The friendly co-operation of the Editorial Department of 
Prentice-Hall, Inc. is gratefully acknowledges*. 

P. G. B. 



Foreword by Albert Einstein v 

Preface vii 

Introduction xiii 

Part I 


I. Frames of Reference, Coordinate Systems, and Co- 
ordinate Transformations 3 

Coordinate transformations not involving time. Coordinate 
transformations involving time. 

11. Classical Mechanics 8 

The law of inertia, inertial systems. Galilean transformations. 
The force law and its transformation properties. 

III. The Propagation of Light is 

The problem confronting classical optics. The corpuscular hy- 
pothesis. The transmitting medium as the frame of reference, 
Th aba ol n (: e f ram o f re f e ren c e . The e xpe ri men t of Mi e he Iso n an d 
Morley. The ether hypothesis. 

IV. Thk Lorbntz Teansfoematioh 28 

The relative character of simultaneity. The length of scales. 
The rate of clocks. The Lorenlz transformation. The "kine- 
matic" effects of the T.orentz transformation. The proper time 
interval. The relafivistie law of the addition of velocities. The 
proper time of a material body. Problems. 

V. Vector and Tensor Calculus in an n Dimensional 

Continuum 47 

Orthogonal transformations. Transformation determinant. Im- 
proved notation. Veetors. Vector analysis. Tensors. Tensor 
analysis. Tensor densities. The tensor density of Levi-Civita. 
Vector product and curl. .Generalisation, n dimensional con- 
tinuum. General transformations. Vectors. Tensors. Metric 
tensor, Riemannian spaces. Raising and lowering of indices, 



Y. Vector and Tensor Calculus in an n Dimensional 
C o ntix u u m: ( Con I . ) 

Tensor densities, Eevi-Civita tensor density. Tensor analysis. 
Geodesic lines. Minkowski world and Lo rents transformations. 
Paths, world lines, Problems. 

VI. Relativistic Mechanics of Mass Points 85 

Program for relativism mechanics. The form of the conservation 
laws. A model example. Lorentz co variance of the new conserva- 
tion laws. Relation between energy and mass. The Compter* 
e fleet. Relativistic analytical mechanics. Eelativistie force. 

VI T. Relativistic Electrodynamics 106 

Maxwell's field equations. Preliminary remarks on transforma- 
tion properties. The representation of four dimensional tensors 
in three plus one dimensions. The Lorentz in variance of Maxwell's 
field equations. The physical significance of the transformation 
laws. Gauge transformations. The ponderomotivo equations. 

VIII. The Mechanics of Continuous Matter 121 

Introductory remarks. No n relativistic treatment. A. special 
coordinate system. Tensor form of the equations. The stress- 
energy tensor of electrodynamics. Problem. 

IX. Applications of the Special Theory of Relativity . . 133 

Experimental verifications of the special theory of relativity. 
Charged particles in electromagnetic fields. The field of a rapidly 
moving particle. Sommerfeld's theory of the hydrogen fine struc- 
ture. De Broglie waves. Problems. 

Part II 


X. The Principle of Equivalence 151 

Introduction . The principle of equivalence. Preparations for a 
relativistic theory of gravitation. On ineftial systems. Ein- 
stein's "elevator." The principle of general eo variance. The 
nature of the gravitational field. 

XL The Riemann-Christoffel Curvature Tensor . . .161 
The characterisation of Riema.nnian spaces. The intcgrability of 
the a.ffine connection. Euclidicity and integrability. The criterion 
of integrability. The commutation law for covariant- differentia- 
tion, the tensor character of Ryuf! Properties of the curvature 
tensor. The covariant form of the curvature tensor. Contracted 
forms of the curvature tensor. The contracted Bianchi identities. 
The number of algebraically independent, components of the curva- 
ture tensor. 



XIL The Field Equations of the General Theory of 

Relativitt 175 

The pondero motive equations of the gravitational field. The rep- 
resentation of matter in the field, equations. The differential 
identities. The field equations. The linear approximation and 
the standard coordinate conditions. Solutions of the linearized 
field equations. The field of a mass point. Gravitational waves. 
The variational principle. The combination of the gravitational 
and electromagnetic fields. The conservation laws in the general 
theory of relativity. 

XIII. Rigorous Solutions of the Field Equations of the 

General Theory of Relativity 198 

The solution of Schwarzschild. The "Schwarzschild singularity." 
The field of an electrically charged mass point. The solutions 
with rotational symmetry. 

XIV. The Experimental Tests of the General Theory of 

Relativity 211 

The advance of the perihelion of Mercury. The deflection of light 
in a Schwarzschild field. The gravitational shift of spectral lines. 

XV. The Equations of Motion in the General Theory 

of Relativity 223 

Force laws in classical physics and in electrodynamics. The law 
of motion in the general theory of relativity. The approximation 
method. The first approximation and the mass conservation 
law. The second approximation and the equations of motion. 
Conclusion. Problem. 

Part III 

XVI. Weyl's Gauge-Invariant Geometry 245 

The geometry. Analysis in gauge invariant geometry. Physical 
interpretation of Weyl's geometry, Weyd's variational principle. 
The equations G^y = 0. 

XVII. Kaluza's Five Dimensional Theory and the Projj ac- 
tive Field Theories 25-i 

Kaluza's theory. A four dimensional formalism in a five dimen- 
sional space. Analysis in the p-fornialism. A special type of 
ooordina.te system. Covariant formulation of Kaluza's theory. 
Projective field theories. 

XVIIL A Generalization of Kaluza's Theory 271 

Possible generalisations of Kaluza's theory. The geometry of the 
closed, five dimensional world. Introduction of the special co- 
ordinate system. The derivation of field equations from a varia- 
tional principle. Differential field equations. 

Index 281 


Almost all the laws of physics deal with the behavior of certain objects 
in space in tin.; course 1 of time. The position of a body or the location 
of an event can bo; expressed only as a location relative- to some other 
body suitable for that purpose. For instance; in an experiment with 
At wood's machine, the velocities and accelerations of the weights are 
referred to the machine itself, that is, ultimately to the earth. An 
astronomer may refer the motion of the planets to the center of gravity 
of the sun. All motions are described as motions relative to some 
reference body. 

We imagine that conceptually, at least, a framework of rods which 
extends into space can be rigidly attached to the reference body. Using 
this conceptual framework as a Cartesian coordinate system in three 
dimensions, we characterize any location by three numbers, the coordi- 
nates of that space point. Such a conceptual framework, rigidly con- 
nected with sonic material body or other well-defined point, is often 
called a frame of reference. 

Some bodies may be suitable as reference bodies, others may not. 
Even before the theory of relativity was conceived, the problem of 
selecting suitable frames of reference played an important part in the 
development of science. Galileo, the father of post-medieval physics, 
considered the choice of tire heliocentric frame to be so important that 
he risked imprisonment and even death in his efforts to have the new 
frame of reference accepted by his contemporaries. Tn the last analysis, 
it was the choice of the reference body which was the subject of his 
dispute with the authorities. 

Later, when Newton gave a comprehensive presentation of the physics 
of his time, the heliocentric frame of reference had been generally 
accepted. Still, Newton felt that further discussion was necessary. To 
show that some frames of reference were more suitable for the descrip- 
tion of nature than others, he devised the famous pail experiment: He 
filled a pail with water. By twisting the rope which supported the pail, 
he made it rotate around its axis. As the water gradually began tc 
participate in the rotation, its surface changed from a plane to a para- 
boloid. After the water had gained the same speed of rotation as the 



pail, he stopped the pail. The water slowed down and eventually came 
to complete rest. At the same time, its surface resumed the shape of 
a plane. 

The description given above is based on a frame of reference con- 
nected with the? earth. The law governing the shape of the water's 
surface could be formulated thus: The surface of the water is a plane 
whenever the water does not rotate. It is a paraboloid when the water 
rotates. The state of motion of the pail has no influence on the shape 
of the surface. 

Now let lis describe the whole experiment in terms of a frame of refer- 
ence rotating relatively to the earth with a constant angular velocity 
equal to the greatest velocity of the pail. At first, the rope, the pail, 
and the water "rotate" with a certain constant angular velocity with 
respect to our new frame of reference, and the surface of the water is a 
plane. Then the rope, and in turn the pail, is "stopped," and the water 
gradually "glows clown," while its surface becomes a paraboloid. After 
the water has come to a "complete rest," its surface still a paraboloid, 
the rope, and in turn the pail, is again made to "rotate" relatively to 
our frame of reference (that is, stopped with respect to the earth); the 
water gradually begins to participate in th.e "rotation," while its surface 
flattens out. In the end, the whole apparatus is "rotating" with, its 
former angular velocity, and the surface of the water is again a plain;. 
With respect to this frame of reference, the la.w would have to be formu- 
lated like this: Only when the water "rotates" with a certain angular 
velocity, is its surface a plant;. The deviation from a plane increases 
with the deviation from this particular state of motion. The state of 
rest produces also a paraboloid. Again the rotation of the pail is 

Newton's pail experiment brings out ve,ry clearly what is meant by 
"suitable" frame of reference. We can describe nature and we can 
formulate its laws using whatever frame of reference we choose. But 
there may exist a frame or frames in which the laws of nature are funda- 
mentally simpler, that is, in which the laws of nature contain fewer 
elements than they would otherwise. Take the instance of Newton's 
rotating pail. If our description of nature were based on the frame of 
reference connected with the pail, many physical laws would have to 
contain an additional element, the angular velocity a of the pail relative 
to a ''more suitable" frame of reference, let us say to the earth. 

The laws of motion of the planets become; basically simpler when they 
are expressed in terms of the- heliocentric frame of reference instead of 
the geocentric frame. That is why the description of Copernicus and 



Galileo won out over that of Ptolemy, even, before Kepler and Newton 
succeeded in formulating the underlying laws. 

Once it was clearly recognized that the choice of a frame of reference 
determined the form of a law of nature, investigations were carried out 
which established the effect of this choice in a mathematical form. 

Mechanics was the first branch of physics to be expressed in a com- 
plete system of mathematical laws. Among all the frames of reference 
conceivable, there exists a set of frames with respect to which the law 
of inertia tabes its familiar form: In Liu: absence of forces, the space 
coordinates of a mass point are linear functions of time. These frames 
of reference; are called inertia! systems. If; was found that all of the 
laws of mechanics take the same form when stated in terms of any one 
of these mertial systems. Another- frame of reference necessitates a 
more involved physical and mathematical description, for example, the 
frame of reference connected with Newton's rotating pail. The charac- 
terization of the motions of mass points not subject to forces is possible 
in terms of this frame of reference, but the mathematical form of the 
law of inertia is involved. The space coordinates are not linear func- 
tions of time. 

Since the laws of mechanics take the same form in all frames of refer- 
ence which are inertia! systems, all inert! al systems are equivalent from 
the point of view of mechanics. We can find out whether a given body 
is "accelerated" or "un accelerated"' by comparing its motion with that 
of some; mass point which is not subject to any forces. But whether a 
body is "at rest" or "'in uniform motion" depends entirely on the inertial 
system used for the description; the terms "at rest" and "in uniform 
motion" have no absolute meaning. The principle that all inertia! sys- 
tems are equivalent for the description of nature is called the principle 
of relativity . 

When Maxwell developed the equations of the electromagnetic held, 
these equations were apparently incompatible with the principle of rela- 
tivity". For, according to this theory, electromagnetic waves in empty 
space should propagate with a universal, constant velocity c of about 
3 X 10 w cm /sec, and this, it appeared, could not be true with respect 
to both of two different inertial systems which were moving relatively 
to each other. The one frame of reference with respect to which the 
speed of electromagnetic radi.ation would be the same hi all directions 
could be used for the definition of "absolute rest" and of "absolute 
motion." A number of experimenters tried hard to find this frame of 
reference and to determine the earth's motion with respect to it. 

All these attempts, however, were unsuccessful. On the contrary, all 


experiments seemed to suggest that the principle of relativity applied 
to the laws of electrodynamics as well as to those of mechanics. H. A. 
Lorentz proposed a new theory, in which he accepted the existence of 
one privileged frame of reference, and at the same time explained why 
this frame could not be discovered by experimental methods. But he 
had to introduce a number of assumptions which could not have been 
checked by any conceivable experiment. To this (stent his theory was 
not very satisfactory. Einstein finally recognized that only a revision 
of our fundamental ideas about and time would resolve the im- 
passe between theory and experimenL Once this revision, had been 
made, the principle of relativity was extended to the whole of physics. 
This Is now called the special theory of relativity. It establishes the 
fundamental equivalence of ai 1 iucrtial systems . 1 1 pressor ves fully their 
privileged position among all conceivable frames of reference. The so- 
called general theory of relativity analyzes and thereby destroys this 
privileged position and is able to give a pew theory of gravitation. 
,. in this book we shall first discuss the role of different frames of refer- 
ence, from a classical point of view, in mechanics and to some extent in 
electrodynamics. Only when the student understands fully the dead- 
lock between theoretical conclusions and cxperi mental results in classical 
electrodvnamics can he appreciate the necessity of revising classical 
physics 'along relativistic lines. Once the new ideas of space and time 
are grasped, ""relativistic mechanics" and "relativistic'' 

are easilv understood. 

The second part of this book is devoted to the general theory of 
relativity, while the third part discusses some recent attempts to ex- 
tend the theory of gravitation to the field of electrodynamics. 

Trie Special Theory of Relativity 


Frames or Reference / Coordinate Systems, 
and Coordinate Transformations 

We have spoken of frames of reference and have mentioned Cartesian 
coordinate systems. In this chapter we shall examine more closely the 
relationships Between different Frames of reference and different co- 
ordinate systems. 

Coordinate transformations not involving time. As a specific in- 
stance; let us consider a frame of reference connected rigidly with the 
earth, that is, a geocentric frame of reference. In order to express 
quantitatively the location of a point relative to the earth, we introduce 
a coordinate system. Wo choose a point of origin, let us say the center 
of the earth, and directions for the three axes; for instance, the Z-axi.s 
may go from the earth's center through the intersection of the equator 
and the Greenwich meridian, the K-axis through the intersection of the 
equator and the 90° J '^-meridian, and the Z-axis through the North Pole. 
The location of any point is then given by three real numbers, the co- 
ordinates of that point. The motion of a point is completely described 
if we express the three point coordinates as functions of time. A point 
is at rest relatively to our frame of reference if these three functions are 

Without abandoning the earth as the body with which our frame of 
reference is rigidly connected, we can introduce another coordinate 
system. We may, for instance, choose as the point of origin some well- 
defined point on the earth's surface, let us say one of the markers of the 
United States Coast and Geodetic Survey; and as the direction for the 
X-axis the direction due East; for tire 7- axis, the direction due North; 
and for the Z-axis, the direction straight up, away from the earth's 
center, the earth assumed to be a sphere. 

The relationship between the two coordinate systems is completely 
determined if the coordinates of any given point with respect to one co- 
ordinate system are known functions of its coordinates with respect to 
the other coordinate system. Let us call the first coordinate system S 



[ Chap. I 

and the second coordinate system S', and the coordinates of a certain 
point P with respect to S (x, y, z) and the coordinates of the same point 
with respect to S' (V, y', s'). Then, x' ; y\ and z' are connected with 
x, y, and z by equations of the form: 

%' = cu.x + c i2 y + c Vi z + x' 
y' = fast + c-ny + cxz + y' 
s' — m. ■■%. + <%$# + ess + £'• 


(&', &', if) are the coordinates of the point of origin of S with respect 
to £'. The constants ca are the cosines of the angles between the axes 
of S and S', en referring to the angle between the X- and the; X'-axis, 
Cm to the angle between the F- and the X'-axis, eai to the angle between 
the X- and the F'-axis, and so forth. 

The transition from one coordinate system to another is called a 
coordinate transformation, and the equations e ounce ting the point co- 
ordinates of the two coordinate systems are called transformation 

A coordinate system is necessary not only for the description of loca- 
tions, but also for the representation of vectors. Let us consider some 
vector held, for example, an electrostatic field, in the neighborhood of 
the point P. The value and direction of the field strength E at P is 
completely determined when we know the components of E with respect 
to some stated coordinate system B. Let us call the components of E 
at P with respect to S, E x , E y , and B* . The components of E at P 
with respect to another system, for instance S', can be computed if we 
know the transformation equations defining the coordinate transforma- 
tion S into S'. These new components, Ml , E'„ , and K , are inde- 
pendent of the translation of the point of origin, that is, the constants 
x' , i/o , and z' is of (1.1). E' x is the sum of the projections of E x , E y , 
and E, on the X'-axis, and E'„ and E' t are determined similarly, 

El = c n E x + cuE„ -f f>®S* , 

E' y = c, n E* + c,nE v + cwEz , 

E' z s= tkS-V + (h'lEy + C-s-iEz . 

A law winch expresses the components of a certain quantity at a point 
in terms of the components of the same quantity at the same point 
with respect to another coordinate system is called 'a transformation taw. 

Coordinate transformations involving time. We have thus far con- 
sidered only transformations which lead from one coordinate system to 

Chap, I ] 


another one rigidly connected with the same reference body, such as the 
earth. But coordinate transformation offers an important method for 
investigating the relationship between two different frames of reference 
which move relatively to each other. In such a case, we represent each 
of the two frames by one coordinate system. 

Let us compare a frame of reference rigidly connected with the earth 
and another one 1 connected with Newton's pail, which we assume is 
rotating with constant angular velocity. We can introduce two co- 
ordinate systems which enable us to describe quantitatively the location 
of any point with respect to either frame of reference. Let us call these 
two coordinate systems S (this $ is not identical with the former S) 
and A"*, respectively, and let us choose the points of origin so that they 
both lie on the axis of the pail and coincide; the Z-axis and the Z*-axis 
may be identical and pointing straight up. If the pail rotates with a 
constant angular velocity ti relative to the earth, and if at the time 
t = the X-axis is parallel to the X*-axis, the coordinate transforma- 
tion equations take the form 

s* = cos wt-x + sin wt-y, 

y* = — sin mi • x + cos cot ■ y, 


(Kg. 1) 

Eqs. (1.2) have a form similar to eqs. (1.1), except that the cosines are 
no longer constant, but functions of time. The relative motion of the 
two frames of reference expresses itself in the functional dependence of 
the Cik on time. 

Eqs. (1.2) express the relationship between, two frames of reference 
which are rotating relatively to each other, Very often we are interested 
in the relationship between two frames of reference which are in a state 
of uniform, translatory motion relative to each other. In that case;, it 
is convenient to choose; (lie two coordinate systems A and S* so that 
their corresponding axes are parallel to each other and so that the points 
of origin coincide at the time ( = 0. The transformation equations 
have the form: 

x — v x t, 

y* - y 


where v x , v y , and ? a are the components of the velocitv of 8* relativelv 
t<j S. 

The form of the transformation equations (1.2) and (1.3) depends, of 


[ Chap. I 

course, on the relative motion of the two frames of reference, but it 
also depends on certain assumptions regarding the nature of time and 
space: We assume that it is possible to define a time t independently of 
any particular frame of reference, or, in other words, that it is possible 
to build clocks which are not affected by their state of motion. This 


l n ig. I, The coordinate svst.em S* with the coordinates {x*, y*, s") rotates 
relatively to the coordinate" system 8 with the coordinates 0, y, z) with the 
angular velocity a. 

assumption is expressed in our transformation equations by the absence 

of a transformation equation for t. Tf we wish, we can add the equation 



expressing the universal character of time explicitly. 

The other assumption concerns length measurements. We assume 
that the distance between two points- -they ma > r be particles— at a 
given time is quite independent of any particular frame of reference; 

Chap. I ] 


that is, we assume that we can construct rigid measuring rods whose 
length is independent of their state of motion. Eqs. (1.3) show with 
parti cular clarity how this assumption is expressed by the form of the 
transformation equations. For the, distance between two points Pi and 
P 2 With the coordinates §& , y ± , sty and (x 2 , y 2 , g 2 ) is 

to = V(i, - .rO r +Ty E - ihf + (a - %)* , (1.5) 

and obviously 

Vfe - ai)" + (y 2 - y,)' + (s 2 - £,)=' 

is satisfied for any time I. 
We shall have to consider these assumptions again at a later time. 


Classical Mechanics 

The law of inertia, inertial systems. The branch of physics which 
from the; first was most consistently developed as an experimental science 
was Galilean-Newtonian mechanics. The first law to be formulated 
was the law of inertia: Bodies when removed from interaction with other 
bodies will continue in their -states of rent or straight-line uniform motion. 
In other 'words, the motion of such bodies is unaccelerated. 

To express the law of inertia in mathematical form, we designate the 
location of a body by its three coordinates, x, y, and z. When a body is 
not at rest, its coordinates are functions of time. According to the law 
of inertia, the second time derivatives of these three functions, the ac- 
celerations, vanish when the body is not subjected to forces, that is, 

x = 0, 

V - 0, 



We use the usual notation x for '-—. The first integral of eqs. (2.1) ex- 
presses the; constancy of the three velocity components, 

x = u^ , y = u-y , z = % . (2.2) 

The equations expressing the law of inertia contain coordinates and 
refer, therefore, to a certain coordinate system. As long as this co- 
ordinate system is not specified, the italicized statement does not have 
a precise meaning. For, given any body, we can always introduce a 
frame of reference with respect to which it is at rest and, therefore, 
unaccelerated. The teal assertion is, rather: There exists a coordinate 
system (or coordinate systems) with respect to which all bodies not subjected 
to forces are unaccelerated. Coordinate systems with this property and 
the frames of reference represented by them are called inertial systems. 

Of course, not all frames of reference are inertial systems. For in- 
stance, let us start out from an inertial coordinate system S, and carry 
out a transformation (1.2), leading to £*, a system rotating with a 
constant angular velocity w relative to S. In order to obtain the 
transformation laws of eqs. (2.1) and (2.2), we differentiate the trans- 



formation equations (1.2) once and then a second time with respect 

to i. The resulting equations contain x, y, z, x*, ?/*, 2*, and the first 
and second time derivatives of these quantities. 

We assumed that the coordinate system S is an inertial system. We, therefore, for ;i;, if and & and for x, y, and i the expressions 
(2.1) and (2.2) respectively. Thus, we obtain, for the starred co- 
ordinates and their derivatives 


x* = uy* + u x cos oil + v.,, sin bit, 
if* = — ax* -f- «,, cos <d — u x sin mi 3 
z* = 1%, 

= w as* -f- 2ay* t 


u y 


r = 0. 


It turns out that in the coordinate system $* the second tune deriva- 
tives do not all vanish. Occasionally it is desirable to work with frames 
of reference in which accelerations occur which are not caused by real 
interactions between bodies. These accelerations, multiplied by the 
masses, are treated like real forces, often called "transport forces," 
"inertial forces," and so forth. In spite of these names, these expres- 
sions are not actual forces; they merely appear in the equations formally 
in the same way as forces do. In our ease, the first terms, ui'x*, </y*, 
multiplied by the mass, arc called "centrifugal forces," and the last 
terms, also multiplied by the mass, arc the so-called Coriolis forces. 

On the other hand, there are also types of coordinate transformations 
which leave the form of the law' of inertia (2.1) unchanged. As a case 
in point, we shall consider first a transformation which involves no 
transition to a new frame of reference, of the type (1.1). The differentia- 
tion of eqs. (1,1) with substitution of x, x. and so forth, from eqs. (2.1) 
and (2.2) produces the equations 

%' = Cnu x + C-&% + Cirfi 3 = u 
1j' = C:ii -f- C-nUy -f BMS* = Uy,{ 

z' = c 3 ii -f- c-^u,, + c-i-iUz = #',J 



x- = y = 




[ Chap. II 



ia — 



ffi* ,1 



% — 





U s — 



Uz . 

The velocity components transform just as we would expect a vector 
to transform, and eqs. (2.1) arc reproduced in the new coordi nates with- 
out change. 

Another transformation preserving the law of inertia is the type (1.3). 
It corresponds to the transition from one frame of reference to another 
one winch is in a state of straight-line, uniform motion relative to the; 
first frame. Taking the second derivatives of eqs, (1.3), we obtain 

a;* = x, ?/* =? f, 2* = z; (2.7) 

and if the motion of the body satisfies the law of inertia (2.1) in the co- 
ordinate system S, we have also: 

2* = if = z* = 0, (2.8) 

while the hrst derivatives of the starred coordinates (if eqs. (2.2) apply 
to the unstarred coordinates) are 


Eq. (2.8) shows that the law of inertia holds in the new system as well 
as in the old on.e. Eqs. (2.9) express the fact that the velocity com- 
ponents in the new coordinate system S* are equal to those in the old 
system minus the; components of the relative; velocity of the two co- 
ordinate systems themselves. This law is often referred to as the (clas- 
sical) law of the addition of velocities. 

Frames of reference and coordinate systems in which the law (2.1) is 
valid are inertial systems. All Cartesian coordinate systems wMeh are 
at rest relative to an inertia! coordinate system, are themselves inertial 
systems. Cartesian coordinate systems belonging to a frame of refer- 
ence which is in a state of straight-line, uniform motion relative to an 
inertial system are also inertial systems. On the other hand, when we 
carry out a transition to a new frame of reference which is in some state 
of accelerated motion relative to the first one, the corresponding co- 
ordinate transformation does not reproduce eqs. (2.1) in terms of the 
new coordinates. The acceleration of the new frame of reference rela- 
tive to an inertial system manifests itself in apparent accelerations of 
bodies not subject to real forces. 

Galilean transformations. If the form of a law is not changed by 

certain coordinate transformations, that is, if it is the same law in 

Chap. M ] 



terms of either set of coordi nates, we call that law invariant or covariant 
with respect to the transformations considered. The law of inertia (2.1) 
is covariant with respect to transformations (1.1) and (1.3), but not 
with respect to (1.2). 

Transformations (1.1) and (1.3) are of the greatest importance for our 
further discussions. They are usually referred to as Galilean transforma- 
tions. According to classical physics, any two inertial systems are con- 
nected by a Galilean transformation.. 

The force law and its transformation properties. We shall now discuss 
the transformation properties of the basic laws of classical mechanics. 
These laws may be formulated tints. 

When bodies are subject to forces, their accelerations do not vanish, but 
are proportional to the forces acting on them. The ratio of force to accelera- 
tion is a constant, different for every individual body: this constant is called 
the mass of the body. 

The total force acting on one body is the vector sum of all the forces caused 
by every other body of the mechanical system. In other words, the total 
interaction among a number of bodies is the combination of interactions of 
pairs. The forces which two bodies exert on each other lie in their connect- 
ing straight line and are equal except that they point in opposite directions; 
that is, two bodies can either attract or repel each other. The magnitude of 
these forces is a function of their distance only; neither velocities nor ac- 
celerations have any influence. 

These laws apply to such phenomena as gravitation, electrostatics, 
and Van Der Waals forces, but electrodynamics is not. included because 
the interaction between magnetic fields and electric charges produces 
forces whose direction is not in the connecting straight line, and which 
depend on the velocity of the charged body as well as on its position. 

But whenever the italicized conditions are satisfied, the forces can be 
represented by the negative derivatives of the potential energy. The 
latter is the sum of the potential energies characterizing the interaction 
of any two bodies or '''mass points," 

zl J2 ^>t(sa), 

i < k. 

m = Vfc -ztf + (yj- VuY + fe - ztf 


The indices i and k refer to the interaction between the z'tli and the kth. 
mass points, and s ik is the distance between them. The functions 
V ih (si k ) are given by the special nature of the problem, for example, 
Coulomb's law, Newton's law of gravitation, and so forth. 



[ Chap. II 


•%-^r dv ik Xi ~~ %h ] 

The force acting on t lie ith mass point is given by 

f. = - ^1 = - V' *** HZX* U ^ ;. (2,11; 

» 6 V Tr^r d Vile Zi — Zt 

The set of equations (2.11), by its form, implies that the force com- 
ponents due to the interaction of the ith and the fcth bodies alone are 
equal, except for opposite signs, that is, 

dV ik 

dx k 

Therefore, the sum of all forces acting on all n mass points vanishes, 

L/^ = E/.'„ = S/^ = o- 


The differential equations governing the motions of the bodies are 


where nu is the mass of the it% body. 

We are now going to show that the system of equations determining the 
behavior of a mechanical system, (2.10), (2.11), and (2.13) is eomriant 
with respect to Galilean transformations. 

Let us start with eq. (2.10). 7 depends on the distances m of the 
various mass points fro in each other. How do the s a change (trans- 
form) when a coordinate transformation (1.1) or (1.3) is carried out? 
In order to answer that question, it must be kept in mind that the co- 
ordinates of the ith and of the kih body are to be taken at the same 
time; in other words, that tin: distance between the two bodies is itself 
a function of time. Of course, the coordinates of the various mass 
points transform independently of each other, each set (Xi , yi , zi) by 
itself, according to the transformation equations (1.1) or (1.3), re- 

Considering these points, it is seen immediately that trans format ion 

Chap. II ] 



(1.3), corresponding to the straight-line, uniform motion, leaves the 
coordinate differences, for example, xi — %, unchanged, or, 


x k 

Xi . 


Therefore, the s« themselves take the same form in the new coordinate 
system 6'* which they have in S. 

The transformation equations (1.1) express the relationship between 
two coordinate systems which are at rest relative to each other and whose 
axes are not parallel. Obviously, the distance between two points is 
expressed in the same way in either coordinate system; so that 

V (xt -x k f + ^ - yyf + & - z k f 


= V7*2 - W + (v'i - yi) 2 + Wi - 4) 2 \ [ (2.1 5) 

Sik = Sik . J 

A quantity which does not change its value (at a given, point) when a 
coordinate transformation, is carried out is called an invariant with 
respect to that transformation. The distance between two points is an 

We have seen that the arguments of the function V, the s,-t- , are in- 
variant with respect to Galilean coordinate transformations. There- 
fore, the function V itself, the total potential energy of the mechanical 
system, is an invariant, too; expressed in terms of the new coordinates, 
it has the same form and takes the same values as in the original co- 
ordinate system. Eq. (2.10) is co variant with respect to Galilean 

Let us proceed to eqs. (2.11) and again begin with transformation 
(1.3). The right -hand sides of eqs. (2,11) contain the derivatives of a 
quantity which we already know is an invariant. These derivatives 
with respect to the two sets of coordinates are related to each other by 
the equations 

dV = dV 

&*s ' ' ax* 





%« dyt 


and therefore, the right-hand side of eqs. (2.11) is invariant with respect 
to transformations (1.3) . Whether the same holds true for the left-hand 
side, we shall be able to decide after discussing the transformation, 
properties of eqs. (2.13). It is clear, however, that the equation remains 
valid in the new coordinate system only if both sides transform the 
same way. Otherwise, it is not covariant with respect to the trans- 
formation considered. We shall have to Jind out virether the trans- 



Chap. \\ 

formation properties of the right-hand side of eqs. (2.11) are compatible 
with those of the left-hand side of eqs. (2.13), as both determine the 
transformation properties of the forges, ti ■ 

Let us first transform the right-hand side of eqs. (2.11) by a trans- 
formation (1.1). Applying the rules of partial differentiation, we obtain 






Cn + — r% + Ti-c-n 


_L dV 

■ess + —7 











The ?>n equations (2.17) can be separated into n groups of 3 equations 
each, these groups being identical save for the value of i. Each group 
transforms as the components of a vector, that is, each component in 
one system is equal to the sum of the projections upon it of the three 
components in the other system. 

Whether the left-hand sides of eqs. (2.11) also have vector character 
must be decided after discussion of the transformation properties of 

eqs. (2.13). 

The left-hand sides of eqs. (2. i3) are products of masses and accelera- 
tions. We have already stated that in classical physics the mass is con- 
sidered to be a constant of a body, independent of its state of motion 
and invariant with respect to coordinate transformations. 

That the accelerations of a body are invariant with respect to trans- 
formation (1.3) we have already seen in eq, (2.7). Therefore, the left- 
hand sides of eqs. (2,13) transform with respect to (1.3) in the same way 
as the right-hand sides of eqs. (2.1 1). 

Turning to transformations (1.1), we know that 

x'i = cn±i + ctfji + c a zi , and so forth, 


but because the c ab have the significance of cosines of angles, and because 
the value of a cosine does not depend on the sign of the angle, 

COS a = COS ( — £*)> 

it is also true that 

m = CuXi + 6%m + c-nz'i , and so forth. (2. J 8a) 

Again, the left-hand sides of eqs. (2.13) transform in exactly the same 
way as the right-hand sides of (2.11), m this case as n vectors. 

Eqs. (2.13) can be considered as the equations defining the forces 
U . We conclude, therefore, that the forces themselves transform so 

Chap/. I! 1 



that both eqs. (2.11) and (2.13) art; eo variant. Willi reaped to spatial, 
orthogonal transformations of the coordinate system, the forces are vectors, 
and they are invariant with respect to transformations representing a 
straight-line uniform motion of one system relative to the other. These rela- 
tions can be expressed in a slightly different form. By eliminating the 
quantities U from eqs. (2.11) and (2.13) we can combine them into new 
equations of the form 




+ mxi = 0, 

+ mjk = 0, 

+ mzi 



These equations contain the essential physical statements of eqs. (2,11) 
and (2.13), Suit do not bring out so clearly the force concept. 

The result of the above consideration, is that the two sides of each of 
equations (2.1.1.) and (2..I3) transform in the same way, and that, there- 
for*. 1 , these equations remain valid when arbitrary Galilean transforma- 
tions are carried out. 

Eq nations which do not change at all with the transformation (that 
is, the terms of winch are invariants) are called invariant. Equations 
which remain valid because their terms, though not invariant, transform 


according to identical transformation laws (such as the terms - — and 

viXi , and so forth, in eqs. (2.19)) are called covariant. 

The eovariance of equations is the mathematical properly which 
corresponds to the existence of a relativity principle for the physical 
laws expressed by those equations, fn fact, the relativity principle of 
classical mechanics is equivalent to our result, that the laws of me- 
chanics take the same form in all inertial systems, that is, in all those 
coordinate systems which can be obtained by subjecting any one inertial 
System to arbitrary Galilean transformations. 

The other branches of mechanics, such as the treatment of continuous 
matter (the theory of clastic bodies and hydrodynamics) or the me- 
chanics of rigid bodies, can be deduced from the mechanics of free mass 
points by introducing suitable interaction energies of the type (2.10), 
and by carrying out certain limiting processes. It is evident, even 
without a detailed treatment of these branches of mechanics, that the 
results obtained apply to them as well as to the laws of motion of free 
mass points. 

The Propagation of Light 

The problem confronting classical optics. During the nineteenth cen- 
tury, a new branch of physics was developed which could not be brought 
within the realm of mechanics. That branch was electrodynamics. As 
long as only electrostatic and magnctostatic effects were known, they 
could be treated within the framework of mechanics by the introduction 
of electrostatic and magnetostatie potentials which depended only on 
the distance of the electric charges Or magnetic poles from each other. 

The interaction of electric and magnetic fields required a different 
treatment. This was brought out clearly by Oersted's experiment. He 
found that a magnetized needle was deflected from its normal North- 
Sou ill direction by a current flowing through an overhead North-South 
wire. The sign of the deflection was reversed when the direction of the 
current was reversed. Obviously, the magnetic actions produced by 
electric currents, that is, by moving charges, depend not only on the 
distance but also on the velocity of these charges. Furthermore, the 
force does not have the direction of the connecting straight line. The 
concepts of Newton's mechanics are no longer applicable. 

Maxwell succeeded in formulating the laws of electromagnet! sm by 
introducing the new concept of "field." As we have seen in the pre- 
ceding chapter, in mechanics a system is completely described when the 
locations of the constituent mass points are known as functions of time, 
in Maxwell's theory, we encounter a certain number of "field variables," 
such as the components of the electric and magnetic field strengths. 
While the point coordinates of mechanics are defined as functions of the 
time coordinate alone, the field variables are defined for all values both 
of the time coordinate and of the three space coordinates, aud are thus 
functions of four independent variables. 

1 In the meshamicB of continuous media, we find variables which resemble field 
variables : The mass density, momentum density, stress components, and so forth; 
but they have only statistical significance. They are the total mass of the av- 
erage number of particles per unit volume, and so forth. In electrodynamics, 
however, the field variables are assumed to be the basic physical quantities. 


Chap. Ill ] 



in Maxwell's field theory it is further assumed that the change of the 
field variables with time at a given space point depends only on the 
immediate neighborhood of the space point. A disturbance of the field 
at a point induces a change of the field in its immediate neighborhood, 
this in turn causes a change farther away, and thus the original disturb- 
ance has a tendency to spread with a finite velocity and to make itself 
felt eventually over a great distance. "Action at a distance" may thus 
be produced by the field, but always in connection with a definite lapse 
of tune. The laws of a field theory have the form of partial differential 
equations containing the partial derivatives of the field variables with 
respect to the space coordinates and with respect to time. 

The force acting upon a mass point is determined by the field in the 
immediate neighborhood of the mass point. Conversely, the presence 
of the mass point may and usually does modify the field. 

Since the structure of Maxwell's theory of electromagnetism is so 
different from Newtonian mechanics, the validity of the relativity prin- 
ciple in mechanics by no means implies its extension to electrodynamics. 
Whether oi' not this principle applies to the laws of the electromagnetic 
field must be the subject of a new hives Ligation. 

A complete investigation of this kind would have to establish the 
transformation laws of the electric and magnetic field intensities with 
respect to Galilean transformations, and than determine whether the 
transformed quantities obey the same laws with respect to the new 
coordinates. Such investigations were carried out by various scientists, 
among them H. Hertz and H. A. Lorentz. "But we can obtain the most 
important result of these investigations by much simpler considerations. 
Instead of treating Maxwell's field equations themselves, we shall con- 
fine ourselves at present to one of their aspects, the propagation of 
electromagnetic waves. 

Maxwell himself recognized that electromagnetic disturbances, such 
as those produced by oscillating charges, propagate through space with 
a velocity which depends on the electric nature of the matter present in 
space. In the absence of matter, the velocity of propagation is inde- 
pendent of its direction, and equal to about 3 X 10 10 cm see -1 . This is 
equal to the known speed of light; Maxwell assumed, therefore, that 
light was a type of electromagnetic radiation. When Hertz was able to 
produce electromagnetic radiation by means of an electromagnetic appa- 
ratus, Maxwell's theory of the electromagnetic field and his electro- 
magnetic theory of light were accepted as an integral part of our physical 

Electromagnetic radiation propagates in empty space with a uniform, 
constant velocity (hereafter denoted by c). This conclusion can be 



£££2s its Sr ss ?s jrr;r ^ 

we can stndy the tatnsformation propafa oTto llff °'' "'' '" 

consider a coordinate l! ! + f ^formations. Let us 

uArm .peed ofS L il If wel^tta G n ^ f ° IT ° ? *" 
of the type (1.3) corresuondinttn f ^ ^^O^tic* 

newsvi's- ek^ o ?! S * T trans!afe ^ ^tion of the 
be equal to t io al dl^tt " ! f ^ ! *» ™ *& ^ 
direction of a iigrd «T£S£ w u^ "** ^^ &■ If ^ 

aad Tl .ail^t^S S"? * * "*■ *< * 

respect to S are ' ' U * Ve '° clty WMrts with 


Wo- + do8*j9f cos'y = 1. 



SK I' 9 ' (M) ' *■ ™** «^— ■ «* -pect to the „ 

^s = C COS a — p„ 

v 'v = c. cos /3 — v„ , 

»a = C COS 7 — |j 


a^^o speed of h g ht depots „ its direction as indicated hy the 

c S + v 2 .: 2,(^ cos ^ -f %W^+1^^), ( 3 d) 

will be it^T^it * *? ^ ** ^ to J! 
t It appears, thus, it^ ^ £Sf? " "* * ° + : * 
*e laws of electromagnetic Nation, ^^ ^ST^ ^ 
electromagnetic fields. If confidence in Maxwel I T\ *$*** ° f 

I--K tw mU st ** «^**tt£^t£2 

Chap. Ill J 



system with respect to which the field equations take their standard 
orm Any frame of reference which is in a state of mol c , S to 

chap,, would apply only to mechanics, not to the whole of ph vsies 

made to orentome these difficulties. We shall i,o,v ,.?,,? , 
important ones. consider the more 

train »n,d h„ refered to a fcj conned Se «7 ' ""^ 

iJ^tTrp™^;^ j:r:„hr s ,ieU to,y ° f 

a eocp^cma,. theory „, the type *'h" toZ , | SST£Z 

is consistent with the principle of relarivitv ti ,f lm - But ' lt 
contains explicitly the velocitv a t J i bo ^ ^ ° f ™^- 

also have comparatively great velocities n-htWoZ ?' y 



[ Chap. Hi 

tancc divided by the speed of light. If v is Ik: variation of velocity of 
one component of the double star, one would have 

I = d/c, At/Ac 

At— - 



(d is the distance between double star and earth, c is the speed of light., 
and t is the mean time for light to reach the earth). Reasonable 
assumptions for the order of magnitude of the quantities », d, and c are: 

10 cm sec , 

■ 10 B em sec" 1 , 

d > Iff cm; 


M > 10* sec. 

As there are many double star systems for which d exceeds 10 21 cm and 
which have: periods less than 10 fi sec, the resulting effects could not 
escape observation. , 

However, no trace of any such effect has ever been observed. Trusts 
sufficiently conclusive to rule out further consideration of this hypothesis. 

The transmitting medium as the frame of reference. Another hy- 
pothesis was that, whenever light was transmitted through a material 
medium, this medium was the "local' 1 privileged frame of reference. 
Within the atmosphere of the earth, the speed of light should be uniform 
with respect to a geocentric frame of reference. 

This hypothesis, too, is unsatisfactory in many respects. Let us 
assume, for the sake of the argument, that it is the transmitting medium 
and its' state of motion which determine the speed of light. Suppose, 
now, that electromagnetic radiation goes from one medium in a certain 
state of motion to a second medium in a different state of motion. The 
speed of light would be bound to change, this change depending on the 
relative velocity of the two media, and on the direction of the radiation 
(also, of course,' on the difference of indices of refraction) . If this experi- 
ment should be carried out with increasingly rarefied media, the inter- 
action between matter and radiation would become less and less, as far 
as refraction, scattering, and so forth, arc concerned, but the change of u 
would remain the same. In the case of infinite dilution, that is, of a 
vacuum, we should have a finite jump in u without apparent cause. 

There is also experimental evidence bearing on this hypothesis. In 
order to obtain information on the influence of a moving medium on the 

Chap. [II J 



speed of light, Fizeau carried out the following experiment. He sent a 
ray of light through a pipe filled with a flowing liquid, and measured the 
speed of light in both the negative and positive directions of flow. He 
determined these speeds accurately by measuring the position of inter- 
ference fringes. 

The experiment shewed that the speed of light does depend on the 
velocity of the flowing liquid, but not to the extent that the velocities of 
the light and of the medium, could simply be added. If we denote the 
speed of light by c, the velocity of the liquid by v, and the index of 
refraction by n, we should expect, according to our present assumption, 
that the observed speed of light is 

± v, 


the sign depending on the relative directions of the light and the flow. 
The actual result was that the change of speed of light is, within the 
limits of experimental error, given by 

u = - ± I 



This experimental result is consistent with the first objection. For, 
as the medium is increasingly rarefied, the index of refraction n ap- 
proaches the value 1, the dependence of u on v becomes negligible, and, 
in the limiting ease of infinite dilution, u becomes simply c. 

Another effect which indicates that the speed of light does not depend 
on the motion of a rarefied medium of transmission is that of aberration. 
Fixed stars at a great distance change their relative positions in the sky 
in a systematic way with a period of one year. Their paths are ellipses 
around fixed centers, with the major axis in all cases approximately 41" 
of arc, Stars near the celestial pole carry out movements that are 
approximately circles, while stars near the; ecliptic have paths which are 
nearly straight lines. 

Fig. 2 illustrates the way the star is seen away from its "normal" 
position (the center of the ellipse) at two typical points of the path of 
the earth around the sun. 

Aberration can be explained thus (Fig. 3): As the telescope is rigidly 
connected with the earth, it goes through, space at an approximate rate 
of 3 X 10 s cm sec"" 1 . Therefore, when a light ray enters the telescope, 
let us say from straight above;, the telescope must be inclined in the 
indicated manner, so that the lower end Will have arrived straight below 
the former position of the upper end by the time that the light ray has 
arrived at the lower end. The tangent of the angle of aberration, a, 


must be the ratio between the distance traversed by the earth and the 
distance traversed by the light ray daring the same time interval, or the 
ratio between the speed of the earth and the speed ol light- Tim 

ratio is 



3 X 10 

Vx 10 W 

JL^ ~ iir 

the angle corresponding to that tangent is 20.5", the amount of greatest 
aberration from tire center of the ellipse. 

To Star 

Tke apparent change in position of a fixed star during a year (aberra- 
' 'n the figure and amounts to not more than 

F%. 2 

tion). This change is exaggerate 
about 20" .5. 

This explanation of aberration again contradicts the assumption that 

the transmitting medium i.s decisive for the speed of light. For if this 
assumption were true, the light rays, upon entering our atmosphere, 
would be "swept" along, and no aberration would take place. 

The absolute frame of reference. All these arguments suggested the 
independence of the electromagnetic laws from the motion of either the 
source of radiation or the transmitting medium. The other alternative, 
it appeared, was to give up the principle of relativity and to assumethat 
there existed a universal frame of reference with respect to: which the 
speed of light was independent of the direction of propagation. As 
mentioned before, the equations of the eleetromsgnetie held would have 

chap, m i 



taken their standard form with respect to that frame. As the accelera- 
tions of charged are proportional to the field, this frame could 
be expected to be an inertial system, so that the ace (derations would tend 
to zero with the field. 

The experiment of Michelson and Morley. On the basis of this 
assumption, Michelson and Morley devised an experiment which was 
designed to determine the motion 
of the earth with respect to the 
privileged frame of reference in 
which the speed of light was to be 
uniform. The essential idea of 
their experiment was to compare 
the apparent speed -of light in two 
different directions. 

Before studying their experimen- 
tal set-up, let us discuss the ex- 
pected results from the standpoint 
of this new hypothesis. The earth 
itself cannot be the privileged 
frame of reference with respect to 
which the equations of Maxwell 
hold, for it is continually subject 
to the gravitational action of the 
sun; and in a frame of reference 
connected with the center of grav- 
itation of our solar system, the 
velocity of the earth is of the 
order of 3 X Iff em see" J . It 
changes, therefore, about 6 X 10 6 
cm sec" 1 in the course of one half- 
year relatively to a frame of reference which approximates an inertial 
system better than the earth does. Therefore;, even if the earth 
could at any time be identified with the state of motion of the privi- 
leged frame of reference, it would have a speed of 6 X itf cm sec" 1 
half a year later relative to the privileged system. 
_ In any case, it has a speed of at least 3 X 10 s cm sec" 1 relative to any 
inertial system ^throughO months of the year. The speed of light is 
about 3 X 10 cm sec \ If it is possible to compare the speed of 
light m two orthogonal directions with a relative accuracy better than 
t0 , and if the experiments are carried out over a period exceeding 6 
months, the effects of the motion of the earth would become noticeable. 
We proceed now to a description of Michelson and Motley's experi- 

Fig. 3. Explanation of aberration. 



[ Chap, III 

menfc (Fig- 4). Light from a terrestrial source L is separated into two 

parts by a thinly silvered glass plate P. At nearly equal distances 
from P, and at right angles to each other, two mirrors & and & arc 
placed which reflet: t the light back to P. There, a part of each of the 
two fays reflected by Si and S, , respectively, are reunited and are 
observed through a telescope F. Since the light emanating from L has 




: s i 

Fig. 4. The Michelson-Morley apparatus. 

travelled almost equal distances, L — P — S L — P — F and 
/_. _ p — Ss — P — F, respectively, interference fringes are observed, 
and their exact location depends on Hie difference between the dis- 
tances h and U • 

So far, we have assumed that the speed of light is the same in all 
directions. If this assumption is dropped, the position of the inter- 
ference fringes in F will also depend on the difference in the speeds along 

Chap. Ill ] 



h and h . Let us assume that the earth, and with it the apparatus, is 
moving, relatively to the "absolute" frame of reference, along ihv, direc- 
tion of k at the rate of speed v. With respect to the apparatus, the 
speed of light along the path P — S l equals (c - v), and along the path 
i|i — P it is (c + v). The time required to travel the path P — Si — P 
will be 





2i, h 


c — v c + v 1 — i' 2 /c s ' 

The relative speed of the light travelling along the path P — .g& — P 
will also be modified. While the light travels from P to S 2 , the whole 
apparatus is moving sideways a distance 5, 


jance tmvelli 

it = VjT+I* 

s = 


and the actual distance travelled by the light is 




Vl - v*/c 2 ' 

On the way back, the light has to travel an equal distance. The total 
time required by the light for the path P - S 2 - P is, therefore, 


k - 

Vl - 0t 


After the apparatus has been swung 90° about its axis, the times 
required to travel the paths P - & - P and P - S 2 - P are, respec- 

k = 

t-2 = 


VI- ?-/■&' 

1 - v'/c- ' 


The time differences between the two alternative paths are, therefore, 
before and after the apparatus has been swung around, 

At = & 

- h = ^L= (*L= - h ) 

Vi - v 2 /c 2 \Vi - -fife 7 ' 



tA = | - \ = 

2/c ft k_ \ to , 



[ Chap, lit 

Chap. Ill ] 


The change in M which is brought about by rotating the apparatus is, 
therefore , 

At- At 


fc+a^b*- 1 ) 


Vl - v'/c- V± ' ' ' Wl - v'/c 
As {v/cf is of the order of magnitude of 1Q" 3 , we. shall expand the right- 
hand side of eq. (3. .1 3) into a power series in (v/cf and consider only the 
first nonvanishing term. We obtain the approximate; expression 

At- At = — (h + h) -t 
c & 


We should expect the interference fringes in the telescope to shift 
because of this change in the time difference M. The amount of this 
shift, expressed in terms of the width of one fringe, would be equal to 

(At — At) divided by the time of one peri'od of oscillation, - , 


e c- 


■v is the velocity of the earth relative to the "absolute" frame of 
reference, presumably at least of the order 3X10° cm sec" 1 , (v/nf is, 

therefore, of the order 10" s . - is the wave number, and for visible; light, 


about 2 X 10 4 cm" 1 . Wo have, therefore, 


k + U 
5 X 10* cm' 

(3 J 5a) 

By lining multiple reflection, Michelson and Morley were able to work 
with effective lengths h and l 2 of several meters. Any effect should 
have been clearly observable after all the usual sources of error, such as 
stresses, temperature eff eets, and so forth, had been eliminated. Never- 
theless, no effect was observed. 

An impasse was at hand: No consistent theory would agree with the 
results of 1'izcau's experiment, the Mich els on-Morley experiment, and 
the effect of aberration. A great number of additional experiments were 
performed along similar hues. Their discussion can be omitted here, 
because they did not change the situation materially. What was needed 
was not more experiments, but some new theory which would explain 
the apparent contradictions. 

The ether hypothesis. Before launching into an explanation of that 
new theory, the theory of relativity, we mention a hypothesis which 

today has only historical significance. Physicists had been accustomed 
to think largely in terms of mechanics. When Faraday, Maxwell, and 
Hertz created the first field theory, it was only natural that attempts 
were made by many physicists to explain the new fields in terms of 
mechanical concepts. Maxwell and Hertz themselves contributed to 
these efforts. Within the realm of mechanics itself, there existed a 
branch which used concepts and methods resembling those of field 
physics, namely, the mechanics of continuous media. So the electro- 
magnetic fields were explained as the stresses of a hypothetical material 
medium, the so-called ether. 

There are many reasons why this interpretation of the electromag- 
netic field finally had to be abandoned. Among them are: the ether 
would have to be endowed with properties not shared by any known 
medium; it would have to penetrate all matter without exhibiting any 
frictiona! resistances and it would have no mass and would not be 
affected by gravitation. Also, Maxwell's equations are different in 
many ways from the equations to which elastic waves are subject. 
There exists, for instance, no analogy in electrodynamics to the "longi- 
tudinal." elastic waves. 

A I; the end of the nineteenth century, however, the ether was regarded 
as a, most promising and even necessary hypothesis. Katu rally, at- 
tempts were made to apply this concept to the problem discussed in this 
chapter, namely, to find that coordinate system in which the speed of 
light is equal to c in all directions. The idea of the ether suggested that 
it might be the coordinate system in which the ether is at rest. That 
theory, however, does little to solve the fundamental difficulty. All 
that it does is reword the problem; for, in order to find out what the state 
of motion of the ether really is, we would have no other means than to 
measure the speed of light. The outcome of the Michel s on-Morley 
experiment would, therefore, suggest that the ether is dragged along 
with the earth, as far as the immediate neighborhood of the earth is 
concerned. The motion of small masses, such as in Fizeau's experiment, 
would carry the ether along, but not completely. But these hypotheses 
could not account for aberration . The existence of the aberration effect 
would be consistent with an ether hypothesis only if the earth could 
glide through the ether without carrying it along, even right on its 
surface, where our telescope picks up the light. 


The Lorentz Transformation 

Several decades of experimental research showed that there was no 
way of determining the state of motion of the earth through the "ether." 
All the evidence seemed to point toward the existence of a "relativity 
principle" in optics and electro dyn amies, even though the Galilean 
transformation equations ruled that out. 

Nevertheless, Fitzgerald and especially H. A. Lorentz tried to pre- 
serve the traditional transformation equations and still account theo- 
retically for the experimental results. Lorentz was able to show that 
the motion of a frame of reference through the ether with a velocity v 
would produce only "second-order effects"; that is, all observable devia- 
tions from the laws which were valid with respect to the frame connected 
with the ether itself would be proportional not to v/c, but to (v/c)\ 

One of these expected a- effects was that, in a system 
moving relatively to the ether, a. light ray would take longer to go out 
and back over a fixed distance parallel to the direction of the motion 
than over an equal distance perpendicular to the motion. The Michel- 
son-Morley experiment was designed to measure that effect. In order 
to explain the negative outcome of the experiment, Fitzgerald and 
Lorentz assumed that scales and other "rigid" bodies moving through 
the ether contracted in the direction of the motion just sufficiently to 
offset this effect. This hypothesis preserved fully the privileged char- 
acter of one frame of reference (the ether). The negative result of the 
Michelson-Morley experiment was not explained by the existence of an 
"optical relativity principle," but was attributed to an unfortunate com- 
bination of effects which made it impossible to determine experimentally 
the motion of the earth through the ether. 

Einstein, on the contrary, accepted the experiments as conclusive evi- 
dence that the relativity principle was valid in the field of electro- 
dynamics as well as in mechanics. Therefore, his efforts were directed 
toward an analysis and modification of the Galilean transformation 
equations so that they would become compatible with the relativity 
principle in optics. We shall now retrace this analysis in order to derive 
the new transformation laws. 


Chap. IV ] 



In writing down transformation equations, we always made two 
assumptions, although we did not always stress them; That there exists 
a universal time t which is defined independently of the coordinate sys- 
tem or frame of reference, and that the distance between two simulta- 
neous events is an invariant quantity, the value of which is independent 
of the coordinate system used. 

The relative character of simultaneity. Let us take up the first 
assumption. As soon as we set out to define a universal time, we are 
confronted with the necessity of defining simultaneity. We can com- 
part; and adjust time-measuring devices in a unique way only if the 
statement "The two events A. and B occurred simultaneously" can be 
given a meaning independent of a frame of reference. That this can be 
done is one of the most important assumptions of classical physics; and 
this assumption has become so much a part of our way of thinking, that 
almost everyone has great difficulty in analyzing its factual basis. 

To examine this hypothesis, we must devise an experimental test 
which will decide whether two events occur simultaneously. Without 
such an experiment (which can be performed, at least in. principle), the 
statement "The two events it and B occurred simultaneously" is devoid of 
physical significance. 

When two events occur close together in space, we can set up a 
mechanism somewhat like the coincidence counters used in the investi- 
gation of cosmic rays. This mechanism will react only if the two events 
occur simultaneously. 

If the two events occur a considerable distance apart, the coincidence 
apparatus is not adequate. In such a case, signals have to transmit 
the knowledge that each event has occurred, to some location where 
the coincidence apparatus has been set up. If we had a method of 
transmitting signals with infinite velocity, no great complication would 
arise. By "infinite velocity" we mean that the signal transmitted 
from a point Pi to another point Pi and then back to P L would return 
to Pj at the same time as it started from there. 

Unfortunately, no signal with this property Is known. All actual 
signals take a finite time to travel out and back to the point of origin, 
and this time increases with the distance traversed. In choosing the 
type of signal, we should naturally favor a signal where the speed of 
transmission depends on as few factors as possible. Electromagnetic 
waves art: most suitable, because their transmission does not require 
the presence of a material medium, and because their speed in empty 
space does not depend on their direction, then wave length, or their 
intensify-. As the recording device, we can use a coincidence circuit 
with twt) photon counters. 



[ Chap. IV 

To account for the finite time lost in transmission, we set up our 
apparatus at the midpoint of the straight line connecting the sites of 
the two events A and B. Each event, as it occurs, emits a light signal, 
and we shall call the events simultaneous if the two light signals arrive 
simultaneously at the midpoint. This experiment has been designed 
to determine the simultaneity of two events without the use of specific 
time-measuring devices. It is assumed that simultaneity as defined by 
this experiment is "transitive," meaning that if two events A and B 
occur simultaneously (by our definition) , and if the two events A and C 
also occur simultaneously, then B and C are simultaneous. It must 
be understood that this assumption is a hypothesis concerning the 
behavior of electromagnetic signals. 

Granted that this hypothesis is correct, we still have no assurance 
that our definition of simultaneity is independent of the frame of refer- 
ence to which we refer our description of nature. Locating two events 
and constructing the point midway on the connecting straight line 
necessarily involves a particular frame and its state of motion. 

Is our definition invariant with respect to the transition from one 
frame to another frame in a different state of motion? To answer this 
question, we shall consider two frames of reference: Oik; comiected 
with the earth (S), the other with a very long train (S*) moving along 
a straight track at a constant rate of speed. We shall have two ob- 
servers, one stationed on the ground alongside the railroad track, the 
other riding on the train. Each of the two, observers is equip ped with 
a recording device of the type described and a, measuring rod. Their 
measuring rods need not be the same length; it is sufficient that each 
observer be; able to determine the point midway between two points 
belonging to his reference body— ground or train. 

Let us assume now that two thunderbolts strike, each hitting the 
train as well as the ground and leaving permanent marks. Also sup- 
pose that each observer finds afterwards that his recording apparatus 
was stationed exactly midway between the marks left on his reference 
body. In Fig. 5, the marks are denoted by A, B, A*, and B*, and the 
coincidence apparatus by C and C*. Is it possible that the light sig- 
nals issuing from .4, A* and from B, B* arrive simultaneously at C and 
also simultaneously at (7*? 

At the instant that the thunderbolt strikes at A. and A'*, these two 
points coincide. The same is true of B and B*. If eventually it turns 
out that the two bolts struck simultaneously as observed by the ground 
observer, then 0* must coincide with C at the same time that A coin- 
cides with A* and B with B* (that is, when the two thunderbolts 



strike) .' It is understood that all these simultaneities are defined with 
respect to the frame ,S'. 
Because of the finite time needed by the light signals to reach C and 

C*, C* travels to the left (Fig. 5, stages b, c, d). The .signal issuing 
from A, A''" reaches C, therefore, only after passing C* (stages h, c); 
while the light signal from B, B* reaches C before it gets to C* (stages 
c, d). As a result, the train observer finds that the signal from A, A* 
reaches his coincidence apparatus sooner than the signal from B, B* 
(stages b, d). 



^^ ===r ^ 


r A" 






B * 
















'A "~ 




Fig. 5. The two events occurring at A, A* and at B, B*, respectively, appear 
simultaneous to an observer at rest relative to the ground (S), but not to an 
observer who is at rest relative to the train (S*). At (a) the two events occur 
(h) the light signal from A, A * arrives at C* , (c) the light signals from both events 
arrive at C, and {d) the light signal from B, B* arrives at C*. 

This does not imply that the ground has a property not possessed 
by the train. It is possible for tin; thunderbolts to strike so that the 
light signals reach 0* simultaneously. But then the signal from 4, A* 
will arrive at C after the signal from B t B*. In airy ease, it is impos- 
sible for both recording instmments, at C and at C*, to indicate that 
the two thunderbolts struck simultaneously. 

I Otherwise, the distances A~*C* and B*C* would not appear equal from the 
point of view of the ground observer; we shall explain later why we do not assume 
anything of this kind. 



[ Chap. IV 

We conclude, therefore, that two events which are simultaneous with 
respect to one frame of reference are in general not simultaneous with 
respect to another frame. 

The length of scales. Our conclusion affects the evaluation of length 
measurements. We have assumed that the ground observer and the 
train observer are able to carry out, length measurements in their re- 
spective frames of reference. Two rods which are at rest relatively to 
the same frame of reference are considered equal in length if they can 
be placed alongside: each other so that their respective end points E, 
E* and F, F'" coincide. Two distances which are marked off on two 
different reference bodies moving relatively to each other can be com- 
pared by the same method, provided these distances are parallel to 
each other and perpendicular to the direction of the relative motion. 
However, if the two distances are parallel Lo the direction of relative 
motion, and if they are travelling along the same straight line, their 
respective end points will certainly coincide at certain times. The two 
distances EF and E*F * are considered equal if the two coin t si deuces 
occur simultaneously. But whether they occur simultaneously de- 
pends on the frame of reference of the observer. Thus, in the case of 
the thunderbolts, the two distances AB and &*B* appear equal to the 
ground observer; the train observer, on the other hand, finds that A. 
coincides with A* before B coincides with B*, and concludes that A*B* 
is longer than A B. In other words, not only the simultaneity of events, 
but also the result of length measurements, depends on the frame of 

The rate of clocks. The frame of reference of the observer also deter- 
mines whether two clocks at. a considerable distance from each other 
agree (that is, whether their hands assume equivalent positions simul- 
taneously). Moreover, if the two clocks are in different states of mo- 
tion, we cannot even compare their rates independently of the frame of 
reference. To illustrate this, let us consider two clocks D and JD* one 
stationed alongside the track and the other on the train. Let us assume 
that the two clocks happen to agree at the moment when 7>* passes D. 
We can say that D* and D go at the same rate if they continue to agree. 

2 Our definition of simultaneity is, of course, to a certain degree arbitrary. 
However, it is impossible to devise an experiment by means of which simultaneity 
could be defined independently of a frame of reference. From the outcome of 
the Michelson-Morlcy experiment, we conclude that the law of propagation of 
light takes the same form in all inortial systems. TTad the outcome of the 
Michelaon-Morley experiment been positive, in other words, if it were possible 
to determine the "state of motion of the "ether," we should naturally have based 
our definition of simultaneity on the frame of reference connected with the ether, 
and thereby have given it absolute signifieance. 



But after a while, D* and D will be a considerable distance apart; and, 

as we know from our earlier considerations, their hands cannot assume 
equivalent positions simultaneously from the points of view both of 
the ground observer and of the train observer. 

The Lorentz transformation. The above considerations help us to 

remove the apparent contradiction between the law of the propagation 
of electromagnetic waves and the principle of relativity. If it is im- 
possible to define a universal time, and if the length of rigid rods cannot 
be defined independently of the frame of reference, it is quite: conceivable 
that the speed of light is actually the same with respect to different 
frames of reference which are moving relatively to each other. We 
are now in a position to show that the classical transformations con- 
necting two inertial systems (Galilean transformation equatkms) can be 
replaced by new equations which are not based Oil the assumptions of 
a universal time and the invariant, length of scales, but which assume 
at the outset the in. variant character of the speed of light. 

In the derivation of these new transformation equations, we shall 
accept the principle of relativity as fundamental ; that is, the transforma- 
tion equations must contain nothing which would give one of the two 
coordinate systems a preferred position aw compared with the other 
system. In addition, we shall assume that the transformation equa- 
tions preserve the homogeneity of space; all points in space and time 
shall be equivalent from the point of view of the transformation.. The 
equations must, therefore, be linear transformation equations. This is 
why avo considered the two distances ,4*C* and B*G* equal in terms of 
^'-coordinates as well as in A 1 *- coordinates (see page 31). 

Let lis consider two inertia! coordinate systems, S and $** S* moves 
relatively to 5 at the constant rate v along the X-axis; a.t the >S T -time 
t = 0, the points of origin of S and S* coincide. The A"*-axis is parallel 
to the X-axis and, in fact, coincides with it. Points which are at rest 
relative to A"* will move with speed v relative to S m the X-direction. 
The first of our transformation equations will, thus, take the form 

a (x — vt), 


where a is a constant to be determined later. 

It is not quite obvious that a straight line which is perpendicular to 
the X-axis should also be perpendicular to the X*-axis (the angles to 
be measured by observers in 5 and $*, respectively). But if we did 
not assume that it was, the left-right symmetry with respect to the 
X-axis would be destroyed by the transformation. For similar reasons, 
we shall assume that the Y- and the 7-axes are orthogonal to each 



other, as observed from either system, and that the same is true of the 

F*~ and Z*-am$. 

As mentioned before, we can compare the lengths of rods in different 
states of motion in an invariant manner if they are parallel breach 
other and orthogonal to the direction of relative motion. If their re- 
spective end points coincide, it follows from the principle of relativity 
that, they are the same length. Otherwise, the relationship between S 
and S* would not be reciprocal. 

On the basis of this, we can formulate two further transformation 


V* = V; 


To complete this set of equations, we have to formulate an equation 
connecting f\ the time measured in. &*'; with the time and space coordin- 
ates of S. t* mnst depend on L x, y, and z linearly, because of what 
we have called the: "homogeneity" of space and time. For reasons of 
symmetry, we assume further that f* does not depend on y and z. 
Otherwise, two S*-clocks in the 7*Z !|; -plane would appear to disagree 
as observed from S. Choosing the point of time origin so that the 
inhomogeneous (constant) term in the transform ation equation vanishes, 
we have 

Finally, we must evaluate the constants a of eq. (4.1) and (3 and y 
eq. (4.3)"- We shall find that they are determined by the two condi- 
tions that the speed of light be the same with respect to S and -S*, 
and that the new transformation equations go over into the classical 
equations when v is small compared with the speed of light, c. 

bet us assume that at the time I = an electromagnetic spherical 
wave leaves the point of origin of S, which coincides at that moment 
with the point of origin of ,S'*. The speed of propagation of the wave 
is the same in all directions and equal to c in terms of either set of 
coordinates. Its progress is therefore described by either of the two 



x*" + tf* -I- 2* 

C t j 

By applving eqs. (4.1), (4.2), and (4.3), we can replace the starred 
quantities in eq. (4.5) completely by uns tarred quantities, 

f?0i _|. yx f = jfts - vlf 4- if + i. (4.(0 



By rearranging the terms, we obtain 

{of - vV)t 2 = (a - cW + y~ + i - 2{v<x 4- c~8y)xl. (4.7) 

This equation goes over into eq. (4.4) only if the coefficients of I and x~ 
are the same in cqs. (4.7) as in eqs. (4.4), and if the coefficient of xt in 
eq, (-1. 7) vanishes. Therefore, 


V a" 

2 2 

e 7 

c-{iy = 0. 


We solve these three equations for the three unknowns a, ff; and y by 
first eliminating a . We obtain the equations 

c'yiiS + pry) = -v. 
Then we (eliminate y and obtain for ,3" the expression 

„* 1 




/3 is not equal to unity, as it is in the classical transformation theory. 
But by choosing the positive root of (4.10), we can make it nearly equal 
to unity for small values of v/c; its deviation from unity is of the second 
order. 7 is given by the equation 

7 = l^* = -fc (4.11) 


and finally, a is obtained from the equation 

i: = -r:8y/v - 8\ (4.12) 

Again we choose the positive sign of the root. 

By substituting all these values into eqs. (4.1), (4.3), we get the new 

transformation equations, 

* x 

- vi 

x Vi 

— v l jii l 

f = ?/, 

z* = z, 

I - 
1* - 



- «V<? j 




Chap. IV 

These equations are the so-called Lorentz transformation equations. For 
small values of v/c, they arc- approximated by the Galilean transforma- 
tion equations, 












The deviations are all of the second order in v/c (or x/ct). We can 
therefore test the Lorentz transformation equations experimentally only 
if we are able to increase (v/c)' be.yond the probable experimental error. 
Mich el son and Morley, in their famous experiment, were able to in- 
crease the accuracy to such an extent that they could measure a second- 
order effect and prove experimentally the inadequacy of the Galilean 
transformation equations. 

When we solve the equations (4.13) with respect to x, y,z, and t, we 

X — 

+ vt* 

t = 

Vl - v"-/c 2 

Vl - v-./c 1 ' 


Comparing eqs. (4. 15) with eqs. (4.13), we conclude that 8 has the rela- 
tive velocity ('— v) with respect to 6**. This is not a trivial conclusion, 
for neither the unit length nor the unit time is directly comparable in 
in S and S*. 

The velocity of a light signal emanating from any point at any time 
is equal to c with respect to any one system if it is equal to c in the 
other system, for the coordinate and time differences of two events 
transform exactly like x, y, z, and t themselves. 

The Lorentz transformation equations do away with the classical 
notions regarding space and time;. They extend the validity of the 
relativity principle to the law of propagation of light. 

So far, we have fashioned our trans forma tion theory to fit the out- 
come of the Michclson-Morley experiment. How does this new theory 
account for aberration 1 ? We have to compare the direction of the in- 

Chsp. IV 1 



coming light with respect to two frames of reference, that of the sun and 
that of the earth. The amount of aberration, depends on the angle 
between the incoming light and the relative motion of these two frames 
of reference. We shah call that angle a. Both coordinate systems are 
to be arranged so that their relative motion is along their common 
X-axis, and that the path of the; light lay lies entirely within the XY- 
plane. With respect to the sun, the path of the light ray is given by 

ei-eos a. 

ct-sm a. 


With respect to the moving earth, we find the equations of motion by 
applying the inverted equations of the Lorentz transformation, (4.15). 
Eq. (4,16) takes the form 


L- v f' = r:(/* + v/c-x*) cos a, 
?/-/(-;- = c{t* + v/c~-z*) sin a. t 


By solving these equations with respect to x* and y A \ we get 

cos a — v/c 


1 — u/c-.eos a 

clVl - V"l& 

= c£* cos or 

sin a 

1 — v/c • COS a 

The cotangent of the new direction is 

ctg a — v/c-cosec a 

ef sin a*. 


ctg u< 

\/l — v 2 /c 


According to the classical explanation given on gage 21 , the angle 
would turn out to be 

Ctg a* = Ctg a — V/C ■ COSeC a. 


If we wish to compare eq. (4.19) with eq. (4.20), we have to keep in 
mind that v/c is a small quantity (about 10 ~ 4 ). Therefore, we expand 
both formulas into power series with respect to v/c. We get 

ctg «* eb! = ctg a 

'■'/c-cosee a + |(y/e) etg a + 



ctg c^cinss = ctg a — y/c-cosee a. (4.20a) 

The observed effect is the first- order effect, while the relativistic second- 
order effect is far below the attainable accuracy of observation. The 
relativistic equation (4.19) is, therefore, in agreement with the observed 



[ Chap. IV 

We can explain Fizeau's experiment by connecting the coordinate 
system 8 with the eartli and 8* with the Hewing liquid. With respect 
to S*, the liquid is at rest, and the equation of the light rays must be 
of the form 

2* = c/n(t* - t). . (4-21) 

Applying the Lorentz transformation equations (-1.13), we obtam 

x - v t = c/n-[{t - v/c'-x) - &Vi - tffH (4.22) 

We obtain the velocity of the light ray with respect to S by solving this 
equation with respect to x. 


x — 



1 + v/nc 

t -f- const. 

Again the observable first-order effect is in agreement with the 

The "kinematic" effects of the Lorentz transformation. Wo shall 
bow study in more detail the effect of the Lorentz equations on length 
and time measurements in different frames of reference. 

Let us consider a clock that is rigidly connected with the starred 
frame of reference, stationed at some point (x , y'ti, &>). Let us com- 
pare the time indicated by that clock with the time I measured in the 
mistarred system. According to eq. (4. 15), the imstarred time co- 
ordinate of the clock is given by 

_ i}/c-X + t* 

vT"- WM ' 

An tf-time interval, (k - fe), is therefore related to the readings ?% and 
if of the clock as follows: 


k = (it ~ ft)/Vl - */#■ 


Thus, the rate of the clock appears slewed down, from the point of view 
of S, by the factor \/l — &■/&. -But not only that. Observed from 
the imstarred frame of reference, different £*-clocks go at the same 
rate, but with a phase constant depending on their position. The 
farther away an 5*-cloek is stationed from the point of origin along 
the positive A'*-axis, the slower it appears to be. Two events that occur 
simultaneously with respect to $ are not in general simultaneous with 
respect to B*, and vice versa. 

We can re verse our setup and compare an S-clock with ,$*-time. 

Chap. IV I 



The clock may be located at the point (x x , y x , s&J, and the starred time 
is connected with the time indicated by the £-eIock through the 


l — v/c -i\ 
Vl - uVc 2 

Again the readings of the £-clock are related to an 6'*- time interval 
as follows: 

(M - h)/Vl - 



It appears that the S'-clock is slowed down, measured in terms of >S*-time, 
and that it is ahead of an 5-eloek placed at the origin, if its own 
.-r-co ordinate is positive. 

How is it that an observer connected with either frame of reference 
finds the rate of the clocks in the other system slow? To measure the 
rate of a clock T which is not at rest relatively to his frame of reference, 
an observer compares it with all the clocks in his system which T 
passes in flu; course of time. That is to say, an S-observer compares 
one »S*-cIock with a succession of >S'-elocks, while an £*- observer com- 
pares one S-clock with several ,S*-eIocks. The A'*-cIock passes, in. the 
course of time, ^-clocks which are farther and farther along the positive 
X-axis and therefore increasingly fast with respect to S*; consequently, 
the? rate of the *S*-c lock appears slow r in comparison. Conversely, an 
S-clock passes S*-clocks farther and farther along the negative A'*-axis 
and therefore increasingly fast with respect to 8, The rate of the 
5-cIock appears slow compared with &*-eloeks. 

In the case of length measurements, conditions are somewhat more 
involved, because the transformation equations contain y and z in a, 
different way than x, the direction of relative motion. A rigid scale 
that is perpendicular to the direction of relative motion has the same 
length in either coordinate system. However, when the scale is parallel 
to the X-aris and the X*-axis, we have to distinguish whether the scale 
is at rest relative to one coordinate system or to the other. Let us 
first consider a rod rigidly connected with B*, the end points of which 
have the coordinates ($£ , 0, 0) and (:t* , 0, 0). Ks length in its own 
system is 

I* = xt 



An observer connected with 8 will consider as the length of the rod the 
coordinate difference (x 2 — Xi) of its end points at the same time, I. 



The coordinates a; 2 and x% are related t.o x% , Xi , and t by equations 
(4.13), yielding 

Xi = 

Xi = 

X-! — vt 

Vl — v 2 /c 2 

x 2 — vt 



Vl — v-/c* j 
Therefore, the coordinate differences are 

Xi = 

X-j — X! 

Vl - v 2 fe 2 

If we denote the length fe — Xi) by I, we obtain 






The rod appears contracted by the factor Vl — v'^/cK This effect is 
called the Lorentz contraction. 

A calculation that reverses the roles of the two coordinate systems 
shows that a rod at rest in the unstarred System appears contracted in 
the starred system. 

Thus, we have the rules: Every clock appears to go at its fastest rate 
when it is at rest relatively to the observer. If il moves relatively to the 
observer with the velocity v, its rate appears slowed down by the factor 
Vl — v 2 /cr. Every rigid body appears to be longest when at rest rela- 
tively to the observer. When it is not at rest, it appetars contracted in the 
direction of its relative motion by the factor Vl — v'-jc 1 , while its dimen- 
sions perpendicular to the direction of motion are unaffected. 

The proper time interval. In contrast to the classical transformation 
theory, we no longer consider length and time intervals as invariants. 
But the invariant character of the speed of light gives rise to the exist- 
ence of another invariant. Let us return to equations (4.1), (4.2), and 
(4.3), and conditions (4.8). We shall consider two events having the 
space and time coordinates (xi , y% , z x , k) and \±% , y-z , z<± , fe), respec- 
tively. The difference between the squared time interval and the 
squared distance, divided by c , shall be called rn > or 



= fc " kf - \ [(x, - x,f + ( Ih - yO 2 + % - zi)']- (4.30) 

Correspondingly, we define a similar quantity with respect to S*, 
= (ft - $'f - UMm - z*f + (vt - df + (4 - ztf\. (4.31) 


Chap. IV ] 



Now we express nl in terms of A'-quan titles, according to eqs. (4.1), 
(4.2), and (4.3), just as we did in the discussion of eq. (4.5), and obtain 

T ?i = (tf - «V/<r)a 2 - t;f - I [(a 2 - ,y)fe - Xt y ) 

c \ (4.32) 

+ (ih ~ Vi? + fe - zif] + 2( t /v/c 2 + liy)(x 2 - xJih-t,}. J 

Because the constants a, 0, and y satisfy conditions (4.8), we find that 
ti2 is an invariant with, respect to the trans formation equations 
(4.13), or, 





It is also invariant with respect to spatial orthogonal transformations 

Hereafter, we shall call all the linear transformations with respect to 
which Tia is invariant, Lorentz transformations, regardless of whether the 
relative motion of the two systems takes place along the common 
X-axis or not. Obviously, the invarianee of r}! implies the in variance 
of the speed of light, for the path, of a light ray is characterized by the 
vanishing of rj for all pairs of points along its path. 

What is the physical significance of this quantity n° ? Tf there 
exists a frame (4' reference with respect to which both events take place 
at the same space point, then m (the positive square root of rj) is the 
time recorded by a clock at rest in that frame of reference, fa is 
therefore called the proper time interval (or ricjen lime intervat). 

Does there always exist a frame of reference with respect to which 
two events take place at the same space point? if we were dealing 
with the classical transformation equations, the answer would be yes, 
unless the two events took place "simultaneously." But the equations 
of the Lorentz transformation, (4.13), become singular when v, the 
relative velocity of the two frames, becomes equal to the speed of 
light. For values of 9 greater than e, equations (4,13) would lead to 
imaginary values of x* and t*. The Lorentz transformation equations 
are, thus, defined only for relative velocities of the two frames of refer- 
ence smaller than c. Therefore, if two events occur in such rapid sue- 
cession that the time difference is equal to or less than the time needed 
by a light ray to traverse the spatial between the two events, 
no frame of reference exists with respect to which the two events occur 
at the same spot. 

Whenever the two events can be just connected by a light ray which 
leaves the site of one event at the time; it occurs and arrives at the site 
of the other event as it takes place, the proper time interval r H between 



them vanishes. Whenever the sequence of two events is such that a 
light ray coming from either event arrives at the site of the other only 
after it" has occurred, nl & negative. Then wo introduce instead of 
ri2 tho invariant 0-1.2 = -'■ n: , 

Ǥ = (& - %if + fes - + (& 

jO 2 " <f& - fi) 2 . (4.34) 

Either ha or 'm fe real for any two events. Whenever 0-u is real, we can 
carry out a Lorentz transformation so that tf — h vanishes. In other 

words, there exists a frame of reference with respect to which the events 
occur simultaneously. In that frame of reference, the spatial distance, 
between the two events is simply 0-12 . 

Frequently, either r& or a vi is referred to as the space-time interval 
between the two events. The interval is called time-like when r v > is 
real, and space-like when ov_ is real. Whether the interval between the 
two events is time-like or space-like does not depend on the frame of 
reference or the coordinate system used, but is an invariant property 
of the two events. 

We mentioned before that the Lorentz transformation is defined only 
for relative velocities smaller than the speed of light. If a frame of 
reference could move as fast as or faster than light, it would be, indeed, 
impossible for light to propagate at all in the forward direction, much 
less with the speed c. 

The relativistic law of the addition of velocities. Is it possible to find 
two frames of reference which are moving relatively to each other with 
a velocity greater than e by carrying out a series of successive Lorentz 
transformations? To answer tiiis question, we shall study the super- 
position of two (or more) Lorentz transformations. We shall introduce 
three frames of reference, S, S% »S'**. S* has the velocity v relative 
to S, and A'** has the velocity w relative to A'*. We want to find the 

transformation equations connecting S* 

j, _ x — vl 

f * = t - v/<?-x_ 

~ Vl - &i* ' 

x* - wi* 

with S. Starting with the 


= «, 

Vl - w 2 /c 2 ' 
t* - w/c-x* 


= z 

w / e° 


fihap. IV I 



we have to substitute the first set of equations in the second set. The 
result of the straightforward calculation is 


x — v.l 

Vi - «7c 2 ' 

if* = V, 

£** = Z, 

t — ufe'-x 


{** = 


Vl ~~ W 2 /c 2 

v + w 
1 -r vw/c 2 ' 



Thus, two Lorentz trans format it) ns, carried out one after the other, 
are equivalent to one Lorentz transformation. But the relative velocity 
of 8** with re spent to S is not simply the sum of v and w. As long as 
both v/c and w/c arc small compared with unity, u Is very nearly equal 
to v 4- w, but as one of the two velocities approaches e, the deviation 
becomes important. Eq. (4.37) can be written in the form 


(f - v/c)(\ - w/c) ' 
1 + vw/c* 


In this form, it is obvious that u cannot become equal to or greater 
than c, as long as both v and w are smaller than c. Therefore, it is 
impossible to combine several Lorentz transformations in one involving 
a relative velocity greater than e. 

Eq. (4.37) can be interpreted in a slightly different way, for a body 
which lias the velocity to -with respect to S* has the velocity u with 
respect to S. Then eq. (4.37) can be regarded as the transformation 
law for velocities (in the X-di recti on). In this case, it wou Id be pref- 
erable to write it 

uvj& ' 


where w has been replaced by u*. We conclude that a body has a 
velocity smaller than c in every inertial system if its velocity is less 
than c with respect to one inertial system. 

The Lorentz transformation equations imply that no material body 
can have a velocity greater than c with respect to any inertial system. 
For each material body can be used as a frame of reference; and if it 



[ Chap. IV 

is removed from interaction with other bodies and does not rotate 
around its own center of gravity, it defines a new inertial system. 
Then, if the body could assume a velocity greater than c with respect 
to any inertial system, this system and the one connected with the 
body would have a relative velocity greater than c. 

The proper time of a material body. We have spoken before of the 
space-time interval between two events. The application of this con- 
cept to the motion of a material body and to the space-time points 
along its path is particularly important for the development of rola- 
ti vis tic mechanics. Since the velocity of a material body remains 
below c at all times, such an interval is always time-like. If the mo- 
tion of the body is not straight-line and uniform, we can still define the 
parameter along its path by the differential equation 

dr = dl 

(dx 2 + dif -f- dz 2 ) 

1 - 


+(!)*+ (1)1 if - ( " 9) 

t is the tune shown by a clock rigidly connected with the moving body, 
really its "proper time ; ' (its own time). When eg. (4.39) is divided by 
di' and the root Is taken, we obtain the relation between coordinate 
time and proper time, 



h - u-/c 2 


where u is the velocity of the body. This relation is valid for acceler- 
ated as well as unaccelerated bodies. 

Both dr and t, which is defined by the integral 

= /v— 

=yV dt, 


are invariant with respect to Lorentz transformations, though dt and 
u are not. 


1. On page 39 we have discussed one method of measuring the length 
of a moving rod. We could also define that length as the product of 
the velocity of the moving rod by the tune interval between the instant 
when one end point of the moving rod passes a fixed marker and the 
instant when the other end point passes the same marker. 

Chap. IV] 



Show that this definition leads also to the Lorentz contraction for- 
mula, equation (4.29). 

2. Two rods which are parallel to each other move relatively to each 
other in their length directions. Explain the apparent paradox that 
either rod may^ appear longer than the other, depending on the state 
of motion of the observer. 

3. Suppose that the frequency of a light ray is v with respect to a 
frame of reference S. Its frequency v* in smother frame of reference, 
S*, depends on the angle a between the direction of the light ray and 
the direction of relative motion of S and S*. Derive both the classical 
and the relativistic equations stating how #* depends on v and the 
angle a. 

Tor this purpose, the light may be treated as a plane scalar wave 
moving with the velocity c. 

k(1 — cos a-v/c), 


1 — cos a-v/c , , ' i / i \i 

— Hi - cos a-v/c + i(v/c) 


Vl - v'/c 2 

The first-order effect common to both formulas is the "classical" Doppler 
effect, the second -order term is called the "relativistic" Doppler effect. 
It is independent of the angle; a. 

4. H, A. Lorentz created a theory which was the forerunner to the 
relativity theory as we know it today. Instead of trying to extend the 
relativity principle to electrodynamics, he assumed that there exists one 
privileged frame of reference, with respect to which the ether was to be 
at rest. In order to account for the outcome of the Michelson-Morley 
experiment, he assumed that the ether affects scales and clocks which 
are moving through it. According to this hypothesis, clocks are slowed 
down and scales are contracted in the direction of their motion. It is 
possible to derive the quantitative expressions for the factors of time- 
and length-contraction with the help of these notions, 

(a) Assuming that the G able an transformation equations are appli- 
cable, derive the rigorous expression for the time that a light ray needs 
to travel a measured distance I in both directions along a straight path 
in a Michelson-Morley apparatus, provided that the velocity of the 
apparatus relative to the privileged system is v and that the angle 
between the path and the direction of s is <s. 

21 Vl — i»Vc 2 'Sin 2 a 


I, = 

1 — v^/e 2 

(4. pi.) 



[ Chap, IV 

{b) Now we introduce Lorentz' hypothesis and assume that equation 
(4,pl) holds for the true, contracted length i and the true, distorted 
angle u. The time indicated by the observer's clock is not the real 
time t, but the clock time, t*. Furthermore, we measure the length with 
scales that are contracted themselves; that is, what we measure is not 
the true, contracted length I, but the apparent, uncontracted length I*. 
The relation between the clock-time (* and the apparent length I* is 


(4. P 2) 

according to the outcome of the Mieheison-Morley experiment. We 
call the factor of time-contraction. and the factor of length- contraction 
in the direction of v, A. Derive the relations between t and £* I and I*, 
and determine A and 6 so that eqs. (4. pi) and (4.p2) become equivalent. 



I* — J-\/sin a a H- A 2 cos a a, 
A = = Vl - I'Vc 2 . ' 


(c) In order to obtain the complete Lorentz transformation equations 
(4,13), introduce two coordinate systems, one at rest and one moving 
through the ether (S and S*). Determine the apparent distances of 
points on the starred coordinate! axes from the starred point of origin. 
Finally j find out how moving clocks must be adjusted so that a signal 
spreading in all directions from the starred point of origin and starting 
at the time I = (* = has the apparent speed c in all directions. 


Vector ana Tensor Calculus in an n Dimensional 


The classical transformation theory draws a sharp dividing line be- 
tween space and time coordinates. The time coordinate is always trans- 
formed into itself, because time intervals are considered in classical 
physics to be invariant. 

The relativistic transformation theory destroys this detached position 
of the time coord in ate in that the time coordinate of one coordinate 
system depends on both the time and space coordinates of another sys- 
tem whenever the two systems considered are not at rest relative to 
each other. 

The laws of classical physics are always formulated so that the time; 
coordinate is set apart from the spatial coordinates, and this is quite 
appropriate because of the character of the transformations with respect 
to which these laws are covariant. It is possible to formulate relativistic 
physics so that the time coordinate retains its customary special posi- 
tion, but we shall find that in this form the relativistic laws are cumber- 
some and often difficult to apply. 

A proper formalism must be adapted to the theory which it is to 
represent. The Lorentz transformation equations suggest the uniform 
treatment of the four coordinates x, y, z, and t. How this might be done 
was shown by H. Minkowski. We shall find that the application of his 
formalism will simplify many problems, and that with its help many 
relativistic laws and equations turn out to be more lucid than their non- 
relati vi s ti e analogues. 

Classical physics is characterized by the in. variance of length and 
time. We can formally characterize relativistic physics by the invari- 
ance of the expression 

nf = (h - kf - | [(set - xd" + (y 2 - yd 2 + (a - zd% (5.1) 

The invariance of this quadratic form of the coordinate differences 
restricts the group of all conceivable linear transformations of the four 




coordinates x, y, z, and t to that of the Lorentz transformations, just as 
the in variance of the expression 



= (x-z - x,y + (m - vO + la - *)" 


defines the group of three dimensional orthogonal transformations. The 
four dimensional continuum (x, y, z, t), with its invariant form r r ?, can 
be treated as a four dimensional "space," in which ra is the "distance" 
between the two "points" (xj , j/i , s L , k) and (x, , y 2 , % , h). This pro- 
cedure permits the development of a sort of generalized vector calculus 
in the "Minkowski world," and the formulation of all invariant rela- 
tions in a clear and concise way. 

We shall begin the study of this mathematical method with a recapitu- 
lation of elementary vector calculus, focusing our attention on its formal 
aspects. Then we shall generalize the formalism so that it becomes 
applicable to the space-time continuum. 

Orthogonal transformations. Let us start with a rectangular Car- 
tesian coordinate system and call its three coordinates x\ , x 2 , and x 3 

(instead of x, y, and z). Call the coordinate differences between two 
points P and 1 J ', Axi , Ax 2 , and Ax s . The distance between the two 
points is given by 

, 2 = j: Ax?. (5.2a) 

If we carry out a linear coordinate transformation, 

«J~Es«** + fti* ^=1,2,3, (5.3) 

the new coordinate differences are 

Ax^t^A^-, i= 1,2,3. (5.4) 

k- 1 

These equations can he solved with respect to the Ax k ; 

Ax, = £ei#&4 ft =1,2, 3, (5.5) 

Eq. (5.2a) expresses itself in terms of the new coordinates thus: 

s 2 = £ c^uAxlM. (5.6) 



The new coordinate system is a rectangular Cartesian system only if 
eq, (5.6) is formally identical with eq. (5.2a), that is, if 

A / , /0 if k *l, 

i ,i (1 it k — I. j 


These equations take a more concise form if we use the so-called 
Kronecker symbol S u , which is defined by the equations 

hi = 0, k & I, 
hi = 1, fe= I 

Eq. (5.7) takes the form 

22 $& c-'u = $M , htl~ 1, 2, 3. 



Eq. (5.7a) is the condition which must be satisfied if the transformation 
equations (5,3) are to represent the transition from one Cartesian coordi- 
nate system to another. 

We can easily formulate the condition to be satisfied by the c ih them- 
selves. By substituting eqs. (5.4) in eqs. (5.5) wc obtain 

Ax k = J2 c'uCnAxt, k = 1, 2, 3, (5.9) 

■i, £=1 

and because this equation holds for arbitrary Ax k , we find 

S c-'mCu = hi, k, I = 1. 2, 3. (5.10) 

Now we can multiply eqs. (5.7a) by e im and sum over the three possible 
values of I. We obtain, because of (5.10), 

Zj VifrGilChn = &M = Zj falCba = 6fea. (5.11) 

i.i=l ' ' [=i 

By substituting c ki for c, k , and so forth, in eqs, (5.7a), we obtain 
X) e»i.«« ?=*■%, h, I = 2, 2, 3, (5.7b) 

and eqs. (5.10) take the form 

£ &ea = S H , k, I = 1, 2, 3. (5.10a) 

Either eqs. (5.7b) or (5.10a) , together with eqs. (5.3), define the group 
of orthogonal transformations. 



[ Chap. V 

Transformation determinant. We shall now investigate the trans- 
formations (5.3) and (5.7b) a little further. 
The determinant of the coefficients cn : , 

t-n , Cia , €13 
C31 j C:i'z , C:i3 

is equal to ± 1 . To prove this statement, we make use of the multipli- 
cation law of determinants, which states that the product of two deter- 
minants I o« I and I bn I is equal to the determinant | "£ q#&a |. Now 


we form the determinant of both sides of eqs. (5.7b), 


£ Cj»&* = I b k i 




According to the above-mentioned multiplication law, the left-hand side 
can be written 


Cil - Chi 


c ki 


The value of the right-hand side of eq. (5.12) is equal to unity, since 





Therefore, we really have 

c M I = ±1- 


The value +1 of the; determinant belongs to the "proper" rotations, 
while the value —1 belongs to orthogonal transformations involving a 

Improved notation. In the great majority of equations occurring in 
three dimensional vector (and tensor) calculus, every literal index which 
occurs once in a product assumes any of the three values 1, 2, 3, and 
every literal index which occurs twice in a product is a summation index. 
From now on, therefore, we shall omit all summation signs and all 
remarks of the type (i, h — 1,2, 3), and it shall be understood that: 

(1) Each literal index which occurs once in a product assumes all its 
possible values; 

Chap. V ] 



(2) Each literal index which occurs twice in a product is a summation 
index, where the summation is to he carried out over all possible values. 

Thus, we write eqs. (5.3) and (5.6) like this: 
Xt = 6fHSi + x { , 

s 2 = C;ifinAx k Axi . 

Summation indices are often called dummy indices or simply dummies. 
The significance of an expression is not changed if a pair of dummies is 
replaced by some other letter, for example., 

OikXic = cuxi . 

Vectors. The transformation law of the Ax-*., (5.4), Is the general 
transformation, law of vectors with respect to orthogonal transforma- 
tions, or, rather: A vector is defined as a set of three quan titles which 
transform like coordinate differences : 

% = c-m&i ■ (5.16) 

When the vector components are given with respect to any one Cartesian 
coordinate system, they can be computed with respect to every other 
("artesian coordinate system. 

The norm of a vector is defined as the sum of the squared vector com- 

W e shall prove that the norm is an invariant with respect to orthogonal 
transformations, or. that 

am = a-ia-i . (5.17) 

Substituting for a h its expi'ession (5.16), and making use of eq. (5,10a), 
we obtain 

a<-ai : = cua,iCf-iai = &u&&i = a&i , 

which proves that eq. (5.1.7) holds for- orthogonal transformations. 

The scalar product of two vectors is defined as the sum of the products 
of corresponding vector components, 

(a-b) = aA. (5.18) 

That this expression is an invariant with respect to orthogonal trans- 
formations is shown by a computation analogous to the proof of eq, 
(5.17). The norm of a vector is the scalar product of the vector by 

The word scalar is frequently used in vector and tensor, calculus 

s in- 



stead of invariant. "Scalar product" means "invariant product/' 

Sums and differences of vectors are, again, vectors, 

di + bi = Si, 
8s — hi = di 


That the new quantities s? and S { really transform according to eq. 
{5. 16) follows from the linear, homogeneous character of that trans- 
formation law. 

The product of a vector and a scalar (invariant) is a vector, 

a-a t = h. (5.20) 

The proof is left to the reader. 

The discussion of the remaining algebraic vector operation, the vector 
product, must be deferred until later in this chapter, because its trans- 
formation properties are not quite like those of a vector. 

Vector analysis. We are now ready to go on to the simplest differ- 
ential operations, the gradient and the divergence. In the three dimen- 
sional space of the three coordinates as* , let us take a scalar field V, 
that is, a function of the three coordinates Xi which is invariant with 
respect to coordinate transformations. The form of the function V of 
the coordinates will, of course, depend on the coordinate system used, 
but in such a way that its value at a fixed point P is not changed by the 

What is the transformation law of the derivatives of V with respect 
to the three coordinates, 



We must express the deriva lives with respect to x' k . in tenns of the 
derivatives witH respect to xi , 

dV = dxidV^ (522) 

Silt dx k ® x i 

According to eq. (5.3), the x' k are linear f auctions of the ft , and vice 
versa. Therefore, the dXi/dx k are constants, and they are the constants 
c ik defined by eqs. (5.5). We have, therefore, 

V,k- = c' ik -V,i 
and, according to eq. (5.11), 

71* = CuT-'- < 5 - 23 ) 



The three quantities V ti transform according to eq. (5.11); therefore, 
they are the components of a vector, which is called the gradient of the 

scalar field V. 

Three functions of the coordinates, T\-(xi , x% , xs), are the components of a 
■vector field if at each space-point they transform as the compone.nts of a 
vector. The functions Vi of the coordinates .iv are, thus, given by the 

V'iti) = c it V k (x s ), (5.11a) 

where the x s are connected with the x r by the transformation equations. 
The gradient creates a vector field out of a scalar field. 

The divergence does the opposite. Given a vector field Vi , we form 
the sum of the three derivatives of each component with respect to the 
coordinate with the same index, 

div V = f w . (5.24) 

We have to show that this expression is an invariant (or scalar), 

FU = ii.i . (5.25) 

The procedure is exactly the same as before. We replace the primed 

quantities and derivatives by the unprimed quantities, 

V h , k ' = c^kifikiVt), 


Because of eq. (5.10), this last expression is equal to the right-hand side 
of eq. (5.25). 

The divergence of a gradient of a scalar field is the Laplacian of that 
scalar field and, of course, is itself a scalar field, 

div grad V = V,„ a V 2 F. (5.27) 

Tensors. In many parts of physics we encounter quantities whose 
transformation laws are somewhat more involved than those of vectors. 
As an example, let us consider the so-called "vector gradient." When 
a vector field Vi is given, we can obtain a set of quantities which deter- 
mine the change of each component of Vi as we proceed from a point 
with the coordinates Xi in an arbitrary direction to the infinitesimal ly 
near point with the coordinates Xi + Sxi . The increments of the three 
quantifies ft are 

SVi = VijMk , (5.28) 

and the nine quantities Vi,tt are called the vector gradient of Vi . W :, 'e 
can easily' derive its transformation law in the usual manner: 


P3fe»\vn*i<' i) ,k CmiCrify. f i r k * 




The vector gradient is one example of the new class of quantities 
which we are now going to treat, the tensors. In general, a tensor /ws 
N indices, all of which lake all values 1 to 3. The tensor has, therefore, 
3- Y components. These 3' v components transform according to Ihe trans- 
formation law 

?.j i f 'Tti ( :' ' .i ' 

-I ikl 


The number of indices, N, is called its rank. The vector gradient is a 
tensor of rank 2, vectors are tensors of rank 1 , and scalars may be 
called tensors of rank 0. 

One very important tensor is the Kronecker symbol. Tts values in 
one coordinate system, when substituted into eq. (5.30), yield the same 
values in another coordinate system, 

Skt = CkiCifia — CkiCu 

Ski , 


according to eq. (5.7b). 

The sum or difference of two tensors of equal rank is a tensor of the 
same rank. We formulate this law for tensors of rank 3: 

Tm + U ikl = V m , 
T m - U m = W m . 


The proof is the same as for the corresponding law for vectors, eq. (5.19). 
The product of two tensors of ranks M and A r is a new tensor of rank 
(M + N), 

T^.Ih,*- = Vi,...^.... (5.34) 

The rank of a tensor may be lowered by 2 (or by any even number) 
by an operation called "contraction/'' Any two indices are converted 
into a pair of dummy indices. For instance, we can contract the tensor 
T&i... to obtain the tensors T ,„.,.... or T,„... . The proof that these 
new contracted tensors are again tensors is very simple. For the first 
example given here, it runs as follows. 

Because of eq. (5.10a), the right-hand side is equal to 

T',,,1... = Sitfitm ■ ■ ■ T Am ... = Sfc, ■ ■ • TV™..- ■ (5-^5) 

When we contract the vector gradient (tensor of rank 2), we obtain the 
divergence (tensor of rank 0). The operations product, (5.34), and the 
eontraction can be combined, so that they yield tensors such as T ik l ; ik , 

T-UeVkm , Tiki' ik , J ikijki ■ ..... 

Tensors may have symmetry properties with respect to their indices. 



If a tensor is not changed when two or more indices are exchanged, 
then it is symmetric in these indices. Instances are 

tiki — Ikil , 

*ikl-m = tilkm ~ tkn m = tlkim = tjslipi ~ Hihm ■ 

The first tensor is symmetric in its first two indices; the second tensor is 
symmetric in its first three indices. 

When a- tensor remains the same or changes the sign of every com- 
ponent upon the permutation of certain indices, the sign depending on 
whether it is an even or an odd permutation, we say that the tensor is 
antisymmetric (also skew&ymmetric or alternating) with respect to these 
indices. Instances are 

tiki = — tk~l j 

tiklm = tklim = tijk m — Hficm — — ikilni. = — tlkim • 

All such symmetry properties of a tensor are invariant. The proof 
is extremely simple and shall be: loft to the student. 
The Kronecker tensor is symmetric in its two indices. 

Tensor analysis. When a tensor is differentiated with respect to the 
coordinates, a new tensor is obtained, the rank of which is greater by 1. 
The proof again consists of simple computation: 

T mn ..., s > = c u (c m ;c rii: ■ ■ ■ Tuc--),i — c !!t ic n k ■ •■ c 3 iT ik -,.,i . (5.36) 

When the resulting tensor is contracted with respect to the index of 
differentiation and another index, for example, T, k ...,k , it is often called 
a divergence. 

Tensor densities. The "vector product" of two vectors a and b is 
usually defined as a vector which is perpendicular to a and to b and 
which has the magnitude | a | - 1 b | -sin (a, b). As there are always two 
Vectors satisfying these conditions, viz., P s and P 2 in Fig. 6, a choice 
is made between these two vectors by the further condition that a, b, 
and P shall form a '-screw" of the same type as the coordinate axes in 
the sequence x, y, z. In Fi.g. 6, the vector Pj satisfies this condition, 
but only because the chosen coordinate system is a "right-handed" 
coordinate system. If we carry out a "reflection" (for example, give 
the positive A r -axis the direction to the rear of the figure instead of to 
the front), P 2 becomes automatically the vector product of a and b. 

The vector product is, thus, not an ordinary vector, but changes its 
sign when we transform a right-handed coordinate system into a left- 



[ Chap. V 

handed system, or vice versa. Such quantities are called "axial vec- 
tors," while ordinary vectors are called "polar vectors." 

Fig. 6. The vector product. In a right-handed coordinate system, Pi represents 
the vector product of a by b. 

With respect to a Cartesian coordinate system, the components of P 
are given by the expressions 

Pi = (t: h — o,-i h , 
Ps = a s bi — (Ji&s, 


Similarly, the curl of a vector field 7; is defined as an "axial vector" 
with the components 

Chap. V J 



W — V 3,2 — V 3,3 

C- 2 = V h ,- T, Al } 
Cz = ¥%% — T'i.2 


From the point of view of tensor calculus, we can avoid the concept 
of "axial vector" by introducing vector product and curl as skewsym- 
nietric tensors of rank 2, 


Pik — ffi#& ^ CLkbi , 

Vx = FL - 7, 


It can be shown that all equations in which "axial vectors" appear can 

be written in the co variant manner with the help of such skewsymmctrio 
tensors. Nevertheless, this treatment does not show very clearly the 
connection between the transformation law of a skewsymmetric tensor 
of rack 2 and that of an '"'axial vector. " We can conform closely to 
the methods of elementary vector calculus by introducing in addition 
to tensors a new type of quantity, the "tensor densities." 

The tensor densities transform like tensors, except that they are also 
multiplied by the transformation determinant (5.15). As long as this 
determinant equals +1, that is, when the transformation is a "proper 
orthogonal transformation" without reflection, there is no difference 
between a tensor and a tensor density. But a density undergoes a 
change of sign (compared with a tensor) when a reflection of the coordi- 
nate system is carried out. The tensor densities have, thus, the same 
relationship to tensors as the "axial vectois" have to the "polar vectors/' ; 
Their transformation law can be written thus : 





The laws of tensor density algebra, and calculus are: The sum or 
difference of two tensor densities of equal rank is again a tensor density 
of the same rank. Tlio product of a tensor and a tensor density is a 
tensor density. The product of two tensor densities is a tensor. The 
contraction of a tensor density yields a new tensor density of Lower rank. 
The derivatives of the components of a tensor density are the com- 
Ponents of a new tensor density, the rank of which is greater by I than 
the rank of the original density. 

The tensor density of Levi-Civita. Wo found that the Kronecker 
symbol is a tensor, the components of which take the same constant 



[ Chap, V 

Chap. V ] 



values in everv coordinate system. Likewise, there exists a constant 
tensor density of rank 3, the LevhCivita tenser density, defined as 
follows, tat is skewsymmelrw in its three irtdiees; therefore, all those com- 
ponents which have at lead two indices equal vanish. The values of the 
norwanishing components are ±1, the sign depending on whether (?, k, I) 
is an even or an odd permutation of {I, 2, 3). 

We have yet to show that 5 m are really the components of a tensor 
density To do that, let us consider a tensor density D iU which has the 
components 5 m in one coordinate system. If it tarns out that its com- 
ponents in some other coordinate system are again km , «* assertion 

is proved. 

The components of Dm in another coordinate system are 

£>1™ = | fcj* | CmiCnkCslSi 


As the skewsymmotry <>f Pm is preserved by the coordinate trans- 
formation, we know that all components DL., with at least two equal 
indices vanish. We have to compute only components with all three 
indices different. The component Diss is given by the expression 


C a b I CuClkCltSikl 


The right-hand side is simply the square of | c rt [, and equal to unify. 
For b ikl is defined so that Ci&a&st&m & just the determinant | c<a |. 

Now that we know that I)' 1M is equal to unity, the remaining com- 
ponents are obtained simply by using the symmetry properties. 
They are 

Sin = Dk = Dm = -S* = -■$*> 



In other words, the ZC. are again equal to &* , and the proof is 

Vector product and curl. With the help of the Levi-Oivita tensor 
density, we can associate skewsymmetrie tensors of rank 2 with vector 
densities : 

ft; = i&qjs&li! 



The converge relation is 

iPfti = Sii.-ro; . 

Applying eg. (5,13) to the vector product and to the curl, defined by 
eqs. (5.37) and (5.38), respectively, we obtain 

%h = fkkid&i , ( 5 - 37b ) 

Because these two vectps densities |& and £,■ transform, like vectors 
except for the change of sign in the case of coordinate reflections, they 
are treated as vectors in vector calculus, but they are referred to as 
"axial ;r vectors, implying that they have something to do with 

The}' really do have something to do with rotation. The angular 
momentum, for instance, is fee vector product of the radius vector and 
the ordinary momentum, 

Si = 'omXkVi 


In the case of a reflection, it transforms as an ordinary vector would, 
except for its sign. Assume that of the x k only m does not vanish, and 
feat p has only the component p 2 . Then fee angular momentum has 
only the component 3' 3 . We can carry out a reflection in three different 
ways: We can replace x, by (— %) or we can do the same thing with x. 
oi- with x s , fee other two coordinates remaining unchanged in every 
case. 3 3 changes its sign in the first two cases, and it remains unchanged 
when Xs is replaced by (-x :i ). A genuine vector would change its sign 
only -when x s is replaced by ( — z t ) . 

Generalization. "Now that we have reviewed briefly vector and tensor 
calculus in three; dimensions with respect to orthogonal transformations, 
we ate in a position to generalize the concepts obtained so that they 
will be applicable to th e problems we shall discuss later. Th e gcnerali na- 
tion is to be carried out in two steps. First, we have- to extend the 
formalism, so that it applies to any positive integral number of dimen- 
sions; second, we shall have to consider coordinate transformations other 
tli an orthogonal transformations. 

n dimensional continuum. The first generalization is almost trivial. 

Instead of three coordinates as , x 2 , ssj , we have n coordinates, 
Xi ■ ■ • x„ , describing an n dimensional manifold. We assume, again, 
that (fere exists an invariant distance between two points, 

s = AxtAxi , 


where the summation is to be carried out over- all n values of the 
index i. Eq. (5.2b) is invariant with respect to the group of n dimen- 
sional, orthogonal transformations, 

X{ CifcXh ~~\~ X^ 5 

where the c ik have to satisfy the conditions 

CikCil = 0/;l. . 




Chap. V 

AT. indices fcafce all nfaw 1 ■ * ■• «. ^d mttanalfeitt are to be earned 
out from 1 to n. The determinant I c a!l | is again equal to ±1. 
Vectors are defined by the transformation law 

(Ik — CkiQi , 


and their algebra and analysis are identical with the algebra and analysis 
of three dimensional vectors. . 

Tensors and tensor densities are defined as in three dimensional 
space, except that, all indices run from 1 to n. fc is again a symmetric 

6 The Levi-Civita tensor density is defined as follows. fct . . ., fe.A tensor 
density with n indices (of rank n), skewsymmetric m all of them JLbe 
nonvanishing components are ±1, the sign depending on whether 
(i h - ■ ■ s) is an even or an odd permutation (A (\, 2, ■ • ■ ,n). me 
'Vector product" is no longer a vector density. With the help of the 
Levi-Civita tensor density, we\can form from a skewsymmetric tensor 
of rank m (m ^ n) a skewsymmetric tensor density of rank (« m). 
Only when n is 3 is the " conjugate" tensor density to a tensor of rank 2 
•a vector density. 

General transformations. The "length" defined in the Minkowski 
space, (5.1), does not have the form (5.2b). We shall, therefore no 
longer restrict ourselves to transformations which leave eqs. (5,3b) in- 
variant but shall «ake up general coordinate transformations, Since 
the Lorentz transformations are much less general than the coordinate 
transformations which we are about to consider, it may appear that we 
are deviating from our mam purpose. But we shall need the general 
coordinate transformations in the general theory of relativity; and 
since they are as simple in most respects as the more restricted group oi 
Lorentz transformations, we shall thus avoid needless repetition. 
' Let us consider a space in which we can introduce Cartesian coordinate 
systems so that the length is defined by eq. (5.2b). Then let us pass 
from a Cartesian coordinate system to another coordinate system winch 
is not Cartesian. The new coordinates may be called £ , | , • ■ ■ , £ (trie 
superscripts are not to be mistaken for power exponents). We have 

? = /*'(£!, ■■■ x n ), % 

= 1 


where the n functions/' are arbitrary, except that we shall assmnc that 
their derivatives exist up to the order needed in any d)scussion; that the 
Jacobian of the transformation, 


dx r 



vanishes nowhere; and that the {' are real for all real values of the 

Xi ■ ■ ■ X- 

s is not, in general, a quadratic form of the A£*, as it is of the Ax, . 

But the square of the distance between two infmitesimally near points 
remains a quadratic form of the coordinate differentials. In terms of 
Cartesian coordinates, this infinitesimal distance is given by 

ds — dxylxk , 
and dxk can be expressed in terms of the (if' , 

<&& = -*tfr. 

Substitution into eq. (5.47) yields 


dx k dx k 

d? ai 





ds is a quadratic, form of the d£, regardless of the coordinate system 
used. This suggests that the coordinate differentials df and the dis- 
tance differential ds will, in the field of general coordinate transforma- 
tions, take the place of the coordinate differences Axi and the distance s, 
which are adapter.! to Cartesian coordinates and orthogonal trans- 

Vectors. Let us see how the coordinate differentials transform in the 
case of a general coordinate transformation. Let £', £ be two sets of 
non -Cartesian coordinates. Then the coordinate differentials are con- 
nected by the equations 



The transformed coordinate differentials d%' 1 are linear, homogeneous 
functions of the dg; but the transformation coefficients {dff l fdg) are not 

constant, but functions of the £\ Neither is their determinant, 
i d? a /d% IJ \, a constant. We shall use them for the purpose of defining 
a type; of geometrical quantity, the "contra variant vector": A contra- 
variant vector has n components, which transform like coordinate differ- 



The sum or the difference of contravariant vectors, and the product of a 

contravariant vector and a scalar are also contravariant vectors. 
It is impossible to form scalar products of contravariant vectors alone. 



[ Chap. V 

To rind out what corresponds to the scalar product in our formalism, 
we shall consider a scalar field V(£, ■ ■ ■ , ?). The change of V along 
an infinitesimal displacement If' is given by 

SV = Xi&f, 


The left-hand side is obviously invariant. The right-hand side has the 
form of an inner product; one factor is the contra. variant vector S| l , 
the other is the gradiant of Y, V.i , 

The components of the gradient of V transform according to the law 


The V ,i> arc linear, homogeneous functions of the V ti . The trans- 
formation law (5.53) is not that of a contra variant vector. We call the 
gradient of a scalar field a covariant vector and define in general a co- 
variant vector as a set of n fjuardities -which transform according to the law 

°"> = urn ** 


The sum or difference of covariant vectors, and the product of a co- 
variant vector and a scalar are, again, covariant vectors. 

Tn order to distinguish between contravariant and covariant vec- 
tors, we shall always write contravariant vectors with superscripts, and 
covariant vectors with subscripts. 

The transformation coefficients of contravariant and covariant vectors 
are different, but they arc related. The {&f l /S^) of eq. (5.51) and the 
(3|V 5 f *) °f e T ( 5 ' 54 arc connected by the n equations 

&£ n df 

5J , 


where $f is, again, the Kroneeker symbol. Because of eq. (5.55), the 
inner product of a covariant and a, contra variant vector is an invariant, 

a-ib = (lib . 


Let us return for a moment to the orthogonal transformations. Their 
transformation coefficients, && , satisfy the equations (5.7b) and (5.10a). 
In the case of orthogonal transformations, the coordinate transformation 
derivatives are 

= Cn ; 

Chap. V 1 



and, because eq. (5.55) holds for all transformations, it follows from eq. 
(5.7b) that (dxi/dx'i), too, is 




That is why the distinction between contravariant and covariant vec- 
tors does not exist in the realm of orthogonal transformations. 

Tensors. Tensors are defined as forms with n N components {N being 
the rank, of the tensor), which transform with respect to each index like a 
vector. They may be covariant in all indices, contravariant in all indices, 
or mixed t that is, contravariant in some indices, covariant in others. The 
contravariant indices are written, as superscripts, the covariant indices 
as subscripts. The example of a mixed tensor of rank three may illus- 
trate the definition: 

j . _ St dt , 

" ,n " ~dt» W~ n a? 


Symmetry properties of tensors are invariant with respect to coordi- 
nate; transformations only if they exist with respect to indices of the 
same type (covariant or contravariant) . 

The Kronecker symbol is a mixed tensor, 

jH = of Of = d£ d£ 

- 4 


The product of two tensors of ranks M and N is a tensor of rank 
{M + N), and e^ery index of ei tiler factor keeps its character as a 
contravariant or covariant index. Furthermore, just as the scalar 
product of a covariant and a contravariant vector is a scalar, so any two 
indices of different position can bo used as a pair of dummies, and the 
result is a lowering of the tensor rank by 2. Examples of such prod- 
ucts are: 

«,- t . I.) . 

a ik . b. . . 

ait. b..i 

Wheil the corresponding components of two tensors of equal rank arc 
added or subtracted, the sums or differences are components of a new 
tensor, provided that the two original tensors have equal numbers of 
indices of the same type. 

These are the simple rules of tensor algebra. They indicate how new- 
quantities may be formed which transform according to laws of the 
pattern (5.58). 

Metric tensor, Riemannian spaces. The expression which occurs in 



[ Chap. V 

eq, (5.49), in = —■ —^ transforms as a tensor when we change from a 

coordinate system £' to £"", 


In other words, gu is a covariant symmetric tensor of rank 2. It is 

called the metric tensor. 

There exist "spaces" where it is not possible to introduce a Cartesian 
coordinate system. Two dimensional "spaces" of that kind include the 
surface, of a sphere. If we introduce as a coordinate system the latitude 
and longitude, <p and <?, it is possible to express the distance between two 
infinitesiinally near points on the spherical surface in terms of their 
coordinate differentials, 

di = ff{$P + cm <pd&-). 

In order to include such continuous manifolds within the scope of our 
investigations, we shall consider spaces with a metric tensor, without 
raising the question, for the time being, of whether a Cartesian coord- 
inate system can be introduced or not. Whenever a "squared infini- 
tesimal distance/' that is, an invariant homogeneous quadratic function. 
of the coordinate differentials, is defined, we call the manifold a "metric 
space" or a "Riemannian space." Tf it is possible to introduce in a 
liiemannian space a coordinate system with respect to which the com- 
ponents of the metric tensor assume the values & ik at every point, such a 
coordinate system is a ('artesian coordinate system, and the space is 
called a Euclidean Euclidean spaces are, therefore, a special 
ease of liiemannian spaces. 

Whenever the infinitesimal distance is given by equations of the form 

ds" = gu d£ d% , 


where d/ is an invariant, gu is a covariant tensor. Our previous proof 
was based on the assumption that the g ;i were given by the expression 
djk dx k . » otllG1 . wor[ n, w<3 implied the possibility of introducing Car- 

tosian coordinates. To show that the transformation properties of g t[ 
are independent of this assumption, we shall consider the equation 

gud^da' = gLdr'dr, (5-62) 



which expresses the invariance of rfs . When we replace the $? on the 
left-hand side by (a£ '■/ 'df"") • df , we obtain 

gu d ^ md ^di dg =g^d£ $ . 


Because the d% /m arc arbitrary, it follows that the coefficients on both 
sides are equal, that is, eq. (5.60) is satisfied. 

If the determinant of five components of gu does not vanish, it is 
possible to define new quantities g' ! ' by the equations 

1:1 ~i 




In order to determine the transformation law of these quan titles, we 
transform first the gu~, ■ We replace them by the expression 

so that we get 

r , <->r N _ ,i 

0$ »* dj :t B 0i ' 

fit** "it*** 

then multiply the latter equation by — -> — - . We know from eq. (5.59) 
that the right-hand side becomes K ; and the left-hand side becomes 

ar ! a? , at m af 
eg $$ r y ™ n Bg 5 d g 

>r ® nn r m g " m Sr g " tn m m (J 

> drat „ 
' 9 " a? a? g : 

^o tha" we get 

fh* Sk k 5 |i 3 - °r 

By comparing oq. (5.06) with eq. (5.64), we find that 

3^, d g y J ' 



that is, the g kl are the components of a eontravariant tensor. This 
tensor is symmetric. We can show this by multiplying eq. (5.64) bj r 
Stag". The left-hand side becomes 

log 9m = sfes g s^ = hg qm = g^g , 




while the right-hand side is equal to 

; W tV 6" off 

mug = M ~ ( J='9 = 5 * ■ 
We obtain 

9^9 = &s ■ 

Comparing this equation with eq. (5.G4), we find that 

9 !l = 9 ii: - 

g kl is called the contravariant metric tensor. The values of its com- 
ponents are equal to the minor of the g», , divided by the full deter- 
minant g = | g,ib | , 

( f = (f 1 - minor fe). (5- 69 ) 

In the case of Cartesian coordinate systems, g' 1 equals hi • 

Raising and lowering of indices. A covariant vector can be obtained 

from a contra variant vector by multiplying it by the covenant metric 
tensor and summing over a pair of indices, 





This process can be reversed by multiplying (k by the contravariant 
metric tensor 

9 a i 


From the definition of the contravariant metric tensor, (5.64), it follows 
that eq. (5.71) IS equivalent to eq. (5.70); in other words, eq. (5. 71) 
leads back to the same contravariant vector which appears in. eq, (5.70). 
The two vectors a :i and a can, therefore, be properly considered as two 
equivalent representations of the same; geometrical object. The opera- 
tions (5.70) and (5.71) are called lowering and raising of indices. It is 
possible, of course, to raise and lower any tensor index in the same way. 
The norm of a vector is defined as the scalar 

a~ = g i! .,a i a k = g*a,a f: = a% , (5./2) 

while the scalar product of two vectors can be written in the alternative 


a,V = abi = gu^a 

g a-iO* 


Tensor densities, Levi-Civita tensor density. We call a tensor density 
a quantity v>hich transforms according to the law 



Chap, V | 



W is a constant the value of which is characteristic for any given tensor 
density; this constant is called the weight of the tensor density. Tensors 
are tensor densities of weight zero. Depending on the number of in- 
dices, we speak also of scalar densities and vector densities. 

The sum of two densities with the same numbers of like indices and the 
same weight is a density with the same characteristics. In the multi- 
plication, the weights are added. 

The Levi-Civita symbols V,... a , ; and a' 1 "'"* are densities of weight 
( — 1) and (-H), respectively, (n is the number of dimensions.) The 
proof is simple and analogous to the one given for orthogonal trans- 

The determinant of the covariant metric tensor. 



is a scalar density of weight 2. 

We shall have very little occasion to work with tensor densities. 

Tensor analysis. 1 The consideration of tensor densities completes the 
discussion of tensor algebra as far as it is needed in this book. A.s for 
tensor analysis, we have already found that the ordinary derivatives of 
a scalar field are the components of a covariant vector field. In general, 
however, the derivatives of a tensor field do not form a new tensor field. 

Let us take the derivative of a vector. The derivative compares the 
value of a vector at one point with its value at another i nil nitesim ally 
near point, in a given direction. In the ease of a coordinate trans- 
formation, the vectors at the two points do not transform with the same 
transformation coefficients, for the coefficients of the transformation are 
themselves functions of the coordinates. Therefore, the derivatives of 
the transformation coefficients enter into the transformation law of the 
derivatives of the vector. 

However, there is a way which enables us to obtain ncw r tensors by 
differentiation. The method is suggested by our experience in the realm 
of (.'artesian coordinates. There we can describe the derivative of a 
vector or tensor thus: The vector is first carried to the "neighboring' 1 
point without changing the values of Its components: it is displaced 
■parallel to itself. (As long as we use Cartesian coordinates, that state- 
ment has an invariant meaning, because the transformation coefficients 
are the same at both points.) Then this parallel displaced vector is 
compared with the value of the vector (as a function of the coordinates) 

1 Tensor analysis with respect to general coordinates is discussed here because 
it is necessary to an unders Landing of the general theory of relativity. It is not 
needed anywhere in the special theory of relativity, and may be omi fcted by those 
who are not interested in t.he general theory of relativity. 


[ Chap. V 

at the same point. The difference is given by Ai,,5x* ■ If it were 
possible to define "the same vector" or "the parallel vector" at a neigh- 
boring point, the difference between the parallel displaced vector and 
the actual vector at a neighboring point would be subject only to the 
transformation law at that point. 

A definition of parallel displacement is actually possible in a com- 
paratively simple way. Of course, the value of the displaced vector 
depends on the original vector itself and en the direction of displace- 
ment. Let us first consider a Euclidean space, where we can introduce 
a Cartesian coordinate- system. With respect to such a coordinate sys- 
tem, the law of parallel displacement takes the form 

a i:t Sx k = 0, (5.76) 

where 5x k represents the infinitesimal displacement. Let us now intro- 
duce an arbitrary coordinate transformation (5.46). The vector com- 
ponents with respect to that new coordinate system may be denoted 
by a prime. Then we have 

Sx k 

\ (>x k ap ysx, V 

dp* d? da T , af dx, 8^f r 

dx k dx~i 'dp dx k dp dx-i dx t 
dx k transforms according to eq. (5.48), and we obtain 

= a ilk hx k = <■— — - + ——- -—a r \^. 


- Sp is the actual increment of a', as a result of the displacement, 
_ -.* 

and shall he denoted by 5a '. Multiplying the right-hand side of eq. 


(5.76a) bv — j , we get finally 

t dxidx; d"p~ 


dp di? dXidXi 

When no Cartesian coordinate system can be introduced, we shall 
retain the linear form of the last equation and assume that, because of 
a parallel displacement, the infinitesimal changes of the vector com- 
ponents are bilinear functions of the vector components and the com- 
ponents of the infinitesimal displacement, 

5( i (5.79) 

Sa k — ■ J rY k ia i &£ . 

Chap. V ] 



i ii 

The coefficients Th and FL of these new tentative laws are, so far, 
entirely unknown quantities. But we can determine their transforma- 
tion laws. Su' is the difference between two vectors at two points, 
characterized by the coordinate values £ and £ + 5£ . In the case of a 
coordinate transformation, the new 5a' are given by the following ex- 

8a>^(^A ~(fA -*r€*wi 

W J tut? \dp U ft\4& / 

■a ylk 




P i i -j , «£ „> Sl , 
; a 8£ + — — a 1 S£ 

dpdg 9f 

= -„ — r-, a 5P + — - 5a . 
dp dp dp 

As stated before, 8a s does n.ot transform like a vector; this was the 
original source of our difficulty. 

We substitute the expression (5.80) into the left-hand side of the 



-T mr ,a 5% 

and replace a"" and 5p n by the expressions from eqs. (5.51) and (5.50). 
We obtain 

i a "I + 1HT <> a 

dp dp dp 

Substituting da from eq. (5.78), we get 
7* 1 

' Q 2 p h 
,dp dp 

dt; '" i, r \ t „[ _fs dp'' dp" , 1 

Jp V "j nS * = ^ r "'" dp dp aSt 

As both a" and §% l are arbitrary quantities, their coefficients on both 
sides must be equal, 

a^ dp l m " 



d'p !: 
dp dp 

The transformation law of the V.t is obtained by multiplication by 
8p_ dp" 
dp~ a dp 1 ' 



[ Chap. V 

The last term on the right-hand side can be written in a slightly different 
way, by shifting the second derivative. It is 

8? $ cfj' k _ 

3f s 3 


3?' ap a? Wv 

The first term on the right-hand side vanishes, because the parenthesis 
in that term is constant. Eq. (5.81) thus becomes 

J« _ a£ faf a? r 1 + IV\. (5.82) 

Bv carrying put the same computation with eq. (5.70), we obtnin the 

It , J j, 

transformation law for T% . It is identical with that of V ai . 
i 11 

We can now subject T% and it, to conditions which are- compatible 
with their transformation law. The transformation law consists of 
two terms. One term depends on the 1% in the old coordinate system 
and has exactly the same form as the transformation law of a tensor. 
The second term does not depend on the it,, and adds an expression 
which is symmetric in the two subscripts. So, even though the T" ah 
may vanish in one coordinate system, they do not vanish in other 
systems. But if the V k aM were symmetric in their subscripts in one 
coordinate system, they would be symmetric in every other coordinate 
system as well, This would be particularly true if the VS were to 

vanish in one system. Furthermore, if the l4 were equal to the t& 
in one coordinate system, this equality would be preserved by arbitrary 
coordinate transformations. We shall find that geometrical considera- 
tions lead us to treat only systems of I&, which satisfy both these 

Let us displace two vectors a,- and V parallel to themselves along 
an infinitesimal path, Sf. The change of their scalar product, {*$.', 
is given by 

n i 

*'C«itf} = at& + b'Sa, = mb'Tri - ThW- 

When two vectors are displaced parallel to themselves, Iheir scalar product 

always remains constant if and only if the C& are equal to the c.orre- 

spending Tit- 

Actually, the assumption that the two types of Tl, are equal is 


Chap. V ] 



strongly suggested not only because in Cartesian coordinates the inner 
product of two constant vectors is itself a constant, but because of 
another consideration which does not refer to Cartesian coordinates. 

By extending the law or definition of "parallel" displacement (5.78), 
(5.79) from vectors to tensors, we can displace any tensor "parallel" 
to itself according to the rule 

ii ii i 

fife! - (pLu-l + fltiJ. - r,U- t :)et 3 . 


This rule is derived from the postulate that the "parallel displacement" 
of a product is given by the same law that applies to the differentiation 
of products, 

R(abc) = ab 5c + as bb -j- be Sa. (5.85) 

Applying the law (5.84) to the parallel displacement of the Kroneeker 
tensor, we obtain 

«(*:) = (r;.s*- r%m? = (ii "* 



Now apply eq. (5.86) and the product rule (5.85) to the "parallel 
displacement" of the product a l 8 L - . The result is 

S(a f ^) = $fyf + aXfy = &a k + S%fJ), 

On the other hand, the product a's'i is equal to a 1 '. We have, therefore, 

5a' ; = Sa" + a'o(S ■) . 

Therefore, S(3|) must vanish. Accordingly, we have 

it, = T% 


Henceforth, we shall omit the distinguishing marks I and II. 

As mentioned before, the T*„ are symmetric in their subscripts if it 
is possible to introduce a coordinate system in which they vanish at 
least locally. From now on, let it be understood that we shall consider 
only symmetric T;,. The if, are still, to a high degree, arbitrary. 
They are, however, uniquely determined if we connect them with the 
metric tensor g lt by the following condition: The result of the parallel 
displacement of a vector a shall not depend on whether we apply the 
law of parallel displacement (5.78) to its (son tra variant representation 
or the law (5.79) to its eovarianfc representation. The two representa- 
tions a 1 and a k have, at the point (^ + 5|% the components (a* + Sa') 
and (a k + $a k ), respectively, where 5a' and Sa- k are given by equations 
(5 .78) and (5 .79) . That these two vectors are again to be representations 



[ Chap. V 

of the same vector (a + 5a), at the point, (£ s + Kf), is expressed by the 

a k + ba k = {g, k + Sff*)ft + 5a), (5.88) 

whore &gi k is 

TCq. (5.88) must be satisfied up to linear terms in the differentials and 
for arbitrary a' and &£. If wo multiply out the right-hand side of 
eq. (5.88) ; we obtain 

Substituting Sc& and Sa from eqs. (5.78) and (5.79), we obtain 


tf8£(g,iVii + gafli ~ (M.d = 0. 


a* and S£ J are arbitrary; therefore, the contents of the parenthesis 

Now we make use of the symmetry condition and write* the vanishing 
bracket down three times, with different index combinations: 

r u g kT + r' k ,.g< r 



rj0« + 'VUe» - w* = Q >, 
rt#i + rj#s - ^ = o. 

Wc subtract the first of these equations from the sum of the other 
two equations. Several terms cancel, and we obtain the equation 

g-rXit = i(3i*.& + £**.< ~~ ?«»•)• 


We multiply this equation by g sl to obtain the final expression for 

t ik , 

vU = kHm^ + tfaj - ?&>*}• 


This expression is usually referred to as the Christoffel three-index 

symbol of the second kind, and it is denoted by the symbol 


\ = lg l "(g^,k + 9h.i - g»_.i)- 




The left-hand side of eq. (5.89) is called the Christoffel symbol of the 
first kind. It is denoted by the sign [ik, s], 

[ik, s] = §{#*»,* H- g kSii - g ik , s ). (5.89a) 

In tht? ease of Cartesian coordinates, both kinds of Christoffel three- 
index symbols vanish. 

The concept of parallel displacement is independent of the existence 
of a metric tensor. We call a space with a law of parallel displacement 
an affmely connmted space and the T l ik the components of the offine con- 
nection. When a metric is defined., covariant and contra variant vec- 
tors becotne equivalent, and the T% must take the values < , > so that 

the parallel displacement of a vector does not depend on which of the 
two representations has been chosen. 

We shall return now to our original program, the formation of new 
tensors by differentiation.. Consider a "tensor field," that is, a tensor 
the components of which are functions of the coordinates. Now, we 
can take the value of this tensor at a point (£') and then displace it 
parallel to itself to the point $ + 5£ E )- The value of the tensor field 
at the point (|* -f 5^), minus the value of the parallel displaced tensor, 
is itself a tensor. In the case of a mixed tensor of rank two, the value 
of the tensor at the point (f s -I- &£") is 

U k . + $,M, 
and the value of the parallel displaced tensor 

the difference between these two expressions is 

U k .+ 


This expression is a tensor because of the way we have obtained it. 
As 5£ s is itself a vector and arbitrary, the expression in the parenthesis 
is a tensor. This tensor is (ailed the covariant derivative of t£ with 
respect to £*, It is identical with the ordinary derivative when the 

-, V vanish, and, therefore, particularly in the case of Cartesian co- 
ordinates. Two of the more usual notations of the covariant derivative 
are Vifc* and i£, . We shall use the latter. 

The covariant derivatives of an arbitrary tensor are formed by add- 



mg to the ordinary derivatives for each index of the differentiated 
tensor a further term, which for contra variant indices takes the form 





+ '-■ , 

while for covariant indices it is 

»-•—{£,}' + 

This definition satisfies the rule for product differentiation, 

{A3- ••)■,» = A [S B ■■■ + A7? ;a ■■■+-■■, 

regardless of whether some of the indices of ,1 , B, ■ ■ ■ , are dummies. 
The eovariant derivatives of the metric tensor vanish, because of the 
vanishing parenthesis of eq. (5.88a). And since the no variant dif- 
ferentiation obeys the law of product differentiation, indices can be 
raised and lowered under the differentiation, 


9ika -r. , 

and so forth. 


Geodesic lines. The Chris toff el symbols appear not only in connection 

with covariant differentiation and parallel displacement, but also in 
connection with a problem which is more directly related to the metric 
of a space;, that is, the setting up of the differential equations of straight 
lines or shortest lines in terms of general coordinates. In a Euclidean 
space, the shortest line connecting two points is a straight line. In the 
case of general Iiiemannian spaces, there may not exist lines having 
all the properties of straight lines, but there is, in general, a uniquely 
determined shortest connecting line between two points. In the case 
of the surface of a sphere, for instance, these lines are great circles. 
Such shortest connecting lines are called geodesies. The length of an 
arbitrary line connecting two points Pi and P 2 is 

„ . /; * - /; v»3F - /; 4/,,. % % « m> 

where p is an arbitrary parameter. 

In order to find the minimum value of Sa with fixed end points of 
integration, one has to carry out the variation according to theEuler- 
Lagrange equations, 


and the extremals are given by the equations 

dy a dx \dyj 



In our ease, the Lagrangian is the integrand of the last expression of 

eq. (5.93), while the variables y a are the coordinates f. The derivatives 

7[ and — , where £ stands for — , are given by 
u% dp 

1 (faff 1 


BL dL 

tig k{& 

, jpm dip 

3 \fff»iff 2 ^ s 

M± 9ut' _ J? dp 

dp W7 *" 

d fdL\ _ dp [ -, j it it , if d 

dpW')~ d~ S l gil ^ * +a ^ +mk '{c 

We have, thus, 

'p/ds 2 

-if d 2 p/ds 2 \ ! 

~ did wiwr dv - 


Because the parenthesis is multiplied by an expression symmetric with 
respect to i and k, we symmetrize the parenthesis itself and write: 

+ §.ui + 9a£ 

<fp/ds 2 \ 



$ dp. 

The parenthesis is now an expression encountered before, eq. (5.89a). 
The differential equations for the geodesic lines are, thus, 

m,wf f + gnr + m 

ft . ^r d p/ds 2 



or, multiplied by g l \ 



= 0. 




Chap. V 

If we choose as the parameter the at<3 length s itself, the last term 
vanishes, and we have: 

ds> [Ml ds ds 


When the coordinate system is Cartesian, the second term vanishes 
identically, and eq. (5.99) simply states that the £ must be linear 
functions of s. 

Minkowski world and Lorentz transformations. We can now return 

to our starting point, Minkowski's treatment of the theory of relativity. 
He considered the ordinary, three dimensional space plus the time as 
a four dimensional continuum, the "world," with the invariant "length" 
or "interval" defined by eq. (5.1). A "world point" is an Ordinary 
point at a certain time;, its four coordinates x, y, z, and I, winch we 
shall often denote from now on by z, %\ x\ and x\ By introducing 
a "metric tensor" ^ with the components 






o , ~~ 


.. > +lj 
we can write eq. (5.1) in the fonn 

r^^^Ax^X. (5 J 01) 

The Lorentz transformations are those linear coordinate transforma- 
tions which carry the metric tensor y,* over into itself. The inertial 
systems of the special relativity theory are, thus, analogous to the 
Cartesian coordinate systems of ordinary three dimensional Euclidean 
geometry, and the Lorentz transformations correspond to tire three 
dimensional orthogonal transformations. Their trans Ion nation co- 
efficients are also subject to conditions similar to (5.10a). 

When we carry out any linear transformation (not necessarily a 
Lorentz transformation), the transformation equations are of the form 

and the coordinate differences transform as contra variant vectors, 




Chap. V ] 



The conditions for Lorentz transformations are that 
y^Ax^Ax*" = T^Ar'^s* 

for arbitrary Ax\ By substituting Ax* 11 from eq. 
V^y'\y\Ax l Ax" = ■f] LK Ax l Ax x , 
and, because the Ax L are arbitrary, 

(5.103), we get 



These are the conditions for the transformation coefficients of Lorentz 
transformations, corresponding to the conditions (5. 10a) for orthogonal 

The; difference between, a Euclidean four dimensional space and the 
Minkowski world is that hi the latter the in variant n s ! is not positive 
definite. That is why no real coordinate transformation can carry eq. 
(5.101) over into eq, (5.2b), page 59. We have, therefore, to (lis- 
ting uish between contra-variant and co variant indices. 

In order to recognize coordinates and tensors of the Minkowski 
world as such, we shall a.dopt the convention of using in general the 
Latin alphabet for indices belonging to the ordinary space, running 
from 1 to 3, and of using Greek letters for Minkowski indices, running 
from 1 to 4. We shall call vectors and tensors of the Minkowski world 
"world vectors" and "woi'ld tensors." 

The contra variant metric tensor has the components 


When contravariant tensors transform with the coefficients y p a which 
satisfy the conditions (5.10G), the coefficients of the co variant trans- 
formation law are the solutions of the equations 


-<-'■ , 

o , 




o , 





o , 


o , 


7 »7p 

= « 


In order to find an explicit expression for these coefficients yj, one ca 
multiply the transformation equations of a co variant vector 

hy ■q"" and replace v$ by ifcjif: The result is 


v »■«, 

v To neiiv 



Now, the left-hand side is equal to v* p , and, therefore, to y p y, 

p f pa $ JJ 

7,» = ij 7« 1M 

Further, since v" is arbitrary, the coefficients must be equal, 
Finally, by multiplying by W "* and switching sides, we obtain 

V = *W7W*. (5 - 111} 

All the algebraic operations of the general tensor calculus can be applied 

to world tensors. As in the case of orthogonal transformations, the 
determinant of the transformation coefficients takes only the values 
±1. Thus, the densities of even weight transform like tensors, and 
the densities of odd weight transform similarly to the; "axial" vectors 
of the three dimensional orthogonal formalism. 

The components of the metric tensor Vli , are constant; therefore, the 
Christoffel symbols vanish, and the covariant derivatives are simply 
the ordinary derivatives. We shall denote such ordinary derivatives 
by the comma, 


= t'. 


We shall now demonstrate how the transformation coefficients of a 
Lorentz transformation are related to the relative velocity of the two 
coordinate systems S and <S*. When a point is in a state of straight- 
line, uniform motion, its velocity is described by the ratios of the co- 
ordinate differences of any two world points along its path, 

i _ A'f_ 
* Ax* ' 

The velocitv of the svstem S relative to the system S* is the velocity 
of a particle P, which is at rest in S, relative to S*. The first three 
coordinate differences of P with respect to 8, Ax, vanish. The co- 
ordinate differences with respect to >S'* are, therefore, given by 

Ax*" = tuA#.J 
S has, therefore, relative to S*, tho velocity 





Conversely, we can compute the velocity of S* with respect to S 
by employing the "inverse" transformation coefficients y/, given by 
cq. (5.108). Because they are the transformation coefficients of the 
transformation 3* -^ 8, we can write;, referring to eq. (5,114), 


T4 4 


By making use of eqs. (5.111), (5.100), and (5.107), we can write the 
right-hand side in the form 

v = —c 



Now it is easy to show that in general «* 2 is equal to iA Let us first 
form the three dimensional norm of ?;*', eq. (5.114). We have 

Because of eq. (5.106) the numerator can be rewritten. 

£ {y\f = c'[(y\r ~ 1], 

and we find, therefore;, 

^ = ^Ct4) -- l 

(7 4 d 2 


Now we treat eq. (5.116) in an analogous way. It is 

4 y^ (y'i ) 
U {y\Y ' 

Again we liave 

and v is 

E (y\) 2 = \ [(7 4 4) 2 - l], 

j _ ? (y\r - 1 

Incidentally, we find that y\ is always given by the expression 




VT — s 2 /c 




Paths, world lines. Odin arily , the motion of a particle a) ong its path 
is described by stating the functional dependence of the three space 
coordinates on the time t, 

x> = ftij. (5.1205 

The components of the velocity are Riven as. the derivatives, 


This kind of description is, of course, possible in the theoiy of relativity 
as well as in non-relativistic physics. However, it is often useful to 
choose a description in which the time is not set apart from the spatial 
coordinates as in eqs. (5.120) and (5.121). 

The motion of a mass point is, in terms of the Minkowski world, 
represented by a line, a "world line," which we can advantageously 
describe by a parameter representation, 

# = f /(p), (3-122) 

where p is an arbitrary parameter defined along the world line. Such 
parameter descriptions are commonly used in analytical geometry. 

In three dimensional geometry., the arc length is often chosen as the 
parameter. In the Minkowski world, we can use as the parameter the 
proper time r along a world line. Just as the arc length £ in ordinary 
geometry is defined as the line integral 

s = [ Vd&Tdy 1 + dJ, 

r is given as the integral of the differential dr along the world line of 
the particle, 

fir^j J dP - | (dx> + dy* + m 

- I VV dx» dx" = J Uib 

^ d "'dr. (5.123) 

dr dr 

We can, therefore, describe the path by equations of tire form 

/ = rfr). (5-124) 

T is related to the x" by a differential equation. When we divide the 
integrand of eq. (5.123) by dt and take account of eq. (5.121), we obtain 




Chap. V ] 



The velocity of a body in terms of the usual description (5.120), 
(5.121) is replaced by the direction of the world line in the four dimen- 
sional description. When, the body is at rest, the line is parallel to the 
Z 4 -axis; a.nd when the body is moving, the line will run at an. angle 
relative to the X*-axi& We can describe its direction in terms of its 
tangential vector, 

U? = 



The four quantities U" are the components of a contra variant unit- 

%.#"#- = 1- (5.127) 

This can be verified easily by replacing dr in eq. (5.126) by its defini- 

dr = V^d^de, 

The U* are related very simply to !,he velocity components u* of 
eq. (5.121); making use of eq, (5.. 125), we have 


_ dt 

~ = dr 






_ dx' dt 
dt dr 

= u (.■ 


In the following chapters, we shall use it' and If consistently in the 
way they were introduced here. v l shall be used exclusively to denote 
the relative motion of two coordinate systems. 

When a body moves without being accelerated, its direction in the 
Minkowski world is constant and its world line is a straight line. The 
law of inertia takes a very simple form in our new description : 

D* = const. 


1. Prove that the right-hand side of eq. (5.90a) transforms according 
to eq. (5.82). 

2. Prove that the symmetry properties with respect to indices of the 
same transformation character are invariant. 

3. For three dimensional Riemannian space, define the differential 
operations gradient (of a scalar), divergence, and curl, and prove that 
the relation holds: 

cui'l gcad V -= 0. 



Treating "axial" vectors as skew symmetric tensors of rank 2, define 
also the divergence of an axial vector and prove the relation 

div curl A = 0, 

where A is a polar vector. 

4. Prove that the following relations hold in three dimensional space: 

5m f :m = 2S7 ; S ikt 6 ima = &t ~ ffiff ■ 

State similar relations for the Minkowski world. 

5. With the help of Problem 4, prove that in the Cartesian, three 
dimensional geometry the relation holds: 

curl curl A = grad div A - V ? A, 
A being either a polar vector or an axial vector. 

%, Compute in three dimensional space the two triple products of 
the three polar vectors: 

[AX IB X CJ1, (A-[B XC1). 

7. In a plane, introduce polar coordinates. 

(a) Compute the components of the metric tensor. 

(b) State the differential equations of the straight lines. 

8 On the surface of a sphere of radius R, introduce the so-called 
Riemannian homogeneous coordinates, which are characterized by an 
expression for the infinitesimal length of (he form 

si = fii + j)(de + dj), with/to = i. 

(a) State the function f{f + n) and the transformation equations 
between lliemannian coordinates and the usual coordinates of longitude 

and latitude, 


/ = 
1 = 

R 5 

_r> + ue + f)J 

2E eos ip . 

. — - cos d; 

1 + sin ip 

2R. cos ip 

„ = - --sin 3. 

1 + sin tp 

(b) Compute the differential equations for the great circles in either 
coordinate system. 

Chap, V 



Remark: The Riemannian coordinates are obtained by a conformal 
transformation leading from a plane to a spherical surface, familiar 
from the theory of complex functions. 

9. The Laplacian operator in n dimensions is defined as the diver- 
gence of the gradient of a scalar, in general coordinates: 

(«) Using the [ w „ j , write out the rightdiand side. 

(b) Introduce a coordinate system tin; coordinates of which are 
everywhere orthogonal to each other, in other words, where the line 
element takes the form 

ds> = E fc&OW, 

e; = ±1. 

The hi are functions of f\ 

Express the Laplacian in terms of the //,- . 
Answer i 

v 5 V = } E (—-, v.) , n = hi-h, h n . 

Remark: This expression is frequently used in order to obtain the 
Schroedinger equation in other than Cartesian coordinates. The 
student can easily derive the expression for V~V in three dimensions 
for spherical coordinates, cylinder coordinates, and so forth. 

10. (a) Show that this relation holds in n dimensions: 

■ma I 






(b) Show that the following expressions are a scalar and a vector, 
respectively, when 7 l is a vector and F l " a skew^symmctric, contravariant 
tensor : 

-■/ (Vg-V 1 ),; ~(V~vP ik ).<=- 
Vg Vg 

(c) In general coordinates, bring V~V into a form which is a generaliza- 
tion of that given sub 9(6). 

11. Vi is a. vector, F-m a skewsymmetric tensor. Show that the fol- 
lowing explosions are tensors with respect to arbitrary transformations, 
even though the derivatives are ordinary derivatives, 

IV* - TV; ; F ik . i + F ku + F U: 


12. Schwarz's inequality, 

states in effect that, in an n dimensional Euclidean space, any side of a 
triangle is shorter than fee sum of fee two- others. For the latter 
statement can he written in the form 

|a| + |b| } |a + b|. 

When both sides arc squared, 

a" + b 2 + 2|a|-[b| % a" + b s + 2(a-b), 

the squai'es on either side cancel; and when the remaining terms are 
squared once more, Schwarz's inequality is obtained. 

By introducing a suitable Cartesian coordinate system, prove 
Schwarz's inequality, and thereby that the above statement about the 
sides of triangles is true regardless of the number of dimensions. 

Another method of proof uses the positive norm of the skewsym- 
metric product, 

|E foto - aAf = [a Xbf. 


13. Tn an n dimensional Euclidean space, m unit vectors may be 

defined, i?% 

t>4 , m ^ n, which are mutually orthogonal to each 

turn — 5k! 




Show that for any vector fi , Bess el's inequality holds: 

SV 5 

~ m 

E (fivS, 


? S o. 

The inequality goes over into an equality if m = ft, 

14. Prove that a vector field Vi in an n dimensional space can be 
represented as the gradient field of a scalar function if and only if Its 

skewsymmetric derivatives vanish, 

V t * - TV; = 0. 

Rclativistic Mechanics of Mass Points 

Program for relativistic mechanics. In Chapter IV, we have laid the 
foundation for the special theory of relativity. But so far we have 
dealt only wife uniform, straight-line motion. The clocks and. scales 
which wo used for the determination of coordinate values were not 
accelerated. We replaced the Galilean transformation equations con- 
necting two inertial systems, (4..H), by the Lorentz equations (4.13). 
The Lorentz equations are linear transformation equations, just as the 
Galilean equations were; that is, the new coordinate values (space and 
time) arc linear functions of the old ones. Therefore, an unaccelerated 
motion in one inertia! system will remain unaccelerated when a Lorentz 
transformation is carried out. 

The law of inertia (2.1) U invariant with respect to Lorentz transforma- 

The remaining chapters of Part I will discuss accelerated motion, in 
otner words, will develop a relativistic mechanics. This will be more 
involved than the development of classical mechanics. The difficulties 
are twofold: In the first place, the equations of classical mechanics are 
covariant wife respect to Galilean transformations, but not with re- 
spect to Lorentz transformations. Therefore, we shall have to develop 
a Lorentz invariant formalism so that our statements may be inde- 
pendent of the coordinate system used. The second difficulty is more 
profound: In classical mechanics, the force which acts on a body at a 
given time is determined by the positions of the other interacting 
bodies at the same time. An "actioii-at-a-distanee" force law can be 
formulated Only if it is meaningful to speak of fee "positions of the 
other interacting bodies at the same time"; that is, if the "same time" 
is independent of the frame of reference used. This condition, wc know, 
runs counter to the theory of relativity. 

It is, therefore, impossible to transform automatically every con- 
ceivable classical law of force info a Lorentz covariant law. We can 
treat only those theories from which the concept of action at a distance 
can bo eliminated. This possibility exists in the theory of collisions,