UNIV.OF
T08G8TO
. :i-"-;, .ii:
MATHEMATICS OF RELATIVITY
LECTURE NOTES
BY
G.*Y?RAINICH
in
(All Rights Reserved)
(Printed in U. S. A.)
EDWARDS BROTHERS, INC.
Lithoprinters and Publishers
ANN ARBOR, MICHIGAN
COHTBHT8
Introduction
Page
Chapter I. OLD PHYSICS . 1
1. Motion of a Particle. The Inverse Square Law 1
2. Two Pictures of Matter 3:
3. Vectors, Tensors, Operations 5
4. Maxwell's Equations 7
5. The Stress-Energy Tensor 9
6. General Equations of Motion. The Complete Tensor 10
Chapter II . NEW GEOMETRY ....'. 12
7. Analytic Geometry of Four Dimensions 12
8. Axioms of Four-Dimensional Geometry 14
9. Tensor Analysis 16
10. Complications Resulting From Imaginary Coordinate 20
11. Are the Equations of Physics Invariant 22
12. Curves in the New Geometry 24
Chapter III. SPECIAL RELATIVITY . 26
13. Equations of Motion , 26
14. Lorentz Transformations , 28
15. Addition of Velocities 29
16 . Light Corpuscles , or Photons 31
17. Electricity and Magnetism in Special Relativity 33
Chapter IV. CURVED SPACE 35
18. Curvature of Curves and Surfaces 35
19. Generalizations 37
20. The Riemann Tensor 39
21. Vectors in General Coordinates 41
22. Tensors in General Coordinates 44
23. Covariant and Contravariant Components 46
24. Physical Coordinates as General Coordinates 49
25. Curvilinear Coordinates in Curved Space 50
26. New Derivation of the Riemann Tensor 53
27 . Differential Relations for the Riemann Tensor 54
28. Geodesies 55
Chapter V. GENERAL RELATIVITY 57
29. The Law of Geodesies 57
30. Solar System, Symmetry Conditions 58
31. Solution of the Field Equations 60
32. Equations of Geodesies 61
33. Newtonian Motion of a Planet 62
34. Relativity Motion of a Planet 63
35. Deflection of Light 65
36. Shift of Spectral Lines 66
INTRODOCTIOS
Since we are going to deal with applied
Mathematics, or Mathematics applied to Physics
we have to state In the beginning the general
point of view we take on that subject. A math-
ematical theory consists of statements or prop-
ositions, some of which are written as formulas.
Some of these propositions are proved, that
means deduced from others, and some are not;
the latter are called definitions and axioms.
Furthermore most mathematical theories, in par-
ticular those in which we are interested, deal
with quantities, so that the propositions take
the form of relations between quantities.
Physical experiments also deal with quan-
tities which are measured in definite prescribed
ways, and then empirical relations are estab-
lished between these measured quantities.
In an application of Mathematics to Phys-
ics a correspondence is established between some
mathematical quantities and some physical quan-
tities in such a way that the same relationship
exists (as a result of the mathematical theory)
between mathematical quantities as the experi-
mentally established relation between the cor-
responding physical quantities. This view is
not new, it was emphatically formulated by H.
Hertz in the introduction to his Mechanics, and
then emphasized again by A. S. Eddington in ap-
plication to Relativity. The process of estab-
lishing the correspondence between the physical
and the mathematical quantities we shall, fol-
lowing Eddington, call identification. An iden-
tification is successful, if the condition men-
tioned above is fulfilled, viz., if the rela-
tions deduced for the mathematical quantities
are experimentally proved to exist between the
Physical quantities with which they have been
identified. From this point of view we do not
speak of true or false theories, still less of
absolute truth, etc.; truth for us is nothing
but a successful identification, and it is nec-
essary to say expressly that there may exist at
the same time two successful identifications,
two theories, each of which may be applied with-
in experimental errors to the known experimental
results; and that there may be times when no
such theory has been found; and also that an
identification which is successful at one time
may cease to be so later, when the experimental
precision will be increased.
Very often it happens that the quantities
of a theory are compared not with quantities which
are direct results of experiment but with quan-
tities of another, less comprehensive, theory
whose identification with experimental quanti-
ties has proved successful. In fact, it sel-
dom happens that we have to deal with direct
results of experiment, since even an experimen-
tal paper usually contains a great amount of
theory.
We may consider Geometry as a first attempt
at a study of the outside world. It may be con-
sidered as a deductive system which reflects
(in the sense explained above, that is of the
existence of a correspondence, etc.) very well
our experiences with some features of the out-
side world, namely features connected with the
displacements of what we call rigid bodies. We
see at once how much is left out in such a study;
in the first place, time is almost entirely
left out: in trying to bring into coincidence
two triangles we are not interested in whether
we move one slowly or rapidly; in describing a
circle we are not concerned with uniformity of
motion. Another important feature that is left
out is the distinction between vertical and
horizontal lines although we know that this
distinction is a very real one. Then optical
and electromagnetic phenomena are loft out. We
see thus that Geometry is not a complete theory
of the outside world; one method of building a
complete theory would be to introduce correc-
tions into geometry, to introduce one by one
time, gravitation, ether, electricity, to patch
it up every time we discover a hole; this is a
disrespectful but roughly correct description
of the actual development of Mathematical Phys-
ics. Another method would be to scrap Geometry,
and to build instead a new theory which would be
an organic whole, embracing the displacement
phenomena as a (very important, to be sure) spe-
cial case. The purpose of the following dis-
cussion is to exhibit such a theory. Before we
come to the systematic exposition we want to
say something about the plan we are going to
follow. It is possible to start with a com-
plete statement of the general theory, and then
show how special features mlfcr be obtained by
specializations and approximations; instead, on
didactical grounds, we shall begin with special
cases and work up by modifications (these being
counterparts of approximations) and generaliza-
tions. The essential difference between this
procedure and the development of classical Phys-
ics is that in the latter Geometry was consid-
ered as a fixed basis, not to be affected by
the upper structure, and we shall feel free to
modify geometry when necessary.
Chapter I.
OLD PHYSICS
The purpose of this chapter is to reformu-
late some of the fundamental equations of mech-
anics and electrodynamics, and to write them in
a new form which constitutes an appropriate bas-
is for the discussion that follows. The con-
tents of the chapter is classical, the modifi-
cations which are characteristic of the Rela-
tivity theory have not been introduced, but, as
was mentioned the form is decidedly new.
1. Motion of a Particle.
The Inverse Square Law.
The fundamental equations of Mechanics of
a particle are usually written in the form
1.1
= X, m-d*y/dt8= Y, in
= Z.
Here m denotes the mass of the particle, x,y,z
are functions of the time t whose values are the
coordinates of the particles at the correspond-
ing time, and X,Y,Z are functions of the coor-
dinates whose values are the components of the
force at the corresponding point. This system
of equations was the first example of what we
may call mathematical physics, and much that is
now mathematical physics may be conveniently
considered as a result of a development whose
germ is the system 1.1. This chapter will be
devoted to tracing out some lines of this de-
velopment.
We begin by writing the equations 1.1 in
the form
1.11 dmu/dt = X, dmv/dt = Y, dmw/dt = Z,
where
1.2 u = dx/dt, v = dy/dt, w = dz/dt
are the velocity components. The quantities
rau,
mv,
mw
are called the momentum components, and in this
form our fundamental equations express the
statement that the time rate of change of the
momentum is equal to the force, the original
statement of Newton. Equations 1.11 are seen
to be equivalent to 1.1 if we use the notations
1.2 and the fact, usually tacitly assumed, that
the mass of a particle does not change with time
or in symbols
1.12
dm/dt = 0.
In the equations 1.1, x,y,z are usually un-
known functions of the time and X,Y,Z are given
functions of the coordinates. The situation is
then this: first the field has to be described
by giving the forces X,Y,Z, and then the motion
in the given field is determined by solving
equations 1.1 (with some additional initial con-
ditions) .
We shall first discuss fields of a certain
simple type. One of the simplest fields of
force is the so-called inverse square field.
The field has a center, which is a singularity
of the field; in it the field is not determin-
ed; in every other point of the field the force
is directed toward the center (or away from it)
and the magnitude of the force Is Inversely
proportional to the distance from the center.
As the most common realization of such a field
of force we may consider the gravitational field
of a mass particle or of a sphere. If cartesi-
an coordinates with the origin at the center
are introduced the force components are
1.3
X = cx/r3, Y = cy/r3, Z = cz/r",
where c is a coefficient of proportionality,
negative, when we have attraction and positive
in the case of repulsion, and
1.4
ya
taking the sum of squares of X,Y,Z we easily
find ca/r4 so that the magnitude of the force
is c/r8; force is inversely proportional to the
square of the distance. If the field is pro-
duced by several attracting particles the force
at every point (outside of the points where the
particles are located) is considered to be giv-
en by the sum of the forces due to the separate
particles. In this case the expressions become
quite complicated and it is easier to study the
general properties of such fields by using cer-
tain differential equations to which the force-
components are subjected rather than by study-
ing the explicit expressions.
These differential equations are as fol-
lows:
1.51 bX/bx + bY/by + bZ/bz = 0
1.52 bY/bz - bZ/by = 0, bZ/bx - bX/bz = 0,
bX/oy - bY/bX = 0.
The fact that the functions X,Y,Z given by
the formulas 1.5 satisfy these equations may be
proved by direct substitution; to facilitate
calculation we may notice that differentiation
of 1.4 gives
1.6 r-br/bx = x, r»br/by = y, r«br/bz = z.
Differentiating the first of 1.8, we have now
bX/bx = c/r3 - 3cx/r4- br/dx = c/r3 - 3cxs/r*.
Substituting this and two analogous expressions
into 1.51 we easily verify it. The verification
of 1.52 hardly presents any difficulty.
It is known that equations 1.52 give a nec-
essary and sufficient condition for the exist-
ence of a function <p of which X,Y,Z are partial
derivatives. The derivative bX/bx is then the
second derivative of this function, and the sys-
tem 1.51-1.52 may be replaced by the equivalent
system
1.53 X = b<p/bx, Y = bq>/dy, Z = b<p/bz
1.54 52<p/dx2 + b29/by8 + b^/bz* = 0.
The last equation is known as the equation
of Laplace.
In the particular case where X,Y,Z are giv-
en by the formulas 1.3 a function <p of which X,
Y,Z are partial derivatives is (as it is easy
to verify)
q> = -c/r.
We may say now that the field of force giv-
en by l.b satisfies the differential equations
1.51, 1.52 or 1.5&, 1.54 which express the same
thing. It is easy to show that these equations
are satisfied not only by the field produced by
one particle at the origin, but also by that due
to any number of particles: first we notice
that if a particle is not at the origin this
results only in additive constants in the coor-
dinates, and so does not affect partial deriva-
tives which appear in the equations 1.51-1.52
which therefore remain true in this case; sec-
ondly, these equations are linear and homogene-
ous, as a consequence of which the sum of two
solutions of these equations necessarily is a
solution, or if two fields satisfy these equa-
tions their sum also satisfies them; if then, as
is generally assumed, the field produced by sev-
eral particles is the sum of the fields due to
the individual particles, such a field also sat-
isfies equations 1.51, 1.52.
Conversely, it can be proved that any field
satisfying the differential equations 1.51-1.52
may be produced by a - finite or infinite - set
of particles each of which acts according to the
inverse square law. We shall not prove here
this fact (the proof is given in Potential The-
ory), but we shall show that these equations
furnish us back the inverse square law, if we
add the condition that the field must be symmet-
ric with respect to one point.
A situation of this character, a situation
where we have to solve a system of partial dif-
ferential equations with the "additional condi-
tion" of symmetry, will appear again later, and
in order to be clear about its significance
then, it is desirable to treat here this spe-
cial case in detail.
To begin with we take the system 1.51, 1.5*
in the form 1.53, 1.54, i.e., we state that X,
Y,Z are derivatives of a function • . Then fron
the condition of symmetry of the X,Y,Z field it
follows that the field represented by the func-
tion (p also must be symmetric, i.e., that this
function may depend only on the distance from
the origin, because two points which are equi-
distant from the origin are symmetric with re-
spect to it and, therefore, 9 must have the
same value in two such points.
We have thus to solve equation 1.54 with
the additional condition that « depends on x,
y,z, only through r. Indicating differentia-
tion with respect to r by ' we have
1.56
and
1.57
btp/bx = <p''br/bx, etc.
xa = <p"-(dr/bx)*
1, etc.
Squaring each of the formulas 1.6 and taking
the sum we have
or
on the other hand differentiating each formula
1.6 we have
bar
summing these we obtain
\5« by
bz»
1, etc.;
= 2.
Using now 1.57 we can give the equation 1.54
the form
or
•=•- 0
whence
' =
<p = > which substituted in 1.56 to-
gether with 1.6 gives 1.?. We see thus that
the general equations 1.51-1.52 give us all the
general information we need about the fields of
force in question; we shall call this system
the system of equations of a Newtonian field or
simply the Newtonian system, although Newton
never considered, the differential equations
that make it up.
We may comment briefly on the mathematical
character of the magnitudes and equations we
have been dealing with in this section. At ev-
ery point X,Y,Z may be considered as the compo-
nents of a vector, the force vector; we have
thus a vector at every point of space, and this
constitutes what is called a vector field. The
function <p is an example of a scalar field.
These two fields are in the particular relation
that the first is derived from the second by
differentiation. The vector field satisfies
equation 1.51, the left hand side of which is
called divergence of the vector field. To find
a divergence of a vector field we. take the sum
of the derivatives of the components of the vec-
tor with respect to the corresponding coordin-
ates, or we differentiate each component with
respect to the corresponding coordinate and add
up the results. The formula becomes more ex-
pressive if we number the coordinates by writ-
ing
1.71 x = x^, y = xa, z = xa
and also the vector components, viz.,
1.78 X = Xx, Y = X2, Z = X3.
The expression for the divergence may then be
written as
1.8
1=1
The operation of forming a divergence is of
fundamental importance in what follows.
We abandon now for a while the study of
force fields and direct our attention to the
left hand sides of the equations of motion.
2. Two Pictures of Matter.
Our fundamental equations 1.1 connect mat-
ter as represented by the left hand sides with
forces as represented by the right hand sides.
There seems to be a fundamental difference in
the mathematical aspects of matter and force.
The quantities characterizing matter, the mo-
mentum components, for example, are functions
of one variable t, and are subjected to ordin-
ary differential equations, whereas quantities
characterizing force, X,Y,Z are functions of
three variables x,y,z and, as a consequence, are
subjected to partial differential equations;
they are field quantities, whereas the matter
components are not; another way of saying this
is to say that force seems to be distributed
continuously through space but matter seems to
be connected with discrete points. This dis-
tinction is, however, not as essential as it
looks; it is merely the result of the point of
view we take. Vt'e could very well consider mat-
ter to be distributed continuously through
space; each of the two theories, the discrete
theory, according to which matter consists of
discrete particles, or material points, each of
which carries a finite mass, and the continuous
theory, according to which matter Is distribut-
ed continuously through space, or certain por-
tions of space, may be considered as the limit-
ing case of the other. We may start with mate-
rial points, then increase their number at the
same time decreasing the mass of each and so
approximate with any degree of precision a fir-
en continuous distribution; or we may start with
a continuous distribution, then make the den-
sity decrease everywhere except in the constant-
ly decreasing neighborhoods of a discrete
ber of points, and thus approximate, with
precision a given discrete distribution. It is
clear that there cannot be any question as to
which of the two theories is correct, since the
difference between the two can be made as small
as we please, and therefore the predictions
based on the two theories can be made to agree
as closely as we may wish, so that if one iden-
tification is successful within experimental
error the other will be likewise. Mathemati-
cally the difference will be largely that be-
tween ordinary differential equations, which
are used in treating the motion of discrete
particles, and partial differential equations,
which apply to continuous distributions.
We may remark here that although forces
are usually considered to be continuously dis-
tributed in space, it is possible to introduce
a discrete picture here also; this is being ac-
tually done sometimes in the electromagnetic
theory, when a field of force is represented by-
discrete lines of force, and the intensity is
characterized by the number of lines per square
inch; we shall not, however, have occasion to
use this picture.
We may also remark here that in the last
few years still a third point of view has ap-
peared (in Quantum Theory) which in a way oc-
cupies an intermediate position; mathematically
the treatment is that used in the continuous .
case (partial differential equations), but the
interpretation is given in terms of discrete
particles, the continuous quantities being con-
sidered as probabilities of a particle being
within a certain volume, and the like. This
point of view also will not be used in What fol-
lows, and is mentioned here only for the sake
of completeness.
We want now to translate the equations of
motion 1.11 into the language of the continuous
theory. Each point of space (or of a certain
portion of space) will be considered as occu-
pied at each moment (or each moment during a
certain period) by a material particle. Here
we also denote by u,v,w the velocity components
of a particle of matter, but here they are also
considered as functions of coordinates as well
as of time; by
u(x,y,z,t), v(x,y,z,t), w(x,y,z,t)
we understand the velocity components of a par-
ticle which at the time t occupies the position
x,y,z. The fundamental quantity in this theory
corresponding to mass of the discrete theory is
density. A particle does not possess any finite
mass, a mass corresponds only to a finite vol-
ume (at a given time). To a point (at a given
time) we assign a density which may be explain-
ed as the limit of the mass of a sphere with
the center at the given point divided by the
volume of that sphere as the radius of the sphere
tends toward zero. A better way of putting it
is to say that we consider a point function
p(x,y,z,t) called the density and that the mass
of matter occupying a given volume at a given
time is the integral
J"p(x,y,z,t)-dxdydz
extended over the given volume. This integral
will, in general, be a function of time; the
mass in a given volume changes with time because
new matter may be coming in and old matter going
out, and they do not exactly balance each other.
But if we consider a certain volume at a given
time, and then consider at other moments the
volume which is occupied by the same matter,
then the mass of matter in that new volume must
be the same. That means that if we consider x,
y,z, as functions of t, namely, as the coordin-
ates of the same particle of matter at differ-
ent times, and if we consider the region of in-
tegration as a variable volume but one that is
occupied by the same particles of matter at all
times, then the integral must be independent of
time, or
.1 -jT/J p(x,y,z,t)«dxdydz = 0.
This may be written also in a differential form
as
2.2 bpu/bx + bpv/by + bpw/oz + bp/bt = 0.
We indicate two proofs for this fact; first
an easy but not rigorous proof.
For an infinitesimal volume V = dx-dy-dz we
may consider density as the same in all points
of the volume, so that mass will be the product
V»p and the derivative of this product will be
dV/dt-p + V-dp/dt.
Then again, considering V as a product dx»dy»dz
we find
dV/dt
* ddx/dt»dydz + dx-ddy/dt-dz + dx-dyddz/dt.
Now set dx = x, - Xj.;
ddx/dt = d(xt - xj/dt = dx,/dt - dxx/dt
» u, - Uj.
substituting this in the preceding relation we
find
dV/dt = V(bu/&x + by /by + bw/bz),
and since
dp/dt
= op/ox- dx/dt + bp/by»dy/dt + bp/bz-dz/dt + bp/bt
= 'op/ox-u +T>p/by -v +"op/oz-w + bp/bt,
the expression for the derivative of mass gives
dm/dt = V-(bpu/bx + bpv/by + bpw/bz + bp/bt)
so that constancy of mass is expressed by the
condition 2.2.
A rigorous proof would be based on express-
ing the integral for the moment t1 which may be
written as '^(x* ,y',zf jt^dx'dy'dz1 using as
variables of integration the coordinates of the
corresponding particles at the moment t. The
formulas of transformation would be
2.3 x' = f(x,y,z,t'), yf = g(x,y,r,t'),
z' = h(x,y,z,t«)
where f(x,y,z,t') is the x coordinate at the
moment t1 of the particle which at the moment
t was at x,y,z, etc. Using the formulas of
transformation of a multiple integral we would
obtain
/P(f,g,h,t')-J.dxdydz
where J is the Jacobian of the functions £.3
and the integration is over the volume occupied
at the moment t. Setting the derivative of
this integral with respect to t1 equal to zero,
and then making t1 = t and noticing that bf/bt1
for t1 = t is the velocity component u, etc.,
we would find the same equation 2.2.
This equation is called the "continuity
equation of matter" or the "equation of conser-
vation of matter." The corresponding equation
In the discrete theory is the equation (1.12)
dm/dt = 0 which is not usually included among
the fundamental equations of mechanics. The
continuity equation may be written in a very
simple form if we use the index notations for
the coordinates introduced before (1.71), intro-
duce analogous notations for the velocity com-
ponents, viz.,
2.4
u = u-
v = ut,
w = u»,
and in addition write
2.5 t = x*
and agree, when it is convenient, to write u4
for unity so that
2.41
1 = u4.
With these notations the continuity equa-
tion becomes
2.21
= 0.
Noticing the analogy of this equation with equ-
ation 1.8 we are tempted to say that the con-
tinuity equation expresses the fact that the
"divergence" of the "vector" of components pu^
is zero. This involves, of course, a generali-
zation of the conceptions divergence and vector,
because the summation here goes from one to 4
instead of to 3 as in the above formula. This
generalization will be of extreme importance in
what follows. In the meantime we may notice
that the divergence of the vector pu^ plays the
same role in the continuous theory as the time
derivative of the number m played in the dis-
crete theory.
We now continue the translation of the
equations of the discrete theory into the con-
tinuous language. The equations of motion ex-
press the fact that the time derivatives of the
momentum components are equal to the force com-
ponents; limiting our consideration to the left
hand sides of the equations we have therefore
to consider the time derivatives of the momen-
tum components; in the first place the time de-
rivative of mu; without repeating the reasoning
which led us to the continuity equation of mat-
ter, and noticing that the only change consists
in replacing of m by mu we find that the time
rate of change of the first component of the mo-
mentum vector will be here
2.61 fcpuu/ox + dpuv/oy + opuw/oz + "opu/ot
and the analogous expressions for the other com-
ponents will be
2.62 dpvu/dx + opvv/dy + opvw/dz + opv/dt
2.63 opwu/ox + opwv/dy + opww/oz + opw/ot.
These expression will have to be set equal to
the force components (or, rather, components of
the force density) in order to obtain the equa-
tions of motion. Such equations have been obtain-
ed by Euler for the motion of a fluid and are re-
ferred to as Euler 's hydrodynamic equations; but
at present we are not so much interested in the
equations of motion as in the mathematical struc-
ture of the expressions involving matter compo-
nents that we have written down. An attentive
Inspection will help to discover a far-reaching
symmetry which again finds its best expression
if we use the index notations introduced above
(1.71, 2.4, 2.5). We may write, In fact, for
the last three expressions
2.6
J = 1,2,8
and we note furthermore, that If we let j hare
take the value 4 we obtain the expression ap-
pearing on the left hand side of the equation
of continuity (2.21). We come thus to the idea
of considering the quantities
2.7
iJ
and we see that the expression
2.65 I OMJ./C-X! J. = 1,2,3,4
plays a very important part in our theory. The
first three components, i.e., the expressions
obtained for J = 1,2,3, give the time rate Of
change of the momentum components, and the last
one, obtained by setting j = 4, gives the ex-
pression whose vanishing expresses conservation
of mass. The expressions 2.65 appear as a gen-
eralization of what we call a divergence, and
we shall call it divergence also, but it is
clear that the whole structure of our expres-
sions deserves a closer study to which we shall
devote our next section.
3. Vectors, Tensors, Operations.
We shall later treat the fundamental con-
cepts of vector and tensor analysis in a syste-
matic way. At present we shall show how the
language of this theory which for ordinary
space has been partly introduced in section 1
can be applied to the case of four independent
variables and extended so as to furnish a sim-
ple way of describing the relations introduced
in the preceding section.
A quantity like p which depends on the in-
dependent variables x,y,z,t, we shall call a
scalar field. The four quantities uif ua, U3,
u4, we shall consider as the components of a
vector (or of a vector field; the latter, if
we want to emphasize the dependence on the in-
dependent variables). The sixteen quantities
pujiij furnish an example of a tensor (or tensor
field) . A convenient way to arrange the com-
ponents of a tensor is in a square array; for
instance,
3.1
pUjU3
puaut puau,
PU3U8 PU3U3
pu4u2 pu4us
pu4u4
3.2
•OUa/bx
"bus/bx
OU4/&X
"bu3/by
bu4/oy
oua/bz
bu3/dz
bua/bt
ou,/bt
We want to mention here a very important tensor
the array of whose components is
3.4
0 0
1 0
0 1
0 0
its components are usually denoted by 8i< so that
0 jj is one if the indices have the same value,
and zero, when they are different. The Oj, are
often referred to as the Kronecker symbols.
A tensor has been obtained above from a
vector by differentiation; the same process can
also be applied to a scalar in order to obtain
a vector. From the scalar p, we would thus ob-
tain a vector whose components are 5&, l£, ?£.,
fcp ox* oy' oz'
5^.; this vector is often called the gradient of
the scalar p. On the other hand the same proc-
ess may be applied to a tensor. For instance,
differentiating each of the sixteen components,
MJJ = pu^j introduced in 2.7 with respect to
each of the independent variables x^ we obtain
the 16 x 4 numbers (or functions) bMji/dxjj. We
call these numbers the components of a tensor
of rank three, and we may call now what we
called simply a tensor, a tensor of rank two, a
vector - a tensor of rank one, and a scalar - a
tensor of rank zero. The operation of differ-
entiation leads then from a tensor (or, better,
a tensor field) to a tensor of the next higher
rank.
It is also convenient to introduce another
operation, the operation of contraction; it can
be applied to a tensor of at least rank two and
it lowers the rank by two; for a tensor of rank
two it consists in forming the sum of all the
components whose indices are equal, or, if the
components are arranged as explained before, in
taking the sum of all the components in the
main diagonal.
The operation of taking the divergence of
a vector (field) may be stated now to consist
of the operation of differentiation followed by
the operation of contraction applied to the re-
sulting tensor of rank two.
A tensor of rank three may be contracted,
in general, in three different ways; in general,
a tensor of higher rank in as many ways as
there are pairs of indices. To contract a ten-
sor with respect to two of its indices means to
take the sum of those components in which the
two selected indices have the same values; for
instance, bM^/Ox^ is a tensor of rank three;
its contraction with respect to the first and
the third indices is the sum ZoMij/bxi; the
index j is allowed to take all the four values
1,2,3,4 so that we have four sums which are
considered as the components of a vector - the
divergence of the tensor U jj.
We may finally mention the operation of
multiplication which has been applied several
times in what precedes. The vector put * qt
has been obtained as a result of multiplication
of the vector u^ by the scalar p. The tensor
MJJ has been obtained by multiplying the vector
qt by the vector \i^ (every component of the
first by every component of the second - that
la why we have to use different indices - 1 and
J are supposed to take independently of each of
the values 1,2,3,4).
The operation of contraction Introduced
above will be performed very often; it is con-
venient, therefore, to simplify our notation;
this simplification consists in omitting the
symbol of summation, and in indicating that
summation takes place by using Greek letters
for indices with respect to which we sum. Thai
we shall write
3.5 1>pua/bxa =0 and 3.6
for 2.21 and 2.65 respectively. The first gives
an example of a divergence of a vector, the sec-
ond of a divergence of a tensor.
The Greek index in the above formulas has
no numerical value; any other Greek letter would
do just as well; in this respect a Greek index
may be compared with the variable of Integra-
tion in a definite integral. The only case when
we have to pay some attention to the particular
Greek letters we are using is when two (or more)
summations occur in one expression - in such a
case different Greek letters have to be used
for every summation. If we have to write, for
example, (ZXiyi)8 using Greek indices we could
write it as (*aya)3, but if we want to write
out the two factors instead of using the expo-
nent we have to write xayaxByB because
(*o7a ) (Vo ^ would have meant zCr^)*.
The operation of contraction is used quite
often. The formation of the scalar product of
two vectors u^ and vi may be considered as re-
sulting from their multiplication followed by
contraction; the multiplication gives the ten-
sor of the second rank UjV*, and contracting
this we get uava = u^ + u8va + u3v3, which is
the scalar product; the scalar product of two
vectors could be also called the contracted
product of the two vectors. In an analogous
fashion we can form a contracted product of two
tensors of the second rank. If the tensors are
ajj and b^ the contracted product will be
aiabaj » ** i3 also a tensor of rank two. It may
be interesting to note that the formation of
the contracted product of two tensors is es-
sentially the same operation as that of multi-
plying two determinants corresponding to the
arrays representing the tensors; to see that,
it will be enough to consider two three row
determinants
aia
a33
and
Their product, according to theorem of multipli-
cation of determinants la
and It Is seen that the elements of this deter-
minant are the components of the tensor of rank
two which arises from the tensor a^j and by by
first multiplying them and then contracting with
respect to the two inside indices.
We could also speak of the contracted square
of a tensor, meaning by this the contracted
product of a tensor with itself.
4. Maxwell's Equations.
In section 1 we discussed from a formal
point of view the inverse square law and the
fields of a more general nature which can be
derived from it; and we expressed the laws of
these fields in terms of three-dimensional ten-
sor analysis, i.e., we employed only three in-
dependent variables; after that we found that
matter is best discussed (from the continuous
point of view) by using four-dimensional ten-
sor analysis. We have thus a discrepancy: two
different mathematical tools are used in the
treatment of the two sides of the fundamental
equations of mechanics. This discrepancy will
be removed In what follows, it will be removed
by considering force fields that differ from
those derived by composition of inverse square
laws, by modifying in a sense this law; but the
modifications will be different In the two cases
in which the inverse square law has been applied
in older physics, the two cases which we are
going to mention now.
Originally, the inverse square law was in-
troduced in the time of Newton in application to
gravitational forces; we shall discuss in chap-
ter V the gravitational phenomena, and see what
modifications - radical in nature, but very
slight as far as numerical values are concerned
- the inverse square law will suffer. Later, it
has been recognized that the inverse square law
applies also to the electrostatic and magneto-
static fields produced by one single electric,
resp. magnetic particle. Still later a more
general law for electromagnetic fields has been
introduced by Faraday and Maxwell, which we
shall have to study now.
If X,Y,Z denote the components of electric
force in the static symmetric case, as we just
said, the inverse square law applies, and, as
shown In section 1 it follows under assumption
of additivity that for a field produced by any
number of particles the divergence vanishes,
4.1
bX/bx + bY/by + 6Z/oz » 0,
and the quantities oY/oz - bZ/by, oZ/bx -bX/bz,
oX/oy - bY/bx also vanish; a static magnetic
field does not interact with the electric field,
but when a changing magnetic field IB present
the laws of the electric field are modified;
viz., the quantities Just mentioned are not
zero any more but are proportional or, in ap-
propriately chosen units, equal to the time de-
rivatives of the components L,M,H of the magne-
tic field, so that we have, in addition
4.11
oY/oz - bZ/by = oL/ot
oZ/6x - oX/6z = oM/dt
bX/by - bY/bx = oN/ot,
In a similar fashion, the divergence
the magnetic force vanishes,
of
4.2
bL/bx + bM/by +bH/bz = 0,
and the expressions bM/bz - oH/oy, bH/bx - oL/oz,
oL/oy - oM/ox are proportional to the time de-
rivatives of the electric components, the fac-
tor of proportionality, however, cannot be re-
duced to one; by an appropriate choice of units
it can be reduced to minus one, and no changing
of directions or sense of coordinate axes can
permit us to get rid of this minus sign without
introducing a minus sign in the preceding equa-
tions; this minus sign is of extreme importance
in what follows, as we shall have occasion to
observe many times; in the meantime we write
out the remaining equations
4.21
bM/bz - dN/oy = -6X/bt
oN/bx - bL/bz «= -bY/bt
5L/by - bM/bx = -bZ/bt,
Several remarks must be made here concern-
ing these equations, which will be referred to
as Maxwell's equations.
In the first place, these equations cannot
be proved; they have to be regarded as the fund-
amental equations of a mathematical theory, whose
Justification lies in the fact that its quanti-
ties have been successfully identified with meas-
ured quantities of Physics, in the sense that
for physical quantities the same relationships
have been established experimentally as those
deduced for the corresponding theoretical quan-
tities from the fundamental equations. In the
second place, the equations as they appear above
present a simplified and Idealized fora of the
fundamental equations, namely the fundamental
equations ft>r the case of free space, i.e., com-
plete absence of matter.
In the third place the choice of unit*
which made the above simple form possible con-
cerns not only units of electric and magnetic
force, but also units of length and time; it
was necessary to choose them in such a way that
the velocity of light, which in ordinary units
is 300,000 kilometers per second becomes one.
As a result of this ordinary velocity, those we
observe in everyday life are expressed by very
small quantities.
For our purposes it is convenient to ar-
range our equations in the following form, where
differentiation with respect to a variable is
Indicated by a subscript,
4.3
Ny - Mz - Xt = 0
Lz ~ Nx ~ Yt = °
Mx - Ly - Zt = 0
*x + * + Zz = 0
Yx -
= 0
= 0
+ Nt = 0
+ N, = 0.
As mentioned before the above equations
describe the behavior of electric and magnetic
forces in free space, that is in regions where
there is no matter, or where we may neglect mat-
ter. On the discrete theory of matter these
equations still hold everywhere except at points
occupied by matter - in this theory matter ap-
pears as singularity of the field and some nu-
merical characteristics of matter, such as elec-
tric charge, appear as residues corresponding to
these singularities. We shall not discuss this
point of view, although mathematically it is
very interesting. On the continuous theory of
matter some terms which represent matter have
to be added to the preceding equations. The sec-
ond set of Maxwell's equations (4.3) remains un-
altered, but the first set is modified; the
loft hand sides do not vanish any more but are
proportional to the velocity components of mat-
ter; the coefficient of proportionality is elec-
tric density which we denote by e . The equa-
tions of Maxwell for space with matter are thus
4.31
Ny - Ms - It = eu
£» - »« - ** " 6V
Mz - Ly - Zt = ew
Xx + Yy + Zf = e
Zy - Y» + Lt^ = 0
Yx - Xy + Nt = 0
Lx + My + Na = 0,
We come thus across a new scalar quantity - elec-
tric density. However, in most cases this den-
sity is proportional to mass density p we have
considered before, the factor of proportionality
being capable of only two numerical values - one
negative for negative electricity, and the other
positive for positive electricity.
Even these equations are not sufficient for
the description of electromagnetic phenomena;
they correspond to a certain idealization in
which the dielectric constant and magnetic per-
meability are neglected, but we shall not go
beyond this idealization.
In the above equations we have four inde-
pendent variables x,y,z,t, as in the discussion
of matter in section 2, and we may try now to
apply to them the same notations which have
bean introduced in that section and section
3. The main question bar* 1st how to traat
the six quantities X,Y,Z,L,M,H? Tha question
was solved by Mlnkowski in 1907 in tha follow-
ing way: it is clear that a vector has too faw
components to take care of these quantities; in-
stead of using two rectors, Minkowski proposed
to use a tensor of rank two; of course, a ten-
sor has too many components; to ba axact, It
has, in the general case, 16 components - four
in the main diagonal, six above, and six below;
we set those in the main diagonal zero, and
those under the main diagonal equal with oppo-
site sign to those above the main diagonal sym-
metric to them; in this way we are left with
six essentially different components; tha re-
striction just introduced is expressed in one
formula
4.4
0;
in fact, the elements in the main diagonal cor-
respond to equal indices; if we set J = i the
above formula becomes FJJ_ +
0,
whence FJH = 0 as asserted. We try now to Iden-
tify the components of this tensor with our
electromagnetic force components in tha follow-
ing way:
4.5
X = F4i, Y = F4t, Z = F4,,
L = Fts, M = F31, I
using these notations together with 4.4 accord
ing to which, for example, Fi4 = -X, we can
write the first set 4.3 in tha highly satisfac
tory form,
4.6
OF18/OX8 + OF13/OX3 •»• ^F14/6x4 = 0
oF8 3/0*3 + "oFai/Oxx + oF14A>x4 = 0
•oF31/OXj. + "oFst/OX. + OF34/OX* = 0
"oF^/oXi + •oF4t/Oxt + oF43/ox\ « 0
= 0.
or
These four equations show a high degree of
symmetry; moreover they show a very pronounced
similarity to some of the equations we have been
considering in section 2, and for which wa pre-
pared a mathematical theory in section 3; we can
say that the four equations written above ex-
press the fact that the divergence of the ten-
.sor Fji just introduced vanishes in the case of
free space. However, if wa apply tha same no-
tations to the second set 4.3 of Maxwell's aqu-
ations nothing very simple comes out; tha minus
sign mentioned after formula 4.2 above seems to
cause trouble; but there exists a way out also
from this difficulty; it mas been indicated (be-
fore Minkowski 's paper) in tha work of Polncare
and Marcolongo, and foreshadowed in a private
letter by Hamilton as early as 1845. We can
overcome the difficulty if we allow ourselves
to use Imaginary quantities side by side with
real quantities - this ought to cause no worry
provided we know the formal rules of operations,
since our new notations are of an entirely for-
mal nature anyway; we set now Instead of £.5
4.7 x » xx, y =
and Instead of 4.5
x,, It
IX = F41, 1Y = F4t, iZ = F4,,
4.72
L = F.,, M = F,x, N = Fj.,
and then the first set (4.3) becomes (4.5) as
before, but the second set (4.3) also acquires
a highly satisfactory form, namely
4.61
JJ34 , 0*41
ox, ox3
= 0
oxa
= 0
or
As mentioned before we consider the compo-
nents Fj, as the components of a tensor. We may
say that we have sixteen of them; the six which
appear in the relations 4.78, six more which re-
sult from them by interchanging the indices and
whose values differ from those given in 4.72 on-
ly in sign, and four more with equal indices.
According to the formula (4.4) they are zero.
We may arrange them in a square array as follows:
4.8
Fia
F8a
j.3
F14
Fas Fa4
Fas F34
F43 F44
0 N -M -IX
-NO L -H
M -L 0 -iZ
IX 1Y iZ 0
We may compare the property F
ij
= -F
which our new tensor has with the property
MJJ = Mji possessed by the tensor of matter (2.7)
ana which is simply the result of commutativity
of multiplication. These two properties are
manifested in the square arrays (3.1 and 4.8) in
that the components of M^i which are symmetric
with respect to the main diagonal are equal, and
those of FJLJ which are symmetric with respect to
the main diagonal are opposite. Tensors of the
first type are called symmetric, those of the
second - antisymmetric .
We want to see now whether the notations
which permitted us to write in a nice form Max-
well 's equations would not spoil the nice form
which we previously gave to the hydrodynamical
equations. But since here we have at our dis-
posal the quantities ux, ug, u«, u+ we can ar-
range it so as to off let the 1 in the x«. In
fact, since differentiation with respect to time
occurs always in the presence of an u* It is
enough to set
4.9
114 * i
instead of 1, as in £.41, and everything will
be all right, as far as the left hand sides of
the equations are concerned, except that the
left hand side of the continuity equation be-
comes imaginary. This, however, does not mat-
ter since the right hand sides of the equations,
we temporarily disregard.
And now we may consider the Maxwell equa-
tions with matter 4.31; the second set is not
affected and may be written as 4.61, but the
first will appear in a form which may be writ-
ten simply as
4.62
The equation of continuity of matter Is s
consequence of these equations; to obtain it
differentiate, contract and use the property
Fij = ~yji» tne result gives the continuity equ-
ation of matter if we take into account that
p/e is a constant.
5. The Stress-Energy Tensor.
So far the equations to which we subjected
force components have been linear equations,
whereas operations performed on matter involved
squares and products of matter components; the
similarity which we observed in the mathematical
aspects of force and matter components makes it
seem desirable to subject force components to
operations analogous to those which we applied
to matter, viz., multiplication.
In the static case we return to our nota-
tions (1.72) Xx, Xa, I, for X,Y,Z and form the
tensor X±Xj; this tensor, slightly modified
plays an important part in the theory; the mod-
ification consists in subtracting from it
a^ijXgXd where ftjj are the components of the
tensor introduced in 3.4 and XgXg stands for
the contracted square of the vector XA, I.e.,
X* + Y* + Z*. We consider then the tensor
whose array is
J(X§ - Y* - Z*) U
XY i(Yt-X*-Zt)
XZ IZ
12
TZ
-X1 -Y1).
Of this tensor we fora the divergence (three-
dimensional), and find as its components
X(XX + Yy
X(YX - Xy)
X(ZX - X,)
Z,) + Y(X7 - Yx) + Z(X, - Zz),
Y(XX + Yy + Z.) +- Z(Y. - Zy),
Y(Zy - Y.) * Z(XX -f Yy + Z.);
the connection of these expressions with the
Newtonian equations (1.51, 1.52) is obvious;
the expressions in brackets are the left hand
sides of the Newtonian equation, so that the
divergence of our new tensor Xj.Xj - 0^X3X3
vanishes as the result of Newtonian equations.
This again confirms us in our opinion that from
the mathematical point of view force and matter
components are of very similar nature. We have
now in mind electric and magnetic forces; if X,
Y,Z are the components of the electric force
vector the tensor whose array is written out
above is called the "electric stress tensor" j
an analogous expression in magnetic components
is called the "magnetic stress tensor"; the sum
of the two, namely
-Y8-za-Ma -H8) H+LM xz+iJi
XY+LM i(I8+!!l8-Xa-Z8-L8-&a) YZ-Hffl
XZ+LN
TZ4MN
is called the "electromagnetic stress tensor";
it has been introduced by Maxwell and plays some
part in electromagnetic theory, for instance, in
the discussion of light pressure; but its main
applications and Importance seem to be in the
study of the fundamental questions, as part of
a more general four-dimensional tensor.
We saw how nicely the system of Indices
worked in the case of Maxwell's equations; it
is natural to express in index notation this
tensor also. We assert that the required ex-
pression is given by
5.1
E
iJ
JijFopFpo
where i,j take on values 1,8,3, and the summa-
tions indicated by p and o are extended from 1
to 4. In fact,
+Fi4F4i
= E(-L8-«a-N8+X8+I8 + Z8),
FipFpx = FaaF,i +FjaF31 +F14F4i = -N* - IIs + I*,
- fc8 - *T8 - |Z8 =
L8 - I8 - Z8 - M8 - H8)
and similarly for the other components corres-
ponding to different combinations of the indices
1,2,3 for 1 and J. There seems to be an incon-
sistency here; the summation indices p and o we
let run from 1 to 4, but we consider only the
10
values 1,2,3 for i and J. It la interesting to
see what comes out if we let i and j take the
value 4. We get four new components, namely,
3 14 = FipFp* = FX,F,4 + FX,F,4 - 1(HI - HZ),
*«4 = F.pFp4 = FttFl4 * F.,F,4 « i(LZ - IX),
B.4 - F.pFp* = F,XFX4 * F,,F,4 - l(MX - LI),
E44 * F4pFp4 - i(-L8 - M" - •• + X" «• I" + Z»)
= i(X8 + Y8 + Z8 * L8 •»• M8 •»• ••).
These quantities happen also to have phys-
ical meaning. The first three constitute (ex-
cept for the factor i) the components of the
so-called Poynting vector, and the last on* is
the so-called electromagnetic energy (or, ener-
gy density) . We are thus led by the notations
we have introduced in a purely mathematical way
to some physical quantities; we may say that
the entire tensor with its sixteen components -
it is called "the electromagnetic stress-energy
tensor" - unifies in a single expression all
the second degree quantities appearing in the
electromagnetic theory; the stress components,
the Poynting components, and the energy.
The stress-energy tensor may be written
out in the form of the following square array:
5.2
Xa+ L8- h XY + LM XZ + LN i(BY - MZ)
XY + LM Y8 + M8- h YZ + MN i(LZ - II)
XZ + LN YZ + MN Z8+Na-h i(MX - LY)
i(NY-MZ) i(LZ-NX) i(MX-LY) h
5.3 where h = J(X8 + Y8 + Z8 + L8 + M8 + I8).
6. General Equations of Motion.
The Complete Tensor.
We let ourselves be guided once more by
what seems to be natural from the formal point
of view, and form the divergence (four-dimen-
sional divergence) of the new tensor. This can
be done either in components or in index nota-
tion. We show how to do it the latter way, and
leave it to the reader to write out the stress-
energy tensor as an array and to font the di-
vergence of the separate lines. Applying form-
ula 3.6 to the tensor 5.1 we have
the first term on the right may be split up in
two equal parts, one of which, writing P f or Y ,
may be .written as ^F *** the otner»
writing a for y and P for a> and interchanging
Indices in both factors, which does not affect
the value because it amounts to changing the
sign twice, takes the form ?ftkF. We thus
i*olc^
Substituting for the second factors their val-
ues from Maxwell's equations 4.61 and 4.62 (in
space with matter) we get
If • •'«».
or in components without indices
6.1
"oE,
=
e(Nv -
e(Lw -
e(Mu -
Mw +
NU +
Lv +
x),
Y),
z),
** = ei(Xu+ Yv + Zw).
These expressions obtained by us in a
purely formal way are known to possess physical
significance: the first three represent the
components of the force exerted by an electro-
magnetic field on a body of electric charge den-
sity e, and the last (if we neglect the factor
i) is the rate at which energy is expended by
the field in moving the body. The first three
expressions obviously give the right hand sides
of our hydrodynamic equations. Since the left
hand sides also (2.61, 2.62) have been obtained
as divergence components (3.6) we may write
these equations in an extremely simple form, if
we introduce a new tensor, the difference
the two appearing on the left and right
sides, viz.,
6.2 TIJ -M^ - BIJ.
Our hydrodynamicel equations are then siaply
6.3 ^Tag/ox,. » 0 (l « 1,2.3).
11
of
This seems to be very satisfactory, but there
is an unpleasant feature about it, namely, that
for 1 = 4 we do not seem to get a correct equa-
tion: the contribution of the tensor Mj, gives
the left hand side of the continuity equation,
and is therefore, zero; but the contribution of
the stress-energy tensor is the work performed
by the forces on the particle and is, in gener-
al, not zero. The source of this unpleasantness
and the way to remove it will be clear after the
reader becomes acquainted with the contents of
the next chapter.
It is time now to cast a glance on the sit-
uation as it has been worked out till now. We
have ten fundamental quantities, p,u,Y,w,X,Y,Z,
L,M,N. They satisfy certain equations, the Max-
well equations (4.61, 4.62), the equation of
continuity (3.5), the equations of motion (6.3);
in the last named equations our ten quantities
enter in certain combinations which are the com-
ponents of the tensor TAj . This tensor appears
then as a very fundamental one. It may be asked
whether it determines the ten quantities which
enter into it; if it does, all the quantities we
have been considering are, in a general sense,
components of one entity, the tensor TJJ , and
all the equations we have introduced express
properties of this tensor - that part of Physics
which we are discussing in this book, with the
exception of the gravitational field, appears
then as the study of the tensor T^j. It can be
proved that TJJ with certain restrictions deter-
mines the quantities p,u,v,w,X,Y,Z,L,M,N, and in
Chapter V it will be shown that the gravitation-
al phenomena also are taken care of by it.
12
Chapter II.
NEW GEOMETRY
In the preceding chapter we achieved by In-
troducing appropriate notations a great simplic-
ity and uniformity in our formulas. The no-
tations in which indices take the values from 1
to 4 are modeled after those previously intro-
duced in ordinary geometry, the two points of
distinction being first that we have four inde-
pendent variables instead of the three coordi-
nates, and second, that the fourth variable la
assigned imaginary values. In spite of these
distinctions the analogy with ordinary geometry
is very great, and we shall profit very much by
pushing this analogy as far as possible, and
using geometrical language, as well as nota-
tions modeled after those of geometry.
Physics seems to require then, a mathema-
tical theory analogous to geometry and differing
from it only in that it must contain four coor-
dinates, one of which is imaginary. The first
purpose of this chapter will be to build a the-
ory to these specifications. The remaining part
of the chapter will be devoted to a more syste-
matic treatment of tensor analysis.
7. Analytic Geometry of Four Dimensions.
In the present section we shall give a
brief outline of properties which we may ex-
pect from a four-dimensional geometry guided
by analogy with two and three-dimensional geom-
etries j of course, we shall lay stress mainly
on those features which we shall need for the
application to Physics that we have in mind. In
this outline we shall disregard the fact that
our fourth coordinate must be imaginary; cer-
tain peculiarities connected with this circum-
stance will be treated in section 10.
The equations of a straight line we ex-
pect to be written in the form
7.1
entirely similar to that used in solid analytic
geometry; but we may also use another form; as
written out the equations state that for every
point of the line the four ratios have the same
value; denoting this value by p we may express
the condition that a point belongs to the line
by stating that its coordinates may be written
as
7.11
x2 = aa+pv8,
*a = a3+pv3, x4 = a4+pv4;
giving here to p different values we obtain (for
given a.±, vx) the coordinate* of all the dif-
ferent points of the line. The variable p if
called the parameter, and this whole way of de-
scribing a line is called "parametric repre-
sentation". Parametric representation is by no
means peculiar to four dimensions, it may be,
and is, of ten used in plane and solid analytic
geometry. We present it here because we shall
need it later, and it is not always sufficient-
ly emphasized.
A straight line is determined by two
points; the equations of the line through the
points ai and bi is given by the above equa-
tions (7.1) in which
7.2
= bx -
va = bt - aa, etc.
Two points determine a directed segment or
vector, whose components are the differences
between the corresponding coordinates of the
points, so that v^ is the component of the vec-
tor whose initial point is given by a^ and
whose final point is given by bx.
A vector determined by two points of a
straight line is said to belong to that line,
and we may say that we can use as denominators
in the equations 7.1, or as coefficients of p
in 7.11 the components of any vector belonging
to the straight line.
Two vectors are considered equal if they
have equal components; a vector is multiplied
by a number by multiplying its components by
that number, and two vectors are added by add-
ing their corresponding components.
Two lines are parallel if they contain
equal vectors, and it is easy to see that a
condition for parallelism of line 7.11 and
7.3
is
_ Xa-A8 _ X3-A3 _
V8 V,
vi/vi = va/V8 = va/V3 = v4A* or Vj = avx
where a is a number so that proportionality of
components of two vectors means parallelism.
A condition for perpendicularity of these
two lines we expect to be the vanishing of the
expression
7.4
YJ.VX + v8Va + vsV, + v4V4
which is called the scalar product of the two
vectors v^ and V± .
The distance between the points ax, aa, a3,
a4 and bj. , ba, b3, b4 is given by the square
root of the expression
7.41
-ba)« + (as-b,)» •»• (a4-b4)»j
this distance Is also considered as the length
of the vector joining a± and bj,. The expres-
sion for the square of the length of a vector
may be considered as a special case of the ex-
pression 7.4. We may say then, that the square
of the length of a vector is the square of the
vector, i.e., the product of the vector with
itself.
We shall often use Roman letters to denote
vectors. The scalar product of the vectors x
and y will be denoted by x.y and the square of
the vector x by x8.
A vector whose length, or whose square, is
unity we shall call a unit vector. Its compo-
nents, we would expect, may be considered as
the direction cosines of the line on which the
vector lies. We also expect that the scalar
product of two vectors is equal to the product
of their lengths times the cosine of the angle
between them; but, of course, an angle between
two vectors in four-dimensional space has not
been defined, so that we could simply define
the angle between two vectors by this property,
or by the formula
7.5
cos <p =
But if we want the angle to be a real quantity
the absolute value of this expression cannot ex-
ceed unity, or
t(x.y)* ^ xa.ya or (x.y)*- x8y8 - 0.
If we form the vector Xx + yy where X and M- are
two numbers, the square of that vector would be
X8x2
and the above inequality, which expresses the
fact that the discriminant of this expression is
negative is seen to be a consequence of the as-
sumption that a square of a vector is never neg-
ative.
A plane we would expect to be determined by
three points not in a line, or by two vectors
with the same initial point or by two lines
through a point. Instead of characterizing a
plane by equations we prefer to give it in par-
ametric form; limiting ourselves to a plane
through the origin we have
7.18
or
qbi, xa = paa + qbg,
xs = pa3 + qba, x4 = pa4 + qb4,
18
where ai and bi are the coordinates of two
points in the plane or the components of two
vectors of the plane whose initial points may
be considered as at the origin. We shall write
this formula also as
x • aa + pb
where x, a, b stand for vectors whose compo-
nents are xif a^, b^ and where we use Greek
letters for parameters in order to avoid con-
fusion with vectors which we denote now by Ro-
man letters.
We always can choose two mutually perpen-
dicular unit vectors as the two vectors deter-
mining a plane; if we call these vectors 1 and
J the preceding formula becomes
7.6
x » oi + pj.
The fact that 1 and J are perpendicular
unit vectors may be written as
i.J = 0,
J8 -1.
3
It Is easy to see that in this case a and
are the projections of x on the directions of
1 and J, or the scalar products of x with 1 and
j respectively.
Every pair of coordinate axes determines a
plane and since six pairs can be formed from
four objects we have six coordinate planes.
In the same way that the direction of a
straight line is determined by a configuration
of two points on it - a vector, the "orienta-
tion" of a plane may be determined by the con-
figuration of three points on it - a triangle.
A vector is given by its components, which are
the lengths of its projections on the coordi-
nate axes; in the same way a triangle may be
characterized (to a certain extent) by the areas
of its projections on the coordinate planes. If,
for example, we take a triangle one of whose
vertices is at the origin 0,0,0,0 and the two
others at the points Xi and y± respectively, the
areas of the projections will be the six quan-
tities
It is interesting to compare these numbers,
which satisfy the relation
F,
= 0
with the components of the tensor Pt« (com-
pare 4.4), three of which have been identified
with the electric, and the other three with the
magnetic force components. Our ten fundamental
quantities pu, pv, pw, p, X, Y., Z, L, M, H, sees
to allow thus a geometrical Interpretation; the
14
first four are considered as the four projec-
tions of a part of a straight line on the coor-
dinate axes, the remaining six, as the projec-
tions of a part of a plane on the coordinate
planes.
A little against our expectations, however,
these six quantities FIJ are not independent;
the reader will easily verify, using the above
expressions, that
7.7
0.
We have here a relation that exists in the math-
ematical theory; at once the question arises:
does a corresponding relation hold for the cor-
responding quantities in the physical theory;
according to the formulas 4.5 this would mean
L.X + M.Y + N.Z = 0,
i.e., perpendicularity of the electric and mag-
netic force vectors; these vectors are, however,
known not to be necessarily perpendicular to
each other; our Identification is therefore
faulty; a slight modification would, however,
help to overcome the difficulty; if instead of
considering the areas of projections of a tri-
angular contour, we consider the areas of pro-
jections of an arbitrary contour, not necessar-
ily a flat one, then the six quantities are in-
dependent and the formal analogy holds perfect-
ly.
Returning to the plane we may mention that
although it might seem strange at first glance,
we should expect that two planes may have only
one point in common - a situation which never
occurs in three dimensions. An example of two
planes with only one common point is given by
the xxxa and the xax4 coordinate planes; the
common point is, of course, the origin.
Four points not in a plane, or three vec-
tors with a common origin, we expect to deter-
mine a "solid" which may be defined as the to-
tality of points of three kinds: (1) points on
the lines determined by the given vectors;
(£) points on lines Joining two points of the
first kind; and (S) points of lines Joining two
points of the second kind. In ordinary geomet-
ry a configuration defined In this way exhausts
all points, but not so in our four-dimensional
geometry; as examples may serve the four coor-
dinate solids, the totalities respectively of
points satisfying the relations xx = 0, xt - 0,
XB = 0, x4 = 0.
A parametric representation of a solid is
analogous to that of a plane. For a solid
through the origin we have as such parametric
representation
7.61
x = oi +
where i,J,k have again been chosen as perpendic-
ular unit vectors so that
l.J » J.k » k.l • 0;
J" - k* - 1.
For all possible values of a, e, r »• ob-
tain all the points of the solid through the
origin determined by the vectors i,J,k.
Mext we consider a configuration determin-
ed by five points not In a solid; we obtain all
the points of our four-dimensional space. In-
cidentally, as a generalization of the formulas
7.6 and 7.61 we may write now
7.62
x » al +
6f;
this formula gives the expression of every vec-
tor with initial point at the origin in terms
of four mutually perpendicular unit vectors. We
have, of course,
7.8
i.J
k.
0,
J» = k» =
8. Axioms of Four-Dimensional Geometry.
Until now we have been listing some propo-
sitions that we may expect to have in four-di-
mensional geometry. But what is. four-dimension-
al geometry? As an abstract mathematical the-
ory it is Just a collection of statements of
which some may be taken without proof and con-
sidered as axioms and definitions, and the oth-
ers are deduced from them as theorems. It is
not difficult then to pass from our expectations
of a four-dimensional geometry to a realization
of such a geometry; all we would have to do
would be to pick out certain of the propositions
listed above and consider them as axioms and to
show that the others can be deduced from them.
But in so doing we do not want to include
among our axioms propositions involving coor-
dinates. In two and three-dimensional geometry
we are accustomed to see analytic geometry based
on the study of elementary geometry, that pre-
cedes It. We have an idea of what a straight
line is before we come to coordinate axes, and
we choose three of these pre-existing straight
lines to play the part of coordinate axes. We
may choose these axes to a certain extent arbi-
trarily (this arbitrariness being restricted
only by our desire to have a rectangular sys-
tem) ; the coordinate axes play only an auxiliary
role, and it would be awkward to Include any
reference to a particular coordinate system in
the axioms. We look, therefore, among the prop-
ositions mentioned in the preceding section for
some that are Independent of a coordinate system
and from which we can reconstruct the whole sys-
tem.
As our fundamental undefined conception we
choose "vector". The axioms that follow will
in the main be rules of operations on vectors.
Axiom I. Every two vectors have a sum. Addi-
tion is commutative a + b = b + a, and associa-
tive, i.e., a + (b + c) = (a + b) •»• c; subtrac-
tion is unique, i.e., for every two vectors a
and b there exists one and only one vector x
such that a = b + x. It follows that there
exists a vector 0 that satisfies the relation
a + 0 = a for every a.
Axiom II. Given a number a and a vector a
there exists a vector aa or aa which is called
their product. The associative law holds in
the sense that a (pa) = (ap)a and also the dis-
tributive laws (a + p)a=aa + pa and a(a + b)
= a a + ab.
Axiom III. To every two vectors a and b cor-
responds a number a.b or ab called their (scal-
ar) product. Scalar multiplication is commuta-
tive, a.b = b.a; it obeys together with multi-
plication of vectors by numbers the associative
law, a(a.b) = (aa.b); and, together with addi-
tion - the distributive law a.(b + c) = a.b +
a.c.
Before we formulate the next axiom we in-
troduce the definition; the vectors a,b,c,....
are called linearly dependent if there exist
numbers o, p, Y,.... not all zero, such that
aa + pb + YC + ... =0;
they are called linearly independent when no
such numbers exist.
Axiom IV. There are four independent vectors,
but there are no five Independent vectors.
Axiom V. If a vector is not zero its square is
positive.
To these axioms on vectors we have to add
some statements concerning points if we want to
have a geometry, and as such we may take:
Axiom VI. Every two points A,B have as their
difference a vector, h; or in formulas B - A = h,
B « A + h.
Axiom VII. (A - B) + (B - C) = A - C.
The body of propositions which may be de-
duced from these axioms we call four-dimension-
al geometry.
In order to prove that the whole of geomet-
ry can be deduced from the propositions I-VII we
would have to actually deduce it. We shall not
do it, of course, but we shall indicate how ana-
lytic geometry can be arrived at in the follow-
ing discussion, which is meant to be entirely
formal, i.e., during which it is not intended to
invoke our intuition but only the properties
stated in the axioms.
By length of a vector we mean the positive
square root of the product of the vector by it-
self: |a| = /a1. Two vectors are considered per-
pendicular, if their scalar product is zero. A
15
vector is called a unit vector if its length It
= 1.
Lemma. There exist four mutually perpen-
dicular unit rectors.
Proof. According to Axiom IV there exist
four independent vectors a,b,c,d; i.e., such
vectors that there are no four numbers a, P, Y »
0, not all zero such that aa + pb + YC + &d = 0.
It follows that a ^ 0, because If It were zero,
the numbers a=l, p = 0, Y = 0, 0 = 0 would
satisfy the above relation. Call a, multiplied
by the reciprocal of Its length, 1, so that
i = a/ | a | . It is easy to see that 1 is a unit
vector, and that i,b,c,d are Independent. Con-
sider the vector b1 = b - i(bi). It Is easy to
see that it is perpendicular to 1; multiply bf
by the reciprocal of its length and call the
result J; then, i and J are two perpendicular
unit vectors and l,j,c are independent. Call
c1 = c - i(ci) - j(cj); this vector is perpen-
dicular to both i and j, and if we multiply it
by the reciprocal of its length and call the
result k, we have in i,J,k, three perpendicular
unit vectors; the fourth vector t may be ob-
tained from d in an analogous way.
Lemma II. Given any vector x we have the
identity
8.1
x =
J(jx) + k(kx)
Proof. Since every five vectors are dependent
(Axiom IV), there exist five numbers o, p, y,
6 , e not all zero such that
+ PJ +
ex =0;
here e cannot be zero since otherwise i,J,k, t
would be dependent and we know that they are
not. Dividing by -e we have
= (-f)i
(-f)J
multiplying by i, the three last terms on the
right vanish as the result of perpendicularity
of i to J, k and t and the first term becomes
-£, whereas the left hand side is ix; in the
same way we prove that -f, -£, -f have the val-
ues Jx, kx, ^x, respectively, and lemma II is
proved.
A set of four perpendicular unit vectors
we shall call a set of coordinate vectors. We
shall call the quantities Xi = ix, x, = Jx,
x3 = kx, X4 = £x the components of x with re-
spect to i,J,k,£. A point 0 together with a
set of coordinate vectors we call a coordinate
system. Given a coordinate system we can as-
sign to every point X four coordinates in the
following way: denote the vector X - 0 by x
(according to Axiom VI); and call the compo-
nents of the vector x the coordinates of the
point X. If now we choose another origin O1
and the same set of coordinate vectors, the co-
ordinates of the point X will be the components
of X - 0'; but since, according to Axiom VII,
X - 0 - (X - 0») + (0« - 0), the old coordi-
nates will be equal to the new coordinates plus
the old coordinates of the new origin, and thus
a connection is established with ordinary ana-
lytic geometry.
Formula 8.1 may be compared with the form-
ula 7.62. It may also be written as
8.2 x = Xj.1 + xaj + xak + x4/ .
9. Tensor Analysis.
We want to substitute now for the prelimin-
ary definitions of tensor analysis that were
suggested by the formal developments in Chapter
I, a definition that is more satisfactory. At
that time our point of view was simply that we
shall consider symbols with two (or more) in-
dices as the tensor components in a way similar
to that of using symbols with one index as vec-
tor components. But in case of vectors (in or-
dinary space) we know what vectors are, and we
consider the components as a method of repre-
senting that known thing. In the case of ten-
sors we seem to have to take representation as
the starting point of our study. The situation
seems complicated since the fact that there is
not one but that there are many different rep-
resentations of the same vector, depending on
the coordinate system we use leads us to think
that the same general situation will obtain in
the case of tensors,, and the question arises na-
turally: how shall we be able to find out, giv-
en two representations of a tensor, whether it
is the same tensor that is represented in the
two cases; or, given a representation of a ten-
sor in one system of coordinates how to find
the representation of the same tensor in a giv-
en other system. In order to be able to answer
such questions intelligently we want to intro-
duce the idea of the tensor itself, to put it in
the foreground and to consider the components as
something secondary. In the beginning we shall
limit ourselves to the consideration of two di-
mensions.
We look then for some entity, of which the
components will be constituent parts. The first
thing that occurs to our minds in connection
with tensors of rank two is, of course, a deter-
minant. It is a single number determined by its
elements, or components. However, it cannot be
used for our purposes because the determinant
does not, in turn, determine its components.
Another instance where two index symbols
occur in mathematics is the case of quadratic
forms. The equation of a central conic may, for
example, be written in the form axa + 2bxy + cy*
= 1; or introducing the notations xx for x, xa
for y, a n for a, a12 and a2i for b, and a22 for
c, in the form
aalx8Xj.
16
Let us consider the left hand side of this equa-
tion. Here the tensor components aij are com-
bined (together with the variables xi, x§) into
one expression; and they can, to a certain ex-
tent, be gotten back from that expression. If
we set xx » 1, xa « 0, for instance, we get an
as the value of the expression; aaa can be got-
ten in a similar way, but it would be difficult
to imagine how a^ could be obtained; in fact,
it is Impossible to get axa from this expres-
sion, because two expressions which differ in
their form but for which au •*• atl has the same
value would give the same values for all com-
binations of values of xx, xt. A slight gener-
alization will, however, obviate this difficul-
ty. This generalization is suggested by the
equation of the tangent to the above conic and
can be written as
9.1 9
aaixayx + ataX»ya;
here, for instance, setting xx = 1, xa = 0,
yi = °> 7a = 1 *s get aia. We shall therefore
consider the bilinear form above aj the tensor.
If we do that we may free ourselves of co-
ordinates easily. The variables Xi, xa and y^9
ya may be considered as the components of two
vectors, and the above expression 9.1 furnishes
us then a numerical value every time these two
vectors are given; it may be considered as de-
fining a function 9; the arguments of that
function are the two vectors and the values are
the numbers calculated by substituting the com-
ponents of these vectors in the expression 9.1.
This functional dependence we may consider as
the tensor so that if we want to use another co-
ordinate system we shall have the same vectors
given by different components x\ , x'a and y'j.,
y'a and we expect to find another expression of
the same type as 9.1 involving these new compo-
nents, say
9 =
a'aix'ay'i
9.11
which would assign the same values to the two
vectors. The coefficients will, of course, be
different, and these new coefficients we shall
consider as the new components of the sane ten-
sor in the new coordinate system.
Let us perform the calculation. If we ro-
tate our axes through an angle f the old coor-
dinates are expressed in the new coordinates by
the formulas
9.2
- X'aS,
x»ac,
where
9.3 c = cos •,
s = sin 9;
the components of the other vector yj. , ya will
be expressed by analogous formulas in terms of
the new components of the same vector. Substi-
tuting these expressions In the above bilinear
form (9.1) we get
which may be written as 9.11 if we give to
the values
ia
a12cs + a8isc + aa8s
+ aiaca - a^. s8 + aaasc
9.21
a1 8i = -alxsc - aias8 + aaic
asgcs
a'as 3 ajj.88 - algsc - aaics + a88ca.
These are the new components of the tensor
whose old components are the a^. The equation
of the conic section in the new coordinates has
the same form as in the old system. We may say
that 9.11 expresses the same functional depend-
ence on the two vectors using their new compo-
nents, as 9.1 does using their old components.
The components of a tensor change, in gen-
eral, when we pass from one coordinate system to
another but there are certain combinations which
do not change; for instance, if we add together
the first and the last of the four above equal-
ities we obtain, taking into account that
9.4
the relation
+ s8 = 1
9.5 a'n + a'8a = a1]L + aaa;
also it is easy to prove that the expression
9.51
an
aal
the
is not affected by the substitution . of
primed components for the unprimed ones.
Expressions of this kind are called invari-
ants.
We have thus achieved our purpose; although
we use coordinates in the definition of a tensor
the result is independent of the particular co-
ordinate system used. We can go a step farther,
however, and dispense with coordinates altogeth-
er in the definition of a tensor. Not every
functional dependence of a number <p on two vec-
tors we shall call a tensor, the dependence on
each vector must be linear (and homogeneous); by
this we mean that the expression involves only
first powers of the components of each vector,
and no products of components of the same vec-
tor; our conception of linearity seems thus to
involve components and we are not rid of coor-
17
«
dinate systems yet. We want therefore to de-
fine linearity independently of coordinate rep-
resentation and we shall see that the following
definition, entirely Independent of coordinates
leads to the same results.
We say, In general, that ? (x) depends on
its argument x linearly if
9.6
<p(Xx
It is easy to see that linearity defined
in terms of coordinates as dependence involving
only first powers of the components satisfies
this condition. We arrive thus at the follow-
ing definition of a tensor:
k tensor of rank r is a_ function which as-
signs to _r vector arguments numerical values
the dependence of the value on each argument
being linear in the sense of 9.6.
We can prove that an expression of a ten-
sor as a bilinear (or multilinear) fora may be
gotten back from this general definition. In
fact, if given, e.g., a tensor of rank two, i.e.,
with two vector arguments ?(x,y) we substitute
for x and y their expressions in terms of com-
ponents and unit vectors (see 7.6)
Jxa
J7i;
we may write, using the above definition
tensor and that of linearity (9.6):
of a
e(J,J)xa7§.
We see that this expression differs from
that given by 9.1 as a bilinear form only in
that •(!,!), <p(i,j), «(J,i)> *O»J) appear in-
stead of alx, a18, aai, and aaa. Prom this
point of view the conception of a tensor is en-
tirely independent of a coordinate system and
of components. We obtain tensor components
when we introduce a set of coordinate vectors;
and transformation of coordinates corresponds
to replacing of one set of coordinate vectors
by a new set.
We pass now to the consideration of opera-
tions on tensors. We had threa such operations:
multiplication, contraction and differentiation.
Multiplication is simple. If we have two
tensors, say f(x,y) and g(z,u,v) we obtain a
tensor of rank five by multiplying these two
together
h(x,y,z,u,v) = f(x,y).g(z,u,v);
it is easy to see that the components of h are
obtained from the components of f and g in the
following fashion
Next comes contraction. We have defined It
In Indices; I.e., given the components of a ten-
sor of rank two In a certain coordinate system
we have a definite rule for obtaining a scalar,
viz., an + aaa; the question arises: will we
obtain the same scalar If we use another system
of coordinates, In other words, Is the defini-
tion Independent of the system of coordinates,
Is It Invariant? Yes, this invar iance has been
proved above by formula 9.5. We have now the
right to use the definition of contraction in
terms of components, knowing that it has an in-
trinsic meaning, that the result is independent
of the system of coordinates used.
We are in a position now to answer a ques-
tion that must have arisen in the mind of the
reader. In the preceding chapter we agreed to
consider a vector as a tensor of rank one. Here
with our new definition of tensor a vector and
a tensor of rank one seem to be two entirely
different things; but we may consider together
with every vector a tensor of rank one which
has the same components. To find a tensor f(x)
which has the same components as the vector v
we have to make f(i) = vx , f(j) = vs, and we
have
f(x) = Xif (1) + X8f (J) =
X8V8
so that the value of this tensor f (x) is simply
the scalar product of the vector to which it
corresponds by the vector argument.
Incidentally, this raises a question as to
the nature of the scalar product; if we define
it as xavo is it invariant? It may be consider-
ed as resulting from two tensors of rank one by
first multiplying them and then contracting the
resulting tensor of rank two.
We also might at this place say a few words
about the symbols of Kronecker Cj*. We may try
to consider these symbols as the components of
a tensor in some coordinate system. The value
of the tensor will then be given by O^x^ and
this is easily seen to be xaya , the scalar prod-
uct of the vectors x and y and thus independent
of the coordinate system. We may then speak of
the tensor Ojj without mentioning the coordinate
system because its components are the same in
all coordinate systems.
The square of a vector may be defined as
the scalar product of a vector with itself, it
is the sum of the squares of the components in
any coordinate system.
We next take up the operation of differen-
tiation. Let us begin with a scalar field f j f
is a function of the coordinates which we do not
put in evidence. The coordinates of a point P
are the components of the vector OP which Joins
the origin to the point in question and they de-
pend in the fashion discussed before on the
choice of the coordinate vectors.
After choosing a definite coordinate system
we may assign to f in every point a vector by
18
agreeing that the components of this vector
should be
9.71
and T, -'of/ox,;
given another coordinate system the relation be-
tween X£ and x'i being given by formulas 9.£,
we can form the derivatives
9.72 "of/ox1!
and "of /ox1 8
and consider them as the components in the new
coordinate system of a vector. The question
arises whether this will be the same vector as
the one introduced above.
In order to settle this question let us
see what the components of the vector whose
components in the old system were vi should be
in the new coordinate system. According to
formulas 9.2 they are
v1,* = vxc + vas = "of/ox^c -f "of/ox^.s
v'» = -vxs + vtc = -
On the other hand
^f/OXfi = ^f/OXi.^Xi/bX1! + W/OX8.'bX8/OXl1,
and since
^Xi/ox'i = c,
we find that v'i = df/dx'i , and, in the sane
way we find that v«8 = *>f/ox'8 which shows
that the components 9.71 and 9.72 above, are
the components in the two coordinate systems
considered of one and the same vector. We
proved then that the operation of obtaining a
vector by taking as its components the deriva-
tives of a scalar with respect to the coordi-
nate axes is independent of the particular sys-
tem of coordinates used, that means -this oper-
ation is invariant.
Before passing to differentiation of a
tensor of rank higher than zero we may note
that the components vx, va we obtained may be
considered as the components of a tensor of
rank one; denoting the components of the argu-
ment vector by hx, ht we hare as the values of
this tensor
"ftf/bxi.hx + W/ox8.h8;
this reminds us of a differential and suggests
to write dxj. for h^ and dx8 for ha; we have
then the formula
df =
"of/oxa.dxa,
which leads to the interpretation of the differ-
ential as a tensor of rank one whose components
are the derivatives of the given function.
We next consider a tensor of rank one whose
components in the old system are fi, these com-
ponents being functions of coordinates. Differ-
entiating with respect to xj we get
ofa/oxa;
can we consider these as the components of some
tensor of rank two? In other words, if we de-
fine a tensor by saying that in the old system
it has these components will its components in
the new system be obtained by differentiating
with respect to the new coordinates of the new
components f 'i of the given tensor? A calcula-
tion analogous to the one preceding will con-
vince us that this is so.
We have thus introduced an operation which
leads from a tensor of rank zero to one of rank
one, and from a tensor of rank one to one of
rank two, and we could, continuing in the same
way pass from any tensor to one of the next
higher rank. In introducing this operation we
used components of tensors and coordinates of
points, but we proved that the result is the
same no matter what particular coordinate sys-
tem we might have used; the operation of differ
entiation is thus independent of a coordinate
system.
After this detailed treatment of tensors
and operations on them in plane geometry it
will not be difficult to generalize to higher
dimensions. We consider first three-dimension-
al space - solid geometry - and begin with an
equation of a central quadric surface. Using
notations similar to those introduced at the be
ginning of this section it may be written as
9.8
+ a23.Xj.Xa + a88x8x2
+ a3iX3Xj. + 832X3X2 + £33X3X3 = 1,
or, using our notations for summation with Greek
indices introduced in Section 3, as
apoxpxo*
For the same reason as before (in the case of a
conic) namely, because not all coefficients of
such an expression can be obtained as its val-
ues, we introduce a slightly more general ex-
pression
9.81 a,
as the tensor; giving in it to the variables the
values xi = 08i, y = 03i (see definition of the
symbol 6 under 3.4) we obtain, for instance, the
coefficient a23.
In coordinateless notation we were inde-
pendent of the number of dimensions to begin
19
with, so that we may take over the definition
of linearity 9.6 and the definition of tensor
following it word for word. In order to ef-
fectuate the transition from vector notation
to coordinate notation we write now, instead of
x = xxi + xaj, x = xpip, and substituting this
and an analogous expression for y into t(x,y)
we get using 9.6
*(xpip,
The notation
brings us back to formula 9.91. There is no
difficulty about tensors of higher ranks; quan-
tities with three indices give rise to triline-
ar forms, e.g.,
those with four indices - to quadrilinear forms,
etc. The definitions of multiplication, con-
traction and differentiation hardly present any
difficulty, but we shall devote some time to
the question of transformation of coordinates
for three and four-dimensions. In solid analy-
tic geometry the question is usually treated by
introducing formulas involving all coordinates
at the same titae, i.e., formulas periaitting to
pass at once from one system of coordinates to
any other with the same origin; these formulas
are quite complicated, they involve nine con-
stants which are not independent, but are con-
nected by six relations, and the corresponding
thing for four dimensions would be still more
unwieldy; we could handle it by introducing in-
dex notations, but we prefer another method. We
pass from one system of coordinates to another
gradually, in steps, each step involving only
two of the coordinates - and one constant - the
angle through which we rotate in the correspond-
ing plane. Three such steps are enough to pass
from any system to any other in three dimen-
sions. For example, we may first perform a ro-
tation in the xy-plane which brings the x-axis
into the new xy-plane; then a rotation in the
so obtained yz-plane bringing the y-axis into
the new xy-plane, and finally we rotate the so
obtained x and y axes until they coincide with
the new x and y axes.
The advantage of this point of view will
be seen from the following proof of the invari-
ance of the operation of contraction in three
dimensions. Given a tensor of rank two by its
components an, aia> a 13, a81, etc., the re-
sult of contraction, according to our defini-
tion in Section 3, is
pp
a22 + a33
If we pass to another coordinate system the
components will be changed into some components
and the result of contraction will be
PP
'11
22
•SJ
in order to prove that contraction has an in-
trinsic meaning, independent of the system of
coordinates we have to prove that the last two
expressions are equal. If the transformation
involves only the xx and xa coordinates, but
does not involve xa, then a'33 which is the co-
efficient of x'sy's will be a33 which is the
coefficient of x3y3, because x1 3 = x3, y'3=y3,
and the other coordinates do not depend on x»
and y3; the coefficients a'n, a'22, on the
other hand will be transformed by the same form-
ulas (9.21) as in the two-dimensional case be-
cause xi, xa, yi> Yz are transformed by the
same formulas (9.?) as before. Therefore, form-
ula 9.5 is applicable, and this together with
the fact that a'sa = £33 establishes the invar-
iance of contraction under a transformation of
coordinates involving xx and xa only. But the
same reasoning would apply to transformation in-
volving x2 and x3 only, or xx and x3 only, and
since we have proved that a general transforma-
tion of coordinates may be replaced by a suc-
cession of transformations involving each only
two coordinates we have proved the invariance of
the operation of contraction of a tensor of rank
two under a general coordinate transformation.
Following the same principle we could prove
the invariance of contraction for tensors of any
rank and also, using the fact that the invari-
ance of the operation of differentiation has
been proved for two dimensions, prove that it
has an intrinsic meaning in three dimensions.
We come now to four dimensions. Here it is
easy to prove that a general transformation can
be effectuated by a succession of six single ro-
tations, i.e., rotations involving only two axes
each; in fact, a rotation in the xt-plane will
bring the x-axis into the new xyz-solid; a ro-
tation in the yt-plane will bring the y-axis in-
to the new xyz-solid, a rotation in the zt-plane
will bring there the z-axis; now the t-axis co-
incides with the new t-axis and the x,y,z axes
are all in the new xyz-solid and can be brought
into coincidence with the new x,y,z axes by
three more rotations as we saw before.
The reasoning indicated for the three-di-
mensional case will, therefore, prove the invar-
iance of the fundamental operations of tensor
analysis also for four dimensions.
10. Complications Resulting From
Imaginary Coordinate.
The only new feature of the new geometry
considered in the preceding sections was this
that we have four coordinate axes instead of
three; but we still have another departure from
ordinary geometry (due in the final count to the
"minus sign" in the Maxwell equations), viz.,
20
the fact that the fourth coordinate is imagin-
ary. The introduction of Imaginarles helped us
to obtain a symmetrical form of Maxwell1 B equa-
tions, and seems to be beneficial from this
point of view. The formal part of the theory'
runs now smoothly; but there Is a disadvantage
in this smoothness, It conceals very important
peculiarities, and the present section will be
devoted to the consideration of some of these
peculiarities.
The discussion may be conveniently attach-
ed to the consideration of the expression (pomp.
7.41)
which defines the square of the distance be-
tween the two points whose coordinates appear
In it; or the square of the vector joining
these two points.
Since in the above formula the quantities
x4 and x*4 and, therefore, their difference is
imaginary, the fourth square is negative, and
,the expression may, according to the relative
magnitudes of the terms, be positive, negative,
or zero. There are thus three types of rela-
tive positions of two points, or directions, or
of vectors; there are vectors of positive square,
those of negative square, and those of zero
square. Our geometry is thus more complicated
than what we would expect it to be if it would
differ from the ordinary geometry in the num-
ber of dimensions only. This complication, or
this richness of our geometry, far from being
an undesirable feature, is, as we shall see, an
advantage, because it corresponds to certain
features of the outside world, e.g., the exist-
ence of both matter and light which are going
to be identified with two kinds of vectors. At
present we only mention that the momentum vec-
tor of a material particle, or the vector of
components u,v,w,i are vectors of negative
square; in fact, the square of the latter is
u2+va+w8-!; the first three terms representing
the square of the velocity of the particle
which, according to the third remark preceding
formula 4.3 is very small compared with unity,
the expression is negative.
In some cases it may be desirable to sac-
rifice the formal advantages accruing from the
use of the imaginary coordinate in order to put
in evidence the peculiarities we are discuss-
ing; it is permissible to go back then to the
old notations x,y,z,t, but it becomes necessary
to modify the formulas accordingly. (We shall
see later how it is possible to use index nota-
tions and still avoid imaginaries.) If we de-
note two points by x,y,z, it and x' , yf, z1, it1,
instead of by xi and x't the formula for the
square of the distance will be
10.1 (x-x')2 + (y-y')* + (z-zf)* - (t-t')',
and the formula for the scalar prc ee 7.4)
of two vectors given by a,b,c,id, and a',b',c',
id',
10.2
aa' + bb' + cc« - dd1.
We may say that the minus sign appearing
in these formulas is the same as the one ap-
pearing in the second set of Maxwell's equa-
tions (4.21), because it may be traced back to
them.
We shall occasionally refer to the quanti-
ties xi and ui which carry indices and involve
the square root of -1 as mathematical coordi-
nates and components, and to the quantities x,
y,z,t and a,b,c,d as physical coordinates and
components. Four-dimensional space of the char-
acter we are studying now, i.e., either charac-
terized by three real and one imaginary coordi-
nates, or with four real coordinates but with
scalar multiplication with a minus sign given
by formula 10.8 is often called four-dimension-
al space-time because of the interpretation of
the quantities x,y,z and t in ordinary physics.
Without going into detail we may mention a
few consequences of the "minus sign". According
to our definition, two vectors are considered
perpendicular if their scalar product vanishes.
But now a scalar product of two equal vectors
may vanish, as happens, for instance, in the
case of two with components 0,0,1,1. We must
say then, that such a vector is perpendicular to
itself.
We also may mention that corresponding to
the existence of three types of directions, those
which correspond to vectors of positive square,
negative square and zero square, there are three
types of orientations or planes. An orientation
may best be characterized by the number of zero
directions it contains and it is easy to prove
that there are orientations containing two, one
or no zero directions.
As the result of existence of vectors of
negative square our proof that the cosine of the
angle between two vectors as defined by formula
7.5 does not exceed unity in absolute value is
not applicable and we would have to consider
imaginary angles or else consider the cosine as
a hyperbolic cosine, but we shall not go into
this question.
The only peculiarity due to the "minus
sign" other than the existence of zero square
vectors that we shall have to use in what fol-
lows is connected with transformation of coordi-
nates.
Formally our transformation formulas remain
the samej we may write, for instance,
s
x =
cos <p - x4 sin
*4 = x3 sin 9 + x4 cos ®;
but here x3 is real and x4 Is imaginary and, of
course, we expect the new coordinates to be of
the same character so that x& will be real and
x^ imaginary. Giving to ,x3 and x4 the values
1 and 0, respectively, we see that cos « mi it
be real, and sin 0 - imaginary. We shall, bow-
ever, prefer not to use imaginary trigonometric
functions; In order to avoid then we introduce
a new notation as follows:
10.
COS f
sin t = l-c ,
where o and T are real quantities. The identi-
ty cos** + sin8? = 1 gives for o and T the re-
lation
10.4
1.
If we prefer to use one number, rather than
two numbers connected by a relation, in describ-
ing different transformations of coordinates In
the X3x4 plane we may again resort to trigonom-
etry and interpret o and T as the secant and
tangent of a real angle t :
10.5
sec
= tan
so to say,
It must be noted, however, that
has no geometrical significance.
If In the above formulas of transformation
we substitute for x3 and x4 their expressions
in terms of z and t (4.7) and for cos • and
sin 9 their expressions (10.?) In terms of a
and T we obtain the following formulas of trans-
formation for the physical coordinates:
10.6
z'
zo + tt,
t1
ZT
tO.
These formulas are called the Lorentz
transformation formulas and their physical in-
terpretation will be discussed in the next chap-
ter.
Concluding this section we may mention how
our axioms of Section 8 must be modified in or-
der to produce a geometry with the desired pe-
culiarities.
It is clear that our axiom V, according to
which a non-zero vector has positive square has
to be modified. The proper modification is the
following:
Axiom V . There are orientations containing
two zero directions, but there are no orienta-
tions containing more than two zero directions.
If we replace axiom V by this axiom and
keep the remaining axioms as they were stated
in Section 8 we obtain a geometry of the kind
desired. In order to show this, we first show
the existence of four mutually perpendicular
vectors three of which have squares equal to 1,
and the fourth one equal to -1. We begin by
picking out a plane with two zero square vec-
tors a and b; we assert that a1 * i(a + b) and
b« = J(a - b) are two perpendicular vectors
with squares of opposite sign; In fact a'b» =
£(a8 - ba) and this is zero because a* = 0 and
b8 = 0; then, a = a1 •»• b1, squaring this and
keeping in mind that a'b' = 0, we have a8 =
a'8 * b'«; since a» = 0 it follows that
the squares of a1 and b' are of opposite
sign. Dividing each of the vectors a1 and b' by
the square root of the absolute value of Its
square we obtain two mutually perpendicular vec-
tors whose squares are +1 and -1, which we may
call k and / respectively. It is easy now to
pick out two more vectors 1 and J which togeth-
er with k and Jf constitute a set of four mutual-
ly perpendicular vectors; none of them can have
a zero square, because if, say, ia should be
zero all the vectors of the plane determined by
a and i would have zero squares, contrary to
axiom V.
Using these four vectors we can, as in the
other case, express every vector in the form
(comp. 8.2)
10.7
ai + pj
O/.
The numbers a, p , Y » 0 will be considered as
the components of this vector. In order to
show that we have what we wanted we shall ex-
press the square of x in terms of its components.
Squaring 10.7 and taking into account that f* =
-1 we obtain
+ ps
This shows that a, p, Y> 6 are what we call
physical components of the vector; the mathema-
tical components are obtained by setting
= a, xg = p,
x3
10,
and we see that we get the kind of geometry we
expect to use in physics.
11. Are the Equations of Physics
Invariant?
We return now to physics. In Chapter I we
arrived at certain equations that we consider
as fundamental; namely, the equation of contin-
uity (3.5)
11.1 ^pua/oxg = 0,
two sets of Maxwell's equations (4.61 and 4.62)
11. £ ^Fy /DXfc + "oFjfc /QXj[ + "dFfci /ox* = 0,
11.3 ^Fia/oxa = eui ,
and the equations of motion (6.3)
11.4 'oTia/oxa =0 (i = 1,2,3)
with (comp. 6.2, 5.1} 2.7)
Tij = EiJ ~
11.5
= FiPFPJ
H
The fact that the Indices run here fro*
1 to 4 (except in 11.4) suggested four-dimen-
sional geometry; which we have introduced in
Sections 7 and 6; the fact that x* in the above
equations is imaginary (comp. 4.7) suggested
the peculiarities discussed in Section 10. low
that we have followed these suggestions and
built a mathematical theory we have to see to
what results the application of our new theory
leads. In addition to following the sugges-
tions we have Introduced into the theory a fea-
ture that was not directly called for by phy-
slcs: we made our theory independent of coordi-
nates. In order to bring out the importance of
this fact let us consider for a moment the case
of two dimensions and compare plane geometry
with a two-way diagram. Both in plane analytic
geometry and in a diagram we use coordinate ax-
es, but in geometry the axes play an auxiliary
role, we find it convenient to express by refer-
ring to axes properties of configurations which
exist and can be treated independently of the
axes; the same properties can be expressed us-
ing any system of coordinate axes. The situa-
tion is different in the case when we use a
plane as a means of representing a functional
dependence between two quantities of different
kind, when we have a diagram. We may use, for
instance, the two axes to plot temperature and
pressure, or the height of an individual and
the number of individuals of that height. In
the majority of such cases the axes play an es-
sential part in the discussion; if we delete
the axes the diagram loses its meaning, the
question of rotation of axes does not arise.
Returning to physics we have to ask our-
selves what we actually need for it, a diagram
or a geometry; in other words, are the coordi-
nate axes essential or can they be changed at
will, or again, do the equations of physics ex-
press properties independent of the coordinate
axes; are they invariant, or not.
In order to answer this question let us
first consider the formal structure of the equa-
tions 11.1 to 11.5. The fundamental dependent
variables are here a scalar p, a vector ui and
an anti- symmetric tensor Fj« .
The left hand side of equation 11.1 may be
described as a result of multiplying the scalar
p by the vector ui} then differentiating the re-
sulting vector p%, then contracting the tensor
so obtained; since the operations of multiplica-
tion, differentiation and contraction have been
shown to be invariant, the scalar opu^/big is
independent of the system of coordinates used,
and if it is zero in one system of coordinates it
is zero in all system of coordinates. The con-
tinuity equation expresses, therefore, a fact
independent of the system of coordinates em-
ployed .
An analogous reasoning applied to 11.3
would show the invariant character of that sys-
tem. The question of invariance of the first
sign. Dividing each of the vectors a1 and b1 by
the square root of the absolute value of its
square we obtain two mutually perpendicular vec-
tors whose squares are +1 and -1, which we may
call k and / respectively. It is easy now to
pick out two more vectors i and J which togeth-
er with k and ^ constitute a set of four mutual-
ly perpendicular vectors; none of them can have
a zero square, because if, say, i8 should be
zero all the vectors of the plane determined by
a and i would have zero squares, contrary to
axiom V .
Using these four vectors we can, as in the
other case, express every vector in the form
(comp. 8.2)
The numbers a, p , Y > 0 will be considered as
the components of this vector. In order to
show that we have what we wanted we shall ex-
press the square of x in terms of its components.
Squaring 10.7 and taking into account that jf* =
-1 we obtain
= a2
- 6!
This shows that o, p, y, 6 are what we call
physical components of the vector; the mathema-
tical components are obtained by setting
xa = p
= Y, X4 = 10,
and we see that we get the kind of geometry we
expect to use in physics.
11. Are the Equations of Physics
Invariant?
We return now to physics. In Chapter I we
arrived at certain equations that we consider
as fundamental; namely, the equation of contin-
uity (3.5)
11.1 SpUa/oxa = 0,
two sets of Maxwell's equations (4.61 and 4.62)
11.8 "oF^j /oxk + -oFjfc /oxi + •e-Ffci /OXj = 0,
11.3 ^>Fia/°xa = eui ,
and the equations of motion (6.3)
11.4 -oTia/oxa =0 (i = 1,2,3)
with (comp. 6.2, 5.1} 2.7)
11.5
' FiPFPJ
The fact that the indices run here from
1 to 4 (except in 11.4) suggested four-dimen-
sional geometry; which we have introduced la
Sections 7 and 8; the fact that x« in the above
equations is imaginary (comp. 4.7) suggested
the peculiarities discussed in Section 10. low
that we have followed these suggestions and
built a mathematical theory we have to see to
what results the application of our new theory
leads. In addition to following the sugges-
tions we have introduced into the theory a fea-
ture that was not directly called for by phy-
sics: we made our theory Independent of coordi-
nates. In order to bring out the importance of
this fact let us consider for a moment the case
of two dimensions and compare plane geometry
with a two-way diagram. Both in plane analytic
geometry and in a diagram we use coordinate ax-
es, but in geometry the axes play an auxiliary
role, we find it convenient to express by refer-
ring to axes properties of configurations which
exist and can be treated independently of the
axes; the same properties can be expressed us-
ing any system of coordinate axes. The situa-
tion is different in the case when we use a
plane as a means of representing a functional
dependence between two quantities of different
kind, when we have a diagram. We may use, for
instance, the two axes to plot temperature and
pressure, or the height of an individual and
the number of individuals of that height. In
the majority of such cases the axes play an es-
sential part in the discussion; if we delete
the axes the diagram loses its meaning, the
question of rotation of axes does not arise.
Returning to physics we have to ask our-
selves what we actually need for it, a diagram
or a geometry; in other words, are the coordi-
nate axes essential or can they be changed at
will, or again, do the equations of physics ex-
press properties independent of the coordinate
axes; are they invariant, or not.
In order to answer this question let us
first consider the formal structure of the equa-
tions 11.1 to 11.5. The fundamental dependent
variables are here a scalar p, a vector ui and
an anti-symmetric tensor Fy .
The left hand side of equation 11.1 may be
described as a result of multiplying the scalar
p by the vector u^, then differentiating the re-
sulting vector pui, then contracting the tensor
so obtained; since the operations of multiplica-
tion, differentiation and contraction have been
shown to be invariant, the scalar ^jpUg/oxg is
independent of the system of coordinates used,
and if it is zero in one system of coordinates it
is zero in all system of coordinates. The con-
tinuity equation expresses, therefore, a fact
independent of the system of coordinates em-
ployed .
An analogous reasoning applied to 11.3
would show the invariant character of that sys-
tem. The question of invariance of the first
23
system of Maxwell's equations requires a spe-
cial discussion; it can best be treated by in-
troducing a new an ti- symmetric tensor DJ_J con-
nected with F4« by the following relations:
11.6
F8, = D14, F31 = D,4,
F14 = D13, F84 = D81 .
Before v,e show how this is going to help us in
connection with our equations we want to prove
that these relations are independent of the co-
ordinate system; i.e., that if 11.6 hold rela-
tions of the same form, namely
11.6'
etc.,
will hold in any other coordinate system. Again,
since a general transformation of coordinates
can be achieved in steps it will be enough to
test a XiX2 rotation only. As a result of such
a rotation F{2 becomes (comp. 9.21)
12
;3 - F
2l
F82sc;
using the fact that
we find
is anti-symmetric (4.4)
11.7
F'a = F,
12
and since obviously D34 = D34 because the x3x4
axes are not affected we see that the first of
the relations 11.6' follows from 11.6. In order
to find F23 we have, according to the general
rule following 9.31, to substitute 62l and 63i
for xj^ and y' respectively in F'pOx'pyJ, =
FpOXpya . As tne corresponding values of x.i and
Yi we find with the aid of 9.2 considering that
X3 = X3, X4 = X4
"~ S i
xfl = c,
x3 =0, x4 = 0,
yi = 0, y2 = 0, y3 =1, y4 = 0;
so that
11.71 FjJ3 = -sF13 + cF23;
in a similar way we obtain
11.72
D14c - D84s;
taking again into account the anti-symmetric
property of F we come to the conclusion that
the second relation of 11.6% is a consequence of
11.6, and since the same reasoning applies to
the remaining relations we conclude that the re-
lations 11.6 are independent of the coordinate
system; it is easy to see that they assign to
every tensor Fy a tensor DJJ (the tensor Dj* ,
or, rather v^l DJJ is often referred to as the
dual of FJ.J). Now if we express, using 11.6
the FJJ
becomes
11.2'
in 11.2 in terms of the Di, that set
and its invariance follows from general con-
siderations as in the case of 11.3.
Formula 11.5 contains only multiplications,
contractions and additions, so that there la no
doubt concerning its invariance, but the situa-
tion changes when we come to the set 11.4. The
vector *>Tia/oxa has been obtained by invariant
operations but 11.4 states that only three of
its components are zero, a statement which ob-
viously depends on the choice of coordinate ax-
es and is not invariant.
We have now two courses open before us:
one is that of resignation, we can say: we see
that physics is not like geometry in this re-
spect, that we can only use four-dimensional
notations, a four-dimensional diagram but not
four-dimensional geometry; the other course is
that of adventure, we may try to play the game
of geometry; let us pretend that we can apply
the formulas of transformation of coordinates
in this case; we know that there will be a dif-
ference between the theory we obtain and the
physics which we undertook to translate into
our language; but it may be that the difference
will amount numerically to very little. Con-
sider the fourth component of the vector oTj^oXgj
we found (comp. remark following 6.3) that one
of the terms of this expression, ^Mio/oxg van-
ishes, and the other 0^3/0X3 gives Xu+Yv+Zw,
where u,v,w are the components of velocity, but
in order to present the Maxwell equations in a
simple form we had to choose our units in such
a way that the velocity of light is unity; or-
dinary velocities are of the order of magnitude
of one ten-millionth of the velocity of light,
so that we see that by setting the fourth com-
ponent of oTia/oxo equal to zero we would com-
mit an error that is numerically very small.
This encourages us to go on with our adventure
and try to force the geometrical character on
physics. In order to do that let us go beyond
the formal structure of our formulas and recall
what the meaning of our fundamental quantities
was. The components of the vector UA were giv-
en (see 1.2, 2.4, 4.9) as
11.3 ux = dx/dt, u2 = dy/dt, ua = dz/dt,
u4 = 1.
But this identification is obviously not inde-
pendent of the coordinate system, it gives pref-
erence to the fourth coordinate. We may think
that this is the source of our difficulty, and
that this difficulty may be overcome if we find
an invariant identification to take the place
of 11.8. The next section will prepare the way
for this.
12. Curves in the New Geometry.
The root of the difficulty is that our de-
scription of motion was not invariant; motion
was described by giving the dependence of the
coordinates x,y,z on time, that means by giving
three of our coordinates Xj., xa, x3, as func-
tions of the fourth x4 which thus is given pref-
erence. The situation is analogous to that In
plane analytic geometry where we give y as a
function of x, or that in solid analytic geomet-
ry when we give y and z as functions of x; in
both cases we represent curves; from our four-
dimensional point of view we should then consid-
er motion of a particle as a curve in four di-
mensions (using the word curve in a general
sense so that straight line is a special case) .
What we want then is a representation of
curves in four dimensions which would not give
preference to the fourth coordinate. We begin
by considering representations of curves in two
and three dimensions which give no preference to
one coordinate. In the plane a line may be rep-
resented by
x = ap + b,
y = cp + d,
a circle by
x = r cos p, y = r sin p;
in space a line by
x = ap + b, y = cp + d, z = ep + f ,
a helix by
x = r cos p, y = r sin p, z = kp, etc.
In all these cases to every value of the
"parameter" p corresponds a point of the curve;
in general, if we set
= f(p),
y = g(p),
z = h(p)
we have what we call a parametric representation
of a curve (corap. parametric form of equation of
straight line in 7.1). In the same way we may
represent a curve in four dimensions, which we
take to mean motion of a particle, by giving x^
as functions of a parameter p.
The defect of this method is that it con-
tains a certain arbitrariness; we may substitute
for p another parameter q by making p an arbi-
trary increasing function of q. We want now to
standardize our parametric representation. The
usual way is to choose the arc length along the
curve as the parameter. Without going into de-
tail we shall state that arc length between
points corresponding to values px and p» of the
parameter is given by
18.1 s = / V( dx/dp) a + (dy/dp)*+ (dz/dp)'dp;
in the special case when s la used as par
p we differentiate both aides and obtain
£4
ter
12.2 (dx/dp)« + (dy/dp)» + (dz/dp)» « 1.
We may consider in general dx/dp, dy/dp, dt/dp
as the components of a vector tangent to the
curve; the change of parameter would multiply
these derivatives by the same number, i.e., sub-
stitute another tangent vector for that one; the
quantity (dx/dp)* + (dy/dp)a + (dz/dp)» fires
the square of the length of the tangent rector;
the above equality 12.12. expresses then the
fact that if we use arc length as the parameter
the length of the tangent vector whose compo-
nents are the derivatives of the coordinates
with respect to the parameter Is unity.
We come thus to the idea of a unit tangent
vector; it characterizes in every point the di-
rection of the curve; its components are the di-
rection cosines of the tangent.
We may try to go through an analogous proc-
ess in the case of curves in four-dimensional
space, which as we saw may be taken to repre-
sent motions; if we succeed, the vector at which
we arrive will suggest itself as a natural thing
to identify with the vector of components ui
which appears in our formulas. Starting with
any parametric representation xi = xi(p), where
p may be for example t, we try to change our
parameter by introducing a new variable q and
making p a function of q, choosing this func-
tion in such a way that
(dxa/dq)8 + (dxa/dq)a
but dxi/dq = (dxi/dp) . (dp/dq) ; so that the
function p(q) must be such that
dp/dq
/(dx1/dp)V(dx,/dpr+(dx,/dp)a+(dx4/dp)1
Is this possible? In the case when the origi-
nal p is t the expression under the radical
sign will be u8 + v» + w» - 1, and for motions
whose velocities are smaller than the velocity
of light (Section 4) this is negative, so that
we would get an Imaginary value for dp/dq. »• In
order to avoid this unpleasantness we decide to
standardize our parameter by requiring
(dxjj/dp) . (djL/dp) to be -1 instead of 1; in
this case we find for dq/dt the expression
/I - (u8 + v* + w8) and we may write
12.3
where p stands for /u8 + v« + w», i.e., for
what we call speed (the length of the velocity
vector). The quantities Just written out we
want to identify with the components of the
vector ui which appears in our formulas. Since
in ordinary cases p is very small, the radical
/I - PB is very near to unity and our new iden-
tification differs from our old identification
(11.8) numerically very little. On the other
hand the new values for ui are according to
their derivation the components of a vector, to
that if we adopt this identification and also
agree to set the fourth component of the diver-
gence of the tensor T^ equal to zero we obtain
an invariant theory whose statements differ on-
ly very slightly from those accepted in classi-
cal physios. It remains to be seen whether
there are cases in which the discrepancy is
large enough to be tested by experiment.
26
Chapter III.
SPECIAL RELATIVITY
Guided by the point of view that the form-
ulas of physics ought to be interpreted in four-
dimensional geometry we were led to the inter-
pretation of the motion of a particle as a curve
in space-time. Following the analogy with a
curve in ordinary space where arc length, s, is
often used as a parameter, we have introduced a
standard parameter, which we may also. call arc
length and denote by s, for curves in space-
time. The partial derivatives dxj/ds of the co-
ordinates of a point on the curve with respect
to s may be considered as the components of a
vector tangent to the curve (the square of this
vector is -1 in every point - we shall refer to
such a vector also as a unit vector) . We have
then at every point Xi, xa> x3, x4 of such a
curve a .unit vector dxi/ds, and we have agreed
to identify this vector with the vector u^ which
appears in our fundamental laws of physics (11. 1
to 11.5) so that
ui = dxi/ds.
In this chapter we want to consider some conse-
quence of this identification.
15. Equations of Motion.
The one thing that was not satisfactory
about the formulas of physics was the fact that
according to 11.4 only three components of the
vector oTia/dxa are equal to zero. In this
section we shall see how this defect is cor-
rected by the adoption of the new identification.
But before we do that we have to study some im-
mediate consequences of this identification.
Before, a motion of a particle was given
by giving the position of the particle in dif-
ferent moments of time, i.e., the coordinates
x,y,z as functions of t. Given these functions
we can calculate for every moment the velocity
vector of the particle - a vector of components
u = dx/dt, v = dy/dt, w = dz/dt.
Now, the same motion is described by giving
*i» *2» x3, x4 as functions of s; and we have
a vector of components ui = dxi/ds which, due
to the special choice of the parameter satisfies
the equation
13.1
2
Ul
U2
U4
— 1
~ -L •
Of course, we have merely two representa-
tions of the same thing. Given the Xi(s) we can
express s as a function of t from
x4(s) = it
and substituting the expression of s so found
into xx(s), xg(s), x3(s) we will have x,y,z,
as functions of t. Or given x,y,z as func-
tions of t we can arrive at the representation
x1(s) as indicated in Section 12.
Also the space-time vector u^ and the
space vector u,v,w describe the same thing.
The formulas 12.5 show how to find the compo-
nents Ui in terms of the velocity vector u,
v,w, and it is easy to find the u,v,w in
terms of components u^. We simply have
u = dx/dt =
dx/ds 4dxi/ds
and in a similar fashion
13.8 v = i.ua/U4,
w =
We see thus that the vector u^ determines the
velocity of motion, and we agree to call it the
four-dimensional velocity vector. On the other
hand, being a unit vector this vector character-
izes the direction in four dimensions of the
curve representing motion; its components u^
may be considered as the direction cosines of
the tangent (compare Section 7, between formu-
las 7.41 and 7.5).
But a velocity vector does not character-
ize the motion of a particle completely; it gives
only the kinematical characterization; in dy-
namics we need in addition, to know the mass of
the particle, and then we form the momentum vec-
tor (compare beginning of Section 1) whose com-
ponents are mu, mv, mw. By analogy we form
the expressions mui or pui (depending on
whether we use the discrete or the continuous
picture of matter) and consider them as the com-
ponents of the four-dimensional momentum vector.
Using the formulas 12.3 we have for its compo-
nents
13.3
mu2 =
mu4
mv
These are what we call the mathematical compo-
nents of the momentum vector; its "physical com-
ponents" are
mu
•v
13.51
c =
We obtain a relation between the momentum
components and mass if we take the sum of the
squares of the components 13.3 or 13.31 and use
13.1, viz.,
aa + b" + c» - d* = -m";
In words, the negative square of mass is the
square of the momentum vector, so that mass is
essentially given by the length of the momen-
tum vector; we see here another advantage of the
four-dimensional representation: the four dy-
namical quantities of a particle which in clas-
sical physics are given by the three momentum
components and mass are here represented more
naturally by the four components of a vector.
As stated many times before, numerically
is in most applications very close to
unity so that approximately the first three
components of the four-dimensional momentum vec-
tor are equal to the components of the three-
dimensional momentum vector and the last (phy-
sical) component of the four -dimensional momen-
tum vector is, in first approximation, equal to
mass.
Let us consider more in detail this fourth
component of the momentum vector. If we want a
better approximation we develop the last of 13.31
according to powers of p and keep only two terms;
we have thus the approximate equality
13.4
d =
the correction represented by the second term is
nothing but kinetic energy; of course, if ordi-
nary units are used this term has to be written
13.41
imVa/c8
where c is the velocity of light, and V is the
velocity of the particle measured in the same
units, because p is the ratio of the velocity of
the particle to that of light. We had better
say then that the correction is kinetic energy
divided by the square of the velocity of light.
Sometimes this fact is expressed by saying (neg-
lecting the other terms, which are very, very
small) that when a body is in motion its mass is
increased by its kinetic energy (divided by the
square of the velocity of light) .
The interest of this lies in the close re-
lationship which is thus established between
mass and energy - a relationship that plays a
prominent part in present physics.
Sometimes the whole expression m//l - p2
is referred to as energy of the particle; mass,
from this point of view, appears then as part of
the energy, that part that the particle possess-
es even when it is at rest; in other words, mass
appears as the rest-energy of the particle. We
could also call m//l - P8 generalized mass and
say that mass changes as a result of motion (com-
pare end of this section).
87
We are ready now to discuss the equations
of notion 11.4 or
The left hand sides may be written as
the first factor of the first term vanishes ac-
cording to the continuity equation 11.1; the
second term may be written, recalling the def-
inition of uj, as
= p.ou1/ox0.dxa/ds « p.duj/ds;
the right hand side of 13.5, according to our
former calculation (Section 6) is eFlaua, so
that the equations of motion become
p.duj/ds
or, if we use the discrete picture, considering
both mass and electric density to be concentrated
in one point, and denoting mass by m, electric
charge by e,
13.51
m.duj/ds =
These are the equations we are going to
discuss. In applying them to physics we give
preference to time by writing
13.52
m.dUj/dt = eFla.dxo/dt
which spoils the invariant form but does not
change the contents of the statement because
the transition from 13.51 to 13.52 is equiva-
lent to multiplication by ds/dt. Using 4.72
(or 6.1) the last equations become
m.dux/dt = e(X + Hv - Mw),
m.du8/dt = e(Y + Lw - Nu) ,
m.du3/dt = e(Z + Mu - Lv),
13.53
m.du4/dt = ie(Xu + Yv + Zw) .
Multiplying the left hand side of the first
of these equations by i.u^Au and the right
hand side by u (compare 13.2); using in the
same way Lu^/u* = v on the second, and
i.u3/U4 = w on the third, and adding the re-
sults we get the fourth equation because the
left-hand side comes out
(Im/u4).(u1.du1/dt + u,.du./dt + u,.du,/dt)
and differentiating the identity 13.1, we find
that the second factor is, Ut.du4/dt. The fourth
equation is thus a consequence of the first
three, a great improvement over the situation
as it was before the new identification. The
fourth equation also has a definite physical
meaning now; the left hand side may be said to
represent the time rate of change of energy
(since the variable part of mvu has been rec-
ognized as kinetic energy), and the right hand
side has been recognized before (Section 6) as
the rate at which the energy of the field (po-
tential energy) is being expended in moving the
body. The difficulty with the fourth equation
has thus been settled in a most satisfactory
fashion but the system as a whole, or the first
three equations, have to be tested by experi-
ment (the fourth, being a consequence of the
first three, cannot be wrong if these three
will be proved to be "true").
Since
dui/dt = -[^(dxi/ds) = dt/ds.-|kdxi/dt)
we may write our equations as
m'.d*x/dt«=e(X+Nv-Mw) , m' ,dIy/dts=e(Y.+Lw-Nu)
13.6
m'.d8z/dt8 = e(Z + Mu - Lv) ,
where
13.7
m1 = m/-
The right hand sides of these equations (as
stated in Section 6) are the components of the
force exerted by the electromagnetic field on
the particle. Comparing the left hand sides with
the classical expressions we see then that the
correction resulting from our identification is
equivalent to the substitution of m1 for m in
the classical equations of motion. We may say
then that our theory predicts that motion will
be governed by the old equations in which mass
has been replaced by a corrected mass the cor-
rection being the kinetic energy (divided by the
square of the velocity of light). In the vast
majority of cases the factor 1//1 - (5s is very
close to one, but there are a few cases where it
is not, and these cases afford an opportunity to
test the new theory and to see whether it or the
old one is better adapted to give account of ex-
perimental results. In experiments with "ca-
thode ray particles" by Bucherer the predictions
of the new theory seem to have been verified.
14. Lorentz Transformations.
Now that we saw that the new identification
removes the difficulty in connection with the
fourth equation of motion we want to consider
some other consequences, and in the first place
we want to give a discussion of the physical
significance of the transformation of coordi-
nates promised in Section 10. The new feature
about our coordinate system* IB the greater ar-
bitrariness in their choice. Before we were
free to pass from one system to another with
the same time axis; now we may change the time
axis also (formulas 10.6), and we want to see
what it means.
In general, in one system of coordinates A
geometrical configuration is described by cer-
tain numbers (e.g., coordinates of its points)
and certain equations (e.g., equations of its
straight lines); In another system of coordi-
nates the same configuration will be character-
ized by other numbers and other equations, but
It will be another description of the same con-
figuration; or, we may say, another identifica-
tion which is theoretically just as good as the
first, but may be more, or less, convenient for
practical purposes. In general we make our
choice of a coordinate system guided by the prop-
erties of the object we are studying and our owi
position in space. If we study an ellipsoid we
would choose for coordinate axes its principal
axes; or in another case we would choose the di-
rection away from us as the y-axis, the direc-
tion to the right as the x-axis, the vertical
direction as the z-axis; but in principle all
axes are permitted. The same general situation
obtains in physics, considered as four-dimen-
sional geometry. We have many systems of coor-
dinate axes at our disposal, and we want to in-
vestigate now what use we can make of this ar-
bitrariness, how we can adjust the choice of ax-
es to the requirements of a particular situa-
tion. In particular, we are interested in the
choice of the t-axis.
The object we want to study in the first
place is the motion of a particle. We repre-
sent such a motion by a curve in four- space (and
a straight line we consider as a special case of
a curve) . At every point of that curve we have
a unit tangent vector, the four-dimensional ve-
locity vector of components a,b,c,d; or we may
characterize it by the three-dimensional veloc-
ity components u,v,w; if we pass to another co-
ordinate system the components a,b,c,d will be
changed, and so will the components u,v,w. If
the coordinate transformation affects only the
space coordinates x,y,z then the component d will
not be affected, and therefore p will not change;
in other words, u,v,w will be changed but not
ua + v8 + w»; the velocity vector will have dif-
ferent components, but its absolute value, the
speed, will be the same. This is essentially as
in old physics; the new feature is in the exist-
ence of transformations affecting t, and the
most striking result of it is expressed in the
following theorem.
Theorem. For every motion it is possible
for every moment to choose a system of space-
time coordinates in such a way that the speed
be zero.
Proof. Begin with any axes'; then, without
changing the, time axis, change the space axes
so that the motion, at the moment considered,
takes place along the z-axis; we have then
a = b = 0; now consider a transformation in-
volving z and t; denoting the new components of
the four-dimensional velocity vector a1, b1, c1,
d' we shall have (10.6)
a' =0, b' = 0, c« = co + dT, d1 = cr + do;
if we want to make c« = 0 we have to choose the
angle \J> so that
-o = sin <r = -c/d;
if c is in absolute value less than d, and this
is so for all motions of bodies so far observed,
an angle satisfying this relation and therefore
a system of coordinates for which a - b = c = 0
can be found. From formulas 12.2 it follows
that in such a system u = v = w = 0, and the
theorem is proved.
The theorem just proved is expressed often
by saying that every particle can be transformed
to rest.
After we have found a coordinate system in
which a particle is at rest we can perform any
transformations of space coordinates and the
property will not be destroyed; any transforma-
tion involving time, on the contrary will result
in introducing significant space components of
the velocity vector; we see thus that whether a
particle is at rest in a coordinate system de-
pends exclusively on the choice of the time ax-
is, so that the choice of the time axis is equiv-
alent to the choice of a body which we desire to
consider at rest; in other words, the direction
of the time axis may be characterized by indi-
cating what particle is at rest in the corres-
ponding coordinate system.
What time axis we actually choose depends,
as in geometry, on circumstances; in many cases
we shall want to consider ourselves as being at
rest, or our laboratory, or the earth.
In what precedes we spoke of a motion of a
particle at a given moment; in a given system'
of space-time coordinates a particle may be at
rest at one moment and not at rest some other
time; but there exists a class of particles which
if transformed to rest for one moment will be at
rest always; these are those particles whose
representative four-dimensional curves have the
same tangent vector at all points, i.e., are
straight lines; it is clear that if the direc-
tion of such a straight line is taken as the di-
rection of the t-axis the velocity in the so ob-
tained coordinate system is zero. But if we
choose any other (cartesian) coordinate system,
the three-dimensional velocity u,v,w, will be
constant in absolute value and direction, so
that we have a rectilinear uniform motion. From
our point of view then the distinction between
uniform rectilinear motion and rest is a non-es-
sential distinction, this distinction does not
exist until we introduce a coordinate system; it
is of the same nature as the distinction between
lines which are and those which are not parallel
to the x-axls in ordinary analytic geometry.
If a motion la not uniform and rectilinear
then there is no coordinate system in which the
particle is permanently at rest. But rather
than to make a strict distinction between par-
ticles which are and those which are not in un-
iform rectilinear motion or rest, it Is more in
keeping with our point of view to speak of par-
ticles which may be (within experimental error)
considered at rest for a sufficiently long peri-
od of time.
We are now in a position to explain the
name and the origin of the theory we are study-
ing. We saw that in this theory there is no
such thing as absolute rest or absolute notion
of a body. If it is at rest with respect to
one system of coordinates it may move with re-
spect to another and vice versa; we can only
speak of relative motion; that is where the
name Relativity comes from.
If we adopt this point of view, we hare to
consider as permissible all transformations of
coordinates from one to any other cartesian co-
ordinate system. Later on we shall consider
other, more general, systems of coordinates, and
therefore more general coordinate transforma-
tions; we shall replace our equations by more
general ones which will be invariant under these
more general coordinate transformations; in com-
parison with this situation we may say that we
consider now only special coordinate transfor-
mations and invariance under them; therefore the
present theory is called "Special Relativity
Theory".
It may be mentioned that the historical
order of appearance of the ideas of our sub-
ject - as it happens so often - has been quite
different from the order which seems natural and
in which we have presented them. First the
formulas of transformation involving space co-
ordinates and time have been introduced by Lor-
entz without, however, giving to them the mean-
ing they have now; in Lorentz's theory there
exists one universal time t, and other times t1
play only an auxiliary part. The merit of mak-
ing the decisive step and recognizing the fact
that all these variables are on the same foot-
ing - belongs to Einstein (1905). The four-di-
mensional point of view, after some preliminary
work had been done by Poincare and Marco longo,
was most emphatically introduced by Minkowski
in 1908.
15. Addition of Velocities.
As explained in Section 13 we have two ways
of characterizing the velocity of a body: by
means of the three-dimensional velocity vector
and by means of the four-dimensional velocity
vector. We can pass from one representation of
velocity to the other without difficulty and
the two methods are equivalent as long as we do
not change our coordinate system. But if we
come to study the relative motion of one body
with respect to another and want to define the
relative velocity, the four-dimensional point
of view leads to conceptions which are at vari-
ance with commonly accepted ideas and we want to
devote this section to the clarification of this
situation. It is natural to reduce the defin-
ition of relative velocity of a body with re-
spect to another body to the conception of the
velocity of a body in a coordinate system by
saying: By velocity of the body B with respect
to a body A we mean the velocity of B in a sys-
tem of coordinates in which the velocity of A
is. zero.
If we want to find the velocity of B with
respect to A we have to transform our coordi-
nates so that in the transformed coordinate sys-
tem A be at rest. It is clear that the meaning
of relative velocity is made to depend by the
preceding definition on what we mean by trans-
formation of coordinates. If by transformation
of coordinates we mean only transformation of
three-dimensional coordinates - transition to
moving axes - we have the old idea of relative
velocity; if, on the other hand, we consider
four-dimensional coordinate axes and our trans-
formation of coordinates involves the coordinate
x4, or t, in the sense of the theorem of Sec-
tion 14, it is clear that we give a new meaning
to relative velocity, and we should not be sur-
prised if the so defined "relativistic" rela-
tive velocity will possess properties different
from those of the "classical" relative velocity.
Consider a body A and a body B that moves
with respect to A uniformly and rectilinearly
with a velocity VBA; this means according to our
definition that the velocity of B in a coordi-
nate system in which A is at rest is VBA • In-
troduce a coordinate system in which A is at
rest and B moves along the z-axis, and call the
coordinates XA, yA, ZA, tA; introduce also a
system of coordinates in which B is at rest so
that (10.6)
15.1
XA ~
ZA ~
VA = VB>
(L = ZBTBA
Now describe the motion of the body B in each
system neglecting the x and y coordinates. In
the system A the motion of the body B which, we
assume, at the moment
given by
t = 0 was at Z = 0 is
ZA =
in the system B the body B is at all times at
the origin of the coordinate system, so that
ZB = 0; substituting this value in the trans-
formation formulas and eliminating tB we get
ZA = ^A*^* Comparing this with the preceding
equation we have (10.5)
15.2
BA
Solving the above transformation formulas
for ZB, tB we also find that TIB a -^BA* °AB =
o BA so that
VAB = -VBA-
This result, that the relative velocity of A
with respect to B is the negative of the rela-
tive velocity of B with respect to A is In keep-
ing with the old Ideas.
Now consider three bodies, A,B,C, all mov-
ing in one direction (more precisely B and C
moving in the same direction with respect to A).
Denote the velocities of B and C respectively
with respect to A by VBA and VCA > and the ve-
locity of C with respect to B by VCB . We have
in addition to the above transformation formu-
las the formulas
15.11 ZA=ZCOCA + tcTCA, tA =ZCJTCA + tcocl,
and
15.12 Z=ZO
B
C(B
=ZT
and also
15.21 V
BA
VGA =
OCA'
CCB
VCB =
TCB
OCB*
Express now ZA, tA in terms of zc, tc by sub-
stituting the values for ZB, tB given by the
transformation formulas 15.12 into the trans-
formation formulas 15. Ij comparing the result
with the transformation formulas 15.11 we get
°CA = °
GB'°BA +TCB'TBA • TCA = TCB*°BA + °CB*TBA
whence, using the above expressions of veloci-
ties in terms of transformation coefficients,
15.2 and 15.21,
15.3
"CA
VBA + VCB
l + VBA.VCB'
This is the Einstein formula for addition of
velocities for the case of two motions In the
same direction. This formula should be com-
pared to the formula of addition of classical-
ly defined relative velocities
15.4
V = V' + V".
Of course, there is no contradiction between
the two formulas because they refer to differ-
ent quantities. Still it is legitimate to ask
which formula is better from the point of view
of experiment, which - if any - is "correct"
for the relative velocities that we actually
measure.
In ordinary units th3 second term in the
denominator in formula 15.3 should be divided
by the square of the velocity of light, so that
for moderate velocities the formulas give re-
sults that differ numerically very little, and
it seems to be difficult to devise an experi-
ment with high enough velocities of material
particles so that the formulas could be tested
directly. In the next section we shall consid-
er the case when one of the velocities is that
of light; in the meantime we may mention that
formula 15. '6 is a special case of a more gener-
al formula which corresponds to the case when
the two motions are not in the same directions.
This general formula gained temporary importance
some years ago v.'hen it played a decisive role
in the early stages of the application of the
idea of the spinning electron to the explanation
of spectra.
16. Light Corpuscles, or Photons.
In studying curves in four-space represent-
ing motions of particles we succeeded (Section
1£) in choosing a standard parameter, s, by con-
sidering the expression
and by setting ds/dp equal to the reciprocal of
the square root of minus the above expression.
This procedure would not work if the above ex-
pression were equal to zero. We can imagine in
our four-dimensional geometry curves and straight
lines for which the above expression is zero
(Section 10), and the question arises: what will
be the physical interpretation of such curves;
in other words: is there anything in physics
that could be identified with such curves in the
same way that motions of particles are identi-
fied v/ith curves for which the above expression
is negative. In order to answer this question
let us calculate the three-dimensional velocity
corresponding to such a curve; if the above ex-
pression is zero for one choice of parameter it
will be zero for all choices; using t as param-
eter, and using physical coordinates we have
then
(dx/dt)8 + (dy/dt)2 + (dz/dt)z -1=0
or u2 + v2 + wa = 1,
i.e., we can say that the curves of zero square
tangent vectors correspond to what we have to
call from the three-dimensional point of view
particles moving with the velocity of light.
This suggests to identify such curves in some
way with propagation of light.
Since the time of Newton and Huygens two
theories of light have been vying for suprem-
acy with variable success; according to one,
the so-called corpuscular theory, light con-
sists (like matter on the discrete theory) of
particles which lately (Wolfers, 1925) hare
been called "photons"; according to the other
theory light is a wave phenomenon. For our
present purposes the former view seems to be
better adapted. If we adopt it we can make our
former statement more specific by saying that
we identify curves of zero square tangent vec-
tors with photons, or with motion of photons.
In adopting thus the corpuscular theory of
light we do not in the least mean to say that
the corpuscular theory of light is correct, and
still less that the other theory - the wave the-
ory - is wrong. We simply want to show that
the identification Just mentioned permits us to
give account of certain light phenomena; and it
is enough to mention polarization in order to
see that other phenomena are left out.
To begin with we want to point out an ad-
vantage that the Relativity theory has compared
to classical theory in the matter of corpuscu-
lar theory of light. In classical theory dif-
ference in velocity is merely a quantitative
difference, in relativity this means an entire-
ly different kind of curves, and there are oth-
er differences entirely of qualitative nature
that are consequences of our identification,
which is more in keeping with the nature of
light compared to matter as we know it from ex-
periment. This seems to constitute a very
strong argument in favor of the adoption of the
point of view of Relativity in general, and of
the identification we are discussing now in
particular.
We want next to discuss what is usually
referred to as constancy of the velocity of
light. The reader may have noticed that a while
ago when we were calculating the three-dimen-
sional velocity, corresponding to curves with
zero-square tangent vectors, we did not say in
vrhat coordinate system we wanted to calculate
this three-dimensional velocity. As a matter
of fact, the result shows that it is independ-
ent of the coordinate system; i.e., no matter
what bodies we consider as being at rest, we
come out with the same value for the velocity
of light, in our units - one.
This seems surprising; it contradicts the
commonly accepted ideas concerning addition of
velocities; but we have been led to a different
formula for the addition of velocities, and we
can show that the constancy of velocity of light
is in agreement with that formula 15.3. In fact,
if we consider the case that C moves with the
velocity of light (that is one in our units)
with respect to B, that means that VCB = 1; sub-
stituting this value in 15.3 we find that VGA
is also one; that is, what is motion with veloc-
ity one in one system is motion with velocity
one in another system.
This discussion, of course, proves nothing
but the inner consistency of the thec
Another question is whether constancy of
velocity of light, i.e., independence of this
velocity from the choice of the system which Is
considered at rest is consistent with experi-
ment. As a matter of fact, it appears that it
is; the weight of experimental evidence seems
to be for it. Historically, results of some
experiments by Michelson and Moreley performed
in 1887 and pointing in the same direction play-
ed a great role in the creation of the Theory
of Relativity.
Having considered thus the question of ve-
locity of light we pass to the discussion of an-
other consequence of our identification.
We have decided in a general way to identi-
fy straight lines whose vectors are of zero
square with light or the motion of photons in
the same way that straight lines whose vectors
are of negative square are identified with mat-
ter or uniform rectilinear motion of material
particles. But a straight line (in four dimen-
sions) does not characterize the motion of the
particle completely - it only gives the velocity
of the motion, it characterizes it only kinema-
tically; for a complete dynamical characteriza-
tion we had to introduce (Section 13) the mass
of the particle, and that led us to introduce
the momentum vector, whose square we found to
be -m8; the complete characterization of a mate-
rial particle consists then of a line with a
vector (of negative square) on that line. In
the same way we shall characterize the motion of
a photon by a line with a vector of zero square
on it. We have thus the same picture for a ma-
terial particle and a photon; in both cases we
have a line with a vector on it; only in the
first case it is a vector of negative square;
in the second of zero square; this difference
corresponds to the difference in the speeds of
•the particles in the classical theory. But in
the classical theory this is a purely quantita-
tive difference and here, as mentioned before,
it leads to qualitative differences, some of
which we are going to consider.
In the first place a photon cannot be trans-
formed to rest. In fact transforming a photon
to rest would mean finding a coordinate system
such that in it the time axis will have the di-
rection of the photon; but that would mean that
the vector 0,0,0,1 would have zero square which
is impossible.
Then there is this distinction: two mate-
rial particles may differ in mass, that means in
the squares of their momentum vectors, and this
is an essential difference because the square
of a vector is not affected by a transformation
of coordinates; all photons on the other hand
have vectors of the same square, namely zero.
We shall prove that as a consequence of this,
two photons never differ essentially, that is,
given two photons there always exist two systems
of coordinates in which the descriptions of the
two photons are the same. To begin with, we may
choose the origins of the two coordinate sys-
tems on the respective straight lines; next we
may consider the two lines In the respective
z-t planes. The momentum vectors of the cor-
responding photons will have now In their re-
spective coordinate systems the components
0,0,qa,q4 and 0,0,p,,p4 (contrary to our gener-
al agreement we use here subscripts with physi-
cal coordinates) and since both vectors are of
zero square we'll have
/ - Q4* - 0,
P»* - P4* « Of
by choosing appropriately the sense on each co-
ordinate axis we can reduce these conditions to
16.1
q»
P» " P4<
Now perform in the second system the transfor-
mation 10.6
z'
ZO + tT,
t'
ZT
tO,
which applied to the second vector and taking
into account 16.1 gives
16.2
P» = P* =
But a and T are subject only to the condition
that o2 - •** = 1, so that we can choose o + t
arbitrarily; if we make the choice
16.3
o + T = qa/p3,
we shall have p3 = q3 and the statement Is
proved.
This theoretical conclusion, that any two
photons are not essentially different from each
other must be confronted with experience. At
first sight it seems to contradict it. We know
that light differs from case to case; it differs
in intensity and color. For difference in in-
tensity we account by assuming that every bean
of light consists of many photons so that in-
tensity (for a given color) is proportional to
the number of photons in the beam. Remains
color. But experiments show that color actual-
ly depends on the state of motion of the ob-
server; when an observer approaches a source of
light, color seemingly changes (Doppler effect)
and so the field is clear for our assertion.
Now let us see how it works out.
Before we treat the situation from the
point of view of the Relativity theory we have
to say a few words about how color appears in
physics as a measurable quantity. From the
point of view of the wave theory to light Is
attached a certain measurable quantity v -
"frequency" which corresponds to color In the
sense that different colors correspond to dif-
ferent frequencies. On the corpuscular theory
photons are characterized by their energies, E,
and the fundamental relation between frequency
and energy is given by the formula
16.4
E = hv,
where h is the so-called Planck's quantum con-
stant, which for us appears simply as coeffici-
ent of proportionality establishing the rela-
tion between the values of two quantities which
measure the same thing in different units, much
in the same way that c, the velocity of light,
appears in the formula connecting mass and en-
ergy (compare 15.41) . Of the two quantities, S
and v, which can be used to measure color, E
will be the one that is more convenient for our
purposes because we use the corpuscular theory
of light.
The question now is with what quantity in
our theory are we going to identify E. In order
to have a suggestion we notice that E is of the
character of kinetic energy; it plays for light
particles the same role that kinetic energy
plays for material particles. There (Section
13) we identified kinetic energy with the sec-
ond term in the development
m
= m
of the fourth component of the momentum vector;
of the other terms the third and the following
are negligible for material particles, and the
first is a constant so that it plays no part in
these considerations where only differences in
energy are important; besides, the correspond-
ing constant for light is zero; everything leads
us thus to compare E with the fourth component
of the momentum vector of light, or photon. We
arrive in this way to a new identification; we
identify the mathematical quantity "time compo-
nent of the momentum vector of a photon" with
the physical quantity E which, except for a fac-
tor of proportionality, is frequency and mea-
sures color. This identification makes color
dependent on the coordinate system but this de-
pendence, as was said before, is to be expected,
and our next question is whether the character
of this dependence corresponds to experimental
facts.
Suppose that E is the energy of a light
corpuscle in one system of coordinates; what
will it be in another? We have already calcu-
lated how the components of a zero square vec-
tor change under a transformation of coordinates
involving time. Formula 16.3 shows that the
ratio of the fourth components of the two vec-
tors, and according to our Identification this
means the ratio of the energies or frequencies,
is
v ' /v = a + i .
On the other hand, we saw before (15.2) that the
relative velocity of two bodies which are at rest
M
in the corresponding systems of coordinates it
V •
this gives
and taking into account the identity o* - T* = 1
we find
16.4
v!/v
T *
or, in first approximation 1 •»• V.
Let us try to figure out the predicted
change of frequency on the classical (ware) the-
ory. If we have a wave of frequency v that
means that there are v vibrations per unit of
time, and since the velocity is unity, there
will be v waves per unit of length, low, if we
move toward the source with a velocity V we
shall travel in a unit of time the distance V
and we shall meet V.v additional vibrations,
so that the number of vibrations our eye re-
ceives in a unit of time will be (1 * V).v, and
this will be the frequency for the moving ob-
server. The two theories give then the predic-
tions
for the change in frequency due to motion of
the observer, and the difference between these
two values is too small to be subjected to an
experimental test; within experimental error
both seem to fit observations equally well.
17. Electricity and Magnetism
in Special Relativity.
In the preceding sections of this chapter
we have discussed some modifications that are
brought about by the Theory of Relativity in
Kinematics, Mechanics, and Optics. There are
other modifications which have attracted a great
deal of popular attention due to their sensa-
tional a"nd paradoxical character. We shall only
mention the so-called effects of motion on the
shape of bodies, lengths, measure of tine, and
the fact that in the Theory of Relativity the
conception of simultaneity loses its absolute
character so that two events which are to be
considered simultaneous in one system of space-
time coordinates need not be simultaneous in an-
other. But we shall say a few words about elec-
tricity and magnetism. Even in the first chap-
ter the components of the electric and the mag-
netic force vectors were combined into one ten-
sor f±t , so that electricity and magnetism seem
to be treated as two aspects, or manifestations
of a higher entity. But as long as we limit
ourselves to transformations of space coordi-
nates the components of F corresponding to
electricity are transformed among themselves and
those corresponding to magnetism - among them-
selves, so that their unification in one ten-
sor FJJ may be considered as artificial. flow-
ever, when we introduce transformations of
space-time coordinates (formulas 10.6) the sit-
uation changes radically.
Following the procedure used in Section 11
when we were proving the invariant character of
the relations between the tensors F and D we
can deduce the following formulas corresponding
to rotation in the x3x4 plane.
= F31s + F41c
F83
= F8,c - F84s
F48 - F38s
F43 = F43
F48c
Fix = F31c - F41s
IB
From these mathematical formulas we can
pass to formulas involving physical components
and only real quantities by making use of 4.72,
10.3 and the fact that F^ = -¥„ . We obtain
thus the relations
X« = ox + TM
yi = oy - TL
Z' = Z
L» = oL - TY
M« = oM + TX
L« = L.
The interpretation of these formulas it that
if the unprimed letters give the component* of
electric and magnetic force in one system the
primed letters will give the components of elec-
tric and magnetic force in a system which mores
with respect to the first with Telocity ? * T/O.
These formulas show that the distinction
between electric and magnetic forces is not an
absolute distinction, but depends on the coor-
dinate system used; we might, for example, hare
in the old coordinate system a purely electric
field, L = M = N = 0; in the new system the mag-
netic components will be different from zero,
viz., -YT, Xxp, 0. What is the physical meaning
of this? It means that a field may have elec-
tric effects on one body but electric and mag-
netic effects on a body that moves with respect
to it. This prediction is verified by experi-
ment. The fact may be restated by saying that
an electric charge in motion has magnetic ef-
fects, it may, e.g., deflect a magnetic needle.
As an example, the magnetic field of a moving
electron may be easily calculated. We start
with an electron at rest. Its magnetic field
is supposed to be zero, its electric field is
supposed to be Independent of time and symmetric
with respect to the point; under these conditions
Maxwell's equations reduce, as can be seen eas-
ily, to Newton's equations which, as we know,
give the inverse square law for the electric
forces. The field of an electron in motion can
now be obtained by applying the above formulas.
M
Chapter IT.
CURVED SPACE
The theory that has been developed so far
may be said to consist of two parts: a general
part which may be called Geometry and which, In
addition to material analogous to that treated
in ordinary geometry, includes general rules of
operations on tensors, and a special part which
may be called Physics and which deals with three
definite tensor fields, a scalar field p, a vec-
tor field u^, an antisymmetric tensor field FJI,
which all have been combined into the tensor
field TJJ , and with special conditions which we
impose on these fields, viz., equations 11.1 to
11.5. The second part is independent of the
first in the sense that we could have built with
the same geometry a different "physics", we could
have chosen another set of tensors instead of
p, u4, FJJ . The reason why our physics was in-
dependent of our geometry is because the latter
does not furnish us any tensors, except the ten-
sor Oij, or the tensor of scalar multiplication
which is, so to say, the same in all points
(and at all times) and therefore cannot be used
to explain the variety and change which are
characteristic of the outside world. In other
words, our geometry does not possess any struc-
ture which seems to be necessary for the inter-
pretation of the outside world and therefore we
had to superimpose on our geometry a certain
arbitrary structure by introducing special ten-
sors, by filling, so to say, the empty space-
time with these tensor, fields. Our geometry
does not give us a landscape, it gives us, so
to say, only a frame for one, or only a stage
and the landscape can be constructed on it by
means of stage-settings which do not constitute
an organic part of the stage. Although some
success has been achieved with the theory Just
described we may want to accomplish more, we may
want to have a geometry possessing a structure
of its own which might be used in interpreting
the outside world. Such a possibility is sug-
gested by the consideration of curved surfaces.
The space- time we have been working with is of
the same character as a plane (except for the
number of dimensions) it is as devoid of struc-
ture as a plane. A curved surface, on the oth-
er hand, possesses a certain structure; it is
not necessarily the same in all points, there
may be a difference in curvature. We shall in-
vestigate the possibility of a four-dimensional
space which bears the same relationship to our
flat space-time as a curved surface bears to a
plane; we shall expect to find that it possess-
es a certain structure which we'll try to inter-
pret in terms of our physical quantities; more
specifically, since all our physical quantities
have been combined (by formula 11.5) into a sym-
metric tensor of the second rank, viz., T^j > *e
shall expect to find a tensor of that character
connected with the curvature of oar curved four-
dimensional space.
The plan of our study will be to begin with
the simplest case, a case that Is even simpler
than that of a surface, viz., with the case of
a curve, and then to work up gradually.
18. Curvature of Curves and Surfaces.
We consider a curve in the plane. We as-
sume that it possesses a tangent at every point,
and, furthermore, that if the origin of coordi-
nates is chosen in any point of the curve, and
the tangent at the origin is chosen as the x-
axis, the curve, in the neighborhood of the ori-
gin may be represented by a function
7 =* f(x),
which can be expanded into a power series con-
verging in the neighborhood of the origin. The
constant term of this expansion vanishes because
the curve passes through the origin so that the
equation must be satisfied for x = 0, y = 0; the
coefficient of the first power of x also van-
ishes, since the slope of the tangent, which is
the x-axis, must be zero; if we write the next
term in the form
the coefficient aa is called the curvature of
curve at the point considered, i.e., the point
chosen for the origin. Since every point can
be chosen for the origin this assigns to every
point of the curve a curvature. We may say
that if we drop all terms of the expansion fol-
lowing the one Just written out, i.e., consider
the curve
18.1
Y = *a,x'
this curve (a parabola) is an approximation to
the given curve in the neighborhood of the point
considered.
We consider next a surface. Here we as-
sume that it has a tangent plane at every point.
Taking a particular point on the surface as the
origin and the tangent plane at that point as
the xy-plane we may represent the surface by an
equation z = f(x,y). We again assume that for
every point on the surface this function may be
developed into a power series converging in the
neighborhood of that point. The constant term
and the coefficients of the first powers of the
variables will vanish as before. We write out
the next group of terms, those that are quadra-
tic in the variables, in the form
18. tt Z =
a* + 2*18*1*8 +
where we use zx for x, and xa for y.
We may consider the coordinates xlf x8 as
the components of a vector in the tangent plane
which Joins the origin to the projection of the
point on the surface, whose coordinates are xx,
xa, z = f(x1,xa). The expression 18. E assigns
thus to every vector in the tangent plane a num-
ber (which may be considered as the ordinate of
a paraboloid approximating the surface) . This
assignment is independent of the coordinate sys-
tem, i.e., if we choose another system of coor-
dinates we shall have the same number assigned
to the same vector although its components will
have changed; in fact in a rotation of the co-
ordinate axes the degree of a polynomial is not
affected so that the group of second degree
terms in the expansion of z is transformed into
the group of second degree terms of that -expan-
sion in the new coordinates. We have then a
function whose values are numbers and whose ar-
gument is a vector; is it a tensor? Of course
not; but it is easy to introduce a tensor with
which our function 18.2 is closely connected.
We simply write, as in 9.1
18.3
aa2x3y2 = s(x,y)
using the coefficients axl, a12, azz of our
function 18.2 and writing in the third term aai
for aia for the sake of symmetry; this is a
(symmetric) tensor of the second rank depending
on two vector arguments x and y, and from which
our function is obtained by setting the vector
arguments equal to each other. We arrived thus
in the case of a surface at a tensor of rank
two which expresses the curvature properties,
i.e., the structure of the surface insofar as it
describes its deviation from its tangent plane
in the neighborhood of the point of contact.
This encourages us in our enterprise: if we
succeed in generalizing this result to higher
dimensions, we may try to Identify the general-
ization of this symmetric tensor of rank two
with the symmetric tensor of rank two which, as
we saw, combines in itself matter and electric-
ity. We want to state at this time that we shall
be ultimately successful in our enterprise but
that everything will not run very smoothly, and
we shall have to make an effort in order to ar-
rive in the general case at a tensor of rank
two. The configurations which will present them-
selves immediately will not be exactly tensors,
and even after we shall arrive at a tensor it
will not be a tensor of rank two. We shall have
to overcome these obstacles, and in order to be
able to do that we shall need some preparation,
which we shall make by studying more attentively
the case of a surface before we pass to the con-
sideration of more complicated cases.
The curvature of a curve is characterized
by a number; that of a surface is a more compli-
cated thing and is characterized by a tensor;
we know, however, that there are certain num-
bers connected with a tensor, in an intrinsic
way (that is, independent of the system of co-
ordinates), viz., the numbers given by 9.5 and
9.51, and we may expect that they have geomet-
rical significance. In fact the first one,
an + aaa> is known as the mean curvature, and
the second,
•iX «!•
18.4 K =
as the total curvature of the surface at the
point considered. We know that K is independ-
ent of the choice of a system of coordinates,
but we want to show how it can be obtained with-
out the use of any coordinates at all. We hare
introduced above a vector notation s(x,y) for
our tensor 18.3; we now write out the expres-
sion (the expressions in coordinates are writ-
ten down for the sake of future references and
may be disregarded at present):
18.5
s(x,u) s(x,v)
s(y,u) s(y,v)
where x,y,u,v are arbitrary vectors, and we as-
sert that it is equal to
18.6 K.
x.u x.v
y.u y.v
= K .
XPUP
7ouo
to prove this consider the expression
&21
u«
multiplying by the law of multiplication of de-
terminants the second and the third factors and
writing K for the first, we get 18.6; applying
the law of multiplication of determinants first
to the first two factors, then to the resulting
determinant and the third factor we get 18.5;
we may thus write
18.7
s(x,u) s(x,v)
s(y,u) s(y,v)
K.
x.u x.v
y.u y.y
we may now obtain K without using any system of
coordinates by dividing the left hand side by
the second factor on the right; the vectors
x,y,u,v, may be any arbitrary four vectors, only
such that the second factor on the right does
not vanish. Setting x=i, u=i, y=J, v=J,
where i,J are two coordinate vectors we get
formula 18.4; now it is seen to hold for any
system of coordinate vectors, so that incident-
ally we have a new proof of the invariance of K.
Before we leave the topic of ordinary sur-
faces we want to establish a relation between
curvature of surfaces and curvature of curves.
The points common to our surface and the xz-
plane constitute a plane curve whose equation
in the xz-plane may be obtained from z = f (x,y)
by setting y = 0. It is clear that the x-axis
is a tangent to this curve and that the first
term of the expansion of z as a function f (x,0)
into a power series will be obtained by setting
x2 = 0 in 18.2. We have thus
18.11
lll*l
as this first non-vanishing term, and comparing
.with 18.1 we see that the curvature of the curve
is an, or (18.3) the value s(i,i). We can con-
sider any plane passing through the z-axls as
the xz-plane, or any unit vector in the tangent
plane as the coordinate vector 1; we have thus
the result, that the curvature at the point of
contact of a curve, resulting from the inter-
section of the surface with a normal plane is
s(i,i), if 1 is a unit vector common to the
tangent plane and the normal plane considered.
In other words to every direction in the tan-
gent plane, characterized by a unit vector i
corresponds a normal plane containing it, and
the curvature of the intersection of that plane
with the surface is s(i,i).
We see thus that to every direction in the
tangent plane corresponds a definite number
s(i,i), the curvature in that direction. As an
exercise the reader may try to express the cur-
vature corresponding to a direction in the tan-
gent plane in terms of the angle that direction
makes with the x-axis.
As the next step of our discussion whose
general aim is to arrive at the most general
situation as far as the number of dimensions is
concerned both of the space from which we start
and the configuration in it that we study, we
take a skew curve in ordinary space; first we
studied a curve (n =1) in a plane (N = 2); then
a surface (n = 2) in the ordinary space (N = 3);
now we take up the case n « 1, N « 8. A curve
may be given in general by two equations on the
three coordinates x,y,z. Solving these equa-
tions for y and z, we represent the curve in the
rorm y = f(x), z = g(x); we again make the as-
sumption that a tangent exists for every point
and that for every point, if we take this point
as the origin and the tangent as the x-axls, it
is possible to solve for y and z, and that the
functions f and g can be developed into power
series; as a result of the choice of the coordi-
nate system, the two power series will begin
with quadratic terms
18.8
If we change the y- and z-axes which fall into
the normal plane to the curve, to other axes in
the same plane, the form of the development*
will not be changed, of course, but the coeffl-
cients a,, bt will assume new values; If, how-
ever, we consider these coefficients as compo-
nents of a vector, the vector represented by
them will be the same in *T» coordinate sys-
tems. Calling this vector v we may say that
the curvature situation of the curve is charac-
terized by the expression
18.9
This expression plays the part of the expres-
sions 18.1 and 18.2 which have occurred in the
two preceding situations.
19 . General! za t ions .
In the preceding section we discussed con-
figurations in the ordinary space, and we could
rely on our intuition; everybody can conceive a
plane curve, a surface, a twisted curve; we
have at our disposal physical objects (drawings,
graphs, models) with measurements on which quan-
tities of our theories may be identified suc-
cessfully. In the investigation we undertake
now, we cannot use our intuition any more, and
the identifications, when they come, will be of
a much less immediate character. We have than
to rely on analogy with the configurations stud-
ied in the preceding section and on mathematical
reasoning supported by formulas.
We begin with what seems the next simplest
case, a surface in four-dimensional space; it
may be considered as a generalization both of a
surface and of a curve in ordinary space. Such
one is given, in general, by two equations on
the four coordinates; in other words, we daf^nf
as a surface in four-space the totality of
points whose coordinates satisfy two equations
P(x,y,z,t) = 0, G(x,y,z,t) = 0 where F,G are
two functions subjected to certain restrictions
to be imposed presently. We define £ plane
In four dimensions as a surface which may be
given by two linear equations (this definition,
although given in terms of coordinates, is in-
variant, because it can be proved that if the
equations are linear In one system of coordi-
nates they remain linear after a transforma-
tion; the equivalence of this definition of the
plane with that given in Section 7 is easily
recognized). In the general case we choose a
point on the surface as the origin of coordi-
nates; we solve the two equations for two of the
coordinates, and we define as the tangent plane
at that point the plane whose equations result
from omitting all but linear terms in the ex-
pansions. We next choose that plane as one of
our coordinate planes; the lowest terms in the
expansions are then the quadratic ones; denot-
ing the coordinates for which the equation* ap-
pear as solved by x», x«, and the two other
coordinates by xx, xa, we may write the groups
of quadratic terms in the two expansions as
19.1
atax,»),
2b18xxxa + baaxa").
For every vector in the tangent plane of compo-
nents xx, xa this gives us two numbers which
may be considered as the components of a vector
in the normal x», x4 plane; or, we may consider
this vector as given by a vector form
19.2
'11*1
2vxax1xa + vaax8!
the coefficients v^j being vectors of the nor-
mal plane whose components are a ij and bjj (along
the x3 and x4 axes respectively) . This expres-
sion assigns to every vector of the tangent
plane a vector of the normal plane; we may sub-
stitute for it, as we did in an analogous case
before (compare 18.3), a more general expres-
sion
19.3
where
s(x,y)
V82x8ya,
vai
but although this is linear in each of the vec-
tors x and y it is_ not a tensor, because the
values of this expression are not numbers (they
are vectors in the normal plane) . We shall not
introduce a special name for such expressions
because we shall not have to deal with them much;
the expression 19.5 we have denoted, as before,
by s(x,y), but we must keep in mind that the
values of s(x,y) are not numbers but vectors of
the normal plane.
We may in this case form the expression
18.5 where it is understood that in the expan-
sion of the determinant scalar products have to
be used where ordinary products were used be-
fore; this change is made necessary by the more
than once mentioned fact that the values of the
elements are vectors. In all other respects we
can apply to the expression the same reasoning
as before and we come to the conclusion that the
relation 18.7 remains true, where K is a number,
independent of the coordinate systems in the tan-
gent and normal planes, but which after such co-
ordinate systems have been chosen can be calcu-
lated in terms of the coefficients of the vector
form 19.3 by means of the formula
19.4
The important fact is that, although our expres-
sion 19.3 is a vector expression, and therefore,
does not furnish us a tensor, the invariant cor-
responding to 19.4 is still a number. The other
H
invariant, which could be called mean curvature
and written as vlx + vt§ Is not a number in this
case. The number, K, we call, as before, the
total curvature of the surface at the point con-
sidered.
In terms of the coefficients of the numer-
ical forms 19.1 the total curvature K may be
expressed as follows: expanding the determin-
ant 19.4 we have K - v^.v.. - vai.vxa; the
term vxl.v22, for Instance, Is the scalar prod-
uct of the vectors vn and vaa whose components
are respectively alt, bn and a.,, b88; the sca-
lar product vlx .vaa is then alxasa + blxbaa; In
the same way, the term vai.via in the expression
for K is aiaaai + biabai, so that we have for
,K in terms of the a's and b's rearranging terms
•and using determinant notation
19.41
K
»xx
The next generalization is an easy one; we
still consider a surface (n = 2) but instead of
a four-dimensional space we take a space of an
arbitrary number of dimensions H; we denote
N-n = N-2byr and we have, a tangent plane,
as before, but instead of a normal plane, we
have now a normal r-dimensional space, an r-flat
as we may say. We call the corresponding
coordinates x», x4, etc., or Xg+k, where k goes
now from 1 to r instead of only taking the val-
ues 1 and 2; we have here a vector form which
may be written as before (19.2) only the v's
are vectors of the normal flat and have r com-
ponents each; these components we may distin-
guish by upper indices In brackets; if we de-
note by I^ the r coordinate vectors in the nor-
mal flat, and denote by aQO the components cor-
responding to I k of v^ we may write
19.5
2
k=l
and for s(x,y) we may write
(k)
W
19.6
r
krl
otherwise there will be no changes. We can fora
the expression 19.4 as before; it will be inde-
pendent of the choice of the coordinates xt+k
because the scalar products used in the expan-
sion of the determinant are; substituting the
values 19.5 and evaluating we will have
19.41
r
Z
k=l
•xx
.00
an obvious generalization of formula 1^.41 which
may be obtained from this by taking r = 2, and
writing a for at1) and b for a(2) with proper
subscripts. We may, if we wish, write out an
expression analogous to 18.5. Substituting for
s(x,u) etc., the expressions 19.6 and using
scalar products in the evaluation of the deter-
minant we shall find
19.43
s(x,u)
= Z
k=l
where the Greek letter subscripts imply summa-
tion fron one to two corresponding to the tan-
gent plane, and the summation with respect to k
corresponding to the r coordinates of the nor-
mal flat is indicated in the usual fashion. Each
of the determinants corresponding to the differ-
rent values of k is exactly of the same nature
as 18.5 so that the reasoning which led from it
to 18.7 applies to each of the determinants
without change, and it is easy to see that form-
ula 18.7 continues to hold. We may use this
formula to define K which we continue to call
total curvature.
And now we come to the last generalization.
We consider, in a space of an arbitrary number
of dimensions N a curved space of n dimensions
with n also an arbitrary number «N, which by
definition is the totality of points whose co-
ordinates satisfy N - n = r equations
19.7 Fk(xi., x2,....xN) =0 (k = l,2,...r).
V.'e assume that for every point in the curv-
ed space these equations can be solved for r
of the coordinates and that these solutions can
be expanded into power series in the remaining
n coordinates, converging in the neighborhood of
the point selected. By a transformation of co-
ordinates we may arrange it so that these expan-
sions begin with second degree terms so that
we may write
19.71 xn+j£ =
+ terms of higher degree
where the summation indicated by the Greek in-
dices now goes from 1 to n. The sub-space de-
fined by the first n coordinate axes we call the
tangent flat space at the point considered, and
the sub-space corresponding to the remaining r
coordinate axes - the normal flat space at that
point. /jjN
As before, we use the coefficients aj« to
form the expressions
19.8
where xif y^ are two vectors of the tangent flat;
and we combine these expressions into a vector
expression
19.9
s(x,y)
p-J-k
It Is natural to try to generalize the
of total curvature. We can form the expression
18.5 but, and this is important, the transfor-
mation 18.7 does not apply; It was based essen-
tially on the fact n = 2, and it breaks down
here.
20. The Rlemann Tensor.
The way out of this difficulty is very
simple. Although relation 18.7 does not hold
we still may consider its left hand side; it is
a function of the four vectors x,y,u,r, func-
tion, which has numerical values; it is easy to
show that it is linear in each of the vector
arguments (we leave this proof to the reader
because the result will follow later from form-
ula 20.8); it is therefore a tensor, a tensor
of rank four; we call it the Riemann tensor,
denote it by R(x,y;u,v) and write
20.1
R(x,y;u,v) =
s(x,u) s(x,v)
s(y,u) s(y,v)
We have then at every point of the curved
space a tensor of rank four instead of a number;
it is connected with the second degree terms of
the expansions 19.71 and therefore character-
izes, at least in part, the deviation of the ex-
pressions of the xa+k from linearity, or of our
space from flatness. The Riemann tensor tells
us then something about the curvature of the
curved space, and it is often called the curva-
ture tensor.
The situation we have now reminds us of a
situation in Section 18. The curvature of a
curve was a number; when we passed to a surface
we found that its curvature was characterized
by a tensor; we have succeeded to derive from
this tensor a number K, so that we could ex-
press (at least partially) the curvature of a
surface by a number. Now passing to higher
curved spaces we again obtain a* tensor. In the
preceding situation we succeeded in interpret-
ing the tensor s(x,y) given by formula 18.5 in
terms of curvatures of certain curves on the
surface; we found that the value s(i,i) gives
the curvature of the normal section determined
by the unit vector 1 and the normal to the sur-
face. Is it possible to interpret the Riemann
tensor in an analogous way as giving the total
curvatures of some surfaces on our curved space?
This is a natural question to ask, and the ans-
wer is affirmative. We shall prove, in fact,
that certain values of the Riemann tensor give
us total curvatures of surfaces situated on the
curved space. Let i,J be two arbitrary mutual-
ly perpendicular unit vectors of the tangent
flat; choose a set of coordinate vectors so
that i,J be two of them. Pass through 1,J and
the normal flat, i.e., through the r vectors Ik,
a flat space of 2 + r dimensions; its points
will be those points of the N-space whose coor-
dinates xa, x4,....xn vanish; the intersection
of this flat space with the given curved space
will be a surface, i.e., a two-dimensional curv-
ed space, because the coordinates of its points
must satisfy the r equations of the curved space
(19.7) and n - 2 equations
20.2
0, x4 =
= 0,
which together isN-n+n-2=N-2 equa-
tions. This surface we may consider as a sur-
face of the r + 2 dimensional flat space 20.2;
its equations in that space will be obtained by
setting x3 = x4 = ...xn = 0 in the equations
19.71 of the curved space (just as the equation
of the normal section of a surface in the xz-
plane was obtained (preceding formula 18.11) by
setting y = 0 in the equation of the surface);
these equations will then become
20.3
+ terms of higher degree
and the total curvature of this surface is
,1 via
with
'28
(k)
but vlx = s(i,i), v12 = s(i,j), vai =
v22 = s(j,j), so that we have
which is R(i,J;i,j), and our statement is proved.
As we saw at the end of Section 18, a unit vec-
tor i in the tangent plane to an ordinary sur-
face determines a direction, a straight line,
which, together with the normal determines a nor-
mal plane, and the intersection of that normal
plane with the surface is a normal section of
curvature s(i,i); here we have the situation
that two unit vectors i,j in the tangent flat to
a curved space determine an orientation, a plane,
which together with the normal r-flat determines
a normal r + 2 flat, and the intersection of
that normal r + 2 flat with the curved space is
a normal section, a surface of curvature
R(i,J;i,j). We see then that the Riemann ten-
sor plays with respect to a curved space a role
analogous to that played by the tensor s(x,y)
with respect to an ordinary surface; our expec-
tations then are fulfilled; we need, it is true,
for the purposes of identification with the com-
plete tensor TJ a tensor of the second rank,
10
but we shall get one of rank two froa R(x,y;u,v)
by applying to it the operation of contraction.
In the meantime let us study the Riemann
tensor, or the curvature tensor as It Is called
sometimes, as we have It. The Riemann tensor
is not a general tensor of rank four. It sat-
isfies the relations
20.41 R(x,y;u,v) - -R(y,x;u,v) - -B(x,y;T,u),
20.42 R(x,y;u,v) = R(u,v;x,y),
20.43 R(x,y;u,v) + R(x,u;v,y) + R(x,v;y,u) - 0,
which are easily verified by using 20.1. The
first of these relations says that R is anti-
symmetric in each of the pairs of the vector
arguments, and the second, that it is symmetric
in the two pairs.
If we introduce a coordinate system in the
tangent flat, by picking four coordinate vectors
i,j,k,l or ix, it, 13, 14, we may represent the
vector arguments (as in Section 9) in the form
x = iaXa> etc., substitute these expressions In-
to R(x,y;u,v) and, by using linearity as de-
fined in 9.6, write the Riemann tensor as
20.5
where
20.6
Rab;cd
(We use here the first letters of the alphabet
as subscripts, instead of i, etc., as before,
in order to avoid confusion with the coordinate
vectors which we denote by i.) These are the
components of the Riemann tensor in t.' c »ordi-
nate system chosen. The relations 20.4 can be
written in components as
20.71 Rabjcd = ~Rba;od '= ~Rabjdc
20.72 Rab;cd = R od;ab
20.73 Rab;cd + Rac;db + Rad;bc = 0.
Exercise. Prove that the number of independent
components of a Riemann tensor for four dimen-
sions is 20.
The vectors of the flat spaces tangent to
the curved space may be considered as belonging
to the curved space, they may be characterized
In terms of the space itself, for instance, by giv-
ing direction and length; they are accessible
as we may say, to beings who live In the space
and for whom points outside the space do not
exist. Normal vectors, the function s(x,y) etc.,
are, on the contrary, not accessible to the in-
habitants of the space. We shall confine our-
selves for the most to the consideration of
these internal properties, properties accessible
to the inhabitants; but later in the course of
our investigation we shall have to use the
expression of the Riemann tensor in terms of the
coefficients a[j) and we shall conclude this sec-
tion by deducing it.
Substituting the expression 19.9 for s in-
to 20.1 and using 20.5 for the left hand side,
we find
20'8
this determinant may be presented as the sum of
r8 determinants, of which, however, only r are
different from zero, namely those In which the same
I appears in the two columns, because in the ex-
pansions of the others all terms vanish as in-
volving products of different and therefore mu-
tually perpendicular I's. What remains is (com-
pare 19.45)
r
Z
k=l
r
A
because the I's are unit vectors and I^.Ifc = 1>
or
(k)
r
2
k=l
PY aP«
Comparing this to the left hand side of 20.8 we
have the required expression
£0.9
Rabjcd =
21. Vectors in General Coordinates.
In the last section we learned how to as-
sociate with every point of a curved space a ten-
sor of rank fourj for our physical interpretation
we need one of rank two; but we know how to ob-
tain from a tensor of rank four one of rank two;
we have to apply the operation of contraction.
The result we shall call the "contracted Riemann
tensor" and we shall expect to identify it with
the tensor T. The first question we have to ask
ourselves in this connection is whether the con-
tracted Riemann tensor satisfies the equation
11.4, viz., oTia/axg = o. But before we do that
we have to go through quite a lengthy development
because at the present stage we do not know how
to differentiate tensors on a curved space. In
flat space we could consider the differential of
a vector, or, more exactly, of a vector field,
by (roughly) considering the difference of two
41
vectors of the field in two neighboring points.
In curved space, or on a curved surface two
vectors in two different points belong to two
different tangent planes and their difference
is not a vector of the surface at all. Or we
could in a flat space adopt a cartesian system
of coordinates and Introduce as the components
of the differential the derivatives of the com-
ponents of the given tensor. This method also
is not applicable directly to curved space be-
cause there is no such a thing here as cartesian
coordinates. Each method could be so modified
as to apply to curved space - the geometrical
method and the coordinate method. We shall de-
velop here the coordinate method because in ad-
dition to permitting us to introduce differen-
tiation - our immediate concern now - it is In-
dispensable in treating special cases.
As we said before, there is no such a thipg
as the cartesian system of coordinates in curved
space, because there are no straight lines; so
we shall have to use some other coordinates, let
us say, general coordinates; the main difficulty
in treating curved spaces is just this, viz.,
that rectilinear coordinates are not applicable
here, or we may say: part of the difficulty
lies in the fact that we have to use curvilinear
coordinates (the greater part) and part - in the
fact that the situation itself is so different
from that we encounter in flat space and with
which we are more or less familiar. Or to put
it in a still different form: the difficulty
is two-fold, we have a new material to work on
and we have to use new tools. To obviate the
difficulty we are going to divide it; we already
have studied curved spaces in the preceding sec-
tions; now we shall try to become familiar with
the new tool - the method of general coordinates,
applying it to the old material - ordinary three-
dimensional space; and then - beginning with
Section 25, we shall study curved space by means
of curvilinear coordinates.
The essential thing in the matter of coor-
dinates Is that points receive names, the names
being composed of numbers, so that we can handle
numbers, which we can do by means of formulas,
instead of points themselves. There are many
different ways of establishing a correspondence
between points and triples of numbers; in the
one that bears the name of Descartes (Cartesius)
the three numbers which are assigned to a point
are its distances from three mutually perpend-
icular planes; there does not seem to be anything
that can take the place of this method in a gen-
eral curved space because there are no planes
and straight lines; still we may use coordinates;
a system of coordinates on a special curved sur-
face is known to everybody, even to those who
never studied analytic geometry; we mean the sys-
tem of specifying the position on the surface of
the earth by means of latitude and longitude.
Polar coordinates in the plane or in space fur-
nish another example of a coordinate system
which is not cartesian; in what follows we shall
use an entirely arbitrary system of coordinates;
we shall assume that a one-to-one correspondence
is established between the points of a certain
portion of space (which may be the whole space)
and the triples of a certain set of triples of
numbers. We shall call these numbers u^u,, u3
or ui and we shall keep the notation XA for
some definite system of cartesian coordinates.
To every triple Ui, u8, u, corresponds a point
whose cartesian coordinates (in some definite
system) are XJL; these numbers xi are therefore
determined by the ui's;we have three functions
*1 = Xl(uX,U8,U3) X8 = X»(Ui,U8,U3)
£1.1
x, = X3(u!,u8,u3)
which are defined on a certain range of triples.
Conversely, if xx, x2f x3 are the cartesian co-
ordinates of a point of the portion of our space
for which general coordinate have been intro-
duced, they determine three numbers ux, u2, u3,
which therefore are functions of the x's
21.2
u8
= U3(xi,x8,x3),
which are defined for a certain range of triples
xi and are the inverse functions of the func-
tions 21.1.
We have to handle vectors even more often
than we have to handle points, and we want to
have a numerical representation also for vectors.
Together with cartesian coordinates for points
goes a very simple numerical representation for
vectors; we represent a vector by three numbers
which are the differences between the corres-
ponding coordinates of its end-points, and are
called the components of the vector; of course,
in a different system of cartesian coordinates
the same vector will have other components, but
as long as we keep to a definite coordinate sys-
tem, vectors, as well as points have definite
names. The method of representing vectors by
their cartesian components has the great advan-
tage that two equal vectors have equal compo-
nents, that we can add vectors by adding their
components, and multiply a vector by a number
by multiplying the components by that number;
these advantages are peculiar to the cartesian
method and cannot be reproduced in other systems.
The theory of curved space is differential geom-
etry, we cannot handle immediately by its methods
such things as a configuration consisting of two
points at a finite distance; if we do we have
to introduce intermediate points, instead of sub-
traction we have here differentiation. There
are two ways in which vectors arise by differen-
tiation, and each gives rise to a system of no-
tation for vectors associated with a given coor-
dinate system for points - only for the rectan-
gular cartesian system do the two representa-
tions coincide. The two ways in which a vector
appears as a result of differentiation are - the
tangent vector of a curve and the gradient of a
field. In this and the next section we shall
take only the first of these two points of view.
Given a curve in cartesian coordinates in para-
metric fora
21.3
y(p),
the components of the tangent vector are obtain-
ed by differentiation
21.4
dx/dp, dy/dp, dz/dp.
This vector is determined not by the curve alone,
but by the particular parametric representation
we are using, but in this chapter we are not
going to change the parameter often and we shall
speak of a curve when we mean "curve in a given
parametric representation", and of the tangent
vector, when we mean "tangent vector resulting
from differentiation with respect to that par-
ticular parameter". In cartesian coordinates,
then, the components of the tangent vector are
obtained by differentiating the coordinates of
the points of the curve. This is certainly con-
venient, and we may ask ourselves whether we
could not reproduce this advantage in general
coordinates. Let us try; the parametric repre-
sentation of the curve in the u1 s can be ob-
tained by substituting XI(P) into 21.2; the u^s
become then functions of p, and this gives a
parametric representation of the same curve in
general coordinates; let us agree to represent
the vector which, when we used the cartesian
system had the components 21.4, by the three
numbers
21.5
dua/dp,
duj/dp.
We have then the required system of repre-
sentation; but it is not necessary, every time
when we want to represent a vector to introduce
a curve to which it is tangent; we shall show
how to find the components 21.5 when we are giv-
en the cartesian components 21.4 without actual-
ly considering the curve.
We have, considering that Uj depends on
xx, x8, x3 which in turn depend on p,
or using the summation convention and applying
to Uj., u8, ua,
21.6
duj/dp = oui/3xa.dx<i/dp,
so that, if we denote the quantities 21.4 by
and the quantities 21.5 by V1, we have
21.7
V1 * OUi/OXo.Ya .
Introducing the abbreviation
21.8
we may write the last formulas as
21.9
V1 = alava.
It will be explained later (Section 23) why we
use in the left hand side the index as a super-
script. These are transformation formulas for
vector components which are associated with the
transformation formulas 21. 2 for the coordinates
of points; the formula Just written out permits
to find the general components when the carte-
sian components are given. In a similar way we
can find the inverse transformation formulas by
starting with a parametric representation of a
curve in general coordinates, substituting the
u^p) into the formulas 21.1 and differentiating;
we arrive thus at
21.10
where
21.11
Before we go further we shall use the fact
that the formulas 21.9 and 21.10 are inverses
of each other to obtain some relations on the
a's and b's. Substituting 21.9 into 21.10 we
get
21.12
vl = biaaopvp>
the left hand side may be written as 01Qvfl so
that
bioaapvp =010V
and since this is an identity (v being arbitrary)
we have
21.13 bia*aj=°lj-
In the same way we may obtain
21.14
aiabaj
= C,
We want to be able to operate with general
components of vectors, for instance, find a
scalar product of two vectors given in their gen-
eral components; it is easy to obtain a formula
answering this question by passing to cartesian
components, and then applying the formula for the
scalar product in cartesian components. Let the
general components of two vectors be V1 and W*;
according to the formula 21.10 their cartesian
components are
and
where we use another summation letter In the
second expression to avoid complications in
what follows. How we write the scalar product
using the formula VYWY and get
21.15 b
It follows that in order to be able to find
scalar products of rectors given by their gen-
eral components we have to know the quantities
21.16
bnbrj
The quantities a's and b's help to pass fro* a
certain cartesian system of coordinates to the
general system; they express a relation between
the general system and that particular carte-
sian system; and thus are not of fundamental im-
portance; the quantities gj* , on the contrary,
although they have been obtained by means of
the a's and b's, are independent, as their sig-
nificance shows, from any particular system of
cartesian coordinates, they characterize the
system of coordinates we are using In itself
(and, as we shall see later, they characterize
it completely, so that the g's are all we need
to know in order to be able to handle vectors
given by their general components). The a's
as well as the b's may be considered either as
functions of the x's, or as functions of the
u's. The g's always will be considered as
functions of the u's.
Before we go any farther re note that, as
it immediately follows from the definition
21.16 the g's are symmetric in the indices:
21.16'
= g
Ji*
In order tb show the importance of the
g's let us deduce a formula for the length of
a curve given in general coordinates. Let the
curve be given by
21.31 Ui(p).
For a curve given in cartesian coordinates
assume as known the formula (compare 12.1)
we
21.17
=/|/(dx/dp)" + (dy/dp)* -i- (dz/dp)«dp
where s is the arc length between two points.
This formula involves three inverse operations,
that of integration, that of taking the square
root, and that of division. It is not pleasant,
in general, to have to do with these operations,
and so we shall free our formula from them, and
write it as
21.17'
ds1 = dx8 + dy*
dz1
The formula as Just written is not essentially
different from the one written above, and means
exactly the same thing. The sign d may be taken
to mean differentiation with respect to some un-
specified parameter, since the correctness of
the formula does not depend on what parameter we
are using (provided that the same parameter IB
used on both sides). We translate 21.17* now
Into general components. Differentiating 21.2
we have
dx =
dy = bia.dua, dz = b3a.du°;
for dx2 we may write biadu°.bijjduP; using simi-
lar expressions for the other terms of 21.17' we
get
dsa
21.18
using the abbreviation 21.15 introduced before.
We see that the quantities gj_< appear again.
22. Tensors in General Coordinates.
We come now to the representation of ten-
sors. We know that a tensor is a function which
assigns numbers to vectors, and the question of
representation will be simply this: given the
general components of the vector arguments how
to find the corresponding value of the tensor.
We have already solved this problem for one par-
ticular tensor, namely for the tensor of the
scalar product which we expressed in the preced-
ing section by means of the g's, and we shell
use the same method in the general case. Given
the cartesian components of a tensor fj* and the
general components V* and W1 of two vector ar-
guments, to find the corresponding value of the
tensor. We pass from general components to car-
tesian components by formulas 21.10 and substi-
tute these expressions into the expression
fY6vYwC for tne value of tne tensor; the result
is
and this may be written as
pp 1 Tji T7^wP ~ f*
if we set
PP P f h .Vi - 1?
ZY6 YiDOj ~ *lj »
we call FIJ the general components of the ten-
sor, whose cartesian components are fji , and we
see that the values of a tensor are expressed by
formula 22.1 in its general components and the
general components of the vector arguments in
the same way as in terms of cartesian components
of the tensor and the vector arguments. Formu-
la 21.15 for the scalar product of two vectors
may be considered as a special case of 22.1. The
cartesian components of the tensor gj j are, of
course, the dj , and substituting 6 for f In
22.2 we get the expressions 21.18 for Ptj . We
treated here as an example a tensor of rank two;
similar calculations can be performed for a ten-
sor of any rank; we give the results for rank
one and three, leaving it to the reader to go
through the calculations:
22.11 F, «
Now naturally the inverse problem present*
itself; given the general components of a ten-
sor to find its cartesian components. The prob-
lem can be solved by substituting in the expres-
sion for the value of a tensor (we again take a
tensor of rank two as an example) given in terms
of general components, Fr0VYW°, the expressions
21.9 for the general components of the vector
arguments in terms of the cartesian components:
vi = aiava> w4 = alpwp> the result is
comparing this to the expression fa0vawfl
the value of a tensor in cartesian components
we derive the desired transformation formula
for passing from general to cartesian compo-
nents; here are these formulas for the first
three ranks
22.5
fi = Faaoi>
ljk
We know now how to write tensors in gener-
al components, and we want to find out how to
perform operations on them. Of course, we could
always pass from general components to carte-
sian components, perform the required operations
and then, if the result is a tensor, pass back
to general components; but instead of following
this program in every special case as it pre-
sents itself we shall do it once for all and
derive general formulas whose application in
special cases is much more convenient than ad
hoc calculations.
We begin with the operation of contraction.
Given again a tensor of rank two by its general
components FJJ we pass to its cartesian compo-
nents by formula 22.3 and now we contract by
taking the sum of components with equal indices
according to the original definition,
22.4 fyy
where we use the abbreviation
22.5 aiYajY = gi^'
For tensors of higher ranks (contraction is
possible only for tensors whose rank is* 2) en-
tirely analogous formulas may be obtained easily;
indices which are not affected by contraction
may be simply disregarded, as it follows from
similar calculations which are left to the
student. For example, the result of contracting
irith respect to the second and fourth Indices of
a tensor of the fourth rank F
ijkl
will be
22.41
I
iokp
The quantities g1J introduced a moment ago
play quite an Important role comparable to that
of the gj« with lower indices, and they are con-
nected with them by the formula
8£ . 6
j j
8 g
aj
To prove it suffices to substitute the expres-
sions 21.16 and 22.5 for the two kinds of g's
and to apply formula 21.14 twice; we may also
notice that, as it follows from the definition,
op 7 tfij = ^Ji
66 • ( g " - g " ,
so that formula 22.6 may also be written as
22.61
Next comes the operation of differentiation.
The result of differentiating a tensor is always
a tensor of rank higher by one than the given
tensor; its components will have one more in-
dex than those of the given tensor; we shall
denote them by simply adding a new index preced-
ed by a comma, to the symbol of the given tensor.
Because the situation is slightly more com-
plicated, let us start in translating differen-
tiation into general components with the simp-
lest case of a tensor of rank one given in gen-
eral components, F^. We follow the same program:
as a first step we pass to cartesian components
by formula 22.5 and get f^ = Ffja^: we next
find the cartesian components of the differential
by simply differentiating with respect to carte-
sian coordinates with the result
»
As a third step we pass back to general compo-
nents using the formula 22.2 and arriving at the
result
but b0j according to formula 21.11 is
so that this expression reduces to
using relation 21.14 the first term reduces to
just what we would expect from analogy with car
tesian coordinates as an expression for the re
sult of differentiation; however, this is not
the whole answer because there is a second term
so that the final result is
22.8
where we set
22.81
the second term may be considered as being In
the nature of a correction to the expected re-
sult; we call it the correction term, and we
call r.. the correction coefficient*. We see
then that in general coordinates the components
of the differential of a tensor of rank one con-
sist of two parts - .the first expresses the
change (or rate of change) of components of the
tensor, the second is due to the change of the
coordinate system from point to point. In the
case of the cartesian system the coordinate sys-
tem is, so to say, the same in all points, the
second term is zero (the a's reduce in this
case to constants, and their derivatives vanish);
another extreme example is furnished by a tensor
whose components are constants in some non-car-
tesian system of coordinates (for Instance, po-
lar coordinates) ; the derivatives of the compo-
nents with respect to the coordinates are zero
but the components of the differential are not;
their values are given by the correction tents
alone .
For tensors of rank higher than the first
the calculations are slightly more complicated,
but the principle is the sane; we write out the
results for tensors of rank two and three
22.82
22.66
ij,k
ijk,a
r
- r
9
i»Fajk
r <* v ru TT
rj»Fiak • rk«*ija>
the general rule ought to be clear now; there
are as many correction terms as there are in-
dices; each correction term corresponds to one
index, the other indices being disregarded in
its formation.
In order to be able to perform the opera-
tion of contraction (and the operation of scal-
ar multiplication is a special case of it) in
general coordinates we have to know, as we saw,
the values of the g's; in order to be able to
perform the operation of differentiation we
have to know the value of the T's (the correc-
tion coefficients); if we know those we can per-
form all the necessary operations in general co-
ordinates without going back to cartesian coor-
dinates. We shall show now the values of the
T 's can be derived from those of the g's.
The correction coefficients were given or-
iginally by the formula 22.81; we can give to
this another form by using the relation 21.14;
writing it as a^b ± =ftkl and differentiating
it with respect to uj we get, since the a»s are
constants,
&
=
so that we have
or, recalling the definition 21.11 of the b's,
k d*Xv
22.84 r,4 - -
from this expression it follows that r is not
affected by interchanging the two lower indices,
or
22.71 rij = rjki*
We may now show how the T's can be derived
from the g's. We shall do that by using the
following artifice. Consider the tensor of sca-
lar multiplication, whose cartesian components
are the O^j and whose general components were
shown to be gjj ; the components of the differen-
tial of this tensor in cartesian components are
the derivatives of the 6' s and therefore zero;
the second formula 22.11 shows that the general
components of this tensor of the third rank also
must vanish, so that
22.85
= 0
(we did not promise that general components of
tensors will always be given by capital letters,
but since heretofore we have been using capital
letters for them it may be well to emphasize
that the g's are intended to represent (follow-
ing the generally accepted custom) general com-
ponents of the scalar multiplication tensor) .
On the other hand, we can calculate the compo-
nents of gij^fc by the application of formula
22.82 and so* we get
- o,
this is a system of equations connecting the r's
with the g's and their derivatives; we want to
solve them for the r's. For that purpose we
write out the above relation in two more forms
resulting from it by cyclic interchanges of in-
dices:
- o,
subtracting the last two relations from the first
we notice that as the result of symmetry of the
g's and the r's in the lower indices (formulas
21.16' and 22.71) four of the terms containing
the r' s cancel and the remaining two are identi-
cal; we thus have
We multiply now both sides by g** and turn with
respect to k, writing for it a Greek Index, e.g.,
0 . Taking into account 22.61 we have
22.91
This shows how, given the g's, to calculate the
r's. We see thus that if only we are given
the g's as functions of the u's we can perform
all the required operations on tensors. Very
often the calculation of the r«s is divided in-
to two parts; first the left hand sides of
22.9 are calculated and listed; they are denot-
ed by rk>1j ; and then the r^ are calculated
using the formulas in the form
22.92
23. Co variant and Contravariant Components.
We know (Section 9) that a vector is a ten
sor of rank one, or, more precisely, that to
every vector v there corresponds a tensor of
rank one v.x which has the same cartesian com-
ponents. Now we have introduced general com-
ponents for vectors and also for tensors; if we
have cartesian components of a vector v^ to
them correspond (21.9) the general components
V1 = alava;
also if we consider the vi as the components of
a tensor to them correspond (22.11) the general
components
to the same cartesian components v^ there cor-
respond thus two different sets of general com-
ponents depending on whether we consider the v^
as vector or as tensor components; it was in an-
icipation of this situation that we have been
using the index for general vector components
as a superscript. Essentially, a vector and a
tensor of rank one are one and the same thing;
and so we have two different systems of compo-
nents for every vector (in a given general co-
ordinate system) ; the components with subscripts
are called covariant components, those with the
superscript - contravariant components. It is
clear from what precedes, but it may be worth-
while to repeat that we have here two different
representations of one and the same thing.
It was mentioned in Section 21 that there
are two ways in which a vector results from dif-
ferentiation; one, a vector considered as a tan-
gent vector to a curve, was discussed before,
and is the basis of what we have been doing all
this time; it is interesting to consider now
briefly the other. If we start with a scalar
field f = f(x,y,z) we may derive a vector field
by differentiation, and the cartesian components
47
of tals vector field will be
23.1
this vector la known as the gradient of the field
f ; now, we may give the same scalar field in gen-
eral coordinates
f = f Xi(U!UtU,), X,(UiUtU,),
if we differentiate f with respect to ulf will
we obtain general components of the gradient
vector field? The question is easily answered
by computing these components; we have
of
Pf oxg 3f
23.2
comparing this with 22.11 we see that the par-
tial derivatives of/oui are the components with
subscripts - the covariant components in gener-
al coordinates of the gradient. The two repre-
sentations, the covariant, and the contravari-
ant, may be thus considered as corresponding to
two ways in which a vector can be arrived at by
differentiation; if we consider a vector as a
gradient we arrive naturally at Its representa-
tion by covariant components, if we consider it
as a tangent vector we arrive at its representa-
tion by contravariant components. (The name co-
variant, by the way, is Intended to Indicate
that these components change in the same way, as,
or have similar formulas of transformation with,
partial derivatives.) In the case of cartesian
coordinates, of course, covariant and contravar-
iant components of a given vector coincide: in
this case it is not necessary to make any dis-
tinction.
We shall have to use covariant as well as
contravariant components, and it is important
to be able to pass from one to the other repre-
sentation; the necessary formulas can be found,
of course, by passing through a cartesian repre-
sentation. Let covariant components FA of a
tensor of rank one (or vector) be given; formu-
las 22.3 show us that the corresponding carte-
sian components are fA = Faaai; ^ terms of these
the contravariant components are obtained by
formula 21.9 which gives here
23.3
F1 = alpFaaap =
if abbreviation 22.5 is used. In the same way
it is easy to prove the following formula, which
permits to calculate covariant components ' when
the contravariant components are given
23.31
It may seem that there is a wasteful redun-
dancy in this double system of notations, that
one representation is enough, and that to have
two, means to Indulge In luxury; as a natter of
fact this double notation If a defect from a
didactical point of view: it makes it more
difficult to learn the new language; but once
mastered it makes the calculations much simpler
and the formulas much shorter and more elegant,
if properly used; as an example, we want to
give the formula for the scalar product of two
vectors, one of which is given In covariant,
and the other in contravariant components. This
formula can be obtained by the usual procedure,
i.e., passing through cartesian components, but
we have already reached a stage where we can
dispense to a great extent with the use of car-
tesian coordinates. The required formula is
simply
23.4
and it can be proved by simply substituting for
Wa its expression according to formula 23. 81,
viz., gaaW0 and comparing the result to 21.15;
of course, the scalar product could also be
written as VgW*, and also ga^VJlp, as it Is
easy to verify.
In Section 22 we derived a system of repre-
sentation for tensors starting with the contra-
variant representation of vectors; we could do
the same thing starting with the covariant rep-
resentation of vectors, and we shall do it so
as to have a perfectly symmetrical system of
notations.
Suppose we are given covariant components
of two vectors Vj, W4 and the components PJJ of
a tensor, and we want to find the value of the
tensor corresponding to the given vectors as ar-
guments; we know how to solve the problem if
the vectors are given by their contravariant
components; therefore, let us calculate first
the contravariant components, viz., V1 = gr'*\f
W1 = g^Wg, and then substitute them Into the
left hand side of expression 22.1 giving the
value of the tensor. The result is
23.4 F
which may be written as
23.41
if we introduce the notation
23.42
F« -
We call F1^ the contravariant components of the
tensor FJJ and the components with lower Indices
(subscripts) which we have been using for ten-
sors heretofore we call covariant components.
We have thus two representations not only for
tensors of the first rank (vectors) but also for
tensors of all ranks. In one case we have been
using already a symbol with two superscripts,
viz., the g1^ (introduced by 22.5); we shall
show now that this notation is in agreement
with the general notation we are introducing now
by proving that these g's with upper indices are
the contravariant components of the tensor of
scalar multiplication. In order to prove that,
we notice that, according to formula 23.42 the
contravariant components of a tensor of covari-
ant components g^ are
but according to 22.61
» Oaj so that we
get 6agal - g^1* and the assertion is proved.
text we want to learn how to differentiate
a tensor given in contravariant components, but
before we do that it seems necessary to Intro-
duce what we call mixed components. Suppose we
are given one vector argument of a tensor of
rank two in contravariant, and the other In co-
variant components, and we want to find the val-
ue of the tensor; if the components of the two
given vectors are V1 and W^, and the cartesian
components of the tensor are f^j , we pass to
cartesian components of the vector arguments
23.5 Vi = b1YVY, wj = W0aol,
and express the value of the tensor as
23.51 fapva wp
where the notation
23.52
is introduced. The numbers F^ with one lower
and one upper index are called mixed components
of the tensor of rank two whose cartesian com-
ponents are fjj . In this same way we may con-
sider mixed components for a tensor of any rank
with as many of the indices up as we may wish,
and the others - down.
We can pass from one kind of component to
any other directly, without going through carte-
sian components. The transition from components
in which a certain index (for example, the third)
is used as a superscript to components in which
the same index is used as a subscript is call-
ed the lowering of that index. This change does
not affect the geometrical meaning of the ten-
sor, it merely corresponds to a transition from
an expression of the tensor in which the corres-
ponding vector argument (in our example, the
third) was given by Its covariant components
to an expression of the same tensor using contra-
variant components for that vector argument. The
formula for the lowering of an index is easily
found to be independent from all other indices,
so that, disregarding them, we always may use
23.31. Formula 23.3 may be considered as a gen-
eral formula for raising an index. Lowering and
raising of indices is sometimes referred to as
juggling with indices.
Again it may seem that the introduction of
mixed components is superfluous, but there are
48
advantages In using mixed components.
One advantage appears in connection with
contraction. The result of contraction Is giv-
en in terns of covariant components by formula
£2.4 (or 22.41). But, according to 23.3 we may
write Rlj for Poj gla so that 22.4 may be writ-
ten as
23.48 F°
•
and for 22.41 we may write ?i\a' We see thus
that if a tensor is given by its mixed compo-
nents and the two indices with respect to which
we contract appear on different levels (one as
a subscript, the other as a superscript) con-
traction is performed (like in cartesian coor-
dinates) by simply replacing each of the two
indices by the same Creek letter.
Another case where there is great advan-
tage in using mixed components is that of dif-
ferentiation of a contravariant tensor (as we
say sometimes for: "tensor given by its con-
travariant components"; a tensor in itself is,
of course, neither contravariant nor covariant
- covariance and contravariance are only prop-
erties, or types of representation of tensors);
the components of the differential will have
one more index, and this index as one derived
by differentiation will naturally be a subscript,
whereas the old indices are superscripts; this
does not mean that we cannot pull the new index
up, or the old ones down, but the expressions
resulting from that would be more complicated.
Suppose the given contravariant tensor is
of rank one (a vector) V4 ; we pass to cartesian
components :
we differentiate this:
O~Vj. "ft Y
and we pass to mixed components by formula
23.52:
"& . & .
1J
and this, using 21.14 and 22.71 reduces to
23.6 ^J " lu~ + rr?r*
The reader should be able, following the
examples given, to deduce formulas for differ-
entiation of a tensor given in any form. We
just mention, because we will have occasion to
use it later, the formula for differentiation
of a mixed tensor
23.7
J
We are in possession now of all the for-
mal rules of operations on tensors in general
coordinates. Although these rules were deduced
by means of cartesian coordinates these coordi-
nates and components together with all formulas
involving the a's and the b's form only a kind
of scaffolding that can be removed after the
building has been completed. All ire have to
know in order to operate on tensors are the g's.
Using the g's we can lower and raise Indices and
contract and, as a special case, find the scalar
product of two vectors; also find the angle be-
tween two vectors (using formula 7.5) and the
length of a curve (using 81.18). Given the g's
we can calculate the r" s (end of Section 22) and
with the aid of the P s we can differentiate
tensors (formulas £2.8). We see thus that the
g's play a fundamental part in all operations -
the tensor of which they are components is of-
ten called the fundamental tensor.
Before we conclude we might state explicit-
ly that all the formulas we have obtained are
entirely independent of the number of dimen-
sions.
24. Physical Coordinates as General Coordinates.
The principal purpose for the introduction
of general coordinates was to make possible the
treatment of tensors in curved space but it hap-
pens that general coordinates may be used with
great advantage also in Special Relativity The-
ory, namely, in connection with the situation
arising from the "minus sign". We remedied this
situation in Section 4 by introducing imaginary
coordinates and tensor components; we know how,
using these imaginary quantities to write our
formulas in a nice symmetrical form. The system
of notations for general coordinates that we
have introduced permits us now to reintroduce
real quantities, and still to preserve symmetry
in the formulas. We shall express our four math-
ematical coordinate's xx, x2, x3, x4, of which
the fourth is imaginary in terms of four real
coordinates which we may denote by u4; we may
choose as these four real numbers the physical
coordinates x,y,z,t and consider the formulas
24.1
xx = x, xt = y, xa = z, x4 = it
as the transformation formulas, corresponding to
21.1; and
24.11
x = x.
y = xs, z = x,, t = -1x4
as the inverse formulas corresponding to £1.2.
The ajj and the b jj with different sub-
scripts are easily seen to be zero, and we have
(compare 21.8 and 21.11)
24.2
»aa = ass = 1» a44 = i »
bsl = b,3 = 1, b44 = -ij
from these we obtain using £1.16
24.21
gtl * g,, » -g44 * 1, all others zero
and the same values we obtain for the g's with
upper indices, using ££.5.
The x,y,z,t may be considered as the con-
travariant components of the radius vector lead-
ing from the origin to the point P; the co vari-
ant components of the same vector are seen, ap-
plying the formula £3.21, to be x,y,z,-t.
The formula for the square of the distance
from the origin may be obtained either from
£1.15 or from £3.4; it is (compare 10.1)
£4.3 x» + 7* + «* - t«.
We come next to Maxwell's equations where
the "minus sign trouble* originated. To con-
form with the notations of this chapter ve
should use for the cartesian components - the
mathematical components of preceding chapters -
small letters, so that formulas 4.72 will hare
to be written
•41
IX, f41 = 1Y, f4, = 1Z, f,, - L,
•31
= M, flt = i.
Using formulas 22.2 and 24.2 we obtain the co-
variant components in physical coordinates - and
we use here capital letters - as follows:
F4l = X, F4t » Y, P4» = Z,
24.4
F83 = L, FJ8 = M,
= M.
Mixed and contravariant components may be ob-
tained by raising indices - formula 23.5. IXie
to the simple character of the g's given by
£4.£1 it is easy to see that raising one of the
indices 1,2,3 does not change the numerical val
ue of a component, so that, for instance,
24.5 F»» =
= gaV*Fal =
and raising of the index 4 Just changes the
sign of the component so that
24.6 F\ '
Which components shall we use in Maxwell's
equations? It is clear that in the first (11.2)
set all the indices must be on the same level,
and since the last one must be a lower index we
write all of them down. In the second set (UJ$)
again the one after the comma must be down; but
the one with which we contract it must be on
the other level, and therefore up; the position
of the third index is arbitrary. We have thus,
as the Maxwell equations for free space
24.7
jk>l
0,
= 0
and in the presence of matter the second set
becomes
24.71
euj
We notice here that no Imaginary quanti-
ties appear and In spite of this our formulas
are symmetric. The raising of the index 4 Is
equivalent to changing the sign of a component,
and this is how the minus sign is taken care of,
We shall now write out the expressions for
the stress energy tensor and the equations of
motion; it is clear that the formulas 11.4 and
11.5 become
24.6
24.9
or
24.91
or
24.92
Tla = 0
»<*
TiJ =
- *o,jF°FPa -
- ig1JF°0FP0 - puSx3
= FlpFPj - igijF°pFP0 -
The continuity equation (11. 1) will now be
written as
24.10
(pua),
0.
25. Curvilinear Coordinates In Curved Space.
We want next to apply the general coordi-
nate system that we have introduced for flat
space also to curved spaces. In flat space we
introduce the language of general coordinates
and components by translating from the language
of cartesian coordinates and components; in
curved space we have no cartesian components; we
shall have, therefore, to begin by introducing
something that will play the role of cartesian
coordinates; we shall introduce quasi-cartesian
coordinates, which will take that place; but
whereas cartesian coordinates are universal in
that the same system of coordinates works for
the whole plane, or flat space - the neighbor-
hood of every point in curved space has its own
system of quasi-cartesian coordinates. They are
defined in the following way: Consider at a
given point P the tangent flat; there will be In
general a neighborhood of P such that no two
points of that neighborhood have the same pro-
jection on the tangent flat (for a sphere, e.g.,
we may obtain such a neighborhood by drawing any
small circle around the point of contact) for
such a neighborhood there exists a one-to-one
correspondence between the points of it and the
points of the tangent flat which are their pro-
jections. We introduce now on the tangent flat
a cartesian system of coordinates with origin
at the point of contact, and we use the coordi-
50
nates of a point of the flat as the quasi-car-
tesian coordinates of the point of the curved
space whose projection it is; If, for Instance,
a surface is given by equation
£5.1 z = J(ax« *• 2bxy + cy»)+t.h.d.,
x,y will be the quasi-cartesian coordinates of
the point x,y,z, of the surface for the neigh-
borhood of 0,0,0; - and in the general ease, If
the curved space is given by 19.71
25.11
t.h.d.
the x± (i - l,...,n) will be the quasi-cartesian
coordinates of the point xt (i = 1,...,H) for
the neighborhood of the point 0,0,0,...,0.
When we were discussing curved space In
Sections 18, 19, 20 we were speaking of vectors
and tensors; these vectors were vectors of the
tangent plane or tangent flat with initial point
at the point of contact. We shall not consider
any other vectors in connection with curved
spaces and we shall refer to these vectors as
the vectors of the curved space. To make this
Idea seem more natural we may remark that a tan-
gent vector to a curve on a surface ( or on a
curved space) is such a vector, that is, a vec-
tor of the tangent plane (or flat) with initial
point at the point of contact. In handling
these vectors we have been using for the vectors
at every point of the curved space a cartesian
coordinate system in the flat tangent at that
point, in fact, we may say the same system that
furnishes us the quasi-cartesian coordinates
for the points of the curved space in the neigh-
borhood of the point of contact. We shall,
therefore, refer to the cartesian components of
the vectors *"d tensors of the tangent flat when
they are considered as vectors and tensors of
the curved space - as quasi-cartesian components
of these vectors and tensors.
We have thus in connection with every point
P on the curved space a local coordinate system
which gives quasi-cartesian coordinates of the
neighboring points and the quasi-cartesian com-
ponents of the tensors at P, and in some cases
these local coordinate systems are very useful,
but it will be necessary to introduce more gen-
eral, more universal systems and learn how to
represent vectors in them. The necessity of
this last requirement will be clear if we con-
sider that, although a quasi-cartesian system
of a point P may be used to represent points In
the neighborhood of P it is not quasi-cartesian
for these points and cannot be used as such to
represent vectors at such points.
There is no difficulty In introducing a
universal system of coordinates for the points
of a space - what we want is just as in ordinary
space a one-to-one correspondence between the
points and n-ples of numbers. A simple example
is furnished by the so-called geographical co-
ordinates for the surface of a sphere. Another
approach is given by the so-called parametric
representation of a curved space. If the coor-
dinates of a flat space X£,...,xg are given as
functions of one parameter we have a curve;
when they are given as functions
25.2
of n parameters Uj we have what we have called
an n-dimensional curved space because eliminat-
ing these n parameters from the N equations
which express the coordinates in terms of them
we find that the coordinates must satisfy N - n
= r equations, and this was our definition of
an n-dimensional curved space. Now since to ev-
ery set of values i^,...,^ of the parameters
there corresponds one point of the curved space
the parameters u^ may be used as coordinates for
the curved space.
Suppose then that we have introduced in
some way a general system of coordinates for the
points on a curved space (the reader may always
think of the special case of a surface) . What
will be a natural system of representation of
vectors to go with it? Just as we use a carte-
sian system in the tangent flat to represent
points on the surface we may, so to say, project
the general coordinate system on each tangent
flat and use it to represent vectors and tensors
in that flat, in particular those with initial
points at the point of contact, i.e., the vec-
tors and tensors of the curved space. For the
neighborhood of each point we have thus two co-
ordinate systems: the general and the quasi-
cartesian for that point - and the same two sys-
tems, or, rather their projections, we may use
on the corresponding tangent flat. For the
neighborhood of each point there will be trans-
formation formulas for the coordinates of points,
and from these we can derive transformation form-
ulas for components of vectors and tensors in-
volving the a1 s, the b's and the g's as intro-
duced in Sections 21, 22 and 23. But since we
consider only vectors with initial points at
the point of contact we shall use the a's, the
b's and the g's calculated from the correspond-
ence between the quasi-cartesian and the gener-
al coordinates at a point only for that point
itself. We know that the a's and b's are nec-
essary only in the building up of a system so
that all we shall need in order to be actually
able to handle tensors and vectors in a given
general coordinate system are the g's.
We want to explain now how to obtain the
g's when a space is given in parametric form
25.2.
We consider a curve u^(t) on the space, and
its tangent vector at some point; it may be con-
sidered either as a vector of the curved space,
and then its contravariant components will be
given, if we denote differentiation with respect
to the parameter by a dot placed over a letter,
by u1, the square of its length will be
51
or it may be considered as a vector of the con-
taining space. Its (cartesian) components will
be given by x^ (l = !,...,») where the x£ are
functions of t which are obtained fro* 25.2 by
substituting for the u's the expressions
characterizing our curve; we hare thus
±1 » oxi/oua.ua
and for the square of the vector
• ,*xl.
Equating this to the expression we obtained
above we find
25.3
1=1
This formula ought to be compared to 21.16 of
which it will be seen to be a generalization if
account is taken of the values 21.11 of the b's.
The method of giving the curved space by means
of the formulas 25.11 may be considered as e
special case of the one used above; this will
be clear if we write 25.11 in the form
Xj. = ux, xa = ua, xn = Ua,
It is seen that the parts of the u's are played
by the first n of the x's. Differentiation of
the formulas Just written with respect to these
variables gives
substitution of these expressions into 25.2 gives
z
i=n+l
25.4 = z
Z
k=l
*n
t.h.d
t.h.d.
These formulas give the values for the g's In
pseudocartesian coordinates for a neighborhood
of the point of contact.
For the point of contact itself, i.e., for
the origin of our system of coordinates we have
25.41 (g.n)0 - om
and from the formulas 22.61 we conclude easily
that the g's with the upper Indices are also
the A's:
25.42
As a consequence of this the distinction between
covariant and contravarlant components vanishes
for quail-cartesian coordinates at the point of
contact.
We come now to the operation of differen-
tiation. In the case of flat space we were sim-
ply trying to find a system of notations for
some operations that were defined Independently.
Here the situation Is different; we have not de-
fined differentiation; we cannot define It In
what would seem to be the natural way, as the
rate of change of a rector, for Instance, be-
cause this would necessitate the consideration
of the difference between two vectors at two
different points and this conception Is not de-
fined for curved space.
Before we come to this definition let us
formulate the situation In flat space as fol-
lows: a tensor field dF has been obtained by
differentiation from a tensor field F if in ev-
ery point the components of dF in a cartesian
system are the derivatives of the components of
F in that system.
In curved space there is no universal car-
tesian system but there is a quasi-cartesian
system for every point; It is natural, there-
fore, to define differentiation in curved space
by substituting in the above statement "quasi-
cartesian system" for "cartesian system"; if we
do that we arrive at the following definition:
Definition of Differentiation. We shall
say that a tensor field dF has been obtained by
differentiation from the tensor field F if at-
every point the components of dF, in a system
of coordinates that is quasi-cartesian at that
point, are the derivatives of the components of
F in that system of coordinates.
Although this definition may sound compli-
cated it is the simplest imaginable adaptation
of the idea of differentiation to curved spaces.
The complication arises from the fact that there
is no cartesian system in curved space but when
we apply this definition to flat space we see
that it brings us back to differentiation as we
knew it in flat space.
We shall not have actually to pass from
general coordinates to quasi-cartesian coordi-
nates and then, after differentiation translate
the result back into the language of general co-
ordinates in every special case. We can derive
the formulas in general coordinates once for
all, just as we did it in the case of flat space
in Sections 22 and 23, and we shall obtain ex-
actly the same formulas. The only difference
may be in the derivation of the r*s from the g's
(end of Section 22) which was based there on
the fact that the derivatives of the g's in car-
tesian coordinates vanish (22.84). Is this true
also in curved space? or, more precisely, do
the derivatives of the g's in quasi-cartesian
coordinates vanish at the point of contact?
In these coordinates the g's are given by
25.4; differentiating these expressions we ob-
tain
25.5
t.h.d.
and for the point of contact, where the x's
vanish, this Is zero so that
25.6
We see thus that formally everything is
just the sane as In flat space so that we can
take over Into curved space the whole apparatus
of formulas worked out In Sections 21, 22, 23.
Incidentally we may mention that as It fol-
lows easily from 25.6 the quantities r also
vanish in quasi-cartesian coordinates at the
point of contact. For future reference we put
down the formula
25.7
0.
In general, the point of contact In quasi-
cartesian coordinates is a place where we have
the closest possible approach to the situation
which obtains in flat space when we use carte-
sian coordinates.
Another formula that can be easily obtain-
ed from 25.5 and that we need later Is obtained
by differentiating 25.5 once more and setting
XA = 0. We get thus
25.8
Given a curved space by the formulas 25.2
we know how to find the g's. The question now
arises: suppose we are given $n(n + 1)
functions of the coordinates; is it possible to
find a space for which these functions serve as
the g's. The question is that of solving the
system of partial differential equations (25.3),
and without going into details we shall state
that such a system of equations in general can be
solved if the number of unknown functions is
equal to that of equations; since we have here
£n(n + l) equations we must have that 'many un-
known functions; that means that the number of
dimensions N of the containing space must be In
general &n(n +1); in special cases it may, of
course, be less than that. We may say then: a
two dimensional curved space given by its g's
may be always considered as Immersed into a 3-
dimensional space; a three-dimensional curved
space may be always considered as part of a six-
dimensional flat space; and a four-dimensional
as part of a ten-dimensional flat.
Another question is, whether for given real
g's the containing space will come out real; and
this Is by no means always the case. We know
that for gxl = g«« = gj3 3 -g44 = 1, all others
zero, the minimum cartesian containing space is
four-dimensional with one imaginary coordinate
and it Is clear that no real cartesian space
can contain It.
Henceforth we may consider the curved space
as given by its g's, and the g's may be consid-
ered as arbitrarily given functions of the u's.
65
It may seem that we lost from view the
original purpose of Introducing curved space,
which was that of obtaining a tensor which we
could Identify with T^ . We Introduced the
Riemann tensor having this In mind, but now we
seem to be Immersed In an entirely formal the-
ory and far removed from the Riemann tensor; as
a matter of fact, It Is just around the corner;
differentiation, although performed according
to formulas that are formally the same as In
flat space, has, as we shall see, a new content;
in trying to discover the difference we will be
led to the Riemann tensor from a new point of
view.
26. New Derivation of the Riemann Tensor.
We said that the meaning of differentiation
in curved space is different from that in flat
space. To show this difference in one of its
most important manifestations we start out with
a tensor of the first rank given in its contra-
variant components F1; we differentiate it twice
to obtain a tensor of third rank F*,jn ; in flat
space this tensor would not differ from F*,^
because in cartesian components differentiation
of a tensor reduces to ordinary differentiation
of its components, so that the cartesian compo-
nents of the two tensors mentioned are £ — ££i
7^ 7K^* ^^71 O3C-J
and g— »r— ^ respectively, and these are equal be-
cause the result of ordinary differentiation does
not depend on the order; two tensors having equal
components in one system of coordinates would be
equal in all systems of coordinates and so we
have
26.1
- F3
= 0 in flat space.
This reasoning does not apply to curved
space; in fact, to find F1^ In a point P we
have to differentiate F1 j ; and in order to do
that we have to know F1^ j in different points
of the neighborhood of P; the finding of F *• j
in these points involves the use of quasi-carte-
sian systems for each of these points; we do not
have then one quasi-cartesian system in which
we can perform all our operations and the rea-
soning that led us to 26.1 breaks down. In spite
of this the result might still hold. In order
to show that it £033 not let us calculate the
left hand side of 26.1 using the formulas which
we deduced for flat space in Sections 22 and 23
and which, as we proved In Section 25, apply to
curved space also.
We start with the contravariant components
FI; we calculate the first differential accord-
ing to formula 23.6 to obtain
26.2
+I*aJP°;
next, we differentiate this again, and get, ac-
cording to formula 23.7
26.21
now we form the difference we want to Investi-
gate, viz.,
the last bracket vanishes according to 22.71,
and what remains becomes after the substitution
of the above expression 26.2 for the first dif-
ferential and rearrangement of tei
H V
Here cancellation takes place in the first thret
pairs of terms: in the first as a result of in-
dependence of ordinary differentiation on order,
In the next two pairs as a result of the fact
that the name of the index of summation is IB-
material; we come out with
- P
26.3
or
26.4
where
26.5
- F:
- rj
•n
Before we discuss the question whether the
expression vanishes we want to show that the
B's are the components of a tensor. In fact,
multiplying both sides of 26.4 by X^Z", where
Xj^, Y^, Z" are components of arbitrary vectors,
and contracting we have
The left hand side is a scalar that has been ob-
tained by legitimate operations and is, there-
fore, independent from the coordinate system
used; so is therefore the right hand side, and
this proves that the B's are the components of
a tensor. (In order to see that this is an es-
sential point and that not every symbol with
indices may be considered as a tensor, the read-
er might consider the expression P<krXe^Z*';
this expression is, obviously, not independent
from the choice of coordinates since in carte-
sian coordinates the r's vanish, and in other
systems they do not; the r's furnish thus an
example of symbols with indices that can not be
interpreted as components of a tensor) .
Now we can settle our question as to the
vanishing of 26. 3 by showing that the B's are
mixed components of the Riemann tensor which
has been Introduced in Section 20. Since we
hare proved that they are components of a ten-
sor we may use any system of coordinates, and
we decide to use a quasi-cartesian system. In
such a system the r's vanish at the point of
contact (25.7) so that we are left with the
terms
substituting for the r's with one upper index
their expressions in terms of the g's with up-
per indices and the r's with all indices down
(22.92) we get
the first two terms vanish again because the r!s
vanish at the point of contact; taking into ac-
count (25.41) that the g's are for the point of
contact equal to the 6's, and using the expres-
sions 22.9 we find after a few cancellations
(the index i appears here as a subscript because
the distinction between contravariant and covar-
iant quantities vanishes for quasi-cartesian co-
ordinates at the point of contact) . Using here
for the second derivatives of the g's the expres-
sions 25.8 we find
Comparing this to the expression for the Riemann
tensor deduced at the end of Section 20 we con-
vince ourselves of the identity of the two ex-
pressions.
This shows that, if the Riemann tensor does
not vanish, the second differential of a vector
field actually may depend on the order of differ-
entiation. This fact is very interesting in it-
self, it confirms our statement that in curved
space differentiation has a new meaning and it
has many important implications, on which, how-
ever, we cannot dwell here. For us it is impor-
tant that we have obtained an expression of the
Riemann tensor in terms of the g's alone; this
means that those properties of the curvature of
space which are expressed in the Riemann tensor
are determined by the metric of the space, i.e.,
if distances along different curves are given,
the curvature (as far as it is expressed in the
Riemann tensor) is determined. According to our
conception, the Inhabitants of the space cer-
54
tainly can measure lengths; it follows that cur-
vature, as expressed by the Riemann tensor if
accessible to the Inhabitants, It Is an internal
property of the space. In particular, for I « 5,
n » 2, i.e., for the ordinary surface we obtain
the fact that the total curvature can be calcu-
lated from the expression for the line element;
this is Gauss's Theorema Egreglun.
27. Differential Relations for the
Riemann Tensor.
The method of quasi-cartesian coordinates
in proving a relation between tensors that we
used in identifying the B's with the components
of the Riemann tensor can be applied often and
helps to avoid lengthy computations. We shall
use it now to prove certain differential rela-
tions for the Riemann tensor that are very im-
portant for us because we know that the tensor
T1, which we want to identify with the contract-
ed Riemann tensor satisfies a certain differen-
tial equation, namely, ^Tj/oXfl =0, and, of
course, we expect the tensor in our mathemati-
cal theory with which we are going to identify
T to satisfy the same relations. In order to
deduce differential relations on the contracted
Riemann tensor we have to prove first some re-
lations for the non-contracted tensor. These
relations have been discovered by Ricci and then
rediscovered by Bianchi and bear the latter' s
name; they are
27.1
0.
The proof is very simple if we use quasi-
cartesian coordinates. In these coordinates the
r's at the point of contact vanish and although
the first derivatives of the r's do not, the
components of the tensor obtained by differen-
tiating the B's (formula 26.5) which we have
identified with the R's will contain the second
derivatives only, because the first derivatives
will be multiplied by the r's themselves that
do vanish. With this remark in mind the proof
of the Bianchi relations does not present any
difficulty; we simply substitute for each of
the three terms in 27.1 the difference of the
two second order derivatives and find that the
result vanishes identically.
Now, in order to deduce froa 27.1 the re-
lations for the contracted tensor we raise in
27.1 the second index so that we have
•n,p
and here we contract 1 with m, and J with n. We
obtain
The second term here may be written as -
using the fact (20.71) that the Riemann tensor
55
changes its sign when two Indices of the same
pair are Interchanged; and the third tern is
equal to the second as we can see by Interchang-
ing a and p (which does not change the value
of the expression since a and p are summation
indices) , and then interchanging the first two
Indices, i.e.) p and a and the next two, i.e.,
p and p (each of these interchanges changes the
sign, so that nothing is changed in the result).
We have thus
- 2Ra(3pp,a
But R Pjp are the mixed components of the con-
tracted Riemann tensor which we denote by R*j
so that, dividing by 2 and changing the sign we
have
Finally, Raa is the result of contraction of the
contracted Riemann tensor; we denote this sca-
lar by R (it is called the twi^e contracted Rie-
mann tensor); then we can write for Raa p simply
R - or (OaijR) a and our relation becomes
28. Geodesies.
In concluding this fragmentary development
of the mathematical theory that we intend to ap-
ply to Physics in the next chapter we shall study
briefly a class of curves in curved space which
play an important part in the study of motion.
These curves may be considered as generalizations
of straight lines in flat space, and we shall
begin by considering these.
In agreement with the point of view of dif-
ferential geometry (Section 21) we shall charac-
terize a straight line by differential equations.
If it is given in parametric form (7.11) we ob-
tain by differentiating twice with respect to
the parameter and indicating differentiation by
a dot placed over the letter
28.1
0.
Since the choice of the parameter is in a
high degree arbitrary, and for another choice of
a parameter the representation may cease to be
linear and the equations (28.1) may not hold any
more - we cannot say that they characterize a
straight line; a complete statement would be: a
straight line is a curve for which there exists
a parametric representation such that 28.1 holds.
Next, we translate 28.1 into the language
of curvilinear coordinates, still keeping to flat
space. We have, as in Section 21, except that we
write now in agreement with Section 23, the index
as a superscript,
1 =
= dxVdp =
= biau«,
dil/dp = b1(
0.
Multiplying by a^ and summing with respect
to i, writing i » Y» we get
or, taking into account 21.14 and 22.82
28.2 ttJ + r^pU^ » 0.
We pass now to curved space; in general, ve
have here no straight lines but we may consider
the same equation and investigate the properties
of the curves represented by them. We introduce
as our definition:
Geodesies are curves which satisfy for an
appropriate choice of parameter Equation 26.2.
In studying geodesies it is often more con-
venient to consider not a single geodesic but
a portion of space filled with geodesies, so
that through every point there passes one and
only one geodesic of the bunch. If we have this
situation we have a vector u1 in every point of
that portion of space, so that we have a vector
field, and the components u1 may be considered
as functions of the coordinates. We may then
write
and equation 28.2 becomes
= 0
or, according to 22.6,
28.3
= 0.
This form is very convenient in some cases.
We shall use it to prove two properties of ge-
odesies. In the first place we may discuss the
meaning of the parameter that we are using. Con-
sider the square of the tangent vector xr*^; we
can prove that this quantity is constant along
a geodesic. In fact, differentiating with re-
spect to the parameter, we have
Since arc
and this vanishes according to 28.3
length is given by the formula
we see that, as a result of the fact that
is constant, s is proportional to p, or p is
proportional to the arc length s. Since mul-
tiplication of the parameter by a constant fac-
tor will not affect equation 28.2 or £8.5 we
may always consider that the parameter il arc
length. This discussion does not apply, how-
ever, when the quantity u«Uci is zero, i.e., when
u1 is a zero square vector. If we agr^e to call
curves whose tangent vectors have zero square
singular curves we may state the following:
Theorem. In case of a non-singular geodesic,
the parameter mentioned In the definition of a
geodesic and used In i)8.P and £8.3 Is propor-
tional or equal to arc length.
Next we may give an Interpretation to equa-
tion £8.2 which sheds some light on the geomet-
rical nature of geodesies. We may assume now
that in all geodesies of the bunch arc length
has been chosen as the parameter; then the vec-
-
tors uJ are unit vectors and they characterize
In each point the direction of the geodesic.
The derivative uJ(1 characterizes the change of
direction as we move in the direction given by
the coordinate u1 and ujfgua gives the change
of direction in the direction of the vector ui,
i.e., in the direction of the geodesic itself.
Since, according to £8.2 this quantity is zero
we have proved that the direction of a geodesic
does not change as we move along it. (The above
discussion applies, strictly speaking, only to
non-singular geodesies.)
57
Chapter V.
GENERAL RELATIVITY
In Chapter I we introduced certain funda-
mental quantities, and we combined them into the
symmetric tensor of rank two, TJJ . We found
that this tensor satisfies the differential
equation
^Tla/'oxa = 0,
first for 1 = 1,2, '6 and then, in Chapter III we
showed that, as a result of the new identifica-
tion introduced there, the fourth equation is
also satisfied. We thought it desirable to build
a mathematical theory in which a tensor of the
same formal properties will appear in a natural
way, and in the preceding Chapter IV we succeed-
ed in actually setting up such a theory - the
theory of curved space- time.
The structure of such a space, we found, is
expressed *in a tensor of rank four - the Riemann
tensor, but we obtained from it by contraction a
tensor of rank two - the contracted Riemann ten-
sor. In investigating the differential proper-
ties of the Riemann tensor we found in Section
27 a relation of the type desired; it is satis-
fied by a tensor which differs slightly from the
contracted Riemann tensor, namely, the tensor
R* - ic^ j R which we may call the corrected con-
tracted Riemann tensor, and this is the tensor
which we are going to identify with the physical
tensor T so that our fundamental assumption will
be
T— — D 1 XK t>
4 — n . — 304 t n.
J J XJ
Thus we decide to interpret T, and therefore our
fundamental quantities of matter and electricity
p, u, v, w, X, Y, Z, L, U, N, which went into it,
in terms of structure of curved space as it is
reflected in the contracted corrected Riemann
tensor. But in doing this we find ourselves be-
fore a radically new situation. As we wanted,
the tensor is now an expression of the proper-
ties of space, i.e., the space is now different
from the one we had before - geometry and phy-
sics is now an organic whole and it is not clear
what changes this brings with it; together with
the desirable feature, namely the fact that T
grew out of space, so to say, we may have brought
in some not desirable and hard to manage fea-
tures. But then there would be no gain if we
could merely say that T is a geometrical thing
now; we expected to gain something essential in
undertaking the merging together of our geometry
and physics; and now we stand before an accom-
plished fact and we have to see what it brought
with it. We conjured up something and we do not
seem to be able to stop, we have to go ahead and
hope that the changes will be beneficial.
It might seem strange that we find a phy-
sical interpretation only for the contracted
Riemann tensor, only for ten combinations of
its twenty components. But this is quite In
order. Should all the components of the Rie-
mann tensor be used up in interpreting matter
and electricity that would mean that where there
is no matter (and electricity) space-time is
flat (as far as internal properties are con-
cerned) ; that would mean that matter acts only
where it is; but we know that matter make* it-
self felt, for instance, by the gravitational
field that it produces, also outside the region
which it occupies, and this is in accord with
our identification as a result of which only
part of the components of the Riemann tensor
vanish where there is no matter, so that the
remaining components may be Interpreted as cor-
responding to gravitational forces.
29. The Law of Geodesies.
In questions of celestial mechanics which
we are going to treat now the effects of the
electromagnetic field are usually negligible
and we shall begin by equating to zero our elec-
tromagnetic tensor. Equation 24.9 becomes then
29.1
According to our fundamental assumption,
this tensor has been identified with the cor-
rected contracted Riemann tensor, and it must
satisfy the equation
29.2 Ta1>a = 0
which formally is the same as our old equation
of motion 24.8 but differs from it In that it
has to be interpreted in curved space. The lest
two equations impose certain conditions on the
velocity components u1 and we want to find these
conditions or, in other words, we want to elim-
inate density from the equations £9.1, 29. P.
(Y/e have been using in Chapter IV the letter u for.
the general coordinates - in this chapter we go
back to our notation of the first three chap-
ters and denote by u1 again the four-dimension-
al velocity vector, and we shall denote the gen-
eral coordinates by x*0
First of all we shall prove the following
theorem due to ilineur.
Theorem. If the field u1 satisfies the equa-
tions 29.1, 29.2 the vectors u1 may be consider-
ed as tangent unit vectors to a family of geo-
desies filling the space.
Proof. We consider first the case when u«
Is a unit vector (and not a zero square vector),
I.e., UpuP = -1. Differentiating this relation
we have
29.3 up,iuP » 0.
Substituting 29.1 Into £9.2 we get
29.4 op/oxa.^Uj + pua auj + Prf*Uj a = 0.
Dividing by p and introducing the notation
29.5 A = o log p/oxa.u° + ua a
we may write 29.? as
Auj + uSij a = 0.
Multiplying this by uJ and summing with respect
to J, for which we write P, we have
= 0
which, according to 29. 3 gives A = 0.
tuting this into 29.5 we obtain
Substi-
29.6
ua.u.
which, according to 28.3 proves the theorem in
the case considered.
But we also have to consider the case of
propagation of light. In this case we do not
heve to consider any density p the momentum vec-
tor being given by Qi with q^qP = 0. The preced-
ing proof breaks down in this case', but continu-
ity considerations lead us to the result that, in
this case also we can find a scalar field p such
that qVp will be tangent vectors to geodesies.
We conclude that in a gravitational field
matter and light particles follow geodesies.
In the present chapter we are going to ap-
ply this result to the investigation of the mo-
tion of a planet and the propagation of light
in the Solar system. We shall see that the
changed significance of differentiation takes
care, in a way, of what is usually accounted
for by gravitational forces.
30. Solar System. Symmetry Conditions.
Our equations 29.1 and 29.2 describe rela-
tions existing between matter and field. We
proved that the motion of matter is character-
ized by the geodesies of the curved space, but
the curvature is in turn determined by matter.
Theoretically, we may have a complete descrip-
tion of the situation, but in practice we do not
know how to handle it, we do not know where to
begin. In such cases we often resort to the
method of successive approximations. Let us try
to apply this method here. In investigating the
motion of a planet around the sun we neglect in
the first place the motion of the sun. Then, in
the first approximation we neglect the MSB of
the planet, I.e., assume that there Is no mat-
ter outside the tun. Since we have already neg-
lected electromagnetlsm we have then that out-
side the sun the tensor T Is zero so that, ac-
cording to the fundamental assumption,
R} - i»ijR « 0.
Contracting we get R - $.4R « 0, so that R » 0
and we have simply
30.1
Rj » 0.
These equations are known as Einstein's equa-
tions. We see that the statement that the cor-
rected contracted Riemann tensor vanishes is
equivalent to the statement that the contracted
Riemann tensor vanishes.
As a result of our first approximation we
derived thus the field equations 30.1. In the
next approximation we introduce the planet »n4
assume that its action on the field is neglig-
ible but that the field acts on It, i.e., that
the motion of the planet is given by the geo-
desies of the field which has been determined
in the preceding step; the motions will then be
given by the equations 28.?
30.2
in which the r's are calculated from the g's
which have been found to satisfy 30.1.
Our problem, therefore, falls in two: first,
to find a field satisfying the equations 30.1,
and second, to find the geodesies of this field.
In this form the problem is comparable to
the problem in Newtonian mechanics as explained
in Section 1. There the field was given by the
potential which had to satisfy the Laplace equa-
tion; here the field is given by the g's which
have to satisfy the equations 30.1.
There the motion, after the field had been
determined, was described by second order or-
dinary equations, differentiation being taken
with respect to time; here motion Is also de-
scribed by second order differential equations,
derivation being with respect to s.
It is possible by making some special as-
sumptions, neglecting certain quantities, for
instance the derivatives of all the g's except
g44 and dropping some terms, to obtain the gen-
eral Newtonian equations as a special or limit-
ing case of our equations. The equations 30.1
would thus reduce to the Laplace equation 1.54
for g44 and the equations of a geodesic to the
equations of motion 1.1 in which X,Y,Z are giv-
en by 1.53, so that we could consider the gen-
eral Newtonian theory of motion in a gravita-
tional field as a first approximation to the
theory of Relativity, but it is quite difficult
in the general case to estimate what we neglect
and the error we commit, and we prefer to com-
pare the two theories on some concrete special
cases. All these cases will refer to what cor-
responds to a gravitational field produced by a
single attracting center. We found In Section
1 such a field by using the general equations
and, In addition, the condition of symmetry. We
Intend to follow an analogous course here. Our
general equations are 30.1 and now we want to
find what will correspond to the conditions of
symmetry. The situation is much more complicat-
ed here. There the field could be characteriz-
ed by a scalar <p and the condition of symmetry
with respect to a point was simply expressed by
stating that 9 Is a function of distance from
that point; here the field is characterized by
a tensor gij. There, in the second place, we
worked in ordinary space; here we have space-
time which has an additional coordinate, t.
Last, there the space was given and in it dis-
tances were well defined; on this space was su-
perimposed a field whose symmetry we had to
discuss; here the field is not superimposed on
a space with a given metric, but the metric it-
self constitutes a field which has to be deter-
mined by the symmetry condition. .
We shall take up these three difficulties
one by one.
In the first place let us consider a ten-
sor field in ordinary space, and let us impose
on it the condition of symmetry with respect to
a point. A tensor we may consider (Section 9)
as the left hand side of an equation of a cen-
tral quadric surface (we are interested in a
symmetric tensor here, since the g's are sym-
metric in the indices 1 and J - this symmetry
we must try not to confuse with the symmetry '
with respect to a point which we impose on the
field - and a symmetric tensor is sufficiently
characterized by a quadratic form) which we may
consider as an ellipsoid. Our tensor field will
then be represented by an ellipsoid at every
point of space. The field must allow rotations
around a fixed center 0, i.e., such a rotation
must bring the field into itself; in other words,
if a rotation brings a point P into a point Q
it must bring the ellipsoid at P into the ellip-
soid at Q. In particular, a rotation, which
leaves P unchanged must not change the ellipsoid
at P. It is clear that every ellipsoid must be
an ellipsoid of revolution and that its axis
must be directed along the radius vector from
0 to P.
The ellipsoid at the point x,0,0, will be
seen to be
- x)8 + B(TJ» + ««) = 1
and for a general point P, if we use polar coor
dinates for P and (considering if it helps,
the ellipsoid as infinitesimal) their differen-
tials for the points of the quadratic relative
to P,
30.3 Adr8 + B(d98 + sinae.d98) = 1.
Since ellipsoids at points equidistant fro* 0
must have the same dimensions, A and B oust be
functions of r alone.
The left hand side of this equation gives
a tensor field which satisfies the condition of
symmetry with respect to a point. Bext, we
consider the complication resulting from the in-
troduction of time. In Section 1 time was not
mentioned, it means that the field was consid-
ered as Independent of time, or static; we may
say that the field must not be affected by a
change in t or, from the four -dimensional point
of view, by a translation along the t-axls. This
Is a requirement of the same character as that
of symmetry with respect to a point; froa the
four-dimensional point of view we may combine
the two requirements and say that the field must
be symmetric with respect to a line - the t-axls.
But the field now is a field in four-space, It
will be represented by a quadratic form in dr,
de, d<p and dt. For dt = 0 it must reduce to the
field given before; the coefficients must be in-
dependent of t corresponding to the requirement
that the field be static; and a change from t
to -t must also not affect the field (reversi-
bility of time) so that terms of the quadratic
form involving dt to the first power must be ab-
sent. It follows that the addition of the
fourth dimension results in the addition of on-
ly one term to our tensor which now may be writ-
ten as
20.4 Adr2 + B(d62 +
Cdt
where C, as well as A and B, are functions of r
alone.
And now we have to overcome the last dif-
ficulty, that connected with the fact that our
space is curved and that we cannot define sym-
metry in terms of rotations because rotation
means a transformation in which distances are
preserved, and distances are defined only by the
field of the g's which we want to determine by
the requirement that it be not affected by ro-
tations. To overcome this difficulty we have to
agree on some other definition of symmetry, and
it seems natural to adopt as such the following:
in order to define a symmetry for a curved space
we shall compare it with a flat space by estab-
lishing a one-to-one correspondence between the
points of the two spaces. Corresponding to ev-
ery transformation of the flat space we will
have then a transformation of the curved space;
and we shall say that the curved space possess-
es the same symmetry as a field F In the flat
space if the metric of the curved space - as
given by the g's - is not affected by those
transformations of the curved space which cor-
respond to the transformations in flat space
not affecting the field F.
Suppose now that we have such a curved
space. This implies that we have a one-to-one
correspondence with the flat space, and we may
use for the points of the curved space the same
coordinates that we use for the corresponding
points of the flat space. It Is clear that 30.4
will satisfy the requirements, so that we can
take It for our fundamental tensor, or as we
shall say (compare 21.8) for our ds8.
But the quantities r, 6, <p, t, which have
definite geometrical significance in flat space
lose it in curved space - they are Just num-
bers which we use to characterize different
points as we use numbers to characterize houses
on a street. There is no reason why we should
not replace them by other numbers, i.e., trans-
form our coordinates, if it would simplify our
formulas. Now, it is clear that transformations
involving 6, 9, t will make our expression 20.4
more complicated because it would introduce
these coordinates into the coefficients. But we
could choose a transformation on r alone so as
to simplify that expression; we could, for in-
stance, reduce any one coefficient to a pre-
scribed function of the new r. We make this
choice in such a way as to reduce B to r8 be-
cause, in a way, it restores to r a geometrical
meaning as we shall see presently. If we write
£(r) and -T(r) for the functions of the new r
which now appear instead of A and C, and inter-
pret 30.4 as giving -ds2, in accordance with
the standardization of the parameter adopted In
Section 12, our final formula will be
SO. 5
-dsa =
+ r8(d98 + sin80.d<p8) - T)(r).dt8
Letting here r and t have constant values
we have a surface, and a simple calculation
would show that -^ is the total curvature of
this surface, which gives a geometrical meaning
to r.
Our task is now accomplished, we have im-
posed on our space the conditions of symmetry
and we have next to impose on it the general
equations 30.1.
31. Solution of the Field Equations.
We are now at a stage which corresponds to
the assumption that the potential 9 is a func-
tion of r alone in Section 1, and our next task
corresponds to the substitution of <p(r) into
Laplace's equation. Instead of one unknown
function <p(r) we have here the ten g's determin-
ed by 30.5 which we may write out as
8x1 =
§32 =
31.1
g33 = r2sinae, gaa? =
all others zero,
60
and which involve two unknown functions. In or-
der to determine these functions we have to
substitute 31.1 into 30.1. In the first place
we have to calculate the g's with the upper in-
dices from the formulas (£2.6)
Since the g's with two distinct lover Indices
vanish, only those terms on the left are not
zero in which a » 1 and we have
for 1 ^ J the right hand sides are zero, and
since the first factors on the left are not
zero the second must vanish; we see thus that
the g's with two distinct upper indices also
vanish. For J = 1 we have unity on the right
and thus
31.2
,11
g33 =
g* » 1/r",
•e, g44 = -i/n(r)
all others zero.
In what follows
Xi = r, xa = 8, x, = f, x« = t.
Differentiation with respect to r will be de-
noted, as in Section 1, by ' . We next calcu-
late the r's with all indices down according to
22.9 and obtain, omitting those that come out
zero,
31.3
1,44 = -iV* r»>»» = -r'slnO. cos 8,
rt, i» = r> r»,i» = r sin*0»
rs,33 = rasin 8. cos 8, F4, 14 = -Jtf
Raising of an index is accomplished In this case
simply by multiplying by the g with the index to
be raised appearing above twice, because the sua
giaFa which, according to 83.3 is equal to F1 re-
duces, as the result of the Tanishing of the g's
with two equal upper indices to one term, namely
g11^. This permits us to write out easily the
r's with one index above:
rli =
31.31 r;4 = J
r;,
r sin86
33
r8, = -sine .cos e,
7, rJ =
«; -cote, r,: =
Next we have to calculate those components
of the Riemann tensor which appear In the
expressions for the components of the contracted
Hieraann tensor, I.e., those with the first Index
equal to the one before last or those of the
type Rljjih • We do not write those out but state
that the result of the calculation with their
aid of the components of the contracted Riemann
tensor is, that all these components with two
distinct indices vanish and the others are
g'r
4T1
It is more convenient to operate with the mixed
components of the contracted Riemann tensor ('al-
though it is not necessary, and the reader might
for the sake of practice go through the same
calculations using covariant components) , and
these are obtained from the last formulas by
multiplication by the corresponding g with upper
indices; we obtain thus
t a
31.4
^Ti-
We come now to the ten equations E * 0
that we have to satisfy; six of them, namely,
those in which i ^ J are satisfied identically
because our R's as well as the O's vanish for
distinct indices. Of the remaining four equa-
tions the second and the third are identically
the same because of the equality of the corres-
ponding values of R in 31.4. Three equations
remain, viz.,
=0
Subtracting the last one from the first we have
81
Ol * const. • c .
By choosing our unit of time appropriately
we can reduce this constant to 1, so that
31.7
or
Using 31.6 and 21.7 in the second of the equa-
tions 31.5 we obtain
1 -
which gives
31.8
TJ = 1 -
where y denotes a constant of integration.
Our field then is given by
a
31.9 -dsa =
rad0a
.d*f - r)dt*
where TI is given by 31.8.
32. Equations of Geodesies.
We first consider the non-zero geodesies
which correspond to a material particle. We
know that in this case arc-length can be taken
as parameter so that the curve in addition to
the equations 30.2 must satisfy the equation
30.5 which we may write as
32.1 ~
= -1;
we shall, however, make our discussion slightly
more general and write A in the right hand side
with a view of using the results also in the
case corresponding to a light particle. We
shall discuss this equation together with the
equations 30.2 which become here
32.21 r -
-.rs
2T]
rn
rrj sina9.f8
32. 22
32.23
32.24
9 + 2£.8 - sinO cosO fa = 0,
+ 2p.f + 2 cote
•f 5-.rt = 0.
.9* * 0,
The choice of the 0 and « coordinates is at our
disposal. We choose them in such a way that the
initial position of the particle be on the equa-
tor and that the tangent be tangent to the equa-
tor. In this case 8 =^ and e = zero at the
initial moment and the second equation shows
that 0=4 always. Now the last two equations
may be integrated once each and they furnish
32.3
32.4
I
where h and k are constants. Together with
these two equations we have to consider the one
corresponding to 32.1; viz.,
32.5
r*9* ~ lit1 » A.
We simplify our system of equations In the
following way: (a) we eliminate t by means of
38.4; (b) we eliminate differentiation with re-
spect to the parameter by using f = (dr/d9) . 9
and 32.3; (c) we Introduce as a new variable,
as Is customary In celestial mechanics, the In-
verse distance u = -^, Instead of r, so that
32.6
r =
;
and, (d) we substitute the value for TJ from 31.8,
We obtain in this way a differential equation
between u and 9; viz.,
where X is a constant. This equation may be
considered as the equation of the orbit of a
planet.
33. Newtonian Motion of a Planet.
Every reader knows, of course, that accord-
ing to the Newtonian theory a planet moves around
the sun on an ellipse in one of whose foci the
sun is situated, although he may not be in the
possession of a proof of that; we shall not give
a proof of that here either, but we shall dis-
cuss in detail only one feature of the situation.
The vertex of the ellipse which is nearest to
the focus in which the sun is located is called
the perihelion, the other vertex - the aphelion;
the line Joining the perihelion and the aphelion
is the major axis, and therefore passes through
the sun. Using the coordinates u and 9 corres-
ponding to those of the preceding section we may
say that the perihelion corresponds to the maxi-
mum value of u, and the aphelion to the minimum
value of u, and that the transition from the
maximum to the minimum value of u corresponds to
the change of 9 by the amount *. It is this last
fact that we shall deduce from the equations of
motion. We may (corresponding to the fact that
we set 0 = i* in the preceding section) consider
a motion in the xy-plane characterized by the
equations (see 1.1 and 1.3)
33.1
dx
"dt1
£**-••
We have now to introduce variables corresponding
to those used above in the Relativity treatment,
i.e., to set
x = COS9/U, y = sin9/u.
6£
We calculate the first and the second deriva-
tives of x and y with reipect to t, substitute
them into 33.1 and combining the terns with COM
and those with sin f we get
co.,
.in, „
dt
- 0.
Multiplying the first of these equalities
sin 9, the second by cos 9 and adding the
sults we obtain
33.8 8u~"(-n:) -
by
re-
and then easily
-a du d9
211 'dt'dt ' " 'dt1
-i d*9 _
~
The last equation may be written as
da9/dta _ gdu/dt
d9/dt u
whence
33.3
d9/dt * Hu1
and then
where H is a constant of integration. We next
want to eliminate t from 33. 2 with the help of
the last formula. Differentiating it we have
dt
du du d9 du
dt = d9*dt = de
d!u m dV*v" + _du.d!9 . d!^j«u« ,
dt* dfPMV d9 dtT de1
Substituting into 33.8 we arrive at
-u~1H8u4 + Mu* = 0
and, after two terms cancel, at
33.4
du
This, we easily see, may be obtained by differ-
entiation from
33.5
• « -
= a
EMu
H»
where a is a constant. This corresponds to equa-
tion 38.7 obtained from Relativity theory in the
preceding section; in that last equation we
have, of course, to take A = -1 if we consider
the motion of a planet so that it becomes
33.51
— 8
and we see that the difference is essentially in
only one term. But before we come to the com-
parison of the motions described by these two
equations we have to continue the discussion of
33.5. The character of motion described by it
depends on the values of the constants appearing
in it, and also on the initial conditions. We
begin the discussion by writing 33.5 in the form
33.52
= -(u -
- u8)
where ux and ua are the two roots of the polyno-
mial u8 - 2Mu/H2 - ct. If the two roots are com-
plex, or equal, the right hand side of 33.52 is
negative and we cannot have real motion. Also
when both roots are negative the right hand side
is negative for positive values of u (and u,
being the inverse distance, must be positive).
The case of one positive and one negative root
corresponds to u changing from 0 to a finite
value and then going back to zero, for instance,
a comet approaching the sun from an infinite
distance and then receding back into infinity.
But we want to treat the case of a planet, and
this will obviously correspond to the only re-
maining case; viz., that of two distinct posi-
tive roots. If by u^ we denote the larger and
by ua the smaller of the two roots it will be
convenient to write our equation as
33.53
and we see that a real solution is only possible
when u is between u2 and ux . The motion will
manifest itself in an oscillation of u between
ua and Ui and the sign of du/d? will change at
these points. The particular question we want
to investigate is, as was mentioned at the be-
ginning of the section, to what change of q> cor-
responds one oscillation of u, between ua and u^
say. In order to find this we solve the equa-
tion for d«p, obtaining
whence
33.6
- U)(U - U8)'
a change of variable will help us to evaluate
this integral. We put
33.7
u - ua
i - u,
sln'x;
when z changes from 0 to */£, u will increase
from ut to Ui as required. We bar*
du » 2(ux - u.) sin z cos z dx,
u - u, » (ux - ut) sin*x
ux - u = Uj. - |ua •«• (ux
= (u^ - u§) cos'z,
and the integral becomes.
33.8
33.9
2dx
The answer to our question is then, that f
changes exactly through x while u performs an
oscillation between its minimum and its ••yi^«i»
values, which corresponds to the fact mentioned
before that the aphelion and the perihelion are
on a straight line with the sun, which fact we
thus proved.
34. Relativity Motion of a Planet.
Following this excursion into celestial
mechanics according to Newton we return to our
Relativity formulas which we shall treat by com-
paring them to the formulas derived in the last
section.
At this stage we come again upon a funda-
mental question: we have two theories; the
quantities of one of them have been identified
with measured quantities, and this identifica-
tion proved, in the main, a splendid success; if
the new theory is to be applied successfully, it
is clear that it has essentially to agree with
the old theory with which it may be compared in-
stead of being compared with results of measure-
ment directly; that means that we have to estab-
lish a correspondence between quantities of the
two theories, and we have to expect that the
corresponding quantities of the two theories
obey approximately the same relations. This
correspondence has been anticipated in the pre-
ceding pages by using the same letters for quan-
tities which it is intended to identify. But it
may not be superfluous to remind the reader that
the quantities u, f , 6 of the two theories are
not the same; there Is a certain arbitrariness
in choosing coordinates in curved space, and es-
pecially obvious it must be in the case of r
(of which u is the inverse); it is possible to
substitute for r some simple function of r, and,
indeed, it has been done; the criterion of cor-
rectness of choice must lie In the success of
the identification.
Next we must identify the constants of the
new theory with those of the old. It would seem
as though we must, in order to reach an agree-
ment, make Y = 0 so as to get rid of the last
term of the equation 33.51 by which it differs
essentially from 33.5. But this would annihi-
late also the preceding term in the new formula
and so spoil the correspondence altogether. We
must, therefore, ascribe to Y a finite value,
but we will expect that it will be small; more
precisely, it will be small in such a way that
the term yu3 will not affect the equation 33.51
essentially or will be small in comparison with
ua. Next, let us compare 32.3 with 33.3. Of
course, the left hand sides differ by the fac-
tor dt/ds, but this is equal (Section 13) to
1//1 - p * which is, even for motions of planets,
very close to one, so that, in the first approx-
imation we may identify h with H. Comparing now
33.5 and 33.51 we come to the conclusion that
34.1 Y = 2M
so that we may write 35.51 as
34.2
After we have made these identifications
the situation is then this: if we neglect the
term 2Mu3 in the equation, and this term is
negligible in most cases, we have the same
equation of the orbit as in Newtonian mechanics.
This result is very satisfactory, we have been
able to obtain the equations of motion of a
planet without considering any gravitational
forces, as a result of our identification of the
contracted Riemann tensor with the complete ten-
sor. Still the term 2Mu3 is there, the Relativ-
ity theory predicts an orbit that is slightly
different from that predicted by the Newtonian
theory; is the difference within the error of
observation? We shall consider now this ques-
tion, but instead of considering the motion as a
whole, we shall consider only the feature of it
which for the case of Newtonian motion has been
considered in the preceding section; viz., we
shall ask ourselves whether, corresponding to an
oscillation of u between a minimum and a maximum
value, the change in <p will be exactly x. Of
course, we are sure that in the new theory there
will be motions which differ but slightly from
the motion considered in the preceding section,
so that the general character of the motion will
be the same, and u will oscillate between a min-
imum and maximum. The value of du/d« will now
be expressed by a polynomial of degree three the
first two terms of which are
2Mu3 - u*.
The sum of the three roots of this polynomial is
1/2M so that if Ux, ua denote two roots, viz.,
those two roots which differ but slightly from
64
the roots denoted in the saae way in the preced-
ing case, the third root will be
ar - ur - ut,
and the Integral corresponding to 55.6 will be
y
I
.
/(ux -u)(u - ut) [1 -
u, «• u)E*J
the same substitution 55.7 as before will be ap-
plied. We only have to calculate
ux + us + u - -ix + u, •»• ut + (ux - ut)sin*z
= ux + ua + ux sin'x + u, cos*x,
so that the integral becomes
f
J
- 2M(ui + ur + ux sin'x + u,cosix)"
As we saw before, II is a very small quantity;
before, we neglected it altogether »nd obtained
K for the value of the integral; now, we shall
go to the next approximation; we shall develop
the denominator according to the powers of M and
neglect all terms beyond the second (it would be
a very easy but not a worthwhile matter to esti-
mate the value of the error) ; we get in this way,
as an approximate value for f^- fg
ua + ux sin«x
cos»x)] dx,
or
The new theory predicts then that the angle e
will have changed by this amount while the dis-
tance from the sun changes from its miniaum to
its maximum; i.e., that the perihelion and aphe-
lion are not in a straight line with the sun but
that the planet moves through an additional angle
of -^f (ux + ua) after reaching the position oppo-
site the one where it was during the perhelion,
before reaching the aphelion. Since the same
situation applies to the motion between an aphe-
lion and the next perihelion we see that between
two consecutive perihelia the planet will have
moved through an angle £* * 3xM(u! + u,), or
that the perihelion will have moved through an
angle 3«M(u1 + ut) during one revolution of the
planet. This is a very small amount, and it nay
be considered as a correction to the classical
result according to which the planet moves on an
elliptic orbit with the sun in one of the foci.
If a is the major semi-axis and e the eccentric-
ity, the distance at perihelion is a- ae and the
distance at aphelion a + ae: we have then
ux -i- u, = l/(a - ae) + l/(a + ae) = 2/a(l - e»),
and the final formula for the advance of
perihelion comes out
34.5 P • -
the
Here then we have two predictions: on the old
theory the perihelion will remain fixed in
space; according to the new one it will advance
by p during one revolution. What are the re-
sults? In the case of most planets either this
amount is too small or the position of the peri-
helion too uncertain to permit any decision but
in the case of the planet Mercury it was known
for a long time that there is a discrepancy be-
tween the prediction of the Newtonian theory and
actual observations; and it happens that the
discrepancy is very nearly the amount showing
the discrepancy between the two theories, so
that the theory of Relativity predicts a result
that has been actually observed. This must be
considered as a success of the new theory.
35. Deflection of Light.
According to Section 29 a light particle
also moves along a geodesic, only in this case
It is a zero geodesic, one along which the tan-
gent vectors have zero length. The equations
for such a geodesic are the same as for the
other kind with the difference that the parame-
ter is no longer arc length. As a result we
have to have zero instead of -1 in the right
hand member of equation 32.1, that is to make
A = 0 in the equations 32.5 and 32.7. The equa-
tion of the orbit will therefore be (34.1)
35.1
+ u2 = o
2MuJ
which will have to be compared with the same
equation without the term containing u3 which
is an equation of a straight line and character-
izes the propagation of a beam of light on the
old theory. In fact, the equation of a straight
line whose distance from the origin is 1/p and
which is perpendicular to the polar axis is in
our coordinates u = p cos 9 ; we have then du/d9
= - p sin 9 and taking the sum of the squares of
the last two expressions we find that they' add
up to p8 which we may identify with o. Again
the term 2Mu3 is very small because the maximum
value u can take is the inverse of the minimum
value of the distance from the center of the
sun, which is the radius of the sun; we
treat the problem again as a perturbation prob-
lem, that is, compare the required solution to
that of the equation without the 2Mu3 term.
Again we are interested in the change of the
angle 9 corresponding to a transition between
the two extreme values of u. We shall be inter-
ested in a beam of light emitted from a star,
arriving into our telescope and passing on its
way very near to the surface of the sun. The
65
distances of the star and even of the earth from
the sun are very large in comparison with the
minimum distance, and we shall take them « ex
The maximum value of u, corresponding to the
minimum distance from the sun we shell denote
by u0. Since du/df changes its sign when the
light particle reaches this point it must van-
ish there so that the left hand side of 35.1 re-
duces to u0", and we have
u • - a'
Etfu »;
we may use u0 Instead of a in our equation and
write it in the form
(^)8
- £Mu » + 2Mu».
Solving this for d0 we find
d9 = -^_ $U .
/2M(u* - U03) - (u* - u0»)"
We introduce a new variable x letting
u = UQ sin x
and after the substitution develop according to
powers of M and keep only two terms; we thus
get for d9 approximately
if we let x change from o to « , u will change
from zero to u0 and back to zero, Just the
change that the inverse distance will experi-
ence during the propagation of the light par-
ticle. The total change of the angle will then
be represented by the integral
on the old theory which corresponds to the ab-
sence of the term with M in the equation 35.1
we will have to omit the term with M in this
integral and the result is *. The approximate
result according to the new theory will differ
from that by
The beam of light coming to us from a star will
then be deflected by an angle 4M/r0 where r0 is
its minimum distance from the sun, compared to
the old theory, or to the beam as it would go if
the sun were absent. If then we observe a star
in a certain position on the sky while the sun
is far away, and then observe the same star when
the sun is near the line of vision; I.e., when
the apparent position of the sun is near the
M
apparent position of the star, this latter posi-
tion must appear shifted away from the sun ap-
proximately by the angle 4M/r0. Actual measure-
ments are possible only during an eclipse of the
sun, because otherwise the light from the sun
drowns out the fainter light from the star, and
are beset with difficulties but the results seem
to be in favor of the prediction.
36. Shift of Spectral Lines.
We come to the third so-called test of the
General Relativity Theory, that is, a case where
the predictions of the theory differ from those
of older theories by an amount exceeding the
error of measurement, thus affording an oppor-
tunity to prove or disprove the advantages of
the new theory.
In this case again we deal with propagation
of light in the gravitational field of the sun,
but this time the source is supposed to be on
the sun itself, and the observer is on the earth,
so that the direction of the beam is that of a
radius of the sun; we take this to mean that
9 = const, and <p = const. We have then accord-
ing to 32.5 with A = 0
L . nta = o.
dr = I
36.1
This gives
the double sign corresponds to two possible
senses of the beam: from the sun to the earth
and from the earth to the sun. The former, in
which we are now Interested, is characterized by
the property that r increases as t increases;
the ratio dr/dt must therefore be positive, and
since T] is positive we must take
36.2 dr = ryit.
The orbit is thus determined by the equa-
tions
de = 0, dq> = 0, dr = rjdt.
But we are interested in the color this time,
and color, as we have agreed in Section 16, is
proportional to the time component of the momen-
tum vector. As the momentum vector we have to
consider the vector of components du^/dp, where
p is a parameter appearing in the equations of
Geodesies, the one with respect to which differ-
entiation is denoted by • . In order to find
this parameter we have to go back to the origin-
al equations of geodesies; 32.21 becomes here
equation 36.1 shows that the last two terms can-
cel leaving us with
* - 0.
This means that r is a linear function of the
parameter,
r • ap * b
so that
d/dp » a.d/dr.
The momentum vector uj is here therefore
a dr
a.—
dr
dr
-r
dr
•^ —
dr
the last according to 36.2. What about the val-
ue of a? The answer is that it is not and can-
not be determined by the foregoing discussion.
There are different beams of light which satisfy
all the conditions imposed so far; they differ
in color, and different colors correspond to
different values of a.
Color, according to our definition, is the
time components of the momentum vector; i.e.,
the scalar product of the momentum vector of
light and the unit vector in the time direction.
If we denote the (contravariant) components of
the latter by T1, the condition that it has tlae
direction will be given by
rpl = ip8 3 q»3 = Q
and the condition that it is a unit vector - by
_ rpGCmp ... 1
°cc(3
which, taking into account the relations just
preceding and the values of the g's becomes
T)(T4)" = 1.
The scalar product of the vectors ui and T1 cal-
culated according to the formula g^JiJT^ becomes
two
or, if we expand and keep only the first
terms,
e(l + M/r) .
The color of a beam of light is then not con-
stant along the beam. We shall compare the col
or as it appears near the surface of the sun,
where r is equal to the radius of the sun rs,
and near the surface of the earth, where we may
assume r = &. The frequencies in these two
cases will be for a given beam of light propor-
tional to
1 + M/r and 1;
the change in frequency will be proportional to
(1 * H/r) - 1 » M/r,
and this will be also the relative change in
frequency .
If now we consider some source of light
near the surface of the sun, whose frequency we
know, the light emitted by it when it is re-
ceived at the surface of the earth will have a
frequency that is less, the amount of the rela-
tive change being given by
M/r.
If then we compare light coming from a ter-
restrial source, for Instance, emitted by an
67
atom, and light emitted by a corresponding sourct
on the sun, for instance, emitted by an atom of
the same kind, we would expect a change of fre-
quency of the amount M/r. Or, if we compare a
Solar spectrum with a Terrestrial spectrum, the
lines of the former will be shifted toward the
red by the amount M/r. This is the prediction
of the General Relativity Theory.
Again the experimental evidence seems to
favor this prediction.
\
BINDING LIST
o
Oi
o>
CD
00
CVJ
•H 0)
s ^
JH «M
O
0)
tiO to
h o
O -H
0) -P
O o3
e
•\ o
CJ -P
•H 0}
-S
&
PS
University of Torontp
Library
DO NOT
REMOVE
THE
CARD
FROM
THIS
POCKET
Acme Library Card Pocket
Under Pat. "Ref. Index Flte"
Made by LIBRARY BUREAU