UNIV.OF T08G8TO

. :i-"-;, .ii:

MATHEMATICS OF RELATIVITY

LECTURE NOTES

BY

G.*Y?RAINICH

in

(All Rights Reserved)

(Printed in U. S. A.)

EDWARDS BROTHERS, INC.

Lithoprinters and Publishers ANN ARBOR, MICHIGAN

COHTBHT8

Introduction

Page

Chapter I. OLD PHYSICS . 1

1. Motion of a Particle. The Inverse Square Law 1

2. Two Pictures of Matter 3:

3. Vectors, Tensors, Operations 5

4. Maxwell's Equations 7

5. The Stress-Energy Tensor 9

6. General Equations of Motion. The Complete Tensor 10

Chapter II . NEW GEOMETRY ....'. 12

7. Analytic Geometry of Four Dimensions 12

8. Axioms of Four-Dimensional Geometry 14

9. Tensor Analysis 16

10. Complications Resulting From Imaginary Coordinate 20

11. Are the Equations of Physics Invariant 22

12. Curves in the New Geometry 24

Chapter III. SPECIAL RELATIVITY . 26

13. Equations of Motion , 26

14. Lorentz Transformations , 28

15. Addition of Velocities 29

16 . Light Corpuscles , or Photons 31

17. Electricity and Magnetism in Special Relativity 33

Chapter IV. CURVED SPACE 35

18. Curvature of Curves and Surfaces 35

19. Generalizations 37

20. The Riemann Tensor 39

21. Vectors in General Coordinates 41

22. Tensors in General Coordinates 44

23. Covariant and Contravariant Components 46

24. Physical Coordinates as General Coordinates 49

25. Curvilinear Coordinates in Curved Space 50

26. New Derivation of the Riemann Tensor 53

27 . Differential Relations for the Riemann Tensor 54

28. Geodesies 55

Chapter V. GENERAL RELATIVITY 57

29. The Law of Geodesies 57

30. Solar System, Symmetry Conditions 58

31. Solution of the Field Equations 60

32. Equations of Geodesies 61

33. Newtonian Motion of a Planet 62

34. Relativity Motion of a Planet 63

35. Deflection of Light 65

36. Shift of Spectral Lines 66

INTRODOCTIOS

Since we are going to deal with applied Mathematics, or Mathematics applied to Physics we have to state In the beginning the general point of view we take on that subject. A math- ematical theory consists of statements or prop- ositions, some of which are written as formulas. Some of these propositions are proved, that means deduced from others, and some are not; the latter are called definitions and axioms. Furthermore most mathematical theories, in par- ticular those in which we are interested, deal with quantities, so that the propositions take the form of relations between quantities.

Physical experiments also deal with quan- tities which are measured in definite prescribed ways, and then empirical relations are estab- lished between these measured quantities.

In an application of Mathematics to Phys- ics a correspondence is established between some mathematical quantities and some physical quan- tities in such a way that the same relationship exists (as a result of the mathematical theory) between mathematical quantities as the experi- mentally established relation between the cor- responding physical quantities. This view is not new, it was emphatically formulated by H. Hertz in the introduction to his Mechanics, and then emphasized again by A. S. Eddington in ap- plication to Relativity. The process of estab- lishing the correspondence between the physical and the mathematical quantities we shall, fol- lowing Eddington, call identification. An iden- tification is successful, if the condition men- tioned above is fulfilled, viz., if the rela- tions deduced for the mathematical quantities are experimentally proved to exist between the Physical quantities with which they have been identified. From this point of view we do not speak of true or false theories, still less of absolute truth, etc.; truth for us is nothing but a successful identification, and it is nec- essary to say expressly that there may exist at the same time two successful identifications, two theories, each of which may be applied with- in experimental errors to the known experimental results; and that there may be times when no such theory has been found; and also that an identification which is successful at one time may cease to be so later, when the experimental precision will be increased.

Very often it happens that the quantities of a theory are compared not with quantities which are direct results of experiment but with quan- tities of another, less comprehensive, theory

whose identification with experimental quanti- ties has proved successful. In fact, it sel- dom happens that we have to deal with direct results of experiment, since even an experimen- tal paper usually contains a great amount of theory.

We may consider Geometry as a first attempt at a study of the outside world. It may be con- sidered as a deductive system which reflects (in the sense explained above, that is of the existence of a correspondence, etc.) very well our experiences with some features of the out- side world, namely features connected with the displacements of what we call rigid bodies. We see at once how much is left out in such a study; in the first place, time is almost entirely left out: in trying to bring into coincidence two triangles we are not interested in whether we move one slowly or rapidly; in describing a circle we are not concerned with uniformity of motion. Another important feature that is left out is the distinction between vertical and horizontal lines although we know that this distinction is a very real one. Then optical and electromagnetic phenomena are loft out. We see thus that Geometry is not a complete theory of the outside world; one method of building a complete theory would be to introduce correc- tions into geometry, to introduce one by one time, gravitation, ether, electricity, to patch it up every time we discover a hole; this is a disrespectful but roughly correct description of the actual development of Mathematical Phys- ics. Another method would be to scrap Geometry, and to build instead a new theory which would be an organic whole, embracing the displacement phenomena as a (very important, to be sure) spe- cial case. The purpose of the following dis- cussion is to exhibit such a theory. Before we come to the systematic exposition we want to say something about the plan we are going to follow. It is possible to start with a com- plete statement of the general theory, and then show how special features mlfcr be obtained by specializations and approximations; instead, on didactical grounds, we shall begin with special cases and work up by modifications (these being counterparts of approximations) and generaliza- tions. The essential difference between this procedure and the development of classical Phys- ics is that in the latter Geometry was consid- ered as a fixed basis, not to be affected by the upper structure, and we shall feel free to modify geometry when necessary.

Chapter I. OLD PHYSICS

The purpose of this chapter is to reformu- late some of the fundamental equations of mech- anics and electrodynamics, and to write them in a new form which constitutes an appropriate bas- is for the discussion that follows. The con- tents of the chapter is classical, the modifi- cations which are characteristic of the Rela- tivity theory have not been introduced, but, as was mentioned the form is decidedly new.

1. Motion of a Particle. The Inverse Square Law.

The fundamental equations of Mechanics of a particle are usually written in the form

1.1

= X, m-d*y/dt8= Y, in

= Z.

Here m denotes the mass of the particle, x,y,z are functions of the time t whose values are the coordinates of the particles at the correspond- ing time, and X,Y,Z are functions of the coor- dinates whose values are the components of the force at the corresponding point. This system of equations was the first example of what we may call mathematical physics, and much that is now mathematical physics may be conveniently considered as a result of a development whose germ is the system 1.1. This chapter will be devoted to tracing out some lines of this de- velopment.

We begin by writing the equations 1.1 in the form

1.11 dmu/dt = X, dmv/dt = Y, dmw/dt = Z,

where

1.2 u = dx/dt, v = dy/dt, w = dz/dt

are the velocity components. The quantities

rau,

mv,

mw

are called the momentum components, and in this form our fundamental equations express the statement that the time rate of change of the momentum is equal to the force, the original statement of Newton. Equations 1.11 are seen to be equivalent to 1.1 if we use the notations 1.2 and the fact, usually tacitly assumed, that the mass of a particle does not change with time or in symbols

1.12

dm/dt = 0.

In the equations 1.1, x,y,z are usually un- known functions of the time and X,Y,Z are given functions of the coordinates. The situation is then this: first the field has to be described by giving the forces X,Y,Z, and then the motion in the given field is determined by solving equations 1.1 (with some additional initial con- ditions) .

We shall first discuss fields of a certain simple type. One of the simplest fields of force is the so-called inverse square field. The field has a center, which is a singularity of the field; in it the field is not determin- ed; in every other point of the field the force is directed toward the center (or away from it) and the magnitude of the force Is Inversely proportional to the distance from the center. As the most common realization of such a field of force we may consider the gravitational field of a mass particle or of a sphere. If cartesi- an coordinates with the origin at the center are introduced the force components are

1.3

X = cx/r3, Y = cy/r3, Z = cz/r",

where c is a coefficient of proportionality, negative, when we have attraction and positive in the case of repulsion, and

1.4

ya

taking the sum of squares of X,Y,Z we easily find ca/r4 so that the magnitude of the force is c/r8; force is inversely proportional to the square of the distance. If the field is pro- duced by several attracting particles the force at every point (outside of the points where the particles are located) is considered to be giv- en by the sum of the forces due to the separate particles. In this case the expressions become quite complicated and it is easier to study the general properties of such fields by using cer- tain differential equations to which the force- components are subjected rather than by study- ing the explicit expressions.

These differential equations are as fol- lows:

1.51 bX/bx + bY/by + bZ/bz = 0

1.52 bY/bz - bZ/by = 0, bZ/bx - bX/bz = 0,

bX/oy - bY/bX = 0.

The fact that the functions X,Y,Z given by the formulas 1.5 satisfy these equations may be proved by direct substitution; to facilitate

calculation we may notice that differentiation of 1.4 gives

1.6 r-br/bx = x, r»br/by = y, r«br/bz = z. Differentiating the first of 1.8, we have now bX/bx = c/r3 - 3cx/r4- br/dx = c/r3 - 3cxs/r*.

Substituting this and two analogous expressions into 1.51 we easily verify it. The verification of 1.52 hardly presents any difficulty.

It is known that equations 1.52 give a nec- essary and sufficient condition for the exist- ence of a function <p of which X,Y,Z are partial derivatives. The derivative bX/bx is then the second derivative of this function, and the sys- tem 1.51-1.52 may be replaced by the equivalent system

1.53 X = b<p/bx, Y = bq>/dy, Z = b<p/bz

1.54 52<p/dx2 + b29/by8 + b^/bz* = 0.

The last equation is known as the equation of Laplace.

In the particular case where X,Y,Z are giv- en by the formulas 1.3 a function <p of which X, Y,Z are partial derivatives is (as it is easy to verify)

q> = -c/r.

We may say now that the field of force giv- en by l.b satisfies the differential equations 1.51, 1.52 or 1.5&, 1.54 which express the same thing. It is easy to show that these equations are satisfied not only by the field produced by one particle at the origin, but also by that due to any number of particles: first we notice that if a particle is not at the origin this results only in additive constants in the coor- dinates, and so does not affect partial deriva- tives which appear in the equations 1.51-1.52 which therefore remain true in this case; sec- ondly, these equations are linear and homogene- ous, as a consequence of which the sum of two solutions of these equations necessarily is a solution, or if two fields satisfy these equa- tions their sum also satisfies them; if then, as is generally assumed, the field produced by sev- eral particles is the sum of the fields due to the individual particles, such a field also sat- isfies equations 1.51, 1.52.

Conversely, it can be proved that any field satisfying the differential equations 1.51-1.52 may be produced by a - finite or infinite - set of particles each of which acts according to the inverse square law. We shall not prove here this fact (the proof is given in Potential The- ory), but we shall show that these equations furnish us back the inverse square law, if we add the condition that the field must be symmet- ric with respect to one point.

A situation of this character, a situation where we have to solve a system of partial dif- ferential equations with the "additional condi- tion" of symmetry, will appear again later, and in order to be clear about its significance then, it is desirable to treat here this spe- cial case in detail.

To begin with we take the system 1.51, 1.5* in the form 1.53, 1.54, i.e., we state that X, Y,Z are derivatives of a function . Then fron the condition of symmetry of the X,Y,Z field it follows that the field represented by the func- tion (p also must be symmetric, i.e., that this function may depend only on the distance from the origin, because two points which are equi- distant from the origin are symmetric with re- spect to it and, therefore, 9 must have the same value in two such points.

We have thus to solve equation 1.54 with the additional condition that « depends on x, y,z, only through r. Indicating differentia- tion with respect to r by ' we have

1.56

and

1.57

btp/bx = <p''br/bx, etc.

xa = <p"-(dr/bx)*

1, etc.

Squaring each of the formulas 1.6 and taking the sum we have

or

on the other hand differentiating each formula 1.6 we have

bar

summing these we obtain

\5« by

bz»

1, etc.;

= 2.

Using now 1.57 we can give the equation 1.54 the form

or

•=•- 0

whence

' =

<p = > which substituted in 1.56 to- gether with 1.6 gives 1.?. We see thus that the general equations 1.51-1.52 give us all the general information we need about the fields of force in question; we shall call this system the system of equations of a Newtonian field or simply the Newtonian system, although Newton never considered, the differential equations that make it up.

We may comment briefly on the mathematical character of the magnitudes and equations we

have been dealing with in this section. At ev- ery point X,Y,Z may be considered as the compo- nents of a vector, the force vector; we have thus a vector at every point of space, and this constitutes what is called a vector field. The function <p is an example of a scalar field. These two fields are in the particular relation that the first is derived from the second by differentiation. The vector field satisfies equation 1.51, the left hand side of which is called divergence of the vector field. To find a divergence of a vector field we. take the sum of the derivatives of the components of the vec- tor with respect to the corresponding coordin- ates, or we differentiate each component with respect to the corresponding coordinate and add up the results. The formula becomes more ex- pressive if we number the coordinates by writ- ing

1.71 x = x^, y = xa, z = xa

and also the vector components, viz., 1.78 X = Xx, Y = X2, Z = X3.

The expression for the divergence may then be written as

1.8

1=1

The operation of forming a divergence is of fundamental importance in what follows.

We abandon now for a while the study of force fields and direct our attention to the left hand sides of the equations of motion.

2. Two Pictures of Matter.

Our fundamental equations 1.1 connect mat- ter as represented by the left hand sides with forces as represented by the right hand sides. There seems to be a fundamental difference in the mathematical aspects of matter and force. The quantities characterizing matter, the mo- mentum components, for example, are functions of one variable t, and are subjected to ordin- ary differential equations, whereas quantities characterizing force, X,Y,Z are functions of three variables x,y,z and, as a consequence, are subjected to partial differential equations; they are field quantities, whereas the matter components are not; another way of saying this is to say that force seems to be distributed continuously through space but matter seems to be connected with discrete points. This dis- tinction is, however, not as essential as it looks; it is merely the result of the point of view we take. Vt'e could very well consider mat- ter to be distributed continuously through space; each of the two theories, the discrete theory, according to which matter consists of

discrete particles, or material points, each of which carries a finite mass, and the continuous theory, according to which matter Is distribut- ed continuously through space, or certain por- tions of space, may be considered as the limit- ing case of the other. We may start with mate- rial points, then increase their number at the same time decreasing the mass of each and so approximate with any degree of precision a fir- en continuous distribution; or we may start with a continuous distribution, then make the den- sity decrease everywhere except in the constant- ly decreasing neighborhoods of a discrete ber of points, and thus approximate, with precision a given discrete distribution. It is clear that there cannot be any question as to which of the two theories is correct, since the difference between the two can be made as small as we please, and therefore the predictions based on the two theories can be made to agree as closely as we may wish, so that if one iden- tification is successful within experimental error the other will be likewise. Mathemati- cally the difference will be largely that be- tween ordinary differential equations, which are used in treating the motion of discrete particles, and partial differential equations, which apply to continuous distributions.

We may remark here that although forces are usually considered to be continuously dis- tributed in space, it is possible to introduce a discrete picture here also; this is being ac- tually done sometimes in the electromagnetic theory, when a field of force is represented by- discrete lines of force, and the intensity is characterized by the number of lines per square inch; we shall not, however, have occasion to use this picture.

We may also remark here that in the last few years still a third point of view has ap- peared (in Quantum Theory) which in a way oc- cupies an intermediate position; mathematically the treatment is that used in the continuous . case (partial differential equations), but the interpretation is given in terms of discrete particles, the continuous quantities being con- sidered as probabilities of a particle being within a certain volume, and the like. This point of view also will not be used in What fol- lows, and is mentioned here only for the sake of completeness.

We want now to translate the equations of motion 1.11 into the language of the continuous theory. Each point of space (or of a certain portion of space) will be considered as occu- pied at each moment (or each moment during a certain period) by a material particle. Here we also denote by u,v,w the velocity components of a particle of matter, but here they are also considered as functions of coordinates as well as of time; by

u(x,y,z,t), v(x,y,z,t), w(x,y,z,t)

we understand the velocity components of a par- ticle which at the time t occupies the position x,y,z. The fundamental quantity in this theory corresponding to mass of the discrete theory is density. A particle does not possess any finite mass, a mass corresponds only to a finite vol- ume (at a given time). To a point (at a given time) we assign a density which may be explain- ed as the limit of the mass of a sphere with the center at the given point divided by the volume of that sphere as the radius of the sphere tends toward zero. A better way of putting it is to say that we consider a point function p(x,y,z,t) called the density and that the mass of matter occupying a given volume at a given time is the integral

J"p(x,y,z,t)-dxdydz

extended over the given volume. This integral will, in general, be a function of time; the mass in a given volume changes with time because new matter may be coming in and old matter going out, and they do not exactly balance each other. But if we consider a certain volume at a given time, and then consider at other moments the volume which is occupied by the same matter, then the mass of matter in that new volume must be the same. That means that if we consider x, y,z, as functions of t, namely, as the coordin- ates of the same particle of matter at differ- ent times, and if we consider the region of in- tegration as a variable volume but one that is occupied by the same particles of matter at all times, then the integral must be independent of time, or

.1 -jT/J p(x,y,z,t)«dxdydz = 0.

This may be written also in a differential form as

2.2 bpu/bx + bpv/by + bpw/oz + bp/bt = 0.

We indicate two proofs for this fact; first an easy but not rigorous proof.

For an infinitesimal volume V = dx-dy-dz we may consider density as the same in all points of the volume, so that mass will be the product V»p and the derivative of this product will be

dV/dt-p + V-dp/dt.

Then again, considering V as a product dx»dy»dz we find

dV/dt * ddx/dt»dydz + dx-ddy/dt-dz + dx-dyddz/dt.

Now set dx = x, - Xj.;

ddx/dt = d(xt - xj/dt = dx,/dt - dxx/dt » u, - Uj.

substituting this in the preceding relation we find

dV/dt = V(bu/&x + by /by + bw/bz), and since

dp/dt

= op/ox- dx/dt + bp/by»dy/dt + bp/bz-dz/dt + bp/bt = 'op/ox-u +T>p/by -v +"op/oz-w + bp/bt, the expression for the derivative of mass gives dm/dt = V-(bpu/bx + bpv/by + bpw/bz + bp/bt)

so that constancy of mass is expressed by the condition 2.2.

A rigorous proof would be based on express- ing the integral for the moment t1 which may be written as '^(x* ,y',zf jt^dx'dy'dz1 using as variables of integration the coordinates of the corresponding particles at the moment t. The formulas of transformation would be

2.3 x' = f(x,y,z,t'), yf = g(x,y,r,t'), z' = h(x,y,z,t«)

where f(x,y,z,t') is the x coordinate at the moment t1 of the particle which at the moment t was at x,y,z, etc. Using the formulas of transformation of a multiple integral we would obtain

/P(f,g,h,t')-J.dxdydz

where J is the Jacobian of the functions £.3 and the integration is over the volume occupied at the moment t. Setting the derivative of this integral with respect to t1 equal to zero, and then making t1 = t and noticing that bf/bt1 for t1 = t is the velocity component u, etc., we would find the same equation 2.2.

This equation is called the "continuity equation of matter" or the "equation of conser- vation of matter." The corresponding equation In the discrete theory is the equation (1.12) dm/dt = 0 which is not usually included among the fundamental equations of mechanics. The continuity equation may be written in a very simple form if we use the index notations for the coordinates introduced before (1.71), intro- duce analogous notations for the velocity com- ponents, viz.,

2.4

u = u-

v = ut,

w = u»,

and in addition write

2.5 t = x*

and agree, when it is convenient, to write u4 for unity so that

2.41

1 = u4.

With these notations the continuity equa- tion becomes

2.21

= 0.

Noticing the analogy of this equation with equ- ation 1.8 we are tempted to say that the con- tinuity equation expresses the fact that the "divergence" of the "vector" of components pu^ is zero. This involves, of course, a generali- zation of the conceptions divergence and vector, because the summation here goes from one to 4 instead of to 3 as in the above formula. This generalization will be of extreme importance in what follows. In the meantime we may notice that the divergence of the vector pu^ plays the same role in the continuous theory as the time derivative of the number m played in the dis- crete theory.

We now continue the translation of the equations of the discrete theory into the con- tinuous language. The equations of motion ex- press the fact that the time derivatives of the momentum components are equal to the force com- ponents; limiting our consideration to the left hand sides of the equations we have therefore to consider the time derivatives of the momen- tum components; in the first place the time de- rivative of mu; without repeating the reasoning which led us to the continuity equation of mat- ter, and noticing that the only change consists in replacing of m by mu we find that the time rate of change of the first component of the mo- mentum vector will be here

2.61 fcpuu/ox + dpuv/oy + opuw/oz + "opu/ot

and the analogous expressions for the other com- ponents will be

2.62 dpvu/dx + opvv/dy + opvw/dz + opv/dt

2.63 opwu/ox + opwv/dy + opww/oz + opw/ot.

These expression will have to be set equal to the force components (or, rather, components of the force density) in order to obtain the equa- tions of motion. Such equations have been obtain- ed by Euler for the motion of a fluid and are re- ferred to as Euler 's hydrodynamic equations; but at present we are not so much interested in the equations of motion as in the mathematical struc- ture of the expressions involving matter compo- nents that we have written down. An attentive

Inspection will help to discover a far-reaching symmetry which again finds its best expression if we use the index notations introduced above (1.71, 2.4, 2.5). We may write, In fact, for the last three expressions

2.6

J = 1,2,8

and we note furthermore, that If we let j hare take the value 4 we obtain the expression ap- pearing on the left hand side of the equation of continuity (2.21). We come thus to the idea of considering the quantities

2.7

iJ

and we see that the expression

2.65 I OMJ./C-X! J. = 1,2,3,4

plays a very important part in our theory. The first three components, i.e., the expressions obtained for J = 1,2,3, give the time rate Of change of the momentum components, and the last one, obtained by setting j = 4, gives the ex- pression whose vanishing expresses conservation of mass. The expressions 2.65 appear as a gen- eralization of what we call a divergence, and we shall call it divergence also, but it is clear that the whole structure of our expres- sions deserves a closer study to which we shall devote our next section.

3. Vectors, Tensors, Operations.

We shall later treat the fundamental con- cepts of vector and tensor analysis in a syste- matic way. At present we shall show how the language of this theory which for ordinary space has been partly introduced in section 1 can be applied to the case of four independent variables and extended so as to furnish a sim- ple way of describing the relations introduced in the preceding section.

A quantity like p which depends on the in- dependent variables x,y,z,t, we shall call a scalar field. The four quantities uif ua, U3, u4, we shall consider as the components of a vector (or of a vector field; the latter, if we want to emphasize the dependence on the in- dependent variables). The sixteen quantities pujiij furnish an example of a tensor (or tensor field) . A convenient way to arrange the com- ponents of a tensor is in a square array; for instance,

3.1

pUjU3

puaut puau,

PU3U8 PU3U3

pu4u2 pu4us

pu4u4

3.2

•OUa/bx "bus/bx

OU4/&X

"bu3/by bu4/oy

oua/bz bu3/dz

bua/bt ou,/bt

We want to mention here a very important tensor the array of whose components is

3.4

0 0

1 0 0 1 0 0

its components are usually denoted by 8i< so that 0 jj is one if the indices have the same value, and zero, when they are different. The Oj, are often referred to as the Kronecker symbols.

A tensor has been obtained above from a vector by differentiation; the same process can also be applied to a scalar in order to obtain a vector. From the scalar p, we would thus ob- tain a vector whose components are 5&, l£, ?£., fcp ox* oy' oz'

5^.; this vector is often called the gradient of

the scalar p. On the other hand the same proc- ess may be applied to a tensor. For instance, differentiating each of the sixteen components, MJJ = pu^j introduced in 2.7 with respect to each of the independent variables x^ we obtain the 16 x 4 numbers (or functions) bMji/dxjj. We call these numbers the components of a tensor of rank three, and we may call now what we called simply a tensor, a tensor of rank two, a vector - a tensor of rank one, and a scalar - a tensor of rank zero. The operation of differ- entiation leads then from a tensor (or, better, a tensor field) to a tensor of the next higher rank.

It is also convenient to introduce another operation, the operation of contraction; it can be applied to a tensor of at least rank two and it lowers the rank by two; for a tensor of rank two it consists in forming the sum of all the components whose indices are equal, or, if the components are arranged as explained before, in taking the sum of all the components in the main diagonal.

The operation of taking the divergence of a vector (field) may be stated now to consist of the operation of differentiation followed by the operation of contraction applied to the re- sulting tensor of rank two.

A tensor of rank three may be contracted, in general, in three different ways; in general, a tensor of higher rank in as many ways as there are pairs of indices. To contract a ten- sor with respect to two of its indices means to take the sum of those components in which the two selected indices have the same values; for instance, bM^/Ox^ is a tensor of rank three; its contraction with respect to the first and the third indices is the sum ZoMij/bxi; the index j is allowed to take all the four values 1,2,3,4 so that we have four sums which are

considered as the components of a vector - the divergence of the tensor U jj.

We may finally mention the operation of multiplication which has been applied several times in what precedes. The vector put * qt has been obtained as a result of multiplication of the vector u^ by the scalar p. The tensor MJJ has been obtained by multiplying the vector qt by the vector \i^ (every component of the first by every component of the second - that la why we have to use different indices - 1 and J are supposed to take independently of each of the values 1,2,3,4).

The operation of contraction Introduced above will be performed very often; it is con- venient, therefore, to simplify our notation; this simplification consists in omitting the symbol of summation, and in indicating that summation takes place by using Greek letters for indices with respect to which we sum. Thai we shall write

3.5 1>pua/bxa =0 and 3.6

for 2.21 and 2.65 respectively. The first gives an example of a divergence of a vector, the sec- ond of a divergence of a tensor.

The Greek index in the above formulas has no numerical value; any other Greek letter would do just as well; in this respect a Greek index may be compared with the variable of Integra- tion in a definite integral. The only case when we have to pay some attention to the particular Greek letters we are using is when two (or more) summations occur in one expression - in such a case different Greek letters have to be used for every summation. If we have to write, for example, (ZXiyi)8 using Greek indices we could write it as (*aya)3, but if we want to write out the two factors instead of using the expo- nent we have to write xayaxByB because (*o7a ) (Vo ^ would have meant zCr^)*.

The operation of contraction is used quite often. The formation of the scalar product of two vectors u^ and vi may be considered as re- sulting from their multiplication followed by contraction; the multiplication gives the ten- sor of the second rank UjV*, and contracting this we get uava = u^ + u8va + u3v3, which is the scalar product; the scalar product of two vectors could be also called the contracted product of the two vectors. In an analogous fashion we can form a contracted product of two tensors of the second rank. If the tensors are ajj and b^ the contracted product will be aiabaj » ** i3 also a tensor of rank two. It may be interesting to note that the formation of the contracted product of two tensors is es- sentially the same operation as that of multi- plying two determinants corresponding to the arrays representing the tensors; to see that, it will be enough to consider two three row determinants

aia

a33

and

Their product, according to theorem of multipli- cation of determinants la

and It Is seen that the elements of this deter- minant are the components of the tensor of rank two which arises from the tensor a^j and by by first multiplying them and then contracting with respect to the two inside indices.

We could also speak of the contracted square of a tensor, meaning by this the contracted product of a tensor with itself.

4. Maxwell's Equations.

In section 1 we discussed from a formal point of view the inverse square law and the fields of a more general nature which can be derived from it; and we expressed the laws of these fields in terms of three-dimensional ten- sor analysis, i.e., we employed only three in- dependent variables; after that we found that matter is best discussed (from the continuous point of view) by using four-dimensional ten- sor analysis. We have thus a discrepancy: two different mathematical tools are used in the treatment of the two sides of the fundamental equations of mechanics. This discrepancy will be removed In what follows, it will be removed by considering force fields that differ from those derived by composition of inverse square laws, by modifying in a sense this law; but the modifications will be different In the two cases in which the inverse square law has been applied in older physics, the two cases which we are going to mention now.

Originally, the inverse square law was in- troduced in the time of Newton in application to gravitational forces; we shall discuss in chap- ter V the gravitational phenomena, and see what modifications - radical in nature, but very slight as far as numerical values are concerned - the inverse square law will suffer. Later, it has been recognized that the inverse square law applies also to the electrostatic and magneto- static fields produced by one single electric, resp. magnetic particle. Still later a more general law for electromagnetic fields has been introduced by Faraday and Maxwell, which we shall have to study now.

If X,Y,Z denote the components of electric force in the static symmetric case, as we just said, the inverse square law applies, and, as

shown In section 1 it follows under assumption of additivity that for a field produced by any number of particles the divergence vanishes,

4.1

bX/bx + bY/by + 6Z/oz » 0,

and the quantities oY/oz - bZ/by, oZ/bx -bX/bz, oX/oy - bY/bx also vanish; a static magnetic field does not interact with the electric field, but when a changing magnetic field IB present the laws of the electric field are modified; viz., the quantities Just mentioned are not zero any more but are proportional or, in ap- propriately chosen units, equal to the time de- rivatives of the components L,M,H of the magne- tic field, so that we have, in addition

4.11

oY/oz - bZ/by = oL/ot oZ/6x - oX/6z = oM/dt bX/by - bY/bx = oN/ot,

In a similar fashion, the divergence the magnetic force vanishes,

of

4.2

bL/bx + bM/by +bH/bz = 0,

and the expressions bM/bz - oH/oy, bH/bx - oL/oz, oL/oy - oM/ox are proportional to the time de- rivatives of the electric components, the fac- tor of proportionality, however, cannot be re- duced to one; by an appropriate choice of units it can be reduced to minus one, and no changing of directions or sense of coordinate axes can permit us to get rid of this minus sign without introducing a minus sign in the preceding equa- tions; this minus sign is of extreme importance in what follows, as we shall have occasion to observe many times; in the meantime we write out the remaining equations

4.21

bM/bz - dN/oy = -6X/bt oN/bx - bL/bz «= -bY/bt 5L/by - bM/bx = -bZ/bt,

Several remarks must be made here concern- ing these equations, which will be referred to as Maxwell's equations.

In the first place, these equations cannot be proved; they have to be regarded as the fund- amental equations of a mathematical theory, whose Justification lies in the fact that its quanti- ties have been successfully identified with meas- ured quantities of Physics, in the sense that for physical quantities the same relationships have been established experimentally as those deduced for the corresponding theoretical quan- tities from the fundamental equations. In the second place, the equations as they appear above present a simplified and Idealized fora of the fundamental equations, namely the fundamental equations ft>r the case of free space, i.e., com- plete absence of matter.

In the third place the choice of unit* which made the above simple form possible con- cerns not only units of electric and magnetic force, but also units of length and time; it was necessary to choose them in such a way that the velocity of light, which in ordinary units is 300,000 kilometers per second becomes one. As a result of this ordinary velocity, those we observe in everyday life are expressed by very small quantities.

For our purposes it is convenient to ar- range our equations in the following form, where differentiation with respect to a variable is Indicated by a subscript,

4.3

Ny - Mz - Xt = 0

Lz ~ Nx ~ Yt = ° Mx - Ly - Zt = 0

*x + * + Zz = 0

Yx -

= 0 = 0 + Nt = 0 + N, = 0.

As mentioned before the above equations describe the behavior of electric and magnetic forces in free space, that is in regions where there is no matter, or where we may neglect mat- ter. On the discrete theory of matter these equations still hold everywhere except at points occupied by matter - in this theory matter ap- pears as singularity of the field and some nu- merical characteristics of matter, such as elec- tric charge, appear as residues corresponding to these singularities. We shall not discuss this point of view, although mathematically it is very interesting. On the continuous theory of matter some terms which represent matter have to be added to the preceding equations. The sec- ond set of Maxwell's equations (4.3) remains un- altered, but the first set is modified; the loft hand sides do not vanish any more but are proportional to the velocity components of mat- ter; the coefficient of proportionality is elec- tric density which we denote by e . The equa- tions of Maxwell for space with matter are thus

4.31

Ny - Ms - It = eu

£» - »« - ** " 6V

Mz - Ly - Zt = ew

Xx + Yy + Zf = e

Zy - + Lt^ = 0

Yx - Xy + Nt = 0

Lx + My + Na = 0,

We come thus across a new scalar quantity - elec- tric density. However, in most cases this den- sity is proportional to mass density p we have considered before, the factor of proportionality being capable of only two numerical values - one negative for negative electricity, and the other positive for positive electricity.

Even these equations are not sufficient for the description of electromagnetic phenomena; they correspond to a certain idealization in which the dielectric constant and magnetic per- meability are neglected, but we shall not go beyond this idealization.

In the above equations we have four inde- pendent variables x,y,z,t, as in the discussion

of matter in section 2, and we may try now to apply to them the same notations which have bean introduced in that section and section 3. The main question bar* 1st how to traat the six quantities X,Y,Z,L,M,H? Tha question was solved by Mlnkowski in 1907 in tha follow- ing way: it is clear that a vector has too faw components to take care of these quantities; in- stead of using two rectors, Minkowski proposed to use a tensor of rank two; of course, a ten- sor has too many components; to ba axact, It has, in the general case, 16 components - four in the main diagonal, six above, and six below; we set those in the main diagonal zero, and those under the main diagonal equal with oppo- site sign to those above the main diagonal sym- metric to them; in this way we are left with six essentially different components; tha re- striction just introduced is expressed in one formula

4.4

0;

in fact, the elements in the main diagonal cor- respond to equal indices; if we set J = i the

above formula becomes FJJ_ +

0,

whence FJH = 0 as asserted. We try now to Iden- tify the components of this tensor with our electromagnetic force components in tha follow- ing way:

4.5

X = F4i, Y = F4t, Z = F4,,

L = Fts, M = F31, I

using these notations together with 4.4 accord ing to which, for example, Fi4 = -X, we can write the first set 4.3 in tha highly satisfac tory form,

4.6

OF18/OX8 + OF13/OX3 •»• ^F14/6x4 = 0 oF8 3/0*3 + "oFai/Oxx + oF14A>x4 = 0

•oF31/OXj. + "oFst/OX. + OF34/OX* = 0

"oF^/oXi + •oF4t/Oxt + oF43/ox\ « 0 = 0.

or

These four equations show a high degree of symmetry; moreover they show a very pronounced similarity to some of the equations we have been considering in section 2, and for which wa pre- pared a mathematical theory in section 3; we can say that the four equations written above ex- press the fact that the divergence of the ten- .sor Fji just introduced vanishes in the case of free space. However, if wa apply tha same no- tations to the second set 4.3 of Maxwell's aqu- ations nothing very simple comes out; tha minus sign mentioned after formula 4.2 above seems to cause trouble; but there exists a way out also from this difficulty; it mas been indicated (be- fore Minkowski 's paper) in tha work of Polncare and Marcolongo, and foreshadowed in a private

letter by Hamilton as early as 1845. We can overcome the difficulty if we allow ourselves to use Imaginary quantities side by side with real quantities - this ought to cause no worry provided we know the formal rules of operations, since our new notations are of an entirely for- mal nature anyway; we set now Instead of £.5

4.7 x » xx, y =

and Instead of 4.5

x,, It

IX = F41, 1Y = F4t, iZ = F4,,

4.72

L = F.,, M = F,x, N = Fj.,

and then the first set (4.3) becomes (4.5) as before, but the second set (4.3) also acquires a highly satisfactory form, namely

4.61

JJ34 , 0*41

ox, ox3

= 0

oxa

= 0

or

As mentioned before we consider the compo- nents Fj, as the components of a tensor. We may say that we have sixteen of them; the six which appear in the relations 4.78, six more which re- sult from them by interchanging the indices and whose values differ from those given in 4.72 on- ly in sign, and four more with equal indices. According to the formula (4.4) they are zero. We may arrange them in a square array as follows:

4.8

Fia F8a

j.3

F14 Fas Fa4 Fas F34 F43 F44

0 N -M -IX

-NO L -H

M -L 0 -iZ

IX 1Y iZ 0

We may compare the property F

ij

= -F

which our new tensor has with the property MJJ = Mji possessed by the tensor of matter (2.7) ana which is simply the result of commutativity of multiplication. These two properties are manifested in the square arrays (3.1 and 4.8) in that the components of M^i which are symmetric with respect to the main diagonal are equal, and those of FJLJ which are symmetric with respect to the main diagonal are opposite. Tensors of the first type are called symmetric, those of the second - antisymmetric .

We want to see now whether the notations which permitted us to write in a nice form Max- well 's equations would not spoil the nice form which we previously gave to the hydrodynamical equations. But since here we have at our dis-

posal the quantities ux, ug, u«, u+ we can ar- range it so as to off let the 1 in the x«. In fact, since differentiation with respect to time occurs always in the presence of an u* It is enough to set

4.9

114 * i

instead of 1, as in £.41, and everything will be all right, as far as the left hand sides of the equations are concerned, except that the left hand side of the continuity equation be- comes imaginary. This, however, does not mat- ter since the right hand sides of the equations, we temporarily disregard.

And now we may consider the Maxwell equa- tions with matter 4.31; the second set is not affected and may be written as 4.61, but the first will appear in a form which may be writ- ten simply as

4.62

The equation of continuity of matter Is s consequence of these equations; to obtain it differentiate, contract and use the property Fij = ~yji» tne result gives the continuity equ- ation of matter if we take into account that p/e is a constant.

5. The Stress-Energy Tensor.

So far the equations to which we subjected force components have been linear equations, whereas operations performed on matter involved squares and products of matter components; the similarity which we observed in the mathematical aspects of force and matter components makes it seem desirable to subject force components to operations analogous to those which we applied to matter, viz., multiplication.

In the static case we return to our nota- tions (1.72) Xx, Xa, I, for X,Y,Z and form the tensor X±Xj; this tensor, slightly modified plays an important part in the theory; the mod- ification consists in subtracting from it a^ijXgXd where ftjj are the components of the tensor introduced in 3.4 and XgXg stands for the contracted square of the vector XA, I.e., X* + Y* + Z*. We consider then the tensor

whose array is

J(X§ - Y* - Z*) U

XY i(Yt-X*-Zt) XZ IZ

12 TZ -X1 -Y1).

Of this tensor we fora the divergence (three- dimensional), and find as its components

X(XX + Yy X(YX - Xy) X(ZX - X,)

Z,) + Y(X7 - Yx) + Z(X, - Zz), Y(XX + Yy + Z.) +- Z(Y. - Zy), Y(Zy - Y.) * Z(XX -f Yy + Z.);

the connection of these expressions with the Newtonian equations (1.51, 1.52) is obvious; the expressions in brackets are the left hand sides of the Newtonian equation, so that the divergence of our new tensor Xj.Xj - 0^X3X3 vanishes as the result of Newtonian equations. This again confirms us in our opinion that from the mathematical point of view force and matter components are of very similar nature. We have now in mind electric and magnetic forces; if X, Y,Z are the components of the electric force vector the tensor whose array is written out above is called the "electric stress tensor" j an analogous expression in magnetic components is called the "magnetic stress tensor"; the sum of the two, namely

-Y8-za-Ma -H8) H+LM xz+iJi

XY+LM i(I8+!!l8-Xa-Z8-L8-&a) YZ-Hffl

XZ+LN

TZ4MN

is called the "electromagnetic stress tensor"; it has been introduced by Maxwell and plays some part in electromagnetic theory, for instance, in the discussion of light pressure; but its main applications and Importance seem to be in the study of the fundamental questions, as part of a more general four-dimensional tensor.

We saw how nicely the system of Indices worked in the case of Maxwell's equations; it is natural to express in index notation this tensor also. We assert that the required ex- pression is given by

5.1

E

iJ

JijFopFpo

where i,j take on values 1,8,3, and the summa- tions indicated by p and o are extended from 1 to 4. In fact,

+Fi4F4i

= E(-L8-«a-N8+X8+I8 + Z8), FipFpx = FaaF,i +FjaF31 +F14F4i = -N* - IIs + I*,

- fc8 - *T8 - |Z8 =

L8 - I8 - Z8 - M8 - H8)

and similarly for the other components corres- ponding to different combinations of the indices 1,2,3 for 1 and J. There seems to be an incon- sistency here; the summation indices p and o we let run from 1 to 4, but we consider only the

10

values 1,2,3 for i and J. It la interesting to see what comes out if we let i and j take the value 4. We get four new components, namely,

3 14 = FipFp* = FX,F,4 + FX,F,4 - 1(HI - HZ), *«4 = F.pFp4 = FttFl4 * F.,F,4 « i(LZ - IX), B.4 - F.pFp* = F,XFX4 * F,,F,4 - l(MX - LI), E44 * F4pFp4 - i(-L8 - M" - •• + X" «• I" + Z»)

= i(X8 + Y8 + Z8 * L8 •»• M8 •»• ••).

These quantities happen also to have phys- ical meaning. The first three constitute (ex- cept for the factor i) the components of the so-called Poynting vector, and the last on* is the so-called electromagnetic energy (or, ener- gy density) . We are thus led by the notations we have introduced in a purely mathematical way to some physical quantities; we may say that the entire tensor with its sixteen components - it is called "the electromagnetic stress-energy tensor" - unifies in a single expression all the second degree quantities appearing in the electromagnetic theory; the stress components, the Poynting components, and the energy.

The stress-energy tensor may be written out in the form of the following square array:

5.2

Xa+ L8- h XY + LM XZ + LN i(BY - MZ)

XY + LM Y8 + M8- h YZ + MN i(LZ - II)

XZ + LN YZ + MN Z8+Na-h i(MX - LY)

i(NY-MZ) i(LZ-NX) i(MX-LY) h

5.3 where h = J(X8 + Y8 + Z8 + L8 + M8 + I8).

6. General Equations of Motion. The Complete Tensor.

We let ourselves be guided once more by what seems to be natural from the formal point of view, and form the divergence (four-dimen- sional divergence) of the new tensor. This can be done either in components or in index nota- tion. We show how to do it the latter way, and leave it to the reader to write out the stress- energy tensor as an array and to font the di- vergence of the separate lines. Applying form- ula 3.6 to the tensor 5.1 we have

the first term on the right may be split up in two equal parts, one of which, writing P f or Y ,

may be .written as ^F *** the otner»

writing a for y and P for a> and interchanging Indices in both factors, which does not affect the value because it amounts to changing the sign twice, takes the form ?ftkF. We thus

i*olc^

Substituting for the second factors their val- ues from Maxwell's equations 4.61 and 4.62 (in space with matter) we get

If •'«».

or in components without indices

6.1

"oE,

=

e(Nv - e(Lw - e(Mu -

Mw +

NU + Lv +

x),

Y),

z),

** = ei(Xu+ Yv + Zw).

These expressions obtained by us in a purely formal way are known to possess physical significance: the first three represent the components of the force exerted by an electro- magnetic field on a body of electric charge den- sity e, and the last (if we neglect the factor i) is the rate at which energy is expended by the field in moving the body. The first three expressions obviously give the right hand sides of our hydrodynamic equations. Since the left hand sides also (2.61, 2.62) have been obtained as divergence components (3.6) we may write these equations in an extremely simple form, if

we introduce a new tensor, the difference the two appearing on the left and right sides, viz.,

6.2 TIJ -M^ - BIJ.

Our hydrodynamicel equations are then siaply

6.3 ^Tag/ox,. » 0 (l « 1,2.3).

11

of

This seems to be very satisfactory, but there is an unpleasant feature about it, namely, that for 1 = 4 we do not seem to get a correct equa- tion: the contribution of the tensor Mj, gives the left hand side of the continuity equation, and is therefore, zero; but the contribution of the stress-energy tensor is the work performed by the forces on the particle and is, in gener- al, not zero. The source of this unpleasantness and the way to remove it will be clear after the reader becomes acquainted with the contents of the next chapter.

It is time now to cast a glance on the sit- uation as it has been worked out till now. We have ten fundamental quantities, p,u,Y,w,X,Y,Z, L,M,N. They satisfy certain equations, the Max- well equations (4.61, 4.62), the equation of continuity (3.5), the equations of motion (6.3); in the last named equations our ten quantities enter in certain combinations which are the com- ponents of the tensor TAj . This tensor appears then as a very fundamental one. It may be asked whether it determines the ten quantities which enter into it; if it does, all the quantities we have been considering are, in a general sense, components of one entity, the tensor TJJ , and all the equations we have introduced express properties of this tensor - that part of Physics which we are discussing in this book, with the exception of the gravitational field, appears then as the study of the tensor T^j. It can be proved that TJJ with certain restrictions deter- mines the quantities p,u,v,w,X,Y,Z,L,M,N, and in Chapter V it will be shown that the gravitation- al phenomena also are taken care of by it.

12

Chapter II.

NEW GEOMETRY

In the preceding chapter we achieved by In- troducing appropriate notations a great simplic- ity and uniformity in our formulas. The no- tations in which indices take the values from 1 to 4 are modeled after those previously intro- duced in ordinary geometry, the two points of distinction being first that we have four inde- pendent variables instead of the three coordi- nates, and second, that the fourth variable la assigned imaginary values. In spite of these distinctions the analogy with ordinary geometry is very great, and we shall profit very much by pushing this analogy as far as possible, and using geometrical language, as well as nota- tions modeled after those of geometry.

Physics seems to require then, a mathema- tical theory analogous to geometry and differing from it only in that it must contain four coor- dinates, one of which is imaginary. The first purpose of this chapter will be to build a the- ory to these specifications. The remaining part of the chapter will be devoted to a more syste- matic treatment of tensor analysis.

7. Analytic Geometry of Four Dimensions.

In the present section we shall give a brief outline of properties which we may ex- pect from a four-dimensional geometry guided by analogy with two and three-dimensional geom- etries j of course, we shall lay stress mainly on those features which we shall need for the application to Physics that we have in mind. In this outline we shall disregard the fact that our fourth coordinate must be imaginary; cer- tain peculiarities connected with this circum- stance will be treated in section 10.

The equations of a straight line we ex- pect to be written in the form

7.1

entirely similar to that used in solid analytic geometry; but we may also use another form; as written out the equations state that for every point of the line the four ratios have the same value; denoting this value by p we may express the condition that a point belongs to the line by stating that its coordinates may be written as

7.11

x2 = aa+pv8, *a = a3+pv3, x4 = a4+pv4; giving here to p different values we obtain (for

given a.±, vx) the coordinate* of all the dif- ferent points of the line. The variable p if called the parameter, and this whole way of de- scribing a line is called "parametric repre- sentation". Parametric representation is by no means peculiar to four dimensions, it may be, and is, of ten used in plane and solid analytic geometry. We present it here because we shall need it later, and it is not always sufficient- ly emphasized.

A straight line is determined by two points; the equations of the line through the points ai and bi is given by the above equa- tions (7.1) in which

7.2

= bx -

va = bt - aa, etc.

Two points determine a directed segment or vector, whose components are the differences between the corresponding coordinates of the points, so that v^ is the component of the vec- tor whose initial point is given by a^ and whose final point is given by bx.

A vector determined by two points of a straight line is said to belong to that line, and we may say that we can use as denominators in the equations 7.1, or as coefficients of p in 7.11 the components of any vector belonging to the straight line.

Two vectors are considered equal if they have equal components; a vector is multiplied by a number by multiplying its components by that number, and two vectors are added by add- ing their corresponding components.

Two lines are parallel if they contain equal vectors, and it is easy to see that a condition for parallelism of line 7.11 and

7.3 is

_ Xa-A8 _ X3-A3 _ V8 V,

vi/vi = va/V8 = va/V3 = v4A* or Vj = avx

where a is a number so that proportionality of components of two vectors means parallelism.

A condition for perpendicularity of these two lines we expect to be the vanishing of the expression

7.4

YJ.VX + v8Va + vsV, + v4V4

which is called the scalar product of the two vectors v^ and .

The distance between the points ax, aa, a3, a4 and bj. , ba, b3, b4 is given by the square root of the expression

7.41

-ba)« + (as-b,)» •»• (a4-b4)»j

this distance Is also considered as the length of the vector joining and bj,. The expres- sion for the square of the length of a vector may be considered as a special case of the ex- pression 7.4. We may say then, that the square of the length of a vector is the square of the vector, i.e., the product of the vector with itself.

We shall often use Roman letters to denote vectors. The scalar product of the vectors x and y will be denoted by x.y and the square of the vector x by x8.

A vector whose length, or whose square, is unity we shall call a unit vector. Its compo- nents, we would expect, may be considered as the direction cosines of the line on which the vector lies. We also expect that the scalar product of two vectors is equal to the product of their lengths times the cosine of the angle between them; but, of course, an angle between two vectors in four-dimensional space has not been defined, so that we could simply define the angle between two vectors by this property, or by the formula

7.5

cos <p =

But if we want the angle to be a real quantity the absolute value of this expression cannot ex- ceed unity, or

t(x.y)* ^ xa.ya or (x.y)*- x8y8 - 0.

If we form the vector Xx + yy where X and M- are two numbers, the square of that vector would be

X8x2

and the above inequality, which expresses the fact that the discriminant of this expression is negative is seen to be a consequence of the as- sumption that a square of a vector is never neg- ative.

A plane we would expect to be determined by three points not in a line, or by two vectors with the same initial point or by two lines through a point. Instead of characterizing a plane by equations we prefer to give it in par- ametric form; limiting ourselves to a plane through the origin we have

7.18

or

qbi, xa = paa + qbg,

xs = pa3 + qba, x4 = pa4 + qb4,

18

where ai and bi are the coordinates of two points in the plane or the components of two vectors of the plane whose initial points may be considered as at the origin. We shall write this formula also as

x aa + pb

where x, a, b stand for vectors whose compo- nents are xif a^, b^ and where we use Greek letters for parameters in order to avoid con- fusion with vectors which we denote now by Ro- man letters.

We always can choose two mutually perpen- dicular unit vectors as the two vectors deter- mining a plane; if we call these vectors 1 and J the preceding formula becomes

7.6

x » oi + pj.

The fact that 1 and J are perpendicular unit vectors may be written as

i.J = 0,

J8 -1.

3

It Is easy to see that in this case a and are the projections of x on the directions of 1 and J, or the scalar products of x with 1 and j respectively.

Every pair of coordinate axes determines a plane and since six pairs can be formed from four objects we have six coordinate planes.

In the same way that the direction of a straight line is determined by a configuration of two points on it - a vector, the "orienta- tion" of a plane may be determined by the con- figuration of three points on it - a triangle. A vector is given by its components, which are the lengths of its projections on the coordi- nate axes; in the same way a triangle may be characterized (to a certain extent) by the areas of its projections on the coordinate planes. If, for example, we take a triangle one of whose vertices is at the origin 0,0,0,0 and the two others at the points Xi and respectively, the areas of the projections will be the six quan- tities

It is interesting to compare these numbers, which satisfy the relation

F,

= 0

with the components of the tensor Pt« (com- pare 4.4), three of which have been identified with the electric, and the other three with the magnetic force components. Our ten fundamental quantities pu, pv, pw, p, X, Y., Z, L, M, H, sees to allow thus a geometrical Interpretation; the

14

first four are considered as the four projec- tions of a part of a straight line on the coor- dinate axes, the remaining six, as the projec- tions of a part of a plane on the coordinate planes.

A little against our expectations, however, these six quantities FIJ are not independent; the reader will easily verify, using the above expressions, that

7.7

0.

We have here a relation that exists in the math- ematical theory; at once the question arises: does a corresponding relation hold for the cor- responding quantities in the physical theory; according to the formulas 4.5 this would mean

L.X + M.Y + N.Z = 0,

i.e., perpendicularity of the electric and mag- netic force vectors; these vectors are, however, known not to be necessarily perpendicular to each other; our Identification is therefore faulty; a slight modification would, however, help to overcome the difficulty; if instead of considering the areas of projections of a tri- angular contour, we consider the areas of pro- jections of an arbitrary contour, not necessar- ily a flat one, then the six quantities are in- dependent and the formal analogy holds perfect- ly.

Returning to the plane we may mention that although it might seem strange at first glance, we should expect that two planes may have only one point in common - a situation which never occurs in three dimensions. An example of two planes with only one common point is given by the xxxa and the xax4 coordinate planes; the common point is, of course, the origin.

Four points not in a plane, or three vec- tors with a common origin, we expect to deter- mine a "solid" which may be defined as the to- tality of points of three kinds: (1) points on the lines determined by the given vectors; (£) points on lines Joining two points of the first kind; and (S) points of lines Joining two points of the second kind. In ordinary geomet- ry a configuration defined In this way exhausts all points, but not so in our four-dimensional geometry; as examples may serve the four coor- dinate solids, the totalities respectively of points satisfying the relations xx = 0, xt - 0, XB = 0, x4 = 0.

A parametric representation of a solid is analogous to that of a plane. For a solid through the origin we have as such parametric representation

7.61

x = oi +

where i,J,k have again been chosen as perpendic- ular unit vectors so that

l.J » J.k » k.l 0;

J" - k* - 1.

For all possible values of a, e, r »• ob- tain all the points of the solid through the origin determined by the vectors i,J,k.

Mext we consider a configuration determin- ed by five points not In a solid; we obtain all the points of our four-dimensional space. In- cidentally, as a generalization of the formulas 7.6 and 7.61 we may write now

7.62

x » al +

6f;

this formula gives the expression of every vec- tor with initial point at the origin in terms of four mutually perpendicular unit vectors. We have, of course,

7.8

i.J

k.

0,

= =

8. Axioms of Four-Dimensional Geometry.

Until now we have been listing some propo- sitions that we may expect to have in four-di- mensional geometry. But what is. four-dimension- al geometry? As an abstract mathematical the- ory it is Just a collection of statements of which some may be taken without proof and con- sidered as axioms and definitions, and the oth- ers are deduced from them as theorems. It is not difficult then to pass from our expectations of a four-dimensional geometry to a realization of such a geometry; all we would have to do would be to pick out certain of the propositions listed above and consider them as axioms and to show that the others can be deduced from them. But in so doing we do not want to include among our axioms propositions involving coor- dinates. In two and three-dimensional geometry we are accustomed to see analytic geometry based on the study of elementary geometry, that pre- cedes It. We have an idea of what a straight line is before we come to coordinate axes, and we choose three of these pre-existing straight lines to play the part of coordinate axes. We may choose these axes to a certain extent arbi- trarily (this arbitrariness being restricted only by our desire to have a rectangular sys- tem) ; the coordinate axes play only an auxiliary role, and it would be awkward to Include any reference to a particular coordinate system in the axioms. We look, therefore, among the prop- ositions mentioned in the preceding section for some that are Independent of a coordinate system and from which we can reconstruct the whole sys- tem.

As our fundamental undefined conception we choose "vector". The axioms that follow will in the main be rules of operations on vectors.

Axiom I. Every two vectors have a sum. Addi- tion is commutative a + b = b + a, and associa- tive, i.e., a + (b + c) = (a + b) •»• c; subtrac- tion is unique, i.e., for every two vectors a and b there exists one and only one vector x such that a = b + x. It follows that there exists a vector 0 that satisfies the relation a + 0 = a for every a.

Axiom II. Given a number a and a vector a there exists a vector aa or aa which is called their product. The associative law holds in the sense that a (pa) = (ap)a and also the dis- tributive laws (a + p)a=aa + pa and a(a + b) = a a + ab.

Axiom III. To every two vectors a and b cor- responds a number a.b or ab called their (scal- ar) product. Scalar multiplication is commuta- tive, a.b = b.a; it obeys together with multi- plication of vectors by numbers the associative law, a(a.b) = (aa.b); and, together with addi- tion - the distributive law a.(b + c) = a.b + a.c.

Before we formulate the next axiom we in- troduce the definition; the vectors a,b,c,.... are called linearly dependent if there exist numbers o, p, Y,.... not all zero, such that

aa + pb + YC + ... =0;

they are called linearly independent when no such numbers exist.

Axiom IV. There are four independent vectors, but there are no five Independent vectors.

Axiom V. If a vector is not zero its square is positive.

To these axioms on vectors we have to add some statements concerning points if we want to have a geometry, and as such we may take:

Axiom VI. Every two points A,B have as their difference a vector, h; or in formulas B - A = h, B « A + h.

Axiom VII. (A - B) + (B - C) = A - C.

The body of propositions which may be de- duced from these axioms we call four-dimension- al geometry.

In order to prove that the whole of geomet- ry can be deduced from the propositions I-VII we would have to actually deduce it. We shall not do it, of course, but we shall indicate how ana- lytic geometry can be arrived at in the follow- ing discussion, which is meant to be entirely formal, i.e., during which it is not intended to invoke our intuition but only the properties stated in the axioms.

By length of a vector we mean the positive square root of the product of the vector by it- self: |a| = /a1. Two vectors are considered per- pendicular, if their scalar product is zero. A

15

vector is called a unit vector if its length It = 1.

Lemma. There exist four mutually perpen- dicular unit rectors.

Proof. According to Axiom IV there exist four independent vectors a,b,c,d; i.e., such vectors that there are no four numbers a, P, Y » 0, not all zero such that aa + pb + YC + &d = 0. It follows that a ^ 0, because If It were zero, the numbers a=l, p = 0, Y = 0, 0 = 0 would satisfy the above relation. Call a, multiplied by the reciprocal of Its length, 1, so that i = a/ | a | . It is easy to see that 1 is a unit vector, and that i,b,c,d are Independent. Con- sider the vector b1 = b - i(bi). It Is easy to see that it is perpendicular to 1; multiply bf by the reciprocal of its length and call the result J; then, i and J are two perpendicular unit vectors and l,j,c are independent. Call c1 = c - i(ci) - j(cj); this vector is perpen- dicular to both i and j, and if we multiply it by the reciprocal of its length and call the result k, we have in i,J,k, three perpendicular unit vectors; the fourth vector t may be ob- tained from d in an analogous way.

Lemma II. Given any vector x we have the identity

8.1

x =

J(jx) + k(kx)

Proof. Since every five vectors are dependent (Axiom IV), there exist five numbers o, p, y, 6 , e not all zero such that

+ PJ +

ex =0;

here e cannot be zero since otherwise i,J,k, t would be dependent and we know that they are not. Dividing by -e we have

= (-f)i

(-f)J

multiplying by i, the three last terms on the right vanish as the result of perpendicularity of i to J, k and t and the first term becomes -£, whereas the left hand side is ix; in the same way we prove that -f, -£, -f have the val- ues Jx, kx, ^x, respectively, and lemma II is proved.

A set of four perpendicular unit vectors we shall call a set of coordinate vectors. We shall call the quantities Xi = ix, x, = Jx, x3 = kx, X4 = £x the components of x with re- spect to i,J,k,£. A point 0 together with a set of coordinate vectors we call a coordinate system. Given a coordinate system we can as- sign to every point X four coordinates in the following way: denote the vector X - 0 by x (according to Axiom VI); and call the compo- nents of the vector x the coordinates of the point X. If now we choose another origin O1 and the same set of coordinate vectors, the co- ordinates of the point X will be the components of X - 0'; but since, according to Axiom VII,

X - 0 - (X - 0») + (0« - 0), the old coordi- nates will be equal to the new coordinates plus the old coordinates of the new origin, and thus a connection is established with ordinary ana- lytic geometry.

Formula 8.1 may be compared with the form- ula 7.62. It may also be written as

8.2 x = Xj.1 + xaj + xak + x4/ .

9. Tensor Analysis.

We want to substitute now for the prelimin- ary definitions of tensor analysis that were suggested by the formal developments in Chapter I, a definition that is more satisfactory. At that time our point of view was simply that we shall consider symbols with two (or more) in- dices as the tensor components in a way similar to that of using symbols with one index as vec- tor components. But in case of vectors (in or- dinary space) we know what vectors are, and we consider the components as a method of repre- senting that known thing. In the case of ten- sors we seem to have to take representation as the starting point of our study. The situation seems complicated since the fact that there is not one but that there are many different rep- resentations of the same vector, depending on the coordinate system we use leads us to think that the same general situation will obtain in the case of tensors,, and the question arises na- turally: how shall we be able to find out, giv- en two representations of a tensor, whether it is the same tensor that is represented in the two cases; or, given a representation of a ten- sor in one system of coordinates how to find the representation of the same tensor in a giv- en other system. In order to be able to answer such questions intelligently we want to intro- duce the idea of the tensor itself, to put it in the foreground and to consider the components as something secondary. In the beginning we shall limit ourselves to the consideration of two di- mensions.

We look then for some entity, of which the components will be constituent parts. The first thing that occurs to our minds in connection with tensors of rank two is, of course, a deter- minant. It is a single number determined by its elements, or components. However, it cannot be used for our purposes because the determinant does not, in turn, determine its components.

Another instance where two index symbols occur in mathematics is the case of quadratic forms. The equation of a central conic may, for example, be written in the form axa + 2bxy + cy* = 1; or introducing the notations xx for x, xa for y, a n for a, a12 and a2i for b, and a22 for c, in the form

aalx8Xj.

16

Let us consider the left hand side of this equa- tion. Here the tensor components aij are com- bined (together with the variables xi, x§) into one expression; and they can, to a certain ex- tent, be gotten back from that expression. If we set xx » 1, xa « 0, for instance, we get an as the value of the expression; aaa can be got- ten in a similar way, but it would be difficult to imagine how a^ could be obtained; in fact, it is Impossible to get axa from this expres- sion, because two expressions which differ in their form but for which au •*• atl has the same value would give the same values for all com- binations of values of xx, xt. A slight gener- alization will, however, obviate this difficul- ty. This generalization is suggested by the equation of the tangent to the above conic and can be written as

9.1 9

aaixayx + ataX»ya;

here, for instance, setting xx = 1, xa = 0, yi = °> 7a = 1 *s get aia. We shall therefore consider the bilinear form above aj the tensor. If we do that we may free ourselves of co- ordinates easily. The variables Xi, xa and y^9 ya may be considered as the components of two vectors, and the above expression 9.1 furnishes us then a numerical value every time these two vectors are given; it may be considered as de- fining a function 9; the arguments of that function are the two vectors and the values are the numbers calculated by substituting the com- ponents of these vectors in the expression 9.1. This functional dependence we may consider as the tensor so that if we want to use another co- ordinate system we shall have the same vectors given by different components x\ , x'a and y'j., y'a and we expect to find another expression of the same type as 9.1 involving these new compo- nents, say

9 =

a'aix'ay'i

9.11

which would assign the same values to the two vectors. The coefficients will, of course, be different, and these new coefficients we shall consider as the new components of the sane ten- sor in the new coordinate system.

Let us perform the calculation. If we ro- tate our axes through an angle f the old coor- dinates are expressed in the new coordinates by the formulas

9.2

- X'aS,

x»ac,

where

9.3 c = cos •,

s = sin 9;

the components of the other vector yj. , ya will

be expressed by analogous formulas in terms of the new components of the same vector. Substi- tuting these expressions In the above bilinear form (9.1) we get

which may be written as 9.11 if we give to the values

ia

a12cs + a8isc + aa8s + aiaca - a^. s8 + aaasc

9.21

a1 8i = -alxsc - aias8 + aaic

asgcs

a'as 3 ajj.88 - algsc - aaics + a88ca.

These are the new components of the tensor whose old components are the a^. The equation of the conic section in the new coordinates has the same form as in the old system. We may say that 9.11 expresses the same functional depend- ence on the two vectors using their new compo- nents, as 9.1 does using their old components.

The components of a tensor change, in gen- eral, when we pass from one coordinate system to another but there are certain combinations which do not change; for instance, if we add together the first and the last of the four above equal- ities we obtain, taking into account that

9.4

the relation

+ s8 = 1

9.5 a'n + a'8a = a1]L + aaa;

also it is easy to prove that the expression

9.51

an aal

the

is not affected by the substitution . of primed components for the unprimed ones.

Expressions of this kind are called invari- ants.

We have thus achieved our purpose; although we use coordinates in the definition of a tensor the result is independent of the particular co- ordinate system used. We can go a step farther, however, and dispense with coordinates altogeth- er in the definition of a tensor. Not every functional dependence of a number <p on two vec- tors we shall call a tensor, the dependence on each vector must be linear (and homogeneous); by this we mean that the expression involves only first powers of the components of each vector, and no products of components of the same vec- tor; our conception of linearity seems thus to involve components and we are not rid of coor-

17

«

dinate systems yet. We want therefore to de- fine linearity independently of coordinate rep- resentation and we shall see that the following definition, entirely Independent of coordinates leads to the same results.

We say, In general, that ? (x) depends on its argument x linearly if

9.6

<p(Xx

It is easy to see that linearity defined in terms of coordinates as dependence involving only first powers of the components satisfies this condition. We arrive thus at the follow- ing definition of a tensor:

k tensor of rank r is a_ function which as- signs to _r vector arguments numerical values the dependence of the value on each argument being linear in the sense of 9.6.

We can prove that an expression of a ten- sor as a bilinear (or multilinear) fora may be gotten back from this general definition. In fact, if given, e.g., a tensor of rank two, i.e., with two vector arguments ?(x,y) we substitute for x and y their expressions in terms of com- ponents and unit vectors (see 7.6)

Jxa

J7i;

we may write, using the above definition tensor and that of linearity (9.6):

of a

e(J,J)xa7§.

We see that this expression differs from that given by 9.1 as a bilinear form only in that •(!,!), <p(i,j), «(J,i)> *O»J) appear in- stead of alx, a18, aai, and aaa. Prom this point of view the conception of a tensor is en- tirely independent of a coordinate system and of components. We obtain tensor components when we introduce a set of coordinate vectors; and transformation of coordinates corresponds to replacing of one set of coordinate vectors by a new set.

We pass now to the consideration of opera- tions on tensors. We had threa such operations: multiplication, contraction and differentiation.

Multiplication is simple. If we have two tensors, say f(x,y) and g(z,u,v) we obtain a tensor of rank five by multiplying these two together

h(x,y,z,u,v) = f(x,y).g(z,u,v);

it is easy to see that the components of h are obtained from the components of f and g in the following fashion

Next comes contraction. We have defined It In Indices; I.e., given the components of a ten- sor of rank two In a certain coordinate system we have a definite rule for obtaining a scalar, viz., an + aaa; the question arises: will we obtain the same scalar If we use another system of coordinates, In other words, Is the defini- tion Independent of the system of coordinates, Is It Invariant? Yes, this invar iance has been proved above by formula 9.5. We have now the right to use the definition of contraction in terms of components, knowing that it has an in- trinsic meaning, that the result is independent of the system of coordinates used.

We are in a position now to answer a ques- tion that must have arisen in the mind of the reader. In the preceding chapter we agreed to consider a vector as a tensor of rank one. Here with our new definition of tensor a vector and a tensor of rank one seem to be two entirely different things; but we may consider together with every vector a tensor of rank one which has the same components. To find a tensor f(x) which has the same components as the vector v we have to make f(i) = vx , f(j) = vs, and we have

f(x) = Xif (1) + X8f (J) =

X8V8

so that the value of this tensor f (x) is simply the scalar product of the vector to which it corresponds by the vector argument.

Incidentally, this raises a question as to the nature of the scalar product; if we define it as xavo is it invariant? It may be consider- ed as resulting from two tensors of rank one by first multiplying them and then contracting the resulting tensor of rank two.

We also might at this place say a few words about the symbols of Kronecker Cj*. We may try to consider these symbols as the components of a tensor in some coordinate system. The value of the tensor will then be given by O^x^ and this is easily seen to be xaya , the scalar prod- uct of the vectors x and y and thus independent of the coordinate system. We may then speak of the tensor Ojj without mentioning the coordinate system because its components are the same in all coordinate systems.

The square of a vector may be defined as the scalar product of a vector with itself, it is the sum of the squares of the components in any coordinate system.

We next take up the operation of differen- tiation. Let us begin with a scalar field f j f is a function of the coordinates which we do not put in evidence. The coordinates of a point P are the components of the vector OP which Joins the origin to the point in question and they de- pend in the fashion discussed before on the choice of the coordinate vectors.

After choosing a definite coordinate system we may assign to f in every point a vector by

18

agreeing that the components of this vector should be

9.71

and T, -'of/ox,;

given another coordinate system the relation be- tween X£ and x'i being given by formulas 9.£, we can form the derivatives

9.72 "of/ox1!

and "of /ox1 8

and consider them as the components in the new coordinate system of a vector. The question arises whether this will be the same vector as the one introduced above.

In order to settle this question let us see what the components of the vector whose components in the old system were vi should be in the new coordinate system. According to formulas 9.2 they are

v1,* = vxc + vas = "of/ox^c -f "of/ox^.s v'» = -vxs + vtc = -

On the other hand

^f/OXfi = ^f/OXi.^Xi/bX1! + W/OX8.'bX8/OXl1,

and since

^Xi/ox'i = c,

we find that v'i = df/dx'i , and, in the sane way we find that v«8 = *>f/ox'8 which shows that the components 9.71 and 9.72 above, are the components in the two coordinate systems considered of one and the same vector. We proved then that the operation of obtaining a vector by taking as its components the deriva- tives of a scalar with respect to the coordi- nate axes is independent of the particular sys- tem of coordinates used, that means -this oper- ation is invariant.

Before passing to differentiation of a tensor of rank higher than zero we may note that the components vx, va we obtained may be considered as the components of a tensor of rank one; denoting the components of the argu- ment vector by hx, ht we hare as the values of this tensor

"ftf/bxi.hx + W/ox8.h8;

this reminds us of a differential and suggests to write dxj. for h^ and dx8 for ha; we have then the formula

df =

"of/oxa.dxa,

which leads to the interpretation of the differ- ential as a tensor of rank one whose components are the derivatives of the given function.

We next consider a tensor of rank one whose components in the old system are fi, these com- ponents being functions of coordinates. Differ- entiating with respect to xj we get

ofa/oxa;

can we consider these as the components of some tensor of rank two? In other words, if we de- fine a tensor by saying that in the old system it has these components will its components in the new system be obtained by differentiating with respect to the new coordinates of the new components f 'i of the given tensor? A calcula- tion analogous to the one preceding will con- vince us that this is so.

We have thus introduced an operation which leads from a tensor of rank zero to one of rank one, and from a tensor of rank one to one of rank two, and we could, continuing in the same way pass from any tensor to one of the next higher rank. In introducing this operation we used components of tensors and coordinates of points, but we proved that the result is the same no matter what particular coordinate sys- tem we might have used; the operation of differ entiation is thus independent of a coordinate system.

After this detailed treatment of tensors and operations on them in plane geometry it will not be difficult to generalize to higher dimensions. We consider first three-dimension- al space - solid geometry - and begin with an equation of a central quadric surface. Using notations similar to those introduced at the be ginning of this section it may be written as

9.8

+ a23.Xj.Xa + a88x8x2 + a3iX3Xj. + 832X3X2 + £33X3X3 = 1,

or, using our notations for summation with Greek indices introduced in Section 3, as

apoxpxo*

For the same reason as before (in the case of a conic) namely, because not all coefficients of such an expression can be obtained as its val- ues, we introduce a slightly more general ex- pression

9.81 a,

as the tensor; giving in it to the variables the values xi = 08i, y = 03i (see definition of the symbol 6 under 3.4) we obtain, for instance, the coefficient a23.

In coordinateless notation we were inde- pendent of the number of dimensions to begin

19

with, so that we may take over the definition of linearity 9.6 and the definition of tensor following it word for word. In order to ef- fectuate the transition from vector notation to coordinate notation we write now, instead of x = xxi + xaj, x = xpip, and substituting this and an analogous expression for y into t(x,y) we get using 9.6

*(xpip, The notation

brings us back to formula 9.91. There is no difficulty about tensors of higher ranks; quan- tities with three indices give rise to triline- ar forms, e.g.,

those with four indices - to quadrilinear forms, etc. The definitions of multiplication, con- traction and differentiation hardly present any difficulty, but we shall devote some time to the question of transformation of coordinates for three and four-dimensions. In solid analy- tic geometry the question is usually treated by introducing formulas involving all coordinates at the same titae, i.e., formulas periaitting to pass at once from one system of coordinates to any other with the same origin; these formulas are quite complicated, they involve nine con- stants which are not independent, but are con- nected by six relations, and the corresponding thing for four dimensions would be still more unwieldy; we could handle it by introducing in- dex notations, but we prefer another method. We pass from one system of coordinates to another gradually, in steps, each step involving only two of the coordinates - and one constant - the angle through which we rotate in the correspond- ing plane. Three such steps are enough to pass from any system to any other in three dimen- sions. For example, we may first perform a ro- tation in the xy-plane which brings the x-axis into the new xy-plane; then a rotation in the so obtained yz-plane bringing the y-axis into the new xy-plane, and finally we rotate the so obtained x and y axes until they coincide with the new x and y axes.

The advantage of this point of view will be seen from the following proof of the invari- ance of the operation of contraction in three dimensions. Given a tensor of rank two by its components an, aia> a 13, a81, etc., the re- sult of contraction, according to our defini- tion in Section 3, is

pp

a22 + a33

If we pass to another coordinate system the components will be changed into some components

and the result of contraction will be

PP

'11

22

•SJ

in order to prove that contraction has an in- trinsic meaning, independent of the system of coordinates we have to prove that the last two expressions are equal. If the transformation involves only the xx and xa coordinates, but does not involve xa, then a'33 which is the co- efficient of x'sy's will be a33 which is the coefficient of x3y3, because x1 3 = x3, y'3=y3, and the other coordinates do not depend on and y3; the coefficients a'n, a'22, on the other hand will be transformed by the same form- ulas (9.21) as in the two-dimensional case be- cause xi, xa, yi> Yz are transformed by the same formulas (9.?) as before. Therefore, form- ula 9.5 is applicable, and this together with the fact that a'sa = £33 establishes the invar- iance of contraction under a transformation of coordinates involving xx and xa only. But the same reasoning would apply to transformation in- volving x2 and x3 only, or xx and x3 only, and since we have proved that a general transforma- tion of coordinates may be replaced by a suc- cession of transformations involving each only two coordinates we have proved the invariance of the operation of contraction of a tensor of rank two under a general coordinate transformation.

Following the same principle we could prove the invariance of contraction for tensors of any rank and also, using the fact that the invari- ance of the operation of differentiation has been proved for two dimensions, prove that it has an intrinsic meaning in three dimensions.

We come now to four dimensions. Here it is easy to prove that a general transformation can be effectuated by a succession of six single ro- tations, i.e., rotations involving only two axes each; in fact, a rotation in the xt-plane will bring the x-axis into the new xyz-solid; a ro- tation in the yt-plane will bring the y-axis in- to the new xyz-solid, a rotation in the zt-plane will bring there the z-axis; now the t-axis co- incides with the new t-axis and the x,y,z axes are all in the new xyz-solid and can be brought into coincidence with the new x,y,z axes by three more rotations as we saw before.

The reasoning indicated for the three-di- mensional case will, therefore, prove the invar- iance of the fundamental operations of tensor analysis also for four dimensions.

10. Complications Resulting From Imaginary Coordinate.

The only new feature of the new geometry considered in the preceding sections was this that we have four coordinate axes instead of three; but we still have another departure from ordinary geometry (due in the final count to the "minus sign" in the Maxwell equations), viz.,

20

the fact that the fourth coordinate is imagin- ary. The introduction of Imaginarles helped us to obtain a symmetrical form of Maxwell1 B equa- tions, and seems to be beneficial from this point of view. The formal part of the theory' runs now smoothly; but there Is a disadvantage in this smoothness, It conceals very important peculiarities, and the present section will be devoted to the consideration of some of these peculiarities.

The discussion may be conveniently attach- ed to the consideration of the expression (pomp. 7.41)

which defines the square of the distance be- tween the two points whose coordinates appear In it; or the square of the vector joining these two points.

Since in the above formula the quantities x4 and x*4 and, therefore, their difference is imaginary, the fourth square is negative, and ,the expression may, according to the relative magnitudes of the terms, be positive, negative, or zero. There are thus three types of rela- tive positions of two points, or directions, or of vectors; there are vectors of positive square, those of negative square, and those of zero square. Our geometry is thus more complicated than what we would expect it to be if it would differ from the ordinary geometry in the num- ber of dimensions only. This complication, or this richness of our geometry, far from being an undesirable feature, is, as we shall see, an advantage, because it corresponds to certain features of the outside world, e.g., the exist- ence of both matter and light which are going to be identified with two kinds of vectors. At present we only mention that the momentum vec- tor of a material particle, or the vector of components u,v,w,i are vectors of negative square; in fact, the square of the latter is u2+va+w8-!; the first three terms representing the square of the velocity of the particle which, according to the third remark preceding formula 4.3 is very small compared with unity, the expression is negative.

In some cases it may be desirable to sac- rifice the formal advantages accruing from the use of the imaginary coordinate in order to put in evidence the peculiarities we are discuss- ing; it is permissible to go back then to the old notations x,y,z,t, but it becomes necessary to modify the formulas accordingly. (We shall see later how it is possible to use index nota- tions and still avoid imaginaries.) If we de- note two points by x,y,z, it and x' , yf, z1, it1, instead of by xi and x't the formula for the square of the distance will be

10.1 (x-x')2 + (y-y')* + (z-zf)* - (t-t')', and the formula for the scalar prc ee 7.4)

of two vectors given by a,b,c,id, and a',b',c', id',

10.2

aa' + bb' + cc« - dd1.

We may say that the minus sign appearing in these formulas is the same as the one ap- pearing in the second set of Maxwell's equa- tions (4.21), because it may be traced back to them.

We shall occasionally refer to the quanti- ties xi and ui which carry indices and involve the square root of -1 as mathematical coordi- nates and components, and to the quantities x, y,z,t and a,b,c,d as physical coordinates and components. Four-dimensional space of the char- acter we are studying now, i.e., either charac- terized by three real and one imaginary coordi- nates, or with four real coordinates but with scalar multiplication with a minus sign given by formula 10.8 is often called four-dimension- al space-time because of the interpretation of the quantities x,y,z and t in ordinary physics.

Without going into detail we may mention a few consequences of the "minus sign". According to our definition, two vectors are considered perpendicular if their scalar product vanishes. But now a scalar product of two equal vectors may vanish, as happens, for instance, in the case of two with components 0,0,1,1. We must say then, that such a vector is perpendicular to itself.

We also may mention that corresponding to the existence of three types of directions, those which correspond to vectors of positive square, negative square and zero square, there are three types of orientations or planes. An orientation may best be characterized by the number of zero directions it contains and it is easy to prove that there are orientations containing two, one or no zero directions.

As the result of existence of vectors of negative square our proof that the cosine of the angle between two vectors as defined by formula 7.5 does not exceed unity in absolute value is not applicable and we would have to consider imaginary angles or else consider the cosine as a hyperbolic cosine, but we shall not go into this question.

The only peculiarity due to the "minus sign" other than the existence of zero square vectors that we shall have to use in what fol- lows is connected with transformation of coordi- nates.

Formally our transformation formulas remain the samej we may write, for instance,

s

x =

cos <p - x4 sin

*4 = x3 sin 9 + x4 cos ®;

but here x3 is real and x4 Is imaginary and, of course, we expect the new coordinates to be of the same character so that x& will be real and x^ imaginary. Giving to ,x3 and x4 the values

1 and 0, respectively, we see that cos « mi it be real, and sin 0 - imaginary. We shall, bow- ever, prefer not to use imaginary trigonometric functions; In order to avoid then we introduce a new notation as follows:

10.

COS f

sin t = l-c ,

where o and T are real quantities. The identi- ty cos** + sin8? = 1 gives for o and T the re- lation

10.4

1.

If we prefer to use one number, rather than two numbers connected by a relation, in describ- ing different transformations of coordinates In the X3x4 plane we may again resort to trigonom- etry and interpret o and T as the secant and tangent of a real angle t :

10.5

sec

= tan

so to say,

It must be noted, however, that has no geometrical significance.

If In the above formulas of transformation we substitute for x3 and x4 their expressions in terms of z and t (4.7) and for cos and sin 9 their expressions (10.?) In terms of a and T we obtain the following formulas of trans- formation for the physical coordinates:

10.6

z'

zo + tt,

t1

ZT

tO.

These formulas are called the Lorentz

transformation formulas and their physical in- terpretation will be discussed in the next chap- ter.

Concluding this section we may mention how our axioms of Section 8 must be modified in or- der to produce a geometry with the desired pe- culiarities.

It is clear that our axiom V, according to which a non-zero vector has positive square has to be modified. The proper modification is the following:

Axiom V . There are orientations containing two zero directions, but there are no orienta- tions containing more than two zero directions. If we replace axiom V by this axiom and keep the remaining axioms as they were stated in Section 8 we obtain a geometry of the kind desired. In order to show this, we first show the existence of four mutually perpendicular vectors three of which have squares equal to 1, and the fourth one equal to -1. We begin by picking out a plane with two zero square vec- tors a and b; we assert that a1 * i(a + b) and = J(a - b) are two perpendicular vectors with squares of opposite sign; In fact a'b» = £(a8 - ba) and this is zero because a* = 0 and b8 = 0; then, a = a1 •»• b1, squaring this and keeping in mind that a'b' = 0, we have a8 = a'8 * b'«; since = 0 it follows that the squares of a1 and b' are of opposite

sign. Dividing each of the vectors a1 and b' by the square root of the absolute value of Its square we obtain two mutually perpendicular vec- tors whose squares are +1 and -1, which we may call k and / respectively. It is easy now to pick out two more vectors 1 and J which togeth- er with k and Jf constitute a set of four mutual- ly perpendicular vectors; none of them can have a zero square, because if, say, ia should be zero all the vectors of the plane determined by a and i would have zero squares, contrary to axiom V.

Using these four vectors we can, as in the other case, express every vector in the form (comp. 8.2)

10.7

ai + pj

O/.

The numbers a, p , Y » 0 will be considered as the components of this vector. In order to show that we have what we wanted we shall ex- press the square of x in terms of its components. Squaring 10.7 and taking into account that f* = -1 we obtain

+ ps

This shows that a, p, Y> 6 are what we call physical components of the vector; the mathema- tical components are obtained by setting

= a, xg = p,

x3

10,

and we see that we get the kind of geometry we expect to use in physics.

11. Are the Equations of Physics Invariant?

We return now to physics. In Chapter I we arrived at certain equations that we consider as fundamental; namely, the equation of contin- uity (3.5)

11.1 ^pua/oxg = 0,

two sets of Maxwell's equations (4.61 and 4.62)

11. £ ^Fy /DXfc + "oFjfc /QXj[ + "dFfci /ox* = 0,

11.3 ^Fia/oxa = eui , and the equations of motion (6.3)

11.4 'oTia/oxa =0 (i = 1,2,3) with (comp. 6.2, 5.1} 2.7)

Tij = EiJ ~

11.5

= FiPFPJ

H

The fact that the Indices run here fro* 1 to 4 (except in 11.4) suggested four-dimen- sional geometry; which we have introduced in Sections 7 and 6; the fact that x* in the above equations is imaginary (comp. 4.7) suggested the peculiarities discussed in Section 10. low that we have followed these suggestions and built a mathematical theory we have to see to what results the application of our new theory leads. In addition to following the sugges- tions we have Introduced into the theory a fea- ture that was not directly called for by phy- slcs: we made our theory independent of coordi- nates. In order to bring out the importance of this fact let us consider for a moment the case of two dimensions and compare plane geometry with a two-way diagram. Both in plane analytic geometry and in a diagram we use coordinate ax- es, but in geometry the axes play an auxiliary role, we find it convenient to express by refer- ring to axes properties of configurations which exist and can be treated independently of the axes; the same properties can be expressed us- ing any system of coordinate axes. The situa- tion is different in the case when we use a plane as a means of representing a functional dependence between two quantities of different kind, when we have a diagram. We may use, for instance, the two axes to plot temperature and pressure, or the height of an individual and the number of individuals of that height. In the majority of such cases the axes play an es- sential part in the discussion; if we delete the axes the diagram loses its meaning, the question of rotation of axes does not arise.

Returning to physics we have to ask our- selves what we actually need for it, a diagram or a geometry; in other words, are the coordi- nate axes essential or can they be changed at will, or again, do the equations of physics ex- press properties independent of the coordinate axes; are they invariant, or not.

In order to answer this question let us first consider the formal structure of the equa- tions 11.1 to 11.5. The fundamental dependent variables are here a scalar p, a vector ui and an anti- symmetric tensor Fj« .

The left hand side of equation 11.1 may be described as a result of multiplying the scalar p by the vector ui} then differentiating the re- sulting vector p%, then contracting the tensor so obtained; since the operations of multiplica- tion, differentiation and contraction have been shown to be invariant, the scalar opu^/big is independent of the system of coordinates used, and if it is zero in one system of coordinates it is zero in all system of coordinates. The con- tinuity equation expresses, therefore, a fact independent of the system of coordinates em- ployed .

An analogous reasoning applied to 11.3 would show the invariant character of that sys- tem. The question of invariance of the first

sign. Dividing each of the vectors a1 and b1 by the square root of the absolute value of its square we obtain two mutually perpendicular vec- tors whose squares are +1 and -1, which we may call k and / respectively. It is easy now to pick out two more vectors i and J which togeth- er with k and ^ constitute a set of four mutual- ly perpendicular vectors; none of them can have a zero square, because if, say, i8 should be zero all the vectors of the plane determined by a and i would have zero squares, contrary to axiom V .

Using these four vectors we can, as in the other case, express every vector in the form (comp. 8.2)

The numbers a, p , Y > 0 will be considered as the components of this vector. In order to show that we have what we wanted we shall ex- press the square of x in terms of its components. Squaring 10.7 and taking into account that jf* = -1 we obtain

= a2

- 6!

This shows that o, p, y, 6 are what we call physical components of the vector; the mathema- tical components are obtained by setting

xa = p

= Y, X4 = 10,

and we see that we get the kind of geometry we expect to use in physics.

11. Are the Equations of Physics Invariant?

We return now to physics. In Chapter I we arrived at certain equations that we consider as fundamental; namely, the equation of contin- uity (3.5)

11.1 SpUa/oxa = 0,

two sets of Maxwell's equations (4.61 and 4.62)

11.8 "oF^j /oxk + -oFjfc /oxi + •e-Ffci /OXj = 0,

11.3 ^>Fia/°xa = eui , and the equations of motion (6.3)

11.4 -oTia/oxa =0 (i = 1,2,3) with (comp. 6.2, 5.1} 2.7)

11.5

' FiPFPJ

The fact that the indices run here from 1 to 4 (except in 11.4) suggested four-dimen- sional geometry; which we have introduced la Sections 7 and 8; the fact that in the above equations is imaginary (comp. 4.7) suggested the peculiarities discussed in Section 10. low that we have followed these suggestions and built a mathematical theory we have to see to what results the application of our new theory leads. In addition to following the sugges- tions we have introduced into the theory a fea- ture that was not directly called for by phy- sics: we made our theory Independent of coordi- nates. In order to bring out the importance of this fact let us consider for a moment the case of two dimensions and compare plane geometry with a two-way diagram. Both in plane analytic geometry and in a diagram we use coordinate ax- es, but in geometry the axes play an auxiliary role, we find it convenient to express by refer- ring to axes properties of configurations which exist and can be treated independently of the axes; the same properties can be expressed us- ing any system of coordinate axes. The situa- tion is different in the case when we use a plane as a means of representing a functional dependence between two quantities of different kind, when we have a diagram. We may use, for instance, the two axes to plot temperature and pressure, or the height of an individual and the number of individuals of that height. In the majority of such cases the axes play an es- sential part in the discussion; if we delete the axes the diagram loses its meaning, the question of rotation of axes does not arise.

Returning to physics we have to ask our- selves what we actually need for it, a diagram or a geometry; in other words, are the coordi- nate axes essential or can they be changed at will, or again, do the equations of physics ex- press properties independent of the coordinate axes; are they invariant, or not.

In order to answer this question let us first consider the formal structure of the equa- tions 11.1 to 11.5. The fundamental dependent variables are here a scalar p, a vector ui and an anti-symmetric tensor Fy .

The left hand side of equation 11.1 may be described as a result of multiplying the scalar p by the vector u^, then differentiating the re- sulting vector pui, then contracting the tensor so obtained; since the operations of multiplica- tion, differentiation and contraction have been shown to be invariant, the scalar ^jpUg/oxg is independent of the system of coordinates used, and if it is zero in one system of coordinates it is zero in all system of coordinates. The con- tinuity equation expresses, therefore, a fact independent of the system of coordinates em- ployed .

An analogous reasoning applied to 11.3 would show the invariant character of that sys- tem. The question of invariance of the first

23

system of Maxwell's equations requires a spe- cial discussion; it can best be treated by in- troducing a new an ti- symmetric tensor DJ_J con- nected with F4« by the following relations:

11.6

F8, = D14, F31 = D,4, F14 = D13, F84 = D81 .

Before v,e show how this is going to help us in connection with our equations we want to prove that these relations are independent of the co- ordinate system; i.e., that if 11.6 hold rela- tions of the same form, namely

11.6'

etc.,

will hold in any other coordinate system. Again, since a general transformation of coordinates can be achieved in steps it will be enough to test a XiX2 rotation only. As a result of such a rotation F{2 becomes (comp. 9.21)

12

;3 - F

2l

F82sc;

using the fact that we find

is anti-symmetric (4.4)

11.7

F'a = F,

12

and since obviously D34 = D34 because the x3x4 axes are not affected we see that the first of the relations 11.6' follows from 11.6. In order to find F23 we have, according to the general rule following 9.31, to substitute 62l and 63i for xj^ and y' respectively in F'pOx'pyJ, = FpOXpya . As tne corresponding values of x.i and Yi we find with the aid of 9.2 considering that

X3 = X3, X4 = X4

"~ S i

xfl = c,

x3 =0, x4 = 0,

yi = 0, y2 = 0, y3 =1, y4 = 0; so that

11.71 FjJ3 = -sF13 + cF23; in a similar way we obtain

11.72

D14c - D84s;

taking again into account the anti-symmetric property of F we come to the conclusion that the second relation of 11.6% is a consequence of 11.6, and since the same reasoning applies to the remaining relations we conclude that the re- lations 11.6 are independent of the coordinate system; it is easy to see that they assign to every tensor Fy a tensor DJJ (the tensor Dj* , or, rather v^l DJJ is often referred to as the dual of FJ.J). Now if we express, using 11.6

the FJJ becomes

11.2'

in 11.2 in terms of the Di, that set

and its invariance follows from general con- siderations as in the case of 11.3.

Formula 11.5 contains only multiplications, contractions and additions, so that there la no doubt concerning its invariance, but the situa- tion changes when we come to the set 11.4. The vector *>Tia/oxa has been obtained by invariant operations but 11.4 states that only three of its components are zero, a statement which ob- viously depends on the choice of coordinate ax- es and is not invariant.

We have now two courses open before us: one is that of resignation, we can say: we see that physics is not like geometry in this re- spect, that we can only use four-dimensional notations, a four-dimensional diagram but not four-dimensional geometry; the other course is that of adventure, we may try to play the game of geometry; let us pretend that we can apply the formulas of transformation of coordinates in this case; we know that there will be a dif- ference between the theory we obtain and the physics which we undertook to translate into our language; but it may be that the difference will amount numerically to very little. Con- sider the fourth component of the vector oTj^oXgj we found (comp. remark following 6.3) that one of the terms of this expression, ^Mio/oxg van- ishes, and the other 0^3/0X3 gives Xu+Yv+Zw, where u,v,w are the components of velocity, but in order to present the Maxwell equations in a simple form we had to choose our units in such a way that the velocity of light is unity; or- dinary velocities are of the order of magnitude of one ten-millionth of the velocity of light, so that we see that by setting the fourth com- ponent of oTia/oxo equal to zero we would com- mit an error that is numerically very small. This encourages us to go on with our adventure and try to force the geometrical character on physics. In order to do that let us go beyond the formal structure of our formulas and recall what the meaning of our fundamental quantities was. The components of the vector UA were giv- en (see 1.2, 2.4, 4.9) as

11.3 ux = dx/dt, u2 = dy/dt, ua = dz/dt, u4 = 1.

But this identification is obviously not inde- pendent of the coordinate system, it gives pref- erence to the fourth coordinate. We may think that this is the source of our difficulty, and that this difficulty may be overcome if we find an invariant identification to take the place of 11.8. The next section will prepare the way for this.

12. Curves in the New Geometry.

The root of the difficulty is that our de- scription of motion was not invariant; motion was described by giving the dependence of the coordinates x,y,z on time, that means by giving three of our coordinates Xj., xa, x3, as func- tions of the fourth x4 which thus is given pref- erence. The situation is analogous to that In plane analytic geometry where we give y as a function of x, or that in solid analytic geomet- ry when we give y and z as functions of x; in both cases we represent curves; from our four- dimensional point of view we should then consid- er motion of a particle as a curve in four di- mensions (using the word curve in a general sense so that straight line is a special case) .

What we want then is a representation of curves in four dimensions which would not give preference to the fourth coordinate. We begin by considering representations of curves in two and three dimensions which give no preference to one coordinate. In the plane a line may be rep- resented by

x = ap + b,

y = cp + d,

a circle by

x = r cos p, y = r sin p; in space a line by

x = ap + b, y = cp + d, z = ep + f , a helix by

x = r cos p, y = r sin p, z = kp, etc.

In all these cases to every value of the "parameter" p corresponds a point of the curve; in general, if we set

= f(p),

y = g(p),

z = h(p)

we have what we call a parametric representation of a curve (corap. parametric form of equation of straight line in 7.1). In the same way we may represent a curve in four dimensions, which we take to mean motion of a particle, by giving x^ as functions of a parameter p.

The defect of this method is that it con- tains a certain arbitrariness; we may substitute for p another parameter q by making p an arbi- trary increasing function of q. We want now to standardize our parametric representation. The usual way is to choose the arc length along the curve as the parameter. Without going into de- tail we shall state that arc length between points corresponding to values px and of the parameter is given by

18.1 s = / V( dx/dp) a + (dy/dp)*+ (dz/dp)'dp;

in the special case when s la used as par p we differentiate both aides and obtain

£4

ter

12.2 (dx/dp)« + (dy/dp)» + (dz/dp)» « 1.

We may consider in general dx/dp, dy/dp, dt/dp as the components of a vector tangent to the curve; the change of parameter would multiply these derivatives by the same number, i.e., sub- stitute another tangent vector for that one; the quantity (dx/dp)* + (dy/dp)a + (dz/dp)» fires the square of the length of the tangent rector; the above equality 12.12. expresses then the fact that if we use arc length as the parameter the length of the tangent vector whose compo- nents are the derivatives of the coordinates with respect to the parameter Is unity.

We come thus to the idea of a unit tangent vector; it characterizes in every point the di- rection of the curve; its components are the di- rection cosines of the tangent.

We may try to go through an analogous proc- ess in the case of curves in four-dimensional space, which as we saw may be taken to repre- sent motions; if we succeed, the vector at which we arrive will suggest itself as a natural thing to identify with the vector of components ui which appears in our formulas. Starting with any parametric representation xi = xi(p), where p may be for example t, we try to change our parameter by introducing a new variable q and making p a function of q, choosing this func- tion in such a way that

(dxa/dq)8 + (dxa/dq)a

but dxi/dq = (dxi/dp) . (dp/dq) ; so that the function p(q) must be such that

dp/dq

/(dx1/dp)V(dx,/dpr+(dx,/dp)a+(dx4/dp)1

Is this possible? In the case when the origi- nal p is t the expression under the radical sign will be u8 + + - 1, and for motions whose velocities are smaller than the velocity of light (Section 4) this is negative, so that we would get an Imaginary value for dp/dq. »• In order to avoid this unpleasantness we decide to standardize our parameter by requiring (dxjj/dp) . (djL/dp) to be -1 instead of 1; in this case we find for dq/dt the expression /I - (u8 + v* + w8) and we may write

12.3

where p stands for /u8 + + w», i.e., for

what we call speed (the length of the velocity vector). The quantities Just written out we want to identify with the components of the vector ui which appears in our formulas. Since in ordinary cases p is very small, the radical /I - PB is very near to unity and our new iden- tification differs from our old identification (11.8) numerically very little. On the other hand the new values for ui are according to

their derivation the components of a vector, to that if we adopt this identification and also agree to set the fourth component of the diver- gence of the tensor T^ equal to zero we obtain an invariant theory whose statements differ on- ly very slightly from those accepted in classi- cal physios. It remains to be seen whether there are cases in which the discrepancy is large enough to be tested by experiment.

26

Chapter III. SPECIAL RELATIVITY

Guided by the point of view that the form- ulas of physics ought to be interpreted in four- dimensional geometry we were led to the inter- pretation of the motion of a particle as a curve in space-time. Following the analogy with a curve in ordinary space where arc length, s, is often used as a parameter, we have introduced a standard parameter, which we may also. call arc length and denote by s, for curves in space- time. The partial derivatives dxj/ds of the co- ordinates of a point on the curve with respect to s may be considered as the components of a vector tangent to the curve (the square of this vector is -1 in every point - we shall refer to such a vector also as a unit vector) . We have then at every point Xi, xa> x3, x4 of such a curve a .unit vector dxi/ds, and we have agreed to identify this vector with the vector u^ which appears in our fundamental laws of physics (11. 1 to 11.5) so that

ui = dxi/ds.

In this chapter we want to consider some conse- quence of this identification.

15. Equations of Motion.

The one thing that was not satisfactory about the formulas of physics was the fact that according to 11.4 only three components of the vector oTia/dxa are equal to zero. In this section we shall see how this defect is cor- rected by the adoption of the new identification. But before we do that we have to study some im- mediate consequences of this identification.

Before, a motion of a particle was given by giving the position of the particle in dif- ferent moments of time, i.e., the coordinates x,y,z as functions of t. Given these functions we can calculate for every moment the velocity vector of the particle - a vector of components

u = dx/dt, v = dy/dt, w = dz/dt.

Now, the same motion is described by giving *i» *2» x3, x4 as functions of s; and we have a vector of components ui = dxi/ds which, due to the special choice of the parameter satisfies the equation

13.1

2 Ul

U2

U4

1

~ -L

Of course, we have merely two representa- tions of the same thing. Given the Xi(s) we can express s as a function of t from

x4(s) = it

and substituting the expression of s so found into xx(s), xg(s), x3(s) we will have x,y,z, as functions of t. Or given x,y,z as func- tions of t we can arrive at the representation x1(s) as indicated in Section 12.

Also the space-time vector u^ and the space vector u,v,w describe the same thing. The formulas 12.5 show how to find the compo- nents Ui in terms of the velocity vector u, v,w, and it is easy to find the u,v,w in terms of components u^. We simply have

u = dx/dt =

dx/ds 4dxi/ds

and in a similar fashion 13.8 v = i.ua/U4,

w =

We see thus that the vector u^ determines the velocity of motion, and we agree to call it the four-dimensional velocity vector. On the other hand, being a unit vector this vector character- izes the direction in four dimensions of the curve representing motion; its components u^ may be considered as the direction cosines of the tangent (compare Section 7, between formu- las 7.41 and 7.5).

But a velocity vector does not character- ize the motion of a particle completely; it gives only the kinematical characterization; in dy- namics we need in addition, to know the mass of the particle, and then we form the momentum vec- tor (compare beginning of Section 1) whose com- ponents are mu, mv, mw. By analogy we form the expressions mui or pui (depending on whether we use the discrete or the continuous picture of matter) and consider them as the com- ponents of the four-dimensional momentum vector. Using the formulas 12.3 we have for its compo- nents

13.3

mu2 =

mu4

mv

These are what we call the mathematical compo- nents of the momentum vector; its "physical com- ponents" are

mu

•v

13.51

c =

We obtain a relation between the momentum components and mass if we take the sum of the squares of the components 13.3 or 13.31 and use 13.1, viz.,

aa + b" + - d* = -m";

In words, the negative square of mass is the square of the momentum vector, so that mass is essentially given by the length of the momen- tum vector; we see here another advantage of the four-dimensional representation: the four dy- namical quantities of a particle which in clas- sical physics are given by the three momentum components and mass are here represented more naturally by the four components of a vector.

As stated many times before, numerically

is in most applications very close to unity so that approximately the first three components of the four-dimensional momentum vec- tor are equal to the components of the three- dimensional momentum vector and the last (phy- sical) component of the four -dimensional momen- tum vector is, in first approximation, equal to mass.

Let us consider more in detail this fourth component of the momentum vector. If we want a better approximation we develop the last of 13.31 according to powers of p and keep only two terms; we have thus the approximate equality

13.4

d =

the correction represented by the second term is nothing but kinetic energy; of course, if ordi- nary units are used this term has to be written

13.41

imVa/c8

where c is the velocity of light, and V is the velocity of the particle measured in the same units, because p is the ratio of the velocity of the particle to that of light. We had better say then that the correction is kinetic energy divided by the square of the velocity of light. Sometimes this fact is expressed by saying (neg- lecting the other terms, which are very, very small) that when a body is in motion its mass is increased by its kinetic energy (divided by the square of the velocity of light) .

The interest of this lies in the close re- lationship which is thus established between mass and energy - a relationship that plays a prominent part in present physics.

Sometimes the whole expression m//l - p2 is referred to as energy of the particle; mass, from this point of view, appears then as part of the energy, that part that the particle possess- es even when it is at rest; in other words, mass appears as the rest-energy of the particle. We could also call m//l - P8 generalized mass and say that mass changes as a result of motion (com- pare end of this section).

87

We are ready now to discuss the equations of notion 11.4 or

The left hand sides may be written as

the first factor of the first term vanishes ac- cording to the continuity equation 11.1; the second term may be written, recalling the def- inition of uj, as

= p.ou1/ox0.dxa/ds « p.duj/ds;

the right hand side of 13.5, according to our former calculation (Section 6) is eFlaua, so that the equations of motion become

p.duj/ds

or, if we use the discrete picture, considering

both mass and electric density to be concentrated

in one point, and denoting mass by m, electric charge by e,

13.51

m.duj/ds =

These are the equations we are going to discuss. In applying them to physics we give preference to time by writing

13.52

m.dUj/dt = eFla.dxo/dt

which spoils the invariant form but does not change the contents of the statement because the transition from 13.51 to 13.52 is equiva- lent to multiplication by ds/dt. Using 4.72 (or 6.1) the last equations become

m.dux/dt = e(X + Hv - Mw), m.du8/dt = e(Y + Lw - Nu) , m.du3/dt = e(Z + Mu - Lv),

13.53

m.du4/dt = ie(Xu + Yv + Zw) .

Multiplying the left hand side of the first of these equations by i.u^Au and the right hand side by u (compare 13.2); using in the same way Lu^/u* = v on the second, and i.u3/U4 = w on the third, and adding the re- sults we get the fourth equation because the left-hand side comes out

(Im/u4).(u1.du1/dt + u,.du./dt + u,.du,/dt)

and differentiating the identity 13.1, we find that the second factor is, Ut.du4/dt. The fourth equation is thus a consequence of the first three, a great improvement over the situation as it was before the new identification. The fourth equation also has a definite physical

meaning now; the left hand side may be said to represent the time rate of change of energy (since the variable part of mvu has been rec- ognized as kinetic energy), and the right hand side has been recognized before (Section 6) as the rate at which the energy of the field (po- tential energy) is being expended in moving the body. The difficulty with the fourth equation has thus been settled in a most satisfactory fashion but the system as a whole, or the first three equations, have to be tested by experi- ment (the fourth, being a consequence of the first three, cannot be wrong if these three will be proved to be "true"). Since

dui/dt = -[^(dxi/ds) = dt/ds.-|kdxi/dt)

we may write our equations as

m'.d*x/dt«=e(X+Nv-Mw) , m' ,dIy/dts=e(Y.+Lw-Nu) 13.6

m'.d8z/dt8 = e(Z + Mu - Lv) ,

where

13.7

m1 = m/-

The right hand sides of these equations (as stated in Section 6) are the components of the force exerted by the electromagnetic field on the particle. Comparing the left hand sides with the classical expressions we see then that the correction resulting from our identification is equivalent to the substitution of m1 for m in the classical equations of motion. We may say then that our theory predicts that motion will be governed by the old equations in which mass has been replaced by a corrected mass the cor- rection being the kinetic energy (divided by the square of the velocity of light). In the vast majority of cases the factor 1//1 - (5s is very close to one, but there are a few cases where it is not, and these cases afford an opportunity to test the new theory and to see whether it or the old one is better adapted to give account of ex- perimental results. In experiments with "ca- thode ray particles" by Bucherer the predictions of the new theory seem to have been verified.

14. Lorentz Transformations.

Now that we saw that the new identification removes the difficulty in connection with the fourth equation of motion we want to consider some other consequences, and in the first place we want to give a discussion of the physical significance of the transformation of coordi- nates promised in Section 10. The new feature

about our coordinate system* IB the greater ar- bitrariness in their choice. Before we were free to pass from one system to another with the same time axis; now we may change the time axis also (formulas 10.6), and we want to see what it means.

In general, in one system of coordinates A geometrical configuration is described by cer- tain numbers (e.g., coordinates of its points) and certain equations (e.g., equations of its straight lines); In another system of coordi- nates the same configuration will be character- ized by other numbers and other equations, but It will be another description of the same con- figuration; or, we may say, another identifica- tion which is theoretically just as good as the first, but may be more, or less, convenient for practical purposes. In general we make our choice of a coordinate system guided by the prop- erties of the object we are studying and our owi position in space. If we study an ellipsoid we would choose for coordinate axes its principal axes; or in another case we would choose the di- rection away from us as the y-axis, the direc- tion to the right as the x-axis, the vertical direction as the z-axis; but in principle all axes are permitted. The same general situation obtains in physics, considered as four-dimen- sional geometry. We have many systems of coor- dinate axes at our disposal, and we want to in- vestigate now what use we can make of this ar- bitrariness, how we can adjust the choice of ax- es to the requirements of a particular situa- tion. In particular, we are interested in the choice of the t-axis.

The object we want to study in the first place is the motion of a particle. We repre- sent such a motion by a curve in four- space (and a straight line we consider as a special case of a curve) . At every point of that curve we have a unit tangent vector, the four-dimensional ve- locity vector of components a,b,c,d; or we may characterize it by the three-dimensional veloc- ity components u,v,w; if we pass to another co- ordinate system the components a,b,c,d will be changed, and so will the components u,v,w. If the coordinate transformation affects only the space coordinates x,y,z then the component d will not be affected, and therefore p will not change; in other words, u,v,w will be changed but not ua + v8 + w»; the velocity vector will have dif- ferent components, but its absolute value, the speed, will be the same. This is essentially as in old physics; the new feature is in the exist- ence of transformations affecting t, and the most striking result of it is expressed in the following theorem.

Theorem. For every motion it is possible for every moment to choose a system of space- time coordinates in such a way that the speed be zero.

Proof. Begin with any axes'; then, without changing the, time axis, change the space axes so that the motion, at the moment considered,

takes place along the z-axis; we have then a = b = 0; now consider a transformation in- volving z and t; denoting the new components of the four-dimensional velocity vector a1, b1, c1, d' we shall have (10.6)

a' =0, b' = 0, = co + dT, d1 = cr + do;

if we want to make = 0 we have to choose the angle \J> so that

-o = sin <r = -c/d;

if c is in absolute value less than d, and this is so for all motions of bodies so far observed, an angle satisfying this relation and therefore a system of coordinates for which a - b = c = 0 can be found. From formulas 12.2 it follows that in such a system u = v = w = 0, and the theorem is proved.

The theorem just proved is expressed often by saying that every particle can be transformed to rest.

After we have found a coordinate system in which a particle is at rest we can perform any transformations of space coordinates and the property will not be destroyed; any transforma- tion involving time, on the contrary will result in introducing significant space components of the velocity vector; we see thus that whether a particle is at rest in a coordinate system de- pends exclusively on the choice of the time ax- is, so that the choice of the time axis is equiv- alent to the choice of a body which we desire to consider at rest; in other words, the direction of the time axis may be characterized by indi- cating what particle is at rest in the corres- ponding coordinate system.

What time axis we actually choose depends, as in geometry, on circumstances; in many cases we shall want to consider ourselves as being at rest, or our laboratory, or the earth.

In what precedes we spoke of a motion of a particle at a given moment; in a given system' of space-time coordinates a particle may be at rest at one moment and not at rest some other time; but there exists a class of particles which if transformed to rest for one moment will be at rest always; these are those particles whose representative four-dimensional curves have the same tangent vector at all points, i.e., are straight lines; it is clear that if the direc- tion of such a straight line is taken as the di- rection of the t-axis the velocity in the so ob- tained coordinate system is zero. But if we choose any other (cartesian) coordinate system, the three-dimensional velocity u,v,w, will be constant in absolute value and direction, so that we have a rectilinear uniform motion. From our point of view then the distinction between uniform rectilinear motion and rest is a non-es- sential distinction, this distinction does not exist until we introduce a coordinate system; it

is of the same nature as the distinction between lines which are and those which are not parallel to the x-axls in ordinary analytic geometry.

If a motion la not uniform and rectilinear then there is no coordinate system in which the particle is permanently at rest. But rather than to make a strict distinction between par- ticles which are and those which are not in un- iform rectilinear motion or rest, it Is more in keeping with our point of view to speak of par- ticles which may be (within experimental error) considered at rest for a sufficiently long peri- od of time.

We are now in a position to explain the name and the origin of the theory we are study- ing. We saw that in this theory there is no such thing as absolute rest or absolute notion of a body. If it is at rest with respect to one system of coordinates it may move with re- spect to another and vice versa; we can only speak of relative motion; that is where the name Relativity comes from.

If we adopt this point of view, we hare to consider as permissible all transformations of coordinates from one to any other cartesian co- ordinate system. Later on we shall consider other, more general, systems of coordinates, and therefore more general coordinate transforma- tions; we shall replace our equations by more general ones which will be invariant under these more general coordinate transformations; in com- parison with this situation we may say that we consider now only special coordinate transfor- mations and invariance under them; therefore the present theory is called "Special Relativity Theory".

It may be mentioned that the historical order of appearance of the ideas of our sub- ject - as it happens so often - has been quite different from the order which seems natural and in which we have presented them. First the formulas of transformation involving space co- ordinates and time have been introduced by Lor- entz without, however, giving to them the mean- ing they have now; in Lorentz's theory there exists one universal time t, and other times t1 play only an auxiliary part. The merit of mak- ing the decisive step and recognizing the fact that all these variables are on the same foot- ing - belongs to Einstein (1905). The four-di- mensional point of view, after some preliminary work had been done by Poincare and Marco longo, was most emphatically introduced by Minkowski in 1908.

15. Addition of Velocities.

As explained in Section 13 we have two ways of characterizing the velocity of a body: by means of the three-dimensional velocity vector and by means of the four-dimensional velocity vector. We can pass from one representation of

velocity to the other without difficulty and the two methods are equivalent as long as we do not change our coordinate system. But if we come to study the relative motion of one body with respect to another and want to define the relative velocity, the four-dimensional point of view leads to conceptions which are at vari- ance with commonly accepted ideas and we want to devote this section to the clarification of this situation. It is natural to reduce the defin- ition of relative velocity of a body with re- spect to another body to the conception of the velocity of a body in a coordinate system by saying: By velocity of the body B with respect to a body A we mean the velocity of B in a sys- tem of coordinates in which the velocity of A is. zero.

If we want to find the velocity of B with respect to A we have to transform our coordi- nates so that in the transformed coordinate sys- tem A be at rest. It is clear that the meaning of relative velocity is made to depend by the preceding definition on what we mean by trans- formation of coordinates. If by transformation of coordinates we mean only transformation of three-dimensional coordinates - transition to moving axes - we have the old idea of relative velocity; if, on the other hand, we consider four-dimensional coordinate axes and our trans- formation of coordinates involves the coordinate x4, or t, in the sense of the theorem of Sec- tion 14, it is clear that we give a new meaning to relative velocity, and we should not be sur- prised if the so defined "relativistic" rela- tive velocity will possess properties different from those of the "classical" relative velocity.

Consider a body A and a body B that moves with respect to A uniformly and rectilinearly with a velocity VBA; this means according to our definition that the velocity of B in a coordi- nate system in which A is at rest is VBA In- troduce a coordinate system in which A is at rest and B moves along the z-axis, and call the coordinates XA, yA, ZA, tA; introduce also a system of coordinates in which B is at rest so that (10.6)

15.1

XA ~

ZA ~

VA = VB> (L = ZBTBA

Now describe the motion of the body B in each system neglecting the x and y coordinates. In the system A the motion of the body B which, we

assume, at the moment given by

t = 0 was at Z = 0 is

ZA =

in the system B the body B is at all times at the origin of the coordinate system, so that ZB = 0; substituting this value in the trans- formation formulas and eliminating tB we get

ZA = ^A*^* Comparing this with the preceding equation we have (10.5)

15.2

BA

Solving the above transformation formulas for ZB, tB we also find that TIB a -^BA* °AB = o BA so that

VAB = -VBA-

This result, that the relative velocity of A with respect to B is the negative of the rela- tive velocity of B with respect to A is In keep- ing with the old Ideas.

Now consider three bodies, A,B,C, all mov- ing in one direction (more precisely B and C moving in the same direction with respect to A). Denote the velocities of B and C respectively with respect to A by VBA and VCA > and the ve- locity of C with respect to B by VCB . We have in addition to the above transformation formu- las the formulas

15.11 ZA=ZCOCA + tcTCA, tA =ZCJTCA + tcocl, and

15.12 Z=ZO

B

C(B

=ZT

and also

15.21 V

BA

VGA =

OCA'

CCB

VCB =

TCB OCB*

Express now ZA, tA in terms of zc, tc by sub- stituting the values for ZB, tB given by the transformation formulas 15.12 into the trans- formation formulas 15. Ij comparing the result with the transformation formulas 15.11 we get

°CA = °

GB'°BA +TCB'TBA TCA = TCB*°BA + °CB*TBA

whence, using the above expressions of veloci- ties in terms of transformation coefficients, 15.2 and 15.21,

15.3

"CA

VBA + VCB l + VBA.VCB'

This is the Einstein formula for addition of velocities for the case of two motions In the same direction. This formula should be com- pared to the formula of addition of classical- ly defined relative velocities

15.4

V = V' + V".

Of course, there is no contradiction between the two formulas because they refer to differ- ent quantities. Still it is legitimate to ask which formula is better from the point of view of experiment, which - if any - is "correct"

for the relative velocities that we actually measure.

In ordinary units th3 second term in the denominator in formula 15.3 should be divided by the square of the velocity of light, so that for moderate velocities the formulas give re- sults that differ numerically very little, and it seems to be difficult to devise an experi- ment with high enough velocities of material particles so that the formulas could be tested directly. In the next section we shall consid- er the case when one of the velocities is that of light; in the meantime we may mention that formula 15. '6 is a special case of a more gener- al formula which corresponds to the case when the two motions are not in the same directions. This general formula gained temporary importance some years ago v.'hen it played a decisive role in the early stages of the application of the idea of the spinning electron to the explanation of spectra.

16. Light Corpuscles, or Photons.

In studying curves in four-space represent- ing motions of particles we succeeded (Section 1£) in choosing a standard parameter, s, by con- sidering the expression

and by setting ds/dp equal to the reciprocal of the square root of minus the above expression. This procedure would not work if the above ex- pression were equal to zero. We can imagine in our four-dimensional geometry curves and straight lines for which the above expression is zero (Section 10), and the question arises: what will be the physical interpretation of such curves; in other words: is there anything in physics that could be identified with such curves in the same way that motions of particles are identi- fied v/ith curves for which the above expression is negative. In order to answer this question let us calculate the three-dimensional velocity corresponding to such a curve; if the above ex- pression is zero for one choice of parameter it will be zero for all choices; using t as param- eter, and using physical coordinates we have then

(dx/dt)8 + (dy/dt)2 + (dz/dt)z -1=0 or u2 + v2 + wa = 1,

i.e., we can say that the curves of zero square tangent vectors correspond to what we have to call from the three-dimensional point of view particles moving with the velocity of light. This suggests to identify such curves in some way with propagation of light.

Since the time of Newton and Huygens two theories of light have been vying for suprem-

acy with variable success; according to one, the so-called corpuscular theory, light con- sists (like matter on the discrete theory) of particles which lately (Wolfers, 1925) hare been called "photons"; according to the other theory light is a wave phenomenon. For our present purposes the former view seems to be better adapted. If we adopt it we can make our former statement more specific by saying that we identify curves of zero square tangent vec- tors with photons, or with motion of photons.

In adopting thus the corpuscular theory of light we do not in the least mean to say that the corpuscular theory of light is correct, and still less that the other theory - the wave the- ory - is wrong. We simply want to show that the identification Just mentioned permits us to give account of certain light phenomena; and it is enough to mention polarization in order to see that other phenomena are left out.

To begin with we want to point out an ad- vantage that the Relativity theory has compared to classical theory in the matter of corpuscu- lar theory of light. In classical theory dif- ference in velocity is merely a quantitative difference, in relativity this means an entire- ly different kind of curves, and there are oth- er differences entirely of qualitative nature that are consequences of our identification, which is more in keeping with the nature of light compared to matter as we know it from ex- periment. This seems to constitute a very strong argument in favor of the adoption of the point of view of Relativity in general, and of the identification we are discussing now in particular.

We want next to discuss what is usually referred to as constancy of the velocity of light. The reader may have noticed that a while ago when we were calculating the three-dimen- sional velocity, corresponding to curves with zero-square tangent vectors, we did not say in vrhat coordinate system we wanted to calculate this three-dimensional velocity. As a matter of fact, the result shows that it is independ- ent of the coordinate system; i.e., no matter what bodies we consider as being at rest, we come out with the same value for the velocity of light, in our units - one.

This seems surprising; it contradicts the commonly accepted ideas concerning addition of velocities; but we have been led to a different formula for the addition of velocities, and we can show that the constancy of velocity of light is in agreement with that formula 15.3. In fact, if we consider the case that C moves with the velocity of light (that is one in our units) with respect to B, that means that VCB = 1; sub- stituting this value in 15.3 we find that VGA is also one; that is, what is motion with veloc- ity one in one system is motion with velocity one in another system.

This discussion, of course, proves nothing but the inner consistency of the thec

Another question is whether constancy of velocity of light, i.e., independence of this velocity from the choice of the system which Is considered at rest is consistent with experi- ment. As a matter of fact, it appears that it is; the weight of experimental evidence seems to be for it. Historically, results of some experiments by Michelson and Moreley performed in 1887 and pointing in the same direction play- ed a great role in the creation of the Theory of Relativity.

Having considered thus the question of ve- locity of light we pass to the discussion of an- other consequence of our identification.

We have decided in a general way to identi- fy straight lines whose vectors are of zero square with light or the motion of photons in the same way that straight lines whose vectors are of negative square are identified with mat- ter or uniform rectilinear motion of material particles. But a straight line (in four dimen- sions) does not characterize the motion of the particle completely - it only gives the velocity of the motion, it characterizes it only kinema- tically; for a complete dynamical characteriza- tion we had to introduce (Section 13) the mass of the particle, and that led us to introduce the momentum vector, whose square we found to be -m8; the complete characterization of a mate- rial particle consists then of a line with a vector (of negative square) on that line. In the same way we shall characterize the motion of a photon by a line with a vector of zero square on it. We have thus the same picture for a ma- terial particle and a photon; in both cases we have a line with a vector on it; only in the first case it is a vector of negative square; in the second of zero square; this difference corresponds to the difference in the speeds of •the particles in the classical theory. But in the classical theory this is a purely quantita- tive difference and here, as mentioned before, it leads to qualitative differences, some of which we are going to consider.

In the first place a photon cannot be trans- formed to rest. In fact transforming a photon to rest would mean finding a coordinate system such that in it the time axis will have the di- rection of the photon; but that would mean that the vector 0,0,0,1 would have zero square which is impossible.

Then there is this distinction: two mate- rial particles may differ in mass, that means in the squares of their momentum vectors, and this is an essential difference because the square of a vector is not affected by a transformation of coordinates; all photons on the other hand have vectors of the same square, namely zero. We shall prove that as a consequence of this, two photons never differ essentially, that is, given two photons there always exist two systems of coordinates in which the descriptions of the two photons are the same. To begin with, we may

choose the origins of the two coordinate sys- tems on the respective straight lines; next we may consider the two lines In the respective z-t planes. The momentum vectors of the cor- responding photons will have now In their re- spective coordinate systems the components 0,0,qa,q4 and 0,0,p,,p4 (contrary to our gener- al agreement we use here subscripts with physi- cal coordinates) and since both vectors are of zero square we'll have

/ - Q4* - 0,

P»* - P4* « Of

by choosing appropriately the sense on each co- ordinate axis we can reduce these conditions to

16.1

" P4<

Now perform in the second system the transfor- mation 10.6

z'

ZO + tT,

t'

ZT

tO,

which applied to the second vector and taking into account 16.1 gives

16.2

= P* =

But a and T are subject only to the condition that o2 - •** = 1, so that we can choose o + t arbitrarily; if we make the choice

16.3

o + T = qa/p3,

we shall have p3 = q3 and the statement Is proved.

This theoretical conclusion, that any two photons are not essentially different from each other must be confronted with experience. At first sight it seems to contradict it. We know that light differs from case to case; it differs in intensity and color. For difference in in- tensity we account by assuming that every bean of light consists of many photons so that in- tensity (for a given color) is proportional to the number of photons in the beam. Remains color. But experiments show that color actual- ly depends on the state of motion of the ob- server; when an observer approaches a source of light, color seemingly changes (Doppler effect) and so the field is clear for our assertion. Now let us see how it works out.

Before we treat the situation from the point of view of the Relativity theory we have to say a few words about how color appears in physics as a measurable quantity. From the point of view of the wave theory to light Is attached a certain measurable quantity v - "frequency" which corresponds to color In the sense that different colors correspond to dif- ferent frequencies. On the corpuscular theory photons are characterized by their energies, E, and the fundamental relation between frequency

and energy is given by the formula

16.4

E = hv,

where h is the so-called Planck's quantum con- stant, which for us appears simply as coeffici- ent of proportionality establishing the rela- tion between the values of two quantities which measure the same thing in different units, much in the same way that c, the velocity of light, appears in the formula connecting mass and en- ergy (compare 15.41) . Of the two quantities, S and v, which can be used to measure color, E will be the one that is more convenient for our purposes because we use the corpuscular theory of light.

The question now is with what quantity in our theory are we going to identify E. In order to have a suggestion we notice that E is of the character of kinetic energy; it plays for light particles the same role that kinetic energy plays for material particles. There (Section 13) we identified kinetic energy with the sec- ond term in the development

m

= m

of the fourth component of the momentum vector; of the other terms the third and the following are negligible for material particles, and the first is a constant so that it plays no part in these considerations where only differences in energy are important; besides, the correspond- ing constant for light is zero; everything leads us thus to compare E with the fourth component of the momentum vector of light, or photon. We arrive in this way to a new identification; we identify the mathematical quantity "time compo- nent of the momentum vector of a photon" with the physical quantity E which, except for a fac- tor of proportionality, is frequency and mea- sures color. This identification makes color dependent on the coordinate system but this de- pendence, as was said before, is to be expected, and our next question is whether the character of this dependence corresponds to experimental facts.

Suppose that E is the energy of a light corpuscle in one system of coordinates; what will it be in another? We have already calcu- lated how the components of a zero square vec- tor change under a transformation of coordinates involving time. Formula 16.3 shows that the ratio of the fourth components of the two vec- tors, and according to our Identification this means the ratio of the energies or frequencies, is

v ' /v = a + i .

On the other hand, we saw before (15.2) that the relative velocity of two bodies which are at rest

M

in the corresponding systems of coordinates it

V this gives

and taking into account the identity o* - T* = 1 we find

16.4

v!/v

T *

or, in first approximation 1 •»• V.

Let us try to figure out the predicted change of frequency on the classical (ware) the- ory. If we have a wave of frequency v that means that there are v vibrations per unit of time, and since the velocity is unity, there will be v waves per unit of length, low, if we move toward the source with a velocity V we shall travel in a unit of time the distance V and we shall meet V.v additional vibrations, so that the number of vibrations our eye re- ceives in a unit of time will be (1 * V).v, and this will be the frequency for the moving ob- server. The two theories give then the predic- tions

for the change in frequency due to motion of the observer, and the difference between these two values is too small to be subjected to an experimental test; within experimental error both seem to fit observations equally well.

17. Electricity and Magnetism in Special Relativity.

In the preceding sections of this chapter we have discussed some modifications that are brought about by the Theory of Relativity in Kinematics, Mechanics, and Optics. There are other modifications which have attracted a great deal of popular attention due to their sensa- tional a"nd paradoxical character. We shall only mention the so-called effects of motion on the shape of bodies, lengths, measure of tine, and the fact that in the Theory of Relativity the conception of simultaneity loses its absolute character so that two events which are to be considered simultaneous in one system of space- time coordinates need not be simultaneous in an- other. But we shall say a few words about elec- tricity and magnetism. Even in the first chap- ter the components of the electric and the mag- netic force vectors were combined into one ten- sor f±t , so that electricity and magnetism seem to be treated as two aspects, or manifestations of a higher entity. But as long as we limit

ourselves to transformations of space coordi- nates the components of F corresponding to electricity are transformed among themselves and those corresponding to magnetism - among them- selves, so that their unification in one ten- sor FJJ may be considered as artificial. flow- ever, when we introduce transformations of space-time coordinates (formulas 10.6) the sit- uation changes radically.

Following the procedure used in Section 11 when we were proving the invariant character of the relations between the tensors F and D we can deduce the following formulas corresponding to rotation in the x3x4 plane.

= F31s + F41c

F83

= F8,c - F84s

F48 - F38s

F43 = F43

F48c

Fix = F31c - F41s

IB

From these mathematical formulas we can pass to formulas involving physical components and only real quantities by making use of 4.72, 10.3 and the fact that F^ = -¥„ . We obtain thus the relations

= ox + TM yi = oy - TL Z' = Z

= oL - TY = oM + TX = L.

The interpretation of these formulas it that if the unprimed letters give the component* of electric and magnetic force in one system the primed letters will give the components of elec- tric and magnetic force in a system which mores with respect to the first with Telocity ? * T/O.

These formulas show that the distinction between electric and magnetic forces is not an absolute distinction, but depends on the coor- dinate system used; we might, for example, hare in the old coordinate system a purely electric field, L = M = N = 0; in the new system the mag- netic components will be different from zero, viz., -YT, Xxp, 0. What is the physical meaning of this? It means that a field may have elec- tric effects on one body but electric and mag- netic effects on a body that moves with respect to it. This prediction is verified by experi- ment. The fact may be restated by saying that an electric charge in motion has magnetic ef- fects, it may, e.g., deflect a magnetic needle. As an example, the magnetic field of a moving electron may be easily calculated. We start with an electron at rest. Its magnetic field is supposed to be zero, its electric field is supposed to be Independent of time and symmetric with respect to the point; under these conditions Maxwell's equations reduce, as can be seen eas- ily, to Newton's equations which, as we know, give the inverse square law for the electric forces. The field of an electron in motion can now be obtained by applying the above formulas.

M

Chapter IT. CURVED SPACE

The theory that has been developed so far may be said to consist of two parts: a general part which may be called Geometry and which, In addition to material analogous to that treated in ordinary geometry, includes general rules of operations on tensors, and a special part which may be called Physics and which deals with three definite tensor fields, a scalar field p, a vec- tor field u^, an antisymmetric tensor field FJI, which all have been combined into the tensor field TJJ , and with special conditions which we impose on these fields, viz., equations 11.1 to 11.5. The second part is independent of the first in the sense that we could have built with the same geometry a different "physics", we could have chosen another set of tensors instead of p, u4, FJJ . The reason why our physics was in- dependent of our geometry is because the latter does not furnish us any tensors, except the ten- sor Oij, or the tensor of scalar multiplication which is, so to say, the same in all points (and at all times) and therefore cannot be used to explain the variety and change which are characteristic of the outside world. In other words, our geometry does not possess any struc- ture which seems to be necessary for the inter- pretation of the outside world and therefore we had to superimpose on our geometry a certain arbitrary structure by introducing special ten- sors, by filling, so to say, the empty space- time with these tensor, fields. Our geometry does not give us a landscape, it gives us, so to say, only a frame for one, or only a stage and the landscape can be constructed on it by means of stage-settings which do not constitute an organic part of the stage. Although some success has been achieved with the theory Just described we may want to accomplish more, we may want to have a geometry possessing a structure of its own which might be used in interpreting the outside world. Such a possibility is sug- gested by the consideration of curved surfaces. The space- time we have been working with is of the same character as a plane (except for the number of dimensions) it is as devoid of struc- ture as a plane. A curved surface, on the oth- er hand, possesses a certain structure; it is not necessarily the same in all points, there may be a difference in curvature. We shall in- vestigate the possibility of a four-dimensional space which bears the same relationship to our flat space-time as a curved surface bears to a plane; we shall expect to find that it possess- es a certain structure which we'll try to inter- pret in terms of our physical quantities; more specifically, since all our physical quantities have been combined (by formula 11.5) into a sym- metric tensor of the second rank, viz., T^j > *e shall expect to find a tensor of that character

connected with the curvature of oar curved four- dimensional space.

The plan of our study will be to begin with the simplest case, a case that Is even simpler than that of a surface, viz., with the case of a curve, and then to work up gradually.

18. Curvature of Curves and Surfaces.

We consider a curve in the plane. We as- sume that it possesses a tangent at every point, and, furthermore, that if the origin of coordi- nates is chosen in any point of the curve, and the tangent at the origin is chosen as the x- axis, the curve, in the neighborhood of the ori- gin may be represented by a function

7 =* f(x),

which can be expanded into a power series con- verging in the neighborhood of the origin. The constant term of this expansion vanishes because the curve passes through the origin so that the equation must be satisfied for x = 0, y = 0; the coefficient of the first power of x also van- ishes, since the slope of the tangent, which is the x-axis, must be zero; if we write the next term in the form

the coefficient aa is called the curvature of curve at the point considered, i.e., the point chosen for the origin. Since every point can be chosen for the origin this assigns to every point of the curve a curvature. We may say that if we drop all terms of the expansion fol- lowing the one Just written out, i.e., consider the curve

18.1

Y = *a,x'

this curve (a parabola) is an approximation to the given curve in the neighborhood of the point considered.

We consider next a surface. Here we as- sume that it has a tangent plane at every point. Taking a particular point on the surface as the origin and the tangent plane at that point as the xy-plane we may represent the surface by an equation z = f(x,y). We again assume that for every point on the surface this function may be developed into a power series converging in the neighborhood of that point. The constant term and the coefficients of the first powers of the variables will vanish as before. We write out the next group of terms, those that are quadra- tic in the variables, in the form

18. tt Z =

a* + 2*18*1*8 +

where we use zx for x, and xa for y.

We may consider the coordinates xlf x8 as the components of a vector in the tangent plane which Joins the origin to the projection of the point on the surface, whose coordinates are xx, xa, z = f(x1,xa). The expression 18. E assigns thus to every vector in the tangent plane a num- ber (which may be considered as the ordinate of a paraboloid approximating the surface) . This assignment is independent of the coordinate sys- tem, i.e., if we choose another system of coor- dinates we shall have the same number assigned to the same vector although its components will have changed; in fact in a rotation of the co- ordinate axes the degree of a polynomial is not affected so that the group of second degree terms in the expansion of z is transformed into the group of second degree terms of that -expan- sion in the new coordinates. We have then a function whose values are numbers and whose ar- gument is a vector; is it a tensor? Of course not; but it is easy to introduce a tensor with which our function 18.2 is closely connected. We simply write, as in 9.1

18.3

aa2x3y2 = s(x,y)

using the coefficients axl, a12, azz of our function 18.2 and writing in the third term aai for aia for the sake of symmetry; this is a (symmetric) tensor of the second rank depending on two vector arguments x and y, and from which our function is obtained by setting the vector arguments equal to each other. We arrived thus in the case of a surface at a tensor of rank two which expresses the curvature properties, i.e., the structure of the surface insofar as it describes its deviation from its tangent plane in the neighborhood of the point of contact. This encourages us in our enterprise: if we succeed in generalizing this result to higher dimensions, we may try to Identify the general- ization of this symmetric tensor of rank two with the symmetric tensor of rank two which, as we saw, combines in itself matter and electric- ity. We want to state at this time that we shall be ultimately successful in our enterprise but that everything will not run very smoothly, and we shall have to make an effort in order to ar- rive in the general case at a tensor of rank two. The configurations which will present them- selves immediately will not be exactly tensors, and even after we shall arrive at a tensor it will not be a tensor of rank two. We shall have to overcome these obstacles, and in order to be able to do that we shall need some preparation, which we shall make by studying more attentively the case of a surface before we pass to the con- sideration of more complicated cases.

The curvature of a curve is characterized by a number; that of a surface is a more compli-

cated thing and is characterized by a tensor; we know, however, that there are certain num- bers connected with a tensor, in an intrinsic way (that is, independent of the system of co- ordinates), viz., the numbers given by 9.5 and 9.51, and we may expect that they have geomet- rical significance. In fact the first one, an + aaa> is known as the mean curvature, and the second,

•iX «!•

18.4 K =

as the total curvature of the surface at the point considered. We know that K is independ- ent of the choice of a system of coordinates, but we want to show how it can be obtained with- out the use of any coordinates at all. We hare introduced above a vector notation s(x,y) for our tensor 18.3; we now write out the expres- sion (the expressions in coordinates are writ- ten down for the sake of future references and may be disregarded at present):

18.5

s(x,u) s(x,v) s(y,u) s(y,v)

where x,y,u,v are arbitrary vectors, and we as- sert that it is equal to

18.6 K.

x.u x.v

y.u y.v

= K .

XPUP

7ouo

to prove this consider the expression

&21

multiplying by the law of multiplication of de- terminants the second and the third factors and writing K for the first, we get 18.6; applying the law of multiplication of determinants first to the first two factors, then to the resulting determinant and the third factor we get 18.5; we may thus write

18.7

s(x,u) s(x,v) s(y,u) s(y,v)

K.

x.u x.v y.u y.y

we may now obtain K without using any system of coordinates by dividing the left hand side by the second factor on the right; the vectors x,y,u,v, may be any arbitrary four vectors, only such that the second factor on the right does not vanish. Setting x=i, u=i, y=J, v=J, where i,J are two coordinate vectors we get formula 18.4; now it is seen to hold for any system of coordinate vectors, so that incident- ally we have a new proof of the invariance of K.

Before we leave the topic of ordinary sur- faces we want to establish a relation between curvature of surfaces and curvature of curves. The points common to our surface and the xz- plane constitute a plane curve whose equation in the xz-plane may be obtained from z = f (x,y) by setting y = 0. It is clear that the x-axis is a tangent to this curve and that the first term of the expansion of z as a function f (x,0) into a power series will be obtained by setting x2 = 0 in 18.2. We have thus

18.11

lll*l

as this first non-vanishing term, and comparing .with 18.1 we see that the curvature of the curve is an, or (18.3) the value s(i,i). We can con- sider any plane passing through the z-axls as the xz-plane, or any unit vector in the tangent plane as the coordinate vector 1; we have thus the result, that the curvature at the point of contact of a curve, resulting from the inter- section of the surface with a normal plane is s(i,i), if 1 is a unit vector common to the tangent plane and the normal plane considered. In other words to every direction in the tan- gent plane, characterized by a unit vector i corresponds a normal plane containing it, and the curvature of the intersection of that plane with the surface is s(i,i).

We see thus that to every direction in the tangent plane corresponds a definite number s(i,i), the curvature in that direction. As an exercise the reader may try to express the cur- vature corresponding to a direction in the tan- gent plane in terms of the angle that direction makes with the x-axis.

As the next step of our discussion whose general aim is to arrive at the most general situation as far as the number of dimensions is concerned both of the space from which we start and the configuration in it that we study, we take a skew curve in ordinary space; first we studied a curve (n =1) in a plane (N = 2); then a surface (n = 2) in the ordinary space (N = 3); now we take up the case n « 1, N « 8. A curve may be given in general by two equations on the three coordinates x,y,z. Solving these equa- tions for y and z, we represent the curve in the rorm y = f(x), z = g(x); we again make the as- sumption that a tangent exists for every point and that for every point, if we take this point as the origin and the tangent as the x-axls, it is possible to solve for y and z, and that the functions f and g can be developed into power series; as a result of the choice of the coordi- nate system, the two power series will begin with quadratic terms

18.8

If we change the y- and z-axes which fall into the normal plane to the curve, to other axes in

the same plane, the form of the development* will not be changed, of course, but the coeffl- cients a,, bt will assume new values; If, how- ever, we consider these coefficients as compo- nents of a vector, the vector represented by them will be the same in *T» coordinate sys- tems. Calling this vector v we may say that the curvature situation of the curve is charac- terized by the expression

18.9

This expression plays the part of the expres- sions 18.1 and 18.2 which have occurred in the two preceding situations.

19 . General! za t ions .

In the preceding section we discussed con- figurations in the ordinary space, and we could rely on our intuition; everybody can conceive a plane curve, a surface, a twisted curve; we have at our disposal physical objects (drawings, graphs, models) with measurements on which quan- tities of our theories may be identified suc- cessfully. In the investigation we undertake now, we cannot use our intuition any more, and the identifications, when they come, will be of a much less immediate character. We have than to rely on analogy with the configurations stud- ied in the preceding section and on mathematical reasoning supported by formulas.

We begin with what seems the next simplest case, a surface in four-dimensional space; it may be considered as a generalization both of a surface and of a curve in ordinary space. Such one is given, in general, by two equations on the four coordinates; in other words, we daf^nf as a surface in four-space the totality of points whose coordinates satisfy two equations P(x,y,z,t) = 0, G(x,y,z,t) = 0 where F,G are two functions subjected to certain restrictions to be imposed presently. We define £ plane In four dimensions as a surface which may be given by two linear equations (this definition, although given in terms of coordinates, is in- variant, because it can be proved that if the equations are linear In one system of coordi- nates they remain linear after a transforma- tion; the equivalence of this definition of the plane with that given in Section 7 is easily recognized). In the general case we choose a point on the surface as the origin of coordi- nates; we solve the two equations for two of the coordinates, and we define as the tangent plane at that point the plane whose equations result from omitting all but linear terms in the ex- pansions. We next choose that plane as one of our coordinate planes; the lowest terms in the expansions are then the quadratic ones; denot- ing the coordinates for which the equation* ap- pear as solved by x», x«, and the two other

coordinates by xx, xa, we may write the groups of quadratic terms in the two expansions as

19.1

atax,»),

2b18xxxa + baaxa").

For every vector in the tangent plane of compo- nents xx, xa this gives us two numbers which may be considered as the components of a vector in the normal x», x4 plane; or, we may consider this vector as given by a vector form

19.2

'11*1

2vxax1xa + vaax8!

the coefficients v^j being vectors of the nor- mal plane whose components are a ij and bjj (along the x3 and x4 axes respectively) . This expres- sion assigns to every vector of the tangent plane a vector of the normal plane; we may sub- stitute for it, as we did in an analogous case before (compare 18.3), a more general expres- sion

19.3

where

s(x,y)

V82x8ya,

vai

but although this is linear in each of the vec- tors x and y it is_ not a tensor, because the values of this expression are not numbers (they are vectors in the normal plane) . We shall not introduce a special name for such expressions because we shall not have to deal with them much; the expression 19.5 we have denoted, as before, by s(x,y), but we must keep in mind that the values of s(x,y) are not numbers but vectors of the normal plane.

We may in this case form the expression 18.5 where it is understood that in the expan- sion of the determinant scalar products have to be used where ordinary products were used be- fore; this change is made necessary by the more than once mentioned fact that the values of the elements are vectors. In all other respects we can apply to the expression the same reasoning as before and we come to the conclusion that the relation 18.7 remains true, where K is a number, independent of the coordinate systems in the tan- gent and normal planes, but which after such co- ordinate systems have been chosen can be calcu- lated in terms of the coefficients of the vector form 19.3 by means of the formula

19.4

The important fact is that, although our expres- sion 19.3 is a vector expression, and therefore, does not furnish us a tensor, the invariant cor- responding to 19.4 is still a number. The other

H

invariant, which could be called mean curvature and written as vlx + vt§ Is not a number in this case. The number, K, we call, as before, the total curvature of the surface at the point con- sidered.

In terms of the coefficients of the numer- ical forms 19.1 the total curvature K may be expressed as follows: expanding the determin- ant 19.4 we have K - v^.v.. - vai.vxa; the term vxl.v22, for Instance, Is the scalar prod- uct of the vectors vn and vaa whose components are respectively alt, bn and a.,, b88; the sca- lar product vlx .vaa is then alxasa + blxbaa; In the same way, the term vai.via in the expression for K is aiaaai + biabai, so that we have for ,K in terms of the a's and b's rearranging terms •and using determinant notation

19.41

K

»xx

The next generalization is an easy one; we still consider a surface (n = 2) but instead of a four-dimensional space we take a space of an arbitrary number of dimensions H; we denote N-n = N-2byr and we have, a tangent plane, as before, but instead of a normal plane, we have now a normal r-dimensional space, an r-flat as we may say. We call the corresponding coordinates x», x4, etc., or Xg+k, where k goes now from 1 to r instead of only taking the val- ues 1 and 2; we have here a vector form which may be written as before (19.2) only the v's are vectors of the normal flat and have r com- ponents each; these components we may distin- guish by upper indices In brackets; if we de- note by I^ the r coordinate vectors in the nor- mal flat, and denote by aQO the components cor- responding to I k of v^ we may write

19.5

2 k=l

and for s(x,y) we may write

(k)

W

19.6

r

krl

otherwise there will be no changes. We can fora the expression 19.4 as before; it will be inde- pendent of the choice of the coordinates xt+k because the scalar products used in the expan- sion of the determinant are; substituting the values 19.5 and evaluating we will have

19.41

r

Z

k=l

•xx .00

an obvious generalization of formula 1^.41 which

may be obtained from this by taking r = 2, and writing a for at1) and b for a(2) with proper subscripts. We may, if we wish, write out an expression analogous to 18.5. Substituting for s(x,u) etc., the expressions 19.6 and using scalar products in the evaluation of the deter- minant we shall find

19.43

s(x,u)

= Z

k=l

where the Greek letter subscripts imply summa- tion fron one to two corresponding to the tan- gent plane, and the summation with respect to k corresponding to the r coordinates of the nor- mal flat is indicated in the usual fashion. Each of the determinants corresponding to the differ- rent values of k is exactly of the same nature as 18.5 so that the reasoning which led from it to 18.7 applies to each of the determinants without change, and it is easy to see that form- ula 18.7 continues to hold. We may use this formula to define K which we continue to call total curvature.

And now we come to the last generalization. We consider, in a space of an arbitrary number of dimensions N a curved space of n dimensions with n also an arbitrary number «N, which by definition is the totality of points whose co- ordinates satisfy N - n = r equations

19.7 Fk(xi., x2,....xN) =0 (k = l,2,...r).

V.'e assume that for every point in the curv- ed space these equations can be solved for r of the coordinates and that these solutions can be expanded into power series in the remaining n coordinates, converging in the neighborhood of the point selected. By a transformation of co- ordinates we may arrange it so that these expan- sions begin with second degree terms so that we may write

19.71 xn+j£ =

+ terms of higher degree

where the summation indicated by the Greek in- dices now goes from 1 to n. The sub-space de- fined by the first n coordinate axes we call the tangent flat space at the point considered, and the sub-space corresponding to the remaining r coordinate axes - the normal flat space at that point. /jjN

As before, we use the coefficients aj« to form the expressions

19.8

where xif y^ are two vectors of the tangent flat; and we combine these expressions into a vector expression

19.9

s(x,y)

p-J-k

It Is natural to try to generalize the of total curvature. We can form the expression 18.5 but, and this is important, the transfor- mation 18.7 does not apply; It was based essen- tially on the fact n = 2, and it breaks down here.

20. The Rlemann Tensor.

The way out of this difficulty is very simple. Although relation 18.7 does not hold we still may consider its left hand side; it is a function of the four vectors x,y,u,r, func- tion, which has numerical values; it is easy to show that it is linear in each of the vector arguments (we leave this proof to the reader because the result will follow later from form- ula 20.8); it is therefore a tensor, a tensor of rank four; we call it the Riemann tensor, denote it by R(x,y;u,v) and write

20.1

R(x,y;u,v) =

s(x,u) s(x,v) s(y,u) s(y,v)

We have then at every point of the curved space a tensor of rank four instead of a number; it is connected with the second degree terms of the expansions 19.71 and therefore character- izes, at least in part, the deviation of the ex- pressions of the xa+k from linearity, or of our space from flatness. The Riemann tensor tells us then something about the curvature of the curved space, and it is often called the curva- ture tensor.

The situation we have now reminds us of a situation in Section 18. The curvature of a curve was a number; when we passed to a surface we found that its curvature was characterized by a tensor; we have succeeded to derive from this tensor a number K, so that we could ex- press (at least partially) the curvature of a surface by a number. Now passing to higher curved spaces we again obtain a* tensor. In the preceding situation we succeeded in interpret- ing the tensor s(x,y) given by formula 18.5 in terms of curvatures of certain curves on the surface; we found that the value s(i,i) gives the curvature of the normal section determined by the unit vector 1 and the normal to the sur- face. Is it possible to interpret the Riemann tensor in an analogous way as giving the total curvatures of some surfaces on our curved space? This is a natural question to ask, and the ans- wer is affirmative. We shall prove, in fact, that certain values of the Riemann tensor give us total curvatures of surfaces situated on the curved space. Let i,J be two arbitrary mutual- ly perpendicular unit vectors of the tangent flat; choose a set of coordinate vectors so that i,J be two of them. Pass through 1,J and the normal flat, i.e., through the r vectors Ik,

a flat space of 2 + r dimensions; its points will be those points of the N-space whose coor- dinates xa, x4,....xn vanish; the intersection of this flat space with the given curved space will be a surface, i.e., a two-dimensional curv- ed space, because the coordinates of its points must satisfy the r equations of the curved space (19.7) and n - 2 equations

20.2

0, x4 =

= 0,

which together isN-n+n-2=N-2 equa- tions. This surface we may consider as a sur- face of the r + 2 dimensional flat space 20.2; its equations in that space will be obtained by setting x3 = x4 = ...xn = 0 in the equations 19.71 of the curved space (just as the equation of the normal section of a surface in the xz- plane was obtained (preceding formula 18.11) by setting y = 0 in the equation of the surface); these equations will then become

20.3

+ terms of higher degree

and the total curvature of this surface is

,1 via

with

'28

(k)

but vlx = s(i,i), v12 = s(i,j), vai = v22 = s(j,j), so that we have

which is R(i,J;i,j), and our statement is proved. As we saw at the end of Section 18, a unit vec- tor i in the tangent plane to an ordinary sur- face determines a direction, a straight line, which, together with the normal determines a nor- mal plane, and the intersection of that normal plane with the surface is a normal section of curvature s(i,i); here we have the situation that two unit vectors i,j in the tangent flat to a curved space determine an orientation, a plane, which together with the normal r-flat determines a normal r + 2 flat, and the intersection of that normal r + 2 flat with the curved space is a normal section, a surface of curvature R(i,J;i,j). We see then that the Riemann ten- sor plays with respect to a curved space a role analogous to that played by the tensor s(x,y) with respect to an ordinary surface; our expec- tations then are fulfilled; we need, it is true, for the purposes of identification with the com- plete tensor TJ a tensor of the second rank,

10

but we shall get one of rank two froa R(x,y;u,v) by applying to it the operation of contraction.

In the meantime let us study the Riemann tensor, or the curvature tensor as It Is called sometimes, as we have It. The Riemann tensor is not a general tensor of rank four. It sat- isfies the relations

20.41 R(x,y;u,v) - -R(y,x;u,v) - -B(x,y;T,u),

20.42 R(x,y;u,v) = R(u,v;x,y),

20.43 R(x,y;u,v) + R(x,u;v,y) + R(x,v;y,u) - 0,

which are easily verified by using 20.1. The first of these relations says that R is anti- symmetric in each of the pairs of the vector arguments, and the second, that it is symmetric in the two pairs.

If we introduce a coordinate system in the tangent flat, by picking four coordinate vectors i,j,k,l or ix, it, 13, 14, we may represent the vector arguments (as in Section 9) in the form x = iaXa> etc., substitute these expressions In- to R(x,y;u,v) and, by using linearity as de- fined in 9.6, write the Riemann tensor as

20.5

where

20.6

Rab;cd

(We use here the first letters of the alphabet as subscripts, instead of i, etc., as before, in order to avoid confusion with the coordinate vectors which we denote by i.) These are the components of the Riemann tensor in t.' c »ordi- nate system chosen. The relations 20.4 can be written in components as

20.71 Rabjcd = ~Rba;od '= ~Rabjdc

20.72 Rab;cd = R od;ab

20.73 Rab;cd + Rac;db + Rad;bc = 0.

Exercise. Prove that the number of independent components of a Riemann tensor for four dimen- sions is 20.

The vectors of the flat spaces tangent to the curved space may be considered as belonging to the curved space, they may be characterized In terms of the space itself, for instance, by giv- ing direction and length; they are accessible as we may say, to beings who live In the space and for whom points outside the space do not exist. Normal vectors, the function s(x,y) etc., are, on the contrary, not accessible to the in- habitants of the space. We shall confine our- selves for the most to the consideration of these internal properties, properties accessible to the inhabitants; but later in the course of our investigation we shall have to use the

expression of the Riemann tensor in terms of the coefficients a[j) and we shall conclude this sec- tion by deducing it.

Substituting the expression 19.9 for s in- to 20.1 and using 20.5 for the left hand side, we find

20'8

this determinant may be presented as the sum of r8 determinants, of which, however, only r are different from zero, namely those In which the same I appears in the two columns, because in the ex- pansions of the others all terms vanish as in- volving products of different and therefore mu- tually perpendicular I's. What remains is (com- pare 19.45)

r

Z

k=l

r

A

because the I's are unit vectors and I^.Ifc = 1> or

(k)

r

2

k=l

PY aP«

Comparing this to the left hand side of 20.8 we have the required expression

£0.9

Rabjcd =

21. Vectors in General Coordinates.

In the last section we learned how to as- sociate with every point of a curved space a ten- sor of rank fourj for our physical interpretation we need one of rank two; but we know how to ob- tain from a tensor of rank four one of rank two; we have to apply the operation of contraction. The result we shall call the "contracted Riemann tensor" and we shall expect to identify it with the tensor T. The first question we have to ask ourselves in this connection is whether the con- tracted Riemann tensor satisfies the equation 11.4, viz., oTia/axg = o. But before we do that we have to go through quite a lengthy development because at the present stage we do not know how to differentiate tensors on a curved space. In flat space we could consider the differential of a vector, or, more exactly, of a vector field, by (roughly) considering the difference of two

41

vectors of the field in two neighboring points. In curved space, or on a curved surface two vectors in two different points belong to two different tangent planes and their difference is not a vector of the surface at all. Or we could in a flat space adopt a cartesian system of coordinates and Introduce as the components of the differential the derivatives of the com- ponents of the given tensor. This method also is not applicable directly to curved space be- cause there is no such a thing here as cartesian coordinates. Each method could be so modified as to apply to curved space - the geometrical method and the coordinate method. We shall de- velop here the coordinate method because in ad- dition to permitting us to introduce differen- tiation - our immediate concern now - it is In- dispensable in treating special cases.

As we said before, there is no such a thipg as the cartesian system of coordinates in curved space, because there are no straight lines; so we shall have to use some other coordinates, let us say, general coordinates; the main difficulty in treating curved spaces is just this, viz., that rectilinear coordinates are not applicable here, or we may say: part of the difficulty lies in the fact that we have to use curvilinear coordinates (the greater part) and part - in the fact that the situation itself is so different from that we encounter in flat space and with which we are more or less familiar. Or to put it in a still different form: the difficulty is two-fold, we have a new material to work on and we have to use new tools. To obviate the difficulty we are going to divide it; we already have studied curved spaces in the preceding sec- tions; now we shall try to become familiar with the new tool - the method of general coordinates, applying it to the old material - ordinary three- dimensional space; and then - beginning with Section 25, we shall study curved space by means of curvilinear coordinates.

The essential thing in the matter of coor- dinates Is that points receive names, the names being composed of numbers, so that we can handle numbers, which we can do by means of formulas, instead of points themselves. There are many different ways of establishing a correspondence between points and triples of numbers; in the one that bears the name of Descartes (Cartesius) the three numbers which are assigned to a point are its distances from three mutually perpend- icular planes; there does not seem to be anything that can take the place of this method in a gen- eral curved space because there are no planes and straight lines; still we may use coordinates; a system of coordinates on a special curved sur- face is known to everybody, even to those who never studied analytic geometry; we mean the sys- tem of specifying the position on the surface of the earth by means of latitude and longitude. Polar coordinates in the plane or in space fur- nish another example of a coordinate system which is not cartesian; in what follows we shall

use an entirely arbitrary system of coordinates; we shall assume that a one-to-one correspondence is established between the points of a certain portion of space (which may be the whole space) and the triples of a certain set of triples of numbers. We shall call these numbers u^u,, u3 or ui and we shall keep the notation XA for some definite system of cartesian coordinates. To every triple Ui, u8, u, corresponds a point whose cartesian coordinates (in some definite system) are XJL; these numbers xi are therefore determined by the ui's;we have three functions

*1 = Xl(uX,U8,U3) X8 = X»(Ui,U8,U3)

£1.1

x, = X3(u!,u8,u3)

which are defined on a certain range of triples. Conversely, if xx, x2f x3 are the cartesian co- ordinates of a point of the portion of our space for which general coordinate have been intro- duced, they determine three numbers ux, u2, u3, which therefore are functions of the x's

21.2

u8 = U3(xi,x8,x3),

which are defined for a certain range of triples xi and are the inverse functions of the func- tions 21.1.

We have to handle vectors even more often than we have to handle points, and we want to have a numerical representation also for vectors. Together with cartesian coordinates for points goes a very simple numerical representation for vectors; we represent a vector by three numbers which are the differences between the corres- ponding coordinates of its end-points, and are called the components of the vector; of course, in a different system of cartesian coordinates the same vector will have other components, but as long as we keep to a definite coordinate sys- tem, vectors, as well as points have definite names. The method of representing vectors by their cartesian components has the great advan- tage that two equal vectors have equal compo- nents, that we can add vectors by adding their components, and multiply a vector by a number by multiplying the components by that number; these advantages are peculiar to the cartesian method and cannot be reproduced in other systems. The theory of curved space is differential geom- etry, we cannot handle immediately by its methods such things as a configuration consisting of two points at a finite distance; if we do we have to introduce intermediate points, instead of sub- traction we have here differentiation. There are two ways in which vectors arise by differen- tiation, and each gives rise to a system of no- tation for vectors associated with a given coor- dinate system for points - only for the rectan- gular cartesian system do the two representa- tions coincide. The two ways in which a vector appears as a result of differentiation are - the

tangent vector of a curve and the gradient of a field. In this and the next section we shall take only the first of these two points of view. Given a curve in cartesian coordinates in para- metric fora

21.3

y(p),

the components of the tangent vector are obtain- ed by differentiation

21.4

dx/dp, dy/dp, dz/dp.

This vector is determined not by the curve alone, but by the particular parametric representation we are using, but in this chapter we are not going to change the parameter often and we shall speak of a curve when we mean "curve in a given parametric representation", and of the tangent vector, when we mean "tangent vector resulting from differentiation with respect to that par- ticular parameter". In cartesian coordinates, then, the components of the tangent vector are obtained by differentiating the coordinates of the points of the curve. This is certainly con- venient, and we may ask ourselves whether we could not reproduce this advantage in general coordinates. Let us try; the parametric repre- sentation of the curve in the u1 s can be ob- tained by substituting XI(P) into 21.2; the u^s become then functions of p, and this gives a parametric representation of the same curve in general coordinates; let us agree to represent the vector which, when we used the cartesian system had the components 21.4, by the three numbers

21.5

dua/dp,

duj/dp.

We have then the required system of repre- sentation; but it is not necessary, every time when we want to represent a vector to introduce a curve to which it is tangent; we shall show how to find the components 21.5 when we are giv- en the cartesian components 21.4 without actual- ly considering the curve.

We have, considering that Uj depends on xx, x8, x3 which in turn depend on p,

or using the summation convention and applying to Uj., u8, ua,

21.6

duj/dp = oui/3xa.dx<i/dp,

so that, if we denote the quantities 21.4 by and the quantities 21.5 by V1, we have

21.7

V1 * OUi/OXo.Ya .

Introducing the abbreviation

21.8

we may write the last formulas as

21.9

V1 = alava.

It will be explained later (Section 23) why we use in the left hand side the index as a super- script. These are transformation formulas for vector components which are associated with the transformation formulas 21. 2 for the coordinates of points; the formula Just written out permits to find the general components when the carte- sian components are given. In a similar way we can find the inverse transformation formulas by starting with a parametric representation of a curve in general coordinates, substituting the u^p) into the formulas 21.1 and differentiating; we arrive thus at

21.10

where

21.11

Before we go further we shall use the fact that the formulas 21.9 and 21.10 are inverses of each other to obtain some relations on the a's and b's. Substituting 21.9 into 21.10 we get

21.12

vl = biaaopvp>

the left hand side may be written as 01Qvfl so that

bioaapvp =010V

and since this is an identity (v being arbitrary) we have

21.13 bia*aj=°lj-

In the same way we may obtain

21.14

aiabaj

= C,

We want to be able to operate with general components of vectors, for instance, find a scalar product of two vectors given in their gen- eral components; it is easy to obtain a formula answering this question by passing to cartesian components, and then applying the formula for the scalar product in cartesian components. Let the general components of two vectors be V1 and W*; according to the formula 21.10 their cartesian components are

and

where we use another summation letter In the second expression to avoid complications in what follows. How we write the scalar product using the formula VYWY and get

21.15 b

It follows that in order to be able to find scalar products of rectors given by their gen- eral components we have to know the quantities

21.16

bnbrj

The quantities a's and b's help to pass fro* a certain cartesian system of coordinates to the general system; they express a relation between the general system and that particular carte- sian system; and thus are not of fundamental im- portance; the quantities gj* , on the contrary, although they have been obtained by means of the a's and b's, are independent, as their sig- nificance shows, from any particular system of cartesian coordinates, they characterize the system of coordinates we are using In itself (and, as we shall see later, they characterize it completely, so that the g's are all we need to know in order to be able to handle vectors given by their general components). The a's as well as the b's may be considered either as functions of the x's, or as functions of the u's. The g's always will be considered as functions of the u's.

Before we go any farther re note that, as it immediately follows from the definition 21.16 the g's are symmetric in the indices:

21.16'

= g

Ji*

In order tb show the importance of the g's let us deduce a formula for the length of a curve given in general coordinates. Let the curve be given by

21.31 Ui(p).

For a curve given in cartesian coordinates assume as known the formula (compare 12.1)

we

21.17

=/|/(dx/dp)" + (dy/dp)* -i- (dz/dp)«dp

where s is the arc length between two points. This formula involves three inverse operations, that of integration, that of taking the square root, and that of division. It is not pleasant, in general, to have to do with these operations, and so we shall free our formula from them, and write it as

21.17'

ds1 = dx8 + dy*

dz1

The formula as Just written is not essentially different from the one written above, and means exactly the same thing. The sign d may be taken

to mean differentiation with respect to some un- specified parameter, since the correctness of the formula does not depend on what parameter we are using (provided that the same parameter IB used on both sides). We translate 21.17* now Into general components. Differentiating 21.2 we have

dx =

dy = bia.dua, dz = b3a.du°;

for dx2 we may write biadu°.bijjduP; using simi- lar expressions for the other terms of 21.17' we get

dsa

21.18

using the abbreviation 21.15 introduced before. We see that the quantities gj_< appear again.

22. Tensors in General Coordinates.

We come now to the representation of ten- sors. We know that a tensor is a function which assigns numbers to vectors, and the question of representation will be simply this: given the general components of the vector arguments how to find the corresponding value of the tensor. We have already solved this problem for one par- ticular tensor, namely for the tensor of the scalar product which we expressed in the preced- ing section by means of the g's, and we shell use the same method in the general case. Given the cartesian components of a tensor fj* and the general components V* and W1 of two vector ar- guments, to find the corresponding value of the tensor. We pass from general components to car- tesian components by formulas 21.10 and substi- tute these expressions into the expression

fY6vYwC for tne value of tne tensor; the result is

and this may be written as

pp 1 Tji T7^wP ~ f*

if we set

PP P f h .Vi - 1?

ZY6 YiDOj ~ *lj »

we call FIJ the general components of the ten- sor, whose cartesian components are fji , and we see that the values of a tensor are expressed by formula 22.1 in its general components and the general components of the vector arguments in the same way as in terms of cartesian components of the tensor and the vector arguments. Formu- la 21.15 for the scalar product of two vectors may be considered as a special case of 22.1. The cartesian components of the tensor gj j are, of

course, the dj , and substituting 6 for f In

22.2 we get the expressions 21.18 for Ptj . We

treated here as an example a tensor of rank two;

similar calculations can be performed for a ten- sor of any rank; we give the results for rank

one and three, leaving it to the reader to go through the calculations:

22.11 F, «

Now naturally the inverse problem present* itself; given the general components of a ten- sor to find its cartesian components. The prob- lem can be solved by substituting in the expres- sion for the value of a tensor (we again take a tensor of rank two as an example) given in terms of general components, Fr0VYW°, the expressions 21.9 for the general components of the vector arguments in terms of the cartesian components: vi = aiava> w4 = alpwp> the result is

comparing this to the expression fa0vawfl the value of a tensor in cartesian components we derive the desired transformation formula for passing from general to cartesian compo- nents; here are these formulas for the first three ranks

22.5

fi = Faaoi>

ljk

We know now how to write tensors in gener- al components, and we want to find out how to perform operations on them. Of course, we could always pass from general components to carte- sian components, perform the required operations and then, if the result is a tensor, pass back to general components; but instead of following this program in every special case as it pre- sents itself we shall do it once for all and derive general formulas whose application in special cases is much more convenient than ad hoc calculations.

We begin with the operation of contraction. Given again a tensor of rank two by its general components FJJ we pass to its cartesian compo- nents by formula 22.3 and now we contract by taking the sum of components with equal indices according to the original definition,

22.4 fyy

where we use the abbreviation

22.5 aiYajY = gi^'

For tensors of higher ranks (contraction is possible only for tensors whose rank is* 2) en- tirely analogous formulas may be obtained easily; indices which are not affected by contraction may be simply disregarded, as it follows from similar calculations which are left to the

student. For example, the result of contracting irith respect to the second and fourth Indices of

a tensor of the fourth rank F

ijkl

will be

22.41

I

iokp

The quantities g1J introduced a moment ago play quite an Important role comparable to that of the gj« with lower indices, and they are con- nected with them by the formula

. 6

j j

8 g

aj

To prove it suffices to substitute the expres- sions 21.16 and 22.5 for the two kinds of g's and to apply formula 21.14 twice; we may also notice that, as it follows from the definition,

op 7 tfij = ^Ji

66 ( g " - g " ,

so that formula 22.6 may also be written as 22.61

Next comes the operation of differentiation. The result of differentiating a tensor is always a tensor of rank higher by one than the given tensor; its components will have one more in- dex than those of the given tensor; we shall denote them by simply adding a new index preced- ed by a comma, to the symbol of the given tensor.

Because the situation is slightly more com- plicated, let us start in translating differen- tiation into general components with the simp- lest case of a tensor of rank one given in gen- eral components, F^. We follow the same program: as a first step we pass to cartesian components by formula 22.5 and get f^ = Ffja^: we next find the cartesian components of the differential by simply differentiating with respect to carte- sian coordinates with the result

»

As a third step we pass back to general compo- nents using the formula 22.2 and arriving at the result

but b0j according to formula 21.11 is so that this expression reduces to

using relation 21.14 the first term reduces to

just what we would expect from analogy with car tesian coordinates as an expression for the re sult of differentiation; however, this is not the whole answer because there is a second term so that the final result is

22.8

where we set

22.81

the second term may be considered as being In the nature of a correction to the expected re- sult; we call it the correction term, and we call r.. the correction coefficient*. We see then that in general coordinates the components of the differential of a tensor of rank one con- sist of two parts - .the first expresses the change (or rate of change) of components of the tensor, the second is due to the change of the coordinate system from point to point. In the case of the cartesian system the coordinate sys- tem is, so to say, the same in all points, the second term is zero (the a's reduce in this case to constants, and their derivatives vanish); another extreme example is furnished by a tensor whose components are constants in some non-car- tesian system of coordinates (for Instance, po- lar coordinates) ; the derivatives of the compo- nents with respect to the coordinates are zero but the components of the differential are not; their values are given by the correction tents alone .

For tensors of rank higher than the first the calculations are slightly more complicated, but the principle is the sane; we write out the results for tensors of rank two and three

22.82

22.66

ij,k

ijk,a

r

- r

9 i»Fajk

r <* v ru TT rj»Fiak rk«*ija>

the general rule ought to be clear now; there are as many correction terms as there are in- dices; each correction term corresponds to one index, the other indices being disregarded in its formation.

In order to be able to perform the opera- tion of contraction (and the operation of scal- ar multiplication is a special case of it) in general coordinates we have to know, as we saw, the values of the g's; in order to be able to perform the operation of differentiation we have to know the value of the T's (the correc- tion coefficients); if we know those we can per- form all the necessary operations in general co- ordinates without going back to cartesian coor- dinates. We shall show now the values of the T 's can be derived from those of the g's.

The correction coefficients were given or- iginally by the formula 22.81; we can give to this another form by using the relation 21.14; writing it as a^b ± =ftkl and differentiating it with respect to uj we get, since the a»s are

constants,

&

=

so that we have

or, recalling the definition 21.11 of the b's,

k d*Xv 22.84 r,4 - -

from this expression it follows that r is not affected by interchanging the two lower indices, or

22.71 rij = rjki*

We may now show how the T's can be derived from the g's. We shall do that by using the following artifice. Consider the tensor of sca- lar multiplication, whose cartesian components are the O^j and whose general components were shown to be gjj ; the components of the differen- tial of this tensor in cartesian components are the derivatives of the 6' s and therefore zero; the second formula 22.11 shows that the general components of this tensor of the third rank also must vanish, so that

22.85

= 0

(we did not promise that general components of tensors will always be given by capital letters, but since heretofore we have been using capital letters for them it may be well to emphasize that the g's are intended to represent (follow- ing the generally accepted custom) general com- ponents of the scalar multiplication tensor) . On the other hand, we can calculate the compo- nents of gij^fc by the application of formula 22.82 and so* we get

- o,

this is a system of equations connecting the r's with the g's and their derivatives; we want to solve them for the r's. For that purpose we write out the above relation in two more forms resulting from it by cyclic interchanges of in- dices:

- o,

subtracting the last two relations from the first we notice that as the result of symmetry of the g's and the r's in the lower indices (formulas 21.16' and 22.71) four of the terms containing the r' s cancel and the remaining two are identi- cal; we thus have

We multiply now both sides by g** and turn with respect to k, writing for it a Greek Index, e.g., 0 . Taking into account 22.61 we have

22.91

This shows how, given the g's, to calculate the r's. We see thus that if only we are given the g's as functions of the u's we can perform all the required operations on tensors. Very often the calculation of the r«s is divided in- to two parts; first the left hand sides of 22.9 are calculated and listed; they are denot- ed by rk>1j ; and then the r^ are calculated using the formulas in the form

22.92

23. Co variant and Contravariant Components.

We know (Section 9) that a vector is a ten sor of rank one, or, more precisely, that to every vector v there corresponds a tensor of rank one v.x which has the same cartesian com- ponents. Now we have introduced general com- ponents for vectors and also for tensors; if we have cartesian components of a vector v^ to them correspond (21.9) the general components

V1 = alava;

also if we consider the vi as the components of a tensor to them correspond (22.11) the general components

to the same cartesian components v^ there cor- respond thus two different sets of general com- ponents depending on whether we consider the v^ as vector or as tensor components; it was in an- icipation of this situation that we have been using the index for general vector components as a superscript. Essentially, a vector and a tensor of rank one are one and the same thing; and so we have two different systems of compo- nents for every vector (in a given general co- ordinate system) ; the components with subscripts are called covariant components, those with the superscript - contravariant components. It is clear from what precedes, but it may be worth- while to repeat that we have here two different representations of one and the same thing.

It was mentioned in Section 21 that there are two ways in which a vector results from dif- ferentiation; one, a vector considered as a tan- gent vector to a curve, was discussed before, and is the basis of what we have been doing all this time; it is interesting to consider now briefly the other. If we start with a scalar field f = f(x,y,z) we may derive a vector field by differentiation, and the cartesian components

47

of tals vector field will be 23.1

this vector la known as the gradient of the field f ; now, we may give the same scalar field in gen- eral coordinates

f = f Xi(U!UtU,), X,(UiUtU,),

if we differentiate f with respect to ulf will we obtain general components of the gradient vector field? The question is easily answered by computing these components; we have

of

Pf oxg 3f

23.2

comparing this with 22.11 we see that the par- tial derivatives of/oui are the components with subscripts - the covariant components in gener- al coordinates of the gradient. The two repre- sentations, the covariant, and the contravari- ant, may be thus considered as corresponding to two ways in which a vector can be arrived at by differentiation; if we consider a vector as a gradient we arrive naturally at Its representa- tion by covariant components, if we consider it as a tangent vector we arrive at its representa- tion by contravariant components. (The name co- variant, by the way, is Intended to Indicate that these components change in the same way, as, or have similar formulas of transformation with, partial derivatives.) In the case of cartesian coordinates, of course, covariant and contravar- iant components of a given vector coincide: in this case it is not necessary to make any dis- tinction.

We shall have to use covariant as well as contravariant components, and it is important to be able to pass from one to the other repre- sentation; the necessary formulas can be found, of course, by passing through a cartesian repre- sentation. Let covariant components FA of a tensor of rank one (or vector) be given; formu- las 22.3 show us that the corresponding carte- sian components are fA = Faaai; ^ terms of these the contravariant components are obtained by formula 21.9 which gives here

23.3

F1 = alpFaaap =

if abbreviation 22.5 is used. In the same way it is easy to prove the following formula, which permits to calculate covariant components ' when the contravariant components are given

23.31

It may seem that there is a wasteful redun- dancy in this double system of notations, that one representation is enough, and that to have

two, means to Indulge In luxury; as a natter of fact this double notation If a defect from a didactical point of view: it makes it more difficult to learn the new language; but once mastered it makes the calculations much simpler and the formulas much shorter and more elegant, if properly used; as an example, we want to give the formula for the scalar product of two vectors, one of which is given In covariant, and the other in contravariant components. This formula can be obtained by the usual procedure, i.e., passing through cartesian components, but we have already reached a stage where we can dispense to a great extent with the use of car- tesian coordinates. The required formula is simply

23.4

and it can be proved by simply substituting for Wa its expression according to formula 23. 81, viz., gaaW0 and comparing the result to 21.15; of course, the scalar product could also be written as VgW*, and also ga^VJlp, as it Is easy to verify.

In Section 22 we derived a system of repre- sentation for tensors starting with the contra- variant representation of vectors; we could do the same thing starting with the covariant rep- resentation of vectors, and we shall do it so as to have a perfectly symmetrical system of notations.

Suppose we are given covariant components of two vectors Vj, W4 and the components PJJ of a tensor, and we want to find the value of the tensor corresponding to the given vectors as ar- guments; we know how to solve the problem if the vectors are given by their contravariant components; therefore, let us calculate first the contravariant components, viz., V1 = gr'*\f W1 = g^Wg, and then substitute them Into the left hand side of expression 22.1 giving the value of the tensor. The result is

23.4 F

which may be written as 23.41

if we introduce the notation

23.42

-

We call F1^ the contravariant components of the tensor FJJ and the components with lower Indices (subscripts) which we have been using for ten- sors heretofore we call covariant components. We have thus two representations not only for tensors of the first rank (vectors) but also for tensors of all ranks. In one case we have been using already a symbol with two superscripts, viz., the g1^ (introduced by 22.5); we shall show now that this notation is in agreement

with the general notation we are introducing now by proving that these g's with upper indices are the contravariant components of the tensor of scalar multiplication. In order to prove that, we notice that, according to formula 23.42 the contravariant components of a tensor of covari- ant components g^ are

but according to 22.61

» Oaj so that we

get 6agal - g^1* and the assertion is proved.

text we want to learn how to differentiate a tensor given in contravariant components, but before we do that it seems necessary to Intro- duce what we call mixed components. Suppose we are given one vector argument of a tensor of rank two in contravariant, and the other In co- variant components, and we want to find the val- ue of the tensor; if the components of the two given vectors are V1 and W^, and the cartesian components of the tensor are f^j , we pass to cartesian components of the vector arguments

23.5 Vi = b1YVY, wj = W0aol, and express the value of the tensor as

23.51 fapva wp where the notation

23.52

is introduced. The numbers F^ with one lower and one upper index are called mixed components of the tensor of rank two whose cartesian com- ponents are fjj . In this same way we may con- sider mixed components for a tensor of any rank with as many of the indices up as we may wish, and the others - down.

We can pass from one kind of component to any other directly, without going through carte- sian components. The transition from components in which a certain index (for example, the third) is used as a superscript to components in which the same index is used as a subscript is call- ed the lowering of that index. This change does not affect the geometrical meaning of the ten- sor, it merely corresponds to a transition from an expression of the tensor in which the corres- ponding vector argument (in our example, the third) was given by Its covariant components to an expression of the same tensor using contra- variant components for that vector argument. The formula for the lowering of an index is easily found to be independent from all other indices, so that, disregarding them, we always may use 23.31. Formula 23.3 may be considered as a gen- eral formula for raising an index. Lowering and raising of indices is sometimes referred to as juggling with indices.

Again it may seem that the introduction of mixed components is superfluous, but there are

48

advantages In using mixed components.

One advantage appears in connection with contraction. The result of contraction Is giv- en in terns of covariant components by formula £2.4 (or 22.41). But, according to 23.3 we may write Rlj for Poj gla so that 22.4 may be writ- ten as

23.48

and for 22.41 we may write ?i\a' We see thus that if a tensor is given by its mixed compo- nents and the two indices with respect to which we contract appear on different levels (one as a subscript, the other as a superscript) con- traction is performed (like in cartesian coor- dinates) by simply replacing each of the two indices by the same Creek letter.

Another case where there is great advan- tage in using mixed components is that of dif- ferentiation of a contravariant tensor (as we say sometimes for: "tensor given by its con- travariant components"; a tensor in itself is, of course, neither contravariant nor covariant - covariance and contravariance are only prop- erties, or types of representation of tensors); the components of the differential will have one more index, and this index as one derived by differentiation will naturally be a subscript, whereas the old indices are superscripts; this does not mean that we cannot pull the new index up, or the old ones down, but the expressions resulting from that would be more complicated.

Suppose the given contravariant tensor is of rank one (a vector) V4 ; we pass to cartesian components :

we differentiate this:

O~Vj. "ft Y

and we pass to mixed components by formula 23.52:

"& . & .

1J and this, using 21.14 and 22.71 reduces to

23.6 ^J " lu~ + rr?r*

The reader should be able, following the examples given, to deduce formulas for differ- entiation of a tensor given in any form. We just mention, because we will have occasion to use it later, the formula for differentiation of a mixed tensor

23.7

J

We are in possession now of all the for- mal rules of operations on tensors in general coordinates. Although these rules were deduced by means of cartesian coordinates these coordi- nates and components together with all formulas involving the a's and the b's form only a kind of scaffolding that can be removed after the building has been completed. All ire have to know in order to operate on tensors are the g's. Using the g's we can lower and raise Indices and contract and, as a special case, find the scalar product of two vectors; also find the angle be- tween two vectors (using formula 7.5) and the length of a curve (using 81.18). Given the g's we can calculate the r" s (end of Section 22) and with the aid of the P s we can differentiate tensors (formulas £2.8). We see thus that the g's play a fundamental part in all operations - the tensor of which they are components is of- ten called the fundamental tensor.

Before we conclude we might state explicit- ly that all the formulas we have obtained are entirely independent of the number of dimen- sions.

24. Physical Coordinates as General Coordinates.

The principal purpose for the introduction of general coordinates was to make possible the treatment of tensors in curved space but it hap- pens that general coordinates may be used with great advantage also in Special Relativity The- ory, namely, in connection with the situation arising from the "minus sign". We remedied this situation in Section 4 by introducing imaginary coordinates and tensor components; we know how, using these imaginary quantities to write our formulas in a nice symmetrical form. The system of notations for general coordinates that we have introduced permits us now to reintroduce real quantities, and still to preserve symmetry in the formulas. We shall express our four math- ematical coordinate's xx, x2, x3, x4, of which the fourth is imaginary in terms of four real coordinates which we may denote by u4; we may choose as these four real numbers the physical coordinates x,y,z,t and consider the formulas

24.1

xx = x, xt = y, xa = z, x4 = it

as the transformation formulas, corresponding to 21.1; and

24.11

x = x.

y = xs, z = x,, t = -1x4

as the inverse formulas corresponding to £1.2. The ajj and the b jj with different sub- scripts are easily seen to be zero, and we have (compare 21.8 and 21.11)

24.2

»aa = ass = a44 = i » bsl = b,3 = 1, b44 = -ij

from these we obtain using £1.16

24.21

gtl * g,, » -g44 * 1, all others zero

and the same values we obtain for the g's with upper indices, using ££.5.

The x,y,z,t may be considered as the con- travariant components of the radius vector lead- ing from the origin to the point P; the co vari- ant components of the same vector are seen, ap- plying the formula £3.21, to be x,y,z,-t.

The formula for the square of the distance from the origin may be obtained either from £1.15 or from £3.4; it is (compare 10.1)

£4.3 + 7* + «* - t«.

We come next to Maxwell's equations where the "minus sign trouble* originated. To con- form with the notations of this chapter ve should use for the cartesian components - the mathematical components of preceding chapters - small letters, so that formulas 4.72 will hare to be written

•41

IX, f41 = 1Y, f4, = 1Z, f,, - L,

•31

= M, flt = i.

Using formulas 22.2 and 24.2 we obtain the co- variant components in physical coordinates - and we use here capital letters - as follows:

F4l = X, F4t » Y, P4» = Z,

24.4

F83 = L, FJ8 = M,

= M.

Mixed and contravariant components may be ob-

tained by raising indices - formula 23.5. IXie to the simple character of the g's given by

£4.£1 it is easy to see that raising one of the

indices 1,2,3 does not change the numerical val ue of a component, so that, for instance,

24.5 F»» =

= gaV*Fal =

and raising of the index 4 Just changes the sign of the component so that

24.6 F\ '

Which components shall we use in Maxwell's equations? It is clear that in the first (11.2) set all the indices must be on the same level, and since the last one must be a lower index we write all of them down. In the second set (UJ$) again the one after the comma must be down; but the one with which we contract it must be on the other level, and therefore up; the position of the third index is arbitrary. We have thus, as the Maxwell equations for free space

24.7

jk>l

0,

= 0

and in the presence of matter the second set becomes

24.71

euj

We notice here that no Imaginary quanti- ties appear and In spite of this our formulas are symmetric. The raising of the index 4 Is equivalent to changing the sign of a component, and this is how the minus sign is taken care of,

We shall now write out the expressions for the stress energy tensor and the equations of motion; it is clear that the formulas 11.4 and 11.5 become

24.6

24.9

or

24.91

or

24.92

Tla = 0 »<*

TiJ =

- *o,jF°FPa -

- ig1JF°0FP0 - puSx3

= FlpFPj - igijF°pFP0 -

The continuity equation (11. 1) will now be written as

24.10

(pua),

0.

25. Curvilinear Coordinates In Curved Space.

We want next to apply the general coordi- nate system that we have introduced for flat space also to curved spaces. In flat space we introduce the language of general coordinates and components by translating from the language of cartesian coordinates and components; in curved space we have no cartesian components; we shall have, therefore, to begin by introducing something that will play the role of cartesian coordinates; we shall introduce quasi-cartesian coordinates, which will take that place; but whereas cartesian coordinates are universal in that the same system of coordinates works for the whole plane, or flat space - the neighbor- hood of every point in curved space has its own system of quasi-cartesian coordinates. They are defined in the following way: Consider at a given point P the tangent flat; there will be In general a neighborhood of P such that no two points of that neighborhood have the same pro- jection on the tangent flat (for a sphere, e.g., we may obtain such a neighborhood by drawing any small circle around the point of contact) for such a neighborhood there exists a one-to-one correspondence between the points of it and the points of the tangent flat which are their pro- jections. We introduce now on the tangent flat a cartesian system of coordinates with origin at the point of contact, and we use the coordi-

50

nates of a point of the flat as the quasi-car- tesian coordinates of the point of the curved space whose projection it is; If, for Instance, a surface is given by equation

£5.1 z = J(ax« *• 2bxy + cy»)+t.h.d.,

x,y will be the quasi-cartesian coordinates of the point x,y,z, of the surface for the neigh- borhood of 0,0,0; - and in the general ease, If the curved space is given by 19.71

25.11

t.h.d.

the (i - l,...,n) will be the quasi-cartesian coordinates of the point xt (i = 1,...,H) for the neighborhood of the point 0,0,0,...,0.

When we were discussing curved space In Sections 18, 19, 20 we were speaking of vectors and tensors; these vectors were vectors of the tangent plane or tangent flat with initial point at the point of contact. We shall not consider any other vectors in connection with curved spaces and we shall refer to these vectors as the vectors of the curved space. To make this Idea seem more natural we may remark that a tan- gent vector to a curve on a surface ( or on a curved space) is such a vector, that is, a vec- tor of the tangent plane (or flat) with initial point at the point of contact. In handling these vectors we have been using for the vectors at every point of the curved space a cartesian coordinate system in the flat tangent at that point, in fact, we may say the same system that furnishes us the quasi-cartesian coordinates for the points of the curved space in the neigh- borhood of the point of contact. We shall, therefore, refer to the cartesian components of the vectors *"d tensors of the tangent flat when they are considered as vectors and tensors of the curved space - as quasi-cartesian components of these vectors and tensors.

We have thus in connection with every point P on the curved space a local coordinate system which gives quasi-cartesian coordinates of the neighboring points and the quasi-cartesian com- ponents of the tensors at P, and in some cases these local coordinate systems are very useful, but it will be necessary to introduce more gen- eral, more universal systems and learn how to represent vectors in them. The necessity of this last requirement will be clear if we con- sider that, although a quasi-cartesian system of a point P may be used to represent points In the neighborhood of P it is not quasi-cartesian for these points and cannot be used as such to represent vectors at such points.

There is no difficulty In introducing a universal system of coordinates for the points of a space - what we want is just as in ordinary space a one-to-one correspondence between the points and n-ples of numbers. A simple example is furnished by the so-called geographical co- ordinates for the surface of a sphere. Another

approach is given by the so-called parametric representation of a curved space. If the coor- dinates of a flat space X£,...,xg are given as functions of one parameter we have a curve; when they are given as functions

25.2

of n parameters Uj we have what we have called an n-dimensional curved space because eliminat- ing these n parameters from the N equations which express the coordinates in terms of them we find that the coordinates must satisfy N - n = r equations, and this was our definition of an n-dimensional curved space. Now since to ev- ery set of values i^,...,^ of the parameters there corresponds one point of the curved space the parameters u^ may be used as coordinates for the curved space.

Suppose then that we have introduced in some way a general system of coordinates for the points on a curved space (the reader may always think of the special case of a surface) . What will be a natural system of representation of vectors to go with it? Just as we use a carte- sian system in the tangent flat to represent points on the surface we may, so to say, project the general coordinate system on each tangent flat and use it to represent vectors and tensors in that flat, in particular those with initial points at the point of contact, i.e., the vec- tors and tensors of the curved space. For the neighborhood of each point we have thus two co- ordinate systems: the general and the quasi- cartesian for that point - and the same two sys- tems, or, rather their projections, we may use on the corresponding tangent flat. For the neighborhood of each point there will be trans- formation formulas for the coordinates of points, and from these we can derive transformation form- ulas for components of vectors and tensors in- volving the a1 s, the b's and the g's as intro- duced in Sections 21, 22 and 23. But since we consider only vectors with initial points at the point of contact we shall use the a's, the b's and the g's calculated from the correspond- ence between the quasi-cartesian and the gener- al coordinates at a point only for that point itself. We know that the a's and b's are nec- essary only in the building up of a system so that all we shall need in order to be actually able to handle tensors and vectors in a given general coordinate system are the g's.

We want to explain now how to obtain the g's when a space is given in parametric form 25.2.

We consider a curve u^(t) on the space, and its tangent vector at some point; it may be con- sidered either as a vector of the curved space, and then its contravariant components will be given, if we denote differentiation with respect to the parameter by a dot placed over a letter, by u1, the square of its length will be

51

or it may be considered as a vector of the con- taining space. Its (cartesian) components will be given by x^ (l = !,...,») where the are functions of t which are obtained fro* 25.2 by substituting for the u's the expressions characterizing our curve; we hare thus

±1 » oxi/oua.ua

and for the square of the vector ,*xl.

Equating this to the expression we obtained above we find

25.3

1=1

This formula ought to be compared to 21.16 of which it will be seen to be a generalization if account is taken of the values 21.11 of the b's. The method of giving the curved space by means of the formulas 25.11 may be considered as e special case of the one used above; this will be clear if we write 25.11 in the form

Xj. = ux, xa = ua, xn = Ua,

It is seen that the parts of the u's are played by the first n of the x's. Differentiation of the formulas Just written with respect to these variables gives

substitution of these expressions into 25.2 gives

z

i=n+l

25.4 = z

Z k=l

*n

t.h.d

t.h.d.

These formulas give the values for the g's In pseudocartesian coordinates for a neighborhood of the point of contact.

For the point of contact itself, i.e., for the origin of our system of coordinates we have

25.41 (g.n)0 - om

and from the formulas 22.61 we conclude easily that the g's with the upper Indices are also the A's:

25.42

As a consequence of this the distinction between covariant and contravarlant components vanishes for quail-cartesian coordinates at the point of contact.

We come now to the operation of differen- tiation. In the case of flat space we were sim- ply trying to find a system of notations for some operations that were defined Independently. Here the situation Is different; we have not de- fined differentiation; we cannot define It In what would seem to be the natural way, as the rate of change of a rector, for Instance, be- cause this would necessitate the consideration of the difference between two vectors at two different points and this conception Is not de- fined for curved space.

Before we come to this definition let us formulate the situation In flat space as fol- lows: a tensor field dF has been obtained by differentiation from a tensor field F if in ev- ery point the components of dF in a cartesian system are the derivatives of the components of F in that system.

In curved space there is no universal car- tesian system but there is a quasi-cartesian system for every point; It is natural, there- fore, to define differentiation in curved space by substituting in the above statement "quasi- cartesian system" for "cartesian system"; if we do that we arrive at the following definition:

Definition of Differentiation. We shall say that a tensor field dF has been obtained by differentiation from the tensor field F if at- every point the components of dF, in a system of coordinates that is quasi-cartesian at that point, are the derivatives of the components of F in that system of coordinates.

Although this definition may sound compli- cated it is the simplest imaginable adaptation of the idea of differentiation to curved spaces. The complication arises from the fact that there is no cartesian system in curved space but when we apply this definition to flat space we see that it brings us back to differentiation as we knew it in flat space.

We shall not have actually to pass from general coordinates to quasi-cartesian coordi- nates and then, after differentiation translate the result back into the language of general co- ordinates in every special case. We can derive the formulas in general coordinates once for all, just as we did it in the case of flat space in Sections 22 and 23, and we shall obtain ex- actly the same formulas. The only difference may be in the derivation of the r*s from the g's (end of Section 22) which was based there on the fact that the derivatives of the g's in car- tesian coordinates vanish (22.84). Is this true also in curved space? or, more precisely, do the derivatives of the g's in quasi-cartesian coordinates vanish at the point of contact?

In these coordinates the g's are given by 25.4; differentiating these expressions we ob- tain

25.5

t.h.d.

and for the point of contact, where the x's vanish, this Is zero so that

25.6

We see thus that formally everything is just the sane as In flat space so that we can take over Into curved space the whole apparatus of formulas worked out In Sections 21, 22, 23.

Incidentally we may mention that as It fol- lows easily from 25.6 the quantities r also vanish in quasi-cartesian coordinates at the point of contact. For future reference we put down the formula

25.7

0.

In general, the point of contact In quasi- cartesian coordinates is a place where we have the closest possible approach to the situation which obtains in flat space when we use carte- sian coordinates.

Another formula that can be easily obtain- ed from 25.5 and that we need later Is obtained by differentiating 25.5 once more and setting XA = 0. We get thus

25.8

Given a curved space by the formulas 25.2 we know how to find the g's. The question now arises: suppose we are given $n(n + 1) functions of the coordinates; is it possible to find a space for which these functions serve as the g's. The question is that of solving the system of partial differential equations (25.3), and without going into details we shall state that such a system of equations in general can be solved if the number of unknown functions is equal to that of equations; since we have here £n(n + l) equations we must have that 'many un- known functions; that means that the number of dimensions N of the containing space must be In general &n(n +1); in special cases it may, of course, be less than that. We may say then: a two dimensional curved space given by its g's may be always considered as Immersed into a 3- dimensional space; a three-dimensional curved space may be always considered as part of a six- dimensional flat space; and a four-dimensional as part of a ten-dimensional flat.

Another question is, whether for given real g's the containing space will come out real; and this Is by no means always the case. We know that for gxl = g«« = gj3 3 -g44 = 1, all others zero, the minimum cartesian containing space is four-dimensional with one imaginary coordinate and it Is clear that no real cartesian space can contain It.

Henceforth we may consider the curved space as given by its g's, and the g's may be consid- ered as arbitrarily given functions of the u's.

65

It may seem that we lost from view the original purpose of Introducing curved space, which was that of obtaining a tensor which we could Identify with T^ . We Introduced the Riemann tensor having this In mind, but now we seem to be Immersed In an entirely formal the- ory and far removed from the Riemann tensor; as a matter of fact, It Is just around the corner; differentiation, although performed according to formulas that are formally the same as In flat space, has, as we shall see, a new content; in trying to discover the difference we will be led to the Riemann tensor from a new point of view.

26. New Derivation of the Riemann Tensor.

We said that the meaning of differentiation in curved space is different from that in flat space. To show this difference in one of its most important manifestations we start out with a tensor of the first rank given in its contra- variant components F1; we differentiate it twice to obtain a tensor of third rank F*,jn ; in flat space this tensor would not differ from F*,^ because in cartesian components differentiation of a tensor reduces to ordinary differentiation of its components, so that the cartesian compo- nents of the two tensors mentioned are £ ££i

7^ 7K^* ^^71 O3C-J

and g— »r— ^ respectively, and these are equal be-

cause the result of ordinary differentiation does not depend on the order; two tensors having equal components in one system of coordinates would be equal in all systems of coordinates and so we have

26.1

- F3

= 0 in flat space.

This reasoning does not apply to curved space; in fact, to find F1^ In a point P we have to differentiate F1 j ; and in order to do that we have to know F1^ j in different points of the neighborhood of P; the finding of F *• j in these points involves the use of quasi-carte- sian systems for each of these points; we do not have then one quasi-cartesian system in which we can perform all our operations and the rea- soning that led us to 26.1 breaks down. In spite of this the result might still hold. In order to show that it £033 not let us calculate the left hand side of 26.1 using the formulas which we deduced for flat space in Sections 22 and 23 and which, as we proved In Section 25, apply to curved space also.

We start with the contravariant components FI; we calculate the first differential accord- ing to formula 23.6 to obtain

26.2

+I*aJP°;

next, we differentiate this again, and get, ac- cording to formula 23.7

26.21

now we form the difference we want to Investi- gate, viz.,

the last bracket vanishes according to 22.71, and what remains becomes after the substitution of the above expression 26.2 for the first dif- ferential and rearrangement of tei

H V

Here cancellation takes place in the first thret pairs of terms: in the first as a result of in- dependence of ordinary differentiation on order, In the next two pairs as a result of the fact that the name of the index of summation is IB- material; we come out with

- P

26.3

or

26.4 where

26.5

- F:

- rj

•n

Before we discuss the question whether the expression vanishes we want to show that the B's are the components of a tensor. In fact, multiplying both sides of 26.4 by X^Z", where Xj^, Y^, Z" are components of arbitrary vectors, and contracting we have

The left hand side is a scalar that has been ob- tained by legitimate operations and is, there- fore, independent from the coordinate system used; so is therefore the right hand side, and this proves that the B's are the components of a tensor. (In order to see that this is an es- sential point and that not every symbol with indices may be considered as a tensor, the read- er might consider the expression P<krXe^Z*'; this expression is, obviously, not independent from the choice of coordinates since in carte- sian coordinates the r's vanish, and in other systems they do not; the r's furnish thus an

example of symbols with indices that can not be interpreted as components of a tensor) .

Now we can settle our question as to the vanishing of 26. 3 by showing that the B's are mixed components of the Riemann tensor which has been Introduced in Section 20. Since we hare proved that they are components of a ten- sor we may use any system of coordinates, and we decide to use a quasi-cartesian system. In such a system the r's vanish at the point of contact (25.7) so that we are left with the terms

substituting for the r's with one upper index their expressions in terms of the g's with up- per indices and the r's with all indices down (22.92) we get

the first two terms vanish again because the r!s vanish at the point of contact; taking into ac- count (25.41) that the g's are for the point of contact equal to the 6's, and using the expres- sions 22.9 we find after a few cancellations

(the index i appears here as a subscript because the distinction between contravariant and covar- iant quantities vanishes for quasi-cartesian co- ordinates at the point of contact) . Using here for the second derivatives of the g's the expres- sions 25.8 we find

Comparing this to the expression for the Riemann tensor deduced at the end of Section 20 we con- vince ourselves of the identity of the two ex- pressions.

This shows that, if the Riemann tensor does not vanish, the second differential of a vector field actually may depend on the order of differ- entiation. This fact is very interesting in it- self, it confirms our statement that in curved space differentiation has a new meaning and it has many important implications, on which, how- ever, we cannot dwell here. For us it is impor- tant that we have obtained an expression of the Riemann tensor in terms of the g's alone; this means that those properties of the curvature of space which are expressed in the Riemann tensor are determined by the metric of the space, i.e., if distances along different curves are given, the curvature (as far as it is expressed in the Riemann tensor) is determined. According to our conception, the Inhabitants of the space cer-

54

tainly can measure lengths; it follows that cur- vature, as expressed by the Riemann tensor if accessible to the Inhabitants, It Is an internal property of the space. In particular, for I « 5, n » 2, i.e., for the ordinary surface we obtain the fact that the total curvature can be calcu- lated from the expression for the line element; this is Gauss's Theorema Egreglun.

27. Differential Relations for the Riemann Tensor.

The method of quasi-cartesian coordinates in proving a relation between tensors that we used in identifying the B's with the components of the Riemann tensor can be applied often and helps to avoid lengthy computations. We shall use it now to prove certain differential rela- tions for the Riemann tensor that are very im- portant for us because we know that the tensor T1, which we want to identify with the contract- ed Riemann tensor satisfies a certain differen- tial equation, namely, ^Tj/oXfl =0, and, of course, we expect the tensor in our mathemati- cal theory with which we are going to identify T to satisfy the same relations. In order to deduce differential relations on the contracted Riemann tensor we have to prove first some re- lations for the non-contracted tensor. These relations have been discovered by Ricci and then rediscovered by Bianchi and bear the latter' s name; they are

27.1

0.

The proof is very simple if we use quasi- cartesian coordinates. In these coordinates the r's at the point of contact vanish and although the first derivatives of the r's do not, the components of the tensor obtained by differen- tiating the B's (formula 26.5) which we have identified with the R's will contain the second derivatives only, because the first derivatives will be multiplied by the r's themselves that do vanish. With this remark in mind the proof of the Bianchi relations does not present any difficulty; we simply substitute for each of the three terms in 27.1 the difference of the two second order derivatives and find that the result vanishes identically.

Now, in order to deduce froa 27.1 the re- lations for the contracted tensor we raise in 27.1 the second index so that we have

•n,p

and here we contract 1 with m, and J with n. We obtain

The second term here may be written as - using the fact (20.71) that the Riemann tensor

55

changes its sign when two Indices of the same pair are Interchanged; and the third tern is equal to the second as we can see by Interchang- ing a and p (which does not change the value of the expression since a and p are summation indices) , and then interchanging the first two Indices, i.e.) p and a and the next two, i.e., p and p (each of these interchanges changes the sign, so that nothing is changed in the result). We have thus

- 2Ra(3pp,a

But R Pjp are the mixed components of the con- tracted Riemann tensor which we denote by R*j so that, dividing by 2 and changing the sign we have

Finally, Raa is the result of contraction of the contracted Riemann tensor; we denote this sca- lar by R (it is called the twi^e contracted Rie- mann tensor); then we can write for Raa p simply R - or (OaijR) a and our relation becomes

28. Geodesies.

In concluding this fragmentary development of the mathematical theory that we intend to ap- ply to Physics in the next chapter we shall study briefly a class of curves in curved space which play an important part in the study of motion. These curves may be considered as generalizations of straight lines in flat space, and we shall begin by considering these.

In agreement with the point of view of dif- ferential geometry (Section 21) we shall charac- terize a straight line by differential equations. If it is given in parametric form (7.11) we ob- tain by differentiating twice with respect to the parameter and indicating differentiation by a dot placed over the letter

28.1

0.

Since the choice of the parameter is in a high degree arbitrary, and for another choice of a parameter the representation may cease to be linear and the equations (28.1) may not hold any more - we cannot say that they characterize a straight line; a complete statement would be: a straight line is a curve for which there exists a parametric representation such that 28.1 holds.

Next, we translate 28.1 into the language of curvilinear coordinates, still keeping to flat space. We have, as in Section 21, except that we write now in agreement with Section 23, the index as a superscript,

1 =

= dxVdp =

= biau«,

dil/dp = b1(

0.

Multiplying by a^ and summing with respect to i, writing i » we get

or, taking into account 21.14 and 22.82 28.2 ttJ + r^pU^ » 0.

We pass now to curved space; in general, ve have here no straight lines but we may consider the same equation and investigate the properties of the curves represented by them. We introduce as our definition:

Geodesies are curves which satisfy for an appropriate choice of parameter Equation 26.2.

In studying geodesies it is often more con- venient to consider not a single geodesic but a portion of space filled with geodesies, so that through every point there passes one and only one geodesic of the bunch. If we have this situation we have a vector u1 in every point of that portion of space, so that we have a vector field, and the components u1 may be considered as functions of the coordinates. We may then write

and equation 28.2 becomes

= 0

or, according to 22.6,

28.3

= 0.

This form is very convenient in some cases. We shall use it to prove two properties of ge- odesies. In the first place we may discuss the meaning of the parameter that we are using. Con- sider the square of the tangent vector xr*^; we can prove that this quantity is constant along a geodesic. In fact, differentiating with re- spect to the parameter, we have

Since arc

and this vanishes according to 28.3 length is given by the formula

we see that, as a result of the fact that is constant, s is proportional to p, or p is proportional to the arc length s. Since mul- tiplication of the parameter by a constant fac- tor will not affect equation 28.2 or £8.5 we may always consider that the parameter il arc length. This discussion does not apply, how- ever, when the quantity u«Uci is zero, i.e., when u1 is a zero square vector. If we agr^e to call

curves whose tangent vectors have zero square singular curves we may state the following:

Theorem. In case of a non-singular geodesic, the parameter mentioned In the definition of a geodesic and used In i)8.P and £8.3 Is propor- tional or equal to arc length.

Next we may give an Interpretation to equa- tion £8.2 which sheds some light on the geomet- rical nature of geodesies. We may assume now that in all geodesies of the bunch arc length has been chosen as the parameter; then the vec-

-

tors uJ are unit vectors and they characterize In each point the direction of the geodesic. The derivative uJ(1 characterizes the change of direction as we move in the direction given by the coordinate u1 and ujfgua gives the change of direction in the direction of the vector ui, i.e., in the direction of the geodesic itself. Since, according to £8.2 this quantity is zero we have proved that the direction of a geodesic does not change as we move along it. (The above discussion applies, strictly speaking, only to non-singular geodesies.)

57

Chapter V. GENERAL RELATIVITY

In Chapter I we introduced certain funda- mental quantities, and we combined them into the symmetric tensor of rank two, TJJ . We found that this tensor satisfies the differential equation

^Tla/'oxa = 0,

first for 1 = 1,2, '6 and then, in Chapter III we showed that, as a result of the new identifica- tion introduced there, the fourth equation is also satisfied. We thought it desirable to build a mathematical theory in which a tensor of the same formal properties will appear in a natural way, and in the preceding Chapter IV we succeed- ed in actually setting up such a theory - the theory of curved space- time.

The structure of such a space, we found, is expressed *in a tensor of rank four - the Riemann tensor, but we obtained from it by contraction a tensor of rank two - the contracted Riemann ten- sor. In investigating the differential proper- ties of the Riemann tensor we found in Section 27 a relation of the type desired; it is satis- fied by a tensor which differs slightly from the contracted Riemann tensor, namely, the tensor R* - ic^ j R which we may call the corrected con- tracted Riemann tensor, and this is the tensor which we are going to identify with the physical tensor T so that our fundamental assumption will be

T— D 1 XK t> 4 n . 304 t n. J J XJ

Thus we decide to interpret T, and therefore our fundamental quantities of matter and electricity p, u, v, w, X, Y, Z, L, U, N, which went into it, in terms of structure of curved space as it is reflected in the contracted corrected Riemann tensor. But in doing this we find ourselves be- fore a radically new situation. As we wanted, the tensor is now an expression of the proper- ties of space, i.e., the space is now different from the one we had before - geometry and phy- sics is now an organic whole and it is not clear what changes this brings with it; together with the desirable feature, namely the fact that T grew out of space, so to say, we may have brought in some not desirable and hard to manage fea- tures. But then there would be no gain if we could merely say that T is a geometrical thing now; we expected to gain something essential in undertaking the merging together of our geometry and physics; and now we stand before an accom- plished fact and we have to see what it brought with it. We conjured up something and we do not seem to be able to stop, we have to go ahead and hope that the changes will be beneficial.

It might seem strange that we find a phy- sical interpretation only for the contracted Riemann tensor, only for ten combinations of its twenty components. But this is quite In order. Should all the components of the Rie- mann tensor be used up in interpreting matter and electricity that would mean that where there is no matter (and electricity) space-time is flat (as far as internal properties are con- cerned) ; that would mean that matter acts only where it is; but we know that matter make* it- self felt, for instance, by the gravitational field that it produces, also outside the region which it occupies, and this is in accord with our identification as a result of which only part of the components of the Riemann tensor vanish where there is no matter, so that the remaining components may be Interpreted as cor- responding to gravitational forces.

29. The Law of Geodesies.

In questions of celestial mechanics which we are going to treat now the effects of the electromagnetic field are usually negligible and we shall begin by equating to zero our elec- tromagnetic tensor. Equation 24.9 becomes then

29.1

According to our fundamental assumption, this tensor has been identified with the cor- rected contracted Riemann tensor, and it must satisfy the equation

29.2 Ta1>a = 0

which formally is the same as our old equation of motion 24.8 but differs from it In that it has to be interpreted in curved space. The lest two equations impose certain conditions on the velocity components u1 and we want to find these conditions or, in other words, we want to elim- inate density from the equations £9.1, 29. P. (Y/e have been using in Chapter IV the letter u for. the general coordinates - in this chapter we go back to our notation of the first three chap- ters and denote by u1 again the four-dimension- al velocity vector, and we shall denote the gen- eral coordinates by x*0

First of all we shall prove the following theorem due to ilineur.

Theorem. If the field u1 satisfies the equa- tions 29.1, 29.2 the vectors u1 may be consider- ed as tangent unit vectors to a family of geo- desies filling the space.

Proof. We consider first the case when Is a unit vector (and not a zero square vector), I.e., UpuP = -1. Differentiating this relation we have

29.3 up,iuP » 0.

Substituting 29.1 Into £9.2 we get

29.4 op/oxa.^Uj + pua auj + Prf*Uj a = 0. Dividing by p and introducing the notation

29.5 A = o log p/oxa.u° + ua a we may write 29.? as

Auj + uSij a = 0.

Multiplying this by uJ and summing with respect to J, for which we write P, we have

= 0

which, according to 29. 3 gives A = 0. tuting this into 29.5 we obtain

Substi-

29.6

ua.u.

which, according to 28.3 proves the theorem in the case considered.

But we also have to consider the case of propagation of light. In this case we do not heve to consider any density p the momentum vec- tor being given by Qi with q^qP = 0. The preced- ing proof breaks down in this case', but continu- ity considerations lead us to the result that, in this case also we can find a scalar field p such that qVp will be tangent vectors to geodesies.

We conclude that in a gravitational field matter and light particles follow geodesies.

In the present chapter we are going to ap- ply this result to the investigation of the mo- tion of a planet and the propagation of light in the Solar system. We shall see that the changed significance of differentiation takes care, in a way, of what is usually accounted for by gravitational forces.

30. Solar System. Symmetry Conditions.

Our equations 29.1 and 29.2 describe rela- tions existing between matter and field. We proved that the motion of matter is character- ized by the geodesies of the curved space, but the curvature is in turn determined by matter. Theoretically, we may have a complete descrip- tion of the situation, but in practice we do not know how to handle it, we do not know where to begin. In such cases we often resort to the method of successive approximations. Let us try to apply this method here. In investigating the motion of a planet around the sun we neglect in the first place the motion of the sun. Then, in

the first approximation we neglect the MSB of the planet, I.e., assume that there Is no mat- ter outside the tun. Since we have already neg- lected electromagnetlsm we have then that out- side the sun the tensor T Is zero so that, ac- cording to the fundamental assumption,

R} - i»ijR « 0.

Contracting we get R - $.4R « 0, so that R » 0 and we have simply

30.1

Rj » 0.

These equations are known as Einstein's equa- tions. We see that the statement that the cor- rected contracted Riemann tensor vanishes is equivalent to the statement that the contracted Riemann tensor vanishes.

As a result of our first approximation we derived thus the field equations 30.1. In the next approximation we introduce the planet »n4 assume that its action on the field is neglig- ible but that the field acts on It, i.e., that the motion of the planet is given by the geo- desies of the field which has been determined in the preceding step; the motions will then be given by the equations 28.?

30.2

in which the r's are calculated from the g's which have been found to satisfy 30.1.

Our problem, therefore, falls in two: first, to find a field satisfying the equations 30.1, and second, to find the geodesies of this field.

In this form the problem is comparable to the problem in Newtonian mechanics as explained in Section 1. There the field was given by the potential which had to satisfy the Laplace equa- tion; here the field is given by the g's which have to satisfy the equations 30.1.

There the motion, after the field had been determined, was described by second order or- dinary equations, differentiation being taken with respect to time; here motion Is also de- scribed by second order differential equations, derivation being with respect to s.

It is possible by making some special as- sumptions, neglecting certain quantities, for instance the derivatives of all the g's except g44 and dropping some terms, to obtain the gen- eral Newtonian equations as a special or limit- ing case of our equations. The equations 30.1 would thus reduce to the Laplace equation 1.54 for g44 and the equations of a geodesic to the equations of motion 1.1 in which X,Y,Z are giv- en by 1.53, so that we could consider the gen- eral Newtonian theory of motion in a gravita- tional field as a first approximation to the theory of Relativity, but it is quite difficult in the general case to estimate what we neglect and the error we commit, and we prefer to com- pare the two theories on some concrete special

cases. All these cases will refer to what cor- responds to a gravitational field produced by a single attracting center. We found In Section 1 such a field by using the general equations and, In addition, the condition of symmetry. We Intend to follow an analogous course here. Our general equations are 30.1 and now we want to find what will correspond to the conditions of symmetry. The situation is much more complicat- ed here. There the field could be characteriz- ed by a scalar <p and the condition of symmetry with respect to a point was simply expressed by stating that 9 Is a function of distance from that point; here the field is characterized by a tensor gij. There, in the second place, we worked in ordinary space; here we have space- time which has an additional coordinate, t. Last, there the space was given and in it dis- tances were well defined; on this space was su- perimposed a field whose symmetry we had to discuss; here the field is not superimposed on a space with a given metric, but the metric it- self constitutes a field which has to be deter- mined by the symmetry condition. .

We shall take up these three difficulties one by one.

In the first place let us consider a ten- sor field in ordinary space, and let us impose on it the condition of symmetry with respect to a point. A tensor we may consider (Section 9) as the left hand side of an equation of a cen- tral quadric surface (we are interested in a symmetric tensor here, since the g's are sym- metric in the indices 1 and J - this symmetry we must try not to confuse with the symmetry ' with respect to a point which we impose on the field - and a symmetric tensor is sufficiently characterized by a quadratic form) which we may consider as an ellipsoid. Our tensor field will then be represented by an ellipsoid at every point of space. The field must allow rotations around a fixed center 0, i.e., such a rotation must bring the field into itself; in other words, if a rotation brings a point P into a point Q it must bring the ellipsoid at P into the ellip- soid at Q. In particular, a rotation, which leaves P unchanged must not change the ellipsoid at P. It is clear that every ellipsoid must be an ellipsoid of revolution and that its axis must be directed along the radius vector from 0 to P.

The ellipsoid at the point x,0,0, will be seen to be

- x)8 + B(TJ» + ««) = 1

and for a general point P, if we use polar coor dinates for P and (considering if it helps, the ellipsoid as infinitesimal) their differen- tials for the points of the quadratic relative to P,

30.3 Adr8 + B(d98 + sinae.d98) = 1.

Since ellipsoids at points equidistant fro* 0 must have the same dimensions, A and B oust be functions of r alone.

The left hand side of this equation gives a tensor field which satisfies the condition of symmetry with respect to a point. Bext, we consider the complication resulting from the in- troduction of time. In Section 1 time was not mentioned, it means that the field was consid- ered as Independent of time, or static; we may say that the field must not be affected by a change in t or, from the four -dimensional point of view, by a translation along the t-axls. This Is a requirement of the same character as that of symmetry with respect to a point; froa the four-dimensional point of view we may combine the two requirements and say that the field must be symmetric with respect to a line - the t-axls. But the field now is a field in four-space, It will be represented by a quadratic form in dr, de, d<p and dt. For dt = 0 it must reduce to the field given before; the coefficients must be in- dependent of t corresponding to the requirement that the field be static; and a change from t to -t must also not affect the field (reversi- bility of time) so that terms of the quadratic form involving dt to the first power must be ab- sent. It follows that the addition of the fourth dimension results in the addition of on- ly one term to our tensor which now may be writ- ten as

20.4 Adr2 + B(d62 +

Cdt

where C, as well as A and B, are functions of r alone.

And now we have to overcome the last dif- ficulty, that connected with the fact that our space is curved and that we cannot define sym- metry in terms of rotations because rotation means a transformation in which distances are preserved, and distances are defined only by the field of the g's which we want to determine by the requirement that it be not affected by ro- tations. To overcome this difficulty we have to agree on some other definition of symmetry, and it seems natural to adopt as such the following: in order to define a symmetry for a curved space we shall compare it with a flat space by estab- lishing a one-to-one correspondence between the points of the two spaces. Corresponding to ev- ery transformation of the flat space we will have then a transformation of the curved space; and we shall say that the curved space possess- es the same symmetry as a field F In the flat space if the metric of the curved space - as given by the g's - is not affected by those transformations of the curved space which cor- respond to the transformations in flat space not affecting the field F.

Suppose now that we have such a curved space. This implies that we have a one-to-one correspondence with the flat space, and we may

use for the points of the curved space the same coordinates that we use for the corresponding points of the flat space. It Is clear that 30.4 will satisfy the requirements, so that we can take It for our fundamental tensor, or as we shall say (compare 21.8) for our ds8.

But the quantities r, 6, <p, t, which have definite geometrical significance in flat space lose it in curved space - they are Just num- bers which we use to characterize different points as we use numbers to characterize houses on a street. There is no reason why we should not replace them by other numbers, i.e., trans- form our coordinates, if it would simplify our formulas. Now, it is clear that transformations involving 6, 9, t will make our expression 20.4 more complicated because it would introduce these coordinates into the coefficients. But we could choose a transformation on r alone so as to simplify that expression; we could, for in- stance, reduce any one coefficient to a pre- scribed function of the new r. We make this choice in such a way as to reduce B to r8 be- cause, in a way, it restores to r a geometrical meaning as we shall see presently. If we write £(r) and -T(r) for the functions of the new r which now appear instead of A and C, and inter- pret 30.4 as giving -ds2, in accordance with the standardization of the parameter adopted In Section 12, our final formula will be

SO. 5 -dsa =

+ r8(d98 + sin80.d<p8) - T)(r).dt8

Letting here r and t have constant values we have a surface, and a simple calculation would show that -^ is the total curvature of this surface, which gives a geometrical meaning to r.

Our task is now accomplished, we have im- posed on our space the conditions of symmetry and we have next to impose on it the general equations 30.1.

31. Solution of the Field Equations.

We are now at a stage which corresponds to the assumption that the potential 9 is a func- tion of r alone in Section 1, and our next task corresponds to the substitution of <p(r) into Laplace's equation. Instead of one unknown function <p(r) we have here the ten g's determin- ed by 30.5 which we may write out as

8x1 =

§32 =

31.1

g33 = r2sinae, gaa? = all others zero,

60

and which involve two unknown functions. In or- der to determine these functions we have to substitute 31.1 into 30.1. In the first place we have to calculate the g's with the upper in- dices from the formulas (£2.6)

Since the g's with two distinct lover Indices vanish, only those terms on the left are not zero in which a » 1 and we have

for 1 ^ J the right hand sides are zero, and since the first factors on the left are not zero the second must vanish; we see thus that the g's with two distinct upper indices also vanish. For J = 1 we have unity on the right and thus

31.2

,11

g33 =

g* » 1/r",

•e, g44 = -i/n(r)

all others zero.

In what follows

Xi = r, xa = 8, x, = f, = t.

Differentiation with respect to r will be de- noted, as in Section 1, by ' . We next calcu- late the r's with all indices down according to 22.9 and obtain, omitting those that come out zero,

31.3

1,44 = -iV* r»>»» = -r'slnO. cos 8,

rt, = r> r»,i» = r sin*0» rs,33 = rasin 8. cos 8, F4, 14 = -Jtf

Raising of an index is accomplished In this case simply by multiplying by the g with the index to be raised appearing above twice, because the sua giaFa which, according to 83.3 is equal to F1 re- duces, as the result of the Tanishing of the g's with two equal upper indices to one term, namely g11^. This permits us to write out easily the r's with one index above:

rli =

31.31 r;4 = J

r;,

r sin86

33

r8, = -sine .cos e,

7, rJ =

«; -cote, r,: =

Next we have to calculate those components of the Riemann tensor which appear In the

expressions for the components of the contracted Hieraann tensor, I.e., those with the first Index equal to the one before last or those of the type Rljjih We do not write those out but state that the result of the calculation with their aid of the components of the contracted Riemann tensor is, that all these components with two distinct indices vanish and the others are

g'r

4T1

It is more convenient to operate with the mixed components of the contracted Riemann tensor ('al- though it is not necessary, and the reader might for the sake of practice go through the same calculations using covariant components) , and these are obtained from the last formulas by multiplication by the corresponding g with upper indices; we obtain thus

t a

31.4

^Ti-

We come now to the ten equations E * 0 that we have to satisfy; six of them, namely, those in which i ^ J are satisfied identically because our R's as well as the O's vanish for distinct indices. Of the remaining four equa- tions the second and the third are identically the same because of the equality of the corres- ponding values of R in 31.4. Three equations remain, viz.,

=0

Subtracting the last one from the first we have

81

Ol * const. c .

By choosing our unit of time appropriately we can reduce this constant to 1, so that

31.7

or

Using 31.6 and 21.7 in the second of the equa- tions 31.5 we obtain

1 -

which gives 31.8

TJ = 1 -

where y denotes a constant of integration. Our field then is given by a

31.9 -dsa =

rad0a

.d*f - r)dt*

where TI is given by 31.8.

32. Equations of Geodesies.

We first consider the non-zero geodesies which correspond to a material particle. We know that in this case arc-length can be taken as parameter so that the curve in addition to the equations 30.2 must satisfy the equation 30.5 which we may write as

32.1 ~

= -1;

we shall, however, make our discussion slightly more general and write A in the right hand side with a view of using the results also in the case corresponding to a light particle. We shall discuss this equation together with the equations 30.2 which become here

32.21 r -

-.rs

2T]

rn

rrj sina9.f8

32. 22

32.23

32.24

9 + 2£.8 - sinO cosO fa = 0,

+ 2p.f + 2 cote •f 5-.rt = 0.

.9* * 0,

The choice of the 0 and « coordinates is at our disposal. We choose them in such a way that the initial position of the particle be on the equa- tor and that the tangent be tangent to the equa- tor. In this case 8 =^ and e = zero at the initial moment and the second equation shows that 0=4 always. Now the last two equations may be integrated once each and they furnish

32.3 32.4

I

where h and k are constants. Together with these two equations we have to consider the one corresponding to 32.1; viz.,

32.5

r*9* ~ lit1 » A.

We simplify our system of equations In the following way: (a) we eliminate t by means of 38.4; (b) we eliminate differentiation with re- spect to the parameter by using f = (dr/d9) . 9 and 32.3; (c) we Introduce as a new variable, as Is customary In celestial mechanics, the In- verse distance u = -^, Instead of r, so that

32.6

r =

;

and, (d) we substitute the value for TJ from 31.8, We obtain in this way a differential equation between u and 9; viz.,

where X is a constant. This equation may be considered as the equation of the orbit of a planet.

33. Newtonian Motion of a Planet.

Every reader knows, of course, that accord- ing to the Newtonian theory a planet moves around the sun on an ellipse in one of whose foci the sun is situated, although he may not be in the possession of a proof of that; we shall not give a proof of that here either, but we shall dis- cuss in detail only one feature of the situation. The vertex of the ellipse which is nearest to the focus in which the sun is located is called the perihelion, the other vertex - the aphelion; the line Joining the perihelion and the aphelion is the major axis, and therefore passes through the sun. Using the coordinates u and 9 corres- ponding to those of the preceding section we may say that the perihelion corresponds to the maxi- mum value of u, and the aphelion to the minimum value of u, and that the transition from the maximum to the minimum value of u corresponds to the change of 9 by the amount *. It is this last fact that we shall deduce from the equations of motion. We may (corresponding to the fact that we set 0 = i* in the preceding section) consider a motion in the xy-plane characterized by the equations (see 1.1 and 1.3)

33.1

dx "dt1

£**-••

We have now to introduce variables corresponding to those used above in the Relativity treatment, i.e., to set

x = COS9/U, y = sin9/u.

We calculate the first and the second deriva- tives of x and y with reipect to t, substitute them into 33.1 and combining the terns with COM and those with sin f we get

co.,

.in,

dt

- 0.

Multiplying the first of these equalities sin 9, the second by cos 9 and adding the sults we obtain

33.8 8u~"(-n:) -

by

re-

and then easily

-a du d9 211 'dt'dt ' " 'dt1

-i d*9 _ ~

The last equation may be written as

da9/dta _ gdu/dt d9/dt u

whence

33.3

d9/dt * Hu1

and then

where H is a constant of integration. We next want to eliminate t from 33. 2 with the help of the last formula. Differentiating it we have

dt

du du d9 du dt = d9*dt = de

d!u m dV*v" + _du.d!9 . d!^j«u« , dt* dfPMV d9 dtT de1

Substituting into 33.8 we arrive at

-u~1H8u4 + Mu* = 0

and, after two terms cancel, at

33.4

du

This, we easily see, may be obtained by differ- entiation from

33.5

« -

= a

EMu

where a is a constant. This corresponds to equa- tion 38.7 obtained from Relativity theory in the preceding section; in that last equation we have, of course, to take A = -1 if we consider the motion of a planet so that it becomes

33.51

8

and we see that the difference is essentially in only one term. But before we come to the com- parison of the motions described by these two equations we have to continue the discussion of 33.5. The character of motion described by it depends on the values of the constants appearing in it, and also on the initial conditions. We begin the discussion by writing 33.5 in the form

33.52

= -(u -

- u8)

where ux and ua are the two roots of the polyno- mial u8 - 2Mu/H2 - ct. If the two roots are com- plex, or equal, the right hand side of 33.52 is negative and we cannot have real motion. Also when both roots are negative the right hand side is negative for positive values of u (and u, being the inverse distance, must be positive). The case of one positive and one negative root corresponds to u changing from 0 to a finite value and then going back to zero, for instance, a comet approaching the sun from an infinite distance and then receding back into infinity. But we want to treat the case of a planet, and this will obviously correspond to the only re- maining case; viz., that of two distinct posi- tive roots. If by u^ we denote the larger and by ua the smaller of the two roots it will be convenient to write our equation as

33.53

and we see that a real solution is only possible when u is between u2 and ux . The motion will manifest itself in an oscillation of u between ua and Ui and the sign of du/d? will change at these points. The particular question we want to investigate is, as was mentioned at the be- ginning of the section, to what change of q> cor- responds one oscillation of u, between ua and u^ say. In order to find this we solve the equa- tion for d«p, obtaining

whence 33.6

- U)(U - U8)'

a change of variable will help us to evaluate this integral. We put

33.7

u - ua i - u,

sln'x;

when z changes from 0 to */£, u will increase from ut to Ui as required. We bar*

du » 2(ux - u.) sin z cos z dx,

u - u, » (ux - ut) sin*x ux - u = Uj. - |ua •«• (ux

= (u^ - u§) cos'z, and the integral becomes.

33.8

33.9

2dx

The answer to our question is then, that f changes exactly through x while u performs an oscillation between its minimum and its ••yi^«i» values, which corresponds to the fact mentioned before that the aphelion and the perihelion are on a straight line with the sun, which fact we thus proved.

34. Relativity Motion of a Planet.

Following this excursion into celestial mechanics according to Newton we return to our Relativity formulas which we shall treat by com- paring them to the formulas derived in the last section.

At this stage we come again upon a funda- mental question: we have two theories; the quantities of one of them have been identified with measured quantities, and this identifica- tion proved, in the main, a splendid success; if the new theory is to be applied successfully, it is clear that it has essentially to agree with the old theory with which it may be compared in- stead of being compared with results of measure- ment directly; that means that we have to estab- lish a correspondence between quantities of the two theories, and we have to expect that the corresponding quantities of the two theories obey approximately the same relations. This correspondence has been anticipated in the pre- ceding pages by using the same letters for quan- tities which it is intended to identify. But it may not be superfluous to remind the reader that the quantities u, f , 6 of the two theories are not the same; there Is a certain arbitrariness in choosing coordinates in curved space, and es- pecially obvious it must be in the case of r (of which u is the inverse); it is possible to substitute for r some simple function of r, and, indeed, it has been done; the criterion of cor- rectness of choice must lie In the success of the identification.

Next we must identify the constants of the new theory with those of the old. It would seem as though we must, in order to reach an agree- ment, make Y = 0 so as to get rid of the last term of the equation 33.51 by which it differs essentially from 33.5. But this would annihi- late also the preceding term in the new formula and so spoil the correspondence altogether. We must, therefore, ascribe to Y a finite value, but we will expect that it will be small; more precisely, it will be small in such a way that the term yu3 will not affect the equation 33.51 essentially or will be small in comparison with ua. Next, let us compare 32.3 with 33.3. Of course, the left hand sides differ by the fac- tor dt/ds, but this is equal (Section 13) to 1//1 - p * which is, even for motions of planets, very close to one, so that, in the first approx- imation we may identify h with H. Comparing now 33.5 and 33.51 we come to the conclusion that

34.1 Y = 2M

so that we may write 35.51 as 34.2

After we have made these identifications the situation is then this: if we neglect the term 2Mu3 in the equation, and this term is negligible in most cases, we have the same equation of the orbit as in Newtonian mechanics. This result is very satisfactory, we have been able to obtain the equations of motion of a planet without considering any gravitational forces, as a result of our identification of the contracted Riemann tensor with the complete ten- sor. Still the term 2Mu3 is there, the Relativ- ity theory predicts an orbit that is slightly different from that predicted by the Newtonian theory; is the difference within the error of observation? We shall consider now this ques- tion, but instead of considering the motion as a whole, we shall consider only the feature of it which for the case of Newtonian motion has been considered in the preceding section; viz., we shall ask ourselves whether, corresponding to an oscillation of u between a minimum and a maximum value, the change in <p will be exactly x. Of course, we are sure that in the new theory there will be motions which differ but slightly from the motion considered in the preceding section, so that the general character of the motion will be the same, and u will oscillate between a min- imum and maximum. The value of du/d« will now be expressed by a polynomial of degree three the first two terms of which are

2Mu3 - u*.

The sum of the three roots of this polynomial is 1/2M so that if Ux, ua denote two roots, viz., those two roots which differ but slightly from

64

the roots denoted in the saae way in the preced- ing case, the third root will be

ar - ur - ut,

and the Integral corresponding to 55.6 will be

y

I

.

/(ux -u)(u - ut) [1 -

u, «• u)E*J

the same substitution 55.7 as before will be ap- plied. We only have to calculate

ux + us + u - -ix + u, •»• ut + (ux - ut)sin*z

= ux + ua + ux sin'x + u, cos*x, so that the integral becomes

f J

- 2M(ui + ur + ux sin'x + u,cosix)"

As we saw before, II is a very small quantity; before, we neglected it altogether »nd obtained K for the value of the integral; now, we shall go to the next approximation; we shall develop the denominator according to the powers of M and neglect all terms beyond the second (it would be a very easy but not a worthwhile matter to esti- mate the value of the error) ; we get in this way, as an approximate value for f^- fg

ua + ux sin«x

cos»x)] dx,

or

The new theory predicts then that the angle e will have changed by this amount while the dis- tance from the sun changes from its miniaum to its maximum; i.e., that the perihelion and aphe- lion are not in a straight line with the sun but that the planet moves through an additional angle of -^f (ux + ua) after reaching the position oppo- site the one where it was during the perhelion, before reaching the aphelion. Since the same situation applies to the motion between an aphe- lion and the next perihelion we see that between two consecutive perihelia the planet will have moved through an angle £* * 3xM(u! + u,), or that the perihelion will have moved through an angle 3«M(u1 + ut) during one revolution of the planet. This is a very small amount, and it nay be considered as a correction to the classical result according to which the planet moves on an elliptic orbit with the sun in one of the foci. If a is the major semi-axis and e the eccentric- ity, the distance at perihelion is a- ae and the distance at aphelion a + ae: we have then

ux -i- u, = l/(a - ae) + l/(a + ae) = 2/a(l - e»),

and the final formula for the advance of perihelion comes out

34.5 P -

the

Here then we have two predictions: on the old theory the perihelion will remain fixed in space; according to the new one it will advance by p during one revolution. What are the re- sults? In the case of most planets either this amount is too small or the position of the peri- helion too uncertain to permit any decision but in the case of the planet Mercury it was known for a long time that there is a discrepancy be- tween the prediction of the Newtonian theory and actual observations; and it happens that the discrepancy is very nearly the amount showing the discrepancy between the two theories, so that the theory of Relativity predicts a result that has been actually observed. This must be considered as a success of the new theory.

35. Deflection of Light.

According to Section 29 a light particle also moves along a geodesic, only in this case It is a zero geodesic, one along which the tan- gent vectors have zero length. The equations for such a geodesic are the same as for the other kind with the difference that the parame- ter is no longer arc length. As a result we have to have zero instead of -1 in the right hand member of equation 32.1, that is to make A = 0 in the equations 32.5 and 32.7. The equa- tion of the orbit will therefore be (34.1)

35.1

+ u2 = o

2MuJ

which will have to be compared with the same equation without the term containing u3 which is an equation of a straight line and character- izes the propagation of a beam of light on the old theory. In fact, the equation of a straight line whose distance from the origin is 1/p and which is perpendicular to the polar axis is in our coordinates u = p cos 9 ; we have then du/d9 = - p sin 9 and taking the sum of the squares of the last two expressions we find that they' add up to p8 which we may identify with o. Again the term 2Mu3 is very small because the maximum value u can take is the inverse of the minimum value of the distance from the center of the sun, which is the radius of the sun; we treat the problem again as a perturbation prob- lem, that is, compare the required solution to that of the equation without the 2Mu3 term. Again we are interested in the change of the angle 9 corresponding to a transition between the two extreme values of u. We shall be inter- ested in a beam of light emitted from a star, arriving into our telescope and passing on its way very near to the surface of the sun. The

65

distances of the star and even of the earth from the sun are very large in comparison with the minimum distance, and we shall take them « ex The maximum value of u, corresponding to the minimum distance from the sun we shell denote by u0. Since du/df changes its sign when the light particle reaches this point it must van- ish there so that the left hand side of 35.1 re- duces to u0", and we have

u - a'

Etfu »;

we may use u0 Instead of a in our equation and write it in the form

(^)8

- £Mu » + 2Mu».

Solving this for d0 we find

d9 = -^_ $U .

/2M(u* - U03) - (u* - u0»)"

We introduce a new variable x letting u = UQ sin x

and after the substitution develop according to powers of M and keep only two terms; we thus get for d9 approximately

if we let x change from o to « , u will change from zero to u0 and back to zero, Just the change that the inverse distance will experi- ence during the propagation of the light par- ticle. The total change of the angle will then be represented by the integral

on the old theory which corresponds to the ab- sence of the term with M in the equation 35.1 we will have to omit the term with M in this integral and the result is *. The approximate result according to the new theory will differ from that by

The beam of light coming to us from a star will then be deflected by an angle 4M/r0 where r0 is its minimum distance from the sun, compared to the old theory, or to the beam as it would go if the sun were absent. If then we observe a star in a certain position on the sky while the sun is far away, and then observe the same star when the sun is near the line of vision; I.e., when the apparent position of the sun is near the

M

apparent position of the star, this latter posi- tion must appear shifted away from the sun ap- proximately by the angle 4M/r0. Actual measure- ments are possible only during an eclipse of the sun, because otherwise the light from the sun drowns out the fainter light from the star, and are beset with difficulties but the results seem to be in favor of the prediction.

36. Shift of Spectral Lines.

We come to the third so-called test of the General Relativity Theory, that is, a case where the predictions of the theory differ from those of older theories by an amount exceeding the error of measurement, thus affording an oppor- tunity to prove or disprove the advantages of the new theory.

In this case again we deal with propagation of light in the gravitational field of the sun, but this time the source is supposed to be on the sun itself, and the observer is on the earth, so that the direction of the beam is that of a radius of the sun; we take this to mean that 9 = const, and <p = const. We have then accord- ing to 32.5 with A = 0

L . nta = o.

dr = I

36.1

This gives

the double sign corresponds to two possible senses of the beam: from the sun to the earth and from the earth to the sun. The former, in which we are now Interested, is characterized by the property that r increases as t increases; the ratio dr/dt must therefore be positive, and since T] is positive we must take

36.2 dr = ryit.

The orbit is thus determined by the equa-

tions

de = 0, dq> = 0, dr = rjdt.

But we are interested in the color this time, and color, as we have agreed in Section 16, is proportional to the time component of the momen- tum vector. As the momentum vector we have to consider the vector of components du^/dp, where p is a parameter appearing in the equations of Geodesies, the one with respect to which differ- entiation is denoted by . In order to find this parameter we have to go back to the origin- al equations of geodesies; 32.21 becomes here

equation 36.1 shows that the last two terms can- cel leaving us with

* - 0.

This means that r is a linear function of the parameter,

r ap * b

so that

d/dp » a.d/dr. The momentum vector uj is here therefore

a dr

a.—

dr

dr

-r dr

•^

dr

the last according to 36.2. What about the val- ue of a? The answer is that it is not and can- not be determined by the foregoing discussion. There are different beams of light which satisfy all the conditions imposed so far; they differ in color, and different colors correspond to different values of a.

Color, according to our definition, is the time components of the momentum vector; i.e., the scalar product of the momentum vector of light and the unit vector in the time direction. If we denote the (contravariant) components of the latter by T1, the condition that it has tlae direction will be given by

rpl = ip8 3 q»3 = Q

and the condition that it is a unit vector - by

_ rpGCmp ... 1

°cc(3

which, taking into account the relations just preceding and the values of the g's becomes

T)(T4)" = 1.

The scalar product of the vectors ui and T1 cal- culated according to the formula g^JiJT^ becomes

two

or, if we expand and keep only the first terms,

e(l + M/r) .

The color of a beam of light is then not con- stant along the beam. We shall compare the col or as it appears near the surface of the sun, where r is equal to the radius of the sun rs, and near the surface of the earth, where we may assume r = &. The frequencies in these two cases will be for a given beam of light propor- tional to

1 + M/r and 1; the change in frequency will be proportional to

(1 * H/r) - 1 » M/r,

and this will be also the relative change in frequency .

If now we consider some source of light near the surface of the sun, whose frequency we know, the light emitted by it when it is re- ceived at the surface of the earth will have a frequency that is less, the amount of the rela- tive change being given by

M/r.

If then we compare light coming from a ter- restrial source, for Instance, emitted by an

67

atom, and light emitted by a corresponding sourct on the sun, for instance, emitted by an atom of the same kind, we would expect a change of fre- quency of the amount M/r. Or, if we compare a Solar spectrum with a Terrestrial spectrum, the lines of the former will be shifted toward the red by the amount M/r. This is the prediction of the General Relativity Theory.

Again the experimental evidence seems to favor this prediction.

\

BINDING LIST

o

Oi

o>

CD 00 CVJ

•H 0)

s ^

JH «M O 0) tiO to

h o

O -H 0) -P O o3

e •\ o

CJ -P •H 0}

-S

&

PS

University of Torontp Library

DO NOT

REMOVE

THE

CARD

FROM

THIS

POCKET

Acme Library Card Pocket

Under Pat. "Ref. Index Flte"

Made by LIBRARY BUREAU