Skip to main content

Full text of "Mathematics of relativity: lecture notes"

See other formats


UNIV.OF 
T08G8TO 

.  :i-"-;, .ii: 


MATHEMATICS  OF  RELATIVITY 


LECTURE  NOTES 


BY 


G.*Y?RAINICH 

in 


(All  Rights  Reserved) 

(Printed  in  U.  S.  A.) 


EDWARDS  BROTHERS,  INC. 

Lithoprinters  and  Publishers 
ANN  ARBOR,  MICHIGAN 


COHTBHT8 

Introduction 

Page 

Chapter  I.  OLD  PHYSICS . 1 

1.  Motion  of  a  Particle.  The  Inverse  Square  Law 1 

2.  Two  Pictures  of  Matter  3: 

3.  Vectors,  Tensors,  Operations  5 

4.  Maxwell's  Equations 7 

5.  The  Stress-Energy  Tensor 9 

6.  General  Equations  of  Motion.  The  Complete  Tensor 10 

Chapter  II .  NEW  GEOMETRY  ....'. 12 

7.  Analytic  Geometry  of  Four  Dimensions 12 

8.  Axioms  of  Four-Dimensional  Geometry 14 

9.  Tensor  Analysis 16 

10.  Complications  Resulting  From  Imaginary  Coordinate 20 

11.  Are  the  Equations  of  Physics  Invariant 22 

12.  Curves  in  the  New  Geometry 24 

Chapter  III.  SPECIAL  RELATIVITY  . 26 

13.  Equations  of  Motion , 26 

14.  Lorentz  Transformations ,  28 

15.  Addition  of  Velocities 29 

16 .  Light  Corpuscles ,  or  Photons 31 

17.  Electricity  and  Magnetism  in  Special  Relativity  33 

Chapter  IV.   CURVED  SPACE 35 

18.  Curvature  of  Curves  and  Surfaces  35 

19.  Generalizations  37 

20.  The  Riemann  Tensor 39 

21.  Vectors  in  General  Coordinates 41 

22.  Tensors  in  General  Coordinates 44 

23.  Covariant  and  Contravariant  Components 46 

24.  Physical  Coordinates  as  General  Coordinates 49 

25.  Curvilinear  Coordinates  in  Curved  Space  50 

26.  New  Derivation  of  the  Riemann  Tensor 53 

27 .  Differential  Relations  for  the  Riemann  Tensor  54 

28.  Geodesies 55 

Chapter  V.   GENERAL  RELATIVITY  57 

29.  The  Law  of  Geodesies 57 

30.  Solar  System,  Symmetry  Conditions 58 

31.  Solution  of  the  Field  Equations  60 

32.  Equations  of  Geodesies 61 

33.  Newtonian  Motion  of  a  Planet  62 

34.  Relativity  Motion  of  a  Planet 63 

35.  Deflection  of  Light  65 

36.  Shift  of  Spectral  Lines  66 


INTRODOCTIOS 


Since  we  are  going  to  deal  with  applied 
Mathematics,  or  Mathematics  applied  to  Physics 
we  have  to  state  In  the  beginning  the  general 
point  of  view  we  take  on  that  subject.  A  math- 
ematical theory  consists  of  statements  or  prop- 
ositions, some  of  which  are  written  as  formulas. 
Some  of  these  propositions  are  proved,  that 
means  deduced  from  others,  and  some  are  not; 
the  latter  are  called  definitions  and  axioms. 
Furthermore  most  mathematical  theories,  in  par- 
ticular those  in  which  we  are  interested,  deal 
with  quantities,  so  that  the  propositions  take 
the  form  of  relations  between  quantities. 

Physical  experiments  also  deal  with  quan- 
tities which  are  measured  in  definite  prescribed 
ways,  and  then  empirical  relations  are  estab- 
lished between  these  measured  quantities. 

In  an  application  of  Mathematics  to  Phys- 
ics a  correspondence  is  established  between  some 
mathematical  quantities  and  some  physical  quan- 
tities in  such  a  way  that  the  same  relationship 
exists  (as  a  result  of  the  mathematical  theory) 
between  mathematical  quantities  as  the  experi- 
mentally established  relation  between  the  cor- 
responding physical  quantities.  This  view  is 
not  new,  it  was  emphatically  formulated  by  H. 
Hertz  in  the  introduction  to  his  Mechanics,  and 
then  emphasized  again  by  A.  S.  Eddington  in  ap- 
plication to  Relativity.  The  process  of  estab- 
lishing the  correspondence  between  the  physical 
and  the  mathematical  quantities  we  shall,  fol- 
lowing Eddington,  call  identification.  An  iden- 
tification is  successful,  if  the  condition  men- 
tioned above  is  fulfilled,  viz.,  if  the  rela- 
tions deduced  for  the  mathematical  quantities 
are  experimentally  proved  to  exist  between  the 
Physical  quantities  with  which  they  have  been 
identified.  From  this  point  of  view  we  do  not 
speak  of  true  or  false  theories,  still  less  of 
absolute  truth,  etc.;  truth  for  us  is  nothing 
but  a  successful  identification,  and  it  is  nec- 
essary to  say  expressly  that  there  may  exist  at 
the  same  time  two  successful  identifications, 
two  theories,  each  of  which  may  be  applied  with- 
in experimental  errors  to  the  known  experimental 
results;  and  that  there  may  be  times  when  no 
such  theory  has  been  found;  and  also  that  an 
identification  which  is  successful  at  one  time 
may  cease  to  be  so  later,  when  the  experimental 
precision  will  be  increased. 

Very  often  it  happens  that  the  quantities 
of  a  theory  are  compared  not  with  quantities  which 
are  direct  results  of  experiment  but  with  quan- 
tities of  another,  less  comprehensive,  theory 


whose  identification  with  experimental  quanti- 
ties has  proved  successful.  In  fact,  it  sel- 
dom happens  that  we  have  to  deal  with  direct 
results  of  experiment,  since  even  an  experimen- 
tal paper  usually  contains  a  great  amount  of 
theory. 

We  may  consider  Geometry  as  a  first  attempt 
at  a  study  of  the  outside  world.  It  may  be  con- 
sidered as  a  deductive  system  which  reflects 
(in  the  sense  explained  above,  that  is  of  the 
existence  of  a  correspondence,  etc.)  very  well 
our  experiences  with  some  features  of  the  out- 
side world,  namely  features  connected  with  the 
displacements  of  what  we  call  rigid  bodies.  We 
see  at  once  how  much  is  left  out  in  such  a  study; 
in  the  first  place,  time  is  almost  entirely 
left  out:  in  trying  to  bring  into  coincidence 
two  triangles  we  are  not  interested  in  whether 
we  move  one  slowly  or  rapidly;  in  describing  a 
circle  we  are  not  concerned  with  uniformity  of 
motion.  Another  important  feature  that  is  left 
out  is  the  distinction  between  vertical  and 
horizontal  lines  although  we  know  that  this 
distinction  is  a  very  real  one.   Then  optical 
and  electromagnetic  phenomena  are  loft  out.  We 
see  thus  that  Geometry  is  not  a  complete  theory 
of  the  outside  world;  one  method  of  building  a 
complete  theory  would  be  to  introduce  correc- 
tions into  geometry,  to  introduce  one  by  one 
time,  gravitation,  ether,  electricity,  to  patch 
it  up  every  time  we  discover  a  hole;  this  is  a 
disrespectful  but  roughly  correct  description 
of  the  actual  development  of  Mathematical  Phys- 
ics. Another  method  would  be  to  scrap  Geometry, 
and  to  build  instead  a  new  theory  which  would  be 
an  organic  whole,  embracing  the  displacement 
phenomena  as  a  (very  important,  to  be  sure)  spe- 
cial case.  The  purpose  of  the  following  dis- 
cussion is  to  exhibit  such  a  theory.  Before  we 
come  to  the  systematic  exposition  we  want  to 
say  something  about  the  plan  we  are  going  to 
follow.  It  is  possible  to  start  with  a  com- 
plete statement  of  the  general  theory,  and  then 
show  how  special  features  mlfcr  be  obtained  by 
specializations  and  approximations;  instead,  on 
didactical  grounds,  we  shall  begin  with  special 
cases  and  work  up  by  modifications  (these  being 
counterparts  of  approximations)  and  generaliza- 
tions. The  essential  difference  between  this 
procedure  and  the  development  of  classical  Phys- 
ics is  that  in  the  latter  Geometry  was  consid- 
ered as  a  fixed  basis,  not  to  be  affected  by 
the  upper  structure,  and  we  shall  feel  free  to 
modify  geometry  when  necessary. 


Chapter  I. 
OLD  PHYSICS 


The  purpose  of  this  chapter  is  to  reformu- 
late some  of  the  fundamental  equations  of  mech- 
anics and  electrodynamics,  and  to  write  them  in 
a  new  form  which  constitutes  an  appropriate  bas- 
is for  the  discussion  that  follows.   The  con- 
tents of  the  chapter  is  classical,  the  modifi- 
cations which  are  characteristic  of  the  Rela- 
tivity theory  have  not  been  introduced,  but,  as 
was  mentioned  the  form  is  decidedly  new. 


1.  Motion  of  a  Particle. 
The  Inverse  Square  Law. 

The  fundamental  equations  of  Mechanics  of 
a  particle  are  usually  written  in  the  form 


1.1 


=  X,  m-d*y/dt8=  Y,  in 


=  Z. 


Here  m  denotes  the  mass  of  the  particle,  x,y,z 
are  functions  of  the  time  t  whose  values  are  the 
coordinates  of  the  particles  at  the  correspond- 
ing time,  and  X,Y,Z  are  functions  of  the  coor- 
dinates whose  values  are  the  components  of  the 
force  at  the  corresponding  point.  This  system 
of  equations  was  the  first  example  of  what  we 
may  call  mathematical  physics,  and  much  that  is 
now  mathematical  physics  may  be  conveniently 
considered  as  a  result  of  a  development  whose 
germ  is  the  system  1.1.  This  chapter  will  be 
devoted  to  tracing  out  some  lines  of  this  de- 
velopment. 

We  begin  by  writing  the  equations  1.1  in 
the  form 

1.11   dmu/dt  =  X,  dmv/dt  =  Y,  dmw/dt  =  Z, 

where 

1.2    u  =  dx/dt,   v  =  dy/dt,   w  =  dz/dt 

are  the  velocity  components.  The  quantities 


rau, 


mv, 


mw 


are  called  the  momentum  components,  and  in  this 
form  our  fundamental  equations  express  the 
statement  that  the  time  rate  of  change  of  the 
momentum  is  equal  to  the  force,  the  original 
statement  of  Newton.  Equations  1.11  are  seen 
to  be  equivalent  to  1.1  if  we  use  the  notations 
1.2  and  the  fact,  usually  tacitly  assumed,  that 
the  mass  of  a  particle  does  not  change  with  time 
or  in  symbols 


1.12 


dm/dt  =  0. 


In  the  equations  1.1,  x,y,z  are  usually  un- 
known functions  of  the  time  and  X,Y,Z  are  given 
functions  of  the  coordinates.  The  situation  is 
then  this:  first  the  field  has  to  be  described 
by  giving  the  forces  X,Y,Z,  and  then  the  motion 
in  the  given  field  is  determined  by  solving 
equations  1.1  (with  some  additional  initial  con- 
ditions) . 

We  shall  first  discuss  fields  of  a  certain 
simple  type.  One  of  the  simplest  fields   of 
force  is  the  so-called  inverse  square  field. 
The  field  has  a  center,  which  is  a  singularity 
of  the  field;  in  it  the  field  is  not  determin- 
ed; in  every  other  point  of  the  field  the  force 
is  directed  toward  the  center  (or  away  from  it) 
and  the  magnitude  of  the  force  Is  Inversely 
proportional  to  the  distance  from  the  center. 
As  the  most  common  realization  of  such  a  field 
of  force  we  may  consider  the  gravitational  field 
of  a  mass  particle  or  of  a  sphere.  If  cartesi- 
an coordinates  with  the  origin  at  the  center 
are  introduced  the  force  components  are 


1.3 


X  =  cx/r3,  Y  =  cy/r3,  Z  =  cz/r", 


where  c  is  a  coefficient  of  proportionality, 
negative,  when  we  have  attraction  and  positive 
in  the  case  of  repulsion,  and 


1.4 


ya 


taking  the  sum  of  squares  of  X,Y,Z  we  easily 
find  ca/r4  so  that  the  magnitude  of  the  force 
is  c/r8;  force  is  inversely  proportional  to  the 
square  of  the  distance.  If  the  field  is  pro- 
duced by  several  attracting  particles  the  force 
at  every  point  (outside  of  the  points  where  the 
particles  are  located)  is  considered  to  be  giv- 
en by  the  sum  of  the  forces  due  to  the  separate 
particles.  In  this  case  the  expressions  become 
quite  complicated  and  it  is  easier  to  study  the 
general  properties  of  such  fields  by  using  cer- 
tain differential  equations  to  which  the  force- 
components  are  subjected  rather  than  by  study- 
ing the  explicit  expressions. 

These  differential  equations  are  as  fol- 
lows: 

1.51  bX/bx  +  bY/by  +  bZ/bz  =  0 

1.52  bY/bz  -  bZ/by  =  0,  bZ/bx  -  bX/bz  =  0, 

bX/oy  -  bY/bX  =  0. 

The  fact  that  the  functions  X,Y,Z  given  by 
the  formulas  1.5  satisfy  these  equations  may  be 
proved  by  direct  substitution;  to  facilitate 


calculation  we  may  notice  that  differentiation 
of  1.4  gives 

1.6   r-br/bx  =  x,  r»br/by  =  y,  r«br/bz  =  z. 
Differentiating  the  first  of  1.8,  we  have  now 
bX/bx  =  c/r3  -  3cx/r4-  br/dx  =  c/r3  -  3cxs/r*. 

Substituting  this  and  two  analogous  expressions 
into  1.51  we  easily  verify  it.  The  verification 
of  1.52  hardly  presents  any  difficulty. 

It  is  known  that  equations  1.52  give  a  nec- 
essary and  sufficient  condition  for  the  exist- 
ence of  a  function  <p  of  which  X,Y,Z  are  partial 
derivatives.  The  derivative  bX/bx  is  then  the 
second  derivative  of  this  function,  and  the  sys- 
tem 1.51-1.52  may  be  replaced  by  the  equivalent 
system 

1.53  X  =  b<p/bx,   Y  =  bq>/dy,  Z  =  b<p/bz 

1.54  52<p/dx2  +  b29/by8  +  b^/bz*  =  0. 

The  last  equation  is  known  as  the  equation 
of  Laplace. 

In  the  particular  case  where  X,Y,Z  are  giv- 
en by  the  formulas  1.3  a  function  <p  of  which  X, 
Y,Z  are  partial  derivatives  is  (as  it  is  easy 
to  verify) 

q>  =  -c/r. 

We  may  say  now  that  the  field  of  force  giv- 
en by  l.b  satisfies  the  differential  equations 
1.51,  1.52  or  1.5&,  1.54  which  express  the  same 
thing.  It  is  easy  to  show  that  these  equations 
are  satisfied  not  only  by  the  field  produced  by 
one  particle  at  the  origin,  but  also  by  that  due 
to  any  number  of  particles:  first  we  notice 
that  if  a  particle  is  not  at  the  origin  this 
results  only  in  additive  constants  in  the  coor- 
dinates, and  so  does  not  affect  partial  deriva- 
tives which  appear  in  the  equations  1.51-1.52 
which  therefore  remain  true  in  this  case;   sec- 
ondly, these  equations  are  linear  and  homogene- 
ous, as  a  consequence  of  which  the  sum  of  two 
solutions  of  these  equations  necessarily  is  a 
solution,  or  if  two  fields  satisfy  these  equa- 
tions their  sum  also  satisfies  them;  if  then,  as 
is  generally  assumed,  the  field  produced  by  sev- 
eral particles  is  the  sum  of  the  fields  due  to 
the  individual  particles,  such  a  field  also  sat- 
isfies equations  1.51,  1.52. 

Conversely,  it  can  be  proved  that  any  field 
satisfying  the  differential  equations  1.51-1.52 
may  be  produced  by  a  -  finite  or  infinite  -  set 
of  particles  each  of  which  acts  according  to  the 
inverse  square  law.  We  shall  not  prove  here 
this  fact  (the  proof  is  given  in  Potential  The- 
ory), but  we  shall  show  that  these   equations 
furnish  us  back  the  inverse  square  law,  if  we 
add  the  condition  that  the  field  must  be  symmet- 
ric with  respect  to  one  point. 


A  situation  of  this  character,  a  situation 
where  we  have  to  solve  a  system  of  partial  dif- 
ferential equations  with  the  "additional  condi- 
tion" of  symmetry,  will  appear  again  later,  and 
in  order  to  be  clear  about  its   significance 
then,  it  is  desirable  to  treat  here  this  spe- 
cial case  in  detail. 

To  begin  with  we  take  the  system  1.51,  1.5* 
in  the  form  1.53,  1.54,  i.e.,  we  state  that  X, 
Y,Z  are  derivatives  of  a  function  • .  Then  fron 
the  condition  of  symmetry  of  the  X,Y,Z  field  it 
follows  that  the  field  represented  by  the  func- 
tion (p  also  must  be  symmetric,  i.e.,  that  this 
function  may  depend  only  on  the  distance  from 
the  origin,  because  two  points  which  are  equi- 
distant from  the  origin  are  symmetric  with  re- 
spect to  it  and,  therefore,  9  must  have  the 
same  value  in  two  such  points. 

We  have  thus  to  solve  equation  1.54  with 
the  additional  condition  that  «  depends  on  x, 
y,z,  only  through  r.  Indicating  differentia- 
tion with  respect  to  r  by  '  we  have 


1.56 

and 

1.57 


btp/bx  =  <p''br/bx,  etc. 


xa  =  <p"-(dr/bx)* 


1,  etc. 


Squaring  each  of  the  formulas  1.6  and  taking 
the  sum  we  have 


or 


on  the  other  hand  differentiating  each  formula 
1.6  we  have 


bar 


summing  these  we  obtain 


\5«   by 


bz» 


1,  etc.; 


=  2. 


Using  now  1.57  we  can  give  the  equation  1.54 
the  form 


or 


•=•-  0 


whence 


'  = 


<p  =   >  which  substituted  in  1.56  to- 
gether with  1.6  gives  1.?.  We  see  thus  that 
the  general  equations  1.51-1.52  give  us  all  the 
general  information  we  need  about  the  fields  of 
force  in  question;  we  shall  call  this  system 
the  system  of  equations  of  a  Newtonian  field  or 
simply  the  Newtonian  system,  although  Newton 
never  considered,  the  differential  equations 
that  make  it  up. 

We  may  comment  briefly  on  the  mathematical 
character  of  the  magnitudes  and  equations  we 


have  been  dealing  with  in  this  section.  At  ev- 
ery point  X,Y,Z  may  be  considered  as  the  compo- 
nents of  a  vector,  the  force  vector;  we  have 
thus  a  vector  at  every  point  of  space,  and  this 
constitutes  what  is  called  a  vector  field.  The 
function  <p  is  an  example  of  a  scalar  field. 
These  two  fields  are  in  the  particular  relation 
that  the  first  is  derived  from  the  second  by 
differentiation.  The  vector  field  satisfies 
equation  1.51,  the  left  hand  side  of  which  is 
called  divergence  of  the  vector  field.  To  find 
a  divergence  of  a  vector  field  we. take  the  sum 
of  the  derivatives  of  the  components  of  the  vec- 
tor with  respect  to  the  corresponding  coordin- 
ates, or  we  differentiate  each  component  with 
respect  to  the  corresponding  coordinate  and  add 
up  the  results.  The  formula  becomes  more  ex- 
pressive if  we  number  the  coordinates  by  writ- 
ing 

1.71      x  =  x^,    y  =  xa,      z  =  xa 

and  also  the  vector  components,  viz., 
1.78      X  =  Xx,    Y  =  X2,      Z  =  X3. 

The  expression  for  the  divergence  may  then  be 
written  as 


1.8 


1=1 


The  operation  of  forming  a  divergence  is  of 
fundamental  importance  in  what  follows. 

We  abandon  now  for  a  while  the  study  of 
force  fields  and  direct  our  attention  to  the 
left  hand  sides  of  the  equations  of  motion. 


2.  Two  Pictures  of  Matter. 

Our  fundamental  equations  1.1  connect  mat- 
ter as  represented  by  the  left  hand  sides  with 
forces  as  represented  by  the  right  hand  sides. 
There  seems  to  be  a  fundamental  difference  in 
the  mathematical  aspects  of  matter  and  force. 
The  quantities  characterizing  matter,  the  mo- 
mentum components,  for  example,  are  functions 
of  one  variable  t,  and  are  subjected  to  ordin- 
ary differential  equations,  whereas  quantities 
characterizing  force,  X,Y,Z  are  functions  of 
three  variables  x,y,z  and,  as  a  consequence,  are 
subjected  to  partial  differential  equations; 
they  are  field  quantities,  whereas  the  matter 
components  are  not;  another  way  of  saying  this 
is  to  say  that  force  seems  to  be  distributed 
continuously  through  space  but  matter  seems  to 
be  connected  with  discrete  points.  This  dis- 
tinction is,  however,  not  as  essential  as  it 
looks;  it  is  merely  the  result  of  the  point  of 
view  we  take.  Vt'e  could  very  well  consider  mat- 
ter to  be  distributed  continuously   through 
space;  each  of  the  two  theories,  the  discrete 
theory,  according  to  which  matter  consists  of 


discrete  particles,  or  material  points,  each  of 
which  carries  a  finite  mass,  and  the  continuous 
theory,  according  to  which  matter  Is  distribut- 
ed continuously  through  space,  or  certain  por- 
tions of  space,  may  be  considered  as  the  limit- 
ing case  of  the  other.  We  may  start  with  mate- 
rial points,  then  increase  their  number  at  the 
same  time  decreasing  the  mass  of  each  and  so 
approximate  with  any  degree  of  precision  a  fir- 
en  continuous  distribution;  or  we  may  start  with 
a  continuous  distribution,  then  make  the  den- 
sity decrease  everywhere  except  in  the  constant- 
ly decreasing  neighborhoods  of  a  discrete 
ber  of  points,  and  thus  approximate,  with 
precision  a  given  discrete  distribution.  It  is 
clear  that  there  cannot  be  any  question  as  to 
which  of  the  two  theories  is  correct,  since  the 
difference  between  the  two  can  be  made  as  small 
as  we  please,  and  therefore  the  predictions 
based  on  the  two  theories  can  be  made  to  agree 
as  closely  as  we  may  wish,  so  that  if  one  iden- 
tification is  successful  within  experimental 
error  the  other  will  be  likewise.  Mathemati- 
cally the  difference  will  be  largely  that  be- 
tween ordinary  differential  equations,  which 
are  used  in  treating  the  motion  of  discrete 
particles,  and  partial  differential  equations, 
which  apply  to  continuous  distributions. 

We  may  remark  here  that  although  forces 
are  usually  considered  to  be  continuously  dis- 
tributed in  space,  it  is  possible  to  introduce 
a  discrete  picture  here  also;  this  is  being  ac- 
tually done  sometimes  in  the  electromagnetic 
theory,  when  a  field  of  force  is  represented  by- 
discrete  lines  of  force,  and  the  intensity  is 
characterized  by  the  number  of  lines  per  square 
inch;  we  shall  not,  however,  have  occasion  to 
use  this  picture. 

We  may  also  remark  here  that  in  the  last 
few  years  still  a  third  point  of  view  has  ap- 
peared (in  Quantum  Theory)  which  in  a  way  oc- 
cupies an  intermediate  position;  mathematically 
the  treatment  is  that  used  in  the  continuous  . 
case  (partial  differential  equations),  but  the 
interpretation  is  given  in  terms  of  discrete 
particles,  the  continuous  quantities  being  con- 
sidered as  probabilities  of  a  particle  being 
within  a  certain  volume,  and  the  like.   This 
point  of  view  also  will  not  be  used  in  What  fol- 
lows, and  is  mentioned  here  only  for  the  sake 
of  completeness. 

We  want  now  to  translate  the  equations  of 
motion  1.11  into  the  language  of  the  continuous 
theory.  Each  point  of  space  (or  of  a  certain 
portion  of  space)  will  be  considered  as  occu- 
pied at  each  moment  (or  each  moment  during  a 
certain  period)  by  a  material  particle.   Here 
we  also  denote  by  u,v,w  the  velocity  components 
of  a  particle  of  matter,  but  here  they  are  also 
considered  as  functions  of  coordinates  as  well 
as  of  time;  by 

u(x,y,z,t),    v(x,y,z,t),   w(x,y,z,t) 


we  understand  the  velocity  components  of  a  par- 
ticle which  at  the  time  t  occupies  the  position 
x,y,z.  The  fundamental  quantity  in  this  theory 
corresponding  to  mass  of  the  discrete  theory  is 
density.  A  particle  does  not  possess  any  finite 
mass,  a  mass  corresponds  only  to  a  finite  vol- 
ume (at  a  given  time).  To  a  point  (at  a  given 
time)  we  assign  a  density  which  may  be  explain- 
ed as  the  limit  of  the  mass  of  a  sphere  with 
the  center  at  the  given  point  divided  by  the 
volume  of  that  sphere  as  the  radius  of  the  sphere 
tends  toward  zero.  A  better  way  of  putting  it 
is  to  say  that  we  consider  a  point  function 
p(x,y,z,t)  called  the  density  and  that  the  mass 
of  matter  occupying  a  given  volume  at  a  given 
time  is  the  integral 


J"p(x,y,z,t)-dxdydz 


extended  over  the  given  volume.  This  integral 
will,  in  general,  be  a  function  of  time;   the 
mass  in  a  given  volume  changes  with  time  because 
new  matter  may  be  coming  in  and  old  matter  going 
out,  and  they  do  not  exactly  balance  each  other. 
But  if  we  consider  a  certain  volume  at  a  given 
time,  and  then  consider  at  other  moments  the 
volume  which  is  occupied  by  the  same  matter, 
then  the  mass  of  matter  in  that  new  volume  must 
be  the  same.  That  means  that  if  we  consider  x, 
y,z,  as  functions  of  t,  namely,  as  the  coordin- 
ates of  the  same  particle  of  matter  at  differ- 
ent times,  and  if  we  consider  the  region  of  in- 
tegration as  a  variable  volume  but  one  that  is 
occupied  by  the  same  particles  of  matter  at  all 
times,  then  the  integral  must  be  independent  of 
time,  or 


.1     -jT/J  p(x,y,z,t)«dxdydz  =  0. 


This  may  be  written  also  in  a  differential  form 
as 

2.2    bpu/bx  +  bpv/by  +  bpw/oz  +  bp/bt  =  0. 

We  indicate  two  proofs  for  this  fact;  first 
an  easy  but  not  rigorous  proof. 

For  an  infinitesimal  volume  V  =  dx-dy-dz  we 
may  consider  density  as  the  same  in  all  points 
of  the  volume,  so  that  mass  will  be  the  product 
V»p  and  the  derivative  of  this  product  will  be 

dV/dt-p  +  V-dp/dt. 

Then  again,  considering  V  as  a  product  dx»dy»dz 
we  find 

dV/dt 
*  ddx/dt»dydz  +  dx-ddy/dt-dz  +  dx-dyddz/dt. 


Now  set  dx  =  x,  -  Xj.; 

ddx/dt  =  d(xt  -  xj/dt  =  dx,/dt  -  dxx/dt 
»  u,  -  Uj. 


substituting  this  in  the  preceding  relation  we 
find 

dV/dt  =  V(bu/&x  +  by  /by  +  bw/bz), 
and  since 

dp/dt 

=  op/ox-  dx/dt  +  bp/by»dy/dt  +  bp/bz-dz/dt  +  bp/bt 
=  'op/ox-u  +T>p/by  -v  +"op/oz-w  +  bp/bt, 
the  expression  for  the  derivative  of  mass  gives 
dm/dt  =  V-(bpu/bx  +  bpv/by  +  bpw/bz  +  bp/bt) 

so  that  constancy  of  mass  is  expressed  by  the 
condition  2.2. 

A  rigorous  proof  would  be  based  on  express- 
ing the  integral  for  the  moment  t1  which  may  be 
written  as  '^(x*  ,y',zf  jt^dx'dy'dz1  using  as 
variables  of  integration  the  coordinates  of  the 
corresponding  particles  at  the  moment  t.   The 
formulas  of  transformation  would  be 

2.3     x'  =  f(x,y,z,t'),  yf  =  g(x,y,r,t'), 
z'  =  h(x,y,z,t«) 

where  f(x,y,z,t')  is  the  x  coordinate  at  the 
moment  t1  of  the  particle  which  at  the  moment 
t  was  at  x,y,z,  etc.   Using  the  formulas  of 
transformation  of  a  multiple  integral  we  would 
obtain 


/P(f,g,h,t')-J.dxdydz 


where  J  is  the  Jacobian  of  the  functions  £.3 
and  the  integration  is  over  the  volume  occupied 
at  the  moment  t.  Setting  the  derivative  of 
this  integral  with  respect  to  t1  equal  to  zero, 
and  then  making  t1  =  t  and  noticing  that  bf/bt1 
for  t1  =  t  is  the  velocity  component  u,  etc., 
we  would  find  the  same  equation  2.2. 

This  equation  is  called  the  "continuity 
equation  of  matter"  or  the  "equation  of  conser- 
vation of  matter."  The  corresponding  equation 
In  the  discrete  theory  is  the  equation  (1.12) 
dm/dt  =  0  which  is  not  usually  included  among 
the  fundamental  equations  of  mechanics.    The 
continuity  equation  may  be  written  in  a  very 
simple  form  if  we  use  the  index  notations  for 
the  coordinates  introduced  before  (1.71),  intro- 
duce analogous  notations  for  the  velocity  com- 
ponents, viz., 


2.4 


u  =  u- 


v  =  ut, 


w  =  u», 


and  in  addition  write 

2.5  t  =  x* 

and  agree,  when  it  is  convenient,   to  write  u4 
for  unity  so  that 


2.41 


1  =  u4. 


With  these  notations  the  continuity  equa- 
tion becomes 


2.21 


=  0. 


Noticing  the  analogy  of  this  equation  with  equ- 
ation 1.8  we  are  tempted  to  say  that  the  con- 
tinuity equation  expresses  the  fact  that  the 
"divergence"  of  the  "vector"  of  components  pu^ 
is  zero.  This  involves,  of  course,  a  generali- 
zation of  the  conceptions  divergence  and  vector, 
because  the  summation  here  goes  from  one  to  4 
instead  of  to  3  as  in  the  above  formula.   This 
generalization  will  be  of  extreme  importance  in 
what  follows.  In  the  meantime  we  may  notice 
that  the  divergence  of  the  vector  pu^  plays  the 
same  role  in  the  continuous  theory  as  the  time 
derivative  of  the  number  m  played  in  the  dis- 
crete theory. 

We  now  continue  the  translation  of  the 
equations  of  the  discrete  theory  into  the  con- 
tinuous language.  The  equations  of  motion  ex- 
press the  fact  that  the  time  derivatives  of  the 
momentum  components  are  equal  to  the  force  com- 
ponents; limiting  our  consideration  to  the  left 
hand  sides  of  the  equations  we  have  therefore 
to  consider  the  time  derivatives  of  the  momen- 
tum components;  in  the  first  place  the  time  de- 
rivative of  mu;  without  repeating  the  reasoning 
which  led  us  to  the  continuity  equation  of  mat- 
ter, and  noticing  that  the  only  change  consists 
in  replacing  of  m  by  mu  we  find  that  the  time 
rate  of  change  of  the  first  component  of  the  mo- 
mentum vector  will  be  here 

2.61  fcpuu/ox  +  dpuv/oy  +  opuw/oz  +  "opu/ot 

and  the  analogous  expressions  for  the  other  com- 
ponents will  be 

2.62  dpvu/dx  +  opvv/dy  +  opvw/dz  +  opv/dt 

2.63  opwu/ox  +  opwv/dy  +  opww/oz  +  opw/ot. 

These  expression  will  have  to  be  set  equal  to 
the  force  components  (or,  rather,  components  of 
the  force  density)  in  order  to  obtain  the  equa- 
tions of  motion.  Such  equations  have  been  obtain- 
ed by  Euler  for  the  motion  of  a  fluid  and  are  re- 
ferred to  as  Euler 's  hydrodynamic  equations;  but 
at  present  we  are  not  so  much  interested  in  the 
equations  of  motion  as  in  the  mathematical  struc- 
ture of  the  expressions  involving  matter  compo- 
nents that  we  have  written  down.  An  attentive 


Inspection  will  help  to  discover  a  far-reaching 
symmetry  which  again  finds  its  best  expression 
if  we  use  the  index  notations  introduced  above 
(1.71,  2.4,  2.5).  We  may  write,  In  fact,  for 
the  last  three  expressions 


2.6 


J  =  1,2,8 


and  we  note  furthermore,  that  If  we  let  j  hare 
take  the  value  4  we  obtain  the  expression  ap- 
pearing on  the  left  hand  side  of  the  equation 
of  continuity  (2.21).  We  come  thus  to  the  idea 
of  considering  the  quantities 


2.7 


iJ 


and  we  see  that  the  expression 

2.65        I  OMJ./C-X!         J.  =  1,2,3,4 

plays  a  very  important  part  in  our  theory.  The 
first  three  components,  i.e.,  the  expressions 
obtained  for  J  =  1,2,3,  give  the  time  rate  Of 
change  of  the  momentum  components,  and  the  last 
one,  obtained  by  setting  j  =  4,  gives  the  ex- 
pression whose  vanishing  expresses  conservation 
of  mass.  The  expressions  2.65  appear  as  a  gen- 
eralization of  what  we  call  a  divergence,  and 
we  shall  call  it  divergence  also,  but  it  is 
clear  that  the  whole  structure  of  our  expres- 
sions deserves  a  closer  study  to  which  we  shall 
devote  our  next  section. 

3.   Vectors,  Tensors,  Operations. 

We  shall  later  treat  the  fundamental  con- 
cepts of  vector  and  tensor  analysis  in  a  syste- 
matic way.  At  present  we  shall  show  how  the 
language  of  this  theory  which  for  ordinary 
space  has  been  partly  introduced  in  section  1 
can  be  applied  to  the  case  of  four  independent 
variables  and  extended  so  as  to  furnish  a  sim- 
ple way  of  describing  the  relations  introduced 
in  the  preceding  section. 

A  quantity  like  p  which  depends  on  the  in- 
dependent variables  x,y,z,t,  we  shall  call  a 
scalar  field.  The  four  quantities  uif  ua,  U3, 
u4,  we  shall  consider  as  the  components  of  a 
vector  (or  of  a  vector  field;  the  latter,   if 
we  want  to  emphasize  the  dependence  on  the  in- 
dependent variables).  The  sixteen  quantities 
pujiij  furnish  an  example  of  a  tensor  (or  tensor 
field) .  A  convenient  way  to  arrange  the  com- 
ponents of  a  tensor  is  in  a  square  array;  for 
instance, 


3.1 


pUjU3 

puaut   puau, 

PU3U8    PU3U3 

pu4u2   pu4us 


pu4u4 


3.2 


•OUa/bx 
"bus/bx 

OU4/&X 


"bu3/by 
bu4/oy 


oua/bz 
bu3/dz 


bua/bt 
ou,/bt 


We  want  to  mention  here  a  very  important  tensor 
the  array  of  whose  components  is 


3.4 


0  0 

1  0 
0  1 
0  0 


its  components  are  usually  denoted  by  8i<  so  that 
0  jj  is  one  if  the  indices  have  the  same  value, 
and  zero,  when  they  are  different.  The  Oj,  are 
often  referred  to  as  the  Kronecker  symbols. 

A  tensor  has  been  obtained  above  from  a 
vector  by  differentiation;  the  same  process  can 
also  be  applied  to  a  scalar  in  order  to  obtain 
a  vector.  From  the  scalar  p,  we  would  thus  ob- 
tain a  vector  whose  components  are  5&,  l£,  ?£., 
fcp  ox*  oy'  oz' 

5^.;  this  vector  is  often  called  the  gradient  of 

the  scalar  p.  On  the  other  hand  the  same  proc- 
ess may  be  applied  to  a  tensor.   For  instance, 
differentiating  each  of  the  sixteen  components, 
MJJ  =  pu^j  introduced  in  2.7  with  respect  to 
each  of  the  independent  variables  x^  we  obtain 
the  16  x  4  numbers  (or  functions)  bMji/dxjj.  We 
call  these  numbers  the  components  of  a  tensor 
of  rank  three,  and  we  may  call  now  what  we 
called  simply  a  tensor,  a  tensor  of  rank  two,  a 
vector  -  a  tensor  of  rank  one,  and  a  scalar  -  a 
tensor  of  rank  zero.  The  operation  of  differ- 
entiation leads  then  from  a  tensor  (or,  better, 
a  tensor  field)  to  a  tensor  of  the  next  higher 
rank. 

It  is  also  convenient  to  introduce  another 
operation,  the  operation  of  contraction;  it  can 
be  applied  to  a  tensor  of  at  least  rank  two  and 
it  lowers  the  rank  by  two;  for  a  tensor  of  rank 
two  it  consists  in  forming  the  sum  of  all  the 
components  whose  indices  are  equal,  or,  if  the 
components  are  arranged  as  explained  before,  in 
taking  the  sum  of  all  the  components  in  the 
main  diagonal. 

The  operation  of  taking  the  divergence  of 
a  vector  (field)  may  be  stated  now  to  consist 
of  the  operation  of  differentiation  followed  by 
the  operation  of  contraction  applied  to  the  re- 
sulting tensor  of  rank  two. 

A  tensor  of  rank  three  may  be  contracted, 
in  general,  in  three  different  ways;  in  general, 
a  tensor  of  higher  rank  in  as  many  ways  as 
there  are  pairs  of  indices.  To  contract  a  ten- 
sor with  respect  to  two  of  its  indices  means  to 
take  the  sum  of  those  components  in  which  the 
two  selected  indices  have  the  same  values;  for 
instance,  bM^/Ox^  is  a  tensor  of  rank  three; 
its  contraction  with  respect  to  the  first  and 
the  third  indices  is  the  sum  ZoMij/bxi;   the 
index  j  is  allowed  to  take  all  the  four  values 
1,2,3,4  so  that  we  have  four  sums  which  are 


considered  as  the  components  of  a  vector  -  the 
divergence  of  the  tensor  U  jj. 

We  may  finally  mention  the  operation  of 
multiplication  which  has  been  applied  several 
times  in  what  precedes.  The  vector  put  *  qt 
has  been  obtained  as  a  result  of  multiplication 
of  the  vector  u^  by  the  scalar  p.  The  tensor 
MJJ  has  been  obtained  by  multiplying  the  vector 
qt  by  the  vector  \i^  (every  component  of  the 
first  by  every  component  of  the  second  -  that 
la  why  we  have  to  use  different  indices  -  1  and 
J  are  supposed  to  take  independently  of  each  of 
the  values  1,2,3,4). 

The  operation  of  contraction  Introduced 
above  will  be  performed  very  often;  it  is  con- 
venient, therefore,  to  simplify  our  notation; 
this  simplification  consists  in  omitting  the 
symbol  of  summation,  and  in  indicating  that 
summation  takes  place  by  using  Greek  letters 
for  indices  with  respect  to  which  we  sum.  Thai 
we  shall  write 


3.5    1>pua/bxa  =0  and  3.6 


for  2.21  and  2.65  respectively.  The  first  gives 
an  example  of  a  divergence  of  a  vector,  the  sec- 
ond of  a  divergence  of  a  tensor. 

The  Greek  index  in  the  above  formulas  has 
no  numerical  value;  any  other  Greek  letter  would 
do  just  as  well;  in  this  respect  a  Greek  index 
may  be  compared  with  the  variable  of  Integra- 
tion in  a  definite  integral.  The  only  case  when 
we  have  to  pay  some  attention  to  the  particular 
Greek  letters  we  are  using  is  when  two  (or  more) 
summations  occur  in  one  expression  -  in  such  a 
case  different  Greek  letters  have  to  be  used 
for  every  summation.  If  we  have  to  write,  for 
example,  (ZXiyi)8  using  Greek  indices  we  could 
write  it  as  (*aya)3,  but  if  we  want  to  write 
out  the  two  factors  instead  of  using  the  expo- 
nent we  have  to  write   xayaxByB  because 
(*o7a  )  (Vo  ^  would  have  meant  zCr^)*. 

The  operation  of  contraction  is  used  quite 
often.  The  formation  of  the  scalar  product  of 
two  vectors  u^  and  vi  may  be  considered  as  re- 
sulting from  their  multiplication  followed  by 
contraction;  the  multiplication  gives  the  ten- 
sor of  the  second  rank  UjV*,  and  contracting 
this  we  get  uava  =  u^  +  u8va  +  u3v3,  which  is 
the  scalar  product;  the  scalar  product  of  two 
vectors  could  be  also  called  the  contracted 
product  of  the  two  vectors.   In  an  analogous 
fashion  we  can  form  a  contracted  product  of  two 
tensors  of  the  second  rank.  If  the  tensors  are 
ajj  and  b^  the  contracted  product  will  be 
aiabaj  »  **  i3  also  a  tensor  of  rank  two.  It  may 
be  interesting  to  note  that  the  formation  of 
the  contracted  product  of  two  tensors  is  es- 
sentially the  same  operation  as  that  of  multi- 
plying two  determinants  corresponding  to  the 
arrays  representing  the  tensors;  to  see  that, 
it  will  be  enough  to  consider  two  three  row 
determinants 


aia 


a33 


and 


Their  product,  according  to  theorem  of  multipli- 
cation of  determinants  la 


and  It  Is  seen  that  the  elements  of  this  deter- 
minant are  the  components  of  the  tensor  of  rank 
two  which  arises  from  the  tensor  a^j  and  by  by 
first  multiplying  them  and  then  contracting  with 
respect  to  the  two  inside  indices. 

We  could  also  speak  of  the  contracted  square 
of  a  tensor,  meaning  by  this  the  contracted 
product  of  a  tensor  with  itself. 


4.  Maxwell's  Equations. 

In  section  1  we  discussed  from  a  formal 
point  of  view  the  inverse  square  law  and  the 
fields  of  a  more  general  nature  which  can  be 
derived  from  it;  and  we  expressed  the  laws  of 
these  fields  in  terms  of  three-dimensional  ten- 
sor analysis,  i.e.,  we  employed  only  three  in- 
dependent variables;  after  that  we  found  that 
matter  is  best  discussed  (from  the  continuous 
point  of  view)  by  using  four-dimensional  ten- 
sor analysis.  We  have  thus  a  discrepancy:  two 
different  mathematical  tools  are  used  in  the 
treatment  of  the  two  sides  of  the  fundamental 
equations  of  mechanics.  This  discrepancy  will 
be  removed  In  what  follows,  it  will  be  removed 
by  considering  force  fields  that  differ  from 
those  derived  by  composition  of  inverse  square 
laws,  by  modifying  in  a  sense  this  law;  but  the 
modifications  will  be  different  In  the  two  cases 
in  which  the  inverse  square  law  has  been  applied 
in  older  physics,  the  two  cases  which  we  are 
going  to  mention  now. 

Originally,  the  inverse  square  law  was  in- 
troduced in  the  time  of  Newton  in  application  to 
gravitational  forces;  we  shall  discuss  in  chap- 
ter V  the  gravitational  phenomena,  and  see  what 
modifications  -  radical  in  nature,   but  very 
slight  as  far  as  numerical  values  are  concerned 
-  the  inverse  square  law  will  suffer.  Later,  it 
has  been  recognized  that  the  inverse  square  law 
applies  also  to  the  electrostatic  and  magneto- 
static  fields  produced  by  one  single  electric, 
resp.  magnetic  particle.   Still  later  a  more 
general  law  for  electromagnetic  fields  has  been 
introduced  by  Faraday  and  Maxwell,   which  we 
shall  have  to  study  now. 

If  X,Y,Z  denote  the  components  of  electric 
force  in  the  static  symmetric  case,  as  we  just 
said,  the  inverse  square  law  applies,  and,  as 


shown  In  section  1  it  follows  under  assumption 
of  additivity  that  for  a  field  produced  by  any 
number  of  particles  the  divergence  vanishes, 


4.1 


bX/bx  +  bY/by  +  6Z/oz  »  0, 


and  the  quantities  oY/oz  -  bZ/by,  oZ/bx  -bX/bz, 
oX/oy  -  bY/bx  also  vanish;  a  static  magnetic 
field  does  not  interact  with  the  electric  field, 
but  when  a  changing  magnetic  field  IB  present 
the  laws  of  the  electric  field  are  modified; 
viz.,  the  quantities  Just  mentioned  are  not 
zero  any  more  but  are  proportional  or,  in  ap- 
propriately chosen  units,  equal  to  the  time  de- 
rivatives of  the  components  L,M,H  of  the  magne- 
tic field,  so  that  we  have,  in  addition 


4.11 


oY/oz  -  bZ/by  =  oL/ot 
oZ/6x  -  oX/6z  =  oM/dt 
bX/by  -  bY/bx  =  oN/ot, 


In  a  similar  fashion,  the  divergence 
the  magnetic  force  vanishes, 


of 


4.2 


bL/bx  +  bM/by  +bH/bz  =  0, 


and  the  expressions  bM/bz  -  oH/oy,  bH/bx  -  oL/oz, 
oL/oy  -  oM/ox  are  proportional  to  the  time  de- 
rivatives of  the  electric  components,  the  fac- 
tor of  proportionality,  however,  cannot  be  re- 
duced to  one;  by  an  appropriate  choice  of  units 
it  can  be  reduced  to  minus  one,  and  no  changing 
of  directions  or  sense  of  coordinate  axes  can 
permit  us  to  get  rid  of  this  minus  sign  without 
introducing  a  minus  sign  in  the  preceding  equa- 
tions; this  minus  sign  is  of  extreme  importance 
in  what  follows,  as  we  shall  have  occasion  to 
observe  many  times;  in  the  meantime  we  write 
out  the  remaining  equations 


4.21 


bM/bz  -  dN/oy  =  -6X/bt 
oN/bx  -  bL/bz  «=  -bY/bt 
5L/by  -  bM/bx  =  -bZ/bt, 


Several  remarks  must  be  made  here  concern- 
ing these  equations,  which  will  be  referred  to 
as  Maxwell's  equations. 

In  the  first  place,  these  equations  cannot 
be  proved;  they  have  to  be  regarded  as  the  fund- 
amental equations  of  a  mathematical  theory,  whose 
Justification  lies  in  the  fact  that  its  quanti- 
ties have  been  successfully  identified  with  meas- 
ured quantities  of  Physics,  in  the  sense  that 
for  physical  quantities  the  same  relationships 
have  been  established  experimentally  as  those 
deduced  for  the  corresponding  theoretical  quan- 
tities from  the  fundamental  equations.  In  the 
second  place,  the  equations  as  they  appear  above 
present  a  simplified  and  Idealized  fora  of  the 
fundamental  equations,  namely  the  fundamental 
equations  ft>r  the  case  of  free  space,  i.e.,  com- 
plete absence  of  matter. 


In  the  third  place  the  choice  of  unit* 
which  made  the  above  simple  form  possible  con- 
cerns not  only  units  of  electric  and  magnetic 
force,  but  also  units  of  length  and  time;  it 
was  necessary  to  choose  them  in  such  a  way  that 
the  velocity  of  light,  which  in  ordinary  units 
is  300,000  kilometers  per  second  becomes  one. 
As  a  result  of  this  ordinary  velocity,  those  we 
observe  in  everyday  life  are  expressed  by  very 
small  quantities. 

For  our  purposes  it  is  convenient  to  ar- 
range our  equations  in  the  following  form,  where 
differentiation  with  respect  to  a  variable  is 
Indicated  by  a  subscript, 


4.3 


Ny  -  Mz  -  Xt  =  0 

Lz  ~  Nx  ~  Yt  =  ° 
Mx  -  Ly  -  Zt  =  0 

*x  +  *  +  Zz  =  0 


Yx  - 


=  0 
=  0 
+  Nt  =  0 
+  N,  =  0. 


As  mentioned  before  the  above  equations 
describe  the  behavior  of  electric  and  magnetic 
forces  in  free  space,  that  is  in  regions  where 
there  is  no  matter,  or  where  we  may  neglect  mat- 
ter. On  the  discrete  theory  of  matter  these 
equations  still  hold  everywhere  except  at  points 
occupied  by  matter  -  in  this  theory  matter  ap- 
pears as  singularity  of  the  field  and  some  nu- 
merical characteristics  of  matter,  such  as  elec- 
tric charge,  appear  as  residues  corresponding  to 
these  singularities.  We  shall  not  discuss  this 
point  of  view,  although  mathematically  it  is 
very  interesting.  On  the  continuous  theory  of 
matter  some  terms  which  represent  matter  have 
to  be  added  to  the  preceding  equations.  The  sec- 
ond set  of  Maxwell's  equations  (4.3)  remains  un- 
altered, but  the  first  set  is  modified;  the 
loft  hand  sides  do  not  vanish  any  more  but  are 
proportional  to  the  velocity  components  of  mat- 
ter; the  coefficient  of  proportionality  is  elec- 
tric density  which  we  denote  by  e  .   The  equa- 
tions of  Maxwell  for  space  with  matter  are  thus 


4.31 


Ny  -  Ms  -  It  =  eu 

£»  -  »«  -  **  "  6V 

Mz  -  Ly  -  Zt  =  ew 

Xx  +  Yy  +  Zf  =  e 


Zy  -  Y»  +  Lt^  =  0 

Yx  -  Xy  +  Nt  =  0 

Lx  +  My  +  Na  =  0, 


We  come  thus  across  a  new  scalar  quantity  -  elec- 
tric density.  However,  in  most  cases  this  den- 
sity is  proportional  to  mass  density  p  we  have 
considered  before,  the  factor  of  proportionality 
being  capable  of  only  two  numerical  values  -  one 
negative  for  negative  electricity,  and  the  other 
positive  for  positive  electricity. 

Even  these  equations  are  not  sufficient  for 
the  description  of  electromagnetic  phenomena; 
they  correspond  to  a  certain  idealization  in 
which  the  dielectric  constant  and  magnetic  per- 
meability are  neglected,  but  we  shall  not  go 
beyond  this  idealization. 

In  the  above  equations  we  have  four  inde- 
pendent variables  x,y,z,t,  as  in  the  discussion 


of  matter  in  section  2,  and  we  may  try  now  to 
apply  to  them  the  same  notations  which   have 
bean  introduced  in  that  section  and  section 
3.  The  main  question  bar*  1st   how  to  traat 
the  six  quantities  X,Y,Z,L,M,H?   Tha  question 
was  solved  by  Mlnkowski  in  1907  in  tha  follow- 
ing way:  it  is  clear  that  a  vector  has  too  faw 
components  to  take  care  of  these  quantities;  in- 
stead of  using  two  rectors,  Minkowski  proposed 
to  use  a  tensor  of  rank  two;  of  course,  a  ten- 
sor has  too  many  components;  to  ba  axact,  It 
has,  in  the  general  case,  16  components  -  four 
in  the  main  diagonal,  six  above,  and  six  below; 
we  set  those  in  the  main  diagonal  zero,  and 
those  under  the  main  diagonal  equal  with  oppo- 
site sign  to  those  above  the  main  diagonal  sym- 
metric to  them;  in  this  way  we  are  left  with 
six  essentially  different  components;   tha  re- 
striction just  introduced  is  expressed  in  one 
formula 


4.4 


0; 


in  fact,  the  elements  in  the  main  diagonal  cor- 
respond to  equal  indices;  if  we  set  J  =  i  the 


above  formula  becomes  FJJ_  + 


0, 


whence  FJH  =  0  as  asserted.  We  try  now  to  Iden- 
tify the  components  of  this  tensor  with  our 
electromagnetic  force  components  in  tha  follow- 
ing way: 


4.5 


X  =  F4i,   Y  =  F4t,   Z  =  F4,, 

L  =  Fts,   M  =  F31,   I 


using  these  notations  together  with  4.4  accord 
ing  to  which,  for  example,  Fi4  =  -X,  we  can 
write  the  first  set  4.3  in  tha  highly  satisfac 
tory  form, 


4.6 


OF18/OX8   +  OF13/OX3   •»•  ^F14/6x4   =  0 
oF8  3/0*3   +  "oFai/Oxx  +  oF14A>x4   =  0 

•oF31/OXj.    +  "oFst/OX.    +    OF34/OX*    =    0 

"oF^/oXi  +  •oF4t/Oxt  +  oF43/ox\  «  0 
=  0. 


or 


These  four  equations  show  a  high  degree  of 
symmetry;  moreover  they  show  a  very  pronounced 
similarity  to  some  of  the  equations  we  have  been 
considering  in  section  2,  and  for  which  wa  pre- 
pared a  mathematical  theory  in  section  3;  we  can 
say  that  the  four  equations  written  above  ex- 
press the  fact  that  the  divergence  of  the  ten- 
.sor  Fji  just  introduced  vanishes  in  the  case  of 
free  space.  However,  if  wa  apply  tha  same  no- 
tations to  the  second  set  4.3  of  Maxwell's  aqu- 
ations nothing  very  simple  comes  out;  tha  minus 
sign  mentioned  after  formula  4.2  above  seems  to 
cause  trouble;  but  there  exists  a  way  out  also 
from  this  difficulty;  it  mas  been  indicated  (be- 
fore Minkowski 's  paper)  in  tha  work  of  Polncare 
and  Marcolongo,  and  foreshadowed  in  a  private 


letter  by  Hamilton  as  early  as  1845.   We  can 
overcome  the  difficulty  if  we  allow  ourselves 
to  use  Imaginary  quantities  side  by  side  with 
real  quantities  -  this  ought  to  cause  no  worry 
provided  we  know  the  formal  rules  of  operations, 
since  our  new  notations  are  of  an  entirely  for- 
mal nature  anyway;  we  set  now  Instead  of  £.5 


4.7   x  »  xx,   y  = 

and  Instead  of  4.5 


x,,   It 


IX  =  F41,    1Y  =  F4t,    iZ  =  F4,, 

4.72 

L  =  F.,,    M  =  F,x,    N  =  Fj., 

and  then  the  first  set  (4.3)  becomes  (4.5)  as 
before,  but  the  second  set  (4.3)  also  acquires 
a  highly  satisfactory  form,  namely 


4.61 


JJ34    ,   0*41 

ox,     ox3 


=  0 


oxa 


=  0 


or 


As  mentioned  before  we  consider  the  compo- 
nents Fj,  as  the  components  of  a  tensor.  We  may 
say  that  we  have  sixteen  of  them;  the  six  which 
appear  in  the  relations  4.78,  six  more  which  re- 
sult from  them  by  interchanging  the  indices  and 
whose  values  differ  from  those  given  in  4.72  on- 
ly in  sign,  and  four  more  with  equal  indices. 
According  to  the  formula  (4.4)  they  are  zero. 
We  may  arrange  them  in  a  square  array  as  follows: 


4.8 


Fia 
F8a 


j.3 


F14 
Fas  Fa4 
Fas  F34 
F43  F44 


0   N  -M  -IX 

-NO    L  -H 

M  -L    0  -iZ 

IX  1Y   iZ   0 


We  may  compare  the  property  F 


ij 


=  -F 


which  our  new  tensor  has  with  the  property 
MJJ  =  Mji  possessed  by  the  tensor  of  matter  (2.7) 
ana  which  is  simply  the  result  of  commutativity 
of  multiplication.  These  two  properties  are 
manifested  in  the  square  arrays  (3.1  and  4.8)  in 
that  the  components  of  M^i  which  are  symmetric 
with  respect  to  the  main  diagonal  are  equal,  and 
those  of  FJLJ  which  are  symmetric  with  respect  to 
the  main  diagonal  are  opposite.  Tensors  of  the 
first  type  are  called  symmetric,  those  of  the 
second  -  antisymmetric . 

We  want  to  see  now  whether  the  notations 
which  permitted  us  to  write  in  a  nice  form  Max- 
well 's  equations  would  not  spoil  the  nice  form 
which  we  previously  gave  to  the  hydrodynamical 
equations.  But  since  here  we  have  at  our  dis- 


posal  the  quantities  ux,  ug,  u«,  u+  we  can  ar- 
range it  so  as  to  off let  the  1  in  the  x«.  In 
fact,  since  differentiation  with  respect  to  time 
occurs  always  in  the  presence  of  an  u*  It  is 
enough  to  set 


4.9 


114  *  i 


instead  of  1,  as  in  £.41,  and  everything  will 
be  all  right,  as  far  as  the  left  hand  sides  of 
the  equations  are  concerned,  except  that  the 
left  hand  side  of  the  continuity  equation  be- 
comes imaginary.  This,  however,  does  not  mat- 
ter since  the  right  hand  sides  of  the  equations, 
we  temporarily  disregard. 

And  now  we  may  consider  the  Maxwell  equa- 
tions with  matter  4.31;  the  second  set  is  not 
affected  and  may  be  written  as  4.61,  but  the 
first  will  appear  in  a  form  which  may  be  writ- 
ten simply  as 


4.62 


The  equation  of  continuity  of  matter  Is  s 
consequence  of  these  equations;  to  obtain  it 
differentiate,  contract  and  use  the  property 
Fij  =  ~yji»  tne  result  gives  the  continuity  equ- 
ation of  matter  if  we  take  into  account  that 
p/e  is  a  constant. 


5.  The  Stress-Energy  Tensor. 

So  far  the  equations  to  which  we  subjected 
force  components  have  been  linear  equations, 
whereas  operations  performed  on  matter  involved 
squares  and  products  of  matter  components;  the 
similarity  which  we  observed  in  the  mathematical 
aspects  of  force  and  matter  components  makes  it 
seem  desirable  to  subject  force  components  to 
operations  analogous  to  those  which  we  applied 
to  matter,  viz.,  multiplication. 

In  the  static  case  we  return  to  our  nota- 
tions (1.72)  Xx,  Xa,  I,  for  X,Y,Z  and  form  the 
tensor  X±Xj;  this  tensor,  slightly  modified 
plays  an  important  part  in  the  theory;  the  mod- 
ification consists  in  subtracting  from  it 
a^ijXgXd  where  ftjj  are  the  components  of  the 
tensor  introduced  in  3.4  and  XgXg  stands  for 
the  contracted  square  of  the  vector  XA,  I.e., 
X*  +  Y*  +  Z*.  We  consider  then  the  tensor 


whose  array  is 

J(X§  -  Y*  -  Z*)      U 

XY  i(Yt-X*-Zt) 
XZ  IZ 


12 
TZ 
-X1  -Y1). 


Of  this  tensor  we  fora  the  divergence  (three- 
dimensional),  and  find  as  its  components 


X(XX  +  Yy 
X(YX  -  Xy) 
X(ZX  -  X,) 


Z,)  +  Y(X7  -  Yx)  +  Z(X,  -  Zz), 
Y(XX  +  Yy  +  Z.)  +-  Z(Y.  -  Zy), 
Y(Zy  -  Y.)  *  Z(XX  -f  Yy  +  Z.); 


the  connection  of  these  expressions  with  the 
Newtonian  equations  (1.51,  1.52)  is  obvious; 
the  expressions  in  brackets  are  the  left  hand 
sides  of  the  Newtonian  equation,  so  that  the 
divergence  of  our  new  tensor  Xj.Xj  -  0^X3X3 
vanishes  as  the  result  of  Newtonian  equations. 
This  again  confirms  us  in  our  opinion  that  from 
the  mathematical  point  of  view  force  and  matter 
components  are  of  very  similar  nature.  We  have 
now  in  mind  electric  and  magnetic  forces;  if  X, 
Y,Z  are  the  components  of  the  electric  force 
vector  the  tensor  whose  array  is  written  out 
above  is  called  the  "electric  stress  tensor"  j 
an  analogous  expression  in  magnetic  components 
is  called  the  "magnetic  stress  tensor";  the  sum 
of  the  two,  namely 


-Y8-za-Ma  -H8)     H+LM      xz+iJi 

XY+LM       i(I8+!!l8-Xa-Z8-L8-&a)     YZ-Hffl 


XZ+LN 


TZ4MN 


is  called  the  "electromagnetic  stress  tensor"; 
it  has  been  introduced  by  Maxwell  and  plays  some 
part  in  electromagnetic  theory,  for  instance,  in 
the  discussion  of  light  pressure;  but  its  main 
applications  and  Importance  seem  to  be  in  the 
study  of  the  fundamental  questions,  as  part  of 
a  more  general  four-dimensional  tensor. 

We  saw  how  nicely  the  system  of  Indices 
worked  in  the  case  of  Maxwell's  equations;  it 
is  natural  to  express  in  index  notation  this 
tensor  also.  We  assert  that  the  required  ex- 
pression is  given  by 


5.1 


E 


iJ 


JijFopFpo 


where  i,j  take  on  values  1,8,3,  and  the  summa- 
tions indicated  by  p  and  o  are  extended  from  1 
to  4.  In  fact, 


+Fi4F4i 

=  E(-L8-«a-N8+X8+I8  +  Z8), 
FipFpx  =  FaaF,i  +FjaF31  +F14F4i  =  -N*  -  IIs  +  I*, 


-  fc8  -  *T8  -  |Z8  = 


L8  -  I8  -  Z8  -  M8  -  H8) 


and  similarly  for  the  other  components  corres- 
ponding to  different  combinations  of  the  indices 
1,2,3  for  1  and  J.  There  seems  to  be  an  incon- 
sistency here;  the  summation  indices  p  and  o  we 
let  run  from  1  to  4,  but  we  consider  only  the 


10 

values  1,2,3  for  i  and  J.     It  la  interesting  to 
see  what  comes  out  if  we  let  i  and  j  take     the 
value  4.     We  get  four  new  components,  namely, 

3 14  =  FipFp*  =  FX,F,4  +  FX,F,4  -  1(HI  -  HZ), 
*«4  =  F.pFp4  =  FttFl4  *  F.,F,4  «  i(LZ  -  IX), 
B.4  -  F.pFp*  =  F,XFX4  *  F,,F,4  -  l(MX  -  LI), 
E44  *  F4pFp4  -  i(-L8  -  M"  -  ••  +  X"  «•  I"  +  Z») 

=  i(X8  +  Y8  +  Z8  *  L8  •»•  M8  •»•  ••). 

These  quantities  happen  also  to  have  phys- 
ical meaning.  The  first  three  constitute  (ex- 
cept for  the  factor  i)  the  components  of  the 
so-called  Poynting  vector,  and  the  last  on*  is 
the  so-called  electromagnetic  energy  (or,  ener- 
gy density) .  We  are  thus  led  by  the  notations 
we  have  introduced  in  a  purely  mathematical  way 
to  some  physical  quantities;  we  may  say  that 
the  entire  tensor  with  its  sixteen  components  - 
it  is  called  "the  electromagnetic  stress-energy 
tensor"  -  unifies  in  a  single  expression  all 
the  second  degree  quantities  appearing  in  the 
electromagnetic  theory;  the  stress  components, 
the  Poynting  components,  and  the  energy. 

The  stress-energy  tensor  may  be  written 
out  in  the  form  of  the  following  square  array: 


5.2 


Xa+  L8-  h  XY  +  LM       XZ  +  LN  i(BY  -  MZ) 

XY  +  LM  Y8  +  M8-  h   YZ  +  MN  i(LZ  -  II) 

XZ  +  LN  YZ  +  MN       Z8+Na-h  i(MX  -  LY) 

i(NY-MZ)  i(LZ-NX)    i(MX-LY)  h 


5.3  where  h  =  J(X8  +  Y8  +  Z8  +  L8  +  M8  +  I8). 

6.  General  Equations  of  Motion. 
The  Complete  Tensor. 


We  let  ourselves  be  guided  once  more  by 
what  seems  to  be  natural  from  the  formal  point 
of  view,  and  form  the  divergence  (four-dimen- 
sional divergence)  of  the  new  tensor.  This  can 
be  done  either  in  components  or  in  index  nota- 
tion. We  show  how  to  do  it  the  latter  way,  and 
leave  it  to  the  reader  to  write  out  the  stress- 
energy  tensor  as  an  array  and  to  font  the  di- 
vergence of  the  separate  lines.  Applying  form- 
ula 3.6  to  the  tensor  5.1  we  have 


the  first  term  on  the  right  may  be  split  up  in 
two  equal  parts,  one  of  which,  writing  P  f  or  Y  , 

may  be  .written  as   ^F    ***  the  otner» 


writing  a  for  y  and  P  for  a>  and  interchanging 
Indices  in  both  factors,  which  does  not  affect 
the  value  because  it  amounts  to  changing  the 
sign  twice,  takes  the  form   ?ftkF.   We  thus 


i*olc^ 

Substituting  for  the  second  factors  their  val- 
ues from  Maxwell's  equations  4.61  and  4.62  (in 
space  with  matter)  we  get 

If  •  •'«». 

or  in  components  without  indices 


6.1 


"oE, 


= 

e(Nv  - 
e(Lw  - 
e(Mu  - 

Mw  + 

NU  + 
Lv  + 

x), 

Y), 

z), 

**  =  ei(Xu+  Yv  +  Zw). 

These  expressions  obtained  by  us  in  a 
purely  formal  way  are  known  to  possess  physical 
significance:  the  first  three  represent  the 
components  of  the  force  exerted  by  an  electro- 
magnetic field  on  a  body  of  electric  charge  den- 
sity e,  and  the  last  (if  we  neglect  the  factor 
i)  is  the  rate  at  which  energy  is  expended  by 
the  field  in  moving  the  body.  The  first  three 
expressions  obviously  give  the  right  hand  sides 
of  our  hydrodynamic  equations.  Since  the  left 
hand  sides  also  (2.61,  2.62)  have  been  obtained 
as  divergence  components  (3.6)  we  may  write 
these  equations  in  an  extremely  simple  form,  if 


we  introduce  a  new  tensor,  the  difference 
the  two  appearing  on  the  left  and  right 
sides,  viz., 


6.2  TIJ  -M^  -  BIJ. 

Our  hydrodynamicel  equations  are  then  siaply 

6.3  ^Tag/ox,.  »  0    (l  «  1,2.3). 


11 


of 


This  seems  to  be  very  satisfactory,  but  there 
is  an  unpleasant  feature  about  it,  namely,  that 
for  1  =  4  we  do  not  seem  to  get  a  correct  equa- 
tion: the  contribution  of  the  tensor  Mj,  gives 
the  left  hand  side  of  the  continuity  equation, 
and  is  therefore,  zero;  but  the  contribution  of 
the  stress-energy  tensor  is  the  work  performed 
by  the  forces  on  the  particle  and  is,  in  gener- 
al, not  zero.  The  source  of  this  unpleasantness 
and  the  way  to  remove  it  will  be  clear  after  the 
reader  becomes  acquainted  with  the  contents  of 
the  next  chapter. 

It  is  time  now  to  cast  a  glance  on  the  sit- 
uation as  it  has  been  worked  out  till  now.   We 
have  ten  fundamental  quantities,  p,u,Y,w,X,Y,Z, 
L,M,N.  They  satisfy  certain  equations,  the  Max- 
well equations  (4.61,  4.62),  the  equation   of 
continuity  (3.5),  the  equations  of  motion  (6.3); 
in  the  last  named  equations  our  ten  quantities 
enter  in  certain  combinations  which  are  the  com- 
ponents of  the  tensor  TAj .  This  tensor  appears 
then  as  a  very  fundamental  one.  It  may  be  asked 
whether  it  determines  the  ten  quantities  which 
enter  into  it;  if  it  does,  all  the  quantities  we 
have  been  considering  are,  in  a  general  sense, 
components  of  one  entity,  the  tensor  TJJ  ,  and 
all  the  equations  we  have  introduced  express 
properties  of  this  tensor  -  that  part  of  Physics 
which  we  are  discussing  in  this  book,  with  the 
exception  of  the  gravitational  field,  appears 
then  as  the  study  of  the  tensor  T^j.  It  can  be 
proved  that  TJJ  with  certain  restrictions  deter- 
mines the  quantities  p,u,v,w,X,Y,Z,L,M,N,  and  in 
Chapter  V  it  will  be  shown  that  the  gravitation- 
al phenomena  also  are  taken  care  of  by  it. 


12 


Chapter  II. 


NEW  GEOMETRY 


In  the  preceding  chapter  we  achieved  by  In- 
troducing appropriate  notations  a  great  simplic- 
ity and  uniformity  in  our  formulas.   The  no- 
tations in  which  indices  take  the  values  from  1 
to  4  are  modeled  after  those  previously  intro- 
duced in  ordinary  geometry,  the  two  points  of 
distinction  being  first  that  we  have  four  inde- 
pendent variables  instead  of  the  three  coordi- 
nates, and  second,  that  the  fourth  variable  la 
assigned  imaginary  values.  In  spite  of  these 
distinctions  the  analogy  with  ordinary  geometry 
is  very  great,  and  we  shall  profit  very  much  by 
pushing  this  analogy  as  far  as  possible,  and 
using  geometrical  language,  as  well  as  nota- 
tions modeled  after  those  of  geometry. 

Physics  seems  to  require  then,  a  mathema- 
tical theory  analogous  to  geometry  and  differing 
from  it  only  in  that  it  must  contain  four  coor- 
dinates, one  of  which  is  imaginary.  The  first 
purpose  of  this  chapter  will  be  to  build  a  the- 
ory to  these  specifications.  The  remaining  part 
of  the  chapter  will  be  devoted  to  a  more  syste- 
matic treatment  of  tensor  analysis. 


7.  Analytic  Geometry  of  Four  Dimensions. 

In  the  present  section  we  shall  give  a 
brief  outline  of  properties  which  we  may  ex- 
pect from  a  four-dimensional  geometry  guided 
by  analogy  with  two  and  three-dimensional  geom- 
etries j  of  course,  we  shall  lay  stress  mainly 
on  those  features  which  we  shall  need  for  the 
application  to  Physics  that  we  have  in  mind.  In 
this  outline  we  shall  disregard  the  fact  that 
our  fourth  coordinate  must  be  imaginary;  cer- 
tain peculiarities  connected  with  this  circum- 
stance will  be  treated  in  section  10. 

The  equations  of  a  straight  line  we  ex- 
pect to  be  written  in  the  form 


7.1 


entirely  similar  to  that  used  in  solid  analytic 
geometry;  but  we  may  also  use  another  form;  as 
written  out  the  equations  state  that  for  every 
point  of  the  line  the  four  ratios  have  the  same 
value;  denoting  this  value  by  p  we  may  express 
the  condition  that  a  point  belongs  to  the  line 
by  stating  that  its  coordinates  may  be  written 
as 


7.11 


x2  =  aa+pv8, 
*a  =  a3+pv3,   x4  =  a4+pv4; 
giving  here  to  p  different  values  we  obtain  (for 


given  a.±,  vx)  the  coordinate*  of  all  the  dif- 
ferent points  of  the  line.  The  variable  p  if 
called  the  parameter,  and  this  whole  way  of  de- 
scribing a  line  is  called  "parametric  repre- 
sentation". Parametric  representation  is  by  no 
means  peculiar  to  four  dimensions,  it  may  be, 
and  is,  of ten  used  in  plane  and  solid  analytic 
geometry.  We  present  it  here  because  we  shall 
need  it  later,  and  it  is  not  always  sufficient- 
ly emphasized. 

A  straight  line  is  determined  by  two 
points;  the  equations  of  the  line  through  the 
points  ai  and  bi  is  given  by  the  above  equa- 
tions (7.1)  in  which 


7.2 


=  bx  - 


va  =  bt  -  aa,  etc. 


Two  points  determine  a  directed  segment  or 
vector,  whose  components  are  the  differences 
between  the  corresponding  coordinates  of  the 
points,  so  that  v^  is  the  component  of  the  vec- 
tor whose  initial  point  is  given  by  a^  and 
whose  final  point  is  given  by  bx. 

A  vector  determined  by  two  points  of  a 
straight  line  is  said  to  belong  to  that  line, 
and  we  may  say  that  we  can  use  as  denominators 
in  the  equations  7.1,  or  as  coefficients  of  p 
in  7.11  the  components  of  any  vector  belonging 
to  the  straight  line. 

Two  vectors  are  considered  equal  if  they 
have  equal  components;  a  vector  is  multiplied 
by  a  number  by  multiplying  its  components  by 
that  number,  and  two  vectors  are  added  by  add- 
ing their  corresponding  components. 

Two  lines  are  parallel  if  they  contain 
equal  vectors,  and  it  is  easy  to  see  that  a 
condition  for  parallelism  of  line  7.11  and 


7.3 
is 


_  Xa-A8  _  X3-A3  _ 
V8  V, 


vi/vi  =  va/V8  =  va/V3  =  v4A*  or  Vj  =  avx 

where  a  is  a  number  so  that  proportionality  of 
components  of  two  vectors  means  parallelism. 

A  condition  for  perpendicularity  of  these 
two  lines  we  expect  to  be  the  vanishing  of  the 
expression 


7.4 


YJ.VX  +  v8Va  +  vsV,  +  v4V4 


which  is  called  the  scalar  product  of  the  two 
vectors  v^  and  V± . 

The  distance  between  the  points  ax,  aa,  a3, 
a4  and  bj. ,  ba,  b3,  b4  is  given  by  the  square 
root  of  the  expression 


7.41 


-ba)«  +  (as-b,)»  •»•  (a4-b4)»j 


this  distance  Is  also  considered  as  the  length 
of  the  vector  joining  a±  and  bj,.  The  expres- 
sion for  the  square  of  the  length  of  a  vector 
may  be  considered  as  a  special  case  of  the  ex- 
pression 7.4.  We  may  say  then,  that  the  square 
of  the  length  of  a  vector  is  the  square  of  the 
vector,  i.e.,  the  product  of  the  vector  with 
itself. 

We  shall  often  use  Roman  letters  to  denote 
vectors.  The  scalar  product  of  the  vectors  x 
and  y  will  be  denoted  by  x.y  and  the  square  of 
the  vector  x  by  x8. 

A  vector  whose  length,  or  whose  square,  is 
unity  we  shall  call  a  unit  vector.  Its  compo- 
nents, we  would  expect,  may  be  considered  as 
the  direction  cosines  of  the  line  on  which  the 
vector  lies.  We  also  expect  that  the  scalar 
product  of  two  vectors  is  equal  to  the  product 
of  their  lengths  times  the  cosine  of  the  angle 
between  them;  but,  of  course,  an  angle  between 
two  vectors  in  four-dimensional  space  has  not 
been  defined,  so  that  we  could  simply  define 
the  angle  between  two  vectors  by  this  property, 
or  by  the  formula 


7.5 


cos  <p  = 


But  if  we  want  the  angle  to  be  a  real  quantity 
the  absolute  value  of  this  expression  cannot  ex- 
ceed unity,  or 

t(x.y)*  ^  xa.ya  or  (x.y)*-  x8y8  -  0. 

If  we  form  the  vector  Xx  +  yy  where  X  and  M-  are 
two  numbers,  the  square  of  that  vector  would  be 


X8x2 


and  the  above  inequality,  which  expresses  the 
fact  that  the  discriminant  of  this  expression  is 
negative  is  seen  to  be  a  consequence  of  the  as- 
sumption that  a  square  of  a  vector  is  never  neg- 
ative. 

A  plane  we  would  expect  to  be  determined  by 
three  points  not  in  a  line,  or  by  two  vectors 
with  the  same  initial  point  or  by  two  lines 
through  a  point.  Instead  of  characterizing  a 
plane  by  equations  we  prefer  to  give  it  in  par- 
ametric form;  limiting  ourselves  to  a  plane 
through  the  origin  we  have 


7.18 


or 


qbi,     xa  =  paa  +  qbg, 


xs  =  pa3  +  qba,     x4  =  pa4  +  qb4, 


18 


where  ai  and  bi  are  the  coordinates  of  two 
points  in  the  plane  or  the  components  of  two 
vectors  of  the  plane  whose  initial  points  may 
be  considered  as  at  the  origin.  We  shall  write 
this  formula  also  as 

x  •  aa  +  pb 

where  x,  a,  b  stand  for  vectors  whose  compo- 
nents are  xif  a^,  b^  and  where  we  use  Greek 
letters  for  parameters  in  order  to  avoid  con- 
fusion with  vectors  which  we  denote  now  by  Ro- 
man letters. 

We  always  can  choose  two  mutually  perpen- 
dicular unit  vectors  as  the  two  vectors  deter- 
mining a  plane;  if  we  call  these  vectors  1  and 
J  the  preceding  formula  becomes 


7.6 


x  »  oi  +  pj. 


The  fact  that  1  and  J  are  perpendicular 
unit  vectors  may  be  written  as 


i.J  =  0, 


J8  -1. 


3 


It  Is  easy  to  see  that  in  this  case  a  and 
are  the  projections  of  x  on  the  directions  of 
1  and  J,  or  the  scalar  products  of  x  with  1  and 
j  respectively. 

Every  pair  of  coordinate  axes  determines  a 
plane  and  since  six  pairs  can  be  formed  from 
four  objects  we  have  six  coordinate  planes. 

In  the  same  way  that  the  direction  of  a 
straight  line  is  determined  by  a  configuration 
of  two  points  on  it  -  a  vector,  the  "orienta- 
tion" of  a  plane  may  be  determined  by  the  con- 
figuration of  three  points  on  it  -  a  triangle. 
A  vector  is  given  by  its  components,  which  are 
the  lengths  of  its  projections  on  the  coordi- 
nate axes;  in  the  same  way  a  triangle  may  be 
characterized  (to  a  certain  extent)  by  the  areas 
of  its  projections  on  the  coordinate  planes.  If, 
for  example,  we  take  a  triangle  one  of  whose 
vertices  is  at  the  origin  0,0,0,0  and  the  two 
others  at  the  points  Xi  and  y±  respectively,  the 
areas  of  the  projections  will  be  the  six  quan- 
tities 


It  is  interesting  to  compare  these  numbers, 
which  satisfy  the  relation 


F, 


=  0 


with  the  components  of  the  tensor  Pt«   (com- 
pare 4.4),  three  of  which  have  been  identified 
with  the  electric,  and  the  other  three  with  the 
magnetic  force  components.  Our  ten  fundamental 
quantities  pu,  pv,  pw,  p,  X,  Y.,  Z,  L,  M,  H,  sees 
to  allow  thus  a  geometrical  Interpretation;  the 


14 


first  four  are  considered  as  the  four  projec- 
tions of  a  part  of  a  straight  line  on  the  coor- 
dinate axes,  the  remaining  six,  as  the  projec- 
tions of  a  part  of  a  plane  on  the  coordinate 
planes. 

A  little  against  our  expectations,  however, 
these  six  quantities  FIJ  are  not  independent; 
the  reader  will  easily  verify,  using  the  above 
expressions,  that 


7.7 


0. 


We  have  here  a  relation  that  exists  in  the  math- 
ematical theory;  at  once  the  question  arises: 
does  a  corresponding  relation  hold  for  the  cor- 
responding quantities  in  the  physical  theory; 
according  to  the  formulas  4.5  this  would  mean 

L.X  +  M.Y  +  N.Z  =  0, 

i.e.,  perpendicularity  of  the  electric  and  mag- 
netic force  vectors;  these  vectors  are,  however, 
known  not  to  be  necessarily  perpendicular  to 
each  other;  our  Identification  is  therefore 
faulty;  a  slight  modification  would,  however, 
help  to  overcome  the  difficulty;  if  instead  of 
considering  the  areas  of  projections  of  a  tri- 
angular contour,  we  consider  the  areas  of  pro- 
jections of  an  arbitrary  contour,  not  necessar- 
ily a  flat  one,  then  the  six  quantities  are  in- 
dependent and  the  formal  analogy  holds  perfect- 
ly. 

Returning  to  the  plane  we  may  mention  that 
although  it  might  seem  strange  at  first  glance, 
we  should  expect  that  two  planes  may  have  only 
one  point  in  common  -  a  situation  which  never 
occurs  in  three  dimensions.  An  example  of  two 
planes  with  only  one  common  point  is  given  by 
the  xxxa  and  the  xax4  coordinate  planes;  the 
common  point  is,  of  course,  the  origin. 

Four  points  not  in  a  plane,  or  three  vec- 
tors with  a  common  origin,  we  expect  to  deter- 
mine a  "solid"  which  may  be  defined  as  the  to- 
tality of  points  of  three  kinds:  (1)  points  on 
the  lines  determined  by  the  given  vectors; 
(£)  points  on  lines  Joining  two  points  of  the 
first  kind;  and  (S)  points  of  lines  Joining  two 
points  of  the  second  kind.  In  ordinary  geomet- 
ry a  configuration  defined  In  this  way  exhausts 
all  points,  but  not  so  in  our  four-dimensional 
geometry;  as  examples  may  serve  the  four  coor- 
dinate solids,  the  totalities  respectively  of 
points  satisfying  the  relations  xx  =  0,  xt  -  0, 
XB  =  0,  x4  =  0. 

A  parametric  representation  of  a  solid  is 
analogous  to  that  of  a  plane.  For  a  solid 
through  the  origin  we  have  as  such  parametric 
representation 


7.61 


x  =  oi  + 


where  i,J,k  have  again  been  chosen  as  perpendic- 
ular unit  vectors  so  that 


l.J  »  J.k  »  k.l  •  0; 


J"  -  k*  -  1. 


For  all  possible  values  of  a,  e,  r  »•  ob- 
tain all  the  points  of  the  solid  through  the 
origin  determined  by  the  vectors  i,J,k. 

Mext  we  consider  a  configuration  determin- 
ed by  five  points  not  In  a  solid;  we  obtain  all 
the  points  of  our  four-dimensional  space.  In- 
cidentally, as  a  generalization  of  the  formulas 
7.6  and  7.61  we  may  write  now 


7.62 


x  »  al  + 


6f; 


this  formula  gives  the  expression  of  every  vec- 
tor with  initial  point  at  the  origin  in  terms 
of  four  mutually  perpendicular  unit  vectors.  We 
have,  of  course, 


7.8 


i.J 


k. 


0, 


J»  =  k»  = 


8.  Axioms  of  Four-Dimensional  Geometry. 

Until  now  we  have  been  listing  some  propo- 
sitions that  we  may  expect  to  have  in  four-di- 
mensional geometry.  But  what  is.  four-dimension- 
al geometry?  As  an  abstract  mathematical  the- 
ory it  is  Just  a  collection  of  statements  of 
which  some  may  be  taken  without  proof  and  con- 
sidered as  axioms  and  definitions,  and  the  oth- 
ers are  deduced  from  them  as  theorems.   It  is 
not  difficult  then  to  pass  from  our  expectations 
of  a  four-dimensional  geometry  to  a  realization 
of  such  a  geometry;  all  we  would  have  to  do 
would  be  to  pick  out  certain  of  the  propositions 
listed  above  and  consider  them  as  axioms  and  to 
show  that  the  others  can  be  deduced  from  them. 
But  in  so  doing  we  do  not  want  to  include 
among  our  axioms  propositions  involving  coor- 
dinates. In  two  and  three-dimensional  geometry 
we  are  accustomed  to  see  analytic  geometry  based 
on  the  study  of  elementary  geometry,  that  pre- 
cedes It.  We  have  an  idea  of  what  a  straight 
line  is  before  we  come  to  coordinate  axes,  and 
we  choose  three  of  these  pre-existing  straight 
lines  to  play  the  part  of  coordinate  axes.   We 
may  choose  these  axes  to  a  certain  extent  arbi- 
trarily (this  arbitrariness  being  restricted 
only  by  our  desire  to  have  a  rectangular  sys- 
tem) ;  the  coordinate  axes  play  only  an  auxiliary 
role,  and  it  would  be  awkward  to  Include  any 
reference  to  a  particular  coordinate  system  in 
the  axioms.  We  look,  therefore,  among  the  prop- 
ositions mentioned  in  the  preceding  section  for 
some  that  are  Independent  of  a  coordinate  system 
and  from  which  we  can  reconstruct  the  whole  sys- 
tem. 

As  our  fundamental  undefined  conception  we 
choose  "vector".  The  axioms  that  follow  will 
in  the  main  be  rules  of  operations  on  vectors. 


Axiom  I.  Every  two  vectors  have  a  sum.   Addi- 
tion is  commutative  a  +  b  =  b  +  a,  and  associa- 
tive, i.e.,  a  +  (b  +  c)  =  (a  +  b)  •»•  c;  subtrac- 
tion is  unique,  i.e.,  for  every  two  vectors  a 
and  b  there  exists  one  and  only  one  vector  x 
such  that  a  =  b  +  x.  It  follows  that  there 
exists  a  vector  0  that  satisfies  the  relation 
a  +  0  =  a  for  every  a. 

Axiom  II.  Given  a  number  a  and  a  vector  a 
there  exists  a  vector  aa  or  aa  which  is  called 
their  product.  The  associative  law  holds  in 
the  sense  that  a (pa)  =  (ap)a  and  also  the  dis- 
tributive laws  (a  +  p)a=aa  +  pa  and  a(a  +  b) 
=  a a  +  ab. 

Axiom  III.  To  every  two  vectors  a  and  b  cor- 
responds a  number  a.b  or  ab  called  their  (scal- 
ar) product.  Scalar  multiplication  is  commuta- 
tive, a.b  =  b.a;  it  obeys  together  with  multi- 
plication of  vectors  by  numbers  the  associative 
law,  a(a.b)  =  (aa.b);  and,  together  with  addi- 
tion -  the  distributive  law  a.(b  +  c)  =  a.b  + 
a.c. 

Before  we  formulate  the  next  axiom  we  in- 
troduce the  definition;  the  vectors  a,b,c,.... 
are  called  linearly  dependent  if  there  exist 
numbers  o,  p,  Y,....  not  all  zero,  such  that 

aa  +  pb  +  YC  +  ...  =0; 

they  are  called  linearly  independent  when  no 
such  numbers  exist. 

Axiom  IV.  There  are  four  independent  vectors, 
but  there  are  no  five  Independent  vectors. 

Axiom  V.  If  a  vector  is  not  zero  its  square  is 
positive. 

To  these  axioms  on  vectors  we  have  to  add 
some  statements  concerning  points  if  we  want  to 
have  a  geometry,  and  as  such  we  may  take: 

Axiom  VI.  Every  two  points  A,B  have  as  their 
difference  a  vector,  h;  or  in  formulas  B  -  A  =  h, 
B  «  A  +  h. 

Axiom  VII.   (A  -  B)  +  (B  -  C)  =  A  -  C. 

The  body  of  propositions  which  may  be  de- 
duced from  these  axioms  we  call  four-dimension- 
al geometry. 

In  order  to  prove  that  the  whole  of  geomet- 
ry can  be  deduced  from  the  propositions  I-VII  we 
would  have  to  actually  deduce  it.  We  shall  not 
do  it,  of  course,  but  we  shall  indicate  how  ana- 
lytic geometry  can  be  arrived  at  in  the  follow- 
ing discussion,  which  is  meant  to  be  entirely 
formal,  i.e.,  during  which  it  is  not  intended  to 
invoke  our  intuition  but  only  the  properties 
stated  in  the  axioms. 

By  length  of  a  vector  we  mean  the  positive 
square  root  of  the  product  of  the  vector  by  it- 
self: |a|  =  /a1.  Two  vectors  are  considered  per- 
pendicular, if  their  scalar  product  is  zero.  A 


15 


vector  is  called  a  unit  vector  if  its  length  It 
=  1. 

Lemma.  There  exist  four  mutually  perpen- 
dicular unit  rectors. 

Proof.  According  to  Axiom  IV  there  exist 
four  independent  vectors  a,b,c,d;  i.e.,  such 
vectors  that  there  are  no  four  numbers  a,  P,  Y  » 
0,  not  all  zero  such  that  aa  +  pb  +  YC  +  &d  =  0. 
It  follows  that  a  ^  0,  because  If  It  were  zero, 
the  numbers  a=l,  p  =  0,  Y  =  0,  0  =  0  would 
satisfy  the  above  relation.  Call  a,  multiplied 
by  the  reciprocal  of  Its  length,  1,  so  that 
i  =  a/  |  a | .  It  is  easy  to  see  that  1  is  a  unit 
vector,  and  that  i,b,c,d  are  Independent.  Con- 
sider the  vector  b1  =  b  -  i(bi).  It  Is  easy  to 
see  that  it  is  perpendicular  to  1;  multiply  bf 
by  the  reciprocal  of  its  length  and  call  the 
result  J;  then,  i  and  J  are  two  perpendicular 
unit  vectors  and  l,j,c  are  independent.   Call 
c1  =  c  -  i(ci)  -  j(cj);  this  vector  is  perpen- 
dicular to  both  i  and  j,  and  if  we  multiply  it 
by  the  reciprocal  of  its  length  and  call  the 
result  k,  we  have  in  i,J,k,  three  perpendicular 
unit  vectors;  the  fourth  vector  t  may  be  ob- 
tained from  d  in  an  analogous  way. 

Lemma  II.  Given  any  vector  x  we  have  the 
identity 


8.1 


x  = 


J(jx)  +  k(kx) 


Proof.  Since  every  five  vectors  are  dependent 
(Axiom  IV),  there  exist  five  numbers  o,  p,  y, 
6  ,  e  not  all  zero  such  that 


+  PJ  + 


ex  =0; 


here  e  cannot  be  zero  since  otherwise  i,J,k,  t 
would  be  dependent  and  we  know  that  they  are 
not.  Dividing  by  -e  we  have 


=  (-f)i 


(-f)J 


multiplying  by  i,  the  three  last  terms  on  the 
right  vanish  as  the  result  of  perpendicularity 
of  i  to  J,  k  and  t  and  the  first  term  becomes 
-£,  whereas  the  left  hand  side  is  ix;  in  the 
same  way  we  prove  that  -f,  -£,  -f  have  the  val- 
ues Jx,  kx,  ^x,  respectively,  and  lemma  II  is 
proved. 

A  set  of  four  perpendicular  unit  vectors 
we  shall  call  a  set  of  coordinate  vectors.  We 
shall  call  the  quantities  Xi  =  ix,  x,  =  Jx, 
x3  =  kx,  X4  =  £x  the  components  of  x  with  re- 
spect to  i,J,k,£.  A  point  0  together  with  a 
set  of  coordinate  vectors  we  call  a  coordinate 
system.  Given  a  coordinate  system  we  can  as- 
sign to  every  point  X  four  coordinates  in  the 
following  way:  denote  the  vector  X  -  0  by  x 
(according  to  Axiom  VI);  and  call  the  compo- 
nents of  the  vector  x  the  coordinates  of  the 
point  X.  If  now  we  choose  another  origin  O1 
and  the  same  set  of  coordinate  vectors,  the  co- 
ordinates of  the  point  X  will  be  the  components 
of  X  -  0';  but  since,  according  to  Axiom  VII, 


X  -  0  -  (X  -  0»)  +  (0«  -  0),  the  old  coordi- 
nates will  be  equal  to  the  new  coordinates  plus 
the  old  coordinates  of  the  new  origin,  and  thus 
a  connection  is  established  with  ordinary  ana- 
lytic geometry. 

Formula  8.1  may  be  compared  with  the  form- 
ula 7.62.  It  may  also  be  written  as 

8.2       x  =  Xj.1  +  xaj  +  xak  +  x4/  . 


9.  Tensor  Analysis. 

We  want  to  substitute  now  for  the  prelimin- 
ary definitions  of  tensor  analysis  that  were 
suggested  by  the  formal  developments  in  Chapter 
I,  a  definition  that  is  more  satisfactory.   At 
that  time  our  point  of  view  was  simply  that  we 
shall  consider  symbols  with  two  (or  more)  in- 
dices as  the  tensor  components  in  a  way  similar 
to  that  of  using  symbols  with  one  index  as  vec- 
tor components.  But  in  case  of  vectors  (in  or- 
dinary space)  we  know  what  vectors  are,  and  we 
consider  the  components  as  a  method  of  repre- 
senting that  known  thing.  In  the  case  of  ten- 
sors we  seem  to  have  to  take  representation  as 
the  starting  point  of  our  study.  The  situation 
seems  complicated  since  the  fact  that  there  is 
not  one  but  that  there  are  many  different  rep- 
resentations of  the  same  vector,  depending  on 
the  coordinate  system  we  use  leads  us  to  think 
that  the  same  general  situation  will  obtain  in 
the  case  of  tensors,,  and  the  question  arises  na- 
turally: how  shall  we  be  able  to  find  out,  giv- 
en two  representations  of  a  tensor,  whether  it 
is  the  same  tensor  that  is  represented  in  the 
two  cases;  or,  given  a  representation  of  a  ten- 
sor in  one  system  of  coordinates  how  to  find 
the  representation  of  the  same  tensor  in  a  giv- 
en other  system.  In  order  to  be  able  to  answer 
such  questions  intelligently  we  want  to  intro- 
duce the  idea  of  the  tensor  itself,  to  put  it  in 
the  foreground  and  to  consider  the  components  as 
something  secondary.  In  the  beginning  we  shall 
limit  ourselves  to  the  consideration  of  two  di- 
mensions. 

We  look  then  for  some  entity,  of  which  the 
components  will  be  constituent  parts.  The  first 
thing  that  occurs  to  our  minds  in  connection 
with  tensors  of  rank  two  is,  of  course,  a  deter- 
minant. It  is  a  single  number  determined  by  its 
elements,  or  components.  However,  it  cannot  be 
used  for  our  purposes  because  the  determinant 
does  not,  in  turn,  determine  its  components. 

Another  instance  where  two  index  symbols 
occur  in  mathematics  is  the  case  of  quadratic 
forms.  The  equation  of  a  central  conic  may,  for 
example,  be  written  in  the  form  axa  +  2bxy  +  cy* 
=  1;  or  introducing  the  notations  xx  for  x,  xa 
for  y,  a  n  for  a,  a12  and  a2i  for  b,  and  a22  for 
c,  in  the  form 


aalx8Xj. 


16 


Let  us  consider  the  left  hand  side  of  this  equa- 
tion.  Here  the  tensor  components  aij  are  com- 
bined (together  with  the  variables  xi,  x§)  into 
one  expression;  and  they  can,  to  a  certain  ex- 
tent, be  gotten  back  from  that  expression.   If 
we  set  xx  »  1,  xa  «  0,  for  instance,  we  get  an 
as  the  value  of  the  expression;  aaa  can  be  got- 
ten in  a  similar  way,  but  it  would  be  difficult 
to  imagine  how  a^  could  be  obtained;  in  fact, 
it  is  Impossible  to  get  axa  from  this  expres- 
sion, because  two  expressions  which  differ  in 
their  form  but  for  which  au  •*•  atl  has  the  same 
value  would  give  the  same  values  for  all  com- 
binations of  values  of  xx,  xt.  A  slight  gener- 
alization will,  however,  obviate  this  difficul- 
ty. This  generalization  is  suggested  by  the 
equation  of  the  tangent  to  the  above  conic  and 
can  be  written  as 


9.1  9 


aaixayx  +  ataX»ya; 


here,  for  instance,  setting  xx  =  1,  xa  =  0, 
yi  =  °>  7a  =  1  *s  get  aia.  We  shall  therefore 
consider  the  bilinear  form  above  aj  the  tensor. 
If  we  do  that  we  may  free  ourselves  of  co- 
ordinates easily.  The  variables  Xi,  xa  and  y^9 
ya  may  be  considered  as  the  components  of  two 
vectors,  and  the  above  expression  9.1  furnishes 
us  then  a  numerical  value  every  time  these  two 
vectors  are  given;  it  may  be  considered  as  de- 
fining a  function  9;  the  arguments  of  that 
function  are  the  two  vectors  and  the  values  are 
the  numbers  calculated  by  substituting  the  com- 
ponents of  these  vectors  in  the  expression  9.1. 
This  functional  dependence  we  may  consider  as 
the  tensor  so  that  if  we  want  to  use  another  co- 
ordinate system  we  shall  have  the  same  vectors 
given  by  different  components  x\  ,  x'a  and  y'j., 
y'a  and  we  expect  to  find  another  expression  of 
the  same  type  as  9.1  involving  these  new  compo- 
nents, say 


9  = 


a'aix'ay'i 


9.11 


which  would  assign  the  same  values  to  the  two 
vectors.  The  coefficients  will,  of  course,  be 
different,  and  these  new  coefficients  we  shall 
consider  as  the  new  components  of  the  sane  ten- 
sor in  the  new  coordinate  system. 

Let  us  perform  the  calculation.  If  we  ro- 
tate our  axes  through  an  angle  f  the  old  coor- 
dinates are  expressed  in  the  new  coordinates  by 
the  formulas 


9.2 


-  X'aS, 


x»ac, 


where 

9.3       c  =  cos  •, 


s  =  sin  9; 


the  components  of  the  other  vector  yj. ,  ya  will 


be  expressed  by  analogous  formulas  in  terms  of 
the  new  components  of  the  same  vector.  Substi- 
tuting these  expressions  In  the  above  bilinear 
form  (9.1)  we  get 


which  may  be  written  as  9.11  if  we  give  to 
the  values 


ia 


a12cs  +  a8isc  +  aa8s 
+  aiaca  -  a^.  s8  +  aaasc 


9.21 


a1  8i  =  -alxsc  -  aias8  +  aaic 


asgcs 


a'as  3  ajj.88  -  algsc  -  aaics  +  a88ca. 

These  are  the  new  components  of  the  tensor 
whose  old  components  are  the  a^.  The  equation 
of  the  conic  section  in  the  new  coordinates  has 
the  same  form  as  in  the  old  system.  We  may  say 
that  9.11  expresses  the  same  functional  depend- 
ence on  the  two  vectors  using  their  new  compo- 
nents, as  9.1  does  using  their  old  components. 

The  components  of  a  tensor  change,  in  gen- 
eral, when  we  pass  from  one  coordinate  system  to 
another  but  there  are  certain  combinations  which 
do  not  change;  for  instance,  if  we  add  together 
the  first  and  the  last  of  the  four  above  equal- 
ities we  obtain,  taking  into  account  that 


9.4 


the  relation 


+  s8  =  1 


9.5          a'n  +  a'8a  =  a1]L  +  aaa; 

also  it  is  easy  to  prove  that  the  expression 


9.51 


an 
aal 


the 


is  not  affected  by  the  substitution  .  of 
primed  components  for  the  unprimed  ones. 

Expressions  of  this  kind  are  called  invari- 
ants. 

We  have  thus  achieved  our  purpose;  although 
we  use  coordinates  in  the  definition  of  a  tensor 
the  result  is  independent  of  the  particular  co- 
ordinate system  used.  We  can  go  a  step  farther, 
however,  and  dispense  with  coordinates  altogeth- 
er in  the  definition  of  a  tensor.   Not  every 
functional  dependence  of  a  number  <p  on  two  vec- 
tors we  shall  call  a  tensor,  the  dependence  on 
each  vector  must  be  linear  (and  homogeneous);  by 
this  we  mean  that  the  expression  involves  only 
first  powers  of  the  components  of  each  vector, 
and  no  products  of  components  of  the  same  vec- 
tor; our  conception  of  linearity  seems  thus  to 
involve  components  and  we  are  not  rid  of  coor- 


17 

« 

dinate  systems  yet.  We  want  therefore  to  de- 
fine linearity  independently  of  coordinate  rep- 
resentation and  we  shall  see  that  the  following 
definition,  entirely  Independent  of  coordinates 
leads  to  the  same  results. 

We  say,  In  general,  that  ? (x)  depends  on 
its  argument  x  linearly  if 


9.6 


<p(Xx 


It  is  easy  to  see  that  linearity  defined 
in  terms  of  coordinates  as  dependence  involving 
only  first  powers  of  the  components  satisfies 
this  condition.  We  arrive  thus  at  the  follow- 
ing definition  of  a  tensor: 

k  tensor  of  rank  r  is  a_  function  which  as- 
signs to  _r  vector  arguments  numerical  values 
the  dependence  of  the  value  on  each  argument 
being  linear  in  the  sense  of  9.6. 

We  can  prove  that  an  expression  of  a  ten- 
sor as  a  bilinear  (or  multilinear)  fora  may  be 
gotten  back  from  this  general  definition.  In 
fact,  if  given,  e.g.,  a  tensor  of  rank  two,  i.e., 
with  two  vector  arguments  ?(x,y)  we  substitute 
for  x  and  y  their  expressions  in  terms  of  com- 
ponents and  unit  vectors  (see  7.6) 


Jxa 


J7i; 


we  may  write,  using  the  above  definition 
tensor  and  that  of  linearity  (9.6): 


of  a 


e(J,J)xa7§. 


We  see  that  this  expression  differs  from 
that  given  by  9.1  as  a  bilinear  form  only  in 
that  •(!,!),  <p(i,j),  «(J,i)>  *O»J)  appear  in- 
stead of  alx,  a18,  aai,  and  aaa.  Prom  this 
point  of  view  the  conception  of  a  tensor  is  en- 
tirely independent  of  a  coordinate  system  and 
of  components.  We  obtain  tensor  components 
when  we  introduce  a  set  of  coordinate  vectors; 
and  transformation  of  coordinates  corresponds 
to  replacing  of  one  set  of  coordinate  vectors 
by  a  new  set. 

We  pass  now  to  the  consideration  of  opera- 
tions on  tensors.  We  had  threa  such  operations: 
multiplication,  contraction  and  differentiation. 

Multiplication  is  simple.  If  we  have  two 
tensors,  say  f(x,y)  and  g(z,u,v)  we  obtain  a 
tensor  of  rank  five  by  multiplying  these  two 
together 

h(x,y,z,u,v)  =  f(x,y).g(z,u,v); 

it  is  easy  to  see  that  the  components  of  h  are 
obtained  from  the  components  of  f  and  g  in  the 
following  fashion 


Next  comes  contraction.  We  have  defined  It 
In  Indices;  I.e.,  given  the  components  of  a  ten- 
sor of  rank  two  In  a  certain  coordinate  system 
we  have  a  definite  rule  for  obtaining  a  scalar, 
viz.,  an  +  aaa;  the  question  arises:   will  we 
obtain  the  same  scalar  If  we  use  another  system 
of  coordinates,  In  other  words,  Is  the  defini- 
tion Independent  of  the  system  of  coordinates, 
Is  It  Invariant?  Yes,  this  invar  iance  has  been 
proved  above  by  formula  9.5.  We  have  now  the 
right  to  use  the  definition  of  contraction  in 
terms  of  components,  knowing  that  it  has  an  in- 
trinsic meaning,  that  the  result  is  independent 
of  the  system  of  coordinates  used. 

We  are  in  a  position  now  to  answer  a  ques- 
tion that  must  have  arisen  in  the  mind  of  the 
reader.  In  the  preceding  chapter  we  agreed  to 
consider  a  vector  as  a  tensor  of  rank  one.  Here 
with  our  new  definition  of  tensor  a  vector  and 
a  tensor  of  rank  one  seem  to  be  two  entirely 
different  things;  but  we  may  consider  together 
with  every  vector  a  tensor  of  rank  one  which 
has  the  same  components.  To  find  a  tensor  f(x) 
which  has  the  same  components  as  the  vector  v 
we  have  to  make  f(i)  =  vx  ,  f(j)  =  vs,  and  we 
have 


f(x)  =  Xif  (1)  +  X8f  (J)  = 


X8V8 


so  that  the  value  of  this  tensor  f  (x)  is  simply 
the  scalar  product  of  the  vector  to  which  it 
corresponds  by  the  vector  argument. 

Incidentally,  this  raises  a  question  as  to 
the  nature  of  the  scalar  product;  if  we  define 
it  as  xavo  is  it  invariant?  It  may  be  consider- 
ed as  resulting  from  two  tensors  of  rank  one  by 
first  multiplying  them  and  then  contracting  the 
resulting  tensor  of  rank  two. 

We  also  might  at  this  place  say  a  few  words 
about  the  symbols  of  Kronecker  Cj*.  We  may  try 
to  consider  these  symbols  as  the  components  of 
a  tensor  in  some  coordinate  system.  The  value 
of  the  tensor  will  then  be  given  by  O^x^  and 
this  is  easily  seen  to  be  xaya  ,  the  scalar  prod- 
uct of  the  vectors  x  and  y  and  thus  independent 
of  the  coordinate  system.  We  may  then  speak  of 
the  tensor  Ojj  without  mentioning  the  coordinate 
system  because  its  components  are  the  same  in 
all  coordinate  systems. 

The  square  of  a  vector  may  be  defined  as 
the  scalar  product  of  a  vector  with  itself,  it 
is  the  sum  of  the  squares  of  the  components  in 
any  coordinate  system. 

We  next  take  up  the  operation  of  differen- 
tiation. Let  us  begin  with  a  scalar  field  f  j  f 
is  a  function  of  the  coordinates  which  we  do  not 
put  in  evidence.  The  coordinates  of  a  point  P 
are  the  components  of  the  vector  OP  which  Joins 
the  origin  to  the  point  in  question  and  they  de- 
pend in  the  fashion  discussed  before  on  the 
choice  of  the  coordinate  vectors. 

After  choosing  a  definite  coordinate  system 
we  may  assign  to  f  in  every  point  a  vector  by 


18 


agreeing  that  the  components  of  this  vector 
should  be 


9.71 


and  T,  -'of/ox,; 


given  another  coordinate  system  the  relation  be- 
tween X£  and  x'i  being  given  by  formulas  9.£, 
we  can  form  the  derivatives 


9.72     "of/ox1! 


and    "of /ox1 8 


and  consider  them  as  the  components  in  the  new 
coordinate  system  of  a  vector.  The  question 
arises  whether  this  will  be  the  same  vector  as 
the  one  introduced  above. 

In  order  to  settle  this  question  let  us 
see  what  the  components  of  the  vector  whose 
components  in  the  old  system  were  vi  should  be 
in  the  new  coordinate  system.  According  to 
formulas  9.2  they  are 

v1,*  =  vxc  +  vas  =  "of/ox^c  -f  "of/ox^.s 
v'»  =  -vxs  +  vtc  =  - 

On  the  other  hand 

^f/OXfi     =  ^f/OXi.^Xi/bX1!     +   W/OX8.'bX8/OXl1, 

and  since 

^Xi/ox'i  =  c, 

we  find  that  v'i  =  df/dx'i  ,  and,  in  the  sane 
way  we  find  that  v«8  =  *>f/ox'8  which  shows 
that  the  components  9.71  and  9.72  above,  are 
the  components  in  the  two  coordinate  systems 
considered  of  one  and  the  same  vector.   We 
proved  then  that  the  operation  of  obtaining  a 
vector  by  taking  as  its  components  the  deriva- 
tives of  a  scalar  with  respect  to  the  coordi- 
nate axes  is  independent  of  the  particular  sys- 
tem of  coordinates  used,  that  means  -this  oper- 
ation is  invariant. 

Before  passing  to  differentiation  of  a 
tensor  of  rank  higher  than  zero  we  may  note 
that  the  components  vx,  va  we  obtained  may  be 
considered  as  the  components  of  a  tensor  of 
rank  one;  denoting  the  components  of  the  argu- 
ment vector  by  hx,  ht  we  hare  as  the  values  of 
this  tensor 

"ftf/bxi.hx  +  W/ox8.h8; 

this  reminds  us  of  a  differential  and  suggests 
to  write  dxj.  for  h^  and  dx8  for  ha;  we  have 
then  the  formula 


df  = 


"of/oxa.dxa, 


which  leads  to  the  interpretation  of  the  differ- 
ential as  a  tensor  of  rank  one  whose  components 
are  the  derivatives  of  the  given  function. 


We  next  consider  a  tensor  of  rank  one  whose 
components  in  the  old  system  are  fi,  these  com- 
ponents being  functions  of  coordinates.  Differ- 
entiating with  respect  to  xj  we  get 


ofa/oxa; 


can  we  consider  these  as  the  components  of  some 
tensor  of  rank  two?  In  other  words,  if  we  de- 
fine a  tensor  by  saying  that  in  the  old  system 
it  has  these  components  will  its  components  in 
the  new  system  be  obtained  by  differentiating 
with  respect  to  the  new  coordinates  of  the  new 
components  f  'i  of  the  given  tensor?  A  calcula- 
tion analogous  to  the  one  preceding  will  con- 
vince us  that  this  is  so. 

We  have  thus  introduced  an  operation  which 
leads  from  a  tensor  of  rank  zero  to  one  of  rank 
one,  and  from  a  tensor  of  rank  one  to  one  of 
rank  two,  and  we  could,  continuing  in  the  same 
way  pass  from  any  tensor  to  one  of  the  next 
higher  rank.   In  introducing  this  operation  we 
used  components  of  tensors  and  coordinates  of 
points,  but  we  proved  that  the  result  is  the 
same  no  matter  what  particular  coordinate  sys- 
tem we  might  have  used;  the  operation  of  differ 
entiation  is  thus  independent  of  a  coordinate 
system. 

After  this  detailed  treatment  of  tensors 
and  operations  on  them  in  plane  geometry  it 
will  not  be  difficult  to  generalize  to  higher 
dimensions.  We  consider  first  three-dimension- 
al space  -  solid  geometry  -  and  begin  with  an 
equation  of  a  central  quadric  surface.   Using 
notations  similar  to  those  introduced  at  the  be 
ginning  of  this  section  it  may  be  written  as 


9.8 


+  a23.Xj.Xa  +  a88x8x2 
+  a3iX3Xj.  +  832X3X2  +  £33X3X3  =  1, 


or,  using  our  notations  for  summation  with  Greek 
indices  introduced  in  Section  3,  as 

apoxpxo* 

For  the  same  reason  as  before  (in  the  case  of  a 
conic)  namely,  because  not  all  coefficients  of 
such  an  expression  can  be  obtained  as  its  val- 
ues, we  introduce  a  slightly  more  general  ex- 
pression 

9.81  a, 


as  the  tensor;  giving  in  it  to  the  variables  the 
values  xi  =  08i,  y  =  03i  (see  definition  of  the 
symbol  6  under  3.4)  we  obtain,  for  instance,  the 
coefficient  a23. 

In  coordinateless  notation  we  were  inde- 
pendent of  the  number  of  dimensions  to  begin 


19 


with,  so  that  we  may  take  over  the  definition 
of  linearity  9.6  and  the  definition  of  tensor 
following  it  word  for  word.  In  order  to  ef- 
fectuate the  transition  from  vector  notation 
to  coordinate  notation  we  write  now,  instead  of 
x  =  xxi  +  xaj,  x  =  xpip,  and  substituting  this 
and  an  analogous  expression  for  y  into  t(x,y) 
we  get  using  9.6 


*(xpip, 
The  notation 


brings  us  back  to  formula  9.91.  There  is  no 
difficulty  about  tensors  of  higher  ranks;  quan- 
tities with  three  indices  give  rise  to  triline- 
ar  forms,  e.g., 


those  with  four  indices  -  to  quadrilinear  forms, 
etc.  The  definitions  of  multiplication,  con- 
traction and  differentiation  hardly  present  any 
difficulty,  but  we  shall  devote  some  time  to 
the  question  of  transformation  of  coordinates 
for  three  and  four-dimensions.  In  solid  analy- 
tic geometry  the  question  is  usually  treated  by 
introducing  formulas  involving  all  coordinates 
at  the  same  titae,  i.e.,  formulas  periaitting  to 
pass  at  once  from  one  system  of  coordinates  to 
any  other  with  the  same  origin;  these  formulas 
are  quite  complicated,  they  involve  nine  con- 
stants which  are  not  independent,  but  are  con- 
nected by  six  relations,  and  the  corresponding 
thing  for  four  dimensions  would  be  still  more 
unwieldy;  we  could  handle  it  by  introducing  in- 
dex notations,  but  we  prefer  another  method.  We 
pass  from  one  system  of  coordinates  to  another 
gradually,  in  steps,  each  step  involving  only 
two  of  the  coordinates  -  and  one  constant  -  the 
angle  through  which  we  rotate  in  the  correspond- 
ing plane.  Three  such  steps  are  enough  to  pass 
from  any  system  to  any  other  in  three  dimen- 
sions. For  example,  we  may  first  perform  a  ro- 
tation in  the  xy-plane  which  brings  the  x-axis 
into  the  new  xy-plane;  then  a  rotation  in  the 
so  obtained  yz-plane  bringing  the  y-axis  into 
the  new  xy-plane,  and  finally  we  rotate  the  so 
obtained  x  and  y  axes  until  they  coincide  with 
the  new  x  and  y  axes. 

The  advantage  of  this  point  of  view  will 
be  seen  from  the  following  proof  of  the  invari- 
ance  of  the  operation  of  contraction  in  three 
dimensions.  Given  a  tensor  of  rank  two  by  its 
components  an,  aia>  a  13,  a81,   etc.,  the  re- 
sult of  contraction,  according  to  our  defini- 
tion in  Section  3,  is 


pp 


a22  +  a33 


If  we  pass  to  another  coordinate  system  the 
components  will  be  changed  into  some  components 


and  the  result  of  contraction  will  be 


PP 


'11 


22 


•SJ 


in  order  to  prove  that  contraction  has  an  in- 
trinsic meaning,  independent  of  the  system  of 
coordinates  we  have  to  prove  that  the  last  two 
expressions  are  equal.  If  the  transformation 
involves  only  the  xx  and  xa  coordinates,   but 
does  not  involve  xa,  then  a'33  which  is  the  co- 
efficient of  x'sy's  will  be  a33  which  is  the 
coefficient  of  x3y3,  because  x1  3  =  x3,  y'3=y3, 
and  the  other  coordinates  do  not  depend  on  x» 
and  y3;  the  coefficients  a'n,  a'22,  on  the 
other  hand  will  be  transformed  by  the  same  form- 
ulas (9.21)  as  in  the  two-dimensional  case  be- 
cause xi,  xa,  yi>  Yz  are  transformed  by  the 
same  formulas  (9.?)  as  before.  Therefore,  form- 
ula 9.5  is  applicable,  and  this  together  with 
the  fact  that  a'sa  =  £33  establishes  the  invar- 
iance  of  contraction  under  a  transformation  of 
coordinates  involving  xx  and  xa  only.  But  the 
same  reasoning  would  apply  to  transformation  in- 
volving x2  and  x3  only,  or  xx  and  x3  only,  and 
since  we  have  proved  that  a  general  transforma- 
tion of  coordinates  may  be  replaced  by  a  suc- 
cession of  transformations  involving  each  only 
two  coordinates  we  have  proved  the  invariance  of 
the  operation  of  contraction  of  a  tensor  of  rank 
two  under  a  general  coordinate  transformation. 

Following  the  same  principle  we  could  prove 
the  invariance  of  contraction  for  tensors  of  any 
rank  and  also,  using  the  fact  that  the  invari- 
ance of  the  operation  of  differentiation  has 
been  proved  for  two  dimensions,  prove  that  it 
has  an  intrinsic  meaning  in  three  dimensions. 

We  come  now  to  four  dimensions.  Here  it  is 
easy  to  prove  that  a  general  transformation  can 
be  effectuated  by  a  succession  of  six  single  ro- 
tations, i.e.,  rotations  involving  only  two  axes 
each;  in  fact,  a  rotation  in  the  xt-plane  will 
bring  the  x-axis  into  the  new  xyz-solid;  a  ro- 
tation in  the  yt-plane  will  bring  the  y-axis  in- 
to the  new  xyz-solid,  a  rotation  in  the  zt-plane 
will  bring  there  the  z-axis;  now  the  t-axis  co- 
incides with  the  new  t-axis  and  the  x,y,z  axes 
are  all  in  the  new  xyz-solid  and  can  be  brought 
into  coincidence  with  the  new  x,y,z  axes  by 
three  more  rotations  as  we  saw  before. 

The  reasoning  indicated  for  the  three-di- 
mensional case  will,  therefore,  prove  the  invar- 
iance of  the  fundamental  operations  of  tensor 
analysis  also  for  four  dimensions. 


10.  Complications  Resulting  From 
Imaginary  Coordinate. 

The  only  new  feature  of  the  new  geometry 
considered  in  the  preceding  sections  was  this 
that  we  have  four  coordinate  axes  instead  of 
three;  but  we  still  have  another  departure  from 
ordinary  geometry  (due  in  the  final  count  to  the 
"minus  sign"  in  the  Maxwell  equations),  viz., 


20 


the  fact  that  the  fourth  coordinate  is  imagin- 
ary. The  introduction  of  Imaginarles  helped  us 
to  obtain  a  symmetrical  form  of  Maxwell1  B  equa- 
tions, and  seems  to  be  beneficial  from  this 
point  of  view.  The  formal  part  of  the  theory' 
runs  now  smoothly;  but  there  Is  a  disadvantage 
in  this  smoothness,  It  conceals  very  important 
peculiarities,  and  the  present  section  will  be 
devoted  to  the  consideration  of  some  of  these 
peculiarities. 

The  discussion  may  be  conveniently  attach- 
ed to  the  consideration  of  the  expression  (pomp. 
7.41) 


which  defines  the  square  of  the  distance  be- 
tween the  two  points  whose  coordinates  appear 
In  it;  or  the  square  of  the  vector  joining 
these  two  points. 

Since  in  the  above  formula  the  quantities 
x4  and  x*4  and,  therefore,  their  difference  is 
imaginary,  the  fourth  square  is  negative,  and 
,the  expression  may,  according  to  the  relative 
magnitudes  of  the  terms,  be  positive,  negative, 
or  zero.  There  are  thus  three  types  of  rela- 
tive positions  of  two  points,  or  directions,  or 
of  vectors;  there  are  vectors  of  positive  square, 
those  of  negative  square,  and  those  of  zero 
square.  Our  geometry  is  thus  more  complicated 
than  what  we  would  expect  it  to  be  if  it  would 
differ  from  the  ordinary  geometry  in  the  num- 
ber of  dimensions  only.  This  complication,  or 
this  richness  of  our  geometry,  far  from  being 
an  undesirable  feature,  is,  as  we  shall  see,  an 
advantage,  because  it  corresponds  to  certain 
features  of  the  outside  world,  e.g.,  the  exist- 
ence of  both  matter  and  light  which  are  going 
to  be  identified  with  two  kinds  of  vectors.  At 
present  we  only  mention  that  the  momentum  vec- 
tor of  a  material  particle,  or  the  vector  of 
components  u,v,w,i  are  vectors  of  negative 
square;  in  fact,  the  square  of  the  latter  is 
u2+va+w8-!;  the  first  three  terms  representing 
the  square  of  the  velocity  of  the  particle 
which,  according  to  the  third  remark  preceding 
formula  4.3  is  very  small  compared  with  unity, 
the  expression  is  negative. 

In  some  cases  it  may  be  desirable  to  sac- 
rifice the  formal  advantages  accruing  from  the 
use  of  the  imaginary  coordinate  in  order  to  put 
in  evidence  the  peculiarities  we  are  discuss- 
ing; it  is  permissible  to  go  back  then  to  the 
old  notations  x,y,z,t,  but  it  becomes  necessary 
to  modify  the  formulas  accordingly.  (We  shall 
see  later  how  it  is  possible  to  use  index  nota- 
tions and  still  avoid  imaginaries.)  If  we  de- 
note two  points  by  x,y,z,  it  and  x'  ,  yf,  z1,  it1, 
instead  of  by  xi  and  x't  the  formula  for  the 
square  of  the  distance  will  be 

10.1   (x-x')2  +  (y-y')*  +  (z-zf)*  -  (t-t')', 
and  the  formula  for  the  scalar  prc       ee  7.4) 


of  two  vectors  given  by  a,b,c,id,  and  a',b',c', 
id', 


10.2 


aa'  +  bb'  +  cc«  -  dd1. 


We  may  say  that  the  minus  sign  appearing 
in  these  formulas  is  the  same  as  the  one  ap- 
pearing in  the  second  set  of  Maxwell's  equa- 
tions (4.21),  because  it  may  be  traced  back  to 
them. 

We  shall  occasionally  refer  to  the  quanti- 
ties xi  and  ui  which  carry  indices  and  involve 
the  square  root  of  -1  as  mathematical  coordi- 
nates and  components,  and  to  the  quantities  x, 
y,z,t  and  a,b,c,d  as  physical  coordinates  and 
components.  Four-dimensional  space  of  the  char- 
acter we  are  studying  now,  i.e.,  either  charac- 
terized by  three  real  and  one  imaginary  coordi- 
nates, or  with  four  real  coordinates  but  with 
scalar  multiplication  with  a  minus  sign  given 
by  formula  10.8  is  often  called  four-dimension- 
al space-time  because  of  the  interpretation  of 
the  quantities  x,y,z  and  t  in  ordinary  physics. 

Without  going  into  detail  we  may  mention  a 
few  consequences  of  the  "minus  sign".  According 
to  our  definition,  two  vectors  are  considered 
perpendicular  if  their  scalar  product  vanishes. 
But  now  a  scalar  product  of  two  equal  vectors 
may  vanish,  as  happens,  for  instance,  in  the 
case  of  two  with  components  0,0,1,1.   We  must 
say  then,  that  such  a  vector  is  perpendicular  to 
itself. 

We  also  may  mention  that  corresponding  to 
the  existence  of  three  types  of  directions,  those 
which  correspond  to  vectors  of  positive  square, 
negative  square  and  zero  square,  there  are  three 
types  of  orientations  or  planes.  An  orientation 
may  best  be  characterized  by  the  number  of  zero 
directions  it  contains  and  it  is  easy  to  prove 
that  there  are  orientations  containing  two,  one 
or  no  zero  directions. 

As  the  result  of  existence  of  vectors  of 
negative  square  our  proof  that  the  cosine  of  the 
angle  between  two  vectors  as  defined  by  formula 
7.5  does  not  exceed  unity  in  absolute  value  is 
not  applicable  and  we  would  have  to  consider 
imaginary  angles  or  else  consider  the  cosine  as 
a  hyperbolic  cosine,  but  we  shall  not  go  into 
this  question. 

The  only  peculiarity  due  to  the  "minus 
sign"  other  than  the  existence  of  zero  square 
vectors  that  we  shall  have  to  use  in  what  fol- 
lows is  connected  with  transformation  of  coordi- 
nates. 

Formally  our  transformation  formulas  remain 
the  samej  we  may  write,  for  instance, 


s 


x  = 


cos  <p  -  x4  sin 


*4  =  x3  sin  9  +  x4  cos  ®; 

but  here  x3  is  real  and  x4  Is  imaginary  and,  of 
course,  we  expect  the  new  coordinates  to  be  of 
the  same  character  so  that  x&  will  be  real  and 
x^  imaginary.  Giving  to  ,x3  and  x4  the  values 


1  and  0,  respectively,  we  see  that  cos  «  mi  it 
be  real,  and  sin  0  -  imaginary.  We  shall,  bow- 
ever,  prefer  not  to  use  imaginary  trigonometric 
functions;  In  order  to  avoid  then  we  introduce 
a  new  notation  as  follows: 


10. 


COS  f 


sin  t  =  l-c  , 


where  o  and  T  are  real  quantities.  The  identi- 
ty cos**  +  sin8?  =  1  gives  for  o  and  T  the  re- 
lation 


10.4 


1. 


If  we  prefer  to  use  one  number,  rather  than 
two  numbers  connected  by  a  relation,  in  describ- 
ing different  transformations  of  coordinates  In 
the  X3x4  plane  we  may  again  resort  to  trigonom- 
etry and  interpret  o  and  T  as  the  secant  and 
tangent  of  a  real  angle  t  : 


10.5 


sec 


=  tan 


so  to  say, 


It  must  be  noted,  however,  that 
has  no  geometrical  significance. 

If  In  the  above  formulas  of  transformation 
we  substitute  for  x3  and  x4  their  expressions 
in  terms  of  z  and  t  (4.7)  and  for  cos  •  and 
sin  9  their  expressions  (10.?)  In  terms  of  a 
and  T  we  obtain  the  following  formulas  of  trans- 
formation for  the  physical  coordinates: 


10.6 


z' 


zo  +  tt, 


t1 


ZT 


tO. 


These  formulas  are  called  the  Lorentz 


transformation  formulas  and  their  physical  in- 
terpretation will  be  discussed  in  the  next  chap- 
ter. 

Concluding  this  section  we  may  mention  how 
our  axioms  of  Section  8  must  be  modified  in  or- 
der to  produce  a  geometry  with  the  desired  pe- 
culiarities. 

It  is  clear  that  our  axiom  V,  according  to 
which  a  non-zero  vector  has  positive  square  has 
to  be  modified.  The  proper  modification  is  the 
following: 

Axiom  V .  There  are  orientations  containing 
two  zero  directions,  but  there  are  no  orienta- 
tions containing  more  than  two  zero  directions. 
If  we  replace  axiom  V  by  this  axiom  and 
keep  the  remaining  axioms  as  they  were  stated 
in  Section  8  we  obtain  a  geometry  of  the  kind 
desired.  In  order  to  show  this,  we  first  show 
the  existence  of  four  mutually  perpendicular 
vectors  three  of  which  have  squares  equal  to  1, 
and  the  fourth  one  equal  to  -1.  We  begin  by 
picking  out  a  plane  with  two  zero  square  vec- 
tors a  and  b;  we  assert  that  a1  *  i(a  +  b)  and 
b«  =  J(a  -  b)  are  two  perpendicular  vectors 
with  squares  of  opposite  sign;  In  fact  a'b»  = 
£(a8  -  ba)  and  this  is  zero  because  a*  =  0  and 
b8  =  0;  then,  a  =  a1  •»•  b1,  squaring  this  and 
keeping  in  mind  that  a'b'  =  0,  we  have  a8  = 
a'8  *  b'«;  since  a»  =  0   it  follows  that 
the  squares  of  a1  and  b'  are  of  opposite 


sign.  Dividing  each  of  the  vectors  a1  and  b'  by 
the  square  root  of  the  absolute  value  of  Its 
square  we  obtain  two  mutually  perpendicular  vec- 
tors whose  squares  are  +1  and  -1,  which  we  may 
call  k  and  /  respectively.   It  is  easy  now  to 
pick  out  two  more  vectors  1  and  J  which  togeth- 
er with  k  and  Jf  constitute  a  set  of  four  mutual- 
ly perpendicular  vectors;  none  of  them  can  have 
a  zero  square,  because  if,  say,  ia  should  be 
zero  all  the  vectors  of  the  plane  determined  by 
a  and  i  would  have  zero  squares,  contrary  to 
axiom  V. 

Using  these  four  vectors  we  can,  as  in  the 
other  case,  express  every  vector  in  the  form 
(comp.  8.2) 


10.7 


ai  +  pj 


O/. 


The  numbers  a,  p ,  Y  »  0  will  be  considered  as 
the  components  of  this  vector.  In  order  to 
show  that  we  have  what  we  wanted  we  shall  ex- 
press the  square  of  x  in  terms  of  its  components. 
Squaring  10.7  and  taking  into  account  that  f*  = 
-1  we  obtain 


+  ps 


This  shows  that  a,  p,  Y>  6  are  what  we  call 
physical  components  of  the  vector;  the  mathema- 
tical components  are  obtained  by  setting 


=  a,   xg  =  p, 


x3 


10, 


and  we  see  that  we  get  the  kind  of  geometry  we 
expect  to  use  in  physics. 


11.  Are  the  Equations  of  Physics 
Invariant? 

We  return  now  to  physics.  In  Chapter  I  we 
arrived  at  certain  equations  that  we  consider 
as  fundamental;  namely,  the  equation  of  contin- 
uity (3.5) 


11.1         ^pua/oxg  =  0, 

two   sets  of  Maxwell's  equations   (4.61  and  4.62) 

11. £  ^Fy  /DXfc  +  "oFjfc  /QXj[  +  "dFfci /ox*  =  0, 

11.3  ^Fia/oxa  =  eui  , 
and  the  equations  of  motion  (6.3) 

11.4  'oTia/oxa    =0  (i  =  1,2,3) 
with  (comp.   6.2,  5.1}  2.7) 

Tij    =  EiJ   ~ 


11.5 


=  FiPFPJ 


H 


The  fact  that  the  Indices  run  here  fro* 
1  to  4  (except  in  11.4)  suggested  four-dimen- 
sional geometry;  which  we  have  introduced  in 
Sections  7  and  6;  the  fact  that  x*  in  the  above 
equations  is  imaginary  (comp.  4.7)  suggested 
the  peculiarities  discussed  in  Section  10.  low 
that  we  have  followed  these  suggestions  and 
built  a  mathematical  theory  we  have  to  see  to 
what  results  the  application  of  our  new  theory 
leads.  In  addition  to  following  the  sugges- 
tions we  have  Introduced  into  the  theory  a  fea- 
ture that  was  not  directly  called  for  by  phy- 
slcs:  we  made  our  theory  independent  of  coordi- 
nates. In  order  to  bring  out  the  importance  of 
this  fact  let  us  consider  for  a  moment  the  case 
of  two  dimensions  and  compare  plane  geometry 
with  a  two-way  diagram.  Both  in  plane  analytic 
geometry  and  in  a  diagram  we  use  coordinate  ax- 
es, but  in  geometry  the  axes  play  an  auxiliary 
role,  we  find  it  convenient  to  express  by  refer- 
ring to  axes  properties  of  configurations  which 
exist  and  can  be  treated  independently  of  the 
axes;  the  same  properties  can  be  expressed  us- 
ing any  system  of  coordinate  axes.  The  situa- 
tion is  different  in  the  case  when  we  use  a 
plane  as  a  means  of  representing  a  functional 
dependence  between  two  quantities  of  different 
kind,  when  we  have  a  diagram.  We  may  use,  for 
instance,  the  two  axes  to  plot  temperature  and 
pressure,  or  the  height  of  an  individual  and 
the  number  of  individuals  of  that  height.   In 
the  majority  of  such  cases  the  axes  play  an  es- 
sential part  in  the  discussion;  if  we  delete 
the  axes  the  diagram  loses  its  meaning,  the 
question  of  rotation  of  axes  does  not  arise. 

Returning  to  physics  we  have  to  ask  our- 
selves what  we  actually  need  for  it,  a  diagram 
or  a  geometry;  in  other  words,  are  the  coordi- 
nate axes  essential  or  can  they  be  changed  at 
will,  or  again,  do  the  equations  of  physics  ex- 
press properties  independent  of  the  coordinate 
axes;  are  they  invariant,  or  not. 

In  order  to  answer  this  question  let  us 
first  consider  the  formal  structure  of  the  equa- 
tions 11.1  to  11.5.  The  fundamental  dependent 
variables  are  here  a  scalar  p,  a  vector  ui  and 
an  anti- symmetric  tensor  Fj«  . 

The  left  hand  side  of  equation  11.1  may  be 
described  as  a  result  of  multiplying  the  scalar 
p  by  the  vector  ui}  then  differentiating  the  re- 
sulting vector  p%,  then  contracting  the  tensor 
so  obtained;  since  the  operations  of  multiplica- 
tion, differentiation  and  contraction  have  been 
shown  to  be  invariant,  the  scalar  opu^/big  is 
independent  of  the  system  of  coordinates  used, 
and  if  it  is  zero  in  one  system  of  coordinates  it 
is  zero  in  all  system  of  coordinates.  The  con- 
tinuity equation  expresses,  therefore,  a  fact 
independent  of  the  system  of  coordinates  em- 
ployed . 

An  analogous  reasoning  applied  to  11.3 
would  show  the  invariant  character  of  that  sys- 
tem. The  question  of  invariance  of  the  first 


sign.  Dividing  each  of  the  vectors  a1  and  b1  by 
the  square  root  of  the  absolute  value  of  its 
square  we  obtain  two  mutually  perpendicular  vec- 
tors whose  squares  are  +1  and  -1,  which  we  may 
call  k  and  /  respectively.   It  is  easy  now  to 
pick  out  two  more  vectors  i  and  J  which  togeth- 
er with  k  and  ^  constitute  a  set  of  four  mutual- 
ly perpendicular  vectors;  none  of  them  can  have 
a  zero  square,  because  if,  say,  i8  should  be 
zero  all  the  vectors  of  the  plane  determined  by 
a  and  i  would  have  zero  squares,  contrary  to 
axiom  V . 

Using  these  four  vectors  we  can,  as  in  the 
other  case,  express  every  vector  in  the  form 
(comp.  8.2) 

The  numbers  a,  p  ,  Y  >  0  will  be  considered  as 
the  components  of  this  vector.  In  order  to 
show  that  we  have  what  we  wanted  we  shall  ex- 
press the  square  of  x  in  terms  of  its  components. 
Squaring  10.7  and  taking  into  account  that  jf*  = 
-1  we  obtain 


=  a2 


-  6! 


This  shows  that  o,  p,  y,  6  are  what  we  call 
physical  components  of  the  vector;  the  mathema- 
tical components  are  obtained  by  setting 


xa  =  p 


=  Y,   X4  =  10, 


and  we  see  that  we  get  the  kind  of  geometry  we 
expect  to  use  in  physics. 


11.  Are  the  Equations  of  Physics 
Invariant? 

We  return  now  to  physics.  In  Chapter  I  we 
arrived  at  certain  equations  that  we  consider 
as  fundamental;  namely,  the  equation  of  contin- 
uity (3.5) 


11.1  SpUa/oxa     =    0, 

two  sets  of  Maxwell's  equations   (4.61  and  4.62) 

11.8  "oF^j  /oxk  +  -oFjfc  /oxi  +  •e-Ffci  /OXj  =  0, 

11.3  ^>Fia/°xa  =  eui  , 
and  the  equations  of  motion   (6.3) 

11.4  -oTia/oxa    =0  (i  =  1,2,3) 
with  (comp.   6.2,   5.1}  2.7) 


11.5 


'  FiPFPJ 


The  fact  that  the  indices  run  here  from 
1  to  4  (except  in  11.4)  suggested  four-dimen- 
sional geometry;  which  we  have  introduced  la 
Sections  7  and  8;  the  fact  that  x«  in  the  above 
equations  is  imaginary  (comp.  4.7)  suggested 
the  peculiarities  discussed  in  Section  10.  low 
that  we  have  followed  these  suggestions  and 
built  a  mathematical  theory  we  have  to  see  to 
what  results  the  application  of  our  new  theory 
leads.  In  addition  to  following  the  sugges- 
tions we  have  introduced  into  the  theory  a  fea- 
ture that  was  not  directly  called  for  by  phy- 
sics: we  made  our  theory  Independent  of  coordi- 
nates. In  order  to  bring  out  the  importance  of 
this  fact  let  us  consider  for  a  moment  the  case 
of  two  dimensions  and  compare  plane  geometry 
with  a  two-way  diagram.  Both  in  plane  analytic 
geometry  and  in  a  diagram  we  use  coordinate  ax- 
es, but  in  geometry  the  axes  play  an  auxiliary 
role,  we  find  it  convenient  to  express  by  refer- 
ring to  axes  properties  of  configurations  which 
exist  and  can  be  treated  independently  of  the 
axes;  the  same  properties  can  be  expressed  us- 
ing any  system  of  coordinate  axes.  The  situa- 
tion is  different  in  the  case  when  we  use  a 
plane  as  a  means  of  representing  a  functional 
dependence  between  two  quantities  of  different 
kind,  when  we  have  a  diagram.  We  may  use,  for 
instance,  the  two  axes  to  plot  temperature  and 
pressure,  or  the  height  of  an  individual  and 
the  number  of  individuals  of  that  height.   In 
the  majority  of  such  cases  the  axes  play  an  es- 
sential part  in  the  discussion;  if  we  delete 
the  axes  the  diagram  loses  its  meaning,  the 
question  of  rotation  of  axes  does  not  arise. 

Returning  to  physics  we  have  to  ask  our- 
selves what  we  actually  need  for  it,  a  diagram 
or  a  geometry;  in  other  words,  are  the  coordi- 
nate axes  essential  or  can  they  be  changed  at 
will,  or  again,  do  the  equations  of  physics  ex- 
press properties  independent  of  the  coordinate 
axes;  are  they  invariant,  or  not. 

In  order  to  answer  this  question  let  us 
first  consider  the  formal  structure  of  the  equa- 
tions 11.1  to  11.5.  The  fundamental  dependent 
variables  are  here  a  scalar  p,  a  vector  ui  and 
an  anti-symmetric  tensor  Fy  . 

The  left  hand  side  of  equation  11.1  may  be 
described  as  a  result  of  multiplying  the  scalar 
p  by  the  vector  u^,  then  differentiating  the  re- 
sulting vector  pui,  then  contracting  the  tensor 
so  obtained;  since  the  operations  of  multiplica- 
tion, differentiation  and  contraction  have  been 
shown  to  be  invariant,  the  scalar  ^jpUg/oxg  is 
independent  of  the  system  of  coordinates  used, 
and  if  it  is  zero  in  one  system  of  coordinates  it 
is  zero  in  all  system  of  coordinates.  The  con- 
tinuity equation  expresses,  therefore,  a  fact 
independent  of  the  system  of  coordinates  em- 
ployed . 

An  analogous  reasoning  applied  to  11.3 
would  show  the  invariant  character  of  that  sys- 
tem. The  question  of  invariance  of  the  first 


23 


system  of  Maxwell's  equations  requires  a  spe- 
cial discussion;  it  can  best  be  treated  by  in- 
troducing a  new  an ti- symmetric  tensor  DJ_J  con- 
nected with  F4«  by  the  following  relations: 


11.6 


F8,  =  D14,  F31  =  D,4, 
F14  =  D13,  F84  =  D81 . 


Before  v,e  show  how  this  is  going  to  help  us  in 
connection  with  our  equations  we  want  to  prove 
that  these  relations  are  independent  of  the  co- 
ordinate system;  i.e.,  that  if  11.6  hold  rela- 
tions of  the  same  form,  namely 


11.6' 


etc., 


will  hold  in  any  other  coordinate  system.  Again, 
since  a  general  transformation  of  coordinates 
can  be  achieved  in  steps  it  will  be  enough  to 
test  a  XiX2  rotation  only.  As  a  result  of  such 
a  rotation  F{2  becomes  (comp.  9.21) 


12 


;3  -  F 


2l 


F82sc; 


using  the  fact  that 
we  find 


is  anti-symmetric  (4.4) 


11.7 


F'a  =  F, 


12 


and  since  obviously  D34  =  D34  because  the  x3x4 
axes  are  not  affected  we  see  that  the  first  of 
the  relations  11.6'  follows  from  11.6.  In  order 
to  find  F23  we  have,  according  to  the  general 
rule  following  9.31,  to  substitute  62l  and  63i 
for  xj^  and  y'  respectively  in  F'pOx'pyJ,  = 
FpOXpya  .  As  tne  corresponding  values  of  x.i  and 
Yi  we  find  with  the  aid  of  9.2  considering  that 

X3   =  X3,  X4   =  X4 


"~  S  i 


xfl  =  c, 


x3  =0,    x4  =  0, 


yi  =  0,    y2  =  0,    y3  =1,    y4  =  0; 
so  that 

11.71          FjJ3  =  -sF13  +  cF23; 
in  a  similar  way  we  obtain 


11.72 


D14c  -  D84s; 


taking  again  into  account  the  anti-symmetric 
property  of  F  we  come  to  the  conclusion  that 
the  second  relation  of  11.6%  is  a  consequence  of 
11.6,  and  since  the  same  reasoning  applies  to 
the  remaining  relations  we  conclude  that  the  re- 
lations 11.6  are  independent  of  the  coordinate 
system;  it  is  easy  to  see  that  they  assign  to 
every  tensor  Fy  a  tensor  DJJ  (the  tensor  Dj*  , 
or,  rather  v^l  DJJ  is  often  referred  to  as  the 
dual  of  FJ.J).  Now  if  we  express,  using  11.6 


the  FJJ 
becomes 

11.2' 


in  11.2  in  terms  of  the  Di,  that  set 


and  its  invariance  follows  from  general  con- 
siderations as  in  the  case  of  11.3. 

Formula  11.5  contains  only  multiplications, 
contractions  and  additions,  so  that  there  la  no 
doubt  concerning  its  invariance,  but  the  situa- 
tion changes  when  we  come  to  the  set  11.4.  The 
vector  *>Tia/oxa  has  been  obtained  by  invariant 
operations  but  11.4  states  that  only  three  of 
its  components  are  zero,  a  statement  which  ob- 
viously depends  on  the  choice  of  coordinate  ax- 
es and  is  not  invariant. 

We  have  now  two  courses  open  before  us: 
one  is  that  of  resignation,  we  can  say:  we  see 
that  physics  is  not  like  geometry  in  this  re- 
spect, that  we  can  only  use  four-dimensional 
notations,  a  four-dimensional  diagram  but  not 
four-dimensional  geometry;  the  other  course  is 
that  of  adventure,  we  may  try  to  play  the  game 
of  geometry;  let  us  pretend  that  we  can  apply 
the  formulas  of  transformation  of  coordinates 
in  this  case;  we  know  that  there  will  be  a  dif- 
ference between  the  theory  we  obtain  and  the 
physics  which  we  undertook  to  translate  into 
our  language;  but  it  may  be  that  the  difference 
will  amount  numerically  to  very  little.  Con- 
sider the  fourth  component  of  the  vector  oTj^oXgj 
we  found  (comp.  remark  following  6.3)  that  one 
of  the  terms  of  this  expression,  ^Mio/oxg  van- 
ishes, and  the  other  0^3/0X3  gives  Xu+Yv+Zw, 
where  u,v,w  are  the  components  of  velocity,  but 
in  order  to  present  the  Maxwell  equations  in  a 
simple  form  we  had  to  choose  our  units  in  such 
a  way  that  the  velocity  of  light  is  unity;  or- 
dinary velocities  are  of  the  order  of  magnitude 
of  one  ten-millionth  of  the  velocity  of  light, 
so  that  we  see  that  by  setting  the  fourth  com- 
ponent of  oTia/oxo  equal  to  zero  we  would  com- 
mit an  error  that  is  numerically  very  small. 
This  encourages  us  to  go  on  with  our  adventure 
and  try  to  force  the  geometrical  character  on 
physics.  In  order  to  do  that  let  us  go  beyond 
the  formal  structure  of  our  formulas  and  recall 
what  the  meaning  of  our  fundamental  quantities 
was.  The  components  of  the  vector  UA  were  giv- 
en (see  1.2,  2.4,  4.9)  as 

11.3    ux  =  dx/dt,  u2  =  dy/dt,  ua  =  dz/dt, 
u4  =  1. 

But  this  identification  is  obviously  not  inde- 
pendent of  the  coordinate  system,  it  gives  pref- 
erence to  the  fourth  coordinate.   We  may  think 
that  this  is  the  source  of  our  difficulty,  and 
that  this  difficulty  may  be  overcome  if  we  find 
an  invariant  identification  to  take  the  place 
of  11.8.  The  next  section  will  prepare  the  way 
for  this. 


12.  Curves  in  the  New  Geometry. 

The  root  of  the  difficulty  is  that  our  de- 
scription of  motion  was  not  invariant;  motion 
was  described  by  giving  the  dependence  of  the 
coordinates  x,y,z  on  time,  that  means  by  giving 
three  of  our  coordinates  Xj.,  xa,  x3,  as  func- 
tions of  the  fourth  x4  which  thus  is  given  pref- 
erence. The  situation  is  analogous  to  that  In 
plane  analytic  geometry  where  we  give  y  as  a 
function  of  x,  or  that  in  solid  analytic  geomet- 
ry when  we  give  y  and  z  as  functions  of  x;   in 
both  cases  we  represent  curves;  from  our  four- 
dimensional  point  of  view  we  should  then  consid- 
er motion  of  a  particle  as  a  curve  in  four  di- 
mensions (using  the  word  curve  in  a  general 
sense  so  that  straight  line  is  a  special  case) . 

What  we  want  then  is  a  representation  of 
curves  in  four  dimensions  which  would  not  give 
preference  to  the  fourth  coordinate.  We  begin 
by  considering  representations  of  curves  in  two 
and  three  dimensions  which  give  no  preference  to 
one  coordinate.  In  the  plane  a  line  may  be  rep- 
resented by 


x  =  ap  +  b, 


y  =  cp  +  d, 


a  circle  by 


x  =  r  cos  p,    y  =  r  sin  p; 
in  space  a  line  by 

x  =  ap  +  b,     y  =  cp  +  d,     z  =  ep  +  f , 
a  helix  by 

x  =  r  cos  p,    y  =  r  sin  p,   z  =  kp,  etc. 

In  all  these  cases  to  every  value  of  the 
"parameter"  p  corresponds  a  point  of  the  curve; 
in  general,  if  we  set 


=  f(p), 


y  =  g(p), 


z  =  h(p) 


we  have  what  we  call  a  parametric  representation 
of  a  curve  (corap.  parametric  form  of  equation  of 
straight  line  in  7.1).  In  the  same  way  we  may 
represent  a  curve  in  four  dimensions,  which  we 
take  to  mean  motion  of  a  particle,  by  giving  x^ 
as  functions  of  a  parameter  p. 

The  defect  of  this  method  is  that  it  con- 
tains a  certain  arbitrariness;  we  may  substitute 
for  p  another  parameter  q  by  making  p  an  arbi- 
trary increasing  function  of  q.  We  want  now  to 
standardize  our  parametric  representation.   The 
usual  way  is  to  choose  the  arc  length  along  the 
curve  as  the  parameter.  Without  going  into  de- 
tail we  shall  state  that  arc  length  between 
points  corresponding  to  values  px  and  p»  of  the 
parameter  is  given  by 

18.1   s  =  /  V(  dx/dp)  a  +  (dy/dp)*+  (dz/dp)'dp; 


in  the  special  case  when  s  la  used  as  par 
p  we  differentiate  both  aides  and  obtain 


£4 


ter 


12.2    (dx/dp)«  +  (dy/dp)»  +  (dz/dp)»  «  1. 

We  may  consider  in  general  dx/dp,  dy/dp,  dt/dp 
as  the  components  of  a  vector  tangent  to  the 
curve;  the  change  of  parameter  would  multiply 
these  derivatives  by  the  same  number,  i.e.,  sub- 
stitute another  tangent  vector  for  that  one;  the 
quantity  (dx/dp)*  +  (dy/dp)a  +  (dz/dp)»  fires 
the  square  of  the  length  of  the  tangent  rector; 
the  above  equality  12.12.  expresses  then  the 
fact  that  if  we  use  arc  length  as  the  parameter 
the  length  of  the  tangent  vector  whose  compo- 
nents are  the  derivatives  of  the  coordinates 
with  respect  to  the  parameter  Is  unity. 

We  come  thus  to  the  idea  of  a  unit  tangent 
vector;  it  characterizes  in  every  point  the  di- 
rection of  the  curve;  its  components  are  the  di- 
rection cosines  of  the  tangent. 

We  may  try  to  go  through  an  analogous  proc- 
ess in  the  case  of  curves  in  four-dimensional 
space,  which  as  we  saw  may  be  taken  to  repre- 
sent motions;  if  we  succeed,  the  vector  at  which 
we  arrive  will  suggest  itself  as  a  natural  thing 
to  identify  with  the  vector  of  components  ui 
which  appears  in  our  formulas.  Starting  with 
any  parametric  representation  xi  =  xi(p),  where 
p  may  be  for  example  t,  we  try  to  change  our 
parameter  by  introducing  a  new  variable  q  and 
making  p  a  function  of  q,  choosing  this  func- 
tion in  such  a  way  that 


(dxa/dq)8  +  (dxa/dq)a 


but  dxi/dq  =  (dxi/dp)  .  (dp/dq)  ;  so  that  the 
function  p(q)  must  be  such  that 


dp/dq 


/(dx1/dp)V(dx,/dpr+(dx,/dp)a+(dx4/dp)1 


Is  this  possible?   In  the  case  when  the  origi- 
nal p  is  t  the  expression  under  the  radical 
sign  will  be  u8  +  v»  +  w»  -  1,  and  for  motions 
whose  velocities  are  smaller  than  the  velocity 
of  light  (Section  4)  this  is  negative,  so  that 
we  would  get  an  Imaginary  value  for  dp/dq.  »•  In 
order  to  avoid  this  unpleasantness  we  decide  to 
standardize  our  parameter  by  requiring 
(dxjj/dp) .  (djL/dp)   to  be  -1  instead  of  1;   in 
this  case  we  find  for  dq/dt  the  expression 
/I  -  (u8  +  v*  +  w8)  and  we  may  write 


12.3 


where  p  stands  for  /u8  +  v«  +  w»,  i.e.,  for 


what  we  call  speed  (the  length  of  the  velocity 
vector).  The  quantities  Just  written  out  we 
want  to  identify  with  the  components  of  the 
vector  ui  which  appears  in  our  formulas.  Since 
in  ordinary  cases  p  is  very  small,  the  radical 
/I  -  PB  is  very  near  to  unity  and  our  new  iden- 
tification differs  from  our  old  identification 
(11.8)  numerically  very  little.  On  the  other 
hand  the  new  values  for  ui  are  according  to 


their  derivation  the  components  of  a  vector,  to 
that  if  we  adopt  this  identification  and  also 
agree  to  set  the  fourth  component  of  the  diver- 
gence of  the  tensor  T^  equal  to  zero  we  obtain 
an  invariant  theory  whose  statements  differ  on- 
ly very  slightly  from  those  accepted  in  classi- 
cal physios.   It  remains  to  be  seen  whether 
there  are  cases  in  which  the  discrepancy  is 
large  enough  to  be  tested  by  experiment. 


26 


Chapter  III. 
SPECIAL  RELATIVITY 


Guided  by  the  point  of  view  that  the  form- 
ulas of  physics  ought  to  be  interpreted  in  four- 
dimensional  geometry  we  were  led  to  the  inter- 
pretation of  the  motion  of  a  particle  as  a  curve 
in  space-time.  Following  the  analogy  with  a 
curve  in  ordinary  space  where  arc  length,  s,  is 
often  used  as  a  parameter,  we  have  introduced  a 
standard  parameter,  which  we  may  also. call  arc 
length  and  denote  by  s,  for  curves  in  space- 
time.  The  partial  derivatives  dxj/ds  of  the  co- 
ordinates of  a  point  on  the  curve  with  respect 
to  s  may  be  considered  as  the  components  of  a 
vector  tangent  to  the  curve  (the  square  of  this 
vector  is  -1  in  every  point  -  we  shall  refer  to 
such  a  vector  also  as  a  unit  vector) .  We  have 
then  at  every  point  Xi,  xa>  x3,  x4  of  such  a 
curve  a .unit  vector  dxi/ds,  and  we  have  agreed 
to  identify  this  vector  with  the  vector  u^  which 
appears  in  our  fundamental  laws  of  physics  (11. 1 
to  11.5)  so  that 

ui  =  dxi/ds. 

In  this  chapter  we  want  to  consider  some  conse- 
quence of  this  identification. 


15.  Equations  of  Motion. 

The  one  thing  that  was  not  satisfactory 
about  the  formulas  of  physics  was  the  fact  that 
according  to  11.4  only  three  components  of  the 
vector  oTia/dxa  are  equal  to  zero.   In  this 
section  we  shall  see  how  this  defect  is  cor- 
rected by  the  adoption  of  the  new  identification. 
But  before  we  do  that  we  have  to  study  some  im- 
mediate consequences  of  this  identification. 

Before,  a  motion  of  a  particle  was  given 
by  giving  the  position  of  the  particle  in  dif- 
ferent moments  of  time,  i.e.,  the  coordinates 
x,y,z  as  functions  of  t.  Given  these  functions 
we  can  calculate  for  every  moment  the  velocity 
vector  of  the  particle  -  a  vector  of  components 

u  =  dx/dt,     v  =  dy/dt,    w  =  dz/dt. 

Now,  the  same  motion  is  described  by  giving 
*i»  *2»  x3,  x4  as  functions  of  s;  and  we  have 
a  vector  of  components  ui  =  dxi/ds  which,  due 
to  the  special  choice  of  the  parameter  satisfies 
the  equation 


13.1 


2 
Ul 


U2 


U4 


—    1 

~  -L  • 


Of  course,  we  have  merely  two  representa- 
tions of  the  same  thing.  Given  the  Xi(s)  we  can 
express  s  as  a  function  of  t  from 

x4(s)  =  it 


and  substituting  the  expression  of  s  so  found 
into  xx(s),  xg(s),  x3(s)  we  will  have  x,y,z, 
as  functions  of  t.  Or  given  x,y,z  as  func- 
tions of  t  we  can  arrive  at  the  representation 
x1(s)  as  indicated  in  Section  12. 

Also  the  space-time  vector  u^  and  the 
space  vector  u,v,w  describe  the  same  thing. 
The  formulas  12.5  show  how  to  find  the  compo- 
nents Ui  in  terms  of  the  velocity  vector  u, 
v,w,  and  it  is  easy  to  find  the  u,v,w  in 
terms  of  components  u^.  We  simply  have 


u  =  dx/dt  = 


dx/ds   4dxi/ds 


and  in  a  similar  fashion 
13.8      v  =  i.ua/U4, 


w  = 


We  see  thus  that  the  vector  u^  determines  the 
velocity  of  motion,  and  we  agree  to  call  it  the 
four-dimensional  velocity  vector.  On  the  other 
hand,  being  a  unit  vector  this  vector  character- 
izes the  direction  in  four  dimensions  of  the 
curve  representing  motion;  its  components   u^ 
may  be  considered  as  the  direction  cosines  of 
the  tangent  (compare  Section  7,  between  formu- 
las 7.41  and  7.5). 

But  a  velocity  vector  does  not  character- 
ize the  motion  of  a  particle  completely;  it  gives 
only  the  kinematical  characterization;  in  dy- 
namics we  need  in  addition,  to  know  the  mass  of 
the  particle,  and  then  we  form  the  momentum  vec- 
tor (compare  beginning  of  Section  1)  whose  com- 
ponents are  mu,  mv,  mw.  By  analogy  we  form 
the  expressions  mui  or  pui  (depending  on 
whether  we  use  the  discrete  or  the  continuous 
picture  of  matter)  and  consider  them  as  the  com- 
ponents of  the  four-dimensional  momentum  vector. 
Using  the  formulas  12.3  we  have  for  its  compo- 
nents 


13.3 


mu2  = 


mu4 


mv 


These  are  what  we  call  the  mathematical  compo- 
nents of  the  momentum  vector;  its  "physical  com- 
ponents" are 


mu 


•v 


13.51 


c  = 


We  obtain  a  relation  between  the  momentum 
components  and  mass  if  we  take  the  sum  of  the 
squares  of  the  components  13.3  or  13.31  and  use 
13.1,  viz., 

aa  +  b"  +  c»  -  d*  =  -m"; 

In  words,  the  negative  square  of  mass  is  the 
square  of  the  momentum  vector,  so  that  mass  is 
essentially  given  by  the  length  of  the  momen- 
tum vector;  we  see  here  another  advantage  of  the 
four-dimensional  representation:  the  four  dy- 
namical quantities  of  a  particle  which  in  clas- 
sical physics  are  given  by  the  three  momentum 
components  and  mass  are  here  represented  more 
naturally  by  the  four  components  of  a  vector. 

As  stated  many  times  before,  numerically 

is  in  most  applications  very  close  to 
unity  so  that  approximately  the  first  three 
components  of  the  four-dimensional  momentum  vec- 
tor are  equal  to  the  components  of  the  three- 
dimensional  momentum  vector  and  the  last  (phy- 
sical) component  of  the  four  -dimensional  momen- 
tum vector  is,  in  first  approximation,  equal  to 
mass. 

Let  us  consider  more  in  detail  this  fourth 
component  of  the  momentum  vector.  If  we  want  a 
better  approximation  we  develop  the  last  of  13.31 
according  to  powers  of  p  and  keep  only  two  terms; 
we  have  thus  the  approximate  equality 

13.4 


d  = 


the  correction  represented  by  the  second  term  is 
nothing  but  kinetic  energy;  of  course,  if  ordi- 
nary units  are  used  this  term  has  to  be  written 


13.41 


imVa/c8 


where  c  is  the  velocity  of  light,  and  V  is  the 
velocity  of  the  particle  measured  in  the  same 
units,  because  p  is  the  ratio  of  the  velocity  of 
the  particle  to  that  of  light.  We  had  better 
say  then  that  the  correction  is  kinetic  energy 
divided  by  the  square  of  the  velocity  of  light. 
Sometimes  this  fact  is  expressed  by  saying  (neg- 
lecting the  other  terms,  which  are  very,  very 
small)  that  when  a  body  is  in  motion  its  mass  is 
increased  by  its  kinetic  energy  (divided  by  the 
square  of  the  velocity  of  light) . 

The  interest  of  this  lies  in  the  close  re- 
lationship which  is  thus  established  between 
mass  and  energy  -  a  relationship  that  plays  a 
prominent  part  in  present  physics.     

Sometimes  the  whole  expression  m//l  -  p2 
is  referred  to  as  energy  of  the  particle;  mass, 
from  this  point  of  view,  appears  then  as  part  of 
the  energy,  that  part  that  the  particle  possess- 
es even  when  it  is  at  rest;  in  other  words,  mass 
appears  as  the  rest-energy  of  the  particle.  We 
could  also  call  m//l  -  P8  generalized  mass  and 
say  that  mass  changes  as  a  result  of  motion  (com- 
pare end  of  this  section). 


87 


We  are  ready  now  to  discuss  the  equations 
of  notion  11.4  or 


The  left  hand  sides  may  be  written  as 


the  first  factor  of  the  first  term  vanishes  ac- 
cording to  the  continuity  equation  11.1;   the 
second  term  may  be  written,  recalling  the  def- 
inition of  uj,  as 


=  p.ou1/ox0.dxa/ds  «  p.duj/ds; 


the  right  hand  side  of  13.5,  according  to  our 
former  calculation  (Section  6)  is  eFlaua,   so 
that  the  equations  of  motion  become 


p.duj/ds 


or,  if  we  use  the  discrete  picture,  considering 

both  mass  and  electric  density  to  be  concentrated 

in  one  point,  and  denoting  mass  by  m,  electric 
charge  by  e, 


13.51 


m.duj/ds  = 


These  are  the  equations  we  are  going  to 
discuss.  In  applying  them  to  physics  we  give 
preference  to  time  by  writing 


13.52 


m.dUj/dt  =  eFla.dxo/dt 


which  spoils  the  invariant  form  but  does  not 
change  the  contents  of  the  statement  because 
the  transition  from  13.51  to  13.52  is  equiva- 
lent to  multiplication  by  ds/dt.  Using  4.72 
(or  6.1)  the  last  equations  become 


m.dux/dt  =  e(X  +  Hv  -  Mw), 
m.du8/dt  =  e(Y  +  Lw  -  Nu) , 
m.du3/dt  =  e(Z  +  Mu  -  Lv), 


13.53 


m.du4/dt  =  ie(Xu  +  Yv  +  Zw) . 

Multiplying  the  left  hand  side  of  the  first 
of  these  equations  by  i.u^Au  and  the  right 
hand  side  by  u  (compare  13.2);  using  in  the 
same  way  Lu^/u*  =  v  on  the  second,  and 
i.u3/U4  =  w  on  the  third,  and  adding  the  re- 
sults we  get  the  fourth  equation  because  the 
left-hand  side  comes  out 

(Im/u4).(u1.du1/dt  +  u,.du./dt  +  u,.du,/dt) 

and  differentiating  the  identity  13.1,  we  find 
that  the  second  factor  is,  Ut.du4/dt.  The  fourth 
equation  is  thus  a  consequence  of  the  first 
three,  a  great  improvement  over  the  situation 
as  it  was  before  the  new  identification.   The 
fourth  equation  also  has  a  definite  physical 


meaning  now;  the  left  hand  side  may  be  said  to 
represent  the  time  rate  of  change  of  energy 
(since  the  variable  part  of  mvu  has  been  rec- 
ognized as  kinetic  energy),  and  the  right  hand 
side  has  been  recognized  before  (Section  6)  as 
the  rate  at  which  the  energy  of  the  field  (po- 
tential energy)  is  being  expended  in  moving  the 
body.   The  difficulty  with  the  fourth  equation 
has  thus  been  settled  in  a  most  satisfactory 
fashion  but  the  system  as  a  whole,  or  the  first 
three  equations,  have  to  be  tested  by  experi- 
ment (the  fourth,  being  a  consequence  of  the 
first  three,  cannot  be  wrong  if  these  three 
will  be  proved  to  be  "true"). 
Since 

dui/dt  =  -[^(dxi/ds)  =  dt/ds.-|kdxi/dt) 


we  may  write  our  equations  as 

m'.d*x/dt«=e(X+Nv-Mw)  ,  m'  ,dIy/dts=e(Y.+Lw-Nu) 
13.6 

m'.d8z/dt8  =  e(Z  +  Mu  -  Lv) , 


where 


13.7 


m1  =  m/- 


The  right  hand  sides  of  these  equations  (as 
stated  in  Section  6)  are  the  components  of  the 
force  exerted  by  the  electromagnetic  field  on 
the  particle.  Comparing  the  left  hand  sides  with 
the  classical  expressions  we  see  then  that  the 
correction  resulting  from  our  identification  is 
equivalent  to  the  substitution  of  m1  for  m  in 
the  classical  equations  of  motion.  We  may  say 
then  that  our  theory  predicts  that  motion  will 
be  governed  by  the  old  equations  in  which  mass 
has  been  replaced  by  a  corrected  mass  the  cor- 
rection being  the  kinetic  energy  (divided  by  the 
square  of  the  velocity  of  light).   In  the  vast 
majority  of  cases  the  factor  1//1  -  (5s  is  very 
close  to  one,  but  there  are  a  few  cases  where  it 
is  not,  and  these  cases  afford  an  opportunity  to 
test  the  new  theory  and  to  see  whether  it  or  the 
old  one  is  better  adapted  to  give  account  of  ex- 
perimental results.  In  experiments  with  "ca- 
thode ray  particles"  by  Bucherer  the  predictions 
of  the  new  theory  seem  to  have  been  verified. 


14.  Lorentz  Transformations. 

Now  that  we  saw  that  the  new  identification 
removes  the  difficulty  in  connection  with  the 
fourth  equation  of  motion  we  want  to  consider 
some  other  consequences,  and  in  the  first  place 
we  want  to  give  a  discussion  of  the  physical 
significance  of  the  transformation  of  coordi- 
nates promised  in  Section  10.  The  new  feature 


about  our  coordinate  system*  IB  the  greater  ar- 
bitrariness in  their  choice.  Before  we  were 
free  to  pass  from  one  system  to  another  with 
the  same  time  axis;  now  we  may  change  the  time 
axis  also  (formulas  10.6),  and  we  want  to  see 
what  it  means. 

In  general,  in  one  system  of  coordinates  A 
geometrical  configuration  is  described  by  cer- 
tain numbers  (e.g.,  coordinates  of  its  points) 
and  certain  equations  (e.g.,  equations  of  its 
straight  lines);  In  another  system  of  coordi- 
nates the  same  configuration  will  be  character- 
ized by  other  numbers  and  other  equations,  but 
It  will  be  another  description  of  the  same  con- 
figuration; or,  we  may  say,  another  identifica- 
tion which  is  theoretically  just  as  good  as  the 
first,  but  may  be  more,  or  less,  convenient  for 
practical  purposes.  In  general  we  make  our 
choice  of  a  coordinate  system  guided  by  the  prop- 
erties of  the  object  we  are  studying  and  our  owi 
position  in  space.  If  we  study  an  ellipsoid  we 
would  choose  for  coordinate  axes  its  principal 
axes;  or  in  another  case  we  would  choose  the  di- 
rection away  from  us  as  the  y-axis,  the  direc- 
tion to  the  right  as  the  x-axis,  the  vertical 
direction  as  the  z-axis;  but  in  principle  all 
axes  are  permitted.  The  same  general  situation 
obtains  in  physics,  considered  as  four-dimen- 
sional geometry.  We  have  many  systems  of  coor- 
dinate axes  at  our  disposal,  and  we  want  to  in- 
vestigate now  what  use  we  can  make  of  this  ar- 
bitrariness, how  we  can  adjust  the  choice  of  ax- 
es to  the  requirements  of  a  particular  situa- 
tion. In  particular,  we  are  interested  in  the 
choice  of  the  t-axis. 

The  object  we  want  to  study  in  the  first 
place  is  the  motion  of  a  particle.   We  repre- 
sent such  a  motion  by  a  curve  in  four- space  (and 
a  straight  line  we  consider  as  a  special  case  of 
a  curve) .  At  every  point  of  that  curve  we  have 
a  unit  tangent  vector,  the  four-dimensional  ve- 
locity vector  of  components  a,b,c,d;  or  we  may 
characterize  it  by  the  three-dimensional  veloc- 
ity components  u,v,w;  if  we  pass  to  another  co- 
ordinate system  the  components  a,b,c,d  will  be 
changed,  and  so  will  the  components  u,v,w.  If 
the  coordinate  transformation  affects  only  the 
space  coordinates  x,y,z  then  the  component  d  will 
not  be  affected,  and  therefore  p  will  not  change; 
in  other  words,  u,v,w  will  be  changed  but  not 
ua  +  v8  +  w»;  the  velocity  vector  will  have  dif- 
ferent components,  but  its  absolute  value,  the 
speed,  will  be  the  same.  This  is  essentially  as 
in  old  physics;  the  new  feature  is  in  the  exist- 
ence of  transformations  affecting  t,  and  the 
most  striking  result  of  it  is  expressed  in  the 
following  theorem. 

Theorem.  For  every  motion  it  is  possible 
for  every  moment  to  choose  a  system  of  space- 
time  coordinates  in  such  a  way  that  the  speed 
be  zero. 

Proof.  Begin  with  any  axes';  then,  without 
changing  the,  time  axis,  change  the  space  axes 
so  that  the  motion,  at  the  moment  considered, 


takes  place  along  the  z-axis;  we  have  then 
a  =  b  =  0;  now  consider  a  transformation  in- 
volving z  and  t;  denoting  the  new  components  of 
the  four-dimensional  velocity  vector  a1,  b1,  c1, 
d'  we  shall  have  (10.6) 

a'  =0,  b'  =  0,  c«  =  co  +  dT,  d1  =  cr  +  do; 

if  we  want  to  make  c«  =  0  we  have  to  choose  the 
angle  \J>  so  that 

-o  =  sin  <r  =  -c/d; 

if  c  is  in  absolute  value  less  than  d,  and  this 
is  so  for  all  motions  of  bodies  so  far  observed, 
an  angle  satisfying  this  relation  and  therefore 
a  system  of  coordinates  for  which  a  -  b  =  c  =  0 
can  be  found.  From  formulas  12.2  it  follows 
that  in  such  a  system  u  =  v  =  w  =  0,  and  the 
theorem  is  proved. 

The  theorem  just  proved  is  expressed  often 
by  saying  that  every  particle  can  be  transformed 
to  rest. 

After  we  have  found  a  coordinate  system  in 
which  a  particle  is  at  rest  we  can  perform  any 
transformations  of  space  coordinates  and  the 
property  will  not  be  destroyed;  any  transforma- 
tion involving  time,  on  the  contrary  will  result 
in  introducing  significant  space  components  of 
the  velocity  vector;  we  see  thus  that  whether  a 
particle  is  at  rest  in  a  coordinate  system  de- 
pends exclusively  on  the  choice  of  the  time  ax- 
is, so  that  the  choice  of  the  time  axis  is  equiv- 
alent to  the  choice  of  a  body  which  we  desire  to 
consider  at  rest;  in  other  words,  the  direction 
of  the  time  axis  may  be  characterized  by  indi- 
cating what  particle  is  at  rest  in  the  corres- 
ponding coordinate  system. 

What  time  axis  we  actually  choose  depends, 
as  in  geometry,  on  circumstances;  in  many  cases 
we  shall  want  to  consider  ourselves  as  being  at 
rest,  or  our  laboratory,  or  the  earth. 

In  what  precedes  we  spoke  of  a  motion  of  a 
particle  at  a  given  moment;  in  a  given  system' 
of  space-time  coordinates  a  particle  may  be  at 
rest  at  one  moment  and  not  at  rest  some  other 
time;  but  there  exists  a  class  of  particles  which 
if  transformed  to  rest  for  one  moment  will  be  at 
rest  always;  these  are  those  particles  whose 
representative  four-dimensional  curves  have  the 
same  tangent  vector  at  all  points,  i.e.,  are 
straight  lines;  it  is  clear  that  if  the  direc- 
tion of  such  a  straight  line  is  taken  as  the  di- 
rection of  the  t-axis  the  velocity  in  the  so  ob- 
tained coordinate  system  is  zero.  But  if  we 
choose  any  other  (cartesian)  coordinate  system, 
the  three-dimensional  velocity  u,v,w,  will  be 
constant  in  absolute  value  and  direction,  so 
that  we  have  a  rectilinear  uniform  motion.  From 
our  point  of  view  then  the  distinction  between 
uniform  rectilinear  motion  and  rest  is  a  non-es- 
sential distinction,  this  distinction  does  not 
exist  until  we  introduce  a  coordinate  system;  it 


is  of  the  same  nature  as  the  distinction  between 
lines  which  are  and  those  which  are  not  parallel 
to  the  x-axls  in  ordinary  analytic  geometry. 

If  a  motion  la  not  uniform  and  rectilinear 
then  there  is  no  coordinate  system  in  which  the 
particle  is  permanently  at  rest.  But  rather 
than  to  make  a  strict  distinction  between  par- 
ticles which  are  and  those  which  are  not  in  un- 
iform rectilinear  motion  or  rest,  it  Is  more  in 
keeping  with  our  point  of  view  to  speak  of  par- 
ticles which  may  be  (within  experimental  error) 
considered  at  rest  for  a  sufficiently  long  peri- 
od of  time. 

We  are  now  in  a  position  to  explain  the 
name  and  the  origin  of  the  theory  we  are  study- 
ing. We  saw  that  in  this  theory  there  is  no 
such  thing  as  absolute  rest  or  absolute  notion 
of  a  body.  If  it  is  at  rest  with  respect  to 
one  system  of  coordinates  it  may  move  with  re- 
spect to  another  and  vice  versa;  we  can  only 
speak  of  relative  motion;  that  is  where  the 
name  Relativity  comes  from. 

If  we  adopt  this  point  of  view,  we  hare  to 
consider  as  permissible  all  transformations  of 
coordinates  from  one  to  any  other  cartesian  co- 
ordinate system.  Later  on  we  shall  consider 
other,  more  general,  systems  of  coordinates,  and 
therefore  more  general  coordinate  transforma- 
tions; we  shall  replace  our  equations  by  more 
general  ones  which  will  be  invariant  under  these 
more  general  coordinate  transformations;  in  com- 
parison with  this  situation  we  may  say  that  we 
consider  now  only  special  coordinate  transfor- 
mations and  invariance  under  them;  therefore  the 
present  theory  is  called  "Special  Relativity 
Theory". 

It  may  be  mentioned  that  the  historical 
order  of  appearance  of  the  ideas  of  our  sub- 
ject -  as  it  happens  so  often  -  has  been  quite 
different  from  the  order  which  seems  natural  and 
in  which  we  have  presented  them.  First  the 
formulas  of  transformation  involving  space  co- 
ordinates and  time  have  been  introduced  by  Lor- 
entz  without,  however,  giving  to  them  the  mean- 
ing they  have  now;  in  Lorentz's  theory  there 
exists  one  universal  time  t,  and  other  times  t1 
play  only  an  auxiliary  part.  The  merit  of  mak- 
ing the  decisive  step  and  recognizing  the  fact 
that  all  these  variables  are  on  the  same  foot- 
ing -  belongs  to  Einstein  (1905).  The  four-di- 
mensional point  of  view,  after  some  preliminary 
work  had  been  done  by  Poincare  and  Marco longo, 
was  most  emphatically  introduced  by  Minkowski 
in  1908. 


15.  Addition  of  Velocities. 

As  explained  in  Section  13  we  have  two  ways 
of  characterizing  the  velocity  of  a  body:   by 
means  of  the  three-dimensional  velocity  vector 
and  by  means  of  the  four-dimensional  velocity 
vector.  We  can  pass  from  one  representation  of 


velocity  to  the  other  without  difficulty  and 
the  two  methods  are  equivalent  as  long  as  we  do 
not  change  our  coordinate  system.   But  if  we 
come  to  study  the  relative  motion  of  one  body 
with  respect  to  another  and  want  to  define  the 
relative  velocity,  the  four-dimensional  point 
of  view  leads  to  conceptions  which  are  at  vari- 
ance with  commonly  accepted  ideas  and  we  want  to 
devote  this  section  to  the  clarification  of  this 
situation.  It  is  natural  to  reduce  the  defin- 
ition of  relative  velocity  of  a  body  with  re- 
spect to  another  body  to  the  conception  of  the 
velocity  of  a  body  in  a  coordinate  system  by 
saying:  By  velocity  of  the  body  B  with  respect 
to  a  body  A  we  mean  the  velocity  of  B  in  a  sys- 
tem of  coordinates  in  which  the  velocity  of  A 
is.  zero. 

If  we  want  to  find  the  velocity  of  B  with 
respect  to  A  we  have  to  transform  our  coordi- 
nates so  that  in  the  transformed  coordinate  sys- 
tem A  be  at  rest.  It  is  clear  that  the  meaning 
of  relative  velocity  is  made  to  depend  by  the 
preceding  definition  on  what  we  mean  by  trans- 
formation of  coordinates.  If  by  transformation 
of  coordinates  we  mean  only  transformation  of 
three-dimensional  coordinates  -  transition  to 
moving  axes  -  we  have  the  old  idea  of  relative 
velocity;  if,  on  the  other  hand,  we  consider 
four-dimensional  coordinate  axes  and  our  trans- 
formation of  coordinates  involves  the  coordinate 
x4,  or  t,  in  the  sense  of  the  theorem  of  Sec- 
tion 14,  it  is  clear  that  we  give  a  new  meaning 
to  relative  velocity,  and  we  should  not  be  sur- 
prised if  the  so  defined  "relativistic"  rela- 
tive velocity  will  possess  properties  different 
from  those  of  the  "classical"  relative  velocity. 

Consider  a  body  A  and  a  body  B  that  moves 
with  respect  to  A  uniformly  and  rectilinearly 
with  a  velocity  VBA;  this  means  according  to  our 
definition  that  the  velocity  of  B  in  a  coordi- 
nate system  in  which  A  is  at  rest  is  VBA  •   In- 
troduce a  coordinate  system  in  which  A  is  at 
rest  and  B  moves  along  the  z-axis,  and  call  the 
coordinates  XA,  yA,  ZA,  tA;  introduce  also  a 
system  of  coordinates  in  which  B  is  at  rest  so 
that  (10.6) 


15.1 


XA  ~ 


ZA  ~ 


VA  =  VB> 
(L   =   ZBTBA 


Now  describe  the  motion  of  the  body  B  in  each 
system  neglecting  the  x  and  y  coordinates.  In 
the  system  A  the  motion  of  the  body  B  which,  we 


assume,  at  the  moment 
given  by 


t  =  0  was  at  Z  =  0  is 


ZA  = 


in  the  system  B  the  body  B  is  at  all  times  at 
the  origin  of  the  coordinate  system,  so  that 
ZB  =  0;  substituting  this  value  in  the  trans- 
formation formulas  and  eliminating  tB  we  get 


ZA  =  ^A*^*  Comparing  this  with  the  preceding 
equation  we  have  (10.5) 


15.2 


BA 


Solving  the  above  transformation  formulas 
for  ZB,  tB  we  also  find  that  TIB  a  -^BA*  °AB  = 
o  BA  so  that 

VAB  =  -VBA- 

This  result,  that  the  relative  velocity  of  A 
with  respect  to  B  is  the  negative  of  the  rela- 
tive velocity  of  B  with  respect  to  A  is  In  keep- 
ing with  the  old  Ideas. 

Now  consider  three  bodies,  A,B,C,  all  mov- 
ing in  one  direction  (more  precisely  B  and  C 
moving  in  the  same  direction  with  respect  to  A). 
Denote  the  velocities  of  B  and  C  respectively 
with  respect  to  A  by  VBA  and  VCA  >  and  the  ve- 
locity of  C  with  respect  to  B  by  VCB  .  We  have 
in  addition  to  the  above  transformation  formu- 
las the  formulas 

15.11  ZA=ZCOCA  +  tcTCA,  tA  =ZCJTCA  +  tcocl, 
and 


15.12     Z=ZO 


B 


C(B 


=ZT 


and  also 


15.21        V 


BA 


VGA  = 


OCA' 


CCB 


VCB  = 


TCB 
OCB* 


Express  now  ZA,  tA  in  terms  of  zc,  tc  by  sub- 
stituting the  values  for  ZB,  tB  given  by  the 
transformation  formulas  15.12  into  the  trans- 
formation formulas  15.  Ij  comparing  the  result 
with  the  transformation  formulas  15.11  we  get 


°CA  =  ° 


GB'°BA  +TCB'TBA  •   TCA  =  TCB*°BA  +  °CB*TBA 


whence,  using  the  above  expressions  of  veloci- 
ties in  terms  of  transformation  coefficients, 
15.2  and  15.21, 


15.3 


"CA 


VBA  +  VCB 
l  +  VBA.VCB' 


This  is  the  Einstein  formula  for  addition  of 
velocities  for  the  case  of  two  motions  In  the 
same  direction.  This  formula  should  be  com- 
pared to  the  formula  of  addition  of  classical- 
ly defined  relative  velocities 


15.4 


V  =  V'  +  V". 


Of  course,  there  is  no  contradiction  between 
the  two  formulas  because  they  refer  to  differ- 
ent quantities.  Still  it  is  legitimate  to  ask 
which  formula  is  better  from  the  point  of  view 
of  experiment,  which  -  if  any  -  is  "correct" 


for  the  relative  velocities  that  we  actually 
measure. 

In  ordinary  units  th3  second  term  in  the 
denominator  in  formula  15.3  should  be  divided 
by  the  square  of  the  velocity  of  light,  so  that 
for  moderate  velocities  the  formulas  give  re- 
sults that  differ  numerically  very  little,  and 
it  seems  to  be  difficult  to  devise  an  experi- 
ment with  high  enough  velocities  of  material 
particles  so  that  the  formulas  could  be  tested 
directly.   In  the  next  section  we  shall  consid- 
er the  case  when  one  of  the  velocities  is  that 
of  light;  in  the  meantime  we  may  mention  that 
formula  15.  '6  is  a  special  case  of  a  more  gener- 
al formula  which  corresponds  to  the  case  when 
the  two  motions  are  not  in  the  same  directions. 
This  general  formula  gained  temporary  importance 
some  years  ago  v.'hen  it  played  a  decisive  role 
in  the  early  stages  of  the  application  of  the 
idea  of  the  spinning  electron  to  the  explanation 
of  spectra. 


16.  Light  Corpuscles,  or  Photons. 

In  studying  curves  in  four-space  represent- 
ing motions  of  particles  we  succeeded  (Section 
1£)  in  choosing  a  standard  parameter,  s,  by  con- 
sidering the  expression 


and  by  setting  ds/dp  equal  to  the  reciprocal  of 
the  square  root  of  minus  the  above  expression. 
This  procedure  would  not  work  if  the  above  ex- 
pression were  equal  to  zero.  We  can  imagine  in 
our  four-dimensional  geometry  curves  and  straight 
lines  for  which  the  above  expression  is  zero 
(Section  10),  and  the  question  arises:  what  will 
be  the  physical  interpretation  of  such  curves; 
in  other  words:  is  there  anything  in  physics 
that  could  be  identified  with  such  curves  in  the 
same  way  that  motions  of  particles  are  identi- 
fied v/ith  curves  for  which  the  above  expression 
is  negative.   In  order  to  answer  this  question 
let  us  calculate  the  three-dimensional  velocity 
corresponding  to  such  a  curve;  if  the  above  ex- 
pression is  zero  for  one  choice  of  parameter  it 
will  be  zero  for  all  choices;  using  t  as  param- 
eter, and  using  physical  coordinates  we  have 
then 

(dx/dt)8  +  (dy/dt)2  +  (dz/dt)z  -1=0 
or  u2  +  v2  +  wa  =  1, 

i.e.,  we  can  say  that  the  curves  of  zero  square 
tangent  vectors  correspond  to  what  we  have  to 
call  from  the  three-dimensional  point  of  view 
particles  moving  with  the  velocity  of  light. 
This  suggests  to  identify  such  curves  in  some 
way  with  propagation  of  light. 

Since  the  time  of  Newton  and  Huygens  two 
theories  of  light  have  been  vying  for  suprem- 


acy  with  variable  success;  according  to  one, 
the  so-called  corpuscular  theory,  light  con- 
sists (like  matter  on  the  discrete  theory)  of 
particles  which  lately  (Wolfers,  1925)  hare 
been  called  "photons";  according  to  the  other 
theory  light  is  a  wave  phenomenon.  For  our 
present  purposes  the  former  view  seems  to  be 
better  adapted.  If  we  adopt  it  we  can  make  our 
former  statement  more  specific  by  saying  that 
we  identify  curves  of  zero  square  tangent  vec- 
tors with  photons,  or  with  motion  of  photons. 

In  adopting  thus  the  corpuscular  theory  of 
light  we  do  not  in  the  least  mean  to  say  that 
the  corpuscular  theory  of  light  is  correct,  and 
still  less  that  the  other  theory  -  the  wave  the- 
ory -  is  wrong.  We  simply  want  to  show  that 
the  identification  Just  mentioned  permits  us  to 
give  account  of  certain  light  phenomena;  and  it 
is  enough  to  mention  polarization  in  order  to 
see  that  other  phenomena  are  left  out. 

To  begin  with  we  want  to  point  out  an  ad- 
vantage that  the  Relativity  theory  has  compared 
to  classical  theory  in  the  matter  of  corpuscu- 
lar theory  of  light.  In  classical  theory  dif- 
ference in  velocity  is  merely  a  quantitative 
difference,  in  relativity  this  means  an  entire- 
ly different  kind  of  curves,  and  there  are  oth- 
er differences  entirely  of  qualitative  nature 
that  are  consequences  of  our  identification, 
which  is  more  in  keeping  with  the  nature  of 
light  compared  to  matter  as  we  know  it  from  ex- 
periment. This  seems  to  constitute  a  very 
strong  argument  in  favor  of  the  adoption  of  the 
point  of  view  of  Relativity  in  general,  and  of 
the  identification  we  are  discussing  now  in 
particular. 

We  want  next  to  discuss  what  is  usually 
referred  to  as  constancy  of  the  velocity  of 
light.  The  reader  may  have  noticed  that  a  while 
ago  when  we  were  calculating  the  three-dimen- 
sional velocity,  corresponding  to  curves  with 
zero-square  tangent  vectors,  we  did  not  say  in 
vrhat  coordinate  system  we  wanted  to  calculate 
this  three-dimensional  velocity.  As  a  matter 
of  fact,  the  result  shows  that  it  is  independ- 
ent of  the  coordinate  system;  i.e.,  no  matter 
what  bodies  we  consider  as  being  at  rest,  we 
come  out  with  the  same  value  for  the  velocity 
of  light,  in  our  units  -  one. 

This  seems  surprising;  it  contradicts  the 
commonly  accepted  ideas  concerning  addition  of 
velocities;  but  we  have  been  led  to  a  different 
formula  for  the  addition  of  velocities,  and  we 
can  show  that  the  constancy  of  velocity  of  light 
is  in  agreement  with  that  formula  15.3.  In  fact, 
if  we  consider  the  case  that  C  moves  with  the 
velocity  of  light  (that  is  one  in  our  units) 
with  respect  to  B,  that  means  that  VCB  =  1;  sub- 
stituting this  value  in  15.3  we  find  that  VGA 
is  also  one;  that  is,  what  is  motion  with  veloc- 
ity one  in  one  system  is  motion  with  velocity 
one  in  another  system. 

This  discussion,  of  course,  proves  nothing 
but  the  inner  consistency  of  the  thec 


Another  question  is  whether  constancy  of 
velocity  of  light,  i.e.,  independence  of  this 
velocity  from  the  choice  of  the  system  which  Is 
considered  at  rest  is  consistent  with  experi- 
ment. As  a  matter  of  fact,  it  appears  that  it 
is;  the  weight  of  experimental  evidence  seems 
to  be  for  it.  Historically,  results  of  some 
experiments  by  Michelson  and  Moreley  performed 
in  1887  and  pointing  in  the  same  direction  play- 
ed a  great  role  in  the  creation  of  the  Theory 
of  Relativity. 

Having  considered  thus  the  question  of  ve- 
locity of  light  we  pass  to  the  discussion  of  an- 
other consequence  of  our  identification. 

We  have  decided  in  a  general  way  to  identi- 
fy straight  lines  whose  vectors  are  of  zero 
square  with  light  or  the  motion  of  photons  in 
the  same  way  that  straight  lines  whose  vectors 
are  of  negative  square  are  identified  with  mat- 
ter or  uniform  rectilinear  motion  of  material 
particles.  But  a  straight  line  (in  four  dimen- 
sions) does  not  characterize  the  motion  of  the 
particle  completely  -  it  only  gives  the  velocity 
of  the  motion,  it  characterizes  it  only  kinema- 
tically;  for  a  complete  dynamical  characteriza- 
tion we  had  to  introduce  (Section  13)  the  mass 
of  the  particle,  and  that  led  us  to  introduce 
the  momentum  vector,  whose  square  we  found  to 
be  -m8;  the  complete  characterization  of  a  mate- 
rial particle  consists  then  of  a  line  with  a 
vector  (of  negative  square)  on  that  line.   In 
the  same  way  we  shall  characterize  the  motion  of 
a  photon  by  a  line  with  a  vector  of  zero  square 
on  it.  We  have  thus  the  same  picture  for  a  ma- 
terial particle  and  a  photon;  in  both  cases  we 
have  a  line  with  a  vector  on  it;  only  in  the 
first  case  it  is  a  vector  of  negative  square; 
in  the  second  of  zero  square;  this  difference 
corresponds  to  the  difference  in  the  speeds  of 
•the  particles  in  the  classical  theory.  But  in 
the  classical  theory  this  is  a  purely  quantita- 
tive difference  and  here,  as  mentioned  before, 
it  leads  to  qualitative  differences,  some  of 
which  we  are  going  to  consider. 

In  the  first  place  a  photon  cannot  be  trans- 
formed to  rest.  In  fact  transforming  a  photon 
to  rest  would  mean  finding  a  coordinate  system 
such  that  in  it  the  time  axis  will  have  the  di- 
rection of  the  photon;  but  that  would  mean  that 
the  vector  0,0,0,1  would  have  zero  square  which 
is  impossible. 

Then  there  is  this  distinction:   two  mate- 
rial particles  may  differ  in  mass,  that  means  in 
the  squares  of  their  momentum  vectors,  and  this 
is  an  essential  difference  because  the  square 
of  a  vector  is  not  affected  by  a  transformation 
of  coordinates;  all  photons  on  the  other  hand 
have  vectors  of  the  same  square,  namely  zero. 
We  shall  prove  that  as  a  consequence  of  this, 
two  photons  never  differ  essentially,  that  is, 
given  two  photons  there  always  exist  two  systems 
of  coordinates  in  which  the  descriptions  of  the 
two  photons  are  the  same.  To  begin  with,  we  may 


choose  the  origins  of  the  two  coordinate  sys- 
tems on  the  respective  straight  lines;  next  we 
may  consider  the  two  lines  In  the  respective 
z-t  planes.  The  momentum  vectors  of  the  cor- 
responding photons  will  have  now  In  their  re- 
spective coordinate  systems  the  components 
0,0,qa,q4  and  0,0,p,,p4  (contrary  to  our  gener- 
al agreement  we  use  here  subscripts  with  physi- 
cal coordinates)  and  since  both  vectors  are  of 
zero  square  we'll  have 


/  -  Q4*  -  0, 


P»*  -  P4*  «  Of 


by  choosing  appropriately  the  sense  on  each  co- 
ordinate axis  we  can  reduce  these  conditions  to 


16.1 


q» 


P»  "  P4< 


Now  perform  in  the  second  system  the  transfor- 
mation 10.6 


z' 


ZO  +  tT, 


t' 


ZT 


tO, 


which  applied  to  the  second  vector  and  taking 
into  account  16.1  gives 


16.2 


P»  =  P*  = 


But  a  and  T  are  subject  only  to  the  condition 
that  o2  -  •**  =  1,  so  that  we  can  choose  o  +  t 
arbitrarily;  if  we  make  the  choice 


16.3 


o  +  T  =  qa/p3, 


we  shall  have  p3  =  q3  and  the  statement  Is 
proved. 

This  theoretical  conclusion,  that  any  two 
photons  are  not  essentially  different  from  each 
other  must  be  confronted  with  experience.   At 
first  sight  it  seems  to  contradict  it.  We  know 
that  light  differs  from  case  to  case;  it  differs 
in  intensity  and  color.  For  difference  in  in- 
tensity we  account  by  assuming  that  every  bean 
of  light  consists  of  many  photons  so  that  in- 
tensity (for  a  given  color)  is  proportional  to 
the  number  of  photons  in  the  beam.   Remains 
color.  But  experiments  show  that  color  actual- 
ly depends  on  the  state  of  motion  of  the  ob- 
server; when  an  observer  approaches  a  source  of 
light,  color  seemingly  changes  (Doppler  effect) 
and  so  the  field  is  clear  for  our  assertion. 
Now  let  us  see  how  it  works  out. 

Before  we  treat  the  situation  from  the 
point  of  view  of  the  Relativity  theory  we  have 
to  say  a  few  words  about  how  color  appears  in 
physics  as  a  measurable  quantity.   From  the 
point  of  view  of  the  wave  theory  to  light  Is 
attached  a  certain  measurable  quantity  v  - 
"frequency"  which  corresponds  to  color  In  the 
sense  that  different  colors  correspond  to  dif- 
ferent frequencies.  On  the  corpuscular  theory 
photons  are  characterized  by  their  energies,  E, 
and  the  fundamental  relation  between  frequency 


and  energy  is  given  by  the  formula 


16.4 


E  =  hv, 


where  h  is  the  so-called  Planck's  quantum  con- 
stant, which  for  us  appears  simply  as  coeffici- 
ent of  proportionality  establishing  the  rela- 
tion between  the  values  of  two  quantities  which 
measure  the  same  thing  in  different  units,  much 
in  the  same  way  that  c,  the  velocity  of  light, 
appears  in  the  formula  connecting  mass  and  en- 
ergy (compare  15.41) .  Of  the  two  quantities,  S 
and  v,  which  can  be  used  to  measure  color,  E 
will  be  the  one  that  is  more  convenient  for  our 
purposes  because  we  use  the  corpuscular  theory 
of  light. 

The  question  now  is  with  what  quantity  in 
our  theory  are  we  going  to  identify  E.  In  order 
to  have  a  suggestion  we  notice  that  E  is  of  the 
character  of  kinetic  energy;  it  plays  for  light 
particles  the  same  role  that  kinetic  energy 
plays  for  material  particles.  There  (Section 
13)  we  identified  kinetic  energy  with  the  sec- 
ond term  in  the  development 


m 


=  m 


of  the  fourth  component  of  the  momentum  vector; 
of  the  other  terms  the  third  and  the  following 
are  negligible  for  material  particles,  and  the 
first  is  a  constant  so  that  it  plays  no  part  in 
these  considerations  where  only  differences  in 
energy  are  important;  besides,  the  correspond- 
ing constant  for  light  is  zero;  everything  leads 
us  thus  to  compare  E  with  the  fourth  component 
of  the  momentum  vector  of  light,  or  photon.  We 
arrive  in  this  way  to  a  new  identification;  we 
identify  the  mathematical  quantity  "time  compo- 
nent of  the  momentum  vector  of  a  photon"  with 
the  physical  quantity  E  which,  except  for  a  fac- 
tor of  proportionality,  is  frequency  and  mea- 
sures color.  This  identification  makes  color 
dependent  on  the  coordinate  system  but  this  de- 
pendence, as  was  said  before,  is  to  be  expected, 
and  our  next  question  is  whether  the  character 
of  this  dependence  corresponds  to  experimental 
facts. 

Suppose  that  E  is  the  energy  of  a  light 
corpuscle  in  one  system  of  coordinates;  what 
will  it  be  in  another?  We  have  already  calcu- 
lated how  the  components  of  a  zero  square  vec- 
tor change  under  a  transformation  of  coordinates 
involving  time.  Formula  16.3  shows  that  the 
ratio  of  the  fourth  components  of  the  two  vec- 
tors, and  according  to  our  Identification  this 
means  the  ratio  of  the  energies  or  frequencies, 
is 

v '  /v  =  a  +  i  . 

On  the  other  hand,  we  saw  before  (15.2)  that  the 
relative  velocity  of  two  bodies  which  are  at  rest 


M 

in  the  corresponding  systems  of  coordinates  it 

V  • 
this  gives 


and  taking  into  account  the  identity  o*  -  T*  =  1 
we  find 


16.4 


v!/v 


T  * 


or,  in  first  approximation  1  •»•  V. 

Let  us  try  to  figure  out  the  predicted 
change  of  frequency  on  the  classical  (ware)  the- 
ory. If  we  have  a  wave  of  frequency  v  that 
means  that  there  are  v  vibrations  per  unit  of 
time,  and  since  the  velocity  is  unity,  there 
will  be  v  waves  per  unit  of  length,  low,  if  we 
move  toward  the  source  with  a  velocity  V  we 
shall  travel  in  a  unit  of  time  the  distance  V 
and  we  shall  meet  V.v  additional  vibrations, 
so  that  the  number  of  vibrations  our  eye  re- 
ceives in  a  unit  of  time  will  be  (1  *  V).v,  and 
this  will  be  the  frequency  for  the  moving  ob- 
server. The  two  theories  give  then  the  predic- 
tions 


for  the  change  in  frequency  due  to  motion  of 
the  observer,  and  the  difference  between  these 
two  values  is  too  small  to  be  subjected  to  an 
experimental  test;  within  experimental  error 
both  seem  to  fit  observations  equally  well. 


17.  Electricity  and  Magnetism 
in  Special  Relativity. 

In  the  preceding  sections  of  this  chapter 
we  have  discussed  some  modifications  that  are 
brought  about  by  the  Theory  of  Relativity  in 
Kinematics,  Mechanics,  and  Optics.  There  are 
other  modifications  which  have  attracted  a  great 
deal  of  popular  attention  due  to  their  sensa- 
tional a"nd  paradoxical  character.  We  shall  only 
mention  the  so-called  effects  of  motion  on  the 
shape  of  bodies,  lengths,  measure  of  tine,  and 
the  fact  that  in  the  Theory  of  Relativity  the 
conception  of  simultaneity  loses  its  absolute 
character  so  that  two  events  which  are  to  be 
considered  simultaneous  in  one  system  of  space- 
time  coordinates  need  not  be  simultaneous  in  an- 
other. But  we  shall  say  a  few  words  about  elec- 
tricity and  magnetism.  Even  in  the  first  chap- 
ter the  components  of  the  electric  and  the  mag- 
netic force  vectors  were  combined  into  one  ten- 
sor f±t  ,  so  that  electricity  and  magnetism  seem 
to  be  treated  as  two  aspects,  or  manifestations 
of  a  higher  entity.  But  as  long  as  we  limit 


ourselves  to  transformations  of  space  coordi- 
nates the  components  of  F  corresponding  to 
electricity  are  transformed  among  themselves  and 
those  corresponding  to  magnetism  -  among  them- 
selves, so  that  their  unification  in  one  ten- 
sor FJJ  may  be  considered  as  artificial.   flow- 
ever,  when  we  introduce  transformations  of 
space-time  coordinates  (formulas  10.6)  the  sit- 
uation changes  radically. 

Following  the  procedure  used  in  Section  11 
when  we  were  proving  the  invariant  character  of 
the  relations  between  the  tensors  F  and  D  we 
can  deduce  the  following  formulas  corresponding 
to  rotation  in  the  x3x4  plane. 


=  F31s  +  F41c 


F83 


=  F8,c  -  F84s 


F48  -  F38s 

F43  =  F43 


F48c 


Fix  =  F31c  -  F41s 


IB 


From  these  mathematical  formulas  we  can 
pass  to  formulas  involving  physical  components 
and  only  real  quantities  by  making  use  of  4.72, 
10.3  and  the  fact  that  F^  =  -¥„  .   We  obtain 
thus  the  relations 


X«  =  ox  +  TM 
yi  =  oy  -  TL 
Z'  =  Z 


L»  =  oL  -  TY 
M«  =  oM  +  TX 
L«  =  L. 


The  interpretation  of  these  formulas  it  that 
if  the  unprimed  letters  give  the  component*  of 
electric  and  magnetic  force  in  one  system  the 
primed  letters  will  give  the  components  of  elec- 
tric and  magnetic  force  in  a  system  which  mores 
with  respect  to  the  first  with  Telocity  ?  *  T/O. 

These  formulas  show  that  the  distinction 
between  electric  and  magnetic  forces  is  not  an 
absolute  distinction,  but  depends  on  the  coor- 
dinate system  used;  we  might,  for  example,  hare 
in  the  old  coordinate  system  a  purely  electric 
field,  L  =  M  =  N  =  0;  in  the  new  system  the  mag- 
netic components  will  be  different  from  zero, 
viz.,  -YT,  Xxp,  0.  What  is  the  physical  meaning 
of  this?  It  means  that  a  field  may  have  elec- 
tric effects  on  one  body  but  electric  and  mag- 
netic effects  on  a  body  that  moves  with  respect 
to  it.  This  prediction  is  verified  by  experi- 
ment. The  fact  may  be  restated  by  saying  that 
an  electric  charge  in  motion  has  magnetic  ef- 
fects, it  may,  e.g.,  deflect  a  magnetic  needle. 
As  an  example,  the  magnetic  field  of  a  moving 
electron  may  be  easily  calculated.   We  start 
with  an  electron  at  rest.  Its  magnetic  field 
is  supposed  to  be  zero,  its  electric  field  is 
supposed  to  be  Independent  of  time  and  symmetric 
with  respect  to  the  point;  under  these  conditions 
Maxwell's  equations  reduce,  as  can  be  seen  eas- 
ily, to  Newton's  equations  which,  as  we  know, 
give  the  inverse  square  law  for  the  electric 
forces.  The  field  of  an  electron  in  motion  can 
now  be  obtained  by  applying  the  above  formulas. 


M 


Chapter  IT. 
CURVED  SPACE 


The  theory  that  has  been  developed  so  far 
may  be  said  to  consist  of  two  parts:  a  general 
part  which  may  be  called  Geometry  and  which,  In 
addition  to  material  analogous  to  that  treated 
in  ordinary  geometry,  includes  general  rules  of 
operations  on  tensors,  and  a  special  part  which 
may  be  called  Physics  and  which  deals  with  three 
definite  tensor  fields,  a  scalar  field  p,  a  vec- 
tor field  u^,  an  antisymmetric  tensor  field  FJI, 
which  all  have  been  combined  into  the  tensor 
field  TJJ  ,  and  with  special  conditions  which  we 
impose  on  these  fields,  viz.,  equations  11.1  to 
11.5.  The  second  part  is  independent  of  the 
first  in  the  sense  that  we  could  have  built  with 
the  same  geometry  a  different  "physics",  we  could 
have  chosen  another  set  of  tensors  instead  of 
p,  u4,  FJJ  .  The  reason  why  our  physics  was  in- 
dependent of  our  geometry  is  because  the  latter 
does  not  furnish  us  any  tensors,  except  the  ten- 
sor Oij,  or  the  tensor  of  scalar  multiplication 
which  is,  so  to  say,  the  same  in  all  points 
(and  at  all  times)  and  therefore  cannot  be  used 
to  explain  the  variety  and  change  which  are 
characteristic  of  the  outside  world.  In  other 
words,  our  geometry  does  not  possess  any  struc- 
ture which  seems  to  be  necessary  for  the  inter- 
pretation of  the  outside  world  and  therefore  we 
had  to  superimpose  on  our  geometry  a  certain 
arbitrary  structure  by  introducing  special  ten- 
sors, by  filling,  so  to  say,  the  empty  space- 
time  with  these  tensor,  fields.  Our  geometry 
does  not  give  us  a  landscape,  it  gives  us,  so 
to  say,  only  a  frame  for  one,  or  only  a  stage 
and  the  landscape  can  be  constructed  on  it  by 
means  of  stage-settings  which  do  not  constitute 
an  organic  part  of  the  stage.  Although  some 
success  has  been  achieved  with  the  theory  Just 
described  we  may  want  to  accomplish  more,  we  may 
want  to  have  a  geometry  possessing  a  structure 
of  its  own  which  might  be  used  in  interpreting 
the  outside  world.  Such  a  possibility  is  sug- 
gested by  the  consideration  of  curved  surfaces. 
The  space- time  we  have  been  working  with  is  of 
the  same  character  as  a  plane  (except  for  the 
number  of  dimensions)  it  is  as  devoid  of  struc- 
ture as  a  plane.  A  curved  surface,  on  the  oth- 
er hand,  possesses  a  certain  structure;  it  is 
not  necessarily  the  same  in  all  points,  there 
may  be  a  difference  in  curvature.  We  shall  in- 
vestigate the  possibility  of  a  four-dimensional 
space  which  bears  the  same  relationship  to  our 
flat  space-time  as  a  curved  surface  bears  to  a 
plane;  we  shall  expect  to  find  that  it  possess- 
es a  certain  structure  which  we'll  try  to  inter- 
pret in  terms  of  our  physical  quantities;  more 
specifically,  since  all  our  physical  quantities 
have  been  combined  (by  formula  11.5)  into  a  sym- 
metric tensor  of  the  second  rank,  viz.,  T^j >  *e 
shall  expect  to  find  a  tensor  of  that  character 


connected  with  the  curvature  of  oar  curved  four- 
dimensional  space. 

The  plan  of  our  study  will  be  to  begin  with 
the  simplest  case,  a  case  that  Is  even  simpler 
than  that  of  a  surface,  viz.,  with  the  case  of 
a  curve,  and  then  to  work  up  gradually. 


18.  Curvature  of  Curves  and  Surfaces. 

We  consider  a  curve  in  the  plane.   We  as- 
sume that  it  possesses  a  tangent  at  every  point, 
and,  furthermore,  that  if  the  origin  of  coordi- 
nates is  chosen  in  any  point  of  the  curve,  and 
the  tangent  at  the  origin  is  chosen  as  the  x- 
axis,  the  curve,  in  the  neighborhood  of  the  ori- 
gin may  be  represented  by  a  function 

7  =*  f(x), 

which  can  be  expanded  into  a  power  series  con- 
verging in  the  neighborhood  of  the  origin.  The 
constant  term  of  this  expansion  vanishes  because 
the  curve  passes  through  the  origin  so  that  the 
equation  must  be  satisfied  for  x  =  0,  y  =  0;  the 
coefficient  of  the  first  power  of  x  also  van- 
ishes, since  the  slope  of  the  tangent,  which  is 
the  x-axis,  must  be  zero;  if  we  write  the  next 
term  in  the  form 


the  coefficient  aa  is  called  the  curvature  of 
curve  at  the  point  considered,  i.e.,  the  point 
chosen  for  the  origin.  Since  every  point  can 
be  chosen  for  the  origin  this  assigns  to  every 
point  of  the  curve  a  curvature.  We  may  say 
that  if  we  drop  all  terms  of  the  expansion  fol- 
lowing the  one  Just  written  out,  i.e.,  consider 
the  curve 


18.1 


Y  =  *a,x' 


this  curve  (a  parabola)  is  an  approximation  to 
the  given  curve  in  the  neighborhood  of  the  point 
considered. 

We  consider  next  a  surface.  Here  we  as- 
sume that  it  has  a  tangent  plane  at  every  point. 
Taking  a  particular  point  on  the  surface  as  the 
origin  and  the  tangent  plane  at  that  point  as 
the  xy-plane  we  may  represent  the  surface  by  an 
equation  z  =  f(x,y).  We  again  assume  that  for 
every  point  on  the  surface  this  function  may  be 
developed  into  a  power  series  converging  in  the 
neighborhood  of  that  point.  The  constant  term 
and  the  coefficients  of  the  first  powers  of  the 
variables  will  vanish  as  before.  We  write  out 
the  next  group  of  terms,  those  that  are  quadra- 
tic in  the  variables,  in  the  form 


18.  tt   Z  = 


a*  +  2*18*1*8  + 


where  we  use  zx  for  x,  and  xa  for  y. 

We  may  consider  the  coordinates  xlf  x8  as 
the  components  of  a  vector  in  the  tangent  plane 
which  Joins  the  origin  to  the  projection  of  the 
point  on  the  surface,  whose  coordinates  are  xx, 
xa,  z  =  f(x1,xa).  The  expression  18.  E  assigns 
thus  to  every  vector  in  the  tangent  plane  a  num- 
ber (which  may  be  considered  as  the  ordinate  of 
a  paraboloid  approximating  the  surface)  .  This 
assignment  is  independent  of  the  coordinate  sys- 
tem, i.e.,  if  we  choose  another  system  of  coor- 
dinates we  shall  have  the  same  number  assigned 
to  the  same  vector  although  its  components  will 
have  changed;  in  fact  in  a  rotation  of  the  co- 
ordinate axes  the  degree  of  a  polynomial  is  not 
affected  so  that  the  group  of  second  degree 
terms  in  the  expansion  of  z  is  transformed  into 
the  group  of  second  degree  terms  of  that  -expan- 
sion in  the  new  coordinates.  We  have  then  a 
function  whose  values  are  numbers  and  whose  ar- 
gument is  a  vector;  is  it  a  tensor?  Of  course 
not;  but  it  is  easy  to  introduce  a  tensor  with 
which  our  function  18.2  is  closely  connected. 
We  simply  write,  as  in  9.1 


18.3 


aa2x3y2  =  s(x,y) 


using  the  coefficients  axl,  a12,  azz  of  our 
function  18.2  and  writing  in  the  third  term  aai 
for  aia  for  the  sake  of  symmetry;  this  is  a 
(symmetric)  tensor  of  the  second  rank  depending 
on  two  vector  arguments  x  and  y,  and  from  which 
our  function  is  obtained  by  setting  the  vector 
arguments  equal  to  each  other.  We  arrived  thus 
in  the  case  of  a  surface  at  a  tensor  of  rank 
two  which  expresses  the  curvature  properties, 
i.e.,  the  structure  of  the  surface  insofar  as  it 
describes  its  deviation  from  its  tangent  plane 
in  the  neighborhood  of  the  point  of  contact. 
This  encourages  us  in  our  enterprise:  if  we 
succeed  in  generalizing  this  result  to  higher 
dimensions,  we  may  try  to  Identify  the  general- 
ization of  this  symmetric  tensor  of  rank  two 
with  the  symmetric  tensor  of  rank  two  which,  as 
we  saw,  combines  in  itself  matter  and  electric- 
ity. We  want  to  state  at  this  time  that  we  shall 
be  ultimately  successful  in  our  enterprise  but 
that  everything  will  not  run  very  smoothly,  and 
we  shall  have  to  make  an  effort  in  order  to  ar- 
rive in  the  general  case  at  a  tensor  of  rank 
two.  The  configurations  which  will  present  them- 
selves immediately  will  not  be  exactly  tensors, 
and  even  after  we  shall  arrive  at  a  tensor  it 
will  not  be  a  tensor  of  rank  two.  We  shall  have 
to  overcome  these  obstacles,  and  in  order  to  be 
able  to  do  that  we  shall  need  some  preparation, 
which  we  shall  make  by  studying  more  attentively 
the  case  of  a  surface  before  we  pass  to  the  con- 
sideration of  more  complicated  cases. 

The  curvature  of  a  curve  is  characterized 
by  a  number;  that  of  a  surface  is  a  more  compli- 


cated  thing  and  is  characterized  by  a  tensor; 
we  know,  however,  that  there  are  certain  num- 
bers connected  with  a  tensor,  in  an  intrinsic 
way  (that  is,  independent  of  the  system  of  co- 
ordinates), viz.,  the  numbers  given  by  9.5  and 
9.51,  and  we  may  expect  that  they  have  geomet- 
rical significance.  In  fact  the  first  one, 
an  +  aaa>  is  known  as  the  mean  curvature,  and 
the  second, 

•iX    «!• 

18.4  K  = 


as  the  total  curvature  of  the  surface  at  the 
point  considered.  We  know  that  K  is  independ- 
ent of  the  choice  of  a  system  of  coordinates, 
but  we  want  to  show  how  it  can  be  obtained  with- 
out the  use  of  any  coordinates  at  all.  We  hare 
introduced  above  a  vector  notation  s(x,y)  for 
our  tensor  18.3;  we  now  write  out  the  expres- 
sion (the  expressions  in  coordinates  are  writ- 
ten down  for  the  sake  of  future  references  and 
may  be  disregarded  at  present): 


18.5 


s(x,u)   s(x,v) 
s(y,u)   s(y,v) 


where  x,y,u,v  are  arbitrary  vectors,  and  we  as- 
sert that  it  is  equal  to 


18.6   K. 


x.u   x.v 


y.u   y.v 


=  K  . 


XPUP 


7ouo 


to  prove  this  consider  the  expression 


&21 


u« 


multiplying  by  the  law  of  multiplication  of  de- 
terminants the  second  and  the  third  factors  and 
writing  K  for  the  first,  we  get  18.6;  applying 
the  law  of  multiplication  of  determinants  first 
to  the  first  two  factors,  then  to  the  resulting 
determinant  and  the  third  factor  we  get  18.5; 
we  may  thus  write 


18.7 


s(x,u)   s(x,v) 
s(y,u)   s(y,v) 


K. 


x.u   x.v 
y.u   y.y 


we  may  now  obtain  K  without  using  any  system  of 
coordinates  by  dividing  the  left  hand  side  by 
the  second  factor  on  the  right;  the  vectors 
x,y,u,v,  may  be  any  arbitrary  four  vectors,  only 
such  that  the  second  factor  on  the  right  does 
not  vanish.  Setting  x=i,  u=i,  y=J,  v=J, 
where  i,J  are  two  coordinate  vectors  we  get 
formula  18.4;  now  it  is  seen  to  hold  for  any 
system  of  coordinate  vectors,  so  that  incident- 
ally we  have  a  new  proof  of  the  invariance  of  K. 


Before  we  leave  the  topic  of  ordinary  sur- 
faces we  want  to  establish  a  relation  between 
curvature  of  surfaces  and  curvature  of  curves. 
The  points  common  to  our  surface  and  the  xz- 
plane  constitute  a  plane  curve  whose  equation 
in  the  xz-plane  may  be  obtained  from  z  =  f (x,y) 
by  setting  y  =  0.   It  is  clear  that  the  x-axis 
is  a  tangent  to  this  curve  and  that  the  first 
term  of  the  expansion  of  z  as  a  function  f (x,0) 
into  a  power  series  will  be  obtained  by  setting 
x2  =  0  in  18.2.  We  have  thus 


18.11 


lll*l 


as  this  first  non-vanishing  term,  and  comparing 
.with  18.1  we  see  that  the  curvature  of  the  curve 
is  an,  or  (18.3)  the  value  s(i,i).  We  can  con- 
sider any  plane  passing  through  the  z-axls  as 
the  xz-plane,  or  any  unit  vector  in  the  tangent 
plane  as  the  coordinate  vector  1;  we  have  thus 
the  result,  that  the  curvature  at  the  point  of 
contact  of  a  curve,  resulting  from  the  inter- 
section of  the  surface  with  a  normal  plane  is 
s(i,i),  if  1  is  a  unit  vector  common  to  the 
tangent  plane  and  the  normal  plane  considered. 
In  other  words  to  every  direction  in  the  tan- 
gent plane,  characterized  by  a  unit  vector  i 
corresponds  a  normal  plane  containing  it,  and 
the  curvature  of  the  intersection  of  that  plane 
with  the  surface  is  s(i,i). 

We  see  thus  that  to  every  direction  in  the 
tangent  plane  corresponds  a  definite  number 
s(i,i),  the  curvature  in  that  direction.  As  an 
exercise  the  reader  may  try  to  express  the  cur- 
vature corresponding  to  a  direction  in  the  tan- 
gent plane  in  terms  of  the  angle  that  direction 
makes  with  the  x-axis. 

As  the  next  step  of  our  discussion  whose 
general  aim  is  to  arrive  at  the  most  general 
situation  as  far  as  the  number  of  dimensions  is 
concerned  both  of  the  space  from  which  we  start 
and  the  configuration  in  it  that  we  study,  we 
take  a  skew  curve  in  ordinary  space;  first  we 
studied  a  curve  (n  =1)  in  a  plane  (N  =  2);  then 
a  surface  (n  =  2)  in  the  ordinary  space  (N  =  3); 
now  we  take  up  the  case  n  «  1,  N  «  8.   A  curve 
may  be  given  in  general  by  two  equations  on  the 
three  coordinates  x,y,z.  Solving  these  equa- 
tions for  y  and  z,  we  represent  the  curve  in  the 
rorm  y  =  f(x),  z  =  g(x);  we  again  make  the  as- 
sumption that  a  tangent  exists  for  every  point 
and  that  for  every  point,  if  we  take  this  point 
as  the  origin  and  the  tangent  as  the  x-axls,  it 
is  possible  to  solve  for  y  and  z,  and  that  the 
functions  f  and  g  can  be  developed  into  power 
series;  as  a  result  of  the  choice  of  the  coordi- 
nate system,  the  two  power  series  will  begin 
with  quadratic  terms 


18.8 


If  we  change  the  y-  and  z-axes  which  fall  into 
the  normal  plane  to  the  curve,  to  other  axes  in 


the  same  plane,  the  form  of  the  development* 
will  not  be  changed,  of  course,  but  the  coeffl- 
cients  a,,  bt  will  assume  new  values;  If,  how- 
ever, we  consider  these  coefficients  as  compo- 
nents of  a  vector,  the  vector  represented  by 
them  will  be  the  same  in  *T»  coordinate  sys- 
tems. Calling  this  vector  v  we  may  say  that 
the  curvature  situation  of  the  curve  is  charac- 
terized by  the  expression 


18.9 


This  expression  plays  the  part  of  the  expres- 
sions 18.1  and  18.2  which  have  occurred  in  the 
two  preceding  situations. 


19 .   General! za  t ions . 

In  the  preceding  section  we  discussed  con- 
figurations in  the  ordinary  space,  and  we  could 
rely  on  our  intuition;  everybody  can  conceive  a 
plane  curve,  a  surface,  a  twisted  curve;  we 
have  at  our  disposal  physical  objects  (drawings, 
graphs,  models)  with  measurements  on  which  quan- 
tities of  our  theories  may  be  identified  suc- 
cessfully. In  the  investigation  we  undertake 
now,  we  cannot  use  our  intuition  any  more,  and 
the  identifications,  when  they  come,  will  be  of 
a  much  less  immediate  character.  We  have  than 
to  rely  on  analogy  with  the  configurations  stud- 
ied in  the  preceding  section  and  on  mathematical 
reasoning  supported  by  formulas. 

We  begin  with  what  seems  the  next  simplest 
case,  a  surface  in  four-dimensional  space;  it 
may  be  considered  as  a  generalization  both  of  a 
surface  and  of  a  curve  in  ordinary  space.  Such 
one  is  given,  in  general,  by  two  equations  on 
the  four  coordinates;  in  other  words,  we  daf^nf 
as  a  surface  in  four-space  the  totality  of 
points  whose  coordinates  satisfy  two  equations 
P(x,y,z,t)  =  0,  G(x,y,z,t)  =  0  where  F,G  are 
two  functions  subjected  to  certain  restrictions 
to  be  imposed  presently.   We  define  £  plane 
In  four  dimensions  as  a  surface  which  may  be 
given  by  two  linear  equations  (this  definition, 
although  given  in  terms  of  coordinates,  is  in- 
variant, because  it  can  be  proved  that  if  the 
equations  are  linear  In  one  system  of  coordi- 
nates they  remain  linear  after  a  transforma- 
tion; the  equivalence  of  this  definition  of  the 
plane  with  that  given  in  Section  7  is  easily 
recognized).  In  the  general  case  we  choose  a 
point  on  the  surface  as  the  origin  of  coordi- 
nates; we  solve  the  two  equations  for  two  of  the 
coordinates,  and  we  define  as  the  tangent  plane 
at  that  point  the  plane  whose  equations  result 
from  omitting  all  but  linear  terms  in  the  ex- 
pansions. We  next  choose  that  plane  as  one  of 
our  coordinate  planes;  the  lowest  terms  in  the 
expansions  are  then  the  quadratic  ones;  denot- 
ing the  coordinates  for  which  the  equation*  ap- 
pear as  solved  by  x»,  x«,   and  the  two  other 


coordinates  by  xx,  xa,  we  may  write  the  groups 
of  quadratic  terms  in  the  two  expansions  as 


19.1 


atax,»), 

2b18xxxa  +  baaxa"). 


For  every  vector  in  the  tangent  plane  of  compo- 
nents xx,  xa  this  gives  us  two  numbers  which 
may  be  considered  as  the  components  of  a  vector 
in  the  normal  x»,  x4  plane;  or,  we  may  consider 
this  vector  as  given  by  a  vector  form 


19.2 


'11*1 


2vxax1xa  +  vaax8! 


the  coefficients  v^j  being  vectors  of  the  nor- 
mal plane  whose  components  are  a  ij  and  bjj  (along 
the  x3  and  x4  axes  respectively)  .  This  expres- 
sion assigns  to  every  vector  of  the  tangent 
plane  a  vector  of  the  normal  plane;  we  may  sub- 
stitute for  it,  as  we  did  in  an  analogous  case 
before  (compare  18.3),  a  more  general  expres- 
sion 


19.3 


where 


s(x,y) 


V82x8ya, 


vai 


but  although  this  is  linear  in  each  of  the  vec- 
tors x  and  y  it  is_  not  a  tensor,  because  the 
values  of  this  expression  are  not  numbers  (they 
are  vectors  in  the  normal  plane)  .  We  shall  not 
introduce  a  special  name  for  such  expressions 
because  we  shall  not  have  to  deal  with  them  much; 
the  expression  19.5  we  have  denoted,  as  before, 
by  s(x,y),  but  we  must  keep  in  mind  that  the 
values  of  s(x,y)  are  not  numbers  but  vectors  of 
the  normal  plane. 

We  may  in  this  case  form  the  expression 
18.5  where  it  is  understood  that  in  the  expan- 
sion of  the  determinant  scalar  products  have  to 
be  used  where  ordinary  products  were  used  be- 
fore; this  change  is  made  necessary  by  the  more 
than  once  mentioned  fact  that  the  values  of  the 
elements  are  vectors.  In  all  other  respects  we 
can  apply  to  the  expression  the  same  reasoning 
as  before  and  we  come  to  the  conclusion  that  the 
relation  18.7  remains  true,  where  K  is  a  number, 
independent  of  the  coordinate  systems  in  the  tan- 
gent and  normal  planes,  but  which  after  such  co- 
ordinate systems  have  been  chosen  can  be  calcu- 
lated in  terms  of  the  coefficients  of  the  vector 
form  19.3  by  means  of  the  formula 


19.4 


The  important  fact  is  that,  although  our  expres- 
sion 19.3  is  a  vector  expression,  and  therefore, 
does  not  furnish  us  a  tensor,  the  invariant  cor- 
responding to  19.4  is  still  a  number.  The  other 


H 


invariant,  which  could  be  called  mean  curvature 
and  written  as  vlx  +  vt§  Is  not  a  number  in  this 
case.  The  number,  K,  we  call,  as  before,  the 
total  curvature  of  the  surface  at  the  point  con- 
sidered. 

In  terms  of  the  coefficients  of  the  numer- 
ical forms  19.1  the  total  curvature  K  may  be 
expressed  as  follows:  expanding  the  determin- 
ant 19.4  we  have  K  -  v^.v..  -  vai.vxa;  the 
term  vxl.v22,  for  Instance,  Is  the  scalar  prod- 
uct of  the  vectors  vn  and  vaa  whose  components 
are  respectively  alt,  bn  and  a.,,  b88;  the  sca- 
lar product  vlx  .vaa  is  then  alxasa  +  blxbaa;  In 
the  same  way,  the  term  vai.via  in  the  expression 
for  K  is  aiaaai  +  biabai,  so  that  we  have  for 
,K  in  terms  of  the  a's  and  b's  rearranging  terms 
•and  using  determinant  notation 


19.41 


K 


»xx 


The  next  generalization  is  an  easy  one;  we 
still  consider  a  surface  (n  =  2)  but  instead  of 
a  four-dimensional  space  we  take  a  space  of  an 
arbitrary  number  of  dimensions  H;  we  denote 
N-n  =  N-2byr  and  we  have,  a  tangent  plane, 
as  before,  but  instead  of  a  normal  plane,  we 
have  now  a  normal  r-dimensional  space,  an  r-flat 
as  we  may  say.     We  call  the  corresponding 
coordinates  x»,  x4,  etc.,  or  Xg+k,  where  k  goes 
now  from  1  to  r  instead  of  only  taking  the  val- 
ues 1  and  2;  we  have  here  a  vector  form  which 
may  be  written  as  before  (19.2)  only  the  v's 
are  vectors  of  the  normal  flat  and  have  r  com- 
ponents each;  these  components  we  may  distin- 
guish by  upper  indices  In  brackets;  if  we  de- 
note by  I^  the  r  coordinate  vectors  in  the  nor- 
mal flat,  and  denote  by  aQO  the  components  cor- 
responding to  I  k  of  v^  we  may  write 


19.5 


2 
k=l 


and  for  s(x,y)  we  may  write 


(k) 


W 


19.6 

r 

krl 


otherwise  there  will  be  no  changes.  We  can  fora 
the  expression  19.4  as  before;  it  will  be  inde- 
pendent of  the  choice  of  the  coordinates  xt+k 
because  the  scalar  products  used  in  the  expan- 
sion of  the  determinant  are;  substituting  the 
values  19.5  and  evaluating  we  will  have 


19.41 


r 

Z 

k=l 


•xx 
.00 


an  obvious  generalization  of  formula  1^.41  which 


may  be  obtained  from  this  by  taking  r  =  2,  and 
writing  a  for  at1)  and  b  for  a(2)  with  proper 
subscripts.  We  may,  if  we  wish,  write  out  an 
expression  analogous  to  18.5.  Substituting  for 
s(x,u)  etc.,  the  expressions  19.6  and  using 
scalar  products  in  the  evaluation  of  the  deter- 
minant we  shall  find 


19.43 


s(x,u) 


=   Z 

k=l 


where  the  Greek  letter  subscripts  imply  summa- 
tion fron  one  to  two  corresponding  to  the  tan- 
gent plane,  and  the  summation  with  respect  to  k 
corresponding  to  the  r  coordinates  of  the  nor- 
mal flat  is  indicated  in  the  usual  fashion.  Each 
of  the  determinants  corresponding  to  the  differ- 
rent  values  of  k  is  exactly  of  the  same  nature 
as  18.5  so  that  the  reasoning  which  led  from  it 
to  18.7  applies  to  each  of  the  determinants 
without  change,  and  it  is  easy  to  see  that  form- 
ula 18.7  continues  to  hold.  We  may  use  this 
formula  to  define  K  which  we  continue  to  call 
total  curvature. 

And  now  we  come  to  the  last  generalization. 
We  consider,  in  a  space  of  an  arbitrary  number 
of  dimensions  N  a  curved  space  of  n  dimensions 
with  n  also  an  arbitrary  number  «N,  which  by 
definition  is  the  totality  of  points  whose  co- 
ordinates satisfy  N  -  n  =  r  equations 

19.7     Fk(xi.,  x2,....xN)  =0  (k  =  l,2,...r). 

V.'e  assume  that  for  every  point  in  the  curv- 
ed space  these  equations  can  be  solved  for  r 
of  the  coordinates  and  that  these  solutions  can 
be  expanded  into  power  series  in  the  remaining 
n  coordinates,  converging  in  the  neighborhood  of 
the  point  selected.  By  a  transformation  of  co- 
ordinates we  may  arrange  it  so  that  these  expan- 
sions begin  with  second  degree  terms  so  that 
we  may  write 


19.71  xn+j£  = 


+  terms  of  higher  degree 


where  the  summation  indicated  by  the  Greek  in- 
dices now  goes  from  1  to  n.  The  sub-space  de- 
fined by  the  first  n  coordinate  axes  we  call  the 
tangent  flat  space  at  the  point  considered,  and 
the  sub-space  corresponding  to  the  remaining  r 
coordinate  axes  -  the  normal  flat  space  at  that 
point.  /jjN 

As  before,  we  use  the  coefficients  aj«  to 
form  the  expressions 


19.8 


where  xif  y^  are  two  vectors  of  the  tangent  flat; 
and  we  combine  these  expressions  into  a  vector 
expression 


19.9 


s(x,y) 


p-J-k 


It  Is  natural  to  try  to  generalize  the 
of  total  curvature.  We  can  form  the  expression 
18.5  but,  and  this  is  important,  the  transfor- 
mation 18.7  does  not  apply;  It  was  based  essen- 
tially on  the  fact  n  =  2,  and  it  breaks  down 
here. 


20.  The  Rlemann  Tensor. 

The  way  out  of  this  difficulty  is  very 
simple.  Although  relation  18.7  does  not  hold 
we  still  may  consider  its  left  hand  side;  it  is 
a  function  of  the  four  vectors  x,y,u,r,  func- 
tion, which  has  numerical  values;  it  is  easy  to 
show  that  it  is  linear  in  each  of  the  vector 
arguments  (we  leave  this  proof  to  the  reader 
because  the  result  will  follow  later  from  form- 
ula 20.8);  it  is  therefore  a  tensor,  a  tensor 
of  rank  four;  we  call  it  the  Riemann  tensor, 
denote  it  by  R(x,y;u,v)  and  write 


20.1 


R(x,y;u,v)  = 


s(x,u)  s(x,v) 
s(y,u)  s(y,v) 


We  have  then  at  every  point  of  the  curved 
space  a  tensor  of  rank  four  instead  of  a  number; 
it  is  connected  with  the  second  degree  terms  of 
the  expansions  19.71  and  therefore  character- 
izes, at  least  in  part,  the  deviation  of  the  ex- 
pressions of  the  xa+k  from  linearity,  or  of  our 
space  from  flatness.  The  Riemann  tensor  tells 
us  then  something  about  the  curvature  of  the 
curved  space,  and  it  is  often  called  the  curva- 
ture tensor. 

The  situation  we  have  now  reminds  us  of  a 
situation  in  Section  18.  The  curvature  of  a 
curve  was  a  number;  when  we  passed  to  a  surface 
we  found  that  its  curvature  was  characterized 
by  a  tensor;  we  have  succeeded  to  derive  from 
this  tensor  a  number  K,  so  that  we  could  ex- 
press (at  least  partially)  the  curvature  of  a 
surface  by  a  number.  Now  passing  to  higher 
curved  spaces  we  again  obtain  a* tensor.  In  the 
preceding  situation  we  succeeded  in  interpret- 
ing the  tensor  s(x,y)  given  by  formula  18.5  in 
terms  of  curvatures  of  certain  curves  on  the 
surface;  we  found  that  the  value  s(i,i)  gives 
the  curvature  of  the  normal  section  determined 
by  the  unit  vector  1  and  the  normal  to  the  sur- 
face. Is  it  possible  to  interpret  the  Riemann 
tensor  in  an  analogous  way  as  giving  the  total 
curvatures  of  some  surfaces  on  our  curved  space? 
This  is  a  natural  question  to  ask,  and  the  ans- 
wer is  affirmative.  We  shall  prove,  in  fact, 
that  certain  values  of  the  Riemann  tensor  give 
us  total  curvatures  of  surfaces  situated  on  the 
curved  space.  Let  i,J  be  two  arbitrary  mutual- 
ly perpendicular  unit  vectors  of  the  tangent 
flat;  choose  a  set  of  coordinate  vectors  so 
that  i,J  be  two  of  them.  Pass  through  1,J  and 
the  normal  flat,  i.e.,  through  the  r  vectors  Ik, 


a  flat  space  of  2  +  r  dimensions;  its  points 
will  be  those  points  of  the  N-space  whose  coor- 
dinates xa,  x4,....xn  vanish;  the  intersection 
of  this  flat  space  with  the  given  curved  space 
will  be  a  surface,  i.e.,  a  two-dimensional  curv- 
ed space,  because  the  coordinates  of  its  points 
must  satisfy  the  r  equations  of  the  curved  space 
(19.7)  and  n  -  2  equations 


20.2 


0,   x4  = 


=  0, 


which  together  isN-n+n-2=N-2  equa- 
tions. This  surface  we  may  consider  as  a  sur- 
face of  the  r  +  2  dimensional  flat  space  20.2; 
its  equations  in  that  space  will  be  obtained  by 
setting  x3  =  x4  =  ...xn  =  0  in  the  equations 
19.71  of  the  curved  space  (just  as  the  equation 
of  the  normal  section  of  a  surface  in  the  xz- 
plane  was  obtained  (preceding  formula  18.11)  by 
setting  y  =  0  in  the  equation  of  the  surface); 
these  equations  will  then  become 


20.3 

+  terms  of  higher  degree 

and  the  total  curvature  of  this  surface  is 

,1   via 


with 


'28 


(k) 


but  vlx  =  s(i,i),  v12  =  s(i,j),  vai  = 
v22  =  s(j,j),  so  that  we  have 


which  is  R(i,J;i,j),  and  our  statement  is  proved. 
As  we  saw  at  the  end  of  Section  18,  a  unit  vec- 
tor i  in  the  tangent  plane  to  an  ordinary  sur- 
face determines  a  direction,  a  straight  line, 
which,  together  with  the  normal  determines  a  nor- 
mal plane,  and  the  intersection  of  that  normal 
plane  with  the  surface  is  a  normal  section  of 
curvature  s(i,i);  here  we  have  the  situation 
that  two  unit  vectors  i,j  in  the  tangent  flat  to 
a  curved  space  determine  an  orientation, a  plane, 
which  together  with  the  normal  r-flat  determines 
a  normal  r  +  2  flat,  and  the  intersection  of 
that  normal  r  +  2  flat  with  the  curved  space  is 
a  normal  section,  a  surface  of  curvature 
R(i,J;i,j).  We  see  then  that  the  Riemann  ten- 
sor plays  with  respect  to  a  curved  space  a  role 
analogous  to  that  played  by  the  tensor  s(x,y) 
with  respect  to  an  ordinary  surface;  our  expec- 
tations then  are  fulfilled;  we  need,  it  is  true, 
for  the  purposes  of  identification  with  the  com- 
plete tensor  TJ  a  tensor  of  the  second  rank, 


10 


but  we  shall  get  one  of  rank  two  froa  R(x,y;u,v) 
by  applying  to  it  the  operation  of  contraction. 

In  the  meantime  let  us  study  the  Riemann 
tensor,  or  the  curvature  tensor  as  It  Is  called 
sometimes,  as  we  have  It.  The  Riemann  tensor 
is  not  a  general  tensor  of  rank  four.  It  sat- 
isfies the  relations 

20.41  R(x,y;u,v)   -  -R(y,x;u,v)   -  -B(x,y;T,u), 

20.42  R(x,y;u,v)   =  R(u,v;x,y), 

20.43  R(x,y;u,v)   +  R(x,u;v,y)   +  R(x,v;y,u)  -  0, 

which  are  easily  verified  by  using  20.1.   The 
first  of  these  relations  says  that  R  is  anti- 
symmetric in  each  of  the  pairs  of  the  vector 
arguments,  and  the  second,  that  it  is  symmetric 
in  the  two  pairs. 

If  we  introduce  a  coordinate  system  in  the 
tangent  flat,  by  picking  four  coordinate  vectors 
i,j,k,l  or  ix,  it,  13,  14,  we  may  represent  the 
vector  arguments  (as  in  Section  9)  in  the  form 
x  =  iaXa>  etc.,  substitute  these  expressions  In- 
to R(x,y;u,v)  and,  by  using  linearity  as  de- 
fined in  9.6,  write  the  Riemann  tensor  as 


20.5 


where 


20.6 


Rab;cd 


(We  use  here  the  first  letters  of  the  alphabet 
as  subscripts,  instead  of  i,  etc.,  as  before, 
in  order  to  avoid  confusion  with  the  coordinate 
vectors  which  we  denote  by  i.)  These  are  the 
components  of  the  Riemann  tensor  in  t.'  c  »ordi- 
nate  system  chosen.  The  relations  20.4  can  be 
written  in  components  as 

20.71  Rabjcd    =  ~Rba;od    '=  ~Rabjdc 

20.72  Rab;cd    =  R  od;ab 

20.73  Rab;cd    +  Rac;db    +  Rad;bc    =  0. 

Exercise.  Prove  that  the  number  of  independent 
components  of  a  Riemann  tensor  for  four  dimen- 
sions is  20. 

The  vectors  of  the  flat  spaces  tangent  to 
the  curved  space  may  be  considered  as  belonging 
to  the  curved  space,  they  may  be  characterized 
In  terms  of  the  space  itself,  for  instance,  by  giv- 
ing direction  and  length;  they  are  accessible 
as  we  may  say,  to  beings  who  live  In  the  space 
and  for  whom  points  outside  the  space  do  not 
exist.  Normal  vectors,  the  function  s(x,y)  etc., 
are,  on  the  contrary,  not  accessible  to  the  in- 
habitants of  the  space.  We  shall  confine  our- 
selves for  the  most  to  the  consideration  of 
these  internal  properties,  properties  accessible 
to  the  inhabitants;  but  later  in  the  course  of 
our  investigation  we  shall  have  to  use  the 


expression  of  the  Riemann  tensor  in  terms  of  the 
coefficients  a[j)  and  we  shall  conclude  this  sec- 
tion by  deducing  it. 

Substituting  the  expression  19.9  for  s  in- 
to 20.1  and  using  20.5  for  the  left  hand  side, 
we  find 


20'8 


this  determinant  may  be  presented  as  the  sum  of 
r8  determinants,  of  which,  however,  only  r  are 
different  from  zero,  namely  those  In  which  the  same 
I  appears  in  the  two  columns,  because  in  the  ex- 
pansions of  the  others  all  terms  vanish  as  in- 
volving products  of  different  and  therefore  mu- 
tually perpendicular  I's.  What  remains  is  (com- 
pare 19.45) 


r 

Z 

k=l 


r 

A 


because  the  I's  are  unit  vectors  and  I^.Ifc  =  1> 
or 


(k) 


r 

2 

k=l 


PY   aP« 


Comparing  this  to  the  left  hand  side  of  20.8  we 
have  the  required  expression 


£0.9 


Rabjcd  = 


21.  Vectors  in  General  Coordinates. 

In  the  last  section  we  learned  how  to  as- 
sociate with  every  point  of  a  curved  space  a  ten- 
sor of  rank  fourj  for  our  physical  interpretation 
we  need  one  of  rank  two;  but  we  know  how  to  ob- 
tain from  a  tensor  of  rank  four  one  of  rank  two; 
we  have  to  apply  the  operation  of  contraction. 
The  result  we  shall  call  the  "contracted  Riemann 
tensor"  and  we  shall  expect  to  identify  it  with 
the  tensor  T.  The  first  question  we  have  to  ask 
ourselves  in  this  connection  is  whether  the  con- 
tracted Riemann  tensor  satisfies  the  equation 
11.4,  viz.,  oTia/axg  =  o.  But  before  we  do  that 
we  have  to  go  through  quite  a  lengthy  development 
because  at  the  present  stage  we  do  not  know  how 
to  differentiate  tensors  on  a  curved  space.  In 
flat  space  we  could  consider  the  differential  of 
a  vector,  or,  more  exactly,  of  a  vector  field, 
by  (roughly)  considering  the  difference  of  two 


41 


vectors  of  the  field  in  two  neighboring  points. 
In  curved  space,  or  on  a  curved  surface  two 
vectors  in  two  different  points  belong  to  two 
different  tangent  planes  and  their  difference 
is  not  a  vector  of  the  surface  at  all.   Or  we 
could  in  a  flat  space  adopt  a  cartesian  system 
of  coordinates  and  Introduce  as  the  components 
of  the  differential  the  derivatives  of  the  com- 
ponents of  the  given  tensor.  This  method  also 
is  not  applicable  directly  to  curved  space  be- 
cause there  is  no  such  a  thing  here  as  cartesian 
coordinates.  Each  method  could  be  so  modified 
as  to  apply  to  curved  space  -  the  geometrical 
method  and  the  coordinate  method.  We  shall  de- 
velop here  the  coordinate  method  because  in  ad- 
dition to  permitting  us  to  introduce  differen- 
tiation -  our  immediate  concern  now  -  it  is  In- 
dispensable in  treating  special  cases. 

As  we  said  before,  there  is  no  such  a  thipg 
as  the  cartesian  system  of  coordinates  in  curved 
space,  because  there  are  no  straight  lines;  so 
we  shall  have  to  use  some  other  coordinates,  let 
us  say,  general  coordinates;  the  main  difficulty 
in  treating  curved  spaces  is  just  this,  viz., 
that  rectilinear  coordinates  are  not  applicable 
here,  or  we  may  say:  part  of  the  difficulty 
lies  in  the  fact  that  we  have  to  use  curvilinear 
coordinates  (the  greater  part)  and  part  -  in  the 
fact  that  the  situation  itself  is  so  different 
from  that  we  encounter  in  flat  space  and  with 
which  we  are  more  or  less  familiar.  Or  to  put 
it  in  a  still  different  form:  the  difficulty 
is  two-fold,  we  have  a  new  material  to  work  on 
and  we  have  to  use  new  tools.  To  obviate  the 
difficulty  we  are  going  to  divide  it;  we  already 
have  studied  curved  spaces  in  the  preceding  sec- 
tions; now  we  shall  try  to  become  familiar  with 
the  new  tool  -  the  method  of  general  coordinates, 
applying  it  to  the  old  material  -  ordinary  three- 
dimensional  space;  and  then  -  beginning  with 
Section  25,  we  shall  study  curved  space  by  means 
of  curvilinear  coordinates. 

The  essential  thing  in  the  matter  of  coor- 
dinates Is  that  points  receive  names,  the  names 
being  composed  of  numbers,  so  that  we  can  handle 
numbers,  which  we  can  do  by  means  of  formulas, 
instead  of  points  themselves.  There  are  many 
different  ways  of  establishing  a  correspondence 
between  points  and  triples  of  numbers;  in  the 
one  that  bears  the  name  of  Descartes  (Cartesius) 
the  three  numbers  which  are  assigned  to  a  point 
are  its  distances  from  three  mutually  perpend- 
icular planes;  there  does  not  seem  to  be  anything 
that  can  take  the  place  of  this  method  in  a  gen- 
eral curved  space  because  there  are  no  planes 
and  straight  lines;  still  we  may  use  coordinates; 
a  system  of  coordinates  on  a  special  curved  sur- 
face is  known  to  everybody,  even  to  those  who 
never  studied  analytic  geometry;  we  mean  the  sys- 
tem of  specifying  the  position  on  the  surface  of 
the  earth  by  means  of  latitude  and  longitude. 
Polar  coordinates  in  the  plane  or  in  space  fur- 
nish another  example  of  a  coordinate  system 
which  is  not  cartesian;  in  what  follows  we  shall 


use  an  entirely  arbitrary  system  of  coordinates; 
we  shall  assume  that  a  one-to-one  correspondence 
is  established  between  the  points  of  a  certain 
portion  of  space  (which  may  be  the  whole  space) 
and  the  triples  of  a  certain  set  of  triples  of 
numbers.  We  shall  call  these  numbers  u^u,,  u3 
or  ui  and  we  shall  keep  the  notation  XA  for 
some  definite  system  of  cartesian  coordinates. 
To  every  triple  Ui,  u8,  u,  corresponds  a  point 
whose  cartesian  coordinates  (in  some  definite 
system)  are  XJL;  these  numbers  xi  are  therefore 
determined  by  the  ui's;we  have  three  functions 

*1  =  Xl(uX,U8,U3)   X8  =  X»(Ui,U8,U3) 

£1.1 

x,  =  X3(u!,u8,u3) 

which  are  defined  on  a  certain  range  of  triples. 
Conversely,  if  xx,  x2f  x3  are  the  cartesian  co- 
ordinates of  a  point  of  the  portion  of  our  space 
for  which  general  coordinate  have  been  intro- 
duced, they  determine  three  numbers  ux,  u2,  u3, 
which  therefore  are  functions  of  the  x's 


21.2 


u8 
=  U3(xi,x8,x3), 


which  are  defined  for  a  certain  range  of  triples 
xi  and  are  the  inverse  functions  of  the  func- 
tions 21.1. 

We  have  to  handle  vectors  even  more  often 
than  we  have  to  handle  points,  and  we  want  to 
have  a  numerical  representation  also  for  vectors. 
Together  with  cartesian  coordinates  for  points 
goes  a  very  simple  numerical  representation  for 
vectors;  we  represent  a  vector  by  three  numbers 
which  are  the  differences  between  the  corres- 
ponding coordinates  of  its  end-points,  and  are 
called  the  components  of  the  vector;  of  course, 
in  a  different  system  of  cartesian  coordinates 
the  same  vector  will  have  other  components,  but 
as  long  as  we  keep  to  a  definite  coordinate  sys- 
tem, vectors,  as  well  as  points  have  definite 
names.  The  method  of  representing  vectors  by 
their  cartesian  components  has  the  great  advan- 
tage that  two  equal  vectors  have  equal  compo- 
nents, that  we  can  add  vectors  by  adding  their 
components,  and  multiply  a  vector  by  a  number 
by  multiplying  the  components  by  that  number; 
these  advantages  are  peculiar  to  the  cartesian 
method  and  cannot  be  reproduced  in  other  systems. 
The  theory  of  curved  space  is  differential  geom- 
etry, we  cannot  handle  immediately  by  its  methods 
such  things  as  a  configuration  consisting  of  two 
points  at  a  finite  distance;  if  we  do  we  have 
to  introduce  intermediate  points,  instead  of  sub- 
traction we  have  here  differentiation.  There 
are  two  ways  in  which  vectors  arise  by  differen- 
tiation, and  each  gives  rise  to  a  system  of  no- 
tation for  vectors  associated  with  a  given  coor- 
dinate system  for  points  -  only  for  the  rectan- 
gular cartesian  system  do  the  two  representa- 
tions coincide.  The  two  ways  in  which  a  vector 
appears  as  a  result  of  differentiation  are  -  the 


tangent  vector  of  a  curve  and  the  gradient  of  a 
field.  In  this  and  the  next  section  we  shall 
take  only  the  first  of  these  two  points  of  view. 
Given  a  curve  in  cartesian  coordinates  in  para- 
metric fora 


21.3 


y(p), 


the  components  of  the  tangent  vector  are  obtain- 
ed by  differentiation 


21.4 


dx/dp,   dy/dp,   dz/dp. 


This  vector  is  determined  not  by  the  curve  alone, 
but  by  the  particular  parametric  representation 
we  are  using,  but  in  this  chapter  we  are  not 
going  to  change  the  parameter  often  and  we  shall 
speak  of  a  curve  when  we  mean  "curve  in  a  given 
parametric  representation",  and  of  the  tangent 
vector,  when  we  mean  "tangent  vector  resulting 
from  differentiation  with  respect  to  that  par- 
ticular parameter".  In  cartesian  coordinates, 
then,  the  components  of  the  tangent  vector  are 
obtained  by  differentiating  the  coordinates  of 
the  points  of  the  curve.  This  is  certainly  con- 
venient, and  we  may  ask  ourselves  whether  we 
could  not  reproduce  this  advantage  in  general 
coordinates.  Let  us  try;  the  parametric  repre- 
sentation of  the  curve  in  the  u1 s  can  be  ob- 
tained by  substituting  XI(P)  into  21.2;  the  u^s 
become  then  functions  of  p,  and  this  gives  a 
parametric  representation  of  the  same  curve  in 
general  coordinates;  let  us  agree  to  represent 
the  vector  which,  when  we  used  the  cartesian 
system  had  the  components  21.4,  by  the  three 
numbers 


21.5 


dua/dp, 


duj/dp. 


We  have  then  the  required  system  of  repre- 
sentation; but  it  is  not  necessary,  every  time 
when  we  want  to  represent  a  vector  to  introduce 
a  curve  to  which  it  is  tangent;  we  shall  show 
how  to  find  the  components  21.5  when  we  are  giv- 
en the  cartesian  components  21.4  without  actual- 
ly considering  the  curve. 

We  have,  considering  that  Uj  depends  on 
xx,  x8,  x3  which  in  turn  depend  on  p, 


or  using  the  summation  convention  and  applying 
to  Uj.,  u8,  ua, 


21.6 


duj/dp  =  oui/3xa.dx<i/dp, 


so  that,  if  we  denote  the  quantities  21.4  by 
and  the  quantities  21.5  by  V1,  we  have 


21.7 


V1   *  OUi/OXo.Ya   . 


Introducing  the  abbreviation 


21.8 


we  may  write  the  last  formulas  as 


21.9 


V1  =  alava. 


It  will  be  explained  later  (Section  23)  why  we 
use  in  the  left  hand  side  the  index  as  a  super- 
script. These  are  transformation  formulas  for 
vector  components  which  are  associated  with  the 
transformation  formulas  21. 2  for  the  coordinates 
of  points;  the  formula  Just  written  out  permits 
to  find  the  general  components  when  the  carte- 
sian components  are  given.  In  a  similar  way  we 
can  find  the  inverse  transformation  formulas  by 
starting  with  a  parametric  representation  of  a 
curve  in  general  coordinates,  substituting  the 
u^p)  into  the  formulas  21.1  and  differentiating; 
we  arrive  thus  at 


21.10 


where 


21.11 


Before  we  go  further  we  shall  use  the  fact 
that  the  formulas  21.9  and  21.10  are  inverses 
of  each  other  to  obtain  some  relations  on  the 
a's  and  b's.  Substituting  21.9  into  21.10  we 
get 


21.12 


vl  =  biaaopvp> 


the  left  hand  side  may  be  written  as  01Qvfl     so 
that 

bioaapvp  =010V 

and  since  this  is  an  identity  (v  being  arbitrary) 
we  have 

21.13  bia*aj=°lj- 

In  the  same  way  we  may  obtain 


21.14 


aiabaj 


=  C, 


We  want  to  be  able  to  operate  with  general 
components  of  vectors,  for  instance,  find  a 
scalar  product  of  two  vectors  given  in  their  gen- 
eral components;  it  is  easy  to  obtain  a  formula 
answering  this  question  by  passing  to  cartesian 
components,  and  then  applying  the  formula  for  the 
scalar  product  in  cartesian  components.  Let  the 
general  components  of  two  vectors  be  V1  and  W*; 
according  to  the  formula  21.10  their  cartesian 
components  are 


and 


where  we  use  another  summation  letter  In  the 
second  expression  to  avoid  complications  in 
what  follows.  How  we  write  the  scalar  product 
using  the  formula  VYWY  and  get 

21.15    b 


It  follows  that  in  order  to  be  able  to  find 
scalar  products  of  rectors  given  by  their  gen- 
eral components  we  have  to  know  the  quantities 


21.16 


bnbrj 


The  quantities  a's  and  b's  help  to  pass  fro*  a 
certain  cartesian  system  of  coordinates  to  the 
general  system;  they  express  a  relation  between 
the  general  system  and  that  particular  carte- 
sian system;  and  thus  are  not  of  fundamental  im- 
portance; the  quantities  gj* ,  on  the  contrary, 
although  they  have  been  obtained  by  means  of 
the  a's  and  b's,  are  independent,  as  their  sig- 
nificance shows,  from  any  particular  system  of 
cartesian  coordinates,  they  characterize  the 
system  of  coordinates  we  are  using  In  itself 
(and,  as  we  shall  see  later,  they  characterize 
it  completely,  so  that  the  g's  are  all  we  need 
to  know  in  order  to  be  able  to  handle  vectors 
given  by  their  general  components).   The  a's 
as  well  as  the  b's  may  be  considered  either  as 
functions  of  the  x's,  or  as  functions  of  the 
u's.  The  g's  always  will  be  considered  as 
functions  of  the  u's. 

Before  we  go  any  farther  re  note  that,  as 
it  immediately  follows  from  the  definition 
21.16  the  g's  are  symmetric  in  the  indices: 


21.16' 


=  g 


Ji* 


In  order  tb  show  the  importance  of  the 
g's  let  us  deduce  a  formula  for  the  length  of 
a  curve  given  in  general  coordinates.  Let  the 
curve  be  given  by 

21.31  Ui(p). 


For  a  curve  given  in  cartesian  coordinates 
assume  as  known  the  formula  (compare  12.1) 


we 


21.17 


=/|/(dx/dp)"  +  (dy/dp)*  -i-  (dz/dp)«dp 


where  s  is  the  arc  length  between  two  points. 
This  formula  involves  three  inverse  operations, 
that  of  integration,  that  of  taking  the  square 
root,  and  that  of  division.  It  is  not  pleasant, 
in  general,  to  have  to  do  with  these  operations, 
and  so  we  shall  free  our  formula  from  them,  and 
write  it  as 


21.17' 


ds1  =  dx8  +  dy* 


dz1 


The  formula  as  Just  written  is  not  essentially 
different  from  the  one  written  above,  and  means 
exactly  the  same  thing.  The  sign  d  may  be  taken 


to  mean  differentiation  with  respect  to  some  un- 
specified parameter,  since  the  correctness  of 
the  formula  does  not  depend  on  what  parameter  we 
are  using  (provided  that  the  same  parameter  IB 
used  on  both  sides).  We  translate  21.17*  now 
Into  general  components.  Differentiating  21.2 
we  have 


dx  = 


dy  =  bia.dua,  dz  =  b3a.du°; 


for  dx2  we  may  write  biadu°.bijjduP;  using  simi- 
lar expressions  for  the  other  terms  of  21.17'  we 
get 


dsa 


21.18 


using  the  abbreviation  21.15  introduced  before. 
We  see  that  the  quantities  gj_<  appear  again. 


22.  Tensors  in  General  Coordinates. 

We  come  now  to  the  representation  of  ten- 
sors. We  know  that  a  tensor  is  a  function  which 
assigns  numbers  to  vectors,  and  the  question  of 
representation  will  be  simply  this:  given  the 
general  components  of  the  vector  arguments  how 
to  find  the  corresponding  value  of  the  tensor. 
We  have  already  solved  this  problem  for  one  par- 
ticular tensor,  namely  for  the  tensor  of  the 
scalar  product  which  we  expressed  in  the  preced- 
ing section  by  means  of  the  g's,  and  we  shell 
use  the  same  method  in  the  general  case.  Given 
the  cartesian  components  of  a  tensor  fj*  and  the 
general  components  V*  and  W1  of  two  vector  ar- 
guments, to  find  the  corresponding  value  of  the 
tensor.  We  pass  from  general  components  to  car- 
tesian components  by  formulas  21.10  and  substi- 
tute these  expressions  into  the  expression 

fY6vYwC  for  tne  value  of  tne  tensor;  the  result 
is 


and  this  may  be  written  as 

pp     1  Tji        T7^wP     ~     f* 

if  we  set 

PP    P  f       h     .Vi          -    1? 

ZY6  YiDOj    ~  *lj    » 

we  call  FIJ  the  general  components  of  the  ten- 
sor, whose  cartesian  components  are  fji  ,  and  we 
see  that  the  values  of  a  tensor  are  expressed  by 
formula  22.1  in  its  general  components  and  the 
general  components  of  the  vector  arguments  in 
the  same  way  as  in  terms  of  cartesian  components 
of  the  tensor  and  the  vector  arguments.   Formu- 
la 21.15  for  the  scalar  product  of  two  vectors 
may  be  considered  as  a  special  case  of  22.1.  The 
cartesian  components  of  the  tensor  gj  j  are,  of 


course,  the  dj  ,  and  substituting  6  for  f  In 

22.2  we  get  the  expressions  21.18  for  Ptj .  We 

treated  here  as  an  example  a  tensor  of  rank  two; 

similar  calculations  can  be  performed  for  a  ten- 
sor of  any  rank;  we  give  the  results  for  rank 

one  and  three,  leaving  it  to  the  reader  to  go 
through  the  calculations: 

22.11   F,  « 


Now  naturally  the  inverse  problem  present* 
itself;  given  the  general  components  of  a  ten- 
sor to  find  its  cartesian  components.  The  prob- 
lem can  be  solved  by  substituting  in  the  expres- 
sion for  the  value  of  a  tensor  (we  again  take  a 
tensor  of  rank  two  as  an  example)  given  in  terms 
of  general  components,  Fr0VYW°,  the  expressions 
21.9  for  the  general  components  of  the  vector 
arguments  in  terms  of  the  cartesian  components: 
vi  =  aiava>  w4  =  alpwp>  the  result  is 


comparing  this  to  the  expression  fa0vawfl 
the  value  of  a  tensor  in  cartesian  components 
we  derive  the  desired  transformation  formula 
for  passing  from  general  to  cartesian  compo- 
nents; here  are  these  formulas  for  the  first 
three  ranks 


22.5 


fi  =  Faaoi> 


ljk 


We  know  now  how  to  write  tensors  in  gener- 
al components,  and  we  want  to  find  out  how  to 
perform  operations  on  them.  Of  course,  we  could 
always  pass  from  general  components  to  carte- 
sian components,  perform  the  required  operations 
and  then,  if  the  result  is  a  tensor,  pass  back 
to  general  components;  but  instead  of  following 
this  program  in  every  special  case  as  it  pre- 
sents itself  we  shall  do  it  once  for  all  and 
derive  general  formulas  whose  application  in 
special  cases  is  much  more  convenient  than  ad 
hoc  calculations. 

We  begin  with  the  operation  of  contraction. 
Given  again  a  tensor  of  rank  two  by  its  general 
components  FJJ  we  pass  to  its  cartesian  compo- 
nents by  formula  22.3  and  now  we  contract  by 
taking  the  sum  of  components  with  equal  indices 
according  to  the  original  definition, 

22.4  fyy 

where  we  use  the  abbreviation 

22.5  aiYajY  =  gi^' 

For  tensors  of  higher  ranks  (contraction  is 
possible  only  for  tensors  whose  rank  is*  2)  en- 
tirely analogous  formulas  may  be  obtained  easily; 
indices  which  are  not  affected  by  contraction 
may  be  simply  disregarded,  as  it  follows  from 
similar  calculations  which  are  left  to  the 


student.  For  example,  the  result  of  contracting 
irith  respect  to  the  second  and  fourth  Indices  of 


a  tensor  of  the  fourth  rank  F 


ijkl 


will  be 


22.41 


I 


iokp 


The  quantities  g1J  introduced  a  moment  ago 
play  quite  an  Important  role  comparable  to  that 
of  the  gj«  with  lower  indices,  and  they  are  con- 
nected with  them  by  the  formula 


8£ .  6 


j     j 

8  g 


aj 


To  prove  it  suffices  to  substitute  the  expres- 
sions 21.16  and  22.5  for  the  two  kinds  of  g's 
and  to  apply  formula  21.14  twice;  we  may   also 
notice  that,  as  it  follows  from  the  definition, 

op  7  tfij  =  ^Ji 

66  •  (  g  "   -  g "  , 

so  that  formula  22.6  may  also  be  written  as 
22.61 

Next  comes  the  operation  of  differentiation. 
The  result  of  differentiating  a  tensor  is  always 
a  tensor  of  rank  higher  by  one  than  the  given 
tensor;  its  components  will  have  one  more  in- 
dex than  those  of  the  given  tensor;  we  shall 
denote  them  by  simply  adding  a  new  index  preced- 
ed by  a  comma,  to  the  symbol  of  the  given  tensor. 

Because  the  situation  is  slightly  more  com- 
plicated, let  us  start  in  translating  differen- 
tiation into  general  components  with  the  simp- 
lest case  of  a  tensor  of  rank  one  given  in  gen- 
eral components,  F^.  We  follow  the  same  program: 
as  a  first  step  we  pass  to  cartesian  components 
by  formula  22.5  and  get  f^  =  Ffja^:  we  next 
find  the  cartesian  components  of  the  differential 
by  simply  differentiating  with  respect  to  carte- 
sian coordinates  with  the  result 


» 

As  a  third  step  we  pass  back  to  general  compo- 
nents using  the  formula  22.2  and  arriving  at  the 
result 


but  b0j  according  to  formula  21.11  is 
so  that  this  expression  reduces  to 


using  relation  21.14  the  first  term  reduces  to 


just  what  we  would  expect  from  analogy  with  car 
tesian  coordinates  as  an  expression  for  the  re 
sult  of  differentiation;  however,  this  is  not 
the  whole  answer  because  there  is  a  second  term 
so  that  the  final  result  is 


22.8 

where  we  set 

22.81 


the  second  term  may  be  considered  as  being  In 
the  nature  of  a  correction  to  the  expected  re- 
sult; we  call  it  the  correction  term,  and  we 
call  r..  the  correction  coefficient*.   We  see 
then  that  in  general  coordinates  the  components 
of  the  differential  of  a  tensor  of  rank  one  con- 
sist of  two  parts  -  .the  first  expresses  the 
change  (or  rate  of  change)  of  components  of  the 
tensor,  the  second  is  due  to  the  change  of  the 
coordinate  system  from  point  to  point.  In  the 
case  of  the  cartesian  system  the  coordinate  sys- 
tem is,  so  to  say,  the  same  in  all  points,  the 
second  term  is  zero  (the  a's  reduce  in  this 
case  to  constants,  and  their  derivatives  vanish); 
another  extreme  example  is  furnished  by  a  tensor 
whose  components  are  constants  in  some  non-car- 
tesian system  of  coordinates  (for  Instance,  po- 
lar coordinates) ;  the  derivatives  of  the  compo- 
nents with  respect  to  the  coordinates  are  zero 
but  the  components  of  the  differential  are  not; 
their  values  are  given  by  the  correction  tents 
alone . 

For  tensors  of  rank  higher  than  the  first 
the  calculations  are  slightly  more  complicated, 
but  the  principle  is  the  sane;  we  write  out  the 
results  for  tensors  of  rank  two  and  three 


22.82 


22.66 


ij,k 


ijk,a 


r 

-  r 


9 
i»Fajk 


r  <*  v      ru  TT 
rj»Fiak  •  rk«*ija> 


the  general  rule  ought  to  be  clear  now;  there 
are  as  many  correction  terms  as  there  are  in- 
dices; each  correction  term  corresponds  to  one 
index,  the  other  indices  being  disregarded  in 
its  formation. 

In  order  to  be  able  to  perform  the  opera- 
tion of  contraction  (and  the  operation  of  scal- 
ar multiplication  is  a  special  case  of  it)  in 
general  coordinates  we  have  to  know,  as  we  saw, 
the  values  of  the  g's;  in  order  to  be  able  to 
perform  the  operation  of  differentiation  we 
have  to  know  the  value  of  the  T's  (the  correc- 
tion coefficients);  if  we  know  those  we  can  per- 
form all  the  necessary  operations  in  general  co- 
ordinates without  going  back  to  cartesian  coor- 
dinates. We  shall  show  now  the  values  of  the 
T  's  can  be  derived  from  those  of  the  g's. 

The  correction  coefficients  were  given  or- 
iginally by  the  formula  22.81;  we  can  give  to 
this  another  form  by  using  the  relation  21.14; 
writing  it  as  a^b  ±  =ftkl  and  differentiating 
it  with  respect  to  uj  we  get,  since  the  a»s  are 

constants, 

& 

= 


so  that  we  have 


or,  recalling  the  definition  21.11  of  the  b's, 

k       d*Xv 
22.84          r,4  -  - 


from  this  expression  it  follows  that  r  is  not 
affected  by  interchanging  the  two  lower  indices, 
or 

22.71  rij  =  rjki* 

We  may  now  show  how  the  T's  can  be  derived 
from  the  g's.  We  shall  do  that  by  using  the 
following  artifice.  Consider  the  tensor  of  sca- 
lar multiplication,  whose  cartesian  components 
are  the  O^j  and  whose  general  components  were 
shown  to  be  gjj  ;  the  components  of  the  differen- 
tial of  this  tensor  in  cartesian  components  are 
the  derivatives  of  the  6' s  and  therefore  zero; 
the  second  formula  22.11  shows  that  the  general 
components  of  this  tensor  of  the  third  rank  also 
must  vanish,  so  that 


22.85 


=  0 


(we  did  not  promise  that  general  components  of 
tensors  will  always  be  given  by  capital  letters, 
but  since  heretofore  we  have  been  using  capital 
letters  for  them  it  may  be  well  to  emphasize 
that  the  g's  are  intended  to  represent  (follow- 
ing the  generally  accepted  custom)  general  com- 
ponents of  the  scalar  multiplication  tensor)  . 
On  the  other  hand,  we  can  calculate  the  compo- 
nents of  gij^fc  by  the  application  of  formula 
22.82  and  so*  we  get 


-  o, 


this  is  a  system  of  equations  connecting  the  r's 
with  the  g's  and  their  derivatives;  we  want  to 
solve  them  for  the  r's.  For  that  purpose  we 
write  out  the  above  relation  in  two  more  forms 
resulting  from  it  by  cyclic  interchanges  of  in- 
dices: 


-  o, 


subtracting  the  last  two  relations  from  the  first 
we  notice  that  as  the  result  of  symmetry  of  the 
g's  and  the  r's  in  the  lower  indices  (formulas 
21.16'  and  22.71)  four  of  the  terms  containing 
the  r'  s  cancel  and  the  remaining  two  are  identi- 
cal; we  thus  have 


We  multiply  now  both  sides  by  g**  and  turn  with 
respect  to  k,  writing  for  it  a  Greek  Index,  e.g., 
0  .  Taking  into  account  22.61  we  have 

22.91 

This  shows  how,  given  the  g's,  to  calculate  the 
r's.  We  see  thus  that  if  only  we  are  given 
the  g's  as  functions  of  the  u's  we  can  perform 
all  the  required  operations  on  tensors.   Very 
often  the  calculation  of  the  r«s  is  divided  in- 
to two  parts;  first  the  left  hand  sides  of 
22.9  are  calculated  and  listed;  they  are  denot- 
ed by  rk>1j  ;  and  then  the  r^  are  calculated 
using  the  formulas  in  the  form 


22.92 


23.  Co  variant  and  Contravariant  Components. 

We  know  (Section  9)  that  a  vector  is  a  ten 
sor  of  rank  one,  or,  more  precisely,  that  to 
every  vector  v  there  corresponds  a  tensor  of 
rank  one  v.x  which  has  the  same  cartesian  com- 
ponents. Now  we  have  introduced  general  com- 
ponents for  vectors  and  also  for  tensors;  if  we 
have  cartesian  components  of  a  vector  v^   to 
them  correspond  (21.9)  the  general  components 

V1  =  alava; 

also  if  we  consider  the  vi  as  the  components  of 
a  tensor  to  them  correspond  (22.11)  the  general 
components 


to  the  same  cartesian  components  v^  there  cor- 
respond thus  two  different  sets  of  general  com- 
ponents depending  on  whether  we  consider  the  v^ 
as  vector  or  as  tensor  components;  it  was  in  an- 
icipation  of  this  situation  that  we  have  been 
using  the  index  for  general  vector  components 
as  a  superscript.  Essentially,  a  vector  and  a 
tensor  of  rank  one  are  one  and  the  same  thing; 
and  so  we  have  two  different  systems  of  compo- 
nents for  every  vector  (in  a  given  general  co- 
ordinate system)  ;  the  components  with  subscripts 
are  called  covariant  components,  those  with  the 
superscript  -  contravariant  components.  It  is 
clear  from  what  precedes,  but  it  may  be  worth- 
while to  repeat  that  we  have  here  two  different 
representations  of  one  and  the  same  thing. 

It  was  mentioned  in  Section  21  that  there 
are  two  ways  in  which  a  vector  results  from  dif- 
ferentiation; one,  a  vector  considered  as  a  tan- 
gent vector  to  a  curve,  was  discussed  before, 
and  is  the  basis  of  what  we  have  been  doing  all 
this  time;  it  is  interesting  to  consider  now 
briefly  the  other.  If  we  start  with  a  scalar 
field  f  =  f(x,y,z)  we  may  derive  a  vector  field 
by  differentiation,  and  the  cartesian  components 


47 


of  tals  vector  field  will  be 
23.1 


this  vector  la  known  as  the  gradient  of  the  field 
f  ;  now,  we  may  give  the  same  scalar  field  in  gen- 
eral coordinates 


f  =  f   Xi(U!UtU,),  X,(UiUtU,), 


if  we  differentiate  f  with  respect  to  ulf  will 
we  obtain  general  components  of  the  gradient 
vector  field?  The  question  is  easily  answered 
by  computing  these  components;  we  have 


of 


Pf  oxg   3f 


23.2 


comparing  this  with  22.11  we  see  that  the  par- 
tial derivatives  of/oui  are  the  components  with 
subscripts  -  the  covariant  components  in  gener- 
al coordinates  of  the  gradient.  The  two  repre- 
sentations, the  covariant,  and  the  contravari- 
ant,  may  be  thus  considered  as  corresponding  to 
two  ways  in  which  a  vector  can  be  arrived  at  by 
differentiation;  if  we  consider  a  vector  as  a 
gradient  we  arrive  naturally  at  Its  representa- 
tion by  covariant  components,  if  we  consider  it 
as  a  tangent  vector  we  arrive  at  its  representa- 
tion by  contravariant  components.  (The  name  co- 
variant, by  the  way,  is  Intended  to  Indicate 
that  these  components  change  in  the  same  way,  as, 
or  have  similar  formulas  of  transformation  with, 
partial  derivatives.)  In  the  case  of  cartesian 
coordinates,  of  course,  covariant  and  contravar- 
iant components  of  a  given  vector  coincide:  in 
this  case  it  is  not  necessary  to  make  any  dis- 
tinction. 

We  shall  have  to  use  covariant  as  well  as 
contravariant  components,  and  it  is  important 
to  be  able  to  pass  from  one  to  the  other  repre- 
sentation; the  necessary  formulas  can  be  found, 
of  course,  by  passing  through  a  cartesian  repre- 
sentation. Let  covariant  components  FA  of  a 
tensor  of  rank  one  (or  vector)  be  given;  formu- 
las 22.3  show  us  that  the  corresponding  carte- 
sian components  are  fA  =  Faaai;  ^  terms  of  these 
the  contravariant  components  are  obtained  by 
formula  21.9  which  gives  here 


23.3 


F1  =  alpFaaap  = 


if  abbreviation  22.5  is  used.  In  the  same  way 
it  is  easy  to  prove  the  following  formula,  which 
permits  to  calculate  covariant  components  '  when 
the  contravariant  components  are  given 


23.31 


It  may  seem  that  there  is  a  wasteful  redun- 
dancy in  this  double  system  of  notations,  that 
one  representation  is  enough,  and  that  to  have 


two,  means  to  Indulge  In  luxury;  as  a  natter  of 
fact  this  double  notation  If  a  defect  from  a 
didactical  point  of  view:  it  makes  it  more 
difficult  to  learn  the  new  language;  but  once 
mastered  it  makes  the  calculations  much  simpler 
and  the  formulas  much  shorter  and  more  elegant, 
if  properly  used;  as  an  example,  we  want  to 
give  the  formula  for  the  scalar  product  of  two 
vectors,  one  of  which  is  given  In  covariant, 
and  the  other  in  contravariant  components.  This 
formula  can  be  obtained  by  the  usual  procedure, 
i.e.,  passing  through  cartesian  components,  but 
we  have  already  reached  a  stage  where  we  can 
dispense  to  a  great  extent  with  the  use  of  car- 
tesian coordinates.  The  required  formula  is 
simply 


23.4 


and  it  can  be  proved  by  simply  substituting  for 
Wa  its  expression  according  to  formula  23.  81, 
viz.,  gaaW0  and  comparing  the  result  to  21.15; 
of  course,  the  scalar  product  could  also  be 
written  as  VgW*,  and  also  ga^VJlp,  as  it  Is 
easy  to  verify. 

In  Section  22  we  derived  a  system  of  repre- 
sentation for  tensors  starting  with  the  contra- 
variant  representation  of  vectors;  we  could  do 
the  same  thing  starting  with  the  covariant  rep- 
resentation of  vectors,  and  we  shall  do  it  so 
as  to  have  a  perfectly  symmetrical  system  of 
notations. 

Suppose  we  are  given  covariant  components 
of  two  vectors  Vj,  W4  and  the  components  PJJ  of 
a  tensor,  and  we  want  to  find  the  value  of  the 
tensor  corresponding  to  the  given  vectors  as  ar- 
guments; we  know  how  to  solve  the  problem  if 
the  vectors  are  given  by  their  contravariant 
components;  therefore,  let  us  calculate  first 
the  contravariant  components,  viz.,  V1  =  gr'*\f 
W1  =  g^Wg,  and  then  substitute  them  Into  the 
left  hand  side  of  expression  22.1  giving  the 
value  of  the  tensor.  The  result  is 


23.4  F 

which  may  be  written  as 
23.41 


if  we  introduce  the  notation 


23.42 


F«   - 


We  call  F1^  the  contravariant  components  of  the 
tensor  FJJ  and  the  components  with  lower  Indices 
(subscripts)  which  we  have  been  using  for  ten- 
sors heretofore  we  call  covariant  components. 
We  have  thus  two  representations  not  only  for 
tensors  of  the  first  rank  (vectors)  but  also  for 
tensors  of  all  ranks.  In  one  case  we  have  been 
using  already  a  symbol  with  two  superscripts, 
viz.,  the  g1^  (introduced  by  22.5);  we  shall 
show  now  that  this  notation  is  in  agreement 


with  the  general  notation  we  are  introducing  now 
by  proving  that  these  g's  with  upper  indices  are 
the  contravariant  components  of  the  tensor  of 
scalar  multiplication.  In  order  to  prove  that, 
we  notice  that,  according  to  formula  23.42  the 
contravariant  components  of  a  tensor  of  covari- 
ant  components  g^  are 


but  according  to  22.61 


»  Oaj  so  that  we 


get  6agal  -  g^1*  and  the  assertion  is  proved. 


text  we  want  to  learn  how  to  differentiate 
a  tensor  given  in  contravariant  components,  but 
before  we  do  that  it  seems  necessary  to  Intro- 
duce what  we  call  mixed  components.  Suppose  we 
are  given  one  vector  argument  of  a  tensor  of 
rank  two  in  contravariant,  and  the  other  In  co- 
variant components,  and  we  want  to  find  the  val- 
ue of  the  tensor;  if  the  components  of  the  two 
given  vectors  are  V1  and  W^,  and  the  cartesian 
components  of  the  tensor  are  f^j  ,  we  pass  to 
cartesian  components  of  the  vector  arguments 

23.5        Vi  =  b1YVY,     wj  =  W0aol, 
and  express  the  value  of  the  tensor  as 


23.51     fapva  wp 
where  the  notation 


23.52 


is  introduced.  The  numbers  F^  with  one  lower 
and  one  upper  index  are  called  mixed  components 
of  the  tensor  of  rank  two  whose  cartesian  com- 
ponents are  fjj  .  In  this  same  way  we  may  con- 
sider mixed  components  for  a  tensor  of  any  rank 
with  as  many  of  the  indices  up  as  we  may  wish, 
and  the  others  -  down. 

We  can  pass  from  one  kind  of  component  to 
any  other  directly,  without  going  through  carte- 
sian components.  The  transition  from  components 
in  which  a  certain  index  (for  example,  the  third) 
is  used  as  a  superscript  to  components  in  which 
the  same  index  is  used  as  a  subscript  is  call- 
ed the  lowering  of  that  index.  This  change  does 
not  affect  the  geometrical  meaning  of  the  ten- 
sor, it  merely  corresponds  to  a  transition  from 
an  expression  of  the  tensor  in  which  the  corres- 
ponding vector  argument  (in  our  example,  the 
third)  was  given  by  Its   covariant  components 
to  an  expression  of  the  same  tensor  using  contra- 
variant  components  for  that  vector  argument.  The 
formula  for  the  lowering  of  an  index  is  easily 
found  to  be  independent  from  all  other  indices, 
so  that,  disregarding  them,  we  always  may  use 
23.31.  Formula  23.3  may  be  considered  as  a  gen- 
eral formula  for  raising  an  index.  Lowering  and 
raising  of  indices  is  sometimes  referred  to  as 
juggling  with  indices. 

Again  it  may  seem  that  the  introduction  of 
mixed  components  is  superfluous,  but  there  are 


48 

advantages  In  using  mixed  components. 

One  advantage  appears  in  connection  with 
contraction.  The  result  of  contraction  Is  giv- 
en in  terns  of  covariant  components  by  formula 
£2.4  (or  22.41).  But,  according  to  23.3  we  may 
write  Rlj  for  Poj  gla  so  that  22.4  may  be  writ- 
ten as 

23.48  F° 

• 

and  for  22.41  we  may  write  ?i\a'  We  see  thus 
that  if  a  tensor  is  given  by  its  mixed  compo- 
nents and  the  two  indices  with  respect  to  which 
we  contract  appear  on  different  levels  (one  as 
a  subscript,  the  other  as  a  superscript)  con- 
traction is  performed  (like  in  cartesian  coor- 
dinates) by  simply  replacing  each  of  the  two 
indices  by  the  same  Creek  letter. 

Another  case  where  there  is  great  advan- 
tage in  using  mixed  components  is  that  of  dif- 
ferentiation of  a  contravariant  tensor  (as  we 
say  sometimes  for:  "tensor  given  by  its  con- 
travariant components";  a  tensor  in  itself  is, 
of  course,  neither  contravariant  nor  covariant 
-  covariance  and  contravariance  are  only  prop- 
erties, or  types  of  representation  of  tensors); 
the  components  of  the  differential  will  have 
one  more  index,  and  this  index  as  one  derived 
by  differentiation  will  naturally  be  a  subscript, 
whereas  the  old  indices  are  superscripts;  this 
does  not  mean  that  we  cannot  pull  the  new  index 
up,  or  the  old  ones  down,  but  the  expressions 
resulting  from  that  would  be  more  complicated. 

Suppose  the  given  contravariant  tensor  is 
of  rank  one  (a  vector)  V4 ;  we  pass  to  cartesian 
components : 


we  differentiate  this: 

O~Vj.     "ft      Y 

and  we  pass  to  mixed  components  by  formula 
23.52: 

"&  .  &  . 


1J 
and  this,  using  21.14  and  22.71  reduces  to 

23.6  ^J     "  lu~  +    rr?r* 

The  reader  should  be  able,  following  the 
examples  given,  to  deduce  formulas  for  differ- 
entiation of  a  tensor  given  in  any  form.   We 
just  mention,  because  we  will  have  occasion  to 
use  it  later,  the  formula  for  differentiation 
of  a  mixed  tensor 


23.7 


J 


We  are  in  possession  now  of  all  the  for- 
mal rules  of  operations  on  tensors  in  general 
coordinates.  Although  these  rules  were  deduced 
by  means  of  cartesian  coordinates  these  coordi- 
nates and  components  together  with  all  formulas 
involving  the  a's  and  the  b's  form  only  a  kind 
of  scaffolding  that  can  be  removed  after  the 
building  has  been  completed.  All  ire  have  to 
know  in  order  to  operate  on  tensors  are  the  g's. 
Using  the  g's  we  can  lower  and  raise  Indices  and 
contract  and,  as  a  special  case,  find  the  scalar 
product  of  two  vectors;  also  find  the  angle  be- 
tween two  vectors  (using  formula  7.5)  and  the 
length  of  a  curve  (using  81.18).  Given  the  g's 
we  can  calculate  the  r" s  (end  of  Section  22)  and 
with  the  aid  of  the  P  s  we  can  differentiate 
tensors  (formulas  £2.8).  We  see  thus  that  the 
g's  play  a  fundamental  part  in  all  operations  - 
the  tensor  of  which  they  are  components  is  of- 
ten called  the  fundamental  tensor. 

Before  we  conclude  we  might  state  explicit- 
ly that  all  the  formulas  we  have  obtained  are 
entirely  independent  of  the  number  of  dimen- 
sions. 


24.  Physical  Coordinates  as  General  Coordinates. 

The  principal  purpose  for  the  introduction 
of  general  coordinates  was  to  make  possible  the 
treatment  of  tensors  in  curved  space  but  it  hap- 
pens that  general  coordinates  may  be  used  with 
great  advantage  also  in  Special  Relativity  The- 
ory, namely,  in  connection  with  the  situation 
arising  from  the  "minus  sign".  We  remedied  this 
situation  in  Section  4  by  introducing  imaginary 
coordinates  and  tensor  components;  we  know  how, 
using  these  imaginary  quantities  to  write  our 
formulas  in  a  nice  symmetrical  form.  The  system 
of  notations  for  general  coordinates  that  we 
have  introduced  permits  us  now  to  reintroduce 
real  quantities,  and  still  to  preserve  symmetry 
in  the  formulas.  We  shall  express  our  four  math- 
ematical coordinate's  xx,  x2,  x3,  x4,  of  which 
the  fourth  is  imaginary  in  terms  of  four  real 
coordinates  which  we  may  denote  by  u4;  we  may 
choose  as  these  four  real  numbers  the  physical 
coordinates  x,y,z,t  and  consider  the  formulas 


24.1 


xx  =  x,  xt  =  y,  xa  =  z,  x4  =  it 


as  the  transformation  formulas,  corresponding  to 
21.1;  and 


24.11 


x  =  x. 


y  =  xs,  z  =  x,,  t  =  -1x4 


as  the  inverse  formulas  corresponding  to  £1.2. 
The  ajj  and  the  b  jj  with  different  sub- 
scripts are  easily  seen  to  be  zero,  and  we  have 
(compare  21.8  and  21.11) 


24.2 


»aa  =  ass  =  1»   a44  =  i  » 
bsl  =  b,3  =  1,   b44  =  -ij 


from  these  we  obtain  using  £1.16 


24.21 


gtl  *  g,,  »  -g44  *  1,  all  others  zero 


and  the  same  values  we  obtain  for  the  g's  with 
upper  indices,  using  ££.5. 

The  x,y,z,t  may  be  considered  as  the  con- 
travariant  components  of  the  radius  vector  lead- 
ing from  the  origin  to  the  point  P;  the  co vari- 
ant components  of  the  same  vector  are  seen,  ap- 
plying the  formula  £3.21,  to  be  x,y,z,-t. 

The  formula  for  the  square  of  the  distance 
from  the  origin  may  be  obtained  either  from 
£1.15  or  from  £3.4;  it  is  (compare  10.1) 

£4.3  x»  +  7*  +  «*  -  t«. 

We  come  next  to  Maxwell's  equations  where 
the  "minus  sign  trouble*  originated.   To  con- 
form with  the  notations  of  this  chapter  ve 
should  use  for  the  cartesian  components  -  the 
mathematical  components  of  preceding  chapters  - 
small  letters,  so  that  formulas  4.72  will  hare 
to  be  written 


•41 


IX,   f41  =  1Y,  f4,  =  1Z,  f,,  -  L, 


•31 


=  M,    flt  =  i. 


Using  formulas  22.2  and  24.2  we  obtain  the  co- 
variant  components  in  physical  coordinates  -  and 
we  use  here  capital  letters  -  as  follows: 


F4l  =  X,  F4t  »  Y,  P4»  =  Z, 


24.4 


F83  =  L,  FJ8  =  M, 


=  M. 


Mixed  and  contravariant  components  may  be  ob- 

tained by  raising  indices  -  formula  23.5.  IXie 
to  the  simple  character  of  the  g's  given  by 

£4.£1  it  is  easy  to  see  that  raising  one  of  the 

indices  1,2,3  does  not  change  the  numerical  val 
ue  of  a  component,  so  that,  for  instance, 


24.5  F»»  = 


=  gaV*Fal   = 


and  raising  of  the  index  4  Just  changes  the 
sign  of  the  component  so  that 

24.6     F\  ' 

Which  components  shall  we  use  in  Maxwell's 
equations?  It  is  clear  that  in  the  first  (11.2) 
set  all  the  indices  must  be  on  the  same  level, 
and  since  the  last  one  must  be  a  lower  index  we 
write  all  of  them  down.  In  the  second  set  (UJ$) 
again  the  one  after  the  comma  must  be  down;  but 
the  one  with  which  we  contract  it  must  be  on 
the  other  level,  and  therefore  up;  the  position 
of  the  third  index  is  arbitrary.  We  have  thus, 
as  the  Maxwell  equations  for  free  space 


24.7 


jk>l 


0, 


=  0 


and  in  the  presence  of  matter  the  second  set 
becomes 


24.71 


euj 


We  notice  here  that  no  Imaginary  quanti- 
ties appear  and  In  spite  of  this  our  formulas 
are  symmetric.  The  raising  of  the  index  4  Is 
equivalent  to  changing  the  sign  of  a  component, 
and  this  is  how  the  minus  sign  is  taken  care  of, 

We  shall  now  write  out  the  expressions  for 
the  stress  energy  tensor  and  the  equations  of 
motion;  it  is  clear  that  the  formulas  11.4  and 
11.5  become 


24.6 

24.9 

or 

24.91 

or 

24.92 


Tla  =  0 
»<* 


TiJ   = 


-  *o,jF°FPa  - 


-  ig1JF°0FP0  -  puSx3 


=  FlpFPj    -  igijF°pFP0   - 


The  continuity  equation  (11.  1)  will  now  be 
written  as 


24.10 


(pua), 


0. 


25.  Curvilinear  Coordinates  In  Curved  Space. 

We  want  next  to  apply  the  general  coordi- 
nate system  that  we  have  introduced  for  flat 
space  also  to  curved  spaces.  In  flat  space  we 
introduce  the  language  of  general  coordinates 
and  components  by  translating  from  the  language 
of  cartesian  coordinates  and  components;  in 
curved  space  we  have  no  cartesian  components;  we 
shall  have,  therefore,  to  begin  by  introducing 
something  that  will  play  the  role  of  cartesian 
coordinates;  we  shall  introduce  quasi-cartesian 
coordinates,  which  will  take  that  place;  but 
whereas  cartesian  coordinates  are  universal  in 
that  the  same  system  of  coordinates  works  for 
the  whole  plane,  or  flat  space  -  the  neighbor- 
hood of  every  point  in  curved  space  has  its  own 
system  of  quasi-cartesian  coordinates.  They  are 
defined  in  the  following  way:  Consider  at  a 
given  point  P  the  tangent  flat;  there  will  be  In 
general  a  neighborhood  of  P  such  that  no  two 
points  of  that  neighborhood  have  the  same  pro- 
jection on  the  tangent  flat  (for  a  sphere,  e.g., 
we  may  obtain  such  a  neighborhood  by  drawing  any 
small  circle  around  the  point  of  contact)  for 
such  a  neighborhood  there  exists  a  one-to-one 
correspondence  between  the  points  of  it  and  the 
points  of  the  tangent  flat  which  are  their  pro- 
jections. We  introduce  now  on  the  tangent  flat 
a  cartesian  system  of  coordinates  with  origin 
at  the  point  of  contact,  and  we  use  the  coordi- 


50 


nates  of  a  point  of  the  flat  as  the  quasi-car- 
tesian coordinates  of  the  point  of  the  curved 
space  whose  projection  it  is;  If,  for  Instance, 
a  surface  is  given  by  equation 

£5.1    z  =  J(ax«  *•  2bxy  +  cy»)+t.h.d., 

x,y  will  be  the  quasi-cartesian  coordinates  of 
the  point  x,y,z,  of  the  surface  for  the  neigh- 
borhood of  0,0,0;  -  and  in  the  general  ease,  If 
the  curved  space  is  given  by  19.71 


25.11 


t.h.d. 


the  x±  (i  -  l,...,n)  will  be  the  quasi-cartesian 
coordinates  of  the  point  xt  (i  =  1,...,H)  for 
the  neighborhood  of  the  point  0,0,0,...,0. 

When  we  were  discussing  curved  space  In 
Sections  18,  19,  20  we  were  speaking  of  vectors 
and  tensors;  these  vectors  were  vectors  of  the 
tangent  plane  or  tangent  flat  with  initial  point 
at  the  point  of  contact.  We  shall  not  consider 
any  other  vectors  in  connection  with  curved 
spaces  and  we  shall  refer  to  these  vectors  as 
the  vectors  of  the  curved  space.  To  make  this 
Idea  seem  more  natural  we  may  remark  that  a  tan- 
gent vector  to  a  curve  on  a  surface  (  or  on  a 
curved  space)  is  such  a  vector,  that  is,  a  vec- 
tor of  the  tangent  plane  (or  flat)  with  initial 
point  at  the  point  of  contact.  In  handling 
these  vectors  we  have  been  using  for  the  vectors 
at  every  point  of  the  curved  space  a  cartesian 
coordinate  system  in  the  flat  tangent  at  that 
point,  in  fact,  we  may  say  the  same  system  that 
furnishes  us  the  quasi-cartesian  coordinates 
for  the  points  of  the  curved  space  in  the  neigh- 
borhood of  the  point  of  contact.  We  shall, 
therefore,  refer  to  the  cartesian  components  of 
the  vectors  *"d  tensors  of  the  tangent  flat  when 
they  are  considered  as  vectors  and  tensors  of 
the  curved  space  -  as  quasi-cartesian  components 
of  these  vectors  and  tensors. 

We  have  thus  in  connection  with  every  point 
P  on  the  curved  space  a  local  coordinate  system 
which  gives  quasi-cartesian  coordinates  of  the 
neighboring  points  and  the  quasi-cartesian  com- 
ponents of  the  tensors  at  P,  and  in  some  cases 
these  local  coordinate  systems  are  very  useful, 
but  it  will  be  necessary  to  introduce  more  gen- 
eral, more  universal  systems  and  learn  how  to 
represent  vectors  in  them.  The  necessity  of 
this  last  requirement  will  be  clear  if  we  con- 
sider that,  although  a  quasi-cartesian  system 
of  a  point  P  may  be  used  to  represent  points  In 
the  neighborhood  of  P  it  is  not  quasi-cartesian 
for  these  points  and  cannot  be  used  as  such  to 
represent  vectors  at  such  points. 

There  is  no  difficulty  In  introducing  a 
universal  system  of  coordinates  for  the  points 
of  a  space  -  what  we  want  is  just  as  in  ordinary 
space  a  one-to-one  correspondence  between  the 
points  and  n-ples  of  numbers.  A  simple  example 
is  furnished  by  the  so-called  geographical  co- 
ordinates for  the  surface  of  a  sphere.  Another 


approach  is  given  by  the  so-called  parametric 
representation  of  a  curved  space.  If  the  coor- 
dinates of  a  flat  space  X£,...,xg  are  given  as 
functions  of  one  parameter  we  have  a  curve; 
when  they  are  given  as  functions 


25.2 


of  n  parameters  Uj  we  have  what  we  have  called 
an  n-dimensional  curved  space  because  eliminat- 
ing these  n  parameters  from  the  N  equations 
which  express  the  coordinates  in  terms  of  them 
we  find  that  the  coordinates  must  satisfy  N  - n 
=  r  equations,  and  this  was  our  definition  of 
an  n-dimensional  curved  space.  Now  since  to  ev- 
ery set  of  values  i^,...,^  of  the  parameters 
there  corresponds  one  point  of  the  curved  space 
the  parameters  u^  may  be  used  as  coordinates  for 
the  curved  space. 

Suppose  then  that  we  have  introduced  in 
some  way  a  general  system  of  coordinates  for  the 
points  on  a  curved  space  (the  reader  may  always 
think  of  the  special  case  of  a  surface) .   What 
will  be  a  natural  system  of  representation  of 
vectors  to  go  with  it?  Just  as  we  use  a  carte- 
sian system  in  the  tangent  flat  to  represent 
points  on  the  surface  we  may,  so  to  say,  project 
the  general  coordinate  system  on  each  tangent 
flat  and  use  it  to  represent  vectors  and  tensors 
in  that  flat,  in  particular  those  with  initial 
points  at  the  point  of  contact,  i.e.,  the  vec- 
tors and  tensors  of  the  curved  space.  For  the 
neighborhood  of  each  point  we  have  thus  two  co- 
ordinate systems:  the  general  and  the  quasi- 
cartesian  for  that  point  -  and  the  same  two  sys- 
tems, or,  rather  their  projections,  we  may  use 
on  the  corresponding  tangent  flat.  For  the 
neighborhood  of  each  point  there  will  be  trans- 
formation formulas  for  the  coordinates  of  points, 
and  from  these  we  can  derive  transformation  form- 
ulas for  components  of  vectors  and  tensors  in- 
volving the  a1 s,  the  b's  and  the  g's  as  intro- 
duced in  Sections  21,  22  and  23.  But  since  we 
consider  only  vectors  with  initial  points  at 
the  point  of  contact  we  shall  use  the  a's,  the 
b's  and  the  g's  calculated  from  the  correspond- 
ence between  the  quasi-cartesian  and  the  gener- 
al coordinates  at  a  point  only  for  that  point 
itself.  We  know  that  the  a's  and  b's  are  nec- 
essary only  in  the  building  up  of  a  system  so 
that  all  we  shall  need  in  order  to  be  actually 
able  to  handle  tensors  and  vectors  in  a  given 
general  coordinate  system  are  the  g's. 

We  want  to  explain  now  how  to  obtain  the 
g's  when  a  space  is  given  in  parametric  form 
25.2. 

We  consider  a  curve  u^(t)  on  the  space,  and 
its  tangent  vector  at  some  point;  it  may  be  con- 
sidered either  as  a  vector  of  the  curved  space, 
and  then  its  contravariant  components  will  be 
given,  if  we  denote  differentiation  with  respect 
to  the  parameter  by  a  dot  placed  over  a  letter, 
by  u1,  the  square  of  its  length  will  be 


51 


or  it  may  be  considered  as  a  vector  of  the  con- 
taining space.  Its  (cartesian)  components  will 
be  given  by  x^  (l  =  !,...,»)  where  the  x£  are 
functions  of  t  which  are  obtained  fro*  25.2  by 
substituting  for  the  u's  the  expressions 
characterizing  our  curve;  we  hare  thus 

±1  »  oxi/oua.ua 

and  for  the  square  of  the  vector 
•  ,*xl. 


Equating  this  to  the  expression  we  obtained 
above  we  find 


25.3 


1=1 


This  formula  ought  to  be  compared  to  21.16  of 
which  it  will  be  seen  to  be  a  generalization  if 
account  is  taken  of  the  values  21.11  of  the  b's. 
The  method  of  giving  the  curved  space  by  means 
of  the  formulas  25.11  may  be  considered  as  e 
special  case  of  the  one  used  above;  this  will 
be  clear  if  we  write  25.11  in  the  form 

Xj.   =  ux,       xa  =  ua,       xn  =  Ua, 


It  is  seen  that  the  parts  of  the  u's  are  played 
by  the  first  n  of  the  x's.  Differentiation  of 
the  formulas  Just  written  with  respect  to  these 
variables  gives 


substitution  of  these  expressions  into  25.2    gives 


z 

i=n+l 


25.4     =  z 


Z 
k=l 


*n 


t.h.d 


t.h.d. 


These  formulas  give  the  values  for  the  g's  In 
pseudocartesian  coordinates  for  a  neighborhood 
of  the  point  of  contact. 

For  the  point  of  contact  itself,  i.e.,  for 
the  origin  of  our  system  of  coordinates  we  have 

25.41  (g.n)0  -  om 

and  from  the  formulas  22.61  we  conclude  easily 
that  the  g's  with  the  upper  Indices  are  also 
the  A's: 


25.42 


As  a  consequence  of  this  the  distinction  between 
covariant  and  contravarlant  components  vanishes 
for  quail-cartesian  coordinates  at  the  point  of 
contact. 

We  come  now  to  the  operation  of  differen- 
tiation. In  the  case  of  flat  space  we  were  sim- 
ply trying  to  find  a  system  of  notations  for 
some  operations  that  were  defined  Independently. 
Here  the  situation  Is  different;  we  have  not  de- 
fined differentiation;  we  cannot  define  It  In 
what  would  seem  to  be  the  natural  way,  as  the 
rate  of  change  of  a  rector,  for  Instance,  be- 
cause this  would  necessitate  the  consideration 
of  the  difference  between  two  vectors  at  two 
different  points  and  this  conception  Is  not  de- 
fined for  curved  space. 

Before  we  come  to  this  definition  let  us 
formulate  the  situation  In  flat  space  as  fol- 
lows: a  tensor  field  dF  has  been  obtained  by 
differentiation  from  a  tensor  field  F  if  in  ev- 
ery point  the  components  of  dF  in  a  cartesian 
system  are  the  derivatives  of  the  components  of 
F  in  that  system. 

In  curved  space  there  is  no  universal  car- 
tesian system  but  there  is  a  quasi-cartesian 
system  for  every  point;  It  is  natural,  there- 
fore, to  define  differentiation  in  curved  space 
by  substituting  in  the  above  statement  "quasi- 
cartesian  system"  for  "cartesian  system";  if  we 
do  that  we  arrive  at  the  following  definition: 

Definition  of  Differentiation.   We  shall 
say  that  a  tensor  field  dF  has  been  obtained  by 
differentiation  from  the  tensor  field  F  if  at- 
every  point  the  components  of  dF,  in  a  system 
of  coordinates  that  is  quasi-cartesian  at  that 
point,  are  the  derivatives  of  the  components  of 
F  in  that  system  of  coordinates. 

Although  this  definition  may  sound  compli- 
cated it  is  the  simplest  imaginable  adaptation 
of  the  idea  of  differentiation  to  curved  spaces. 
The  complication  arises  from  the  fact  that  there 
is  no  cartesian  system  in  curved  space  but  when 
we  apply  this  definition  to  flat  space  we  see 
that  it  brings  us  back  to  differentiation  as  we 
knew  it  in  flat  space. 

We  shall  not  have  actually  to  pass  from 
general  coordinates  to  quasi-cartesian  coordi- 
nates and  then,  after  differentiation  translate 
the  result  back  into  the  language  of  general  co- 
ordinates in  every  special  case.  We  can  derive 
the  formulas  in  general  coordinates  once  for 
all,  just  as  we  did  it  in  the  case  of  flat  space 
in  Sections  22  and  23,  and  we  shall  obtain  ex- 
actly the  same  formulas.  The  only  difference 
may  be  in  the  derivation  of  the  r*s  from  the  g's 
(end  of  Section  22)  which  was  based  there  on 
the  fact  that  the  derivatives  of  the  g's  in  car- 
tesian coordinates  vanish  (22.84).  Is  this  true 
also  in  curved  space?  or,  more  precisely,  do 
the  derivatives  of  the  g's  in  quasi-cartesian 
coordinates  vanish  at  the  point  of  contact? 

In  these  coordinates  the  g's  are  given  by 
25.4;  differentiating  these  expressions  we  ob- 
tain 


25.5 


t.h.d. 


and  for  the  point  of  contact,  where  the  x's 
vanish,  this  Is  zero  so  that 


25.6 


We  see  thus  that  formally  everything  is 
just  the  sane  as  In  flat  space  so  that  we  can 
take  over  Into  curved  space  the  whole  apparatus 
of  formulas  worked  out  In  Sections  21,  22,  23. 

Incidentally  we  may  mention  that  as  It  fol- 
lows easily  from  25.6  the  quantities  r  also 
vanish  in  quasi-cartesian  coordinates  at  the 
point  of  contact.  For  future  reference  we  put 
down  the  formula 


25.7 


0. 


In  general,  the  point  of  contact  In  quasi- 
cartesian  coordinates  is  a  place  where  we  have 
the  closest  possible  approach  to  the  situation 
which  obtains  in  flat  space  when  we  use  carte- 
sian coordinates. 

Another  formula  that  can  be  easily  obtain- 
ed from  25.5  and  that  we  need  later  Is  obtained 
by  differentiating  25.5  once  more  and  setting 
XA  =  0.  We  get  thus 


25.8 


Given  a  curved  space  by  the  formulas  25.2 
we  know  how  to  find  the  g's.  The  question  now 
arises:   suppose  we  are  given     $n(n  +  1) 
functions  of  the  coordinates;  is  it  possible  to 
find  a  space  for  which  these  functions  serve  as 
the  g's.  The  question  is  that  of  solving  the 
system  of  partial  differential  equations  (25.3), 
and  without  going  into  details  we  shall  state 
that  such  a  system  of  equations  in  general  can  be 
solved  if  the  number  of  unknown  functions  is 
equal  to  that  of  equations;  since  we  have  here 
£n(n  +  l)  equations  we  must  have  that  'many  un- 
known functions;  that  means  that  the  number  of 
dimensions  N  of  the  containing  space  must  be  In 
general  &n(n  +1);  in  special  cases  it  may,  of 
course,  be  less  than  that.  We  may  say  then:  a 
two  dimensional  curved  space  given  by  its  g's 
may  be  always  considered  as  Immersed  into  a  3- 
dimensional  space;  a  three-dimensional  curved 
space  may  be  always  considered  as  part  of  a  six- 
dimensional  flat  space;  and  a  four-dimensional 
as  part  of  a  ten-dimensional  flat. 

Another  question  is,  whether  for  given  real 
g's  the  containing  space  will  come  out  real;  and 
this  Is  by  no  means  always  the  case.  We  know 
that  for  gxl  =  g««  =  gj3  3  -g44  =  1,  all  others 
zero,  the  minimum  cartesian  containing  space  is 
four-dimensional  with  one  imaginary  coordinate 
and  it  Is  clear  that  no  real  cartesian  space 
can  contain  It. 

Henceforth  we  may  consider  the  curved  space 
as  given  by  its  g's,  and  the  g's  may  be  consid- 
ered as  arbitrarily  given  functions  of  the  u's. 


65 


It  may  seem  that  we  lost  from  view  the 
original  purpose  of  Introducing  curved  space, 
which  was  that  of  obtaining  a  tensor  which  we 
could  Identify  with  T^ .  We  Introduced  the 
Riemann  tensor  having  this  In  mind,  but  now  we 
seem  to  be  Immersed  In  an  entirely  formal  the- 
ory and  far  removed  from  the  Riemann  tensor;  as 
a  matter  of  fact,  It  Is  just  around  the  corner; 
differentiation,  although  performed  according 
to  formulas  that  are  formally  the  same  as  In 
flat  space,  has,  as  we  shall  see,  a  new  content; 
in  trying  to  discover  the  difference  we  will  be 
led  to  the  Riemann  tensor  from  a  new  point  of 
view. 


26.  New  Derivation  of  the  Riemann  Tensor. 

We  said  that  the  meaning  of  differentiation 
in  curved  space  is  different  from  that  in  flat 
space.  To  show  this  difference  in  one  of  its 
most  important  manifestations  we  start  out  with 
a  tensor  of  the  first  rank  given  in  its  contra- 
variant  components  F1;  we  differentiate  it  twice 
to  obtain  a  tensor  of  third  rank  F*,jn  ;  in  flat 
space  this  tensor  would  not  differ  from  F*,^ 
because  in  cartesian  components  differentiation 
of  a  tensor  reduces  to  ordinary  differentiation 
of  its  components,  so  that  the  cartesian  compo- 
nents of  the  two  tensors  mentioned  are  £  —  ££i 

7^    7K^*  ^^71   O3C-J 

and  g—  »r—  ^  respectively,  and  these  are  equal  be- 


cause the  result  of  ordinary  differentiation  does 
not  depend  on  the  order;  two  tensors  having  equal 
components  in  one  system  of  coordinates  would  be 
equal  in  all  systems  of  coordinates  and  so  we 
have 


26.1 


-  F3 


=  0  in  flat  space. 


This  reasoning  does  not  apply  to  curved 
space;  in  fact,  to  find  F1^  In  a  point  P  we 
have  to  differentiate  F1  j  ;  and  in  order  to  do 
that  we  have  to  know  F1^  j  in  different  points 
of  the  neighborhood  of  P;  the  finding  of  F  *•  j 
in  these  points  involves  the  use  of  quasi-carte- 
sian systems  for  each  of  these  points;  we  do  not 
have  then  one  quasi-cartesian  system  in  which 
we  can  perform  all  our  operations  and  the  rea- 
soning that  led  us  to  26.1  breaks  down.  In  spite 
of  this  the  result  might  still  hold.  In  order 
to  show  that  it  £033  not  let  us  calculate  the 
left  hand  side  of  26.1  using  the  formulas  which 
we  deduced  for  flat  space  in  Sections  22  and  23 
and  which,  as  we  proved  In  Section  25,  apply  to 
curved  space  also. 

We  start  with  the  contravariant  components 
FI;  we  calculate  the  first  differential  accord- 
ing to  formula  23.6  to  obtain 


26.2 


+I*aJP°; 


next,  we  differentiate  this  again,  and  get,  ac- 
cording to  formula  23.7 


26.21 


now  we  form  the  difference  we  want  to  Investi- 
gate, viz., 


the  last  bracket  vanishes  according  to  22.71, 
and  what  remains  becomes  after  the  substitution 
of  the  above  expression  26.2  for  the  first  dif- 
ferential and  rearrangement  of  tei 


H  V 


Here  cancellation  takes  place  in  the  first  thret 
pairs  of  terms:  in  the  first  as  a  result  of  in- 
dependence of  ordinary  differentiation  on  order, 
In  the  next  two  pairs  as  a  result  of  the  fact 
that  the  name  of  the  index  of  summation  is  IB- 
material;  we  come  out  with 


-  P 


26.3 

or 

26.4 
where 

26.5 


-  F: 


-  rj 


•n 


Before  we  discuss  the  question  whether  the 
expression  vanishes  we  want  to  show  that  the 
B's  are  the  components  of  a  tensor.  In  fact, 
multiplying  both  sides  of  26.4  by  X^Z",  where 
Xj^,  Y^,  Z"  are  components  of  arbitrary  vectors, 
and  contracting  we  have 


The  left  hand  side  is  a  scalar  that  has  been  ob- 
tained by  legitimate  operations  and  is,  there- 
fore, independent  from  the  coordinate  system 
used;  so  is  therefore  the  right  hand  side,  and 
this  proves  that  the  B's  are  the  components  of 
a  tensor.  (In  order  to  see  that  this  is  an  es- 
sential point  and  that  not  every  symbol  with 
indices  may  be  considered  as  a  tensor,  the  read- 
er might  consider  the  expression  P<krXe^Z*'; 
this  expression  is,  obviously,  not  independent 
from  the  choice  of  coordinates  since  in  carte- 
sian coordinates  the  r's  vanish,  and  in  other 
systems  they  do  not;  the  r's  furnish  thus  an 


example  of  symbols  with  indices  that  can  not  be 
interpreted  as  components  of  a  tensor) . 

Now  we  can  settle  our  question  as  to  the 
vanishing  of  26. 3  by  showing  that  the  B's  are 
mixed  components  of  the  Riemann  tensor  which 
has  been  Introduced  in  Section  20.  Since  we 
hare  proved  that  they  are  components  of  a  ten- 
sor we  may  use  any  system  of  coordinates,  and 
we  decide  to  use  a  quasi-cartesian  system.  In 
such  a  system  the  r's  vanish  at  the  point  of 
contact  (25.7)  so  that  we  are  left  with  the 
terms 


substituting  for  the  r's  with  one  upper  index 
their  expressions  in  terms  of  the  g's  with  up- 
per indices  and  the  r's  with  all  indices  down 
(22.92)  we  get 


the  first  two  terms  vanish  again  because  the  r!s 
vanish  at  the  point  of  contact;  taking  into  ac- 
count (25.41)  that  the  g's  are  for  the  point  of 
contact  equal  to  the  6's,  and  using  the  expres- 
sions 22.9  we  find  after  a  few  cancellations 


(the  index  i  appears  here  as  a  subscript  because 
the  distinction  between  contravariant  and  covar- 
iant  quantities  vanishes  for  quasi-cartesian  co- 
ordinates at  the  point  of  contact)  .  Using  here 
for  the  second  derivatives  of  the  g's  the  expres- 
sions 25.8  we  find 


Comparing  this  to  the  expression  for  the  Riemann 
tensor  deduced  at  the  end  of  Section  20  we  con- 
vince ourselves  of  the  identity  of  the  two  ex- 
pressions. 

This  shows  that,  if  the  Riemann  tensor  does 
not  vanish,  the  second  differential  of  a  vector 
field  actually  may  depend  on  the  order  of  differ- 
entiation. This  fact  is  very  interesting  in  it- 
self, it  confirms  our  statement  that  in  curved 
space  differentiation  has  a  new  meaning  and  it 
has  many  important  implications,  on  which,  how- 
ever, we  cannot  dwell  here.  For  us  it  is  impor- 
tant that  we  have  obtained  an  expression  of  the 
Riemann  tensor  in  terms  of  the  g's  alone;  this 
means  that  those  properties  of  the  curvature  of 
space  which  are  expressed  in  the  Riemann  tensor 
are  determined  by  the  metric  of  the  space,  i.e., 
if  distances  along  different  curves  are  given, 
the  curvature  (as  far  as  it  is  expressed  in  the 
Riemann  tensor)  is  determined.  According  to  our 
conception,  the  Inhabitants  of  the  space  cer- 


54 

tainly  can  measure  lengths;  it  follows  that  cur- 
vature, as  expressed  by  the  Riemann  tensor  if 
accessible  to  the  Inhabitants,  It  Is  an  internal 
property  of  the  space.  In  particular,  for  I  «  5, 
n  »  2,  i.e.,  for  the  ordinary  surface  we  obtain 
the  fact  that  the  total  curvature  can  be  calcu- 
lated from  the  expression  for  the  line  element; 
this  is  Gauss's  Theorema  Egreglun. 


27.  Differential  Relations  for  the 
Riemann  Tensor. 

The  method  of  quasi-cartesian  coordinates 
in  proving  a  relation  between  tensors  that  we 
used  in  identifying  the  B's  with  the  components 
of  the  Riemann  tensor  can  be  applied  often  and 
helps  to  avoid  lengthy  computations.  We  shall 
use  it  now  to  prove  certain  differential  rela- 
tions for  the  Riemann  tensor  that  are  very  im- 
portant for  us  because  we  know  that  the  tensor 
T1,  which  we  want  to  identify  with  the  contract- 
ed Riemann  tensor  satisfies  a  certain  differen- 
tial equation,  namely,  ^Tj/oXfl  =0,  and,  of 
course,  we  expect  the  tensor  in  our  mathemati- 
cal theory  with  which  we  are  going  to  identify 
T  to  satisfy  the  same  relations.  In  order  to 
deduce  differential  relations  on  the  contracted 
Riemann  tensor  we  have  to  prove  first  some  re- 
lations for  the  non-contracted  tensor.  These 
relations  have  been  discovered  by  Ricci  and  then 
rediscovered  by  Bianchi  and  bear  the  latter' s 
name;  they  are 


27.1 


0. 


The  proof  is  very  simple  if  we  use  quasi- 
cartesian  coordinates.  In  these  coordinates  the 
r's  at  the  point  of  contact  vanish  and  although 
the  first  derivatives  of  the  r's  do  not,  the 
components  of  the  tensor  obtained  by  differen- 
tiating the  B's  (formula  26.5)  which  we  have 
identified  with  the  R's  will  contain  the  second 
derivatives  only,  because  the  first  derivatives 
will  be  multiplied  by  the  r's  themselves  that 
do  vanish.  With  this  remark  in  mind  the  proof 
of  the  Bianchi  relations  does  not  present  any 
difficulty;  we  simply  substitute  for  each  of 
the  three  terms  in  27.1  the  difference  of  the 
two  second  order  derivatives  and  find  that  the 
result  vanishes  identically. 

Now,  in  order  to  deduce  froa  27.1  the  re- 
lations for  the  contracted  tensor  we  raise  in 
27.1  the  second  index  so  that  we  have 


•n,p 

and  here  we  contract  1  with  m,  and  J  with  n.  We 
obtain 


The  second  term  here  may  be  written  as  - 
using  the  fact  (20.71)  that  the  Riemann  tensor 


55 


changes  its  sign  when  two  Indices  of  the  same 
pair  are  Interchanged;  and  the  third  tern  is 
equal  to  the  second  as  we  can  see  by  Interchang- 
ing a  and  p  (which  does  not  change  the  value 
of  the  expression  since  a  and  p  are  summation 
indices) ,  and  then  interchanging  the  first  two 
Indices,  i.e.)  p  and  a  and  the  next  two,  i.e., 
p  and  p  (each  of  these  interchanges  changes  the 
sign,  so  that  nothing  is  changed  in  the  result). 
We  have  thus 


-  2Ra(3pp,a 


But  R  Pjp  are  the  mixed  components  of  the  con- 
tracted Riemann  tensor  which  we  denote  by  R*j 
so  that,  dividing  by  2  and  changing  the  sign  we 
have 

Finally,  Raa  is  the  result  of  contraction  of  the 
contracted  Riemann  tensor;  we  denote  this  sca- 
lar by  R  (it  is  called  the  twi^e  contracted  Rie- 
mann tensor);  then  we  can  write  for  Raa  p  simply 
R   -  or  (OaijR)  a  and  our  relation  becomes 


28.  Geodesies. 

In  concluding  this  fragmentary  development 
of  the  mathematical  theory  that  we  intend  to  ap- 
ply to  Physics  in  the  next  chapter  we  shall  study 
briefly  a  class  of  curves  in  curved  space  which 
play  an  important  part  in  the  study  of  motion. 
These  curves  may  be  considered  as  generalizations 
of  straight  lines  in  flat  space,  and  we  shall 
begin  by  considering  these. 

In  agreement  with  the  point  of  view  of  dif- 
ferential geometry  (Section  21)  we  shall  charac- 
terize a  straight  line  by  differential  equations. 
If  it  is  given  in  parametric  form  (7.11)  we  ob- 
tain by  differentiating  twice  with  respect  to 
the  parameter  and  indicating  differentiation  by 
a  dot  placed  over  the  letter 


28.1 


0. 


Since  the  choice  of  the  parameter  is  in  a 
high  degree  arbitrary,  and  for  another  choice  of 
a  parameter  the  representation  may  cease  to  be 
linear  and  the  equations  (28.1)  may  not  hold  any 
more  -  we  cannot  say  that  they  characterize  a 
straight  line;  a  complete  statement  would  be:   a 
straight  line  is  a  curve  for  which  there  exists 
a  parametric  representation  such  that  28.1  holds. 

Next,  we  translate  28.1  into  the  language 
of  curvilinear  coordinates,  still  keeping  to  flat 
space.  We  have,  as  in  Section  21,  except  that  we 
write  now  in  agreement  with  Section  23,  the  index 
as  a  superscript, 


1  = 


=  dxVdp  = 


=  biau«, 


dil/dp  =  b1( 


0. 


Multiplying  by  a^  and  summing  with  respect 
to  i,  writing  i  »  Y»  we  get 


or,  taking  into  account  21.14  and  22.82 
28.2          ttJ  +  r^pU^  »  0. 

We  pass  now  to  curved  space;  in  general,  ve 
have  here  no  straight  lines  but  we  may  consider 
the  same  equation  and  investigate  the  properties 
of  the  curves  represented  by  them.  We  introduce 
as  our  definition: 

Geodesies  are  curves  which  satisfy  for  an 
appropriate  choice  of  parameter  Equation  26.2. 

In  studying  geodesies  it  is  often  more  con- 
venient to  consider  not  a  single  geodesic   but 
a  portion  of  space  filled  with  geodesies,  so 
that  through  every  point  there  passes  one  and 
only  one  geodesic  of  the  bunch.  If  we  have  this 
situation  we  have  a  vector  u1  in  every  point  of 
that  portion  of  space,  so  that  we  have  a  vector 
field,  and  the  components  u1  may  be  considered 
as  functions  of  the  coordinates.  We  may  then 
write 


and  equation  28.2  becomes 


=  0 


or,  according  to  22.6, 


28.3 


=  0. 


This  form  is  very  convenient  in  some  cases. 
We  shall  use  it  to  prove  two  properties  of  ge- 
odesies. In  the  first  place  we  may  discuss  the 
meaning  of  the  parameter  that  we  are  using.  Con- 
sider the  square  of  the  tangent  vector  xr*^;  we 
can  prove  that  this  quantity  is  constant  along 
a  geodesic.  In  fact,  differentiating  with  re- 
spect to  the  parameter,  we  have 


Since  arc 


and  this  vanishes  according  to  28.3 
length  is  given  by  the  formula 


we  see  that,  as  a  result  of  the  fact  that 
is  constant,  s  is  proportional  to  p,  or  p  is 
proportional  to  the  arc  length  s.  Since  mul- 
tiplication of  the  parameter  by  a  constant  fac- 
tor will  not  affect  equation  28.2  or  £8.5  we 
may  always  consider  that  the  parameter  il  arc 
length.  This  discussion  does  not  apply,  how- 
ever, when  the  quantity  u«Uci  is  zero,  i.e., when 
u1  is  a  zero  square  vector.  If  we  agr^e  to  call 


curves  whose  tangent  vectors  have  zero  square 
singular  curves  we  may  state  the  following: 

Theorem.  In  case  of  a  non-singular  geodesic, 
the  parameter  mentioned  In  the  definition  of  a 
geodesic  and  used  In  i)8.P  and  £8.3  Is  propor- 
tional or  equal  to  arc  length. 

Next  we  may  give  an  Interpretation  to  equa- 
tion £8.2  which  sheds  some  light  on  the  geomet- 
rical nature  of  geodesies.  We  may  assume  now 
that  in  all  geodesies  of  the  bunch  arc  length 
has  been  chosen  as  the  parameter;  then  the  vec- 


- 


tors uJ  are  unit  vectors  and  they  characterize 
In  each  point  the  direction  of  the  geodesic. 
The  derivative  uJ(1  characterizes  the  change  of 
direction  as  we  move  in  the  direction  given  by 
the  coordinate  u1  and  ujfgua  gives  the  change 
of  direction  in  the  direction  of  the  vector  ui, 
i.e.,  in  the  direction  of  the  geodesic  itself. 
Since,  according  to  £8.2  this  quantity  is  zero 
we  have  proved  that  the  direction  of  a  geodesic 
does  not  change  as  we  move  along  it.  (The  above 
discussion  applies,  strictly  speaking,  only  to 
non-singular  geodesies.) 


57 


Chapter  V. 
GENERAL  RELATIVITY 


In  Chapter  I  we  introduced  certain  funda- 
mental quantities,  and  we  combined  them  into  the 
symmetric  tensor  of  rank  two,  TJJ  .   We  found 
that  this  tensor  satisfies  the  differential 
equation 

^Tla/'oxa  =  0, 

first  for  1  =  1,2, '6  and  then,  in  Chapter  III  we 
showed  that,  as  a  result  of  the  new  identifica- 
tion introduced  there,  the  fourth  equation  is 
also  satisfied.  We  thought  it  desirable  to  build 
a  mathematical  theory  in  which  a  tensor  of  the 
same  formal  properties  will  appear  in  a  natural 
way,  and  in  the  preceding  Chapter  IV  we  succeed- 
ed in  actually  setting  up  such  a  theory  -  the 
theory  of  curved  space- time. 

The  structure  of  such  a  space,  we  found,  is 
expressed  *in  a  tensor  of  rank  four  -  the  Riemann 
tensor,  but  we  obtained  from  it  by  contraction  a 
tensor  of  rank  two  -  the  contracted  Riemann  ten- 
sor. In  investigating  the  differential  proper- 
ties of  the  Riemann  tensor  we  found  in  Section 
27  a  relation  of  the  type  desired;  it  is  satis- 
fied by  a  tensor  which  differs  slightly  from  the 
contracted  Riemann  tensor,  namely,  the  tensor 
R*  -  ic^  j  R  which  we  may  call  the  corrected  con- 
tracted Riemann  tensor,  and  this  is  the  tensor 
which  we  are  going  to  identify  with  the  physical 
tensor  T  so  that  our  fundamental  assumption  will 
be 

T—  —  D 1      XK    t> 
4  —  n .  —  304  t  n. 
J    J     XJ 

Thus  we  decide  to  interpret  T,  and  therefore  our 
fundamental  quantities  of  matter  and  electricity 
p,  u,  v,  w,  X,  Y,  Z,  L,  U,  N,  which  went  into  it, 
in  terms  of  structure  of  curved  space  as  it  is 
reflected  in  the  contracted  corrected  Riemann 
tensor.  But  in  doing  this  we  find  ourselves  be- 
fore a  radically  new  situation.  As  we  wanted, 
the  tensor  is  now  an  expression  of  the  proper- 
ties of  space,  i.e.,  the  space  is  now  different 
from  the  one  we  had  before  -  geometry  and  phy- 
sics is  now  an  organic  whole  and  it  is  not  clear 
what  changes  this  brings  with  it;  together  with 
the  desirable  feature,  namely  the  fact  that  T 
grew  out  of  space,  so  to  say,  we  may  have  brought 
in  some  not  desirable  and  hard  to  manage  fea- 
tures. But  then  there  would  be  no  gain  if  we 
could  merely  say  that  T  is  a  geometrical  thing 
now;  we  expected  to  gain  something  essential  in 
undertaking  the  merging  together  of  our  geometry 
and  physics;  and  now  we  stand  before  an  accom- 
plished fact  and  we  have  to  see  what  it  brought 
with  it.  We  conjured  up  something  and  we  do  not 
seem  to  be  able  to  stop,  we  have  to  go  ahead  and 
hope  that  the  changes  will  be  beneficial. 


It  might  seem  strange  that  we  find  a  phy- 
sical interpretation  only  for  the  contracted 
Riemann  tensor,  only  for  ten  combinations  of 
its  twenty  components.  But  this  is  quite  In 
order.  Should  all  the  components  of  the  Rie- 
mann tensor  be  used  up  in  interpreting  matter 
and  electricity  that  would  mean  that  where  there 
is  no  matter  (and  electricity)  space-time  is 
flat  (as  far  as  internal  properties  are  con- 
cerned) ;  that  would  mean  that  matter  acts  only 
where  it  is;  but  we  know  that  matter  make*  it- 
self felt,  for  instance,  by  the  gravitational 
field  that  it  produces,  also  outside  the  region 
which  it  occupies,  and  this  is  in  accord  with 
our  identification  as  a  result  of  which  only 
part  of  the  components  of  the  Riemann  tensor 
vanish  where  there  is  no  matter,  so  that  the 
remaining  components  may  be  Interpreted  as  cor- 
responding to  gravitational  forces. 


29.  The  Law  of  Geodesies. 

In  questions  of  celestial  mechanics  which 
we  are  going  to  treat  now  the  effects  of  the 
electromagnetic  field  are  usually  negligible 
and  we  shall  begin  by  equating  to  zero  our  elec- 
tromagnetic tensor.  Equation  24.9  becomes  then 


29.1 


According  to  our  fundamental  assumption, 
this  tensor  has  been  identified  with  the  cor- 
rected contracted  Riemann  tensor,  and  it  must 
satisfy  the  equation 

29.2  Ta1>a  =  0 

which  formally  is  the  same  as  our  old  equation 
of  motion  24.8  but  differs  from  it  In  that  it 
has  to  be  interpreted  in  curved  space.  The  lest 
two  equations  impose  certain  conditions  on  the 
velocity  components  u1  and  we  want  to  find  these 
conditions  or,  in  other  words,  we  want  to  elim- 
inate density  from  the  equations  £9.1,  29. P. 
(Y/e  have  been  using  in  Chapter  IV  the  letter  u  for. 
the  general  coordinates  -  in  this  chapter  we  go 
back  to  our  notation  of  the  first  three  chap- 
ters and  denote  by  u1  again  the  four-dimension- 
al velocity  vector,  and  we  shall  denote  the  gen- 
eral coordinates  by  x*0 

First  of  all  we  shall  prove  the  following 
theorem  due  to  ilineur. 

Theorem.  If  the  field  u1  satisfies  the  equa- 
tions 29.1,  29.2  the  vectors  u1  may  be  consider- 
ed as  tangent  unit  vectors  to  a  family  of  geo- 
desies filling  the  space. 


Proof.  We  consider  first  the  case  when  u« 
Is  a  unit  vector  (and  not  a  zero  square  vector), 
I.e.,  UpuP  =  -1.  Differentiating  this  relation 
we  have 

29.3  up,iuP  »  0. 

Substituting  29.1  Into  £9.2  we  get 


29.4  op/oxa.^Uj  +  pua  auj   +   Prf*Uj  a  =  0. 
Dividing  by  p  and  introducing  the  notation 

29.5  A  =  o  log   p/oxa.u°  +  ua  a 
we  may  write  29.?  as 

Auj  +  uSij  a  =  0. 

Multiplying  this  by  uJ  and  summing  with  respect 
to  J,  for  which  we  write  P,  we  have 


=  0 


which,  according  to  29.  3  gives  A  =  0. 
tuting  this  into  29.5  we  obtain 


Substi- 


29.6 


ua.u. 


which,  according  to  28.3  proves  the  theorem  in 
the  case  considered. 

But  we  also  have  to  consider  the  case  of 
propagation  of  light.  In  this  case  we  do  not 
heve  to  consider  any  density  p  the  momentum  vec- 
tor being  given  by  Qi  with  q^qP  =  0.  The  preced- 
ing proof  breaks  down  in  this  case',  but  continu- 
ity considerations  lead  us  to  the  result  that, in 
this  case  also  we  can  find  a  scalar  field  p  such 
that  qVp  will  be  tangent  vectors  to  geodesies. 

We  conclude  that  in  a  gravitational  field 
matter  and  light  particles  follow  geodesies. 

In  the  present  chapter  we  are  going  to  ap- 
ply this  result  to  the  investigation  of  the  mo- 
tion of  a  planet  and  the  propagation  of  light 
in  the  Solar  system.  We  shall  see  that  the 
changed  significance  of  differentiation  takes 
care,  in  a  way,  of  what  is  usually  accounted 
for  by  gravitational  forces. 

30.  Solar  System.  Symmetry  Conditions. 

Our  equations  29.1  and  29.2  describe  rela- 
tions existing  between  matter  and  field.   We 
proved  that  the  motion  of  matter  is  character- 
ized by  the  geodesies  of  the  curved  space,  but 
the  curvature  is  in  turn  determined  by  matter. 
Theoretically,  we  may  have  a  complete  descrip- 
tion of  the  situation,  but  in  practice  we  do  not 
know  how  to  handle  it,  we  do  not  know  where  to 
begin.  In  such  cases  we  often  resort  to  the 
method  of  successive  approximations.  Let  us  try 
to  apply  this  method  here.  In  investigating  the 
motion  of  a  planet  around  the  sun  we  neglect  in 
the  first  place  the  motion  of  the  sun.  Then,  in 


the  first  approximation  we  neglect  the  MSB  of 
the  planet,  I.e.,  assume  that  there  Is  no  mat- 
ter outside  the  tun.  Since  we  have  already  neg- 
lected electromagnetlsm  we  have  then  that  out- 
side the  sun  the  tensor  T  Is  zero  so  that,  ac- 
cording to  the  fundamental  assumption, 


R}  -  i»ijR  «  0. 


Contracting  we  get  R  -  $.4R  «  0,  so  that  R  »  0 
and  we  have  simply 


30.1 


Rj  »  0. 


These  equations  are  known  as  Einstein's  equa- 
tions. We  see  that  the  statement  that  the  cor- 
rected contracted  Riemann  tensor  vanishes  is 
equivalent  to  the  statement  that  the  contracted 
Riemann  tensor  vanishes. 

As  a  result  of  our  first  approximation  we 
derived  thus  the  field  equations  30.1.  In  the 
next  approximation  we  introduce  the  planet  »n4 
assume  that  its  action  on  the  field  is  neglig- 
ible but  that  the  field  acts  on  It,  i.e.,  that 
the  motion  of  the  planet  is  given  by  the  geo- 
desies of  the  field  which  has  been  determined 
in  the  preceding  step;  the  motions  will  then  be 
given  by  the  equations  28.? 


30.2 


in  which  the  r's  are  calculated  from  the  g's 
which  have  been  found  to  satisfy  30.1. 

Our  problem,  therefore,  falls  in  two:  first, 
to  find  a  field  satisfying  the  equations  30.1, 
and  second,  to  find  the  geodesies  of  this  field. 

In  this  form  the  problem  is  comparable  to 
the  problem  in  Newtonian  mechanics  as  explained 
in  Section  1.  There  the  field  was  given  by  the 
potential  which  had  to  satisfy  the  Laplace  equa- 
tion; here  the  field  is  given  by  the  g's  which 
have  to  satisfy  the  equations  30.1. 

There  the  motion,  after  the  field  had  been 
determined,  was  described  by  second  order  or- 
dinary equations,  differentiation  being  taken 
with  respect  to  time;  here  motion  Is  also  de- 
scribed by  second  order  differential  equations, 
derivation  being  with  respect  to  s. 

It  is  possible  by  making  some  special  as- 
sumptions, neglecting  certain  quantities,  for 
instance  the  derivatives  of  all  the  g's  except 
g44  and  dropping  some  terms,  to  obtain  the  gen- 
eral Newtonian  equations  as  a  special  or  limit- 
ing case  of  our  equations.  The  equations  30.1 
would  thus  reduce  to  the  Laplace  equation  1.54 
for  g44  and  the  equations  of  a  geodesic  to  the 
equations  of  motion  1.1  in  which  X,Y,Z  are  giv- 
en by  1.53,  so  that  we  could  consider  the  gen- 
eral Newtonian  theory  of  motion  in  a  gravita- 
tional field  as  a  first  approximation  to  the 
theory  of  Relativity,  but  it  is  quite  difficult 
in  the  general  case  to  estimate  what  we  neglect 
and  the  error  we  commit,  and  we  prefer  to  com- 
pare the  two  theories  on  some  concrete  special 


cases.  All  these  cases  will  refer  to  what  cor- 
responds to  a  gravitational  field  produced  by  a 
single  attracting  center.  We  found  In  Section 
1  such  a  field  by  using  the  general  equations 
and,  In  addition,  the  condition  of  symmetry.  We 
Intend  to  follow  an  analogous  course  here.  Our 
general  equations  are  30.1  and  now  we  want  to 
find  what  will  correspond  to  the  conditions  of 
symmetry.  The  situation  is  much  more  complicat- 
ed here.  There  the  field  could  be  characteriz- 
ed by  a  scalar  <p  and  the  condition  of  symmetry 
with  respect  to  a  point  was  simply  expressed  by 
stating  that  9  Is  a  function  of  distance  from 
that  point;  here  the  field  is  characterized  by 
a  tensor  gij.  There,  in  the  second  place,  we 
worked  in  ordinary  space;  here  we  have  space- 
time  which  has  an  additional  coordinate,  t. 
Last,  there  the  space  was  given  and  in  it  dis- 
tances were  well  defined;  on  this  space  was  su- 
perimposed a  field  whose  symmetry  we  had  to 
discuss;  here  the  field  is  not  superimposed  on 
a  space  with  a  given  metric,  but  the  metric  it- 
self constitutes  a  field  which  has  to  be  deter- 
mined by  the  symmetry  condition.  . 

We  shall  take  up  these  three  difficulties 
one  by  one. 

In  the  first  place  let  us  consider  a  ten- 
sor field  in  ordinary  space,  and  let  us  impose 
on  it  the  condition  of  symmetry  with  respect  to 
a  point.  A  tensor  we  may  consider  (Section  9) 
as  the  left  hand  side  of  an  equation  of  a  cen- 
tral quadric  surface  (we  are  interested  in  a 
symmetric  tensor  here,  since  the  g's  are  sym- 
metric in  the  indices  1  and  J  -  this  symmetry 
we  must  try  not  to  confuse  with  the  symmetry  ' 
with  respect  to  a  point  which  we  impose  on  the 
field  -  and  a  symmetric  tensor  is  sufficiently 
characterized  by  a  quadratic  form)  which  we  may 
consider  as  an  ellipsoid.  Our  tensor  field  will 
then  be  represented  by  an  ellipsoid  at  every 
point  of  space.  The  field  must  allow  rotations 
around  a  fixed  center  0,  i.e.,  such  a  rotation 
must  bring  the  field  into  itself;  in  other  words, 
if  a  rotation  brings  a  point  P  into  a  point  Q 
it  must  bring  the  ellipsoid  at  P  into  the  ellip- 
soid at  Q.  In  particular,  a  rotation,  which 
leaves  P  unchanged  must  not  change  the  ellipsoid 
at  P.  It  is  clear  that  every  ellipsoid  must  be 
an  ellipsoid  of  revolution  and  that  its  axis 
must  be  directed  along  the  radius  vector  from 
0  to  P. 

The  ellipsoid  at  the  point  x,0,0,  will  be 
seen  to  be 


-  x)8  +  B(TJ»  +  ««)  =  1 


and  for  a  general  point  P,  if  we  use  polar  coor 
dinates  for  P  and  (considering  if  it  helps, 
the  ellipsoid  as  infinitesimal)  their  differen- 
tials for  the  points  of  the  quadratic  relative 
to  P, 

30.3     Adr8  +  B(d98  +  sinae.d98)  =  1. 


Since  ellipsoids  at  points  equidistant  fro*  0 
must  have  the  same  dimensions,  A  and  B  oust  be 
functions  of  r  alone. 

The  left  hand  side  of  this  equation  gives 
a  tensor  field  which  satisfies  the  condition  of 
symmetry  with  respect  to  a  point.  Bext,  we 
consider  the  complication  resulting  from  the  in- 
troduction of  time.  In  Section  1  time  was  not 
mentioned,  it  means  that  the  field  was  consid- 
ered as  Independent  of  time,  or  static;  we  may 
say  that  the  field  must  not  be  affected  by  a 
change  in  t  or,  from  the  four  -dimensional  point 
of  view,  by  a  translation  along  the  t-axls.  This 
Is  a  requirement  of  the  same  character  as  that 
of  symmetry  with  respect  to  a  point;  froa  the 
four-dimensional  point  of  view  we  may  combine 
the  two  requirements  and  say  that  the  field  must 
be  symmetric  with  respect  to  a  line  -  the  t-axls. 
But  the  field  now  is  a  field  in  four-space,  It 
will  be  represented  by  a  quadratic  form  in  dr, 
de,  d<p  and  dt.  For  dt  =  0  it  must  reduce  to  the 
field  given  before;  the  coefficients  must  be  in- 
dependent of  t  corresponding  to  the  requirement 
that  the  field  be  static;  and  a  change  from  t 
to  -t  must  also  not  affect  the  field  (reversi- 
bility of  time)  so  that  terms  of  the  quadratic 
form  involving  dt  to  the  first  power  must  be  ab- 
sent. It  follows  that  the  addition  of  the 
fourth  dimension  results  in  the  addition  of  on- 
ly one  term  to  our  tensor  which  now  may  be  writ- 
ten as 


20.4   Adr2  +  B(d62  + 


Cdt 


where  C,  as  well  as  A  and  B,  are  functions  of  r 
alone. 

And  now  we  have  to  overcome  the  last  dif- 
ficulty, that  connected  with  the  fact  that  our 
space  is  curved  and  that  we  cannot  define  sym- 
metry in  terms  of  rotations  because  rotation 
means  a  transformation  in  which  distances  are 
preserved,  and  distances  are  defined  only  by  the 
field  of  the  g's  which  we  want  to  determine  by 
the  requirement  that  it  be  not  affected  by  ro- 
tations. To  overcome  this  difficulty  we  have  to 
agree  on  some  other  definition  of  symmetry,  and 
it  seems  natural  to  adopt  as  such  the  following: 
in  order  to  define  a  symmetry  for  a  curved  space 
we  shall  compare  it  with  a  flat  space  by  estab- 
lishing a  one-to-one  correspondence  between  the 
points  of  the  two  spaces.  Corresponding  to  ev- 
ery transformation  of  the  flat  space  we  will 
have  then  a  transformation  of  the  curved  space; 
and  we  shall  say  that  the  curved  space  possess- 
es the  same  symmetry  as  a  field  F  In  the  flat 
space  if  the  metric  of  the  curved  space  -  as 
given  by  the  g's  -  is  not  affected  by  those 
transformations  of  the  curved  space  which  cor- 
respond to  the  transformations  in  flat  space 
not  affecting  the  field  F. 

Suppose  now  that  we  have  such  a  curved 
space.  This  implies  that  we  have  a  one-to-one 
correspondence  with  the  flat  space,  and  we  may 


use  for  the  points  of  the  curved  space  the  same 
coordinates  that  we  use  for  the  corresponding 
points  of  the  flat  space.  It  Is  clear  that  30.4 
will  satisfy  the  requirements,  so  that  we  can 
take  It  for  our  fundamental  tensor,  or  as  we 
shall  say  (compare  21.8)  for  our  ds8. 

But  the  quantities  r,  6,  <p,  t,  which  have 
definite  geometrical  significance  in  flat  space 
lose  it  in  curved  space  -  they  are  Just  num- 
bers which  we  use  to  characterize  different 
points  as  we  use  numbers  to  characterize  houses 
on  a  street.  There  is  no  reason  why  we  should 
not  replace  them  by  other  numbers,  i.e.,  trans- 
form our  coordinates,  if  it  would  simplify  our 
formulas.  Now,  it  is  clear  that  transformations 
involving  6,  9,  t  will  make  our  expression  20.4 
more  complicated  because  it  would  introduce 
these  coordinates  into  the  coefficients.  But  we 
could  choose  a  transformation  on  r  alone  so  as 
to  simplify  that  expression;  we  could,  for  in- 
stance, reduce  any  one  coefficient  to  a  pre- 
scribed function  of  the  new  r.   We  make  this 
choice  in  such  a  way  as  to  reduce  B  to  r8  be- 
cause, in  a  way,  it  restores  to  r  a  geometrical 
meaning  as  we  shall  see  presently.  If  we  write 
£(r)  and  -T(r)  for  the  functions  of  the  new  r 
which  now  appear  instead  of  A  and  C,  and  inter- 
pret 30.4  as  giving  -ds2,  in  accordance  with 
the  standardization  of  the  parameter  adopted  In 
Section  12,  our  final  formula  will  be 


SO.  5 
-dsa  = 


+  r8(d98  +  sin80.d<p8)  -  T)(r).dt8 


Letting  here  r  and  t  have  constant  values 
we  have  a  surface,  and  a  simple  calculation 
would  show  that  -^  is  the  total  curvature  of 
this  surface,  which  gives  a  geometrical  meaning 
to  r. 

Our  task  is  now  accomplished,  we  have  im- 
posed on  our  space  the  conditions  of  symmetry 
and  we  have  next  to  impose  on  it  the  general 
equations  30.1. 


31.  Solution  of  the  Field  Equations. 

We  are  now  at  a  stage  which  corresponds  to 
the  assumption  that  the  potential  9  is  a  func- 
tion of  r  alone  in  Section  1,  and  our  next  task 
corresponds  to  the  substitution  of  <p(r)   into 
Laplace's  equation.   Instead  of  one  unknown 
function  <p(r)  we  have  here  the  ten  g's  determin- 
ed by  30.5  which  we  may  write  out  as 


8x1  = 


§32  = 


31.1 


g33  =  r2sinae,  gaa?  = 
all  others  zero, 


60 


and  which  involve  two  unknown  functions.  In  or- 
der to  determine  these  functions  we  have  to 
substitute  31.1  into  30.1.  In  the  first  place 
we  have  to  calculate  the  g's  with  the  upper  in- 
dices from  the  formulas  (£2.6) 


Since  the  g's  with  two  distinct  lover  Indices 
vanish,  only  those  terms  on  the  left  are  not 
zero  in  which  a  »  1  and  we  have 


for  1  ^  J  the  right  hand  sides  are  zero,  and 
since  the  first  factors  on  the  left  are  not 
zero  the  second  must  vanish;  we  see  thus  that 
the  g's  with  two  distinct  upper  indices  also 
vanish.  For  J  =  1  we  have  unity  on  the  right 
and  thus 


31.2 


,11 


g33  = 


g*  »  1/r", 

•e,  g44  =  -i/n(r) 

all  others  zero. 


In  what  follows 

Xi  =  r,  xa  =  8,  x,  =  f,  x«  =  t. 

Differentiation  with  respect  to  r  will  be  de- 
noted, as  in  Section  1,  by  ' .  We  next  calcu- 
late the  r's  with  all  indices  down  according  to 
22.9  and  obtain,  omitting  those  that  come  out 
zero, 


31.3 


1,44  =  -iV*     r»>»»  =  -r'slnO.  cos  8, 

rt,  i»  =  r>     r»,i»  =  r  sin*0» 
rs,33  =  rasin  8.   cos  8,     F4, 14  =  -Jtf 


Raising  of  an  index  is  accomplished  In  this  case 
simply  by  multiplying  by  the  g  with  the  index  to 
be  raised  appearing  above  twice,  because  the  sua 
giaFa  which,  according  to  83.3  is  equal  to  F1  re- 
duces, as  the  result  of  the  Tanishing  of  the  g's 
with  two  equal  upper  indices  to  one  term,  namely 
g11^.  This  permits  us  to  write  out  easily  the 
r's  with  one  index  above: 


rli   = 

31.31    r;4  =  J 


r;, 


r  sin86 


33 


r8,  =  -sine  .cos  e, 


7,  rJ  = 


«;  -cote,  r,:  = 


Next  we  have  to  calculate  those  components 
of  the  Riemann  tensor  which  appear  In  the 


expressions  for  the  components  of  the  contracted 
Hieraann  tensor,  I.e.,  those  with  the  first  Index 
equal  to  the  one  before  last  or  those  of  the 
type  Rljjih  •  We  do  not  write  those  out  but  state 
that  the  result  of  the  calculation  with  their 
aid  of  the  components  of  the  contracted  Riemann 
tensor  is,  that  all  these  components  with  two 
distinct  indices  vanish  and  the  others  are 


g'r 


4T1 


It  is  more  convenient  to  operate  with  the  mixed 
components  of  the  contracted  Riemann  tensor  ('al- 
though it  is  not  necessary,  and  the  reader  might 
for  the  sake  of  practice  go  through  the  same 
calculations  using  covariant  components) ,  and 
these  are  obtained  from  the  last  formulas  by 
multiplication  by  the  corresponding  g  with  upper 
indices;  we  obtain  thus 

t  a 


31.4 


^Ti- 


We come  now  to  the  ten  equations  E  *  0 
that  we  have  to  satisfy;  six  of  them,  namely, 
those  in  which  i  ^  J  are  satisfied  identically 
because  our  R's  as  well  as  the  O's  vanish  for 
distinct  indices.  Of  the  remaining  four  equa- 
tions the  second  and  the  third  are  identically 
the  same  because  of  the  equality  of  the  corres- 
ponding values  of  R  in  31.4.  Three  equations 
remain,  viz., 


=0 


Subtracting  the  last  one  from  the  first  we  have 


81 


Ol  *  const.  •  c  . 

By  choosing  our  unit  of  time  appropriately 
we  can  reduce  this  constant  to  1,  so  that 


31.7 


or 


Using  31.6  and  21.7  in  the  second  of  the  equa- 
tions 31.5  we  obtain 


1  - 


which  gives 
31.8 


TJ  =  1  - 


where  y  denotes  a  constant  of  integration. 
Our  field  then  is  given  by 
a 


31.9   -dsa  = 


rad0a 


.d*f  -  r)dt* 


where  TI  is  given  by  31.8. 


32.  Equations  of  Geodesies. 

We  first  consider  the  non-zero  geodesies 
which  correspond  to  a  material  particle.   We 
know  that  in  this  case  arc-length  can  be  taken 
as  parameter  so  that  the  curve  in  addition  to 
the  equations  30.2  must  satisfy  the  equation 
30.5  which  we  may  write  as 


32.1    ~ 


=  -1; 


we  shall,  however,  make  our  discussion  slightly 
more  general  and  write  A  in  the  right  hand  side 
with  a  view  of  using  the  results  also  in  the 
case  corresponding  to  a  light  particle.   We 
shall  discuss  this  equation  together  with  the 
equations  30.2  which  become  here 


32.21  r  - 


-.rs 

2T] 


rn 


rrj  sina9.f8 


32. 22 


32.23 


32.24 


9  +  2£.8  -  sinO  cosO  fa  =  0, 


+  2p.f  +  2  cote 
•f  5-.rt  =  0. 


.9*  *  0, 


The  choice  of  the  0  and  «  coordinates  is  at  our 
disposal.  We  choose  them  in  such  a  way  that  the 
initial  position  of  the  particle  be  on  the  equa- 
tor and  that  the  tangent  be  tangent  to  the  equa- 
tor. In  this  case  8  =^  and  e  =  zero  at  the 
initial  moment  and  the  second  equation   shows 
that  0=4  always.  Now  the  last  two  equations 
may  be  integrated  once  each  and  they  furnish 


32.3 
32.4 


I 


where  h  and  k  are  constants.  Together  with 
these  two  equations  we  have  to  consider  the  one 
corresponding  to  32.1;  viz., 


32.5 


r*9*  ~  lit1  »  A. 


We  simplify  our  system  of  equations  In  the 
following  way:   (a)  we  eliminate  t  by  means  of 
38.4;  (b)  we  eliminate  differentiation  with  re- 
spect to  the  parameter  by  using  f  =  (dr/d9) .  9 
and  32.3;  (c)  we  Introduce  as  a  new  variable, 
as  Is  customary  In  celestial  mechanics,  the  In- 
verse distance  u  =  -^,  Instead  of  r,  so  that 


32.6 


r  = 


; 


and,  (d)  we  substitute  the  value  for  TJ  from  31.8, 
We  obtain  in  this  way  a  differential  equation 
between  u  and  9;  viz., 

where  X  is  a  constant.  This  equation  may  be 
considered  as  the  equation  of  the  orbit  of  a 
planet. 


33.  Newtonian  Motion  of  a  Planet. 

Every  reader  knows,  of  course,  that  accord- 
ing to  the  Newtonian  theory  a  planet  moves  around 
the  sun  on  an  ellipse  in  one  of  whose  foci  the 
sun  is  situated,  although  he  may  not  be  in  the 
possession  of  a  proof  of  that;  we  shall  not  give 
a  proof  of  that  here  either,  but  we  shall  dis- 
cuss in  detail  only  one  feature  of  the  situation. 
The  vertex  of  the  ellipse  which  is  nearest  to 
the  focus  in  which  the  sun  is  located  is  called 
the  perihelion,  the  other  vertex  -  the  aphelion; 
the  line  Joining  the  perihelion  and  the  aphelion 
is  the  major  axis,  and  therefore  passes  through 
the  sun.  Using  the  coordinates  u  and  9  corres- 
ponding to  those  of  the  preceding  section  we  may 
say  that  the  perihelion  corresponds  to  the  maxi- 
mum value  of  u,  and  the  aphelion  to  the  minimum 
value  of  u,  and  that  the  transition  from  the 
maximum  to  the  minimum  value  of  u  corresponds  to 
the  change  of  9  by  the  amount  *.  It  is  this  last 
fact  that  we  shall  deduce  from  the  equations  of 
motion.  We  may  (corresponding  to  the  fact  that 
we  set  0  =  i*  in  the  preceding  section)  consider 
a  motion  in  the  xy-plane  characterized  by  the 
equations  (see  1.1  and  1.3) 


33.1 


dx 
"dt1 


£**-•• 


We  have  now  to  introduce  variables  corresponding 
to  those  used  above  in  the  Relativity  treatment, 
i.e.,  to  set 

x  =  COS9/U,     y  =  sin9/u. 


6£ 


We  calculate  the  first  and  the  second  deriva- 
tives of  x  and  y  with  reipect  to  t,  substitute 
them  into  33.1  and  combining  the  terns  with  COM 
and   those  with  sin  f  we  get 


co., 


.in,  „ 


dt 


-  0. 


Multiplying  the  first  of  these  equalities 
sin  9,  the  second  by  cos  9  and  adding  the 
sults  we  obtain 


33.8   8u~"(-n:)  - 


by 

re- 


and then  easily 


-a  du  d9 
211  'dt'dt  '  "  'dt1 


-i  d*9  _ 
~ 


The  last  equation  may  be  written  as 

da9/dta  _  gdu/dt 
d9/dt      u 


whence 


33.3 


d9/dt  *  Hu1 


and  then 


where  H  is  a  constant  of  integration.  We  next 
want  to  eliminate  t  from  33. 2  with  the  help  of 
the  last  formula.  Differentiating  it  we  have 


dt 


du   du  d9   du 
dt  =  d9*dt  =  de 


d!u  m  dV*v"  +  _du.d!9  .  d!^j«u«     , 
dt*   dfPMV    d9  dtT   de1 

Substituting  into  33.8  we  arrive  at 


-u~1H8u4  +  Mu*  =  0 


and,  after  two  terms  cancel,  at 


33.4 


du 


This,  we  easily  see,  may  be  obtained  by  differ- 
entiation from 


33.5 


•  «  - 


=  a 


EMu 
H» 


where  a  is  a  constant.   This  corresponds  to  equa- 
tion 38.7  obtained  from  Relativity  theory  in  the 
preceding  section;  in  that  last  equation  we 
have,  of  course,  to  take  A  =  -1  if  we  consider 
the  motion  of  a  planet  so  that  it  becomes 


33.51 


— 8 


and  we  see  that  the  difference  is  essentially  in 
only  one  term.  But  before  we  come  to  the  com- 
parison of  the  motions  described  by  these  two 
equations  we  have  to  continue  the  discussion  of 
33.5.  The  character  of  motion  described  by  it 
depends  on  the  values  of  the  constants  appearing 
in  it,  and  also  on  the  initial  conditions.   We 
begin  the  discussion  by  writing  33.5  in  the  form 


33.52 


=  -(u  - 


-  u8) 


where  ux  and  ua  are  the  two  roots  of  the  polyno- 
mial u8  -  2Mu/H2  -  ct.  If  the  two  roots  are  com- 
plex, or  equal,  the  right  hand  side  of  33.52  is 
negative  and  we  cannot  have  real  motion.   Also 
when  both  roots  are  negative  the  right  hand  side 
is  negative  for  positive  values  of  u  (and  u, 
being  the  inverse  distance,  must  be  positive). 
The  case  of  one  positive  and  one  negative  root 
corresponds  to  u  changing  from  0  to  a  finite 
value  and  then  going  back  to  zero,  for  instance, 
a  comet  approaching  the  sun  from  an  infinite 
distance  and  then  receding  back  into  infinity. 
But  we  want  to  treat  the  case  of  a  planet,  and 
this  will  obviously  correspond  to  the  only  re- 
maining case;  viz.,  that  of  two  distinct  posi- 
tive roots.  If  by  u^  we  denote  the  larger  and 
by  ua  the  smaller  of  the  two  roots  it  will  be 
convenient  to  write  our  equation  as 


33.53 


and  we  see  that  a  real  solution  is  only  possible 
when  u  is  between  u2  and  ux .  The  motion  will 
manifest  itself  in  an  oscillation  of  u  between 
ua  and  Ui  and  the  sign  of  du/d?  will  change  at 
these  points.  The  particular  question  we  want 
to  investigate  is,  as  was  mentioned  at  the  be- 
ginning of  the  section,  to  what  change  of  q>  cor- 
responds one  oscillation  of  u,  between  ua  and  u^ 
say.  In  order  to  find  this  we  solve  the  equa- 
tion for  d«p,  obtaining 


whence 
33.6 


-  U)(U  -  U8)' 


a  change  of  variable  will  help  us  to  evaluate 
this  integral.  We  put 


33.7 


u  -  ua 
i  -  u, 


sln'x; 


when  z  changes  from  0  to  */£,  u  will  increase 
from  ut  to  Ui  as  required.  We  bar* 


du  »  2(ux  -  u.)  sin  z  cos  z  dx, 

u  -  u,  »  (ux  -  ut)  sin*x 
ux  -  u  =  Uj.  -  |ua  •«•  (ux 

=  (u^  -  u§)  cos'z, 
and  the  integral  becomes. 


33.8 


33.9 


2dx 


The  answer  to  our  question  is  then,  that  f 
changes  exactly  through  x  while  u  performs  an 
oscillation  between  its  minimum  and  its  ••yi^«i» 
values,  which  corresponds  to  the  fact  mentioned 
before  that  the  aphelion  and  the  perihelion  are 
on  a  straight  line  with  the  sun,  which  fact  we 
thus  proved. 


34.  Relativity  Motion  of  a  Planet. 

Following  this  excursion  into  celestial 
mechanics  according  to  Newton  we  return  to  our 
Relativity  formulas  which  we  shall  treat  by  com- 
paring them  to  the  formulas  derived  in  the  last 
section. 

At  this  stage  we  come  again  upon  a  funda- 
mental question:  we  have  two  theories;   the 
quantities  of  one  of  them  have  been  identified 
with  measured  quantities,  and  this  identifica- 
tion proved,  in  the  main,  a  splendid  success;  if 
the  new  theory  is  to  be  applied  successfully,  it 
is  clear  that  it  has  essentially  to  agree  with 
the  old  theory  with  which  it  may  be  compared  in- 
stead of  being  compared  with  results  of  measure- 
ment directly;  that  means  that  we  have  to  estab- 
lish a  correspondence  between  quantities  of  the 
two  theories,  and  we  have  to  expect  that  the 
corresponding  quantities  of  the  two  theories 
obey  approximately  the  same  relations.   This 
correspondence  has  been  anticipated  in  the  pre- 
ceding pages  by  using  the  same  letters  for  quan- 
tities which  it  is  intended  to  identify.  But  it 
may  not  be  superfluous  to  remind  the  reader  that 
the  quantities  u,  f ,  6  of  the  two  theories  are 
not  the  same;  there  Is  a  certain  arbitrariness 
in  choosing  coordinates  in  curved  space,  and  es- 
pecially obvious  it  must  be  in  the  case  of  r 
(of  which  u  is  the  inverse);  it  is  possible  to 
substitute  for  r  some  simple  function  of  r,  and, 
indeed,  it  has  been  done;  the  criterion  of  cor- 
rectness of  choice  must  lie  In  the  success  of 
the  identification. 


Next  we  must  identify  the  constants  of  the 
new  theory  with  those  of  the  old.  It  would  seem 
as  though  we  must,  in  order  to  reach  an  agree- 
ment, make  Y  =  0  so  as  to  get  rid  of  the  last 
term  of  the  equation  33.51  by  which  it  differs 
essentially  from  33.5.  But  this  would  annihi- 
late also  the  preceding  term  in  the  new  formula 
and  so  spoil  the  correspondence  altogether.  We 
must,  therefore,  ascribe  to  Y  a  finite  value, 
but  we  will  expect  that  it  will  be  small;  more 
precisely,  it  will  be  small  in  such  a  way  that 
the  term  yu3  will  not  affect  the  equation  33.51 
essentially  or  will  be  small  in  comparison  with 
ua.  Next,  let  us  compare  32.3  with  33.3.   Of 
course,  the  left  hand  sides  differ  by  the  fac- 
tor dt/ds,  but  this  is  equal  (Section  13)   to 
1//1  -  p  *  which  is,  even  for  motions  of  planets, 
very  close  to  one,  so  that,  in  the  first  approx- 
imation we  may  identify  h  with  H.  Comparing  now 
33.5  and  33.51  we  come  to  the  conclusion  that 


34.1  Y  =  2M 

so  that  we  may  write  35.51  as 
34.2 


After  we  have  made  these  identifications 
the  situation  is  then  this:  if  we  neglect  the 
term  2Mu3  in  the  equation,  and  this  term  is 
negligible  in  most  cases,  we  have  the  same 
equation  of  the  orbit  as  in  Newtonian  mechanics. 
This  result  is  very  satisfactory,  we  have  been 
able  to  obtain  the  equations  of  motion  of  a 
planet  without  considering  any  gravitational 
forces,  as  a  result  of  our  identification  of  the 
contracted  Riemann  tensor  with  the  complete  ten- 
sor. Still  the  term  2Mu3  is  there,  the  Relativ- 
ity theory  predicts  an  orbit  that  is  slightly 
different  from  that  predicted  by  the  Newtonian 
theory;  is  the  difference  within  the  error  of 
observation?  We  shall  consider  now  this  ques- 
tion, but  instead  of  considering  the  motion  as  a 
whole,  we  shall  consider  only  the  feature  of  it 
which  for  the  case  of  Newtonian  motion  has  been 
considered  in  the  preceding  section;  viz.,  we 
shall  ask  ourselves  whether,  corresponding  to  an 
oscillation  of  u  between  a  minimum  and  a  maximum 
value,  the  change  in  <p  will  be  exactly  x.   Of 
course,  we  are  sure  that  in  the  new  theory  there 
will  be  motions  which  differ  but  slightly  from 
the  motion  considered  in  the  preceding  section, 
so  that  the  general  character  of  the  motion  will 
be  the  same,  and  u  will  oscillate  between  a  min- 
imum and  maximum.  The  value  of  du/d«  will  now 
be  expressed  by  a  polynomial  of  degree  three  the 
first  two  terms  of  which  are 

2Mu3  -  u*. 

The  sum  of  the  three  roots  of  this  polynomial  is 
1/2M  so  that  if  Ux,  ua  denote  two  roots,  viz., 
those  two  roots  which  differ  but  slightly  from 


64 

the  roots  denoted  in  the  saae  way  in  the  preced- 
ing case,  the  third  root  will  be 

ar  -  ur  -  ut, 

and  the  Integral  corresponding  to  55.6  will  be 


y 

I 


. 

/(ux  -u)(u  -  ut)  [1  - 


u,  «•  u)E*J 


the  same  substitution  55.7  as  before  will  be  ap- 
plied. We  only  have  to  calculate 

ux  +  us  +  u  -  -ix  +  u,  •»•  ut  +  (ux  -  ut)sin*z 

=  ux  +  ua  +  ux  sin'x  +  u,  cos*x, 
so  that  the  integral  becomes 


f 
J 


-  2M(ui  +  ur  +  ux  sin'x  +  u,cosix)" 


As  we  saw  before,  II  is  a  very  small  quantity; 
before,  we  neglected  it  altogether  »nd  obtained 
K  for  the  value  of  the  integral;  now,  we  shall 
go  to  the  next  approximation;  we  shall  develop 
the  denominator  according  to  the  powers  of  M  and 
neglect  all  terms  beyond  the  second  (it  would  be 
a  very  easy  but  not  a  worthwhile  matter  to  esti- 
mate the  value  of  the  error)  ;  we  get  in  this  way, 
as  an  approximate  value  for  f^-  fg 


ua  +  ux  sin«x 


cos»x)]  dx, 


or 


The  new  theory  predicts  then  that  the  angle  e 
will  have  changed  by  this  amount  while  the  dis- 
tance from  the  sun  changes  from  its  miniaum  to 
its  maximum;  i.e.,  that  the  perihelion  and  aphe- 
lion are  not  in  a  straight  line  with  the  sun  but 
that  the  planet  moves  through  an  additional  angle 
of  -^f  (ux  +  ua)  after  reaching  the  position  oppo- 
site the  one  where  it  was  during  the  perhelion, 
before  reaching  the  aphelion.  Since  the  same 
situation  applies  to  the  motion  between  an  aphe- 
lion and  the  next  perihelion  we  see  that  between 
two  consecutive  perihelia  the  planet  will  have 
moved  through  an  angle  £*  *  3xM(u!  +  u,),  or 
that  the  perihelion  will  have  moved  through  an 
angle  3«M(u1  +  ut)  during  one  revolution  of  the 
planet.  This  is  a  very  small  amount,  and  it  nay 
be  considered  as  a  correction  to  the  classical 
result  according  to  which  the  planet  moves  on  an 
elliptic  orbit  with  the  sun  in  one  of  the  foci. 
If  a  is  the  major  semi-axis  and  e  the  eccentric- 
ity, the  distance  at  perihelion  is  a-  ae  and  the 
distance  at  aphelion  a  +  ae:  we  have  then 

ux  -i-  u,  =  l/(a  -  ae)  +  l/(a  +  ae)  =  2/a(l  -  e»), 


and  the  final  formula  for  the  advance  of 
perihelion  comes  out 

34.5  P  •  - 


the 


Here  then  we  have  two  predictions:  on  the  old 
theory  the  perihelion  will  remain  fixed  in 
space;  according  to  the  new  one  it  will  advance 
by  p  during  one  revolution.   What  are  the  re- 
sults? In  the  case  of  most  planets  either  this 
amount  is  too  small  or  the  position  of  the  peri- 
helion too  uncertain  to  permit  any  decision  but 
in  the  case  of  the  planet  Mercury  it  was  known 
for  a  long  time  that  there  is  a  discrepancy  be- 
tween the  prediction  of  the  Newtonian  theory  and 
actual  observations;  and  it  happens  that  the 
discrepancy  is  very  nearly  the  amount  showing 
the  discrepancy  between  the  two  theories,  so 
that  the  theory  of  Relativity  predicts  a  result 
that  has  been  actually  observed.  This  must  be 
considered  as  a  success  of  the  new  theory. 


35.  Deflection  of  Light. 

According  to  Section  29  a  light  particle 
also  moves  along  a  geodesic,  only  in  this  case 
It  is  a  zero  geodesic,  one  along  which  the  tan- 
gent vectors  have  zero  length.   The  equations 
for  such  a  geodesic  are  the  same  as  for  the 
other  kind  with  the  difference  that  the  parame- 
ter is  no  longer  arc  length.   As  a  result  we 
have  to  have  zero  instead  of  -1  in  the  right 
hand  member  of  equation  32.1,  that  is  to  make 
A  =  0  in  the  equations  32.5  and  32.7.  The  equa- 
tion of  the  orbit  will  therefore  be  (34.1) 


35.1 


+  u2  =  o 


2MuJ 


which  will  have  to  be  compared  with  the  same 
equation  without  the  term  containing  u3  which 
is  an  equation  of  a  straight  line  and  character- 
izes the  propagation  of  a  beam  of  light  on  the 
old  theory.  In  fact,  the  equation  of  a  straight 
line  whose  distance  from  the  origin  is  1/p  and 
which  is  perpendicular  to  the  polar  axis  is  in 
our  coordinates  u  =  p  cos  9  ;  we  have  then  du/d9 
=  - p  sin  9  and  taking  the  sum  of  the  squares  of 
the  last  two  expressions  we  find  that  they'  add 
up  to  p8  which  we  may  identify  with  o.   Again 
the  term  2Mu3  is  very  small  because  the  maximum 
value  u  can  take  is  the  inverse  of  the  minimum 
value  of  the  distance  from  the  center  of  the 
sun,  which  is  the   radius  of  the  sun;  we 
treat  the  problem  again  as  a  perturbation  prob- 
lem, that  is,  compare  the  required  solution  to 
that  of  the  equation  without  the  2Mu3  term. 
Again  we  are  interested  in  the  change  of  the 
angle  9  corresponding  to  a  transition  between 
the  two  extreme  values  of  u.  We  shall  be  inter- 
ested in  a  beam  of  light  emitted  from  a  star, 
arriving  into  our  telescope  and  passing  on  its 
way  very  near  to  the  surface  of  the  sun.   The 


65 

distances  of  the  star  and  even  of  the  earth  from 
the  sun  are  very  large  in  comparison  with  the 
minimum  distance,  and  we  shall  take  them  «  ex 
The  maximum  value  of  u,  corresponding  to  the 
minimum  distance  from  the  sun  we  shell  denote 
by  u0.  Since  du/df  changes  its  sign  when  the 
light  particle  reaches  this  point  it  must  van- 
ish there  so  that  the  left  hand  side  of  35.1  re- 
duces to  u0",  and  we  have 


u  •  -  a' 


Etfu  »; 


we  may  use  u0  Instead  of  a  in  our  equation  and 
write  it  in  the  form 


(^)8 


-  £Mu  »  +  2Mu». 


Solving  this  for  d0  we  find 

d9  =  -^_ $U . 

/2M(u*  -  U03)  -  (u*  -  u0»)" 

We  introduce  a  new  variable  x  letting 
u  =  UQ  sin  x 

and  after  the  substitution  develop  according  to 
powers  of  M  and  keep  only  two  terms;  we  thus 
get  for  d9  approximately 


if  we  let  x  change  from  o  to  «  ,  u  will  change 
from  zero  to  u0  and  back  to  zero,  Just  the 
change  that  the  inverse  distance  will  experi- 
ence during  the  propagation  of  the  light  par- 
ticle. The  total  change  of  the  angle  will  then 
be  represented  by  the  integral 


on  the  old  theory  which  corresponds  to  the  ab- 
sence of  the  term  with  M  in  the  equation  35.1 
we  will  have  to  omit  the  term  with  M  in  this 
integral  and  the  result  is  *.  The  approximate 
result  according  to  the  new  theory  will  differ 
from  that  by 


The  beam  of  light  coming  to  us  from  a  star  will 
then  be  deflected  by  an  angle  4M/r0  where  r0  is 
its  minimum  distance  from  the  sun,  compared  to 
the  old  theory,  or  to  the  beam  as  it  would  go  if 
the  sun  were  absent.  If  then  we  observe  a  star 
in  a  certain  position  on  the  sky  while  the  sun 
is  far  away,  and  then  observe  the  same  star  when 
the  sun  is  near  the  line  of  vision;  I.e.,  when 
the  apparent  position  of  the  sun  is  near  the 


M 


apparent  position  of  the  star,  this  latter  posi- 
tion must  appear  shifted  away  from  the  sun  ap- 
proximately by  the  angle  4M/r0.  Actual  measure- 
ments are  possible  only  during  an  eclipse  of  the 
sun,  because  otherwise  the  light  from  the  sun 
drowns  out  the  fainter  light  from  the  star,  and 
are  beset  with  difficulties  but  the  results  seem 
to  be  in  favor  of  the  prediction. 


36.  Shift  of  Spectral  Lines. 

We  come  to  the  third  so-called  test  of  the 
General  Relativity  Theory,  that  is,  a  case  where 
the  predictions  of  the  theory  differ  from  those 
of  older  theories  by  an  amount  exceeding  the 
error  of  measurement,  thus  affording  an  oppor- 
tunity to  prove  or  disprove  the  advantages  of 
the  new  theory. 

In  this  case  again  we  deal  with  propagation 
of  light  in  the  gravitational  field  of  the  sun, 
but  this  time  the  source  is  supposed  to  be  on 
the  sun  itself,  and  the  observer  is  on  the  earth, 
so  that  the  direction  of  the  beam  is  that  of  a 
radius  of  the  sun;  we  take  this  to  mean  that 
9  =  const,  and  <p  =  const.  We  have  then  accord- 
ing to  32.5  with  A  =  0 


L  .  nta  =  o. 


dr  =  I 


36.1 


This  gives 


the  double  sign  corresponds  to  two  possible 
senses  of  the  beam:  from  the  sun  to  the  earth 
and  from  the  earth  to  the  sun.  The  former,  in 
which  we  are  now  Interested,  is  characterized  by 
the  property  that  r  increases  as  t  increases; 
the  ratio  dr/dt  must  therefore  be  positive,  and 
since  T]  is  positive  we  must  take 

36.2  dr  =  ryit. 

The  orbit  is  thus  determined  by  the  equa- 


tions 


de  =  0,   dq>  =  0,   dr  =  rjdt. 


But  we  are  interested  in  the  color  this  time, 
and  color,  as  we  have  agreed  in  Section  16,  is 
proportional  to  the  time  component  of  the  momen- 
tum vector.  As  the  momentum  vector  we  have  to 
consider  the  vector  of  components  du^/dp,  where 
p  is  a  parameter  appearing  in  the  equations  of 
Geodesies,  the  one  with  respect  to  which  differ- 
entiation is  denoted  by  •  .  In  order  to  find 
this  parameter  we  have  to  go  back  to  the  origin- 
al equations  of  geodesies;  32.21  becomes  here 


equation  36.1  shows  that  the  last  two  terms  can- 
cel leaving  us  with 


*  -  0. 

This  means  that  r  is  a  linear  function  of  the 
parameter, 

r  •  ap  *  b 

so  that 

d/dp  »  a.d/dr. 
The  momentum  vector  uj  is  here  therefore 


a  dr 

a.— 

dr 


dr 


-r 
dr 


•^  — 

dr 


the  last  according  to  36.2.  What  about  the  val- 
ue of  a?  The  answer  is  that  it  is  not  and  can- 
not be  determined  by  the  foregoing  discussion. 
There  are  different  beams  of  light  which  satisfy 
all  the  conditions  imposed  so  far;  they  differ 
in  color,  and  different  colors  correspond  to 
different  values  of  a. 

Color,  according  to  our  definition,  is  the 
time  components  of  the  momentum  vector;  i.e., 
the  scalar  product  of  the  momentum  vector  of 
light  and  the  unit  vector  in  the  time  direction. 
If  we  denote  the  (contravariant)  components  of 
the  latter  by  T1,  the  condition  that  it  has  tlae 
direction  will  be  given  by 

rpl  =  ip8  3  q»3  =  Q 

and  the  condition  that  it  is  a  unit  vector  -  by 

_    rpGCmp  ...    1 

°cc(3 

which,  taking  into  account  the  relations  just 
preceding  and  the  values  of  the  g's  becomes 

T)(T4)"  =  1. 

The  scalar  product  of  the  vectors  ui  and  T1  cal- 
culated according  to  the  formula  g^JiJT^  becomes 


two 


or,  if  we  expand  and  keep  only  the  first 
terms, 


e(l  +  M/r)  . 

The  color  of  a  beam  of  light  is  then  not  con- 
stant along  the  beam.  We  shall  compare  the  col 
or  as  it  appears  near  the  surface  of  the  sun, 
where  r  is  equal  to  the  radius  of  the  sun  rs, 
and  near  the  surface  of  the  earth,  where  we  may 
assume  r  =  &.   The  frequencies  in  these  two 
cases  will  be  for  a  given  beam  of  light  propor- 
tional to 

1  +  M/r  and  1; 
the  change  in  frequency  will  be  proportional  to 

(1  *  H/r)  -  1  »  M/r, 

and  this  will  be  also  the  relative  change  in 
frequency  . 


If  now  we  consider  some  source  of  light 
near  the  surface  of  the  sun,  whose  frequency  we 
know,  the  light  emitted  by  it  when  it  is  re- 
ceived at  the  surface  of  the  earth  will  have  a 
frequency  that  is  less,  the  amount  of  the  rela- 
tive change  being  given  by 

M/r. 

If  then  we  compare  light  coming  from  a  ter- 
restrial source,  for  Instance,  emitted  by  an 


67 


atom,  and  light  emitted  by  a  corresponding  sourct 
on  the  sun,  for  instance,  emitted  by  an  atom  of 
the  same  kind,  we  would  expect  a  change  of  fre- 
quency of  the  amount  M/r.  Or,  if  we  compare  a 
Solar  spectrum  with  a  Terrestrial  spectrum,  the 
lines  of  the  former  will  be  shifted  toward  the 
red  by  the  amount  M/r.  This  is  the  prediction 
of  the  General  Relativity  Theory. 

Again  the  experimental  evidence  seems  to 
favor  this  prediction. 


\ 


BINDING  LIST 


o 

Oi 

o> 

CD 
00 
CVJ 


•H    0) 

s  ^ 

JH  «M 
O 
0) 
tiO  to 

h  o 

O  -H 
0)  -P 
O  o3 

e 
•\  o 

CJ  -P 
•H  0} 


-S 

& 


PS 


University  of  Torontp 
Library 


DO  NOT 

REMOVE 

THE 

CARD 

FROM 

THIS 

POCKET 


Acme  Library  Card  Pocket 

Under  Pat.  "Ref.  Index  Flte" 

Made  by  LIBRARY  BUREAU