DO
164486
LIBRARY
CaliNp.
Author
This book should be returned brfor
ate fast riiarked bdow,
NATURAL PHILOSOPHY
OF
CAUSE AND CHANCE
NATURAL PHILOSOPHY
OF
CAUSE AND CHANCE
BY
MAX BORN
BEING
THE WAYNFLETE LECTURES
DELIVERED IN THE COLLEGE OF
ST. MAEY MAGDALEN, OXFORD
IN HILARY TERM
1948
OXFORD
AT THE CLARENDON PRESS
1949
Oxford University Press, Amen House, London E.G. 4
GLASGOW NEW YORK TORONTO MELBOURNE WELLINGTON
BOMBAY CALCUTTA MADRAS CAPE TOWN
Geoffrey Cumberlege, Publisher to the University
PREFACE
A DBAFT of these lectures, written before they were delivered,
contained considerably more technicalities and mathematics than
the present text. Facing a large audience in which physicists
and mathematicians were presumably a minority, I had to
change my plans and to improvise a simplified presentation.
Though this did not seem difficult on the platform of the Hall
of Magdalen College, Oxford, the final formulation for publi
cation was not an easy task. I did not like replacing rigorous
mathematical reasoning by that mixture of literary style,
authority, and mystery which is often used by popularizing and
philosophizing scientists. Thus, the idea occurred to me to
preserve the mathematics by removing it to a detailed appendix
which could also contain references to the literature. The vast
extension of the latter, however, compelled me to restrict quota
tions to recent publications which are not in the textbooks.
Some of these supplements contain unpublished investigations
of my school, mainly by my collaborator Dr. H. S. Green. In the
text itself I have given up the original division into seven lec
tures and replaced it by a more natural arrangement into ten
chapters.
I have to thank Dr. Green for his untiring help in reading,
criticizing, and correcting my script, working out drafts of the
appendix, and reading proofs. I am also indebted to Mr. Lewis
Elton not only for proofreading but for carefully preparing the
index. I have further to thank Albert Einstein for permission
to publish soctions of two of his letters.
My most sincere gratitude is due to the President and the
Fellows of Magdalen College who gave me the opportunity to
plan these lectures, and the leisure to write them down for
publication.
I wish to thank the Oxford University Press for the excellent
printing and their willingness to follow all my wishes.
M. B.
CONTENTS
I. INTRODUCTION ..... 1
II. CAUSALITY AND DETERMINISM ... 5
III. ILLUSTRATION: ASTRONOMY AND PARTICLE
MECHANICS ...... 10
IV. CONTIGUITY: MECHANICS OF CONTINUOUS
MEDIA; ELECTROMAGNETIC FIELDS; RELA
TIVITY AND THE FIELD THEORY OF GRAVITA
TION ....... 17
V. ANTECEDENCE: THERMODYNAMICS . . 31
VI. CHANCE : KINETIC THEORY OF GASES ; STATISTI
CAL MECHANICS ; GENERAL KINETIC THEORY 46
VII. CHANCE AND ANTECEDENCE . . .71
VIII. MATTER: MASS, ENERGY, AND RADIATION;
ELECTRONS AND QUANTA . . .74
IX. CHANCE: QUANTUM MECHANICS; INDETER
MINISTIC PHYSICS; QUANTUM KINETIC
THEORY OF MATTER . . . .93
X. METAPHYSICAL CONCLUSIONS . . .122
APPENDIX
1. Multiple causes (II, 8) . . . . .129
2. Derivation of Newton's law from Kepler's laws (III, 13) . 129
3. Cauchy's mechanics of continuous media (IV, 20) . .134
4. Maxwell's equations of the electromagnetic field (FV H 24) . 138
5. Relativity (IV, 27) . . . . . .141
6. On classical and modern thermodynamics (V, 38) . . 143
7. Theorem of accessibility (V, 39) . . . 144
8. Thermodynamics of chemical equilibrium (V, 43) . .146
9. Velocity of sound in gases (V, 44) .... 149
10. Thermodynamics of irreversible processes (V, 45) . .151
11. Elementary kinetic theory of gases (VI, 47) . . .152
12. Statistical equilibrium (VI, 50) .... 153
13. Maxwell's functional equation (VI, 51) . . .153
14. The method of the most probable distribution (VI, 52) . 154
15. The method of mean values (VI, 54) ... 160
CONTENTS vii
16. Boltzmann's collision integral (VI, 56) . . .163
17. Irreversibility in gases (VI, 57) . . . .165
18. Formalism of statistical mechanics (VI, 60) . .167
19. Quasiperiodicity (VI, 62) . . . . . 169
20. Fluctuations and Brownian motion (VI, 63). . .170
21. Reduction of the multiple distribution function (VI, 67) . 176
22. Construction of the multiple distribution function (VI, (>8) . 177
23. Derivation of the collision integral from the general theory of
fluids (VI, 69) ....".. 180
24. Irreversibility in fluids (VII, 72) . . . 183
25. Atomic physics (VIII, 75) . . . . . 187
26. The law of equipartition (VIII, 77) . . . . 188
27. Operator calculus in quantum mechanics (VIII, 91) . . 188
28. General formulation of the uncertainty principle (IX, 94) . 189
29. Dirac's derivation of the Poisson brackets in quantum
mechanics (IX, 97) ..... 191
30. Perturbation theory for the density matrix (IX, 100) . 193
31. The functional equation of quantum statistics (IX, 112) . 197
32. Degeneration of gases (IX, 113) . . . .197
33. Quantum equations of motion (IX, 116) . . . 203
34. Supraconductivity (IX, 118) . . . .206
35. Economy of thinking (X, 124) . . . .207
36. Concluding remarks (X, 127) . . . .207
INDEX . . . . . . .211
NOTATION
THE practice of representing vector quantities by means of clarendon
type in print is now well established, and used throughout these lectures.
For dealing with cartesian tensors, the notation of Chapman and Milne,
explained in the first chapter of Chapman and Cowling's book The
Mathematical Theory of NonUniform Oases (C.U.P., 1939) is used; this
consists in printing tensors in sans serif type.
The following examples will suffice to show how vector and tensor
equations are translated into coordinate notation.
a = b  a k = b k (k = 1, 2, 3)
3
a.b =
fci
a u ~ 6fcj (M = 1,2,3)
8
2a w &j = c k (k = 1,2,3)
i~i
3
a.b.c = 2 a k^ki c i
aAb = c~>a a 6 3 ~a 3 62 = c i> tc.
= 6 fc (fc = 1,2,3)
,. V&fy
diva= > *
Z^dx k
.a == diva = b^ ^ = ( diva )* = 6 fc (* =
oX ^lw ^1
Jl
= b 19 etc.
ox
INTRODUCTION
THE notions of cause and chance which I propose to deal with
in these lectures are not specifically physical concepts but have
a much wider meaning and application. They are used, more or
less vaguely, in everyday life. They appear, not only in all
branches of science, but also in history, psychology, philosophy,
and theology ; everywhere with a different shade of meaning.
It would be far beyond my abilities to give an account of all
these usages, or to attempt an analysis of the exact significance
of the words 'cause 5 and 'chance' in each of them. However,
it is obvious that there must be a common feature in the use of
these notions, like the theme in a set of variations. Indeed,
cause expresses the idea of necessity in the relation of events,
while chance means just the opposite, complete randomness.
Nature, as well as human affairs, seems to be subject to both
necessity and accident. Yet even accident is not completely
arbitrary, for there are laws of chance, formulated in the mathe
matical theory of probability, nor can the causeeffect relation
be used for predicting the future with certainty, as this would
require a complete knowledge of the relevant circumstances,
present, past, or both together, which is not available. There
seems to be a hopeless tangle of ideas. In fact, if you look
through the literature on this problem you will find no satisfac
tory solution, no general agreement. Only in physics has a
systematic attempt been made to use the notions of cause and
chance in a way free from contradictions. Physicists form their
notions through the interpretation of experiments. This method
may rightly be called Natural Philosophy, a word still used for
physics at the Scottish universities. In this sense I shall attempt
to investigate the concepts of cause and chance in these lectures.
My material will be taken mainly from physics, but I shall try
to regard it with the attitude of the philosopher, and I hope that
the results obtained will be of use wherever the concepts of
cause and chance are applied. I know that such an attempt will
not find favour with some philosophers, who maintain that
5131
a INTRODUCTION
science teaches only a narrow aspect of the world, and one which
is of no great importance to man's mind. It is toie that many
scientists are not philosophically minded and have hitherto
shown much skill and ingenuity but little wisdom. I need hardly
to enlarge on this subject. The practical applications of science
have given us the means of a fuller and richer life, but also the
means of destruction and devastation on a vast scale. Wise
men would have considered the consequences of their activities
before starting on them ; scientists have failed to do so, and only
recently have they become conscious of their responsibilities to
society. They have gained prestige as men of action, but they
have lost credit as philosophers. Yet history shows that science
has played a leading part in the development of human thought.
It has not only supplied raw material to philosophy by gathering
facts, but also evolved the fundamental concepts on how to deal
with them. It suffices to mention the Copernican system of the
universe, and the Newtonian dynamics which sprang from it.
These originated the conceptions of space, time, matter, force,
and motion for a long time to come, and had a mighty influence
on many philosophical systems. It has been said that the meta
physics of any period is the offspring of the physics of the pre
ceding period. If this is true, it puts us physicists under the
obligation to explain our ideas in a nottootechnical language.
This is the purpose of the following lectures on a restricted
though important field. I have made an attempt to avoid
mathematics entirely, but failed. It would have meant an un
bearable clumsiness of expression and loss of clarity. A way out
would have been the reduction of all higher mathematics to
elementary methods in Euclidean style following the cele
brated example of Newton's Principia. But this would even
have increased the clumsiness and destroyed what there is of
aesthetic appeal. I personally think that more than 200 years
after Newton there should be some progress in the assimilation of
mathematics by those who are interested in natural philosophy.
So I shall use ordinary language and formulae in a suitable
mixture; but I shall not give proofs of theorems (they are
collected in the Appendix).
INTRODUCTION 3
In this way I hope to explain how physics may throw some
light on a problem which is not only important for abstract
knowledge but also for the behaviour of man. An unrestricted
belief in causality leads necessarily to the idea that the world is
an automaton of which we ourselves are only little cogwheels.
This means materialistic determinism. It resembles very much
that religious determinism accepted by different creeds, where
the actions of men are believed to be determined from the
beginning by a ruling of God. I cannot enlarge on the difficulties
to which this idea leads if considered from the standpoint of
ethical responsibility. The notion of divine predestination
clashes with the notion of free will, in the same way as the
assumption of an endless chain of natural causes. On the other
hand, an unrestricted belief in chance is impossible, as it cannot
be denied that there are a great many regularities in the world ;
hence there can be, at most, 'regulated accident'. One has to
postulate laws of chance which assume the appearance of laws
of nature or laws for human behaviour. Such a philosophy
would give ample space for free will, or even for the willed
actions of gods and demons. In fact, all primitive polytheistic
religions seem to be based on such a conception of nature : things
happening in a haphazard way, except where some spirit inter
feres with a purpose. We reject today this demonologioal
philosophy, but admit chance into the realm of exact science.
Our philosophy is dualistic in this respect ; nature is ruled by
laws of cause and laws of chance in a certain mixture. How is
this possible? Are there no logical contradictions? Can this
mixture of ideas be cast into a consistent system in which all
phenomena can be adequately described or explained ? What
do we mean by such an explanation if the feature of chance is
involved ? What are the irreducible or metaphysical principles
inyolved ? Is there any room in this system for free will or for
the interference of deity ? These and many other questions can
be asked. I shall try to answer some of them from the stand
point of the physicist, others from my philosophical convictions
which are not much more than common sense improved by
Sporadic reading. The statement, frequently made, that modern
4 INTRODUCTION
physics has given up causality is entirely unfounded. Modem
physics, it is true, has given up or modified many traditional
ideas ; but it would cease to be a science if it had given up the
search for the causes of phenomena. I found it necessary,
therefore, to formulate the different aspects of the fundamental
notions by giving definitions of terms which seem to me in agree
ment with ordinary language. With the help of these concepts,
I shall survey the development of physical thought, dwelling
here and there on special points of interest, and I shall try to
apply the results to philosophy in general.
II
CAUSALITY AND DETERMINISM
THE concept of causality is closely linked with that of determin
ism, yet they seem to me not identical. Moreover, causality is
used with several different shades of meaning. I shall try to dis
entangle these notions and eventually sum them up in definitions.
The causeeffect relation is used mainly in two ways ; I shall
illustrate this by giving examples, partly from ordinary life,
partly from science. Take these statements :
* Overpopulation is the cause of India's poverty/
'The stability of British politics is caused by the institution
of monarchy.'
'Wars are caused by the economic conditions.'
'There is no life on the moon because of the lack of an
atmosphere containing oxygen.'
'Chemical reactions are caused by the affinity of molecules.'
The common feature to which I wish to draw your attention
is the fact that these sentences state timeless relations. They
say that one thing or one situation A causes another J5, meaning
apparently that the existence of B depends on A, or that if A
were changed or absent, B would also be changed or absent.
Compare these statements with the following :
'The Indian famine of 1946 was caused by a bad harvest/
'The fall of Hitler was caused by the defeat of his armies.'
' The American war of secession was caused by the economic
situation of the slave states/
'Life could develop on earth because of the formation of an
atmosphere containing oxygen/
' The destruction of Hiroshima was caused by the explosion
of an atomic bomb/
In these sentences one definite event A is regarded as the
cause of another B ; both events are more or less fixed in space
and time. I think that these two different shades of the cause
effect relation are both perfectly legitimate. The common factor
6 CAUSALITY AND DETERMINISM
is the idea of dependence, which needs some Comment. This
concept of dependence is clear enough if the two things con
nected are concepts themselves, things of the mind, like two
numbers or two sets of numbers ; then dependence means what
the mathematician expresses by the word 'function'. This
logical dependence needs no further analysis (I even think it
cannot be further analysed). But causality does not refer to
logical dependence ; it means dependence of real things of nature
on one another. The problem of what this means is not simple
at all. Astrologers claim the dependence of the fate of human
beings on the constellations of stars. Scientists reject such state
ments but why? Because science accepts only relations of
dependence if they can be verified by observation and experi
ment, and we are convinced that astrology has not stood this
test. Science insists on a criterion for dependence, namely
repetitive observation or experiment : either the things A and B
refer to phenomena, occurring repeatedly in Nature and being
sufficiently similar for the aspect in question to be considered
as identical; or repetition can be artificially produced by
experiment.
Observation and experiment are crafts which are systemati
cally taught. Sometimes, by a genius, they are raised to the
level of an art. There are rules to be observed: isolation of the
system considered, restriction of the variable factors, varying
of the conditions until the dependence of the effect on a single
factor becomes evident ; in many cases exact measurements and
comparison of figures are essential. The technique of handling
these figures is a craft in itself, in which the notions of chance
and probability play a decisive part we shall return to this
question at a later stage. So it looks as if science has a methodical
way of finding causal relations without referring to any meta
physical principle. But this is a deception. For no observation
or experiment, however extended, can give more than a finite
number of repetitions, and the statement of a law B depends on
A always transcends experience. Yet this kind of state
ment is made everywhere and all the time, and sometimes from
scanty material. Philosophers call it Inference by Induction,
CAUSALITY AND DETERMINISM 7
and have developed many a profound theory of it. I shall not
enter into a discussion of these speculations. But I have to
make it clear why I distinguish this principle of induction from
causality. Induction allows one to generalize a number of observa
tions into a general rule : that night follows day and day follows
night, or that in spring the trees grow green leaves, are induc
tions, but they contain no causal relation, no statement of
dependence. The method of inductive thinking is more general
than causal thinking ; it is used in everyday life as a matter of
course, and it applies in science to the descriptive and experi
mental branches as well. But while everyday life has no definite
criterion for the validity of an induction and relies more or less
on intuition, science has worked out a code, or rule of craft, for
its application. This code has been entirely successful, and I
think that is the only justification for it just as the rules of the
craft of classical music are only justified by full audiences and
applause. Science and art are not so different as they appear.
The laws in the realms of truth and beauty are laid down by the
masters, who create eternal works.
Absolute values are ideals never reached. Yet I think that
the common effort of mankind has approached some ideals in
quite a respectable way. I do not hesitate to call a man foolish
if he rejects the teaching of experience because no logical proof
is forthcoming, or because he does not know or does not accept
the rules of the scientific craft. You find such superlogical
people sporadically among pure mathematicians, theologians,
and philosophers, while there are besides vast communities of
people ignoijant of or rejecting the rules of science, among them
the members of antivaccination societies and believers in
astrology. It is useless to argue with them ; I cannot compel
them to accept the same criteria of valid induction in which I
believe: the code of scientific rules. For there is no logical
argument for doing so ; it is a question of faith. In this sense I
am willing to call induction a metaphysical principle, namely
something beyond physics.
After this excursion, let us return to causality and its two
ways of application, one as a timeless relation of dependence,
8 CAUSALITY AND DETERMINISM
the other as a dependence of one event fixed in time and space
on another (see Appendix, 1). I think that the abstract, timeless
meaning of causality is the fundamental one. This becomes
quite evident if one tries to use the term in connexion with a
specific case without implicit reference to the abstraction. For
example: The statement that a bad harvest was the cause of the
Indian famine makes sense only if one has in mind the timeless
statement that bad harvests are causes of famines in general.
I leave it to you to confirm this with the other examples I have
given or with any more you may invent. If you drop this refer
ence to a general rule, the connexion between two consecutive
events loses its character of causality, though it may still retain
the feature of perfect regularity, as in the sequence of day and
night. Another example is the timetable of a railway line. You
can predict with its help the arrival at King's Cross of the
10 o'clock from Waverley ; but you can hardly say that the time
table reveals a cause for this event. In other words, the law
of the timetable is deterministic : You can predict future events
from it, but the question 'why ? ' makes no sense.
Therefore, I think one should not identify causality and
determinism. The latter refers to rules which allow one to
predict from the knowledge of an event A the occurrence of an
event B (and vice versa), but without the idea that there is a
physical timeless (and spaceless) link between all things of the
kind A and all things of the kind J5. I prefer to use the ex
pression 'causality' mainly for this timeless dependence. It is
exactly what experimentalists and observers mean when they
trace a certain phenomenon to a certain cause by systematic
variation of conditions. The other application of the word to
two events following one another is, however, in so common use
that it cannot be excluded. Therefore I suggest that it should be
used also, but supplemented by some 'attributes' concerning
time and space. It is always assumed that the cause precedes the
effect ; I propose to call this the principle of antecedence. Further,
it is generally regarded as repugnant to assume a thing to cause an
effect at a place where it is not present, or to which it cannot be
Jinked by other things ; I shall call this the principle of contiguity.
CAUSALITY AND DETERMINISM 9
I shall now try to condense these considerations in a few
definitions.
Determinism postulates that events at different times are
connected by laws in such a way that predictions of un
known situations (past or future) can be made.
By this formulation religious predestination is excluded, since
it assumes that the book of destiny is only open to God.
Causality postulates that there are laws by which the occur
rence of an entity B of a certain class depends on the
occurrence of an entity A of another class, where the word
' entity ' means any physical object, phenomenon, situation,
or event. A is called the cause, B the effect.
If causality refers to single events, the following attributes
of causality must be considered :
Antecedence postulates that the cause must be prior to, or at
least simultaneous with, the effect.
Contiguity postulates that cause and effect must be in spatial
contact or connected by a chain of intermediate things in
contact.
Ill
ILLUSTRATION: ASTRONOMY AND PARTICLE
MECHANICS
I SHALL now illustrate these definitions by surveying the
development of physical science. But do not expect an ordinary
historical treatment. I shall not describe how a great man
actually made his discoveries, nor do I much care what he him
self said about it. I shall try to analyse the scientific situation
at the time of the discovery, judged by a modern mind, and
describe them in terms of the definitions given.
Let us begin with the oldest science, astronomy. Pre
Newtonian theory of celestial motions is an excellent example
of a mathematical and deterministic, yet not causal, description.
This holds for the Ptolemaic system as well as for the Copernican,
including Kepler's refinements. Ptolemy represented the motion
of the planets by kinematic models, cycles, and epicycles rolling
on one another and on the fixed heavenly sphere. Copernicus
changed the standpoint and made the sun the centre of cyclic
planetary motion, while Kepler replaced the cycles by ellipses.
I do not wish to minimize the greatness of Copernicus' step in
regard to the conception of the Universe. I just consider it from
the standpoint of the question which we are discussing. Neither
Ptolemy nor Copernicus nor Kepler states a cause for the be
haviour of the planets, except the ultimate cause, the will of the
Creator. What they do is, in modern mathematical language,
the establishment of functions,
*i=/i(*), = /(*).
for the coordinates of all particles, depending on time. Coperni
cus himself claimed rightly that his functions, or more accurately
the corresponding geometrical structures, are very much simpler
than those of Ptolemy, but he refrained from advocating the
cosmological consequences of his system. This question came
to the foreground long after his death mainly by Galileo's tele
scopic observations, which revealed in Jupiter and his satellites
a repetition of the Copernican system on a smaller scale.
ILLUSTRATION: ASTRONOMY AND PARTICLE MECHANICS 11
Descartes's cosmology can be regarded as an early attempt to
establish causal laws for the planetary orbits by assuming a
complicated vortex motion of some kind of ether, and it is
remarkable that this construction satisfies contiguity. But it
failed because it lacked the main feature of scientific progress ;
it was not based on a reasonable induction from facts. Of course,
no code of rules existed, nor did Descartes's writings provide it
at that time. The principles of the code accepted today are
implicitly contained in the works of Galileo and Newton, who
demonstrated them with their actual discoveries in physics and
astronomy in the same way as Haydn established the rules of
the sonata by writing lovely music in this form.
Galileo's work precedes Newton's not only in time but also
in logical order ; for Galileo was experimenting with terrestrial
objects according to the rules of repetition and variation of con
ditions, while Newton's astronomical material was purely obser
vational and restricted. Galileo observed how a falling body
moves, and studied the conditions on which the motion depends.
His results can be condensed into the wellknown formula for the
vertical coordinate of a small body or 'particle' as a function of
time > z = fetf 2 , (3.1)
where g is a constant, i.e. independent not only of time but also
of the falling body. The only thing this quantity g can depend
upon is the body towards which the motion is taking place, the
earth a conclusion which is almost too obvious to be formu
lated ; for if the motion is checked by my hand, I feel the weight
as a pressure directed downwards towards the earth. Hence the
constant g nlust be interpreted as a property of the earth, not of
the falling body.
Using Newton's calculus (denoting the time derivative by a
dot) and generalizing for all three coordinates, one obtains the
equations ^ Oj ^ = 0> z = 0, (3.2)
which describe the trajectories of particles upon the earth with
arbitrary initial positions and velocities.
These formulae condense the description of an infinite number
of orbits and motions in one single simple statement: that some
12 ILLUSTRATION: ASTRONOMY AND PARTICLE MECHANICS
property of the motion is the same for the whole class, indepen
dent of the individual case, therefore depending only on the one
other thing involved, namely the earth. Hence this property,
namely the vertical acceleration, must be 'due to the earth', or
'caused by the earth', or e a force exerted by the earth'.
This word 'force' indicates a specification of the general
notion of cause, namely a measurable cause, expressible in
figures. Apart from this refinement, Galileo's work is just a case
of ordinary causality in the sense of my definition.
Yet the law (3.2) involves time, since the 'effect' of the force
is an acceleration, the rate of change of velocity in time. This
is the actual result of observation and measurement, and has no
metaphysical root whatsoever. A consequence of this fact is the
deterministic character of the law (3.2) : if the position and the
velocity of a particle are given at any time, the equations deter
mine its position and velocity at any other time.
In fact, any other time in the past or future. This shows that
Galileo's law does not conform to the postulate of antecedence :
a given initial situation cannot be regarded as the cause of a
later situation, because the relation between them is completely
symmetrical; each determines the other. This is closely con
nected with the notion of time which Galileo used, and which
Newton took care to define explicitly.
The postulate of contiguity is also violated by Galileo's law
since the action of the earth on the moving particle needs
apparently no contact. But this question is better discussed in
connexion with Newton's generalization.
Newton applied Galileo's method to the explanation of celestial
motions. The material on which he based his deductions was
scanty indeed; for at that time only six planets (including the
earth) and a few satellites of these were known. I say 'deduc
tions', for the essential induction had already been made by
Kepler when he announced his three laws of planetary motion as
valid for planets in general. The first two laws, concerning the
elliptic shape of the orbit and the increase of the area swept by
the radius vector, were based mainly on Tycho Brahe's observa
tions of Mars, i.e. of one single planet. Generalized by a sweeping
ILLUSTRATION: ASTRONOMY AND PARTICLE MECHANICS 13
induction to any planet they are, according to Newton, equiva
lent to the statement that the acceleration is always directed
towards the sun and varies inversely as the square of the distance
r from the sun, /zr~ 2 , where p, is a constant which may differ
from planet to planet. But it is the third law which reveals the
causal relation to the sun. It says that the ratio of the square
of the period and the cube of the principal axis is the same for
all planets induced from data about the six known planets.
This implies, as Newton showed (see Appendix, 2), that the
constant p, is the same for all planets. Hence as in Galileo's case,
it can depend only on the single other body involved, the sun.
In this way the interpretation is obtained that the centripetal,
acceleration ju,r~ 2 is 'due to the sun', or 'caused by the sun', or
'a force exerted by the sun'.
The moon and the other planetary satellites were then the
material for the induction which led to the generalization of a
mutual attraction of all bodies towards one another. The most
amazing step, rightly admired by Newton's contemporaries and
later generations, was the inclusion of terrestrial bodies in the
law derived from the heavens. This is in fact the idea symbolized
by the apocryphal story of the falling apple : terrestrial gravity
was regarded by Newton as identical with celestial attraction. }
By applying his laws of motion to the system earthmoon, he
could calculate Galileo's constant of gravity g from geodetical
and astronomical data : namely,
where r is the* radius of the earth, R the distance between thei
centres of earth and moon, and T the time of revolution ofj
the moon (sidereal month). !
The general equations for the motion of n particles under
mutual gravitation read in modern vector notation
(3.5)
H ILLUSTRATION: ASTRONOMY AND PARTICLE MECHANICS
where r a is the position vector of the particle a (a = 1, 2,..., n),
r^p = r a rp the distance of two particles a and j3, V the
potential of the gravitational forces.
Newton also succeeded in generalizing the laws of motion for
other nongravitational forces by introducing the notion of mass,
or more precisely of inertial mass. Newton's method of repre
senting his results in an axiomatic form does not reveal the way
he obtained them. It is, however, possible to regard this step as
a case of ordinary causality derived by induction. One has to
observe the acceleration of different particles produced by the
same nongravitational (say elastic) forces at the same point of
space ; they are found to differ, but not in direction, only in
magnitude. Therefore, one can infer by induction the existence
of a scalar factor characteristic for the resistance of a particle
against acceleration or its inertia. This factor is called 'mass'.,
It may still depend on velocity as is assumed in modern theory
of relativity. This can be checked by experiment, and as in
Newton's time no such effect could be observed, the mass was
regarded as a constant.
Then the generalized equations of motions read
w a r a = F a , (3.6)
where m a (a = 1, 2,..., n) are the masses and F a the force vectors
which depend on the mutual distances r^p of all the particles.
As in the case of gravitation, they may be derivable from a
potential V by the operation
F a =grad a F, (3.7)
where V is a function of the r^. The most general form of V for
forces inverse to the square of the distance would be ( 2' means
a,p
summing over all a, /3 except a = /?)
where n a p are constants; comparison with (3.4) and (3.5) shows,
however, that these must have the form
V>B = >W ( 3  )
ILLUSTRATION: ASTRONOMY AND PARTICLE MECHANICS 15
Newton applies further a law of symmetry, stated axiomatic
ally, namely that action equals reaction, or /Lt a p = /*0 a , from
which follows
pp = Kmp, (3.10)
where K is a universal constant, the constant of gravitation.
Hence the constants of attraction or gravitational masses /Z Q
are proportional to the inertial masses m a .
Neither Newton himself, nor many generations of physicists
and astronomers after him have paid much attention to the law
expressed by (3.10). Astronomical observations left little doubt
that it was correct, and it was proved by terrestrial observations
(with suitable pendulums) to hold with extreme accuracy
(Eotvos and others). Two centuries went by before Einstein saw
the fundamental problem contained in the simple equation
(3.10), and built on it the colossal structure of his theory of
general relativity, to which we have to return later.
But this is not our concern here. We have to examine
Newton's equations from the standpoint of the principle of
causality. I hope I have made it clear that they imply the notion
of cause exactly in the same sense as it is always used by the
experimentalist, namely signifying a verifiable dependence of
one thing or another. Yet this one thing is, in Galileo's and
Newton's theory alike, a peculiar quantity, namely an accelera
tion. The peculiarity is not only that it cannot be seen or read
from a measuring tape, but that it contains the time implicitly.
In fact, Newton's equations determine the motion of a system
in time completely for any given initial state (position and
velocity of all particles involved). In this way, ' causation ' leads
to ' determination ', not as a new metaphysical principle, but as a
physical fact, like any other. However, just as in Galileo's
simpler case, so here the relation between two consecutive con
figurations of the system is mutual and symmetrical. This has a
bearing on the question whether the principle of antecedence
holds. As this applies, according to our definition, only to the
causeeffect relation between single events, one has to change
the standpoint. Instead of considering the acceleration of one
body to be caused by the other bodies, one considers two
16 ILLUSTRATION: ASTRONOMY AND PARTICLE MECHANICS
consecutive configurations of the whole system and asks whether
it makes sense to call the earlier one the cause" of the later one.
But it makes no sense, for the relation between the two states is
symmetrical. One could, with the same right, call the later
configuration the cause of the earlier one.
The root of this symmetry is Newton's definition of time.
Whatever he says about the notion of time (in Principia,
Scholium I) as a uniform flow, the use he makes of it contains
nothing of a flow in one direction. Newton's time is just an
independent variable t appearing in the equations of motion, in
such a way that if t is changed into t, the equations remain
the same. It follows that, if all velocities are reversed, the
system just goes back the same way ; it is completely reversible.
Newton's time variable is obviously an idealization abstracted
from simple mechanical models and astronomical observations,
fitting well into celestial motion, but not into ordinary experi
ence. To us it appears that life on earth is going definitely in one
direction, from past to future, from birth to death, and the
perception of time in our mind is that of an irresistible and
irreversible current.
Another feature of Newton's dynamics was repugnant to
many of his contemporaries, in particular the followers of
Descartes, whose cosmology, whatever else its shortcomings,
satisfied the principle of contiguity, as I have called the condition
that cause and effect should be in spatial contact. Newton's
forces, the quantitative expressions for causes of motion, are sup
posed to act through empty space, so that cause and effect are
simultaneous whatever the distance. Newton himself refrained
from entering into a metaphysical controversy and insisted that
the facts led unambiguously to his results. Indeed, the language
of facts was so strong that they silenced the philosophical objec
tions, and only when new facts revealed to a later generation the
propagation of forces with finite velocity, was the problem of
contiguity in gravitation taken up. In spite of these difficulties,
Newton's dynamics has served many generations of physicists
and is useful, even indispensable, today.
IV
CONTIGUITY
MECHANICS OF CONTINUOUS MEDIA
ALTHOUGH I maintain that neither causality itself nor its
attributes, which I called the principles of antecedence and
contiguity, are metaphysical, and that only the inference by
induction transcends experience, there is no doubt that these
ideas have a strong power over the human mind, and we have
evidence enough that they have influenced the development of
classical physics. Much effort has been made to reconcile
Newton's laws with these postulates. Contiguity is closely bound
up with the introduction of contact forces, pressures, tensions,
first in ordinary material bodies, then in the electromagnetic
ether, and thus to the idea of fields of forces ; but the systematic
application of contiguity to gravitation exploded Newton's
theory, which was superseded by Einstein's relativity. Similar
was the fate of the postulate of antecedence ; it is closely bound
up with irreversibility in time, and found its first quantitative
formulation in thermodynamics. The reconciliation of it with
Newton's laws was attempted by atomistics and physical
statistics ; the idea being that accumulations of immense num
bers of invisible Newtonian particles, atoms, or molecules appear
to the observer to have the feature of irreversibility for statistical
reasons. The atoms were first hypothetical, but soon they were
taken seriously, and one began to search for them, with increas
ing success. They became more and more real, and finally even
visible. And tlaen it turned out that they were no Newtonian
particles at all. Whereupon the whole classical physics exploded,
to be replaced by quantum theory. Looked at from the point
of view of our principles, the situation in quantum theory is
reversed. Determinism (which is so prominent a characteristic
of Newton's theory) is abandoned, but contiguity and ante
cedence (violated by Newton's laws) are preserved to a consider
able degree. Causality, which in my formulation is independent
of antecedence and contiguity, is not affected by these changes :
5131
18 CONTIGUITY
scientific work will always be the search for causal interdepen
dence of phenomena. a
After this summary of the following discussion, let us return to
the question why violation of the principles of contiguity and ante
cedence in Newton's theory was first accepted though not with
out protest but later amended and finally rejected. This change
was due to the transition from celestial to terrestrial mechanics.
The success of Newton's theory was mainly in the field of
planetary motion, and there it was overwhelming indeed. It is
not my purpose to expand on the history of astronomy after
Newton ; it suffices to remember that the power of analytical
mechanics to describe and predict accurately the observations
led many to the conviction that it was the final formulation of
the ultimate laws of nature.
The main attention was paid to the mathematical investiga
tion of the equations of motion, and the works of Lagrange,
Laplace, Gauss, Hamilton, and many others are a lasting
memorial of this epoch. Of all these writings, I shall dwell only
for a moment on that of Hamilton, because his formulation of
Newton's laws is the most general and elegant one, and because
they will be used over and over again in the following lectures.
So permit me a short mathematical interlude which has nothing
directly to do with cause and chance.
Hamilton considers a system of particles described by any
(in general non Cartesian) coordinates #1, #2v> then the poten
tial energy is a function of these, V(q l9 J 2 > ) r shortly F(#), and
the kinetic energy T a function of both the coordinates and the
generalized velocities q v q%,..., T(q, q). He then defines general
ized momenta m
and regards the total energy T+ V as a function of the # a and p a .
This function T+V = H(q, p) (4.2)
is today called the Hamiltonian.
The equations of motion assume the simple 'canonical' form
CONTIGUITY 19
from which one reads at once the conservation law of energy,
^ = 0, H = const. (4.4)
at
It is this set of formulae which has survived the most violent
revolution of physical ideas which has ever taken place, the
transition to quantum mechanics.
Returning now to the postNewtonian period there was,
simultaneous with the astronomical applications and confirma
tions of the theory, a lively interest in applying it to ordinary
terrestrial physics. Even here Newton had shown the way and
had calculated, for instance, the velocity of sound in a fluid.
Eventually the mechanics of elastic solids brought about a
modification of Newton's definition of force which satisfies con
tiguity. Much of this work is due to the great mathematician
Cauchy. He started, as many before him, by treating a solid as
an aggregate of tiny particles, acting on one another with
Newtonian noncontiguous forces of short range anticipating
to some degree the modern atomistic standpoint. But there was
of course, at that time, no evidence of the physical reality of
these particles. In the physical applications all traces of them
were obliterated by averaging. The form of these results sug
gested to Cauchy another method of approach where particle
mechanics is completely discarded. Matter is considered to be a
real continuum in the mathematical sense, so that it has a
meaning to speak of a force between two pieces of matter
separated by a surface. This seems to be, from our modern
standpoint, a step in the wrong direction, as we know matter
to be discontinuous. But Cauchy 's work showed how con
tiguity could be introduced into mechanics ; the importance of
this point became evident when the new method was applied to
the ether, the carrier of light and of electric and magnetic forces,
which even today is still regarded as continuous though it has
lost most of the characteristic properties of a substance and can
hardly be called a continuous medium.
In this theory all laws appear in the form of partial differential
equations, in which the three spacecoordinates appear together
with the time as independent coordinates.
20 CONTIGUITY
I shall give a short sketch of the mechanics of continuous
media.
Mass, velocity, and all other properties of matter are con
sidered continuously distributed in space. The mass per unit
volume or density p is then a function of the space coordinates,
and the same holds for the current of mass u = pv (namely the
quantity of mass passing through a surface per unit area and
unit time). The conservation (indestructibility) of mass then
leads to the socalled continuity equation (see Appendix, 3)
p+div u = 0. (4.5)
Concerning the forces, one has to assume that, if the sub
stance is regarded as separated into two parts by a surface, each
part exerts a push or pull through this surface on the other which,
measured per unit area, is called tension or stress. A simple
mathematical consideration, based on the equilibrium conditions
for the resultant forces acting on the surfaces of a volume
element, shows that it suffices to define these tension forces for
three noncoplanar surface elements, say those parallel to the
three coordinate planes; the force on the element normal to x
being T x with components T xx ,T xy ,T xz) the other two forces
correspondingly T y (T yx , T yy , T yz ) and T s (T zx , T zy) T m ). Then the
force on a surface element with the normal unit vector
n ( n x> n y> n z)
is given by T n = T x n x +T y n y +T e n z . (4.6)
Application of the law of moments to a small volume element
shows (see Appendix, 3) that
m _. 7? m__/p /77__/TT /4/7\
*yz *0y> *zx J xz'> ^xy L yx t V** ( /
Hence the quantities T form a symmetrical matrix, the stress
tensor
'xy xz
T m T y ,\. (4.8)
T,,, Tj
Newton's law applied to a volume element then leads to the
equations
(4.9)
CONTIGUITY 21
where div T is a vector with the components
and d/dt the operator
which is called the 'convective derivative'.
(4.9) together with (4.5) are the new equations of motion
which satisfy the postulate of contiguity. They are the proto
type for all subsequent field theories. In the present form they
are still incomplete and rather void of meaning, as the stress
tensor is not specified in its dependence on the physical condi
tions of the system just in the same way as Newton's equations
are void of meaning if the forces are not specified with their
dependence on the configuration of the particles. The configura
tion of a continuous system cannot be described by the values
of a finite number of variables, but by certain space functions,
called ' straincomponents'. They are defined in this way:
A small (infinitesimal) volume of initially spherical shape will be
transformed by the deformation into an ellipsoid ; the equation
of this has the form
e n x 2 +e 22 y 2 +e 39 z 2 +2e^yz+2e 31 zx+2e u xy = e, (4.12)
where e is an (infinitesimal) constant, measuring the absolute
dimensions, and e n , e 22 ,..., e 12 are six quantities depending on
the position x, y, z of the centre of the sphere. These e^ are the
components of the strain tensor e.
In the thegry of elasticity it is assumed that the stress com
ponents Ty are linear functions of the strain components e^
(Hooke's law).
In hydrodynamics the relation between T and e involves
space and timederivatives of e^. In plastic solids the situation
is still more complicated.
We need not enter into these different branches of the
mechanics in continuous media. The only important point for
us is this : Contact forces spread not instantaneously but with
finite velocity. This is the main feature distinguishing Cauchy's'
22 CONTIGUITY
contiguous mechanics from Newton's noncontiguous. The
simplest example is an elastic fluid (liquid or' gas). Here the
stress tensor T has only diagonal elements which are equal and
represent the pressure p. The configuration can also be described
by one variable, the density p or, for a given mass, the volume V.
The relation between p and F may be any function # =/(F) we
shall have to remember this later when we have to deal with
thermodynamics. For small disturbances of equilibrium the
general equations reduce to linear ones ; any quantity </> in an
isotropic fluid (change of volume or pressure) satisfies the linear
wave equation , Q2 ^
ig* (4.1.)
where A is Laplace's differential operator
' ++*;. (4.14)
and c a constant which is easily found to mean the phase
velocity of a plane harmonic wave
2*77
<f> == A sin (# ct).
A
The equation (4.13) links up mechanics with other branches
of physics which have independently developed, optics and
electromagnetism.
ELECTROMAGNETIC FIELDS
The history of optics, in particular Newton's contributions
and his dispute with Huygens about the corpuscular or wave
nature of light, is so well known that I need not speak about it.
A hundred years after Newton, the wave nature of light was
established by Young and Fresnel with the help of experiments
on diffraction and interference. Wave equations of the type
(4.13) were used as a matter of course to describe the observa
tions, where now <f> means the amplitude of the vibration.
But what is it that vibrates ? A name, 'ether/ was ready to
hand, and its ability to propagate transverse waves suggested
that it was comparable to an elastic solid. In this way it came
to pass that the etherfilled vacuum was the carrier of contact
CONTIGUITY 23
forces, spreading with finite velocity. They existed for a long
period peacefully beside Newton's instantaneous forces of
gravitation, and other similar forces introduced to describe
elementary experiences in electricity and magnetism. These
forces are usually connected with Coulomb's name, who verified
them by direct measurements of the intensity of attraction and
repulsion between small charged bodies, and between the
poles of needleshaped magnets. He found a law of the same
type as that of Newton, of the form //r~ 2 , where the constant p
depends on the state of electrification or magnetization respec
tively of the interacting particles; by applying the law of
action and reaction p can be split into factors, /x = e l e 2 in the
electric case, where e v e z are called the charges. It must,
however, have been remarked that this law was already
established earlier and with a higher degree of accuracy by
Cavendish and Priestley by an indirect reasoning, with the help
of the fact that a closed conductor screens a charged particle
from the influence of outside charges; this argument, though still
dressed up in the language of Newtonian forces, is already quite
close to notions of field theory.
It was the attempt to formulate the mechanical interactions j
between linear currents (in thin wires) in terms of Newtonian f
forces which entangled physics in the first part of the nineteenth
century in serious difficulties. Meanwhile Faraday had begun /
his investigations unbiased by any mathematical theory, and
accumulated direct evidence for understanding electric and
magnetic phenomena with the help of contact forces. He spoke
about pressures and tensions in the media surrounding charged
bodies, using the expressions introduced in the theory of elas
ticity, yet with considerable and somewhat strange modifica
tions. Indeed, the strangeness of these assumptions made it
difficult for his learned contemporaries to accept his ideas and
to discard the wellestablished Newtonian fashion of descrip
tion. Yet seen from our modern standpoint, there is no intrinsic
difference between the two methods, as long as only static and
stationary phenomena are considered. Mathematical analysis
shows that the resultant forces on observable bodies can be
24 CONTIGUITY
expressed either as integrals over elementary contributions of
the Newtonian or better Coulombian type acting over the
distance, or by surface integrals of tensions derived from field
equations. This holds not only for conductors in vacuo, but also
for dielectric and magnetizable substances ; it is true that in the
latter case the Coulombian forces lead to integral equations
which are somewhat involved, but the differential equations of
the field are, in spite of their simpler aspect, not intrinsically
simpler. This is often overlooked in modern textbooks. How
ever, in Faraday's time this equivalence of differential and
integral equations for the forces was not known, and if it had
been, Faraday would not have cared. His conviction of the
superiority of contact forces over Coulombian forces rested on
his physical intuition. It needed another, more mathematically
minded genius, Clerk Maxwell, to find the clue which made it
impossible to accept forces acting instantaneously over finite
distances: the finite velocity of propagation. It is not easy to
analyse exactly the epistemological and experimental founda
tions of Maxwell's prediction, as his first papers make use of
rather weird models and the purity of his thought appears only
in his later publications. I think the process which led to Max
well's equations, stripped of all unnecessary verbiage and round
about ways, was this : By combining all the known experimental
facts about charges, magnetic poles, currents, and the forces
between them, he could establish a set of field equations con
necting the spatial and temporal changes of the electric and
magnetic field strength (force per unit charge) with the electric
charge density and current. But if these were combined with
the condition that any change of charge could occur only by
means of a current (expressed by a continuity equation analo
gous to (4.5) ; see Appendix, 4), an inadequacy became obvious.
In the language of that time, the result was formulated by
saying that no open currents (like discharge of condensers)
<3ould be described by this theory. Therefore something was
wrong in the equations, and an inspection showed a suspicious
feature, a lack of symmetry. The terms expressing Faraday's
induction law (production of electric force by the time variation
CONTIGUITY 25
of the magnetic field) had no counterpart obtained by exchanging
the symbols for electric and magnetic quantities (production of
magnetic force by the time variation of the electric field). With
out any direct experimental evidence Maxwell postulated this
inverse effect and added to his equations the corresponding
term, which expresses that a change of the electric field (dis
placement current) is, in its magnetic action, equivalent to an
ordinary current. It was a guess based on a belief in harmony.
Yet by some mathematical reasoning it can be connected with
one single but highly significant fact which sufficed to convince
Maxwell of the correctness of his conjecture just as Newton
was convinced of the correctness of his law of gravitation by one
single numerical coincidence, the calculation of terrestrial;
gravity from the moon's orbit. Maxwell showed that his modified
equations had solutions representing waves, the velocity, c, of
which could be expressed in terms of purely electric and magnetic
constants ; for the vacuum c turned out to be equal to the ratio
of a unit of charge measured electrostatically (by Coulomb's
law) and electromagnetically (by Oersted's law). This ratio, a
quantity of the dimensions of a velocity, was known from
measurements by Kohlrausch and Weber, and its numerical
value coincided with the velocity of light. That could hardly
be accidental, indeed, and Maxwell could pronounce the electro
magnetic theory of light.
The final confirmation of Maxwell's theory was, after his
death, obtained by Hertz's discovery of electromagnetic waves,
I cannot follow the further course of events in the establish
ment of electromagnetic theory. I only wish to stress the point
that the use of contact forces and field equations, i.e. the
establishment of contiguity, in electromagnetism was the result
of a long struggle against preconceptions of Newtonian origin.
This confirms my view that the question of contiguity is not a
metaphysical one, but an empirical one.
We have now to see whether the laws of electromagnetism
satisfy the principle of antecedence. An inspection of Maxwell's
equations (see Appendix, 4) shows that a reversal of time,
> t, leaves everything, including the continuity equation,
26 CONTIGUITY
unchanged, if the electric density and field are kept unchanged
while the electric current and magnetic field are reversed. This
is a kind of reversibility very similar to that of mechanics,
where a change of the sign of all velocities makes the system
return to its initial state. The difference is only a practical one :
a change of sign of all current densities and the whole magnetic
field is not as simple to perform as that of a finite set of velocities.
The situation is best seen by considering an electromagnetic
wave spreading from a point source ; the corresponding solution
of Maxwell's equation is given by socalled retarded potentials
which express the electromagnetic state at a point P for the time
t in terms of the motion of the source at the time tr/c, where r
is the distance of P from the source. But there also exist other
solutions, advanced potentials, which refer to the later time
t+r/c and represent a wave contracting towards the source.
Such contracting waves are of course necessary for solving
certain problems. Imagine, for instance, a spherical wave
reflected by a concentric spherical mirror. However, such a
mirror must be absolutely perfect to do its duty, and there
appears to be something improbable about the occurrence of
advanced potentials in nature. For the description of ele
mentary processes of emission of atoms or electrons one has
supplemented Maxwell's equations by the rule that only retarded
solutions are allowed. In this way a kind of irreversibility can
be introduced and the principle of antecedence satisfied. But
'this is altogether artificial and unsatisfactory. The irreversi
bility of actual electromagnetic processes has its roots in other
fectsL which we shall later have to describe in detail. Maxwell's
equations themselves do not satisfy the postulate of antecedence.
RELATIVITY AND THE FIELD THEORY OF GRAVITATION
The situation which we have now reached is that which I
found when I began to study almost half a century ago. There
existed, more or less peacefully side by side, Newton's mechanics
of instantaneous action over any distance, Cauchy's mechanics
of continuous substances, and Maxwell's electrodynamics, the
latter two satisfying the postulate of contiguity . Of these theories,
CONTIGUITY 27
Maxwell's seemed to be the most promising and fertile, and the
idea began to spread that possibly all forces of nature might be of
electromagnetic origin. The problem had to be envisaged, how to
reconcile Newton's gravitational forces with the postulate of con
tiguity ; the solution was Einstein's general theory of relativity.
This is a long and interesting story by itself which involves
not only the notion of cause with which we are concerned here,
but other philosophical concepts, namely those concerned with
space and time. A detailed discussion of these problems would
lead us too far away from our subject, and I think it hardly
necessary to dwell on them because relativity is today widely
known and part of the syllabus of the student of mathematics
and physics as well. So I shall give a very short outline only.
The physical problems which led to the theory of relativity
were those concerned with the optical and electromagnetic
phenomena of fastmoving bodies. There are two types of
experiments : those using the high velocity of celestial bodies
(e.g. Michelson's and Morley's experiment) and those using fast
electrons or ions (e.g. Bucherer's measurement of the mass of
electrons in cathode rays as a function of the velocity) . The work
of Lorentz, FitzGerald, Poincar6, and others prepared the ground
for Einstein's discovery that the root of all difficulties was the
assumption of a universal time valid for all moving systems of
reference. He showed that this assumption has no foundation
in any possible experience and he replaced it by a simple defini
tion of relative time, valid in a given coordinate system, but
different from the time of another system in relative motion.
The formal lajv of transformation from one spacetime system
to another was already known, owing to an analysis of Lorentz ;
it is in fact an intrinsic property of Maxwell's equations. The,
Lorentz transformation is linear; it expresses the physical
equivalence of systems in relative motion with constant velocity
(see Appendix, 5).
Einstein's theory of gravitation is formally based on a general
ization of these transformations into arbitrary, nonlinear ones ;
with the help of these one can express the transition from one
system of reference to another accelerated (and simultaneously
28 CONTIGUITY
deformed) one. The physical idea behind this mathematical
formalism has been already mentioned: the exact proportion
ality of mass, as defined by inertia, and of mass as defined by
gravitation, equation (3.10); or, in other words, the fact that
in Newton's law of gravitational motion (3.4) the (inertial) mass
does not appear.
Einstein succeeded in establishing equations for the gravita
tional field by identifying the components of this field with the
quantities g^ v which define the geometry of space time, namely
the coefficients of the line element
(4.15)
p,v
where x l 9 # 2 , x 3 stand for the space coordinates x, y, z, x* for the
time t.
In ordinary 3dimensional Euclidean geometry the g^ v are
constant and can, by a proper choice of the coordinate system,
be normalized in such a way that
9w= *> ^ v = 0for/x ^v.
Minkowski showed that special relativity can be regarded as
a 4dimensional geometry, where time is added as the fourth
coordinate, but still with constant g^ v , which can be normalized to
11 = 022 = 033= l > 044= 1 > fy, = for /* =5* V. (4.16)
It was further known from the work of Riemann that a very
general type of nonEuclidean geometry in 3dimensional space
could be obtained by taking the gr as variable functions of
x l , x z , # 3 , and the mathematical properties of this geometry had
been thoroughly studied (LeviCivita, Eicci).
Einstein generalized Eiemann's formalism to four dimensions,
assuming that the g^ v depend not only on x l ,x 2 ,x* 9 but also on
x 4 , the time. However, he regarded the g^ v not as given functions
of x l , # 2 , # 3 , # 4 but as field quantities to be calculated from the
distribution of matter. He formed a set of quantities R^ v which
can be regarded as a measure of the * curvature' of space and are
functions of the g^ v and their first and second derivatives, and
postulated equations of the form
R^  fcZ;,, (4.17)
CONTIGUITY 29
where K is a constant and the T^ v are generalizations of the
tensions in matter, defined in (4.8): one has to supplement the
tensor T by a fourth row and column, where T U ,T 2 ^T U are
the components of the density of momentum, T u the density of
energy. These equations (4.17) are invariant in a very general
sense, namely for all continuous transformations of spacetime,
and they are essentially uniquely determined by this property
and the postulate that no higher derivatives than the second
order ones should appear.
If the distribution of matter is given, i.e. the T^ v are known,
the field equations (4.17) allow one to calculate the g v , i.e. the
geometry of space. Einstein found the solution for a mass point
as source of the field, and by assuming that the motion of another
particle was determined by a geodesic, or shortest, or straightest
line in this geometry, he showed that Newton's laws of planetary
motion follow as a first approximation. But higher approxima
tions lead to small deviations, some of which can be observed.
I cannot enter into the discussion of all the consequences of the
new gravitational theory ; Einstein's predictions have been con
firmed, although some of them are at the limit of observational
technique. But I wish to add a remark about a theoretical point
which is not so well known, yet very important. The assump
tion that the motion of a particle is given by a geodesic is
obviously an unsatisfactory feature ; one would expect that the
field equations alone should determine not only the field pro
duced by particles but also the reaction of the particles to the
field, that is their motion. Einstein, with his collaborators
Infeld and Hoffmann, has proved that this is in fact the case,
and the same result has been obtained independently and, as I
think, in a considerably simpler way, by the Russian physicist,.
Pock. On the basis of these admirable papers, one can say that!
the field theory of gravitatien is logically perfect whether it?
will stand all observational tests remains to be seen. *
From the standpoint of the philosophical problem, which is
the subject of these lectures, there are several conclusions to be
drawn. The first is that now physical geometry, that is, not some
abstract mathematical system but the geometrical aspect of the
30 CONTIGUITY
behaviour of actual bodies, is subject to the causeeffect relation
and to all related principles like any other btanch of science.
The mathematicians often stress the opposite point of view ;
they speak of the geometrization of physics, but though it cannot
be denied that the mathematical beauty of this method has
inspired numerous valuable investigations, it seems to me an
overestimation of the formalism. The main point is that
Einstein's geometrical mechanics or mechanical geometry
satisfies the principle of contiguity. On the other hand, ante
cedence, applied to two consecutive configurations as cause and
effect, is not satisfied, or not more than in electrodynamics ; for
there is no intrinsic direction in the flow of time contained in
the equations. The theory is deterministic, at least in principle :
the future or past motion of particles and the distribution of the
gravitational field are predictable from the equations, if the
situation at a given time is known, together with boundary
conditions (vanishing of field at infinity) for all times. But as
the gravitational field travels between the particles with finite
velocity, this statement is not identical with Newtonian deter
minism : a knowledge is needed, not only of all particles, but also
of all gravitational waves (which do not exist in Newton's
theory). Einstein himself values the deterministic feature of his
theory very highly. He regards it as a postulate which has to
be demanded from any physical theory, and he rejects, there
fore, parts of modern physics which do not satisfy it.
Here I only wish to remark that determinism in field theories
seems to me of very little significance. To illustrate the power
of mechanics, Laplace invented a supermathematician able to
predict the future of the world provided the positions and veloci
ties of all particles at one moment were given to him. I can
sympathize with him in his arduous task. But I would really
pity him if he had not only to solve the numerous ordinary
differential equations of Newtonian type but also the partial
differential equations of the field theory with the particles as
singularities.
ANTECEDENCE: THERMODYNAMICS
WE have now to discuss the experiences which make it possible
to distinguish in an objective empirical way between past and
future or, in our terminology, to establish the principle of
antecedence in the chain of cause and effect. These experiences
are connected with the production and transfer of heat. There
would be a long story to tell about the preliminary steps
necessary to translate the subjective phenomena of hot and cold
into the objective language of physics : the distinction between
the quality 'temperature' and the quantity 'heat', together
with the invention of the corresponding instruments, the ther
mometer and calorimeter. I take the technical side of this
development to be well known and I shall use the thermal
concepts in the usual way, although I shall have to analyse them
presently from the standpoint of scientific methodology. It was
only natural that the measurable quantity heat was first
regarded as a kind of invisible substance called caloric. The
flow of heat was treated with the methods developed for material
liquids, yet with one important difference : the inertia of the
caloric fluid seemed to be negligible; its flow was determined
by a differential equation which is not of the second but of the
first order in time. It is obtained from the continuity equation
(see (4.5)) (J+divq = (5.1)
by assuming that the change of the density of heat Q is propor
tional to the change of temperature T, SQ = c 8T (where c is the
specific heat), while the current of heat q is proportional to the
negative gradient of temperature, q = K grad T (where K is
the coefficient of conductivity). Hence
fiT
c^ = K *T, (5.2)
dt
a differential equation of the first order in time. This equa
tion was the startingpoint of one of the greatest discoveries
in mathematics, Fourier's theory of expansion of arbitrary
32 ANTECEDENCE: THERMODYNAMICS
functions in terms of orthogonal sets of simple periodic functions,
the prototype of numerous similar expansions and the embryo
from which a considerable part of modern analysis and mathe
matical physics developed.
But that is not the aspect from which we have here to regard
the equation (5.2) ; it is this:
The equation does not allow a change of t into t, the result
cannot be compensated by a change of sign of other variables
as happens in Maxwell's equations. Hence the solutions exhibit
an essential difference of past and future, a definite 'flow of time'
as one is used to say meaning, of course, a flow of events in
time. For instance, an elementary solution of (5.2) for the
temperature distribution in a thin wire along the ^direction is
, (5.3)
which describes the spreading and levelling out of an initially
high temperature concentrated near the point x = 0, an
obviously irreversible phenomenon.
I do not know enough of the history of physics to understand
how this theory of heat conduction was reconciled with the
general conviction that the ultimate laws of physics were of the
Newtonian reversible type.
Before a solution of this problem could be attempted another
important step was necessary : the discovery of the equivalence
of heat and mechanical work, or, as we say today, of the first
law of thermodynamics. It is important to remember that this
discovery was made considerably later than the invention of the
steamengine. Not only the production of heat by mechanical
work (e.g. through friction), but also the production of work
from heat (steamengine) was known. The new feature was the
statement that a given amount of heat always corresponds to a
definite amount of mechanical work, its 'mechanical equivalent'.
Robert Mayer pronounced this law on very scanty and indirect
evidence, but obtained a fairly good value for the equivalent
from known properties of gases, namely from the difference of
heat necessary to raise the temperature by one degree if either
ANTECEDENCE: THERMODYNAMICS 33
the volume is kept constant or the gas allowed to do work
against a constant pressure. Joule investigated the same
problem by systematic experiments which proved the essential
point, namely that the work necessary to transfer a system from
one equilibrium state to another depends only on these two
states, not on the process of application of the work. This is the
real content of the first law ; the determination of the numerical
value of the mechanical equivalent, so much stressed in text
books, is a matter of physical technique. To get our notions
clear, we have now to return to the logical and philosophical
foundations of the theory of heat.
The problem is to transform the subjective sense impressions
of hot and cold into objective measurable statements. The latter
are, of course, again somewhere connected with sense impres
sions. You cannot read an instrument without looking at it.
But there is a difference between this looking at, say, a thermo
meter with which a nurse measures the temperature of a patient
and the feeling of being hot under which the patient suffers.
It is a general principle of science to rid itself as much as
possible from sense qualities. This is often misunderstood as
meaning elimination of sense impressions, which, of course, is
absurd. Science is based on observation, hence on the use of the
senses. The problem is to eliminate the subjective features and
to maintain only statements which can be confirmed by several
individuals in an objective way. It is impossible to explain to
anybody what I mean by saying 'This thing is red' or 'This
thing is hot'. The most I can do is to find out whether other
persons call the same things red or hot. Science aims at a closer
relation between word and fact. Its method consists in finding
correlations of one kind of subjective sense impressions with
other kinds, using the one as indicators for the other, and in this
way establishes what is called a fact of observation.
Here I have ventured again into metaphysics. At least, a
philosopher would claim that a thorough study of these
methodological principles is beyond physics. I think it is again
a rule of our craft as scientists, like the principle of inductive
inference, and I shall not analyse it further at this moment.
6181
JJ4 ANTECEDENCE: THERMODYNAMICS
In the case of thermal phenomena, the problem is to define
the quantities involved temperature, heat by means of
observable objective changes in material bodies. It turns out
that the concepts of mechanics, configuration and force, strain
and stress, suffice for this purpose, but that the laws of mechanics
have to be essentially changed.
Let us consider for simplicity only systems of fluids, that is of
continuous media, whose state in equilibrium is defined by one
single strain quantity, the density, instead of which we can also,
for a given mass, take the total volume V. There is also only one
stress quantity, the pressure p. From the standpoint of
mechanics the pressure in equilibrium is a given function of the
volume, p = f(V).
Now all those experiences which are connected with the
subjective impression of making the fluid hotter or colder, show
that this law of mechanics is wrong : the pressure can be changed
at constant volume namely 'by heating' or 'by cooling'.
Hence the pressure p can be regarded as an independent
variable besides the volume V, and this is exactly what thermo
dynamics does.
The generalization for more complicated substances (such as
those with rigidity or magnetic polarizability) is so obvious that
I shall stick to the examples of fluids, characterized by two
thermodynamically independent variables V,p. But it is
necessary to consider systems consisting of several fluids, and
therefore one has to say a word about different kinds of contact
between them.
To shorten the expression, one introduces the idea of 'walls'
separating different fluids. These walls are supposed to be so
thin that they play no other part in the physical behaviour of
the system than to define the interaction between two neigh
bouring fluids. We shall assume every wall to be impenetrable
to matter, although in theoretical chemistry semipermeable
partitions are used with great advantage. Two kinds of walls
are to be considered.
An adiabatic wall is defined by the property that equilibrium
of a body enclosed by it is not disturbed by any external process
ANTECEDENCE: THERMODYNAMICS 35
as long as no part of the wall is moved (distance forces being
excluded in the whole consideration).
Two comments have to be made. The first is that the
adiabatic property is here defined without using the notion of
heat ; that is essential, for as it is our aim to define the thermal
concepts in mechanical terms, we cannot use them in the ele
mentary definitions. The second remark is that adiabatic
enclosure of a system can be practically realized, as in the Dewar
vessel or thermos flask, with a high degree of approximation.
Without this fact, thermodynamics would be utterly im
practicable.
The ordinary presentations of this subject, though rather
careless in their definitions, cannot avoid the assumption of the
possibility of isolating a system thermally; without this no
calorimeter would work and heat could not be measured.
The second type of wall is the diathermanous wall, defined by
the following property : if two bodies are separated by a diather
manous wall, they are not in equilibrium for arbitrary values of
their variables p^V^ and p 2 ,V 2 , but only if a definite relation
between these four quantities is satisfied
F(Pi ,Vi,PVt) = 0. (5.4)
This is the expression of thermal contact; the wall is only intro
duced to symbolize the impossibility of exchange of material.
The concept of temperature is based on the experience that
two bodies, being in thermal equilibrium with a third one, are
also in thermal equilibrium with another. If we write (5.4) in
the short form JF(1, 2) = 0, this property of equilibrium can be
expressed by saying that of the three equations
F(2, 3) = 0, J(3, 1) = 0, JP(1, 2) = 0, (5.5)
any two always involve the third. This is only possible if
(5.4) can be brought into the form
Now one can use one of the two bodies, say 2, as thermometer
and introduce the value of the function
I) = * (5.7)
36 ANTECEDENCE: THERMODYNAMICS
as empirical temperature. Then one has for the other body the
socalled equation of state
Any arbitrary function of & can be chosen as empirical tem
perature with equal right; the choice is restricted only by
practical considerations. (It would be impractical to use a ther
mometric substance for which two distinguishable states are in
thermal equilibrium.) The curves & = const, in thep Fplane are
independent of the temperature scale ; they are called isotherms.
It is not superfluous to stress the extreme arbitrariness of the
temperature scale. Any suitable property of any substance can
be chosen as thermometric indicator, and if this is done, still
the scale remains at our disposal. If we, for example, choose a
gas at low pressures, because of the simplicity of the isothermal
compression law p V = const., there is no reason to take p V = #
as measure of temperature: one could just as well take (pV) 2 or
*J(p F). The definition of an ' absolute ' scale of temperature was
therefore an urgent problem which was solved by the discovery
of the second law of thermodynamics.
The second fundamental concept of thermodynamics, that of
heat, can be defined in terms of mechanical quantities by a
proper interpretation of Joule's experiments. As I have pointed
out already, the gist of these experiments lies in the following
fact : If a body in an adiabatic enclosure is brought from one
(equilibrium) state to another by applying external work, the
amount of this work is always the same in whatever form
(mechanical, electrical, etc.) and manner (slow or fast, etc.) it is
applied.
Hence for a given initial state (p Q) V ) the work done adia
batically is a function U of the final state (p, F), and one can
write W=UU ; (5.9)
the function U(p, V) is called the energy of the system. It is a
quantity directly measurable by mechanical methods.
If we now consider a nonadiabatic process leading from the
initial state (p Q , V Q ) to the final state (p, F), the difference
JJ. ^_ jy w iU not be zero, but cian be determined if the energy
ANTECEDENCE: THERMODYNAMICS 37
function U(p, V) is known from previous experiment. This
difference n __ ^_ w = Q (g JQ)
is called the heat supplied to the system during the process.
Equation (5.10) is the definition of heat in terms of mechanical
quantities.
This procedure presupposes that mechanical work is measur
able however it is applied ; that means, for example, that the
displacements of and the forces on the surface of a stirring
wheel in a fluid, or the current and resistance of a wire heating
the fluid, must be registered even for the most violent reactions.
Practically this is difficult, and one uses either stationary pro
cesses of a comparatively long duration where the irregular
initial and final stages can be neglected (this includes heating
by a stationary current), or extremely slow, 'quasistatic*
processes ; these are in general (practically) reversible, since no
kinetic energy is produced which could be irreversibly destroyed
by friction. In ordinary thermodynamics one regards every
curve in the p Fplane as the diagram of a reversible process ;
that means that one allows infinitely slow heating or cooling by
bringing the system into thermal contact with a series of large
heat reservoirs which differ by small amounts of temperature.
Such an assumption is artificial; it does not even remotely
correspond to a real experiment. It is also quite superfluous.
We can restrict ourselves to adiabatic quasi static processes,
consisting of slow movements of the (adiabatic) walls. For these
the work done on a simple fluid is
dW = pdV, (5.11)
where p is the equilibrium pressure, and the first theorem of
thermodynamics (5.10) assumes the form
dQ = dU+pdV = 0. (5.12)
For systems of fluids separated by adiabatic or diathermanous
walls the energy and the work done are additive (according to
our definition of the walls) ; hence, for instance,
dQ = dQi+dQt = dU+pidTi+ptdTi, (5.13)
where U = Z7i+Z7 2 .
58 ANTECEDENCE: THERMODYNAMICS
This equation is of course only of interest for the case of
thermal contact where the equation (5.6) holds ; the system has
then only three independent variables, for which one can choose
V l9 V 2 and the temperature #, defined by (5.7) and (5.8). Then
, #), U 2 = t/ 2 (F 2 , #), and (5.13) takes the form
(5.14)
Every adiabatic quasistatic process can be represented as a
line in the threedimensional I^tfspace which satisfies this
equation; let us call these for brevity * adiabatic lines'.
Equation (5.14) is a differential equation of a type studied by
Pfaif. Pfaffian equations are the mathematical expression of
elementary thermal experiences, and one would expect that the
laws of thermodynamics are connected with their properties.
That is indeed the case, as Carath(5odory has shown. But
classical thermodynamics proceeded in quite a different way,
introducing the conception of idealized thermal machines which
transform heat into work and vice versa (William Thomson
Lord Kelvin), or which pump heat from one reservoir into
another (Clausius). The second law of thermodynamics is then
derived from the assumption that not all processes of this kind
are possible : you cannot transform heat completely into work,
nor bring it from a state of lower temperature to one of higher
'without compensation' (see Appendix, 6). These are new and
strange conceptions, obviously borrowed from engineering.
I have mentioned that the steamengine existed before thermo
dynamics; it was a matter of course at that time to use the
notions and experiences of the engineer to obtain the laws of
heat transformation, and the establishment of the abstract
concepts of entropy and absolute temperature by this method
is a wonderful achievement. It would be ridiculous to feel any
thing but admiration for the men who invented these methods.
But even as a student, I thought that they deviated too much
from the ordinary methods of physics ; I discussed the problem
with my mathematical friend, Caratheodory, with the result
that he analysed it and produced a much more satisfactory
ANTECEDENCE: THERMODYNAMICS 39
solution. This was about forty years ago, but still all text
books reproduce the * classical' method, and I am almost certain
that the same holds for the great majority of lectures I know,
however, a few exceptions, namely those of the late R. H. Fowler
and his school. This state of affairs seems to me one of unhealthy
conservatism. I take in these lectures an opportunity to advo
cate a change.
The central point of Caratheodory's method is this. The
principles from which Kelvin and Clausius derived the second
law are formulated in such a way as to cover the greatest
possible range of processes incapable of execution: in no way
whatever can heat be completely transformed into work or
raised to a higher level of temperature. Caratheodoiy remarked
that it is perfectly sufficient to know the existence of some
impossible processes to derive the second law. I need hardly
say that this is a logical advantage. Moreover, the impossible
processes are already obtained by scrutinizing Joule's experi
ments a little more carefully. They consisted in bringing a
system in an adiabatic enclosure from one equilibrium state to
another by doing external work : it is an elementary experience,
almost obvious, that you cannot get your work back by reversing
the process. And that holds however near the two states are.
One can therefore say that there exist adiabatically inaccessible
states in any vicinity of a given state. That is Caratheodory's
principle.
In particular, there are neighbour states of any given
one which are inaccessible by quasistatic adiabatic processes.
These are represented by adiabatic lines satisfying the
Pfaffian equation (5.14). Therefore the question arises: Does
Carath^odory's postulate hold for any Pfaffian or does it mean
a restriction ?
The latter is the case, and it can be seen by very simple
mathematics indeed, of which I shall give here a short sketch
(see Appendix, 7).
Let us first consider a Pfaffian equation of two variables,
x and y,
(5.15)
40 ANTECEDENCE: THERMODYNAMICS
where X, T are functions of x 9 y. This is equivalent to the
ordinary differential equation
27
which has an infinite number of solutions <l>(x,y) = const.,
representing a oneparameter set of curves in the (#,t/)plane.
Along any of these curves one has
. .
and this must be the same condition rfs the given Pfaffian;
hence one must have jr\ \ n /* io\
aty = A dtp. (5.18)
Each Pfaffian dQ of two variables has therefore an 'integrating
denominator' A, so that dQ/X is a total differential.
For Pfaffians of three (or more) variables,
dQ = Xdx+Tdy+Zdz (5.19)
this does not hold. It is easy to give analytical examples (see
Appendix, 7) ; but one can see it geometrically in this way: if in
(5.19) dx > dy, dz are regarded as finite differences #, T? y, 2,
it is the equation of a plane through the point x,y,z\ one has a
plane through each point of space, continuously varying in
orientation with the position of this point. Now if a function <j>
existed, these planes would have to be tangential to the surfaces
<f>(x, y, z) = const. But one can construct continuously varying
sets of planes which are not 'integrable', i.e. tangential to a set
of surfaces. For example, take all circular screws with the same
axis, but varying radius and pitch, and construct at each point
of every screw the normal plane ; these obviously form a non
integrable set of planes.
Hence all Pfaffians can be separated into two classes : those of
the form dQ = Ad^, which have an 'integrating denominator'
and represent the tangential planes of a set of surfaces ^ = const.,
and those which lack this property.
Now in the first case, dQ = Xd$ y any line satisfying the
Pfaffian equation (5.19) must lie in the surface < = const.
Hence an arbitrary pair of points P and P in the #t/zspace
ANTECEDENCE: THERMODYNAMICS 41
cannot be connected by such a line. This is quite elementary.
Not quite so obvious is the inverse statement which is used in the
thermodynamic application: If there are points P in any
vicinity of a given point P which cannot be connected with P
by a line satisfying the Pfaffian equation (5.19), then there
exists an integrating denominator and one has dQ = Xd<f>.
One can intuitively understand this theorem by a continuity
consideration : All points P inaccessible from P will fill a certain
volume, bound by a surface of accessible points going through P .
Further, to each inaccessible point there corresponds another
one in the opposite direction ; hence the boundary surface must
contain all accessible points : which proves the existence of the
function </>, so that dQ = \d<f> (see Appendix, 7).
The application of this theorem to thermodynamics is now
simple. Combining it with Carath^odory's principle, one has for
any two systems
dQi = A^, dQt = A 2 cty 2 , (5.20)
and for the combined system
dQ = dQt+dQi = Xdfa (5.21)
hence Xd<f> = ^ &</>!+ X 2 d<f> 2 . (5.22)
Consider in particular two simple fluids in thermal contact;
then the system has three independent variables T^, T,#, which
can be replaced by </> l9 (f) 2y &. Then (5.22) shows that <f> depends
only on ^ x , < 2 , and not on #, while
=5 ^
d<f>i A d<f>2 A
Hence these quotients are also independent of #,
8 ho _?_^_o
e A ~ ' a* A '
from which one infers
1 8\ _ I aA _ 1 8X
Aj 8& ~ A 2 8& ~ X 8&' ^ ' '
Now \ is a variable of the first fluid only, therefore only
42 ANTECEDENCE: THERMODYNAMICS
dependent on < x and #; in the same way A 2 = A 2 (</> 2 ,#). The
first equality (5.24) can only hold if both quantities depend only
on #. Hence
___ _ _____ _ __
where g(&) is a universal function, namely numerically identical
for different fluids and for the combined system.
This simple consideration leads with ordinary mathematics
to the existence of a universal function of temperature. The rest
is just a matter of normalization. From (5.25) one finds for each
system
logA = J g() d#+logd>, A = Oef **> d , (5.26)
where <I> depends on the corresponding <f>.
If one now defines
T(&)
(5.27)
where the constant C can be fixed by prescribing the value of
T 1 T 2 for two reproducible states of some normal substance
(e.g. T l T 2 = 100, if 2i corresponds to the boilingpoint, T 2
the freezingpoint of water at 1 atmosphere of pressure), then
one has dQ = Xd</> = TdS. (5.28)
T is the thermodynamical or absolute temperature and 8 the
entropy.
Equation (5.28) refers only to quasistatic processes, that
is, to sequences of equilibrium states. To get a result about
real dynamical phenomena one has to apply Carath6odory's
principle again, considering a finite transition from an initial
state FJ, F, 8 to a final state V lt V 2 , 8. One can reach the latter
one in two steps: first changing the volume quasistatically
(and adiabatically ) from FJ, VI to V I9 V 2 , the entropy remaining
constant, equal to &, and then changing the state adiabatically,
but irreversibly (by stirring, etc.) at constant volume, so that
S goes over into S.
Now if any neighbouring value S of 8 could be reached in
this way, one would have a contradiction to Carath^odory's
ANTECEDENCE: THERMODYNAMICS 43
principle, as the volumes are of course arbitrarily changeable.
Hence for each such process one must have either 8 ^ $ or
S < $. Continuity demands that the same sign holds for all
initial states; it holds also for different substances since the
entropy is additive (as can be easily seen). The actual sign ^
or ^ depends on the choice of the constant C in (5.27); if this is
chosen so that T is positive, a single experience, say with a gas,
shows that entropy never decreases.
It may not be superfluous to add a remark on the behaviour
of entropy for the case of conduction of heat. As thermo
dynamics has to do only with processes where the initial and
final states are equilibria, stationary flow cannot be treated : one
can only ask, What is the final state of two initially separated
bodies brought into thermal contact ? The difficulty is that a
change of entropy is only defined by quasistatic adiabatic
processes ; the sudden change of thermal isolation into contact,
however, is discontinuous and the processes inside the system
not controllable. Yet one can reduce this process to the one
considered before. By quasistatic adiabatic changes of volume
the temperatures can be made equal without change of entropy ;
then contact can be made without discontinuity, and the initial
volumes quasi statically restored, again without a change of
entropy. The situation is now the same as in the initial state
considered before, and it follows that any process leading to the
final state must increase the entropy.
The whole chain of considerations can be generalized for more
complicated systems without any difficulty. One has only to
assume that all independent variables except one are of the type
represented by the volume, namely arbitrarily changeable.
If one has to deal, as in chemistry, with substances which are
mixtures of different components, one can regard the concentra
tions of these as arbitrarily variable with the help of semi
permeable walls and movable pistons (see Appendix, 8).
By using thermodynamics a vast amount of knowledge has
been accumulated not only in physics but in the borderland
sciences of physicochemistry, metallurgy, mineralogy, etc. Most
of it refers to equilibria. In fact, the expression ' thermodynamics '
44 ANTECEDENCE: THERMODYNAMICS
is misleading. The only dynamical statements possible are
concerned with the irreversible transitions from one equilibrium
state to another, and they are of a very modest character, giving
the total increase of entropy or the decrease of free energy
F = UTS. The irreversible process itself is outside the scope
of thermodynamics.
j^s The principle of antecedence is now satisfied ; but this gain is
paid for by the loss of all details of description which ordinary
dynamics of continuous media supplies.
Can this not be mended? Why not apply the methods of
Cauchy to thermal processes, by treating each volume element
as a small thermodynamical system, and regarding not only
strain, stress, and energy, but also temperature and entropy as
continuous functions in space ? This has of course been done,
but with limited success. The reason is that thermodynamics is
definitely connected with walls or enclosures. We have used the
adiabatic and diathermanous variety, and mentioned semi
permeable walls necessary for chemical separations; but a
volume element is not surrounded by a wall, it is in free contact
with its neighbourhood. The thermodynamic change to which
it is subject depends therefore on the flux of energy and material
constituents through its boundary, which themselves cannot be
reduced to mechanics. In some limiting cases, one has found
simple solutions. For instance, when calculating the velocity of
sound in a gas, one tried first for the relation between pressure
p and density p the isothermal law p = cp where c is a constant,
but found no agreement with experiment ; then one took the
adiabatic law p = cp? where y is the ratio of the specific heats at
constant pressure and constant volume (see Appendix, 9), which
gave a much better result. The reason is that for fast vibrations
there is no time for heat to flow through the boundary of a
volume element which therefore behaves as if it were adiabati
cally enclosed. But by making the vibrations slower and slower,
one certainly gets into a region where this assumption does not
hold any more. Then conduction of heat must be taken into
account. The hydrodynamical equations and those of heat con
duction have to be regarded as a simultaneous system. In this
ANTECEDENCE: THERMODYNAMICS 45
way a descriptive or phenomenological theory can be developed
and has been developed. Yet I am unable to give an account of
it, as I have never studied it; nor have the majority of physicists
shown much interest in this kind of thing. One knows that any
flux of matter and energy can be fitted into Cauchy's general
scheme, and there is not much interest in doing it in the most
general way. Besides, each effect needs separate constants
e.g. in liquids compressibility, specific heat, conductivity of heat,
constants of diffusion ; in solids elastic constants and parameters
describing plastic flow, etc., and very often these socalled con
stants turn out to be not constants, but to depend on other
quantities (see Appendix, 10).
Therefore one can rightly say that with ordinary thermo
dynamics the descriptive method of physics has come to its
natural end. Something new had to appear.
VI
CHANCE
KINETIC THEORY OF GASES
THE new turn in physics was the introduction of atomistics and
statistics.
To follow up the history of atomistics into the remote past is
not in the plan of this lecture. We can take it for granted that
since the days of Demokritos the hypothesis of matter being
composed of ultimate and indivisible particles was familiar to
every educated man. It was revived when the time was ripe.
Lord Kelvin quotes frequently a Father Boscovich as one of the
first to use atomistic considerations to solve physical problems ;
he lived in the eighteenth century, and there may have been
others, of whom I know nothing, thinking on the same lines.
The first systematic use of atomistics was made in chemistry,
where it allowed the reduction of innumerable substances to a
relatively small stock of elements. Physics followed considerably
later because atomistics as such was of no great use without
another fundamental idea, namely that the observable properties
of matter are not intrinsic qualities of its smallest parts, but
averages over distributions governed by the laws of chance.
The theory of probability itself, which expresses these laws, is
much older; it sprang not from the needs of natural science but from
gambling and other, more or less disreputable, human activities.
The first use of probability considerations in science was made
by Gauss in his theory of experimental errors. I can suppose
that every scientist knows the outlines of it, yet I have to dwell
upon it for a few moments because of its fundamental and
somewhat paradoxical aspect. It has a direct bearing on the
method of inference by induction which is the backbone of all
human experience. I have said that in my opinion the signifi
cance of this method in science consists in the establishment of a
code of rules which form the constitution of science itself. Now
the curious situation arises that this code of rules, which ensures
the possibility of scientific laws, in particular of the causeeffect
KINETIC THEORY OF GASES 47
relation, contains besides many other prescriptions those related
to observational errors, a branch of the theory of probability.
This shows that the conception of chance enters into the very
first steps of scientific activity, in virtue of the fact that no
observation is absolutely correct. I think chance is a more
fundamental conception than causality ; for whether in a con
crete case a causeeffect relation holds or not can only be judged
by applying the laws of chance to the observations.
The history of science reveals a strong tendency to forget this.
When a scientific theory is firmly established and confirmed, it
changes its character and becomes a part of the metaphysical
background of the age: a doctrine is transformed into a dogma.
In fact no scientific doctrine has more than a probability value
and is open to modification in the light of new experience.
After this general remark, let us return to the question how
the notion of chance and probability entered physics itself.
As early as 1738 Daniel Bernoulli suggested the interpretation
of gas pressure as the effect of the impact of numerous particles
on the wall of the container. The actual development of the
kinetic theory of gases was, however, accomplished much later,
in the nineteenth century.
The object of the theory was to explain the mechanical and
thermodynamical properties of gas from the average behaviour
of the molecules. For this purpose a statistical hypothesis was
made, often called the 'principle of molecular chaos': for an
* ideal' gas in a closed vessel and in absence of external forces all
positions and all directions of velocity of the molecules are
equally probable.
Applied to a monatomic gas (the atoms are supposed to be
mass points), this leads at once to a relation between volume F,
pressure #, and mean energy U (see Appendix, 11)
Vp = 1*7, (6.1)
if the pressure p is interpreted as the total momentum trans
ferred to the wall by the impact of the molecules. One has now
only to assume that the energy U is a measure of temperature
to obtain Boyle's law of the isotherms. Then it follows from
48 CHANCE
thermodynamics that U is proportional to the absolute tempera
ture (see Appendix, 9) ; one has
U = IRT, pV = RT, (6.2)
where R is the ordinary gas constant. This is the complete
equation of state (combined BoyleCharles law), and one sees
that the specific heat of a monatomic gas for constant volume
is p.
I have mentioned these things only to stress the point that the
kinetic theory right from the beginning produced verifiable
numerical results in abundance. There could be no doubt that
it was right, but what did it really mean ?
How is it possible that probability considerations can be
superimposed upon the deterministic laws of mechanics without
a clash ?
These laws connect the state at a time t to the initial state,
at time , by definite equations. They involve, however, no
restriction on the initial state. This has to be determined by
observation in every concrete case. But observations are not
absolutely accurate; the results of measurements will suffer
scattering according to Gauss's rules of experimental errors. In
the case of gas molecules, the situation is extreme ; for owing
to the smallness and excessive number of the molecules, there is
almost perfect ignorance of the initial state.
The only facts known are the geometrical restriction of the
position of each molecule by the walls of the vessel, and some
physical quantities of a crude nature, like the resultant pressure
and the total energy : very little indeed in view of the number of
molecules (about 10 19 per c.c.).
Hence it is legitimate to apply probability considerations to
the initial state, for instance the hypothesis of molecular chaos.
The statistical behaviour of any future state is then completely
determined by the laws of mechanics. This is in particular the
case for 'statistical equilibrium ', when the observable properties
are independent of time ; in this case any later state must have
the same statistical properties as the initial state (e.g. it must
also satisfy the condition of molecular chaos). How can this be
KINETIC THEORY OF GASES 49
mathematically formulated ? It is convenient to use the equa
tions of motion in Hamilton's canonical form (4.3, p. 18). The
distribution is described by a function/(, q lt # 2 ,..., j w , Pi,p&.., p n )
of all coordinates and momenta, and of time, such that fdpdq
is the probability for finding the system at time t in a given
element dpdq = dp^...dp n dq^...dq n . One can interpret this
function as the density of a fluid in a 2ftdimensional ^?gspace,
called 'phase space'; and, as no particles are supposed to dis
appear or to be generated, this fluid must satisfy a continuity
equation, of the kind (4.5, p. 20), generalized for 2n dimensions,
namely (see Appendix, 3)
(63)
8 Pk I
This reduces in virtue of the canonical equations (4.3), p. 18,
tO f.f
[#,/] = 0, (6.4)
where [#,/] is an abbreviation, the socalled Poisson bracket,
namely (66)
(6 ' 5)
On the other hand, the convective derivative defined for three
dimensions in (4.11, p. 21) may be generalized for 2n dimensions
thus: (6 . 6)
( '
dt 8t
Then (6.4) says that in virtue of the mechanical equations
f=0. (6.7)
at
The result expressed by the equivalent equations (6.4) and
(6.7) is called Liouville's theorem. The density function is an
integral of the canonical equations, i.e. / = const, along any
trajectory in phase space; in other words, the substance of the
fluid is carried along by the motion in phase space, so that the
I=jfdpdq (6.8)
SIM
60 CHANCE
over any part of the substance moving in phase space is inde
pendent of time.
Any admissible distribution function, namely one for which
the probabilities of a configuration at different times are com
patible with the deterministic laws of mechanics, must be an
integral of motion, satisfying the partial differential equation
(6.4). For a closed system, i.e. one which is free from external
disturbances (like a gas in a solid vessel), H is explicitly inde
pendent of time. The special case of statistical equilibrium
corresponds to certain timeindependent solutions of (6.4), i.e.
functions / satisfying ^^j = (o)
An obvious integral of this equation is / = O(T), where <I> indi
cates an arbitrary function. This case plays a prominent part
in statistical mechanics.
Yet before continuing with these very general considerations
we had better return to the ideal gases and consider the kinetic
theory in more detail. In an ideal gas, the particles (atoms,
molecules) are supposed to move independently of one another.
Hence the function f(p, q) is a product of N functions /(x, J,)
each belonging to a single particle and all formally identical;
x is the position vector and \ = ( l/m)p the velocity vector. Then
fdxd% is the probability of finding a particle at time t at a specified
element of volume and velocity.
In the case where no external forces are present (dH/dt = 0,
dH/dx = 0) the Hamiltonian reduces to the kinetic energy,
H = m 2 = (l/2m)p 2 .
The hypothesis of molecular chaos is expressed by assuming /to
be a function of  2 alone. This is indeed a solution of (6.9), as it
can be written in the form / = O(f ) mentioned above. No
other solution exists if the gas as a whole is homogeneous and
isotropic (i.e. all positions and directions are physically equiva
lent; see Appendix, 12).
The determination of the velocity distribution function /( 2 )
was recognized by Maxwell as a fundamental problem of kinetic
theory: it is the quantitative formulation of the 'law of chance'
for this case. He gave several solutions; his first and simplest
KINETIC THEORY OF GASES 51
reasoning was this: Suppose the three components of velocity
i f 2> 3 t be statistically independent, then
This functional equation has the only solution (see Appendix, 13)
where a,j8 are constants.
This is Maxwell's celebrated law of velocity distribution.
However, the derivation given is objectionable, as the supposed
independence of the velocity components is not obvious at all.
I have mentioned it because the latest proof (and as 1 think the
most satisfactory and rigorous and of the widest possible genera
lization) of the distribution formula uses exactly this Max wellian
argument, only applied to more suitable variables as we shall
presently see.
Maxwell, being aware of this weakness, gave several other
proofs which have been improved and modified by other authors.
Eventually it appears that there are two main types of argu
ment: the equilibrium proof and the dynamical proof. We shall
first consider the equilibrium proof in some detail.
Assume each molecule to be a mechanical system with co
ordinates q v g 2 ,... and momenta pi,i? 2 >> f r which we write
simply q,p, and with a Hamiltonian H(p,q). The interaction
between the molecules will be neglected. The total number n
and the total energy U of the assembly of molecules are
given.
In order to apply the laws of probability it is convenient to
reduce the continuous set of points p, q in phase space to a dis
continuous enumerable set of volume elements. One divides
the phase space into N small cells of volumes o> x V, a) 2 V,..., a> N V,
where V is the total volume; hence
oi 1 +oi 1 +...+co Ar = 1. (6.12)
To each cell a value of the energy H(p, q) can be attached, say
that corresponding to its centre; let these energies be 19 2 ,..,, C N .
Now suppose the particles distributed over the cells so that therof
62 CHANCE
are n^ in the first cell, n 2 in the second, etc., but of course with
the restriction that the totals
n l +n 2 +...+n N = n, (6.13)
= U (6.14)
are fixed. Liouville's theorem suggests that the probability of a
single molecule being in a given cell is proportional to its volume.
Making this assumption, one has to calculate the composite
probability P for any distribution n v %,..., n N under the restric
tions (6.13) and (6.14).
This is an elementary problem of the calculus of probability
(see Appendix, 14) which can be solved in this way: First the
second condition (6.14) is omitted; then the probability of a
given distribution n It n 29 ... 9 n N is
P( ni ,n z ,...,n N ) = __^c^,...y. (615)
If this is summed over all n l9 n 2 ,...,n N satisfying (6.13), one ob
tains by the elementary polynomial theorem
2 P(n l9 n 29 ... 9 n N ) = K+c^+..+c^)" = i, (6.16)
ni,na,...,n.y
because of (6.12) as it must be if P is a properly normalized
probability.
It is well known that the polynomial coefficients n l/n^ ! n 2 ! . . .n N !
have a sharp maximum for n x = n 2 = ... = n N ; that means, if
all cells have equal volumes (o^ = cu 2 = ... = CO N ) the uniform
distribution would have an overwhelming probability. Yet this
is modified by the second condition (6.14) which we have now
to take into account. The simplest method of doing this proceeds
in three approximations which seem to be crude, but are perfectly
satisfactory for very large numbers of particles (n > oo). The
first approximation consists in neglecting all distributions of
comparatively small n lt n 2 ,...,n N ; then the n k can be treated as
continuous variables. The second approximation consists in
replacing the exact expression (6.15) by its asymptotic value for
KINETIC THEORY OF GASES 53
large n k by using Stirling's formula log(n!) >n(logn 1) (see
Appendix, 14), and the result is
logP = n l logn l n 2 logn 2 ...n N logn N i~const.
(6.17)
The third approximation consists in the following assumption:
the actual behaviour of a gas in statistical equilibrium is deter
mined solely by the state of maximum probability; all other
states have so little chance to appear that they can be neglected.
Hence one has to determine the maximum of log P given by
(6.17) under the conditions (6.13) and (6.14). Using elementary
calculus this leads at once to
n k = eto, (6.18)
where a and j8 are two constants which are necessary in order to
satisfy the conditions (6.13), (6.14). Yet these constants play a
rather different part.
If one has to do with a mixture of two gases A and B with given
numbers n (A) and n (B \ one gets two conditions of the type (6.13)
but only one of the type (6.14), expressing that the total energy
is given.
Hence one obtains
n& = e {A} P#\ 4*> = e* (B) t<*\ (6.19)
with two different constants a u) and a (B \ but only one /?. There
fore ft is the parameter of thermal equilibrium between the two
constituents and must depend only on temperature.
Indeed, if one now calculates the mean energy U and the
mean pressure p, one can apply thermodynamics and sees
easily that the second law is satisfied if
PW (620)
where T is the absolute temperature and k a constant, called
Boltzmann's constant. At the same time it appears that the
entropy is given by
8 = klogP = fc2> a logn a . (6.21)
Of
All these results are mainly due to Boltzmann ; in particular (6.18)
54 CHANCE
is called Boltzmann's distribution law. It obviously contains
Maxwell's law (6.11) as a special case, namely for mass points.
We have now to ask: Is this consideration which I called the
equilibrium proof of the distribution law really satisfactory ?
One objection can be easily dismissed, namely that the
approximations made are too crude. They can be completely
avoided. Darwin and Fowler have shown that one can give a
rigorous expression for the mean value of any physical quantity
in terms of complex integrals, containing the socalled 'partition
function' (see Appendix 15)
a) I z*i+aj 2 z *+...+a> N z*x = F(z). (6.22)
No distribution is neglected and no use is made of the Stirling
formula. Yet in the limit n ~> oo, all results are exactly the same
as given by the Boltzmann distribution function. Although this
method is extremely elegant and powerful, it does not introduce
any essential new feature in regard to the fundamental question
of statistical mechanics.
Another objection is going deeper: can the molecules of a gas
really be treated as independent ?
There are numerous phenomena which show they are not,
even if one considers only statistical equilibrium. For no real
gas is 'ideal', i.e. satisfies Boyle's law rigorously, and the devia
tions increase with pressure, ending in a complete collapse, con
densation. This proves the existence of longrange attractive
forces between the molecules. The statistical method described
above is unable to deal with them. The first attempt to correct
this was the celebrated theory of van der Waals, which was
followed by many others. I shall later describe in a few words
the modern version of these theories, which is, from a certain
standpoint, rigorous and satisfactory.
More serious are the interactions revealed by nonequilibrium
phenomena: viscosity, conduction of heat, diffusion. They can
all be qualitatively understood by supposing that each molecule
has a finite volume, or more correctly that two molecules have a
shortrange repulsive interaction which prohibits a close
approach. This assumption has the consequence that there
KINETIC THEORY OF GASES 55
exists an effective crosssection for a collision, hence a mean free
path for the straight motion of a molecule. The coefficients of
the three phenomena mentioned can be reduced, by elementary
considerations, to the free path, and the results, as far as they go,
are in good agreement with observations.
All this is very good physics producing in a simple and intuitive
way formulae which give the correct order of magnitude of
different correlated effects.
But for the problem of a rigorous kinetic theory, which takes
account of the interactions and is valid not only for equilibria,
but also for motion, these considerations have only the value of
a preliminary reconnoitring. The question is: How can one
derive the hydrodynamical equations of visible motion together
with the phenomena of transformation and conduction of heat
and, for a mixture, of diffusion ?
This is an ambitious programme. For such a theory must
include the result that a gas left to itself tends to equilibrium.
Hence it must lead to irreversibility, although the laws of
ordinary reversible mechanics are assumed to hold for the mole
cules. How is this possible ? Further, is the equilibrium obtained
in this way the same as that derived directly, say by the method
of the most probable distribution ?
To begin with the last question. Its answer represents what I
have called above the dynamical proof of the distribution law
for equilibrium.
The formulation of the nonequilibrium theory of gases is due
to Boltzmann. One can obtain his fundamental equation by
generalizing one of the equivalent formulae (6.4) or (6.7). These
are based on the assumption that each molecule moves indepen
dently of the others according to the laws of mechanics, and they
describe how the distribution / of an assembly of such particles
develops in time. Now the assumption of independence is
dropped, hence the expression on the lefthand side of (6.4) or
(6.7) is not zero; denoting by /(I) the probability density for a
certain particle 1, one can write
df(l)
dt ~ et ~ L " v v * /J (6 ' 23)
56 CHANCE
where (7(1) represents the influence of the other molecules on the
particle 1; it is called the 'collision integral', as Boltzmann cal
culates it only for the case where the orbit of the centre of a
particle can be described as straight and uniform motion
interrupted by sudden collisions. For this purpose a new and
independent application of the laws of probability is made by
assumingthatthe probability of a collision between two particles 1
and 2 is proportional to the product of the probabilities of finding
them in a given configuration, /(l)/(2). If one then expresses that
some molecules are thrown by a collision out of a given element
of phase space, others into it, one obtains (see Appendix, 16)
0(1) = // {/W(2)/(l)/(2)}$r5, dbdZ z , (6.24)
where /(2) is the same function as /(I), but taken for the particle
2 as argument. /(I), /(2) refer to the motion of two particles
'before' the collision, /'(I), /'(2) to that 'after' the collision; one
has to integrate over all velocities of the particle 2, (d% 2 ) 9 and over
the 'crosssection' of the collision, (rfb), which I shall not define
in detail. 'Before' and 'after' the collision mean the asymptotic
straight and uniform motions of approach and separation; it is
clear that if the former is given, the latter is completely deter
mined for any law of interaction force it is the twobody problem
of mechanics. Hence the velocities of both particles !> JE^ after
the collision are known functions of those before the collision
!, 5 2 > an d (6.23) assumes the form of an integrodifferential
equation for calculating /.
This equation has been the object of thorough mathematical
investigations, first by Boltzmann and Maxwell, and later by
modern writers. Hilbert has indicated a systematic method of
solution in which each step of approximation leads to an integral
equation of the normal (socalled Predholm) type. Enskog and
Chapman have developed this method, with some modifications,
in detail. There is an admirable book by Chapman and Cowling
which represents the whole theory of non homogeneous gases as
a consequence of the equation (6.23). I can only mention a few
points of these important investigations.
The first is concerned with the question of equilibrium. Does
KINETIC THEOBY OF GASES 57
the equation (6.23) really indicate an irreversible approach from
any initial state to a homogeneous equilibrium ? This is in fact
the case, and a very strange result indeed: the metamorphosis
of reversible mechanics into irreversible thermodynamics with
the help of probability. But before discussing this difficult
question, I shall indicate the mathematical proof.
From the statistics of equilibrium it is known how the entropy
is connected with probability, namely by equation (6.21).
Replace here the discontinuous n k by the continuous / and
summation by integration over the phase space, and you obtain
S = k J/(l)log/(l) dqdp. (6.25)
If one now calculates the time derivative dS/dt by substituting
df(l)/dt from (6.23), and assuming no external interference, one
finds (see Appendix, 17)
f>0, (6.26)
where the = sign holds only if /(I) is independent of the space
coordinates and satisfies, as a function of the velocities,
/(l)/(2)=/'(l)/'(2) (6.27)
identically for any collision.
The result expressed by (6.26) is often quoted as Boltzmann's
jBTtheorem (because he used the symbol H for S/k). Boltz
mann claimed that it gave the statistical explanation of
thermodynamical irreversibility.
Equation (6.27) is a functional equation which determines /
as a function of 'collision invariants', like total energy and total
momentum. If the gas is at rest as a whole, the only solution of
(6.27) is Maxwell's (or Boltzmann's) distribution law:
/=**, H(p,q) = . (6.28)
This is what I called the dynamical proof, and is a most remarkable
result indeed; for it has been derived from the mechanism of
collisions, which was completely neglected in the previous
equilibrium methods. This point needs elucidation.
Before doing so, let me mention that the hydrothermal
equations of a gas, i.e. the equations of continuity, of motion and
68 CHANCE
of conduction of heat, are obtained from Boltzmann's equation
(6.23) by a simple formal process (multiplication with 1,5 and
w! 2 followed by integration over all velocities) in terms of the
stress tensor T you will remember Cauchy's general formula
(4.9) which itself is expressed in terms of the distribution
function /. To give these equations a real meaning, one has to
expand/ in terms of physical quantities, and this is the object of
the theories contained in Chapman and Cowling's book. In this
wayavery satisfactory theory of hydro thermodynamics of gases,
including viscosity, conduction of heat, and diffusion, is obtained.
STATISTICAL MECHANICS
I remember that forty years ago when I began to read scientific
literature there was a violent discussion raging about statistical
methods in physics, especially the H theorem. The objections
raised have been classified into two types, one concerning reversi
bility, the other periodicity.
Loschmidt, like Boltzmann, a member of the Austrian school,
formulated the reversibility objection in this way: by reversing
all velocities you get from any solution of the mechanical
equations another one how can the integral $, which depends
on the instantaneous situation, increase in both cases ?
The periodicity objection is based on a theorem of the great
French mathematician Henri Poincar6, which states that every
mechanical system is, if not exactly periodic, at least quasi
periodic. This follows from Liouville's theorem according to
which a given region in phase space moves without change of
volume and describes therefore a tubeshaped region of ever
increasing length. As the total volume available is finite (it is
contained in the surface of maximum energy), this tube must
somewhere intersect itself, which means that final and initial
states come eventually near together.
Zermelo, a German mathematician, who worked on abstract
problems like the theory of Cantor's sets and transfinite num
bers, ventured into physics by translating Gibbs's work on
statistical mechanics into German. But he was offended by the
logical imperfections of this theory and attacked it violently.
He used in particular Poincar^'s theorem to show how scanda
STATISTICAL MECHANICS . 69
lous the reasoning of the physicists was: they claimed to have
proved the irreversible increase of a mechanical quantity for
a system which returns after a finite time to its initial state with
any desired accuracy.
These objections were not quite futile, as they led two dis
tinguished physicists, Paul Ehrenfest and his wife Tat j ana, to
investigate and clear up the matter beyond doubt in their well
known article in vol. iv of the Mathematical Encyclopedia.
Today we hardly need to follow all the logical finesses of this
work. It suffices to point out that the objections are based on the
following misunderstanding. If we describe the behaviour of the
gas (we speak only of this simple case, as for no other case has
the .fftheorem been proved until recently) by the equation (6.4) ,
taking for H the Hamiltonian of the whole system, a function of
the coordinates and momenta of all particles, then / is indeed
reversible and quasiperiodic, no H theorem can be proved.
Boltzmann's proof is based not on this equation, but on (6.23),
where now H is the Hamiltonian of one single molecule un
disturbed by the others, and where the righthand term is not
zero but equal to the collision integral C( 1). The latter is taken as
representing roughly the effect of all the other molecules;
'roughly', that means after some reasonable averaging. This
averaging is the expression of our ignorance of the actual micro
scopic situation. Boltzmann's theorem says that this equation
mixing mechanical knowledge with ignorance of detail leads to
irreversibility. There is no contradiction between the two state
ments.
But there rises the other question whether such a modification
of the fundamental equation is justified. We shall see presently
that it is indeed, in a much wider sense than that claimed by
Boltzmann, namely not only for a gas, but for any substance
which can be described by a mechanical model. We have there
fore now to take up the question of how statistical methods can
be applied to general mechanical systems. Without such a
theory, one cannot even treat the deviations from the socalled
ideal behaviour of gases (Boyle's law), which appear at high
pressure and low temperature, and which lead to condensation.
60 CHANCE
Theories like that of van der Waals have obviously only pre
liminary character. What is needed is a general and well
founded formalism covering the gaseous, liquid, and solid states,
under all kind of external forces.
For the case of statistical equilibrium, this formalism was
supplied by Willard Gibbs's celebrated book on Statistical
Mechanics (1901), which has proved to be extremely successful
in its applications (see Appendix, 18). The gist of Gibbs's idea
is to apply Boltzmann's results for a real assembly of many
equal molecules to an imaginary or Virtual' assembly of many
copies of the system under consideration, and to postulate that
the one system under observation will behave like the average
calculated for the assembly. Before criticizing this assumption,
let us have a glimpse of Gibbs's procedure. He starts from
Liouville's theorem (6.4) and considers especially the case of
equilibrium where the partition function/of his virtual assembly
has to satisfy equation (6.9). He states that/ = <P(T) is a solu
tion (as we have seen) and he chooses two particular forms of
the function O. The first is
/ = (H) = const., if E < H < E+&E, (6 2g)
= outside of this interval,
where E is a given energy and kE a small interval of energy.
(In modern notation one could write (H) = 8(HE), where
8 is Dirac's symbolic function.) The corresponding distribution
he calls microcanonical.
The second form is just that of MaxwellBoltzmann,
f=efte, H(p,q) = E, (6.30)
and the corresponding distribution is called canonical. Gibbs shows
that both assumptions lead to the same results for the averages
of physical quantities. But the canonical is preferable, as it is
simpler to handle. /? turns out again to be equal for systems in
thermal equilibrium; if one puts /? = 1/kT the formal relations
between the averages constructed with (6.30) are a true replica
of thermodynamics. For instance, the normalization condition
for the probability
jfdpdq = J ePP* dpdq = 1 (6.31)
STATISTICAL MECHANICS 61
can be written
J = /0 = kTlog Z, Z = jj e~WMdpdq. (6.32)
This F plays the part of Helmholtz's free energy. The integral
Z, today usually called the 'partition function', depends, apart
from the energy E, on molar parameters like the volume F.
All physical properties can be obtained by differentiation, e.g.
entropy S and pressure p by
8F 8F
s =^' *=fr < 6  33 >
This formalism has been amazingly successful in treating thermo
mechanical and also thermochemical properties. For instance,
the theory of real (nonideal) gases is obtained by writing
E = H(p, q)  K(p)+ U(q), (6.34)
where K is the kinetic and U the potential energy; the latter
depends on the mutual interactions of the molecules. As K is
quadratic in the p, the corresponding integration in Z is easily
performed and the whole problem reduces to calculating the
multiple integral
Q = j ... J e" u ^^'"^ kT dq^dq^dq N . (6.35)
Still, this is a very formidable task, and much work has been
spent on it. I shall mention only the investigations initiated by
Ursell, and perfected by Mayer and others, the aim of which was
to replace van der Waals's semiempirical equation of state by
an exact one. In fact one can expand Q into a series of powers of
V" 1 and, introducing this into (6.32) and (6.33), one obtains the
pressure p as a similar series
where the coefficients 4,B,..., called virial coefficients, are
functions of T. One can even go further and discuss the process
of condensation, but the mathematical difficulties in the treat
ment of the liquid state itself are prohibitive.
The range of application of Gibbs's theory is enormous. But
reading his book again, I felt the lack of a deeper foundation.
62 CHANCE
A few years later (1902, 1903) there appeared a series of papers
by Einstein in which the same formalism was developed,
obviously quite independently, as Maxwell and Boltzmann are
quoted, but not Gibbs; these papers contained two essential
improvements: an attempt to justify the statistical assumptions,
and an application to a case which at once transformed the kinetic
theory of matter from a useful hypothesis into something very real
and directly observable, namely, the theory of Brownian motion.
Concerning the foundation, Einstein used an argument which
Boltzmann had already introduced to support his distribution
law (6.18) though this seems to be hardly necessary, as for a
real assembly the method of enumerating distributions over
cells is perfectly satisfactory. Curiously enough, this argument
of Boltzmann is based on a theorem similar to Poincare's con
siderations on quasiperiodicity with which Zermelo intended
to smash statistical mechanics altogether. Einstein considers a
distribution of the microcanonical type, in Gibbs's nomen
clature, where only one 'energy surface' H(p,q) = E in the phase
space is taken into account. The representative point in phase
space moves always on this surface. It may happen that the
whole surface is covered in such a way that the orbit passes
through every point of the surface. Such systems are called
ergodic; but it is rather doubtful whether they exist at all.
Systems are called quasiergodic where the orbit comes near to
every point of the energy surface; that this happens can be seen
by an argument similar to that which leads to Poincare's theorem
of quasiperiodicity. Then it can be made plausible that the total
time of sojourn of the moving point in a given part of the energy
surface is proportional to its area; hence the time average of any
function of p, q is obviously the same as that taken with the help
of a microcanonical virtual assembly. In this way quasi
periodicity is used to justify statistical mechanics, exactly
reversing Zermelo 's reasoning. This paradox is resolved by the
remark that Zermelo believes the period to be large and macro
scopic, while Einstein assumes it to be unobservably small.
Who is right ? You may find the obvious answer for yourselves
(see Appendix, 19).
STATISTICAL MECHANICS 63
Modern writers use other ways of establishing the foundations
of statistical mechanics. They are mostly adaptations of the
cellmethod to the virtual assembly; one has then to explain why
the average properties of a single real observed system can be
obtained by averaging over a great many systems of the virtual
assembly. Some say simply: As we do not know the real state,
we have the right to expect the average provided exceptional
situations are theoretically extremely rare and this is of course
the case. Others say we have not to do with a single isolated
system, but with a system in thermal contact with its surround
ings, as if it were in a thermostat or heat bath; we can then
assume this heat bath to consist of a great many copies of the
system considered, so that the virtual assembly is transformed
into a real one. I think considerations of this kind are not very
satisfactory.
There remains the fact that statistical mechanics has justified
itself by explaining a great many actual phenomena. Among
these are the fluctuations and the Brownian motion to which
Einstein applied his theory (see Appendix, 20). To appreciate
the importance of this step one has to remember that at that
time (about 1900) atoms and molecules were still far from being
as real as they are today there were still physicists who did
not believe in them. After Einstein's work this was hardly
possible any longer. Visible tiny particles suspended in a gas or a
liquid (colloid solution) are test bodies small enough to reveal
the granular structure of the surrounding medium by their
irregular motion. Einstein showed that the statistical properties
of this movement (mean density, mean square displacement in
time, etc.) agree qualitatively with the predictions of kinetic
theory. Perrin later confirmed these results by exact measure
ments and obtained the first reliable value of Avogadro's num
ber N, the number of particles per mole. From now on kinetic
theory and statistical mechanics were definitely established.
But beyond this physical result, Einstein's theory of Brownian
motion had a most important consequence for scientific metho
dology in general. The accuracy of measurement depends on the
sensitivity of the instruments, and this again on the size and
64 CHANCE
weight of the mobile parts and the restoring forces acting on
them. Before Einstein's work it was tacitly assumed that
progress in this direction was limited only by experimental
technique. Now it became obvious that this was not so. If an
indicator, like the needle of a galvanometer, became too small
or the suspending fibre too thin, it would never be at rest but
perform a kind of Brownian movement. This has in fact been
observed. Similar phenomena play a large part in modern
electronic technique, where the limit of observation is given by
irregular variations which can be heard as a 'noise' in a loud
speaker. There is a limit of observability given by the laws of
nature themselves.
This is a striking example that the code of rules for inference
by induction, though perhaps metaphysical in some way, is
certainly not a priori, but subject to reactions from the know
ledge which it has helped to create. For those rules which taught
the experimentalist how to obtain and improve the accuracy of
his findings contained to begin with certainly no hint that there
is a natural end to the process.
However, the idea of unlimited improvement of accuracy need
not be given up yet. One had only to add the rule: make your
measurements at as low a temperature as possible. For Brownian
motion dies down with decreasing temperature.
Yet later developments in physics proved this rule also to be
ineffective, and a much more trenchant change in the code had
to be made.
But before dealing with this question we have to finish our
review of statistical methods in classical mechanics.
GENERAL KINETIC THEORY
Kinetic theory could only be regarded as complete if it
applied to matter in (visible) motion as well as to equilibrium.
But if you look through the literature you will find very little
a few simple cases. The most important of these, the theory of
gases, has been dealt with in some detail. Two others must be
mentioned : the theory of solids and of the Brownian motion.
Ideal solids are crystal lattices or gigantic periodic molecules.
GENERAL KINETIC THEORY 65
But only for zero temperature are the atoms in regularly spaced
equilibrium positions; for higher temperatures they begin to
vibrate. As long as the amplitudes are small, the mutual forces
will be linear functions of them; then the vibrations can be
analysed into 'normal modes', each of which is a wave running
through the lattice with a definite frequency. These normal
modes represent a system of independent harmonic oscillations
to which Gibbs's method of statistical mechanics can be applied
without any difficulty. If, however, the temperature rises, the
amplitudes of the vibrations increase and higher terms appear in
the interaction: the waves are scattering one another and are
therefore strongly damped. Hence there exists a kind of free
path for the transport of energy which can be used for explaining
conduction of heat in crystals (Debye). Similar considerations,
applied to the electrons in metallic crystals, are used for the
explanation of transport phenomena like electric and thermal
conduction in metals.
In the case of Brownian motion, I have already mentioned that
Einstein calculated not only the mean density of a colloidal
solution, say, under gravity, but also the mean square displace
ment of a single suspended particle in time (or, what amounts to
the same, the dispersion of a colloid by diffusion as a function
of time). The simplifying assumption which makes this possible
is that the mass of the colloidal particle is large compared with the
mass of the surrounding molecules, so that these impart only
small impulses. Similar considerations have been applied to
other fluctuation phenomena (see Appendix, 20).
A great number of more or less isolated examples of non
equilibria have been treated by a semiempirical method which
uses the notion of relaxation time. You find a very complete
account of such things related to solids and liquids in a book of
J. Frenkel, Kinetic Theory of Liquids. But you must not expect
to find in this work a systematic theory, based on a general idea,
nor will you find it in any other book.
My collaborator Dr. Green and I have tried to fill this gap,
and to develop the kinetic theory of matter in general. I hope
you will not mind if I indulge a little in the pleasure of explaining
5131 E
66 CHANCE
the leading ideas. It will help the understanding of the interplay
of cause and chance in the laws of nature.
We have to remember the general principles laid down by
Gibbs which he, however, used only for the case of statistical
equilibrium.
An arbitrary piece of matter, fluid or solid, is, from the atomis
tic standpoint, a mechanical system of particles (atoms, mole
cules) defined by a Hamiltonian H. Its state is completely
determined if the initial values of coordinates and momenta are
given. Actually this is not the case; but there is a probability
(as yet unknown) f(p,q)dpdq for the initial distribution. The
causal laws of motion demand that the distribution f(t,p>q) at
a later time t is a solution of the Liouville equation (6.4) (p. 49)
g ![*,/] 0. (6.37)
namely that solution which for t = becomes /,
,g)=/(3>,g). (6.38)
Let us assume for simplicity that all molecules are equal
particles (point masses) with coordinates x (fc) and velocities
%w = pw/m. We shall consider/ to be a function of these and
write/(tf, x,). If wewant to indicate thatafunction/ depends onh
particles, we do not write all arguments, but simply f h (l, 2,...h) or
shortly //>. As all the particles are physically indistinguishable, we
can assume all the functions/^ to be symmetrical in the particles.
Now the physicist is not directly interested in a symmetric
solution /# of (6.37). He wants to know such things as the num
ber density (number of particles per unit volume) n^(t, x) at a
given point x of space, or perhaps, in addition, the velocity dis
tribution f^t, x, ), i.e. just those quantities which are familiar
from the kinetic theory of gases, depending on one particle only.
One has therefore to reduce the function/^ for N particles step
by step to the function f l of one particle.
This is done by integrating over the position and velocity of
one particle, say the last one, with the help of the integral
operator
(6.39)
GENERAL KINETIC THEORY 67
i is g iven > we obtain f q by applying the operator X Q +I, it is,
however, convenient to add a normalizing factor and to write
(Nq)f t = **/, (6.40)
The physical meaning of the operation is this: we give up the
pretence to know the whereabouts of one particle and declare
frankly our ignorance. By repeated application of the operation,
we obtain a chain of functions
/*, /*i, ., / A, (641)
to which one can add/ = 1 ; f q means the probability of finding
the system in a state where q particles have fixed positions (i.e.
lie in given elements). The normalization is such that
J/ x (*, xa>, W) dci> = ni (t, x) (6.42)
is the number density; for one has
J
where the last equality follows from (6.40) for q = (with
/o = 1).
Now we have to reduce the fundamental equation (6.37) step by
step by repeatedly applying the operator x ( see Appendix, 21).
Assuming that the atoms are acting on another with central
forces, <I> W) being the potential energy between two of them, the
result of the reduction is a chain of equations of the form
(q=l,2 9 ...,N), (6.44)
where S X+it^^VrH] ( 6  45 )
q
This quantity S q will be called the statistical term. What is the
advantage of this splitting up of the problem into the solution of
a chain of equations ? The first impression is that there is no
advantage at all; for to determine/! you need to know S v but S^
contains / 2 , and this again depends on/ 3 , and so on, so that one
eventually arrives at f N , which satisfies the original equation.
Yet this reasoning supposes the desire to get information about
every detail of the motion, and that is just what we do not want.
68 CHANCE
We wish to obtain some observable and rather crude averages.
Starting from/! and climbing up to/ 2 ,/ 3 ,..., we can soon stop, as
the chaos increases with the number of particles, and replace the
rigorous connexion between f q and/ ff+1 by an approximate one,
according to the imperfection of observation.
Before explaining the application of this 'method of ignorance'
to simple examples, I wish to mention that we have actually
found the chain of equations (6.44) in quite a different way,
starting with f l and using the calculus of probability for events
not independent of one another (see Appendix, 22).
This derivation is less formal than the first one and illuminates
the physical meaning of the statistical term.
It would now be very attractive to show how from this general
formula (6.44) the mechanical and thermal laws for continuous
substances can be derived. But I have to restrict myself to a
few indications concerning the general 'method of ignorance',
to which I have already alluded.
The first example is the theory of gases. We have seen that
this theory is based on Boltzmann's equation (6.23)
(6.46)
where (7(1) is the collision integral (6.24);
0(1) = j [/ / (l)/'(2)/(l)/(2)]? 1 ? 2 
Now (6.46) has the same form as our general formula (6.44) for
q = 1, provided (7(1) can be identified with 8 V
Green has shown that this is indeed the case, provided that the
molecular forces have a small range r ; then one can assume that
in the gaseous state the probability of finding more than two
particles in a distance smaller than r is negligible. In other
words, one can exclude all except 'binary' encounters. Two
particles outside the sphere of interaction can be regarded as
independent; hence one has there
/(l,2)=A(l)./i(2). (6.48)
This holds also, in virtue of Liouville's theorem, if on the left
hand side the positions and velocities refer to a point in th
GENERAL KINETIC THEORY 69
interior of the sphere of action while on the righthand side the
values on its surface are used. With the help of this fact, the
integration in S 1 can be performed (see Appendix, 23), and
leads exactly to the expression (7(1), in which only the * boundary
values' of the functions/(l) and/(2) on the surface of the sphere
of action appear.
Hence the whole kinetic theory of gases is contained as a
special case in our theory.
Concerning liquids, one must proceed in a different way,
because triple and higher collisions cannot be handled with
elementary formulae. We have adopted a method suggested by
the American physicist Kirkwood. His formula is a generali
zation of (6.48), namely
23> _/2(2,3)/ 2 (3,l)/ 2 (l,2)
> >3) 
and may be interpreted in different ways, e.g. by saying that the
occurrences of three pairs of particles (2, 3), (3, 1), (1, 2) at given
positions and with given velocities are almost independent
events, because the mutual interactions decrease rapidly with
the distance.
Substituting / 3 from (6.49) in $ 2 , one obtains from (6.44),
(6.45) two integrodifferential equations for/! and/ 2 which form
a closed system and can be solved by suitable approximations.
(If then / 3 is calculated from the solution f l9 / 2 , with the help of
(6.49), the relation (6.40) for q = 3 is not necessarily satisfied;
this is the sacrifice of accuracy introduced by the Kirkwood
method.)
All physical properties of a liquid of the kind discussed here
(particles with central forces) can be expressed in terms of
ti 2 (l, 2), a function known to the experimenters in Xray research
on liquids as the radial distribution function. The method
explained leads to explicit formulae for the equation of state
and the energy; it allows also a discussion of the singularity
which separates the gaseous and liquid states. But I cannot
enter into a discussion of details.
Concerning nonequilibria, one can obtain the differential
70 CHANCE
equations for the mechanical and thermal flow in a rigorous way;
the result has, of course, the form of Cauchy's equations (4.9)
for continuous media, yet with a stress tensor T^ which can be
explicitly expressed in terms of the time derivatives of the strain
tensor (or the space derivative of the velocity) and the gradient
of temperature. In this way expressions for the coefficients of
viscosity and thermal conductivity are obtained. They differ
from the known formulae for gases by the great contribution of
the mutual forces. Yet again I cannot dwell on this subject
which would lead us far from the main topic of these lectures,
to which I propose now to return (see Appendix, 33).
VII
CHANCE AND ANTECEDENCE
WHAT can we learn from all this about the general problem of
cause and chance ? The example of gases has already shown us
that the introduction of chance and probability into the laws of
motion removes the reversibility inherent in them; or, in other
words, it leads to a conception of time which has a definite
direction and satisfies the principle of antecedence in the
causeeffect relation.
The formal method consists in defining a certain quantity,
f
jfdpdq
and showing that it never decreases in time : dS/dt ^ 0. In the
case of a gas, the function / was the distribution function f l of
one single molecule, a function of the point p, q of the phase
space of this molecule.
The same integral represents also the entropy of an arbitrary
system in statistical mechanics, if / is replaced by f N , the
distribution function in the 22Vdimensional phase space; it
satisfies all equilibrium relations of ordinary thermodynamics.
In the case of a gas, the time derivative of 8 could be deter
mined with the help of Boltzmann's collision equation, and it
was found that always
^ > 0. (7.2)
at
I have stressed the point that this is not in contradiction to
the reversibility of mechanics; for this reversibility refers to a
distribution function of noninteracting molecules, satisfying
f=[#J], (73)
while molecules colliding with one another satisfy
72 CHANCE AND ANTECEDENCE
where C is the collision integral. Irreversibility is therefore a
consequence of the explicit introduction of ignorance into the
fundamental laws.
Now the same considerations hold for any system. If we take
for / the function f N of a closed system of N particles, (7.3) is
again satisfied, and if its solution is introduced into (7.1), it can
be easily shown that dS/dt = 0.
Irreversibility can be understood only by explicitly exempting
a part of the system from causality. One has to abandon the
condition that the system is closed, or that the positions and
velocities of all particles are under control. The remarkable
thing is that it suffices to assume one single particle beyond con
trol. Then we have to do with a system of N+ 1 particles, but
concentrate our interest only on N of them. The partition func
tion of these N particles satisfies the equation (6.44) for q = N:
% = [H N ,f N ]+8 N , (7.5)
where 8 N is a certain integral over/^ +1 given by (6.45) for q = N.
For a solution of this equation (7.5) the entropy is either constant
or increasing. This is of course a fortiori the case if the system
of N particles is coupled to more complicated systems out of
control (see Appendix, 24).
The increase of S continues until statistical equilibrium is
reached, and it can be shown that the final distribution is the
canonical one
) = E. (7.6)
This result is, in my opinion, the final answer of the ageold
question how the reversibility of classical mechanics and the
Irreversibility of thermodynamics can be reconciled. The latter
is due to a deliberate renunciation of the demand that in prin
ciple the fate of every single particle can be determined. You
must violate mechanics in order to obtain a result in obvious
contradiction to it. But one may say: this violation may be
necessary from practical reasons because one can neither observe
all particles nor solve the innumerable equations in reality,
however, the world is reversible, and thermodynamics only a
CHANCE AND ANTECEDENCE 73
trick for obtaining probable, not certain, results. This is the
standpoint taken in many presentations of statistical mechanics.
It is difficult to contradict if one accepts the axiom that the
positions and velocities of all particles can, at least in principle,
be determined but can this really be maintained? We have
seen that the Brownian motion sets a limit to all observations
even on a macroscopic scale. One needs a spirit who can do
things we could not even do with infinitely improved technical
means. Further, the idea of a completely closed system is also
almost fantastic.
I think that the statistical foundation of thermodynamics is
quite satisfactory even on the ground of classical mechanics.
But in fact, classical mechanics has turned out to be defective
just in the atomic domain where we have applied it. The whole
situation has therefore to be reexamined in the light of quantum
mechanics.
VIII
MATTER
MASS, ENERGY, AND RADIATION
IN order not to lose sight of my main subject I have added to the
heading of each section of these lectures words like 'cause',
'contiguity', 'antecedence', 'chance'. The one for the present
section, 'matter', seems to be an intruder. For classical philo
sophy teaches that matter is a fundamental conception of a
specific kind, entirely different from cause, though on the same
level in the hierarchy of notions: another 'category' in Kant's
terminology. This doctrine was generally accepted at the time
before the great discoveries were made of which I have now to
speak. It was the period when physics was governed by the
dualism 'force and matter', Kraft und Stoff (the title of a popular
book by Biichner). In modern physics this duality has become
vague, almost obsolete. The first steps in this direction have been
described in the preceding survey: the transition from Newton's
distance forces to contact forces, first in mechanics, then in
electromagnetism, and finally for gravitation; in other words,
the victory of the idea of contiguity. If force is spreading in
'empty space' with finite velocity, space cannot be quite empty;
there must be something which carries the forces. So space is
filled with ether, a kind of substance akin to ordinary matter
in many respects, in which strains and stresses can be produced.
Though these contact forces obey different laws from those which
govern elasticity, they are still forces in an ether, something
different from the carrier. Yet this distinction vanishes more
and more. Relativity showed that the ether does not share with
ordinary matter the property of 'localization 5 : you cannot say
'here I am' ; there is no physical way of identifying a point in the
ether, as you could recognize a point in running water by a
little mark, a particle of dust. Electric and magnetic stresses
are not something in the ether, they are 'the ether' . The question
of a carrier becomes meaningless.
However, this is a question of interpretation. Physicists are
MASS, ENEKGY, AND RADIATION 75
very broadminded in this respect; they will continue using obso
lete expressions like ether, and no harm is done. For them a
matter of terminology is not serious until a new quantitative
law is involved. That has happened here indeed. I refer to the
law connecting mass m, energy e, and the velocity of light c
(see Appendix, 25), _ mc ^ (8 x)
which, after having been found to hold in special cases, was
generally established by Einstein. His reasoning is based on the
existence of the pressure of light, demonstrated experimentally
and also derivable from Maxwell's equations of electrodynamics.
If a body of mass M emits a welldefined quantity of light in a
parallel beam which carries the total electromagnetic energy e,
it suffers a recoil corresponding to the momentum /c transferred.
It therefore moves in the opposite direction, and to avoid a clash
with the mechanical law that the centre of mass of a system
cannot be accelerated by an internal process, one has to ascribe
to the beam of light not only an energy e and momentum /c but
also a mass e/c 2 , and to assume that the mass M of the emitting
body is reduced by the same amount m = e/c 2 .f
The theory of relativity renders this result quite natural. It
provides, moreover, an expression for the dependence of mass
on velocity; one has
where m is called the restmass. Energy e and momentum p are
then given by f = ^ p = my (g 3)
I need hardly to remind you how this result of 'purest science'
has been lately confirmed by a terrifying, horrible, 'technical'
application in New Mexico, Japan, and Bikini. There is no
doubt, matter and energy are the same. The old duality between
the force and the substance on which it acts, has to be aban
doned, and hence also the original idea of force as the cause of
motion. We see how old notions are dissolved by new ex
periences. It is this process which has led me to the abstract
t M. Born, Atomic Physics (Blackie), 4th ed., 1948, Ch. III. 2, p. 52 ; A. VII,
p. 288. See Appendix, 25.
76 MATTER
definition of causality based only on the notion of physical
dependence, but transcending special theories which change
according to the experimental situation.
Returning to our immediate object, we learn from Einstein's
law that the atomistic conception of matter is necessarily con
nected with the atomistic conception of energy. In fact the
existence of quanta of energy was deduced by Planck from the
laws of heat radiation five years before Einstein published his
relation between mass and energy.
Planck's discovery opened the first chapter in the history of
quantum theory, which corresponds to the years 1900 to 1913
and could be entitled ' Tracing the quantum by thermodynamical
and statistical methods' . The next chapter deals with the period
191325 when spectroscopical and electronic methods were in
the foreground, while the last chapter describes the birth and
development of quantum mechanics.
I cannot possibly give an account of this long and tedious
development, but I shall pick out a few points which are not so
well known and hardly found in textbooks, beginning with some
remarks on the thermostatistical quantum hunt.
The problem which Planck solved was the determination of
the density of radiation p in equilibrium with matter of a given
temperature T as function of T and of the frequency v, so that
p(v,T) dv is the energy per unit volume in the frequency interval
dv. By purely thermodynamical methods several properties of
this function were known: the temperature dependence of the
total radiation J p dv = oT 4 (law of Stefan and Boltzmann) and
the specification that p/v* is only a function of the quotient vjT.^
The problem remained to determine this function, and here
statistical methods had to be used.
One can proceed in two ways. Either one regards the radiation
as being in equilibrium with a set of atoms which in their inter
action with radiation can be replaced by harmonic oscillators;
then the mean energy of these can be calculated in terms of the
radiation density and turns out to be proportional to it. This
was the method preferred by Planck. Or one regards the radia
f Law of Wien; see Atomic Physics, Ch. VII. 1, p. 198; A. XXVII, p. 343.
MASS, ENERGY, AND RADIATION 77
tion itself as a system of oscillators, each of these representing
the amplitude of a plane wave. This method was used by Ray
leigh and later by Jeans. In both cases the relation between the
mean energy u(v) of the oscillators of frequency v and the radia
tion density p is given byf
877V 2 , QA .
p~jru, (8.4)
and it suffices to determine u.
This can be done with the help of the socalled equipartition
law of statistical mechanics. Suppose the Hamiltonian H of a
system has the form
(8.5)
where is any coordinate or momentum and H' contains all
the other coordinates and momenta but not . Then the mean
value of the contribution to the energy of this variable is (see
Appendix, 26) ,
^T, (8.6)
independent of the constant a hence the same for all variables
of that description.
Applied to a set of oscillators of frequency v, where
H = (p'+^W), (8.7)
one obtains for the average energy
u = kT, (8.8)
hence, from (8.4), p = kT. (8.9)
This is called the RayleighJeans radiation formula. It is a
rigorous consequence of classical statistical mechanics, but
nevertheless in obvious contradiction to facts. It does not even
lead to a finite total radiation, since p increases as v 2 with fre
quency. The law is, however, not quite absurd as it agrees well
with measurements for small frequencies (long waves) or high
temperatures. At the other end of the spectrum, the observed
t See Atomic Physics, Ch. VII. 1, p. 201 ; A. XXVIII, p. 347.
78 MATTER
energy density decreases again, and Wien has proposed for this
region an experimental law which would correspond to the
assumption that in (8.4) the oscillator energy is of the form
u = u e^ kT . (8.10)
This looks very much like a Boltzmann distribution. According
to Wien's displacement law it holds for high values of the
SrU
4
quotient v/T, and both constants U Q and c must be proportional
to v; but their meaning is obscure.
This was the situation which Planck encountered: two
limiting cases given by the formulae (8.8) and (8.10), the first
valid for large T, the second for small T. Planck set out to dis
cover a bridging formula; the difficulty of this task can be visual
ized by looking at the two mathematical expressions or the
corresponding graphs in Fig. 1. Planck decided that the energy
was a variable unsuited for interpolation, and he looked for
another one. He found it in the entropy 8. I shall give here
his reasoning in a little different form (due to Einstein, 1905),
where the entropy does not appear explicitly but the formulae
of statistical mechanics are used. Starting from Boltzmann's
MASS, ENERGY, AND RADIATION 79
distribution law, according to which the probability of finding
a system in a state with energy c is proportional to e~0 , where
j8 = I/kT, one can express the mean square fluctuation of the
energy ^ = ^^ = ? __ 2
in terms of the average energy e = u itself if the latter is given as
function of temperature or of /? (see Appendix, 20.10):
(2F = ^. (8.11)
Now this function tt(/J) is known for the two limiting cases: T
large or /J small, and T small or j8 large, from (8.8) and (8.10),
/jS 1 for small &
U e0<. for large/?. (8J2)
Hence one has
3 ' 2 = 2 ' 8ma11 # , 8 13>
*' = eo , large jB. (8J3)
Now Planck argues like this: the two limiting cases will corre
spond to the preponderance of two different causes, whatever
they may be. A wellknown theorem of statistics says that the
mean square fluctuations due to independent causes are additive.
Let us assume that the condition of independence is here satisfied.
Hence, if both causes act simultaneously, we should have
. (8.14)
This is a differential equation for u, with the general solution
(8.15)
v '
The constant of integration a must vanish in order to have the
limiting cases (8.15) all right. Wien's displacement law, accord
ing to which p/v 8 = 8iru/c 3 v depends only on T\v, leads then to
e == hv, where h is a constant, known as Planck's constant.
The result is Planck's formula for the mean oscillator energy
scri ^W' < 8  16 '
from which the radiation density follows according to (8.9); a
80 MATTER
refined interpolation which turned out to be in so excellent agree
ment with experiment that Planck looked for a deeper explana
tion and discovered it in the assumption of energy quanta of
finite size e = hv. If the energy is a multiple of e , the integral
(6.32) has to be replaced by the sum
and then the usual procedure outlined in section 6 leads at once
to the expression (8.16) for the oscillator energy u.
Planck believed that the discontinuity of energy was a pro
perty of the atoms, represented by oscillators in their interaction
with radiation, which itself behaved quite normally. Seven
years later Einstein showed that indeed wherever oscillations
occur in atomic systems, their energy follows Planck's formula
(8.16);I refer to his theory of specific heat of molecules and solids
which opened more than one new chapter of physics. But this
is outside the scope of these lectures. f
Einstein had, however, arrived already in 1905 at the con
clusion that radiation itself was not as innocent as Planck
assumed, that the quanta were an intrinsic property of radia
tion and ought to be imagined to be a kind of particles flying
about. In textbooks this revival of Newton's corpuscular theory
of light is connected with Einstein's explanation of the photo
electric effect and similar phenomena where kinetic energy of
electrons is produced by light or vice versa. This is quite correct,
but not the whole story. For it was again a statistical argument
which led Einstein to the hypothesis of quanta of light, or
photons, as we say today.
He considered the two limiting cases (8.13) from a different
point of view. Suppose the wave theory of light is correct, then
heat radiation is a statistical mixture of harmonic waves of all
directions, frequencies, and amplitudes. Then one can deter
mine the mean energy of the radiation and its fluctuation in a
given section of a large volume. This calculation has been per
formed by the Dutch physicist, H. A. Lorentz, with the result
t See Atomic Physics, Ch. VII. 2, p. 207.
MASS, ENERGY, AND RADIATION 81
that (Ap) 2 = p 2 for any frequency, or expressed in terms of the
equivalent oscillators, (Ac) 2 = u 2 , in agreement with the
RayleighJeans case (small j9, large T) in (8.13). Hence there
must be something else going on besides the waves, for which
(Ae) 2 = u 9 what can this be?
Suppose Planck's quanta exist really in the radiation and let
n be their number per unit volume and unit frequency interval.
As each quantum has the energy c = hv, one has e = u = we ,
and (Ae) 2 = (A?i) 2 . Hence the fluctuation law in Wien's case
(large j3, small T) of (8.13) can be written as
(Snj* = n. (8.18)
This is a wellknown formula of statistics referring to the
following situation: a great number of objects are distributed at
random in a big volume and n is the number contained in a part.
Then one has just the relation (8.18) between the average n
and its mean square fluctuation (see Appendix, 20). So Einstein
was led to the conclusion that the Wien part of the fluctuation
of energy is accounted for by quanta behaving like independent
particles, and he corroborated it by taking into account, besides
the energy, also the momentum hv/c of the quantum and the
recoil of an atom produced by it. It was this result which en
couraged him to look for experimental evidence and led him to
the wellknown interpretation of the photoelectric effect as a
bombardment of photons which knock out electrons from the
metal transferring their energy to them.
Expressed in terms of photon numbers the combined fluctua
tion law (8.14) reads
(8.19)
Q
with the general solution
*  ^n> < 8  20 >
where a = leads to the correct value for large T. But what if
a^O?
Every physicist glancing at the last formula will recognize
it as the socalled BoseEinstein distribution law for an ideal
82 MATTER
gas of indistinguishable particles according to quantum theory.
It is most remarkable that at this early stage of quantum theory
Planck and Einstein have already hit on a result which was
rediscovered much later (Einstein again participating) (see
Appendix, 25, 32). In fact Planck's interpolation can be inter
preted in modern terms as the first and completely successful
attempt to bridge the gulf between the wave aspect and the
particle aspect of a system of equal and independent components
whatever they may be photons or atoms.
I shall conclude this section by giving a short account of
another consideration of Einstein's which belongs to a later
period of quantum theory, when Bohr's theory of atoms was
already well established, namely the existence of stationary
states in the atoms which differ by finite amounts of energy
content. Suppose the atom can exist in a lower state 1 and a
higher state 2; transitions are possible by emission or absorp
tion of a light quantum of energy 2 1 = , hence of frequency
v = /h. On the other hand, according to Boltzmann's law
the relative number of atoms in the two states will be
= e*'. (8.21)
A/I
Now one can write (8.20), with a = 0, in the form
fa = n,
or, using (8.21), nN z +N z = nN v (8.22)
For this equation Einstein gave the following interpretation:
the lefthand side represents the number of quanta emitted per
unit of time from the N 2 atoms in the higher state, the right
hand side those absorbed by the N : atoms in the lower state,
two processes which in equilibrium must of course cancel one
another.
The absorption is obviously proportional to the number of
atoms in the lower state, N l9 and to the number n of photons
present, i.e. to nN v Concerning the emission the term N 2 signifies
a spontaneous process, independent of the presence of radiation;
it corresponds to the wellknown emission of electromagnetic
MASS, ENERGY, AND RADIATION 83
waves by a vibrating system of charges. The other term nN 2 is
a new phenomenon which was signalled the first time in this paper
of Einstein (later confirmed experimentally), namely a forced
or induced emission proportional to the number of photons
present.
If we denote the number of spontaneous emissions by AN 2>
of induced emissions by B 21 N 2 n, of absorptions B^N t n y we
learn from (8.22) that the probability coefficients (probabilities
per unit time, per atom, and per lightquantum) are all equal:
A = B 12 = B 2l . (8.23)
This result had farreaching consequences. The first is the exis
tence of a symmetric probability coefficient J5 12 = B 2 i for transi
tion between two states induced by radiation. This became one
of the clues for the discovery of the matrix form of quantum
mechanics.
The second point is seen if one considers, not equilibrium, but
a process in time; Einstein's consideration leads at once to the
equation
% = f = f = ^m*>+*. < 8  2 *)
which is of the type used by the chemists for the calculation of
reaction velocities. One has, in their terminology, three com
peting reactions, namely two diatomic ones and one monatomic
one. Now genuine monatomic reactions are rare in ordinary
chemistry, but abundant in nuclear chemistry; they were in
fact until recently the only known ones, namely the natural
radioactive disintegrations. If the radiation density is zero,
n = 0, one has ,
*J! = AN t , (8.25)
which is exactly the elementary law of radioactive decay,
according to Rutherford and Soddy . It expresses the assumption
that the disintegrations are purely accidental and completely
independent of one another.
Thus Einstein's interpretation means the abandonment of
causal description and the introduction of the laws of chance
for the interaction of matter and radiation.
84 MATTER
ELECTRONS AND QUANTA
Although my programme takes me through the whole history
of physics, I am well aware that it is a very onesided account of
what really has happened. It will not have escaped you that I
believe progress in physics essentially due to the inductive
method (of which I hope to say a little more later), yet the experi
mentalist may rightly complain that his efforts and achieve
ments are hardly mentioned. Yet as I am concerned with the
development of ideas and conceptions, I may be permitted
to take the skill and inventive genius of the experimenters for
granted and to use their results for my purpose without detailed
acknowledgement.
The period about 1900, when quantum theory sprang from
the investigations of radiation, was also full of experimental
discoveries: radioactivity, Xrays, and the electron, are the
major ones.
In regard to the role of chance in physics, radioactivity was of
special importance. As I said before, the law of decay is the
expression of independent accidental events. Moreover, the
decay constant turned out to be perfectly insensitive to all
physical influences. There might be, of course, some internal
parameters of the atom which determine when it will explode.
Yet the situation is different from that in gas theory: there we
know the internal parameters, or believe we know them, they
are supposed to be ordinary coordinates and momenta; what
we do not know are their actual values at any time, and we are
compelled to take refuge in statistics because of this lack of
detailed knowledge. In radioactivity, on the other hand, nobody
had an idea what these parameters might be, their nature itself
was unknown. However, one might have kindled the hope that
this question would be solved and radioactive statistics reduced
to ordinary statistical mechanics. In fact, just the opposite has
happened.
Radioactivity is also important for our problem because it
provided the means of investigating the internal structure of the
atom. You know how Rutherford used aparticles as projectiles
to penetrate into the interior of the atom, and found the nucleus.
ELECTBONS AND QUANTA 85
This result, together with J. J. Thomson's discovery of the elec
tron, led to the planetary model of the atom: a number of elec
trons surrounding the nucleus, bound to it by electric forces. The
fundamental difficulty of this model is its mechanical instability.
As long as nothing was known about the forces which keep the
elementary particles in an atom together one could assume a
law of force which allowed stable equilibrium states. An in
genious model of this kind is due to J. J. Thomson. But now one
knew that the forces were electrostatic ones, following Coulomb's
law, and these could never guarantee the extraordinary stability
of the actual atoms which survive billions of collisions without
any change of structure. Bohr connected this difficulty with the
facts of spectroscopy, and the result was his wellknown model
of the atom consisting of 'quantized' electronic orbits.
Mentioning spectroscopy, I feel again sadly how I have to
skip over great fields of research with a few words.
The discovery of simple laws in line spectra was in fact a great
achievement. Still more important than numerical formulae,
like the one discovered by the Swiss schoolmaster Balmer for
the hydrogen spectrum, was the rule found by Ritz (also a
Swiss, who unfortunately died quite young), the socalled com
bination principle; it says that the frequencies of the spectral
lines of gases can be obtained by forming differences of a single
row of quantities T l9 T 2 , jP 3 ,..., which are called terms:
>W = ?nT m , (8.26)
though not all of these differences appear as lines in the spectrum.
Balmer 5 s formula for hydrogen is a special case where T n = Rjn 2 ,
namely
The formula (8.26) gave Bohr the clue to the application of
quantum theory. Multiplying it by Planck's constant h he
interpreted it as the energy difference e nm = hv nm between any
two stationary states having the energies e n hT n (n = 1,2,...).
This interpretation is a sweeping generalization of Planck's
original conception of discrete energylevels of oscillators. It
explained at once the stability of atoms against impacts with an
86 MATTER
energy smaller than a certain threshold, the difference between
emission and absorption spectra (the latter being of the form
hv nl = n v where 1 means the ground state), and was in
detail confirmed by the wellknown experiments of Pranck and
Hertz (excitation of spectra by electron bombardment).
However, I cannot continue to describe the whole develop
ment of quantum theory because that would mean writing an
encyclopaedia of physics of the last thirtyfive years. I have
given this short account of the initial period because it is fashion
able today to regard physics as the product of pure reason.
Now I am not so unreasonable as to say that physics could
proceed by experiment only, without some hard thinking, nor
do I deny that the forming of new concepts is guided to some
degree by general philosophical principles. But I know from my
own experience, and I could call on Heisenberg for confirmation,
that the laws of quantum mechanics were found by a slow and
tedious process of interpreting experimental results. I shall try to
describe the main steps of this process in the shortest possible way.
Yet it must be remembered that these steps do not form a
straight staircase upwards, but a tangle of interconnected alleys.
However, I must begin somewhere.
There was first the question whether the stationary states are
certain selected mechanical orbits, and if so, which. Proceeding
from example to example (oscillator, rotator, hydrogen atom),
'quantum conditions' were found (Bohr, Wilson, Sommerfeld)
which for every periodic coordinate q of the motion can be
expressed in the form
I =:pdq = hn, (8.27)
where p is the momentum corresponding to q and the integration
extended over a period. The most convincing theoretical
argument for choosing these integrals / was given by Ehrenfest,
who showed that if the system is subject to a slow external per
turbation, / is an invariant and therefore well suited to be
equated to a discontinuous 'jumping' quantity Tin.
Among these 'adiabatic invariants' / there is in particular the
angular momentum of a rotating system and its component in a
ELECTRONS AND QUANTA 87
given direction; if both are to be integer multiples of h, the strange
conclusion is obtained that an atom could not exist in all orienta
tions but only in a selected finite set. This was confirmed by
Stern and Gerlach's celebrated experiment (deflecting an
atomic beam in an inhomogeneous magnetic field). I am proud
that this work was done in my department in Frankforton
Main. There is hardly any other effect which demonstrates the
deviations from classical mechanics in so striking a manner.
A signpost for further progress was Bohr's correspondence
principle. It says that, though ordinary mechanics does not
apply to atomic processes, we must expect that it holds at least
approximately for large quantum numbers. This was not so
much philosophy as common sense. Yet in the hands of Bohr
and his school it yielded a rich harvest of results, beginning with
the calculation of the constant M in the Balmer formula. f The
mysterious laws of spectroscopy were reduced to a few general
rules about the energy levels and the transitions between them.
The most important of these rules was Pauli's exclusion prin
ciple, derived from a careful discussion of simple spectra; it says
that two or more electrons are never in the same quantum state,
described by fixed values of the quantum numbers (8.27)
belonging to all periods, including the electronic spin (Uhlenbeck
and Goudsmit). With the help of these simple principles the
periodic system of the elements could be explained in terms of
electronic states. But all these great achievements of Bohr's
theory are outside the scope of our present interest. I have,
however, to mention Bohr's considerations about the correspon
dence between the amplitudes of the harmonic components of a
mechanical orbit and the intensity of certain spectral lines.
Consider an atom in the quantum state n with energy c n and
suppose the orbit can be, for large n, approximately described
by giving the coordinates q as functions of time. As these will
be periodic, one can represent q as a harmonic (Fourier) series,
of the type
<?(*) = f o ro ()cos[2 w ()M+S m )], (8.28)
m=l
t See Atomic Physics, Ch. V. 1, p. 98; A. XIV, p. 300.
88 MATTER
where the fundamental frequency v(n) and the amplitudes
a m (n) depend on the number n of the orbit considered. In reality,
the frequencies observed are not v(ri), 2v(n), 3i>(n),... but
_!/ _ v
v nm % \ n m/
and what about the amplitudes ? It was clear that the squares
\a m (ri)  2 should correspond in some way to the transition prob
abilities B nm = B mn introduced by Einstein in his derivation
of Planck's radiation law (8.16). But how could the mth over
tone of the nth orbit be associated with the symmetric relation
between two states m, n ?
This was the central problem of quantum physics in the years
between 1913 and 1925. In particular there arose a great interest
in measuring intensities of spectral lines, with the help of newly
invented recording microphotometers. Simple laws for the
intensities of the component lines of multiplets were discovered
(Ornstein, Moll), and presented in quadratic tables which look
so much like matrices that it is hard to understand why this
association of ideas did not happen in some brain.
It did not happen because the mind of the physicist was still
working on classical lines, and it needed a special effort to get
rid of this bias. One had to give up the idea of a coordinate
being a function of time, represented by a Fourier series like
(8.28); one had to omit the summation in this formula and to
take the set of unconnected terms as representative of the
coordinate. Then it became possible to replace the Fourier
amplitudes a m (n) by quantum amplitudes a(m,n) with two
equivalent indices m, n, and to generalize the multiplication law
for Fouriercoefficients into that for matricesf
* c ( m > n ) = 2 a ( m > k ) b ( k > n )' ( 8  29 )
Heisenberg justified the rejection of traditional concepts by a
general methodological principle: a satisfactory theory should
use no quantities which do not correspond to anything ob
servable. The classical frequencies mv(ri) and the whole idea of
orbits have this doubtful character. Therefore one should
f See Atomic Physics, Ch. V. 3, p. 123; A. XV, p. 305.
ELECTRONS AND QUANTA 89
eliminate them from the theory and introduce instead the
quantum frequencies v nm = A~ 1 ( n c m ), while the orbits should
be completely abandoned.
This suggestion of Heisenberg has been much admired as the
root of the success of quantum mechanics. Attempts have been
made to use it as a guide in overcoming the difficulties which
have meanwhile turned up in physics (in the application of
quantum methods to field theories and ultimate particles); yet
with little success. Now quantum mechanics itself is not free
from unobservable quantities. (The wavefunction of SchrO
dinger, for instance, is not observable, only the square of its
modulus.) To rid a theory of all traces of such redundant con
cepts would lead to unbearable clumsiness. I think, though
there is much to be said for cleaning a theory in the way recom
mended by Heisenberg, the success depends entirely on scientific
experience, intuition, and tact.
The essence of the new quantum mechanics is the representa
tion of physical quantities by matrices, i.e. by mathematical
entities which can be added and multiplied according to well
known rules just like simple numbers, with the only difference
that the product is noncommutative. For instance, the quan
tum conditions (8.27) can be transcribed as the commutation law
(8.30)
The Hamiltonian form of mechanics can be preserved by re
placing all quantities by the corresponding matrices. In par
ticular the determination of stationary states can be reduced
to finding matrices q,p for which the Hamiltonian H(p,q)
as a matrix has only diagonal elements which are then the energy
levels of the states. In order to obtain the connexion with
Planck's theory of radiation, the squares \q(m,n)\* have to be
interpreted as Einstein's coefficients B mn . In this way a few
simple examples could be satisfactorily treated. But matrix
mechanics applies obviously only to closed systems with discrete
energylevels, not to free particles and collision problems.
This restriction was removed by SchrOdinger's wave mechan
ics which sprang quite independently from an idea of de Broglie
90 MATTER
about the application of quantum theory to free particles. It is
widely held that de Broglie's work is a striking example of the
power of the human mind to find natural laws by pure reason,
without recourse to observation. I have not taken part in the
beginnings of wave mechanics, as I have in matrix mechanics,
and cannot speak therefore from my own experience. Yet I
think that not a single step would have been possible if some
necessary foothold in facts had been missing. To deny this
would mean to maintain that Planck's discovery of the quantum
and Einstein's theory of relativity were products of pure think
ing. They were interpretations of facts of observation, solutions
of riddles given by Nature difficult riddles indeed, which only
great thinkers could solve.
De Broglie observed that in relativity the energy of a particle
is not a scalar, but the fourth component of a vector in space
time, whose other components represent the momentum p; on
the other hand, the frequency v of a plane harmonic wave is also
the fourth component of a spacetime vector, whose other
components represent the wave vector k (having the direction
of the wave normal and the length A" 1 , where A is the wave
length). Now if Planck postulates that = Jiv, one is compelled
to assume also p = hk. For light waves where Ai/ = c, this had
already been done by Einstein, who spoke of photons behaving
like darts with the momentum p = e/c = hv/c. De Broglie
applied it to electrons where the relation between e and p is
more complicated, namely obtained from (8.3) by eliminating
the velocity v:
(8.31)
If a particle (c,p) is always accompanied by a wave (v, k) the
phase velocity of the wave would be (using e = me 2 , p = mv)
v \ = v /k = /p = c 2 /v > c, (8.32)
apparently an impossible result, as the principle of relativity
excludes velocities larger than that of light. But de Broglie
was not deterred by this; he observed that the prohibition of
velocities larger than c refers only to such motions which can be
used for sending timesignals. That is impossible by means
ELECTRONS AND QUANTA 91
of a monochromatic wave. For a signal one must have a small
group of waves, the velocity of which can be obtained, according
to Rayleigh, by differentiating frequency with respect to wave
number. Thus, from (8.31) and (8.32), f
a most satisfactory result which completely justifies the formal
connexion of particles and waves, though the physical meaning
of this connexion was still mysterious.
This reasoning is indeed a stroke of genius, yet not a triumph
of a priori principles, but of an extraordinary capacity for com
bining and unifying remote subjects.
I should say the same about the work of Schrftdinger and Dirac,
but you could better ask them directly what they think about the
roots of their discoveries. I shall not describe them here in detail,
but indicate some threads to other facts or theories. Schr5dinger
says that he was stimulated by a remark of de Broglie that any
periodic motion of an electron must correspond to a whole
number of waves of the corresponding wave motion. This led
him to his wave equation whose eigenvalues are the energy
levels of stationary states. He was further guided by the analogy
of mechanics and optics known from Hamilton's investigations;
the relation of wave mechanics to ordinary mechanics is the
same as that of undulating optics to geometrical optics. Then,
looking out for a connexion of wave mechanics with matrix
mechanics, Schr6dinger recognized as the essential feature of a
matrix that it represents a linear operator acting on a vector
(onecolumn matrix), and came in this way to his operator cal
culus (see Appendix, 27) ; if a coordinate q is taken as an ordinary
variable and the corresponding momentum as the operator
the commutation law (8.30) becomes a trivial identity. Apply
ing the theory of sets of orthonormal functions, he could then
establish the exact relation between matrix and wave mechanics.
f See Atomic Physics, Ch. IV. 5, p. 84; A. XI, p. 295.
92 MATTER
It is most remarkable that the whole story has been developed
by Dirac from Heisenberg's first idea by an independent and
formally more general method based on the abstract concept of
noncommuting quantities (^numbers).
The growth of quantum mechanics out of three independent
roots uniting to a single trunk is strong evidence for the inevita
bility of its concepts in view of the experimental situation.
From the standpoint of these lectures on cause and chance it
is not the formalism of quantum mechanics but its interpretation
which is of importance. Yet the formalism came first, and was
well secured before it became clear what it really meant: nothing
more or less than a complete turning away from the predomi
nance of cause (in the traditional sense, meaning essentially
determinism) to the predominance of chance.
This revolution of outlook goes back to a tentative interpreta
tion which Einstein gave of the coexistence of light waves and
photons. He spoke of the waves being a 'ghost field' which has
no ordinary physical meaning but whose intensity determines
the probability of the appearance of photons. This idea could be
transferred to the relation of electrons (and of material particles
in general) t<fde Broglie's waves. With the help of SchrOdinger's
wave equation, the scattering of particles by obstacles, the excita
tion laws of atoms under electron bombardment, and other
similar phenomena could be calculated with results which con
firmed the assumption.
I shall now describe the present situation of the theory in a
formulation due to Dirac which is well adapted to comparing the
new statistical physics with the old deterministic one.
IX
CHANCE
QUANTUM MECHANICS
IN quantum mechanics physical quantities or observables are
not represented by ordinary variables, but by symbols which
have no numerical values but determine the possible values of
the observable in a definite way to be described presently. These
symbols can be added and multiplied with the proviso that
multiplication is noncommutative: AB is in general different
from BA. I cannot deal with the most general aspect of this
symbolic calculus, but shall consider a special representation,
namely that where the coordinates #i,#2>"> of the particles are
regarded as ordinary numbers. Then a definite state of a system
is defined by a function ^(#i,32)> an( ^ an bservable A can be
represented by a linear operator: Aifj(q) means a new function
<f>(q), the result of operating with A on 0. If this result is, apart
from a factor, identical with ^,
A$ = aifj, (9.1)
</f is called an eigenf unction of A and the constant a an eigenvalue.
The whole set of eigenvalues is characteristic for the operator
A and represents the possible numerical values of the observable,
which may be continuous or discontinuous.
The coordinates q themselves can be considered to be opera
tors, namely multiplication operators: q a operating on \jj
means multiplying by q a . Operators whose eigenvalues are
all real numbers are called realtor TEteimitian*) operators. It
is clear that aH physical quantities have to be represented by real
operators, as the eigenvalues are supposed to represent the pos
sible results of measuring a physical quantity. One can easily
see that not only the multiplication operators q a but also the
ft ft
momenta p ot = ~ are real. But for the formal argument one
* lat
can also use complex operators, of the form C == A+iB (where
i = V 1), and its conjugate (7* = A%B\ then CC* can be
94 CHANCE
shown to be a real operator with only positive (or zero) eigen
values.
If two observables are represented by non commuting opera
tors, A and J?, their eigenfunctions are not all identical; if a is
an eigenvalue of A belonging to such an eigenfunction, there is
no state of the system for which a measurement can result in
finding simultaneously for A and B sharp numerical values a
and 6.
The theory cannot therefore in general predict definite
values of all physical properties, but only probability laws.
The same experiment, repeated under identical and controllable
conditions, may result in finding for a quantity A so many
times a v so many times a 2 , etc., and for B in the same way b
or 6 2 > e tc. But the average of repeated measurements must be
predictable. Whatever the rule for constructing the number
which represents the average A of the measurements of A, it
must, by common sense, have the properties that A+B = A + B
and cZ = cA, if c is any number.
From this alone there follows an important result. Consider
apart from the averages A, B of two operators A, B also their
mean square deviations, or the 'spreading' of the measurements,
8^ = J{(AA)*}, 8B = V{(B) 2 }, (9.2)
then by a simple algebraic reasoning (see Appendix, 28), which
uses nothing other than the fact stated above that (7(7* has no
negative eigenvalues, hence (7(7* ^ 0, it is found that
ft
where [A, #] = ^ (ABBA) (9.4)
is the socalled 'commutator' of the two operators A, B. If this
is specially applied to a coordinate and its momentum, A = p,
B = g, one has [q,p] = 1, therefore
ft
This is Heisenberg's celebrated uncertainty principle which is
QUANTUM MECHANICS 96
a quantitative expression for the effect of noncommutation on
measurements, but independent of the exact definition of aver
ages. It shows how a narrowing of the range for the measured
gvalues widens the range for p. The same holds, according to
(9.3), for any two noncommuting observables with the differ
ence that the 'uncertainty' depends on the mean of the oommu
tator,^
These general considerations are, so to speak, the kinematical
part of quantum mechanics. Now we turn to the dynamical
part.
Just as in classical mechanics, the dynamical behaviour of a
system of particles is described by a Hamiltonian
which is a (differential) operator. It is usually just taken over
from classical mechanics (where, if necessary, products like pq
have to be 'symmetrized' into %(pq+qp)). In Dirac's relativistic
theory of the electron there are, apart from the space coordinates,
observables representing the spin (and similar quantities in
meson theory); they lead to no fundamental difficulty and will
not be considered here.
Yet one remark about the Hamiltonian H has to be made,
bearing on our general theme of cause and chance: H contains
in the potential energy (and in corresponding electromagnetic
interaction terms) the last vestiges of Newton's conception of
force, or, using the traditional expression, of causation. We
have to remember this point later.
In classical mechanics we have used a formulation of the laws
of motion which applies just as well to a simple system, where all
details of the motion are of interest, as to a system of numerous
particles, where only statistical results are desired (and possible).
A function f(t,p,q) of time and of all coordinates and momenta
was considered; if p,q change with time according to the equa
tions of motion, the total change of/ is given by
96 CHANCE
where [H t f] is the Poisson bracket
07)
 (9>7)
i
One recovers the canonical equations by taking for /
simply q k or p k respectively. On the other hand, if one puts
df/dt = 0, any solution of this equation is an integral of the
equations of motion, and from a sufficient number of such
integrals fk(t,p,q) = c k one can obtain the complete solution
giving all p y q as functions of t.
But if this is not required, the same equation is also the means
for obtaining statistical information in terms of a solution /,
called the * distribution function', as I have described in detail.
/ is that integral of ^ /.
f = [H,n, (98)
which for t = goes over into a given initial distribution f (p, q).
If, in particular, this latter function vanishes except in the
neighbourhood of a given point p , q Q in phase space, or, in Dirac's
notation, if / = 8(p p^Siqqo), one falls back to the case of
complete knowledge, g and p Q being the initial values of q and p.
This procedure cannot be transferred without alteration to
quantum mechanics for the simple reason that p and q cannot
be simultaneously given fixed values. The uncertainty relation
(9.5) forbids the prescribing of sharp initial values for all p and q.
Hence the first part of the programme, namely a complete
knowledge of the motion in the same sense as in classical mechan
ics, breaks down right from the beginning. Yet the second part,
statistical prediction, remains possible. Following Dirac, we
ask which quantities have to replace the Poisson brackets (9.7)
in quantum theory, where all quantities are in general non
commuting. These brackets [a, /?] have a number of algebraic
properties; the most important of them being
[, A+A] = [, Al+[. A]. (9 9)
[,AA1 = A[,A]+[,A]A.
If one postulates that these shall hold also for noncommuting
quantities a and /?, provided the order of factors is always
QUANTUM MECHANICS 97
preserved (as it is in (9.9)), then it can be shown (Appendix, 29)
that [a,jS] is exactly the commutator as defined by (9.4).
Now one has to replace the function / in (9.8) by a time
dependent operator p, called the statistical operator, and to
determine p from the equation (formally identical with (9.8)):
% = [B, P ] (9.10)
with suitable initial conditions. To express these in a simple
way it is convenient to represent all operators by matrices in
the gspace; A operating on a function iff(q) is defined by
Aj(q) = j A(q,q'W(q') dq' (9.11)
(q stands for all coordinates ? 1> 9 8 > > an( * ?' ^ or another set of
values <?i,#2,...) where A(q,q') is called the matrix represent
ing A.
The product AB is represented by the matrix
A B(q, q') = J A (q, q")B(q", q') dq". (9.12)
If now p and H are taken as such matrices, where the elements
of p depend also on time, (9.10) is a differential equation for
p(t,q y q'), and the initial conditions are simply
p(o,g,?') = Po(?,fia (913)
where p is a given function of the two sets of variables.
The number of vector arguments in p for a system of N par
ticles is 2N, exactly as in the case of classical theory in the func
tion f(p,q). But while the meaning of/ depending on p,q is
obvious, that of p depending on two sets q, q' is not, except in
one case, namely when the two sets are identical, q = q'; then
the function P (t,q,q) = n(t,q) (9.14)
is the number density, corresponding to the classical
jf(t>3>P)dp = n(t,q).
Quite generally, the classical operation of integrating over the p'&
is replaced by the simpler operation of equating the two sets
ofj's, q = q r , or in matrix language, taking the diagonal elements
of p.
6131
M CHANCE
The average of an observable A for a configuration q must be
a real number A formed from p and A so that nA is linear in both
operators. The simplest expression of this kind is
nl = HpA+Ap)^, (9.15)
and this gives, in fact, all results of quantum mechanics usually
obtained with the help of the wave function. For instance, the
statistical matrix describing a stationary state where A has a
sharp value a, belonging to the eigenfunction *jj(a,q), is
p^t(a,qW*(a,q'). (9.16)
Then, from the definition (9.12) it follows easily that for this p
and any real operator A one has
Ap = P A = ap, (9.17)
hence for q = q', with (9.14),
n(a,q)= 0(a,g) 2 , I = a. (9.18)
Thus we have obtained the usual assumption that  </>(#, #) 2
is the 'probability' (if normalized to 1) or 'number density*
(if normalized to N) at the point q for the state a. (It must,
however, be noted that for systems of numerous particles, like
liquids in motion, other ways of averaging are useful, for instance
for the square of a momentum instead of np* = ^(pp 2 +p 2 p) Q ^^
the expression ^ipp^+p^p+Zppp)^?, which, however, for uni
form conditions coincides with the former.)
Let us consider the general stationary case where p is inde
pendent of time and therefore satisfies
[#,p] = 0. (9.19)
Any solution of this equation, i.e. any quantity A which com
mutes with H 9 is called an integral of the motion, in analogy to
the corresponding classical conception. H itself is, of course, an
integral. All integrals A a , A 2 ,..., have different eigenvalues
A x , A 2 ,..., for one and the same eigenfunction ^(Aj, A 2 ,...; #i>3 2 ,...)
or shortly ^(A, q) :
A 1 = A 1 0, A 2 ^ = A 2 ^, .... (9.20)
p can be taken as any function of the A's; its matrix representa
tion is given by
P(9, 9') =
QUANTUM MECHANICS 99
from which one obtains, with (9.18),
n(q) = />(?,?) = I P(A)</r(A,g)* = J P(A)n(A,g). (9.22)
A A
This shows that the arbitrary coefficient P(A) is the probability!
of finding the system in the stationary state A.
Dynamical problems arise in a somewhat different way from
those in classical theory. There it has a definite meaning to
speak about the motion of particles in a closed system, for
instance of the orbit of Jupiter in the planetary system. In
quantum theory a closed system settles down in a definite
stationary state, or a mixture of such states as given by (9.21).
But then nothing is changing in time; one cannot even make an
observation without interfering with the state of the system.
In classical physics it is supposed that we have to do with an
objective and always observable situation; the process of measur
ing is assumed to have no influence on the object of observation.
I have, however, drawn your attention to the point that even
in classical physics this postulate is practically never fulfilled
because of the Brownian motion which affects the instruments.
We are therefore quite prepared to find that the assumption of
'harmless' observations is impossible.
The most general way of formulating a dynamical problem is
to split the Hamiltonian in two parts
H = H +V, (9.23)
where jff describes what is of interest while V is of minor impor
tance, a socalled perturbation. F may also include external
influences and depend explicitly on the time. This partition is,
of course, arbitrary to a high degree; but it corresponds to the
actual situation. If a water molecule H 2 O is assembled from its
atoms, one can either ask what the stationary states of the whole
system are, or one can consider the parts H 2 and O and ask how
the states of the hydrogen molecule H 2 are changed by the
approaching oxygen atom, or one can ask the same question
for the HO radical and the H atom. The latter two are dynamical
problems.
Dynamical problems in quantum theory therefore, in contrast
to those in classical theory, cannot be defined withouta subjective,
100 CHANCE
more or less arbitrary decision about what you are interested
in. In other words, quantum mechanics does not describe an
objective state in an independent external world, but the aspect
of this world gained by considering it from a certain subjective
standpoint, or with certain experimental means and arrange
ments. This statement has produced much controversy, and
though it is generally accepted by the present generation of
physicists it has been decidedly rejected by just those two men
who have done more for the creation of quantum physics than
anybody else, Planck and Einstein. Yet, with all respect, I
cannot agree with them. In fact, the assumption of absolute
observability which is the root of the classical concepts seems to
me only to exist in imagination, as a postulate which cannot be
satisfied in reality.
Assuming the partition (9.23) one has to describe the system in
terms of the integrals of motion A l5 A 2 ,... of H Q which are, how
ever, not integrals of motion of H. All operators are then to be
expressed as matrices in the eigenvalues A (A 1? A 2 ,...) of A x , A 2 ,...;
for instance, the statistical operator p by the matrix p(t; A, A').
The diagonal elements of this matrix
P(*; A) = />(*; A, A) (9.24)
represent the probability of a state A at time , and they go over
for t = into the coefficients P(A) which appear in the expansion
(9.21) and represent the initial probabilities. The function
p(t\ A, A') can be determined from the differential equation (9.10)
by a method of successive approximations. My collaborator,
Green, has even found an elegant formula representing the com
plete solution. To a second approximation one finds
P(t,X) = P(A)+2 /(A,A'){P(A')P(A)}+...; (9.25)
(9.26)
the coefficients are given by
J(A,A') =
t
1
fF(; A, *
where E is the energy of the unperturbed s^en^in the state A,
E' that in the state A' (see Appendix, 30).
QUANTUM MECHANICS 101
Now equation (9.25) has precisely the form of the laws of radio
active decay, or of a set of competing monomolecular reactions.
The matrix J(A,A') obviously represents the probability of a
transition or jump from the state A to the state A'. This inter
pretation becomes still more evident if one assumes that the
A values are practically continuous, as would be the case if the
system allowed particles to fly freely about (for instance in radio
activity one has to take account of the emitted aparticles; in the
theory of optical properties of an atom of the photons emitted
and absorbed). If external influences are excluded, so that F
does not depend on time, the integral (9.26) can be worked out
with the result that J becomes proportional to the time
J(A,A')=j(A,A% (9.27)
where j(X 9 A') = ^ \V(X,\')\*S(EE'). (9.28)
n
The last factor 8(EE') says that j(A, A') differs from zero for
two states A and A' only if their energy is equal. /(A, A') is
obviously the transition probability per unit time, precisely
the quantity used in radioactivity.
By applying the formula (9.25) to the case of the interaction
of an atom with an electromagnetic field one obtains the formula
(8.24) which was used by Einstein in his derivation of Planck's
radiation law. There are innumerable similar applications, such
as the calculation of the effective crosssections of various kinds
of collision processes, which have provided ample confirmation
of the formula (9.25).
INDETERMINISTTC PHYSICS
There is no doubt that the formalism of quantum mechanics
and its statistical interpretation are extremely successful in
ordering and predicting physical experiences. But can our
3esire o. understanding, our wish to explain things, be satisfied
by a theorj^^^hich is frankly and shamelessly statistical and
indeterministic ran we be content with accepting chance, not
cause, as the supreme law of the physical world ?
To this last question I answer that not causality, properly
understood, is eliminated, but only a traditional interpretation
102 CHANCE
of it, consisting in its identification with determinism. I have
taken pains to show that these two concepts are not identical.
Causality in my definition is the postulate that one physical
situation depends on the other, and causal research means the
discovery of such dependence. This is still true in quantum
physics, though the objects of observation for which a depen
dence is claimed are different: they are the probabilities of
elementary events, not those single events themselves.
In fact, the statistical matrix p, from which these probabilities
are derived, satisfied a differential equation which is essentially
of the same type as the classical field equations for elastic or
electromagnetic waves. For instance, if one multiplies the
eigenfunction i/j(q) of the Hamiltonian H, Hift = EI/J, by e iEtln ,
the new function satisfies
?==^ (9 ' 29)
For a free particle, where H = L (pl+p$+pl) = J^ A, (9.29)
goes over into the wave equation
A* (930)
Although here only the first derivative with respect to time
appears, it does not differ essentially from the ordinary wave
/ 1 r)2JL\
equation j where the lefthand side is ~ ^ \ . One must remember
y C ut J
that only <fxf>* = < 2 has a physical meaning (as a probability),
where ^* satisfies the conjugate complex equation
_ ^
H dt
For this pair of equations a change in the time direction (t > t)
can be compensated by exchanging < and <*, which has no in
fluence on <<*.
The same holds in the general case (9.29), and we see that the
jtjiff erential equations of the wave function share the property
of all classical field equations that the principle of antecedence
INDETEBMINISTIC PHYSICS 103
is violated: there is no distinction between past and future for
the spreading of the probability density. On the other hand, the
principle of contiguity is obviously satisfied.
The differential equation itself is constructed in a way very
similar to the classical equations of motion. It contains in the
potential energy, which is part of the Hamiltonian, the classical
idea of force, or in other words, the Newtonian quantitative
expression for causation. If, for instance, particles are acting
on one another with a Coulomb force (as the nucleus and the
electrons in an atom), there appears in H the same timeless action
over finite distance as in Newtonian mechanics. Yet one has the
feeling that these vestiges of classical causality are provisional
and will be replaced in a future theory by something more satis
factory; in fact, the difficulties which the application of quantum
mechanics to elementary particles encounters are connected with
the interaction terms in the Hamiltonian; they are obviously still
too * classical'. But these questions are outside the scope of my
lectures.
We have the paradoxical situation that observable events obey
laws of chance, but that the probability for these events itself
spreads according to laws which are in all essential features
causal laws.
Here the question of reality cannot be avoided. What really
are those particles which, as it is often said, can just as well
appear as waves ? It would lead me far from my subject to discuss
this very difficult problem. I think that the concept of reality is
too much connected with emotions to allow a generally accept
able definition. For most people the real things are those things
which are important for them. The reality of an artist or a poet
is not comparable with that of a saint or prophet, nor with that
of a business man or administrator, nor with that of the natural
philosopher or scientist. So let me cling to the latter kind of
special reality, which can be described in fairly precise terms.
It presupposes that our sense impressions are not a permanent
hallucination, but the indications of, or signals from, an external
world which exists independently of us. Although these signals
change and move in a most bewildering way, we are aware o
104 CHANCE
objects with invariant properties. The set of these invariants of
our sense impressions is the physical reality which our mind
constructs in a perfectly unconscious way. This chair here looks
different with each movement of my head, each twinkle of my
eye, yet I perceive it as the same chair. Science is nothing else
than the endeavour to construct these invariants where they are
not obvious. If you are not a trained scientist and look through
a microscope you see nothing other than specks of light and
colour, not objects; you have to apply the technique of biological
science, consisting in altering conditions, observing correlations,
etc., to learn that what you see is a tissue with cancer cells, or
something like that. The words denoting things are applied to
permanent features of observation or observational invariants.
In physics this method has been made precise by using mathe
matics. There the invariant against transformation is an exact
notion. Felix Klein in his celebrated JErlanger Programm has
classified the whole of mathematics according to this idea, and
the same could be done for physics.
From this standpoint I maintain that the particles are real,
as they represent invariants of observation. We believe in the
'existence' of the electron because it has a definite charge e and a
definite mass m and a definite spin s; that means in whatever
circumstances and experimental conditions you observe an effect
which theory ascribes to the presence of electrons you find for
these quantities, e, m, s, the same numerical values.
Whether you can now, on account of these results, imagine
the electron like a tiny grain of sand, having a definite position in
space, that is another matter. In fact you can, even in quantum
theory. What you cannot do is to suppose it also to have a
definite velocity at the same time; that is impossible according to
the uncertainty relation. Though in our everyday experience
we can ascribe to ordinary bodies definite positions and velocities,
there is no reason to assume the same for dimensions which are
below the limits of everyday experience.
Position and velocity are not invariants of observation. But
they are attributes of the idea of a particle, and we must use
them as soon as we have made up our minds to describe certain
INDETERMINISTIC PHYSICS 105
phenomena in terms of particles. Bohr has stressed the point that
our language is adapted to our intuitional concepts. We cannot
avoid using these even where they fail to have all the properties
of ordinary experience. Though an electron does not behave like
a grain of sand in every respect, it has enough invariant pro
perties to be regarded as just as real.
The fact expressed by the uncertainty relation was first dis
covered by interpreting the formalism of the theory. An
explanation appealing to intuition was given afterwards, namely
that the laws of nature themselves prohibit the measurement
with infinite accuracy because of the atomic structure of matter:
the most delicate instruments of observation are atoms or
photons or electrons, hence of the same order of magnitude as
the objects observed. Niels Bohr has applied this idea with great
success to illustrate the restrictions on simultaneous measure
ments of quantities subject to an uncertainty rule, which he calls
'complementary' quantities.
One can describe one and the same experimental situation
about particles either in terms of accurate positions or in terms
of accurate momenta, but not both at the same time. The two
descriptions are complementary for a complete intuitive under
standing. You find these things explained in many textbooks
so that I need not dwell upon them.
The adjective complementary is sometimes also applied to
the particle aspect and the wave aspect of phenomena I think
quite wrongly. One can call these 'dual aspects' and speak of a
'duality* of description, but there is nothing complementary as
both pictures are necessary for every real quantum phenomenon.
Only in limiting cases is an interpretation using particles alone
or waves alone possible. The particle case is that of classical
mechanics and is applicable only to the case of large masses,
e.g. to the centre of mass of an almost closed system. The wave
case is that of very large numbers of independent particles, as
illustrated by ordinary optics.
The question of whether the waves are something 'real' or a
fiction to describe and predict phenomena in a convenient way
is a matter of taste. I personally like to regard a probability
106
CHANCE
wave, even in S^dimensional space, as a real thing, certainly
as more than a tool for mathematical calculations. For it has
the character of an invariant of observation; that means it
predicts the results of counting experiments, and we expect to
find the same average numbers, the same mean deviations, etc.,
if we actually perform the experiment many times under the
same experimental condition. Quite generally, how could we
rely on probability predictions if by this notion we do not refer
to something real and objective ? This consideration applies just
as much to the classical distribution function f(t\ p,q) as to the
quantummechanical density matrix p(t; q,q').
The difference between /and p lies only in the law of propaga
tion, a difference which can be described as analogous to that
between geometrical optics and undulatory optics. In the latter
case there is the possibility of interference. The eigenfunctions
of quantum mechanics can be superposed like light waves and
produce what is often called 'interference of probability'.
/^rrrj"__ _
B
FIG. 2.
This leads sometimes to puzzling situations if one tries to ex
press the observations only in terms of particles. Simple optical
experiments can be used as examples. Assume a source A of
light illuminating a screen B with two slits B l9 B% and the
light penetrating these observed on a parallel screen C. If only
one of the slits B is open, one sees a diffraction pattern around
the point where the straight line AB hits the screen, with a
bright central maximum surrounded by small fringes. When
both slits are open and the central maxima of the diffraction
pattern overlap, there appear in this region new interference
fringes, depending on the distance of the two slits.
INDETERMINISTIC PHYSICS 107
The intensity, i.e. the probability of finding photons on the
screen, in the case of both slits open, is therefore not a simple
superposition of those obtained when only one of the slits is
open. This is at once understandable if you use the picture of
probability waves determining the appearance of photons. For
the spreading of the waves depends on the whole arrangement,
and there is no miracle in the effect of shutting one slit. Yet if
you try to use the particles alone you get into trouble; for then a
particle must have passed one slit or the other and it is perfectly
mysterious how a slit at a finite distance can have an influence on
the diffraction pattern. Reichenbach, who has published a very
thorough book on the philosophical foundations of quantum
mechanics, speaks in such cases of 'causal anomalies'. To avoid
the perplexity produced by them he distinguishes between
phenomena, i.e. things really observable, such as the appearance
of the photons on the screen, and 'interphenomena 3 , i.e. theoreti
cal constructions about what has happened to a photon on its
way, whether it has passed through one slit or the other. He
states rightly that the difficulties arise only from discussing
interphenomena. 'That a photon has passed through the slit
B t is meaningless as a statement of a physical fact.' If we want
to make it a physical fact we have to change the arrangement
in such a way that the passing of a photon through the slit B l can
be really registered; but then it would not fly on undisturbed,
and the phenomenon on the screen would be changed. Reichen
bach's whole book is devoted to the discussion of this type of
difficulty. I agree with many of his discussions, though I object
to others. For instance, he treats the interference phenomenon
of two slits also in what he calls the wave interpretation; but here
he seems to me to have misunderstood the optical question. In
order to formulate the permitted and forbidden (or meaningless)
statements he suggests the use of a threevalued logic, where the
law of the 'excluded middle' (tertium non datur) does not hold.
I have the feeling that this goes too far. The problem is not one
of logic or logistic but of common sense. For the mathematical
theory, which is perfectly capable of accounting for the actual
observations, makes use only of ordinary twovalued logics.
108 CHANCE
Difficulties arise solely if one transcends actual observations and
insists on using a special restricted range of intuitive images
and corresponding terms. Most physicists prefer to adapt their
imagination to the observations. Concerning the logical pro
blem itself, I had the impression when reading Reichenbach's
book that in explaining three valued logic he constantly used
ordinary logic. This may be avoidable or justifiable. I remember
the days when I was in daily contact with Hilbert, who was
working on the logical foundations of mathematics. He dis
tinguished two stages of logics: intuitive logic dealing with finite
sets of statements, and formal logic (logistics), which he described
as a game with meaningless symbols invented to deal with the
infinite sets of mathematics avoiding contradictions (like that
revealed in Russell's paradox). But G6del showed that these
contradictions crop up again, and Hilbert's attempt is today
generally considered a failure. I presume that threevalued
logic is another example of such a game with symbols. It is
certainly entertaining, but I doubt that natural philosophy will
gain much by playing it.
Thinking in terms of quantum theory needs some effort and
considerable practice. The clue is the point which I have
stressed above, that quantum mechanics does not describe a
situation in an objective external world, but a definite experimen
tal arrangement for observing a section of the external world.
Without this idea even the formulation of a dynamical problem
in quantum theory is impossible. But if it is accepted, the funda
mental indeterminacy in physical predictions becomes natural,
as no experimental arrangement can ever be absolutely precise.
I think that even the most fervent determinist cannot deny
that present quantum mechanics has served us well in actual
research. Yet he may still hope that one day it will be replaced
by a deterministic theory of the classical type.
Allow me to discuss briefly what the chances of such a counter
revolution are, and how I expect physics to develop in future.
It would be silly and arrogant to deny any possibility of a
return to determinism. For no physical theory is final; new
experiences may force us to alterations and even reversions. Yet
INDETERMINISTIC PHYSICS 109
scanning the history of physics in the way we have done we see
fluctuations and vacillations, but hardly a reversion to more
primitive conceptions. I expect that our present theory will be
profoundly modified. For it is full of difficulties which I have not
mentioned at all the selfenergies of particles in interaction
and many other quantities, like collision crosssections, lead to
divergent integrals. But I should never expect that these
difficulties could be solved by a return to classical concepts. I
expect just the opposite, that we shall have to sacrifice some
current ideas and use still more abstract methods. However,
these are only opinions. A more concrete contribution to this
question has been made by J. v. Neumann in his brilliant book,
Mathematische Orundlagen der Quantenmechanik. He puts the
theory on an axiomatic basis by deriving it from a few postulates
of a very plausible and general character, about the properties
of 'expectation values' (averages) and their representation by
mathematical symbols. The result is that the formalism of
quantum mechanics is uniquely determined by these axioms; in
particular, no concealed parameters can be introduced with the
help of which the indeterministic description could be trans
formed into a deterministic one. Hence if a future theory
should be deterministic, it cannot be a modification of the present
one but must be essentially different. How this should be
possible without sacrificing a whole treasure of wellestablished
results I leave to the determinists to worry about.
I for my part do not believe in the possibility of such a turn
of things. Though I am very much aware of the shortcomings
of quantum mechanics, I think that its indeterministic founda
tions will be permanent, and this is what interests us from the
standpoint of these lectures on cause and chance. There remains
now only to show how the ordinary, apparently deterministic
laws of physics can be obtained from these foundations.
QUANTUM KINETIC THEORY OF MATTER
The main problem of the classical kinetic theory of matter
was how to reconcile the reversibility of the mechanical motion
of the ultimate particles with the irreversibility of the thenno
110 CHANCE
dynamical laws of matter in bulk. This was achieved by pro
claiming a distinction between the true laws which are strictly
deterministic and reversible but of no use for us poor mortals
with our restricted means of observation and experimentation,
and the apparent laws which are the result of our ignorance and
obtained by a deliberate act of averaging, a kind of fraud or
falsification from the rigorous standpoint of determinism.
Quantum theory can appear with a cleaner conscience. It
has no deterministic bias and is statistical throughout. It has
accepted partial ignorance already on a lower level and need
not doctor the final laws.
In order to define a dynamical phenomenon one has, as we have
seen, to split the system in two parts, one being the interesting
one, the other a 'perturbation'; and this separation is highly
arbitrary and adaptable to the experimental arrangement to be
described. Now this circumstance can be exploited for the pro
blem of thermodynamics. There one considers two (or more)
bodies first separated and in equilibrium, then brought into
contact and left to themselves until equilibrium is again
attained.
Let HM be the Hamiltonian of the first body, #< 2 > that of the
second, and write
(9.31)
Then this is the combined Hamiltonian of the separated bodies.
If they are brought into contact the Hamiltonian will be differ
ent, namely ff = ^^ (9 32)
where F is the interaction, which for ordinary matter in bulk will
consist of surface forces. Now (9.32) has exactly the form of the
Hamiltonian of the fundamental dynamical problem, if we are
'interested* in J5T : and that is just the case.
Hence we describe the behaviour of the combined system by
the proper variables of the unperturbed system, i.e. by the inte
grals of motion A\ Aj^,..., of the first body, and the integrals
of motion Ai 2) , Ajj 2) ,..., of the second body, which all together
form the integrals of motion of the separated bodies, represented
QUANTUM KINETIC THEORY OF MATTER 111
by H Q . Hence we can use the solution of the dynamical problem
given before, namely (9.25),
P(*,A) = P(A)+ I J(A, A'){P(A')P(A)}+..., (9.33)
A
where now A represents the sets of eigenvalues A (1) = (A^, A^,...)
of Ai, A^>,..., and X = (A< 2 >, Af>,...) of A< 2 >, A 2 >,....
Let us consider first statistical equilibrium. Then
hence the sum must vanish, and one must have
P(A') = P(A) (9.34)
for any two states A, A' for which the transition probability
J(A,A') is not zero. But we have seen further that these quan
tities /(A,A') are in all practical cases proportional to the time
and vanish unless the energy is conserved, E = E r (see formulae
9.27, 9.28). If we disregard cases where other constants of motion
exist for which a conservation law holds (like angular momentum
for systems free to rotate), one can replace P(A) by P(E). But
as the total system consists of two parts which are practically
independent, one has
P(E) = P(A) = Pi(Aa>)P a (A>), (9.35)
where the two factors represent the probabilities of finding the
separated parts initially in the states A (1) and A (2) . This factoriza
tion need not be taken from the axioms of the calculus of pro
bability; it is a consequence of quantum mechanics itself. For
if the energy is a sum of the form (9.31), the exact solution of the
fundamental equation for the density operator
 = [#,/0 (9.36)
is p = p 1 pfr where p l refers to the first system H (l \ /> 2 to the
second # (2) ; as according to (9.24) P(t, A) = p(t\ A, A), the product
formula (9.35) holds not only for the stationary case (as long as
the interactions can be neglected). If now J (1 >(A (1) ) and J5 (2) (A (2) )
are the energies of the separated parts, one obtains from (9.35)
(9.37)
112 CHANCE
which is a functional equation for the three functions P, P l9 P 2 .
The solution is easily found to be (see Appendix, 31)
p = e *ps 9 p i = e *ipsi 9 p 2 = e **ps* 9 (9.38)
with a = Oi+aj, J0 = .#!+ J 2 (9.39)
and the same j8 in all three expressions.
Thus we have found again the canonical distribution of Gibbs,
with the modification that the energies appearing are not explicit
functions of q and p (Hamiltonians) but of the eigenvalues
A, A (1) , A (2) of the integrals of motion.
This derivation is obviously a direct descendant of Maxwell's
first proof of his velocity distribution law which we discussed
previously, p. 51. But while the argument of independence is not
justifiable with regard to the three components of velocity, it is
perfectly legitimate for the constants of motion A. The fact
that the multiplication law of probabilities and the additivity
of energies for independent systems leads to the exponential
distribution law has, of course, been noticed and used by many
authors, beginning with Gibbs himself. This reasoning becomes,
with the help of quantum mechanics, an exact proof which shows
the limits of validity of the results. For if there exist constants
of motion other than the energy, the distribution law has to
be modified, and therefore the whole of thermodynamics. This
happens for instance for bodies moving freely in space, like stars,
where the quantity /J = 1/kT is no longer a scalar but the time
component of a relativistic four vector, the other components
representing j8v, where v is the mean velocity of the body. Yet
this is outside the scope of these lectures.
The simplest and much discussed application of quantum
statistics is that to the ideal gases. It was Einstein who first
noticed that for very low temperatures deviations from the
classical laws should appear. The Indian physicist, Bose, had
shown that one can obtain Planck's law of radiation by regard
ing the radiation as a 'photon gas* provided one did not treat
the photons as individual recognizable particles but as com
pletely indistinguishable. Einstein transferred this idea to
material atoms. Later it was recognized that this socalled Bose
QUANTUM KINETIC THEORY OF MATTER 113
Einstein statistics was a straightforward consequence of quan
tum mechanics; about the same time Fermi and Dirac discovered
another similar case which applies to electrons and other particles
with spin.
In the language used here the two 'statistics' can be simply
characterized by the symmetry of the density function
p(x 1 , X 2 ,..., x^; x lf X 2 ,..., Xjy).
It is always symmetric, for indistinguishable particles, in both
sets of arguments, i.e. it remains unchanged if both sets are
subject to the same permutation. If, however, only one set is
permuted, p remains also unchanged in the BoseEinstein case
for all permutations, while in the FermiDirac case it does so
only for even permutations, and changes sign for odd permuta
tions.
Applied to a system of free particles of equal structure, one
obtains at once from the canonical distribution law the pro
perties of socalled degenerate gases. But as these are treated in
many textbooks, I shall not discuss them here (see Appendix,
32).
After having considered statistical equilibrium we have now
to ask whether quantum mechanics accounts for the fact that
every system approaches equilibrium in time by the dissipation
of visible energy into heat, or, in other words, whether the
H theorem of Boltzmann holds.
This is the case indeed, and not difficult to prove. One defines
the total entropy, just as in classical theory, by
2P(t,A)logP(U)
where the summations are to be extended over all values of the
Ai,A 2 ,...; i.e. for each separate part of a coupled system over
XL\ A 2 X) ,... and AJ 2 *, A 2 2) ,..., respectively, and for the whole system
over both sets. For loosely coupled systems the probabilities
are, as we have seen, multiplicative at any time :
P(t\ A>, A< 2 >) = PM X)P 2 (t, A< 2 >). (9.41)
5131
114 CHANCE
From this it follows easily that the entropies are additive,
S = S^S^. (9.42)
Now substitute into (9.40) the explicit expression for P(t,X)
from (9.33) which holds for weak coupling; then by neglecting
higher powers of the small quantities J(A, A') one obtains
Jr(A,A')(A,A')
where Q(X,X)  {P(A)P(A')}log^. (9.44)
r(A )
The transition probabilities J(A, A') are, as we have seen, in all
practical cases proportional to time and vanish for transitions,
for which energy is not conserved; one has, according to (9.27)
and (9.28),
J(A, A') = t 7(A, X')\^(EE'), (9.45)
where V is the interaction potential. These quantities J(A, A')
are always positive. So is the denominator ]T P(A), while Q(X, A')
A
is positive as long as P(A) differs from P(A').
Hence S increases with time and will continue to do so, until
statistical equilibrium is reached. For only then no further
increase of 8 will happen, as is seen by taking equilibrium as the
initial state (where according to (9.34) Q(A,A') = for all non
vanishing transitions).
It remains now to investigate whether quantum kinetics
leads, for matter in bulk, to the ordinary laws of motion and ther
mal conduction as formulated by Cauchy. This is indeed the
case, as far as these laws are expressed in terms of stress, energy,
and flux of matter and heat. Yet, as we have seen, this is only
half the story, since Cauchy's equations are rather void of mean
ing as long as the dependence of these quantities on strain,
temperature, and the rate of their changes in space and time are
not given. Now in these latter relations the difference between
quantum theory and classical theory appears and can reach vast
proportions under favourable circumstances, chiefly for low
QUANTUM KINETIC THEORY OF MATTER 115
temperatures. The theory sketched in the following is mainly
due to my collaborator, Green.
The formal method of obtaining the hydrothermal equations
is very similar to that used in classical theory. Starting from the
fundamental equation for N particles
8 ^ = [H N ,p N ], (9.46)
a reduction process is applied to obtain similar equations for
Nl,N2,..., particles, until the laws of motion for one particle
are reached.
The reduction consists, as in classical theory, in averaging
over one, say the last, particle of a set. The coordinates of each
particle appear twice as arguments of a matrix
Pn =
Put here x (n) = x (w) ' and integrate over x (n) ; the result is x n Pn> a
matrix which depends only on x (1) ,..., x (n ~ l) i x (l) ',.> x^" 1 *'.
With the same normalization as in classical theory, (6.40), p. 67,
\V6 \Vilue
By applying this operation several times to (9.46) one obtains
(see Appendix, 33)
= [B q , P Q ]+S q (q = 1, 2,..., tf ), (9.48)
where S q = ^ Xq+ i [<&* +1) , /> fl+ i], (9.49)
in full analogy with the corresponding classical equations (6.44),
(6.45). Here H Q means the Hamiltonian of q particles, O (i +1)
the interaction between one of these (i) and a further particle
(q+1), and$ 9 is called, as before, the statistical term.
The quantity p fi (x, x) = ^(x) represents the generalized
number density for a 'cluster* of q particles, and in particular
n^x) is the ordinary number density.
Now one can obtain generalized hydrothermodynamical
equations from (9.47) by a similar process to that employed in
116 CHANCE
classical theory. Instead of integrating over the velocities one
has to take the diagonal terms of the matrices (putting x = x'),
and one has to take some precautions in regard to noncommuta
tivity by symmetrizing products, e.g. replacing a/J by J(aj8+j8a)
(see Appendix, 33). Exactly as in the classical equations of
motion there appears the average kinetic energy of the particle
(i) in a cluster of q particles which, divided by &, may be called
a kinetic temperature of the particle (i) in the cluster of q par
ticles. One might expect that the quantity T : corresponds to the
ordinary temperature; but this is not the case.
It is well known from simple examples (e.g. the harmonic
oscillator) that in quantum theory for statistical equilibrium the
thermodynamic temperature T 9 defined as the integrating de
nominator of entropy, is not equal to the mean square momen
tum. Here in the case of non equilibrium it turns out that not
only this happens, but that a similar deviation occurs with regard
to pressure. The thermodynamical pressure p is defined as the
work done by compression for unit change of volume; the kinetic
pressure p l is the isotropic part of the stress tensor in the equations
of motion. These two quantities differ in quantum theory.
Observable effects produced by this difference occur only for
extremely low temperatures. For gases these are so low that
they cannot be reached at all because condensation takes place
long before. Most substances are solid crystals in this region of
temperature; for these one has a relatively simple quantum
theory, initiated by Einstein, where the vibrating lattice is
regarded as equivalent to a set of oscillators (the 'normal modes').
This theory represents the quantum effects in equilibrium
(specific heat, thermal expansion) fairly well down to zero
temperature, while the phenomena of flow are practically
unobservable.
There are only two cases where quantum phenomena of flow
at very low temperatures are conspicuous. One is liquid helium
which, owing to its small mass and weak cohesion, does not
crystallize under normal pressure even for the lowest tempera
tures and becomes suprafluid at about 2 absolute. The other
case is that of the electrons in metals which, though not an
QUANTUM KINETIC THEORY OF MATTER 117
ordinary fluid, behave in many respects like one and, owing to
their tiny mass, exhibit quantum properties, the strangest of
which is supraconductivity.
In order to confirm the principles of quantum statistics,
investigations of these two cases are of great interest. Both
have been studied theoretically in my department in Edinburgh,
and I wish to say a few words about our results.
In the suprafluid state helium behaves very differently from a
normal liquid. It appears to lose its viscosity almost completely;
it flows through capillaries or narrow slits with a fixed velocity
almost independent of the pressure, creeps along the walls of the
container, and so on. A metal in the supraconductive state has, as
the name says, no measurable electrical resistance and behaves
abnormally in other ways. A common and very conspicuous
feature of both phenomena is the sharpness of the transition
point which is accompanied by an anomaly of the specific
heat: it rises steeply if the temperature approaches the critical
value T c from below, and drops suddenly for T = T C9 so that the
graph looks like the Greek letter A; hence the expression Apoint
for T c . However, this similarity cannot be very deeply rooted.
Where has one to expect, from the theoretical standpoint, the
beginning of quantum phenomena ? Evidently when the momen
tum p of the particles and some characteristic length I are reaching
the limit stated by the uncertainty principle, pi ~ H. If we
equate the kinetic energy p 2 /2m to the thermal energy JcT,
the critical temperature will be given by lcT c ~ # 2 /2ml 2 . If one
substitutes here for k and H the wellknown numerical values and
for m the mass of a hydrogen atom times the atomic mass
number /*, one finds, in degrees absolute,
oo
T C~Jt> ( 9 ' 5 )
where I is measured in Angstr&m units (10"" 8 cm.).
For a helium atom one has //, = 4, and if lis the mean distance
of two atoms (order 1 A) one obtains for T c a few degrees, which
agrees with the observed transition at about 2. But for elec
trons in metals one has \L = 1/1840 . If one now assumes one
118 CHANCE
electron per atom and interprets I as their mean distance it would
be again of the order 1 A, hence the expression (9.50) would
become some thousand degrees and has therefore nothing to do
with the Apoint of supraconductivity. This temperature has,
in fact, another meaning; it is the socalled 'degeneration tem
perature' T g of the electronic fluid; below T g , for instance at
ordinary temperatures, there are already strong deviations from
classical behaviour (e.g. the extremely small contribution of the
electrons to the specific heat), though not of the extreme charac
ter of supraconductivity. In order to explain the Apoint of
supraconductivity which lies for all metals at a few degrees
absolute, one has to take I about 200 times larger (^ 200 A).
As the interpretation of this length is still controversial, I shall
not discuss supraconductivity any further (see Appendix, 34).
Nor do I intend in the case of suprafluidity of helium to give
a full explanation of the Adiscontinuity, but I wish to direct
your attention to the thermomechanical properties of the
supraliquid below the Apoint, called He II.
I have already mentioned that in quantum liquids one has to
distinguish the ordinary thermodynamic temperature T and
pressure p from the kinetic temperature T l and pressure p v
The hydrothermal equations contain only T and p v and these
quantities are constant in equilibrium, i.e. for a state where no
change in time takes place. But 2\ and p l are not simple
functions of T and p but depend also on the velocity and its
gradient. Therefore in such a state permanent currents of mass
and of energy may flow as if no viscosity existed. This is reflected in
the energy balance which can be derived from the hydrothermal
equations. One obtains a curious result which looks like a viola
tion of the first law of thermodynamics; for the change of heat
is given by
dQ = TdS = dU+pdVVdn, (9.51)
where all symbols have the usual meaning, and TT = pp is
the difference of the kinetic and thermodynamic pressures. This
equation differs from the ordinary thermodynamical expression
(5.12) by the term V drr; howis this possible if thermodynamics
QUANTUM KINETIC THEORY OF MATTER 119
claims rightly universal validity? This claim is quite legitimate,
but the usual form of the expression for dQ depends on the
assumption that a quasistatic, i.e. very slow, process can be
regarded as a sequence of equilibria each determined by the
instantaneous values of pressure and volume. This is correct in
the classical domain, because if the rate of change of external
action (compression, heat supply, etc.) is slowed down, all veloci
ties in the fluid tend to disappear. Not so in quantum mechanics.
In consequence of the indeterminacy condition the momenta or
the velocities cannot decrease indefinitely if the coordinates of
the particles are restricted to very small regions. An investiga
tion of the hydrothermal equations shows that this effect is
preserved, to some degree, even for the visible velocities; it is
true there can exist a genuine statistical equilibrium where the
density is uniform and the currents of mass and energy vanish,
but there are also those states possible where certain combina
tions of currents of mass (velocities) and of energy (heat) per
manently exist. The production of these depends entirely on
the way in which the heat dQ is supplied to the system and cannot
be suppressed by just making the rate of change of volume very
small. We have therefore not a breakdown of the law of con
servation of energy but of its traditional thermodynamical
formulation.
The consequences of that extra term in (9.51) are easily seen
by introducing instead of the internal energy the quantity
E = UTrV (9.52)
in the expression (9.51) for dQ, which then reads
dQ = dX+ptdV, (9.53)
where p = P\~TT is the kinetic pressure. This shows that the
specific heat at constant volume is
(9 54)
P (9>54)
not (dU/dT)^ as in classical thermodynamics. Now as p v and
therefore TT = Pip, is very large at T = and decreases with
increasing T to reach the value at the Apoint, one obtains for
120 CHANCE
c v (T) a curve exactly of the form actually observed. Hence the
Aanomaly is due to the coupling of heat currents with the
mass motion characteristic of quantum liquids. It is a molar,
macroscopic motion, the shape of which depends on the geo
metrical conditions, presumably consisting of tiny closed threads
of fastmoving liquid, or groups of density waves.
A similar conception has been derived by several authors
(Tisza, Mendelssohn, Landau) from the experiments; they speak
of the liquid being a mixture of ordinary atoms and special
degenerate atoms (zparticles) which are in the lowest quantum
state and carry neither energy nor entropy. Yet in a liquid one
cannot attribute a quantum state to single atoms.
These considerations are also the clue to the understanding
of other anomalous phenomena, as the flow through narrow
capillaries or slits, the socalled fountain effect, the 'second
sound', etc. Green has studied the properties of He II in detail
and arrived at the conclusion that the quantum theory of liquids
can account for the strange behaviour of this substance.
I have dwelt on this special problem in some detail as it reveals
in a striking way that quantum phenomena are not confined to
atomic physics or microphysics where one aims at observing
single particles, but appear also in molar physics which deals
with matter in bulk. From the fundamental standpoint this
distinction, so essential in classical physics, loses much of its
meaning in quantum theory. The ultimate laws are statistical,
and the deterministic form of the molar equations holds for
certain averages which for large numbers of particles or quanta
are all one wants to know.
Now these molar laws satisfy all postulates of classical
causality: they are deterministic and conform to the principles
of contiguity and antecedence.
With this statement the circle of our considerations about
cause and chance in physics is closed. We have seen how classical
physics struggled in vain to reconcile growing quantitative
observations with preconceived ideas on causality, derived from
everyday experience but raised to the level of metaphysical
postulates, and how it fought a losing battle against the intrusion
QUANTUM KINETIC THEORY OF MATTER 121
of chance. Today the order of ideas has been reversed: chance
has become the primary notion, mechanics an expression of its
quantitative laws, and the overwhelming evidence of causality
with all its attributes in the realm of ordinary experience is
satisfactorily explained by the statistical laws of large numbers.
METAPHYSICAL CONCLUSIONS
THE statistical interpretation which I have presented in the last
section is now generally accepted by physicists all over the world,
with a few exceptions, amongst them a most remarkable one.
As I have mentioned before, Einstein does not accept it, but still
believes in and works on a return to a deterministic theory. To
illustrate his opinion, let me quote passages from two letters.
The first is dated 7 November 1944, and contains these lines:
'In unserer wissenschaftlichen Erwartung haben wir uns zu Antipoden
entwickelt. Du glaubst an den wiirfelnden Gott und ich an voile Gesetz
lichkeit in einer Welt von etwas objektiv Seiendem, das ich auf wild
spekujativem Weg zu erhaschen suche. Ich hoffe, dass einer einen mehr
realistischen Weg, bezw. eine mehr greifbare Unterlage fur eine solche
Auffassung finden wird, als es mir gegeben ist. Der grosse anfangliche
Erfolg der Quantentheorie kann mich doch nicht zum Glauben an das
fundamentale Wiirfelspiel bringen. '
(In our scientific expectations we have progressed towards antipodes.
You believe in the dice playing god, and I in the perfect rule of law in a
world of something objectively existing which I try to catch in a wildly
speculative way. I hope that somebody will find a more realistic way,
or a more tangible foundation for such a conception than that which is
given to me. The great initial success of quantum theory cannot convert
me to believe in that fundamental game of dice.)
The second letter, which arrived just when I was writing these
pages (dated 3 December 1947), contains this passage:
*Meine physikalische Haltung kann ich Dir nicht so begrunden, dass
Du sie irgendwie vernunftig finden wiirdest. Ich sehe naturlich ein, dass
die principiell statistische Behandlungsweise, deren Notwendigkeit im
Rahmen des bestehenden Formalismus ja zuerst von Dir klar erkannt
wurde, einen bedeutenden Wahrheitsgehalt hat. Ich kann aber deshalb
nicht ernsthaft daran glauben, weil die Theorie mit dem Grundsatz
unvereinbar ist, dass die Physik eine Wirklichkeit in Zeit und Raum
darstellen soil, ohne spukhafte Fernwirkungen. . . . Davon bin ich fest
uberzeugt, dass man schliesslich bei einer Theorie landen wird, deren
gesetzmassig verbundene Dinge nicht Wahrscheinlichkeiten, sondern
gedachte Tatbestande sind, wie man es bis vor kurzem als selbstver
standlich betrachtet hat. Zur Begnindung dieser tJberzeugung kann
ich pber nicht logische Griinde, sondern nur meinen kleinen Finger als
Zeugen beibringen, also keine Autoritat, die ausserhalb meiner Haut
irgendwelchen Respekt einfiossen kann. 1
METAPHYSICAL CONCLUSIONS 123
(I cannot substantiate my attitude to physics in such a manner that
you would find it in any way rational. I see of course that the statistical
interpretation (the necessity of which in the frame of the existing for
malism has been first clearly recognized by yourself) has a considerable
content of truth. Yet I cannot seriously believe it because the theory
is inconsistent with the principle that physics has to represent a reality
in space and time without phantom actions over distances. ... I am
absolutely convinced that one will eventually arrive at a theory in which
the objects connected by laws are not probabilities, but conceived facts,
as one took for granted only a short time ago. However, I cannot provide
logical arguments for my conviction, but can only call on my little finger
as a witness, which cannot claim any authority to be respected outside
my own skin.)
I have quoted these letters because I think that the opinion
of the greatest living physicist, who has done more than anybody
else to establish modern ideas, must not be bypassed. Einstein
does not share the opinion held by most of us that there is over
whelming evidence for quantum mechanics. Yet he concedes
'initial success' and 4 a considerable degree of truth'. He ob
viously agrees that we have at present nothing better, but he
hopes that this will be achieved later, for he rejects the 'dice
playing god'. I have discussed the chances of a return to deter
minism and found them slight. I have tried to show that classical
physics is involved in no less formidable conceptional difficulties
and had eventually to incorporate chance in its system. We
mortals have to play dice anyhow if we wish to deal with atomic
systems. Einstein's principle of the existence of an objective
real world is therefore rather academic. On the other hand, his
contention that quantum theory has given up this principle is
not justified, if the conception of reality is properly understood.
Of this I shall say more presently.
Einstein's letters teach us impressively the fact that even an
exact science like physics is based on fundamental beliefs. The
words ich glaube appear repeatedly, and once they are under
lined. I shall not further discuss the difference between Ein
stein's principles and those which I have tried to extract from
the history of physics up to the present day. But I wish to
collect some of the fundamental assumptions which cannot be
further reduced but have to be accepted by an act of faith.
124 METAPHYSICAL CONCLUSIONS
Causality is such a principle, if it is defined as the belief in the
existence of mutual physical dependence of observable situa
tions. However, all specifications of this dependence in regard
to space and time (contiguity, antecedence) and to the infinite
sharpness of observation (determinism) seem to me not funda
mental, but consequences of the actual empirical laws.
Another metaphysical principle is incorporated in the notion
of probability. It is the belief that the predictions of statistical
calculations are more than an exercise of the brain, that they can
be trusted in the real world. This holds just as well for ordinary
probability as for the more refined mixture of probability and
mechanics formulated by quantum theory.
The two metaphysical conceptions of causality and probability
have been our main theme. Others, concerning logic, arithmetic,
space, and time, are quite beyond the frame of these lectures.
But let me add a few more which have occasionally occurred,
though I am sure that my list will be quite incomplete. One is
the belief in harmony in nature, which is something distinct from
causality, as it can be circumscribed by words like beauty,
elegance, simplicity applied to certain formulations of natural
laws. This belief has played a considerable part in the develop
ment of theoretical physics remember Maxwell's equations of
the electromagnetic field, or Einstein's relativity but how far
it is a real guide in the search of the unknown or just the expres
sion of our satisfaction to have discovered a significant relation,
I do not venture to say. For I have on occasion made the sad
discovery that a theory which seemed to me very lovely neverthe
less did not work. And in regard to simplicity, opinions will
differ in many cases. Is Einstein's law of gravitation simpler
than Newton's? Trained mathematicians will answer Yes,
meaning the logical simplicity of the foundations, while others
will say emphatically No, because of the horrible complication
of the formalism. However this may be, this kind of belief may
help some specially gifted men in their research; for the validity
of the result it has little importance (see Appendix, 35).
The last belief I wish to discuss may be called the principle of
objectivity. It provides a criterion to distinguish subjective
METAPHYSICAL CONCLUSIONS 126
impressions from objective facts, namely by substituting for
given sensedata others which can be checked by other indivi
duals. I have spoken about this method when I had to define
temperature: the subjective feeling of hot and cold is replaced
by the reading of a thermometer, which can be done by any
person without a sensation of hot or cold. It is perhaps the most
important rule of the code of natural science of which innumer
able examples can be given, and it is obviously closely related
to the conception of scientific reality. For if reality is understood
to mean the sum of observational invariants and I cannot see
any other reasonable interpretation of this word in physics the
elimination of sense qualities is a necessary step to discover
them.
Here I must refer to the previous Waynflete Lectures given by
Professor E. D. Adrian, on The Physical Background of Percep
tion, because the results of physiological investigations seem to
me in perfect agreement with my suggestion about the meaning
of reality in physics. The messages which the brain receives have
not the least similarity with the stimuli. They consist in pulses
of given intensities and frequencies, characteristic for the trans
mitting nervefibre, which ends at a definite place of the cortex.
All the brain 'learns' (I use here the objectionable language of
the 'disquieting figure of a little hobgoblin sitting up aloft in the
cerebral hemisphere') is a distribution or 'map' of pulses. From
this information it produces the image of the world by a process
which can metaphorically be called a consummate piece of com
binatorial mathematics: it sorts out of the maze of indifferent
and varying signals invariant shapes and relations which form
the world of ordinary experience.
This unconscious process breaks down for scientific ultra
experience, obtained by magnifying instruments. But then it is
continued in the full light of consciousness, by mathematical
reasoning. The result is the reality offered by theoretical physics.
The principle of objectivity can, I think, be applied to every
human experience, but is often quite out of place. For instance:
what is a fugue by Bach ? Is it the invariant crosssection, or the
common content of all printed or written copies, gramophone
126 METAPHYSICAL CONCLUSIONS
records, sound waves at performances, etc., of this piece of music ?
As a lover of music I say No! that is not what I mean by a fugue.
It is something of another sphere where other notions apply,
and the essence of it is not 'notions' at all, but the immediate
impact on my soul of its beauty and greatness.
In cases like this, the idea of scientific objective reality is
obviously inadequate, almost absurd.
This is trivial, but I have to refer to it if I have to make good
my promise to discuss the bearing of modern physical thought
on philosophical problems, in particular on the problem of free
will. Since ancient times philosophers have been worried how
free will can be reconciled with causality, and after the tremen
dous success of Newton's deterministic theory of nature, this
problem seemed to be still more acute. Therefore, the advent of
indeterministic quantum theory was welcomed as opening a
possibility for the autonomy of the mind without a clash with
the laws of nature. Free will is primarily a subjective pheno
menon, the interpretation of a sensation which we experience,
similar to a sense impression. We can and do, of course, project
it into the minds of our fellow beings just as we do in the case
of music. We can also correlate it with other phenomena in order
to transform it into an objective relation, as the moralists,
sociologists, lawyers do but then it resembles the original
sensation no more than an intensity curve in a spectral diagram
resembles a colour which I see. After this transformation into
a sociological concept, free will is a symbolic expression to
describe the fact that the actions and reactions of human beings
are conditioned by their internal mental structure and depend on
their whole and unaccountable history. Whether we believe
theoretically in strict determinism or not, we can make no use
of this theory since a human being is too complicated, and we
have to be content with a working hypothesis like that of spon
taneity of decision and responsibility of action. If you feel that
this clashes with determinism, you have now at your disposal
the modern indeterministic philosophy of nature, you can assume
a certain 'freedom', i.e. deviation from the deterministic laws,
because these are only apparent and refer to averages. Yet if
METAPHYSICAL CONCLUSIONS 127
you believe in perfect freedom you will get into difficulties again,
because you cannot neglect the laws of statistics which are laws
of nature.
I think that the philosophical treatment of the problem of free
will suffers often (see Appendix, 36) from an insufficient dis
tinction between the subjective and objective aspect. It is
doubtless more difficult to keep these apart in the case of such
sensations as free will, than in the case of colours, sounds, or
temperatures. But the application of scientific conceptions to a
subjective experience is an inadequate procedure in all such
cases.
You may call this an evasion of the problem, by means of
dividing all experience into two categories, instead of trying to
form one allembracing picture of the world. This division is
indeed what I suggest and consider to be unavoidable. If quan
tum theory has any philosophical importance at all, it lies in the
fact that it demonstrates for a single, sharply defined science the
necessity of dual aspects and complementary considerations. Niels
Bohr has discussed this question with respect to many applica
tions in physiology, psychology, and philosophy in general.
According to the rule of indeterminacy, you cannot measure
simultaneously position and velocity of particles, but you have
to make your choice. The situation is similar if you wish, for
instance, to determine the physicochemical processes in the
brain connected with a mental process: it cannot be done because
the latter would be decidedly disturbed by the physical investiga
tion. Complete knowledge of the physical situation is only
obtainable by a dissection which would mean the death of the
living organ or the whole creature, the destruction of the mental
situation. This example may suffice; you can find more and
subtler ones in Bohr's writings. They illustrate the limits of
human understanding and direct the attention to the question
of fixing the boundary line, as physics has done in a narrow
field by discovering the quantum constant ft. Much futile
controversy could be avoided in this way. To show this by a
final example, I wish to refer to these lectures themselves which
deal only with one aspect of science, the theoretical one. There
128 METAPHYSICAL CONCLUSIONS
is a powerful school of eminent scientists who consider such
things to be a futile and snobbish sport, and the people who
spend their time on it drones. Science has undoubtedly two
aspects: it can be regarded from the social standpoint as a prac
tical collective endeavour for the improvement of human
conditions, but it can also be regarded from the individualistic
standpoint, as a pursuit of mental desires, the hunger for know
ledge and understanding, a sister of art, philosophy, and religion.
Both aspects are justified, necessary, and complementary. The
collective enterprise of practical science consists in the end of
individuals and cannot thrive without their devotion. But
devotion does not suffice; nothing great can be achieved without
the elementary curiosity of the philosopher. A proper balance is
needed. I have chosen the way which seemed to me to harmonize
best with the spirit of this ancient place of learning.
APPENDIX
1. (II. p. 8.) Multiple causes
Any event may have several causes. This possibility is not
excluded by my definition (given explicitly on p. 9), though I
speak there of A being 'the' cause of the effect B. Actually the
'number* of causes, i.e. of conditions on which an effect B
depends, seems to me a rather meaningless notion. One often
finds the idea of a 'causal chain' A^A^,..., where B depends
directly on A V A^ on A 2 , etc., so that B depends indirectly on
any of the A n . As the series may never end where is a 'first
cause' to be found ? the number of causes may be, and will be
in general, infinite. But there seems to be not the slightest
reason to assume only one such chain, or even a number of
chains; for the causes may be interlocked in a complicated way,
and a 'network' of causes (even in a multidimensional space)
seems to be a more appropriate picture. Yet why should it be
enumerable at all? The 'set of all causes' of an event seems to
me a notion just as dangerous as the notions which lead to logical
paradoxes of the type discovered by Russell. It is a metaphysical
idea which has produced much futile controversy. Therefore I
have tried to formulate my definition in such a way that this
question can be completely avoided.
2. (III. p. 13.) Derivation of Newton's law from Kepler's
laws
The fact that Newton's law is a logical consequence of Kepler's
laws is the basis on which my whole conception of causality in
physics rests. For it is, apart from Galileo's simple demonstra
tion the first and foremost example of a timeless causeeffect
relation derived from observations. In most textbooks of
mechanics the opposite way (deduction of Kepler's laws from
Newton's) is followed. Therefore it may be useful to give the
full proof in modern terms.
We begin with formulating Kepler's laws, splitting the first
one in two parts:
la. The orbit of a planet is a plane curve.
16. It has the shape of an ellipse, one focus of which is the sun.
130 APPENDIX
II. The area A swept by the radius vector increases propor
tionally to time.
III. The ratio of the cube of the semiaxis a of the ellipse to
the square of the period T is the same for all planets.
From I a it follows that it suffices to consider a plane, intro
ducing rectangular coordinates x,y, polar coordinates r, ^, so
x = rcos</> ) y = rsin<.
Indicating differentiation with respect to time by a dot, one
obtains for the velocity
rsn, y =
and for the acceleration
x = a r cos</> cfysin<, y
where a r = r r<j> 2 , (I)
0,4 = 2ty+r (2)
are the radial and tangential components of the acceleration.
Next we use II. The element of the area in polar coordinates
is obviously
FIG. 3.
If the origin is taken at the centre of the sun, the rate of in
crease of A is constant, say $h, dA = %hdt, or
2A = rty = A  (3)
Now it is convenient to use the variable
u =
r
APPENDIX 131
instead of r and to describe the orbit by expressing u as a function
of<j> y u((f>). Then ,
(4)
TN t **/ ; 1 UU ; 7 tttt /
Further, r = < = < = /& . (6)
Substituting (4), (5), (6) into (2), one finds
Hence the acceleration has only a radial component a r with
respect to the sun. To obtain the value of a r we calculate, with
the help of (4),
. __ dr  __ h dr __ __, du
" = A
and substitute this in (1):
we use 16. The polar equation of an ellipse is
where q is the semilatus rectum and e the numerical eccentricity;
or u = (l+cos^).
From this, one obtains
du . . d*u e .
=  sm<i, r 9 = cos^,
d<f> q Y d<t>* q Y
hence from (8) a r =  u 2 ==  ^. (10)
The acceleration is directed to the sun (centripetal) and is
inversely proportional to the square of the distance.
132 APPENDIX
According to the third law III one can write
* til)
I 11 *
where the constant /z is the same for all planets.
Now integrating (3) for a full revolution one has
2A = hT. (12)
On the other hand, the area of an ellipse is given by
A = nab, (13)
where a and b are the major and minor semiaxes.
Taking in (9) <j4 = and <f> = TT one gets the aphelion and
perihelion distances; half of the sum of those is the semimajor
axis:
while the semiminor axis is given by
hence aq = 6 2 .
Substituting this in (13), one gets from (12)
solving with respect to h 2 /q and using (11):
A 2
Therefore the law of acceleration (10) becomes
a, = _, (15)
where p is the same for all planets, hence a property of the sun,
called the gravitational mass.
This demonstrates the statement of the text that Newton's
derivation of his law of force is purely deductive, based on the
inductive work of Ty cho Brahe and Kepler. The new feature due
to Newton is the theoretical interpretation of the deduced formula
for the acceleration, as representing the *cause' of the motion,
APPENDIX 133
or the force determining the motion, which then led him to the
fundamental idea of general gravitation (each body attracts each
other one). In the textbooks this situation is not always clear;
this may be due to Newton's own representation in his Principia
where he uses only geometrical constructions in the classical
style of the Greeks. Yet it is known that he possessed the
methods of infinitesimal calculus (theory of fluxions) for many
years. I do not know whether he actually discovered his results
with the help of the calculus; it seems to me incredible that he
should not. He was obviously keen to avoid new mathematical
methods in order to comply with the taste of his contemporaries.
But it is known also that he liked to conceal his real ideas by
dressing them up. This tendency is found in Gauss and other
great mathematicians as well and has survived to our time,
much to the disadvantage of science.
Newton regarded the calculation of terrestrial gravity from
astronomical data as the crucial test of his theory, and he with
held publication for years as the available data about the radius
of the earth were not satisfactory. The formula (3.3) of the text
is simply obtained by regarding the earth as central body and
the moon as 'planet'. Then //, is the gravitational mass of the
earth which can be obtained from (11) by inserting for a the
mean distance B of the centre of the moon from that of the earth,
and for T the length of the month. Substituting ^ = 47r 2 JB 8 /T 2
into (15), where r is the radius of the earth, one obtains for the
acceleration on the earth's surface g ( = a r ) the formula (3.3)
of the text,
If here the values Rjr = 60, jR = 384x!0 10 cm., and
T = 27 d 7 h 43 m 11 5 s = 2361X10 6 sec. are substituted, one
finds g = 980 *2 cm. sec.~ 2 , while the observed value (extrapolated
to the pole) is g = 980 '6 cm. sec.~ 2
This reasoning is based on the plausible assumption that the
acceleration produced by a material sphere at a point outside is
independent of the radial distribution of density and the mass of
the sphere can therefore be regarded as concentrated in the
centre. The rigorous proof of this lemma forms an important
part of Newton's considerations and was presumably achieved
with the help of his theory of fluxions.
134 APPENDIX
3. (IV. p. 20.) Cauchy's mechanics of continuous media
The mathematical tool for handling continuous substances is
the following theorem of Gauss (also attributed to Green).
If a vector field A is defined inside and on the surface S of a
volume F, one has
JdivAdF = f A.nAS, (1)
V 8
where n is the unit vector in the direction of the outer normal of
the surface element dS and
,. A dA x , 8A V 8A V d A /ox
divA ~ H HJ = .A. (2)
dx ^ dy ^ dz dx v '
If p is the density, the total mass inside V is
m = J p dV. (3)
The amount of mass leaving the volume through the surface is
u . n dS,
where u = pv is the current, v the velocity.
The indestructibility of mass is then expressed by
J
u.ndS == 0.
Substituting (3) and applying (1), one obtains a volume
integral, which vanishes for any surface; hence its integrand
must be zero: p+divu = 0. (4)
This is the continuity equation (4.5) of the text.
Consider now the forces acting on the volume F. Neglecting
those forces which act on each volume element (like Newton's
gravitation), we assume with Cauchy that there are surface
forces or tensions, acting on each element dS of the surface S,
and proportional to dS. They will also depend on the orientation
of dS, i.e. on the normal vector n, and can therefore be written
T n dS. If n coincides with one of the three axes of coordinates
x,y,z, the corresponding forces per unit area may be repre
sented by the vectors T^, T y) T e . Now the projections of an
element dS on the coordinate planes are
dS x = n x dS, dS y = n y dS, dS z = n z dS.
APPENDIX 135
The equilibrium of the tetrahedron with the sides dS, dS x> dS y , dS 8
then leads to the equation
T n dS = T x dS x +T y dS y +T z dS Z9
or T n = T x n x +T y n y +T z n z , (5)
which is the formula (4.6) of the text.
2
FIG. 4.
Consider further the equilibrium of a rectangular volume
element, and in particular its crosssection 2 = 0, with the sides
dx, dy. The components of T^ in this plane may be denoted by
T xx and T xyj those of T y by T yx , T yy . Then the tangential com
FIG. 5.
ponents on the surfaces dydz and dxdz produce a couple about the
origin with the moment
(T xy dydz)dx (T yx dxdz)dy.
This must vanish in equilibrium; therefore one has
T T
*xy *!/#
and the corresponding equations obtained by cyclic permutation
of the indices, (4.7) of the text. Hence the stress tensor T defined
130 APPENDIX
by (4.8) is symmetrical. One can express this, with the help of
(5), in the form
(T n ) x = T xx n x +T yx n v +T zx n s = T x .n,
where T x is the vector (T xx , T xv , T xz ).
The ^component of the total force F = f T n dS can now be
transformed with the help of formula (1) into a volume integral
F x = f (T n ) x dS = J T x .n dS = j divT x dV.
Using the tensor notation of the text, (4.10), one can write this
F= J divTW. (6)
This has to be equated to the rate of change of momentum of a
given amount of matter, i.e. enclosed in a volume moving in
time. One has for any function O of space
d f  1 1 f f 1
_J <DdF=jmI J OrfFJOdFj
V F<+A<) v(t) '
= limij f <\tdV+ !<bdV\.
A I I r\J ' I I
A^>0 /AC I J 01 J j
^17<'/^ AIT" '
AF
The second integral is extended over the volume between two
infinitesimally near positions of the surface, so that
dV = n.v&tdS
and therefore
fOn.vdSA*= f div(<&v)dFAt
S f
Hence ^ \{ <D dV = f {^+div(Ov)j dV. (7)
v
If this is applied to the components of the momentum density
pv one obtains for the rate of change of the total momentum P:
APPENDIX 137
Here the second integral vanishes in consequence of the continu
ity equation (4), with u = />v. In the first integral appears the
convective derivative, defined by (4.11) of the text,
Hence
Now the equation of motion
? = *
dt
reduces in virtue of (6) and (8) to (4.9) of the text:
P ^ = divT. (9)
Consider in particular an elastic fluid where
rp _ rn _ /TT _ _ sy* /Ti _ np _ rr\ _ r\
Jxx ~*yy *zz Jr> J yz ^zx ^xy v >
and the pressure p is a function of /> alone. Then the continuity
equation and the equations of motion
pv) = 0,
are four differential equations for the four functions />, v x , v y9 v z .
If one wishes to determine small deviations from equilibrium,
then v and <f> = p p are small and /> constant with regard to
space and time. Then the last two equations reduce in first
approximation to ^ .
dp
By differentiating the first of these with respect to time and
dv
substituting p Q from the second, one finds
dt
138 APPENDIX
... ,. 3 a d
or with div = . = A
dx ax ax
This is the equation (4.13) of the text applied to the variation of
density <f>. Each of the velocity components satisfies the same
equation, which is the prototype of all laws of wave propagation.
4. (IV. p. 24.) Maxwell's equations of the electromagnetic
field
The mathematical part of Maxwell's work consisted in con
densing the experimental laws, mentioned in the text, in a set
of differential equations which, with the usual notation, are
div D = 4?rp, curl H == u,
c
divB = 0, curlE+~B = 0, (1)
c
D = E, B = pH.
To give a simple example, Coulomb's law for the electrostatic
field is obtained by putting B = 0, H = 0, u = 0; then there
remains ,. ~ A 1T ^
divD = 47T/>, curlE = 0.
The second equation implies that there is a potential 0, such that

ax'
In vacuOy where D = E, one obtains therefore Poisson's
equation ,. . .
^ divE = A( =
The solution is <f> = f ^dV, (2)
provided singularities are excluded; this formula expresses
Coulomb's law for a continuous distribution of density. In a
similar way one obtains for stationary states (B = 0) the law of
Biot and Savart for the magnetic field of a current of density u.
APPENDIX 139
Maxwell's physical idea consisted in discovering the asym
metry in the equations (1) which, in our style of writing, is
obvious even to the untrained eye: the missing term 6 in the
c
second equation. The logical necessity of this term follows from
the fact of the existence of open currents, e.g. discharges of con
densers through wires. In this case the charge on the condenser
changes in time, hence p ^ 0; on the other hand, the equations
^
(1) imply divu = divcurlH = 0. Therefore the continuity
equation />+divu = is violated.
To amend this Maxwell postulated a new type of current
bridging the gap between the conductors in the condenser, with
a certain density w, so that
curlH = (u+ w). (3)
c
Then taking the div operation one has
div w div u = p = divD.
47T
The simplest way of satisfying this equation is putting
W = ^D, (4)
so that the corresponding equation in Maxwell's set becomes
~iD = u (5)
c c v '
and complete symmetry between electric and magnetic quanti
ties is obtained (apart from the fact that the latter have no true
charge and current).
The modified system of field equations permits the prediction
of waves with finite velocity. In an isotropic substance free of
charges and currents (p = 0, u = 0, D = eE, B = /^H) one has
curlHE = 0, curlE+^H = 0;
c c
taking the curl of one of them, and using the formula that for a
140 APPENDIX
vector with vanishing div one has curl curl = A, one obtains
for each component of E and H the wave equation
For vacuum (c = /x = 1) the velocity of propagation should
therefore be equal to the electromagnetic constant c. As stated
in the text, this constant has the dimensions of a velocity and can
be measured by determining the magnetic field of a current
produced by a condenser discharge (measured therefore electro
statically). Such experiments had been performed by Kohl
rausch and Weber, and their result for c agreed with the velocity
of light in vacuo. This evidence for the electromagnetic theory of
light was strongly enhanced by experiments carried out by
Boltzmann, which showed that the velocity of light in simple
substances (rare gases, which are monatomic) can be calculated
from their dielectric constant e (p being practically = 1 ) with the
help of Maxwell's formula c x = c/Ve.
Maxwell's formulation satisfies contiguity, but its relation to
Cauchy's form of the dynamical laws has still to be established.
The electric and magnetic field vectors, though originally defined
by the forces on point charges and magnetic poles (which actually
do not exist), are defined by the equations also in places where
neither charges nor currents exist. Yet they are not stresses
themselves; they are analogous to strains, on which the stresses
depend. The law of this connexion has also been found by Max
well; it is a mathematical formulation of Faraday's intuitive
interpretation of the mechanical reactions between electrified
and magnetized bodies. A short indication must here suffice.
Apart from the electric force on a point charge e, F = eE,
there exists a mechanical force on the element of a linear current
u, produced by a magnetic field H; this force is perpendicular
to H and to the current u and therefore does no work. It is not
quite uniquely determined, as one can obviously add any force
whose line integral over a closed circuit vanishes. The simplest
expression is:
as can be seen by considering the change of magnetic energy
TT J H . B dV produced by a virtual displacement of an element
of the current.
APPENDIX 141
To illustrate Maxwell's procedure it suffices to consider charge
distributions in vacuo with density p and current u = />v.
Combining the two forces F = eE and F =  u A B into one
c
expression, one has for the density of force
(7)
the socalled Lorentz force.
Substituting here for p and pv the expressions from Maxwell's
equations one can, by elementary transformations, bring f into
the form of Cauchy , . __ , . 
where
These are the celebrated formulae of Maxwell's tensions. They
can be easily generalized for material bodies with dielectric
constant and permeability, and they have become the prototype
for similar expressions in other field theories, e.g. gravitation
(Einstein), electronic field (Dirac), meson field (Yukawa).
5. (IV. p. 27.) Relativity
It is impossible to give a short sketch of the theory of relativity,
and the reader is referred to the textbooks. The best representa
tion seems to me still the article in vol. v of the Mathematical
Encyclopaedia written by W. Pauli when he was a student,
about twenty years of age. There one finds a clear statement of
the experimental facts which led to the mathematical theory
almost unambiguously. Eddington's treatment gives the im
pression that the results could have been obtained or even
have been obtained by pure reason, using epistemological
principles. I need not say that this is wrong and misleading.
There was, of course, a philosophical urge behind Einstein's
relentless effort; in particular the violation of contiguity in
Newton's theory seemed to him unacceptable. Yet the greatness
of his achievement was just that he based his own theory not on
preconceived notions but on hard facts, facts which were obvious
142 APPENDIX
to everybody, but noticed by nobody. The main fact was the
identity of inertial and gravitational mass, which he expressed
as the principle of equivalence between acceleration and gravita
tion. An observer in a closed box cannot decide by any experi
ment whether an observed acceleration of a body in the box is
due to gravity produced by external bodies or to an acceleration
of the box in the opposite direction. This principle means that
arbitrary, nonlinear transformations of time must be admitted.
But the formal symmetry between spacecoordinates and time
discovered by Minkowski made it very improbable that the
transformations of space should be linear, and this was corro
borated by considering rotating bodies: a volume element on the
periphery should undergo a peripheral contraction according
to the results of special relativity, but remain unchanged in the
radial direction. Hence acceleration was necessarily connected
with deformation. This led to the postulate that all laws of
nature ought to be unchanged (covariant) with respect to
arbitrary spacetime transformations. But as special relativity
must be preserved in small domains, the postulate of in variance
of the line element had to be made.
The long struggle of Einstein to find the general covariant field
equations was due to the difficulty for a physicist to assimilate
the mathematical ideas necessary, ideas which were in fact
completely worked out by Riemann and his successors, Levi
Civita, Ricci, and others.
I wish to add here only one remark. The physical significance
of the line element seems to me rather mystical in a genuinely
continuous spacetime. If it is replaced by the assumption of
parallel displacement (affine connexion), this impression of
mystery is still further enhanced. On the other hand, the
appearance of a> finite length in the ultimate equations of physics
can be expected. Quantum theory is the first step in this direc
tion; it introduces not a universal length but a constant, Planck's
K y of the dimension length times momentum into the laws of
physics. There are numerous indications that the further
development of physics will lead to a separate appearance of
these two factors, h = q. p, in the ultimate laws. The difficulties
of presentday physics are centred about the problem of intro
ducing this length q in a way which satisfies the principle of
relativity. This fact seems to indicate that relativity itself
APPENDIX 143
needs a generalization where the infinitesimal element ds is
replaced by a finite length.
The papers quoted in the text are: A. Einstein, L. Infeld, and
B. Hoffmann, Ann. of Math., 39, no. 1, p. 65 (Princeton,
1938); V. A. Fock, Journ. of Phys. U.S.S.R. 1, no. 2, p. 81 (1939).
6. (V. p. 38.) On classical and modern thermodynamics
It is often said that the classical derivation of the second law
of thermodynamics is much simpler than Carath6odory's as it
needs less abstract conceptions than Pfaffian equations. But
this objection is quite wrong. For what one has to show is the
existence of an integrating denominator of dQ. This is trivial
for a Pfaffian of two variables (representing, for example, a single
fluid with F, #) ; it must be shown not to be trivial and even, in
general, wrong for Pfaffians with more than two variables (e.g. two
fluids in thermal contact with Tj, !,#). Otherwise, the student
cannot possibly understand what the fuss is all about. But that
means explaining to him the difference between the two classes
of Pfaffians of three variables, the integrable ones and the non
integrable ones. Without that all talk about Carnot cycles is just
empty verbiage. But as soon as one has this difference, why not
then use the simple criterion of accessibility from neighbouring
points, instead of invoking quite new ideas borrowed from
engineering ? I think a satisfactory lecture or textbook should
bring this classical reasoning as a corollary of historical interest,
as I have suggested long ago in a series of papers (Phys. Zeitschr.
22, pp. 218, 249, 282 (1921)).
Since writing the text I have come across one book which
gives a short account of Carath^odory's theory, H. Margenau
and G. M. Murphy, The Mathematics of Physics and Chemistry
(D. van Nostrand Co., New York, 1943), 1.15, p. 26. But
though the mathematics is correct, it does not do justice to the
idea. For it says on p. 28: 'This formal mathematical con
sequence of the properties of the Pfaff equation [namely the
theorem proved in the next section of the appendix] is known as
the principle of Carath^odory. It is exactly what we need for
thermodynamics.' Carath6odory's principle is, of course, not
that formal mathematical theorem but the induction from obser
vation that there are inaccessible states in any neighbourhood
of a given state.
144 APPENDIX
7. (V. p. 39.) Theorem of accessibility
An example of a Pfaffian which has no integrating denomina
tor (by the way, the same example as described in geometrical
terms in the text) is this:
dQ = y dx\x dy+k dz,
where Tc is a constant. If it were possible to write dQ in the
form Ad^, where A and < are functions of x, y, z, one would have
d<f> __ _^y B(f> __ x dcf> __k
aJ~~~~A' 3y~A' 0z~"A'
hence
dydz
= 1/f \  17*\ ^  !L(v\  fe\
~~ dz \ A/ ~~ dy \A/ ' Bzdx ~ dz \ A/ "~ dx ( A/ '
dxdy c
or
dz dy dz dx dx dy
By substituting dXjdx and dX/dy from the first two equations
in the third one finds A =
Examples like this show clearly that the existence of an
integrating denominator is an exception.
We now give the proof of the theorem of accessibility.
Consider the solutions of the Pfaffian
dQ = X dx+Y dy+Z dz = 0, (1)
which lie in a given surface 8,
x = x(u, v) y y = y(u, v), z = z(u, v).
They satisfy a Pfaffian
v = 0, (2)
where U = X+Y+Z,
du^ du^ du
.
dv
APPENDIX
145
Hence through every point PofS there passes one curve, because
(2) is equivalent to the ordinary differential equation
du
dv
V
U>
which has a oneparameter set of solutions <f>(u,v) = const.,
covering the surface S.
Let usnowsupposethat,in the neighbourhood of apointP, there
are inaccessible points; let Q be one of these. Construct through
FIG. 6.
FIG. 7.
P a straight line J2 7 , which is not a solution of (1 ), and the plane ir
through Q and JSf. In TT there is just one curve satisfying (1) and
going through Q\ this curve will meet the line & at a point R.
Then R must be inaccessible from P; for if there should exist a
solution leading from P to JB, then one could also reach Q from
P by a continuous (though kinked) solution curve, which
contradicts the assumption that Q is inaccessible from P. The
point R can be made to lie as near to P as one wishes by choosing
Q near enough to P.
Now we move the straight line JS? parallel to itself in a cyclic
way so that it describes a closed cylinder. Then there exists on
this cylinder a solution curve # which starts from P on .5? and
meets JS? again at a point N. It follows that N and P must coin
cide. For otherwise one could, by deforming the cylinder, make
N sweep along the line JS? towards P and beyond P. Hence there
would be an interval of accessible points (like N) around P,
5131 T .
146 APPENDIX
while it has been proved before that there are inaccessible points
Q in any neighbourhood of P.
As N now coincides with P the connecting curve ^ can be
made, by steady deformation of the cylinder, to describe a
surface which contains obviously all solutions starting from P.
If this surface is given by <f>(x, y, z) = 0, one has
dQ = A <ty,
which is the theorem to be proved.
The function </> and the factor A are not uniquely determined;
if <f> is replaced by $(<) one has
Xd<f> = A*D with A = A~.
d(f>
8. (V. p. 43.) Thermodynamics of chemical equilibria
Carath6odory's original publication on his foundation of
thermodynamics (Math. Ann. 61, p. 355, 1909) is written in a
very abstract way. He considers a type of systems which are
called simple and defined by the property that of the parameters
necessary to fix a state of equilibrium all except one are con
figurational variables, i.e. such that their values can be arbi
trarily prescribed (like volumes). In my own presentation of the
theory, of 1 92 1 (quoted in Appendix, 6), there is only a hint at the
end ( 9) how such variables can be introduced in more complica
ted cases, as for instance for chemical equilibria where the concen
trations of the constituents can be changed. I hoped at that time
that this might be worked out by the chemists themselves, for it
needs nothing more than the usual method of semipermeable
walls with a slight modification of the wording. As this has not
happened, I shall give here a short indication how to do it.
I consider first a simple fluid (without decomposition), but
arrange it in such a way that volume V and mass M are both
independently changeable. For this purpose one has to imagine
a cylinder with a piston attached to the volume V, connected
by a valve, through which substance can be pressed into the
volume V considered. The position of the auxiliary piston
determines uniquely the mass M contained in V\ hence M can
be regarded as a configuration variable in Carath^odory's sense.
If the valve is closed, V can be changed, by moving the 'main'
piston, without altering M . Hence M and V are both indepen
APPENDIX
147
dent configuration variables, and the work done for any change
of them must be regarded as measurable. If this work is deter
mined adiabatically one obtains the energy function, say in
terms of F, M and the empirical temperature #, U(V 9 M,&).
FIG. 8.
When this is known the differential of heat is defined by the
difference JQ __ ^jjj^^dV dM (1}
where p and p arc functions of the state (V,M,&) like U, which
can be regarded as empirically known.
Now one has in (1) a Pfaffian of three variables and can apply
the same considerations as before which lead to the result that dQ
is integrable and can be equated to T dS. Hence one can write
dU = TdSpdV+pdM. (2)
But U must be a homogeneous function of the first order in the
variables S, V, M. If one introduces the specific variables
U*S*V DV TT r n JT TT ir /\\
, o, v vj u JI U) js = Ms, V = M v, (3)
one has according to Euler's theorem
u = Tspv+p, (4)
i m du du /^x
where 1 = , p = , (5)
and then, from (2), du = Tdspdv. (6)
If the substance inside F is a chemical compound and one wishes
to investigate its decomposition into n components, one assumes
n cylinders with pistons attached to F, separated from F by
semipermeable walls, each of which allows the passage of only
one of the components. Then one has in the same way
dU = TdSpdV+
(7)
148 APPENDIX
where F, M v M z ,..., M n can be regarded as configuration variables
and U, p, fa, p z ,... y p n as known functions of these. Now, as above,
the specific energy, entropy, and volume are introduced and
further the concentrations c f by
M^c.M, (8)
where M is the total mass: M = ] M it hence
=1 (9)
One obtains from Euler's theorem
(10)
.,, m U U U /11X
with T = , p = _, ft = _. (11)
where the differentiation with respect to the c { is performed as
if they were independent; and
du = Tds pdv+ Pidct. (12)
The formalism of thermodynamics consists in deriving rela
tions between the variables by differentiating the equations
(11), e.g. BT e d
== . _ci = etc.
dv ds dv dcS
As experiments are often performed at constant pressure or
temperature or both, one uses instead of u(s 9 v 9 c l9 ... 9 c n ) 9 the
functions free energy uTa = /, enthalpy u+pv = w, or free
enthalpy p = u+pvTs (defined by (4)). For the latter, for
example, it follows from (12) that
dp = sdT+vdp} J Pidc i9 (13)
,
= > v = f p > M* = .
so that for T = const., p = const., one has simply
^ = >t<^. (15)
The most important theorem, which follows from these general
APPENDIX 149
equations, is Gibbs's phase rule. The system may exist in differ
ent phases if the n equations
^(T,p,c^c n ) = C< (t = 1,..., n) (16)
have several solutions. Let ra be the number of independent
solutions; then there are m phases which can be ordered in such a
way that each has contact with two others only. Hence there
are ml interfaces and n equations of the form (16) for each,
altogether n(mI) independent equations. On the other hand,
there are (nl) independent concentrations c^...,c n ^ for each
phase, i.e. m(n I) for the whole system, to which the two vari
ables p, T have to be added; hence m(n l)+2 independent
variables. The number of arbitrary parameters, or the number of
degrees of freedom of the system, is therefore
m(n 1)+2 n(m 1) = n m+2.
So for a single pure substance, n = 1, the number of degrees of
freedom is 3 m; hence there are three cases m = 1, 2, 3 corre
sponding to one phase, two or three coexisting phases; more than
three phases cannot be in equilibrium. All further progress in
thermodynamics is based on special assumptions about the func
tions involved, either prompted by experiment, or chosen by an
argument of simplicity, or and this is the most important
step derived from statistical considerations.
9. (V. p. 44.) Velocity of sound in gases
The simple problem of calculating the adiabatic law for an
ideal gas gives me the opportunity to show how the theory of
Carath^odory determines uniquely the absolute temperature
and entropy.
The ideal gas is defined by two properties: (1) Boyle's law,
the isotherms are given by pV = const.; (2) the same quantity
pV remains constant if the gas expands without doing work. In
mathematical symbols,
pV = F(&), U = U().
Hence dQ = dU+p dV = U'd&+F. (1)
If e is defined by
. Tin .
 3
150 APPENDIX
(1) can be written
dQ = F() dlog(OV);
hence one can put A = JF(#), < = log(0F) and obtain from the
equation (5.25) of the text
~ ~~a#~~ ~~ a '
Then (5.27) gives, writing C = 1/JB, the usual form of the
equation of state RT = F(&) = ^ {3)
and S = S Q +Rlog(0V).
If the special assumption is made, that U depends linearly on
pV (which holds for dilute gases with the same approximation
as Boyle's law), one has U = c v T and, from (2),
The entropy becomes therefore
S=S Q +log(T*<V R ), (4)
or, substituting p for T from (3),
S^S 1 +log(p^V c p) 9 (5)
where c p = c v +R (6)
is the specific heat for constant pressure. Hence the adiabatic
law S = const, is equivalent to
= const., y = ^, (7)
which is identical with the equation p = ap? in the text, as the
density p is reciprocal to the volume F.
The velocity of Sound was calculated in Appendix, 3; according
to (3.10) it is . ,
I P
c ~~~ iJTp'
J? T the isothermal law, p = ap, this means
c= /?
APPENDIX 151
while, for the adiabatic law, p = apY, one finds
_ IVP
~V7'
which is considerably larger; e.g. for diatomic molecules (air)
experiment, and kinetic theory as well, give y = J = 14.
10. (V. p. 45.) Thermodynamics of irreversible processes
Since I wrote this section of the text a new development of
the descriptive or phenomenological theory has come to my
knowledge which is remarkable enough to be mentioned.
It started in 1931 with a paper by Onsager in which the
attempt was made to build up a thermodynamics of irreversible
processes by taking from the kinetic theory one single result,
called the theorem of microscopic reversibility, and to show that
this suffices to obtain some important properties of the flow of
heat, matter, and electricity. The startingpoint is Einstein's
theory of fluctuations (see Appendix, 20), where the relation
8 = klogP between probability P and entropy 8 is reversed,
using the known dependence of S on observable quantities to
determine the probability P of small deviations from equili
brium. Then it is assumed that the law for the decay of an
accidental accumulation of some quantity (mass, energy, tem
perature, etc.) is the same as that for the flow of the same
quantity under artificially produced macroscopic conditions.
This, together with the reversibility theorem mentioned,
determines the main features of the flow. The theory has been
essentially improved by Casimir and others, amongst whom
the book of Prigogine, from de Donders's school of thermo
dynamics in Brussels, must be mentioned. Here is a list of the
literature:
L. Onsager, Phys. Eev. 37, p. 405 (1931); 38, p. 2265 (1931).
H. B. G. Casimir, Philips Research Reports, 1, 18596 (April 1946);
Eev. Mod. Physics, 17, p. 343 (1945).
C. Eckhart, Phys. Rev. 58, pp. 267, 269, 919, 924 (1940).
J. Meixner, Ann. d. Phys. (v), 39, p. 333 (1941); 41, p. 409 (1943);
43, p. 244 (1943) ; Z. phys. Chem. B, 53, p. 235 (1943).
S. R. de Groot, ISEffet Soret, thesis, Amsterdam (1945); Journal
de Physique, no. 6, p. 191.
I. Prigogine, ^tude thermodynamique des phtnomenes irre'versiblea
(Paris, Dunod ; Ltege, Desoer, 1947).
152
APPENDIX
11. (VI. p. 47.) Elementary kinetic theory of gases
To derive equation (6.1) of the text, consider the molecules
of a gas to be elastic balls which at impact on the wall of the
vessel recoil without loss of energy and
momentum. If the t/zplane coincides
with the wall the ^component of the
momentum mg of a molecule is changed
into wf ; hence the momentum 2m is
transferred to the wall. Let n v be the
number of molecules per unit of volume
having the velocity vector v. If one
constructs a cylinder upon a piece of the
wall of area unity and side v dt, all mole
cules in it will strike the part of the wall
within the cylinder in the timeelement dt\ the volume of the
cylinder is dt, hence the number of collisions per unit surface
and unit time, gn v and the total momentum transferred 2m 2 n v .
This has first to be summed over all angles of incidence (i.e.
over a hemisphere); the result is obviously the same as onehalf
of the sum over the total sphere, namely
Fia. 9.
where n v is the number of molecules per unit of volume, with a
velocity of magnitude v (but any direction). Now the 'principle
of molecular chaos' is used according to which
P = ? = ~? =
Hence the last expression is equal to n v lfi 9 and the pressure is
3
finally obtained by summing over all velocities
P=J
m
riv*.
(1)
The total (kinetic) energy in the volume V is
hence one obtains the equation (6.1) of the text,
Vp = Z7.
Now one can apply the considerations of Appendix, 9, using the
experimental fact expressed by Boyle's law (that all states of a
APPENDIX 163
gas at a fixed empirical temperature # satisfy pV = const.).
Then one obtains
pV = RT, U = \RT, (2)
as stated in (6.2) of the text.
12. (VI. p. 50.) Statistical equilibrium
If H depends only on p, not on x, the equation [H, /] =
reduces to Q 
and is equivalent to the set of ordinary differential equations
dx __ dy ___ dz
Hx ~~ P y ~~ !>*
By integrating these (p is constant) one obtains the general
solution of (1) as an arbitrary function of the integrals of (2),
), (3)
where m = x A p. (4)
Now if the gas is isotropic, / can depend only on p 2 and m 2 ,
and if it is to be homogeneous (i.e. all properties are independent
of x), m 2 cannot appear; hence
/ = fcfcpt) = ^(H), (5)
as stated in the text.
13. (VI. p. 51.) Maxwell's functional equation
To solve the equation (6.10) it suffices to take 3 = 0; putting
has
f(x+y) = #*#(?). (1)
Differentiating partially with respect to x,
f(x+y) = f (*tf<y), (2)
and dividing by the original equation
/'(s+y) =
f(x+y)
154 APPENDIX
Now, the righthand side is independent of y, and the lefthand
side cannot therefore depend on y\ hence, putting x = 0,
*
where j8 is a constant. By integration,
f(y) = aeP* = e~fr, (5)
which is the formula (6.11) of the text.
14. (VI. p. 52.) The method of the most probable distribu
tion
We have to determine the probability of a distribution of
equal particles over N cells, where n^ of them are in the first cell,
n 2 of them in the second, etc. (n 1 +n 2 +...+n N = n). To do this
we first take the particles in a fixed order; then the probability of
distribution (n v n 2 ,..., n N ) is, according to the multiplication law,
a>2 ......... O} N O) N ...W N = a^eog* ...$%
where cu^a^,...,^ are ^ e relative volumes of the cells, nor
malized so that w l + fc>2 + + W N == 1 To obtain the probability
asked for, we have to destroy the fixed order of the particles. If
one performs first all n\ permutations, one gets too many cases,
as all those distributions, which differ only by permuting the
particles in each cell, count only once. Therefore one has to
divide n\ by the number of all these permutations inside a cell,
that is by n 1 !w 2 L..%! The total result is the expression (6.15)
P (ni ,n 2 ,...,n N )  jf^oM...^. (1)
which is nothing but the general term in the polynomial expan
sion
2 P(n v n 2) ...,n N ) =; V nl <*>?.
ni...^ W ^J n^^.n^.
We now deal with the approximation of n! by Stirling's
formula. The simplest way to obtain it is this: write
= log(1.2.3...n) = log l+log2+log3+... flog n
APPENDIX 155
n
and replace the sum J log fc by the integral
n
j logic dx = 7i(logn 1).
b
A more satisfactory derivation is the following: One can represent
n! by an integral and evaluate it with the help of the socalled
method of steepest descent, which plays a great part in the
modern treatment of statistical mechanics due to Darwin and
Fowler (see p. 54). The approximate evaluation of n! may serve
as a simple example of this method.
If the identity
~(e~ x x n ) = e x x n +ne~ x x n  1
dx
is integrated from to oo and the abbreviation (Ffunction)
00
T(n+l) = j e~ x x n dx (n > 1), (2)
used, one obtains T(n+l) = nT(n). (3)
As F(l) = 1, one has
T(2) = l.T(l) = 1, T(3) = 2F(2) = 1.2,
r(4) = 3r(3)= 1.2.3,
and in general T(n\l) = n\. (4)
The integral (2) can be written
CO
r(n+ 1 ) = J e'C^ dx, f(n, x) = x+n log x. (5)
The function f(n, x) (hence also the integrand) has a maximum
where
/'(*) = 1+ = 0,
x
i.e. at x = n, and
f(n) == n+nlogn,
156 APPENDIX
The expansion of f(x) in the neighbourhood of the maximum
x = n is therefore
f(x) = n+nlogn (x
and one has
00
n J
where the dots indicate terms of higher order which can be easily
worked out. If these are neglected the integral becomes
J e (xn)'i2n dx = J e t' l2n d
n
for large n. Hence
nl = T(n+l) = J(27rn)e n n n +... (6)
and log/i! = nlogn n\^log(27rn)\ ..., (7)
where the highest terms agree with the previous result.
Thus the logarithm of the probability P can be written
logP = 2^K)+const., (8)
8
where <f> 8 (n 8 ) = ^(logo^log/g. (9)
(8) and (9) are, for equal co's, equivalent to formula (6.17) of
the text.
One has to determine the maximum of log P with the con
ditions (6.13), (6.14), of the text, namely
N N
IX = 71, 5> s e*= U. (10)
81 Sl
Without using the special form of <f> 8 , one obtains
. I")
where A, j8 are two Lagrangian factors. For the special function
(9) one has
l, (12)
and if this is substituted in (11) with A+l = a,
Iogn 8 = logco d +a p 8i
n a = a> 8 e a ^; (13)
that is, for equal CD'S, the formula (6.18) of the text.
APPENDIX 167
If one has two sets of systems A and JJ, as discussed in the text,
there are three conditions
V n^ = n^\ T n*>  T n^>4^>+ T n^cj^ = U,
rl r=l rl rl
and therefore instead of two multipliers three, X< A \ X (B \ j8 ; and
one obtains, with A^>+ 1 = a (A \ A< B >+ 1 = a (B \ the formulae
(6.19) of the text, which show that ft is the equilibrium para
meter, a function of the (empirical) temperature # alone.
In order to see that /S is reciprocal to the absolute tempera
ture one must apply the second theorem of thermodynamics,
which refers to quasistatic processes involving external work
(for instance by changing the volume).
By an infinitely slow change of external parameters a l9 a&... 9
the energies of the cells e r will be altered and at the same time
the occupation numbers n r \ the total energy will be changed by
du
while the total number of particles is unchanged,
dw==2dn r = 0. (15)
r
The first term in (14) represents the total work done
dW=n2f a da at (16)
a
where / = 1 V n r ^ (17)
n 44 da a
is the average force resisting a change of a a . Then the second
term in (14) d$ = 2,K (18)
r
must represent the heat produced by the rearrangements of the
systems over the cells.
The corresponding change of logP is obtained from (8)
and (11),
dlogP =
which in virtue of (15) and (18) reduces to
(19)
158 APPENDIX
This shows that fidQ is a total differential of a function depend
ing on j9,a l9 a 2 ,..., and that /J(#) is the integrating factor.
Hence the second law of thermodynamics is automatically
satisfied by the statistical assembly, and one has, with the
notations of section V,
dQ = Xd<(>, with A = , < = logP;
P
then (5.25) and (5.26) give, with C = 1/4, logO = 0, 0> = 1,
and (5.27) ,
T = ~ 9 SS = k<f> = klogP. (20)
top
k is called Boltzmann's constant.
Now the change of energy (14) becomes
(21)
If one has a fluid with the only parameter a^ = F, the corre
sponding force is the pressure
p = nf, =  *>>
and one obtains the usual equation
dU= pdV+TdS. (23)
Returning to the general expression (21) one sees easily that
one can express all quantities in terms of the socalled partition
function (or 'sumoverstates')
Z = 2"r<r/K (24)
For, from (10) and (13),
n = e a o>e~P* r = e a Z,
r vp
hence a = lognlogZ.
Now one has, after simple calculation, from (19), (20) with
(8), (9), (17)
u eiogZ
u = =   _ .
n 3B '
} (25)
S v ;
APPENDIX 159
and du = J f a da a + T ds. (26)
a
The simplest thermodynamieal function, from which all
others can be derived by differentiation, is the free energy,
= kTlogZ,
(27)
hence
nence
df  dlo * Z
_
while s = df/dT leads back to the second formula (25).
The application to ideal gases may be illustrated by the simp
lest model where each particle is regarded as a mass point with
coordinates x,y,z, momenta p x ,p y ,p z , and mass ra. Then,
according to Liouville's theorem, one has to take as cells w 8
elements of the phase space dxdydzdp x dp y dp s and replace the
sums by integrals. The energy is (p%\p%+p%)/2m. Then the par
tition function (24) becomes
Z = "' ^ I2mm+p ^ p ^ dxdydzdp x dp y dp s .
The integration over the space coordinates gives F, the volume.
If one puts <J(p/2m)p x = ,..., one has
z = vl^J JJJ ete+
the integration extended for each variable from oo to
The integral is a constant which is of no interest as all physical
quantities depend on derivatives of Z. Hence, with /? = (kT)~~ l ,
/= kTlogZ = 
from which one obtains
p =  n = , s =  =
= pT+const.
These are the wellknown formulae for an ideal monatomic gas:
Boyle's law, the entropy and energy per atom. The specific
heat at constant volume is
* i
= dT = ^
if n refers to one mole.
160 APPENDIX
15. (VI. p. 54.) The method of mean values
The method of Darwin and Fowler aims at computing the
mean value of any quantity f(n r ), depending on the occupation
number n f of a cell for all possible distributions n lf n Zi ... 9 n N9
satisfying the conditions
that is, the quantity
TOM = 2 p ( n *> *)/K)> (2)
where P is the probability of the distribution n l9 w 2 ,..., n N) defined
by equation (6.15) or 14, (1).
We consider the function F(z) defined by (6.22),
F(z) = w l #*+wt#*+. ..+*>##*, (3)
and assume that a very small unit of energy is chosen so that all
the r are positive integers, which may be ordered in such a way
thatcj ^ e 2 ^ 8 ^ ... ^ e^; also, by choosing the zero of energy
suitably we can arrange that e x = 0.
Then we expand {F(z)} n into powers of z according to the
multinomial theorem and obtain a series of terms
n !
by collecting all these terms with the same factor z u we obtain all
the P(n 1 ,7i 2 ,...,7i^ v ) which belong to the same value of
U = I e r n r .
r
Now we substitute 1 for each/(r& r ) in (2) and obtain in this
way the total probability of these distributions which have a
given total energy U, in the form:
y P = coefficient of z u in {F(z)} n .
This coefficient can be evaluated by Cauchy's theorem, if z is
regarded as a complex variable; one has
where the integral is taken round a closed contour surrounding
APPENDIX 161
the origin in the zplane. The integral can be evaluated
approximately by the method of steepest descent which we have
already explained, for real variables, in Appendix 14, for the
Ffunction.
The first step is to express the integrand in the form
} n = e<**\ O(z) = nlog F(z)(U+l)logz.
both log F(z) and its derivative increase monotonically from a
finite value to oo as z moves along the real axis between and
oo. Also log z and its negative derivative z~ l decrease mono
tonically along the same path. Hence G(z) can have only one
extremum, a minimum, on the real axis between and oo, and
this minimum will be extremely steep if n and U are large.
Also let z be the point of the real axis where the minimum
happens to be; then at this point the first derivative of G(z)
vanishes and the second is positive and very large. Hence in the
direction orthogonal to the axis the integrand must have a very
sharp maximum. If we take as contour of integration a circle
about through z , only the immediate neighbourhood of this
point will contribute appreciably to the integral.
The minimum z is to be found as root of the equation
and one has
This shows that for large U and n a proportional increase in U
and n will not change the root z , while G"(z ), which is positive,
can be made arbitrarily large.
5131 XT
12 APPENDIX
Putting z = z +iy one obtains for the integral (4)
00
V P = JL e W> f eWto*i dy,
00
where the terms of higher than second order are omitted and the
limits of integration are taken to be 00 because of the sharp
drop of the exponential function. This gives
U+ 1 can be replaced by f/, because of the smallness of the energy
unit chosen; if one puts
*o = e+ (7)
one has, for N > oo,
F(z ) = 2 "V# = 2 "re'*' = Z(/5), (8)
which shows that the function F(z) is equivalent to the partition
function introduced in (14.24), p. 158.
If one now takes the logarithm of (6) the leading terms are
On the other hand, one has from (5) to the same approximation
p _ , f'M _ *dZ_ dlogZ
U ~ nz F(z ) ~ Zdp n dp ' ( *>
in agreement with (14.25); hence
(10)
Comparison with (14.25) shows that the entropy in this theory
is to be defined by
= rw = HogYP, (11)
(if)
while in the Appendix, 14, the definition was S = Mog P, where
P means the maximum value of the probability.
Thus it becomes clear that owing to the enormous sharpness of
the maximum it does not matter whether one averages over all
states or picks out only the state of maximum probability.
In fact, the two methods, that of the most probable distribution
and that of mean values, do not differ as much as it appears.
Both use asymptotic approximations for the combinatorial
APPENDIX
163
quantities: either for each factorial in the probability before
averaging, or for the resultant integral after averaging. The
results are completely identical. Yet there are apostles and dis
ciples for each of the two doctrines who regard their creed as the
only orthodox one. In my opinion it is just a question of training
and practice which formalism is more convenient. The method
of Darwin and Fowler has perhaps the advantage of greater
flexibility. The partition function is nothing but a 'generating
function' for the probabilities, and allows the representation of
these by complex integrals. In this way the powerful methods
of the theory of analytic functions of complex integrals can be
utilized for thermodynamics.
16. (VI. p. 56.) Boltzmann's collision integral
The collision integral (6.24) can be derived in the following way.
The gas is supposed to be so
diluted that only binary en
counters are to be taken into
account. Then the relative
motion of two colliding particles
has an initial and a final straight
line asymptote.
To specify an encounter we
define the 'crosssection* as the
plane through a point with a
normal parallel to the relative velocity l^ 5 2 of two particles
before an encounter and introduce the position vector b in this
plane. We erect a cylindrical volume element over the area db
with the height i~ 2! &t\ then all particles in this element
having the relative velocity 5x 2 w ^ P ass through db in time
dt. The probability of a particle 2 passing a particle 1 at
within the crosssection element db is obtained from the
product /(I) dx 1 d? 1 /(2) dx 2 d 2 by replacing dx 2 by the volume
of the cylinder i 2! dbdt,
Fia. 10.
Every encounter changes the velocities and removes therefore
the particle 1 from the initial range. The total loss is obtained
by integrating over all db and d 2 :
(i)
164 APPENDIX
But there are other encounters such that the final state of the
particle 1 is in the element dx l d^ l ; they are called inverse en
counters.
If the final velocities of the direct encounters are i, ?' 2 the
laws of collision (conservation of momentum and energy) allow
one to express i, 2 in terms of x , 2 and two further parameters
(the components of b in the crosssection plane). These relations
are linear in i 2 and may be shortly written
5i), (2)
where 3f represents a 6 x 6 matrix. It is obvious that the solu
tions of these equations for 5i and 2 in terms of i and ! ' 2 must
have the same form; that means that &~ l = 3C, so that J2 72 = 1
and \y\ = 1 or
' ' *5i% = *5id5.. (3)
Further, the elementary theory of collisions (conservation of
energy and momentum) implies
ISiSI = I5i5l (*)
Hence the number of inverse encounters is
(5)
where /'(I) means /($, x x ,i), 5i being the linear function of
Combining (1) and (5), one obtains for the total gain of par
ticles (1) in dx l d^ lt per timeelement dt,
5 2 dbd 2 . (6)
This has to be equated to the change of /(I) calculated without
assuming interactions, namely,
k \
XidX^dt. (7)
The results are the combined formulae (6.23) and (6.24) of the
text,
ffr, /(I)] = JJ {/'(
(8)
APPENDIX 166
17. (VI. p. 57.) Irreversibility in gases
Assuming no external forces, the Hamiltonian of a particle
is H = ^~~P % > h^nce Boltzmann's equation (6.23), or (Appendix,
2m
16.8) reduces to
(1)
If now the entropy is defined by (6.25) or, using the velocity
instead of the momentum, by
S=kjjf(l)logf(l)dx l d$ l9 (2)
one obtains
and substituting df(I)fdt from (1)
2 0)
Here the first integral can be written
and transformed, by Gauss's theorem, into a surface integral
over the walls of the container,
where v is the unit vector parallel to the outer normal of the
surface, dor the surface element.
The inner integral is n times the mean value over all velocities
of ! .v log /( 1 ), where n = J /( 1 ) d^ is the number density. But
this average vanishes at the surface which is supposed to be
perfectly elastic and at rest, external interference being excluded;
for the numbers of incident and reflected particles with the same
160 APPENDIX
absolute value of the normal component of the velocity, .v,
will be equal.
Hence there remains only the second integral in (3). This can
be written in four different forms, namely, apart from the one
given in (3), where the factor !+/(!) appears, three others where
this factor is replaced by l+/(2) or l+/'(l)or l+/'(2). For it is
obvious that 1 and 2 can be interchanged as the integration is
extended over both points in a symmetric way; and the dashed
variables can be exchanged with the undashed ones as
(see Appendix, 16.3, 4). Hence
~=~l JJJJ {log/(l)+log/(2)
x{/'U)/'(2)/(l)/(2)}5i5dMx I d5 1 d5,
01
xl5xl.ld5id5.dMx!. (4)
)
Now log *).(*> /* is positive or negative according as /'(l)/'(2)
/v 1 )/! 2 )
is greater or smaller than /(I)/ (2); it has therefore always the
same sign as /'(l)/'(2) /(l)/(2), and one obtains
the = sign can hold only if
/'(l)/'(2)  /(l)/(2) (6)
or log/'(l)+log/'(2) = log/(l)+log/(2). (7)
One can express this also by saying that
log/(l)+log/(2) (8)
is a collision invariant.
The mechanics of the twobody problem teaches that there are
only four quantities conserved at a collision: the three com
ponents of the momentum wl^+wl^ and the total energy
wf+im!. Hence log / must be a linear combination of these:
log/ = t0rop+Y.$. (9)
or
APPENDIX 167
This can also be written
/ = e^tf^u)', (10)
1 v 2
where u = 5 y and a x = a  Y
(10) shows that u is the mean velocity. For a gas at rest (in a
fixed vessel) one has therefore u = 0, hence y = and
This is the dynamical proof of Maxwell's distribution law,
18. (VI. p. 60.) Formalism of statistical mechanics
As said in the text, Gibbs's statistical mechanics is formally
identical with Boltzmann's theory of gases if the actual gas is
replaced by a virtual assembly of copies of the system under
consideration. Hence all formulae referring to averages per
particle (small letters) can be taken over if the word 'particle'
is replaced by 'system under consideration'. One forms the
partition function (14.24) or (15.8) F(z) = Z(p), z = e~0, and
from that the free energy (14.27)
from which all thermodynamical quantities can be obtained
by differentiation:
This formalism includes also the case of chemical mixtures
where the number of particles of a certain type is variable. One
has to know how the quantity Z depends on these numbers;
then the chemical potentials, introduced in Appendix, 8, are
obtained by differentiating /with respect to the concentrations.
We shall mention only the method of the 'great ensemble' which
can be used in this case.
In the theory of nonideal gases the Hamiltonian splits up
into a sum
N
H =
and the partition function into a product
Z = J<P J exp{03/2m) J^pf) d Pl ...dp N x
x J <w
168 APPENDIX
The first integral can easily be evaluated and gives
hence one has z = (27fmkT?mQ} (4)
where
Q = J <> / exp{U(x v ...,* N )/kT} dx..^, (5)
as in (6.35) of the text.
The method of Ursell for the evaluation of this integral applies
to the case where the potential energy is supposed to consist of
interaction in pairs between the centres of the particles,
U = *<>> 0<, = 0(r<,). (6)
t>y
Then one can write
where f ti = 1 c^*. (8)
' The product (7) can be expanded into a series
and the problem of calculating Q is reduced to finding the
'cluster integrals'
J J fa dx v ..dx N9 J...J fafn dx v ..dx N9 ..., (10)
which are obviously proportional to F^" 1 , F^" 2 ,... . Hence one
obtains for QV~ N an expansion in powers of F" 1 which holds
for small interactions (j8O^ small implies/^ small):
(11)
Then (1) and (4) give
/ = trHlog(2 w wfc3 l )+lo g g), (12)
I ^ )
hence
 8 /ftr 8k  y * If /i^+ 1?  \ as)
p  ^p  M gy  7~V F + F* 7' ( }
where Jl = ot/N, B = (of2p)/N 9 ... . That is the formula (6.36)
given in the text.
The actual evaluation of the cluster integrals is extremely
difficult and cumbersome. The analytical properties of the power
APPENDIX 169
series (11) have been carefully investigated by J. Mayer, and by
myself in collaboration with K. Fuchs. The theory has been
generalized so as to include quantum effects by Uhlenbeck,
Kahn, de Boer. Here is a list of publications:
H. D. Ursell, Proc. Camb. Phil. Soc. 23, p. 685 (1927).
J. E. Mayer, J. Chem. Phys. 5, p. 67 (1937).
J. E. Mayer and P. J. Ackermann, ibid. p. 74.
J. E. Mayer and S. F. Harrison, ibid. 6, pp. 87, 101 (1938).
M. Born, Physica, 4, p. 1034 (1937).
M. Born and K. Fuchs, Proc. Roy. Soc. A, 166, p. 391 (1938).
K. Fuchs, ibid. A, 179, p. 340 (1942).
B. Kahn and G. E. Uhlenbeck, Physica, 4, p. 299 (1938).
B. Kahn, The Theory of the Equation of State. Utrecht Dissertation.
J. de Boer and A. Michels, Physica, 6, p. 97 (1939).
S. F. Streeter and J. E. Mayer, J. Chem. Phys. 7, p. 1025 (1939).
J. E. Mayer and E. W. Montroll, ibid. 9, p. 626 (1941).
J. E. Mayer and M. Goeppert Mayer, Statistical Mechanics, J. Wiley
& Sons, Now York (1940).
J. E. Mayer, J. Chem. Phys. 10, p. 629 (1942).
W. G. MacMillan and J. E. Mayer, ibid. 13, p. 276 (1945).
J. E. Mayer, ibid. 43, p. 71 (1939); 15, p. 187 (1947).
H. S. Green, Proc. Roy. Soc. A, 189, p. 103 (1947).
J. de Boer and A. Michels, Physica, 7, p. 369 (1940).
J. de Boer, Contributions to the Theory of Compressed Oases. Amster
dam Dissertation (1940).
J. Yvon, Actualit6s8cientifiquesetindustrielles,2Q3,p. 1 (1935); p. 542
(1937) ; Cahiers de physique, 28, p. 1 (1945).
19. (VI. p. 62.) Quasi periodicity
The state of a mechanical system can be represented by a
point in the 6ZVdimensional phase space p, q, and its motion by
a single orbit on a 'surface' of constant energy in this space.
Following this orbit one must come very near to the initial point;
the time needed will be considerable, in the range of observa
bility. This is the quasiperiod considered by Zermelo. Yet
there are much smaller quasiperiods if one takes into account
that all particles are equal and indistinguishable; the gas is
already in almost the same state as the initial one if any particle
has come near the initial position of any other. Then the orbit
defined above is not closed at all, yet the system has performed
another kind of quasiperiod. These periods are presumably
small; I cannot give a mathematical proof, but it seems evident
from the overwhelming probability of distributions near the
170 APPENDIX
most probable one. Einstein has this quasiperiod in mind. It
is certainly extremely short in the scale of observable time
intervals, and one can therefore say that the representing 'point*
sweeps over the whole energy surface if this point is defined
without regard to the individuality of the particles (i.e. if an
enormous number of single points corresponding to permutation
of all particles are regarded as one point).
20. (VI. p. 63.) Fluctuations and Brownian motion
The statistical conception of matter in bulk implies that
spontaneous deviations from equilibrium are possible. There are
several different types of problems, some of them concerned
with the deviations from the average or fluctuations found by
repeated observations, others with actual motion of suspended
visible particles the Brownian motion.
The simplest case of fluctuations is that of density, i.e. of the
number of particles in a small part cuF of the whole volume V.
One has in this case two cells of relative size a> and 1 o>, and the
probability of a distribution %,n a = nn ly is according to
(6.15) or (14.1)
The expectation value of n^ found by repeated experiments is
o
or
According to the binomial theorem this reduces to
n^ = na>, (2)
as might be expected.
In order to calculate nf we note that %(^i~ 1) can be found in
exactly the same way as rT l9 namely,
n
= 2 *
nt0
(2)!
APPENDIX 171
whence n i( w i~ !) = n(n~l)a>*. (3)
Therefore
L = n(n l)co j7Uo, (4)
so that the mean square deviation is
a>). (5)
If a* is a small fraction, one obtains the wellknown fluctuation
formula for independent events
(S^ = ^. (6)
This is directly applicable to the density fluctuation of an ideal
gas and can be used to explain the scattering of light by a gas,
as observed for instance in the blue of the sky (Lord Rayleigh;
Atomic Physics, Appendix IV, p. 280).
There are also fluctuations of other properties of a gas. As
the state of a fluid is determined by two independent macro
scopic variables (p, V for instance), it suffices to calculate the
fluctuation of one further quantity. The most convenient one
is the energy.
The following consideration holds, however, not only for ideal
gases, but for any set of independent equal systems of given
total (or mean) energy; it supposes only that the distribution
is canonical.
Then all averages can be obtained with the help of the par
tition function (14.24) or (15.8),
 jL, w t* ' > r ~kif
In particular the mean energy is (14.25)
s CO  6 P * rri 11 rr
7* r r 2 dlogZ
tt ^^ rr ^ ' = =  
^ CO 6""' f Jw C&J3
r
and the mean of its square
2= Vco.e^~ === T'
172 APPENDIX
Hence the mean fluctuation of energy
= 4(f),
dp \ Ju I
ZZ"Z'*
or with (8) _ ,
<Ae)=g. (10)
If the mean energy is known as a function of temperature,
hence of j8, one obtains its fluctuation by differentiation. For
example, for an ideal gas one has u = c v T = (c v /k)f3~ l 9 hence
(Ac) 2 = c v kT 2 = (k/c v )u 2 . Another application, to the fluctua
tion of radiation, is made in section VIII, p. 79.
If one wishes to determine the fluctuations of a part of a body
which cannot be decomposed into independent systems, these
simple methods are not applicable.
Einstein has invented a most ingenious method which can be
applied in such cases. It consists in reversing Boltzmann's
equation S = klogP, (11)
taking 8 as a known function qf observable parameters, and
determining the probability P from it,
P = e 8 ' k . (12)
Assume the whole system is divided into N small, but still
macroscopic parts and At^ is the fluctuation of energy in one of
them; then one has for the entropy in this part
If the whole system is adiabatically isolated one has
TAw i = 0.
By adding up all fluctuations one gets for the entropy
S = $*y #+..., (13)
where the abbreviations
APPENDIX 173
are used. According to the second law of thermodynamics one
has for constant volume
dS ^dU_ 8S_ = 1
hence
/ 1 r i .
2k
where c v = dU/dT is the specific heat for constant volume.
Substituting (13) in (12) one obtains approximately
P = P y?fl;
hence the mean square fluctuation of energy in one (macroscopic)
cell is
= jjjlog J er* df,  ,
or (A^J 2 = kT*c v . (16)
This result is formally identical with that for an ideal gas ob
tained above, yet holds also if c v is any function of T.
In a similar way other fluctuations can be expressed in terms
of macroscopic quantities.
We now turn to the theory of Brownian motion which is also
due to Einstein. His original papers on this subject are collected
in a small volume Investigations on the Theory of the Brownian
Movement, by R. Fiirth, translated by A. D. Cowper (Methuen
& Co. Ltd., London, 1926) and make delightful reading. Here
I give the main ideas of this theory in a slight modification
formulated independently by Planck and Fokker.
Let f(x, t) dx be the probability that the centre of a colloidal
(visible) particle has an xcoordinate between x and x\dt at
time t. The particle may be subject to a constant force F and
to the collisions of the surrounding molecules. The latter will
produce a frictionlike effect; if the particle is big compared with
the molecules, its acceleration may be neglected and the velocity
174 APPENDIX
component in the ^direction assumed to be proportional to the
force f = BF, (17)
where B is called the 'mobility*. Apart from this quasicontinu
ous action, the collisions will produce tiny irregular displace
ments which can be described by a statistical law, namely, by
defining a function <f>(x) which represents the probability for a
particle to be displaced in the positive ^direction by x during a
small but finite interval of time r.
Then one obtains a kind of collision equation (which is simpler
than Boltzmann's in the kinetic theory of gases, as no attempt
is made to analyse the mechanism of collision in detail) : The con
vective increase of f(x, t) in the timeinterval T,T/ = T( + ~)
at \dt ox /
is not zero but equal to the difference of the effect of the collisions
which carry a particle from x l to x and those which remove the
particle at x to any other place x\
= ]{f(xx')f(x)}<f>(x')dx'. (18)
oo
tf>(x) may be normalized to unity and the mean of the displace
ment and of its square introduced by
00
J
dx=l, x$(x) dx = x, x*<f>(x) dx == (Ax)*.
oo oo
(19)
Further it may be supposed that the range of <f>(x) is small; then
one can expand/(a x') on the righthand side of (18) and obtain
a differential equation for/(x,f) which, with (17) and (19), can
be written
(20 >
Let us assume that the irregular action of the collisions is
symmetric in x t ^(x) = ^( x); then Aa; == 0.
APPENDIX 175
Consider first statistical equilibrium; then
Now the coordinate x of the colloidal particle can be included in
the total set of coordinates of the whole system, if a term Fx
is added to the Hamiltonian, so that the canonical law of distri
bution contains the factor eP Fx ,/3 = 1/kT. Hence the solution
of (21) must have the form
/ = / e^*, (22)
so that &L_ BF *J_
8x*~ P 8x'
If this is substituted in (21), one finds
(23)
We consider now the motion of the particles without an
external field (F = 0), under the action of the collisions only.
Then (20) reads ~ f ^ f
i= D w < 24 >
This is the wellknown equation of diffusion. Einstein's main
result consists in the double formula (23) which connects the
mean square displacement with the coefficient of diffusion D and
with temperature and mobility.
If the particle is known to be at a given position, say x = 0,
at t = 0, the probability of finding it at x after the time t is the
following solution of (24):
the mean square of the coordinate, or the 'spread* of probability
after the time t is found by a simple calculation:
2 = J x*f(x, t)dx = 2Dt, (26)
00
which for t = T is equal to the mean square displacement (A#) 2
given by (23).
These formulae can be used in different ways to determine
176 APPENDIX
Boltzmann's constant k, or Avogadro's number N = R/k (where
jR is the gas constant per mole), i.e. the number of molecules per
mole. A static method consists in observing the sedimentation
under gravity of a colloid solution; then F = wgr, where ra is
the mass of the colloid particle, and the number of particles
decreases with height according to the law (22), which now reads
n =
In order to apply this formula one has to determine the mass.
For spherical particles m = (47r/3)r 3 />, where p is the density and
r the radius. The mobility of a sphere in a liquid of viscosity rj
has been calculated by Stokes from the hydrodynamical equa
tions, with the result 1
B = i ; (27)
67777?' v '
hence it falls under gravity, F = mg = (47r/3)r 3 />0, with
the velocity (17)
977
As can be measured, r can be found, if p and rj are known, and
finally m.
Another method is a dynamical one. One observes the dis
placements Aa^, A# 2 ,..., of a single colloid particle in equal inter
vals r of time and forms the mean square (A#) 2 . Then using the
same method as just described for determining the radius r,
one finds B from (27) and then k from (23).
In this way the first reliable determinations of N have been
made. Among those who have developed the theory M. v.
Smoluchowski has played a distinguished part, while the first
systematic measurements are due to J. Perrin.
A new and interesting approach to the theory of Brownian
motion may be mentioned: J. G. Kirkwood, J. Chem. Phys.
14, p. 180 (1946); 15, p. 72 (1947).
21. (VI. p. 67.) Reduction of the multiple distribution
function
The total Hamiltonian H N o(N particles can be split into two
parts, the first being the Hamiltonian H N ^ of N I particles,
the second the interaction of these with the last particle:
(1)
APPENDIX 177
where O (t<) is the external potential on the particle i and O<^> the
mutual potential between two particles i and j.
Now we apply the operator XN ^ ^e equation for the total
system f
^==[H N ,f N ]. (2)
From (6.40) we have, for q = N 1,
XN/N /jvi*
Hence ^/. ~ Q/
and
Here the first term on the righthand side becomes
since H N _ l does not depend on the particle N to which the opera
tor XN re; f ers Further,
_
2m
for if the integration XN ^ s performed, the result refers to values
of f N at infinity of the x (Ar) and (Ar) respectively, and these vanish
as there is no probability for particles to be at an infinite distance
or to have infinite velocities.
If all this is substituted in (2) one obtains
(3)
Repeating the same process with XN~i>XN2>" one obtains the
chain of equations (6.44), (6.45) of the text.
22. (VI. p. 68.) Construction of the multiple distribution
function
The fundamental multiplication theorem for nonindependent
events ,can be obtained in the following way.
Any event of a given set may have a certain property A or
5131
178 APPENDIX
not, A. If B is another property we indicate by A B those events
which have both the properties A and B.
Then all events can be split into four groups AB, AB, AB,
AB, with the probabilities p AB , p AB) p AB , p^g.
The probability of A is
PA^PAB+PAB _ __ (1)
On the other hand, if A is known to occur, the cases AB,AB are
excluded, hence the probability of B is
= PAB ^P^B^
PAB+PAB PA '
or
which is the multiplication rule; it reduces to the ordinary one
for independent events if p B (A) does not depend on A and is
equal to p B .
This rule can be applied to a mechanical system of N particles
in the following way.
Let A signify that g particles are in given elements of phase
space; the probability of A can be written
PA = f q d^d^...dx^d^. (3)
Let B mean that the element q+l is occupied. Then AB ex
presses that all q+ 1 elements are occupied, or
PAB = /+i dx< 1 W5< 1 )...rfx^^x^%^ 1 >. (4)
Hence p B (A) t the probability for the element q+l being occu
pied, if q particles are in given elements, is
p (A) = ** = ldxte+Udftv+V. (5)
PA fq
If this is summed over all possible positions and velocities of the
last particle (g+ 1), the result is equal to the number of particles
excluding the q fixed ones, N q\ hence, with the normalization
described in the text, (6.42) and (6.43),
(Nq)f q = f g+l dx(*+d%*+ = X<H .i/ ff +i, (6)
which is the formula (6.40) of the text.
In order to construct the equation (6.44) for the rate of change
of / fl , one has to introduce a generalized distribution function
which depends not only on the position x and the velocity 
APPENDIX 179
but also on the acceleration yj of the particles; the probability
for a set of q particles to be in the element
shall be denoted by
g q (t, XW 0), yjd),...
One has obviously
/<r (7)
Now the motion of the molecules follows causal laws; hence
the probability f q of a configuration in x, ^space at a time t must
be the same as that at the time t\8t of that configuration which
is obtained from the first by substituting x^+^8t an
for x<*> and <*>.
Hence (7) leads to
J ? J
01
The integration in the first two terms can be performed with
the help of (7); that of the last leads to the integral
f q Jp*, (9)
J ( .?! J
where the symbol yj^ i s evidently the mean acceleration.
Hence one obtains from (8)
The final step consists in using the laws of mechanics for deter
mining rjp. The equations of motion are (force P (r) )
Now the function f q refers to the case where the positions and
velocities of q particles are given, the others unknown. Hence
one has to split the sum (11) into two parts, the first referring to
the given particles, the second to the rest. For this rest the
180 APPENDIX
probability of finding a particle in a given element q+ 1 is known,
namely (f q +i/f q ) dx ( + 1) dl ( + 1) ; hence the average of this sum can
be determined by integrating over dx ( <*+ 1) d ( < z+1) , i.e. by applying
the operator x q +i In this way the mean acceleration is found to be
Substituting this in (10), one obtains
/d(T)(t,2+l) fif \
* u ^ ' u jq+l\ MQ\
g x(< ) ~J9g(o) ^>
which is easily confirmed to be identical with the formulae (6.44),
(6.45) of the text.
23. (VI. p. 69.) Derivation of the collision integral from the
general theory of fluids
From the standpoint of statistical theory a fluid differs from
a solid by the absence of a longrange order, so that for two events
A and B happening a long distance apart one has, with the
notations of Appendix, 22, p AB = p A .p B i for instance, for large
x (2) ~x (1)  one has / 2 (x^, x (2) ) = / l (x (1) )/ 1 (x (2) ), while in solids
this is not the case.
The distinction between liquid and gas is not so sharp and
may even be said to disappear above the critical state. However,
if one is not specially concerned with these intermediate con
ditions there is a wide region where liquid and gas can be dis
tinguished by the extreme difference of density. From the
atomistic standpoint this has to be formulated thus:
The potential energy <J>(x (t>) , x ( ^) between two molecules at
x w and x f) decreases rapidly with the distance between their
mass centres, and (except in the case of ions, where Coulomb
forces act) a distance r , small by macroscopic standards, may
be specified, beyond which the interaction may without error
be assumed to vanish completely. In a liquid proper, there are
many molecules within this distance r of a given molecule; in a
gas there are usually none, and the probability that there is more
than one is very small, except near condensation. The neglect
APPENDIX 181
of this small probability is equivalent to the assumption of
'binary encounters' in gastheory. Green has shown that when
this assumption is made, on taking q = 1 in the equations (6.44)
and (6.45) of the text,
[#i>/i]^> (1)
one obtains Boltzmann's collision equations (6.46), (6.47).
To prove this we first work out the expression S l using the
definition (6.5) of the Poisson bracket and of the operator x>
(see also 22.13). With the assumption of binary encounters / 2
can be expressed in terms of / x by using the mechanical laws of
collision.
Consider the motion of two molecules which at time t have
positions x (l) , x (2) , such that x< 2 > x (1)  < r , and velocities
5 a) >? (2) j while at time t Q (< t), when the molecules were last at a
distance r Q from another, their positions and velocities were
x$\ x[> 2) and QftQ^. The configurational probability
/,(*, x<!>, x< 2 >, 50), >) dJ^dj^d^^
must remain unchanged during the interval ( , t) as the motion
follows a causal law; also, by Liouville's theorem, the volume in
phase space dx (1) dx (2) cfl (1 W 2) is unaltered. Since, as explained
above, molecular events in fluids which occur beyond the range
of interaction must be considered independent, one has
Next one introduces an approximate assumption which is
always made in gastheory, that , x^, x> 2) may be replaced by
t, x (1) and x (2) on the righthand side of (4) (but of course not
?o 1) 5o 2) > by J* (1) , ^ (2) ). As r Q is very small the resulting error is of
microscopic order; nevertheless it is not without importance,
for it allows small deviations from Maxwell's velocity distribu
tion law (and other 'fluctuations'), which would otherwise be
unexplainable, as this law is a rigorous consequence of Boltz
mann's collision equation in equilibrium conditions.
It remains to calculate 1* j, 1} and & 2) in terms of a) , (2) and
182 APPENDIX
r = x (2) x (1) , which can be done by using the canonical equations
of motion or their independent integrals (conservation of energy,
momentum, and angular momentum). The resulting formulae
are the same as used in Boltzmann's theory (see Appendix, 16).
The reduction of $ 2 can be performed without making use of
explicit expressions. One has only to remark that
now satisfies the equation
[# 2 ,/ 2 ] = 0, (5)
where # a = f (5 (1)2 +5 (2)i )+<I>(r) (6)
2t
is the Hamiltonian of the two particles which are considered to
move independently of all the others. Now (5) becomes
We integrate this over dx (2 >cfl; {2) ; then the term with d/ 2 / (2) on
the righthand side vanishes, because there are no particles with
infinite velocities. The other term, with #/ 2 /#! (1) , becomes identi
cal with mSfr according to (3), since
ao ao
Hence, with (4),
^ = JJ ( 5 (2) ~S (1)) ' ^{A(5i 1} )/i(?o 2) )} ctafl5, (8)
where the domain of integration over r may be limited by the
sphere of radius r surrounding x (1 >.
This integration can be performed by imagining the sphere to
be partitioned by elementary tubes parallel to the relative
velocity J (2)  (1) ; one may then integrate, first over a typical
tube specified by the crosssection radius b, perpendicular from
the centre of the sphere to the tube (see Appendix, 16), and then
over all values of b. At the beginning of the tube, where
the interaction between the molecules is negligible, and the
functions giving 5i 1) and 2) in terms of (1) , (2) , and r reduce to
APPENDIX 183
5 and < 2 >. At the end of the tube the values $' and 2 >' of these
functions have to be calculated from the collision integrals, just
as in Boltzmann's theory. Thus one obtains
>flg, (9)
which is identical with (6.47) of the text and the collision integral
in (16.8).
This derivation is not more complicated than Boltzmann's
original one' and is preferable because it reveals clearly the
assumptions made.
24. (VII. p. 72.) Irreversibility In fluids
A rigorous proof of the irreversibility in dense matter from the
classical standpoint seems to be very difficult, or at least ex
tremely tedious. Green has, however, suggested a derivation
which, though not quite rigorous, is plausible enough and
certainly based on reasonable approximations.
It has tojbe shown that the entropy 8 defined by (7.1) never
decreases in time, so that
satisfies the equation
1 ), (2)
which expresses that one particle of unknown position and
velocity is added to a system of N particles.
If <I> is the total potential energy between the N particles and
$&wi) that between the ith of these and the additional particle,
one has
8t ~
<l il
(3)
If this is substituted in (1) the integrals of the first two terms
184 APPENDIX
vanish on transformation to surface integrals. In the last sum
all terms contribute the same, as/^ a,ndf N+1 can be assumed to
be symmetric in regard to all particles. Hence
Now the reasoning follows very closely that of Appendix, 23,
where (23.3) was transformed into the integrable expression
(23.8) with the help of the identity ( 23.7).
For this purpose one introduces instead of the velocities of the
two particles (1) and (^+1) appearing explicitly in (4) new
variables, namely their total momentum m, two components
of the relative angular momentum a, and the relative energy w,
m = m( 1 >+^+ 1 >), a = m(x^+ 1 >x< 1 >) A
w =
and regards f N+l as a function of these, so that
/v+i = f K+ i(t, * (1 >,..., XW+ 1 ', (..., ?w, m, a, w).
Then by direct differentiation it can be verified that
_ / g/^ a/ A7+1 \
"" vs ' s ; '\ax^ +1 > ax^ +1 v'
an equation similar to (23.7).
^f ^A(i^+i)
If JN+I u  j s taken from it and substituted in (4), the
dt> (l) dx (1)
only term which does not vanish is found to be
x (OTD_(D) . rfx0^^...rfx^+ 1 5^+ (6)
m,a,t(7 are parameters specifying the trajectories which would
be followed by the particles numbered (1) and (N\ 1) if no other
particles were present. Now one can apply the same reasoning
as in Appendix, 23, partitioning the x (jv+1) domain by tubes
formed by the trajectories of (jV+1) relative to (1), where
m, a, w are constant, and one can perform the integration with
APPENDIX 185
respect to x ( ^ +1) first along such a tube, then over all values of
the crosssection b. At each end of the trajectory where the
interaction O C1  JV>+1) can be neglected, the function f N+1 would
factorize into fi N+1) f N , provided no other particle were near to
the particle (JV'+l).
This is, of course, not the case; but it seems to be reasonable to
assume that the factorization is at least approximately correct
as the action of the rest will nearly cancel. This is the simplifica
tion made by Green. It is clear that it could be corrected by a
more detailed consideration; but let us be content with it.
Since the sphere around x (1) in which O (t JV>+1) is effectively
different from zero is of microscopic dimensions, the values of
X UV+D an( j x (i) nee( i no t be distinguished, nor the instants when
these points are reached. The initial velocities J (1)/ , 5 (jv+1) ' must,
however, be determined from the actual final velocities from the
'conservation' law, i.e. the definitions (5) for constant m, a,w:
m(x< A 5 (1 >'
If the integration in (5) is performed as described, one obtains
=m J (2 " t2) J
x (tf+D_(D dbdxdx^\..dx^d^\..d^^\ (7)
where instead of x (1) the centre x = (x (1) +x (jv+1) ) is introduced.
Here/^ 4 " 1} means/ 1 (x (jv+1) , l (Ar+1) ) which can be replaced, accord
ing to formula (6.40) of the text, by
1 f f
If this is introduced into (7) one has an integral over
variables, where the integrand contains the factor f^fz/^/^
By repeating the procedure one can transform (7) into the
expression
f  pr^p / J
x 
186 APPENDIX
where F N is the function obtained by replacing the variables
x<*> and % iuf N by x<*+^> and 5<'+^> respectively.
Now one can apply the same transformations as for gases,
as explained in Appendix, 17, which lead from (17.3) to (17.4),
exchanging the dashed and undashed variables, and exchanging
the two groups (1,2,...,^) and (1+N 9 2+N 9 ...,2N). As it is
obvious that the integral is invariant for these changes, one
obtains
f  pra J '"' J "*(f )<^/n> x
x 5v+i>_5<i> dbdJid^...d^^d^ N +^...d^^d^...d^>, (8)
which makes it clear that dS/dt is positive or zero, and that the
latter happens only if
f N F N =f' N F' N . (9)
The solution of this equation leads again essentially to the
canonical distribution. I shall, however, not reproduce the
derivation but refer the reader to the original papers:
M. Born and H. S. Green, Nature, 159, pp. 251, 738 (1947).
 p roCt R y t Soc. A, 188, p. 10 (1946); 190, p. 455 (1947);
191, p. 168 (1947); 192, p. 166 (1948).
H. S. Green, ibid. 189, p. 103 (1947); 194, p. 244 (1948).
The reader may compare this involved and, in spite of the
complication, not quite rigorous derivation from classical theory
with the simple and straightforward proof from quantum theory
given in section IX.
I wish to add an argument, also due to Green, which shows
that once the increase of entropy is secured the distribution
approaches the canonical one. The latter is given by
flre*. a = ^. = !, (10)
A is the free energy and E the energy, given by
E = Jm f ($ (i) u<*>)2
(12)
(i) being the macroscopic velocity at the point x (i) .
Let the actual distribution be
APPENDIX 187
one has
jjf N dxd% = Nl, jjf N E dxd% = Nl 7,
where U is the internal energy, and the same holds, of course, for
/^, so that
jjf' N dxd$ = 0, jjf' N Edxd$ = 0. (13)
Then
Hence
= ~Fi JJ {^
Here the terms linear in f' N vanish in virtue of (13), and one
obtains *
This shows that an increase in the value of S requires a decrease
in the average value of \f' N \ and therefore an approach to the
canonical distribution.
25. (VIII. p. 75.) Atomic physics
It seems impossible to supplement this and the following
sections, which deal with atomic physics in general, by appen
dixes in the same way as before. The reader must consult
the literature; he will find a condensed account of these things
in my own book Atomic Physics (Blackie & Son, Glasgow;
4th edition 1948), which is constructed in a similar way to the
present lectures; the text uses very little mathematics, while a
series of appendixes contain short and rigorous proofs of the
theorems used. For instance, Einstein's law of the equivalence
of mass and energy is dealt with in Chapter III, 2, p. 52,
and a short derivation of the formula = me 2 given in A. Ph.
Appendix VII, p. 288. Whenever in the following sections I
wish to direct the reader to a section or appendix of my other
book, an abbreviation like (A. Ph. Ch. III. 2, p. 52; A. VII,
p. 288) is used.
188 APPENDIX
26. (VIII. p. 77.) The law of equipartition
If the Hamiltonian has the form (8.5), or
= p, (1)
where H' does not depend on , one has for the average of in a
canonical assembly
J e~
/ _
as all other integrations in numerator and denominator cancel.
Now this can be written
7
c = log Z, Z = f e~P dg. (2)
d P oo
If the integration variable rj = ^(j8a/2) is introduced one gets
Z = J8U,
where A is a constant. Hence logZ = const. log^3 and
e = = ifc? 7 (3)
in agreement with (8.6).
27. (VIII. p. 91.) Operator calculus in quantum mechanics
The failure of matrix mechanics to deal with aperiodic motions,
continuous spectra, was less a matter of conception than of
practical methods. An indication of using integral operators
instead of matrices is contained in a paper by M. Born,
W. Heisenberg, and P. Jordan, Z.f. Phys. 35, 557 (1926), which
follows immediately after Heisenberg's first publication. The
idea that physical quantities correspond to linear operators in
general acting on functions was suggested by M. Born and
N. Wiener, Journ. Math, and Phys. 5, 84 (1926) and Z. f. Phys.
36, 174 (1926), where in particular operators of the form
T
were used. Here the kernel q(t,s) is a 'continuous matrix', also
APPENDIX 189
introduced by Dirac. This paper contains also the representation
of special quantities by differential operators (with respect to
time) which satisfy identically the commutation law between
energy and time EttE = ih.
SchrOdinger's discovery, which was made quite independently,
consists in using a representation where the coordinates are
multiplication operators and the momenta differential operators,
so that the commutation laws
are identically satisfied. This opened the way to finding the
relation between matrix mechanics and wave mechanics and
to the later development of the general transformation theory of
quantum mechanics which is brilliantly represented in Dirac's
famous book.
The early development of quantum mechanics as represented
in textbooks has become rather legendary. To mention a few
instances: the matrices and the commutation law [q,p] = 1
which are traditionally called Heisenberg's, are not explicitly
contained in his first publication: W. Heisenberg, Z. f. Phys.
33, 879 (1925); his formulae correspond only to the diagonal
terms of the commutator. The complete formulae in matrix
notation are in the paper by M. Born and P. Jordan, Z.f. Phys.
34, 858 (1925). Further, the perturbation theory of quantum
mechanics, traditionally called Schrftdinger's, is contained
already in the next publication of Heisenberg, Jordan, and
myself (quoted at the beginning), not only for matrices, but also
for vectors on which these matrices operate, and not only for
simple eigenvalues, but also degenerate systems. The only
difference of SchrOdinger's derivation is that he starts from a
representation with continuous wave functions which he aban
dons at once in favour of a discontinuous one (by a Fourier
transformation) .
28. (IX. p. 94.) General formulation of the uncertainty
principle
The derivation of the most general form of the uncertainty
principle can be found in my book (A. Ph. A. XXII, p. 326).
As it is fundamental for the reasoning in these lectures, I shall
give it here in a little more abstract form.
190 APPENDIX
We assume that for a complex operator C = A+iB and its
conjugate C* = AiB the mean value of the product CC* is
real and not negative:
OC*">0, (1)
where the bar indicates any form of linear averaging, as described
in the text. Then writing XB instead of B, where A is a real para
meter, one has
(A+iXB)(Ai\B) = A z +BW~ri[A, B]X > 0, (2)
where the abbreviation (9.4)
(3)
is used. As the lefthand side of (2) is real and also the first two
terms on the right, it follows that [A, B] is real. The minimum
of the quadratic expression in A, given by (2), occurs when
and it is equal to
Hence * . JS* ^ [A,B]*. (4)
Now replace A by A A and BbyBB. A&A,B are numbers
and commute with A and B, the commutator [A, B] remains
unchanged. Putting, as in (9.2),
8.4 = {(AA)*}*, SB = {CB 2 }*,
one obtains from (4) the formula (9.3) of the text,
(5)
and as [q,p] = 1, especially (9.5),
8^.8g>. (6)
This derivation reveals the simple algebraic root of the uncer
tainty relation. But it is not superfluous at all to study the
meaning of this relation for special cases; simple examples can
APPENDIX 191
be found in A. Ph. A. XII, p. 296, A. XXXII, p. 357, and in
many other books, for instance, Heisenberg, The Physical
Principles of the Quantum Theory.
29. (IX. p. 97.) Dirac's derivation of the Poisson brackets
in quantum mechanics
It is fashionable today to represent quantum mechanics in an
axiomatic way without explaining why just these axioms have
been chosen, justifying them only by the success. I think that
no real understanding of the theory can be obtained in this way.
One must follow to some degree the historical development and
learn how things have actually happened. Now the decisive
fact was the conviction held by theoretical physicists that many
features of Hamiltonian mechanics must be right, in spite of the
fundamentally different aspect of quantum theory. This con
viction was based on the surprising successes of Bohr's principle
of correspondence. In fact, the solution of the problem consisted
in preserving the formalism of Hamiltonian mechanics as a whole
with the only modification that the physical quantities are to be
represented by noncommuting quantities.
If this is accepted, there is a most elegant consideration of
Dirac which leads in the shortest way to the rule for translating
formulae of classical mechanics into quantum mechanics. It
starts from the fact that classical mechanics can be condensed
into the equation
0, (1)
which any function f(t,q,p) representing a quantity carried by
the motion must satisfy. Here the Poisson bracket is used
If (1) is to be generalized for noncommuting quantities, it is
necessary to consider how the Poisson bracket should be trans
lated into the new language.
Dirac uses the fact that these brackets have a series of formal
properties, namely
192 APPENDIX
where c is a constant ; further
and finally ' i (5)
Here the factors are written in a definite order, though in
classical mechanics this does not matter. We have t f o do so if we
want to use these expressions for non commuting quantities,
and the rule followed is simply to leave the order of factors un
changed.
The question is, What do the brackets mean in this case ? To see
this, we form the bracket [ x 2 > ?h ^2] * n ^ wo ways, using the two
formulae (5) first in one order, then in the opposite one. Then
and in the same way
Equating these two expressions one obtains
[l> >?l](f 2 ^2 ^2 f 2) = (^1 ^l ^1 f l)[f 2 ^2] ( 6 )
As [g v T^J must be independent of 2 , ^ 2 an( i y i ce versa, it follows
where A is independent of all four quantities and commutes with
f i ^i~~ ^i f i an( i 2 7 ?2~~' ^2 ^2* Hence A is a number. That it must
be purely imaginary, A = ifi, cannot be derived in such a formal
way; but it follows from considerations like those used in the
previous appendix, where it is shown that a reasonable definition
of averages implies (28.3) that
[f,ij] = ^ij (8)
is real.. Thus it is established that the Poisson brackets in
APPENDIX 193
quantum theory correspond to properly normalized commu
tators.
If one inserts in the classical expression (2) for and 17 a
coordinate or a momentum, one finds
[rffa] = > [Pr*A] = > [?r>A] = 8 rs> ()
where . = 1, S ra = for r ^ s.
The same relations (9) must be postulated to hold in quantum
mechanics. In this way the fundamental commutation laws are
obtained.
30. (IX. p. 100.) Perturbation theory for the density matrix
We consider the problem of solving the equation
^ = [#, P ], H = H +V, (1)
where the perturbation function F is small.
The method is essentially the same as that used for the
corresponding problem in matrix or wave mechanics.
Assume that A represents the eigenvalues of a complete set
of integrals A of H , so that [H , A] = and H becomes
diagonal in the Arepresentation; put
S (A, A) = E, H (A', A') = E' 9 while H (A, A') = for A ^ A'. (2)
Introduce instead of p and V the functions a and U given by
Then one has
f
F(A,A') = '
i 8t (i dt
(H p~ P H )(\,X f ) = (JS/~^)a(A,A'
Hence the equation (1) reduces to
 I {U(\ 9 A>(A", A')a(A, A")?7(A^, A')}. (4)
A*
Now assume that a is expanded in a series
a = C7 + C7 1 + C7 2 +..., (5)
5131 Q
194 APPENDIX
where CT O is diagonal and independent of the time and a v <r 2 ,...
of order 1, 2,... in the perturbation. Then one obtains
hence ^(A, A') = *(A, A'){a (A')*oW} (6)
where
t t
u(\ A') = 1 f Z7(A, A') dt =  f F(A, A')e^/^r^ <fc. (7)
ft J J
o o
It follows for the diagonal elements from (6) that
a!(A,A) = 0. (8)
The next approximation cr 2 has to satisfy the equation
", A')}
We need only the diagonal elements; for these one has
>(A', AXa (A')a (A)},
which gives by integration
a a (A, A) =  M (A, A')  2 {a (A')a (A)}, (9)
since, according to (7), u(A,A') is hermitian, u(A,A') = w*(A',A),
and vanishes for t = 0.
It is seen from (3) that the diagonal elements of p and cr are
identical; they represent the probability P(,A) of finding the
APPENDIX 195
system at time t in the state A. Now (5), (8), and (9) give, in agree
ment with (9.25) of the text,
P(t,X) = P(A)+
where
f F(A,A'
(10)
(11)
When F(A, \') is independent of the time one can perform the
integration, with the result
J(M') = KlW
(12)
Now the function
277
le**
iy
Try
behaves for large t
like a Dirac 8function, i.e. one has
27T
if the interval of integration Ay includes y = and if Ay ;> 1.
Suppose that the energy values are distributed so closely that
they are forming a practically continuous spectrum. Then one
can split the index A into (A, E) and replace the simple summation
in (10) by a summation over A and an integration over E', the
latter can be performed on the coefficients J(\,E\ A', JE") with
the result that the formula (10) is unchanged, if the coefficients
are given by
which combines the equations (9.27) and (9.28) of the text.
As mentioned in the text, Green has found a formula which
allows one to calculate the higher approximations in a very
simple way. This formula is so elegant and useful that I shall
give it here, though without proof (which can be found in the
Appendix I, p. 178, of the paper by M. Born and H. S. Green,
Proc. Roy. Soe. A, 192, 166, 1948). Starting from the equation
(7) or _ , _
196 APPENDIX
where U is known for a given perturbation V by (3), one forms the
successive commutators
u 22 = uu uu, i} 28 = u^n uu 22i ,..., (15)
and from them the expansion
If the initial condition u 2 = for t = is added, u% can be deter
mined by integrating this series term by term.
Then one forms
and the expansion
from which one can determine % so that u% = for = 0.
The second suffix Z in ?% has been chosen to indicate the power
of F which is involved in the expression; one has u% = (^(F 2 *"" 1 )
and this decreases rapidly with k when F or t are small. This rule
makes it possible to construct t* 4 , %,..., in a similar manner.
Then one has the solution of (4)
a = e u e u *e u *...p Q ...e u *e u *e u (19)
from which p is obtained by (3).
The explicit expressions for the expansion (5) of a are
(20)
These formulae will be useful for many purposes in quantum
theory. Concerning thermodynamics, the thirdorder terms will
have a direct application to the theory of fluctuations and
Brownian motion. The customary theory derives these pheno
mena from considerations about the probability of distributions
in an assembly which differs from the most probable one. The
theory described here deals with one single system with the
methods of quantum mechanics (which allows anyhow only
APPENDIX 197
statistical predictions); deviations from the average will then
depend on higher approximations. It can be hoped that this idea
leads to a new approach to the theory of fluctuations in quantum
mechanics.
31. (IX. p. 112.) The functional equation of quantum
statistics
The equation (9.37),
where A (1) depends on E^ l \ but not on E ( *\ and A (2) vice versa, is
obviously of the form
f(x+y) = tWd/)
treated in Appendix, 13, and has as solution general exponen
tial functions; hence the distribution for all three systems is
canonical.
32. (IX. p. 113.). Degeneration of gases
The theory of gas degeneration is treated in my book Atomic
Physics, but in a way which appears not to conform with the
general principles of quantum statistics as explained in these
lectures. According to these one always has in statistical equili
brium canonical distribution, P = e*~& E , while the presenta
tion in A. Ph. gives the impression that, by means of a modified
method of statistical enumeration, a different result is obtained.
This impression is only due to the terms used, which were those
of the earlier authors (Bose, Einstein, Fermi, Sommerfeld), while
in fact there is perfect agreement between the general theory
and the application to gases. A simple and clear exposition of
this subject is found in the little book by E. SchrOdinger,
Statistical Thermodynamics (Cambridge University Press, 1946).
I shall give here a short outline of the theory.
In classical theory an ideal gas is regarded as a system of inde
pendent particles. In quantum theory this is not permitted,
because the particles are indistinguishable. If ^(x* 1 *) and
^r 2 (x (2) ), or shortly ^j(l) and 2 (2)> are the wave functions of two
identical particles with the energies E l and E 2 , the SchrCdinger
equation for the system of both particles, without interaction,
has obviously the solution ^i(l)0 2 (2) with the energy E^+E^
but as the particles are identical there is another solution
198 APPENDIX
belonging to the same energy, namely M2)M1) Hence any
linear combination of these is also a solution. Two of these,
namely the symmetric one and the antisymmetric one,
&(1, 2) = Ml W(2)+&(2)&(1),
^1,2) = ^1)^2)M2)^(1)
have a special property: the squares of their moduli, /r g  2 and
j/rj 2 , are unchanged if the particles are interchanged. One can
further show that they do not 'combine', i.e. the mixed inter
action integrals (matrix elements) vanish,
0, (2)
for any operator A symmetric in the particles. Hence they repre
sent two entirely independent states of the system; each state
being characterized by two energylevels of the single particle
occupied, without saying by which particle.
The same holds for any number of particles. If E l ,E 2 ,...,E n
are the energies of the states of the isolated particles, the total
system (without interaction) has not only the eigenfunction
1/^(1)^2(2). ..$ n (ri) belonging to the energy E l +E 2 +...+E n but
all functions P/r 1 ( 1 )<A 2 (2).  ^n( n ) > where P means any permutation
of the particles, hence also all linear combinations of these.
There are in particular two combinations, the symmetric one
and the antisymmetric one,
Ml, 2,..., n) = P
(3)
(+ for even, for odd permutations),
which have the same simple properties as described in the case
of two particles: i/j 8 remains unaltered when two particles are
exchanged, while i/j a (which can be written as a determinant)
changes its sign; hence </rj 2 and ^ a  2 remain unchanged.
Further, the two states do not combine, a fact expressed by
formula (2).
The functions if/ 8 and i// a describe the state of the nparticle
system in such a way that the particles have lost their indi
viduality; the only thing which counts is the number of particles
having a definite energylevel.
Experiment has shown that this description is adequate for all
APPENDIX 199
particles in nature; every type of particle belongs either to one
or the other of these two classes.
The eigenfunctions of electrons belong, in view of spectroscopic
and other evidence, to the antisymmetric type ; hence they vanish
if two of the single eigenfunctions ^ a (/J) are identical, i.e. if two
particles are in the same quantum state. This is the mathe
matical formulation of Pauli's exclusion principle. Nucleons
(neutrons or protons) and neutrinos are of the same type; one
speaks of a FermiDirac (F.D.) gas. Photons and mesons,
however, and many nuclei (containing an even number of
nucleons) are of the other type, having symmetric eigen
functions; they form a BoseEinstein (B.E.) gas.
In both cases the total energy may be written
E =
where v e 2 ,... are the possible energy levels of the single particles
and n l9 n 2 ,... integers which indicate how often this level appears
in the original sum E l \E^\...\E n (where each E k was attri
buted to ofie definite particle).
The sum of these occupation numbers n v n 2 ,...
... = 2,n 8 = n (5)
may be given or it may not. The latter holds if particles are
absorbed or emitted, as in the case of photons. For a B.E. gas,
including the case of photons, there is no restriction of the n 8 ,
while for a F.D. gas each energyvalue 8 can only appear once,
if it appears at all. Hence one has the two cases
. (B.E.) ^ = 0,1,2,3,...
(o)
(F.D.) n. = 0,l.
Now we apply the general laws of statistical equilibrium, which
have to be supplemented by the fundamental rule of quantum
mechanics that each nondegenerate (simple) quantum state has
the same weight. (This is implied by the equation (9.14) of the
text which shows that the diagonal element of the density matrix
determines the number of particles in the corresponding state.)
As we have seen in Appendix, 14, it suffices to calculate the
partition function Z (14.24), p. 158, with all a) 8 = 1,
Z= 2 et*to>** 9 (7)
200 APPENDIX
where the sum is to be extended over all quantum numbers n 8 ,
which describe a definite state of the system. These are just the
numbers introduced in (4), with or without the restriction (5)
according to the type of particle. Introducing the abbreviation
z s = efc ' (8)
one has
z= 2 ^5... = 2? I 24 t " = n2* ()
ni,n 2 ,... wi n* a n*
The sum is easily evaluated for the two cases (6),
(F.D.) 2? =l+v
a
One can conveniently combine the results into one expression
where the upper sign refers throughout to the B.E.'case.
This formula contains the theory of radiation, where the
condition (5) does not apply. But it is more convenient to deal
with the instance where (5) holds and to relax this condition in
the final result.
A glance at the original form (9) of Z shows that the condition
(5) indicates the selection from (10) of those terms which are
homogeneous of order n in all the z s .
This can be done by the method of complex integration. We
form the generating function
/(O '
and expand it in powers of . The coefficient of n is obviously
equal to the product (9) with the restriction (5). Hence we obtain
instead of (10)
(12)
where the path of integration surrounds the origin in such a way
that no other singularity is included except f = 0.
For large n this integral can be evaluated by the method of
steepest descent. It is easy to see that the integrand has one and
only one minimum on the real positive axis.
APPENDIX 201
As in previous cases (see Appendix, 14, 15) the crudest
approximation suffices. One writes the integrand in the form
e*, where ^ = _ (n+1)log + l og /(0,
and determines the minimum of the function gr() from
0; (13)
then one has to calculate g() and
for the value of which is the root of (13). To a first approxi
mation one finds
z = 
logZ = Mn+l)log+log/(0llog{27j0'(0}.
Neglecting 1 compared with n and the last term (which can
be seen to be of a smaller order), one obtains
logZ=nlogMlog/(C); (14)
here is the root of (13), where also n\ 1 can be replaced by n.
Now one gets from (11)
log/(0 =
Hence, from (8) and (13),
From this equation a (or f ) can be determined as function of
the particle number and of temperature. One easily sees now
that the case where the number of particles n is not given is
obtained by just omitting the equation (16) and putting a =
or ? = 1. Yet the equation (16) is not entirely meaningless now,
it gives the changing number of particles actually present.
202 APPENDIX
The mean number of particles of the kind s is obviously
"
__ 1 dZ ___ IdlogZ
hence, from (14) and (15),
which confirms (16) with (5). This formula, for thg B.E. case
(minus sign), has been mentioned in VIII (8.20), where it was
obtained by a completely different consideration of Einstein's.
In the same way, the average energy of the system is found to
(18)
in agreement with (4).
These are the fundamental formulae of quantum gases, derived
from the general kinetic theory. They are to be found in A. Ph.
Ch. VII, p. 197; in particular the fundamental formula (17) in
5, p. 224, for B.E., and in 6, p. 228, for F.D. All further
developments may be read there (or in any other of the
many books dealing with the subject). I wish to conclude this
presentation by giving the explicit formulae for monatomic
gases, where the energy is = jt> 2 /2m and the summation over
cells is to be replaced by an integration over the momentum
space. The weight of a cell is found, by a simple quantum
mechanical consideration, for a single particle without spin
(A. Ph. Ch. VII, 4, p. 215) to be
1
Hence, introducing the integration variable
one obtains from (17) and (18)
(20)
APPENDIX 203
which are the quantum generalizations of the formulae given in
Appendix 14 and reduce to them for a > oo. A detailed discus
sion would be outside the plan of this book. It need only be
said that the F.D. statistics of electrons have been fully confirmed
by the study of the properties of metals (A. Ph. Ch. VII. 7,
p. 229; 8, p. 232; 9, p. 235; 10, p. 236; A. XXX, p. 352).
33. (IX. p. 116.) Quantum equations of motion
At the end of Chapter VI, which deals with the kinetic theory
of (dense) matter from the classical standpoint, the statistical
derivation of the phenomenological hydrothermal equations
was mentioned and reference made to this later Appendix, which
belongs to quantum theory. This was done to save space; for
the classical derivation is essentially the same as that based on
quantum theory, and one easily obtains it from the latter by a
few simple rules.
The first of these rules is, of course, the correspondence of the
normalized commutator [a, j3] =  (a/? /?a) with the Poisson
* lib
bracket 8? 8a 8$ \
~ *
.  ,
1 = 1
if x (i) and p (i) are the position and momentum vectors on which
a and /J depend.
The second rule concerns the operator x> which in the text is
described in words; expressed in mathematical symbols it is
Xr .. = JJ dxWx<d'8(xM_x%.. (2)
It has to be interpreted classically to mean
Xa ... = JJdx(^<>.... (3)
Thus the classical operation f d (fl) corresponds to
i.e. to substituting x (fl) for x (c)/ as stated in the text.
In using the correspondence principle to proceed from classical
to quantum mechanics a product aft may not be left unchanged
unless a and j8 commute; in general one must replace aj8 by
By applying these rules one can easily go over from quantum
204 APPENDIX
to classical formulae (and in many cases also vice versa). There
fore we give here only the quantum treatment.
To derive the equation (9.48) from (9.46) with the help of
(9.47), one proceeds by steps of which only the first need be
given, as the following ones are precisely similar: One has:
H N = tf w _ 1+ JLpWi+ JT tfW. . (4)
Hence, applying the operation XN ^ (946), one obtains, using
XN PN = PNV
P W2 > ftr] + j
The middle term on the righthand side is
. PN]
(5)
and vanishes on transformation to a surface integral, because
there is no flow across the boundary at a large distance. Hence
(5) reduces to the equation (9.48) with q = NI, which com
pletes the first step. The following steps are of the same pattern.
In order to make the transition from the 'microscopic' equa
tions of motion (9.48) of the molecular clusters to the macro
scopic equations of hydrodynamics, one needs first to define the
density and macroscopic velocity in terms of the molecular
quantities. The generalized 'density' n q , which reduces to the
ordinary number density % for q = 1, is obtained as a function
of the positions x (1 >,..., x ( > by writing x (l '>' = x<*> (i = l,2,...,g)
in the density matrix /> a (x, x'). The macroscopic velocity u^
for a molecule (i) in the cluster of q molecules whose positions are
given is the average value of the quantity represented by the
operator p (i) :
m i
* <
where the bracket {...} indicates the symmetrized product, as
introduced above, and the subscript x' = x the diagonal
elements of the matrix.
APPENDIX 205
By expressing (9.48) in the coordinate representation and
writing x (t>) ' = x (i) , one obtains the equation of continuity
since
= ~
and
,x')] x , =x = 0.
Next, multiply (9.48) by the operator p (i) before and after, taking
half the sum, and then write x<' = x< 0' = 1, 2,...,q).
' s\
The leftiiand side evidently reduces to w^u^). One has
Cv
further
,,**
and {p, [*<, P9 ]}^ K =
r
{P (i) ,x, + i[^ +1) .p 9+1 ]}x x =  J
Hence, if a tetisor 1^ is defined by
one has
= 0. (8)
By using (7) one obtains
206 APPENDIX
c\ n Q
or, if d/dt is the convective derivative ~ + V u^. ^ ,
ct j^i dx (1)
Hence (8) may be written in the form
where p tfi)
fl )} x ^ x mn a <)<> (11)
7/2'
here v<*> = I p(*>~i4*> f[ S(x^x< / ) (12)
is the relative molecular velocity referred to the visible motion.
The equation (10) is the generalized equation of mdtion of the
cluster of q molecules, which reduces to the ordinary equation of
hydrodynamics when q = 1.
p wi) , the generalized pressure tensor, is seen to consist of two
parts W* and I (J ^ associated with the kinetic energy of motion
and the potential energy between the molecules respectively.
The diagonal element of the tensor k w is a multiple of the
kinetic temperature T^ defined by
The equation of energy transfer can be obtained inthe same way
as the equation of motion by calculating the rate of change with
time of
34. (IX. p. 118.) Supraconductivity
There exists a satisfactory phenomenological theory of supra
conductivity, mainly due to F. London; it is excellently pre
sented in a book by M. von Laue, where the literature can be
found. (Theorie der Supraleitung: Springer, Berlin u. GOttingen,
1947).
Many attempts to formulate an electronic theory have
been made, without much success. Recently W. Heisenberg
APPENDIX 207
has published some papers (Z. f. Naturforschung, 2 a, p. 185,
1947) which claim to explain the essential features of the pheno
menon. According to this theory every metal ought to be supra
conductive for sufficiently low temperatures. Actually the alkali
metals which liave one 'free' electron are not supraconductive
even at the lowest temperatures at present obtainable, and it is
not very likely that a further decrease of temperature will change
this. There are also theoretical objections against Heisenberg's
method. %
A different theory has been developed by my collaborator
Mr. Kai Chia Cheng and myself, which connects supracon
ductivity with certain properties of the crystal lattice and pre
dicts correlations between structure and supraconductive state,
which are confirmed by the facts (e.g. the behaviour of the alkali
metals). The complete theory will be worked out in due course.
35. (X. p. 124.) Economy of thinking
The ideal of simplicity has found a materialistic expression in
Ernst Mach's principle of economy in thought (Prinzip der
DenkOkonomie). He maintains that the purpose of theory in
science is to economize our mental efforts. This formulation,
often repeated by other authors, seems to me very objectionable.
If we want to economize thinking the best way would be to stop
thinking at all. A minimum principle like this has, as is well
known to mathematicians, a meaning only if a constraining
condition is added. We must first agree that we are confronted
with the task not only of bringing some order into a vast expanse
of accumulated experience but also of perpetually extending this
experience by*research; then we shall readily consent that we
would be lost without the utmost efficiency and clarity in think
ing. To replace these words by the expression 'economy of
thinking' may have an appeal to engineers and others interested
in practical applications, but hardly to those who enjoy thinking
for no other purpose than to clarify a problem.
36. (X. p. 127.) Concluding remarks
I feel that any critical reference to philosophical literature
ought to be based on quotations. Yet, as I have said before, my
reading of philosophical books is sporadic and unsystematic,
and what I say here is a mere general impression. A book which
208 APPENDIX
I have recently read with some care is E. Cassirer's Determinis
mus und Indeterminismua in der modernen Physik (Gdteborg,
Elanders, 1937), which gives an excellent account of the situa
tion, not only in physics itself but also with regard to possible
applications of the new physical ideas to other fields. There one
finds references to and quotations from all great thinkers who have
written about the problem. The last section containsCassirer's
opinion on the ethical consequences of physical indeterminism
which is essentially the same as that expressed by myself. I
quote his words (translated from p. 259): 'From the significance
of freedom, as a mere possibility limited by natural laws,
there is no way to that "reality" of volition and freedom of
decision with which ethics is concerned. To mistake the choice
(Auswahl) which an electron, according to Bohr's theory has
between different quantum orbits, with a choice (Wahl) in
the ethical sense of this concept, would mean to become the
victim of a purely linguistic equivocality. To speak of an ethical
choice there must not only be different possibilities but a con
scious distinction between them and a conscious decision
about them. To attribute such acts to an electron would be a
gross relapse into a form of anthropomorphism. . . . ' Concerning
the inverse problem whether the 'freedom' of the electron helps
us to understand the freedom of volition he says this (p. 261):
'It is of no avail whether causality in nature is regarded in
the form of rigorous "dynamical" laws or of merely statistical
laws. ... In neither way does there remain open an access to
that sphere of "freedom" which is claimed by ethics'.
My short survey of these difficult problems cannot be com
pared with Cassirer's deep and thorough study. Yet it is a
satisfaction to me that he also sees the philosophical importance
of quantum theory not so much in the question of indeterminism
but in the possibility of several complementary perspectives or
aspects in the description of the same phenomena as soon as
different standpoints of meaning are taken. There is no unique
image of our whole world of experience.
This last Appendix, added after delivering the lectures, gives
me the opportunity to express my thanks to those among my
audience who came to me to discuss problems and to raise
objections. One of these was directed against my expression
APPENDIX 209
'observational invariants'; it was said that the conception of
invariant presupposes the existence of a group of transformations
which is lacking in this case. I do not think that this is right.
The problem is, of course, a psychological one; what I call
'observational* in variants' corresponds roughly to the Gestalten
of the psychologists. The essence of Oestalten theory is that the
primary perceptions consist not in uncoordinated sense im
pressions but in total shapes or configurations which preserve
their identity independently of their own movements and the
changing standpoint of the observer. Now compare this with
a mathematical example, say the definition of the group of
rotations as those linear transformations of the coordinates
x,y, z for which x 2 +y*+z 2 is invariant. The latter condition
can be interpreted geometrically as postulating the invariance
of the shape of spheres. Hence the group is defined by assuming
the existence of a definite invariant configuration or Gestalt, not
the other way round. The situation in psychology seems to me
quite analogous, though much less precise. Yet I think that this
analogy is of some help in understanding what we mean by real
things in the flow of perceptions.
Another objection was raised against my use of the expression
'metaphysical' because of its association with speculative sys
tems of philosophy. I need hardly say that I do not like this kind
of metaphysics, which pretends that there is a definite goal to be
reached and often claims to have reached it. I am convinced
that we are on a neverending way ; on a good and enjoyable way,
but far from any goal. Metaphysical systematization means
formalization and petrification. Yet there are metaphysical
problems, 'which cannot be disposed of by declaring them
meaningless, or by calling them with other names, like epistemo
logy. For, as I have repeatedly said, they are 'beyond physics'
indeed and demand an act of faith. We have to accept this fact
to be honest. There are two objectionable types of believers:
those who believe the incredible and those who believe that
'belief must be discarded and replaced by 'the scientific method' .
Between these two extremes on the right and the left there is
enough scope for believing the reasonable and reasoning on
sound beliefs. Faith, imagination, and intuition are decisive
factors in the progress of science as in any other human activity.
5131
INDEX
Absolute temperature, 38, 42, 48, 53,
149, 157.
Absorption, 82, 86.
Accessibility, 144M5.
Ackejrmann, 169.
Adiabatics, 349, 147, 14951.
Adrian, A. !>., 125.
Advanced potential, 26.
Angular momentum, 86, 111, 182,
184. %
Antecedence, 9, 12, 15, 17,256, 302,
44, 713, 102, 120, 124.
Astronomy, 1016.
Atom, 17, 84.
Atomic physics, 187.
Atomistics, 46.
Avogadro's number, 63, 176.
Balmer, 85, 87.
Bernoulli, D., 47.
Binary encounters, 68.
Biot, 138.
Boer, de, 169.
Bohr, 82, 857, 105, 127, 191, 208.
Boltzmann, 53, 55, 56, 5860, 62, 76,
140, 163, 167, 1823.
Boltzmann's constant, 53, 158, 176.
equation, 55, 68, 71, 165, 172, 174,
181.
/ftheorem, 579, 113.
Born, 65, 169, 186, 1889, 195, 207.
Boscovich, 46.
Bose, 112, 197.
BoseEinstein statistics, 113, 199202.
BoyleCharles law, 48.
Boyle's law, 47, 54, 59, 149, 152, 159.
Broglie, de, 8992.
Brownian motion, 625, 73, 99, 170,
196.
Bucherer, 27.
Biichner, 74.
Caloric, 31.
Calorimeter, 31.
Canonical distribution, 60, 72, 112,
113, 175, 187, 197.
form, 18, 49, 96.
Caratheodory, 389, 143, 146, 149.
Carath6odory's principle, 39, 41, 42.
Carnot cycle, 143.
Casimir, H. B. G., 151.
Cassirer, E., 208.
Cauchy, 19, 21, 26, 44, 114, 134, 140.
Cauchy's equation, 20, 58, 70, 141.
theorem, 160.
Causality, 3, 59, 17, 72, 76, 95, 1013,
120, 124, 126, 129.
Cause, 47, 92, 101, 109, 120, 129.
Causeeffect relation, 5, 15, 467, 71.
Cavendish, 23.
Chance, 3, 4673. 83, 84, 92, 101, 103,
109, 120, 123.
Chapman, 56.
and Cowling, 56, 58.
Charge, 104.
Chemical equilibrium, 146.
Cheng, Kai Chia, 207.
Clausius, 38.
Collision crosssection, 55, 101, 109,
163.
integral, 56, 59, 68, 72, 163, 180,
183.
Colloids, 65.
Commutation law, 89, 91, 189.
Commutator, 947, 190, 193, 203.
Conduction of heat, 54, 58, 65, 70, 114.
Conservation of energy, 19, 119, 164,
182.
of mass, 20.
of momentum, 164, 182.
Constants of motion, 112.
Contact forces, 21.
Contiguity, 9, 12, 16, 1730, 74, 103,
120, 124, 140, 141.
Continuity equation, 20, 24, 31, 49,
57, 134, 137, 139, 205.
Continuous media, 1922, 134.
Convective derivative, 21, 49, 137,
206.
Copernicus, 12.
Corpuscular theory of light, 22.
Correspondence principle, 87, 191,
203.
Cosmology, 10, 11.
Coulomb forces, 234, 103, 180.
Coulomb's law, 25, 85, 138.
Cowling, see Chapman.
Critical temperature, 117.
Curvature of space, 28.
Darwin, C. G., 54, 155, 160.
Debye, 65.
Decay, law of, 84.
Degeneration of gases, 197.
temperature, 118.
Demokritos, 46.
Density, 20, 34, 170, 204,
function, 113.
matrix (or operator), 106, 111, 193,
199, 204.
Dependence, 6, 8, 76, 102, 124.
Descartes, 11, 16.
Determinism, 3, 59, 17, 30, 92, 101,
108, 110, 120, 1224, 126.
212
INDEX
Dewar vessel, 35.
Diffraction, 22, 106.
Diffusion, 54, 58, 175.
Dirac, 91, 92, 95, 113, 141, 189, 191.
Dirac's 8function, 60, 96, 195.
Displacement current, 25.
Distribution function, 50, 71, 96, 106,
17680.
law of BoseEinstein, 81.
of MaxwellBoltzmann, 614,
57, 60, 78, 82, 112, 167, 181.
Eckhart, 151.
Economy of thinking, 207.
Eddington, 141.
Ehrenfest, 59, 86.
Eigenfunction, 934, 98, 102, 1056,
1989.
Eigenvalue, 91, 934, 98, 100, 11112,
195.
Einstein, 15, 2730, 625, 75, 78,
803, 8890, 92, 100, 101, 112,
116, 1224, 1413, 151, 170, 172,
173, 175, 197, 202.
Einstein's law, 756.
Electromagnetic field, 226, 138.
wave, 25.
Electron, 8492, 95, 1035, 11618,
141, 199, 207, 208.
spin, 87, 95, 104, 113.
Emission, 82, 86.
Energy and mass, 75.
and relativity, 90.
, density of, 29.
, dissipation of, 113.
in perturbation method, 100, 111.
levels, 825, 87, 91, 198, 199.
of atom, 159.
of oscillator, 779.
of system, 36, 478, 53, 1478, 171
3.
surface, 62, 170.
Enskog, 56.
Enthalpy, 148.
Entropy and probability, 53, 57, 151,
162, 172.
and temperature, 116.
, change of, 434, 712, 183, 186.
 , definition of, 42, 11314, 165.
, establishment of, 38.
from Caratheodory's theory, 149
50.
in chemical equilibria, 148.
of atom, 159.
Equations of motion, 18, 21, 49, 57,
956, 103, 116, 137, 182, 2036.
of state, 36, 48.
Equipartition law, 77, 188.
Erlanger Programm, 104.
Ether, 19, 22, 74.
Euler's theorem, 1478.
Excluded middle, 107.
Exclusion principle, 87.
Faraday, 234, 140.
Fermi, 113, 197.
FermiDirac statistics, 113, 199203.
Field equations, 289.
of force, 17.
vector, 140.
FitzGerald, 27.
Fluctuation law, 79, 81.
Fluctuations, 151, 17ty6, 196.
Fock, V. A., 29, 143.
Fourier, 31.
Fowler, R. H., 39, 54, 155, 160.
Franck, 86.
Free energy, 44, 61, 148, 159, 167,
186.
will, 3, 1267.
Frenkel, J., 65.
Frequency, 8890.
Fresnel, 22.
Fuchs, K., 169.
Functional equation of quantum
statistics, 112, 197.
Galileo, 1013/129. .
Gammafunction, 155.
Gas constant, 48.
Gauss, 18, 46, 48, 133.
Gauss's theorem, 134, 165.
Geodesic, 29.
Gerlach, 87.
Oestalten theory, 209.
Gibbs, 58, 602, 65, 66, 112, 149, 167.
Godel, 108.
GoeppertMayer, 169.
Goudsmit, 87.
Gravitation, 13, 17, 2630, 124, 133,
134, 141, 142.
Green, H. S., 65, 68, 100, 115, 120
169, 181, 183, .1856, 195.
Groot, de, 151.
Hamilton, 18, 91, 191.
Harmltonian as matrix, 89.
as operator, 95.
definition of, 18.
in equations of motion, 4951, 59,
66, 103, 115.
in perturbation method, 99, 110,
112.
in statistical mechanics, 167.
of a particle, 102, 166, 176, 182.
of oscillator, 77, 188.
Heat, 31, 34, 367, 119, 147, 157.
Heisenberg, 86, 889, 92, 94, 1889,
191, 2067.
Helium, 11618.
INDEX
213
Helmholtz, 61.
Hermitian operator, 93.
Hertz^G., and Franck, J., 86.
Hertz, H., 25.
Hilbert, 56, 108.
Hoffmann, B., 29, 143.
Hooke's law, 21 ,
Huvgens, 22.
Hyarogen atom, 86.
Hydrothermal equations, 115, 118,
119, 203.
Ideal gas, 47. 50, 54, 112, 149, 159,
171, 197.'
Indeterminacy, 1019, 119, 208.
Induction, 6, 7, 14, 46, 64, 84.
Infeld, L., 29, 143.
Initial state, 48.
Integral of motion, 98, 100, 110, 112.
operator x , 66, 177, 181, 2034.
Integrating denominator, 401, 143
4.
Interference, 22, 106.
Inter phenomena, 107.
Irreversibility, 17, 32, 55, 579, 72,
109, 151, 165, 183.
Isotherms, 36, 47, 149.
Jeans, 77, 81.
Jordan, P., 1889.
Joule, 336.
Kahn, B., 169.
Kelvin, 38, 46.
Kepler, 10, 12, 129, 132.
Kepler's laws, 1213, 12932.
Kinetic energy, 18, 50, 61, 80, 116,
117, 206.
theory, general, 6470, 202.
of gases, 4658, 69, 1513, 174.
, quantum, 10921.
Kirkwood, 69, 176.
Klein, F., JP04.
Kohlrausch, 25, 140.
Lagrange, 18.
Lagrangian factor, 156.
Apoint, 11718.
Landau, 120.
Laplace, 18, 30.
differential operator, 22.
Laue, M. v., 206.
LeviCivita, 28, 142.
Light quantum, 82.
Liouville's theorem, 49, 52, 58, 60, 66,
68, 159, 181.
Logic, threevalued, 1078.
London, F., 206.
Lorentz, 27, 80.
force, 141.
Lorentz transformation, 27.
Loschmidt, 58.
Mach, E., 207.
MacMillan, 169.
Magnetic field, 138.
Margenau, H., 143.
Mass, gravitational, 15, 142.
, inertial, 14, 15, 28, 75, 104, 142.
Matrix, 83, 889, 91, 978, 1001, 115.
mechanics, 901, 1889.
Matter, 74, 83.
Maxwell, 247, 50, 51, 54, 56, 62, 112,
124, 13841.
Maxwell's equations, 246, 75, 124,
13841.
functional equation, 153.
tensions, 141.
Mayer, J., 169.
Mayer, R., 32.
Mean free path, 55.
Mechanical equivalent of heat, 323.
Meixner, 151.
Mendelssohn, 120.
Meson, 95, 141, 199.
Method of ignorance, 68.
Miohels, 169.
Michelson and Morley experiment, 27.
Minkowski, 28, 142.
Molecular chaos, 47, 48, 50, 152.
Moll, 88.
Momentum, 18, 29, 75, 86, 90, 91, 116,
136, 152, 184.
Montroll, 169.
Multinomial theorem, 1 60.
Multtplet, 88.
Murphy, G. M., 143.
Neumann, J. v., 109.
Neutrino, 199.
Neutron, 199.
Newton, 1030, 74, 80, 95, 103, 124,
126, 129, 132, 133, 141.
Noncommuting quantities, 926.
NonEuclidean geometry, 28.
Nonlinear transformation, 27, 142.
Nucleon, 199.
Nucleus, 84, 103, 199.
Number density, 978.
Objectivity, 1245.
Observational invariant, 125, 209.
Oersted, 25.
Onsager, L., 151.
Operator, 91, 935, 978, 100, 1889.
Ornstein, 88.
Oscillator, 779, 856, 116.
Partition function, 54, 60, 61, 72,
1589, 162, 167, 171, 199.
214
INDEX
Pauli, 87, 141.
Fault's exclusion principle, 199.
Periodic system of elements, 87.
Perrin, 63, 176.
Perturbation, 99, 110, 1936.
Pfaffian equation, 3841, 1434, 147.
Phase rule, 149.
space, 49, 51.
velocity, 22, 90.
Photoelectric effect, 801.
Photon, 80, 82, 90, 92, 101, 105, 107,
112, 199.
Planck, 76, 7880, 82, 8890, 100, 101,
112.
Planck's constant, 79, 85, 127, 142.
Poincare", 27, 58, 62.
Poisson bracket, 49, 96, 181, 1913,
203.
Poisson's equation, 138.
Potential, 14, 103, 138, 177.
energy, 18, 61, 67, 95, 168, 180,
183, 206.
Pressure, 22, 23, 34, 47, 48, 53, 61,
116, 118.
of light, 75.
tensor, 206.
Priestley, 23.
Prigogine, I., 151.
Probability, 51, 56, 94, 1023, 106,
124, 174.
and determinism, 48.
and entropy, 151, 172.
and irreversibility, 57.
coefficient, 83.
function, 4950.
of distribution, 4950, 52, 667, 98,
100, 154, 1603, 170, 17880, 194.
of energy, 79.
, theory of, 46.
wave, 1057.
Proton, 199.
Ptolemy, 10.
Quantum, 76, 802, 90,
conditions, 86, 89.
gas formulae, 202.
mechanics, 19, 73, 76, 83, 86, 89,
92103, 1079, 11121, 123,
18893, 196.
number, 87, 200.
theory, 17, 76, 82, 84, 110, 116, 120,
123, 1247, 142, 197, 2036.
Quasiperiodicity, 58, 62, 169.
Radial distribution function, 69.
Radiation, 7683, 889, 101, 112, 200.
 density, 767, 83.
Radioactive decay, 83, 101.
Radioactivity, 84, 101.
Rayleigh, 77, 81, 91, 171.
Reaction velocity, 83.
Reality, 1034, 123, 125.
Reichenbach, 1078.
Relativity, 14, 15, 17, 2630, 745, 90,
124, 1413.
Restmass, 75.
Retarded potential, 6.
Reversibility, 58, 71.
, microscopic, 151.
Ricci, 28, 142.
Riemann, 28, 142.
Ritz, 85.
Rotator, 86.
Russell, 108, 129.
Rutherford, 83, 84.
Savart, 138.
Schrodinger, 89, 912, 189, 197.
Selfenergy, 109.
Semi permeable walls, 43, 146.
Smoluchowski, M. v., 176.
Soddy, 83.
Sommerfeld, 86, 197.
Specific heat, 44, 80, 11619, 150, 159,
173.
Spectrum, 85, 87.
Statistical equilibrium, 48, 50, 53, 60,
66, 72, 111, 11316, 119, 153,
175, 197.
mechanics, 50, 5865, 713, 779,
84, 155, 167.
operator (ormatrix), 978, 100, 102.
term, 67.
Statistics, 46, 84.
Steepest descent, 155, 161, 200.
Stefan, 76.
Stern, 87.
Stirling's formula, 534, 154.
Stokes, 176.
Strain, 20, 34, 44, 140.
tensor, 21, 70.
Stress, 20, 34, 44, 140.
tensor, 202, 29, &S, 76, 116, 135.
Supraconductivity, 11718, 2067.
Surface forces, 134.
Temperature, absolute, 38, 42, 48, 53,
149, 157.
and Brownian motion, 175.
and heat, 31, 34.
, critical, 117.
degeneration, 118.
, empirical, 36, 147, 153.
function, 42, 44.
, kinetic, 116, 118, 206.
scale, 36.
, thermodynamic, 116, 118.
Tension, 20, 23, 134.
Thermal energy, 117.
equilibrium, 356, 53, 60.
INDEX
215
Thermal expansion, 116.
Theijnodynamios, 17, 3145, 53, 110,
1*, 146, 1489, 151.
% , first law of, 33, 37, 118.
, second law of, 36, 38, 53, 143, 157
8, 173.
Thermometer, 3J, 35.
Thomson, J. J., 85.
TiAe, 16, 27, 32, 71.
, flow of, 32.
Tisza, 120.
Transition probability, 101, 114.
Tycho Brahe, 12, 132.
Uhlenbeck, 8* 169.
Uncertainty principle, 94, 96, 1045,
117, 18991.
Ursell, 61, 168, 169.
Van der Waals, 54, 60, 61.
Vector field, 134.
Velocity distribution, 50, 51.
of light, 25, 140.
of sound, 14951.
Viscosity, 54, 68, 70, 117, 118, 176.
Wave equation, 22, 912, 102, 140.
function, 89, 102, 197.
mechanics, 89, 91, 189.
theory of light, 22.
Weber, 25, 140.
Wien, 76, 789, 81.
Wiener, N., 188.
Wilson, 86.
Xrays, 84.
Young, 22.
Yukawa, 141.
Yvon, 169.
Zermelo, 58, 62, 169.
PRINTED IN
GREAT BRITAIN
AT THE
UNIVERSITY PRESS
OXFORD
BY
CHARLES BATEY
PRINTER
TO THE
UNIVERSITY