7^" I \ THE MATHEMATICAL THEORY OF RELATIVITY \ CAMBRIDGE UNIVERSITY PRESS C. F. CLAY, Manager LONDON : FETTER LANE, E.C. 4 LONDON : H. K. LEWIS AND CO., Ltd., 136 Gower Street, W.C. I NEW YORK : THE MACMILLAN CO. BOMBAY \ CALCUTTA L MACMILLAN AND CO ., Ltd. MADRAS ) TORONTO. : THE MACMILLAN CO. OF CANADA, Ltd. TOKYO: MARUZKN-KABUSHIKI-KAISHA ALL RIGHTS RESERVED THE MATHEMATICAL THEORY OF RELATIVITY BY A. S. EDDINGTON, M.A., M.Sc, F.R.S. PLUMIAN PROFESSOR OF ASTRONOMY AND EXPERIMENTAL PHILOSOPHY IN THE UNIVERSITY OF CAMBRIDGE ira° CAMBRIDGE AT THE UNIVERSITY PRESS 1923 ac G E35 PRINTED IN GREAT BRITAIN PREFACE A FIRST draft of this book was published in 1921 as a mathematical supple- ment to the French Edition of Space, Time and Gravitation. During the ensuing eighteen months I have pursued my intention of developing it into a more systematic and comprehensive treatise on the mathematical theory of Relativity. The matter has been rewritten, the sequence of the argu- ment rearranged in many places, and numerous additions made throughout ; so that the work is now expanded to three times its former size. It is hoped that, as now enlarged, it may meet the needs of those who wish to enter fully into these problems of reconstruction of theoretical physics. The reader is expected to have a general acquaintance with the less technical discussion of the theory given in Space, Time and Gravitation, although there is not often occasion to make direct reference to it. But it is eminently desirable to have a general grasp of the revolution of thought associated with the theory of Relativity before approaching it along the narrow lines of strict mathematical deduction. In the former work wc ex- plained how the older conceptions of physics had become untenable, and traced the gradual ascent to the ideas which must supplant them. Here our task is to formulate mathematically this new conception of the world and to follow out the consequences to the fullest extent. The present widespread interest in the theory arose from the verification of certain minute deviations from Newtonian laws. To those who are still hesitating and reluctant to leave the old faith, these deviations will remain the chief centre of interest ; but for those who have caught the spirit of the new ideas the observational predictions form only a minor part of the subject. It is claimed for the theory that it leads to an understanding of the world of physics clearer and more penetrating than that previously attained, and it has been my aim to develop the theory in a form which throws most light on the origin and significance of the great laws of physics. It is hoped that difficulties which are merely analytical have been mini- mised by giving rather fully the intermediate steps in all the proofs with abundant cross-references to the auxiliary formulae used. For those who do not read the book consecutively attention may be called to the following points in the notation. The summation convention (p. 50) is used. German letters always denote the product of the corresjjonding English letter by V — g (p. 111). Vl is the symbol for " Hamiltonian differen- tiation" introduced on p. 139. An asterisk is prefixed to symbols generalised so as to be independent of or covariant with the gauge (p. 203). VI PREFACE A selected list of original papers on the subject is given in the Biblio- graphy at the end, and many of these are sources (either directly or at second-hand) of the developments here set forth. To fit these into a con- tinuous chain of deduction has involved considerable modifications from their original form, so that it has not generally been found practicable to indicate the sources of the separate sections. A frequent cause of deviation in treat- ment is the fact that in the view of most contemporary writers the Principle of Stationary Action is the final governing law of the world ; for reasons explained in the text I am unwilling to accord it so exalted a position. After the original papers of Einstein, and those of de Sitter from which I first acquired an interest in the theory, I am most indebted to Weyl's Raum, Zeit, Materie. Weyl's influence will be especially traced in §§ 49, 58, 59, 61, 63, as well as in the sections referring to his own theory. I am under great obligations to the officers and staff' of the University Press for their help and care in the intricate printing. A. S. E. 10 August 1922. CONTENTS INTRODUCTION PAGE 1 SECTION CHAPTER I ELEMENTARY PRINCIPLES 1. 2. 3. 4. 5. 6. Indeterminateness of the space-time frame The fundamental quadratic form Measurement of intervals . Rectangular coordinates and time The Lorentz transformation The velocity of light .... 7. Timelike and spacelike intervals 8. Immediate consciousness of time 9. The "3 + 1 dimensional " world 10. The FitzC4erald contraction 11. Simultaneity at different places 12. Momentum and Mass 13. Energy ...... 14. Density and temperature . 15. General transformations of coordinates 16. Fields of force 17. The Principle of Equivalence . 18. Retrospect 8 10 11 13 17 18 22 23 25 25 27 29 32 33 34 37 39 41 The quotient law CHAPTER II THE TENSOR CALCULUS 19. Contra variant and covariant vectors . 20. The mathematical notion of a vector 21. The physical notion of a vector 22. The summation convention 23. Tensors 24. Inner multiplication and contraction. 25. The fundamental tensors . 26. Associated tensors 27. Christoffel's 3-index symbols 28. Equations of a geodesic 29. Covariant derivative of a vector 30. Covariant derivative of a tensor 31. Alternative discussion of the covariant derivative 32. Surface-elements and Stokes's theorem 33. Significance of covariant differentiation 34. The Riemann-Christoffel tensor 35. Miscellaneous formulae 43 44 47 50 51 52 55 56 58 59 60 62 65 66 68 71 74 VI 11 CONTENTS CHAPTER III THE LAW OF GRAVITATION SECTION 36. The condition for flat space-time. Natural coordinates 37. Einstein's law of gravitation 38. The gravitational field of an isolated particle 39. Planetary orbits 40. The advance of perihelion 41. The deflection of light .... 42. Displacement of the Fraunhofer lines 43. Isotropic coordinates ..... 44. Problem of two bodies — Motion of the moon 45. Solution for a particle in a curved world . 46. Transition to continuous matter 47. Experiment and deductive theory PAGE 76 81 82 85 88 90 91 93 95 100 101 104 CHAPTER IV RELATIVITY MECHANICS 48. The antisymmetrical tensor of the fourth rank . . . . 107 49. Element of volume. Tensor-density 109 50. The problem of the rotating disc 112 51. The divergence of a tensor . . . . . . . . 113 52. The four identities 115 53. The material energy-tensor . . 116 54. New derivation of Einstein's law of gravitation . . . . 119 55. The force 122 56. Dynamics of a particle . . . 125 57. Equality of gravitational and inertial mass. Gravitational waves . . 128 58. Lagrangian form of the gravitational equations .... 131 59. Pseudo-energy-tensor of the gravitational field . . . . . 134 60. Action 137 61. A property of invariants 140 62. Alternative energy-tensors ........ 141 63. Gravitational flux from a particle . . . . . . . 144 64. Retrospect 146 65. 66. 67. 68. 69. 70. 71. 72. CHAPTER V CURVATURE OF SPACE AND TIME Curvature of a four-dimensional manifold Interpretation of Einstein's law of gravitation Cylindrical and spherical space-time Elliptical space Law of gravitation for curved space-time Properties of de Sitter's spherical world Properties of Einstein's cylindrical world The problem of the homogeneous sphere 149 152 155 157 159 161 166 168 CONTENTS ]X CHAPTER VI 86 87 88, 89 90, ELECTRICITY SECTION 73. The electromagnetic equations . 74. Electromagnetic waves .... 75. The Lorentz transformation of electromagnetic 76. Mechanical effects of the electromagnetic field 77. The electromagnetic energy-tensor . 78. The gravitational field of an electron 79. Electromagnetic action 80. Explanation of the mechanical force . 81. Electromagnetic volume . 82. Macroscopic equations CHAPTER VII WORLD GEOMETRY Part I. Weyl's Theory 83. Natural geometry and world geometry 84. Non-integrability of length 85. Transformation of gauge-systems Gauge-invariance The generalised Riemann-Christoffel tensor The iii-invariants of a region The natural gauge Weyl's action-principle force PAGE 171 175 179 180 182 185 187 189 193 194 196 198 200 202 204 205 206 209 Part II. Generalised Theory 91. 92. 93. 94. 95. 96. 97. 98. 99. 100. 101. 102. 103. Parallel displacement ..... Displacement round an infinitesimal circuit Introduction of a metric ..... Evaluation of the fundamental in-tensors . The natural gauge of the world .... The principle of identification . The bifurcation of geometry and electrodynamics General relation-structure .... The tensor *B*. fiva- ...... Dynamical consequences of the general properties of world-invariants The generalised volume Numerical values Conclusion Bibliography . Index 213 214 216 218 219 222 223 224 226 228 232 235 237 241 244 INTEODUCTION The subject of this mathematical treatise is not pure mathematics but physics. The vocabulary of the physicist comprises a number of words such as length, angle, velocity, force, work, potential, current, etc., which we shall call briefly "physical quantities." Some of these terms occur in pure mathe- matics also ; in that subject they may have a generalised meaning which does not concern us here. The pure mathematician deals with ideal quantities defined as having the properties which he deliberately assigns to them. But in an experimental science we have to discover properties not to assign them ; and physical quantities are defined primarily according to the way in which we recognise them when confronted by them in our observation of the world around us. Consider, for example, a length or distance between two points. It is a numerical quantity associated with the two points; and we all know the procedure followed in practice in assigning this numerical quantity to two points in nature. A definition of distance will be obtained by stating the exact procedure ; that clearly must be the primary definition if we are to make sure of using the word in the sense familiar to everybody. The pure mathematician proceeds differently; he defines distance as an attribute of the two points which obeys certain laws — the axioms of the geometry which he happens to have chosen — and he is not concerned with the question how this "distance" would exhibit itself in practical observation. So far as his own investigations are concerned, he takes care to use the word self-consistent ly ; but it does not necessarily denote the thing which the rest of mankind are accustomed to recognise as the distance of the two points. To find out any physical quantity we perform certain practical operations followed by calculations ; the operations are called experiments or observations according as the conditions are more or less closely under our control. The physical quantity so discovered is primarily the result of the operations and calculations; it is, so to speak, a manufactured article — manufactured by our operations. But the physicist is not generally content to believe that the quantity he arrives at is something whose nature is inseparable from the kind of operations which led to it ; he has an idea that if he could become a god contemplating the external world, he would see his manufactured physical quantity forming a distinct feature of the picture. By finding that he can lay x unit measuring-rods in a line between two points, he has manufactured the quantity x which he calls the distance between the points ; but he believes that that distance x is something already existing in the picture of the world —a gulf which would be apprehended by a superior intelligence as existing in itself without reference to the notion of operations with measuring-rods. e. 1 2 INTRODUCTION Yet he makes curious and apparently illogical discriminations. The parallax of a star is found by a well-known series of operations and calculations ; the distance across the room is found by operations with a tape-measure. Both parallax and distance are quantities manufactured by our operations ; but for some reason we do not expect parallax to appear as a distinct element in the true picture of nature in the same way that distance does. Or again, instead of cutting short the astronomical calculations when we reach the parallax, we might go on to take the cube of the result, and so obtain another manufactured quantity, a " cubic parallax." For some obscure reason we expect to see distance appearing plainly as a gulf in the true world-picture ; parallax does not appear directly, though it can be exhibited as an angle by a comparatively simple construction ; and cubic parallax is not in the picture at all. The physicist would say that he finds a length, and manufactures a cubic parallax ; but it is only because he has inherited a preconceived theory of the world that he makes the distinction. We shall venture to challenge this distinction. Distance, parallax and cubic parallax have the same kind of potential existence even when the operations of measurement are not actually made — if you will move sideways you will be able to determine the angular shift, if you will lay measuring-rods in a line to the object you will be able to count their number. Any one of the three is an indication to us of some existent condition or relation in the world outside us — a condition not created by our operations. But there seems no reason to conclude that this world-condition resembles distance any more closely than it resembles parallax or cubic parallax. Indeed any notion of " resemblance " between physical quantities and the world-conditions underlying them seems to be inappropriate. If the length AB is double the length CD, the parallax of B from A is half the paral- lax of D from C ; there is undoubtedly some world-relation which is different for AB and CD, but there is no reason to regard the world-relation of A B as being better represented by double than by half the world-relatiou of CD. The connection of manufactured physical quantities with the existent world-condition can be expressed by saying that the physical quantities are measure-numbers of the world-condition. Measure-numbers may be assigned according to any code, the only requirement being that the same measure- number always indicates the same world-condition and that different world- conditions receive different measure-numbers. Two or more physical quantities may thus be measure-numbers of the same world-condition, but in different codes, e.g. parallax and distance; mass and energy; stellar magnitude and lumi- nosity. The constant formulae connecting these pairs of physical quantities give the relation between the respective codes. But in admitting that physical quantities can be used as measure-numbers of world-conditions existing independently of our operations, we do not alter their status as manufactured quantities. The same series of operations will naturally manufacture the INTRODUCTION 3 same result when world-conditions are the same, and different results when they are different. (Differences of world-conditions which do not influence the results of experiment and observation are ipso facto excluded from the domain of physical knowledge.) The size to which a crystal grows may be a measure-number of the temperature of the mother-liquor ; but it is none the less a manufactured size, and we do not conclude that the true nature of size is caloric. The study of physical quantities, although they are the results of our own operations (actual or potential), gives us some kind of knowledge of the world-conditions, since the same operations will give different results in different world-conditions. It seems that this indirect knowledge is all that we can ever attain, and that it is only through its influences on such opera- tions that we can represent to ourselves a "condition of the world." Any attempt to describe a condition of the world otherwise is either mathematical symbolism or meaningless jargon. To grasp a condition of the world as completely as it is in our power to grasp it, we must have in our minds a symbol which comprehends at the same time its influence on the results of all possible kinds of operations. Or, what comes to the same thing, we must contemplate its measures according to all possible measure-codes — of course, without confusing the different codes. It might well seem impossible to realise so comprehensive an outlook; but we shall find that the mathematical calculus of tensors does represent and deal with world-conditions precisely in this way. A tensor expresses simultaneously the whole group of measure- numbers associated with any world-condition ; and machinery is provided for keeping the various codes distinct. For this reason the somewhat difficult tensor calculus is not to be regarded as an evil necessity in this subject, which ought if possible to be replaced by simpler analytical devices ; our knowledge of conditions in the external world, as it comes to us through observation and experiment, is precisely of the kind which can be expressed by a tensor and not otherwise. And, just as in arithmetic we can deal freely with a billion objects without trying to visualise the enormous collection ; so the tensor calculus enables us to deal with the world-condition in the totality of its aspects without attempting to picture it. leaving regard to this distinction between physical quantities and world- conditions, we shall not define a physical quantity as though it were a feature in the world-picture which had to be sought out. A physical quantity is defined by the series of operations and calculations of which it is the result. The tendency to this kind of definition had progressed far even in pre-relativity physics. Force had become " mass x acceleration," and was no longer an in- visible agent in the world-picture, at least so far as its definition was concerned. Mass is defined by experiments on inertial properties, no longer as ''quantity of matter." But for some terms the older kind of definition (or lack of definition) has been obstinately adhered to ; and for these the relativity 4 INTRODUCTION theory must find new definitions. In most cases there is no great difficulty in framing them. We do not need to ask the physicist what conception he attaches to " length " ; we watch him measuring length, and frame our definition according to the operations he performs. There may sometimes be cases in which theory outruns experiment and requires us to decide between two definitions, either of which would be consistent with present experimental practice ; but usually we can foresee which of them corresponds to the ideal which the experimentalist has set before himself. For example, until recently the practical man was never confronted with problems of non-Euclidean space, and it might be suggested that he would be uncertain how to construct a straight line when so confronted ; but as a matter of fact he showed no hesitation, and the eclipse observers measured without ambiguity the bending of light from the " straight line." The appropriate practical definition was so obvious that there was never any danger of different people meaning different loci by this term. Our guiding rule will be that a physical quantity must be defined by prescribing operations and calculations which will lead to an unambiguous result, and that due heed must be paid to existing practice ; the last clause should secure that everyone uses the term to denote the same quantity, however much disagreement there may be as to the conception attached to it. When defined in this way, there can be no question as to whether the operations give us the real physical quantity or whether some theoretical correction (not mentioned in the definition) is needed. The physical quantity is the measure-number of a world-condition in some code ; we cannot assert that a code is right or wrong, or that a measure-number is real or unreal ; what we require is that the code should be the accepted code, and the measure- number the number in current use. For example, what is the real difference of time between two events at distant places ? The operation of determining time has been entrusted to astronomers, who (perhaps for mistaken reasons) have elaborated a regular procedure. If the times of the two events are found in accordance with this procedure, the difference must be the real difference of time ; the phrase has no other meaning. But there is a certain generalisa- tion to be noticed. In cataloguing the operations of the astronomers, so as to obtain a definition of time, we remark that one condition is adhered to in practice evidently from necessity and not from design — the observer and his apparatus are placed on the earth and move with the earth. This condition is so accidental and parochial that we are reluctant to insist on it in our definition of time ; yet it so happens that the motion of the apparatus makes an important difference in the measurement, and without this restriction the operations lead to no definite result and cannot define anything. We adopt what seems to be the commonsense solution of the difficulty. W e decide that time is relative to an observer ; that is to say, Ave admit that an observer on another star, who carries out all the rest of the operations and calculations INTRODUCTION 5 as specified in our definition, is also measuring time — not our time, but a time relative to himself. The same relativity affects the great majority of elementary physical quantities*; the description of the operations is insuf- ficient to lead to a unique answer unless we arbitrarily prescribe a particular motion of the observer and his apparatus. In this example we have had a typical illustration of " relativity," the recognition of which has had far-reaching results revolutionising the outlook of physics. Any operation of measurement involves a comparison between a measuring-appliance and the thing measured. Both play an equal part in the comparison and are theoretically, and indeed often practically, inter- changeable ; for example, the result of an observation with the meridian circle gives the right ascension of the star or the error of the clock indifferently, and we can regard either the clock or the star as the instrument or the object of measurement. Remembering that physical quantities are results of comparisons of this kind, it is clear that they cannot be considered to belong solely to one partner in the comparison. It is true that we standardise the measuring appliance as far as possible (the method of standardisation being explained or implied in the definition of the physical quantity) so that in general the variability of the measurement can only indicate a variability of the object measured. To that extent there is no great practical harm in regarding the measurement as belonging solely to the second partner in the relation. But even so we have often puzzled ourselves needlessly over paradoxes, which disappear when we realise that the physical quantities are not properties of certain external objects but are relations between these objects and something else. Moreover, we have seen that the standardisation of the measuring-appliance is usually left incomplete, as regards the specifica- tion of its motion ; and rather than complete it in a way which would be arbitrary and pernicious, we prefer to recognise explicitly that our physical quantities belong not solely to the objects measured but have reference also to the particular frame of motion that we choose. The principle of relativity goes still further. Even if the measuring- appliances were standardised completely, the physical quantities would still involve the properties of the constant standard. We have seen that the world-condition or object which is surveyed can only be apprehended in our knowledge as the sum total of all the measurements in which it can be concerned ; any intrinsic property of the object must appear as a uniformity or law in these measures. When one partner in the comparison is fixed and the other partner varied widely, whatever is common to all the measurements may be ascribed exclusively to the first partner and regarded as an intrinsic property of it. Let us apply this to the converse comparison ; that is to say, keep the measuring-appliance constant or standardised, and vary as widely as possible the objects measured — or, in simpler terms, make a particular * The most important exceptions are number (of discrete entities), action, and entropy. (J INTRODUCTION kind of measurement in all parts of the field. Intrinsic properties of the measuring-appliance should appear as uniformities or laws in these measures. We are familiar with several such uniformities; but we have not generally recognised them as properties of the measuring-appliance. We have called them laws of nature 1 The development of physics is progressive, and as the theories of the external world become crystallised, we often tend to replace the elementary physical quantities defined through operations of measurement by theoretical quantities believed to have a more fundamental significance in the external world. Thus the vis viva mv 2 , which is immediately determinable by experi- ment, becomes replaced by a generalised energy, virtually defined by having the property of conservation ; and our problem becomes inverted — we have not to discover the properties of a thing which we have recognised in nature, but to discover how to recognise in nature a thing whose properties we have assigned. This development seems to be inevitable ; but it has grave draw- backs especially when theories have to be reconstructed. Fuller knowledge may show that there is nothing in nature having precisely the properties assigned ; or it may turn out that the thing having these properties has entirely lost its importance when the new theoretical standpoint is adopted*. When we decide to throw the older theories into the melting-pot and make a clean start, it is best to relegate to the background terminology associated with special hypotheses of physics. Physical quantities defined by operations of measurement are independent of theory, and form the proper starting-point for any new theoretical development. Now that we have explained how physical quantities are to be defined, the reader may be surprised that we do not proceed to give the definitions of the leading physical quantities. But to catalogue all the precautions and provisos in the operation of determining even so simple a thing as length, is a task which we shirk. We might take refuge in the statement that the task though laborious is straightforward, and that the practical physicist knows the whole procedure without our writing it down for him. But it is better to be more cautious. I should be puzzled to say off-hand what is the series of operations and calculations involved in measuring a length of 10~ 15 cm. ; nevertheless I shall refer to such a length when necessary as though it were a quantity of which the definition is obvious. We cannot be forever examining our foundations ; we look particularly to those places where it is reported to us that they are insecure. I may be laying myself open to the charge that I am doing the very thing I criticise in the older physics — -using terms that * We shall see in § 59 that this has happened in the case of energy. The dead-hand of a superseded theory continues to embarrass us, because in this case the recognised terminology still has implicit reference to it. This, however, is only a slight drawback to set off against the many advantages obtained from the classical generalisation of energy as a step towards the more complete theory. INTRODUCTION 7 have no definite observational meaning, and mingling with my physical quantities things which are not the results of any conceivable experimental operation. I would reply — By all means explore this criticism if you regard it as a promising field of inquiry. I here assume that you will probably find me a justification for my 10 -15 cm. ; but you may find that there is an insurmountable ambiguity in defining it. In the latter event you may be on the track of something which will give a new insight into the fundamental nature of the world. Indeed it has been suspected that the perplexities of quantum phenomena may arise from the tacit assumption that the notions of length and duration, acquired primarily from experiences in which the average effects of large numbers of quanta are involved, are applicable in the study of individual quanta. There may need to be much more excavation before we have brought to light all that is of value in this critical consideration of experimental knowledge. Meanwhile I want to set before you the treasure which has already been unearthed in this field. CHAPTER I ELEMENTARY PRINCIPLES 1. Indeterminateness of the space-time frame. It has been explained in the early chapters of Space, Time and Gravitation that observers with different motions use different reckonings of space and time, and that no one of these reckonings is more fundamental than another. Our problem is to construct a method of description of the world in which this indeterminateness of the space-time frame of reference is formally recognised. Prior to Einstein's researches no doubt was entertained that there existed a "true even-flowing time" which was unique and universal. The moving- observer, who adopts a time-reckoning different from the unique true time, must have been deluded into accepting a fictitious time with a fictitious space-reckoning modified to correspond. The compensating behaviour of electromagnetic forces and of matter is so perfect that, so far as present knowledge extends, there is no test which will distinguish the true time from the fictitious. But since there are many fictitious times and, according to this view, only one true time, some kind of distinction is implied although its nature is not indicated. Those who still insist on the existence of a unique " true time " generally rely on the possibility that the resources of experiment are not yet exhausted and that some day a discriminating test may be found. But the off-chance that a future generation may discover a significance in our utterances is scarcely an excuse for making meaningless noises. Thus in the phrase true time, " true " is an epithet whose meaning has yet to be discovered. It is a blank label. We do not know what is to be written on the label, nor to which of the apparently indistinguishable time-reckonings it ought to be attached. There is no way of progress here. We return to firmer ground, and note that in the mass of experimental knowledge which has accumulated, the words time and space refer to one of the " fictitious " times and spaces — primarily that adopted by an observer travelling with the earth, or with the sun — and our theory will deal directly with these space- time frames of reference, which are admittedly fictitious or, in the more usual phrase, relative to an observer with particular motion. The observers are studying the same external events, notwithstanding their different space-time frames. The space-time frame is therefore some- thing overlaid by the observer on the external world ; the partitions repre- senting his space and time reckonings are imaginary surfaces drawn in the world like the lines of latitude and longitude drawn on the earth. They do CH. I 1 INDETERMINATENESS OF THE SPACE-TIME FRAME 9 not follow the natural lines of structure of the world, any more than the meridians follow the lines of geological structure of the earth. Such a mesh- system is of great utility and convenience in describing phenomena, and we shall continue to employ it ; but we must endeavour not to lose sight of its fictitious and arbitrary nature. It is evident from experience that a four-fold mesh-system must be used ; and accordingly an event is located by four coordinates, generally taken as x, y, z, t. To understand the significance of this location, we first consider the simple case of two dimensions. If we describe the points of a plane figure by their rectangular coordinates x, y, the description of the figure is complete and would enable anyone to construct it ; but it is also more than complete, because it specifies an arbitrary element, the orientation, which is irrelevant to the intrinsic properties of the figure and ought to be cast aside from a description of those properties. Alternatively we can describe the figure by stating the distances between the various pairs of points in it ; this descrip- tion is also complete, and it has the merit that it does not prescribe the orientation or contain anything else irrelevant to the intrinsic properties of the figure. The drawback is that it is usually too cumbersome to use in practice for any but the simplest figures. Similarly our four coordinates x, y, z, t may be expected to contain an arbitrary element, analogous to an orientation, which has nothing to do with the properties of the configuration of events. A different set of values of x, y, z, t may be chosen in which this arbitrary element of the description is altered, bub the configuration of events remains unchanged. It is this arbitrariness in coordinate specification which appears as the indeterminate- ness of the space-time frame. The other method of description, by giving the distances between every pair of events (or rather certain relations between pairs of events which are analogous to distance), contains all that is relevant to the configuration of events and nothing that is irrelevant. By adopting this latter method Ave can strip away the arbitrary part of the description, leaving only that which has an exact counterpart in the configuration of the external world. To put the contrast in another form, in our common outlook the idea of position or location seems to be fundamental. From it we derive distance or extension as a subsidiary notion, which covers part but not all of the con- ceptions which we associate with location. Position is looked upon as the physical fact — a coincidence with what is vaguely conceived of as an identifiable point of space — whereas distance is looked upon as an abstraction or a computational result calculable when the positions are known. The view which we are going to adopt reverses this. Extension (distance, interval) is now fundamental; and the location of an object is a computational result summarising the physical fact that it is at certain intervals from the other objects in the world. Any idea contained in the concept location which is not 10 INDETERMINATENESS OF THE SPACE-TIME FRAME CH. 1 expressible by reference to distances from other objects, must be dismissed from our minds. Our ultimate analysis of space leads us not to a "here" and a " there," but to an extension such as that which relates " here " and " there." To put the conclusion rather crudely — space is not a lot of points close together ; it is a lot of distances interlocked. Accordingly our fundamental hypothesis is that — Everything connected with location which enters into observational know- ledge — everything we can know about the configuration of events — is contained in a relation of extension between pairs of events. This relation is called the interval, and its measure is denoted by ds. If we have a system 8 consisting of events A, B, G, D, ..., and a system S / consisting of events A', B' , C, D', . .., then the fundamental hypothesis implies that the two systems will be exactly alike observationally if, and only if, all pairs of corresponding intervals in the two systems are equal, AB = A'B' } AC = A'C, .... In that case if 8 and 8'. are material systems they will appear to us as precisely similar bodies or mechanisms ; or if 8 and 8' correspond to the same material body at different times, it will appear that the body has not undergone any change detectable by observation. But the position, motion, or orientation of the body may be different ; that is a change detect- able by observation, not of the system 8, but of a wider system comprising S and surrounding bodies. Again let the systems 8 and 8' be abstract coordinate-frames of reference, the events being the corners of the meshes ; if all corresponding intervals in the two systems are equal, we shall recognise that the coordinate-frames are of precisely the same kind — rectangular, polar, unaccelerated, rotating, etc. 2. The fundamental quadratic form. We have to keep side by side the two methods of describing the con- figurations of events by coordinates and by the mutual intervals, respectively — the first for its conciseness, and the second for its immediate absolute significance. It is therefore necessary to connect the two modes of description by a formula which will enable us to pass readily from one to the other. The particular formula will depend on the coordinates chosen as well as on the absplute properties of the region of the world considered ; but it appears that in all cases the formula is included in the following general form — The interval ds between two neighbouring events with coordinates (x 1} sc 2 , x 3 , x 4 ) and (a\ + dx x , x 2 + dx 2 , x 3 + dx 3 , x 4 + dx 4 ) in any coordinate-system is given by ds 2 = g u dx^ + g 22 dx 2 + g 33 dx 3 2 + g 44 dx 4 2 + 2g i2 dx 1 dx 2 + 2 f g 13 dx 1 dx 3 -f 2g 14 dx l dx 4 + 2g 23 dx 2 dx 3 + 2g 24 dx 2 dx 4 + 2g 3i dx 3 dx 4 (21), where the coefficients g n , etc. are functions of x lf x 2 , x 3 , x 4 . That is to say, ds 2 is some quadratic function of the differences of coordinates. 1-3 THE FUNDAMENTAL QUADRATIC FORM 11 This is, of course, not the most general case conceivable ; for example, we might have a world in which the interval depended on a general quartic function of the dx's. But, as we shall presently see, the quadratic form (2'1) is definitely indicated by observation as applying to the actual world. Moreover near the end of our task (§ 97) we shall find in the general theory of relation- structure a precise reason why a quadratic function of the coordinate- differences should have this paramount importance. Whilst the form of the right-hand side of (2'1) is that required by observation, the insertion of ds 2 on the left, rather than some other function of ds, is merely a convention. The quantity ds is a measure of the interval. It is necessary to consider carefully how measure-numbers are to be affixed to the different intervals occurring in nature. We have seen in the last section that equality of intervals can be tested observationally ; but so far as we have yet gone, intervals are merely either equal or unequal, and their differences have not been further particularised. Just as wind-strength may be measured by velocit}', or by pressure, or by a number on the Beaufort scale, so the relation of extension between two events could be expressed numerically according to many different; plans. To conform to (21) a particular code of measure-numbers must b< adopted; the nature and advantages of this c^de will be explained in th^uext section. The pure geometry associated with the general formula (2'1) was studied by Riemann, and is generally called Riemannian geometry. It includes Euclidean geometry as a special case. 3. Measurement of intervals. Consider the operation of proving by measurement that a distance AB is equal to a distance CD. We take a configuration of events LMNOP..., viz. a measuring-scale, and lay it over AB, and observe that A and B coincide with two particular events P, Q (scale-divisions) of the configuration. We find that the same configuration* can also be arranged so that C and D coincide with P and Q respectively. Further we apply all possible tests to the measuring-scale to see if it has "changed " between the two measurements : and we are only satisfied that the measures are correct if no observable difference can be detected. According to our fundamental axiom, the absence of any observable difference between the two configurations (the structure of the measuring-scale in its two positions) signifies that the intervals are un- changed ; in particular the interval between P and Q is unchanged. It follows that the interval A to B is equal to the interval C to D. We consider that the experiment proves equality of distance; but it is primarily a test of equality of interval. * The logical point may be noticed that the measuring-scale in two positions (necessarily at different times) represents the same configuration of events, not the same events. 12 MEASUREMENT OF INTERVALS CH. I In this experiment time is not involved ; and we conclude that in space considered apart from time the test of equality of distance is equality of interval. There is thus a one-to-one correspondence of distances and intervals. We may therefore adopt the same measure-number for the interval as is in general use for the distance, thus settling our plan of affixing measure- numbers to intervals. It follows that, when time is not involved, the interval reduces to the distance. It is for this reason that the quadratic form (2"1) is needed in order to agree with observation, for it is well known that in three dimensions the square of the distance between two neighbouring points is a quadratic function of their infinitesimal coordinate-differences — a result depending ultimately on the experimental law expressed by Euclid I, 47. When time is involved other appliances are used for measuring intervals. If we have a mechanism capable of cyclic motion, its cycles will measure equal intervals provided the mechanism, its laws of behaviour, and all relevant surrounding circumstances, remain precisely similar. For the phrase "precisely similar " means that no observable differences can be detected in the mechanism or its behaviour ; and that, as we have seen, requires that all corresponding intervals should be equal. In particular the interval between the events marking the beginning and end of the cycle is unaltered. Thus a clock primarily measures equal intervals ; it is only under more restricted conditions that it also measures the time-coordinate t. In general any repetition of an operation under similar conditions, but for a different time, place, orientation and velocity (attendant circumstances which have a relative but not an absolute significance*), tests, equality of interval. It is obvious from common experience that intervals which can be measured with a clock cannot be measured with a scale, and vice versa. -We have thus two varieties of intervals, which are provided for in the formula (2*1 ), since ds 2 may be positive or negative and the measure of the interval will accordingly be expressed by a real or an imaginary number. The abbreviated phrase " imaginary interval " must not be allowed to mislead ; there is nothing imaginary in the corresponding relation ; it is merely that in our arbitrary code an imaginary number is assigned as its measure-number. We might have adopted a different code, and have taken, for example, the antilogarithm of ds 2 as the measure of the interval ; in that case space- intervals would have received code-numbers from 1 to oo , and time-interva. numbers from to 1. When we encounter V — 1 in our investigations, w must remember that it has been introduced by our choice of measure-codi and must not think of it as occurring with some mystical significance in th external world. * They express relations to events which are not concerned in the test, e.g. to the sun an stars. 3, 4 RECTANGULAR COORDINATES AND TIME 1 3 4. Rectangular coordinates and time. Suppose that we have a small region of the world throughout which the g's can be treated as constants*. In that case the right-hand side of (2 - l) can be broken up into the sum of four squares, admitting imaginary coefficients if necessary. Thus writing y 1 = ttj#, + a,x 2 + a 3 x 3 + a 4 # 4 , 2/ 2 = ^1 + b 2 x 2 -f b 3 x 3 + b 4 x 4 "] etc., so that dy x = a l dx 1 + a 2 dx 2 + a 3 dx 3 + a A dx A ; etc., we can choose the constants a u b ly ... so that (2*1) becomes ds 2 = dyS + dy. 2 +dy. 2 + dy 2 (4-1). For, substituting for the dy's and comparing coefficients with (2-1), we have only 10 equations to be satisfied by the 16 constants. There are thus many ways of making the reduction. Note, however, that the reduction to the sum of four squares of complete differentials is not in general possible for a large region, where the g's have to be treated as functions, not constants. Consider all the events for which y A has some specified value. These will form a three-dimensional world. Since c£y 4 is zero for every pair of these events, their mutual intervals are given by ds 2 = dy 2 + dyi + dyi (4-2). But this is exactly like familiar space in which the interval (which we have shown to be the same as the distance for space without time) is given by ds 2 = dx 2 + dy 2 + dz 2 (4-3), where x, y, z are rectangular coordinates. Hence a section of the world by y x — const, will appear to us as space, and 2/i> 2/2> y-3 will appear to us as rectangular coordinates. The coordinate-frames 2/i > 2/2, y-i, and x, y, z, are examples of the systems S and S' of § 1, for which the intervals between corresponding pairs of mesh-corners are equal. The two systems are therefore exactly alike observational ly; and if one appears to us to be a rectangular frame in space, so also must the other. One proviso must be noted; the coordinates y u y 2 , y 3 for real events must be real, as in familiar space, otherwise the resemblance would be only formal. Granting this proviso, we have reduced the general expression to ds 2 = dx 2 + dy 2 + dz 2 + dy 4 2 (4'4), where x, y, z will be recognised by us as rectangular coordinates in space Clearly y 4 must involve the time, otherwise our location of events by the four coordinates would be incomplete ; but we must not too hastily identify it with the time t. * It will be shown in § 3G that it is always possible to transform the coordinates so that the first derivatives of the g's vanish at a selected point. We shall suppose that this preliminary transformation has already been made, in ordtr that the constancy of the g's may be a valid approximation through as large a region as possible round the selected point. 14 RECTANGULAR COORDINATES AND TIME CH. I I suppose that the following would be generally accepted as a satisfactory (pre-relativity) definition of equal time-intervals: — if we have a mechanism capable of cyclic motion, its cycles will measure equal durations of time anywhere and anywhen, provided the mechanism, its laws of behaviour, and all outside influences remain precisely similar. To this the relativist would add the condition that the mechanism (as a whole) must be at rest in the space-time frame considered, because it is now known that a clock in motion goes slow in comparison with a fixed clock. The non-relativist does not dis- agree in fact, though he takes a slightly different view ; he regards the proviso that the mechanism must be at rest as already included in his enunciation, because for him motion involves progress through the aether, which (he considers) directly affects the behaviour of the clock, and is one of those " outside influences " which have to be kept " precisely similar." Since then it is agreed that the mechanism as a whole is to be at rest, and the moving parts return to the same positions after a complete cycle, we shall have for the two events marking the beginning and end of the cycle doc, dy, dz = 0. Accordingly (4'4) gives for this case ds 2 = dy 2 . We have seen in § 3 that the cycles of the mechanism in all cases correspond to equal intervals ds ; hence they correspond to equal values of dy^ But by the above definition of time they also correspond to equal lapses of time dt ; hence we must have dy 4 proportional to dt, and we express this proportion- ality by writing 'dy 4 — icdt (4*5), where i= V— 1, and c is a constant. It is, of course, possible that c may be an imaginary number, but provisionally we shall suppose it real. Then (4'4) becomes ds 2 = da 2 + dy 2 + dz 2 - c 2 dt 2 (4-6). A further discussion is necessary before it is permissible to conclude that (4'6) is the most general possible form for ds 2 in terms of ordinary space and time coordinates. If we had reduced (2*1) to the rather more general form ds 2 = da? + dy 2 + dz 2 - c 2 dt 2 - 2cadxdt - 2c/3dydt - 2cydzdt . . .(4-7), this would have agreed with (46) in the only two cases yet discussed, viz. (1) when dt = 0, and (2) when dx, dy, dz = 0. To show that this more general form is inadmissible we must examine pairs of events which differ both in time and place. In the preceding pre-relativity definition of t our clocks had to remain stationary and were therefore of no use for comparing time at different places. What did the pre-relativity physicist mean by the difference of time dt between two events at different places ? I do not think that we can attach any meaning to his hazy conception of what dt signified ; but we know one rtf 4 RECTANGULAR COORDINATES AND TIME 15 or two ways in which he was accustomed to determine it. One method which he used was that of transport of chronometers. Let us examine then what happens when we move a clock from (x u 0, 0) at the time t x to another place (x 2 , 0, 0) at the time t 2 . We have seen that the clock, whether at rest or in motion, provided it remains a precisely similar mechanism, records equal intervals; hence the difference of the clock-readings at the beginning and end of the journey will be proportional to the integrated interval •2 ds (481). If the transport is made in the direct line (dy = 0, dz = 0), we shall have according to (4"7) - ds 2 = c 2 dt 2 + 2cctdxdt - dx 2 2a dx 1 fdxV) = c 2 dt 2 \l + I c dt c- \dt Hence the difference of the clock-readings (4"81) is proportional to *' 1± /., 2au w 2 \* where u = dx/dt, i.e the velocity of the clock. The integral will not in general reduce to U — ti\ so that the difference of time at the two places is not given correctly by the reading of the clock. Even when a = 0, the moving clock does not record correct time. Now introduce the condition that the velocity u is very small, remembering that t 2 — ti will then become very large. Neglecting u 2 /c 2 , (482) becomes I dt( 1 + - -7- J approximately = (t, — t i ) + -(x % — #i). c The clock, if moved sufficiently slowly, will record the correct time-difference if, and only if, a = 0. Moving it in other directions, we must have, similarly, /3 = 0, 7 = 0. Thus (4*6) is the most general formula for the interval, when the time at different places is compared by slow transport of clocks from one place to another. I do not know how far the reader will be prepared to accept the condition that it must be possible to correlate the times at different places by moving a clock from one to the other with infinitesimal velocity. The method employed in accurate work is to send an electromagnetic signal from one to the other, and we shall see in § 11 that this leads to the same formulae. We can scarcely consider that either of these methods of comparing time at different places is an essential part of our primitive notion of time in the same way that measurement at one place by a cyclic mechanism is ; therefore 16 RECTANGULAR COORDINATES AND TIME CH. I they are best regarded as conventional. Let it be understood, however, that although the relativity theory has formulated the convention explicitly, the usage of the word time-difference for the quantity fixed by this convention is in accordance with the long established practice in experimental physics and astronomy. Setting a = in (4*82), we see that the accurate formula for the clock- reading will be • f 2 dt (1 - u*/c 2 )l ♦ = (l-r?lc*)Hh-U) (4-9) for a uniform velocity u. Thus a clock travelling with finite velocity gives too small a reading — the clock goes slow compared with the time-reckoning conventionally adopted. To sum up the results of this section, if we choose coordinates such that the general quadratic form reduces to ds 2 = dy, 2 + dy.? + dy 3 2 + dy 4 2 (4-95), then y u y 2 , y 3 and y 4 \/— 1 will represent ordinary rectangular coordinates and time. If we choose coordinates for which ds" = dyf + dyi + dy 3 2 4- dy? + 2ady 1 dy i + 2/3dy 2 dy 4 + 2ydy 3 dy 4 . . .(4-96), these coordinates also will agree with rectangular coordinates and time so far as the more primitive notions of time are concerned ; but the reckoning by this formula of differences of time at different places will not agree with the reckoning adopted in physics and astronomy according to long established practice. For this reason it would only introduce confusion to admit these coordinates as a permissible space and time system. We who regard all coordinate-frames as equally fictitious structures have no special interest in ruling out the more general form (4 - 96). It is not a question of ascribing greater significance to one frame than to another, but of discovering which frame corresponds to the space and time reckoning generally accepted and used in standard works such as the Nautical Almanac. As far as § 14 our work will be subject to the condition that we are dealing with a region of the world in which the g's are constant, or approximately constant. A region having this property is called flat. The theory of this case is called the " special " theory of relativity ; it was discussed by Einstein in 1905 — some ten years before the general theory. But it becomes much simpler when regarded as a special case of the general theory, because it is no longer necessary to defend the conditions for its validity as being essential properties of space-time. For a given region these conditions may hold, or they may not. The special theory applies only if they hold ; other cases must be referred to the general theory. 4, 5 THE LORENTZ TRANSFORMATION 17 5. The Lorentz transformation. Make the following transformation of coordinates x = /3(x'-nt , ) > y = y', z = z\ t = j3 (f - ux/c 2 ) (51), /3=(i-u*/c?yK where u is any real constant not greater than c. We have by (5"1) dx 2 - c 2 dt 2 = /3 2 {(dx - udtj - c 2 (df - vjx'jc 2 ) 2 } = j3 2 \(l - *£) dx' 2 - (c 2 - i* a ) dt' 2 = dx' 2 - c 2 dt' 2 . Hence from (4 - 6) ds 2 = dx 2 + dy 2 + dz 2 - c 2 dt 2 = dx' 2 + dy' 2 + dz' 2 - c 2 dt' 2 (52). The accented and unaccented coordinates give the same formula for the interval, so that the intervals between corresponding pairs of mesh-corners will be equal, and therefore in all observable respects they will be alike. We shall recognise x', y , z as rectangular coordinates in space, and t' as the associated time. We have thus arrived at another possible way of reckoning- space and time — another fictitious space-time frame, equivalent in all its properties to the original one. For convenience we say that the first reckoning is that of an observer S and the second that of an observer >S", both observers being at rest in their respective spaces*. The constant u is easily interpreted. Since # is at rest in his own space, his location is given by x = const. By (5 - l) this becomes, in S"s coordinates, x — ut' = const. ; that is to say, S is travelling in the ^-'-direction with velocity u. Accordingly the constant a is interpreted as the velocity of S relative to S'. It does not follow immediately that the velocity of >S" relative to S is — u; but this can be proved by algebraical solution of the equations (5"1) to determine x', y' , z' , t'. We find x = /3 (x + at), y' = y, z' = z, t' = /3(t+ ux/c 2 ) (5'3), showing that an interchange of S and S' merely reverses the sign of u. The essentia] property of the foregoing transformation is that it leaves the formula for ds 2 unaltered (5'2), so that the coordinate-systems which it connects are alike in their properties. Looking at the matter more generally, we have already noted that the reduction to the sum of four squares can be made in many ways, so that we can have ds 2 = dy 2 + dy? + dy 3 2 + dy, 2 = dy,' 2 + dy! 2 + dy,' 2 + dy/ 2 (5-4). * This is partly a matter of nomenclature. A sentient observer can force himself to "recollt sot that lie is moving " and so adopt a space in which he is not at rest ; but he does not so readily adopt the time which properly corresponds; unless he uses the space ■ time frame in which he is at rest, he is likely to adopt a hybrid space-time which leads to inconsistencies. There is no ambiguity if the "observer" is regarded as merely an involuntary measuring apparatus, which by the principles of § 4 naturally partitions a space and time with respect to which it is at rest. E. 2 18 THE LORENTZ TRANSFORMATION CH. 1 The determination of the necessary connection between any two sets of coordinates satisfying this equation is a problem of pure mathematics ; we can use freely the conceptions of four-dimensional geometry and imaginary rotations to find this connection, whether the conceptions have any physical significance or not. We see from (5 - 4) that ds is the distance between two points in four-dimensional Euclidean space, the coordinates (y lt y. 2 , y 3 , y^) and (yi'> 2/2'. y.', Vi) being rectangular systems (real or imaginary) in that space. Accordingly these coordinates are related by the general transformations from one set of rectangular axes to another in four dimensions, viz. translations and rotations. Translation, or change of origin, need not detain us ; nor need a rotation of the space-axes (y lt y 2 , y 3 ) leaving time unaffected. The interesting case is a rotation in which ?/ 4 is involved, typified by V\ = V\ cos # — 1)1 s i n @> Vi — y{ si n + yl cos 0' Writing u = ic tan 0, so that ft = cos 6, this leads to the Lorentz transforma- tion (5-1). Thus, apart from obvious trivial changes of axes, the Lorentz transforma- tions are the only ones which leave the form (4*6) unaltered. Historically this transformation was first obtained for the particular case of electromagnetic equations. Its more general character was pointed out by Einstein in 1905. 6. The velocity of light. Consider a point moving along the a>axis whose velocity measured by &' is v, so that <-% < 61 >- Then by (5*1) its velocity measured by S is dx /3 (dx — udt') dt fr(dt'-udx'l&) V — u by (6-1) (6-2). 1 — uv'/c 2 In non-relativity kinematics we should have taken it as axiomatic that v = v — u. If two points move relatively to S' with equal velocities in opposite directions + v and - v, their velocities relative to S are V —XI . V + u and — 1 - uv'/c 2 1 + uv'/c 2 " As we should expect, these speeds are usually unequal ; but there is an ex- ceptional case when v = c. The speeds relative to S are then also equal, both in fact being equal to c. 5, 6 THE VELOCITY OF LIGHT 19 Again it follows from (5"2) that when ( m^-m-m 22 TIMELIKE AND SPACELIKE INTERVALS CH. I 7. Timelike and spacelike intervals. We make a slight change of notation, the quantity hitherto denoted by ds 2 being in all subsequent formulae replaced by — ds 2 , so that (4 - 6) becomes ds 2 = c 2 dt 2 - dx 2 - dtf - dz 2 (7-1). There is no particular advantage in this change of sign ; it is made in order to conform to the customary notation. The formula may give either positive or negative values of ds 2 , so that the interval between real events may be a real or an imaginary number. We call real intervals timelike, and imaginary intervals spacelike. ds\ 2 „ fdx\ 2 fdy\ 2 (dz\ 2 ■ U) = c 2 -v 2 (7-2), where v is the velocity of a point describing the track along which the interval lies. The interval is thus real or imaginary according as v is less than or greater than c. Assuming that a material particle cannot travel faster than light, the intervals along its track must be timelike. We ourselves are limited by material bodies and therefore can only have direct experience of timelike intervals. We are immediately aware of the passage of time without the use of our external senses ; but we have to infer from our sense perceptions the existence of spacelike intervals outside us. From any event x, y, z, t, intervals radiate in all directions to other events ; and the real and imaginary intervals are separated by the cone = c 2 dt 2 - dx 2 - dy 2 - dz 2 , which is called the null-cone. Since light travels with velocity c, the track of any light-pulse proceeding from the event lies on the null-cone. When the g's are not constants and the fundamental quadratic form is not reducible to (7'1), there is still a null-surface, given by ds= in (2'1), which separates the timelike and spacelike intervals. There can be little doubt that in this case also the light-tracks lie on the null-surface, but the property is perhaps scarcely self-evident, and we shall have to justify it in more detail later. The formula (6"2) for the composition of velocities in the same straight line may be written tanh -1 v/c = tanh" 1 v/c — tanh -1 u/c (7 '3). The quantity tanh -1 v/c has been called by Robb the rapidity corresponding to the velocity v. Thus (7 3) shows that relative rapidities in the same direction compound according to the simple addition-law. Since tanh -1 1 = oo , the velocity of light corresponds to infinite rapidity. We cannot reach infinite rapidity by adding any finite number of finite rapidities ; therefore we cannot reach the velocity of light by compounding any finite number of relative velocities less than that of light. 7, 8 TIMELIKE AND SPACELIKE INTERVALS 23 There is an essential discontinuity between speeds greater than and less than that of light which is illustrated by the following example. If two points move in the same direction with velocities Vi = c + e, v 2 = c— e respectively, their relative velocity is by (62) v-i — v 2 2e _ 2c 2 1-fl^/c 2 = 1 - (c f - e 2 )/c 2 ~ T ' which tends to infinity as e is made infinitely small ! If the fundamental velocity is exactly 300,000 km. per sec, and two points move in the same direction with speeds of 300,001 and 299,999 km. per sec, the speed of one relative to the other is 180,000,000,000 km. per sec. The barrier at 300,000 km. per sec is not to be crossed by approaching it. A particle which is aiming to reach a speed of 300,001 km. per sec. might naturally hope to attain its object by continually increasing its speed ; but when it has reached 299.999 km. per sec, and takes stock of the position, it sees its goal very much farther off than when it started. A particle of matter is a structure whose linear extension is timelike. We might perhaps imagine an analogous structure ranged along a spacelike track. That would be an attempt to picture a particle travelling with a velocity greater than that of light ; but since the structure would differ fundamentally from matter as known to us, there seems no reason to think that it would be recognised by us as a particle of matter, even if its existence were possible. For a suitably chosen observer a spacelike track can lie wholly in an instan- taneous space. The structure would exist along a line in space at one moment ; at preceding and succeeding moments it would be non-existent. Such instan- taneous intrusions must profoundly modify the continuity of evolution from past to future. In default of any evidence of the existence of these spacelike particles we shall assume that they are impossible structures. 8. Immediate consciousness of time. Our minds are immediately aware of a "flight of time" without the inter- vention of external senses. Presumably there are more or less cyclic processes occurring in the brain, which play the part of a material clock, whose indica- tions the mind can read. The rough measures of duration made by the internal time-sense are of little use for scientific purposes, and physics is accustomed to base time-reckoning on more precise external mechanisms. It is, however, desirable to examine the relation of this more primitive notion of time to the scheme developed in physics. Much confusion has arisen from a failure to realise that time as currently used in physics and astronomy deviates widely from the time recognised by the primitive time-sense. In fact the time of which we are immediately con- scious is not in general physical time, but the more fundamental quantity which we have called interval (confined, however, to timelike intervals). 24 IMMEDIATE CONSCIOUSNESS OF TIME CH. I Our time-sense is not concerned with events outside our brains; it relates only to the linear chain of events along our own track through the world. We may learn from another observer similar information as to the time-succession of events along his track. Further we have inanimate observers — clocks — from which we may obtain similar information as to their local time-successions. The combination of these linear successions along different tracks into a com- plete ordering of the events in relation to one another is a problem that requires careful analysis, and is not correctly solved by the haphazard intuitions of pre-relativity physics. Recognising that both clocks and time-sense measure ds between pairs of events along their respective tracks, we see that the problem reduces to that which we have already been studying, viz. to pass from a description in terms of intervals between pairs of events to a description in terms of coordinates. The external events which we see appear to fall into our own local time-succession ; but in reality it is not the events themselves, but the sense-impressions to which they indirectly give rise, which take place in the time-succession of our consciousness. The popular outlook does not trouble to discriminate between the external events themselves and the events constituted by their light-impressions on our brains ; and hence events throughout the universe are crudely located in our private time-sequence. Through this con- fusion the idea has arisen that the instants of which we are conscious extend so as to include external events, and are world-wide ; and the enduring universe is supposed to consist of a succession of instantaneous states. This crude view was disproved in 1675 by Romer's celebrated discussion of the eclipses of Jupiter's satellites ; and we are no longer permitted to locate external events in the instant of our visual perception of them. The whole foundation of the idea of world-wide instants was destroyed 250 years ago, and it seems strange that it should still survive in current physics. But, as so often happens, the theory was patched up although its original raison d'etre had vanished. Ob- sessed with the idea that the external events had to be put somehow into the instants of our private consciousness, the physicist succeeded in removing the pressing difficulties by placing them not in the instant of visual perception but in a suitable preceding instant. Physics borrowed the idea of world-wide instants from the rejected theory, and constructed mathematical continuations of the instants in the consciousness of the observer, making in this way time- partitions throughout the four-dimensional world. We need have no quarrel with this very useful construction which gives physical time. We only insist that its artificial nature should be recognised, and that the original demand for a world-wide time arose through a mistake. We should probably have had to invent universal time-partitions in any case in order to obtain a com- plete mesh-system ; but it might have saved confusion if we had arrived at it as a deliberate invention instead of an inherited misconception. If it is found that physical time has properties which would ordinarily be regarded as con- 8-10 THE "3+1 DIMENSIONAL" WORLD 25 trary to common sense, no surprise need be felt ; this highly technical construct of physics is not to be confounded with the time of common sense. It is im- portant for us to discover the exact properties of physical time ; but those properties were put into it by the astronomers who invented it. 9. The "3 + 1 dimensional" world. The constant c 2 in (7*1) is positive according to experiments made in regions of the world accessible to us. The 3 minus signs with 1 plus sign particularise the world in a way which we could scarcely have predicted from first principles. H. Weyl expresses this specialisation by saying that the world is 3 + 1 dimensional. Some entertainment may be derived by considering the properties ofa2 + 2ora4 + dimensional world. A more serious question is, Can the world change its type ? Is it possible that in making the reduction of (2"1) to the sum or difference of squares for some region remote in space or time, we might have 4 minus signs ? I think not ; because if the region exists it must be separated from our 3+1 dimensional region by some boundary. On one side of the boundary we have ds 2 = - dx 2 - dy 2 - dz 2 + c x 2 dt\ and on the other side ds 2 = - dx 2 - dy- - dz 2 - c.Ht 2 . The transition can only occur through a boundary where ds 2 = -dx 2 - dy 2 - dz 2 + Qdt 2 , so that the fundamental velocity is zero. Nothing can move at the boundary, and no influence can pass from one side to another. The supposed region beyond is thus not in any spatio-temporal relation to our own universe — which is a somewhat pedantic way of saying that it does not exist. This barrier is more formidable than that which stops the passage of light round the world in de Sitter's spherical space-time (Space, Time and Gravi- tation, p. 160). The latter stoppage was relative to the space and time of a distant observer ; but everything went on normally with respect to the space and time of an observer at the region itself. But here we are contemplating a barrier which does not recede as it is approached. The passage to a 2 + 2 dimensional world would occur through a transition region where ds 2 =-dx 2 - dy 2 + Odz 2 + c 2 dt 2 . Space here reduces to two dimensions, but there does not appear to be any barrier. The conditions on the far side, where time becomes two-dimensional, defy imagination. 10. The FitzGerald contraction. We shall now consider some of the consequences deducible from the Lorentz transformation. The first equation of (5-3) may be written x'//3 = x + ut, 26 THE FITZGERALD CONTRACTION CH. I which shows that S, besides making the allowance ut for the motion of his origin, divides by ft all lengths in the ^-direction measured by 8'. On the other hand the equation y' = y shows that 8 accepts S"s measures in direc- tions transverse to their relative motion. Let 8' take his standard metre (at rest relative to him, and therefore moving relative to 8) and point it first in the transverse direction y' and then in the longitudinal direction x'. For 8' its length is 1 metre in each position, since it is his standard ; for 8 the length is 1 metre in the transverse position and 1//3 metres in the longitudinal position. Thus 8 finds that a moving rod contracts when turned from the transverse to the longitudinal position. The question remains, How does the length of this moving rod compare with the length of a similarly constituted rod at rest relative to 8 ? The answer is that the transverse dimensions are the same whilst the longitudinal dimensions are contracted. We can prove this by a reductio ad absurdum. For suppose that a rod moving transversely were longer than a similar rod at rest. Take two similar transverse rods A and A' at rest relatively to 8 and 8' respectively. Then 8 must regard A' as the longer, since it is moving relatively to him ; and S' must regard A as the longer, since it is moving relatively to him. But this is impossible since, according to the equation y = y' } S and 8' agree as to transverse measures. We see that the Lorentz transformation (5'1) requires that (x, y, z, t) and (x', y, z' , t') should be measured with standards of identical material constitu- tion, but moving respectively with $ and 8'. This was really implicit in our deduction of the transformation, because the property of the two systems is that they give the same formula (5"2) for the interval; and the test of complete similarity of the standards is equality of all corresponding intervals occurring in them. The fourth equation of (5*1 ) is t = /3(tf- ux'/c 2 ). Consider a clock recording the time t', which accordingly is at rest in S"s system (x = const.). Then for any time-lapse by this clock, we have since hx = 0. That is to say, 8 does not accept the time as recorded by this moving clock, but multiplies its readings by /3, as though the clock were going slow. This agrees with the result already found in (4*9). It may seem strange that we should be able to deduce the contraction of a material rod and the retardation of a material clock from the general geometry of space and time. But it must be remembered that the contraction and retardation do not imply any absolute change in the rod and clock. The " configuration of events " constituting the four-dimensional structure which we call a rod is unaltered ; all that happens is that the observer's space and time partitions cross it in a different direction. 10, 11 THE FITZGERALD CONTRACTION 27 Further we make no prediction as to what would happen to the rod set in motion in an actual experiment. There may or may not be an absolute change of the configuration according to the circumstances by which it is set in motion. Our results apply to the case in which the rod after being set in motion is (according to all experimental tests) found to be similar to the rod in its original state of rest*. When a number of phenomena are connected together it becomes some- what arbitrary to decide which is to be regarded as the explanation of the others. To many it will seem easier to regard the strange property of the fundamental velocity as explained by these differences of behaviour of the observers' clocks and scales. They would say that the observers arrive at the same value of the velocity of light because they omit the corrections which would allow for the different behaviour of their measuring-appliances. That is the relative point of view, in which the relative quantities, length, time, etc., are taken as fundamental. From the absolute point of view, which has regard to intervals only, the standards of the two observers are equal and behave similarly ; the so-called explanations of the invariance of the velocity of light only lead us away from the root of the matter. Moreover the recognition of the FitzGerald contraction does not enable us to avoid paradox. From (5'3) we found that S "s longitudinal measuring- rods were contracted relatively to those of 8. From (51) we can show similarly that 8'a rods are contracted relatively to those of S'. There is complete reciprocity between S and S'. This paradox is discussed more fully in Space, Time and Gravitation, p. 55. 1 1 . Simultaneity at different places. It will be seen from the fourth equation of (5*1 ), viz. t = /3(t'~ ux'fc 2 ), that events at different places which are simultaneous for S' are not in general simultaneous for S. In fact, if dt' — 0, dt = -(3udx'/c 2 (111). It is of some interest to examine in detail how this difference of reckoning of simultaneity arises. It has been explained in § 4 that by convention the time at two places is compared by transporting a clock from one to the other with infinitesimal velocity. Our formulae are based on this convention; and, of course, (ll'l) will only be true if the convention is adhered to. The fact that infinitesimal velocity relative to S' is not the same as infinitesimal velocity relative to S, leaves room for the discrepancy of reckoning of simul- taneity to creep in. Consider two points A and B at rest relative to S', and distant x' apart. Take a clock at A and move it gently to B by giving it an * It may be impossible to chaDge the motion of a rod without causing a rise of temperature, Our conclusions will then not apply until the temperature has fallen again, i.e. until the tempera- ture-test shows that the rod is precisely similar to the rod before the change of motion. 28 SIMULTANEITY AT DIFFERENT PLACES CH. I infinitesimal velocity du for a time x'jdu'. Owing to the motion, the clock will by (4-9) be retarded in the ratio (1 - du' 2 /c 2 )'^; this continues for a time x'jdu' and the total loss is thus {l-(l-du' 2 /crf}x'/du, which tends to zero when du' is infinitely small. S' may accordingly accept the result of the comparison without applying any correction for the motion of the clock. Now consider S's view of this experiment. For him the clock had already a velocity u, and accordingly the time indicated by the clock is only (1 - u 2 /c 2 )- of the true time for S. By differentiation, an additional velocity du* causes a supplementary loss (1 — u 2 /c 2 ) ~ 2 udu/c 2 clock seconds (H'2) per true second. Owing to the FitzGerald contraction of the length AB, the distance to be travelled is as'//3, and the journey will occupy a time x'jftdu true seconds (11"3). Multiplying (11'2) and (11'3), the total loss due to the journey is ux'/c 2 clock seconds, or fiux'/c 2 true seconds for 8 (H'4). Thus, whilst S' accepts the uncorrected result of the comparison, S has to apply a correction fitix'/c 2 for the disturbance of the chronometer through transport. This is precisely the difference of their reckonings of simultaneity given by (11*1). In practice an accurate comparison of time at different places is made, not by transporting chronometers, but by electromagnetic signals — usually wireless time-signals for places on the earth, and light-signals for places in the solar system or stellar universe. Take two clocks at A and B, respectively. Let a signal leave A at clock-time t lt reach B at time ts by the clock at B, and be reflected to reach A again at time t 2 . The observer S', who is at rest relatively to the clocks, will conclude that the instant tj$ at B was simul- taneous with the instant |(£i + 4) at A, because he assumes that the forward velocity of light is equal to the backward velocity. But for S the two clocks are moving with velocity u ; therefore he calculates that the outward journey will occupy a time x/(c — u) and the homeward journey a time x/(c + u). Now x x(c + u) 8 2 x , . = — ; r = ~T ( c + w )> c — u c- — u 2 c 2 x x (c — u) S 2 x . . = — ^ 7T = Hr (c - u). c + u c 2 -u 2 c 2 ! Thus the instant ts of arrival at B must be taken as /3 2 xujc 2 later than the half-way instant \(t t + t 2 ). This correction applied by S, but not by S', agrees with (114) when we remember that owing to the FitzGerald contraction x = x'lfi. * Note that du will not be equal to du'. 11, 12 SIMULTANEITY AT DIFFERENT PLACES 29 Thus the same difference in the reckoning of simultaneity by S and 8' appears whether we use the method of transport of clocks or of light-signals. In either case a convention is introduced as to the reckoning of time-differences at different places ; this convention takes in the two methods the alternative forms — (1) A clock moved with infinitesimal velocity from one place to another continues to read the correct time at its new station, or (2) The forward velocity of light along any line is equal to the backward velocity*. Neither statement is by itself a statement of observable fact, nor does it refer to any intrinsic property of clocks or of light ; it is simply an announce- ment of the rule by which we propose to extend fictitious time-partitions through the world. But the mutual agreement of the two statements is a fact which could be tested by observation, though owing to the obvious practical difficulties it has not been possible to verify it directly. We have here given a theoretical proof of the agreement, depending on the truth of the funda- mental axiom of § 1. The two alternative forms of the convention are closely connected. In general, in any system of time-reckoning, a change du in the velocity of a clock involves a change of rate proportional to du, but there is a certain turning-point for which the change of rate is proportional to du 2 . In adopting a time-reckoning such that this stationary point corresponds to his own motion, the observer is imposing a symmetry on space and time with respect to himself, which may be compared with the symmetry imposed in assuming a constant velocity of light in all directions. Analytically we imposed the same general symmetry by adopting (46) instead of (4 - 7) as the form for ds 2 , making our space-time reckoning symmetrical with respect to the interval and therefore with respect to all observational criteria. 12. Momentum and mass. Besides possessing extension in space and time, matter possesses inertia. We shall show in due course that inertia, like extension, is expressible in terms of the interval relation ; but that is a development belonging to a later stage of our theory. Meanwhile we give an elementary treatment based on the empirical laws of conservation of momentum and energy rather than on any deep-seated theory of the nature of inertia. For the discussion of space and time we have made use of certain ideal apparatus which can only be imperfectly realised in practice — rigid scales ami * The chief case in which we require for practical purposes an accurate convention as to the reckoning of time at places distant from the earth, is in calculating the elements and mean places of planets and comets. In these computations the velocity of light in any direction is taken to be 300,000 km. per sec, an assumption which rests on the convention (2). All experimental methods of measuring the velocity of light determine only an average to-and-fro velocity. 30 MOMENTUM AND MASS CH. I perfect cyclic mechanisms or clocks, which always remain similar configura- tions from the absolute point of view. Similarly for the discussion of inertia we require some ideal material object, say a perfectly elastic billiard ball, whose condition as regards inertial properties remains constant from an absolute point of view. The difficulty that actual billiard balls are not perfectly elastic must be surmounted in the same way as the difficulty that actual scales are not rigid. To the ideal billiard ball we can affix a constant number, called the invariant mass*, which will denote its absolute inertial properties; and this number is supposed to remain unaltered throughout the vicissitudes of its history, or, if temporarily disturbed during a collision, is restored at the times when we have to examine the state of the body. With the customary definition of momentum, the components **■ M % M Tt < 121 > cannot satisfy a general law of conservation of momentum unless the mass M is allowed to vary with the velocity. But with the slightly modified definition dec dy dz .,«■„. m -=- , m ~ , m ~r (All) as as as the law of conservation can be satisfied simultaneously in all space-time systems, m being an invariant number. This was shown in Space, Time and Gravitation, p. 142. Comparing (12"1) and (12*2), we have M = ™% (12-3). We call m the invariant mass, and M the relative mass, or simply the mass. The term " invariant " signifies unchanged for any transformation of coordinates, and, in particular, the same for all observers ; constancy during the life-history of the body is an additional property of m attributed to our ideal billiard balls, but not assumed to be true for matter in general. Choosing units of length and time so that the velocity of light is unity, we have by (7*2) Hence by (12-3) M=m{\-v 2 )-^ (12-4). The mass increases with the velocity by the same factor as that which gives the FitzGerald contraction; and when v = 0, M '= m. The invariant mass is thus equal to the mass at rest. It is natural to extend (122) by adding a fourth component, thus dx dy dz dt ,..«-* m dS' m ds- "'5' m ds < 12 ' 5 > * Or proper-mass. 12 MOMENTUM AND MASS 31 By (123) the fourth component is equal to M. Thus the momenta and mass (relative mass) form together a symmetrical expression, the momenta being space-components, and the mass the time-component. We shall see later that the expression (125) constitutes a vector, and the laws of conservation of momentum and mass assert the conservation of this vector. The following is an analytical proof of the law of variation of mass with velocity directly from the principle of conservation of mass and momentum. Let M lt MS be the mass of a body as measured by S and S' respectively, v u Vi being its velocity in the ^--direction. Writing /^(l-^/c 2 ) -1 . ft'=(l- Vl ' 2 /c 2 )-*, /3 = (l_uVc a )-*, we can easily verify from (6'2) that ftux = £&>/- it) (12-6). Let a number of such particles be moving in a straight line subject to the conservation of mass and momentum as measured by S', viz. 2i¥,' and 2il/jV are conserved. Since /3 and u are constants it follows that S3/ 1 '/3(i'i' — u) is conserved. Therefore by (12'6) Sif/ft^/ft' is conserved (1271). But since momentum must also be conserved for the observer S Sil/jVi is conserved (1272). The results (1271) and (1272) will agree if MJfr = k'/A', and it is easy to see that there can be no other general solution. Hence for different values of v 1} M x is proportional to /3 X , or M=m(l-v 2 /c 2 )~^, where in is a constant for the body. It requires a greater impulse to produce a given change of velocity Bv in the original direction of motion than to produce an equal change Bw at right angles to it. For the momenta in the two directions are initially mv(l — v 2 /c 2 )~^, 0, and after a change Bv, Bw, they become m (v + 8v) [1 - {(v + Bv) 2 + (Sw) 2 }/c 2 ] " * , mBw [1 - {(v + Bv)* + (Bivf}/c 2 ] "*. Hence to the first order in Bv, Bw the changes of momentum are m ( 1 - v 2 /c 2 ) ~%Bv, m(l-v 2 /c 2 )~* Bw, or M/3 2 Bv, MBw, where fi is the FitzGerald factor for velocity v. The coefficient M^ was formerly called the longitudinal mass, M being the transverse mass ; but the longitudinal mass is of no particular importance in the general theory, and the term is dropping out of use. 32 ENERGY CH. I 13. Energy. When the units are such that c = 1, we have M=m(l -v 2 )-? = m + ^mv i approximately (131), if the speed is small compared with the velocity of light. The second term is the kinetic energy, so that the change of mass is the same as the change of energy, when the velocity alters. This suggests the identification of mass with energy. It may be recalled that in mechanics the total energy of a system is left vague to the extent of an arbitrary additive constant, since only changes of energy are defined. In identifying energy with mass we fix the additive constant m for each body, and m may be regarded as the internal energy of constitution of the body. The approximation used in (13'1) does not invalidate the argument. Consider two ideal billiard balls colliding. The conservation of mass (relative mass) states that %m (1 — v 2 ) - 2 is unaltered. The conservation of energy states that 2 m (1 + |v 2 ) is unaltered. But if both statements were exactly true we should have two equations determining unique values of the speeds of the two balls ; so that these speeds could not be altered by the collision. The two laws are not independent, but one is an approximation to the other. The first is the accurate law since it is independent of the space-time frame of reference. Accordingly the expression |mv 2 for the kinetic energy in elementary mechanics is only an approximation in which terms in v\ etc. are neglected. When the units of length and time are not resti'icted by the condition = 1, the relation between the mass M and the energy E is M=E/c 2 ....(13-2). Thus the energy corresponding to a gram is 9 . 10 20 ergs. This does not affect the identity of mass and energy — that both are measures of the same world-condition. A world-condition can be examined by different kinds of experimental tests, and the units gram and erg are associated with different tests of the mass-energy condition. But when once the measure has been made it is of no consequence to us which of the experimental methods was chosen, and grams or ergs can be used indiscriminately as the unit of mass. In fact, measures made by energy-tests and by mass- tests are convertible like measures made with a yard-rule and a metre-rule. The principle of conservation of mass has thus become merged in the principle of conservation of energy. But there is another independent pheno- menon which perhaps corresponds more nearly to the original idea of Lavoisier when he enunciated the law of conservation of matter. I refer to the per- 13, 14 ENERGY 33 manence of invariant mass attributed to our ideal billiard balls but not supposed to be a general property of matter. The conservation of m is an accidental property like rigidity ; the conservation of M is an invariable law of nature. When radiant heat falls on a billiard ball so that its temperature rises, the increased energy of motion of the molecules causes an increase of mass M. The invariant mass m also increases since it is equal to M for a body at rest. There is no violation of the conservation of M, because the radiant heat has mass M which it transfers to the ball ; but we shall show later that the electromagnetic waves have no invariant mass, and the addition to m is created out of nothing. Thus invariant mass is not conserved in general. To some extent we can avoid this failure by taking the microscopic point of view. The billiard ball can be analysed into a very large number of con- stituents — electrons and protons — each of which is believed to preserve the same invariant mass for life. But the invariant mass of the billiard ball is not exactly equal to the sum of the invariant masses of its constituents*. The permanence and permanent similarity of all electrons seems to be the modern equivalent of Lavoisier's "conservation of matter." It is still uncertain whether it expresses a universal law of nature; and we are willing to con- template the possibility that occasionally a positive and negative electron may coalesce and annul one another. In that case the mass M would pass into the electromagnetic waves generated by the catastrophe, whereas the invariant mass m would disappear altogether. Again if ever we are able to synthesise helium out of hydrogen, - 8 per cent, of the invariant mass will be annihilated, whilst the corresponding proportion of relative mass will be liberated as radiant energy. It will thus be seen that although in the special problems considered the quantity m is usually supposed to be permanent, its conservation belongs to an altogether different order of ideas from the universal conservation of M. 14. Density and temperature. Consider a volume of space delimited in some invariant way, e.g. the content of a material box. The counting of a number of discrete particles continually within (i.e. moving with) the box is an absolute operation ; let the absolute number be N. The volume V of the box will depend on the space-reckoning, being decreased in the ratio /3 for an observer moving relatively to the box and particles, owing to the FitzGerald contraction of one of the dimensions of the box. Accordingly the particle-density a = Nj V satisfies x z, x l which are any four functions of the old coordinates x x , x 2 , x 3 , x t . Conversely, x x , x. 2 , x 3 , x 4 are functions of x x , x 2 ', x s ', a? 4 '. It is assumed that 14, 15 GENERAL TRANSFORMATIONS OF COORDINATES 35 multiple values are excluded, at least in the region considered, so that values of (x lt x 2 , x 3 , x 4 ) and (#/, x 2 ', x 3 ', # 4 ') correspond one to one. ■II X x —J\ \X\ y X 2 y X 3 , X 4 ) , %2 = J2 \ X 1 i ^2 i ^3 i ^4 ) ) etC, dx l = |4 dak + |4 dak' + 1£ dee,' + |& dasi ; etc (151), or it may be written simply, dx x = r-4 dx x ' + ^-i dx 2 + ^-2- cfo? 3 ' -f r-4 cfa;/ ; etc (152). O0C± 0*&2 uX$ CQC± Substituting from (15 - 2) in (2 - l) we see that da- will be a homogeneous quadratic function of the differentials of the new coordinates ; and the new coefficients g n ', g 22 ', etc. could be written down in terms of the old, if desired. For an example consider the usual transformation to axes revolving with constant angular velocity co, viz. x = #! cos cox 4 — x 2 sin cox 4 , y = x[ sin cox 4 ' + x 2 ' cos cox 4 .(15-3). Hence dx = dx x cos cox 4 ' — dx 2 ' sin cox 4 ' + co (— x x sin cox 4 — x 2 ' cos cox 4 ') dx 4 , dy = dx x sin cox 4 ' + dx 2 ' cos cox 4 ' 4- co (#/ cos cox 4 ' — x 2 ' sin cox 4 ') dx 4 ', dz = dx 3 ', dt = dx 4 . Taking units of space and time so that = 1, we have for our original fixed coordinates by (7*1) ds 2 = -dx 2 - df - dz" + dt 3 . Hence, substituting the values found above, ds 2 = - dx,' 2 - dx 2 2 - dx^ 2 + {1 - co 2 OV' 2 + x 2 ' 2 )} dx 4 ' 2 + 2cox 2 'dx x dx 4 ' — 2ioXi' dx 2 ' dx 4 ' (154). Remembering that all observational differences of coordinate-systems must arise via the interval, this formula must comprise everything which distinguishes the rotating system from a fixed system of coordinates. In the transformation (15*3) we have paid no attention to any contraction of the standards of length or retardation of clocks due to motion with the rotating axes. The formulae of transformation are those of elementary kinematics, so that x x , x 2 ', x 3 ', x 4 are quite strictly the coordinates used in the ordinary theory of rotating axes. But it may be suggested that elementary kinematics is now seen to be rather crude, and that it would be worth while to touch up the formulae (153) so as to take account of these small changes of the standards. A little consideration shows that the suggestion is im- 36 GENERAL TRANSFORMATIONS OF COORDINATES CH. I practicable. It was shown in § 4 that if a?/, x£, x s ', xi represent rectangular coordinates and time as partitioned by direct readings of scales and clocks, then ds* = - dti* - dx^ - dx.;* + c 2 dx 4 '* (15-45), so that coordinates which give any other formula for the interval cannot represent the immediate indications of scales and clocks. As shown at the end of § 5, the only transformations which give (1545) are Lorentz trans- formations. If we wish to make a transformation of a more general kind, such as that of (15'3), we must necessarily abandon the association of the coordinate- system with uncorrected scale and clock readings. It is useless to try to "improve" the transformation to rotating axes, because the supposed im- provement could only lead us back to a coordinate-system similar to the fixed axes with which we started. The inappropriateness of rotating axes to scale and clock measurements can be regarded from a physical point of view. We cannot keep a scale or clock at rest in the rotating system unless we constrain it, i.e. subject it to molecular bombardment — an " outside influence " whose effect on the measure- ments must not be ignored. In the x, y, z, t system of coordinates the scale and clock are the natural equipment for exploration. In other systems they will, if unconstrained, con- tinue to measure ds ; but the reading of ds is no longer related in a simple way to the differences of coordinates which we wish to determine ; it depends on the more complicated calculations involved in (2*1). The scale and clock to some extent lose their pre-eminence, and since they are rather elaborate appliances it may be better to refer to some simpler means of exploration. We consider then two simpler test-objects — the moving particle and the light-pulse. In ordinary rectangular coordinates and time x, y, z, t an undisturbed particle moves with uniform velocity, so that its track is given by the equations cc = a + bt, y = c + dt, z = e+ft (15*5), i.e. the equations of a straight line in four dimensions. By substituting from (15-3) we could find the equations of the track in rotating coordinates; or by substituting from (15 - 2) we could obtain the differential equations for any desired coordinates. But there is another way of proceeding. The differential equations of the track may be written d?x dhj d z z d"t /1( .m ds>> d&> d7*' d*~° ( b) ' which on integration, having regard to the condition (7'1), give equations (15 - 5). The equations (15'6) are comprised in the single statement I ds is stationary (15'7) for all arbitrary small variations of the track which vanish at the initial and final limits — a well-known property of the straight line. 15, 16 GENERAL TRANSFORMATIONS OF COORDINATES 37 In arriving- at (15'7) we use freely the geometry of the x, y, z, t system given by (7 - l) ; but the final result does not allude to coordinates at all, and must be unaltered whatever system of coordinates we are using 1 . To obtain explicit equations for the track in any desired system of coordinates, we substitute in (157) the appropriate expression (21) for ds and apply the calculus of variations. The actual analysis will be given in § 28. The track of a light-pulse, being a straight line in four dimensions, will also satisfy (157); but the light-pulse has the special velocity c which gives the additional condition obtained in § 7, viz ds = (15'8). Here again there is no reference to any coordinates in the final result. We have thus obtained equations (15'7) and (15*8) for the behaviour of the moving particle and light-pulse which must hold good whatever the coordinate-system chosen. The indications of our two new test-bodies are connected with the interval, just as in § 3 the indications of the scale and clock were connected with the interval. It should be noticed however that whereas the use of the older test-bodies depends only on the truth of the fundamental axiom, the use of the new test-bodies depends on the truth of the empirical laws of motion and of light-propagation. In a deductive theory this appeal to empirical laws is a blemish which we must seek to remove later. 16. Fields of force. Suppose that an observer has chosen a definite system of space-coordinates and of time-reckoning (x 1} x 2 , x 3 , x 4 ) and that the geometry of these is given by ds 2 = g n dx^ + g 22 dx£ + ... + 2g y2 dx 1 dx 2 + (16*1). Let him be under the mistaken impression that the geometry is ds * = - dx* - dx 2 2 -dx 3 2 + dx 4 2 (16*2), that being the geometry with which he is most familiar in pure mathematics. We use ds to distinguish his mistaken value of the interval. Since intervals can be compared by experimental methods, he ought soon to discover that his ds cannot be reconciled with observational results, and so realise his mistake. But the mind does not so readily get rid of an obsession. It is more likely that our observer will continue in his opinion, and attribute the discrepancy of the observations to some influence which is present and affects the behaviour of his test-bodies. He will, so to speak, introduce a supernatural agency which he can blame for the consequences of his mistake. Let us examine what name he would apply to this agency. Of the four test-bodies considered the moving particle is in general the most sensitive to small changes of geometry, and it would be by this test that the observer would first discover discrepancies. The path laid down for it by our observer is I ds is stationary, 38 FIELDS OF FORCE CH. I i.e. a straight line in the coordinates (x lt x 2 , x 3 , x 4 ). The particle, of course, pays no heed to this, and moves in the different track / ds is stationary. Although apparently undisturbed it deviates from "uniform motion in a straight line." The name given to any agency which causes deviation from uniform motion in a straight line is force according to the Newtonian definition of force. Hence the agency invoked through our observer's mistake is described as a " field of force." The field of force is not always introduced by inadvertence as in the fore- going illustration. It is sometimes introduced deliberately by the mathema- tician, e.g. when he introduces the centrifugal force. There would be little advantage and many disadvantages in banishing the phrase "field of force" from our vocabulary. We shall therefore regularise the procedure which our observer has adopted. We call (16'2) the abstract geometry of the system of coordinates (x u x 2 , x 3 , x 4 ) ; it may be chosen arbitrarily by the observer. The natural geometry is (16*1). A field of force represents the discrepancy between the natural geometry of a coordinate-system and the abstract geometry arbitrarily ascribed to it. A field of force thus arises from an attitude of mind. If we do not take our coordinate-system to be something different from that which it really is, there is no field of force. If we do not regard our rotating axes as though they were non-rotating, there is no centrifugal force. Coordinates for which the natural geometry is ds 2 = — dx 2 — dx 2 2 — dx 3 + dx 4 2 are called Galilean coordinates. They are the same as those we have hitherto called ordinary rectangular coordinates and time (the velocity of light being unity). Since this geometry is familiar to us, and enters largely into current conceptions of space, time and mechanics, we usually choose Galilean geometry when we have to ascribe an abstract geometry. Or we may use slight modifi- cations of it, e.g. substitute polar for rectangular coordinates. It has been shown in § 4 that when the g's are constants, coordinates can be chosen so that Galilean geometry is actually the natural geometry. There is then no need to introduce a field of force in order to enjoy our accustomed outlook ; and if we deliberately choose non-Galilean coordinates and attribute to them abstract Galilean geometry, we recognise the artificial character of the field of force introduced to compensate the discrepancy. But in the more general case it is not possible to make the reduction of § 4 accurately through- out the region explored by our experiments; and no Galilean coordinates exist. In. that case it has been usual to select some system (prefeiably an approximation to a Galilean system) and ascribe to it the abstract geometry of the Galilean system. The field of force so introduced is called " Gravitation." 16, 17 FIELDS OF FORCE 39 It should be noticed that the rectangular coordinates and time in current use can scarcely be regarded as a close approximation to the Galilean system, since the powerful force of terrestrial gravitation is needed to compensate the error. The naming of coordinates (e.g. time) usually follows the abstract geometry attributed to the system. In general the natural geometry is of some compli- cated kind for which no detailed nomenclature is recognised. Thus when we call a coordinate t the "time," we may either mean that it fulfils the observational conditions discussed in § 4, or we may mean that any departure from those conditions will be ascribed to the interference of a field of force. In the latter case " time " is an arbitrary name, useful because it fixes a consequential nomenclature of velocity, acceleration, etc. To take a special example, an observer at a station on the earth has found a particular set of coordinates cc 1 , x 2 , x 3 , .r 4 best suited to his needs. He calls them x, y, z, t in the belief that they are actually rectangular coordinates and time, and his terminology — straight line, circle, density, uniform velocity, etc. — follows from this identification. But, as shown in § 4, this nomenclature can only agree with the measures made by clocks and scales provided (1G'2) is satisfied ; and if (16*2) is satisfied, the tracks of undisturbed particles must be straight lines. Experiment immediately shows that this is not the case ; the tracks of undisturbed particles are parabolas. But instead of accepting the verdict of experiment and admitting that x lt x 2 , x z, & 4 ar* 1 not w h°. f u, e «un- posed they were, our observer introduces a field of force to explain why his test is not fulfilled. A certain part of this field of force might have been avoided if he had taken originally a different set of coordinates (not rotating with the earth) ; and in so far as the field of force arises on this account it is generally recognised that it is a mathematical fiction — the centrifugal force. But there is a residuum which cannot be got rid of by any choice of co- ordinates ; there exists no extensive coordinate-system having the simple properties which were ascribed to x, y, z, t. The intrinsic nature of space- time near the earth is not of the kind which admits coordinates with Galilean geometry. This irreducible field of force constitutes the field of terrestrial gravitation. The statement that space-time round the earth is " curved " — that is to say, that it is not of the kind which admits Galilean coordinates — is not an hypothesis ; it is an equivalent expression of the observed fact that an irreducible field of force is present, having regard to the Newtonian definition of force. It is this fact of observation which demands the intro- duction of non-Galilean space-time and non-Euclidean space into the theory. 17. The Principle of Equivalence. In § 15 we have stated the laws of motion of undisturbed material particles and of light-pulses in a form independent of the coordinates chosen. Since a great deal will depend upon the truth of these laws it is desirable to 40 THE PRINCIPLE OF EQUIVALENCE CH. I consider what justification there is for believing them to be both accurate and universal. Three courses are open : (a) It will be shown in Chapters IV and VI that these laws follow rigorously from a more fundamental discussion of the nature of matter and of electromagnetic fields ; that is to say, the hypotheses underlying them may be pushed a stage further back. (6) The track of a moving particle or light-pulse under specified initial conditions is unique, and it does not seem to be possible to specify any unique tracks in terms of intervals only other than those given by equations (15-7) and (15'8). (c) We may arrive at these laws by induction from experiment. If we rely solely on experimental evidence we cannot claim exactness for the laws. It goes without saying that there always remains a possibility of small amendments of the laws too slight to affect any observational tests yet tried. Belief in the perfect accuracy of (157) and (158) can only be justified on the theoretical grounds (a) or (b). But the more important consideration is the universality, rather than the accuracy, of the experimental laws ; we have to guard against a spurious generalisation extended to conditions intrinsically dissimilar from those for which the laws have been established observationally. We derived (15*7) from the equations (15*5) which describe the observed behaviour of a particle muving under no field of force. We assume that the result holds in all circumstances. The risky point in the generalisation is not in introducing a field of force, because that may be due to an attitude of mind of which the particle has no cognizance. The risk is in passing from regions of the world where Galilean coordinates {x, y, z, t) are possible to intrinsically dissimilar regions where no such coordinates exist — from flat space-time to space-time which is not flat. The Principle of Equivalence asserts the legitimacy of this generalisation. It is essentially an hypothesis to be tested by experiment as opportunity offers. Moreover it is to be regarded as a suggestion, rather than a dogma admitting of no exceptions. It is likely that some of the phenomena will be determined by comparatively simple equations in which the components of curvature of the world do not appear; such equations will be the same for a curved region as for a flat region. It is to these that the Principle of Equivalence applies. It is a plausible suggestion that the undisturbed motion of a particle and the propagation of light are governed by laws of this specially simple type; and accordingly (15*7) and (15 - 8) will apply in all circumstances. But there are more complex phenomena governed by equations in which the curvatures of the world are involved ; terms containing these curvatures will vanish in the equations summarising experiments made in a flat region, and would have to be reinstated in passing to the general equations. Clearly there must be some phenomena of this kind which discriminate between 17, 18 THE PRINCIPLE OF EQUIVALENCE 41 a flat world and a curved world ; otherwise we could have no knowledge of world-curvature. For these the Principle of Equivalence breaks down. The Principle of Equivalence thus asserts that some of the chief differential equations of physics are the same for a curved region of the world as for an osculating flat region*. There can be no infallible rule for generalising experimental laws; but the Principle of Equivalence offers a suggestion for trial, which may be expected to succeed sometimes, and fail sometimes. The Principle of Equivalence has played a great part as a guide in the original building up of the generalised relativity theory ; but now that we have reached the new view of the nature of the world it has become less necessary. Our present exposition is in the main deductive. We start with a general theory of world-structure and work down to the experimental consequences, so that our progress is from the general to the special laws, instead of vice versa. 18. Retrospect. The investigation of the external world in physics is a quest for structure rather than substance. A structure can best be represented as a complex of relations and relata; and in conformity with this we endeavour to reduce the phenomena to their expressions in terms of the relations which we call intervals and the relata which we call events. If two bodies are of identical structure as regards the complex of interval relations, they will be exactly similar as regards observational properties f, if our fundamental hypothesis is true. By this we show that experimental measurements of lengths and duration are equivalent to measurements of the interval relation. To the events we assign four identification-numbers or coordinates according to a plan which is arbitrary within wide limits. The connection between our physical measurements of interval and the system of identification- numbers is expressed by the general quadratic form (2'1). In the particular case when these identification-numbers can be so assigned that the product terms in the quadratic form disappear leaving only the four squares, the coordinates have the metrical properties belonging to rectangular coordinates and time, and are accordingly so identified. If any such system exists an infinite number of others exist connected with it by the Lorentz trans- formation, so that there is no unique space-time frame. The relations of these different space-time reckonings have been considered in detail. It is * The correct equations for a curved world will necessarily include as a special case those already obtained for a flat world. The practical point on which we seek the guidance of the Principle of Equivalence is whether the equations already obtained for a flat world will serve as they stand or will require generalisation. t At present this is limited to extensional properties (in both space and time). It v\ill be shown later that all mechanical properties are also included. Electromagnetic properties require separate consideration. 42 RETROSPECT CH. I 18 shown that there must be a particular speed which has the remarkable property that its value is the same for all these systems ; and by appeal to the Michelson-Morley experiment or to Fizeau's experiment it is found that this is a distinctive property of the speed of light. But it is not possible throughout the world to choose coordinates fulfilling the current definitions of rectangular coordinates and time. In such cases we usually relax the definitions, and attribute the failure of fulfilment to a field of force pervading the region. We have now no definite guide in selecting what coordinates to take as rectangular coordinates and time ; for whatever the discrepancy, it can always be ascribed to a suitable field of force. The field of force will vary according to the system of coordinates selected; but in the general case it is not possible to get rid of it altogether (in a large region) by any choice of coordinates. This irreducible field of force is ascribed to gravitation. It should be noticed that the gravitational influence of a massive body is not properly expressed by a definite field of force, but by the property of irreducibility of the field of force. We shall find later that the irreducibility of the field of force is equivalent to what in geometrical nomenclature is called a curvature of the continuum of space-time. For the fuller study of these problems we require a special mathematical calculus which will now be developed ab initio. CHAPTER II THE TENSOR CALCULUS 19. Contravariant and co variant vectors. We consider the transformation from one system of coordinates x 1} x 2 ,x 3 , x A to another system .*;/, x 2 ', x 3 , x 4 '. The differentials (dx 1} dx 2 , dx 3 , dr 4 ) are transformed according to the equations (15"2), viz. dxi = -s-i dx x + »— dx 2 + ^-i dx 3 + ~- dx 4 ; etc. OXi OXo ox 3 vx 4 which may be written shortly a= i ox a four equations being obtained by taking /u=l, 2, 3, 4, successively. Any set of four quantities transformed according to this law is called a contravariant vector. Thus if (A 1 , A 2 , A 3 , J. 4 ) becomes (A 1 , A' 2 , A' 3 , A'*) in the new coordinate-system, where A'"= X -^A* (191), a = ] 0X a then (A 1 , A 2 , A 3 , A*), denoted briefly as A 1 *, is a contravariant vector. The upper position of the suffix (which is, of course, not an exponent) is reserved to indicate contravariant vectors. If is an invariant function of position, i.e. if it has a fixed value at each point independent of the coordinate-system employed, the four quantities /d_0 d$_ 3^ dx x dcf) dx 2 d(j> dx 3 d(f> dx 4 d(f) dx x ' dxi dx 1 dxS dx 2 dx/ dx 3 9#/ dx t ' which may be written shortly 0(f) i dx a dcj> dx/ a -i dx^ dx a ' Any set of four quantities transformed according to this law is called a covariant vector. Thus if A p. is a covariant vector, its transformation law is 4 S/v. A;= X P-,A a (19-2). 44 CONTRA VARIANT AND CO VARIANT VECTORS CH. II We have thus two varieties of vectors which we distinguish by the upper or lower position of the suffix. The first illustration of a contra variant vector, dx forms rather an awkward exception to the rule that a lower suffix in- dicates covariance and an upper suffix contravariance. There is no other exception likely to mislead the reader, so that it is not difficult to keep in mind this peculiarity of dx^; but we shall sometimes find it convenient to indicate its contravariance explicitly by writing dx^={dxY (19-3). A vector may either be a single set of four quantities associated with a special point in space- time, or it may be a set of four functions varying continuously with position. Thus we can have an "isolated vector" or a " vector-field." For an illustration of a covariant vector we considered the gradient of an invariant, dcfr/dx^; but a covariant vector is not necessarily the gradient of an invariant. The reader will probably be already familiar with the term vector, but the distinction of covariant and contravariant vectors will be new to him. This is because in the elementary analysis only rectangular coordinates are contemplated, and for transformations from one rectangular system to another the laws (19*1) and (19'2) are equivalent to one another. From the geometrical point of view, the contravariant vector is the vector with which everyone is familiar; this is because a displacement, or directed distance between two points, is regarded as representing (dx lt dx 2 , dx 3 )* which, as we have seen, is contravariant. The covariant vector is a new conception which does not so easily lend itself to graphical illustration. 20. The mathematical notion of a vector. The formal definitions in the preceding section do not help much to an understanding of what the notion of a vector really is. We shall try to explain this more fully, taking first the mathematical notion of a vector (with which we are most directly concerned) and leaving the more difficult physical - notion to follow. We have a set of four numbers (A 1} A 2 , A 3 , A 4 ) which we associate with some point (x 1} x. 2 , x 3 , a? 4 ) and with a certain system of coordinates. We make a change of the coordinate-system, and we ask, What will these numbers become in the new coordinates ? The question is meaningless ; they do not automatically "become" anything. Unless we interfere with them they stay as they were. But the mathematician may say " When I am using the coordinates x 1} x 2 , x 3 , x t , I want to talk about the numbers A lt A 2 , A 3> A 4 ; and when I am using #/, x 2y x 3 , xl I find that at the corresponding stage of my work I shall want to talk about four different numbers A,', A 2 , A 3) A 4 '. The customary resolution of a displacement into components in oblique directions assumes this. 19, 20 THE MATHEMATICAL NOTION OF A VECTOR 45 So for brevity I propose to call both sets of numbers by the same symbol ^l." We reply " That will be all right, provided that you tell us just what numbers will be denoted by ^l for each of the coordinate-systems you intend to use. Unless you do this we shall not know what you are talking about." Accordingly the mathematician begins by giving us a list of the numbers that £t will signify in the different coordinate-systems. We here denote these numbers by letters. ^ will mean*