Rl2?e $7-10135
Rademaeher
The en joyment-- of mathematics.
Kansas city g|| public library
Books will be issued only
on presentation of library card.
Please report lost cards and
change of residence promptly.
Card holders are responsible for
all books, records, films, pictures
or other library materials
checked out on their cards.
KANSAS CITY. MISSOURI PUBUC UBRARY
,-. g'pO i .;
JUL27'
JUL2619"? ?
The Enjoyment of Mathematics
OTHER BOOKS OF INTEREST TO AMATEURS
OF MATHEMATICS
G. Polya, How to Solve It
G. Polya, Mathematics and Plausible Reasoning
Vol. I. Induction and Analogy in Mathematics
Vol. II. Patterns of Plausible Inference
EL Weyl, Symmetry
H. Weyl, The Philosophy of Mathematics and Natural Science
Selections from Mathematics
for the Amateur
BY HANS RADEMACHER AND
OTTO TOEPLITZ
TRANSLATED BY HERBERT ZUCKERMAN
PRINCETON, NEW JERSEY
PRINCETON UNIVERSITY PRESS
1957
Published, 1957, by Princeton University Press
All Rights Reserved
This is a translation from Von Zahlen und Figuren: Pro-
ben Mathematischen Denkens fur Liebhaber der Mathe-
matik, by Hans Rademacher and Otto Toeplitz, second
edition originally published by Julius Springer, Berlin,
1933* Chapters 15 and 28 by Herbert Zuckerman have been
added to the English language edition.
Printed in the United States of America
Preface
Otto Toeplitz, co-author of this book, died in Jerusalem on Fe
bruary 19, 1940, after having left Germany in the Spring of 1939.
Toeplitz began his academic career in Gottingen as a disciple of
David Hilbert, was then professor in Kiel and later in Bonn. His
scientific work is centered around the theory of integral equations
and the theory of functions of infinitely many variables, fields to
which he has made lasting contributions.
The plan for this book arose at frequent meetings which the
authors had, while Toeplitz was in Kiel and I was at the University
of Hamburg. Both of us had repeatedly lectured about the subject
matter of this book to a wider public. The manuscript was rewritten
several times under mutual criticism. Toeplitz's great interest in the
history of mathematics is still visible in the present edition. I re
member with warm feelings the summer days in 1929 at Bonn, when
together we put the last touches to the German manuscript. I am
sure that Toeplitz would have been pleased and proud of the present
English edition, a project of which he had often thought.
I wish to thank the translator Professor Herbert S. Zuckerman for
his painstaking and understanding work. Not only do I admire his
apt translation of the German text, but I also think that in his pre
sentation of the content he has brought many of its arguments closer
to the English speaking reader. He has added two chapters (15 and
28) to the book, which faithfully reflect the spirit in which this book
was written.
My thanks go to my friends Emil Grosswald, D. H. Lehmer, and
Herbert Robbins for help and valuable advice, and also to the
publisher for his sympathetic cooperation.
Philadelphia, 1956 HANS RADEMACHER
CONTENTS
Preface v
Introduction 5
1. The Sequence of Prime Numbers 9
2. Traversing Nets of Curves 13
3. Some Maximum Problems 17
4. Incommensurable Segments and Irrational Numbers 22
5. A Minimum Property of the Pedal Triangle 27
6. A Second Proof of the Same Minimum Property 30
7. The Theory of Sets 34
8. Some Combinatorial Problems 43
9. On Waring's Problem 52
10. On Closed Self-Intersecting Curves 61
11. Is the Factorization of a Number into Prime Factors Unique? 66
12. The Four-Color Problem 73
13. The Regular Polyhedrons 82
14. Pythagorean Numbers and Fermat's Theorem 88
15. The Theorem of the Arithmetic and Geometric Means 95
16. The Spanning Circle of a Finite Set of Points 103
17. Approximating Irrational Numbers by Means of Rational
Numbers 111
18. Producing Rectilinear Motion by Means of Linkages 119
19. Perfect Numbers 129
20. Euler s Proof of the Infinitude of the Prime Numbers 135
21. Fundamental Principles of Maximum Problems 139
22. The Figure of Greatest Area with a Given Perimeter 142
23. Periodic Decimal Fractions 147
24. A Characteristic Property of the Circle 160
25. Curves of Constant Breadth 163
26. The Indispensability of the Compass for the Constructions
of Elementary Geometry 177
27. A Property of the Number 30 187
28. An Improved Inequality 192
Notes and Remarks 197
The Enjoyment of Mathematics
Introduction
Mathematics, because of its language and notation and its odd-
looking special symbols, is closed off from the surrounding world as
by a high wall. What goes on behind that wall is, for the most part,
a secret to the layman. He thinks of dull uninspiring numbers, of
a lifeless mechanism which functions according to laws of inescapable
necessity. On the other hand, that wall very often limits the view
of him who stays within. He is prone to measure all mathematical
things with a special yardstick and he prides himself that nothing
profane shall enter his realm.
Is it possible to breach this wall, to present mathematics in such
a way that the spectator may enjoy it? Cannot the enjoyment of
mathematics be extended beyond the small circle of those who are
"mathematically gifted"? Indeed, only a few are mathematically
gifted in the sense that they are endowed with the talent to discover
new mathematical facts. But by the same token, only very few are
musically gifted in that they are able to compose , music. Never
theless there are many who can understand and perhaps reproduce
music, or who at least enjoy it. We believe that the number of
people who can understand simple mathematical ideas is not
relatively smaller than the number of those who are commonly
called musical, and that their interest will be stimulated if only we
can eliminate the aversion toward mathematics that so many have
acquired from childhood experiences.
It is the aim of these pages to show that the aversion toward
mathematics vanishes if only truly mathematical, essential ideas are
presented. This book is intended to give samples of the diversified
phenomena which comprise mathematics, of mathematics for its
own sake, and of the intrinsic values which it possesses.
The attempt to present mathematics to nonmathematicians has
often been made, but this has usually been done by emphasizing the
usefulness of mathematics in other fields of human endeavor in an
effort to secure the comprehension and interest of the reader.
Frequently the advantages which it offers in technological and other
applications have been described and these advantages have been
illustrated by numerous examples. On the other hand, many books
INTRODUCTION
have been written on mathematical games and pastimes. Although
these books contain much interesting material, they give at best a
very distorted picture of what mathematics really is. Finally, other
books have discussed the foundations of mathematics with regard
to their general philosophical validity. A reader of the following
pages who is primarily interested in the pure, the absolute mathema
tics will naturally direct his attention toward just such an epistemolo-
gical evaluation of mathematics. But this seems to us to be attaching
an extraneous value to mathematics, to be judging its value according
to measures outside itself.
In the following pages we will not be able to demonstrate the
effects of the ideas to be presented on the domain of mathematics
itself. We cannot consider the interior applications, so to speak,
of mathematics, the use of the ideas and results of one field in other
fields of mathematics. This means that we must omit something
that is quite essential in the nature of the mathematical edifice, the
great and surprising cross-connections that permeate this edifice in
all directions. This omission is quite involuntary on our part, for
the greatest mathematical discoveries are those which have revealed
just such far-reaching interrelations. In order to present these inter
relations, however, we would need long and comprehensive prepara
tions and would have to assume a thorough training on the part of
the reader. This is not our intention here.
In other words, our presentation will emphasize not the facts as
other sciences can disclose them to the outsider but the types of
phenomena, the method of proposing problems, and the method of solving
problems. Indeed, in order to understand the great mathematical
events, the comprehensive theories, long schooling and persistent
application would be required. But this is also true with music.
On going to a concert for the first time one is not able to appreciate
fully Bach's "The Art of Fugue/ 5 nor can one immediately visualize
the structure of a symphony. But besides the great works of music
there are the smaller pieces which have something of true sublimity
and whose spirit reveals itself to everyone. We plan to select such
"smaller pieces' 3 from the huge realm of mathematics: a sequence
of subjects each one complete in itself, none requiring more than an
hour to read and understand. The subjects are independent, so
that one need not remember what has gone before when reading
any chapter. Also, the reader is not required to remember what
he may have been compelled to learn in his younger years. No
use is made of logarithms or trigonometry. No mention is made of
6
INTRODUCTION
differential or integral calculus. The theorems of geometrical
congruence and the multiplication of algebraic sums will gradually
be brought back to the memory of the reader; that will be all.
In the case of a small work of music it may not only be the line
of melody with which it opens that makes it beautiful. A little
variation of the theme, a surprising modulation may well be the
climax of the whole. Only he who has listened attentively to the
basic theme will fully perceive and understand this climax. In a
similar sense our reader will have to "listen" readily and attentively
to the basic motive of a problem, to its development., to the first
few examples which illustrate each theme, before the decisive
modulation to the cardinal thought takes place. He will have to
follow the reasoning with a little more active attention than is
usually required in reading. If he does this, he will find no dif
ficulty in grasping the essential ideas of each subject. He will
then get a glimpse of what a few great thinkers have created when
they have occasionally left the realm of their comprehensive theoretical
production and have built, from simple beginnings, a small self-
contained piece of art, a fragment of the prototype of mathematics.
i. The Sequence of Prime Numbers
6 is equal to 2 times 3, but 7 cannot be similarly written as a
product of factors. Therefore 7 is called a prime or prime number.
A prime is a positive whole number which cannot be written as the
product of two smaller factors. 5 and 3 are primes but 4 and 12
are not since we have 4 = 2-2 and 12 == 3-4. Numbers which
can be factored like 4 and 12 are called composite. The number 1
is not composite but, because it behaves so differently from other
numbers, it is not usually considered a prime either; consequently
2 is the first prime, and the first few primes are:
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, - - -.
A glance reveals that this sequence does not follow any simple law
and, in fact, the structure of the sequence of primes turns out to
be extremely complicated.
A number can be factored a step at a time until it is reduced to
a product of primes. Thus 6 = 2 * 3 is immediately expressed as a
product of two primes, while 30 = 5 6 and 6 = 2-3 gives
30 = 2 3 5, a product of three primes. Similarly, 24 has four
prime factors (24 = 3 8 = 3 2 - 4 = 3 - 2 - 2 2), of which three
happen to be the same prime, 2. In the case of a prime such as
5 one can only write 5 = 5, a product of a single prime. By means
of this step-by-step factoring, any positive whole number except 1
can be written as a product of primes. Because of this, the prime
numbers can be thought of as the building blocks of the sequence
of all positive whole numbers.
In the ninth book of Euclid's Elements the question of whether the
sequence of primes eventually ends is raised and answered. It is shown
that the sequence has no end, that after each prime yet another and
larger prime can be found.
Euclid's proof is very ingenious yet quite simple. The numbers
3, 6, 9, 12, 15, 18, - are all multiples of 3. No other numbers
can be divided evenly by 3. The next larger numbers 4, 7, 10,
13, 16, 19, , which are multiples of 3 increased by 1, are certainly
not divisible by 3; for example, 19 = 6 3 + 1, 22 = 7 3 + 1, etc.
In the same way the multiples of 5 increased by 1 are not divisible
THE SEQUENCE OF PRIME NUMBERS
by 5 (21 = 4 5 + 1, etc.). The same thing is true for 7, for 11,
and so on.
Now Euclid writes down the numbers
2-3+1 = 7
2 3 5 + 1 = 31
2- 3 5- 7 + 1 = 211
2-3-5-7-11 + 1 = 2311
2-3-5-7-11 -13+1 = 30,031, etc.
The first two primes, the first three primes, and so forth, are
multiplied together and each product is increased by 1. None of
these numbers is divisible by any of the primes used to form it.
Since 31 is a multiple of 2 increased by 1, it is not divisible by 2.
Is is a multiple of 3 increased by 1 and hence is not divisible by
3. It is a multiple of 5 increased by 1, hence not divisible by 5.
31 happens to be a prime and it is certainly larger than 5. 211 and
2311 are also new primes, but 30031 is not a prime. However,
30031 is not divisible by 2, 3, 5, 7, 11, or 13, and hence its prime
factors are greater than 13. As a matter of fact, a little figuring
shows that 30031 = 59 509, and these prime factors are greater
than 13.
The same argument may be applied as far as one wants to go.
Let p be any prime and form the product of all primes from 2 to p;
increase this product by 1 and write
2-3-5-7- 11 -^ + 1 = JV.
None of the primes 2, 3, 5, - - - p divides JV, so either jV" is a prime
(certainly much greater than p} or all the prime factors of M are
different from 2, 3, 5, - - - p, and hence greater than p. In either
case, a new prime greater than p has been found. No matter how
large p is there is always another larger prime.
This part of Euclid is quite remarkable, and it would be hard
to name its most admirable feature. The problem itself is only of
theoretical interest. It can be proposed, for its own sake, only by
a person who has a certain inner feeling for mathematical thought.
This feeling for mathematics and appreciation of the beauty of
mathematics was very evident in the ancient Greeks, and they have
handed it down to later civilizations. Also, this problem is one that
most people would completely overlook. Even when it is brought
to our attention it appears to be trivial and superfluous, and its real
difficulties are not immediatelty apparent. Finally, we must admire
10
THE SEQUENCE OF PRIME NUMBERS
the ingenious and simple way in which Euclid proves the theorem.
The most natural way to try to prove the theorem is not Euclid's.
It would be more natural to try to find the next prime number
following any given prime. This has been attempted but has always
ended in failure because of the extreme irregularity of the formation
of the primes.
Euclid's proof circumvents the lack of a law of formation for the
sequence of primes by looking for some prime beyond instead of for
the next prime after/. For example, his proof gives 2311, not 13,
as a prime past 11, and it gives 59 as one past 13. Frequently there
are a great many primes between the one considered and the one
given by the proof. This is not a sign of the weakness of the proof,
but rather it is evidence of the ingenuity of the Greeks in that they
did not try to do more than was required.
As an illustration of the complexity of the sequence of primes, we
shall show that there are large gaps in the series. We shall show,
for example, that we can find 1000 consecutive numbers, all of which
are composite. The method is closely related to that of Euclid.
We saw that 2-3-5+1 = 31 is not divisible by 2, 3, or 5.
If two numbers are divisible by 2 then their sum is certainly divisible
by 2 as well. The same is true for 3, 5, etc. Now, 2*3-5 is
divisible by 2, 3, and 5, so 2 3 - 5 + 2 = 32 is divisible by 2,
2-3-5 + 3 = 33 by 3, 2-3-5 + 4 = 34 by 2, 2-3-5 + 5=: 35
by 5, and 2-3*5+6 = 36 by 2. Therefore none of the numbers
32, 33, 34, 35, 36 is a prime. This argument fails for the first time
for 2 3 - 5 + 7 = 37, which is not divisible by 2, 3, or 5.
In the same way we can find 1000 consecutive composite numbers.
Let/ be the first four digit prime (1009) and form the 1000 numbers
2- 3- - *p + 2, 2- 3-- -p + 3- , 2- 3-- -p + 1001.
Each of the numbers 2, 3, 4, 5, -, 1001, is divisible by one of
the primes 2, 3, - -, p and so is 2 - 3 - - j&. Therefore each of the
numbers listed is also divisible by one of 2, 3, - * - p, and hence is
not a prime itself. We have thus found 1000 consecutive numbers
none of which is prime, and consequently there is a gap of at least
1000 numbers in the sequence of primes.
Naturally a gap of such length will not occur until we have gone
quite far along in the sequence of primes, but if we go far enough
we can, by the same method, find gaps as long as we may desire.
This problem and Euclid's are very similar both in nature and
proof, yet the question of gaps in the sequence of primes was not
11
THE SEQUENCE OF PRIME NUMBERS
considered by the Greeks. It was taken up by modern mathemati
cians along with a great number of other related problems. Most
of these other problems are not as easy to solve; some remain unsolved
at the present time, while others have led to entirely new fields of
mathematics.
Let us consider one of these further problems which we can still
handle by the same methods and which will give a certain insight
into some of the others. The multiples of 3 are 3, 6, 9, -, and these
numbers increased by 1 viz. 4, 7, 10, have occurred above.
The remaining numbers 2, 5, 8, 11, 14, 17, 20, 23, - are the
numbers which have the remainder 2 when divided by 3. Do these
last numbers contain infinitely many primes? That is, does the
sequence 2, 5, 11, 17, 23, - never end? We shall prove that
there is an infinite number of these primes.
First we must show that if any two of the numbers 1, 4, 7, 10,
13, are multiplied together, the product is ^again one of these
same numbers. Each of the numbers is a multiple of 3 increased
by 1, say 3# + 1. If a second number is 3y + 1 3 then their product is
1) = 9*j> + 3j + 3* + 1
(1) =3(3*y+j>+*) + 1,
which is again a multiple of 3 increased by 1 and hence is in our
original set of numbers.
Now if any of the numbers 2, 5, 8, 11, is split into its prime
factors, at least one of the prime factors must again be one of 2, 5,
8, 11, . For example, 14 = 2 7 has the factor 2 and 35 = 5 7
has 5. To show that this is always true, we note that each of the
prime factors must be either a multiple of 3, in the sequence 1, 4,
7, 10, , or in the sequence 2, 5, 8, - -. The only multiple of
3 that is a prime is 3 itself, and it does not divide our chosen number.
If all the prime factors were of the kind 1, 4, 7, 10, -, then by
the above remark (1) our number would again be of this kind.
Since it is not of this kind, at least one of its prime factors must be
one of the following 2, 5, 8, 11, - -.
We can now proceed as in Euclid's proof except that we consider
2-3-5-7- 11 -^ 1 = Af
in place of
2- 3- 5- 7- 11 -jfr + 1 = JV*.
M is a multiple of 3 decreased by 1, which means that it is one of
2, 5, 8, 11, - -. Just as with JV", it is clear that M is not divisible
12
THE SEQUENCE OF PRIME NUMBERS
by any of the primes 2, 3, 5, 7, 1 1, -, p. Either M is a prime
(greater than p} y or all its prime factors are greater than p. In
the latter case at least one of the prime factors is one of 2, 5, 8, ,
and hence in both cases we have found a prime of this kind that
is greater than p. Therefore the sequence 2, 5, 8, contains
an infinite number of primes.
This leaves open the question whether the sequences 1, 4, 7, 10,
13, * * contains an infinite number of primes. It is quite possible
that 2, 5, 8, . . . could contain infinitely many primes while 1, 4, 7,
contained only a finite number. The fact is that the latter sequence
also contains infinitely many primes, but the proof of this requires
completely different methods. In a later chapter we shall gain
some insight into these methods. The reason for mentioning these
last problems was to point out how the problems and methods of
modern mathematics are related to those of the Greeks. This is
not only true in isolated cases, but in whole areas of modern mathe
matics, areas of very great interest in which research is still being
carried on.
2. Traversing Nets of Curves
A streetcar company decides to reorganize its system of routes
without changing the existing tracks. It wishes to do this in such a
way that each section of track will be used by just one route.
Passengers will be allowed to transfer from route to route until
they finally reach their destinations. The problem is: how many
routes must the company operate in order to serve all sections without having
more than one route on any section?
For a very small city with car lines as in Fig. 1, the problem is
quite simple. One route could go from A to B and a second from
C to D, both passing through K. Or a line could go from A through
K to D and a second from B through K to C. Finally, a line could
go from A through K to C and another from B through K to D.
Each of these possibilities necessitates two routes. Of course the
company could set up a new transfer point at R } end a route there,
and run a shuttle from R to J5. But this would only increase the
total number of routes needed. By adding more transfer points
the number of routes could be increased indefinitely, but we are not
interested in doing this, since the original problem asks for the
smallest possible number of routes.
13
TRAVERSING NETS OF CURVES
Fig. 2 is still a fairly simple network of lines, but the problem is
more complicated in this case. One possibility is for one route to
run from A around through B, C, D, E, and back to A (a closed route).
Fig. 1
A second route might go from A through F, G, H, to D. Three
more routes, BF, EG, CH, would be needed, making five in all.
However, this is not the best possible arrangement. The first two
routes could be combined into a single one running from A around
through B, C, D, E, and back to A } and then on through F, G,
and H to D. This cuts the number of routes to four, but would
it be possible to set up just three routes?
In the network of Fig. 3 one route might proceed from A through
-B, C, Z>, and E, back to A, and then on through F to B. That
would leave three sections, CF, DF, EF, of which two could be
Fig. 2
Fig. 3
combined into a single route CFD 9 leaving EF as a route by itself.
In this case the first two routes pass through F while the third ends
at F. The question still remains: would two routes do just as well?
It would not be unduly difficult to list all the possible routes for
Figs. 2 and 3 (as we did for Fig. 1) in order to be sure we have the
minimum number of routes. However, this would be quite a long
procedure for a fairly complicated network, and in any case it
would not be a very interesting problem. Also, it would no more
be mathematics merely to list all the possibilities for any given
network than it is mathematics to multiply together two seven-digit
numbers. The spirit of mathematics dictates that we seek out the
essentials of the problem and use them to solve it.
14
TRAVERSING NETS OF CURVES
The essential thing in this problem is quite simple. We must
consider where the ends of the various routes must lie. There must
be an end wherever a section has a free end as at A, JB, C, and D
in Fig. 1. Since in this case there must be four ends, and since
each route has no more than two ends (a closed route has no ends
of course), it is clear that there must be at least two routes between
these four ends. By means of a single consideration we have ob
tained the same result as was obtained above by considering all the
possible cases.
In Fig. 2 there are no free ends, but here there are places such
as A where three sections come together. At such a place at least
one route must end, since two routes can never be on the same
section. In Fig. 2 there are eight places of this kind, so there must
be at least eight ends of routes. Eight is an even number, the num
ber of ends of four routes. Therefore there must be at least four
routes; and, as we have already seen, four routes will do.
In Fig. 3 there are five junctions of the kind we have been dis
cussing, and a sixth point F where not three but five sections come
together. We shall call a point like F a junction of the fifth order.
At F two pairs of sections might be joined to form routes, leaving
one over, as was done in the first discussion of Fig. 3. The same
thing would be true for any junction of odd order; one section is
always left over, and hence there must be at least one end there.
We now see that Fig. 3 requires at least six ends (an even number)
and hence at least three routes.
For any network, no matter how complicated, we can easily
count the junctions of odd order and divide by 2 to obtain the least
possible number of routes. In the three examples so far, the number
of junctions of odd order was always even, and it turned out that
half this number of routes was not only necessary but also sufficient
to meet the requirements of the problem. We would now like to
determine whether or not the number of junctions of odd order is
always even, and whether half this number of routes will always do
for any given network.
No matter how complicated a network is, it is certainly possible
to find some system of routes having no more than one route on any
section. All one needs to do, in fact is to establish separate shuttle
lines on each section between each pair of junctions. However,
this will certainly require far too many routes. There will be other
systems using fewer routes, and we are looking for the best system,
the minimum number of routes needed. Among all the conceivable
15
TRAVERSING NETS OF CURVES
systems there will be certain ones that are best such that no
other system requires fewer routes.
It is clear that a system of routes is certainly not best if it includes
extra ends like the transfer point R of Fig. 1. Neither is it best if
more than one route ends at a point like F in Fig. 3, for if one
route came through C to F and another came through D to F 9 then
both could be connected at F to form a single route, and this would
decrease the total number. In order for a system to be best, the sec
tions at each junction must wherever possible be paired into routes.
This means that just one route will end at a junction of odd order,
none at a junction of even order, and the total number of ends will
be the number of junctions of odd order in the network.
We must still decide whether a system which is best can contain
a closed route. Such a closed route was mentioned in our discussion
of Fig. 2, but we connected it at A to the route AFGHD and decreased
the number of routes from five to four. The resulting route
ABCDAFGHD is not closed-it has the two ends A and D. This
same reduction can be made whenever a closed route contains a
junction of odd order, and a similar reduction can be made if the
closed route contains only junctions of even order. Let A be such
a junction (Fig. 4) on a closed route (shown here in the form of a
Fig. 4
figure eight). Some other routes through A are shown as dotted
curves and they may continue on in any way. If the system is
best no route can end at A, so the line from B through A continues
on, let us say, through E. We can then combine this line and
the closed route into a single route running through B to A, around
the closed route black to A, and then on through E. Since this reduces
the number of routes, the original system could not have been best.
However, we should note that in this reduction the newly formed
route may again be closed, since the original line through BAE may
have continued on and back to B. If this is the case, further reduc
tions may be possible. If at some stage the new closed route contains
16
SOME MAXIMUM PROBLEMS
a junction of odd order, then the next reduction will yield a route
that is not closed. Otherwise all the junctions of the original net
work must have been of even order and the system will finally be
reduced to a single closed route.
To sum up our results, we have seen that if a system is best, routes
end only at junctions of odd order and only one route ends at each
such point. Also, there will be a closed route in the system only
if all the junctions of the original network are of even order, and
then the closed route will traverse the entire network. In a best
system, the number of junctions of odd order is equal to the number
of ends of routes, and is therefore an even number. Furthermore,
the minimum number of routes is half the number of junctions of
odd order, except in the case where all junctions are of even order,
when the minimum is one (closed) route.
3. Some Maximum Problems
1. Let us compare the areas of various rectangles of two-inch
perimeter. Some are shown in Fig. 5. The width of each rectangle
must be less than 1 inch, and the closer it is to 1 inch the smaller
is the height and also the area. If the height is close to 1 inch,
then the width and again the area are very small. The inter
mediate rectangles have larger area, and one might ask which of
the rectangles has the largest area. This is a maximum problem.
It is probably the simplest and oldest of all such problems, and so
perhaps the most suitable one to use as an introduction.
D
Y
G
Fig. 5
Fig. 6
This problem is solved in Euclid, Book VI, theorem 27. Our
proof will use the same principles and will differ from Euclid's only
in its presentation. The rectangle ABCD of Fig. 6 is supposed to
represent any rectangle with a fixed perimeter P. The square
17
SOME MAXIMUM PROBLEMS
BEFG has each side of length JP, so its perimeter is also P. We
assert that the square is the answer to our problem., that its area is
greater than the area of any rectangle (not a square) ABCD having
the same perimeter. In the figure the shaded rectangle is part
of both the original rectangle and the square. The square also
contains the area X, while the original rectangle is made up of
and T. Now AB + BC is half the perimeter of the rectangle, and
GB + BE is half the perimeter of the square, so these two are equal;
AB + BC = GB + BE. This can be written as AG + GB + BC =
GB + BC + CE 9 from which we have AG = CE. Thus the rec-
angle X is just as high as T is wide. However, the width of X is
one side of the square, while the height of T is part of the side of the
square, and is therefore smaller. If two rectangles (Fig. 7) are the
Fig. 7
same length in one dimension, then the one with the larger other
dimension has the larger area, i.e., X is larger than T. Therefore
X + is larger than T + % 9 and the square has larger area than the
rectangle. The height of T would no longer be a part of the side
of the square only if ABCD were a square itself, and then the square
BEFG would itself be the original rectangle ABCD. Therefore the
square is larger than all other rectangles with the same perimeter.
This result has been stated as the Greeks would have put it. It
can also be written as an algebraic formula, which is the way we would
put it in modern mathematics. Let x and y be the dimensions of
the rectangle in inches. The area of the rectangle may then be
given in square inches and its perimeter is* + y + x + y = 2(x -\-y)
inches. Therefore the square has sides (# +y] inches in length
( x + j>\ 2
1 square inches. Thus, if x and y are any
two positive numbers^ the result becomes 1
xy ^
or
1 The symbol < means, and is read, ."is less than"; ^ means "is less than or
equal to." Thus, 3 < 5, 3 ^ 5, 3 ^ 3. Likewise, the symbol > means "greater
than," and ^ means "greater than or equal to."
18
SOME MAXIMUM PROBLEMS
/ ^
Vxy ^
Verbally, this might be stated: the geometric mean of two numbers is
always less than their arithmetic mean. The two are equal only if x
and y are equal.
2. We have now seen what is meant by a maximum problem
and its solution. First a solution is proposed, after which it is proved
that the figure named in this solution exceeds all the other figures
with which it is to be compared with regard to a given property
(in this case, area). It is now possible to turn to the main problem
of this section: to find the triangle of largest area that is inscribed in a
given circle. It is likely that this problem was discussed, if not solved,
at the time of Plato, a century before Euclid. However, neither
Euclid nor more modern books give the following solution, which
could easily have been understood and discovered by the Greeks.
Besides the original triangle ABC,, we consider the equilateral
triangle AQB Q C O inscribed in the same circle or another equal circle
(Fig. 8) . The area of AJB^C^ is completely determined, since the
Fig. 8
equilateral triangle is definitely fixed except that it may be turned.
We claim that the equilateral triangle is the solution to our problem,
that its area is greater than the area of any other inscribed triangle.
We first note that the circumference of our circle is split into three
equal arcs by the equilateral triangle, while the other triangle breaks
it up into three unequal arcs. Of these last three arcs, one must
be larger than one- third of the total circumference of the circle.
Otherwise the three arcs would each have to be exactly one-third
the circumference in order to make up the whole. The triangle
ABC would then be equilateral, but we have assumed that this is
not the case. In the same way, one of the arcs is smaller than one-
third the circumference. The third arc may be larger or smaller
than one-third the circumference; we are not able to decide which,
but it will not affect our argument.
Let us suppose that the triangle has been labeled in such a way
19
SOME MAXIMUM PROBLEMS
that the arc AB is less than one-third the circumference, and arc BC
is more. As in Fig. 9, we measure an arc CB" with a length equal
to that of AB along CB. The triangle CAB" is the mirror image of
ACB reflected in the diameter of the circle perpendicular to AC.
We also measure the arc AB' ', which is equal to one-third the cir
cumference in the same direction from A as B. Since AB is less
than one-third the circumference, B f will be past B. However,
B' will not be past B", that is, it will fall between B and B". If
it were past B", the arc AB' would be larger than AB", which is
the reflection of CB. But CB was greater than one-third the cir
cumference, while we made AB' exactly one-third the circum
ference.
Now since B r lies between B and B", it is higher than B. The
triangles ACB and ACS' have the same base AC, and the altitude
c 1
Fig. 9 Fig. 10
of ACB is less than the altitude of ACB'. Since the area of a triangle
is one-half the altitude times the base, ACB has less area than ACB'.
We have found a new inscribed triangle ACB' of greater area than
the original triangle ABC and with one side AB' equal to the side of
an inscribed equilateral triangle.
It may happen that ACB' is an equilateral triangle. This will
be the case if the arc AC cut off by the original tirangle was exactly
one-third the circumference. Then the proof that the equilateral
triangle is larger than the original triangle would be complete.
If ACB f is not equilateral we consider AB f as the base of the figure
just as AC was before. To do this we turn the figure around as in
Fig. 10 until AB r is at the bottom. We can now treat AB'C with
base AB' exactly as we did ACB with base AC. We will end up with
the triangle AB'C', which is equilateral since both B'C' and A'B
are each one-third the circumference of the circle. Because ^ J? C
of Fig. 8 and AB'C' are equilateral triangles inscribed in equal
circles, their areas are equal. We have shown that the area of
AB'C' is larger than that of AB'C, which in turn is larger than the
original triangle ABC, always assuming that ABC is not equilateral.
20
SOME MAXIMUM PROBLEMS
This completes the proof; the equilateral triangle has a larger area
than any other triangle inscribed in the same circle.
3. The previous result is a particular case of a more general
statement that can be proved in a similar way. We shall show
that of all polygons of n sides inscribed in a given circle, the regular polygon
is largest. If n = 3 the polygon is a triangle and this is our previous
result.
In order to prove this we need to notice just one more simple fact.
If we are given any polygon inscribed in a circle (Fig. 11), we can
inscribe another polygon in the circle having the same sides but in
any other order. All we need to do is draw the radii to the corners
of the polygon and cut the circle into sectors along these radii.
Fig. 11
These sectors can then be rearranged in any desired order. It is
obvious that the new polygon has the same area as the original one.
The proof now goes just about as before. In the first place, if
the polygon is not regular, there must be a side that cuts off an arc
less than one n-th of the circumference, and one that cuts off an arc
larger than one fl-th. In the case of a triangle any two sides are
always next to each other, but if n is greater than three the two
sides we are interested in may not be next to each other. However,
we have just seen that we can draw a new polygon in the same
circle, with the same sides and area, but with the two particular
sides next to each other. Suppose the smaller side is called AB,
the larger BG* We can measure off one 7Z-th of the circumference
from A in the direction of B and call its end B'. As before, B f lies
between B and the mirror image B" ofB. Then the new polygon
with B' in place of B and all the other corners the same has a larger
area than the original polygon. Also, one side, AB\ of the new
polygon is the length of a side of a regular polygon. We can apply
this same procedure to the n 1 remaining sides, always rearranging
the order of the sides if necessary. Continuing a step at a time, we
will finally arrive at a regular polygon, since a new side becomes
21
INCOMMENSURABLE SEGMENTS AND IRRATIONAL NUMBERS
the correct length at each step. Because the area is increased, each
time, the regular polygon has a larger area than the original polygon.
In a similar way it can be shown that of all polygons of n sides
circumscribed about a circle, the regular polygon has least area.
4. Incommensurable Segments and Irrational Numbers
The measurement of length, area, and volume is at the root of all
geometry. To measure one line segment by another, we see how
many times one goes into the other. This is simple enough if it
goes in exactly without leaving a remainder. If the smaller segment
does not go into the larger one exactly, then we look at the remainder.
It may happen that the remainder is one-half, one-third, two-thirds,
or some other similar fractional part of the segment we are using as
a measure. If so, we have a sort of substitute measure, a fractional
part of the segment that goes exactly into both the segment being
measured and the one that is used as a measure. This new segment
is a "common measure" for the two original segments.
The earliest geometrical problems undoubtedly involved a com
mon measure. For example, if a rectangle has sides 3 and 4 inches
in length, then by the Pythagorean theorem the square constructed
on the diagonal has area
3 2 + 4 2 = 9 + 16 = 25 sq. in.
and the diagonal is consequently 5 inches long. Since the smaller
side of the rectangle and the diagonal have a 1 inch segment as a
common measure, they are in the proportion 3:5.
It is quite natural to try to do the same thing for a square and to
look again for a common measure for the side and diagonal. If
we attempt to do this, we find a remainder (Fig. 12) and are forced
to try smaller and smaller fractional parts of the segment. The
INCOMMENSURABLE SEGMENTS AND IRRATIONAL NUMBERS
question arises as to whether one can eventually find a fine enough
measure or whether there just isn't any such measure, that is, whether
the two segments are "commensurable" or "incommensurable". This
leads to the question whether a line segment can be divided into
arbitrarily fine divisions or whether there is a limit to the possible
divisions. Is the line made up of a very large number of small
indivisible parts? Does it have an "atomic structure"? The con
ception 'of the atomic structure of matter is attributed to Democritus,
who lived just before Plato. However, there is a difference between
this and the atomic structure of the line. One can easily consider
a line as "continuous," i.e. capable of arbitrarily fine division, and
still suppose that physical material is strung along it in a series of
atoms! A fragment attributed to Anaxagoras has also been preserved
from about the same time; this asserts, in effect, that the line is
continuous. The fact that only this fragment has been handed
down does not signify that it was just an accidental utterance*
Rather, it was a controversial thesis that led to much discussion,
and it is representative of a time when mankind was making real
steps toward the solution of this basic problem.
We can imagine the tremendous effect of the more far-reaching
discovery that the side and diagonal of a square are incommensurable. This
discovery is attributed to the Pythagoreans, a secret society of
Southern Italy about whom very little is definitely known. Ac
cording to a legend, the Pythagorean who made these investigations
public atoned by perishing in a shipwreck. Perhaps this is more
allegory than truth, since it may refer to the shattering effect that
the discovery of irrationals had on the foundations of contemporary
thought. In any event we have Plato's own report in his Laws of
how this discovery excited him when he first learned of it.
We shall give two proofs of this fact without considering the in
teresting historical question of which proof is the older. The second
was given not only by Euclid but even by Aristotle. The first,
apparently older, is of the type usual with the Greeks and, in spirit,
belongs to the tenth book of Euclid.
For the first proof we must begin by noticing certain facts of an ele
mentary geometric nature. The whole argument is easily recognized
as arising from a vain attempt to find a common measure by measuring
the side off along the diagonal and then continuing to attempt to
find a suitable fractional part. We measure off the side along the
diagonal from B (Fig. 13). It goes in only once and we can call
the point where it ends D. The line B'D has been drawn perpen-
23
INCOMMENSURABLE SEGMENTS AND IRRATIONAL NUMBERS
dicular to BD and B r is the point where it crosses AC. Also B and
B' have been connected. We have BA = BD, BB' = BB', angle
BAB r = angle BDB r , the last because each is a right angle. There-
A" C
Fig. IS
fore by one of the congruence theorems, the triangles BAB' and
BDB' are congruent and consequently AJB' and DB', being corre
sponding sides, are equal. Also angle B'CB, between a side of
the square and a diagonal, is half a right angle, and since angle
CDB' was constructed a right angle, just half a right angle is left
over for the third angle of the triangle CDB'. Therefore CDB' is
an isosceles right triangle and its legs DB' and DC are equal.
Combining what we have proved, we have
(1) AB' = B'D = DC.
We now erect a perpendicular AC to the diagonal at C and make
it equal to DB'. When A is connected to B', A'B'CD forms a new
square smaller than the original one. The whole process can now
be applied to the new square, its side is measured off along its
diagonal B'C from B to give >', and the line B"D' drawn perpen
dicular to B'C. As before, we have
(2) A'B" = B"D f = D'C.
Clearly this can be repeated again and again, and each time there
will be a remainder on the diagonal which can be used as a side
of the next square. Although the process will never end, the
remainders (which are never zero) become smaller at each step:
(3) CD > CD' > CD" > CD" 1 > - -.
Each of these remainders is the difference between the diagonal and
side of the corresponding square:
(4) CD^CB-AB, CD'^CB'-A'B', CD"=CB"~-A"B", - - -.
24
INCOMMENSURABLE SEGMENTS AND IRRATIONAL NUMBERS
This completes the geometric preliminaries for the first proof.
The proof itself will be an indirect one. We assume that the side
and diagonal are commensurable and show that this leads to an
impossible situation. If the two are commensurable they have a
common measure, a segment E that goes exactly into both the side
and the diagonal. Now if any two segments are exact multiples
of E> their difference is also an exact multiple of E. Therefore, if
CB and AB are exact multiples of E, so is CD by (4). Also, by (1)
we have CB' = CA B'A = AB CD, which is a difference of
multiples of E and hence an exact multiple of E itself. The second
square of Fig. 13 has side A'B r = CD and diagonal CB', both of
which are multiples of E. From the first square we have gone to
the second and we can go on in the same way; from A'B 1 and CB'
we find that CD', A"B", and CB" are multiples of E, and so on
through the further squares.
We now come to the contradiction. If CB and AB are exact
multiples of E, then we have seen that CD', CD", CD" 1 , - - are also
exact multiples of E. But (3) shows that these multiples of E get
smaller and smaller without stopping, though they are never zero.
This is not possible since, for example, if CD were lOOOjE, then
CD', a smaller multiple of E, would be at most 999 E, etc., until
at the latest the 1001st term would be less than E. It would then be
zero, since it is a multiple ofE less than E. This contradicts the fact
that no term of (3) is zero.
The second proof is much simpler, and the arithmetical preparation
for it is shorter than the geometric preliminaries of the first proof.
We first consider even and odd numbers. An even number is
twice some other, so that it can be written 2x. An odd number is
an even number increased by 1, and can be written 2# + 1. The
square of an odd number is always odd, for
(2* + 1)2 = 4** + 4* + I = 2(2* 2 + 2*) + 1
is again twice a number increased by 1. From this we can im
mediately prove:
Lemma 2 1. If the square of a number is even then the number itself
is even. If the number were odd, as we have just seen, its square
would also be odd. It is even easier to prove:
Lemma 2. The square of an even number is always divisible by 4. Thus
(2#) 2 = 4# 2 is 4 times * 2 , or simply 4^.
2 A lemma is a proposition not important enough to be called a theorem,
but which is to be used in the later proof.
25
INCOMMENSURABLE SEGMENTS AND IRRATIONAL NUMBERS
The main proof is again indirect. We again suppose that the
side and diagonal of the square have a common measure E. Let
the diagonal be d times E, the side s times E. Then, applying
the Pythagorean theorem to one of the right triangles formed by
two sides of the square and the diagonal, we have
(5) d* = s 2 + s\ d* = 2s*.
We can suppose that d and s have no common factors since they
can always be reduced by a new choice of E. For example, 10 and
16 have the common factor 2, but they can be reduced to 5 and 8
by doubling the size of the common measure E. From now on we
shall assume that such a reduction has been made.
From (5) we see that d? is twice a number and hence is even.
Lemma 1 then shows d is also even. Consequently s must be odd,
since otherwise d and s would have the common factor 2 in contra
diction to the fact that they are reduced. However, since d is even,
lemma 2 shows that d 2 is divisible by 4, d 2 = 4g, and 2j 2 = 4g or
^2 _. 2g. Therefore j 2 is even and so is s (by another application
of lemma 1). This contradicts the fact that we have just 'shown
(that s is odd), and hence shows that the original assumption, that
s and d have a common measure, is false.
The essence of both proofs is that a decreasing sequence of whole
positive numbers must finally come to an end. In the first proof
it is apparent in our discussion of the sequence (3). In the second
proof it is hidden but really included in the remarks concerning the
reduction of the numbers d and s. The proof of the fact that the
reduced form of two numbers can be found depends on a series of
decreasing steps.
In the current mathematical notation, formula (5) would be
written
(4)'=
2.
Likewise, our final result would be written: There is no fraction
(no rational number) x whose square is 2. This can also be
s
stated as: There is no rational number which is equal to y% or
finally, V 2 * s an "irrational number".
26
5. A Minimum Property of the Pedal Triangle
We shall again consider a problem of the kind discussed in Chapter
3, but this time it might more properly be called a minimum
problem. It will serve to introduce mathematical methods that are
highly refined yet clear and simple. The theorem and the proof
we shall give here are the work of H. A. Schwarz. Although the
theorem is only a relatively minor mathematical problem, it shows
how this great mathematician's genius manifests itself, equally in
relatively trivial and extremely significant work.
1. Before considering our main theorem, let us look at a very
simple problem concerned with the law of reflection of light. It
is well known that if a light ray starts at A (Fig. 14) and strikes a
mirror g, it is reflected toward B in such a way that the angle of
A 1
Fig. 14 Fig. 15
incidence and the angle of reflection are equal. What we want
to prove is that the path ADB that the light ray chooses is the shortest
of all possible paths from A to B that touch the mirror g. It is the
path that a steamboat should take if it were required to go from a
place A to another place B and to touch at the bank g on the way.
We shall not go into the question of why a light ray, which does
not have the power of reasoning, chooses the same path that would
be chosen by the pilot of the steamboat after considerable thought.
All we shall prove is the purely mathematical fact that the path ADB
with equal angles of incidence and reflection is shorter than any other path ACB.
The proof depends on a device that seems artificial from a mathe
matical standpoint, but which is quite natural from the point of
view of optics. We reflect the point A and the lines AC and AD
in the mirror g (Fig. 15). If A' is the image of A, then A'C is the
image of AC and A'D is the image of AD, so that A'C = AC,
A'D = AD, and A'E = AE. Therefore the triangles EDA and
EDA' are congruent and the angles EDA and EDA' are equal.
According to our hypothesis we have angle EDA = angle CDB.
27
A MINIMUM PROPERTY OF THE PEDAL TRIANGLE
Therefore angles CDB and EDA' play the roles of vertical angles;
that is, A'DB is a straight line.
Now the lengths of the paths ADB and A'DB, as well as ACB
and A'CB, are equal. Since A'DB is a straight line connecting
A' and 5, it is shorter than the path A'CB, and consequently ADB
is shorter than ACB. Here we have used the fact that a straight
line is the shortest distance between two points.
2. We now turn to our main problem, to inscribe in a given acute-
angle triangle ABC a triangle UVW whose perimeter is as small as possible
(Fig. 16). The assertion is that the "pedal triangle" EFG (Fig. 17),
whose vertices are the three feet of the altitudes of the triangle ABC, has a
smaller perimeter than any other inscribed triangle UVW.
A I/ C
Fig. 16 Fig. 17
We must first prove a lemma concerning the pedal triangle. We
assert the angles AFG and CFE are equal (as in the law of reflection) ,
and consequently that the analogous angles at E are equal and also
those at G. In order to prove this lemma we must recollect some theo
rems of plane geometry: the theorem of Thales, that an angle inscribed
in a semicircle is a right angle (Fig. 18); that inscribed angles which
intercept the same arc are equal (Fig. 19); that the altitudes of a
triangle meet at a point. Using these, we see that the circle with
diameter AH passes through G and F and that the circle with
B
Fig. 18
Fig. 19
Fig. 20
(Fig. 20). Furthermore,
diameter CH passes through E and F
angle AFG intercepts the arc AG, as does angle AHG, so these two
angles are equal. In the same way we see that the angles CFE
and CHE are also equal. But angles AHG and CHE are vertical
angles and hence are equal. Therefore we have angle AFG =
angle CFE.
28
A MINIMUM PROPERTY OF THE PEDAL TRIANGLE
3. We can now commence Schwarz's proof. We reflect the
riangle ABC in the side BC (Fig. 21), reflect the reflected triangle
n its side CA' ', reflect the resulting triangle in A'B' y reflect in B'C',
hen C'A", then A"B" y a total of six reflections. We first prove
he fairly obvious fact that the final position A"B"C" is the original
position ABC moved parallel to itself without turning. The first
wo reflections shift ABC to the third position A f B'C. This shift
ould have been made without reflections and without lifting the
riangle out of its plane merely by rotating it about C through
he angle 2C in a clockwise direction. Similarly, the shift from
he third to the fifth position could be accomplished by a clockwise
otation through the angle 2B about the point B'~ Finally, a clock-
rise rotation through the angle 2A about A" would yield the seventh
r final position. In all, the triangle has been rotated one complete
evolution, through the angle 2C + 2B + 24, since the sum
1 + -B + C of the angles of a triangle is a^ straight angle. The final
>osition of the triangle therefore has the same orientation as the
uiginal; it has merely been moved but remains parallel to itself.
rherefore BC is parallel to B"C".
Fig. 21
Now we want to trace the various positions assumed by the pedal
riangle and the triangle UVW under the successive reflections,
fhese are shown in Fig. 21 by dotted lines and shading. From our
emma concerning the pedal triangle, we see at once that the second
)osition of EG forms a straight line with the first position of FE.
n the same way, one side of the pedal triangle will always lie on
he continuation of this line in successive positions. Therefore the
traight line EE" is made up of 6 segments, 2 equal to FG y 2 to GE 9
ind 2 to EF: hence it is equal to twice the perimeter of the pedal triangle.
Tracing out the positions assumed by the arbitrary triangle UVW
n the same way, we find that the zig-zag line UV'W'U'V"W"U">
onnecting U and U", is equal to twice the perimeter of the triangle UVW.
We have already seen that the segments UE and U"E" are
>arallel since they lie on BC and B"C". They are also equal, since
29
A MINIMUM PROPERTY OF THE PEDAL TRIANGLE
they are corresponding segments in two positions of the triangle
ABC. Then by a theorem of plane geometry EE"UU" is a paral
lelogram, and consequently its other two sides are equal, UU"=EE".
Therefore UU" is also equal to twice the perimeter of the pedal
triangle. The straight line UU" connecting U and U" is shorter
than the zig-zag line connecting the same two points, and the zig-zag
line is twice the perimeter of UVW. Therefore the perimeter of
the pedal triangle is less than the perimeter of UVW, as was to be
proved.
This proof is typical of many truly mathematical proofs. It
consists essentially in transforming the hypotheses and conclusion
until the true kernel of the theorem can be recognized at a glance.
6. A Second Proof of the Same Minimum Property
1. In the last chapter we proved that of all the triangles inscribed
in a given acute-angle triangle, the pedal triangle has the smallest
perimeter. It is worthwhile to consider another proof of the same
theorem, because this second proof will illustrate some new ideas
and, for our purposes, the methods used are of more importance and
interest than the mere mathematical content of new theorems.
The previous proof, originally given by H. A. Schwarz, depended
essentially on the fact that a straight line is the shortest distance
between two points, and it made use of the idea of reflection of a
figure in a line. These two principles also form the basis of the
second proof, and it is of interest to contrast the manner in which
they are used in the two proofs. The following proof was given by
L. Fejer, who discovered it as a student and thereby won especial
recognition from H. A. Schwarz.
2. In the given acute-angle triangle ABC (Fig. 22), let UVW be
an arbitrary inscribed triangle with U on BC, V on CA, W on AB.
A
Fig. 22
Let U be reflected in the two lines AC and AB and call U' and U"
the two images. Now f/Fand U'V are mirror images of each other
and hence are equal. For the same reason UW and U"W are also
30
A MINIMUM PROPERTY OF THE PEDAL TRIANGLE
equal. The perimeter of the triangle UVW is UV + VW + WU
and therefore it is equal to the length of the path U'VWU".
If we hold U fixed but move V and W to new positions, then the
points U' and U" remain fixed because they depend only on U and
the triangle ABC. The path U'VWU" then connects the two fixed
points U' and U", and its length is always equal to the' perimeter
of UVW. The shortest path from U' to U" is a straight line. There
fore the straight line segment U'U" is the smallest possible perimeter
for an inscribed triangle with one vertex held at U. This minimal
triangle with vertex at U, which we shall call UMN, is shown in
Fig. 22.
3. Having found the triangle of smallest perimeter with vertex
at /, we need only compare the minimal triangles for various
positions of U and pick out the one with smallest perimeter. That
triangle will certainly have the smallest perimeter of all inscribed
triangles.
We must determine the position of U so that the segment U'U"
is as small as possible. For this purpose we first notice that the
triangle AU'U" is isosceles, with AU' and AU" as equal sides.
Indeed these two segments are each mirror images of the same
segment AU and hence are equal, AU = AU' = AU".
Although the legs of the triangle AU'U" are equal in length to
AU and hence depend on the position of U on JBC, the size of the
angle U"AU r does not depend on the position of U. This angle is com
pletely determined by the original triangle ABC and nothing else,
since (because of the reflections) we have the following equations
between the angles of the figure:
UAB = U"AB, UAC = U'AC.
From the first we have
U"AU = 2UAB,
ajid from the second,
U'AU = 2UAC;
and hence
U'AU + U"AU = 2UAB + 2UAC
or
U'AU" = 2BAC,
which proves our assertion concerning the angle U'AU".
4. In the isosceles triangle AU'U" we want to make the base
U'U" as small as possible. Since the angle at A does not depend
on U 9 all these triangles, for different positions of U, have the same
31
A MINIMUM PROPERTY OF THE PEDAL TRIANGLE
vertex angle. Of all these the one with the shortest base will also
have the shortest legs. The legs AU 1 and AU" have the length AU.
Therefore we will obtain the shortest segment U'U" if we choose U
so that AU is as short as possible.
Now the segment A U connects the point A with the line BC 3
and it is well known that the shortest distance from a point to a
line is the perpendicular distance. Therefore we must choose U
so that AUis perpendicular to BC, that is, so that AU is the altitude
of the triangle ABC through A.
5. Let us now construct this triangle EFG of smallest perimeter
(Fig. 23). Let E be the foot of the perpendicular to BC through A.
If E' and E" are the images of E under reflection in AC and AB,
then E'E" is the length of the smallest perimeter of an inscribed
triangle. The two points F and G where the line E'E" cuts AC
and AB are the other two vertices of the minimal triangle.
Be. c
Fig. 23
If we think back over what we have done, we see that every
inscribed triangle UVW, different from EFG, must have a larger
perimeter. For if U is different from JE 9 then the segment U'U"
is larger than E'E" and the perimeter of UVW is at least as large
as U'U". If U and E coincide, then one or both of the points Fand
W will differ from the points F and G, and the path E'VWE" will
differ from the straight line E'FGE". In both cases, then, the
perimeter of UVW will actually be greater than the perimeter
of EFG.
6. These considerations have shown that the problem of finding
an inscribed triangle of least possible perimeter has only one solution.
We shall make use of this uniqueness of the solution. Our construc
tion of the minimal triangle did not treat all three vertices in the
same manner. One vertex E is the foot of the altitude through A,
but the other two were found by a construction having nothing to
do with the altitudes through B and C.
We could have carried out our argument starting with the vertex
B in place of A. That is, in 2, instead of reflecting the point U
in the sides AB and AC, we could have reflected the point V in BA
32
A MINIMUM PROPERTY OF THE PEDAL TRIANGLE
and BC and continued accordingly. We would then have ended
with a minimal triangle whose vertex F was the foot of the altitude
through B. Since there is only one minimal triangle, this construc
tion starting with B must have led to exactly the same triangle EFG
as the original construction starting with A. Since we could also
start with the vertex C, we may conclude that in the minimal
triangle EFG not only E but F and G as well are feet of altitudes.
Thus we have proved the theorem.
7. We can still get a little more from the proof. Making use
of the uniqueness of the solution, we saw that if E had a certain
property (being a foot of an altitude), then F and G also had the
analogous property. Similarly, any property that F and G have by
our construction will also hold for E. Because E 1 is the mirror
image of , the angles EFC and E'FC are equal. Since E'FC and
GFA are vertical angles, they are also equal, and angle EFC =
angle GFA. That is, the two sides of the minimal triangle that go
through Fforrn equal angles with the side ,4Cof the original triangle.
The corresponding statement holds for the point G. If we had started
our construction with F as the foot of the altitude through J5, then this
same proof would show that the angles GEB and FEG are also equal.
Disregarding the minimal property of the triangle EFG, we know
from 6 that EFG can be characterized as the pedal triangle.
Combining these two results, we have the theorem: if a pedal triangle
is inscribed in an acute-angle traingle, then the two angles formed
at each vertex of the pedal triangle by the two sides of the pedal
triangle and the side of the original triangle are equal.
This theorem now contains nothing concerning a minimum. It
is the type of theorem customarily found in elementary geometry,
and it could be proved by the methods of elementary geometry.
In fact, we have actually done just that in Chapter 5. Schwarz's
proof needed this result as a lemma, and we supplied it by making
use of a number of theorems concerning circles. An advantage of
Fejer's proof is that it makes use of nothing other than the principle
of the shortest distance and reflections. Furthermore, Fejer's proof
is distinguished by the fact that it uses only two reflections, while
Schwarz's employs six.
8. There is a counterpart to the theorem on the pedal triangle:
In every acute-angle triangle there is one and only one point the
sum of whose distances from the three vertices is a minimum. This
point is so situated that the lines that join it to the three vertices
form angles of 120 with each other.
This theorem was proved by L. Schruttka by a method suggested
33
THE THEORY OF SETS
by analogy with Schwarz's proof of the pedal triangle theorem. A
much shorter proof has been given by Btickner, and this is the one
we shall use.
Let P (Fig. 24a) be an arbitrary point in the acute-angle triangle
ABC. Let the triangle AGP be turned about the point A through
60 to the position AC'P'. This rotation is to be made in such a
direction that AC turns out of the triangle, so that finally the line
C" C"
c \^^ />/ c
Fig. 24a
Fig, 24b
AC lies between AB and AC'. Then we have C'P' = CP and
PP' = AP, for the triangle APP f is not only isosceles but equilateral,
its angle at A being 60. Then the path BPP'C' represents the sum of
the distances of P from the three vertices A, B, C. The point C' is in
dependent of the position of P. All the paths corresponding to
various positions of P join B and C'. The shortest of these paths
is the straight line BC' (Fig. 24b). Therefore the minimal point
P must lie on BC', and its position is completely determined by the
fact that angle AP C r = 60. The supplementary angle AP B is
then 120. The construction shows that there can be only one
minimal point P . Consequently the same construction with A
replaced by another vertex will lead to the same point P . There
fore the angles BP Q C and CP A are also 120.
7. The Theory of Sets
The subject of this chapter lies at the very foundations of mathe
matics. However, our interest in it will depend more on the beauty
and simplicity of the manner in which it is built up than on its
significance for mathematics in general. The theory of sets, origin
ated by Georg Cantor, is a truly mathematical theory which starts
with only the very simplest concepts and builds up to a ramified and
important subject through the use of pure reasoning.
Are there more whole numbers than even numbers? Which are
more numerous, the points of a line segment or the points of the
surface of a square? It is just such questions as these that started
34
THE THEORY OF SETS
Cantor on his theory. It is important to avoid jumping to con
clusions when trying to answer these questions. The questions as
they are put are not precise because we don 3 1 know exactly w r hat we
mean by one being more than another. Cantor's first important
step was to give them a precise meaning by using simple methods
of counting off, as is done for finite numbers, and by carefully
differentiating between cardinal and ordinal numbers, a distinction
that is merely grammatical for finite numbers.
A simple example will point out the direction in which we are
to go. Suppose we are in a dance hall and are asked whether
there are more men or women present. What would be the easiest
way to decide? One method would be to separate the men and
women into two groups, to count each group, and to compare the
numbers. A simpler method, however, would be to start the dance.
The men and women would pair off, and it would only be necessary
to observe whether those left over were men or women. We suppose
that everyone dances if he can find a partner.
This principle of pairing off was adopted by Cantor as a starting
point. If we wish to decide whether there are more whole numbers
than even numbers we are to try to pair them off. In fact, we
can find a pairing in which each whole number is paired with an
even number and none is left out. We do it as follows:
1, 2, 3, 4, 5, 6,
2, 4, 6, 8, 10, 12, - - -
where each whole number in the upper row is to be paired off with
the even number directly below it. By this method every number
gets paired off and none is left out. This simple fact is quite
remarkable. The two rows have been paired off exactly and yet
the lower row consists of part and only part of the upper row.
This brings up an essential difference between this case and the
case of finite numbers. In the case of the dance hall (where there
is a finite number of dancers) it is quite immaterial which man
dances with which woman. The number of those sitting out is
always the same and it does not change as long as no one enters or
leaves the hall. It is quite different in the case of the whole numbers
and even numbers. We have seen a pairing off that comes out
exactly, including every element, but it is easy to give another which
does not do so. We can use the most natural pairing: 2 with 2,
4 with 4, 6 with 6, etc. We have paired off the even whole numbers,
but all of the odd ones are left over. The essence of Cantor's theory
35
THE THEORY OF SETS
is that it abandons the idea of arbitrary pairing and only demands
that we find some one pairing off that comes out exactly. If such
a pairing can be found, the two sets are said to be of the same power.
Cantor's next step was to prove that the set of all rational numbers^
i.e. all whole numbers and fractions, does not ham higher power than the set
of whole numbers. To construct the appropriate pairing we arrange
the fractions not in order of size but according to the size of the sum
of their numerator and denominator. We consider only reduced
fractions and begin with those the sum of whose numerator and
denominator is 2. There is only one such fraction: 1/1 = 1. Those
with sum 3 are 1/2 and 2/1 = 2. Next come 1/3, 2/2, 3/1, with
sum 4, but 2/2 is omitted because it is not reduced. These are then
followed by 1/4, 2/3, 3/2, 4/1, etc. We now write the fractions
in this order under the series of natural numbers,
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
1, 1/2, 2, 1/3, 3, 1/4, 2/3, 3/2, 4, 1/5, 5, 1/6, - - -,
and can pair off each whole number with the fraction directly
under it. The lower row does not leave out any rational number,
because each one has a definite sum of numerator and denominator
and must therefore have some place in the row. This, then, is
the pairing off we desired.
This surprising result is expressed by saying that the set of rational
numbers is "countable" or "denumerable" because the pairing off
with the natural numbers is, in effect, a counting off of the rationals.
In general, a set is denumerable when it can be paired off with the
natural numbers, that is, when it has the same power as the set of the
natural numbers. Cantor goes on to show that several other sets, which
are much more extensive than the natural numbers, do not actually
have higher power. We shall pass by these examples since they
involve further mathematical concepts and our one example is
adequate.
Thus far all the infinite sets considered have had the same power.
Cantor's theory would be trivial if there were no sets of higher power.
Cantor showed that sets of higher power do exist by proving that the
set of points of a line segment is of higher power than the set of natural
numbers. The proof is an indirect one. We suppose that there is a
pairing between the points of, say, a 1 inch segment and the natural
numbers, and show that this leads to a contradiction. This pairing
off puts the points in an order, the first paired with 1, the second
with 2, etc., but this will obviously not be the natural order of the
36
THE THEORY OF SETS
points on the line. It is convenient to identify each point by its
distance from one end of the line segment. For example, the middle
of the segment corresponds to the number 0.5 and each point
corresponds to some decimal fraction. In order to measure the
points exactly we must use infinite decimals; for example the end
of the first one-third of the segment corresponds to 0.33333 . In
the pairing off of the points of the segment with the natural numbers
we can use the corresponding infinite decimal fractions instead of
the points themselves. In the first place there will be some decimal
0. corresponding to 1, in the second place a similar decimal, etc.
It is a little easier to list them in a vertical column, giving us a list
of the type shown below where we have inserted particular numbers
merely for purposes of illustration.
1. 0.35420 - - -
2. 0.61773*--
3. 0.55549 - - -
4. 0.01007 - - -
5. 0.20206 - - -
Now we shall show that there is a decimal fraction 0. - - - which,
contrary to our assumption, is not included in the list. This decimal
can be found as follows: we choose for the first digit after the decimal
point a digit that is different from the first digit of the first decimal
in the list. This gives us a choice of 9 digits. In order to make
it definite, let us choose the digit 1 unless the first digit of the first
decimal in the list is 1, in which case we shall choose 2. Now it is
clear that our new decimal will differ from the first decimal of the
list no matter what choice is made for the remaining digits, for the
two certainly differ in their first digits and will represent different
points even if they agree in all the other digits. We now choose
1 to be the second digit unless (as in the above example) the second
digit of the second decimal of the list is 1, in which case we choose
2. Therefore in either case the second digit of our decimal is
different from the second digit of the second decimal of the list.
Then our decimal will differ from the second decimal of the list as
well as from the first. We continue choosing digits in this way.
In the illustrative example our decimal would begin 0.12111 .
Since the process can be continued indefinitely, we have defined an
infinite decimal fraction which is different from all the decimals
37
THE THEORY OF SETS
in the list. The fact that the list does not contain this decimal
contradicts the assumption that they could be paired off. Therefore
the points of a line segment are not denumerable.
There is a certain objection to the above proof which must
be mentioned. It is caused by the decimals that end in an in
finite row of 9's, like 0.269999 , which is no different from
0.270000 = 0.27. The difficulty is that two different decimals
can represent the same point if one of them ends in 9's, and this
means that we were not justified in replacing the points of the
segment by the set of all decimal fractions. 3 The two sets do not
correspond exactly. Cantor avoided this difficulty by putting the
proof in quite a different form, but there is a very simple way to
dispose of the objection. We merely rule out all decimals ending
in 9's. That is, we would not identify a point by 0.269999
but would use 0.270000 -. All that we have to do is to make
sure that the decimal we construct does not end in 9's. But this
does not occur, since it is made up entirely of the digits 1 and 2
and does not contain any 9's.
This result has an interesting consequence. Since the set of
rational numbers is denumerable, we see that it is of lower power
than the set of all numbers between and 1. Therefore there must
certainly be numbers between and 1 which are not rational. Thus
we have proved the existence of irrational numbers by means of
quite general considerations. The same thing was proved in an
entirely different way in chapter 4.
Cantor's next result is also rather surprising. It states that the
set of points of the surface of a square does not have higher power than the
set of points of a side of the square. The reason this is so surprising is
that it contradicts the intuitive idea of dimension. A one-dimen
sional segment has the same power as a two-dimensional square,
and it can also be shown that a three-dimensional cube also has the
same power.
As in the previous proof we shall use infinite decimal fractions
and again we shall exclude decimals ending in 9 5 s. The points of
the segment will again be represented by decimals 0- just as
before. The points of the square will be characterized by a pair of
decimals, one decimal, x, representing the horizontal distance from
the left side of the square and another, jy, representing the vertical
distance from the lower side of the square (Fig. 25). We shall
3 For a further discussion, see R. Gourant and H. Robbins, What is Mathematics?,
1941, Oxford University Press, New York, pp. 64-66, 80-82.
38
THE THEORY OF SETS
set up a pairing in the following way. Each point P of the square
is characterized by two decimals
Fig. 25
x = Q.a&az - , y = O.^i^
From these two we form the single decimal
by alternately taking digits from x and y. Then we pair off the
point P of the square with the point ), of the segment characterized
by the decimal z- For example, the center of the square has
x = 0.500 - , y 0.500 - , and it is paired off with the point
of the segment corresponding to z = 0.550000 . In this way
each point of the square is paired off with some point of the segment.
But this is not quite enough. If we had only to pair each point of
the square with some point of the segment we could merely pair
each point P with the point directly above P on the upper side of the
square. But then each point of the side would be paired, not with
just one point of the square, but with an infinite number (all the
points on a vertical line). But this is ruled out in Cantor's method
of comparing powers of sets, just as in the case of the dancers we
did not suppose that one person would dance with several partners.
Our original, more refined, method of pairing off avoids this diffi
culty. For if a second point P' of the square with, say,
X =
was paired with the same point ^of the segment, then we would have
This decimal expression for z and the original one can be equal only
if the two correspond exactly in all digits,
But this shows that all the digits of*' are equal to the corresponding
digits of x 9 and the same is true for y' and y. Therefore we have
39
THE THEORY OF SETS
x' = x, y' = jy, and P 1 is the same point as P, contrary to the as
sumption that it was a different point. Consequently two different
points of the square are never paired off with the same point of the
segment.
The pairing will be complete when we show that no point of the
segment is left unpaired. This is easily seen, for if
Z = Q.CiCtftCiC 6 c B -
corresponds to any point of the segment, then the point of the
square with
x = 0.^3*5 - - , y = 0., and this is exactly the
decimal z with which we started.
There is still an objection to this proof which arises from decimals
ending in 9's. The point on the segment with z = 0.2202020 *
is paired with the point of the square having
x = 0.2000 , y = 0.2222 - - -.
The point on the segment with z' = 0.12929292 is paired with
the point of the square having
x' = 0.1999 -,/ = 0.2222 - ,
but here we have a decimal ending in 9's and we should write
#' = 0.2000 -. Therefore the two different points on the segment,
corresponding to z and z'> are paired with the same point of the
square.
A very simple, although not obvious, trick will eliminate this
difficulty. In the exact form in which it is given above, the proof
is not correct, but if we make the one following modification it
becomes valid. Whenever a digit 9 occurs in a decimal, we com
bine it with the digit to its right to form an inseparable "molecule".
If there are several consecutive 9 J s we include all of them and the
next digit in the molecule. Thus in the above example we would
have O.(l)(2)(92)(92)(92) and, as a new example, we would
write z" = O.(7)(3)(94)(990)(9997) -. These molecules are to
take the place of the digits in the proof, so that from,?", for example,
we would have
x" = 0. (7) (94) (9997) - - , y" = O.(3)(990) - -.
The pairing is now quite different from what it was before but,
40
THE THEORY OF SETS
after a moment's reflection, we see that the proof now goes through
without difficult so long as we remember to rule out decimals
ending in 9's.
At this point Cantor encountered a problem which has since
become famous. This was the question whether there exists a set
having a higher power than the set of natural numbers but a lower
power than the set of points of a segment.
This problem has been named " the problem of the continuum".
It defied all of Cantor's attempts to find an answer, as well as those of
his successors. Probably no other mathematical problem formulated
with so little mathematical preparation has ever defied solution so
stubbornly as this one. The only concepts used in the formulation
of the problem are those of whole numbers and line segments.
There is nothing remarkable in proposing one or many very difficult
problems if one uses complicated mathematical ideas. Using only
simple ideas to formulate a problem that is neither easily solved
nor trivial reveals' the true art of mathematics. Considered from
this point of view, the problem of the continuum stands out as a
shining example.
It soon became evident that the fundamental theory of sets, on
which all of mathematics depends, could not be properly developed
until the concept of what is meant by a set had been carefully
analyzed. One is forced to contemplate such an analysis because
of the following famous paradox in the theory of sets.
We have been talking of sets. These sets consist of "elements"
of one sort or another. The set of points of a segment contains as
elements the individual points of the segment. The whole numbers
themselves are the elements of the set of whole numbers. The
relation between the sets and elements is the same as that between
an association and its members. Sometimes an association has as
members not individuals but associations. For example, the
United Nations is an association of nations, each of which is an
association of members or citizens. The actual membership of the
United Nations is made up of individual countries, not of the citizens
of those countries. In the same way a set can contain sets as
elements. An example is the set of all denumerable sets, all of
whose elements are sets in themselves. Being a citizen of a parti
cipating nation does not make a person a member of the United
Nations. Similarly the number 1/5 is an element of the set of
rational numbers, which we have proved denumerable, but that
does not make it an element of the set of all denumerable sets.
41
THE THEORY OF SETS
Can a set contain itself as an element? The ordinary sets which
we have been considering do not have this property. However, it
is easy to give examples to show that such extraordinary sets do
exist. The set of all conceivable sets is of this type, since it is a
particular set in itself. For the moment we shall call a set that
contains itself as an element an extraordinary set, while we shall
call the others ordinary.
Now let us consider the set of all ordinary sets, and let us call it
the set s. Is s itself an ordinary or an extraordinary set? It must
be one or the other. If s is extraordinary then it must contain
itself as an element. But then s is a member of s and hence is an
ordinary set, as are all elements of s. This is a contradiction, so
s is not extraordinary. However, if s is ordinary, then it does not
contain itself as an element. Therefore s is not an element of s,
which contains all ordinary sets as elements, and so s is not ordinary.
This is again a contradiction, so s is not ordinary. This is the
paradox: s must be either ordinary or extraordinary, but each
possibility leads to a contradiction.
This paradox is not specifically restricted to the theory of sets.
In order to make this clear we shall reformulate the same paradox
in a rather frivolous way. This formulation is completely free of
the idea of sets. In a certain regiment a soldier is detailed to take
over the duties of barber. His exact orders direct him to shave
everyone in the regiment who does not shave himself. Should the
soldier shave himself? If he shaves himself, then he is one who
shaves himself, and his orders direct him not to. If he doesn't
shave himself then he is one who does not shave himself and again
he violates his orders. What can the man do to carry out his
orders strictly?
The paradox is a purely logical one. We are inescapably led
to it in the theory of sets, but it is more general than that and it
does not need that theory for its formulation. The old, rather dull,
subject of logic has developed into something quite interesting. As
a matter of fact, both mathematicians and logicians have been work
ing for some time to free logic from its old Aristotelian form, but at
present it is difficult to see just what new form it will take.
42
8. Some Combinatorial Problems
1. A simple example will serve to show what type of problem
we shall discuss. Suppose we have 4 red balls (R), I yellow ball
(jT), and 2 white balls (W). These balls are supposed to be of the
same size and weight and completely indistinguishable except by
their colors. We also suppose that we have two urns, A and B.
Urn A will hold exactly 3 balls, B will hold 4. In how many different
ways can the 7 colored balls" be distributed between the urns A and B?
Since we have a very simple case with only two urns, we need
only consider the balls that are put into A. The remaining 4
balls will then have to be put into B. To answer our question we
shall systematically list all the possibilities. First of all, A might
contain only red balls and B whatever is left over:
1. in A : RRR, in B : RTWW.
It does not matter which particular three red balls are put into A,
since we supposed that the balls could not be distinguished from
each other except by their colors.
Next A might contain just 2 red balls, in which case the third ball
must be either yellow or white. This gives the two distributions
2. in A :RRY, in B :RRWW,
3. in A : RRW, in B : RRTW.
If A contains just 1 red ball, then the other two will clearly
have to be either YW or WW, giving
4. in A:RTW, in B:RRRW,
5. in A:RWW, in B:RRRY.
Finally, if A contains no red balls, it must contain the other three:
6. in A : TWW, in B : RRRR.
Therefore we see that there are 6 possible ways to distribute 4 red,
1 yellow, and 2 white balls between two urns which hold 3 and 4
balls respectively. Clearly it is quite immaterial of what colors the
balls are; we would get the same result with 4 black, 1 green, and
2 blue balls, so it is necessary only to tell the number of balls of
each color, not their colors. In order to state our result a little
more briefly, we will then say: the number of distributions of 4, 1, 2,
balls between urns of contents 3 and 4 is 6. We will also write
the same fact in symbols in the form:
{4, 1, 2 1 3, 4} 7 = 6.
43
SOME COMBINATORIAL PROBLEMS
Here the numbers in front of the vertical bar are the number of
balls of the various colors, and the numbers after the bar are the
number of balls that each urn can hold. The sum of the numbers in
front of the bar must equal the sum of the numbers after the bar, since all
the balls together just fill all the urns. This total number of balls,
which is 7 in this case, has been written as a subscript to the right
of the bracket.
2. It is not necessary to consider just three colors and two urns.
As a general problem we might consider n balls of c different colors
and u urns of total content n. We would then use the symbol
(1) Z = {r, *, I a,b, }
for the number of distributions of n balls, of which r are one color,
s another, etc., among urns which will hold a, b, balls res
pectively. The problem would be to compute the value of from
the numbers n, r, s 9 -, a, b, - -. We will not solve this problem
in such a general form, but rather will restrict ourselves to a number
of examples and important special cases.
The problem does not really require the objects distributed to be
colored balls. For example, we have already used the letters
RRRRTWW to designate the 4 red, 1 yellow, and 2 white balls,
and have distributed these 7 letters between 2 sets A and B instead
of distributing 7 balls between 2 urns A and B.
3. We shall take up a number of examples having nothing to
do with colored balls, but we will try to discover a way of interpreting
each case as a distribution of colored balls. These examples are of
considerable interest and importance and they will serve to show
the significance of the symbol (1).
Example I. In how many ways can n persons be seated at n places?
In order to interpret this in the form of our previous problem, we
note that each person can be distinguished from every other one.
Therefore we can designate each person by a different color, a name, so
to speak. The problem then reduces to : in how many ways can n balls
of n different colors be put in n different places? Each of these n places
corresponds in our old problem to an urn that will hold just one ball.
Using our symbol, the problem reduces to finding the value of
( 2 ) Pn = {1, 1, - - - I 1, 1, - ' }
where there are n 1's in front of the vertical bar, corresponding to
n different colored balls, and n 1's after the bar corresponding to
the n urns, each holding just 1 ball.
44
SOME COMBINATORIAL PROBLEMS
If we think of the n urns or places as being put in a row, then we
are asking in how many ways n balls or persons (or any distinguishable
objects) can be arranged in a row? Such an arrangement is called
a permutation^ so we can say that P n of (2) represents the number of
permutations of n different things.
There still remains the problem of finding the numerical value
of P n if we are told how big n is. We shall find a formula for P n
in terms of n, but shall postpone it until we have taken up a number
of further examples.
4. Example IL In the game of skat 32 different cards are used
and 3 players participate. Each player is dealt 10 cards and the
remaining 2 cards go into the "skat". In how many different ways
can the hands be dealt out? Clearly the number of ways is the
same as the number of distributions of 32 differently colored balls
among 4 urns that will hold 10, 10, 10, 2 balls respectively. There
fore the number of ways the cards can be distributed in a game of
skat is given by
(3) S = {1, 1, - - - | 10, 10, 10, 2} 32 .
Again we will postpone the numerical computation of this symbol.
5. Example IIL The so-called polynomial theorem is another in
teresting example. We shall consider only a special case involving
three variables AT, j>, . The expression
is to be multiplied out. This w-th power represents a product of
n equal factors, each of which is (x + y + ) . If a sum of terms
enclosed in parentheses is to be multiplied by a factor, then each
term inside the parentheses must be multiplied by that factor. If
this rule is applied to all the n parentheses in the n-th power, then
it is seen that the power consists of a sum of products. Each of
these products has n factors, which can be x QT y or . Since the
factors of a product can be taken in any order we can bring all the
x*s together, then the j's, and then the 5 s. Then each product
will be of the form x a y*z? where a, , c, are whole numbers subject
to the restrictions
(4) a + b + c = n y a ^ 0, b ^ 0, c ^ 0.
A factor x can originate in any of the n parentheses and so can factors
y and . Therefore there can be several products x a y b ^ : with the
same values of a, b, c. How often will xy^ arise when we multiply
45
SOME COMBINATORIAL PROBLEMS
out the n-th power? One of the factors x, j>, or z must originate
from each of the n parentheses. We can imagine a row of n urns,
one corresponding to each parenthesis. Then we can put into
each urn the letter that originates in the corresponding parenthesis.
Now we need only count how many ways a elements * V (red balls),
b elements e >" (yellow balls), and c elements "*" (white balls) can
be distributed among n urns each holding just one element. We
shall designate this number by P<$ jC and we now have
(5) P$, c = & b,c | 1, 1, , 1},.
When the power of (x +j + z} n is multiplied out, the term# a jvV
will occur P c times. These like terms can be collected, and the
combined term will then have the coefficient P ( $ tC .
A particular case will help to make this result clearer. Let us
take n = 4 and think of multiplying out the power (x + y + ) 4 .
Terms of the type xy*z c will occur with all values of a, b, c that
are consistent (4) with n = 4. Listing them systematically, we find
the following 15 possibilities:
x*z, xyfi,
x*z 2 , y*z*> .
xy 2 z, xyz 2 ,
Supplying these terms with the coefficients we have found, we get
(x
(6)
This gives only the form of the power. We must still find the
values of P$ fC and, more generally, of P C .
6. Example IV. In how many different ways can k things be
chosen from among n different things? This number is usually
called the number of combinations of n things taken k at a time and
is designated by the symbol C^ n) . The n things are all different, so
we can consider them as differently colored balls, r*= 1, s = !,
The k balls that are chosen can be put in one urn, a = k, and the
remaining ones can be put in another urn, b = n k. The number
we are trying to determine is therefore
(7) CW = {1, 1, | *, - *}.
46 '
SOME COMBINATORIAL PROBLEMS
7. The duality of the distribution symbol. The symbol that we have
used to represent the number of distributions has a very important
property that will help us to compute its value. The numbers in
front of the vertical bar and those following it can be interchanged,
(8) {r,s,--- \ a, b, - -} B = {a, b, - - j r, s, - -} n .
Stated in terms of distributions, this asserts that there are exactly
the same number of ways to distribute r red balls, s white, - - among
urns that will hold a, b, - balls respectively, as there are ways
to distribute a red balls, b white, * among urns that will hold
r, s y - - ' balls respectively. Here, as always, we are supposing that
r + s + --' = a + b + -'- = n.
Expressed in this way, in terms of distributions, the equality (8)
is very easily proved. A simple numerical example will completely
demonstrate the proof. Let us prove the equality
(9) {3, 4 | 1, 1, 5} 7 = {1, 1, 5 j 3, 4} 7 .
The left side represents the number of distributions of 3 red and 4
white balls
R, R, R, W 9 W, W, W,
among the three urns, A and B each holding 1 ball and C holding
the other 5. Let us look at one of the possible distributions, say
\R\ \W\ \RRWWW\
A B C
Now we can just as well write this distribution by listing the balls
in a row and putting under each one the urn into which it goes,
R, W, R, R, W, W, W
A, B, C, C, C, C, C.
This is just a list of 7 pairs of letters. The upper letters R and W
are the names of the colors, the lower letters A, B, C are the names
of the urns. We can interchange the roles of the letters and let
A, B, C be the names of colors, R and W the names of urns. If
we now write the colors in the upper row, the urns in the lower,
and rearrange the pairs according to urns, we have the scheme
R, R, R, W, W, W, W.
Exactly the same pairs appear in (lOb) as in (10a), only each has
been inverted. But (I Ob) can be interpreted as representing a
47
SOME COMBINATORIAL PROBLEMS
distribution of 1 ball of color A, I of B, and 5 of C among the two
urns R and W, of content 3 and 4 respectively. But this is one
of the distributions which are counted by the right side of (9).
Every other distribution counted by the left side of (9) will correspond
to one counted by the right side, in the same way as (lOa) cor
responds to (1Gb). This correspondence clearly works in the other
direction, from (10b) to (lOa), in just the same way. Therefore
there is a complete correspondence, a "duality/' between the two
problems represented by the left and right sides of (9). Since the
two sets of distributions are paired off exactly, they must have the
same number, and hence (9) is a true equality.
In proving that the two sides of (9) are equal, we did not have
to know the numerical value of either side. In this case, however,
it is easy to list all possible distributions systematically and thus
to find
{3, 4 | 1, 1, 5} 7 = {1, 1, 5 | 3, 4} 7 = 4.
This proof of (9) consisted merely in interchanging the names of
the colors and the names of the urns. We do not need to go through
the detailed proof of (8). The same interchange of colors and urns
shows the duality of the two problems and therefore the general
equality (8).
8. The computation of the value of distributions in certain cases. The
computation of the value of {r, s, . . . \ a, b 9 : } for arbitrary
numbers r, s 9 , a, b, is quite troublesome and we shall not
attempt it. The symbols that arose in paragraphs 3 to 6 are 'not
of the most general sort. They all have the peculiarity that either
all the numbers in front of the vertical bar or all the numbers
following the bar (or both) consist entirely of 1's. That is, either
all the balls have different colors or each urn can hold just one
ball. Because of the duality we need only consider the case where
the balls all have different colors, and we may compute
{1, 1, - - -, 1 | a, b, c, -} n .
The subscript n reminds us that there are n balls and that all the
urns together will hold just n balls,
1 + 1 H + l = a + b + c-i = n.
Since we are only considering symbols with 1's in front of the bar,
we don't always need to write them in, and we can use the shorter
notation,
{1, 1, - - -, 1 \a 9 b,c,- -} n = {a, b,c,-- -} n .
48
SOME COMBINATORIAL PROBLEMS
The numbers a, , c, , representing the sizes of the urns, are
subject to no restrictions other than that their total be n. Ob
viously we have
(11) Wn=l
for there is only one way to put all the balls into a single urn. 'If
we replace the single urn JV of content n by two urns, JV^ of content
n 1 and JV' of content 1, then one of the balls must be taken
from JV and put into JV'. The remaining n I balls are put into
JVj. The one ball for JV' can be any one and, since they are all
different, this gives us n possibilities. Therefore we have
{n- !,!) = .
In the same way we prove
(12) a{a>b,c,--} h = {a- 1, 1, *, ; ' '3 1, V^jJ' ^ ' ' >
x z , between
an( j -L - ^ whose squares would yield the same remainder r
2
when divided by p. That is, we would have #f = q l p + r and
% = q^p + r. Subtracting, we find *f *| = (ft q*)P or
(*i *2) (*i + **) = (ft ft)A Since P is a P rime > it: must divide
either ^ x 2 or # x + x 2) but this is impossible, since ^ # 2 and
#! + x% are positive numbers less than p.
Now we take the remainders r 9 increase each by 1 and subtract
from p. This again gives us - - numbers s between and
2i
57
ON WARING'S PROBLEM
p 1, all of which are different. For ^=11 we would obtain
10, 9, 6, 1, 5, 7. At least one of the numbers s must equal a number
of our previous set r, since r uses up - of the p numbers
2
0, 1 5 2, ', P !, leaving over only numbers, while there
2t
are numbers s. For p = 11 we find the numbers 1, 9, 5
in both r and s.
Let ^ = be equal numbers from r and s. R is the remainder
/j _ ^
on dividing some # 2 , fg x ^ - , by p. S is obtained by
2i
_ j
dividing some number jy 2 , fj j> ^ - , increasing the remainder
2t
by 1 and subtracting from p. That is, x 2 = q^p + R, y 1 = q%p + r,
S = p (r + 1). Adding these three equations, we have
#2 + y + S = (qi + q 2 + l)p +R 1. Since R = S 9 we can
write this as x 2 +j 2 + 1 = mp where m = q^ + g 2 + * Also,
since ^ * 5g - - and fgjy fg - - , we have < mp ^
2 2
P - !\ 2 , P - 1\* ,
=
2
< j?? 2 , and therefore < m < j&. This proves theorem 2, since we
have 1 ^ 77Z < p and mp = x 2 +j>* + I 2 + O 2 .
For /? = 1 1 we could take R = S = 5, finding A? = 4, j> = 4,
4 a + 4 2 + I 2 + O 2 = 3 11. However, if we take = S = 1, we
find # = 1, y = 3, I 2 + 3 2 + I 2 + O 2 = 1 11. In this case we
have been able to write p = 11 as a sum of four squares. The
method we have used may not always give us p as a sum of four
squares, however, so we now prove:
Theorem 3. Ifp is a prime number greater than 2 and ifm is the smallest
positive whole number such that mp is a sum of four squares, then m = 1.
We already have m < p from 2. This smallest m cannot be even,
for if it were we would have x% + A| + x\ -\- x\ = mp, an even number.
Then either all the x's would be even, two would be even and two
odd, or all would be odd. In the second possibility, we can suppose
that we have numbered the #'s so that x^ and x 2 are even, while # 3
and Ar 4 are odd. Then in every case (x^ + ^2)* (*i #2)3 (#3 + #4)*
and (#3 # 4 ) are all even numbers. We then have
58
ON WARING S PROBLEM
<
'
and all the numbers whose squares are taken on the left are whole
numbers. That is, p can be written as a sum of four squares,
Zi
and the even 772 was not the smallest possible value, as we had
supposed.
We now know that the smallest m is odd and m < p. To prove
m = 1 3 we suppose m > 1 and again show that it could be reduced.
Since m is odd, we may suppose m ^ 3. Then we have
(4) mp = $ + ^ + 4 + af .
We divide each x k by m and obtain a remainder r fcy ^ r k < m.
If ^ r k 5g - , we write jy*. = r h . If - - ^* r k ^ m I,
2 2
we write y k = r k m. In both cases we have x k = q^m + y k and
m I m 1 . iu
-- ^ j & ^ - . Since jp fc = ^ q%m, we also nave,
2 2
using (4),
(5)
where TZ is a whole number. Furthermore, we have n > 0, since
if n = we would havej^ y% J 3 =j^ 4 = 0. This would mean
that each x is a multiple of m, and hence each # 2 is a multiple of m 2 .
From (4) we would then see that mp is a multiple of m 2 , so j& is
a multiple of TTZ. But this cannot be, since p is prime and l a ^d
hence n < m.
Multiplying (4) and (5), we find
(6) m*np = (4 + 4 + 4 + *J)0? +^ + Jl +^)
= the right side of (2').
The first number whose square appears on the right side of (2') is
59
ON WARING S PROBLEM
+*S?8 +*4&)
where x is a whole number. The second number whose square
appears on the right side of (2') is
where 2 is a ls a whole number. In a similar way, we find that
the third and fourth numbers are mz 3 and m^. Putting these in
(6) 3 we have
Therefore np can be written as a sum of four squares, and we have
already found < n < m. This shows that m > 1 was not the
smallest possible value, as we had supposed. All that is left is
m = 1, and theorem 3 is proved.
For the case p = 11, ^ = 4, * 2 = 4, # 3 = 1, # 4 = 0, m = 3,
we have ^ = 1, jv 2 = 1, j 3 = 1, j 4 = 0; 4 - 1 + 4 - 1 + 1 - 1 +
0-0 = 3^ -fc = 3; 4-1 4-1 + 1-0 0-1 = 3^, <: 2 = 0;
4-1-1.1 + 0-1-1-0=3^, ^=1; 4-0 0-1+4-1-
1 - 1 = 3^ 4 , ^ 4 = 1 and finally 3 2 + O 2 + I 2 + I 2 = 1 - 11, n = 1.
Theorem 4. If p is a prime number it can be written as a sum of four
squares This is hardly more than a restatement of theorem 3.
This is true if p 2, since we have 2 = I 2 + I 2 + O 2 + O 2 . If
p > 2 it is true by theorem 3.
We finally come to Lagrange's Theorem:
Theorem 5. Every whole number n ^ can be written as a sum of four
squares. This is true for n = and n = 1, since we have
= O 2 + O 2 + O 2 + O 2 and 1 = I 2 + O 2 + O 2 + O 2 . It is also
true if n equals a prime number by theorem 4. All that are left
are the composite numbers. If n is composite, we can factor it
into a product of primes, n - A/> 2 /> 3 - - - f t) where A, A,A> ' ' m > Pt
are prime numbers, not necessarily distinct. Now, by theorem 4,
60
ON CLOSED SELF-INTERSECTING CURVES
pi and p% can be written as sums of four squares. Then by theorem
1, the product pp 2 can also be written as the sum of four squares.
Again by theorem 4, p z can be written as a sum of four squares and,
by theorem 1, so can the product AAA- Continuing in this manner,
we finally find that n can be written as a sum of four squares.
10. On Closed Self-Intersecting Curves
1. The curves that we shall discuss in this chapter are of a special
kind. Although they may be quite complicated, they must satisfy
certain conditions. First, they must be traversible in a single
passage. That is, one should be able to draw the whole curve
with a single stroke, starting at a given point and never taking the
pencil from the paper until the curve is completely drawn. Second,
they must be closed. That is, when one draws the curve he should
be able to start at a given point, trace out the whole curve, and return
to the original starting point just as the curve is completed. Finally,
the curves may cross themselves any number of times, but when
they do, they shall pass through such a crossing point only twice.
The examples shown in Figs. 26 and 27 satisfy thes