Cornell University Library
QA 155.B66
Introduction to higher algebra.
Ililliiilliillllllllilllililllllilliillilllli
3 1924 002 750 473
Digitized by Microsoft®
.«
Cornell University
Library
The original of tiiis book is in
tine Cornell University Library.
There are no known copyright restrictions in
the United States on the use of the text.
http://www.archive.org/details/cu31924002750473
INTEODUCTION TO HIGHER ALGEBRA
THE MACMILLAN COMPANY
NEW YORK • BOSTON • CHICAGO ■ DALLAS
ATLANTA  SAN FRANCISCO
MACMILLAN AND CO., Limited
LONDON ■ BOMBAY  CALCUTTA  MADRAS
MELBOURNE
THE MACMILLAN COMPANY
OF CANADA, Limited
INTRODUCTION
TO
HIGHER ALGEBRA
BY
MAXIME BOCHER
PBOFESSOK OF MATHEMATICS IN HARVARD UNIVEBSITY
PREPARED FOR PUBLICATION WITH THE COOPERATION OF
E. P. R. DUVAL
INSTRUCTOR IN MATHEMATICS IN THE UNIVERSITY
OF WISCONSIN
PROPERTY OF LIBRARY
NEW YORK STATE SCHOOL
mWSTRIAL m LABOR RELATIONS
CORNELL UNIVERSITY
mm gotk
THE MACMILLAN COMPANI
Copyright, 1907,
By the MACMILLAN COMPANY.
All rights reserved — no part of this book may be
reproduced in any form without permission in writing
from the publisher, except by a reviewer who wishes
to quote brief passages in connection with a review
written for inclusion in magazine or newspaper.
Set up and electrotyped. Published December, 190^.
45151
» PRINTED IN THE UNITED STATES OF AMERICA •
PREFACE
An American student approaching the higher parts of mathe
matics usually finds himself unfamiliar with most of the main facts
of algebra, to say nothing of their proofs. Thus he has only a
rudimentary knowledge of systems of linear equations, and he knows
next to nothing about the subject of quadratic forms. Students in.
this condition, if they receive any algebraic instruction at all, are
usually plunged into the detailed study of some special branch of
algebra, such as the theory of equations or the theory of invariants,
where their lack of real mastery of algebraic principles makes it
almost inevitable that the work done should degenerate to the level
of purely formal manipulations. It is the object of the present
book to' introduce the student to higher algebraun such a way that
he shall, on the one hand, learn what is meant by a proof in algebra
and acquaint himself with the proofs of the most fundamental facts,
and, on the other, become familiar with many important results of
algebra which are new to him.
The book being thus intended, not as a compendium, but really,
as its title states, only as an introduction to higher algebra, the
attempt has been made throughout to lay a sufficiently broad founda
tion to enable the reader to pursue his further studies intelligently,
rather than to carry any single topic to logical completeness. No
apology seems necessary for the omission of even such important
subjects as Galois's Theory and a systematic treatment of invariants.
A selection being necessary, those subjects have been chosen for
treatment which have proved themselves of greatest importance in
geametry and analysis, as well as in algebra, and the relations of
the algebraic theories to geometry have been emphasized throughout.
At the same time it must be borne in mind that the subject primarily
treated is algebra, not analytic geometry, so that such geometric
information as is given is necessarily of a fragmentary and some
what accidental character.
No algebraic knowledge is presupposed beyond a familiarity with
elementary algebra up to and including quadratic equations, and
VI • PREFACE
such a knowledge of determinants and the method of mathematical
induction as may easily be acquired by a freshman in a week or
two. Nevertheless, the book is not intended for wholly immature
readers, but rather for students who have had two or three years'
training in the elements of higher mathematics, particularly in
analytic geometry and the calculus. In fact, a good elementary
knowledge of analytic geometry is indispensable.
The exercises at the ends of the sections form an essential part
of the book, not merely in giving the reader an opportunity to think .
for himself on the subjects treated, but also, in many cases, by sup
. plying him with at least the outlines of important additional theories.
As illustrations of this we may mention Sylvester's Law of Nullity
(page 80), orthogonal transformations (page 154 and page 173), and
the theory of the invariants of the biquadratic binary form (page 260).
On a first reading of Chapters IVII, it may be found desirable
to omit some or all of sections 10, 11, 18, 19, 20, 25, 27, 34, 35. The
reader may then either take up the subject of quadratic forms
(Chapte's VIIIXIII), or, if he prefer, he may pass directly to the
more general questions treated in Chapters XIVXIX.
The chapters on Elementary Divisors (XXXXII) form decid
edly the most advanced and special portion of the book. A person
wishing to read them without reading the rest of the book should
first acquaint himself with the contents of sections 19 (omitting
Theorem 1), 2125, 36, 42, 43.
In a work of this kind, it has not seemed advisable to give many
bibliographical references, nor would an acknowledgement at this
point of the sources from which the material has been taken be
feasible. The work of two mathematicians, however, Kronecker
and Frobenius, has been of such decisive influence on the character
of the book that it is fitting that their names receive special men
tion here. The author would also acknowledge his indebtedness
to his colleague. Professor Osgood, for suggestions and criticisms
relating to Chapters XIVXVI.
This book has grown out of courses of lectures which have been
delivered by the author at Harvard University during the last ten
years. His thanks are due to Mr. Duval, one of his former pupils,
without whose assistance the book would probably never have been
written.
CONTENTS
CHAPTER I
POIYNOMIALS AND THEIR MoST FUNDAMENTAL PROPERTIES
SUCTION p^g,
1. Polynomials in One Variable 1 '
2. Polynomials in More than One Variable 4
3. Geometric Interpretations 8
4. Homogeneous Coordinates H
5. The Continuity of Polynomials 14
6. The Fundamental Theorem of Algebra 16
CHAPTER II
A Few Properties of DBTBkMiNANTS
7. Some Definitions 20
8. Laplace's IJevelopraent . 24
9. The Multiplication Theorem 26
10. Bordered Determinants . 28
11. Adjoint Determinants and their Minors 30
CHAPTER III
The Theory of Linear Dependence
12. Definitions and Preliminary Theorems . 34
18. The Condition for Linear Dependence of Sets of Constants ... 36
14. The Linear Dependence of Polynomials 38
15. Geometric Illustrations 39
CHAPTER IV
Linear Equations
16. Non Homogeneous Linear Equations 43
17. Homogeneous Linear Equations 47
18. Fundamental Systems of Solutions of Homogeneous Linear Equations . 49
vii
viii CONTENTS
CHAPTER V
Some Theorems Concerning the Rank of a Matrix
BECTION
pAex
19. General Matrices 54
20. Symmetrical Matrices 56
CHAPTER VI
Linear Transformations and the Combination of Matrices
21. Matrices as Complex Quantities 60
22. The Multiplication of Matrices ......... 62
23. Linear Transformation . . . . 66
24. CoUineation 68
25. Further Development of the Algebra of Matrices 74
26. Sets, Systems, and Groups 80
27. Isomorphism 83
CHAPTER VII
Invariants. First Principles and Illustrations
28. Absolute Invariants ; Geometric, Algebraic, and Arithmetical . . 8S
29. Equivalence 92
30. The Rank of a System of Points or a System of Linear Forms as an
Invariant 94
31. Relative Invariants and Covariants 95
32. Some Theorems Concerning Linear Forms 100
33. CrossRatio and Harmonic Division 102
34. PlaneCoordinates and Contragredient Variables . . . • ' . 107
35. LineCoordinates in Space 110
CHAPTER Vin
Bilinear Forms
36. The Algebraic Theory 114
37. A Geometric Application 116
CHAPTER IX
Geometric Introduction to the Study of Quadratic Forms
38. Quadric Surfaces and their Tangent Lines and Planes .... 118
39. Conjugate Points and Polar Planes 121
40. Classification of Quadric Surfaces by Means of their Rank . . . 123
41. Reduction of the Equation of a Quadric Surface to a Normal Form . 124
CONTENTS ■ ix
CHAPTER X
Quadratic Forms
SBOTION PAOB
42. The General Quadratic Form and its Polar 127
43. The Matrix and the Discriminant of a Quadratic Form . . .' . 128
44. Vertices of Quadratic Forms 129
45. Reduction of a Quadratic Form to a Sum of Squares .... 131
46. A Normal Form, and the Equivalence of Quadratic Forms . . . 134
47. Reducibility 136
48. Integral Rational Invariants of a Quadratic Form 137
49. A Second Method of Reducing a Quadratic Form to a Sum of Squares . 139
CHAPTEE XI
Real Quadratic Forms
50. The Law of Inertia 144
51. Classification of Real Quadratic Forms 147
52. Definite and Indefinite Forms 150
CHAPTEE XII
The System of a Quadratic Form and Onb or More Linear
Forms
53. Relations of Planes and Lines to a Quadric Surface 155
54. The Adjoint Quadratic Form and Other Invariants 159
55. The Rank of the Adjoint Form 161
CHAPTER XIII
Pairs or QrADRAxic Forms
56. Pairs of Conies 163
57. Invariants of a Pair of Quadratic Forms. Their XEquation . . . 165
58. Reduction to Normal Form when the XEquation has no Multiple Roots 167
59. Reduction to Normal Form when i/» is Definite and NonSingular . . 170
CHAPTEE XIV
Some Properties op PoLYNOMiAiiS in General
60. Factors and Reducibility . ' 174
61. The Irreducibility of the General Determinant and of the Symmetrical
Determinant 176
62. Corresponding Homogeneous and NonHomogeneous Polynomials . . 178
X CONTENTS
SECTION PAGE
63. Division of Polynomials 180
64. A Special Transformation of a Polynomial 184
CHAPTER XV
Factors and Common Factors of Polynomials in One Variable
AND OF Binary Forms
65. Fundamental Theorems on the Factoring of Polynomials in One Varia
ble and of Binary Forms 187
66. The Greatest Common Divisor of Positive Integers 188
67. The Greatest Common Divisor of Two Polynomials in One Variable . 191
68. The Resultant of Two Polynomials in One Variable .... 195
69. The Greatest Common Divisor in Determinant Form .... 197
70. Common Roots of Equations. Elimination 19S
71. The Cases a„=0 and J„ = 200
72 The Resultant of Two Binary Forms 201
CHAPTER XVI
Factors of Polynomials in Two or More Variables
73. Factors Involving only One Variable of Polynomials in Two Variables . 203
74. The Algorithm of the Greatest Common Divisor for Polynomials in Two
Variables 206
75. Factors of Polynomials in Two Variables 208
76. Factors of Polynomials in Three or More Variables 212
CHAPTER XVII
General Theorems on Integral Rational Invariants
77. The Invariance of the Factors of Invariants 218
78. A More General Method of Approach to the Subject of Relative Invariants 220
79. The Isobaric Character of Invariants and Covariants .... 222
80. Geometric Properties and the Principle of Homogeneity .... 226
81. Homogeneous Invariants 230
82. Resultants and Discriminants of Binary Forms 236
CHAPTER XVIII
Symmetric Polynomials
83. Fundamental Conceptions. 5 and S Functions 240
84. Elementary Symmetric Functions 242
85. The Weights and Degrees of Symmetric Polynomials .... 245
86. The Resultant and the Discriminant of Two Polynomials in One Variable 248
CONTENTS XI
CHAPTER XIX
Polynomials Symmetric in Pairs of Variables
6B0TI0N PAGE
87. Fundamental Conceptions. S and S Functions 252
88. Elementary Symmetric Functions of Pairs of Variables. . . . 253
89. Binary Symmetric Functions 255
90. Resultants and Discriminants of Binary Forms . . . . 257
CHAPTER XX
Elementary Divisors and the Equivalence of X^Matrices
91. XMatrices and their Elementary Transformations 262
92. Invariant Factors and Elementary Divisors ...... 269
93. The Practical Determination of Invariant Factors and Elementary
Divisors 272
94. A Second Definition of the Equivalence of \Matrices .... 274
95. Multiplication and Division of A.Matrices 277
CHAPTER XXI
The Equivalence and Classification of Pairs of Bilinear
Forms and op Collineations
96. The Equivalence of Pairs of Matrices 279
97. The Equivalence of Pairs of Bilinear Forms 283
98. The Equivalence of Collineations 284
99. Classification of Pairs of Bilinear Forms 287
100. Classification of Collineations 292
CHAPTER XXII
The Equivalence and Classification of Pairs of Quadratic
Forms
101. Two Theorems in the Theory of Matrices 296
102. Symmetric Matrices 299
103. The Equivalence of Pairs of Quadratic Forms 302
104. Classification of Pairs of Quadratic Forms 305
105. Pairs of Quadratic Equations, and Pencils of Forms or Equations . . 307
106. Conclusion 31?
Index • 31!
mTRODUOTIO]^ TO HIGHER ALGEBRA
CHAPTER I
POLYNOMIALS AND THEIR MOST FUNDAMENTAL
PROPERTIES
1. Polynomials in One Variable. By an integral rational func
tion of a;, or, as we shall say for brevity, a polynomial in a;, is meant
a function of x determined by an expression of the form
(1) c^a;"i + Cg*"* + ••• +"^2;''*,
where the «'s aie integers positive or zero, while the c's are any con
stants, real or imaginary. We may without loss of generality
assume that no two of the as are equal. This being the case, the
expressions CiOfi are called the terms of the polynomial, e^ is called
the coefficient of this term, and a^ is called its degree. The highest
degree of any term whose coefficient is not zero is called the degree
of the polynomial.
It should be noticed that the conceptions just defined — terms,
coefficients, degree — apply not to the polynomial itself, but to the
particular expression (1) which we use to determine the polynomial,
land it would be quite conceivable that one and the same function of
X might be given by either one of two wholly different expressions
of the form (1). We shall presently see (cf. Theorem 5 below)
that this cannot be the case except for the obvious fact that we
may insert in or remove from (1) any terms we please with zero
coefficients.
By arranging the terms in (1) in the order of decreasing a's and
supplying, if necessary, certain missing terms with zero coefficients,
we may write the polynomial in the normal form
(2) a^x^ + fflja;"! +  + a^.^x + a„.
B 1
2 INTRODUCTION TO HIGHER ALGEBRA
It should, however, constantly be borne in mind that a polynomial
in this form is not necessarily of the nth. degree ; but will be of the
wth degree when and only when a^ ^ 0.
Definition. Two polynomials, fiix) and f^ip^), o're said to he
identically equal (/j =/2) i/" ^^^y ^'''^ equal for all values of x. A
polynomial f{x) is said to vanish identically (/=0) if it vanishes for
all values of x.
We learn in elementary algebra how to add, subtract, and multi
ply * polynomials ; that is, when two polynomials f^ix) and f^ix) are
given, to form new polynomials equal to the sum, difference, and
product of these two.
Theoeem 1. If the polynomial
f{x) = a^x" + a^df'^ + ...+a„
vanishes when x= a, there exists another polynomial
^1(2;) = a^x"^ + aix''^ + ■■■ + <_i,
such that /(a;) = {x — a)(^j(a;).
For since by hypothesis /( a) = 0, we have
^(.x)=f{x) /(«) = %{^  «") + «!(«"'  «"') +  + a„i(a:  «).
Now by the rule of elementary algebra for multiplying together
two polynomials we have
«*  «* =(a;  a) (a^' + oa;*" + • • • + a*i).
Hence
/(a;) = (a;«)[ao(a;»» + aa;»='H 1 «""') + aiCa;"^ + aa;""' + —
+ «»2)+... +a^_J.
If we take as <^x(a;) the polynomial in brackets, our theorem is
proved.
Suppose now that yS is another value of x distinct from a for which
"fix) is zero. Then
/(/3) = (;8«)<^i(;8)=0;
* The question of division is somewhat more complicated and will be considered
in § 63.
i'OLYNOMIALS AND THEIR FUNDAMENTAL PROPERTIES 3
and since j3 — u^Q, ^^(/S) = 0. We can therefore apply tHe theo
rem just proved to the polynomial <j>i{^), thus getting a new
po ynomia ^^^^^ ^ ^^^^_^ _^ ^,,_^„_3 _j_ __, __ ^,,_^^
such that <^j(a;) = (x 0)^^{x),
and therefore f(x) = (a; — a) (a; — ^)4>^(z).
Proceeding in this way, we get the following general result:
Thboeem 2. If ui, a^i ••• % «*■« ^ distinct constants, and if
f(x) = a^x" + aja;"! + ... + a^ (w>i),
aw«Z /(«!) =f{u^) = ••• =/(«*) = 0,
then f{x) = {x a{) (x  a^) ■■■{x a^)4){x),
where ^{x) = a^a:"* + b^x"''^ + — + 5„_i.
Applying this theorem in particular to the case n = k,we see that
if the polynomial f(x) vanishes for n distinct values aj, a^, ••• «„ of z,
f{x) = a^{x  a^){x  ttj) ... (a;  «„).
Accordingly, if a^^O, there can be no value of x other than «j, ••• «„
for which /(a;) = 0. We have thus proved
Theorem 3. A polynomial of the nth degree in x cannot vanish
for more than n distinct values of x.
Since the only polynomials which have no degree are those all oi
whose coefficients are zero, and since such polynomials obviously
vanish identically, we get the fundamental result :
Theorem 4. A necessary and sufficient condition that a polyno
mial in X vanish identically is that all its coefficients he zero.
Since two polynomials in x are identically equal when and only
when their difference vanishes identically, we have
Theorem 5. A necessary and sufficient condition that two polyno
mials in x he identically equal is that they have the same coefficients.
This theorem shows, as was said above, that the terms, coefficients,
and degree of a polynomial depend merely on the polynomial itself,
not on the special way in which it is expressed.
4 INTRODUCTION TO HIGHER ALGEBRA
2. Polynomials in More than One Variable. A function of (2;, y^
is called a polynomial if it is given by an expression of the form
c^y^'^ + e^x^y^' + ••• + e^.x'^kyh,
where the a's and /S's are integers positive or zero.
More generally, a function of {x^, x^, ■•■ x„) is called a polynomial
if it is determined by an expression of the form
(1) e^x^x^^ ■ ■ ■ x/^ + c^x^x^'^ • • • a;/2 + • • • + c^x^ux^i ■ ■ ■ a;/*,
where the a's, /3's, •■■ v's are integers positive or zero.
Here we may assume without loss of generality that in no two
terms are the exponents of the various x's the same ; that is, that if
then Vi =^ Vj.
This assumption being made, CiX^ix^* a;/* is called a term of the
polynomial, e, its coefficient, a, the degree of the term in x^, /3; in x^,
etc., and a^ + yS, H h I'i the total degree, or simply the degree, of
the terra. The highest degree in a;, of any term in the polynomial
ivhose coefficient is not zero is called the degree of the polynomial
in Xi, and the highest total degree of any term whose coefficient is
not zero is called the degree of the polynomial.
Here, as in § 1, the conceptions just defined apply for the present
not to the function itself but to the special method of representing
it by an expression of the form (1). We shall see presently, how
ever, that this method is unique.
Before going farther, we note explicitly that according to thi
definition we have given, a polynomial all of ivhose coefficientu are zero
has no degree.
When we speak of a polynomial in n variables, we do not nee
essarily mean that all n variables are actually present. One or more
of them may have the exponent zero in every term, and hence not
appear at all. Thus a polynomial in one variable, or even a con
stant, may be regarded as a special case of a polynomial in any
larger number of variables.
A polynomial all of whose terms are of the same degree is said
to be homogeneous. Such polynomials we will speak of as forms,*
* There is diversity of usage here. Some writers, following Kronecker, apply the
term /orm to all polynomials. On the other hand, homogeneous polynomials are often
spoken of as guantics by English writer*
POLYNOMIALS AND THEIR FUNDAMENTAL PROPERTIES 6
distinguishing between hinary, ternary, quaternary, and in general
nary forms according to the number of variables involved, binarj
forms involving two, ternary three, etc.
Another method of classifying forms is according to their
degree. We speak here of linear forms, quadratic forms, cubic
forms, etc., according as the degree is 1, 2, 3, etc. We will,' how
ever, agree that a polynomial all of whose coefficients are zero may
also be spoken of indifferently as a linear form, quadratic form,
cubic form, etc., in spite of the fact that it has no degree.
If all the coefficients of a polynomial are real, it is called a real
polynomial even though, in the course of our work, we attribute
imaginary values to the variables.
It is frequently convenient to have a polynomial in more than one
variable arranged according to the descending powers of some one
of the variables. Thus a normal form in which we may write a
polynomial in n variables is
the <^'s being polynomials in the n — 1 variables {x^, ■•■ x„).
We learn in elementary algebra how to add, subtract, and multiply
polynomials, getting as the result new polynomials.
Definition. Two polynomials in any number of variables are
said to be identically equal if they are equal for all values of the vari
ables. A polynomial is said to vanish identically if it vanishes for
all values of the variables.
Theoreji 1. A necessary and sufficient condition that a polyno
mial in any number of variables vanish identically is that all its coeffi
cients be zero.
That this is a sufficient condition is at once obvious. To prove
that it is a necessary condition we use the method of mathematical
induction. Since we know that the theorem is true in the case of
one variable (Theorem 4, § 1), the theorem will be completely proved
if we can show that if it is true for a certain number w — 1 of vari
ables, it is true for n variables.
Suppose, then, that
vanishes identically. If we assign to (a^, ••• x„) any fixed values
(x[, ■■•jc'„),f becomes a polynomial in Xi alone, which, by hypothesis.
6 INTRODUCTION TO HIGHER ALGEBRA
vanishes for all values of Xy Hence its coefficients must, by Theorem
4, 8 1, all be zero: , , , r\ n /• n i . ^\
That is, the polynomials <}>q, (^j, ••• <^m vanish for all values of the
variables, since (x'^, ••■ x'„) was any set of values. Accordingly, by
the assumption we have made that our theorem is true for polyno
mials in w — 1 variables, all the coefficients of all the polynomials
<^o, <^i, ••■ ^OT are zero. These, however, are simply the coefficients
of/. Thus our theorem is proved.
Since two polynomials are identically equal when and only
when their difference is identically zero, we infer now at once the
further theorem :
Theorem 2. A neceatary and sufficient condition that two poly
nomials be identically equal is that the coefficients of their corresponding
terms be equal.
We come next to
Theorem 3. Iffi andfi are polynomials in any number ofvari
ables of degrees rwj and m^ respectively, the product f^f^ will be of de
gree Wj + m^.
This theorem is obviously true in the case of polynomials in one
variable. If, then, assuming it tc be true for polynomials in ti — 1
variables we can prove it to be true ^or polynomials in n variables,
the proof of our theorem by the method of mathematical induction
will be complete.
Let us look first at the special case in which both polynomials
are homogeneous. Here every term we get by multiplying tftem
together by the method of elementary algebra is of degree m^ + m^.
Our theorem will therefore be proved if we can show that there is at
least one term in the product whose coefficient is not zero. For
this purpose, let us arrange the two polynomials /j and f^ according
to descending powers of a;,,
f{xi, ... x„)=<j)'i{x.i, ••• a;„)a^*« + ^{(a;2,  a;„)a^'i»+ ...,
Here we may assume that neither ^J nor ^H vanishes identically.
Since /i and/j are homogeneous, <j>^ and <f>^' will also be homogeneous
POLYNOMIALS AND THEIR FUNDAMENTAL PROPERTIES 7
of degrees Wj — k^ and m^ — k^ respectively. In the product /j/j the
terms of highest degree in x^ will be those in the product
and since we assume our theorem to hold for polynomials in « — 1
variables, if>l, <(>(,' will be a polynomial of degree »»j + wij — Aj — /fe^.
Any term in this product whose coefficient is not zero gives us when
multiplied by a;j*i**a a term of the product /^/^ of degree mj^ + m^
whose coefficient is not zero. Thus our theorem is proved for the
case of homogeneous polynomials.
Let us now, in the general case, write /i and/j in the forms
/iCa?!, ■■■x„)=(f>iX^, ■■■ a;„)+ (^4^_i(2;i, ...x„)+—,
f 2(3:1, • • • a;„) = </)^(a;i, ■■■x„) + ^^1(2^, ■■■x„)+—,
where <^/ and <f>j' are homogeneous polynomials which are either of
degrees i and j respectively, or which vanish identically. Since,
by hypothesis, /j and f^ are of degrees nii and m2 respectively,
^^, and <j>ll^ will not vanish identically, but will be of degrees
wij and m^.
The terms of highest degree in the product fif^ will therefore be
the terms of the product (f>l„^ 0^, and this being a product of homo
geneous polynomials comes under the case just treated and is there
fore of degree nii + m^. The same is therefore true of the product
/1/2, and our theorem is proved.
By a successive application of this theorem we infer
Corollary. If k polynomials are of degrees mj, m^, ••• w^ re
spectively, their product is of degree m^ + m^i h WJ^.
We mention further, on account of their great importance, the
two rather obvious results :
Theorem 4. If the product of two or more polynomials is identi
cally zero, at least one of the factors must he identically zero.
For if none of them were identically zero, they would all have
definite degrees, and therefore their product would, by Theorem 3,
have a definite degree, and would therefore not vanish identically.
It is from this theorem that we draw our justification for cancel
ling out from an identity a factor which we know to be not identi
cally zero.
8 INTRODUCTION TO HIGHER ALGEBRA
Theorem %. Iff(x^, ■ ■ ■ x„) is a polynomial which is not identically
zero, and if cf>{x^, •■■ x„) vanishes at all points where f does not vanish,
then <f) vanishes identically.
This follows from Theorem 4 when we notice that fcf} =0.
EXERCISES
1. If / and c^ are polynomials in any number of variables, what can be inferred
from the identity /^ = <^^ concerning the relation between the polynomials / and <ji1
2. If /j and f^ are polynomials in (Xj, ••• x„) which are of degrees m^ and m,
respectively in xi, prove that their product is of degree m^ + m2 in x^
3. Geometric Interpretations. In dealing with functions of a
single real variable, the different values which the variable may
take on may be represented geometrically by the points of a line ;
it being understood that when we speak of a point x we mean the
point which is situated on the line at a distance of x units (to the
right or left according as x is positive or negative) from a certain
fixed origin 0, on the line. Similarly, in the case of functions of
two real variables, the sets of values of the variables may be pictured
geometrically by the points of a plane, and in the case of three real
variables, by the points of space ; the set of values represented by a
point being, in each case, the rectangular coordinates of that point.
When we come to functions of four or more variables, however, this
geometric representation is impossible.
The complex variable x= ^ + r]i depends on the two independent
real variables  and 77 in such a way that to every pair of real values
(}, T)) there corresponds one and only one value of x. The different
values which a single complex variable may take on may, therefore,
be represented by the points of a plane in which (^, 7?) are used as
cartesian coordinates. In dealing with functions of more than one
complex variable, however, this geometric representation is impos
sible, since even two complex variables x = ^fr]i, y = ^ + r] i are
equivalent to four real variables (f, r], ^j, t^j).
By the neighborhood of a point x=a we mean that part of the
line between the points x=a — u and x= a + ,t{a being an arbitrary
positive constant, large or small), or what is the same thing, all
points whose coordinates x satisfy the inequality \x — a\<u.*
*We use the symbol \Z \ to denote the absolute value of Z, i.e. the numericai
value of Z if Z is real, the modulus of Z if Z is imaginary.
POLYNOMIALS AND THEIR FUNDAMENTAL PROPERTIES 9
Similarly, by the neighborhood of a point (a, b) in a plane, we
shall mean all points whose coordinates (x, y) satisfy the inequalities
\xa\<a, «/5</3,
where a and /8 are positive constants. This neighborhood thus con
sists of the interior of a rectangle of which (a, 6) is the center and
whose sides are parallel to the coordinate axes.
By the neighborhood of a point (a, 6, c) in space we mean all
points whose coordinates {x, y, z) satisfy the inequalities
\x—a\<u, y — 6</3, 2— c<7.
In all these cases it will be noticed that the neighborhood
may be large or small according to the choice of the constants
If we are dealing with a single complex variable x = ^ + rji, we
understand by the neighborhood of a point a all points in the plane
of complex quantities whose complex coordinate x satisfies the in
equality \x— a\<K, a being as before a real positive constant. Since
\x — a\ is equal to the distance between x and a, the neighborhood of
a now consists of the interior of a circle of radius a described about
a as center.
It is found convenient to extend the geometric terminology
we have here introduced to the case of any number of real or
complex variables. Thus if we are dealing with n independent
variables {x^, x^, ■■■ x„), We /speak of any particular set of values
of these variables as a point in space of n dimensions. Here
we have to distinguish between real points, that is sets of values
of the x's which are all real, and imaginary points in which
this is not the case. In using these terms we do not propose
even to raise the question whether in any geometric sense there
is such a thing as space of more than three dimensions. We
merely use these terms in a wholly conventional algebraic
sense because on the one hand they have the advantage of
conciseness over the ordinary algebraic terms, and on the other
hand, by calling up in our minds the geometric pictures of three
dimensions or less, this terminology is often suggestive of new
relations which might otherwise not present themselves to us so
readily.
10 INTRODUCTION TO HIGHER ALGEBRA
By the neighborhood of the point (a^, a^, ••• a„) we understand all
points which satisfy the inequalities
\Xiai\<«i, {x^a^lKa^ a;„a„<a„,
where Kj, a^, ■•• a„ are real positive constants.
If, in particular, (aj, aSj, ••• «„) is a real point, we may speak of
the real neighborhood of this point, meaning thereby all real points
{x^, x^,x„) which satisfy the above inequalities.
As an illustration of the use to which the conception of the
neighborhood of a point can be put in algebra, we will prove the
following important theorem :
Theorem 1. A necessary and sufficient condition that a poly
nomial /(ajj, ••• x„) vanish identically is that it vanish throughout the
neighborhood of a point (a^, ••• a„).
That this is a necessary condition is obvious. To prove that it
is sufficient we begin with the case n = 1.
Suppose then that /(a;) vanishes throughout a certain neighbor
hood of the point x=a. \if{x) did not vanish identically, it would
be of some definite degree, say k, and therefore could not vanish at
more than h points (cf. Theorem 3, § 1). This, however, is not the
case, since it vanishes at an infinite number of points, namely all
points in the neighborhood of x = a. Thus our theorem is proved
in the case w = 1.
Turning now to the case « = 2, let
fix, y) = ,l>,{y)x' + 4>i{y)x'' +■■'+ Ut/)
be a polynomial which vanishes throughout a certain neighborhood
of the point (a, J), say when
\xa\<a, \yb\<ff.
Let y^ be any constant satisfying the inequality
l2/o5<i8.
Then f(x, y^) is a polynominal in x alone which vanishes whenever
la;— a < «. Hence, by the case w = 1 of our theorem, f(x, y^) = 0.
That is.
POLYNOMIALS AND THEIR FUNDAMENTAL PROPERTIES 11
Thus all these polynomials tj) vanish at every point y^ in the neigh
borhood of y = 6, and therefore, by the case « = 1 of our theorem,
they are all identically zero. From this it follows that for every
value of X, f(z, y) vanishes for all values of y, that is /= 0, and
our theorem is proved.
We leave to the reader the obvious extension of this method of
proof to the case of n variables by the use of mathematical induction.
From the theorem just proved we can infer at once the following:
Theorem 2. A necessary and sufficient condition that two polyno
mials in the variables (a;^, •••a;„) be identically equal is that they be
equal throughout the neighborhood of a point {a^, ■ ■ • a„).
EZintCISES
1. Theorem 3, § 1 may be stated as follows : If / is a polynomial in one
variable which is known not to be of degree higher than n, then if / vanishes at
n + 1 distinct points, it vanishes identically.
Establish the following generalization of this theorem :
If /is a polynomial in (x, y) which is known not to be of higher degree than
n in X, and not of higher degree than m in y, then, if / vanishes at the
(n + 1) (m + 1) distinct points : / i = 1, 2, ... n + l\
(^>' 36') V ;■= l' 2 ••■ m + 1/'
it vanishes identically. ■' ' '"'
2. Generalize the theorem of Exercise 1 to polynomials in any number of
variables.
3. Prove Theorem 4, § 2 by means of Theorem 1 of the present section ; and
from this result deduce Theorem 3, § 2.
4. Do Theorems 1 and 2 of this section hold if we consider only real polyno
mials and the real neighborhoods of real points ?
4. Homogeneous Coordinates. Though only two quantities are
necessary in order to locate the position of a point in a plane, it is
frequently more convenient to use three, the precise values of the
quantities being of no consequence, but only their ratios. We will
represent these three quantities by x, y, t, and define their ratios by
the equations ^ — yr H— V
t t
where X and Y are the cartesian coordinates of a point in a plane.
Thus (2, 3, 5) will represent the point whose abscissa is  and whose
ordinate is f . Any set of three numbers which are proportional to
12 INTRODUCTION TO HIGHER ALGEBRA
(2, 3, 5) will represent the same point. So that, while to every
set of three numbers (with certain exceptions to be noted below)
there corresponds one and only one point, to each point there cot
respond an infinite number of different sets of three numbers, all
of which, however, are proportional.
When ^ = our definition is meaningless ; but if we consider
the points (2, 3, 1), (2, 3, 0.1), (2, 3, 0.01), (2, 3, 0.001), •••, which
are, in cartesian coordinates, the points (2, 3), (20, 30), (200, 300),
(2000, 3000), •••, we see that they all lie on the straight line through
the origin whose slope is . Thus as t approaches zero, x and «/
remaining fixed but not both zero, the point {x, y, t) moves away
along a straight line through the origin whose slope is y/x. Hence
it is natural to speak of (a;, y, 0) as the point at infinity on the line
whose slope is y/x. If t approaches zero through negative values,
the point will move off along the same line, but in the opposite
direction. We will not distinguish between these two cases, but
will speak of only one point at infinity on any particular line. It
can be easily verified that if a point moves to infinity along any line
parallel to the one just considered, its homogeneous, coordinates may
be made to approach the same values (a;, y, 0) as those just obtained.
It is therefore natural to speak of the point at infinity in a cer
tain direction rather than on a definite line. Finally we will agree
that two points at infinity whose coordinates are propoitional shall
be regarded as coinciding, since these coordinates may be regarded
as the limits of the coordinates of one and the same point which
moves further and further off.*
\i x = y = t = Q, we will not say that we have a point at all, since
the coordinates of any point whatever may be taken as small as we
please, and so (0, 0, 0) might be regarded as the limits of the coor
dinates of any fixed or variable point.
* It should be noticed that in speaking of points at infinity we are, considering
the matter from a purely logical point of view, doing exactly the same thing that we
did in § 3 in speaking of imaginary points, or points in space of n dimensions ; that is,
we are speaking of a set of quantities as a "point" which are not the coordinates of
any point. The only difference between the two cases is that the coordinates of our
"point at infinity " are the limits of the coordinates of a true point.
Thus, in particular, it is a pure convention, though a desirable and convenient
one, when we say that two points at infinity shall be regarded as coincident when and
only when their coordinates are proportional. We might, if we chose, regard all
points at infinity as coincident. There is no logical compulsion in the matter.
POLYNOMIALS AND THEIR FUNDAMENTAL PROPERTIES 13
The equation
AX^ + BXY+ CY^+ DX+EY+ F=
becomes, in homogeneous coordinates,
or Aay^ + Bxy + Oy^ + Bxt + Eyt + M^ = 0,
a homogeneous equation of the second degree ; and it is evident that
if the coordinates JT, Y in any algebraic equation be replaced by
the coordinates a;, «/, i, the resulting equation will be homogeneous,
and of the same degree as the original equation. It is to this fact
that the system owes its name, as well as one of its chief advantages.
The equation Ax + By+Qt=^
represents, in general, a line, but if J. = ^= 0, C^ 0, it has no true
geometric locus. It is, in this case, satisfied by the coordinates of all
points at infinity, and by the coordinates of no other point. We shall
therefore speak of it as the equation of the line at infinity. The reader
may easily verify, by using the equation of a line in terms of its inter
cepts, that if a straight line move further and further away, its homo
geneous equation will approach more and more nearly the form ^=0.
In space of three dimensions we will represent the point whose
cartesian coordinates are X, Y, Z by the four homogeneous coordi
nates X, y, z, t, whose ratios are defined by the equations
f = X, ^=F, ' = Z.
t t t
We will speak of (x, y, e, 0) as " the point at infinity " on a line
whose direction cosines are
X y g
Va? + y^ + 2^' V«2 + ^2 + «2' Va;2 H «/2 + 22
(0, 0, 0, 0) will be excluded, and t = will be spoken of as the equa
tion of the plane at infinity.
Extending the same terminology to the general case, we shall
sometimes find it convenient to speak of {xy, x^, ■•■ x„) not as a point
in space of n dimensions, but as a point represented by its homo
geneous coordinates in space of w — 1 dimensions. Two points
14 INTRODUCTION TO HIGHER ALGEBRA
whose coordinates are proportional will be spoken of as identical,
a point whose last coordinate is zero will be spoken ^i ,1^ a, point at
infinity, and the case a^= ••• =x„=0 will not be spoken of as a
point at all. This terminology will be adopted only in connection
with homogeneous polynomials, and even then it must be clearly
understood that we are perfectly free to adopt whichever terminol
ogy we find most convenient. Thus, foF instance, if /(xy x^, x^) is
a homogeneous polynomial of the second degree, the equation /=
may be regarded either as determining a conic in a plane (x^, a;^, x^
being homogeneous coordinates) or a quadric cone in space (a;j, x^, r^
being ordinary cartesian coordinates).
Homogeneous coordinates may also be used in space of one
dimension. We should then determine the points on a line by two
coordinates x, t whose ratio x/t is the nonhomogeneous coordinate
X, i.e. the distance of the point from the origin. It is this repre
sentation that is commonly made use of in connection with the
theory of binary forms.
5. The Continuity of Polynomials.
Definition. A function f{xi, ■■■ x„) is said to be continuous at the
point (cj, • • • c„) if, no matter how small a positive quantity e be chosen,
a neighborhood of the point (cj, •■■ c„) can be found so small that the dif
ference between the value of the function at any p nnt of this neighbor
hood and its value at the point (cj, ••• c„) is in absolute value less than e.
That is, / is continuous at (cj, •■■ c„) if, having chosen a positive
quantity e, it is possible to determine a positive S such that
\f{xi, ■■■x„)f(ci,...e„)\<e
for all values of (2^ • • • x„) which satisfy the inequalities,
\xic,\<B, \Xic,\ < S, ... \x„eJ<S.
Theorem 1. If two functions are continuous at a point, their
sum is continuous at this point.
Let/i and/2 be two functions continuous at the point (cj, ■■■ c„]
and let ki and A, be their respective values at this point. Then, no
matter how small the positive quantity e may be chosen, we may
take Si and Sj so small that
1/1  *i I < J e when 1 2;<  c,  < Sj,
/2Aj<Je when I a;<  Ci I < Sj.
POLYNOMIALS AND THEIR FUNDAMENTAL PROPERTIES 16
Accordingly
l/i*il + l/8*2l<e when I a;<  (?< I < S,
where B is the smaller of the two quantities Sj and Sj ; and, since
A + J5^A + 5, we have
l/i  *i +A  *2l = l(/i +/2)  (*i + ^2)1 < e when \xt  Ci\ < S.
Hence /i +/,, is continuous at the point (c^, ••• c„).
CoROLLABY. If a finite number of functions are continuous at a
point, their sum is continuous at this point.
Theorem 2. ^ two functions are continuous at a point, their
product is continuous at this point.
Let/j and/2 be the two functions, and k^ and k^ their values at
the point (cj, ••• <?„) where they are assumed to be continuous. We
have to prove that however small e may be, S can be chosen so small
that
(1) 1/1/2 *i*2l<e when a;ic,<S.
Let ij be a positive constant, which we shall ultimately restrict to a
certain degree of smallness, and let us choose two positive constants
«! and Sj, such that /i  ^i I < »? when  a;,  c,  < Sj,
1/2 h\<V when a;j  c^I < Sg.
Now take S as the smaller of the two quantities S, and S^. Then,
when \xi — c,<S,
1/1/2  Ai*2l = \Mfi  *i) + h(A  h)\
^ I/2I 1/1 *il + 1*111/2^1 ^J/2l + l*lN
Accordingly since, when ja^j — c, < S,
1/2! = 1*2 + (/2  *2)l ^ 1*2! + 1/2  *2 < 1*2! + V,
we may write
(2) /i/2*l*2<{*ll + l*2ll'; + '?'
If ki and k^ are not both zero, let us take 17 small enough to satisfy
the two inequalities .^
16 INTRODUCTION TO HIGHER ALGEBRA
If ^1 = ^2 = 0, we will restrict 7} merely by the inequality
i7<Ve.
In either case, inequality (2) then reduces to the form (1) and oui
theorem is proved.
Corollary. If a finite number of functions are continuous at a
point, their product is continuous at this point.
Referring now to our definition of continuity, we see that any
constant ma<, be regarded as a continuous function of (a;j, • • • x„) for
all values of these variables, and that the same is true of any one of
these variables themselves. Hence by the last corollary any function
Ox the form Ca^> ■•• a^'", where the k's are integers positive or zero, is
continuous at every point. If we now refer to the corollary to
Theorem 1, we arrive at the theorem:
Theorem 3. Ani/ polynomial is a continuous function for all
values of the variables.
Finally, we give a simple application of this theorem.
Theorem 4. If f(xi, ••■ a;„) is a polynomial and f{c.^, ■■■ e„)g!=0,
it is possible to take a neighborhood of the point (cj, • • ■ c„) so small that
f does not vanish at any point in this neighborhood.
Let k=f(ci, ••• e„). Then, on account of the continuity of/ at
(<?!> ••• c„), a positive quantity S can be chosen so small that through
out the neighborhood a;, — c,! < S, the inequality
\f^<m
is satisfied. In this neighborhood / cannot vanish; for at any point
where it vanished we should have
\fk\=m<m^
which is impossible since by hypothesis k^O.
6. The Fundamental Theorem of Algebra. Up to this point no
use has been made of what is often known as the fundamental
theorem of algebra, namely the proposition that every algebraic
equation has a root. This fact we may state in more precise form
as follows :
POLYNOMIALS AND THEIR FUNDAMENTAL PKOPERTIES IT
Theorem 1. If f{x) is a polynomial of the nth degree where
« ^ 1, there exists at least one value of xfor which f{x) = 0.
This theorem, fundamental though it is, is not necessary for
most of the developments in this book. Moreover, the methods of
proving the theorem are essentially not algebraic, or only in part
algebraic. Accordingly, we will give no proof of the theorem here,
but merely refer the reader who desires a formal proof to any of the
textbooks on the theory of functions of a complex variable. We
shall, however, when we find it convenient to do so, assume the
truth of this theorem. In this section we will deduce a few of its
more immediate consequences.
Theorem 2. If f(x) is a polynomial of the nth degree,
fix) = af^x" + a^a;"! +  +a„_iX + a„ (a„ # 0),
there exists one and only one set of constants, a^, a^, ••• «„, such that
f{x)=a^{x  a^{x a^{x «J.
This theorem is seen at once to be true for polynomials of the
first degree. Let us then use the method of mathematical induction
and assume the proposition true for all polynomials of degree less
than n. If we can infer that the theorem is true for polynomials of
the nth. degree, it follows that being true for those of the first degree
it is true for those of the second, hence for those of the third, etc.
By Theorem 1 we see that there is at least one value of x for
which f{x) = 0. Call such a value Uy By Theorem 1, ^ 1 we may
write
f(x) = (xai)^{x),
where 4>(x) = a^x"'^ + b^stf^ H + b„_i.
Since <f)(x) is a polynomial of degree n — 1, and since we are assum
ing our theorem to be true for all such polynomials, there exist
n — 1 constants a^, •■• «„ such that
<ji(x) = a^{x  «2)  (a;  «„).
Hence , f(x) = aQ{x— ai){x — a2) — i^ — «^'
Thus half of our theorem is proved.
18 INTRODUCTION TO HIGHER ALGEBRA
Suppose now there were two such sets of constants, Wj, •■• <*» aQ<J
ySj, ••• /S„. We should then have
(1) /(*) = «o(*  «i)  (a;  «„) s ao(x  ^{) — {x  /3,).
Let a; = efi in this identity. This gives
«o(«i^i)(«iA)(«i/3„) = 0.
Accordingly, since a^^O, aj must be equal to one of the quan
tities ySj, /Sj, ••• /3„. Let us suppose the /3's to have been taken in
such an order that a^ = /Sj. Now in the identity (1) cancel out the
factor aQ{x — Wj) (see Theorem 4, § 2). This gives
{x  a^) — (x «„) = {x /Jj) —(x /3,).
Accordingly, since we have assumed the theorem we are proving
to be true for polynomials of degree n— 1, the constants ySj, ••• /8„
are the same, except perhaps for the order, as the constants Oj, ••• «„,
and our theorem is proved.
Definition. 77ie constants «j, ••• «„ determined in the last theo
rem are called the roots of the polynomial f{x), or of the equation
f(x) = 0. If k of these roots are equal to one another, hut different
from all the other roots, this root is called a kfold root.
It is at once seen by reference to Theorem 1, § 1 that these roots
are the only points at which /(a;) vanishes.
Theorem 3. Iff{xy, ••• a;„) is a polynomial which is not identi
cally equal to a constant, there are an infinite number of points (a;,, ••• x„)
at which f=^0, and also an infinite number at which f=0, provided
w > 1.
The truth of the first part of this theorem is at once obvious, for,
since /is not identically zero, a point can be found at which it is not
zero, and then a neighborhood of this point can be taken so small
that /does not vanish in this neighborhood (Theorem 4, § 5). This
neighborhood, of course, consists of an infinite number of points.
To prove that / vanishes at an infinite number of points, let us
select one of the variables which enters into / to at least the first
degree. Without loss of generality we may suppose this variable
to be Xy We may then write
f{x^.  x„)^F^{x^, ... a;„)a;f + J'i(aj„ ••. a;,)a;J' + ... + F^ix^, ... «;„),
POLYNOMIALS AND THEIR FUNDAMENTAL PROPERTIES 19
where k>l and Fq is not identically zero. Let (<?j, ••• c„) be any
point at which F^ is not zero. Then/(a;i, c^, ••• o„) is a polynomial
of the Ath degree in x^ alone. Accordingly, by Theorem 1, there is
at least one value of x^ for which it vanishes. If Cj is such a value,
/(Cj, Cj, ••• c„) = 0. Moreover, by the part of our theorem already
proved, there are an infinite number of points where F^ =!=■ 0, that
is an infinite number of choices possible for the quantities Cj, ••■ <?„.
Thus our theorem is completely proved.
Finally, we will state, without proof, for future reference, a
theorem which says, in brief, that the roots of an algebraic equation
are continuous functions of the coefficients :
Theorem 4. If a is a root of the polynomial
then no matter how small a neighborhood a;— a<e of the point a we
may consider, it is possible to take in space of n+1 dimensions a neigh
lorhood of the point {aQ,ai, ■■■ a„) so small that, if (bQ,b^, — 6J is any
point in this neighborhood, the polynomial
b^af + 6ia;»i +  + b„_iZ + b„
has at least one root fi in the neighborhood \x—a\<€ of the point a.
For a proof of this theorem we refer to Weber's Algebra,
Vol. 1, § 44.
• The theorem remains true if we merely assume that the polynomial is of at least
the first degree. That is, some of the first coefficients oo, oi, ••• may he zero.
CHAPTER n
A FEW PROPERTIES OF DETERMINANTS
7. Some Definitions. We assume that the reader is familiar with
the determinant notation, and will merely recall to him that by a
determinant of the nib. order
*21
*nl
"ni
nn
*2»
we understand a certain homogeneous polynomial of the wth degree
in the n^ elements a,j. By the side of these determinants it is often
desirable to consider the system of the n^ elements arranged in the
order in which they stand in the determinant, but not combined
into a polynomial. Such a square array of n^ elements we speak
of as a matrix. In fact, we will lay down the following somewhat
more general definition of this term :
Definition 1. A system of mn quantities arranged in a rectangu
lar array of m rows and n columns is called a matrix. If m = n, we
say that we have a square matrix of order n.
It is customary to place double bars on each side of this array,
thus:
"n
*21
*12
*22
*ln
\a,
'ml
*m2
20
A FEW PROPERTIES OF DETERMINANTS
Sometimes parentheses are used, thus :
21
Even when a matrix is square, it must be .carefully noticed that
it is not a determinant. In fact, a matrix is not a quantity at all,*
but a system of quantities. This difference between a square ma
trix and a determinant is clearly brought out if we consider the
effect of interchanging columns and rows. This interchange has no
effect on a determinant, but gives us a wholly new matrix. In fact,
we will lay down the definition:
Definition 2. Two square matrices
»u
M
*lre
'11
•*1»
^»1
of which either is obtained from the other hy interchanging rows and
jolumns are called conjugate f to each other.
Although, as we have pointed out, square matrices and deter
minants are wholly different things, every determinant determines a
square matrix, the matrix of the determinant, and conversely every
square matrix determi*""* a determinant, the determinant of the
matrix.
Every matrix contains other matrices obtained from it by strik
ing out certain rows or columns or both. In particular it contains
certain square matrices ; and the determinants of these square
matrices we will call the determinants of the matrix. If the matrix
contains m rows and n columns, it will contain determinants of all
orders from 1 (the elements themselves) to the smaller of the two
integers and m and n inclusive. J In many important problems all
* Cf., however, § 21. t Sometimes also transposed.
t if TO = re, there is only one of these determinants of highest order, and it was this
which, ws called above the determinant of the square matrix.
22
INTRODUCTION TO HIGHER ALGEBRA
of these determinants above a certain order are zero, and it is often
of great importance to specify the order of the highest nonvanish
ing determinant of a given matrix. For this purpose we lay down
the following definition :
Definition 3. A matrix is said to be of rank r if it contains at
least one rrowed determinant which is not zero, while all determinants
of order higher than r which the matrix may contain are zero.
A matrix is said to he of rank if all its elements are zero.
For brevity, we shall speak also of the rank of a determinant,
meaning thereby the rank of the matrix of the determinant.
We turn now to certain definitions concerning the minors of
determinants ; that is, the determinants obtained from the given
determinant by striking out certain rows and columns.
It is a familiar fact that to every element of a determinant
corresponds a certain first minor; namely, the one obtained by
striking out the row and column of the determinant in which
the given element lies. Now the elements of a determinant
of the nth order may be regarded as its (n — l)th minors.
Accordingly we have here a method of pairing off each one
rowed minor of a given determinant with one of its (n — l)rowed
minors.
Similarly, if ilf is a tworowed minor of a determinant of the
«th order i), we may pair it off against the (» — 2)rowed minor N
obtained by striking out from 2) the two rows and columns which
are represented in M. The two minors M and N we will speak of
as complementary. Thus, in the determinant
the two minors
*21
*81
are complementary.
hi
*31
»61
*U
*jj
*32
*43
'♦ea
"IS
*M
43
"it
•14
*24
*34
*44
*64
^K
"U
«ia «:
'42
'68
14
*44
*S4
»M
*46
*6S
A FEW P&OPERTIES OP DETERMINANTS 23
In the same way we pair off with every threerowed minor an
(» — 3)rowe(i minor; etc. In general we lay down
D'BPINITION 4. If D 18 a determinant of the nth order and M
one of its krowed minors, then the (n — k)rowed minor N obtained hy
striking out from D all the rows and columns represented in M is called
the complement of M.
Conversely, Mis clearly the complement of N.
Let us go back now for a moment to the case of the onerowed
minors ; that is to the elements themselves. Let a^, be the element of
the determinant D which stands in the ith row and the yth column.
Let Dij represent the corresponding first minor. It will ^e recalled
that we frequently have occasion to consider not this minor D^ but
the cof actor Ay of a^ defined by the equation J^ = (— ly+^'i)^.
Similarly, it is often convenient to consider not the complement
of a given minor but its algebraic complement, which in the case just
mentioned reduces to the cofactor, and which, in general, we define
as follows :
Definition 5. If M is the mrowed minor of D in which the rows
^n ••• ^m ^f^^ *^* columns li, ■■• l^ are represented, then the algebraic comr
plement of M is defined by the equation
alg. compl. of il!f=(l)*'+"+*"'+''+"+'"'[compl. of ilf].
The following special case is important :
Definition 6. By a principal minor of a determinant D is under
stood a minor obtained by striking out from D the same rows as columns.
Since in this case, using the notation of Definition 5, we have
it follows that the algebraic complement of any principal minor is equa
to its plain complement.
We have so far assumed tacitly that the orders of the minors wo
were dealing with were less than the order n of the determinant
itself. By the nrowed minor of a determinant D of the rath order
we of course understand this determinant itself. The complement
of this minor has, however, by our previous definition no meaning.
We will define the complement in this case to be 1, and, by Definition 5,
this will also be the algebraic complement.
24 INTRODUCTION TO HIGHER ALGEBRA
EXERCISE
Prove that, if M and N are complementary minors, either M and N are ths
algebraic complements of each other, or — iV is the algebraic complement of
M and — ikf is the algebraic complement of N.
8. Laplace's Development. Just as the elements of any row or
column and their corresponding cofactors may be used to develop a
determinant in terms of determinants of lower orders, so the Arowed
minors formed from any k rows or columns may be used, along with
their algebraic complements, to obtain a more general development
of the determinant, due to Laplace, and which includes as a special
case the one just referred to. In order to establish this develop
ment, we begin with the following preliminary theorem : ^
Theorem 1. If the rows and columns of a determinant D he
shifted in such a way as to bring a certain minor M into the upper left
hand corner without changing the order of the rows and columns either
of M or of its complement N, then this shifting will change the sign of
D or leave it unchanged according as — N or N is the algebraic com
plement of M.y
To prove this let us, as usual, number the rows and columns of
J), beginning at the upper lefthand corner, and let the numbers of
the rows and columns represented in M, arranged in order of increas
ing magnitude, be A^, •••A^, and l^, •••?„ respectively. _ In order to
effect the rearrangement mentioned in the theorem, we may first
shift the row numbered k^ upward into the first position, thus carry
ing it over k^ — 1 other rows and therefore changing the sign of the
determinant k^^ — l times. Then shift the row numbered k^ into the
second position. This carries it over Aj — 2 rows and hence changes
the sign Ajg — 2 times. Proceed in this way until the row numbered
k^ has been shifted into the mth position. Then shift the columns
in a similar manner. The final result is to multiply Z) by.
Comparing this with Definition 5, § 7, the truth. of our theorem is
obvious. ,
Lemma. If M is a minor of a determinant A the product of M
by its algebraic complement is identical, when expanded, with some of
the terms of the expansion of D.
A FEW PROPERTIES OF DETERMINANTS 25
Let 2) =
"m
*nl
and call the order of M, m, and its complement iV.^ We will first
prove our lemma in the special case in which M stands in the upper
leftrhand corner of D, so that iV, which in this case is the algebraic
complement, is in the lower righthand corner. \ What we have to
show here is that the product of any tdrm oi Moj a, term, of iV is a
term of D, and that this term does not come in twice to the product
MM y Any term of M may be written
where the integers Zj, l^i'tm are merely some arrangement of the
integers 1, 2, •••m, and fj, is the number of inversions of order in this
arrangementA Similarly, any term of N may be written
where Im+v •" 4 is merely some arrangement of the integers mr1,
•■■ n, and v is the number of inversions of order in this arrangement^
The product of these two terms
is a term of B, for the factors a are chosen in succession from the
first, second, ••• nth rows of J), and no two are from the same col
umn,' and fi+ vis clearly precisely the number of inversions of order
in' the arrangement Zj, l^, ■•■ l„,a.s compared to the natural arrange
ment, 1, 2, ••■ n, of these integers.'^
Having thus proved our lemma in the special case in which M
lies in the upper lefthand corner of D, we now pass to the general
case. ^^ Here we may, by shifting rows and columns, bring M into
the upper lefthand corner and JV into the lower righthand corner.
This has, by Theorem 1, the effect of leaving each term in the
expansion of D unchanged, or of reversing the sign of all of them
according as iVor — iVis the algebraic complement of M.\ Accord
ingly, since the product il!fi\r gives, as we have just seen, terms in
the expansion of this rearranged determinant, the product of M by
its algebraic complement gives terms in the expansion of D itself.
as was to be proved.x
26
INTRODUCTION TO HIGHER ALGEBRA
Laplace's Development, which may be stated in the form of the
following rule, now follows at once :
Theorem 2. Pick out any m rows (or columns) from a determi
nant J), and form all the mrowed determinants from this matrix.^ The
sum of the products of each of these minors by its algebraic complement
is the value of D.
Since, by our lemma, each of these products when developed con
sists of terms of 2), it remains merely to show that every term of
D occurs in one and only one" of these products. ^ This is obviously
the case ; for every term of D contains one element from each of the
m rows of D from which our theorem directs us to pick out «irowed
determinants, and,/since these elements all lie in different columns,
they lie in one and only one of these wirowed determinants, say l£^
Since the other elements in this term of D obviously all lie in the
complement N of M, this term will be found In the product MN and
in none of the other products mentioned in our theorem._^ v \
EXERCISES
1. From a square matrix of order n and rank r, s rows (or columns) are selected.
Prove that the rank of t}ie matrix thus obtained cannot be less than r ir s — n.
2. Generalize the theorem of Exercise 1.
9. The Multiplication Theorem. Laplace's Development enables
us to write out at once the product of any two determinants as a
single determinant whose order is the sum of the orders of the two
given determinants
«fll — «1«
^1 — hm
«nl ••• «»»
Kl •'• 5mm
■ Pm, ^ml ■
whatever the values of the ^'s may be. For, expanding the large
determinant in terms of the nrowed minors of the first n rows, all
the terms of the expansion are zero except the one written in the
first member of the equation.
A FEW PROPERTIES OF DETERMINANTS
27
«1
«a
«8
A
^2
^8
•
7i
7a
78
«1
«2
«3
/3i
/32
/Sa
7i
73
7s
1
«1
«2
«s
1
h
J.
h
1
Ci
Co
e.
From this formula we will now deduce a far more important
one for expressing the product of two determinants of the same
order as a determinant of that order. For this purpose let us choose
the p'a in the last formula as follows :
Pij = when i ^J, p^ =  1,
and let us consider for simplicity the product of two determinants
of the third order. We have
Let us now reduce this sixrowed determinant by multiplying
its first column by a^ and adding it to the fourth column; then
multiply the first column by a^ and add it to the fifth; then
multiply the first column by a^ and add it to the sixth. In this
way we bring zeros into the last three places in the fourth row.
Next multiply the second column successively by 6j, Jj, 63 and
add it to the fourth, fifth, and sixth columns respectively.
Finally multiply the third column successively by Cj, c^, Cg and
add it to the fourth, fifth, and sixth columns. The determinant
thus takes the form
a^ «j ag aiai+a^h^ + ttgCi aiaj+a^Jj + agCj «i«8+«2^3+"8<'3
7i 7a 7s 7i«i + 7a*i + 78<'i 7ia2 + 72*2 + 78^2 7i«8 + 72*8 + 73«8
10
0100
01,0
and this reduces at once to the threerowed determinant
«!«! t «,Ji + BjCi ai«2 f «2&2 f ttjCa tti^g + «2*S + H^z
/3iaiy8jJi + /33C, ^^a^^ ^ih^ + ^^e^ ^1 flg + ^2 *3 + ^8''8
7i«i + 7a^i + 78''i 7i«a + 72^a + 78<'3 7i«8 + 7a^j. + 73«8
28
INTRODUCTION TO HIGHER ALGEBRA
We have thus expressed the product of two determinants of the
third order as a single determinant of the third order. The methbd
we have used is readily seen to be entirely general, and we thus get
the following rule for multiplying together two determinants of the
nth order :
Theorem. The product of two determinants of the nth ordm
may he expressed as a determinant of the nth order in which the
element which lies in the ith row and jth column is obtained hy
multiplying each element of the ith row of the first factor by the
corresponding element of the jth column of the second factor and
adding the results.
It should be noted that changing rows into columns in either or
both of the given determinants, while not affecting the value of the
product, will alter its form materially. For example,
2 3
1 7
20 41
4 5
6 9
34 73
2 3
1 6
23 39
4 6
7 9
39 69
2 4
1 T
26 501
3 5
•
6 9
33 66
2 4
1 6
80 48
3 6
7 9
38 63
= 66,
= 66,
= 66;
and similarly the product of any two determinants of the same ordei
may be written in four different forms.
10 Bordered Determinants. If to a determinant of the wth
order we add one or more rows and the same number of columns of
n quantities each and fill in the vacant corner with zeros, the result
ing determinant is called a bordered determinant. Thus starting from
the tworowed determinant
/3
A FEW PROPERTIES OF DETERMINANTS
29
we may form the bordered determinants
a
/8
Ui
7
S
Ma
5
^1
^2
a
/3
"l
7
8
"2
^1
^2
<
<
a
^
"l
«i
«('
7
S
"2
ni
K
"i
"2
<
<
.i'
ti'
If in tlie second of these examples we use Laplace's Development
to expand the bordered determinant according to the tworowed
determinants of the last two rows, we see that its value is
«1
<
^1
^zl
M3
<
V'l
v'.l
a quantity into which the elements «, /3, 7, S of the original deter
minant do not enter. Similarly expanding the third of the above
bordered determinants according to the threerowed determinants of
its last three rows, we see that its value is zero.
The reasoning we have here used is of general application and
leads to the following results :
Theorem 1. If a determinant of the nth order is bordered with
n rows and n columns, the resulting . determinant has a value which
depends only on the bordering quantities.
Theorem 2. If a determinant of the nth order is bordered with
more than n rows and columns, the resulting determinant always has the
value zero.
The cases of interest are therefore those in which the deter
minant is bordered with less than n rows and columns. Concerning
these we will establish the following fact :
Theorem 3. If a determinant of the nth order be bordered by p
rows and p columns (p < n) of independent variables, the resulting
determinant is a polynomial of degree 2p in the bordering quantities,
whose coefficients are the pth minors of the original determinant; and
conversely, every pth minor of the original determinant is the coefficient
of at least one term of this polynomial.
30
INTRODUCTION TO HIGHER ALGEBRA
Let us consider the special case where m = 4 and p = 2.
D =
'11
»21
*31
*12
*22
*8a
42
*13
*23
*88
'48
*U
*24
*84
*44
«4
M
«18
«M
«1
.Mi
«1
^2
«28
«24
"2
Mi
+ •
• to 6 terms
"l
«2
«88
«48
«84
«44
Ms
«4
<
Developing this determinant, by Laplace's method (§ 8), in terms
of the tworowed determinants of the last two rows, we have
D=
If now we expahd each of these fourrowed determinants, by
Laplace's method, in terms of the tworowed determinants of their
last two columns, and then arrange the result as a polynomial in
the tt's and d's, the truth of the theorem is apparent. We leave it
to the reader to fill in the details of the proof here sketched.
11. Adjoint Determinants and their Minors.
Definition. If, in the determinant
2) =
*nl
A^ is the eof actor of the element a^, then the determinant
*i«
D'=
is called the adjoint of J^
^u ■■■ ^Xn
■^nl
A FEW PROPERTIES OF DETERMINANTS
31
By corresponding minors of 2) and D', or indeed of any two
determinants of the same order, we shall naturally understand
minors obtained by striking out the same rows and columns from
2) as from B'. These definitions being premised, the fundamental
theorem here is the following : ,
Theorem. If D' is the adjoint of any determinant B, and M and
M' are corresponding mrowed minors of D and D' respectively, then
M' is equal to the product of I)^~^ hy the algebraic complement of M.\^
We will prove this theorem first for the special case in which
the minors iHf and M lie at the upper lefthand corners of D and D'
respectively. We may then write
M' =
Ax •
•' ^im
... .
•• A»
A^, .
A
...
•• A^
..
1
.
.
..
1 .
..
• •
..
•
.
. 1
>
Let us now interchange the columns and rows of 2),
D =
*n
n»
*nl
\
and then form the product M'B by the theorem of § 9. This gives
Z> — —
D —
M'D =
»!»
a.
J)
l.m+l "2.«»+l
a,
■m,m+l «*m+l,m+l
*i»+l,»
= 2)"
*m+l,m+l "* *n,m+l
'm+l,»
>
3
\
32 INTRODUCTION TO HIGHER ALGEBRA
Let us here regard a^p ■■■ «„„ as n^ independent variable^!
Then the equation just written becomes an identity, from which 17
since it is not identically zero, may be cancelled out, and we get
(1) M'B'"^
^m+l,m+l '" ^n,m+l
Since the determinant which is written out in (1) is precisely the
algebraic complement of M, our theorem is proved in the special
case we have been considering. ' It should be noticed that this proof
holds even in the case m = n; cf. Corollary 2 below^
. Turning now to the case in which the minors iHf and M' do not
lie at the upper lefthand corners of D and D', let us denote by a the
sum of the numbers which specify the location of the rows and
columns in M ov M', the numbering running, as usual, from the up
per lefthand corner. \ Then by Definition 5, § 7,
(2) alg. compl. of M={ If [compl. of il!f].\
Let us now, by shifting rows and columns, bring the determinant
M into the upper lefthand corner of B. Calling the determinant
D, as thus rearranged, D^ we have (cf. Theorem 1, § 8)
(3) D,= (iyi).y
The cof actors in D^ are equal to (— 1)M^, since the interchange of
two adjacent rows or columns of a determinant changes the sign of
every one of its cofactors.^ Accordingly the adjoint of D^, which we
will call B'p may be obtained from D' by rearranging its rows and
columns in the same way as the rows and columns of D were
rearranged to give JD^ and then prefixing the factor (— 1)" to each
element.
Let us" now apply the special case already established of our
theorem to the determinant B^ and its adjoint D[, the mrowed
minors M^ and M^ being those which are situated in the upper left
hand corner of B^ and I)[ respectively. We thus get
(4) J!fi=2)f» [alg. compl. of M^]. y
X FEW PROPERTIES OF DETERMINANTS 33
Now, since Mi is a principal minor, its algebraic complement is
the same as its ordinary complement, I and this in turn is the same as
the ordinary complement of the minor Min D. Accordingly, using
(2), we may write
(5) alg. compl. of ilifi = (— 1)" [alg. compl. of If].
Since the elements of M\ differ from those of M' only in having
the factor (— 1)" prefixed to each, it follows thatj>
(6) ifi = (l)»"'if'. >
We may now reduce (4) by means of (3), (5), and (6).'., We thus
get
( lyM' = ( l)»f"'i)i)»'i( 1)" [alg. compl. of Jf ].>
Cancelling out the factor ( — 1)™ from both sides of this equation,
we see that our theorem is proved. \
We proceed now to point out a number of special cases of this
theorem which are worth noting on account of their frequent occur
rence. \
Corollary 1.\ If a^ is any element of a determinant D of the
nth order, and if a^ is the cof actor of the corresponding element Aij in
the adjoint of D, then _ T^„_2„
This is merely the special case of our general theorem in which
m = w — 1, modified, however, slightly in statement by the use of
the cof actor «„■ in place of the (w — l)rowed minor ( Vf^'aij.
Corollary 2. If J) is any determinant of the nth order and B'
its adjoint, then j., _ t^i
This is the special case m = n.
Corollary 3. If D is any determinant, and S is the second
minor obtained from it by striking out its ith and kth rows and its jth
and Ith columns, and if we denote by A^j the cofactor of the element
which stands in the ith row and the jth column of D, then
This is the special case m = 2.j^
= ( _ Vf+i+^^^DS. \^ /f^
CHAPTER III
THE THEORY OF LINEAR DEPENDENCE
12. Definitions and Preliminary Theorems. Two sets of con
stants (flj, Sj, Cj, d,^.,) and (a^, ij, c^, d^) are usually said to be propor
tional to one another if every element of one set may be obtained
from the corresponding element of the other by multiplying by the
same constant factor. For example, (1, 2, 3, 4) and (2, 4, 6, 8) are
proportional. It is ordinarily assumed that either set may be thus
obtained from the other, and in most cases this is true ; but in the
case of the two sets (1, 2, 3, 4) and (0, 0, 0, 0) we can pass from
the first to the second by multiplying by 0, but we cannot pass from
the second to the first.
A more convenient definition, for many purposes, and one which
is easily seen to be equivalent to the abovementioned one, is the
following :
Definition 1. The two sets of constants
r" »•" ... t"
1' 2' »'
are said to he proportional to each other if two constants c^ and c^, not
both zero, exist such that
eix[iO2x'^ = (£= 1, 2, .■• n).
If <?i # 0, we have
and if (?2^0, we have
The two sets of constants ^t ~i
\i ^2' ■■■ ^ni
0, 0, ... 0,
THE THEORY OF LINEAR DEPENDENCE 35
are evidently proportional, since if we take Cj = and c, = any con
stant not zero, we have a pair of e's which fulfill the requirements
of our definition.
Linear dependence may be regarded as a generalization of the
conception of proportionality. Instead of two sets of constants we
now consider m sets, and give the following :
Definition 2. The m sets of n constants each,
41, 4'), ... a;W (i = 1, 2, ... wi),
are said to he linearly dependent ifm constants o^, c^, ••• o^, not all zero,
exist such that
c,x^ + c^'j + • . . + e^o; W = 0' = 1, 2, . . . n).
If this is not the case, the sets of quantities are said to he linearly
independent.
In the same way we generalize the familiar conception of the
proportionality of two polynomials as follows :
Definition 3. The m polynomials {in any numher of independent
variahles') f^, f^, ■■■ f^ are said to he linearly dependent if m constants
"jj, Cg, • • • c„, not all zero, exist such that
^l/l + ^2/2 +  + <'mfm=^
If this is not the case, the polynomials are said to he linearly inde
pendent.*
The following theorems about linear dependence, while almost
selfevident, are of sufficient importance to deserve explicit state
ment :
Theorem 1. If m sets of constants {or if m polynomials) are lin
early dependent,it is always possible to express one — hut not necessarily
any one — of them linearly in terms of the others. This set of constants
{or this polynomial) is then said to he linearly dependent on the others.
This is seen at once if we remember that at least one of the e's
is not zero. The relations (or relation) in which the e's occur can,
then, be divided through by this e.
* We might clearly go farther and consider the linear dependence of m sets of r.
polynomials each. The two cases of the text would be merely special cases from thi«
general point of view.
36 INTRODUCTION TO HIGHER ALGEBRA
Theorem 2. If there exist among the sets of constants {or among tht
polynomials) a smallernumber of sets {or of polynomials) which are linearly
dependent, then the m sets {or the m polynomials) are linearly dependent
For suppose there are I sets of constants (or I polynomials)
which are linearly dependent {l<m), then we may take for our set
of m c's, the I c's which must exist for the I sets (or polynomials) and
(ot — l) zeros
Theorem 3. If any one of the m sets of constants consists exclu
sively of zeros {or if any one of the polynomials is identically zero), the
m sets {or the m polynomials) are linearly dependent.
For we . may take for the c corresponding to Chis particular set
(or polynomial) any constant what(rver, except lero,, and for the othei
(m — 1) c's, {m — 1) zeros.
13. The Condition for Linear Dependence^ of t$ets of Constants.
In considering m sets of n constants each,
(1) 7^lx%...x^^ (i=l, 2, ...m>
it will be convenient to distinguish between the two cases m^n and
m>n.
{a) m^n. We wish here to prove the following fundamental
theorem :
Theorem 1. A necessary and sufficient condition for the linear
dependence of the m sets (1) of n constants each, when m^n, is that all
the mrowed determinants of the matrix
x{
x', .
•• <
^I'
x'> .
■■ <
• •
' '
■
should vanish.
a;W
4"' •
■• 4"'
That this is a necessary condition is at once obvious; for if the
m sets of constants are linearly dependent, one of the rows can be
expressed as a linear combination of the others. Accordingly if in
any of the mrowed determinants we subtract from the elements of
this row the corresponding elements of the other rows after each row
THE THEORY OF LINEAR DEPENDENCE 37
has been mtiltiplied by a suitable constant, the elements of this row
will reduce to zero. The determinant therefore vanishes.
We come now to the proof that the vanishing of these deter
minants is also a sufficient condition. We assume, therefore, that
all the TOrowed determinants of the above matrix vanish. Let us
also assume that the rank of the matrix is r>0* (of. Definition 3,
§ 7). Without any real loss of generality we may (and will) assume
that the rrowed determinant which stands in'the upper lefthand corner
of the matrix does not vanish ; for by changing the order of the sets
of constants and the order of the constants in each set (and these
orders are clearly quite immaterial) we can bring one of the non
vanishing yrowed determinants into this position.
We will now prove that the first {r f 1) sets of constants are lin
early dependent. From this the linear dependence of the m sets
follows by Theorem 2, § 12.
Let us denote by Cj, Cji "• ^r+i the cofactors in the (>• f l)rowed
determinant which stands in the upper lefthand corner of the matrix,
and which correspond to the elements of its last column. If we remem
ber that all the (r 41) rowed determinants vanish, we get the relations
c^x] \c^x'l\ ■■■ ^ c^+ia;'r'' = (y = »• f 1, r I 2, ■ • • w).
Since the sum of the products of the elements of any column of a
determinant by the cofactors of the corresponding elements of another
column is zero, this equation is also true when/ = 1, 2, ••• r.
This establishes the linear dependence of the first (r + V) sets of
constants, since c^+i, being the rrowed determinant which stands
in the upper lefthand corner of the matrix, is not zero.
(J) m>n. This case can be reduced to the one already considered
by the following simple device. Add to each set of n constants m—n
zeros. We then have m sets of m constants each. Their matrix con
tains only one wrowed determinant, and this vanishes since one, at
least, of its columns is composed of zeros. Therefore these m sets of m
constants each are linearly dependent ; and hence the original m sets
of n constants each were linearly dependent. Thus we get the theorem :
Thboeem 2. m sets of n constants each are always linearly depen
dent if m^^n.
* In general we shall have r = m — 1 , but r may have any value less than m. The
only case which we here exclude is that in which all the elements of the matrix are
zero, a case in which the linear dependence is at once obvious
88
INTRODUCTION TO HIGHER ALGEBRA
EXERCISES
Determine whether the following sets of constants are linearly
dependent or not :
S.
3 a,
2 J,
3c,
6d,
a,
0,
 c.
id,
0,
i.
0,
Zd.
1,
0, 0,
5,
1,
2, 6,
7,
3,
1, 3,
16.
5,
2,.l,
3, 4,
0,
3, 0,
0, 8,
15,
7, 3,
9, 7.
5,
7,
0,
1, 1,
1,
3,
2,
3, 1,
I 4, 0, 7, 9, 2.
14. The Linear Dependence of Polynomials. Suppose we have m
polynomials, f^j^^ ...f^^
in any number of independent variables. A necessary and sufficient
condition for the linear dependence of these polynomials is evidently
the linear dependence of their m sets of coefficients. Thus the condi
tions deduced in the last section can be applied at once to the case of
polynomials.
EXERCISES
Determine whether the following polynomials are linearly depend
ent or not: r ^R^ j^ qn
r Id a; + 30 s,
1. J 6a; + 2y + 53 4,
[ 15 a; + 9y 18.
2.
3.
3 Sj + 4 ij — 4 a;, + 6 a;^,
7a;i +Zx^ + 'J x^,
5 Xj + 9 Xj — Xj + 4 X4 + 8.
2a;2+ 8a;y+ 6y^ ■\\i:X + 12y 
7i» + 3/2+ 6 J. _ 4y^
33;" 6x3/+ 8y2 5x +
7,
Ba" + 20 xy + 15 3/2 + 35 x + 30 3/  10.
THE THEORY OF LINEAR DEPENDENCE 39
15. Geometric Illustrations. The sets of n constants with which
we had to deal in §§ 12, 13 may, provided that not all the constants
in any one set are zero, advantageously be regarded as the homoge
neous coordinates of points in space of w — 1 dimensions. It will
then be convenient to speak of the linear dependence or independ
ence of these points. The geometric meaning of linear dependence
will be at once evident from the following theorems for the
case w = 4.
Two points will here be represented by two sets of four constants
®''°^' ^v 1/v "v tv
*2' 2^2' ^2' 2'
which will be linearly dependent when, and only when, they are pro
portional, that is, when the points coincide. Hence:
Theorem 1. Two points are linearly dependent when, and only
when, they coincide.
If we have three points in space, P^, P^, Pg, whose coordinates
are (zj, y^, z^, t^), {x^, y^, z^, t^), {x^, y^, Zg, t^), respectively, and
which are linearly dependent, there must exist three constants Cj, c^
Cj, not all zero, such that
OlVl + ^2^2 + ^^3^3 = 0'
CiZi + C^Z^ + CgZs = 0,
^1*1 + <?2*2 + ^3^3 = 0
Let us suppose the order of the points to be so taken that Cg gtO, and
Xg ^ /Cy^Xj^Jr rC^X^,
^3 = "1 ^1 "t" "2^2'
tg = K^t^ + n^2 2>
solve for Xg, y^, Sg, t^
(1)
where Tc^= — cje^, h^= — cjcy Now if
Ax\By^Qz + I)t = Q
is the equation of any plane through the points Pj and Pj, we have
Ax^^By^^ Gz^ + Dt^=0,
Ax^ + By^ + Cz^ + Bt^ = 0.
40 INTRODUCTION TO HIGHER ALGEBRA
Multiplying the first of these equations by Je^, the second by ^3, and
adding, we have, by means of the equations (1),
Ax^ + %3 + (7^3 + 2)^3 = 0.
Hence every plane through Pj and P^ passes through P3 also, and
the three points are collinear.
Now, in order to prove conversely that any three collinear points
are linearly dependent, let us suppose the three points Pj, P3, Pg
collinear. We may assume that these three points are distinct, as
otherwise their linear dependence would follow from Theorem 1.
We have seen that when three points are linearly dependent, the line
through two 'of them contains the third. Hence if we let
tV — Iht Xt ~\~ '^0*^0?
y' = hy\ + ^iHv
• ^ ^~ 11 "I" 2 2'
where Aj and k^ are two constants, not both zero, the point (a;', y\
s', <') or P' lies on the line PxP^, and our theorem will be established
if we can show that the constants Tc^ and h^ can be so chosen that
the points P' and Pg coincide. Now let ax \hy \ cz \ dt = be the
equation of any plane through the point P3 but not through Pj or P^.
Thus P3 is determined as the intersection of this plane with tlie line
PjPg, so that if P', which we know lies on P^Pi.^ can be made to lie
in this plane, it must coincide with Pg and the proof is complete.
The condition for P' to lie in this plane is ax' + hy' + cz' + dt' = 0.
Substituting for cb', y', s', t' their values given above, we have
hlax.^ + %! + cz^ + dt^ + li^^ax^ + %2 + '^^2. + ^^%} = ^•
But neither of these parentheses is zero, since the plane does not pass
through Pj or Pg, hence we may give to \ and k^ values different
from zero for which this equation is satisfied. We have thus proved
Theorem 2. Three points are linearly dependent when, and only
when, they are collinear.
The proofs of the following theorems are left to be supplied by
the reader. It will be found that some of them are readily proved
THE THEORY OF LINEAR DEPENDENCE 41
from the definition of linear dependence, as above, while for others
it is more convenient to use the condition for linear dependence ob
tained in § 13.
Theoeem 3. Four points are linearly dependent when, and only
when, they are complanar.
Theorem 4. Five_^ or more points are always linearly dependent.
Another geometric, application is suggested by the following con
siderations :
A set of n ordinary * quantities is nothing more nor less than a
complex quantity with n components (of. § 21). Our first definition of
linear dependence is therefore precisely equivalent to the following:
The m complex quantities
J J n ft , ,. fi
are said to he linearly dependent if m ordinary quantities Cj, c^, ••• e^,
not all zero, exist such that :
c^a^ + o^a^+ ■■■ +e^a„ = 0.
Now the sim'plest geometric interpretation for a complex quantity
with n components is as a vector in space of n dimensions,! and. we
are thus led to the conception of linear dependence of vectors. The
geometric meaning of this linear _ dependence will be seen from the
following theorems for the case w = 3:
Theorem 5. Two vectors are linearly dependent when, and only
when, they are collinear.
Theorem 6. Three vectors are linearly dependent when, and only
when, they are complanar.
Theorem 7. Four or more vectors are always linearly dependent.
In order to get a geometric interpretation of the linear dependence
of polynomials, we must consider, not the polynomials themselves,
but the equations obtained by equating them to zero. We speak of
these equations as being linearly dependent if the polynomials are
* Two different standpoints are here possible according as we understand the term
ordinary quantity to mean real quantity, or ordinani complex quantity.
t There are of course other possible geometric interpretations. Thus in the case
M= 4 we may regard our complex quantities as quaternions, and consider the meaning
of linear dependence of two, three, or four quaternions.
42 INTRODUCTION TO HIGHER ALGEBRA
linearly dependent. If then we regard the independent variables as
rectangular coordinates, these equations give us geometric loci in
space of as many dimensions as there are independent variables.
Thus, in the cases of two and three variables, we have plane curves
and surfaces respectively. The case of two loci is of no interest,
as they must coincide in order to be linearly dependent. In the case
of three linearly dependent loci it is easily shown that any one must
meet the other two in all their common points and in no others.
The following theorems will serve to illustrate the geometric mean
ing of linear dependence :
(1) In the plane :
Theorem 8. Three circles are linearly dependent when, and only
when, they belong to the same coaxial family .
Theorem 9. Four circles are linearly dependent when, and only
when, they have a (real or imaginary) common orthogonal circle.
Theorem 10. Four circles are linearly dependent when, and only
when, the points of intersection of the first and second, q/nd the points of
intersection of the third and fourth, lie on a common circle.
Theorem 11. Five or more circles are always linearly dependent.
(2) In space (using homogeneous coordinates) :
Theorem 12. Three planes are linearly dependent when, and only
when, they intersect in a line.
Theorem 13. Four planes are linearly dependent when, and only
when, they intersect in a point.
Theorem 14. Five or more planes are always linearly dependent.
CHAPTER IV
LINEAR EQUATIONS
16. Nonhomogeneous Linear Equations. In every elementary
treatment of determinants, however brief, it is explained how to
solve by determinants a system of n equations of the first degree in
n unknowns, provided that the determinant of the coefficients of
the unknowns is not zero. Cramer's Rule, by which this is done,
is this :
Cramer's Rule. Jf in the equations
aiiX^\ h ai„x„ = ki.
the determinant
«»ia;i+ + an„a;„= J„
a =
*ii
*»i
»itt
is not zero, the equations have one and only one solution, namely :
a
a^
a
where a^ is the nrowed determinant obtained from a by replacing the
elements of the ith column by the elements k^, k^, •■• /fc„.
This rule, whose proof we assume to be known,* is of funda
mental importance in the general theory of linear equations to
which we now proceed.
* The proof as given in most English and American texthooks merely establishes
the fact that if the equations have a solution it is given by Cramer's formulse. That
these formulae really satisfy the equations in all cases is not commonly proved, but
may be easily estabished by direct substitution. We leave it for the reader to do this,
43
44
INTRODUCTIOKT TO HIGHER ALGEBRA
Consider the system of m linear equations in n variables:
aiia;i+ ••• + a i„a;„ + Jj = 0,
«».i*i+ ••• + a™a;„+6„ = 0,
where m and n may be any positive integers. Three cases arise:
(1) The equations may have no solution, in which case they are
said to be inconsistent.
(2) They may have just one solution.
(3) They may have more than one solution, in which case it will pres
ently appear that they necessarily have an infinite number of solutions.
Let us consider the two matrices:
a =
ni
*ln
*ml
b =
*ii
*ml
We will call a the matrix of the system of equations, b the aug
mented matrix.
It is evident that the rank of the matrix a cannot be greater
than that of the matrix b, since every determinant contained in a
is also contained in b. We have, then, two cases:
I. Rank of a = Rank of b.
II. Rank of a < Rank of b.
We will consider Case II first.
Let r be the rank of b. Then b must contain at least one
rrowed determinant which is not zero. Moreover, this determinant
must contain a column of b's, since otherwise it would be contained
in a also, which is contrary to our hypothesis. Suppose for definite
ness that this nonvanishing rrowed determinant is the one situ
ated in the upper righthand corner of b. There is no loss of
generality in assuming this, since by writing the equations in a dif
ferent order and changing the order of the variables x^, ■■■ x^ wo
can always bring the determinant into this position. Now lot
brevity let us represent the polynomials forming the first members
of our given equations by V^, F^, ■■■ F^ respectively, and the
homogeneous polynomials obtained by omitting the constant terms
\n each of these equations by f^, f^, . f^. Then we have the
identities:
Fi=fi + b„
(i = l, 2, ... w).
LINEAR EQUATIONS 45
Consider the first r of these identities. Since the rank of a is
less, than r, the polynomials /i,/2, •••/^ are linearly dependent,
'^\S\ + C2/2 + • • • + e«/r = 0,
hence c^F^V ••■ ^ c^I'^ = e^^\ ■■■ + c^b^= C.
But since the rank of b is r, the polynomials F^, ■■■ F^ are linearly
independent and therefore 0^ 0. Hence the given equations are in
consistent, for if they were consistent all the F's would be zero for
some suitably chosen values of a;j, ••• a;„, and if we substitute these
Values in the last written identity we should have
Let us now consider Case I. Let r be the common rank of a
and b, then there is at least one frowed determinant in a which
is not zero. This same determinant also occurs in b. Suppose it to
be situated in the upper lefthand corner of each matrix. Since all
(?• + 1) rowed determinants of either matrix are zero, the first (rf 1)
of the F's are linearly dependent, and we have
c^F^ + c^F^ +.>rO,F,Jf c^+^F,^^ = ;
and, since F^, ■■■ F^ are linearly independent, <?^+j cannot be zero ;
hence we may divide through by it and express F^+i linearly in
terms of Fj^, ■■ F^. The same argument holds if instead of F^^^ we
take ^r+2 o'' ^°y other one of the remaining J"s. Hence
F^^i = * ra J^ + ... + W}F, {1=1,2, ...m r).
From these identities it is obvious that at any point (a;^, ■■■x„) where
F^, ■■■ Fr all vanish, the remaining F's also vanish. In other words,
any solution which the first r equations of the given system may
have is necessarily a solution of the whole system.
Now consider the first r of the given equations. Assign to
a^r+n • • • ^n any fixed values 4+i' • • • *»' and transpose all the terms
after the rth in each equation to the second member,
"11^1 "I •" "■i'^' ~ ~ ^1' '•+1^^+! '^i"*" ~ ^1'
46
INTRODUCTION TO HIGHER ALGEBRA
Remembering that the righthand sides of these equations are known
constants, and that the determinant of the coefficients on the left is
not zero, we see that we have the case to which Cramer's Rule
applies, and that this system of equations has therefore just one
solution. Hence the given system of equations is consistent, and we
have the theorem:
Theorem 1. A necessary and sufficient condition for a system of
linear equations to be consistent is that the matrix of the system
have the same rank as the augmented matrix.
From the foregoing considerations we have also
Theorem 2. If in a system of linear equations the matrix of
the system and the augmented matrix have the same rank r, the values
of n — r of the unknowns may he assigned at pleasure and the others
will then he uniquely determined.
The n—r unknowns whose values may he assigned at pleasure may
he chosen in any way provided that the matrix of the coefficients of
the remaining unknowns is of rank r.
EXERCISES
Solve completely the foUowing'systems of equations:
2x y + Sz 1 = 0,
4:x2y z+ 3 = 0,
2x y4:z+ 4 = 0,
lOi  5y  62 + 10 = 0.
4a: y+ 2 + 5 = 0,
2xZy + 5z + l = 0,
x+ y2z + 2 = Q,
5x — 2+2 = 0.
3.
2x 3y + iz u) = 3,
x + 2y  2 + 2«) = 1,
3x 3^ + 22 3m; =.4,
3a; — y + z — 7w = 4.
LINEAR EQUATIONS 47
17. Homogeneous Linear Equations. We will now consider the
special case where the equations of the last section are homoffeneout,
i.e. where all the 6's are zero,
aiiaJiH ai„a;„=:0,
ami2;i+ •••+«„„«„ = 0.
The matrices a and b of the last section differ here only by a column
of zeros ; hence thej^ always have the same rank and this is called
the rank of the system of equations. Theorems 1 and 2 of the last
section become
Theorem 1. A system of homogeneous linear equations always
has one or more solutions.
Theorem 2. If the rank of a system of homogeneous linear equa
tions in n variables is r, the values of n — r of the unknowns may be
assigned at pleasure and the others will then be uniquely determined.*
If the rank of the equations is n, there witl therefore be only one
solution, and this solution is obviously x^ = X2= ••• = a;„ = 0. Since
the rank can never be greater than n, we have
Theorem 3. A necessary and sufficient condition for a system of
homogeneous linear equations in the n variables {xi, ■ ■ ■ a;„) to have a solu
tion other than ajj = ajg = • • • = a;„ = is that their rank be less than n.
Corollary 1. If there are fewer equations than unknowns, the
equations always have solutions other than x^ = X2= ■■■ = x„= 0.
Corollary 2. If the number of equations is equal to the number
of unknowns, a necessary and sufficient condition for solutions other than
x^ = X2= ■■■ = Xn= is that the determinant of the coefficients be zero.
In the special case where the number of equations is just one less
than the number of unknowns and the equations are linearly
independent, we will prove the following :
Theorem 4. Every set of values of x^,x„ which satisfies a
system ofn — 1 linearly independent,^ homogeneous linear equations in
* Cf. also the closing lines of Theorem 2, § 16.
t The theorem is still true if the equations are linearly dependent, hut it is then
trivial, since the determinants in question are all zero.
48 INTRODUCTION TO HIGHER ALGEBRA
n unknowns is proportional to the set of (n — Vyrowed determinanti
taken alternately with plus and minus signs, and obtained by striking
out from the matrix of the coefficients first the first column, then the
second, etc.
Let us denote by a^ the (n — l)rowed determinant obtained by
striking out the ith column from the matrix of the equations. Since
the equations are linearly independent, there must be at least one
of the determinants a^, a^, ■■■ a„ which is not zero. Let it be a^.
Now assign to a;, any iixed value, o, and transpose the I'th term of
each equation to the second member and we have
Oi^x^ +•■■ + ai,iia;i_i + aj.j+i ^i+j + ■■• + «!„ »„ =  a^^c,
Hence: Xu = ^~ ' — '^ i^= 1, 2, ••• m),
from which it is clea» that (x^, ■■• «„) are proportional to the de
• terminants (aj, — a^, a^, ... ( _ l)"!^^), as was to be proved.
The theory of homogeneous linear equations has here been de
duced from the theory of linear dependence. It can, however, in
turn be used to obtain further results in this lastmentioned theory.
As an example of this we will deduce the following theorem, which
we shall find useful later :
Theorem 5. If a set of points {x^, ■■■ x„), finite or infinite in
number, have the property that k points can be found among them upon
which every other point of the set is linearly dependent, then any k + 1
points of the set will be linearly dependent.
Let {x[, ■■■ x^), {x>l, ■■■x'l), (4*1, ... 4«) be the k points upon
which every other point of the set is linearly dependent, and let
(Xi, ... X'„), (X'l, ... X'J:), ... (X^ii, ... Xt*Hi])
be any k+1 points of the set. Then we may write
■X M = cW x[ + dl 44... + cfx['^ ,
(1) ■ ■ ■ ' .'..'', ■ (»■ = !> 2, ...^H)
.X^ = c^fx'„ + 4*14' f ... + cfx^.
LINEAR EtiUATIONS
49
This is true by hypothesis if {Xf, ••• Xi^) is not one of the first A
points, and if it is one of these points, it is obviously true. We
have then to prove that k+l constants, Cj, 0^, ••• C^+p not all zero,
can be found such that
C,X^+C,X'i+... + O,,,X}''^^ =
(y=i, 2, ...«).
By substituting here the values of the X's from (1), we see that
these equations will be fulfilled if
Cici+C^c'^+.:+C,^,c,
[k+l] .
0,
and this is a system of fewer equations than unknowns, which is
therefore satisfied by a set of C's not all zero. (Cf. Theorem 3.
Cor. 1.)
IIXERCISES
Solve completely the following systems of equations ;
1.
2.
lla;+ Sy2z+ 3w=0,
2x+ Sy 2+ 2w = 0,
7 X— y+ z  3k' = 0,
4:xlly+5z 12w = 0.
2x~ Sy + 5z+ 3w = 0,
ix — y + 2+ ro = 0,
Sx 2y + Sz+ 4to = 0.
18. Fundamental Systems of Solutions of Homogeneous Linear
Equations. If {x[, ■ ■ x'^) is a solution of the system of equations
(1)
nl"^! I" ■ ■ ■ "r ^mn^B — "»
then {ex[,'cxn) is also a solution, and by giving to e different values
we get thus (except in the special case in which the a;"s are all zero)
an infinite number of solutions. These may include all the solutions
of (1) (cf. Theorem 4 of the last section), but in general this will
not be the case.
50 INTRODUCTION TO HIGHER ALGEBRA
Suppose, again, that (x[, ■■■ xl) and (x'^, ■■■ x'l) are two solutions ol
(1), then {c^Xj + c^x'[, ■ ■ ■ c^x^ + e^x'^ is also a solution. If the two
given solutions are proportional to each other, this clearly gives us
nothing more than what we had ahove by starting from a single
solution ; but if these two solutions are linearly independent, we
build up from them, by allowing c^ and c^ to take on all values, a
doubly infinite system of solutions ; but even this system will
usually not include all the solutions of (1). Similarly we see
that, if we can find three linearly independent solutions, we can
build up from them a triply infinite system of solutions, etc. If,
proceeding in this way, we succeed in finding a finite number of
linearly independent solutions in terms of which all solutions can
be expressed, this finite number of solutions is said to form a
fundamental system.
Definition. If {x^^, ■■■ x^'S) (i= 1, 2, ••• ^) are a system of k
solutions of {!) which satisfy the following two conditions, they are said
to form a fundamental system :
(a) They shall be linearly independent.
(b) Every solution of (\) shall he expressible in the form
{c^x[ + c^x'l t • • • + e^^l\ c^x'„ + c^xl+ + c;^[*l ).
Theorem 1. If the equations (1) are of rank r <n, they possess
an infinite number of fundamental systems each of which consists
of n — r solutions.
Suppose the rrowed determinant which stands in the upper left
hand corner of the matrix of the equations (1) does not vanish, and
let us consider the first r of these equations. Any solution of these
will be a solution of all the others. Transpose all terms after the
rth to the second members, and let (a^^+j, ••■ a;„) have any fixed set
of values (a;J.+j, ••• x^), not all zero; then these r equations will have
just one solution given by Cramer's Rule. Call it (a/j, ••■ x'„).
Now let (a;r+i, ■■■ x^ have any other fixed set of values {xJl^, ■■■ a/^),
not all zero, and we get another solution, (a;'/, ... x'j^. Continue in
this way until we have n—r solutions
Xy, ••■ X^, ^r+v ••• ^ni
a^i, ... 4»^l,a;[;!i, ••■ 4»^l
LINEAR EQUATIONS
51
If we have chosen these n — r sets of values for (x^
the determinant
(2)
'r+l'
»„) SO that
"r+l
4+T^"4""''^
is not zero, — and this may clearly be done in an infinite variety of
ways, — these n — r solutions will be linearly independent. That ia
to say, we may thus obtain an infinite number of sets of w — r solu.
tions each, each of which satisfies condition (a) of our definition foj
a fundamental system.
To prove that these sets of solutions also satisfy condition (J), let
us suppose that (Xj, ■ • • X„) is any solution of the r equations we are
considering. The last n— r oi these X^s are linearly dependent on
the n—r sets of values we have chosen for (aj^+i, ■ • • a;„) since we have
here more sets of constants than there are elements in each set (cf.
Theorem 2, §13), and the determinant (2) is not zero. Thus
(3) Xi = c^x[+c^x>l + ^o,_^f'^
(i = r+l, r42,  n).
Let us now solve the first r equations (1) by Cramer's Rule, regard
ing a;^+j, •■■ x„ as known. We thus get results of the form
, = ^:
Lja;^+2
+ A'!x,^^++A^^^x^
By assigning special values here to x.
•r+l'
0' = 1,2,
^v^ ^e get
r).
(4)
Xj = J^.X,+i + A'!X,^^ +■■■+ Af'^X,.
(y = l,2, ...r)
If we multiply the first n — r of these equations by Cj, ••• e^
spectively and add, we get, by (3),
Ci4 + ■ • ■ + c^A"~'^ = ^j^r+i + ■ • • + 4""'''^«
Consequently, by the last equation (4),
(5) , Xj = c,x^ +■■■ c„_,4"'' 0" = 1, 2, •
Equations (8) and (5) together prove our theorem.
, re
r\
52 INTRODUCTION TO HIGHER ALGEBRA
We thus see that the totality of all solutions of the system (Ij
forms a set of points satisfying the conditions of Theorem 5, §17.
Consequently, ^
Theorem 2. If the rank of a system of homogeneous linear equa
tions in n variables is r, then any w — r + 1 solutions are linearly
dependent.
Finally we will prove the theorem.
Theorem 3. A necessary and sufficient condition that a set of
solutions of a system of homogeneous linear equations of rank r in n
variables form a fundamental system is that they be
(a) linearly independent,
(5) n — r in number.
By definition, (a) is a necessary condition. To see that (5) also
is necessary, notice that by Theorem 2 there cannot be more
than n — r linearly independent solutions. We have, then,
merely to show that I linearly independent solutions never form
a fundamental system when l< n — r. If they did, then by
Theorem 5, § 17, any set oi I \ 1 solutions would be linearly
dependent, and therefore the same would be true of any set of
n — r solutions (since n — r^l + 1). But by Theorem 1, this
is not true.
In order now to prove that conditions (a) and (6) are also
sufficient, let
{xf, xf, ... rcW) {i = l,2,...nr)
be any system oi n — r linearly independent solutions of our system
of equations, and let {x^, ... x„) be any solution of the system.
Then, by Theorem 2, we have n — r +1 constants (cj, ... 'c„r+i)i
not all zero, and such that i
<?ia;j + c^x'! + ... + o^.^xf''^ + c„.^^^Zj ~0 (/ = 1, 2, ... n).
But since the n — r given points are linearly independent, c„_^^.j =^ 0;
accordingly these last equations enable us to express the solu
tion (xi, ... x^) linearly in terms of the n — r given solutions,
and this shows that these n — r solutions form a fundamental
system.
LINEAR EQUATIONS 63
EXERCISES
1. Prove that all the fundamental systems of solutions of a system of homo
geneous linear equations are included in the infinite number obtained in the
proof of Theorem 1.
2. Given three planes in space by their equations in homogeneous coordinates.
What are their relative positions when the rank of the system of equations is 3?
when it is 2 ? when it is 1 ?
3. Given three planes in space by their equations in nonhomogeneous coordi
nates. What are their relative positions for the different possible pairs of values
of the ranks of the matrices and augmented matrices?
CHAPTER V
SOME THEOREMS CONCERNING THE RANK OF A MATRIX
19. General Matrices. In order to show that a given matrix
is of rank r, we have first to show that at least one rrowed deter
minant of the matrix is not zero, and secondly that all {r + 1)
towed determinants are zero. This latter work may be considerably
shortened by the following theorem :
Theorem 1. If in a given matrix a certain rrowed determinant
is not zero, and all the {r+l)rowed determinants of which this rrowed
determinant is a first minor are zero, then all the (r'r'\.)rowed deter
minants of the matrix are zero.
We will assume, as we may do without loss of generality, that
the nonvanishing rrowed determinant stands in the upper left
hand corner of the matrix. Let the matrix be
nr
•*ml
*ln
and consider the r f 1 sets of n quantities each which lie in the first
r\l rows of this matrix. These r 4 1 sets of quantities are linearly
dependent, as will be seen by reference to the proof of Theorem 1,
§ 13, for although we knew there that all the (r 1 l)rowed deter
minants were zero, we made use of this fact only for those {r\V)
rowed determinants which we now assume to be zero. Moreover,
since the r sets of constants which stand in the first r rows of our
matrix are linearly independent, it follows that the (r  l)th row is
linearly dependent on the first r. Precisely the same reasoning
shows that each of the subsequent rows is linearly dependent on the
first >• rows. Accordingly, by Theorem 5, §17, any r\\ rows are
linearly dependent ; and therefore, by Theorem 1, §13, all the (rfl)
rowed determinants of our matrix are zero, as was to be proved.
64
THEOREMS CONCERNING THE RANK OP A MATRIX 56
Still another method of facilitating the determination of the rank
of a matrix is by changing the form of the matrix in certain ways
which do not change its rank. In order to explain this method, we
begin by laying down the following definition : ,
Definition 1. By an elementary transformation of a matrix we
understand a transformation of any one of the following forms :
(a) the interchange of two rows or of two columns ;
[b) the multiplication of each element of a row {or column) by the
same constant not zero ;
iji) the addition to the elements of one row {or column) of the prod
ucts of the corresponding elements of another row {or column) by one and
the same constant.
It is clear that if we can pass from a matrix a to a matrix b by one
of these transformations, we can pass back from b to a by an elemen
tary transformation.
Definition 2. Two matrices are said to he equivalent if it is possi
ble to pass from one to the other hy a finite number of elementary trans
formations.
Theorem 2. If two matrices are equivalent, they have the same rank.
It is evident that the transformations {a) and (5) of Definition 1
do not change the rank of a matrix, since they do not affect the van
ishing or nonvanishing of any determinant of the matrix. In order
to prove our theorem, it is therefore sufficient to prove that the rank
of a matrix is not changed by a transformation (e).
Suppose this transformation consists in adding to the elements of
the pth. row of a matrix a k times the elements of the qih row,
thus giving the matrix b. Let r be the rank of the matrix a. We
will first show that this rank cannot be increased by the transforma
tion, that is, that all {r + l)rowed determinants of the matrix b are
zero. By hypothesis all the (?• f 1 )rowed determinants of the
matrix a are zero, and some of these determinants are clearly not
changed by the transformation, namely, those which do not contain
the j9th row, or which contain both the pth. and the qt\\ row. The
other determinants, which contain the pth row but not the g'th, take
on after the transformation the form ^± AB where A and B are
{r I l)rowed determinants of a, and are therefore zero. Thus we
see that the transformation (e) never increases the rank of a matrix.
66
INTRODUCTION TO HIGHER ALGEBRA
Moreover, the rank of b cannot be less than that of a, for then the
transformation (c) which carries b into a would increase the rank of
b, and this we have just seen is impossible.
This theorem can often be used to advantage in determining the
rank of a matrix, for by means of elementary transformations it is
often easy to simplify the matrix very materially.
BXERCISES
Determine the ranks of the following matrices:
14
12
6 8 2
6
104
21 9 17
7
6
3 4 1
35
30
15 20 5
75
116  39 •
171
 69
402 123
45
301
87 417
169
1
14
46
268 82
30
3. Prove that any matrix of rank r can be reduced by means of elementaxy
transformations to a form where the element in the ith row and ith column is 1
when i<r, while all the other elements of the matrix are zero.
4. Hence prove that two matrices with m rows and n columns each are always
equivalent when they have the same rank.
5. Prove that a necessary and sufficient condition that the matrix
be of rank or 1 is that there exist m + n constants oi, ••• Om, Pi, ••• j3n such that
20. Symmetrical Matrices.
Definition. The square matrix
«u
«12 •■
•«1«
«21
«22"
•«2™
a =
•
• •
•
■ •
•
«.l
««2
«™
'^and also its determinant) is said to he symmetrical if the pairs of termi
which are situated symmetrically with respect to the principal diagonal
are equal. That is, if a^y = %.
THEOREMS CONCERNING THE RANK OF A MATRIX 57
We will denote by Mf an irowed principal minor of a. It is
onr main object in this section to show how the rank of the symmet
rical matrix may be determined by an examination of the principal
minors only. This may be done by means of the following three
theorems.
Theorem 1. ff an rrowed principal minor Mr of the symmetrical
matrix a is not zero, while all the principal minors obtained hy adding
one row and the same column, and also all those obtained by adding two
rows and the same two columns, to M^ are zero, then the rank of a is r.
Let the non vanishing minor be the one which stands in the upper
lefthand corner of a, and let B^^ denote the determinant obtained by
adding the ath row and the j8th column to M^. If we can show that
B^a = for all unequal values of « and /S our theorem will be proved.
Cf. Theorem 1, § 19. Give to the integers « and ^ any two unequal
values, and let C denote the determinant obtained by adding to M^ the
ath and ySth rows and the «th and /3th columns of a. Then we have, .
by hypothesis, M^i^ 0, B^ = 0, B^^ = 0, C= 0. Let M^ be the two
rowed principal minor of the adjoint of Q which corresponds to the
complement of Mr in O. Then by Corollary 3, § 11, we have
M^=OMr=0.
But Mi~B^^Bp^B^.
Therefore B„8 = 0.
Theorem 2. J^ all the (r+l)rowed principal minors of the sym
metrical matrix a are zero, and also all the {r + 2) rowed principal
minors, then the rank of a is r or less.
lir=0, all the elements in the principal diagonal are zero and aU
■ the tworowed principal minors are zero.
That is, an ■ Ujj — a  = 0,
and therefore, since a« = %=0, aij = 0. That is, every element is
zero and hence the rank is zero, and the theorem is true in this
special case. >
Now, assume it true when r = k; that is, we assume that when all
{k\l)TOwed principal minors are zero and all (A2)rowed principal
minors are zero, the rank of a is less than k+ 1. Then it follows that
when all (yfch2)rowed, and all (At 3)rowed principal minors are zero.
58 INTRODUCTION TO HIGHER ALGEBRA
the rank of a is less than k + 2. For in this case, if all (Jc + l)rowed
principal minors are zero, the rank is less than ^ + 1, by hypothesis,
and if some (^ + l)rowed principal minor is not zero, the rank is ex
actly A + 1, by the last theorem. We see then that if the theorem is
true for r = k it is true for r=k + l. But we have proved it true for
r = 0, hence it is true for all values of r.
Theorem 3. If the rank of the symmetrical matrix a is r > 0, there
is at least one rrowed principal minor of a which is not zero.
For all (r+l)rowed principal minors are zero, and, if all rrowed
principal minors were zero also, the rank of a would be r— 1 or less,
by the last theorem.
We close with a theorem of a somewhat special character which
will be found useful later (cf. Exercises 46, § 50).
Theorem 4. If the rank of the symmetrical matrix a is r> 0, we
may shift the rows {at the same time shifting the columns in the same
way, thus keeping a symmetrical) in such a way that no consecutive two
of the set of quantities ]y[ ^ ^ ... M
shall he zero and M^ =?t ; M^ being unity, and the other M^s being the
principal minors of a of orders indicated by their subscripts, which stand
in the upper lefthand corner of a after the shifting.
By definition we have il!f^=?fcO. Leaving aside for the moment
the special case in which all the elements of the principal diagonal are
zero, let us suppose the element a^ is not zero. Then by shifting the
ith row and column to the first place, we have ilfj ^ 0. We have
tn'as fixed the first row and column, but we are still at liberty to
shift all the others. Now consider the tworowed principal minor
obtained by adding to M^ one row and the same column. Leaving
aside still the special case in which these are all zero, let us suppose
that the tworowed determinant obtained by striking out all the rows'
and columns except those numbered 1 and i^ is not zero. Then, by
shifting the ijth row and column into the second place, we have
M^ "^ ^' W® ii^^t have to consider the threerowed principal minors
of which M^ is a first minor. We can evidently proceed in this way
until we have so shifted our rows and columns that none of the quan
tities Mq, ilfj, ... M^ are zero, unless at a certain stage we find that
all the principal minors of a certain order which we have to consider
are zero. In this case we should have so shifted our first k rows and
columns that none of the quantities Mq, M\, ... Mh are zero, but we
THEOREMS CONCERNING THE RANK OF A MATRIX
59
should then find that all (A;+l)rowed principal minors of which M^
is a first minor vanish, so that, however we may shift the last n — Jc
rows and columns, we have M^+j = 0. Let us then examine the
(A+2)rowed principal minors of which M^ is a second minor.*
These can (by Theorem 1) not all be zero as otherwise the rank of a
would he k <r. That is, if M/i+i = 0, we can so arrange the rows
and columns that Mn^^^O. Thus we see that the rows and col
umns of a may be so shifted that no consecutive two of the iKTs are
zero. Now, if Mr_i=0, the above proof shows that we can make
M^ =^ 0. But even though illf^_ j =?;= we can still make M^ ^ 0, for by
hypothesis f all the determinants obtained by adding to M^i two
rows and the same two columns vanish, and if all those obtained by
adding one row and the same column were zero also, the rank of a
would be ? — 1, by Theorem 1.
A symmetrical matrix is said to be arranged in normal form when
no consecutive two of the M's of Theorem 4 are zero and Mr=^0.
EXERCISES
1. Determine the ranks of the following matrices :
2
1
11
2
4 10 1
1
4
1
4
8 18 7
f
11
4
56
5
10
18 40 17
2
1
5
6
1
7 17 3
1
1
4
1 b d
1
2
5
1
c e
1
3
6
J
b
c 2bc cd+be
1
2
3
14
32
d
e cd+be 2de
4
5
6
32
77
2. By a skewsymmetric determinant, or matrix, is meant one in which Otyw
— flji (and therefore an = 0).
Establish for such matrices theorems similar to Theorems 1, 2, 3 of this section.
3. By considering the effect of changing rows into columns, prove that a skew
gymmetric determinant of odd order is always zero.
4. Prove that the rank of a skewsymmetric matrix is always even.
* The tacit assumption is here made that when k = rl,r<n,aa otherwise Mi+,
lyould have no meaning. The case r = n can, however, obviously not occur here, for
then we should have itf* + 1 = o ^t 0.
t Here again we assume that r < n, for if r = n, Mr = a=^0.
CHAPTER VI
LINEAR TRANSFORMATIONS AND THE COMBINATION
OF MATRICES
21. Matrices as Complex Quantities. "We have said in § 7 that a
matrix of m rows and n columns is not a quantity, but a set of mn
quantities. This statement is true only if we restrict the term
quantity to the real and complex quantities of ordinary algebra. A
moment's reflection, however, will show that the conception of quan
tity as used in arithmetic and algebra has been gradually enlarged
from the primitive conception of the positive integer by using tha
word quantity to denote entities which, at an earlier stage, would
not have been regarded as quantities at all, as, for instance, nega
tive quantities. We will consider here only one of these extensions,
namely the introduction of complex quantities, as this will lead us to
look at our matrices from a broader point of view.
If we have objects of two or more different kinds which can be
counted or measured, and if we consider aggregates of such objects,
we get concrete examples of complex quantities, as, for instance,
5 horses, 3 cows, and 7 sheep. A convenient way to write such a
complex quantity is (5, 3, 7), it being agreed that, in the illustra
tion we are considering, the first place shall .always indicate horses,
the second cows, and the third sheep. In the abstract theory of
complex quantities we do not specify any concrete objects such as
horses, cows, etc., but merely consider sets of quantities (couples,
triplets, etc.), distinguishing these quantities by the position they
occupy in our symbol. Such a complex quantity we often find it
convenient to designate by a single letter,
a. = (a, b, o)
just as in ordinary algebra we denote a fraction ( for instance),
which really involves two numbers, by a single letter. We speak
here of the simple quantities a, I, c of which a is composed as its first,
60
LINEAR TRANSFORMATIONS AND MATRICES 61
(Second, third components ; and we call two complex quantities equal
when and only when the components of one are equal respectively to
the corresponding components of the other. Similarly a complex quan
tity is said to vanish when and only when all of its components are zero.
What makes it worth while to speak of such sets of quantities as
complex quantities is that it is found useful to perform certain alge
braic operations on them. By the sum and difference of two complex
quantities • a, = {a„b„e,), a, = ia,,h„c,)
we mean the two new complex quantities
«i + «2 = («i + <*2? ^1 + K "i + ^2)) «! — «2 = (% — C'2' \ — ^2! H — ''i)*
When it comes to the question of defining what we shall under
stand by the product of two complex quantities, things are by no
means so simple. It is necessary here to lay down some rule accord
ing to which, when two complex quantities are given, a third, which
we call their product, is determined. Such rules may be laid down
in an infinite variety of ways, and each such rule gives us a differenb
system of complex quantities, f
We come now to the subject of matrices. A matrix of m rows
and n columns being merely a set of mn quantities (which we
assume to be either real quantities or the ordinary complex quantities
of elementary algebra) arranged in a definite order, is, according to
the point of view we have explained, a complex quantity with mn
components; and it is only a special application of the theory of
complex quantities which we have sketched, when we lay down the
following definitions:
Definition 1. A matrix is said to he zero when and only when all
of its elements are zero.
Definition 2. Two matrices are said to he equal when and only
when they have the same number of rows and of columns, and every
element of one is equal to the corresponding element of the other.
* That this is the natural meaning to be attached to the terms sum and difference
will be seen by reference to the concrete illustration given above.
t If, in particular, we wish to introduce the ordinary system of complex quanti
ties of elementary algebra, we use a system of couples, and define the product of two
couples, «! = (cti, 61). «2 = ('^2! ^2)'
by the formula aiWz = (0102  6162, 0162 + a^bi).
For further details cf. Burkhardt's Funktionentheorie, §§ 2, 3.
62 INTRODUCTION TO HIGHER ALGEBRA
Definition 3. By the sum {or difference) of two matrices of m
rows and n columns each, we understand a matrix of m rows and n col
umns, each of whose elements is the sum (or difference) of the corre
sponding elements of the given matrices.
In order to distinguish them from matrices, we will call the
ordinary quantities of algebra (real quantities and ordinary complex
quantities) scalars.
Before proceeding, as we shall do in the next section, to the
definition of the product of two matrices, we will define the product
of a matrix and a scalar.
Definition 4. If & is a matrix * and k a scalar, then by the prod
uct Tcdi or sJe we understand the matrix each of whose elements is k
times the corresponding element of a.
As an obvious consequence of our definitions we state the
theorem :
Theorem. All the laws of ordinary algebra hold for the addition
or subtraction of matrices and their multiplication by scalars.
For instance, if a, b, c are matrices, and k, I scalars,
a + b = b + a,
a + (b + c) = (a + b) + c,
ka\kh = k{& + b),
Aa + Za = (A; + Z)a.t
EXERCISE
If r\ and r^ are the ranks of two matrices and R the rank of their sum, prove
that „ ^
R^r^ + r^
22. The Multiplication of Matrices. Up to this point we have con
sidered matrices with m rows and n columns. For tlie sake . of sim
plicity of statement, we shall confine ouv attention from now on to
square matrices, that is to the case m = n. This involves no real loss
* The notation here used, matrices being denoted by heavyfaced type, will be
systematically followed in this book.
t We add that, as a matter of notation, we shall write
(l)a=a.
LINEAR TRANSFORMATIONS AND MATRICES 63
of generality provided we agree to consider a matrix of m rows and
n columns, where m^n, as equivalent to a square matrix of order
equal to the larger of the two integers m, n and obtained from the
given matrix by filling in the lacking rows or columns with zeros.
The question now presents itself: How shall we define the prod
uct of two square matrices of the same order ? It must be clearly
understood that we are logically free to lay down here such definition
as we please, and that the definition we select is preferable to others
not on any a priori grounds, but only because it turns out to be more
useful. We select the following definition, which is suggested * by
the multiplication theorem for determinants :
Definition 1. The product ab of two square matrices of the nth
order is a square matrix of the nth order in which the element which lies
in the ith row and jth column is obtained hy multiplying each element of
the ith row of a hy the corresponding element of the jth column of b and
adding the results.
Let us denote by «;,• and J^y the elements in the ith row and jth.
column of a and b respectively, or, as we will say for brevity, the
element (i, j) of these matrices. Then, according to our definition,
the element (i, j) of the product ab is
(1) a ijby + a^aigy + • • ■ + «.A/t
while the element (i, j) in the matrix ba is
(2) aj/ii + a^jbi^ + • • • + a„^J»„
Since the two quantities (1) and (2) are not in general equal, we
obtain
Theorem 1. The multiplication of matrices is not in general com
mutative, that is, in general ^^ , jg^_
Let us now consider a third matrix c whose element (i, j) is c^y,
and form the product (ab)c. The element {i, j) of this matrix is
(a.jjji + aigSji I 1 «iAi>i;
(3) + • •
+ («il*ln + «i2*2n + ••• + ".iifinnj^nj
* Historically this deflnition was suggested to Cayley by the consideration of the
composition ot linear transformations; cf. § 23.
64
INTRODUCTION TO HIGHER ALGEBRA
On the other hand, the element {i, j) of the matrix a(bc) ia
( 4) + aiihlHi + ^22^27 + • • • + hn^nj)
+
Since the two quantities (3) and (4) are equal, we have estahlished
Theorem 2. The multiplication of matrices is associative, that is,
(ab)c = a(bc).
Finally, since the element {i,j) of the matrix a (b + c) is clearly
equal to the sum of the elements {i, j) of the matrices ab and ac, we
have the result
Theoeem 3. The multiplication of matrices is distributive, that is,
a(b + c) = ab + ac.
Besides the commutative, associative, and distributive laws, there
is one other principle of elementary algebra which is of constant use,
namely, the principle that a product cannot vanish unless at least one
of the factors is zero. Simple examples show that this is not true in
the algebra of matrices. We have, for instance.
(5)
"21
^31
^22
*32
31
32
= 0,
whatever the values of the a's and 5's may be. Hence
Theorem 4. From the vanishing of the product of two or more
matrices, we cannot infer that one of the factors is zero.
The process of cancelling out nonvanishing factors which enter
throughout an equation will, therefore, be inadmissible in the algebra
of matrices.
We next state a result which follows at once from the similarity
between the theorem for the multiplication of determinants and our
definition of the product of two matrices :
Theorem 5. The determinant of a matrix which is obtained by
multiplying together two or more matrices is equal to the product of th»
'ieferminants of these matrices.
LINEAR TRANSFORMATIONS AND MATRICES 65
The conception of the conjugate of a matrix, as defined in § 7,
Definition 2, is an important one, and the following theorem concern
ing it is often useful:
Thbokem 6. The conjugate of the product of any number of
matrices is the product of their conjugates taken in the reverse order.
In order to prove this theorem we first notice that its truth in the
case of two matrices follows at once from the definition of the prod
uct of two matrices. Its truth will therefore follow in all cases if,
assuming the theorem to be true for the product of w — 1 matrices,
we can prove that it is true fot the product of n matrices. Let us
write
= 3.2 33 ■•■ 3»'
Then, from what we have assumed,
D = &n ■ ■ ■ ^3 32)
where we use accents to denote conjugates. Accordingly,
(ai32  a„y = (aib)' = b'ai = ai  43^,
and our theorem is proved.
In conclusion we lay down the following :
Definition 2. A square matrix is said to he singular if its deter
minant is zero.
According to the convention made at the beginning of this sec
tion, it will be seen that all matrices which are not square are
singular.
EXERCISES
1. Definition. A matrix a is called a divisor of zero if a matrix b different
from zero exists such that either ab = or ba = 0.
Prove that every matrix one of whose rows or columns is composed wholly of
zeros is a divisor of zero.
2. If it is possible to pass from a to b by means of an elementary transformar
tion (cf . § 19, Definition 1), prove that there either exists a nonsingular matrix c
such that
ac = b,
or a nonsingular matrix d such that
da = b.
p
66 INTRODUCTION TO HIGHER ALGEBRA
3. If aU the elements of a matrix are veal, and if the product of this matrix
and its conjugate is zero, prove that the matrix itself is zero.
4. If the corresponding elements of two matrices a and b are conjugate imagi
naries, and, b' being the matrix conjugate to b, if
ab' = 0, then a = b = 0.
23. Linear Transformation. Before going farther with the
theory of matrices we will take up, in this section and the next, the
closely allied subject of linear transformation, which may be regarded
as one of the most important applications of the theory of matrices.
In algebra and analysis we frequently have occasion to introduce,
in place of the unknowns, or variables, we had originally to deal
with, certain functions of these quantities which we regard as new
unknowns or variables. Such a transformation, or change of vari
ables, is particularly simple, and for many purposes particularly
important, if the functions in question are homogeneous linear poly
nomials. It is then called a homogeneous linear transformation, or,
as we shall say for brevity, simply a linear transformation. If x^, ■■■ x„
are the original variables, and x^, ■■■ x'^ the new ones, we have, as the
formulae for the transformation,
x[ = ai^x^+ ■■■ +a^„x„.
, ^n — ^nl^l + • • • + <^nn^»
The square matrix
a =
is called the matrix of the transformation, and the determinant of
this matrix, which we will represent by a, is called the determinant
of the transformation. Inasmuch as the transformation is com
pletely determined by its matrix, no confusion will arise if we speak
of the transformation a.
In most cases where we have occasion to use a transformation it is
important for us to be able, in the course of our work, to pass back to
the original variables, and for this purpose it must be possible, not
merely to express x{, ■■■ xl as functions of x^, x„, but also to express
a^i, ■•■ x„ as functions of x[, ■■■ x\. In the case of linear transforma
LINEAR TRANSFORMATIONS AND MATRICES
6\
tions this can in general be done. For the equations of the transfer
mation may be regarded as nonhomogeneous linear equations in
Xi, ■■■ x„, and if the determinant a of the transformation is not zero,
they can be solved and give
a a
.„ = ^.i +
A
where A.,, •■• A„„ are the cof actors of a,,, ■•• a.
in a.
This transformation A is called the inverse of the transformation
a, but it must be remembered that it exists only if a :^ 0. A linear
transformation for which a = is called a singular transformation.
If a is nonsingular, its inverse A is also nonsingular, since the deter,
minant of A is a~^(ci. Corollary 2, § 11).
Definition. The special linear transformation
1 V 2 — 2' » — »'
whose matrix is
1 .
.
1 .
.
■
• •
■ 1
is called the identical transformation.
The determinant of this transformation is 1.
We turn now to the subject of the composition of linear trans
formations. If we introduce a new set of variables a;' as functions
of the original variables x, and then make a second transformation
by introducing a third set of variables x" as functions of the vari
ables x', these two transformations can obviouslj"^ be combined and
the variables x" expressed directly in terms of the a;'s. If the two
transformations which we combine are linear transformations, it ia
readily seen that the resulting transformation will also be linear.
The precise formulae are important here, and for the sake of simplic
ity we will write them in the case of three variables, a case which
will be seen to be perfectly typical of the general case.
68
INTRODUCTION TO HIGHER ALGEBRA
Let
•^2 ^^ iV"^! ' ^22'*2 ' ^23"''3'
Xg = Pgia^l + OgjaJg + Ogg^g,
be two linear transformations. Replacing the a;''s in b by theii
values from a, we get
< 3^1 = (an^ii + a^i^ia + «3i^i3)«i
+ («12^11 + «'22^2 + «32*13)a^2
+ («i30n + '^23"12 + *33"13)^3'
4' = («11^21 + «21^22 + «31*23)^1
+ (<*12"21 "I" ^22^22 "I" *32"23)'''2
+ ((I13O21 + ^23^22 "I" ^33 23)'^3'
^8 ~ ("'ll^Sl + *21^32 + '*31"33)^1
"I" ('"iz'^Sl "I" *22''32 "I" *32"33)'^2
+ (^13^31 + ^23^32 + '''33"33)*3"
It will be seen that the matrix of this transformation is ba.
Hence,
Theorem. If we pass from the variables x to the variables x' by a
linear transformation of matrix a, and from the variables x' to the vari
ables x" by another linear transformation of matrix b, then the linear
transformation' of matrix ba will carry us directly from the variables x
to the variables x" .* '
24. CoUineation. We come now to an important geometrical
application of the subject of linear transformation. For the sake of
simplicity we begin with the case of three variables, Which we will
regard as the homogeneous coordinates of points in a plane.
The equations . ^, _ ^^^ ^ j^^ ^ ^^^^
(1) ■ y' = a^x + b^y + c^t,
. t' = a^x + b^y + Cgi
* This result may be remembered conveniently by means of the following symbolio
notation, which is often convenient. Let us denote the transformation a by the
symbolic equation x' = a(a;), and the transformation b by x'' = b(a;'). The result o<
combining these two transformations is then x" = b(a(a;)) or simply x" = ba(s)
LINEAR TRANSFORMATIONS AND MATRICES 69
may be regarded as defining a transformation of the points of the
plane ; that is, if (a;, «/, t) is an arbitrarily given point, we can com
pute, by means of (1), the coordinates {x\ y\ t') of a second point
into which we regard the first point as being transformed. The
only exception is when the computed values of x\ y\ t' are all three
zero, in which case there is no point into which the given point is
transformed. This exceptional case can clearly occur only when the
determinant of the transformation (1) is zero. Let us then confine
our attention to nonsingular linear transformations. In this case,
not only does every point (a;, y, t) correspond to a dfefinite point
(«', y\ t'), but conversely, every point (x', y\ t') corresponds to a
definite point (a;, y, t), since the transformation (1) now has an inverse
(2)
D B" B
B B" B
where B is the determinant of (1), and J.^, Bi^ Q^ are the cofactors
'va.B.
The points (a;, y, t) of the line
(3) ax + ^y + 'yt=^0
are transformed by means of the nonsingular transformation (1)
into points of another line,
^^^ aA^ + ^B^ + yC, ^, ^ aA^ + ^B^+yO^ ^, ^ aAs+ 0B^ + yC^ ^, ^^^
as we see by using formulae (2). Conversely every point .of the line
(4) corresponds, as we see by using (1), to a point on (3). That is,
the transformation establishes a onetoone correspondence between
the points on the two lines (3) and (4), or, as we say, it transforms
the line (3) into the line (4). On account of this property of trans
forming straight lines into straight lines, the transformation is called
a collineation. The transformation is also known as a projective
transformation, for it may be shown that it can be effected by pro
jecting one plane on to another by means of straight lines radiating
from a point in space.
70
INTRODUCTION TO HIGHER ALGEBRA
What we have here said in the case of two dimensions applies
with no essential change to three dimensions. The transformation
x' = a^ + hy^ + CjZ + df,
y' = a^x + l^y + c^z + d^,
z' = a^x + i^ + CgZ + d^t,
t' = a^x + f>il/ + c^z + d^t
(5)
gives us, provided its determinant is not zero, a onetoone trans
formation of the points of space, which carries over planes into
planes, and therefore also straight lines into straight lines, and is
called a collineation or projective transformation of space. The same
idea can be extended to spaces of higher dimensions.
Quite as important is the case of one dimension. The transfor
mation
(6)
x' = a^x + bjt,
t' = a^x + b^t
gives us, provided its determinant is not zero, a onetoone trans
formation of the points on a line. This we call a projective trans
formation of the line, the term collineation being in this case
obviously inadequate.
It is possible, although for most purposes not desirable, to express
the projective transformations (6), (1), (5) in one, two, and three
dimensions in terms of nonhomogeneous, instead of homogeneou?
coordinates. We thus get the formulse
(7) X' =
a„X+ h'
(8)
j^i _ ayX+ h^Y+c^
a^X + h^Y + c^
y,_ a^X\h^Y+e^
a^X+boY+c^
(9)
^1 __ a^X +b^Y+c^Z+ c?i
a^X+b^Y+c^Z+d^
, _ a^X +b^Y+c^Z+ d^
a^X+b^Y+c^Z+dl
a„X+ b„Y+ CoZ+d^
Z' =
a^X+b^Y+c^Z+d^
These fractional forms may, in particular, be used to advantage
in case their denominators reduce to mere constants. This special
case, which is known as an affine transformation, may clearly be char
acterized by saying that all finite points go into finite points.*
* If we consider the still more special case in which the constant terms in th*
numerators of (8) and (9) are zero, that is, affine transformations in which the origin
is transformed Into itself, we see that our formulae (8) and (9) h'^e the form (6) and
LINEAR TRANSFORMATIONS AND MATRICES 71
These affine transformations are of much importance in mechanics,
where they are known as homogeneous strains; cf., for instance,
Webster's Dynamics (Leipzig, Teubner), pp. 4274:^40.
Although we propose to leave the detailed discussion of singular
transformations to the reader (see Exercise 1 at the end of this sec
tion), we will give one theorem concerning them.
Theorem 1. ^ the points Pj, Pj' '" ^'"^ carried over by a singu
lar projective transformation into the points Pj, Pg, •■•. then, if our
transformation is in one dimension, the points P' will all coincide ;
if in two dimensions, they will all he collinear ; if in three dimensions,
they will all be complanar, etc.
Suppose, for instance, that we have to deal with two dimensions.
Since the determinant of the coUineation (1) is supposed to be zero,
the three polynomials in the second members of (1) are linearly de
pendent ; that is, there exist three constants, Aj, k^, A3, not all zero,
and such that for all values of x, y, t,
(10) k^x' + k^y' + k/ = 0.
Accordingly all points («', y', t') obtained by this transformation
lie on the line (10).
Similar proofs apply to the cases of one dimension and of three
or more dimensions.
Theorem 2. Any three distinct points on a line may be carried
over respectively into any three distinct points on the line by one, and
only one, projective transformation.
Let the three initial points be P^, P^, P3, with homogeneous coor
dinates (a^i, «i), {x^, f^), (x^, t^) respectively, and let the points into
which we wish them transformed be PJ, P^, P3 with coordinates
{x[, t[), (4, 4), (4, ig'). The projective transformation
x' = ax + ^t,
t' = ^x+ U
(1) respectively. Thus (6) may be regarded either as the general projective transfor
mation of a line (if ^, t are regarded as homogeneous coordinates) or as a special
afane transformation of the plane (if x, t are regarded as nonhomogeneous coordi
nates). Similarly (1) may be regarded either as the general projective transformation
of a plane, or as a special afBrie transformation of space.
72
INTRODUCTION TO HIGHER ALGEBRA
carries over any given point {x, t) into a point {x', t') whose position
depends on the values of the constants a, 13, y, S. Our theorem is
true if it is possible to find one, and, except for a constant factor
which may be introduced throughout, only one, set of seven con
stants — four, a, y3, 7, S, and three others, pj, p^, p^, none of which is
zero — which satisfy the six equations
^Pjx'^ = axi + /3«i, 1^24 = «a;2 + /3«2. fPs^'s = '^H + /3«3,
\pA = 'iH + ^^1' [PiA = 72:2 + ^*r [Pa^'a = 7^3 + ^h
Since the aj's and t's are all known, we have here six homogeneous
linear equations in seven unknowns. Hence there are always solu
tions other than zeros, the number of independent ones depending on
the rank of the matrix of the coefficients. Transposing and rear
ranging the equations, we have
Xja + ij/S
^iPl
x^a + 1^
a^aV + *a^
aJjttH^g/S
2:37+ <3^
= 0,
= 0,
 ^W = <^»
 «^a = 0,
 ^zPz = 0,
«^3 = 0.
The matrix of these equations is of rank six. For consider the
determinant of the first six columns with its sign reversed,
2) =
Since Pj, Pj, Pg are distinct, there exist two constants Cj, c^
neither of which is zero, such that
"I'^'i ' "2^2 H" 273 = U,
^1
h
x\
H
h
X
^1
h
A
«2
h
^^
*3
h
Xr,
u
LINEAR TRANSFORMATIONS AND MATRICES
73
Hence, adding to the fifth row of D Cj times the first row and c^ times
the second, and to the sixth row c^ times the third row and c^ times
the fourth, we have
D =
^1
<1
a.1
H
h
4
^1
*1
= CiC2
2
44
ti4
oA
oA
cA
cA
and this is not zero, since P^ and P'^ are distinct as well as Pj
and Pj.
In the same way we see that the determinants obtained by
striking out the sixth and the fifth columns respectively of the
matrix are not zero. Accordingly, by Theorem 4, § 17, we see that
the equations have a solution in which none of the quantities
jOj, p^, /O3 are zero, and that every solution is proportional to this
one. All these solutions obviously yield the same projective trans
formation of the line.
CoBOLLARY. The transformation J ust determined is nonsingular.
This follows, by a reference to Theorem 1, from the fact that it
does not carry Pj, Pg, P3 into a single point.
EXERCISES
1. Discuss singular projective transformations in one, two, and three dimen
sions ; noting, in particular, the effect of the rank of the matrix of the transfor
mation, first, on the distribution of the points which have no corresponding points
after the transformation, and secondly, on the distribution of the points into which
no points are carried over by the transformation.
2. Prove that any four complanar points no three of which are collinear may
be carried over into any four points in the plane, no three of which are collinear,
by one and only one collineation.
3. State and prove the corresponding theorem in n dimensions.
4. Prove that the transformation from a first system of homogenecms coordi
nates to a second is effected by a nonsingular linear transformation. Considei
the case of one, two, and three dimensions.
74
INTRODUCTION TO HIGHER ALGEBRA
5. Prove that a projective transformation in space effects on every plane a
twodimensional, and on every line a onedimensional, projective transformation,
while at the^same time the positions of the plane and line are changed.
[SnoGESTioN. If p and pi are any two corresponding planes, assume in any way a
pair of perpendicular axes in each of them, and denote hy (a;^, y^, t^, and (xL y' t')
respectively the systems of twodimensional homogeneous coordinates based on these
axes. Then show, by using the result of Exercise 4, that the transformation of one
plane on the other will be expressed by writing x[, y!^, t^ as homogeneous linear poly
nomials in Xj, y^, «j.]
25. Further Development of the Algebra of Matrices. We proceed
to establish certain further properties of matrices, leaving, however,
much to the reader in the shape of exercises at the end of the section.
The theory of linear transformations suggests to us at once certain
properties of matrices. The first of these is :
Theorem 1.
The matrix
1
1 =
has the property that if a is any matrix whatever
■ la = al = a.
For the linear transformation of which a is the matrix will evi
dently not be changed by being either followed or preceded by the
identical transformation of which I is the matrix.
If we do not wish to use the idea of linear transformation, we may
prove the theorem directly by actually forming the products la and al.
This theorem tells us that I plays in the algebra of matrices the
same role that is played, by 1 in ordinary algebra. For this
reason I is sometimes called the unit matrix or idemfactor.
Let us now consider any nonsingular linear transformation and
its inverse. These two transformations performed in succession in
either order obviously lead to the identical transformation. This
gives us the theorem :
Theorem 2. If
a =
ni'
*nl
*!»
LINEAR TRANSFORMATIONS AND MATRICES
T5
18 a nonsingular matrix of determinant a, and if A^j denote in the ordi
nary way the cof actors of the elements of a, the matrix
a
a
a
a
called the inverse of a, and denoted hy a"S is a nonsingular matrix
which has the property that
aa~i = a~ia = I.
This suggests that we define positive and negative integral
powers of matrices as follows :
Definition 1. If p is any positive integer and a any matrix we
understand hy aP the product aa • • • a to p factors. If a. is a non
singular matrix, we define its negative and zero powers hy the for
mulce
a.p = (aiy, a" = I.
From this definition we infer at once
Thboeem 3. 77ie laws of indices
aPa." = a^+«, {aPf = aP^
hold for all matrices when the indices p and q are positive integers, and
for all nonsingular matrices when p and q are any integers.
We turn now to the question of the division of one matrix by
another. We naturally define division as the inverse of multiplica
tion, and, since multiplication is not commutative, we thus get two
distinct kinds of division ; a divided by b being on the one hand a
matrix x such that „ .
a = Dx,
on the other hand a' matrix y such that
a = yb.
On account of this ambiguity, the term division is not ordinarily used
here. We have, however, as is easily seen, the following theorem :
76
INTRODUCTION TO HIGHER ALGEBRA
Theorem 4. If & is any matrix and b any nonsingular matrix^
there exists one, and only one, matrix x which satisfies the equation
a = bx,
and one, and only one, matrix y which satisfies the equation
a = yb,
and these matrices are given respectively hy theformulce.
X = b~'a, y = ab~^
A special class of matrices is of some importance ; namely, those
of the type
h .
•
h ■
•
• •
•
• k
Such matrices we will call scalar matrices for a reason which will
presently appear.
If we denote by k the scalar matrix just written, and by a aijv
matrix of the same order as k, we obtain readily the formula
(1) ka = ak = ka..
If now, besides the scalar matrix k, we have a second scalar matrix
1 in which each element in the principal diagonal is I, we have the
two formulae
(2) k + l = Hk = (^ + OI,
(3) kl = lk = Hl.
Formula (1) shows that scalar matrices may be replaced by ordinary
scalars when they are to be multiplied by other matrices ; while
formulae (2) and (8) show that scalar matrices combined with one
another not only obey all the laws of ordinary scalars, but that each
scalar matrix may in such cases be replaced by the scalar which
occurs in each element of its principal diagonal provided that at the
end of the work the resulting scalar be replaced by the correspond
ing scalar matrix
LINEAR TRANSFORMATIONS AND MATRICES
77
For these reasons we may, in the algebra of matrices, replace all
scalar matrices by the corresponding scalars, and then consider that
all scalars which enter into our work stand for the corresponding
scalar matrices. If we do this, the unit matrix I will be represented
by the symbol 1.
Definition 2. By the adjoint A. of a matrix a is understood
another matrix of the same order in which the element in the ith row
and jth column is the cofaetor of the element in the jth row and ith
column of a.*
It will be seen that when a is nonsingular,
(4) A = aai,
but it should be noticed that while every matrix has an adjoint, only
nonsingular matrices have inverses.
Equation (4) may be written in the form
j5) Aa = aA = aI,
a form in which it is true not merely when a is nonsingular, but also,
as is seen by direct multiplication, when the determinant of a is zero.
Finally we come to a few important theorems concerning the
rank of the matrix obtained by multiplying together two given
matrices. In the first place, we notice that the rank of the product
is not always completely determined by the ranks of the factors.
This may be shown by numerous examples, for instance, in formula
(5), § 22, the ranks of the factors are in general two and one, and the
rank of the product is zero, while in the formula
^11
hi
^31
*12
•*32
,
1
=
*12
*32
the ranks of the factors are in general the same, namely two and
one, while the rank of the product is one.
But though, as this example shows, the ranks of the factors (even
together with the order of the matrices) do not suffice to determine
the rank of the product, there are, nevertheless, important inequali
ties between these ranks, one o^ which we now proceed to deduce.
* Notice the Interchange of rows and columns here, which in the case of adjoint
determinants, being immaterial and sometimes inconvenient, was not made.
78
INTRODUCTION TO HIGHER ALGEBRA
For this purpose consider the two matrices
a =
"u
*i»
b =
*n •■• K
"m
and their product ab.
Theorem 5. Any krowed determinant of the matrix ab is equal
to an aggregate of k rowed determinants of b each multiplied into a
polynomial in the as, and also to an aggregate of krowed determinants
of a each multiplied hy a polynomial in the b's.
For any Arovved determinant of ab niay be broken up into a sum
of determinants of the A:th order in such a way that each column of
each determinant has one of the b's as a common factor.* After
taking out these common factors from each determinant, we have
left a determinant in the a's which, if it does not vanish identically, is
a ^rowed determinant of a. Or, on the other hand, we may break
up the A;rowed determinant of ab into a sum of determinants of the
kth order in such a way that each row of each determinant has one
of the as as a common factor. After taking out these common factors
from each determinant, we have left a determinant in the b's which,
if it does not vanish identically, is a Arowed determinant of b.
From the theorem just proved it is clear that if all the Arowed
determinants of a or of b are zero, the same will be true of. all the
Arowed determinants of ab. Hence
Theorem 6. TJie rank of the product of two matrices cannot
exceed the rank of either factor, f
* The truth of this statement and the following will be evident if the reader
actually writes out the matrix ab.
t Thus if j'l and r« are the ranks of the two factors and B is the rank of the prod
uct, we have J?<ri, i?<r2. This is one half of Sylvester's "Law of Nullity," of
which the other half may be stated in the form It^r\ + r^ — n, where n is the order
of the matrices ; cf. Exercise 8 at the end of this section. Sylvester defines the nullity
of a matrix as the difference between its order and its rank, so that his statement of
the law of nullity is : The nullity of the product of two matrices is at least as gieat
as the nullity of either factor, and at most as great as the sum of the nullities of t)>«
factors.
LINEAR TRANSFORMATIONS AND MATRICES 79
There is one important case in which this theorem enables us to
determine completely the rank of the product, namely, the case in
which one of the two matrices a or b is nonsingular. Suppose first
that a is nonsingular, and denote the ranks of b and ab by r and i2
respectively. By Theorem 6, R<r. We may, however, also regard
b as the product of a~^ and ab, and hence, applying Theorem 6
again, we have r^R. Combining these two results, we see that
r = B.
On the other hand, if b is nonsingular, and we denote the ranks
of a and ab respectively by r and R, we get from Theorem &,R<r;
and, applying this theorem again to the equation
(ab) bi = a,
we have r<R. Thus again we get r = R.
We have thus established the result :
Theorem 7. If a matrix of rank r is multiplied in either order hy
a nonsingular matrix, the rank of the product is also r.
EXERCISES
1. Prove that a necessary and sufficient condition that two matrices a and b
of the same order be equivalent is that there exist two nonsingular matrices
c and d such that , .
dac = D.
Cf. § 22, Exercise 2, and § 19, Exercise 4.
2. Prove that a necessary and sufficient condition that two matrices a and b
of the same order be equivalent is that there exist four matrices c, d, e, f such
^"^^^ dac = b, a = fbe.
3. Prove that every matrix of rank r can be written as the sum of r matrices
of rank one.*
[Suggestion. Notice that the special matrix mentioned in § 19, Exercise 3, can be
so written.]
* A matrix of rank one has been called by Gibbs a dyad, since it may (cf.
§ 19, Ex. 5) be regarded as a product of two complex quantities (ai, a^, ■■• fln) and
(6i, 62, ••• M The sum of any number of dyads is called a dyadic polynomial, or
simply a dyadic. Every matrix is therefore a dyadic, and vice versa. Gibbs's theory
of dyadics, in the case ra = 3, is explained in the Vector Analysis of GibbsWilson,
Chap. "V. Geometric language is used here exclusively, the complex quantities
(Oi, (i2, as) and (61, 62, 63) from which the dyads are built up being interpreted as
vectors in space of three dimensions. This theory is equivalent to Hamilton's theory
of the Linear Vector Function in Quaternions.
80 INTRODUCTION TO HIGHER ALGEBRA
4. Prove that a necessary and sufficient condition that a matrix be a divisor
of zero (cf . § 22, Exercise 1) is that it be singular.
[SnoGESTiON. Consider equivalent matrices.]
5. Prove that the inverse of the product of any number of nonsingular
matrices is the product of the inverses of these matrices taken in the reverse
order.
Hence deduce a similar theorem concerning the adjoint of a product of any
number of matrices, whether these matrices are singular or not.
What theorem concerning determinants can be inferred ?
6. Prove that the conjugate of the inverse of a nonsingular matrix is the
inverse of the conjugate ; and that the conjugate of the adjoint of any matrix
is the adjoint of the conjugate.
7. Prove that if a matrix has the property that its product vrith every matrix
of the same order is commutative, it is necessarily a scalar matrix.
8. If ri and r^ are the ranks of two matrices of order n, and R the rank of
their product, prove that i?> i *
[Suggestion. Prove this theorem first on the supposition that one of the two
matrices which are multiplied together is of the form mentioned in Exercise 3, § 19,
using also at this point Exercise 1, § 8. Then reduce the general case to this one by
means of Exercise 1 of this section.]
26. Sets, Systems, and Groups. These three words are the
technical names for conceptions which are to be met with in all
branches of mathematics. In fact the first two are of such gener
ality that they may be said to form the logical foundation on which
all mathematics rests, f In this section we propose, after having
given a brief explanation of these three conceptions, to show how
they apply to the special subjects considered in this chapter.
The objects considered in mathematics — we use the word ohject
in the broadest possible sense — are of the most varied kinds. We
have, on the one hand, to mention a few of the more important ones,
the different kinds of quantities ranging all the way from the posi
tive integers to complex quantities and matrices. Next we have in
geometry not only points, lines, curves, and surfaces but also such
* Cf. the footnote to Theorem 6.
t For a popular exposition of the point of view here alluded to, see my address on
The Fundamental Conceptions and Methods of Mathematics, St. Louis Congress oi
Arts and Science, 1904. Reprinted in Bull. Amer. Math. Soc. , December, 1904.
LINEAR TRANSFORMATIONS AND MATRICES 81
things as displacements (rotations, translations, etc.), collineations,
and, in fact, geometrical transformations in general. Then in vari
ous parts of mathematics we have to deal with the Theory of
Substitutions, that is, with the various changes which can be made
in the order of certain objects, and these substitutions themselves
may be regarded as objects of mathematical study. Finally, in
mechanics we have to deal with such objects as forces, couples,
velocities, etc.
These objects, and all others which are capable of mathematical
consideration, are constantly presenting themselves to us, not singly,
but in sets. Such sets (or, as they are sometimes called, classes) of
objects may consist of a "finite or an infinite number of objects, or
elements. We mention as examples :
(1) All prime numbers.
(2) All lines which meet two given lines in space.
(3) All planes of symmetry of a given cube.
(4) All substitutions which can be performed on five letters.
(5) All rotations of a plane about a given line perpendicular to it.
Having thus gained a slight idea of the generality of the con
ception of a set, we next notice that in many cases in which we have
to deal with a set in mathematics, there are one or more rules by
which pairs of elements of the set may be combined so as to give
objects, either belonging to the set or not as the case may be. As
examples of such rules of combination, we mention addition and
multiplication both in ordinary algebra and in the algebra of ma
trices ; the process by which two points, in geometry^ determine a
line ; the process of combining two displacements to give another
displacement, etc.
Such a set, with its associated rules of combination, we will call
a mathematical system, or simply a system.*
We come now to a very important kind of system known as a
group, which we define as follows :
* This definition is sufficiently general for our immediate purposes. In general,
however, it is desirable to admit, not merely rules of combination, but also relations be
tween the elements of a system. In fact we may have merely one or more relations
and no rules of combination at all. From this point of view the positive integers with
the relation of greater and less would form a system, even though we do not introduce
any rule of combination such as addition or multiplication. It may be added that rules
of combination may be regarded as merely relations between three objects ; of. the
address referred to above.
82 INTRODUCTION TO HIGHER ALGEBRA
Definition. A si/stem consisting of a set of elements and one ruU
of combination, which we will denote hy o, is called a group if thefolloiv
ing conditions are satisfied :
(1) If a and b are any elements of the set, whether distinct or not,
aob is also an element of the set.*
(2) The associative law holds ; that is, if a, b, c are any elements
of the set, {aob)oc = ao{boc).
(3) The set contains an element, i, called the identical element,
which is such that every element is unchanged when combined with it,
ioa=aoi = a.
(4) Jf a is any element, the set also contains an element a', called the
inverse of a, such that a'oa = aoa' = i.
Thus, for example, the positive and negative integers with zero
form a group if the rule of combination is addition. In this case
zero is the identical element, and the inverse of any element is its
negative. These same elements, however, do not form a group if
the rule of combination is multiplication, for while conditions (1),
(2), and (3) are fulfilled (the identical element being 1 in this ease),
condition (4) is not, since zero has no reciprocal.
Again, the set of all real numbers forms a group if the rule
of combination is addition, but not if it is multiplication, since in
this case zero has no inverse. If we exclude zero from the set, we
have a group if the rule of combination is multiplication, but not if
it is addition.
As an example of a group with a finite number of elements we
mention the four numbers
+ 1,  1, + V^  V^Tl
with multiplication as the law of combination.
In order to get an example of a group of geometrical operations,
let us consider the translations of a plane, regarded as a rigid lamina,
in the directions of its own lines. Every such translation may be
represented both in magnitude and in direction by the length and
* A system satisfying condition (1) is sometimes said to have "the group prop
erty." In the older worlts on the subject this condition was the only one to be
explicitly mentioned, the others, however, being tacitly assumed.
LINEAR TRANSFORMATIONS AND MATRICES 83
direction of an arrow lying in the plane in question. Two such
translations performed in succession are obviously equivalent to a
translation of the same sort represented by the arrow obtained by
combining the two given arrows according to the law of the paral
lelogram of forces. The set of all translations with the law of
combination just explained is readily seen to form a group if we
include in it the null translation, i.e. the transformation which leaves
every point in the plane fixed. This null translation is then the
identical element, and two translations are the inverse of each other
if they are equal in magnitude and opposite in direction.
All the groups we have so far mentioned satisfy, not only the
four conditions stated in the definition, but also a fifth condition,
viz. that the law of combination is commutative. Such groups are
called commutative or Ahelian groups. In general, however, groups
do not have this property. As examples of nonAbelian groups,
we may mention first the group of all nonsingular matrices of a
given order, the rule of combination being multiplication ; and
secondly the group of all matrices of a given order whose deter
minants have the value ± 1, the rule of combination being again
multiplication. This second group is called a subgroup of the first,
since all its elements are also elements of the first group, and the
rule of combination is the same in both cases. A subgroup of the
group last mentioned is the group of all matrices of a given order
whose determinants have the value 1~ 1,* the rule of combination
being multiplication.
We add that nonAbelian groups may readily be built up whose
elements are linear transformations, or coUineations. On, the other
band, Abelian groups may be formed from matrices if we take as our
' ule of combination addition instead of multiplication,
27. Isomorphism.
Definition. Two groups are said to be isomorphic^ if it is possible
to establish a onetoone correspondence between their elements of such a
* These are called unimodular matrices ; or, more accurately, properly unimodular
matrices to distinguish them from the improperly unimodular matrices whose determi
nants have the value — 1. It should be noticed that these last matrices taken by them
selves do not constitute a group, since they do not even have the group property.
t Simply isomorphic would be the more complete term. We shall, however, not
be concerned with isomorphism which is not simple.
84 INTRODUCTION TO HIGHER ALGEBRA
gort that if a, h are any elements of the first group and a\ h' the corre
sponding elements of the second, then a' o b' corresponds to aob.*
We proceed to illustrate this definition by some 'examples,
leaving to the reader the proofs of the statements we make. In each
case we omit the statement of the rule of combination in the case of
transformations, where no misunderstanding is possible.
FiEST Example, (a) The group of the four elements
1, V1, 1, V1,
the rule of combination being multiplication.
(b) The group of four rotations about a given line through angles
of 0°, 90°, 180°, 270°. '
These two groups may be proved to be isomorphic by pairing the
elements against one another in the order in which they have just
been written.
Second Example, (a) The group of the four matrices
(o i),(~o i),(o l),( l),
the rule of combination being multiplication.
(6) The group of the following four transformations : the iden
tical transformation ; reflection in a plane ; reflection in a second
plane at right angles to the first ; rotation through 180° about the
line of intersection of these two planes.
(c) The group consisting of the identical transformation and of
three rotations through angles of 180° about three straight lines
through a point at right angles to one another.
The two groups of Example 1 are not isomorphic with the three of
Example 2 in spite of the fact that there are the same number of ele
ments in all the groups. This follows from the presence of two
elements in the groups of Example 1 whose squares are not the
identical element.
* This idea of isomorphism may obviously be extended to the case of any two sys
tems provided merely that there are the same number ofrules of combination in both
cases. Thus the system of all scalar matrices on the one hand and of all scalars on the
other, the rules of combination being in both cases addition and multiplication, are ob •
viously isomorphic. It is for this reason that no confusion arises if no distinction la
made between scalar matrices and scalars.
LINEAR TRANSFORMATIONS AND MATRICES 85
Third Example, (a) The group of all real quantities ; the rule
of combination being addition.
(J) The group of all scalar matrices of order h with real ele
ments ; the rule of combination being addition.
(e) The group of all translations of space parallel to a given
line.
Fourth Example, (a) The group of all nonsingular matrices
of order n, with multiplication as the rule of combination.
(5) The group of all nonsingular homogeneous linear transfor
mations in n variables.
We might be tempted to mention as a group of geometrical trans
formations isomorphic with the last two groups, the group of all
nonsingular coUineations in space of w — 1 dimensions. This, how
ever, would be incorrect, for the correspondence we have established
between coUineations and linear transformations is not onetoone ;
to every linear transformation corresponds one coUineation, but to
every coUineation correspond an infinite number of linear transfor,
mations, whose coefficients are proportional to one another.* In
order to get a group of geometrical transformations isomorphic with
the group of nonsingular matrices of the nth order it is sufficient
to interpret the variables x^, ■■■ x„ as nonhomogeneous coordinates in
space of n dimensions, and to consider the geometric transformation
effected by nonsingular homogeneous linear transformations of
these x's. These transformations are those affine transformations of
space of n dimensions which leave the origin unchanged ; cf. the
footnote on p. 70. Thus the group of all nonsingular matrices of
the nth order is isomorphic with a certain subgroup of the group
of coUineations in space of n dimensions, not with the group of all
nonsingular coUineations in space of w — 1 dimensions.
An essential difference between these two groups is that one
* This does not really prove that the groups are not isomorphic, since it is con
ceivable that some other correspondence might be established between their elements
■which would be onetoone and of such a sort as to prove isomorphism. Even the
fact, to be pointed out presently, that the groups depend on a different number of
parameters does not settle the question. A reference to the result stated in Exercise 7,
§ 2.'>, shows that the groups are not isomorphic ; for, according to it, the only non
singular coUineation which is commutative with all coUineations is the identical
transformation, whereas all linear transformations with scalar matrices havp this
property.
mTRODUCTION TO HIGHER ALGEBRA
depends on y? parameters (the i^ coefficients of the linear transforma
tion' while the other depends only on i^ — 1 parameters (the ratios
of the coefficients of the coUineation).
We can, however, by looking at the subject a little differently,
obtain a group of matrices isomorphic with the group of all non
singular coUineations in space of m — 1 dimensions. For this purpose
we need merely to regard two matrices as equal whenever the ele
ments of one can be obtained from those of the other by multiplying
all the elements by the same quantity not zero. When we take this
point of view with regard to matrices, it is desirable to indicate it
bj" a new terminology and notation. According to a suggestion of
E. H. Moore of Chicago, we will call such matrices fractional
matrices, and write them
, etc.
Agreeing that fractional matrices are. to be added and mul
tiplied according to the same rules as ordinary matrices, we may now
say that the group of all nonsingular coUineations in space of w — 1
dimensions is isomorphic with the group of all fractional matrices of
the wth order whose determinants are not zero.*
To take another example, the groups in the second example above
are isomorphic with the group whose elements are the four fractional
matrices
«11
«12
«11
«12
«13
1
«31
«22
«23
«21
«22
«31
«32
«33
1
1
1
1
1
1
1
and where the law of combination is multiplication. These four
matrices, if regarded as ordinary matrices, would not even satisfy
the first condition for a group.
The reader wishing to get a further insight into the theory of
groups of linear transformations will find the following three treat
* It should be noticed that we cannot speak of the value of the determinant of a
fractional matrix unless this value is zero, for if we multiply all the elements of the
matrix by c we do not change the matrix, but do multiply the determinant by c":
There is in particular no such thing as a unimodular fractional matrix. We may
however, speak of the rank of a fractional matrix.
LINEAR TRAKSiORMATlONS AND MA'I'RICES 8?
ments interesting and instructive. They duplicate each other to
only a very slight extent.
Weber, Algebra, Vol. II.
Klein, Vbrlesungen uher das Ihosaeder.
LieScheffers, Vorlesungen iiber continuirliche Grv/ppen.
EXERCISES
1. Definition. A group is said to be of order n if it contains n, and only n,
elements.
If a group of order n has a subgroup, prove that the order of this subgroup is
a factor of n.
[Suggestion. Denote the elements of the subgroup by ai ••• at, and let 6 be any
other element of the group. Show that bau baz, ■■■ bat are all elements of the group
distinct from each other and distinct from the a's. If there are still other elements,
let c be one and consider the elements cai, ■■■ ea^, etc.]
2. Prove that if a is any element of a group of finite order, it is possible by
multiplying a by itself a sufficient number of times to get the identical element.
Definition. The lowest power to which a can be raised so as to give the identical
element is called the period of a.
3. Prove that every element of a group of order n has as its period a factor
of n (I and n included).
4. Definition. A group is called cyclic if all its elements are powers of a
single element.
Prove that all cyclic groups of order n are isomorphic with the group of rota
tions about an axis through angles 0, w, 2 <o, — (n — 1) to, vrhere to = 2 ■n/n, and
that conversely every such group of rotations is a cyclic group.
5. Prove that every group whose order is a prime number is, a cyclic group.
6. Prove that all groups of order 4 are either cyclic or isomorphic with the
groups of the second example above. A group of this last kind is called a fours
group (^Vierergruppe").
7. Obtain groups with regard to one or the other of which all groups of
order 6 are isomorphic.
8. Obtain groups with regard to one or the other of which all groups of
order 8 are isomorphic.
CHAPTER VII
INVARIANTS. FIRST PRINCIPLES AND ILLUSTRATIONS
28. Absolute Invariants, Geometric, Algebraic, and Arithmetical.
If we subject a geometric figure to a transformation, we find that,
while many properties of the figure have been altered, others have
not. If we consider, not a single transformation, but a set of trans
formations, then those properties of figures which are not changed by
any of the transformations of the set are said to be invariant prop
erties with regard to this set of transformations. Thus if our set of
transformations is the group of all displacements, the property of
two lines being parallel or perpendicular to each other and the
property of a curve being a circle are invariant properties, since
after the transformation the lines will still be parallel or perpendicu
lar and the curve will still be a circle. If, however, we consider,
not the group of displacements, but the group of all nonsingular
coUineations, none of the properties just mentioned will be invariant
properties. Properties invariant with regard to all nonsingular
coUineations have played such an important part in the development
of geometry that a special name has been given to them, and they
are called projective or descriptive properties. As examples of such
projective properties we mention the collinearity and complanarity
of points, the complanarity and concurrence of lines, etc.; or, on
the other hand, the contact of a line with a curve or a surface or
the contact of two curves or of two surfaces, or of a curve and a
surface.
Definition 1. If there is associated with a geometric figure a
quantity which is unchanged hy all the transformations of a certain
set, then this quantity is called an invariant with regard to the trans
formations of the set.
For instance, if our set of transformations is the group of dis
placements, the distance between two points and the angle between
two lines would be two examples of invariants.
88
INVARIANTS. FIRST PRINCIPLES AND ILLUSTRATION* 89
The geometric invariants so far considered lead up. naturally to
the subject of algebraic invariants. Thus let us consider the two
polynomials U,. + £,,+ C,,
[A^x + B^y + G^,
and subject the variable^ (a;, y) to the transformations of the set
(2)
x' = X cos6 + y smO + a,
y' = — xsin + y cos 6 + ^,
where «, /3, are parameters which may have any values. The trans
formation (2) carries over the polynomials (1) into two new poly
nomials :
.3) lA',x' + B[y'+C[,
\A'^x' + £',y'+Oi.
The coefficients of (3) may be readily expressed in terms of the coeffi
cients of (1) and the parameters a, /8, 6. Using these expressions,
we easily obtain the formulas
.4. .UiB>,A',Bi = A,B,A,B„
[A[Al + B[Bi = A^A, + B,B^.
We shall therefore speak of the two expressions
(5) A,B^A^B,, A,A^+B^B^
as invariants of the system of polynomials (1) with regard to the set
of transformations (2) according to the following general definition :
Definition 2. If we have a system of polynomials in the variables
{x, y, z, ...) arid a set of transformations of these variables, then any
function of the coefficients of the polynomials is called an invariant (or
more accurately an absolute invariant^ with regard to these transforma
tions if it is unchanged when the polynomials are subjected to all the
transformations of the set.
The relation of the example considered above to the subject of
geometric invariants "becomes obvious when we notice that the alge
braic transformations (2) may be regarded as expressing the dis
placements of plane figures in their plane when (x,y) are rectangular
coordinates of points in the plane. If now we consider, not the poly
nomials (1), but the two lines determined by setting them equal tc
90 . INTRODUCTION TO HIGHER ALGEBRA
zero, we have to deal with the displacements of these two lines.
The invariants (5) have themselves no geometric significance, but by
equating them to zero, we get the necessary and sufficient conditions
that the two lines be respectively parallel and perpendicular, and
these, as we noticed above, are invariant properties with regard to
displacements. Finally we may notice that the ratio of the two in
variants (5) gives the tangent of the anglfe between the lines, — a
geometric invariant.
As a second example, let us consider, not two lines, but a line and
a point. Algebraically this means that we start with the system
... Ux+By+O,
consisting of a polynomial and a pair of variables. We shall wish
to demand here that whenever the variables (x, y) are subjected to a
transformation, the variables (x^, y^ be subjected to the same trans
formation, or as we say according to Definition 3 below, that {x, y)
and (aTj, y^ be cogredient variables. If we subject the system (6)
to any transformation of the set (2), we get a new system
,„. {A'a/ + B'y'+C',
and it is readily seen that
A'x[ + B'yi +C' = Ax^ + By^ + O.
Accordingly we shall call Ax.^^ + By^ + C a covariant of the system (6)
according to Definition 4 below. This covariant has also no direct
geometric meaning, but its vanishing gives the necessary and suffi
cient condition for an invariant property, namely, that the point
{xy, yi) lie on the line Ax + By +0=0.
In the light of this example we may lay down the following gen
eral definitions :
Definition 3. If we have several sets of variables
(x, y, 2, ■■■), («i, «/i, 2i, •••), X^2' 1/v «'2' •••)' —
and agree that whenever one of these sets is subjected to a transfor
mation every other set shall be subjected to the same transformation,
then we say that we have sets of cogredient variables.
INVARIANTS. FIRST PRINCIPLES AND ILLUSTRATIONS 91
Definition 4. If we have a system consisting of a number of poly
nomials in (x, y, z, ■••) and of a number of sets of variables co
gredient to (x, y, s, •■•), then any function of the coefficients of the
polynomials and of the cogredient variables which is unchanged when
the variables (x, y, z, ••■) are subjected to all the transformations of a
certain set is called a covariant {or more accurately an absolute cova
riant) of this system with regard to the transformations of this set.
It will be seen that invariants may be regarded as special cases of
covariants.
Among the geometric invariants there are some which from their
nature are necessarily integers, and which we will speak of as arith
metical invariants. An example would be the number of vertices of
a polygon if our set of transformations was either the group of dis
placements or the group of nonsingular coUineations. Another ex
ample is the largest number of real points in which an algebraic
curve can be cut by a line, if our set of transformations is the group
of reac nonsingular coUineations.
These arithmetical invariants also play, as we shall see, an impor
tant part in algebra. We mention here as an example the degree of an
wary form, which is an invariant with regard to all nonsingular linear
transformations. *
EXERCISES
1. Prove that (x^  xiy + (y^  ViY, and
are covariants of the system
(xi, yi), (12, y^), (xs, ys)
with regard to the transformations (2).
2. Prove that A+B and B^AC
are invariants of the polynomial
Ax^ + 2 Bxy + Cy^ + 2 Dx +2 Ey + F
with regard to the transformations (2).
What geometric meaning can be attached to these invariants?
3. Prove that A^ + B'' is an invariant of the polynomial
Ax + By+C
with regard to the transformations (2).
Hence show that Axi + Byi + C
VA^ + B^
is a covariant of the system (6). Note its geometric meaning.
*It is, in fact, an invariant with regard to all linear transformations except the
one in which all the coefficients of the transformation are zero.
X\
n
1
X2
y^
1
Xi
y
1
92 INTRODUCTION TO HIGHER ALGEBRA
29. Equivalence.
Definition 1. If A and B denote two geometric configuratiom
or two algebraic expressions, or sets of expressions, then A and B shall
he said to he equivalent with regard to a certain set of transformations
when, and only when, there exists a transformation of the set which car
ries over A into B and also a transformation of the set which carries over
B into A.
To illustrate this definition we notice that the conception of equiv
alence of geometric figures with regard to displacements is identical
with the Euclidean conception of the equality or congruence of fig
ures.
Again, we see from Theorem 2, § 24, that on a straight line two
sets of three points each are always equivalent with regard to non
singular projective transformations.
In both of the illustrations just mentioned the set of transforma
tions forms a group. In such cases the condition for equivalence can
be decidedly simplified, for the transformation which carries A into
B has an inverse belonging to the set, and this inverse necessarily
carries B into A. Thus we have the
Theokem. a necessary and sufficient condition for the equiva
lence of A and B with regard to a group of transformations is thatji
transformation of the group carry over A into B.
This theorem will be of great importance, as the question of
equivalence will present itself to us only when the set of transfor
mations we are considering forms a group.
Let us consider, for the sake of greater definiteness, a group of
geometric transformations. If two geometric configurations are
equivalent with regard to this group, every invariant of the first
configuration must be equal to the corresponding invariant of the
second. Thus, for instance, if two triangles are equivalent with re
gard to the group of displacements, all the sides and angles of the
first will be equal to the corresponding sides and angles of the second.
The same will be true of the altitudes, lengths of the medial lines,
radius of the inscribed circle, etc., all of these being invariants.
Now one of the first problems in geometry is to pick out from among
these invariants of the triangle as small a number as possible whose
equality for two triangles insures the equivalence of the triangles.
This can be done, for instance, by taking two sides and the included
INVARIANTS. FIRST PRINCIPLES AND ILLUSTRATIONS 93
angle, or two angles and the included side, or three sides. Any one
of these three elements may be called a complete system of invariants
for a triangle with regard to the group of displacements, since two
triangles having these invariants . in common are equivalent and
therefore have all other invariants in common. The conception we
have here illustrated may be defined in general terms as follows :
Definition 2. A set of invariants of a geometric configuration or an
algebraic expression are said to form a complete system of invariants if
two configurations or expressions having these invariants in common are
necessarily equivalent.*
It will be seen from this definition that all the invariants of a
geometric configuration or of an algebraic expression are uniquely
determined by any complete system of invariants.
Finally we will glance at an application to matrices of the ideas
of invariants and equivalence. Let us consider matrices of the nth
order,! and consider transformations of the following form which
transform the matrix A into the matrix B:
(1) aAb = B,
where a and b are any nonsingular matrices of the nth. order. This
transformation may be denoted by the symbol (a, b), and these sym
bols must obviously be combined by the formula »
By means of this formula it may readily be shown that these trans
formations form a group.
According to our general definition of equivalence, two matrices
A and B must therefore be said to be equivalent when, and only
when, two nonsingular matrices a and b exist which satisfy (1).
That this definition of equivalence amounts to the same thing as our
earlier definition is seen by a reference to Exercise 1, § 25.
* In the classical theory of algebraic invariants this term is used in a different and
much more restricted sense. There we have to deal with integral rational relative inva
riants (cf . § 31) . By a complete system of such invariants of a system of algebraic forms
is tiiere understood a set of such invariants in terms of which every invariant of this sort
of the system of forms can be expressed integrally and rationally. Cf. for instance
Clebseh, Binare Formen, p. 109.
t We may, if we choose, confine our attention throughout to matrices with real
elements.
94 INTRODUCTION TO HIGHER ALGEBRA
30. The Rank of a System of Points or a System of Linear Forms
as an Invariant. Let (x^, y^. Zy, ^j), {x^, y^, Sj, t^, {x^., y^. Zg, t^ be
any three distinct collinear points, so that the rank of the matrix
^1 VV % h
is two. Now subject space to a nonsingular QoUineation and we get
three new points which will also be distinct and collinear, and hence the
rank of their matrix will also be two. Thus we see that in this special
case the rank of the system of points is unchanged by a nonsingular
coUineation.
Again, let a^x + b^y + c^z + d^t = 0,
a^x + b^y + e^z + d^t = 0,
a^x + b^y + c^z + d^t = 0,
a^x + b^+ c^z +d^t=0
be any four planes which have one, and only one, point in common,
so that the rank of their matrix is three. After a nonsingular coUin
eation we have four new planes which will also have one, and only
one, point in common, and hence the rank of the matrix of their
coefficients will be three. The rank of this system of planes is
therefore unchanged by such a transformation.
We proceed now to generalize these facts.
Theorem 1. The rank of the matrix of m points
{xf,xf, ...a;W), (;=1, 2, ... m)
is an invariant with regard to nonsingular linear transformations.
Let [■ X^ = c^^Xj^ + • • • f c^„x„,
(1)
^M — ^nl^l + • ■ ■ + <^mv^n
be a nonsingular linear transformation which carries the points
(xf, ••■ 2:5*1) over into the points {X^K ■■■ X'M). Now suppose any k
of the points (xf, ■■■ 4']), which for convenience we will take as the
first k, are linearly dependent. Then there exist k constants
(cj, ••• c^) not all zero, such that
(2) c,x'^ + c^x'! +... + c^f = 0, 0' = 1, 2, .. . n).
INVARIANTS. FIRST PRINCIPLES AND ILLUSTRATIONS 95
By means of the transformation (1) we have
hence c^X'^ + c^X'J +■■■ + c^f = Cj^ (c^x'^ + c^{ + + c^^^) +
■■■ + Oj„(cix'^ + c^x':+ +e^»i) (J = l, 2, n).
Since this vanishes on account of (2), the first k of the points
{Xf, ■■■ X^'^) are linearly dependent. Since (1) is a nonsingular
transformation, it is immaterial which set of points we consider as the
initial set. Thus we have shown that if any k points of either set
are linearly dependent, the corresponding k points of the other set
will be, also.
Now if the rank of the matrix of the a;'s is r, at least one set of r
of the 2;points is linearly independent, but every set of (r+1) of
them is linearly dependent. Consequently the same is true for the
Xpoints, and therefore their matrix must also be of rank r.
Theorem 2. The rank of the matrix of m linear forms
/i(a;i, ■■■x,^ = a.i^i + ai^^\ h ai^x,, (i = 1, 2, • • • wi)
is an invariant with regard to nonsingular linear transformations.
The proof of this theorem, which is very similar to the proof of
Theorem 1, we leave to the reader.
It will be noticed that the invariants we have considered in this
section are examples of what we have called arithmetical invariants.
31. Relative Invariants and Covariants. We will begin by con
sidering a system of n linear forms in n variables
^21 ■*'l "^ '''22 "^2 "•" ■ ■ ■ "1 '*2» •"»'
(1)
a«r 2^1 + «7!2 ^2 H 1" '^nn ^n
Definition 1. The determinant
*ii
*nl
*i»
is called the fesultant of the system (1).
96
INTRODUCTION TO HIGHER ALGEBRA
Let us now subject the system (1) to the linear transformation
■ a^i = '^ii a;i + ■■• + ci„4,
f2)
This gives the new system of forms
(3)
where
Cij
' "T" ^in "ray
From these formulae and the law of multiplication of matrices we
infer that
f4)
"u
n™
*ii
*ln
11
l"
•*nl ■ ' ■ '^nn "nl ' " ' "'nn '^nl
This result we state as follows :
Theorem \. If a system of n linear forms in n variables with
matrix a is subjected to a linear transformation with matrix c, the re
sulting system has as its matrix ac.
Taking the determinants of both sides of (4), we see that the re
sultant of (1) is not an absolute invariant. It is, however, changed
in only a very simple manner by a linear transformation, namely, by
being multiplied by the determinant of the transformation. This
leads us to the following definition :
Definition 2. A rational function * of the coefficients of a form or
system of forms which, when these forms are subjected to any non
singular linear transformation, is merely multiplied by the fith power
(/u an integer f ) of the determinant of the transformation is called a rela
tive invariant of weight ft, of the form or system of forms. :f The forms
themselves are called the ground forms.
* Besides these rational invariants we may also consider irrational ones (cf. § 90),
in which case the exponent /i will not necessarily be an integer.
t The condition that n be an integer need not be included as a part of our
hypothesis, since it may "be proved. The proof that ix cannot be a fraction is simple.
The proof that /x cannot be irrational or imaginary would take us outside of the domain
of algebra.
t From this definition it is clear that every relative invariant is an absolute invariant
with regard to the group of linear transformations of determinant +1. Cf , Exercise 7 . § 81.
INVARIANTS. FIRST PRINCIPLES AND ILLUSTRATIONS 97
It will be seen that absolute invariants are simply relative in
variants of weight zero.
The fact pointed out above concerning the resultant may now be
stated in the following form :
Theorem 2. The resultant of a set of n linear forms in n varidblet
is a relative invariant of weight 1.
We pass on now to relative covariants :
Definition 3. If we have a system consisting of a number of nary
forms and of a number of points {y^, ... «/„), (2j, ... 3„), ... , the coor
dinates of each of which are cogredient with the variables (a;j, ...2;„) of
the forms, then any rational function of the coefficients of the forms and
the coordinates of the points which is merely multiplied by the futh power
(^jjL an integer) of the determinant of the transformation when the xs are
subjected to any nonsingular linear transformation is called a relative
covariant of weight fx of the system of forms and points.*
We may regard an invariant as the extreme case of a covariant
where the number of points is zero. The other extreme case is that
in which the number of forms is zero. Here we have the theorem:
Theorem 3. The determinant
pW ...4'»i
is a relative covariant of weight — 1 of the system of points
{x'„...x:),(4,:.x':),...{xf,...xi^).
For applying the transformation
x^=c^^Xi+ ■■■ + Ci„Z„
^n=<^nl^l + ••• + ''nn^t
»'
* In most books where the subject of covariants is treated, the same letters
(xi, ■ ■ ■ Xn) are used for one of the points as for the variables of the forms. There is
no objection to this, and it is sometimes convenient. We prefer to use a notation
which shall make it perfectly clear that the variables of the forms have no connection
with the coordinates of the points except that they are cogredient with them.
98
INTRODUCTION TO HIGHER ALGEBRA
we have
M ...M
o,^X[ ++c^„X^ e^^X^ ++c„„X',
c,,X[^ +■■■+ ^i„XW  c„iXiW + ... + c„„XN
'nl ^nn
XWXM
XI XL
=
eii<''i,>
<^nl'" "''ran
1
4"! ••■4"
Or
?,s was to be proved.
Another extremely simple case arises when we have a single
form and a single point:
Theorem 4. The system consisting of the form f (x^, ••• a;„) and
the point (i/i, • • • «/„) has as an absolute covariant with regard to linear
transformations . . .
fiyv'yn)
For let us denote / more explicitly as
J y^xi *2' " ■ ' ^v ' ' ' •"«)'
where aj, a^, ■■■ are the coefficients of/. If the coefficients after the
transformation are a[, a^, ••, we have
This l)eing true for all values of the a;'s, will be true if the 2;'s are
replaced by the y's. But when this is done, the a;"s will be replaced
by the ^''s, since the a;'s and y's. are cogredient. Accordingly
fia'v 4' ••• ■ y'v ■■■y'n) =f{«v «2'  ; yv •••y»)'
as was to be proved.
INVARIANTS, FIRST PRINCIPLES AND ILLUSTRATIONS 99
The three examples of invariants and covariants which have been
given in this section are all polynomials in the coefficients of the
forms and the coordinates of the points. Such invariants we shall
speak of as integral rational invariants and covariants.*
Theorem 5. The weight of an integral rational invariant cannot
be negative.^
Let ftj, a^, ■■• ; Jj, Jgi ; ••• be the coefficients of the system of
forms, and let c^j be the coefficients of the transformation. It is clear
that the coefficients a\, a'^,; J^, l'^,; ••• after the transformation are
polynomials in the a's, 5's, etc., and in the e^'s. Now let I be an
integral rational invariant of weight /t,
/(«;, 4, •■• ; &;, J^, ... ; ...) = c«J(ai, flg, •.. ; \, b^,; •••).
where c is the determinant of the transformation. Suppose now
that /JL were negative, fj,= —v. Then
(5) Clia',,; 6i,.; ■•• ) = ^K,  ; b„  ; ■.■).
This equality, like the preceding one, is known to hold for all
values of the c^/s for which e ^0. Hence, since the expressions on
both sides of the equality are polynomials in the a's, J's, ■•• and the
Cj/s, ive infer, by an application of Theorem 5, § 2, that we really
have an identity.
Let us now assign to the a's, J's, •• any constant values such that
J(aj, ; Jj, •■• ; ■■■)^0. Then I{a[,; Jj, ■■• ; ■■•) will be a poly
nomial in the c^/s alone, which, from (5), cannot be identically zero.
The identity (5) thus takes a form which states that the product of
two polynomials in the e^/s is a constant, and since the first of these
polynomials, c", is of higher degree than zero, this is impossible.
We will agree in future to understand by the terms invariant and
covariant, invariants or covariants (absolute, relative, or arithmetical)
with regard to all nonsingular linear transformations. If we wish to
consider invariants or covariants with regard to other sets of trans
formations, for instance with regard to real linear transformationh
this fact will be explicitly mentioned.
* All rational invariants and covariants may be formed as the quotients of such an
we integral and rational ; cf. Exercises 4, 5, § 78.
* It cannot be zero either ; cf. Theorem 5, §79.
100
INTRODUCTION TO HIGHER ALGEBRA
Finally, let us note the geometric meaning to be associated wit^
the invariants and covariants which have been mentioned in this
section. We confine bur attention to the case of four variables.
The vanishing of the resultant of four linear forms gives a necessary
and sufficient condition that the four planes determined by setting
the forms equal to zero meet in a point. The vanishing of the co
variant of Theorem 3 is a necessary and sufficient condition that the
four points lie in a plane. The vanishing of the covariant of
Theorem 4 is a necessary and sufficient condition that the point
iVv Vv 2^3' y^ ^^® ^"^ *^® surface /= 0. It will be seen that in aU
eases we are thus led to a projective property ; cf. §§80, 81.
32, Some Theorems Concerning Linear Forms.
Theorem 1. Two systems of n linear forms in n variables are
equivalent with regard to nonsingular linear transformations if neithet
resultant is zero.
Let (a^^x^ +  + a^„x„ (K^i+''+hn'^n
(1) • • • • • (2)
be the two systems, whose resultants,
.i>ni^i+tb„nX^
«11
••«i.
b =
hi
hn
am
•«»»
K
■Kn
are, by hypothesis, not zero. Applying the transformations
CKj — a^^X^ + • • • + (lin^n
I *m — *Bl*l
+ ■ ■ ■ + '^nri^K
' x[=hiix^+ ■■■■kb^^x^
. 4 = *„i»i+ ■■■+b„
to (1) and (2) respectively, they are both reduced to the normal form
(3)
Now, since neither a nor b is zero, the transformations a and b have
inverses, which when applied to (3) carry it back into (1) and (2)
respectively. Hence the transformation b~ia carries (1) into (2\
INVARIANTS. FIRST PRINCIPLES AND ILLUSTRATIONS 101
Theorem 2. A system of n linear forms in n variables has no
integral rational invariants * other than constant multiples of powers
of the resultant.
Let (1) be the given system and a its resultant, and let e be the
determinant of a nonsingular linear transformation which carries
(l)overinto ^a\,x[+ ... + a[^x'„,
(4)
. *ni^i "T" ■ • ■ + "^nn^n
If we call the resultant of (4) a', we have
a' = ac.
Let /(an, • • • «„„) be any integral rational invariant of the system
(1) of weight fi, and write
J'=J(ali, ...<„).
Then /' = c^I.
Now let us assume for a moment that a #0, and consider the special
transformation which carries over (1) into the normal form (3). In
this special case we have a' = 1; hence, as may also be seen directly,
ac = 1. Calling the constant value which /' has in this particular
case k, we have , „ t _„ 7
or
(5) /= har.
This equality, in which h is independent of the coefficients a^y, has
been established so far merely for values of the a^'s for which a # 0.
Since fj, is not negative (cf. Theorem 5, § 31), we can now infer
that (5) is an identity, by making use of Theorem 5, § 2. Thus we
>,ee that /is merely a constant multiple of a power of the resultant,
as was to be proved.
COROLLAKY. A System of m linear forms in n variables has no in
tegral rational invariants {other than constants) when m <_ n.
For such an invariant would also be an integral rational invariant
of the system of n linear forms obtained by adding nm new forms to
the given system ; and hence it would be a constant multiple of a
* It has the arithmetical invariant mentioned in Theorem 2, § 30.
102 INTRODUCTION TO HIGHER ALGEBRA
power of the resultant of this system. This power must be zero,
and hence the invariant must be a mere constant, as otherwise it
would involve the coefficients of the added forms, and hence would
not be an invariant of the system of original forms.
EXERCISES
1. Prove that if we have two systems of n + 1 linear forms in n variables whose
matrices are both of rank n, a necessary and sufficient condition that these two
systems be equivalent with regard to nonsingular linear transformation is that
the resultants of the forms of one set taken n at a time be proportional to th»
resultants of the corresponding forms of the other set.
2. Generalize the preceding theorem.
3. Prove that every integral rational invariant of a system of m linear forms in
H variables (m > n) is a homogeneous polynomial in the resultants of these forms
taken n at a time.
4. State and prove the tlieorems analogous to the theorems of the present sec
tion, including the three preceding exercises, when the system of linear forms is re
placed by a system of points.
33. Crossratio and Harmonic Division. Let us consider any four
distinct points on a line '
(1) {^v *i)' («2' ^2)' (23' h\ (^4' h)
We have seen, in § 31, Theorem 3, that each of the six determinants
. x^t^ — ajgCj, x^tg — x^tj, x^t^ — x^t^,
Xgt^ ^l^S' "^4 2 •^'2 4' ^2 3 '^3^2'
is a covariant of weight — 1. The ratio of two of these determinants
is therefore an absolute covariant, and we might be tempted, by
analogy with the examples of absolute covariants in Exercise 1, § 28,
to expect that it might have a geometric meaning. It will be readily
seen, howe\er, that this is not the case, for the value of the ratio of
two of the determinants (2) will be changed if the two coordinates
of one of the four points are multiplied by the same constant, and
this does not affect the position of the points.
It is easy, however, to avoid this state of affairs by forming such
an expression as the following:*
/3v (12 3 4^ — ^^1^2 ~ ^2^1) (^3^4 ~ ^4^3)
(V3«3*2)(«4*l2^lO'
* The reversal of sign of the second factor in the denominator is not essential, but
ia customary for a reason which will presently be evident.
INVARIANTS. FIRST PRINCIPLES AND ILLUSTRATIONS 103
which is also an absolute covariant of the four points (1), and is called
their crossratio or anharmonic ratio. More accurately it is called the
crossratio of these four points when taken in the order written in (1).*
In order to determine the geometric meaning of the crossratio
of four points, let us first suppose the four points to be finite so that
^1*2*3^4=^0. Dividing numerator and denominator of (1, 2, 3, 4) by
this product, we find the following expression for the crossratio in "
terms of the nonhomogeneous coordinates X^ of the points,
(4) (1, 1, 6, 4) (^x,X,)(X,XS
Finally, denoting the points by Pj, Pg' ^S' ^4? ^^ ™*y write
C5) CI, A ^,4) p^pj p^p p^pj p^p^
In words, this formula tells us that the crossratio of four finite
points is the ratio of the ratio in which the second divides the first
and third and the ratio in which the fourth divides the first and
third ; and that it is also the ratio of the ratios in which the first
and third divide the second and fourth.
In this statement, it must be remembered that we have taken the
ratio in which C divides the points J., B as AC I BC, so that the ratio
is negative if G divides AB internally, positive if it divides it externally.
If we agree that the point at infinity on a line shall be said to
divide any two finite points A., B of this line in the ratio 1 1 (and this
is a natural convention since the more distant a point the more nearly
does it divide AB in the ratio f 1) it is readily jeen, by going back
to formula (3), that the first statement following (5) still holds if the
second or fourth point is at infinity, while the second statement holds
if the first or third is at infinity. Thus we have in all cases a simple
geometric interpretation of the crossratio of four distinct points.
The special case in which four points Py, Pj, P3, P4 are so situ
ated that (1, 2, 3, 4) = — 1 is of peculiar importance. In this case
we have
(1, 2, 3, 4) = (1, 4, 3, 2) = (3, 2, 1, 4) = (3, 4, 1, 2) = (2, 1, 4, 3)
= (2, 3, 4, 1) = (4, 1, 2, 3) = (4, 3, 2, 1)=1.
* If these four .points are taken in other orders, we get different crossratios 1
(1, 2, 4, S), (1, 4, 3, 2), etc. Cf. Exercise 1 at the end of this section
104 INTRODUCTION TO HIGHER ALGEBRA
The relation is therefore merely a relation between the two pairs of
points Pj, Pg and P^, P^ taken indifferently in either order, and we
say that these two pairs of points divide each other harmonically.
From the geometric meaning of crossratio, we see that, if all four
points are finite, the pairs Pj, Pg and Pg, P^ divide each other har
monically when, and only when, Pg and P^ divide Pj, Pg internally
and externally in the same ratio ; and also vsrhen, and only when, Pj
and Pg divide Pg, P^ internally and externally in the same ratio. If
Pg or P^ lies at infinity, the first of these statements alone has a
meaning, while if Pj or Pg lies at infinity, it is the second statement
to which we must confine ourselves.
It is easily seen that the case in which three of the four points,
say Pj^, Pg, Pg, coincide, while P^ is any point on the line, may be
regarded as a limiting form of two pairs of points which separate one
anotlier harmonically. It is convenient to include this case under the
term harmonic division, and we will therefore lay down the definition:
Definition. Two pairs of points Pj, Pg and Pg, P^ on a line are
said to divide one another harmonically if they are distinct and their
crossratio taken in the order Pj, Pg, Pg, P^ is — 1, and also if at least
three of them coincide.
It will be seen that the property of two pairs of points dividing
each other harmonically is a projective property in space of one
dimension.
The most important applications of crossratio come in geometry
of two, three, or more dimensions where the points are not determined
as above by two coordinates (or one nonhomogeneous coordinate), but
by more. Suppose, for instance, we have four distinct finite points
on a line in space of three dimensions. Let the points be Pj, Pg,
§1, $g, and suppose the coordinates of Pj, Pg are {x^, i/j, gj, t^ and
(ajg, yg, ^g, ig) respectively. Then the coordinates of Q^, Q^ may be
written
(^1 + ^^^g,^! f X«/2,Si fXag,^! 1 \t^),{x^ + iix^,yi ■{ tiy2,z^+ tte^^t^^ ixt^.
Now, let
(6) AxJrBy+az\I)t =
be any plane through Q.^ but not through P^, and we have
(,Ax^ + By^ + Cz^ + J)t,) + ^Ax^ + By 2 + Cz^ + Dt^) = 0,
INVARIANTS. FIRST PRINCIPLES AND ILLUSTRATIONS 105
or, since P^ does not lie on (6),
Axi + By I + (7gi + JXj^ ^ _
Changing to nonhomogeneous coordinates, we have
AX^ + BY^+O'Z^ + I) t^
AX^ + BY^+CZ^ + D~ t^'
If PiMi and P^M^ are. the perpendiculars from Pj and P^ on the
plane (6), we have
PiQi^PiM^^ AX^ + BY^+OZ^ + B t^
P^Q, P^M^ AX^ + BY^+aZ^ + B~ ^t^'
In exactly the same way we get
Consequently ^iQi /PiQ%_'^
P^J P^^~J
This is the crossratio of the four points taken in the order P^, Q^,
It is readily seen that if one of the two points §j or Q^ lies at in
finity, all that is essential in the ahove reasoning remains valid, and
the crossratio is still X//t.
The case in which one of the two points Pj or P^ is at infinity
may be reduced to the case just considered by writing for the coor
dinates of ^1 and §2, (1^1, 77^, ?i, Tj) and (^2, t]^, ^^, Tj). The coordi
nates of Pj and Pj are then
(ll^f2' ^l)jlv ?l^?2> ^l^'4
(?1  ?2' '?!  '?2' fl  ?2' ^1  ■^2)
Accordingly, from what has just been proved, we see that the
crossratio of the four points taken in the order ^j, Pj, Q^, P^ is \/^i.
But this change of order does not change the crossratio. Hence in
all cases we have the result :
106 INTRODUCTION TO HIGHER ALGEBRA
Theokem 1. The crossratio of the four distinct poinis
•* 2 ^^2' Vv ^2' 2/'
§2 (^1 + /^^2' ^1 + A'«/2' ^1 + ^^2' ^1 + M*2)>
to^en m tAe order P^, Q^, P^, Q^, is X/zj,.
From this theorem we easily deduce the further result :
Theokem 2. The crossratio of four points on a line is invariant
with regard to nonsingular collineations of space.*
For the four points Pj, P^, Qi, Q^ of Theorem 1 are carried over
by a nonsingular coUineation into the four points
■* 2 '■''2' 2^2' ^2' 2/'
ei (zi + \4, y[ + xz/^, gj + x4, t; + X^),
§2 (4 + /^4' ^i + W2' «i + ^^2' *i + /"4)»
whose crossratio, when taken in the order Pj, Q'^, P'^, Q'^, is also X//*.
Theorems similar to Theorems 1 and 2 hold in space of two, and in
general in space of n, dimensions and may be proved in the same way.
EXERCISES
1. Denote the six determinants (2) by
(1,2), (1,3), (1,4), (3,4), (4,2), (2,3),
and write
^ = (1, 2)(3,4), fi=(l,3)(4,2), C=(l,4)(2, 3).
Prove that six, and only six, crossratios can be formed from four points by
taking them in different orders, namely the negatives of the six ratios which can be
formed from A, B, C taken two and two.
2. Prove that AiB + C = 0, and hence show that if X is one of the cross
ratios of four points, the other five wUl be
1 1 . 1 X1 X
X' ^~^' i3x' ~r' x^i
* This also follows from Exercise 5, § 24,
INVARIANTS. FIRST PRINCIPLES AND ILLUSTRATIONS 107
3. Prove that the six crossratios of four distinct points are all different from
each other except in the following two cases :
(«) The ease of four harmonic points, where the values of the crossratios
are  1, 2, i
(/8) The case known as four equianharmonic points, in which the values of
the crossratios are — J ± JV— 3.
4. Prove Theorem 2, § 24, by making use of the fact that the crossratio of four
points on a line is unchanged by nonsingular projective transformations of the
line.
5. By the crossratio of four planes which meet in a line is understood the
crossratio of the four points in which these planes are met by any line which
does not meet their line of intersection.
Justify this definition by proving that if the equations of the four planes are
Pi = 0, Pl + >^P2 =0,P2= 0, Pi + yP2 =
(pj and ^2 homogeneous linear polynomials in x, y, z, t), the crossratio of the four
points in which any line which does not meet the line of intersection of the planes
is met by the planes is A//^.
6. Prove that the crossratio of four planes which meet in a line is invariant
with regard to nonsingular coUineations.
34. PlaneCoordinates and Contragredient Variables. If Mj, u^,
Mj, u^ are constants, and x^, z^, x^, x^ are the homogeneous coordinates
of a point in space, the equation
(1) Mja;! + M2«2 + %^3 + ^4*4 =
represents a plane. Since the values of the m's determine the posi
tion of this plane, the m's may be regarded as coordinates of the plane.
We will speak of them && planecoordinates, just as the a;'s (each set
of which determines a point) are called pointcoordinates. And just
as we speak of the point {x^, x^, x^, x^) so we will speak of the plane
(mj, u^, Mg, mJ. The tt's are evidently analogous to homogeneous
coordinates in that if they be all multiplied by the same constant,
the plane which they determine is not changed.
Suppose now that we consider the x's as constants and allow the
m's to vary, taking on all possible sets of values which, with the fixed
set of values of the x's, satisfy (1). This equation will now repre
sent a family of planes, infinite in number, each one of which is de
termined by a particular set of values of the m's and all of which pass
through the fixed point (xy x^, x^, x^. The equation (1) may there
fore be regarded as the equation of a point in planecoordinates, since
it is satisfied by the coordinates of a moving plane which envelops
this point, just as when the x's vary and the m's are constant, it is
108 INTRODUCTION TO HIGHER ALGEBRA
the equation of a plane in pointcoordinates, since it is satisfied by the
coordinates of a moving point whose locus is the plane.*
In the same way, a homogeneous equation of degree higher than
the first in the m's will be satisfied by the coordinates of a moving
plane which will, in general, envelop a surface. The equation will
then be called the equation of this surface in planecoordinates, f
Let us now subject space to the collineation
c x'i = CijXj^ + Cf^x^ + Ci^Xg + Ci^x^ (i = 1, 2, 3, 4).
We will assume that the determinant c of this transformation is not
zero ; and we will denote the cofactors in this determinant by Cjy.
Then the inverse of the transformation c may be written
ci Xi = ^x' + ^'x' + ^x' + ^x' {i = 1, 2, 3, 4).
COG
Substituting these expressions, we see that the plane (1) goes over
into
(2) u\x\ + M^a;^ + u'^x'^ + u\x'^ = 0,
^^ere r c r r
d u[=^u^ + ^u^+ ^Mg + ^u^ (i = 1, 2, 3, 4).
We thus see that the m's have also suffered a linear transformation,
though a different one from the aj's, namely, the transformation whose
matrix is the conjugate (cf . § 7, Definition 2) of c~^. This transforma
tion d of the planecoordinates is merely another way of expressing
the collineation which we have commonly expressed by the transfor
mation c of the pointcoordinates. The two sets of variables x and u
are called contragredi^t variables according to the following
Definition 1. Two sets of n variables each are called contragre
dient if whenever one is subjected to a nonsingular linear transformation,
the other is subjected to the transformation which has as its matrix the
conjugate of the inverse of the matrix of the first.
* Similarly, in two dimensions, the equation
UiXi + M2X2 + UsXn =
represents a line in the pointcoordinate (a;i, x^, Kg) if «i, Ma, % are constants, or a
point in the linecoordinates («i, U2, Ms) if a^i, x^, x^ are constants,
t An example of this will be found in § 53.
INVARIANTS. FIRST PRINCIPLES AND ILLUSTRATIONS 109
Precisely the reasoning used above in the case of four variables
establishes here also the theorem :
Theoeem.* If the two sets of contragredient variables Xy ■•• x„
and, Mj, ■■• w„ are carried over hy a linear transformation into aij, ••• x'„
and u[, ■■■ wjj, then
will go over into u[x[ + u'^x'^ + • • • + u^x'^.
In connection with this subject of contragredient variables it is
customary to introduce the conception of contravariants, just as the
conception of covariants was introduced in connection with the sub
ject of cogredient variables. For this purpose we lay down the
Definition 2. If we have a system of forms in {x^, ••■ x^ and
also a number of sets of variables, (u'^, mJj), (mj', ••• m"), ■••, contra
gredient to the x's, any rational function of the w's and the coefficients of
the forms which is unchanged by a nonsingular linear transformation of
the x's except for being multiplied by the fMthpou'er (/u. an integer) of the
determinant of this transformation is called a contravariant of weight fi.
Thus the theorem that the resultant of n linear forms in n variables
is an invariant of weight 1 may, if we prefer, be stated in the form :
If we have n sets of n variables each, (mJ, ••• wjj), •■• (^"', ••• m^"'),
each of which is contragredient to the variables (x^, ■••'a;„), the de
terminant of the us, is a contravariant of weight l.f
It will be seen that the conception of contravariant, though
sometimes convenient, is unnecessary, since the contragredient vari
ables may always be regarded as the coefficients of linear forms, and,
when so regarded, the contravariant is merely an invariant.
Similarly, the still more general conception of mixed concomitants,
in which, besides the coefficients of forms and the contragredient
variables, certain sets of cogredient variables are involved, J reduces
to the familiar conception of covariants if we regard the contra
gredient variables as coefficients of linear forms.
* This is really a special theorem in the theory of bilinear forms. Cf. the next
chapter.
t For other examples of contravariants in which coefficients also occur, see
Chap. XII.
X An example is u\Xi + MzXa + ••• + ii„Xn, the theorem above stating that this is an
absolute mixed concomitant.
110
INTRODUCTION TO HIGHER ALGEBRA
35. Line Coordinates in Space. A line is determined by two
points («/j, y^, y^, y^), (sj, z^, z^, z^ which lie on it. It is clear that
these eight coordinates are not all necessary to determine the line ;
and it will be seen presently that the following six quantities deter
mine the line completely, and may be used as linecoordinates.
Fiv Pis^ Pu
' i'34' i'42' P2S>
where
(1)
Pv =
Zi Zj
•
In other words, the p's are the tworowed determinants of the matrix
Vi Vi Vs Vi
^1 ^2 ^3 ^4
except that the sign of the determinant obtained by striking out
the first and third column has been changed. These six ^'s are not
all zero if, as we assume, the two points y and z are distinct.
These six p's are connected by the relation
(2) PnPzi + P\zPi'i+ PiiPii = ^*
as may be seen either directly or by expanding the vanishing
determinant
Vi
by Laplace's method in terms of the minors of the first two rows.
That the jo's may really be used as line coordinates is shown by
the following two theorems:
Theorem 1. When a line is given, its linecoordinates py are com
pletely determined except for an arbitrary factor different from zero by
which they may all be multiplied.
The definition (1) of the ^'s shows that they may all be multi
plied by an arbitrary factor different from zero without affecting
the position of the line, since the «/'s (and also the g's) may be multi
plied by such a factor without affecting the position of the point.
1/2
Vz
Vi
h
h
H
Vi
y%
Vi
* CI. Exercise 2, § 33.
INVARIANTS. FIRST PRINCIPLES ANB ILLUSTRATIONS 111
In order to prove our theorem it is sufficient to show that, if
instead of the two points used above for determining the p"s we use
two other points of the line,
the linecoordinates
(^v ^2.' ^s> ^ih
n =
Zi Z.
thus determined will be proportional to thepy's. Since the points
Y and Z are collinear with the distinct points y, z, they are linearly
dependent upon them and we may write
Accordingly
Zi^^yi^h^Zi
(i= 1,2, 3,4).
'1 ^2
*1 n^2
y\ Vi.
= ff>i,»
where K^ 0, as T and Z are distinct points.
Theorem 2. Any six constants py satisfying the relation (2) and
not all zero are the linecoordinates of one, and only one, line.
That they cannot be the coordinates of more than one line may be
seen as follows: Suppose the pi/s to be the coordinates of a line,
and take two distinct points y and z on the line. The coordinates
of these points may then be so determined that relations (1) hold.
Let us suppose, for definiteness, that Pi2¥=Q* Now, consider the
point whose coordinates are Cj«/i+C2 3i. By assigning to Cj and c^
first the values — 3^ and y^, then the values — z^ and y^, we get the
two points
(3) (0, Pi2^ Pis, Pi^ (i?2i, 0, ^23, ^24).
where, by definition, pij = — Pa
These two points are distinct, since for the first of them the first
coordinate is zero and the second is not, while for the second the
second coordinate is zero and the first is not. These points ac
cordingly determine the line, and since they, in turn, are deter
mined by the ^'s, we see that the line is uniquely determined
by the ^'s.
* By a slight modification of the fonnulse this proof will apply to the case in
which any one of the p's is assumed different from zero.
112 INTRODUCTION TO HIGHER ALGEBRA
It remains, then, merely to show that any set of Pi/s, not all zero,
which satisfy (2) really determine a line. For this purpose we again
assume p^^ =?^ * and consider the two points (3) which, as above, are
distinct. The line determined by them has as its coordinates
i'u' PnPia^ PiiPw  P1ZP42 PiiPiz^ P\iP& PxiPiv
I
Using the relation (2), the fourth of these quantities reduces to
PiiPsp ^'^ that, remembering that the coordinates of a line may be
multiplied by any constant different from zero, we see that we
really have a line whose coordinates are p^j.
In a systematic study of threedimensional geometry these line
coordinates play as important a part as the point or planecoordi
nates ; and in the allied algebraic theories we shall have to consider
expressions having the invariant property, into which these line
coordinates enter just as pointcoordinates occur in covariants and
planecoordinates in contravariants. We may, if we please, regard
these expressions as ordinary covariants, since the linecoordinates
are merely functions of the coordinates of two points, but the co
variants we get in this way are covariants of a special sort, since the
coordinates of the two points occur only in the combinations (1).
As an example, let us consider four points
{Xi, 1/i, Zi, t,) {i = 1, 2, 3, 4).
The determinant of these sixteen coordinates is, by Theorem 3, § 31,
a covariant of weight — 1. Let us denote by p[j and p'^j the coordi
nates of the lines determined by the first two and the last two points
respectively. Expanding the fourrowed determinant just referred
to by Laplace's method according to the tworowed determinants of
the first two rows, we get
(4) P'nP'L + P'iiPzi + P'lzP'L + ^^13^42 + P'uP'k + PuP'iz
This, then, is an expression having the invariant property and in
volving only linecoordinates.
Siu'ce the vanishing of the fourrowed determinant from which
we started gave the condition that the four points lie in a plane, it
follows that the vanishing of (4) gives a necessary and sufficient
condition that the two lines p' and p" lie in a plane, or, what
amounts to the same thing, that they meet in a point.
* By a slight modification of the formulae, this proof will apply to the case in
which any one of the ^'s is assumed difierent from zero.
INVARIANTS. FIRST PRINCIPLES AND ILLUSTRATIONS 113
EXERCISES
1. Prove that, if the pointcoordinates are slibjected to the linear transfer
X'i = CilXl + Ci2X2 + CisXs + CuXi (i = 1, 2, 3, 4),
the linecoordinates will be subjected to the linear transformation
Pii = (e.l«j2  Ci2Cfl)pi2 + (CiiCj3  CisCji) JOis + (caCji  CnCjj)pu + (CisCjt  C,iCjs)psi
2. A plane is determined by three points
(2/i> 2/2, ys, 2/0, (.^u «2, zs, Zi), (wi, Wi, ws, 104).
Prove that the threerowed determinants of the matrix of these three points may
be used as coordinates of this plane, and that these coordinates are not distinct
from the planecoordinates defined in § 34.
3. A line determined by two of its points may be called a ray, and the line
coordinates of the present section may therefore be called raycoordinates. A line
determined as the intersection of two planes may be called an axis. If («i, «2,
«3, M4) and (vi, V2, vs, Vi) are two planes given by their planecoordinates, discuss
the axiscoordinates of their intersection,
912, ?i3, ?i4, qsi, 942, 52S>
where qy = «<% — ujVi.
4. Prove that raycoordinates and axiscoordinates are not essentially differ
ent by showing that, for any line, the q's, taken in the order written in Exercise 3,
are proportional to the ^'s taken in the order
Pu, Pi2, P23, P12, Pis, Pli
5. A point is determined as the intersection of three planes
(«1, U2, «3, Ui), (i'l, V^, Vs, Vi), (Wl, 102, Ws, Wi).
Prove that the threerowed determinants of the matrix of these planes may be
used as coordinates of this point, and that they do not differ from the ordinary
pointcoordinates.
Hence, show that all covariants may be regarded as invariants.
CHAPTER Vin
BILINEAR FORMS
36. The Algebraic Theory. Before entering on the study of
quadratic forms, which will form the subject of the next five chapters,
we turn briefly to a very special type of quadratic form in 2 m varia
bles, known as a bilinear form, and which, as its name implies, forms
a natural transition between linear and quadratic forms.
DEFiNiTioisr 1. A polynomial in the 2n variables (a:^, ••■ a;„),
(yi< • • • i/n) i's called a bilinear form if each of its terms is of the first
degree in the xs and also of the first degree in the i/s.
Thus, for w = 3, the most general bilinear form is
^U^l^l + «122^iy2 + «13^l2'8
+ fltjl^a^'l + '^wPil/'i + ^23''2y3
3
. This may be denoted, for brevity, by %aifc^j ; and, in general, we
may denote the bilinear form in 2 w variables by
(1)
The matrix
Sa«r
iptyj'
a =
*ii
^nl
"In
is called the matrix of the form (1) ; its determinant, the determinant
of the form ; and its rank, the rank of the form.* A bilinear form
is called singular when, and only when, its determinant is zero.
* It should be noticed that the bilinear form is completely determined when its
matrix is given, so there will be no confusion if we speak of the bilinear form a. If
two bilinear forms have matrices ai and a2, their sum has the matrix ai)a2. 'The
bilinear form whose matrix is aiaa is not the product of the two forms, but is sometimes
spoken of as their symbolic product.
114
BILINEAR POKMS 115
Let US notice that the bilinear form (1) may be obtained by
starting from the system of n linear forms in the ^'s of matrix a,
multiplying them respectively by x^, x^, ■•■ x„, and adding them
together. It can also be obtained by starting from the system
of n linear forms in the x's whose matrix is the conjugate of a,
multiplying them respectively by ^^ y^, ■■■ y„, and adding them
together.
Using the first of these two methods, we see (cf. Theorem 1, § 31)
that if the ^'s are subjected to a linear transformation with matrix
d, the bilinear form is carried over into a new bilinear form whose
matrix is ad. Using the second of the above methods of building
up the bilinear form from linear forms, we see that if the x's are
subjected to a linear transformation with matrix c, we get a new
bilinear form the conjugate of whose matrix is a'c, where accents
are used to denote conjugate matrices. The matrix of the form
itself is then (cf. Theorem 6, § 22) c'a.*
Combining these two facts, we have
Theokem 1. If, in the bilinear form (1) with matrix a, we subject
the xs to a linear transformation with matrix c and the y's to a linear
transformation with matrix d, we obtain a new bilinear form with matrix
c'ad, where c' is the conjugate of c.
Considering the determinants of these matrices, we may say :
Theorem 2. The determinant of a bilinear form is multiplied by
the product of the determinants of the transformations to which the x's
and y''s are subjected, f
We also infer from Theorem 1, in combination with Theorem 7,
§ 25, the important result : J
Theorem 3. The rank of a bilinear form is an invariant with re
gard to nonsingular linear transformations of the x's and y^s.
Definition 2. A bilinear form whose matrix is symmetric it
called a symmetric bilinear form.
* These results may also be readily verified without referring to any earlier theorems.
t This theorem tells us that the determinant of a bilinear form is, in a generalized
sense, a relative invariant. Such invariants, where the given forms depend on several
sets of variables, are known as combinants.
t This result n">^ also be deduced from Theorem 2, § 30.
116 INTRODUCTION TO HIGHER ALGEBRA
Theorem 4. A symmetric bilinear form remains symmetric if we
subject the x's and the y's to the same linear transformation.
For if c is the matrix of the transformation to which both the x'a
and the «/'s are subjected, the matrix of the transformed form will,
by Theorem 1, be c'ac. Remembering that a, being symmtetric, is
its own conjugate, we see, by Theorem 6, § 22, that c'ac is its own
conjugate. Hence the transformed form is symmetric.
EXERCISES
1. Prove that a necessary and sufficient condition for the equivalence of two
bilinear forms with regard to nonsingular linear transformations of the a;'s and
y'a is that they have the same rank.
2. Prove that a necessary and sufficient condition that it be possible to factor
a bilinear form into the product of two linear forms is that its rank be zero or one.
3. Prove that every bilinear form of rank r can be reduced by nonsingular
linear transformations of the x's and y's to the normal form
xiyi + Xiy2+ •■■ +Xryr
4. Do the statements in the preceding exercises remain correct if we confine
our attention to real bilinear forms and real linear transformations?
5. Prove that a necessary and sufficient condition that the form
xiyi + X2yi+ ... +x„2/„
should be unchanged by linear transformations of the x's and of the y's is that
these be contragredient transformations.
37. A Geometric Application. Let (x^, ajj, x^) and (j/i, yj, y,) be
homogeneous coordinates of points in a plane, and let us consider
the bilinear equation
3
(1) "la^iXtyj^O.
If («/j, ygi ^s) is a fixed point P, then (1), being linear in the x's,
is the equation of a straight line p. The only exception is when the
coefficients of (1), regarded as a linear equation in the x's, are all
zero, and this cannot happen if the determinant of the form is differ
ent from zero. Thus we see that the equation (1) causes one, and
only one, line p to correspond to every point P of the plane, pro
vided the bilinear form in (1) is nonsingular.
Conversely, if
(2) Ax^ + Bx^+Cx^^Q
BILINEAR FORMS 117
IS a line;?, there is one, and only one, point F which corresponds to it
by means of (1), provided the bilinear form in (1) is nonsingular. For
if P is the point (y^, y^, y^), the equation of the line corresponding to it
is (1), and the necessary and sufficient condition that this line coincide
witli (2) is a,,t/, + a,^, + a,^, = pA,
«2l2/l + «22y2 + «232/3 = P^^
HlVx + «322/2 + ^^33^3 = P C'.
where /> is a constant, not zero. For a given value of /o, this set of
equations has one, and only one, solution (y^, y^, y^), since the deter
minant a is not zero, while a change in p merely changes all the y's
in the same ratio. Hence,
Theorem. If the bilinear equation (1) is nonsinffular, it establishes
a onetoone correspondence between the points and lines of the plane.
This correspondence is called a correlation.
EXERCISES
1. Discuss the singular correlations of the plane, considering separately the
cases in which the rank of the bilinear form is 2 and 1.
2. Examine the corresponding equation in three dimensions, that is, the equa
tion obtained by equating to zero a bilinear form in which n= i, and discuss it for
all possible suppositions as to the rank of the form.
3. Show that a necessary and sufficient condition for three or more lines,
which correspond to three or more given points by means of a nonsingular corre
lation, to be concurrent is that the points be collinear.
4. Show that the crossratio of any four concurrent lines is the same as that of
the four points to which they correspond by means of a nonsingular correlation.
5. Let P be any point in a plane and p the line corresponding to it by means
of a nonsingular correlation. Prove that a necessary and sufficient condition for
the lines corresponding to the points of p to pass through P is that the bilinear
iorm be symmetrical.
6. State and prove the corresponding theorem for points and planes in space
of three dimensions, showing that here it is necessary and sufficient that the form
•le symmetrical or skewsymmetrical.*
* The correlation given by a symmetric bilinear equation is known as a reciproca
tion. By reference to the formulae of the next chapter, it will be readily seen that in
this case every point corresponds, in the plane, to its polar with regard to a fixed conic ;
in space, to its polar plane with regard to a fixed quadric surface. The skewsymmetric
bilinear equation gives rise in the plane merely to a very special singular correlation,
In space, however, it gives an important correlation which is in general nousingulai
and is known as a nullsystem. Cf. any treatment of line geometry, where, however,
the subject is usually approached from another side.
CHAPTER IX
GEOMETRIC INTRODUCTION TO THE STUDY OF QUADRATIC
FORMS
38. Quadric Surfaces and their Tangent Lines and Planes. If
Xj^, a^g, 2^3 are homogeneous coordinates in a plane, we see, by reference
to §4, that the equation of any conic may be written
(J^ll^^j + 0^22^2 "t" *33^3"t" ^*12^1'*'2 "■" '^13^l^3 < '*23'''2^3 ^^
Similarly, in space of three dimensions, the equation of any quadric
surface may be written
^11*1 + ^2*1 + *33^3 + *44^ + '^'^t'^1^2 + ^ajg^ja^g + 2a.j^a;j2;^
+ 2as^x^x^ + 2.a^^x^x^ + la^^x^x^ = 0.
This form may be made still more symmetrical if, besides the
coefficients aj2' *i3. ^i4' *34' *42' ^23' '^^ introduce the six other con
stants a^p a^i, a^p a^g, a^^, a^^, defined by the general formula
The equation of the quadric surface may then be written
U/itXt "T~ ittnXiXn ~r" ClinXtXn ~T" ^lAXtXj
+ a!21^2'*'l "■" '*22*2 ' ^23'^2'''3 >" ^24'^2"^4
+ Clg^XgX^ + 0^2,^3X2 + (tgs''^^ + ("342^32^4
+ Cl^^X^X^ f~ ^^2*^4*^2 "T" ^43*^4*^3 "T~ ^44*^4 ^^ ^>
or for greater brevity
(1) S aijXiXj = 0.
Definition 1. — The matrix of the sixteen a's taken in the order
written above is called the matrix of the quadric surface (1), the deter
minant of this matrix is called the discriminant of the quadric surface,
its rank is called the rank of the quadric surface, and if the discrimi
nant vanishes, the quadric surface is said to he singular.
iia
GEOMETRIC INTRODUCTION TO QUADRATIC FORMS 119
A fundamental problem is the following: If (y^, y^^ yg, y^ and
(a^, Zj, gg, z^) are two points, in what points does the line yz meet the
surface (1)?
The coordinates of any point on yz, other than y, may be written
(zj + \yi, 22 + X^a' ^3 + '^ys' ^4 + X^i).
A necessary and sufficient condition for this point to lie on (1) is
4
^a^^z, + \y>) (z, + \y>f = 0,
or expanded,
■(2) 'la^zfj + 2 X^tti^iZ^ + X^'laijy^j = 0.
If the point y does not lie on (1), this is a quadratic equation in \.
To each root of this equation corresponds one point where the line
meets the quadric. Thus we see that every line through a point y
which does not lie on a quadric surface, meets this surface either in
two, and only two, distinct points, or in only one point.
On the other hand, if y does lie on (1), the equation (2) reduces to
an equation of the first degree, provided la^yiZj^O. In this case,
also, the line meets the surface in two, and only two, distinct points,
viz., the point y and the point corresponding to the root of the equa
tion of the first degree (2).
Finally, if "S^ai^y^j = 'La^^^j = 0, the first member of equation
(2) reduces to a constant, so that (2) is either satisfied by no value
of \, in which case the line meets the surface at the point y only,
or by all values of \ (if 'Za^jZiZj = 0), in which case every point on the
line is also a point on the surface.
Combining the preceding results we may say:
Theorem 1. If a quadrio surface and a straight line are given,
one of the following three cases must occur :
(1) The line meets the quadric in two, and only two, points, in which
case the line is called a secant.
(2) The line meets the quadric in one, and only one, point, in which
case it is called a tangent.*
(3) Every point of the line is a point of the quadric. In this case
the line is called a ruling of the quadric. f
* We shall presently diatinguisb between true tangents and pseudotangents,
t Also called a generator, because, as will presently appear, the whole surface may
be generated by the motion of such a line.
120 INTRODUCTION TO HIGHER ALGEBRA
That all these three cases are possible is shown by simple exam
pies ; for instance, in the case of the surface
the three coordinate ayes illustrate the three cases.
We shall often find it convenient to say that a tangent line meets
the quadric in two coincident points.
From the proof we have given of Theorem 1, we can also infer
the further result :
Theorem 2. If («/j, y^, y^, y^ is a point on the quadric (1), then if
(3) ia,^x,yi = 0*
every line through y is either a tangent or a ruling of (1), otherwise
every line through y which lies in the plane
(4) ^dio^iyj =
is a tangent or ruling of(l), while every' other line through y is a secant.
A theorem of fundamental importance, which follows immediately
from this, is :
Theorem 3. If there exists a point y on the quadric (1) such that
the identity (3) is fulfilled, then (1) is a cone with y as a vertex ; and, con
versely, if (1) is a cone with y as a vertex, then the identity (2>)is fulfilled.
We pass now to the subject of tangent planes, which we define
as follows :
Definition 2. A plane p is said to he tangent to the quadric (1)
at one of its points. P, if every line of p which passes through P is either a
tangent or a ruling of (1).
It will be seen that, according to this definiti on, if (1) is a cone, every
plane through a vertex of (1) is tangent to (1 ) at this vertex. We have
thus included among the tangent planes, planes which in ordinary
geometric parlance would not be called tangent. The same objection
applies to our definition of tangent lines. We therefore now intro
duce the distinction between true tangent lines or planes and pseudo
tangent lines or planes.
Definition 3. A line or plane which touches a quadric surface at
a point which is not a vertex is called a true tangent ; all other tangent
lines and planes are called pseudotangents.
* It should be noticed that, on account of the relation a<; = o/c, 'Zaijx,0j = '2,(hi!]iXj.
GEOMETRIC INTRODUCTION TO QUADRATIC FORMS 121
EXSRCISES
1. Prove that if P is a point on a quadrio surface S, which is not a vertex,
and p the tangent plane at this point, one of the following three cases must occur :
(a) Two, and only two, lines of p are rulings of S, and these rulings intersect
at P.
(i) One, and only one, line of /) is a ruling of S, and this ruling passes
through P. ,
(c) Every line of p is a ruling of S.
2. Prove that
(a) When case (a) of Exercise f occurs, the quadric surface is not a cone ;
and, conversely, if the quadric surface is not a cone, case (o) will always occur.
(b) If case (6) of Exercise 1 occurs, p is tangent to S at every point of the
ruling which lies in p.
(c) If case (J) of Exercise 1 occurs, S is a cone with one, and only one, vertex,
and this vertex is on the ruling which lies in p ; and conversely, if S is a cone with
one, and only one, vertex, case (J) will always occur.
(d) If case (c) of Exercise 1 occurs, there is a line I iap every point of which
(but no other point) is a vertex of S ; and S consists of two planes one of which
is p, while the other intersects it in I.
39. Conjugate Points and Polar Planes. Two points are com
monly said to be conjugate with regard to a quadric surface
4
(1) ^a^jXiXj = 0,
when they are divided harmonically by the points where the line
connecting them meets the surface. In order to include all limiting
cases, we frame the definition as follows:
Definition. Two distinct points are said to be conjugate with re
gard to the surface (1) if
(a) The line Joining them is a tangent or a secant to (1), and the
points are divided harmonically hy the points where this line meets (1) ; or
(h) The line joining them is a ruling of (1).
Two coincident points are called conjugdte if they both lie on (1).
Let the coordinates of the points be («/j, y^, y^, y^) and (gj, z^, z^, z^),
and let us look first at the case in which the points are distinct and
neither of them lies on (1), and in which the line connecting them is
a secant of (1). The points of intersection of the line yz with (1)
may therefore be written
(si + ^,^1, ^2 + ^t^z' ^3 + ^i J/3' H + \ yd («' = 1» 2),
122 INTRODUCTION TO HIGHER ALGEBRA
where X^ and X^ are the roots of Equation (2), § 38. A necessary
and sufficient condition for harmonic division is that the crossratio
Xj/Xg have the value —1 ; that is XiH\2 = ; or, referring back
to Equation (2), § 38,
(2) ^atji/^,.=0.
We leave it for the reader to show that in all other cases in which y
and a are conjugate this relation (2) is fulfilled ; and that, conversely,
whenever this condition is fulfilled, the points are conjugate. That is :
Theorem 1. A necessary and sufficient condition that the points
y, z he conjugate with regard to (1) is that (2) he fulfilled.
This theorem enables us at once to write down the equation of
the locus of the point x conjugate to a fixed point y, namely,
(3) \a,p,y,=0.
Except when the first member of this equation vanishes identically,
this locus is therefore a plane called the polar plane of the point y.
We saw in the last section that the first member of (3) vanishes
identically when (1) is a cone and «/ is a vertex. This is the only
ease in which it vanishes identically ; for, if y is any point, not a
vertex, on a quadric surface, (3) represents the tangent plane at that
point; while if y is not on (1), the first member of (3) can clearly not
vanish identically, since it does not vanish when the x's are replaced
by the y's. Hence the theorem :
Theoebm 2. If (1) is not a cone, every point y has a definite polar
plane (3) ; if (1) is a cone, every point except its vertices has a definite
polar plane (3), while for the vertices the first member of (3) is identi
cally zero.
We note that the property that a plane is the polar of a given
point with regard to a quadric surface is a projective property, since
a collineation of space evidently carries over two conjugate points
into points conjugate with regard to the transformed surface.
Theorem 3. If two points Pj and P^ are so situated that the
polar plane of P.^ passes through P^, then, conversely, the polar plans
of P, will pass through Pj.
GEOMETRIC INTRODUCTION TO QUADRATIC FORMS 123
For, from the hypothesis, it follows that P^ and Pg are conjugate
points, and from this the conclusion follows.
40. Classification of Quadric Surfaces by Means of their Rank.
Theorem 2 of the last section may be stated by saying that a neces
sary and sufficient condition that the quadric surface
be a cone and that (yj, y^, yj, t/^) be its vertex (or one of its vertices]
is that
(2) 'Laif>iyj = 0.
This identity (2) is equivalent to the four equations
' «liyi + «122'2 +«132/3 + "uVi = 0,
(3) «2iyi + «22y2 + «23y3 + «242/4 = 0,
«3l2/l + «322/2 + Hzyz+ ^34^4 = 0,
• «4iyi + ^42^2 + «43y3+ «44y4 = 0
A necessary and sufficient condition for this set of equations to
have a common solution other than (0, 0, 0, 0) is that the determi
nant of their coefficients be zero. We notice that this determinant
is the discriminant a of the quadric surface. Hence,
Theorem 1. A necessary and sufficient condition for a quadric
surface to he a cone is that its discriminant vanish.
If, then, the rank of the quadric surface is four, the surface is not
a cone.
If the rank is three, the set of equations (3) has one, and, except
for multiples of this, only one, solution. Hence in this case the sur
face is an ordinary cone with a single vertex.
If the rank is two, equations (3) have two linearly independent solu
tions (cf. §18), on which all other solutions are linearly dependent.
Hence in this case the surface is a cone with a whole line of vertices.
If the rank is one, equations (3) have three linearly independent
solutions on which all other solutions are linearly dependent. Hence
we have a cone with a whole plane of vertices.
If the rank is zero we have, strictly speaking, no quadric surface ;
but the locus of (1) may be regarded as a cone, every point in space
being a vertex.
124 INTRODUCTION TO HIGHER ALGEBRA
It is clear that the property of a quadric surface being a cone is
a projective property ; and the same is true of the property of a point
being a vertex of a cone. Hence from the classification we have
just given we infer
Theorem 2. The rank of a quadric surface is unchanged hy non
singular collineations.
EXERCISES
1. Definition. If a plane p is the polar of a point P with regard to a quadric
surface, then P is called a pole of p.
Prove that if the quadric surface is nonsiugular, every plane has one, and only
one, pole.
2. Prove that if the quadric surface is a cone, a plane which does not pass
through a vertex has no pole.
What can be said here about planes which do pass through a vertex ?
41. Reduction of the Equation of a Quadric Surface to a Normal
Form. Since crossratio is invariant under a nonsingular coUinea
tion, a quadric surface S, a point P, not on S, and its polar plane
with regard to S, are carried over by any nonsingular collineation
into a quadric surface S', a point P', and its polar plane with regard
4
to *S". A point (g^, y^, «/g, y^ not on the quadric surface '%aijXiX^= 0,
cannot be on its own polar plane z^a^jX^j = as we see by replacing
the x's in this last equation by the z/'s. Now transform by a colline
ation so that this point becomes the origin and its polar plane the
plane at infinity.* The quadric surface will now be a central quad
ric with center at tlie origin, since, if any line be drawn through the
origin, the two points in which this line meets the surface are divided
harmonically by the origin and the point at infinity on this line.
The equation of the polar plane of the point (y/, y'^, y'^, y^) with
regard to the transformed quadric
4
'La'ix'ix'j =
4
is '^a'iAy'i = 0,
* Such a_j!ollineation can obviously be determined in an infinite number of ways
by means of the theorem that there exists a collineation which carries over any five
Unearly independent points into any five linearly independent points ; cf. Exercises 2.
B, §24.
GEOMETRIC INTRODUCTION TO QUADRATIC FORMS 125
ivhich reduces to the simple form
aiiXi + a^ix'i + a'^xl + a'iix[ =
when the point is the origin (0, 0, 0, 1), For this equation to rep
resent the plane at infinity, we must have
Hence the quadric surface becomes
«ii ^1 + «i2 ^i 4 + «i3 4 4
+ a^i 4 x[ + «^2 42 + ^3 4 4
+ «8i 4 4 + «32 4 4 + <3 4^
A slightly different reduction can be performed by transforming
the point {y^, y^, 3/3, y^ to the point at infinity on the ajjaxis and its
polar plane to the aiga^gplane. It is easy to see that we thus get rid
of the terms pontaining Xj except the square term.
Similarly we can get rid of the terms containing x^ and x^
Thus we see that any quadric surface can he reduced hy a collineation
to a form where its equation contains no term in x^ except the term in
x} whose coefficient then is not zero.
According as we take for i the values 1, 2, 3, 4, we get thus four
different normal forms for the equation of our quadric surface, and
inasmuch as each of these forms can be obtained in a great variety
of ways, the question naturally arises whether we cannot perform
all four reductions simultaneously. That this can, in general, be
done may be seen as follows : let ^ be a point not on the quadric
surface, and z any point on the polar plane of y, but not on the
quadric surface. Its polar plane contains ^. Let w be any point on
the intersection of the polar planes of y and s, but not on the quadric
surface. Then its polar plane passes through y and S. These three
polar planes meet in some point u, and it is readily seen that the four
points y, z, w, u do not lie on a plane. The tetrahedron yzwu is called
a polar or self conjugate tetrahedron of the quadric surface, since it
has the property that any vertex is the pole of the opposite face.
If we transform the four points, y, z, w, u to the origin and the
points at infinity on the three axes, the effect will be the same as that
of the separate transformations above, that is, the equation of the
quadric surface will be reduced to the form
*11 "^1 ' ^22 "^2 ' ^33 ^S ' *14 ^i '^
126 INTRODUCTION TO HIGHER ALGEBRA
We have tacitly assumed that it is possible to find points y, z, w,
constructed as indicated above, and not lying on the quadric surface.
We leave it for the reader to show that, if the quadric surface is not
a cone, this will always be possible in an infinite number of ways.
A cone, however, has no selfconjugate tetrahedron, and in this case
the above reduction is impossible.
EXERCISES
1. Prove that if the discriminant of a quadric surface is zero, the equation of
the surface can always be reduced, by a suitable collineation, to a form in which
the coordinate x^ does not enter.
[Suggestion. Show, by using the results of this chapter, that if the vertex of a
quadric cone is at the origin, 014 = 024 = an = 044 = 0.]
2. Show that, provided the cone has a finite vertex, the collineation of
Exercise 1 may be takeu in the form
•Ci ^^ OCi ~\' OL Xij
''^a ~ **'3 ")" y ^ii
[Suggestion. Uee nonhomogeneous coordinate,]
CHAPTER X
QUADRATIC FORMS
42. The General Quadratic Form and its Polar. The general
quadratic form in n variables is
n
f 1) ^dtPiXj = ajja^ + ai22;i ajj + . . . + ai„XjX„
+ OiiXiX^ 4 a^ixl + . . . + a^^x^n
T ''nl^n^ 4" (^ra^n^2 + • • • + ^nn^'i
S
where a^ = ay^.* The bilinear form ^ai^yiZj is called the polar form of
(1"). Subjecting (1) to the linear transformation
we get a new quadratic form
(2) ^c^x^r.
The polar form of (2) is ^a^jylz'j. If we transform the ^'s and s's of
the polar form of (1) by the same transformation c, we get a new
n
bilinear form '2a^yiZj. We will now prove that a^ = a^.
We have the identities
n n
(3) '^aijXiXj = l.aljxlx^,
(4) 2aij2^i%=Saiyt/^sj.
* It should be clearly understood that this restriction is a matter of convenience,
Tjot of necessity. If it were not made, the quadratic form would be neither more noi
less general.
127
128
INTRODUCTION TO HIGHER ALGEBRA
Each of these we may regard as identities in the a;"s, y"s, z's, the
x's, «/'s, z's being merely abbreviations for certain polynomials in the
corresponding primed letters. The last written identity reduces,
when we let ?/J = gj = cc^i = 1, 2, ... n), to
^ilyjXjJUj — itt^V^^Xn* •
Combining this with (3) gives
Hence
a« = a'u and a,;, + a^i = a^ + ajj.
We have assumed that a', = ajj, these being merely the coefficients of
a certain quadratic form, and we proved, in Theorem 4, § 36, that
a^ = ttji. Hence we infer that a^J = «,y.
From this fact and from (4) we get at once the further result :
n n
That is : ^ *
Theorem. The polar form
is ar^hsolute eovariant of the system composed of the quadratic form
n
and the two points (y^, . . . y„), {z^, . . . z„).
43. The Matrix and the Discriminant of a Quadratic Form.
Definition. The matrix
a,, ... a.
a =
'11'
is called the matrix of the quadratic form
(1)
2^ai^Xii
The determinant of sl is called the discriminant of (1) ; and the rank of
a, the rank of (1). If the discriminant vanishes, (1) is called singular.
QUADRATIC FORMS 129
The matrix of (1) is the matrix of its polar form. Moreover, as
was sliown in the last section, if the x's in (1) are subjected to a
linear transformation, and the t/'s and z's in the polar of (1) are sub
jected to the same transformation, the matrix of the new quadratic
form will be the same as the matrix of the new bilinear form. But
we saw, in Theorem 1, §36, how the matrix of a bilinear form is
changed by linear transformations of the variables. Thus we have
the theorem :
'Thbokem 1. If in the quadratic form (1) with matrix a we sub
ject the x's to a linear transformation with matrix c, we obtain a new
quadratic form with matrix c'ac, where d is the conjugate of c.
From this there follow at once, precisely as in § 36, the further
results :
Theorem 2. The rank of a quadratic form is not changed by non
singular linear transformation.
Theorem 3. The discriminant of a quadratic form is a relative
invariant of weight two.
44. Vertices of Quadratic Forms.
Definitiok. By a vertex of the quadratic form
(1) ^aijXfOj,
1
we understand a point (cj, •••<;„) where the c's are not all zero, such that
n
(2) '^afjXiCj = 0.
A quadratic form clearly vanishes at all of its vertices.
It is merely another way of stating this definition when we say :
Theorem 1. A necessary and sufficient condition that (cj, •■■ c„) be
a vertex of (1) is that it be a solution, not consisting exclusively of zeros,
<)f the system of equations
(3) ::::::::
130 INTRODUCTION TO HIGHER ALGEBRA
Since the resultant of (3) is the discriminant of (1), we may add:
Theorem 2. A necessary and sufficient condition for a quadratic
form to have a vertex is that its discriminant he zero ; and if the rank
of the form is r, it has n—r linearly independent vertices, and every point
linearly dependent on these is a vertex.
In particular, we note that if the discriminant of a quadratic
form is zero and if the cofactors of the elements of this determinant
are denoted in the ordinary way by Ay, then (J.^, ■■•4j„) is a vertex,
provided all these ^'s are not zero.
The following identify is of great importance (cf. formula
(2), §38),
(4) Sa<, {Zi + \y,) {zj + X%) = 2 a^zfj + 2 Xla^ Ziyj + ^^^.a^yiyj.
This may be regarded as an identity in all the letters involved.
If (cj, • ■ • c„) is a vertex of the quadratic form 1,a^x,xj, and these
c's are substituted in (4) in place of the ^'s, the last two terms of
tiae second member of this identity are zero, and we have
n n
(5) Srty (Si + \c,) {zj + Xcj) = S a^ZiZj ;
and conversely, if (5) holds, (c^, •■• c„) is a vertex ; for subtracting
(5) from (4), after substituting the c's for the ^'s in (4), we have
n n
2 X^a^zfij + A,22 a^ c^c^ = 0,
and, this being an identity in \ as well as in the z's, we have
n
2«j,ZiC, = 0.
1
Thus we have proved the following theorem:
Theorem 3. A necessary and sufficient condition that (ffj, •••<?„)
he a vertex of the quadratic form (1) is that 2j, ■••3„ and X being in
dependent variables, the identity (5) be fulfilled.
EXERCISES
1. Prove that if (Cj, . . . c„) is a vertex of (1), and (y^, . . . y„) is any point at
which the quadratic form vanishes, then the quadratic form vanishes at every
point linearly dependent on c and y.
2. State and prove a converse to 1.
QUADRATIC FORMS 131
45. Reduction of a Quadratic Form to a Sum of Squares. If in
the quadratic form
(1) ^(xj^,x„) = 'S.ayXiXj
the coefficient a^, is not zero, we may simplify the form by the fol
lowing transformation due to Lagrange.
The difference
is evidently independent of Xf. Denoting it by ^j, we have
^ = ~ {ai^x^ + ■■■+ a,„a;„)2 + (jty
da
If, then, we perform the nonsingular linear transformation
■x{ = a,^x^ + a^^x^ + ■■■ +a,„a;.
(2)
Xo
a/ =
L «
S/ft'
the quadratic form <f> is reduced to the form
(3) ^f + 0i(4»^«).
in which all the terms in x[ are wanting except the term in xf.
It will be seen that this reduction can in general be performed in
a variety of ways. It becomes impossible only when the coefficients
of all the square terms in the original quadratic form are zero.
Unless, in the new quadratic form <^j, the coefficients of all the
square terms are zero, we can apply the same reduction to this form
by subjecting the variables a/^, •■■ x'„ to a suitable nonsingular linear
transformation. This transformation may also be regarded as a non
singular linear transformation of all the x''s : (x[, x'^, • ■■ a;J,) if we write
x'^ = x'y We thus reduce (3) to the form
Applying this reduction now to (f)^, and proceeding as before, we
see that by a number of successive nonsingular transformations the
form <f) can finally be reduced to the form :
(5) 0^x1 + 6^x1+ ■■■ + e,x2.
132 INTRODUCTION TO HIGHER ALGEBRA
These successive transformations can now be combined into a single
nonsingular linear transformation, and we are thus led to the
Theorem. Every quadratic form in n variables can be reduced to
the form (5) by a nonsingular linear transformation.
The proof of this theorem is not yet complete ; for if at any
stage of the reduction the quadratic form ^^ has the peculiarity that
all its square terms are wanting, the next step in the reduction will be
impossible by the method we have used. Before considering this point,
we will illustrate the method of reduction by a numerical case.
Example.
12x1+ *i252 + 8a;j2;3"
+ x^x^ixl +9x^xA = X(^2x^ + X2 + 8xsy+<f)i
+ 8x^x^ + 9x^x^ + 2x1 J
where f t o
^,^lix,+ 8x,fBxl + 18x,x,+ 2xl^l^lll'^^^^_g^^^^'
=^2(^lx, + 5x,)^l^xl.
Accordingly, by means of the nonsingular linear transformation
(xj = 2x^ + x2 +8 x^,
^2 = 2 ^2 ")~ " •*'S'
^3 = • Xg,
the form <}> reduces to ^x'^ — ^x'^ — l^S x'^.
We have given here merely owe method of reduction. Three differ
ent methods were open to us at the first step and two at the second,
We proceed now to complete the proof of the general theorem,
Let us suppose that the coefficients of all the square terms in </> ar»
zero,* but that a^2^^ Then
^(a^i, ■ ■ • a;„) s 2 a^^^^x^ + 2xj^ (a^^x^ + • ■ • + a^„x„)
+ 2 x^( a23^3 + • • • + «2na;„ ) + ^a^XiXj
2 '
= — (aja^s + «i3=»3 H 1 (^m^n) (a^v^i + H?Pz H V Hr^^r)
+ </>!
2 / "
where ^i = (ajga^g +■■■+ a^^x^) (a^^x^ \ h a^^x„) + ^Lai^x^Xj.
«12 ^
♦This method may be used whenever an = 022 = whether all the other coef
ficients an are zero or net.
QUADRATIC FORMS
133
Che nonsingular linear transformation
f *i = ai2*2 + «I3*3 + • • • + «!«*»
+ «2S*S + " + «2na'»
^2 — ^21*1
La4 =
thus reduces ^ to the form
x^
— a;i'4 + ^j(a;3',a;„').
"12
The further nonsingular transformation
Cj = ajj f x^,
nf'' — fy' />»'
2 — 1 •''2'
"8'
r" =
a!i
reduces ^ to the form
1
2«12'
2 a.
<2 + ^^(^«,...^^.
12
The above reduction was performed on the supposition that Ui^^f^O,
It is clear, however, that only a slight change in notation would bn
necessary to carry through a similar reduction if a^^— ^ ^^^ '"'■a^ ^
The only case to which the reduction does not apply is, therefore,
the one in which all the coefficients of the quadratic form are zero,
a case in which no further reduction is necessary or possible.
We thus see that whenever Lagrange's reduction fails, the method
last explained will apply, and thus our theorem is completely eS'
tablished.
EXERCISES
1. Given a quadratic form in which n = 5 and a^ = \i —j\' Keduce to the
form (5).
2. Keduce the quadratic form
9a;2_63/28z2 + 6a;y14:a;z + 18a;tt; + 8y«H12yio4«w
to the form C6>
1S4
INTRODUCTION TO HIGHER ALGEBRA.
3. Prove that if (y^, ... y») is any point at which a given quadratic form is
not zero, a linear transformation can be found (and that in an infinite number of
ways) which carries this point into the point (0, ••• 0, 1) and its polar into kx„;
and show that this linear transformation eliminates from the quadratic form all
terms in x^ except the term in xj which then has a coefficient not zero.
4. Prove that the transformations described in Exercise 3 are the only ones
which have the effect there described.
5. Show how the two methods of reduction explained in this section come as
special cases under the transformation of Exercise 3.
46. A Normal Form, and the Equivalence of Quadratic Forms.
In the method of reduction explained in the last section, it may
happen that, after we have taken a number of steps, and thus
reduced <f> to the form
0^x1+ ••• + c/^xl + 4>t (aji+i, ••■ x„),
the form 0^ is identically zero. In this case no further reduction
would be necessary and the form (5) of the last section to which <j) is
reduced would have the peculiarity that c^+j
while all the earlier e's are different from zero,
when this case will occur.
For this purpose, consider the matrix
= 0,
0,
It is easy to see just
'A+2 — ■ ■ ■ — '
of the reduced form (5) of § 45. It is clear that the rank of this
matrix is precisely equal to the number of c's different from zero ;
and, since the rank of this reduced form is the same as that of the
original form, we have the result :
Theorem 1. A necessary and sufficient condition that it be possi
ble to reduce a quadratic form by means of a nonsingular linear trans
formation to the form
(1) Cia;2+ •■• +c,x%
where none of the c's are zero, is that the rank of the quadratic form be r.
QUADRATIC FORMS
135
This form (1) involves r coefficients Cj, ••■ c^. That the values of
these coefficients, apart from the fact that none of them are zero, are
immaterial will be seen if we consider the effect on (1) of the trans
formation
(2)
C,
= >•'
*r+l — ^<
■r+l»
where \^ ••• Tc^ are arbitrarily given constants none of which, how
ever, is to be zero. The transformation (2) is nonsingular, and
reduces (1) to the form
(3) \d^^ ••• +*X^.
Thus we have proved
Theokbm 2. A qtiadratio form of rank r can he reduced hy means
of a nonsingular linear transformation to the form (3), where the values
of the constants k^, ■■■ k^ may be assigned at pleasure provided none of
them are zero.
If, in particular, we assign to all the ^'s the value 1, we get
Theorem 3. livery quadratic form of rank r can he reduced to
the normal form
(4) xlh ••• +x1.
hy means of a nonsingular linear transformation.
From this follows
Thbokem 4. A necessary and sufficient condition that two quad
ratic forms he equivalent with regard to nonsingular linear trans
formations is that they have the same rank.
That this is a necessary condition is evident from the fact that the
rank is an invariant. That it is a sufficient condition follows from
the fact that, if the ranks are the same, both forms can be reduced
to the same normal form (4^).
136 INTRODUCTION TO HIGHER ALGEBRA
The normal form (4) has no special advantage, except its sym
metry, over any other form which could be obtained from (3) by
assigning to the Fs particular numerical values. Thus, for instance,
a normal form which might be used in place of (4) is
This form would have the advantage, in geometrical work, of giving
rise to a real locus.
Finally we note that the transformations used in this section are
not necessarily real, even though the form we start with be real.
EXERCISE!
Apply the results of this section to the study of quadric surfaces.
47. Reducibility. A quadratic form is called reducible when it is
identically equal to the product of two linear forms, that is, when
n
(1) 2 UijXiZj = (b^Xj^ + b^x^+ + b„x„){ojXi +0^X2 \ h c„x„).
Let us seek a necessary and sufficient condition that this be the case.
We begin by supposing the identity (1) to hold, and we consider in
succession the case in which the two factors in the righthand mem
ber of (1) are linearly independent, and that in which they are pro
portional. In the first case the 6's are not all proportional to the
corresponding c's, and by a mere change of notation we may insure
Jj, &2 not being proportional to Cj, c^. This being done, the trans
formation
'x[ = bjXi 1 J^a^a + • ■ • + Sn^«
4 = c^x^ + C2X^\ h o„x^
is nonsingular and carries our quadratic form over into the form
The matrix of this form is readily seen to be of rank 2, hence the
original form was of rank 2.
QUADRATIC FORMS 137
Turning now to the case in which the two factors in (1) are
proportional to each other, we see that (1) may be written
n
2 ai^x,o;j = 0(bjX^ +■■■ + b„x^f where 0=^ 0.
Unless all the 6's are zero (in which case the rank of the quadratic
form is zero) we may without loss of generality suppose Jj^^O, in
which ca' j the linear transformation
a^ = Xo
will be nonsingular and will reduce the quadratic form to
&''
which is of rank 1. ■^'
Thus we have shown that if a quadratic form is reducible, its
rank is 0, 1, or 2. We wish now, conversely, to prove that every
quadratic form whose rank has one of these values is reducible.
A quadratic form of rank zero is obviously reducible.
A form of rank 1 can be reduced by a nonsingular linear trans
formation to the form a;^, that is,
If here we substitute for a^ its value in terms of the a;'s, it is clear
that the form is reducible.
A form of rank 2 can be reduced to the form x[ 1 x^^ that is,
2 a^iXi = x'' +x'^={x[ + ^/l 4)(a4  V^ 4).
Here again, replacing a^ and a^ by their values in terms of the a;'s,
the reducibility of the form follows. Hence,
Theorem. A necessary and sufficient condition that a quadratic
form be reducible is that its rank be not greater than 2.
48. Integral Rational Invariants of a Quadratic Form. We have
seen that the discriminant a of a quadratic form is an invariant of
weight 2. Any integral power of a, or more generally, any constant
multiple of such a power, will therefore also be an invariant. We
will now prove conversely the
Theorem. Every integral rational invariant of a quadratic form
is a constant multifile of some power of the discriminant.
138 INTRODUCTION TO HIGHER ALGEBRA
Let us begin by assuming that the quadratic form
n
(1) '^aijXiXj
is nonsingular, and let o be the determinant of a linear transforma
tion which carries it over into the normal form
(2) x'^ + x'i++x'i.
Let /(aji, ••■ «„„)be any integral rational invariant of (l)of weight ii,
and denote by k the value of this invariant when formed from (2).
It is clear that A is a constant, that is, independent of the coefficients
%of(l). Then h=c'I.
Moreover, the discriminant a being of weight 2, and having for (2)
the value 1, we have ^ _ ^^_
Raising the last two equations to the powers 2 and /t respectively,
we get p ^ ^^ j2^ 1 ^ ^^ ^^_
From which follows
(3) J2 = Pa".
This formula has been established so far for all values of the
coefficients a^ for which a ^ 0. That it is really an identity in the
a„'s is seen at once by a reference to Theorem 5, § 2. The poly
nomial on the righthand side of (3) is of degree /x in aj, ; * hence
we see that/i must be an even number, since P is of even degree in
a^y Letting fjL='2v, we infer from (3) (cf . Exercise 1, § 2) that one
or the other of the identities
J= lca\ 1=  Tear
must hold, and either of these identities establishes our theorem.
A comparison of the result of this section with Theorem 4, § 46
will bring out clearly the essential difference between the two con
ceptions of a complete system of invariants mentioned in § 29. It will
be seen that the rank of a quadratic form is in itself a complete sys
tem of invariants for this form in the sense of Definition 2, § 29 ;
while the discriminant of the form is in itself a complete system in
the sense of the footnote appended to this definition.
* We assume here that k^O, .as otherwise the truth of the theorem would be
obvious.
QUADRATIC FORMS 139
49. A Second Method of Reducing a Quadratic Form to a Sum of
Squares. By the side of Lagrange's method of reducing a quadratic
form to a sum of squares, there are many other methods of accom
plishing the same result, one of the most useful of which we pro
ceed to explain. It depends on the following three theorems. The
proof of the first of these theorems is due to Kronecker and estab
lishes, in a remarkably simple manner, the fact that any quadratic
form of rank r can be written in terms of r variables only, a fact
which has already been proved by another method in Theorem 1, § 46.
Theorem 1. If the rank of the quadratic form
n
(1) (f>(x^,x„) = '2aijXiXj
is r > 0, and if the variables x^, ■•■ a;„ are so numbered that the rrowed
determinant in the upper lefthand corner of its matrix is not zero,*
new variables x[, •■■ xl can be introduced by means of a nonsingular
linear transformation such that
x[ = Xi (i = j Hi, •••»),
and such that (1) reduces to the form
r
'^ayx'ix'j.
This, it will be noticed, is a quadratic form in r variables in which
the coefficients, so far as they go, are the same as in the given form (1).
In order to prove this theorem, we begin by finding a vertex
(cj, ••• c„) of the form (1) by means of Equations (3), § 44. Since the
rrowed determinant which stands in the upper lefthand corner of
the matrix of these equations is not zero, the values of c^+i, ■■• c„ may
be chosen at plfeasure, and the other c's are then completely deter
mined. If we let c^+i = c,+2 = •■• = c„_;i =0, c„ = 1, we get a vertex
(cj, e^O, 0, 1).
Using this vertex in the identity (5), § 44, we have
^{x^\\o^, ■•• x^ + Xc^, Xr+i, •■•a;„_i,a;„F\) = ^(a;i,"a;„).
If we let \ = — Xn, this identity reduces to
(j) (xj — Cia;„, • • • a;^ — c^x„, x^^^, • • • x„_i, 0) = ^ {xy, • ■ • z„).
* That such an arrangement is possible is evident from Theorem 3, § 20.
140 INTRODUCTION TO HIGHER ALGEBRA
Accordingly, if we perform the nonsingular linear transformation *
\ = XiCiX„ (i=l, •••»•),
[x'i = Xi (i = r + l,n),
the quadratic form (1) reduces to
Bt
<^(ai, 241,0)= 2a.ja^a^.
This, being a quadratic form in n — 1 variables of rank r and so
arranged that the rrowed determinant which stands in the upper
lefthand corner of its matrix is not zero, can be reduced, by the
method just explained, to the form
n2
ZCl^/Xf Xj ,
where the linear transformation used is nonsingular and such that
3^! = 3^i (i=r+l,"nl).
By adding the formula x'l = a;J„
we may regard this as a nonsingular linear transformation in the n
variables. This transformation may then be combined with the one
previously used, thus giving a nonsingular transformation in which
x'! = Xi, (i=»ll, •"«),
and such that it reduces (1) to the form
«2
1 '
Proceeding in this way step by step, our theorem is at last
proved.
In the next two theorems we denote by Aij in the usual way the
cofactor of a^ in the discriminant a of the quadratic form (1).
Theorem 2. If J.„„ ^ 0, new variables x\, ■■• xl can he introduced
by a nonsingular transformation in such a way that
and that (1) takes the form
• This transformation should be compared with Exercise 2, § 41.
QUADRATIC FORMS
To prove this we consider the quadratic form
141
Its discriminant is
*ii
«i,»i
»ln
*nl, 1
^^ a_
Hence by means of a nonsingular transformation of the kind used
in the last theorem, an essential point being that x^ = x„, we get
Ml
'Lai.XiXi  r~xl = ?a„a;l
or
nl
2 a^j Xi Xj = E Uij x\ x'j + ^ x'^.
Theorem 3. If
new variables x[, ■■• xl can he introduced hy a nonsingular transforma ■
tion in such a way that , ,
and that (1) takes the form
V ' / , 2a ' , ,
■i O'ij XiXj\ — a;„ a;„_ J .
■' ^B, nl
Let us denote by B the determinant obtained by striking out the
last two rows and columns of a. Then (cf. Corollary 3, § 11) we
have
^»l, Bl "nl^r,
(2)
aB =
A A
B, B— 1 *^Bfl
= Al„.^^Q.
Consider, now, the quadratic form
(3)
'^aijXiZ
2a
142
INTRODLCTIQN TO HIGHER ALGEBRA
Its discriminant is
(4)
*ii
*l,ol
*ln
"'nl.l ■•• <"^Bl,nl
*B1, n
^n, n1
*Bl
^re, n1
— » — ^n, m1^ ^«, n1 ^ Dl ^ I )
^», »l ^n, n1 Vja.„^ ;j_;i/
which has the value zero, as we see by making use of (2). Not only
does the determinant (4) vanish, but its principal minors obtained
by striking out its last row and column and its next to the last row
and column are zero, being A„„ and ^„_j_„_j respectively. The minor
obtained by striking out the last two rows and columns from (4) is
B, and, by (2), this is not zero. Thus we see (cf. Theorem 1, § 20)
that the determinant (4) is of rank n — 2. Hence, by Theorem 1,
we can reduce (3) by a nonsingular linear transformation in which
*nl — ^1
■nV ^n
ai = a;„ to the form
2a
*w, n~\
Hence
n2
71, n—1
CoKOLLAKY. Under the conditions of Theorem 3, the quadratic
form (1) can he reduced to the form
n2
2a«a;52;j + 
2a
•Wi^^2)
"n, n—\
hy a non singular linear transformation.
To see this we have merely first to perform the reduction of
Theorem S, and then to follow this by the additional nonsingular
transformation
<=x'! (j=l, 2, ...»2>
•''n1 — ■*'rel •'ni
X„
QUADRATIC FORMS 143
Haying thus established these three theorems, the method of
reducing a quadratic form completely is obvious. If the form (1) is
singular, we begin by reducing it by Theorem 1 to
r
where r is the rank of the form. Unless all the principal (r— 1)
rowed minors of the discriminant of this form are zero, the order
of the variables a^, ••• x^ can be so arranged that the reduction af
Theorem 2 is possible, a reduction which may be regarded as a non
singular linear transformation of all n variables. If all the princi
pal (r— 1) rowed minors are zero, there will be at least one of the
cofactors Afj which is not zero, and, by a suitable rearrangement of
the order of the variables, this may be taken as A^^^^^. The reduc
tion of Theorem 3, Corollary, will then be possible. Proceeding in
this way, we finally reach the result, precisely as in Theorem 1, § 46,
that a quadratic form of rank r can always be reduced by a non
singular linear transformation to the form
It may be noticed that the arrangement of the transformation of
this section is in a certain sense precisely the reverse of that of §45,
inasmuch as we here leave at each step the coefficients of the unre
duced part of the form unchanged, but change the variables which
enter into this part ; while in §45 we change the coefficients of the
(lureduced part, but leave the variables in it unchanged.
CHAPTER XI
REAL QUADRATIC FORMS
50. The Law of Inertia. We come now to the study of real
quadratic forms and the effect produced on them by real linear
transformations.
We notice, here, to begin with, that the only operations involved
in the last chapter are rational operations (i.e. addition, subtraction,
multiplication, and division) with the single exception of the radicals
which come into formula (2), § 46. In particular the reduction of
§ 45 (or the alternative reduction of § 49) involves only rational oper
ations. Consequently, since rational operations performed on real
quantities give real results, we have
Theorem 1. A real quadrdtic form of rank r can be reduced hy
means of a real nonsingular linear transformation to the form
(1) c^x'^ + c^x'i+ ■■■ +c^'?
where Oy, ••• e^ are real constayits none of which are zero.
As we saw in the last chapter, this reduction can be performed in a
variety of ways, and the values of the coefficients Cj, . ■ • c^ in the reduced
form will be different for the different reductions. The signs of these
coefficients, apart from the order in which they occur, will not depend
on the particular reduction used, as is stated in the following im
portant theorem discovered independently by Jacobi and Sylvester
and called by the latter the Law of Inertia of Quadratic Forms:
Theorem 2. If a real quadratic form, of rank r is reduced hy two
real nonsingular linear transformations to the forms (1) and
(2) k,x'i^ + k^x'l^+ .■■+kX\
respectively, then the number of positive cs in{V) is equal to the number
of positive k's in (2).
In order to prove this, let us suppose that the z''s and a;"'s have
been so numbered that the first fi of the c's and the first v of the jfc's
are positive while all the remaining c's and k's are negative. Our
144
REAL QUADRATIC FORMS 145
theorem will be established if we can show that jj = v. If this is
not the case, one of the two integers i^ and v must be the greater,
and it is merely a matter of notation to assume that fi^ v. We will
prove that this assumption leads to a contradiction.
If we regard the x"s and x"'s simply as abbreviations for certain
linear forms in the x's, (1) and (2) are both of them identically equal
to the original quadratic form, and hence to each other. This iden
tity may be written
(3) c,x[^ +...+ c,x^^\c,,,\xl\, =K142
Let us now consider the system of homogeneous linear equations
in (2:1, •••a;„),
(4) 4=0, ... x'^ = 0, < + i = 0, ... 4 = 0.
We have here v + n—fi<n equations. Hence, by Theorem 3,
Corollary 1, § 17, we can find a solution of these equations in which
all the unknowns are not zero. ' Let (yj^, ••■ y„) be such a solution and
denote by y'i, y" the values of a;^, x'l when the constants ^i, ■  • y^ are
substituted in them for the variables 2^, ... a;„. Substituting the ^'s
for the a;'s in (3) gives
oiy'^+  +«.^;^= \K^,\y':ii IW
The expression on the left cannot be negative, and that on the right
cannot be positive, hence they must both be zero; and this is pos
sible only if ,_ ,_Q
But by (4) we also have y'^^^ = •■• = yl = 0.
That is, (yi, ■■■yn) is a solution, not composed exclusively of zeros, of
the system of n homogeneous linear, equations in n unknowns,
x[ = 0, 4=0, ...:c^=0.
The determinant of these equations must therefore be zero, that is,
the linear transformation which carries over the x's into the a;''s must
be a singular transformation. We are here led to a contradiction,
and our theorem is proved.
146 INTRODUCTION TO HIGHER ALGEBRA
We can thus associate with every real quadratic form two in
tegers P and N, namely, the number of positive and negative coeffi
cients respectively which we get when we reduce the form by any
real nonsingular linear transformation to the form (1). These two
numbers are evidently arithmetical invariants of the quadratic form
with regard to real nonsingular linear transformations, since two
real quadratic forms which can be transformed into one another by
means of such a transformation can obviously be reduced to the same
expression of form (1).*
The two arithmetical invariants P and N which we have thus
arrived at, and the arithmetical invariant r which we had before, are
not independent since we have the relation
(5) P + N=r.
One of the invariants P and iV is therefore superfluous and either
might be dispensed with. It is found more convenient, however,
to use neither P nor N, but their difference,
(6) s = P  iV,
which is called the signature of the quadratic form.
Definition. By the signature of a real quadratic form is under
stood the difference between the number of positive and the number of
negative coefficients tvhich we obtain ivhen we reduce the form by any
real non singular linear transformation to the form (1).
Since the integers P and iVused above were arithmetical invari
ants, their difference s will also be an arithmetical invariant. It
should be noticed, however, that s is not necessarily a positive in
teger. We have thus proved
Theorem 3. The signature of a quadratic form is an arithmetical
invariant with regard to real nonsingular linear transformations.
EXERCISES
1. Prove that the rank r and the signature s of a quadratic form are either
both even or both odd ; and that ^ ,
' — r<s<r.
2. Prove that any two integers r and s (r positive or zero) satisfying the con
ditions of Exercise 1 may be the rank and signature respectively of a quadratic
form.
* P is sometimes called the index of inertia of the quadratic form.
«ll«12
021022 '
...Ar =
^n
•Oir
On
.Orr
REAL QUADRATIC FORMS 147
3. Prove that a necessary and sufficient condition that a real quadratic form
of rank r and signature s be factorable into two real linear factors is that
either r<2;
or r = 2, s = 0.
4. A quadratic form of rank r shall be said to be regularly arranged (cf. § 20,
Theorem 4) if the x's are so numbered that no two consecutive ^'s are zero in the set
Af) — 1, ^j — Oji, A2 —
and that Ar^O. Prove that if the form is real and any one of these 4's is zero,
the two adjacent A's have opposite signs.
[Suggestion. In this exercise and the following ones, the work of § 49 should he
consulted.]
5. Prove that the signature of a regularly arranged real quadratic form is
equal to the number of permanences minus the number of variations of sign in the
sequence of the ^'s, if the ^'s which are zero are counted as positive or as nega
tive at pleasure.
6. Defining the expression sgn x (read signum x) by the equations
sgna;= + 1 a;>0,
sgna;= a; = 0,
sgna;= — 1 a;<0,
show that the signature of a regularly arranged real quadratic form of rank r is
sgn (^0^1) + sgn (^1^2) + ••• + sgn {AriAr).
51. Classification of Real Quadratic Forms. We saw in the last
section that a real quadratic form has two invariants with regard to
real nonsingular linear transformations, — its rank and its signa
ture. The main result to be established in the present section
(Theorem 2) is that these two invariants form a complete system.
If in § 46 the c's and A's are real, the transformation (2) will be
real when, but only when, each c has the same sign as the corre
sponding k. All that we can infer from the reasoning of that section
now is, therefore, that if a real quadratic form of rank r can be
reduced by a real nonsingular linear transformation to the form
it can also be reduced by a real nansingular linear transformation to
the form ^t^a^Zf ••• +Kx%
lis INTRODUCTION TO HIGHER ALGEBRA
where the Fs are arbitrarily giveu real constants, not zero, subject
to the condition that each k has the same sign as the corresponding c.
Using the letters P and N for the number of positive and negative e's
respectively, the transformation can be so arranged that the first
P e's are positive, the last iV negative. Accordingly the first P k's
can be taken as + 1, the last iV as — 1. From equations (5) and (6)
of § 50, we see that P and N may be expressed in terms of the rank
and signature of the form by the formulae
(1) P = r±i, N='^.
Thus we have the theorem :
Theorem 1. A real quadratic form of rank r and signature s can be
reduced by a "eal nonsingular linear transformation to the normal form
(2) x\++xlxl^^ xl
where P is given by (1).
We are now able to prove the fundamental theorem :
Theorem 2. A necessary and sufficient condition that two real
quadratic forms be equivalent with regard to real nonsingular linear
transformations is that they have the same rank and the same signature.
That this is a necessary condition is evident from the invariance
of rank and signature. That it is. sufficient follows from the fact
that if the two forms have the same rank and signature, they can
both be reduced to the same normal form (2).
Definition. All real quadratic forms, equivalent with regard to
real nonsingular linear transformations to a given form, and therefore to
each other, are said to f/^rm a class.*
Thus, for instance, smce every real nonsingular quadratic form
in four variables can be reduced to one or the other of the five
normal forms,
(3)
• This term may be used in a similar manner whenever the conception of equiva
'ence is involved.
xl
+
xl
+
xl
+
xl
x\
+
^2
+
xl
—
xl
x\
^
X\
—
xl
—
xl
x\
—
a;
—
4
—
xl
x\
—
~2
x„
—
xl
—
xl
REAL QUADRATIC FORMS 149
we see that all such forms belong to one or the other of five classes
characterized by the values
s = 4, 2, 0,  2,  4, r = 4.
If, however, as is the case in many problems in geometry, we are
concerned not with quadratic forms, but with the equations obtained
by equating these forms to zero, the number of classes to be distin
guished will be reduced by about one half, since two equations are
the same if their first members differ merely in sign.
Thus there are only three classes of nonsingular quadric surfaces
with real equations, whose normal forms are obtained by equating
the first three of the forms (3) to zero. These equations written in
nonhomogeneous coordinates are
X2 I r2 + Z2 =  1,
X^+ Y^ + Z^= 1,
The first of these represents an imaginary sphere, the second a real
sphere, and the third an unparted hyperboloid generated by the revolu
tion of a rectangular hyperbola about its conjugate axis. It may readily
be proved that this last surface may also be generated by the revolution
of either of the lines Y= 1 X =+ Z
about the axis of Z. We may therefore say :
Theorem 3. There are three, and only three, classes of nonsingular
quadric surfaces with real equations. In the first the surfaces are imag
inary ; in the second real, hut their rulings are imaginary ; in the third
they are real, and the rulings through their real points are real*
This classification is complete from the point of view we have
adopted of regarding quadric surfaces as equivalent if one can be
transformed into the other by a real nonsingular coUineation. The
more familiar classification does not adopt this projective view, but
distinguishes in our second class between ellipsoids, biparted hyper
boloids, and elliptic paraboloids ;_ and in the third class between un
parted hyperboloids and hyperbolic paraboloids.
* If, as here, we consider not real quadratic forms, but real homogeneous quadratic
equations we must use, not s, but \s\ as an invariant. In place of \s\ we may use what is
known as the characteristic of the quadratic form, that is the smaller of the two in
tegers P N This characteristic is simply \{i — s).
150 INTRODUCTION TO HIGHER ALGEBRA
EXERCISES
1. Prove that there are J(n + 1) (ra + 2) classes of real quadratic forms in n
variables.
2. Give a complete classification of singular quadric surfaces with real equa
tions from the point of view of the present section.
52. Definite and Indefinite Forms.
Definition. By an indefinite quadratic form is understood a real
quadratic form such that, when it is reduced to the normal form (2),
§ 51, hy a real nonsingular linear transformation, both positive and neg
ative signs occur. All other real quadratic forms are called definite ; *
and we distinguish between positive and negative definite forms accord
ing as the terms in the normal form are all positive or all negative.
In other words, a real quadratic form of rank r and signature s
is definite if « = ±r, otherwise it is indefinite. f
The names definite and indefinite have been given on account of
the following fundamental property :
Theorem 1. An indefinite quadratic form is positive for some real
values of the variables, negative for others. A positive definite form is
positive or zero for all real values of the variables ; a negative definite
form, negative or zero. ,
The part of this theorem which relates to definite forms follows
directly from the definition. To prove the part concerning indefi
nite forms, suppose the form reduced by a real nonsingular linear
transformation to the normal form
(1) x'^ + +x'}x%^ 42
Regarding the x"s as abbreviations for certain real linear forms in
the x's, let us consider the system of n—P homogeneous linear equa
tions
(2) rK^+i = 0, a;i>^2 = 0'•••^»=0•
Since these equations are real, and their number is less than the
number of unknowns, they have' a real solution not consisting
* Some writers reserve the name definite for nonsingular forms,, and call the
singular definite forms semidefinite.
f Otherwise stated, the condition for a definite form is that the characteristic he
zero. Cf. the footnote to Theorem 3, § 61.
REAL QUADRATIC FORMS 151
exclusively of zeros. Let (^/j, •■■«/„) be such a solution. This
solution cannot satisfy all the equations
(3) z[ = 0,z'p=0,
for equations (2) and (3) together form a system of n homogeneous
linear equations in n unknowns whose determinant is not zero, since
it is the determinant of the linear transformation which reduces the
given quadratic form to the normal form (1). Accordingly, if we
substitute («/j, ■ • • y„) for the variables (x^, • ■ • a„) in the given quad
ratic form, this form will have a positive value, as we see from the
 reduced form (1).
Similarly, by choosing for the x's a real solution of the equations
0/^ = 0, ■■■x'p=0, 4+1 = 0, ■•• 2;^= 0,
which does not consist exclusively of zeros, we see that the quad
ratic form takes on a negative value.
We pass now to some theorems which will be better appreciated
by the reader if he considers their geometrical meaning in the
case w = 4.
Theorem 2. If an indefinite quadratic form is positive at the real
point (?/j, •■■ y„) and negative ai the real point (^^y, ■■■ s„), then there
are two real points linearly dependent on these two, hut linearly inde
pendent of each other, at which the quadratic form is zero, and neither
of which is a vertex of the form.
The condition that the quadratic form
n
(4) ^ayXiXj
vanish at the point (yj + XSj, ■••«/„ + Xs„) is
This quadratic equation in \ has two real distinct roots, since,
from our hypothesis that (4) is positive at y and negative at z, it
follows that
Let us call these roots \^ and \. Then the points
(5) (^1 + XiSj, •••«/„ + \Zn\ (2/1 + Vi'  y« + ^'.2.)
152 INTRODUCTION TO HIGHER ALGEBRA
are two real points linearly dependent on the points y and z at which
(4) vanishes.
Next notice that
(6)
IXj
1x3
Zi Zj
Since the points y and z are linearly independent, the integers i,j
can be so chosen that the last determinant on the right of (6) is not
zero. Then the determinant on the left of (6) is not zero; and,
consequently, the points (5) are linearly independent.
In order, finally, to prove that neither of the points (5) is a "
vertex, denote them for brevity by
(Fj, ... F„), (^1, ... Z„).
Letting Xj — X^ = l//i, we have
Zi = ixTi  fiZi (i = 1, 2, ... n).
Therefore
(7) ia,,.a,3,. = ^fi^a,J YiY,  2 lAa^^Y^Z^ + ,J?la,^Z,Z^.
Since the points Y and Z have been so determined that (4) vanishes
at them, the first and last terms on the right of (7) are zero. If
either Y ov Z were a vertex, the middle term would also be zero ;
but this is impossible since the lefthand member of (7) is, by
hypothesis, negative. Thus our theorem is proved.
For the sake of completeness we add the corollary, whose truth
is at once evident :
Corollary. The only points linearly dependent on y and z at
which the quadratic form vanishes are points linearly dependent on one
or the other of the points referred to in the theorem; and none of these
are vertices.
We come now to a theorem of fundamental importance in the
theory of quadratic forms.
Theorem 3. A necessary and sufficient condition that a real
quadratic form he definite is that it vanish at no real points except its
vertices and the point (0, 0, ••• 0).
REAL QUADRATIC FORMS 153
Suppose, first, that we have a real quadratic form which vanishes
at no real points except its vertices and the point (0, 0, ••■ 0). If it
were indefinite, we could (Theorem 1) find two real points «/, z, at
one of which it is positive, at the other, negative. Hence (Theorem
2) we could find two real points linearly dependent on y and z, at
which the quadratic form vanishes. Neither of these will be the
point (0, 0, ■•• 0), since, by Theorem 2, they are linearly independ
ent. Moreover, they are neither of them vertices. Thus we see
that the form must be definite, and the sufficiency of the condition
is established.
It remains to be proved that a definite form can vanish only at
its vertices and at the point (0, 0, ••• 0).
Suppose (4) is definite and that {y^^ •■■ Vn) is any real point at
which it vanishes. Then,
'laijixi + >.?/i)(2j + Xy,) = 'la^XiXj + 2 Xlaif^^yj.
If y were neither a vertex nor the point (0, 0, ••■ 0), 'I,a{jX,yj would
not vanish identically, and we could find a real point (sj, • • • z„) such
that n
k = ^a,jZit/j ^ 0.
n
If we let c = SaySjZ,,
we have
(8) «.i<2< + X«/i)(% + Xy,) = e+2\k.
For a given real value of X, the lefthand side of this equation,
is simply the value of the quadratic form (4) at a certain real point.
Accordingly, for different values of X it will not change sign, while
the righthand side of (8) has opposite signs for large positive
and large negative values of X. Thus the assumption that y was
neither a vertex nor the point (0, 0, ■•• 0) has led to a contradiction;
and our theorem is proved.
Corollary. A nongingular definite quadratic form vanishes,
for real values of the variables, only when its variables are all zero.
As a simple application of the last corollary we will prove
Theorem 4. In a nonsingular definite form, none of the coeffi.
cients of the square terms can be zero
164 INTRODUCTION TO HIGHER ALGEBRA
For suppose the form (4) were definite and nonsingular; and
that Uii = 0. Then the form would vanish at the point
a;^ = . . . = a;j_ J = Xi^^ = • • • = a;„ ^ U, a;^ = 1 ;
and this is impossible, since this is not the point (0,0, •••0).
EXERCISES
1. Definition. By an orthogonal transformation * is understood a linear trans
formation which carries over the variables {xi, x„) into the variables ^x[," af„) in such
a way that
Xl + Xl+ ... + Xl = x[^ + Mi' + ... + X'^.
Prove that every orthogonal transformation is nonsingular, and, in particular,
that its determinant must have the value + 1 or — 1.
2. Prove that all orthogonal transformations in n variables form a group; and
that the same is true of all orthogonal transformations in n variables of deter
minant + 1.
3. Prove that a necessary and sufficient condition that a linear transforma
tion be orthogonal is that it leave the " distance "
y/(.yi  ziy + iVi  ^Y +  + (yn  2n)'
between every pair of points (yi, ... y^, (zi, •.• z„) invariant.
4. Prove that if n = 3, and if x\, x^, xa be interpreted as nonhomogeneous
rectanglar coordinates in space, an orthogonal transformation represents either a
rigid displacement which leaves the origin fixed, or such a displacement combined
■with reflection in a plane through the origin.
Show that the first of these cases will occur when the determinant of the
transformation is + 1, the second when this determinant is — 1.
5. If the coefficients of a linear transformation are denoted in the usual way
by Cjj, prove that a necessary and sufficient condition that the transformation be
orthogonal is that cl + c!i+ ... +cli=l (J = 1,2, ...n),
„ ft = 1,2, ••• n .
CliCy + CiiCgj + ... + CnlCnS = U 1 . , „ » ^J"
U=1.2, ".n
Show that these will still be necessary and sufficient conditions for an orthogona)
transformation if the two subscripts of every c be interchanged.!
* The matrix of such a transformation is called an orthogonal matrix, and its deter
minant an orthogonal determinant.
t We have here J » (n + 1) relations between the n^ coefficients of the transforma
tion. This suggests that it should be possible to express all the coefficients in terms of
j^3 wCwf l) _ «(nl)
2 2
of them, or if we prefer in terms of J n (m — 1) other parameters. Tor Cayley's dls
cifesion of this question cf. Pascal's book, Die Determinanten,. § 47. Cayley's formulse,
however, do not include all orthogonal transformations except as limiting cases.
CHAPTER XII
THE SYSTEM OF A QUADRATIC FORM AND ONE OR MORE
LINEAR FORMS
53. Relations of Planes and Lines to a Quadric Surface. If thb
plane
(1) UjX^ + U^X^ + U^Xg + u^x^ = 9
is a true tangent plane to the quadric surface
4
(2) 2ai,a;,.a;y = 0,
there will be a point («/j, y^, t/g, y^ (namely the point of contact)
lying in (1) and such that its polar plane
(3) •iaijXiyj=0
coincides with (1). From elementary analytic geometry we know
that a necessary and sufficient condition that two equations of the
first degree represent the same .plane is that their coefficients be pro
portional. Accordingly, from the coincidence of (1) and (3), we
deduce the equations
^ii^i + «123'2 + '^izVz + "'uVi  /***! = ^'
«2iyi + <*222'2 + S32/3 + ^24^4 " P'^2 = ^'
«3iyi + H2^i + H^yz + ^34^4  PW3 = ^■>
. HxVl + ^42^2 + H^Z + «442/4 ^ /'**4 = ^•
From the fact that the point y lies on (1), we infer the further
relation
(5) u^yy + 1*2^2 + ^3^8 + ^42/4 = ^•
These equations (4) and (5) have been deduced on the suppo
sition that (1) is a true tangent plane to (2). They still hold if
it is a pseudotangent plane ; for then the quadric must be a cone,
and a vertex of this cone must lie on (1). Taking the point y as
this vertex, equation (5) is fulfilled. Moreover, since now the first
166
(4)
156 INTRODUCTION TO HIGHER ALGEBRA
member of (3) is identically zero, equations (4) will also be fulfilled
if we let /3 = 0. Thus we have shown in all cases, that if (1) is a
tangent plane to (2), there exist five constants, ^j, y^, y^, y^, p, of
which the first four are not all zero, and which satisfy equations (4)
and (5). Hence
(6)
«u
"n
«18
ai4
«1
«21
«22
«23
«24
Ma
«31
«32
«33
«34
M3
«41
«42
«4S
«44
M4
Ml
"2
Ms
M4
= 0.
Conversely, if this last equation is fulfilled, there exist five
constants, yj, y^, y^, y^, p, not all zero, and which satisfy equations
(4) and (5). We can go a step farther and say that y^, y^., y^, y^
cannot all be zero, as otherwise, from equations (4) and the fact that
the m's are not all zero, p would also be zero. Thus we see that if
equation (6) is fulfilled, there exists a point {y^, y^, y^, t/4) in the
plane (1) whose coordinates, together with a certain constant p,
satisfy (4). If p = 0, this shows that the quadric is a cone with y
as a vertex, and hence that (1) is at least a pseudotangent plane.
If p =?^ 0, equatijons (4) show us that the polar plane (3) of y coin
cides with the plane (1). Moreover we see, either geometrically, or
by multiplying equations (4) by ^j, y^, y^, y^ respectively and add
ing, that the point y lies on the quadric ; so that, in this case, (1) is
a true tangent plane.
We have thus established the theorem :
Theorem 1. Equation (6) is a necessary and sufficient condition
that the plane (1) he tangent to the quadric (2).
It will be seen that this theorem gives us no means of distinguish
ing between true and pseudotangent planes of quadric cones. In
the case of nonsingular quadrics, pseudotangent planes are impos
sible, and therefore equation (6) may, in this case, be regarded as the
equation of the quadric in planecoordinates.
In the case of a quadric surface of rank 3, that is, of a cone with
a single vertex, the coordinates (mj, Mj, %, u^ of every plane through
this vertex satisfy equation (6), so that in this case this equation
represents a single point, and not the quadric cone.*
* In fact a cone cannot be represented by a single equation in planecoordinates.
QUADRATIC AND LINEAR FORMS
157
If the rank of (2) is less than 3, the coordinates of every plane
in space should satisfy (6), since every such plane passes through a
vertex and is therefore a tangent plane. This fact may be verified
by noticing that equation (6) may also be written
4
ZiA^jUflij = 0,
where the J.'s are the cofactors in the discriminant of (2) according
to our usual notation.
We pass now to the condition that a straight line touch the
quadric (2). This line we will determine as the intersection of the
two planes (1) and
(7) v^x^ + v^x^ + v^x^ + v^x^ = 0.
If the line of intersection of these planes is a true tangent to (2),
there will be a point (t/^, y^^ t/g, yj, namely the point of contact, lying
upon it, and such that its polar plane (3) contains the line. It must there
fore be possible to write the equation of this polar plane in the form
(8) i(fiUt + vVi)Xi=0;
and, in fact, by properly choosing the constants fi and v, the co
efficients of (8) may be made not merely proportional, but equal to
the coeiBcients of (3) :
• «ll2/l + «12^2 + «132/3 + ^14^4  f^^i  i'«'i= 0^
«21^1 + «22^2 + «23y3 + «242/4 " '^'^2 " ^2= <^'
«3iyi + '^322/2 + «332/3 + «342/4 ^^3 " ''^S = 0,
. «4iyi + «422/2 + «432/3 + <^id/i " f^^i " ^^i = ^'
Since the point 1/ lies on the line of intersection of the planes (1)
and (7), we also have the relations
f Ml^l + W2«/2 + ^3^3 + «42^4 = 0'
I Vi^l + ^23^2 + "3^3 + ^d/i = 0
Since the six equations (9) and (10) are satisfied by six constants
?/v Vt ^3' yv i^' ^ ^°* ^ ^®'^°' ^® ^°^®'' ^"^^^^y ^^ relation
(9)
(10)
ai)
*31
"12
^32
^42
*13
^23
^33
^43
*14
*24
^34
^^44
M,
Mg
= 0.
158 INTRODUCTION TO HIGHER ALGEBRA
We have deduced this equation on the supposition that the line of
intersection of (1) and (7) is a true tangent to (2). We leave it to
the reader to show that (11) holds if this line is a pseudotangent,
and also if it is a ruling of (2).
We also leave it for him to show that if (11) holds, the line of
intersection of (1) and (7) will be either a true tangent, a pseudo
tangent, or a ruling, and thus to establish the theorem:
Theokem 2. A necessary and sufficient condition that the line of
intersection of the planes (1) and (7) he either a tangent or a ruling
of (2) is that equation (11) be fulfilled.
On expanding the determinant in (11), it will be seen that it is
a quadratic form in the six linecoordinates q^ (cf. Exercise 3, § 35).
Equation (11) may therefore be regarded as the equation of the
quadric surface in linecoordinates if the surface is not a cone, or is
a cone with a single vertex. If the rank of (2) is 2, so that the.
quadric consists of two planes, (11) is the equation of the line of
intersection of these planes. While if the rank is 1 or 0, (11) is
identically fulfilled.
EXERCISES
1. Two planes are said to be conjugate with regard to a nonsingular quadric
surface if each passes through the pole of the other.
Prove that if (2) is " nonsingular quadric, a necessary and sufficient con
dition that the planes (1) and (7) be conjugate with regard to it is the vanishing
of the determinant
"1
"12 Qia '
^22 ^23 ^24 ^2
^,32 ^aa ^M ^3
^^42 "m "44 "4
[', V, V.
=  ^AijViiVj.
How must this definition of conjugate planes be extended in order that this
theorem be trUe for singular quadrics also?
2. Prove that if (2) is a nonsingular quadric, a necessary and sufficient con
dition that the point of intersection of three planes lie on (2) is the vanishing of
the sevenrowed determinant formed by bordering the discriminant of (2) with the
coefficients of the three planes.
3. Admitting it to be obvious geometrically that a necessary and sufficient con
dition that a line touch a nonsingular quadric is that the two tangent planes which
can be passed through this line should coincide, prove that, if (2) is nonsingular,
a necessary and sufficient condition that the line of intersection of (1) and (7)
touch ^2^ is ^ ■* 4
(S AijUiUj) (2 AigViVj)  (2 AijttiVjy = 0.
4. Show algebraically that the condition of Exercise 3 is equivalent to (11).
QUADRATIC AND LINEAR FORMS
159
54. The Adjoint Quadratic Form and Other Invariants. Passing
now to the case of n variables, we begin by considering the system
consisting of a quadratic form and a single linear form
(2) ^UiX{.
The geometrical considerations of the last section suggest that we
form the expression „ .._ ^
n , . . .
(3) '2Aij.UiUj = 
This, it will be seen, is a quadratic form in the variables (mi, ■■■u„)
whose matrix is the adjoint of the matrix of (1). We will speak of
(3) as the adjoint of (1).
The invariance of (3) is at once suggested by the fact that in the
case n = 4: the vanishing of (3) gave a necessary and sufficient con
dition for a projective relation. In fact we will prove the theorem :
#
Theorem 1. 2%e adjoint form (3) is an invariant of weight two of
the pair of forms (1), (2).
Inasmuch as the m's are, as we saw in § 34, contragredient to the
x's, we may also call (3) a contravariant (cf. Definition 2, § 34).
In order to prove this theorem we must subject the a;'s to a linear
transformation,
(4)
Xi — Cjj a;i + • ■ • + Ci„2;„
whose determinant we will call c. Let us denote by a!y and i4 re
spectively the coefficients of the quadratic and linear form into which
this transformation carries (1) and (2).
Let us now introduce an auxiliary variable t, and consider the
quadratic form in a^, ••• x„, t,
.(5)
'2aijXiXj+2t{iiiXi\ f u„ «;„).
The discriminant of this form is precisely the determinant in
(3), that is, the negative of the adjoint of (1).
160
INTRODUCTION TO HIGHER ALGEBRA
Let us now perform on the variables x^, ••■x„,t the linear trana
formation given by formulae (4) and the additional formula
(6) t=t'.
The determinant of this transformation is c, and it carries over the
form (5) into ",,,„,/,, , ,.
From the fact that the discriminant of (5) is an invariant of
weight 2, we infer the relation we wished to obtain :
"11
*ln
«»1
^nl
The method just used admits of immediate extension to the proof
of the following more general theorem :
Theorem 2. If we have a system consisting of a quadratic form in
n variables and p linear forms, the in + p)rowed determinant formed hy
bordering the discriminant of the quadratic form by p rows and p
columns each of which consists of the coefficients of one of the linea"
forms is an invariant of weight 2.
We leave the details of the proof of this theorem to the reader.
If the discriminant a of the quadratic form (1) is not zero, we may
form a new quadratic form whose matrix is the inverse of the matrix
of (1). This quadratic form, which is known as the inverse or
reciprocal of (i), is simply the adjoint of (1) divided by the discrimi
nant a. We will prove the following theorem concerning it :
Theorem 3. If the quadratic form (1) is nonsingular, it will be
carried over into its inverse by the nonsingular transformation
(7) x'i = aii«i + ••• + ai^x„ (i= 1, 2,  «).
For we have
2i a^j x^ Xj — 2j x^x^.
But from (7) we have
a a
and theuefore
as was to be proved.
2 aijXiXj = 2 — ^' x'ips'j.
QUADRATIC AND LINEAR FORMS 161
It will be noticed that if (1) is a real quadratic form, the trans
formation (7) is real ; and from this follows
Theoeem 4. A real nonsingular quadratic form and its inverse
have the same signature.
EXERCISES
1. Given a quadratic form ^oyXiXj and two linear forms S««a;i, 2iViXi.
Prove that
'S,AijUiVj=
iii ■•• Uin «,
Uj ••• Vn
is an invariant of the system of weight 2.
2. Generalize the theorem of Exercise 1 to the case in which we have more
than two linear forms.
3. Prove that if a first quadratic form is transformed into a second by the
linear transformation of matrix c, then the adjoint of the first will be transformed
into the adjoint of the second by the linear transformation whose matrix is the
conjugate of the adjoint of c.
4. Prove a similar theorem for bilinear forms.
5. State and prove a theorem for bilinear forms analogous to Theorem 3.
55. The Rank of the Adjoint Form, Suppose the discriminant a
n
of the quadratic form 'LuyXiXj is pf rank r, and that the discrimi
n
nant A of its adjoint S A^UfUj is of rank B. Then, iir<n — l, all
the (n — l)rowed determinants of a are zero ; but these are the ele
ments of A, hence R=0. If »• = w — 1, at least one of the elements
of A is not zero, and all tworowed determinants of A are zero (since
by § 11 each of them contains « as a factor), hence ^=1. If r= w,
Ii = n; for if H were less than n we should have ^ = 0, and there
fore a= (since A = a"~i). But this is impossible, since by hypothesis
r = n. We have then :
Theoeem 1. If the rank of a quadratic form in n variables and of
its adjoint are r and li respectively, then
ifr = n, R = n,
if r = n—l, i2 = l,
ifr<n—l, E=0,
u
162 INTRODUCTION TO HIGHER ALGEBRA
Let us consider further the case r = n — l. Here we have seen
that ^ = 1, that is, that the adjoint is the square of a linear form,
n n n
Comparing coefficients, we see that
All the OS cannot be zero, as otherwise we should have ^ = 0. Let
e), 4= 0. Then since ^ _ ^2 j, q
we see that not all the quantities {^A^y, ••■ A^„) are zero. Accord
ingly (cf. §44) the point (A^^, A^^, ■•■ A^„), and therefore also the
point ((?j, • • ■ 6'„), is a vertex of the original quadratic form. Thus we
have the theorem :
Theorem 2. If the rank of a quadratic form in n variables is n—1,
its adjoint is the square of a linear form, and the coefficients of this
linear form are the coordinates of a vertex of the original form.
Since, in the case we are considering, all the vertices of the
quadratic form are linearly dependent on any one, this theorem com
pletely determines the linear form in question except for a constant
factor.
CHAPTER XIII
PAIRS OF QUADRATIC FORMS
56. Pairs of Conies. We will give in this section a short geomet
rical introduction to the study of pairs of quadratic forms, confining
ourselves, for the sake of brevity, to two dimensions.
Let u and v be two conies which we will assume to be so situated
that they intersect in four, and only four, distinct points, A, B, C, D.
Consider all conies through these four points. These conies, we will
say, form a pencil. It is obvious that there are three and only three
singular conies (i.e. conies which consist of pairs of lines) in this
pencil, namely, the three pairs of lines AB, CD ; BC, DA ; AG, BD
Let us call the " vertices " of these conies P, Q, and R respectively.
From the harmonic properties of the complete quadrilateral* we
see that the secants PAB and POD are divided harmonically by the
• Cf . any book on modern geometry.
163
J64 INTRODUCTION TO HIGHER ALGEBRA
line QR. Accordingly QR is the polar of P with regard to every
conic of the pencil. In a similar manner PR is the polar of Q, and
PQ the polar of R with regard to every conic of the pencil. Thus,
we see that the triangle PQR is a selfconjugate triangle (see §41)
with regard to every conic of the pencil. Accordingly, if we per
form a coUineation which carries over P, Q, R into the origin and
the points at infinity on the axes of x and y, the equation of every
conic of the pencil will he reduced to a form in which only the
square terms enter. We are thus led to the result :
t
Theorem, if two conies intersect in four and only four distinct
points, there exists a nonsingular coUineation which reduces their
equations to the normal form
f Aj^x^ + A^xl + A^l = 0,
1 B^xl + B^xl + B^xl = 0.
If we wish to carry through this reduction analytically, we shall
write the equations of the two conies u and v in the forms
(1) taif>iXj = 0, ^b^XiXj = 0.
The pencil of conies may then be written
3
(2) SKX5i,>ia;,.= U,
1
or rather, to be accurate, this equation will represent for different
values of X all the conies of the pencil except the conic v. The
singular conies of the pencil will be obtained by equating the
discriminant of (2) to zero.
(3)
ajiXJii
«12  ^^^12
^13 ~ ^"13
«21  ^^21
«22  ^*22
«23  ^*23
agi  XJgi
«32  ^^^32
agg— XOgj
= 0.
This equation we will call the \equation of the two conies
When expanded, it takes the form
(4)  A'X8 1 @'\=  ® X 1 A = 0,
PAIRS OF QUADRATIC FORMS
165
where A, A' are the discriminants of u and v respectively, and
=
hi
*31
*12
hi
*32
"83
+
*11
hi
*31
"12
^22
"32
*33
+
"ll
hi
"31
•«12
*22
"62
*38
while 0' can be obtained from ® by an interchange of the letters a
and b. It can readily be proved (cf. the next section) that the co
efficients @ and & as well as A and A' are invariants of weight two.
Except when the discriminant A' of v is zero, the equation (4)
is of the third degree, and its three roots, which in the case we
have considered must evidently be distinct, give, when substituted
in (2), the three singular conies of the pencil.
We will not stop here to show how the theory of any two
conies, where no restriction as to the number of points of intersec
tion is made, can be deduced from equation (3).* This will follow
in Chapter XXII as an application of the method of elementary
divisors. Our only object in this section has been to give a geo
metrical basis for the appreciation of the following sections.
57. Invariants of a Pair of Quadratic Forms. Their XEquation.
We consider the pair of quadratic forms
n
n
and form from them the pencil of quadratic forms
n
The discriminant of this pencil,
a,
*ii ~ ''^^11
*i« — "^hn
\i ~~ ^Ki
,. — ^.5„
=^(X),
is a polynomial in X which is in general of degree n, and which may
be written ^(x) = ©„  ©jX + • • • + (  1)»@„X«
* An elementary discussion of the Xequation of two oonios (Vequation en \) is
regularly given in French textbooks on a;nalytic geometry. See, for instance, Briot
et Bouquet, Leqons de GeomHrie analytique, 14th ed., p. 349, or Niewenglowski, Oours
de Qeometrie analytique, "Vol. I, p. 459.
166 INTRODUCTION TO HIGHER ALGEBRA
The coefficients of this polynomial are themselves polynomials in
the a^'s and J.^'s, ©o and 0„ being merely the discriminants of <j> and
■\jr respectively, while 0^ is the suni of all the different determinants
which can be formed by replacing k columns of the discriminant of
(f) by the corresponding columns of the discriminant of yjr.
Theorem 1. The coefficients @q, ■■■®„of F{'X.) are integral rational
invariants of weight two of the pair of quadratic forms (f>, i^.*
In order to prove this, let us consider a linear transformation of
determinant c which carries over <^ and ■yjr into ^' and ■^' respec
tively, where »
^' = '2a^jx'iX^,
Let us denote by ©J the polynomial in the a^'s and b'/s obtained by
putting accents to the a's and 6's in @j. Our theorem will then be
proved if we can establish the identities
&i = c^@i (i=0,l,n)
This follows at once from the fact that F{\), being the discriminant
oi <j> — \i/p, is an invariant of weight two, so that if we denote by
J"(\) the discriminant of (j)' — \i/r', we have
F'(X) = c2I'(\).
This being an identity in X as well as in the a's and J's, we can
equate the coefficients of like powers of \ on the two sides, and this
gives precisely the identities we wished to establish. 
The equation F(x\ =
we will call the Xequation of the pair of forms </>, yjr. Since, as we
have seen, F is merely multiplied by a constant different from zero
when ^ and y]r are subjected to a nonsingular linear transformation,
* Cf. Exercise 13, § 90.
f The method by which we have here arrived at invariants of the system of two
quadratic forms will be seen to be of very general application. If we have an integral
rational invariant /of weight ^ of a single form of the kth degree in ?i variables, we can
find a large number of invariants of the system <t>i,<t>2, •■•(pp oip forms of the feth degree
in n variables by forming the invariant /for the form Xi0i + ... +Xp0p. This will be
a polymonial in the X's, each of whose coefBcients is seen, precisely as above, to be an
integral rational invariant of the systems of ^'s of weight /i.
PAIRS OF QUADRATIC FORMS 167
the roots of the Xequation will not be changed by such a transfor
mation. These roots, however, are irrational functions of the 0's and
hence of the a's and 6's. We may therefore state the result :
Theorem 2. The roots of the Xequation of a pair of quadratie
forms are absolute irrational invariants of this pair of forms with
regard to nonsingular linear transformations.
It is clear that the multiplicity of any root of the \equation
will not be changed by a nonsingular linear transformation. Hence
Theorem 3. The multiplicities of the roots of the \equation are
arithmetical invariants of the pair of quadratic forms with regard to
nonsingular linear transformations
If 4> = a^x\+ ■■■ + a„x%
■\fr= a;f + ••• ^ a;2,
the roots of the Xequation are a^, ■■• a„. This example shows
that the absolute invariants of Theorem 2 may have any values,
and also that the arithmetical invariants of Theorem 3 are subject
to no other restriction than the obvious one of being positive in
tegers whose sum is n.
58. Reduction to Normal Form when the \Equation has no Multi
ple Roots. Although our main concern in this section is with the
case in which the \equation has no multiple roots, we begin by estab
lishing a theorem which applies to a much more general case.
Theorem 1. If Xj is a simple root of the \equation of the pair
<f>, ■^{r of quadratic forms in n variables, then, by a nonsingular linear
transformation, (f> and yjr can be reduced respectively to the forms
where e^ is a constant not zero and ^j, i/rj are quadratic formB in tht
n— 1 variables z^, ■■• z„.
To prove this, we will consider the pencil of forms
(2) ^Xf = <^Xjf + (XiX)'f.
168 INTRODUCTION TO HIGHER ALGEBRA
Since \j is a root of the X.equatioa of the pair of forms <f), yjr, the form
^ — Xji/r is singular, and can "therefore, by a suitable nonsingular linear
transformation, be written in a form in which one of the variables, say
x[, does not enter, ^ _ \j^ = (f>i(x'^, . . . z'^).
If this transformation reduces ijr to yfr', we have
(3) <j>\y{r = <^'(4, •••<) + (\i  X)f'{x[, ■ ■ ■ a/„).
The discriminant of the quadratic form which stands here on the
right cannot contain \j — \ as a factor more than once, since \j is, by
hypothesis, not a multiple root of the Xequation of <f> and yjr. It
follows from this that the coefficient of x!^ in the quadratic form yjr'
cannot be zero, for otherwise the discriminant of the righthand side
of (3) would have a zero in the upper lefthand corner, and Xj — X,
would be a factor of all the elements of its first row and also of its
first column ; so that it would contain the factor (Xj — X)^.
Since the coefficient of x'^ in i/r' is not zero, we can by Lagrange's
reduction (Formulte (2), (3), §45) obtain a nonsingular linear trans
formation of the form
■ Si = 7ia;i + 72^ +  + Jn^'n
Z^ =
which reduces ■^' to the form
This transformation carries over the second member of (3) into
<l>'(z^, ••■3„)+(XiX)^i(22, ••■z„)J(XiX)ci4
Combining these two linear transformations and writing
f (^2' ••• 2») + ^^1^1(22' ■•■^n) = <f>l{h^ ••• ^n),
we have thus obtained a nonsingular linear transformation which
effects the reduction.
If, here, we equate the coefficients of X on both sides, and the
terms independent of X, we see that we have precisely the reduc
tion of the forms ^, yjr to the forms (1); and the theorem is proved.
PAIRS OF QUADRATIC FORMS 169
Let us now assume that tlie form ■yfr is nonsingular, thus insur
ing that the \equation be of degree n. We will further assume
that the roots Xj, \, •■■ X„ of this equation are all distinct. We can
then, by the theorem just proved, reduce the forms <f>, ^fr to the forms
(1) by a nonsingular linear transformation. The \equation of these
two forms is seen to differ from the Xequation of the pair of forms
in {n — 1) variables ^j, ■^^ only by the presence of the extra factor
\j — \. Accordingly the \equation of the pair of forms (^j, i^j has
as its roots Xji •" ^n ^"^^ these are all simple roots. We may there
fore apply the reduction of Theorem 1 to the two forms </)j, i/tj and
thus by a nonsingular linear transformation of z^ ■■■ s„ reduce them
to the forms \c^z'^ F ^^{z\, ■■■ sj,).
This linear transformation may, by means of the additional formula
be regarded as a nonsingular linear transformation of Sj, ••• s„ which
carries over ^, ■^ into the forms
Cisi2+ ^242+^2(4, ...2;).
Proceeding in this way, we establish the theorem :
Theorem 2. If 0, ir are quadratic forms in (x^, ■ ■ ■ x„) of which the
second is nonsingular, and if the roots \, ■■■ X„ of their \equation are all
distinct, there exists a non singular linear transformation which carries
over and f into ^^ ^^^12 ^x^c^x'^^ +  + X„c„<2,
c^x[^+ c^x'ih ■■■ + c„42
respectively, where Cj, c„ are constants all different from zero.
Since none of the c's are zero, the linear transformation
x'!=^ix'i (i=l, 2, m)
is nonsingular. Performing this transformation, we get the further
result :
Theorem 3. Under the same conditions as in Theorem 2, ^ and
ir may be reduced hy means of a nonsingular linear transformation to
the normal forms \^x\ + \^xl^  ^ X„4,
a;ff a;f ■••  xl
170 INTRODUCTION TO HIGHER ALGEBRA
From this we infer at once
Theorem 4. If in the two pairs of quadratic forms <^, yfr and <j)',
Tr' the forms yjr and ijr' are both non singular, and if the \equations of
these two pairs of forms have no multiple roots, a necessary and suf
ficient condition for the equivalence of the two pairs of forms is that
these two Xequations have the same roots ; or, what amounts to the same
thing, that the invariants @q, @j, • ■ • ©„ of the first pair of forms he pro
portional to the invariants ©q, ®[, • • • ©» of the second.
EXERCISE
Prove that, under the conditions of Theorem 3, the reduction to the normal
form can be performed in essentially only one way, the only possible variation
consisting in a change of sign of some of the x's in the normal form.
59. Reduction to Normal Form when vf is Definite and Non
singular. We now consider the case of two real quadratic forms
<^, i/r of which ir is definite and nonsingular. Our main problem is
to reduce this pair of forms to a normal form by means of a real
linear transformation. For this purpose we begin by proving
Theorem 1. The Xequation of a pair of real quadratic forms
<}>, yjr can have no imaginary root if the form yfr is definite and non
singular.
For, if possible, let a + /3i {a and y8 real) be an imaginary root of
this \equation, so that /3 =?i= 0. Then (j> — a\jr — i/3\jr will be a singular
quadratic form, and can therefore be reduced by a nonsingular
linear transformation
^n = {Pm + iq,a)Xl+ •■■+ (j>nn + iqnn)Xn
to the sum of k squares, where k < n,
(1) 4>a^}ri3yr = x[^{x!^+ +X/,?.
Let
(2) !/l=PllXi+ ■■■+PlnXn,
(3) Si = qiiXi + •■■ +qinX„
so that Xi = yi+ iz, .
"«>
PAIRS OF QUADRATIC FORMS 171
By equating the coefficients of i on the two sides of (1) we thus get
(4) ^^ = 2yi2i + 2y2'22+ +2^,34.
Let us now determine x^, ■■■ x„ so as to make the righthand side
of (4) vanish, for instance by means of the equations
A reference to (2) shows that we have here a system of k real
homogeneous linear equations in n unknowns, so that real values of
a;j, ••• a;„not all zero can be found satisfying these equations. For
these values of the variables, we see from (4) that yjr vanishes ; but
this is impossible (cf . the Corollary of Theorem 3, § 52), since yjr is by
hypothesis nonsingular and definite.
Theorem 2. If yjr is a nonsingular definite quadratic form and <f)
is any real quadratic form^ the pair of forms (f>, ir can be reduced by a
real nonsingular linear transformation to the normal form
(5)
(j>=±(\,x'^h+Kx'i),
ylr=±{ x[^++ x'i).
where Xj, • • • \„ are the roots of the Xequation, and the upper or lower
sign is to be used in both cases according as ^jr is a positive or a negative
form.
The proof of this theorem is very similar to the proof of
Theorem 2, § 58. We must first prove, as in Theorem 1, § 58,
that <j), yfr can be reduced by a real nonsingular linear transforma
tion to the forms , , • s
^ [ C^zff l/rj(22, ■■•3„).
To prove this, we consider the, pencil of forms
^\f = ^ X^f ^ (Xj  X)yjr.
Since \ is real by Theorem 1, — Xji/r is a real singular quad
ratic form, and can therefore by a real nonsingular linear trans
formation be reduced to a form in which one of the variables does
not enter, , . , ,,, , ,.
172 INTRODUCTION TO HIGHER ALGEBRA
If this transformation reduces ■yjr to i/r', we have
(7) <,Xf = c>'{4,■■■xi) + {X,\)^lr'{x'„■..xl).
At this point comes the essential difference between the case
we are now considering and the case considered in § 58, as Xj may
now be a multiple root of the discriminant of the righthand side
of (7). We need, then, a different method for showing that the
coefficient of x'^ in i/r' is not zero. For this purpose it is sufficient
to notice that y]r, and therefore also yfr', is a nonsingular definite
form, and that accordingly, by Theorem 4, § 52, the coefficient of
none of the square terms in ■^' can be zero.
Having thus shown that the coefficient of x'^ in yfr' is not zero,
we can apply Lagrange's reduction to tlr', and thus complete the
reduction of the forms (f>, yfr to the forms (6) precisely as in the proof
of Theorem 1, § 58, noticing that the transformation we have to deal
with is real.
In (6), <^j, v/rj are real quadratic forms in the n — 1 variables
Sg, ■•• 3„. Moreover, since
is nonsingular and definite, it follows that the same is true of yjr^.
For, if Jcj were either singular or indefinite, we could find values
of z^, ■■■ z„ not all zero and such that i^j = ; and these values to
gether with the value Sj = would make vr = 0. This, however, is
impossible by the Corollary of Theorem 3, § 52.
The \equation of the two forms ^j, yjr^ evidently differs from
the Xequation of <^, yfr only by the absence of the factor \ — \j. The
roots of the \equation of <^j, yjr.^ are therefore \, ■•■ X„, so that if we
reduce <f>^ and t/tj by the method already used for <f>, yfr (we have just
seen that <f>■^^, i/tj satisfy all the conditions imposed on cj), yfr), we get
^i(«2' ••• Zn) = \o^4^+ ^2(4' •■• 2»>
ti(32, • • • 2!„) = <?222^ + ^2(4' " ' " O'
Proceeding in this way, we finally reduce ^,yfrhy a, real nonsingu
lar linear transformation to the forms
\yfr= c^y\++c„yl.
PAIRS OF QUADRATIC FORMS 173
Since yjr is definite, the constants Cj, • ■ • c„ are all positive or all nega
tive according as i/r is a positive or a negative form. By means of
the further nonsingular real linear transformation
a'i=V[^yi (i=l, 2, ..m),
the forms/(8) may be reduced to the forms (5), and our theorem is
proved.
EXERCISES
1. If (^ is a real quadratic form in n variables of rank r, prove that it can be
reduced by a real orthogonal transformation in n variables to the form
Cf . Exercises, § 52.
2. Show that the determinant of the orthogonal transformation of Exercise 1
may be taken at pleasure as + 1 or — 1.
3. Discuss the metrical classification of real quadric surfaces along the
following lines:
Assume the equation in nonhomogeneous rectangular coordinates, and show
that by a transformation to another system of rectangular coordinates having the
same origin the equation can be reduced to a form where the terms of the second
degree have one or the other of the five forms (the A'a being positive constants)
■Aixl + A2XI+ AsXs,
A i^i + A^xl — AgXg,
Aixl + Aix\,
Aixl.
Then simplify each of the nonhomogeneous equations thus obtained by further
transformations of coordinates; thus getting finally the standard forms of the
equations of ellipsoids, hyperboloids, paraboloids, cones, cylinders, and planes
■which are discussed in all elementary textbooks of solid analytic geometry.
CHAPTER XIV
SOME PROPERTIES OF POLYNOMIALS IN GENERAL
60. Factors and Reducibility. In the present section we will
introduce certain conceptions of fundamental importance in our
subsequent work.
Definition 1. By a factor or divisor of a polynomial f in any
number of variables is understood a polynomial (f> which satisfies an
identity of the form j. = ±^
iff being also a polynomial.
It will be noticed that every constant different from zero is a
factor of every polynomial ; that every polynomial is a factor of a
polynomial which vanishes identically; while a polynomial which
is a mere constant, different from zero, has no factors other than
constants.
We note also that a polynomial in x^, ■■• x„ which is not identically
zero cannot have as a factor a polynomial which actually contains
any other variables.
The conception of reducibility, which we have already met in
a special case (§47), we define as follows:
Definition 2. A polynomial is said to he reducible if it is iden
dcally equal to the product of two polynomials neither of which is a
constant. >
In dealing with real polynomials, a narrower determination of
the conception of reducibility is usually desirable. We consider,
then, what we will call reducibility in the domain of reals, a con
ception which we define as follows:
Definition 3. A real polynomial is said to he reducible in the
domain of reals if it is identically equal to the product of two othef
real polynomials neither of which is a constant.
m
SOME PROPERTIES OF POLYNOMIALS IN GENERAL 175
In many branches of algebra still another modification of the
conception of reducibility plays an important part. In order to
explain this, we first lay down the following definition :
DBFiUfiTiON 4. A set of numbers is said to form a domain of
rationality if, when a and b are any numbers of the set, a+ b,
a~b, db, and, so far as b^O, a/b are also numbers of the set.
Thus all numbers, real and imaginary, form a domain of ration
ality, and the same is true of all real numbers. The simplest of all
domains of rationality, apart from the one which contains only the
single number zero, is what is known as the natural domain, that is all
rational numbers, positive and negative. A more complicated domain
of rationality would be the one consisting of all numbers of the form
a + b V — 1, where a and b are not merely real, but rational. These
illustrations, which might be multiplied indefinitely, should suffice to
make the scope of the above definition clear.*
Definition 5. A polynomial all of whose coefficients lie in a
domain of rationality M is said to be reducible in this domain if it is
identically equal to a product of two polynomials, neither of which is a
constant, whose coefficients also lie in this domain.
It will be noticed that Definitions 2 and 3 are merely the special
cases of this definition in which the domain of rationality is the
domain of all numbers, and the domain of all reals respectively. To
illustrate these three definitions, we note that the polynomial x^ + 1
is reducible according to Definition 2, since it is identically equal to
(a; + V — 1) (a; — V — 1). It is, however, not reducible in the domain
of reals, nor in the natural domain. On the other hand, x^ — 2 is
reducible in the domain of reals, but not in the natural domain.
Finally, ai^ _ 4 is reducible in the natural domain.
Leaving these modifications of the conception of reducibility, we
close this section with the following two definitions :
Definition 6. Two polynomials are said to be relatively prime ij
they have no common factor other than a constant.
*ByiJ (01,02, ••■an) is understood the domain of rationality consisting of all
numbers which can be obtained from the given numbers ai, ■•• a» by the rational pro
cesses (addition, subtraction, multiplication, and division). In this notation the natural
domain vfould be most simply denoted by 72(1); the domain last mentioned in the
text by B (1, V^l) or, even more simply, by B {V^l). This notation would not apply
to all cases {_e.g. the real domain) except by the use of an infinite number of arguments.
176 INTRODUCTION TO HIGHER ALGEBRA
Definition 7. Two methods of factoring a polynomial shall be
said to he not essentially different if there are the same number n of
factors in each case, and these factors can he so arranged that the kth
factors are proportional for all values of k, from 1 to n inclusive..
EXERCISES
1. Prove that every polynomial in (x, y) is irreducible if it is of the form
/(^) + y,
where / (x) is a polynomial in x alone.
Would this also be true for poljmomials of the form
2. li f,<j>,}p are polynomials in any number of variables which satisfy the
relation f=4>^,
and if the coefficients of /and <f> lie in a certain domain of rationality, prove that
the coefficients of i/» will lie in the same domain provided <^ ^ 0. ,
61. The Irreducibility of the General Determinant and of the Sym
metrical Determinant.
Theorem 1. The determinant
D =
ni "'12 "■!»
^ix '''22 "■ '"Z"
^n%
is an irreducible polynomial if its r? elements are regarded as inde
pendent variables.
For suppose it were reducible, and let
where neither/ nor ^ is a constant. Expanding D according to the
elements of the first row, we see that it is of the first degree in a^j.
Hence one of the two polynomials / and <f) must be of the first
degree in ajj, the other of the zeroth degree. Precisely the same
reasoning shows that if a,;, is any element of Z>, one of the polyno
mials / and (j) will be of the first degree in a^p while the other will
not involve this variable.
Let us denote by/ that one of the two polynomials which involves
%, any element of the principal diagonal of D. Then <f> does not
SOME PROPERTIES OF POLYNOMIALS IN GENERAL 177
involve any element of the ith row or the ith column. For if it did,
since / is of the first degree in a^ and <f) is of the zeroth, their product
D would involve terms containing products of the form %a^y or
a^aji, which, from the definition of a determinant, is impossible.
Consequently, if either one of the polynomials / and (j) contains any
element of the principal diagonal of D, it must contain all the ele
ments standing in the same row and all those standing in the same
column with this one, and none of these can occur in the other
polynomial.
Now suppose/ contains a« and that ^ contains any other element
of the principal diagonal, say a^j. Then «;, and a,; can be in neither/
nor (j>, which is impossible. Hence, if / contains any one of the ele
ments in the principal diagonal, it must contain all the others, and
hence all the elements, and our theorem is proved.
Theorem 2. The symmetrical determinant
i) =
"In
*nl
(S — "'Ji)
is an irreducible polynomial if its ^n(n + l) elements be regarded as
independent variables.
The proof given for the last theorem holds, almost word for word,
in this case also, the only difference being that while D is of the first
degree in each of the elements of its principal diagonal, it is of the
second degree in each of the other elements. The slight changes in
the proof made necessary by this difference are left to the reader.
EXERCISES
1. The general bordered determinant
an
■ ain
«n •
■ «ij,
a„i .
• «nm
«nl
.M«p
Vn.
• Vi„
0.
.0
V
0...0
is irreducible iip<n, the a's, u's, and «'s being regarded as independent variables
2. The symmetrical bordered determinant obtained from the determinant is
Exercise 1 by letting a<, = aj,, u,^ = Vji is irreducible iip<n.
178 INTRODUCTION TO HIGHER ALGEBRA
3. If for certain values of i and J, but not for all, a^f = Ojt, but if the o's are
otherwise independent, can we still say that
an . ■ . Qin
is irreducible ?
Onl .
4. Prove that a skewsymmetric determinant (of. Exercises 2, 3, §20) is always
reducible by showing that, when it is of even order, it is a perfect square.
[Suggestion. Use Corollary 3, § 11, and Theorem 6 and Exercise 1, § 76.]
Does this theorem require any modification if the elements are real and we
consider reduoibility in the domain of reals ?
62. Corresponding Homogeneous and NonHomogeneous Polyno
mials. It is often convenient to consider side by side two polyno
mials, one homogeneous and the other nonhomogeneous, which
bear to one another the same relation as the first members of the
equations of a plane curve or of a surface in homogeneous and non
homogeneous coordinates respectively. Such polynomials we will
speak of as corresponding/ to one another according to the following
definition :
Definition. If we have a nonhomogeneous polynomial of the Jcth
degree in any number of variables (ajj, ••■2'n_i) and form a new poly
nomial by multiplying each term of the old by the power of a new
variable a„ necessary to bring up the degree of this term to k, the homo
geneous polynomial thus formed shall be said to correspond to the given
nonhomogeneous polynomial.
Thus the two polynomials
(1) 2a^ + 3x^y5xz^i/2 +2z^ +x 3y 9,
(2) 2x^\ Sx^y  5xz^  yzt + 2zH + xt^2,yt^  9«8,
correspond to each other.
It may be noticed that if <^ (a;j, '"Xn^ is the nonhomogeneous
polynomial of degree A, the corresponding homogeneous polynomial
may be written , /a a g \
To every nonhomogeneous polynomial there corresponds one, and
only one, homogeneous polynomial. Conversely, however, to a
homogeneous polynomial in n variables there correspond in general
SOME PROPERTIES OF POLYNOMIALS IN GENERAL 179
n different nonhomogeneous polynomials which are obtained by set
ting one of the variables equal to 1. For instance, in the example
given above, to (2) corresponds not only (1) but also
(3) 2 +32^ 5z^ yzt+2eH + t^ Syt^9i?,
(4) 2a^ + 3a;2 5xz^zt +2zH + xfiBt^ 9f,
(5) 23? + Sx^i/5x yt +2t +xt^Syt^ 9^.
It should be noticed, however, that if one of the variables enters into
every term of a homogeneous polynomial, the result of setting this
variable equal to "unity is to give, not a corresponding nonhomo*
geneous polynomial, but a polynomial of lower degree. In fact, in
the extreme case in which every variable enters into every term of
the homogeneous polynomial, there is no corresponding nonhomo
geneous polynomial; as, for instance, in the case of the polynomial
x^yz + xyh, t xy^.
Theorem 1. If one of two corresponding polynomials is redueihle^
then the other is, also, and the factors of each polynomial correspond to
the factors of the other.
For let <]){x^, ■ ■ ■ Xn) be a homogeneous polynomial of degree
(k + 1), and suppose it can be factored into two factors of degrees
k and I, respectively,
Now suppose the corresponding nonhomogeneous polynomial in
question is the one formed by setting x^ = 1 We have
Since by hypothesis the degree of the polynomial on the left is
unchanged by this operation, neither of the factors on the right
hand side of (6) can have had its degree reduced, hence neither
of the factors on the right of (7) is a constant. Our nonhomo
geneous polynomial is therefore reducible ; and moreover the two
factors on the right of (7), being of degrees k and I respectively,
are precisely the two functions corresponding to the two factors on
the right of (6).
180 INTRODUCTION TO HIGHER ALGEBRA
Now, let ^i+i(a;i, ■•• Xny) be a nonhomogeneous poljmomial, and
^*+i(*i> • ■ • a^ni) = %(aii • • • a;^i) Xi(a;i, • • ■ a;^i),
where the subscripts denote the degrees of the polynomials. Let
4>k+i^ "^k+ii Xk+i be the homogeneous polynomials corresponding to
*, ■*•, X. Then when x„ ^ 0,
*
h+l
\x„ x„ J \^„ x„ J \x„ x„ J
.Multiplying this equation by a;^"*"' we have
an equation which holds whenever x^ =#= 0, and, therefore, by Theorem
5, § 2, is an identity. Thus our theorem is proved.
As a simple illustration of the way in which this theorem may be
applied we mention the condition for reducibility of a nonhomo
geneous quadratic polynomial in any number of variables. By
applying the test of § 47 to the corresponding homogeneous poly
nomial we obtain at once a test for the reducibility of any non
homogeneous quadratic polynomial.
Theokbm 2. If f and ^ are nonhomogeneous polynomials, and
F, <I> are the corresponding homogeneous polynomials, a necessary and
sufficient condition that F and <1> he relatively prime is that f and
(j> be relatively prime.
For if/ and ^ have a common factor i/r which is not a constant,
the homogeneous polynomial '^ which corresponds to ir is, by
Theorem 1, a common factor of F and $, and is clearly not a con
stant. Conversely, if 'Sf is a common factor of F and <I> which is
not a constant, / and </> will have, by Theorem 1, a common factor
which corresponds to ^ and which therefore cannot be a mere
constant.
63. Division of Polynomials. We will consider first two polyno
mials in one variable:
'»>
(X) {f{x)~a^3f' + a^x^^+... + a,
\4>(x) = h^x'^ + J^a;"! (■•■ f 6„.
SOME PROPERTIES OF POLYNOMIALS IN GENERAL 181
We learn in elementary algebra how to divide/ by 0, getting a
quotient Q (x) and a remainder R (x). What is essential here is
contained in the following theorem :
Theorem 1. If f and <j} are two polynomials in x of which is
not identically zero, there exists one, and only one, pair of polynomials,
Q and S, which satisfy the identity
(2) f{x)=Q{x)cj>{x) + E{x),
and such that either B = Q,* or the degree of M is less than the degree ofcj).
We begin by proving that at least one pair of polynomials Q, Jt
exists which satisfies the conditions of the theorem.
If /is of lower degree than <j) (or if /= 0), the truth of this state
ment is obvious, for we may then let Q = 0, R =/.
Suppose, then, that/ is of at least as high degree as ^. Writing
/ and (ji in the form (1), we may assume
asg gb 0, Jfl =?t 0, n> m.
Lemma, if cf *'* '^^^ ^f higher degree thanf, there exist two polyno
mials Qi and R^ which satisfy the identity
f{x)=Q,(x)<j>{x) + R,ix),
and such that either Ri=0, or the degree of R^ is less than the degree off.
The truth of this lemma is obvious if we let
These two polynomials Q^ and R^ will serve as the polynomials Q
and R of our theorem if R^=0, or if the degree of R^ is less than the
degree of <^. If not, apply the lemma again to the two functions R^
and 0, getting j^^ (^^ ^ q^ (^^ ^ (■^^ + r^ (^),
where R^ is either identically zero or is of lower degree than R^ We
may then write, y(^) = ^ q^ (^^ + q^ (^)] ^ (a;) + R^ (x).
If R^ = 0, or if the degree ofR^ is less than the degree of ^, we may
take for the polynomials Q, R of our theorem, Q^ + Q^ and R^. If
not, we apply our lemma again to R^ and (p. Proceeding in this way
* It will be remembered that, according to the definition we have adopted, a polyno
mial which vanishes identically has no degree.
182 INTRODUCTION TO HIGHER ALGEBRA
we get a series of polynomials Mi, R^, ■■■ whose degrees are con
stantly decreasing. We therefore, after a certain number of steps,
reach a polynomial Ri which is either identically zero or of degree
less than ^. Combining the identities obtained up to this point, we
I'^^e f{x)^\_Qi{x)+.:+Q, (x) ] (^) + R, {x),
an identity which proves the part of our theorem which states that
at least one pair of polynomials Q, R of the kind described exists.*
Suppose now that besides the polynomials Q, R of the theorem
there existed a second pair of polynomials Q', R' satisfying the same
conditions. Subtracting from (2) the similar identity involving
Q', R\ we have
(3) Q = {QQ')<I> + {RR').
From this we infer, as was to be proved,
Q=Q\ R = R'.
For if Q and Q' were not identical, the first term on the right ol (3)
would be of at least the mth degree, while the second involves no
■power of X as high as m.
Turning now to polynomials in several variables :
1 <¥,Xv ■ ■ ■ ^k) = *o(^2' • ■ • ^^^"\ + *l(2'2' ■ • • «*>r"^ + • • • + K{^V ■ ■ ■ Vk),
the ordinary method of dividing/ by </> would give us as quotient and
remainder, not polynomials, but fractional rational functions. In
order to avoid this, we state our theorem in the following form :
Theokem 2. Iff and (j) are polynomials in {x^, ■■■ x^) of which
is ^'ot id! entically zero, there exist polynomials Q, R, _P, of which the
last is not identically zero and does not involve the variable a;^, which
satisfy the identity,
(5) P(a2, ••• x,)f{xi, ■■■ x^) = Q{xi, ■■■ Xi)(j){xi, ••• X;,) + R{xi, ••• x^),
and such that either R = 0,or the degree in x^ of R is less than the degree
in x^ of (p.
The proof of this theorem follows the same lines as the proof
of Theorem 1.
* The reader should notice that the process just used Is merely the ordinary process
of long division.
SOME PROPERTIES OF POLYNOMIALS IN GENERAL 183
If/ is of lower degree in x^ than <^ (or if/= 0), the truth of the
theorem is obvious, for we may then let P= 1, ^ = 0, R=f.
Suppose, then, that / is of at least as high degree in x^ as ^.
"Writing /and <^ in the form (4), we may assume
flSj ^ 0, Jfl ^ 0, n> m.
Lemma. If <j) is not of higher degree in x^ than /, there exist two
polynomials Qy, M^ which satisfy the identity,
and such that either R^ = 0, or the degree of R^ in x^ is less than the
degree of fin Xy
The truth of this lemma is obvious if we let
^1 — "^oC^a' ■ ■ ■ ^*) *: "" •
The polynomials Q^ R^, h^ will serve as the polynomials Q, R, P
of our theorem if R^ = 0, or if the degree of R^ in a^j is less than the
degree of ^ in a;j. If not, apply the lemma again to the two functions
Ry and <^, getting
where R^ is either identically zero or is of lower degree in x^ than JBj.
We may then write 52/^(5^^^+ Q^)^ + R^.
If R^ = 0, or if the degree of R^ in x^ is less than the degree of (/> in
Xy we may take for the polynomials Q, R, P of our theorem the
functions Jo^i+ Qv ^2' ^o* ^^ ''^^^■> ^® ^PP^y our lemma again to R^
and <^. Proceeding in this way, we get a series of polynomials
Ry i^g, •••whose degrees in x^ are constantly decreasing. We there
fore, after a certain number of steps, reach a polynomial R^ which is
either identically zero, or of degree in x^ less than <^. Combining
the identities obtained up to this point, we have
hif=(hl^Q^ + hl^Q^+.: + Q,)<i, + R,,
an identity which proves our theorem, and which also establishes the
additional result :
Corollary. The polynomial P whose existence is stated in out
theorem may he taken as a power of 6q.
184 mTRODUCTION TO HIGHER ALGEBRA
We note that it would obviously not be correct to add to the
statement of Theorem 2 the further statement that there is only
one set of polynomials Q, B, P, since the identity (5) may be multi
plied by any polynomial in {x^, •••a;„) without changing its form.
Cf., however, the exercise at the end of § 73.
64. A Special Transformation of a Polynomial. Suppose that
j\x^,x^,x^,x^) is a homogeneous polynomial of the kth degree in
the homogeneous coordinates {x^, x^, x^, x^), so that the equation
/ = represents a surface of the kth degree. If, in /, the term in
a^ has the coef3ficient zero, the surface passes through the origin;
and if the term in x^ (or x^, or x^) has the coefficient zero, the
surface passes through the point at infinity on the axis of x^
(or x^, or Xg). It is clear that these peculiarities of the surface
can be avoided, and that, too, in an infinite variety of ways, by sub
jecting the surface to a nonsingular coUineation which carries over
any four noncomplanar points, no one of which lies on the surface,
into the origin and the three points at infinity on the coordinate
axes. It is this fact, generalized to the case of n variables, whic'"
we now proceed to prove.
Lemma. If f{x^, ••■a;„) is a homogeneous polynomial of the kth
degree in which the term 24 is wanting, there exists a nonsingular
linear transformation of the variables {xy, ■■■ x„) which carries f into a
new form /j, in which the term, in a;,', hag a coefficient different from
zero, while the coefficients of the kth powers of the other variables have
not been changed.
In proving this theorem there is obviously no real loss of gener
ality in taking as the variable x^ the last of the variables x^.
Let us then consider the nonsingular transformation
X, =x[+ afcl (i = 1, •• • » — 1).
This transformation carries /into
and evidently does not change the coefficients of the terms in
^v  '*
•■xz.
SOME PROPERTIES OF POLYNOMIALS IN GENERAL 185
Now, since every term in/j, except the term in x'^, contains at least
one of the variables x[, x'„^^, the coefficient of the term in x'" will
be
/i(0,...,0,l) = /K,a„_i,l).
Our lemma will therefore be proved if we can show that the
constants «!, ••• a„_i can be so chosen that this quantity is not zero.
Let us take any point (b^, •■■b„) for which &„=?t=0 ; and consider
a neighborhood of this point sufficiently small so that a;„ does
not vanish at any point in this neighborhood. Then, since /
does not vanish identically, we can find a point (c^, ••• c„) in this
neighborhood (so that c„#0) such that
M, ■••Cn)*0.
If now we take for a^, •••«„_! the values c^/c„, •••c„_j/c„, we shall
have, since /is homogeneous,
Thus our lemma is proved.
Theorem 1. If f{xy,x^ is a homogeneous polynomial of the kih
degree, there exists a nonsingular linear transformation which carries
k k
f into a new form fj^ in which the terms in x\, ■•■ x'„ all have coefficients
cHfferent from zero.
The proof of this theorem follows at once from the preceding
lemma. For we need merely to perform in succession the trans
formations which cause the coefficients first of a;*, then of a;, etc.,
to become different from zero, and which our lemma assures us
will exist and be nonsingular, to obtain the transformation we
want. To make sure of this it is necessary merely to notice that
the coefficient of 3^ obtained by the first transformation will not
be changed by the subsequent transformations ; that the same
will be true of the coefficient of a^ obtained by the second trans
formation; etc.
Theorem 2. If f{x^, ■■■ x^ is a polynomial of the hth degree
which is not necessarily homogeneous, there exists a nonsingular
homogeneous linear transformation of {x^, ■■■ x^) which makes this
polynomial of the kth degree in each of the variables x\, ••■ x'„ taken
separately.
186 INTRODUCTION TO HIGHER ALGEBRA
If / is homogeneous, this is equivalent to Theorem 1. If / is
nonhomogeneous, we may write it in the form
where each (^ is a homogeneous polynomial of the degree indicated
by its subscript or else is identically zero. We need now merely to
apply Theorem 1 to the homogeneous polynomial (jy^, which is, of
course, not identically zero.
This theorem, and therefore also Theorem 1, which is merely a
special case of it, admits the following generalization to the case of a
system of functions :
Thbokem 3. If we have a system of polynomials
of degrees k^, \, ■■■ k^ respectively, there exists a nonsingular homo
geneous linear transformation ivhich makes these polynomials of degrees
Aj, • • • k^ in each of the variables x[, ■■■ x',^ taken separately.
This theorem may be proved either by the same method used ic
proving Theorems 1, 2 ; or by applying Theorem 2 to the product
/i/a "Vw
CHAPTER XV
FACTORS AND COMMON FACTORS OF POLYNOMIALS IN ONE
VARLABLE AND OF BINARY FORMS
65. Fundamental Theorems on the Factoring of Polynomials in
One Variable and of Binary Forms. Theorem 2, § 6 may be stated
in the following form :
Theorem 1. A polynomial of the nth degree in one variable is
always reducible when n>l. It can be resolved into the product of n
linear factors in one, and essentially in only one, way.
By means of § 62 we can deduce from this a similar theorem in
the case of the binary form
(1) a^l + ajZ^'^x^ + ••■ + a„aq.
Let us first assume that a^^O. Then the nonhomogeneous poly
nomial
(2) aQX^ + aj^fi + ■■■ +a„
corresponds to (1), according to the definition of § 62. Factoring
(2), we get
«o(^i  «i)(*i  «2) "■ (^1 ~ «»)'
or, if we take n constants aj', a'^, ■•• < whose product is «„,
(3) «a:i  a[)(a'^Xi  «^) ••• («'X  e/,),
where for brevity we have written
«'/«. = «J (i=l, 2, ...w).
By Theorem 1, § 62, we now infer that the binary form (1) is identi
cally equal to
(4) («i'^i  «ia'2)(«2*i  «22^2) ••• («'>i  ««*2)
187
188 INTKODUCTION TO HIGHER ALGEBRA
Moreover, there cannot be any way essentially different from this of
factoring (1) into linear factors, for if there were we should, bj
setting x^ = 1, have a way of factoring (2) into linear factors essen
tially different from (3). Thus our theorem is proved on the suppo
sition that «Q ^ 0.
Turning now to the case a^ = 0, let us suppose that
a^= ... = a^_^ = 0, a^^O,
where k<n. The form (1) then has the form
(5) a^xY'^a* + ■■■ + a^xl,
which is equal to the product of k factors x^ and the binary form
a^l''' + • ■ ■ + «„a;*
of degree n — k. Since the first coefficient in this form is not zero,
it can, as we have just seen, be factored into n — k linear factors.
Thus, here also, we see that the binary form can be written in the
form (4), the only peculiarity being that in this case k of the con
stants a" are zero. We leave it to the reader to show that this
factoring can be performed in essentially only one way. This being
done, we have the result :
Theorem 2. A binary form of the nth degree is always reduc
ible when n>\. It can be resolved into the product of n linear
factors in one, and essentially only one, way.
EXERCISES
1. Prove that every real polynomial in one variable of degree higher than two
is reducible in the domain of reals, and can be resolved into irreducible factors in
one, and essentially only one, way.
2. Prove the corresponding theorem for real binary forms.
66. The Greatest Common Divisor of Positive Integers.* We
will consider in this section the problem of finding the greatest com
mon divisor of two positive integers a and S, which has the closest
• In this section we use the term divisor in the arithmetical sense, not in the
algebraic sense defined in § 60. An integer h is said to be a divisor of an integer a if
an integer c exists such that a = 6c.
FACTORS OF POLYNOMIALS IN ONE VARIABLE 189
analogy with the algebraic problem of the next section. The solu
Mon of this problem, which was given by Euclid, is as follows :
If we divide a by 6 * and get a quotient g'j and a remainder r^, we
may write a = q,b + r^,
where, if the division is carried out as far as possible, we. have rj<5.
Then divide 6 by r getting a quotient jj and a remainder r^
which, if the division is carried out as far as possible, is less than r^.
Proceeding in this way, we get the following system of equations,
in which, since the remainders r^, r^, ••• are positive integers which
continually decrease, we ultimately come to a point where the divi
sion leaves no remainder :
a = qj)+ ri ^^ ^ j^
From the first of these equations we see that every common factor
of a and 6 is a factor of r^; from the second, that every common
factor of I and r^ is a factor of r^; etc.; finally, that every common
factor of rp_2 and r^.^ is a factor of r^. Hence every common factor
of a and b is a factor of r^.
On the other hand, we see from the last equation (1) that every
factor of jp is a factor of r^.^ ; from the next to the last equation, that
every common factor of jp and »p_i is a factor of ri,_^; etc.; finally,
that every common factor of r^ and r^ is a factor of J, and that
every common factor of jj and i is a factor of a. Hence every factor
of r^ is a common factor of a and b.
Since the largest factor of r,, is r,, itself, we have the result :
Theorem 1. In Uuclid's algorithm (1), the greatest common divisor
of a and b is r,,.
In particular, a necessary and sufficient condition that a and 6 be
relatively prime is that r^ = 1.
* This is possible even if a < 6, the only peculiarity in this case being that the
quotient is zero and the remainder equal to o.
190 INTRODUCTION TO HIGHER ALGEBRA
We will next deduce from the equations (1) an important formula
by means of which jp is expressed in terms of a, b, and the g's.
From the first equation (1) we have
ri = a q^b.
Substituting this value into the second equation, we get for r^
the value ^^ = _ q^a + {q^q^ + V)b.
Substituting the values for r^ and r^ just found in the third
equation, we get ^^ ^ (^^^^ + i) ^ _ (^^^^^^ + q^ + q,)b.
Proceeding in this way, we can express each of the r's, and there
fore ultimately »*p, in terms of a and b. In order to express con
veniently the general formula here, we introduce the following
notation :
[ j = l
[«i] = «i'
^' 1 [«i, ttgi "aJ = "l«2 «3 + «3 + "l'
It will be seen that the values of r^, r^, r^ found above are included
in the formula
(3) »•* = ( 1)*^ Iqv 1V ?*i] « + (!)* [9o' ?r ?2' • ■ • ?*!] *•
By the method of mathematical induction this formula will therefore
be established for all values oi k^p if, assuming that it holds when
k< Aj, we can show that it holds when A = ^j + 1. This follows at
once when, in the formula
we substitute for r^^, ^^^ ''aii their values from (3) and reduce the
resulting expression by means of the definitions (2).
We have therefore established the formula
(4) r, = Aa + Bb,
where A = {iy>^lq^,q^,...q,_{\, 5 = ( l>'[jo,ji, ... gp_i].
Since the g''s are integers, it is clear that A and B will be integers.
FACTORS OF POLYNOMIALS IN ONE VARIABLE 191
The most important application of the result just obtained is to
the case in which a and h are relatively prime. Here r^ = 1, and we
have
(5) Aa + Bb = l.
Conversely, if two integers A and B exist which satisfy (6), a and h
are relatively prime, as otherwise the lefthand side of (5) would
have a factor greater than 1.
Theorem 2. A necessary and sufficient condition for a and b to be
relatively prime is that there exist two integers A and B such that
Aa+Bb = l.
EXERCISES
1. Prove that [ai,a2, •■• a^ = [a„,a„i, •■• ai].
[Suggestion. Use the method of mathematical induction.]
2. Prove that the numerical values of the integers A and B found above are
respectively less than ^b and a.
[Suggestion. Show that a/b =[<?oi ■■•??]/[?!) •■• 9p]i ^nd that this second
fraction is expressed in its lowest terms.]
3. Prove that there can exist only one pair of integers A and B satisfying the
relation Aa + Bb = \ and such that A and B are numerically less than \ b and J a
respectively.
67. The Greatest Common Divisor of Two Polynomials in One
Variable. In place of the integers a and b of the last section, we
consider here the two polynomials:
\f{x)=a^x^\a^x'^^+ •••+«„.
1 </.(:c) = b^x^ + Sja;™! + . • . + J^.
By the greatest common divisor of these two polynomials is
meant their common factor of greatest degree.* It will turn out in
the course of our work that (except in the case in which/ and ^ are
both identically, zero) this greatest common divisor is completely
determinate except for an arbitrary constant factor which may be
introduced into it.
* Many English and American textbooks use the term highest common factor ; but
as there is not the slightest possibility that the word greatest, here, should refer to the
value of the polynomial, since the polynomial has an infinite number of values for dif
ferent values of the argument, it seems better to retain the traditional term.
192 INTRODUCTION TO HIGHER ALGEBRA
We will assume that neither/ nor <^ is a mere constant, and that
the notation has been so introduced that / is of at least as high
degree as ^ ; that is, we assume
a„=itO, bo=^0, n>m>0.
Let us apply Euclid's algorithm to/ and (f> precisely as in § 66 we
applied it to a and b. We thus get the system of identities
cl^{x)=Q,{x)R,{x) + R^{x),
(2) ^ R^{x)~Qlx)Rlx) + Rlz),
[B,.,{x)^Q,{x)RJix) + R,^i.
For the sake of uniformity we will write
<p{x)^R,(x).
Then ^g, i2p R^, ■•• are polynomials of decreasing degrees, so that
after a finite number of steps a remainder is reached which is a con
stant. Tliis remainder we have indicated by Rp + i
From this algorithm we infer, as in § 66, that every common
factor of /and <^ is a factor of all the R's, and, on the other hand,
that every common factor of two successive jB's is a factor of all the
preceding K's and therefore of / and (f). Accordingly, if / and ^
have a common factor which is not a constant, this common factor
must be a factor of the constant Kp+j, and therefore 72^^ j = 0. Con
versely, if Rp_^_^= 0, the polynomial Rp{x) is itself a common factor
of Rp and Bp+j, and therefore of /and (f). Hence the two theorems :
Theorem 1. A necessary and sufficient condition that two poly
nomials in one variable f and </>, neither of which is a constant^ be
relatively prime is that in Huclid's algorithm, (2), Rp+i # 0.
Theorem 2. If in Euclid's algorithm, (2), i?p+ j = 0, then RJ^x) is
the greatest common divisor off and ^.
By means of this theorem we are in a position to compute the
greatest common divisor, not only of two, but of any finite number, of
polynomials in one variable. Thus if we want the greatest common
divisor of f{x), <f>{x), y}r(x), we should first compute, as above, the
greatest common divisor Rp{x) of / and (j), and then, by the same
method, compute the greatest common divisor of Iij,(x) and yjr{x).
FACTORS OF POLYNOMIALS IN ONE VARIABLE 193
From the identities (2) we can compute the value of each of the
remainders in terms of /, (f>, and the quotients Q. The formulae
here are precisely like those of § 66, and give for Bp+^ the value
(3) B,^^ s (  ly [ «i {X), Qix), ■ . . Q,(xmx)
Suppose, now, that / and <j) are relatively prime. We may then
divide (3) by B^+i and get
(4) F{x)J{x) + ^{x)4,{x)^l,
where
(5)
^«<i^
p+i
( — 1)"+"
"P+l
Q,{x), Q^x), ... Q^x)^^
From the definitions (2), § 66, we see that F and ^ are polyno
mials in X. The existence of two polynomials F and $ which satisfy
(4) is therefore a necessary condition that/ and ^ be relatively prime.
It is also a sufficient condition ; for from (4) we see that every com
mon factor of /and <j> must be a factor of 1, that is, must be a con
stant. Thus we have proved the theorem:
Theorem 3. A necessary and sufficient condition that the polyno
mials f{x) and 4>(x) ie relatively prime is that two polynomials F{x) and
^{x) exist which satisfy (4).*
We can make this statement a little more precise by noting the
degrees of F and cE> as given by (5). For this purpose let us first
notice that if «j, ■ • ■ <*■„ are polynomials of degrees k^, ■ • • k„ respec
tively, [«i, • • • «„] will, by (2), § 66, not be of degree greater than
/fcj + • • • + /fc„. Now let the degree of Rlx) be %, and, as above, the
degrees of /and i^, n and m respectively. Then (cf. (2)) the degrees
of ^0' ^1' ^2' ■■■ "^^i^^ be w — m, m — mj, m^—m^,.. respectively. Ac
cordingly, by (5), the degrees of F and * are respectively not greater
than (m — Wj) I (% — OTg) ■! (wp.i — mp) = m — m,,
and {n — m) + {m — m^ + {mi — m^ \ h(mp_j Wp) = n — mf,.
Hence, since Wp > 0, ^ is of degree less than m, and 4> of degree less
than n.
* The proof we have given of this theorem applies only vfhen neither/nor tp is a
constant. The truth of the theorem is at once obvious if / or is a constant.
o
194 INTRODUCTION TO HIGHER ALGEBRA
Conversely, we will now show that if F^ is a polynomial of degree
less than m, and ^j a polynomial of degree less than n, and if
(6) F^{x)f{x) + <^lx)<^{x) = l,
then Ft!^x)^F{x), ^^{x)=^{x).
To prove this, subtract (6) from (4), getting
{FF,)f^{_^,^)j>.
If we resolve the two sides of this identity into their linear fac
tors, we see that, since / and ^ are relatively prime, / must be a
factor of $j — $ and <^ a factor of F—F^^. This, however, is pos
sible only if <E>j^— $ and F—F^ vanish identically, as otherwise
they would be of lower degree than f and respectively. We
have thus proved the theorem :
Theorem 4. If f{x) and <f>{x^ are relatively prime, and neither is
a constant, there exists one, and only one, pair of polynomials F{x)
^{x), whose degrees are respectively less than the degrees of <^ and f
and which satisfy the identity (4).
Before proceeding to the general applications of the principles
here developed which will be found in the next section, the reader
will do well to familiarize himself somewhat with the ideas involved
by considering the special case of two polynomials of the second
<ieg^ee= f{x) = a^a? + a^x + a^ ag^O,
^(x) = 6q x^ +h^x + h^ h^^^
If the condition that these two polynomials be relatively prime
be worked out by a direct application of Euclid's algorithm, it
will be found necessary to consider separately the cases in which
aj6g — aj)^ is or is not zero. By collating these results it will be
found that in all cases the desired condition is:
(«2^o  «o*2)^ + («i^o  «o*i)(«/2  «2*i) ^ 0.
This condition may be found more neatly and quickly by obtaining
the condition that two polynomials of the form
F{x)=p^x+p^,
^{x) = qQX + q^
exist which satisfy the identity (4).
It is this last method which we shall apply to the general case in
the next section.
FACTORS OF POLYNOMIALS IN ONE VARIABLE
195
68. The Resultant of Two Polynomials in One Variable. Let
f{x) = a^x" + a^af'^ + • • • + a„,^x + a„ a,, :?s= 0, w > 0,
^{x)=h^x^+\x'^^+ •■■ +h^_^x + h^ Jo^O, m>0.
The condition that these polynomials be relatively prime consists
as we see from Theorem 4, § 67, in the existence of constants p^
Pv ■■■ Pmv %" 1v ••■ S'ni such that
(_Poa;"i+>ia;'»2+ ... +p^_^)(aaa;' + aiX^+  + a„)
Equating coefficients of like powers of x, we see that the following
system of equations is equivalent to the last written identity:
n
«oPo +hlo
= 0,
(Im+lPo + «mi'l+ ••• +«2^».l +Kqi+ +%m+l
= 0,
= 0,
1 = 0.
1=0,
^nPml
+ 5m?Bl=l
In writing these equations we have assumed for the sake of
definiteness that n>m, though the change would be immaterial if
this were not the case. This is a system of m + w linear equations
in the m + n unknowns jj^, ■ . • Pmv %■> '" 1nv whose determinant, after
an interchange of rows and columns and a shifting of the rows, is
a„
«„
.0
.0
a„
K
6„
.5»o
h„.
a determinant which, it should be noticed, has m+n rows and columnsk
196 INTRODUCTION TO HIGHER ALGEBRA
If B=?tO, the set of equations (1) has one and only one solution,
and / and <j> are relatively prime. If i2 = 0, two cases seem at first
sight possible (cf. § 16): either the system of equations has no solu
tion, or it has an infinite number of solutions. This latter alterna
tive cannot, however, really arise, for we have seen in Theorem 4,
§ 67, that not more than one pair of polynomials F and $ can exist
which satisfy formula (4) of that section and whose degrees do not
exceed m — 1 and n — l respectively. Aacordingly, if J? = 0, the
set of equations has no solution and / and <j) have a common factor.
II is called the resultant of /and <f>.*
The term resultant has thus been defined only on the supposition
that / and ^ are both of at least the first degree. It is desirable to
extend this definition to the case in which one or both of these
polynomials is a constant. Except in the extreme case m = n=Q,
we will continue to use the determinant H as the definition of the
resultant. Thus when m = 0, w > we have
If 6(1 =#= we have jB ^ 0, and moreover in this case ./ and ^ are
relatively prime since the constant ^ has no factors other than
constants. If, however, h^ = 0, we have i? = 0, and every factor of
/■ is a factor of ^, since ^ is now identically zero.
Similarly when w = 0, m > 0, we have
and we see that a necessary and sufficient condition that f and <j> be
relatively prime is that R^Q.
Finally, when w = m = 0, we define the symbol R i "''^\ which we
still use to denote the resultant, by the formula "^
7? (^q\ _ 1 1 when a^, and Jg are not both zero,
\hj lO when a^ = 6^ = 0.
We may now say with entire generality:
Theorem. A necessary and sufficient condition for two polynomials
in one variable to be relatively prime is that their resultant do not vanish.
For another method of approach to the resultant, cf. Exercise 4
at the end of § 76.
* It should be noticed that the resultant of and / may be the negative of the
resultant of/ and 4>. This change of sign is of no consequence for most curposes.
FACTORS OF POLYNOMIALS IN ONE VARIABLE
197
69. The Greatest Common Divisor in Determinant Form.
Definition. By the ith suhresultant iJ; of two polynomials in
cne variable is understood the determinant obtained by striking out the
first i and the last i rows and also the first i and the last i columns from
the resultant of these polynomials.
Thus if the polynomials are of degrees 5 and 3 respectively, the
resultant B is a determinant of the eighth order, iZj of the sixth,
M^ of the fourth, and M^ of the second, as indicated below :
R =
h
«1
«2
«s
«4
«5
«6
^3
«1
«2
«8
«4
«5
«4
*2
^3
«0
B^=
^0
«1
«2
«3
h
h
*0
h
h
f>i
h
h
h
K
h
We now state the following results, leaving their proof to the
reader:
Lemma. Iffxi^) ^'^'^ </>i(^) *^^ polynomials, and
f{x) = {x a)f,{x), <^{x) = {x a)ct>,(x),
the resultant of f^ and <f)^ and their successive subresultants are equal
respectively to the successive subresultants off and (j).
Theoeem 1. The degree of the greatest common divisor of f{x)
and ^x) is equal to the subscript of the first of the subresultants
Bq = B, B^, B^, ■■■which does not vanish.
Theorem 2. If i is the degree of the greatest common divisor of
two polynomials f (J) and (^{x), then this greatest common divisor may
he qhtained from the ith suhresultant off and ^ by replacing the last
198
INTRODUCTION TO HIGHER ALGEBRA
element in the last roxo of coefficients off by f{x), the element Just above
this by xf{x), the element above this by 3?f{x), etc.; and replacing
the last element in the first row of coefficients of ^ by ^(x), the element
below this by xcftix), etc.
If, for instance, the degrees of / and ^ are 5 and 3 respectively,
and i = 1, the greatest common divisor is
K
»2
h
h
J,
a^
h
h
h
5,
h
xf{x)
<ji{x)
x<f)(x)
x^<^{x^
a:?<p{x)
Consider the
70. Common Roots of Equations. Elimination.
equations f(^x) = a^x" + a^x"'^ + . . . + a„ =
<f>{x) = baX^ + bj^x^'^+ + b^=0 J(,#0,
whose roots are (Xj, a^,a„ and /Sj, ^^,^^, respectively; and
suppose /(a;) and <l>(x) resolved into their linear factors:
f{x) = «(, (a;  aj^x a^).{x a„),
<f>{x) = b,{x  0^){x ^^){x ^J.
Since, by Theorem 1, § 65, these sets of factors are unique, it is
evident that the equations /( 2;)= and 4'{x)= will have a common
root when, and only when, f{x) and (p{x) have a common factor,
that is, when, and only when, the resultant i2 of/ and (j> is zero.
To eliminate x between two equations f{x) = and (/>(a;) = 0, is
often taken in elementary algebra to mean : to find a relation between
the coefficients of/ and (f> which must hold if the two equations are
both satisfied; that is, to find a necessary condition for the two
equations to have a common root. For most purposes, however,
when we eliminate we want a relation between the coefficients which
not only holds when the two equations have a common root, but
such that, conversely, when it holds the equations will have a
common root. From this broader point of view, to eliminate x
between two equations f{x) = and <f)(x) — means simply to find
a necessary and sufficient condition that these equations have a
common root. Hence the result of this elimination is J? = 0. Let
us, however, look at this question from a little different point of view.
FACTORS OF POLYNOMIALS IN ONE VARIABLE
199
In the equations
(1) af,3? + aja^ + a^+as3^+a^x+a^ = aQ=^0,
(2) b^x^ + b^z' +b^x+b^ = 5„ ^ 0,
let us consider the different powers of x as so many distinct un
knowns. We have, then, two nonhomogeneous, linear equations in
the five unknowns x, x\ a?, a;*, x^. ■ Multiplying (1) through by x
and then by a^, and (2) by x, x^, a?, a;*, in turn, we have
a^'' + a^x^ + a^ + a^x^ + a^ f aga;^ = 0,
a^ t a^o^ f a^y^ f a^^ f a^ ■\ a^x = 0,
a^afi + flja;* + a^x^ + a^x^ \ a^x + a^ =0,
b^a^+b^x^+b^x+bg =0,
b^ + Sj^s + b^x^ + b^x = 0,
h^ H b^x^ f b^ofi t 632^ = 0,
l^x^ f h^T> + b^xf^ 4 Jga^ = 0,
h^ 4 Jja^ F ^22^ + Sga;* = 0,
a system of eight non homogeneous, linear equations in seven
unknowns.
If a value of x satisfies both (1) and (2), it will evidently satisfy
all the above equations. These equations are therefore consistent,
so that by Theorem 1, § 16, we have.
h
h
h
h
H
«5
H
«4
H
H
«3
«4
a,
h
h
h
b
h
h
h
b.
h
h
'
= 0.
Hence the vanishing of this determinant is a necessary condition for
(1) and (2) to have a common root.
This device is known as Sylvester's Dialytio Method of Elimina
tion.*
* For the sake of simplicity we have taken the special case where ra = 5 and m = 3.
The method, however, is perfectly general.
200 INTRODUCTION TO HIGHER ALGEBRA
The above determinant is seen to be exactly the same as the
resultant R of (1) and (2), so that Sylvester's method leads to the
same condition for two epilations to have a common root as that
found above, namely ^ = 0. It does not prove, however, that this
condition is sufficient, but merely that it is necessary. Thus Syl
vester's method, while brief, is very imperfect.
The number of roots common to two equations, fix) = and
^(x)= 0, and also an equation for computing the common roots,
may be obtained at once from § 69.
71. The Cases Oq = and b^ = 0 It is important for us to note
that according to the definitions we have given, the determinant
22 ( 0' "" " ) will be the resultant of the two polynomials
\0n, ■ • • 0™/
fix) = aax" + a^Tf^ +•••+«„,
^x) = \x^>r \x^^ + ■ • • + 5^,
only when / and ^ are precisely of degrees n and m respectively,
that is, only when a^4=^, i^^O Thus, for instance, the resultant of
the polynomials j.^^^ ^ ^^^ni + «^^.2 + . . . + «^,
(fix) = Jo^;™ + 5ia:™i + + K,
is not the (m + w)rowed determinant i2('*i' "■'"'•) but, if a^=^0,
Bq #= 0, the {m + n — l)rowed determinant Mf i' ' " ' ^" J or, if a^ or }»
is zero, a determinant of still lower order.* V c '■" m'
Let us indicate by R the (m + M)rowed determinant K ( "'
and by r the resultant of / and ^, and consider the case a^ = 0,
fflj =5*= 0, b^^ 0. Since every element of the first column of R except
the last is zero, we may write
In a similar way we see that if the degree of / is w — i, and 6^ gfc 0, we
may write ^^^jf^^
and if the degree of is m — i, and a^^O, we have
R = air.
Accordingly, except when a^ = J^ = 0, R differs from r only by a
nonvanishing factor.
* As an illustration take the two polynomials f(x) = {a. + p')x'^ + x — p and
(pQi:) =ax + l. If a 4/3^0 and a^O, the resultant here is (a'^— l)/3. But if
B = — ^ :^ 0, the resultant is 1 — a^.
FACTORS OF POLYNOMIALS IN ONE VARIABLE 201
Theorem. Although B T^o' " • • <\ is the resultant off and <f> onl^
when Uq^O and b^^O (or when m = or w=0), nevertheless its
vanishing still forms a necessary and sufficient condition that f and (j)
have a common factor even when aQ = Oorb^= 0, provided merely that
both ftg and b^ are not zero.
That this last restriction can not be removed is at once evident;
for, if ao=6o=0, every element in the first column of the determinant
is zero, and hence the determinant vanishes irrespective of whether
/and have a common factor or not.* All that we can say, if we
do not wish to make this exception, is, therefore, that in all cases the
vanishing of R forms a necessary condition that / and ^ have a
common factor.
72. The Resultant of Two Binary Forms. Let us now consider
the binary forms
/(a^i, x^ = a^xl + a^xl\ + ■■■ + a„xl (w^l)
^(^1' x^)=b,x^ + b^x^^x^+ ... +b^x^ (jn^l).
By the side of these forms we write the polynomials in one
variable j<(^^^ = a,x + a^x^ +•••+«„,
The determinant ^f^o' •" ««]
will be the resultant of F and ^ only when neither a^ nor h^ is zero.
We will, however, call it the resultant of the binary forms / and <^
in all cases.
* By looking at the question from the side of the theory of common roots of two
equations (cf. § 70), and by introducing the conception of infinite roots, we may avoid
even this last exception. An equation
Ci^" + aioff^^ + — a„=0
has n roots, distinct or coincident, provided ao gfc 0. If we allow Oq to approach the value
zero, one or more of these roots becomes in absolute value larger and larger, as is seen
by the transformation x' = 1/x. Hence it is natural to say that if a(, = the equation
has an infinite root. If then we consider two equations each of which has an infinite
root as having a common root, we may say :
A necessary and sufficient condition that the equations
OoK" + aiK'i H 10!„=0 M>0,
6(ia;'" + 6iX'»i + "+6m = m>0,
202 INTRODUCTION TO HIGHER ALGEBRA
Theorem. A necessary and sufficient condition for two binary
forms to have a common factor other than a constant is that their
resultant be zero.
If fflj and Jj are both different from zero, the nonhomogeneous
polynomials F and $ correspond to the forms / and ^ according ta
the definition of § 62. Accordingly, by Theorem 2 of that section
a necessary and sufficient condition that / and ^ have a, common
factor other than a constant is, in this case, the vanishing of their
resultant.
On the other hand, if a^ = J^ = 0, / and ^ have the common factor
aTj, and the resultant of/ and (/> obviously vanishes.
A similar renjark applies to the case in which all the a's or all
the Vs. are zero.
There remain then only the following two cases to be considered,
(1) «„=,!= ; 5(,= Ji= ... =6^=0, 64+1 9fcO (Jc<m),
(2) ifl^^; aQ = a.^= ■■• = a^, = 0, a^+i ^t {k<n).
In Case (1), F corresponds to /, and, if we write
^ corresponds to <f)y Now we know in this case (cf. § 71) that
M^O is a necessary and sufficient condition that F and ^ be rela
tively prime. Accordingly, by Theorem 2, § 62, it is also a necessary
and sufficient condition that /and <p^ be relatively prime. But since
X2 is not a factor of/, the two forms /and <f) will be relatively prime
when and only when/ and (^^ are relatively prime. Thus our theo
rem is proved in this case.
The proof in Case (2) is precisely similar to that just given.
CHAPTER XVI
FACTORS OF POLYNOMIALS IN TWO OR MORE VARIABLES
73. Factors Involving only One Variable of Polynomials in Two
Variables. We have seen in the last chapter that polynomials in
one variable are always reducible when they are of degree higher
than the first. Polynomials in two, or more, variables are, in gen
eral, not reducible, as we have already noticed in the special case of
quadratic forms.
Let f{x, y) be any polynomial in two variables, and suppose it
arranged according to powers of x,
/(a;,y)=ao(«/>"lai(t/)a:»i+  +a„_i(y)2;f a„(y),
the a's being polynomials in y.
Theorem 1. A necessary and sufficient condition that a poly
nomial in y alone, ^y), he a factor of f{x, y) is that it be a factor of
all the d's.
The condition is clearly sufficient. To prove that it is necessary,
let us suppose that ■^y) is a factor of /(a;, y). Then
a) a^{y)x+ + «„(2/) = ^(2/)[io(y)*"+  +5«(2/)]'
where the J's are polynomials in y. For any particular value of y
we deduce from (1), which is then an identity in x, the following
equations: , / ni / n
■a'n{y) = f{y)K{y)
Since these equations hold for every value of y, they are identities,
and ^(y) is a factor of all the a's.
203
204 . INTRODUCTION TO HIGHER ALGEBRA
Theorem 2. If f{x,y) and 4>{x, y) are any two polynbrniaU in x
and y, and yjr{y) is an irreducible polynomial in y alone * which is a
factor of the product fj), then ■\}r is a? factor off or of (j).
Let f{x, y) = a^{y)x^+a^{y)7f^+  + a^{y),
and <i>{x, y)^h^{y)x^ + ii{y)x"'^+ •■• +b^{yy,
then
f{x, y)<i>(x, y) = a.h^x*' + (a,b, + a,b,)x+^^
In order to prove that ir is a factor either of / or of (j> we must
prove that it is either a factor of all the a's or of all the b's. If this
were not the case, we could find a first a in the sequence a^, a^, ■■■ a„
of which yjr is not a factor. Call this function a^. There would also
be a first b in the sequence of b^, Sj, • ■ • b^ which is not divisible by tfr.
Call this function bj. Our theorem will be proved if we can show
that this assumption, that a^ and Jy are not divisible by yjr while all
the functions a,), • ■ • a^ _ j, b^, ■■■ bj_ j, are divisible by yfr, leads to a
contradiction. For this purpose let us consider in the product f(f>
the coefficient of a;('»«)+('»J'), which may be written
provided we agree that the a's and 6's with subscripts greater than
n and m respectively shall be identically zero. Since f(f> is by
hypothesis divisible by yjr, it follows from Theorem 1 that the last
written expression must be divisible by ylr. This being obviously
the case for all the terms which preceed and for all which succeed
the term a^bJ, it follows that this term must also be divisible by yfr,
so that among the linear factors of the function a^bj must be found
yjr. But by Theorem 1, § 65, the function afij can be resolved into
its linear factors in essentially only one way, and one way of so re
solving it is to resolve «< and bj into their linear factors. Since yjr is
not one of these factors, we are led to a contradiction, and our
theorem is proved.
An important corollary of our theorem is :
Corollary. Letf{x, y) and (j) [x, y) be polynomials in {x,y), and
let yjr(y) be a polynomial in y alone. If yjr is a factor of the product of
f^ but is relatively prime to ^, then yjr is a factor off.
* That is, a linear polynoiiial
FACTORS OF POLYNOMIALS IN TWO VARIABLES 205
If yjr is irreducible, this corollary is identical with the theorem.
Let us suppose yfr resolved into its irreducible factors none of which
are constants, that is, into its factors of the first degree :
Now consider the identity which expresses the fact that ^ is a factor
(2) f{x, y)j>{x, y) = ir^{y)ir^{y) ■ ■ ■ f,{y)a{x, y).
This shows that ■^i{y) is a factor of f<f) and hence, by Theorem 2,
it is a factor either of / or of (f). Since yjr and ^ are relatively
prime, yjr^ cannot be a factor of (j). It must then be a factor of / :
f{x, y) = f^(y)fj{x, y).
Substituting this in (2), and cancelling out i/r^, as we have a right to
do since it is not identically zero, we get
(8) f^{x, y) (f>{x, y) = y^j^y) ■ ■ ■ fly) a{x, y).
From this we infer that yjr^, being a factor oi fi<f>, must be a factor
° ^' fii^^ y)=^iy)flp^ y)
We substitute this in (3) and cancel out i/rg. Proceeding in this way
^^ ^^* f{x, y) = iriiy)f2(y) ■■■ t*(y)/*(^> y)^^{y)M^, y),
an identity which proves our corollary.
EXERCISE
If /(t, y) and tf}(x, y) are polynomials, then any two sets of polynomials
Pi(.y), Qi(^. y), Ri(^. 2').
PjCy). Q^C^' y)' R2(^' y)'
win be proportional to each other provided,
(a) they satisfy the identities
P^{y)f(?<:, y)^Qi(^' y)<t>(^' y) +^i(^. y)?
PMf(^> y) = ^2(^' J')<^(^' y^ + ^^^''^ y^ '
(6) there is no factor other than a constant common to Pi, Qi, and also no
factor other than a constant common to P^, Q^ ;
(c) iJj and R^ are both of lower degree in x than <}>.
(Cf. Theorem 2, § 63.) '
206
INTRODUCTION TO HIGHER ALGEBRA
74. The Algorithm of the Greatest Common Divisor for Polyno
mials in Two Variables. We will consider the two polynomials in
X and y, ^^^^ ^^ ^ ^^^^^^„ ^ ^^^^^^„_i ^ _^ ^j^^^^
<\>{x, y) =6o(2/K + *i(yK~' +  + Uy>
and assume a^ ^ 0, 6q ^ 0, w ^ m >0.
Theorem 1 of the last section in combination with the results of
§ 67 enables us to get all the common factors of / and <^ which in
volve y only; for such factors must be common factors of all the as
and all the 6's.
It remains, then, merely to devise a method of obtaining the com
mon factors of/ and ^ which do not themselves contain factors in y
alone. We v,ill show how this can be done by means of the algo
rithm of the greatest common divisor.
Dividing / by (^ (cf . § 63, Theorem 2), we get the identity
Po(s')/(^' y)= ^o(^' ^^)^(^' y) + ^i(^' y)^
when iZj is either identically zero, or is of lower degree in x than ^.
If R^ ^ 0, divide ^ by R^, getting the identity
Pi{y)'l>{^^ y) = ^i(«' y)^i(.^^ y) + ^i{^^ «/>
where R^ is either identically zero, or is of lower degree in x than
i2j. If R^ ^0, divide R^ by R^ Proceeding in this way, we get the
following system of identities in which the degrees in x of R^, R^, ■■■
continually decrease, so that after a certain number of steps we reach
an R, say Bp+j, which is independent of x :
Po(2')/(^' y) = ^o(^' y)<t>{^^ y) + ^i(^' y)^
Pi(.y)4'(^^ y) = Qiip^ y)^i(*' y) + ^2(^^ y)^
A(2/)^i(^' y) = Qii^^ y)Ri{?>> y) + ^3(2;, y).
(1)
P9i{yW,i(p^ y) = Qpii^^ y)Rpi(^^ y)+ ^l^^ y>
P,{y)R,i{^^ y) ^ Q,{^^ ^)^p(^, y) + RUy)
Theoeem 1. A necessary and sufficient condition thatf and <}> have
a common factor which involves x is
FACTORS OF POLYNOMIALS IN TWO VARIABLES 201
In order to prove this theorem we first note that, by the first of the
identities (1), any common factor of/ and </> is a factor of R^, hence,
by the second of the identities, it is a factor of R^, etc. Finally we
see that every common factor off and <j) is a factor of all the Ji's. But
Rp+i does not contain x. Hence if /and <f> have a common factor
which contains x, Rp+i{y) = 0.
Now suppose conversely that ^p+j(«/)=0, and let
(2) B,{x,y)^S{y)a(x,y),
where Gr has no factor in y alone.* The last identity (1) then tells
us that Pp(«/) is a factor of
'Qlx, y)S{y)a(x, y),
and since by hypothesis Gr has no factor in y alone, PJi^y) must, by
the Corollary of Theorem 2, § 73, be a factor of Q^S, that is
(3) Q,{x, y)S(y) = P,{y)Eix, y).
Substituting first (2) and then (3) in the last identity (1), and cancel
ling out the factor Pp(y) from the resulting identity, as we have a
right to do since Pp{y) ^ 0, we get the result
^Pii^^ I/)^S(x, y)G{x, y).
That is, (r is a factor not only of R,, but also of R;,^. Accordingly
we may write the next to the last identity (1) in the form
^pi(«/)^^2(^' y')=J''K^ y)^{^. y)
By the corollary of Theorem 2, § 73, we' see that Ppi{y) is a factor
of .T", so that Ppi(y) can be cancelled out of this last written identity,
and we see that 6r is a factor of Bp.g.
Proceeding in this way, we see that Q^ is a factor of all the J2's,
and therefore, finally, of / and <f>. Moreover, we see from (2) that
(7 is of at least the first degree in x, as otherwise Rp would not con
tain X, while Rp+i was assumed to be the first of the ^'s which did
not involve x.
Thus our theorem is proved.
Since, as we saw above, every common factor of / and 4> is also a
factor of all the R's, it follows from (2) that, if i/r is a common factor
of/ and <^, (3L(^^ y^s{y) = f{x, y)K{x, y).
* If Bp has no factor in y alone, S reduces to a constant.
208 INTRODUCTION TO HIGHER ALGEBRA
If then yjr contains no factor in 1/ alone, S must, by the Corollary of
Theorem 2, § 73, be a factor of K. Consequently by cancelling out
S from the last written identity, we see that ^ is a factor of Gr.
That is,
Theorem 2. If in Euclid's algorithm Bp+i = 0, the greatest cem
mon divisor of f and <f> which contains no factor in y alone is the poly
nomial Gr{x,y) obtained by striking from Rf,{x,y) all factors in y
alone.
We note that if R^+i is a constant different from zero, / and ^
are relatively prime ; but that the converse of this is not true as the
simple example
f = 2x^ + 2.y\ if> = x
shows.
Going back to the identities (1), we get from the first of these
identities, by mere transposition, the value of R^ in terms of / and
^ (and Pq, §q). Substituting this value in the second identity, we
get a value for R^ in terms of /, (^, and certain P's and Q's. Pro
ceeding in this way, we finally get the formula
(4) ^P+i(«/) = Fix, y)f(x, y) + ^{x, y)^{x, y)
where I' and ^ are polynomials in (a;, y).
75. Factors of Polynomials in Two Variables.
Theorem 1. If fix, y) and ^(a;, y) are any two polynomials in x
and y, and yjrix, y) is an irreducible polynomial which is a factor of
the product f(f), then yjr is a factor of f or of (f).
If i/r does not contain both x and y, this theorem reduces to
Theorem 2, § 73. It remains, then, only to consider the case that yjr
involves both variables. In this case, at least one of the polynomials
/, <f> must be of at least the first degree in x. Without loss of gener
ality we may assume this to be/. If ir is a factor of/, our theorem
is true. Suppose i/^ is not a factor of/; then, since ir is irreducible, /
and i/r are relatively prime, and if we apply the algorithm of the
greatest common divisor to /and ifr (as we did in the last section to
/ and 0) the first remainder R^^^ (y) which does not involve x is not
identically zero The identity (4) of the last section now becomes
(1) i2.+i(y) = Fix, y)fix, y) + ^ix, y)ylr{x, y).
FACTORS OF POLYNOMIALS IN TWO VARIABLES 209
If we multiply this by <J3{x, y), the second member becomes a poly
nomial which has i/r as a factor, since, by hypothesis, /^ has •^ as a
factor. We may therefore write
C^) ^p+i {y)K^', y) = ^{x, y)x{x, y).
Now no factor other than a constant of i2p+j can be a factor of i/r,
since ^ is irreducible. Consequently, by the Corollary of Theorem
2, § 73, ^p+j is a factor of ■)^x, y). Cancelling out B^^^ from (2),
as we have a right to do since it does not vanish identically, we get
an identity of the form
<l){x, y) = '>lr{x, y)xi{x, y);
that is, i/r is a factor of (^, and our theorem is proved.
By applying this theorem a number of times, we get the
CoKOLLAKY. If the product of any numher of polynomials in two
variables/ A{x, y)f,(x, y) ■ ■ ■ f^x, y),
is divisible by an irreducible polynomial 1/^(2;, y), then yfr is a factor of
at least one ofthefs.
We come now to the fundamental theorem of the whole subject
of divisibility of polynomials in two variables.
Theorem 2. A polynomial in two variables which is not identically
zero can be resolved into the product of irreducible factors no one of
which is a constant in one, and essentially in only one, way.
That a polynomial f{x, y) can be resolved into the product of
irreducible factors no one of which is a constant in at least one way
may be seen as follows. If /is irreducible, no factoring is possible
or necessary. If/ is reducible, we have
f{x,y)=flx,y)flx,y),
where neither /j nor/g is a constant. If /^ and f^ are both irredu
cible, we have a resolution of/ of the form demanded. If not, resolve
such of these polynomials /j and/g as are reducible into the product
of two factors neither of which is a constant. We thus get / ex
pressed as the product of three or four factors. This is the resolu
tion of/ demanded if all the factors are irreducible. If not, resolve
such as are reducible into the product of two factors, etc. This pro
cess must stop after a finite number of steps, for each time we factor
210 INTRODUCTION TO HIGHER ALGEBRA
a polynomial into two factors, the degrees of the factors are lowei
than the degree of the original polynomial. We shall thus ulti
mately resolve / by this process into the product of irreducible
factors, no one of which is a constant.
Suppose now that / can be resolved in two ways into the product
of irreducible factors none of which are constants,
f{x, y)=A{x, y)f^{x, y) ■■■fk{x, y)
= ^^{x,y)<p^{x,y)<^i{x,y).
Since c^j is a factor of /, it must, by the Corollary of Theorem 1,
be a factor of one of the polynomials /ji/gi "' fk Suppose the/'s so
arranged that it is a factor oify Then, since /j is irreducible, /j and
^j can differ only by a constant factor, and since (p^^O, we may cancel
it from the identity above, getting
In the same way we see from this identity that/2 and one of the
<t's, say (^2' differ only by a constant factor. Cancelling ip^, we get
''I'^z/s ■•■fk = 4>3 •■■ ^i
Proceeding in this way, we should use up the <p's before the /'s if
Kk, the/'s before the <f>'s ii I >k. Neither of these cases is possible,
for we should then have ultimately a constant on one side of the
identity, and a polynomial different from a constant on the other.
Thus we must have k = I. Moreover we see that the /'s can be
arranged in such an order that each / is proportional to the corre
sponding c6, and this is what we mean (cf. Definition 7, § 60) by
saying that the two methods of factoring are not essentially different.
Thus our theorem is proved.
Theorem 3. If two polynomials f and <f> in {x, y') are relatively
prime, there are only a finite number of pairs of values of (x, y) for
which f and (j> both vanish.*
For if/ and (f) both vanished at the points
(3) (^v 2/1)' (^2' 2/2)' ••• '
and if these points were infinite in number, there would be among
them either an infinite number of distinct x's or an infinite number of
* stated geometrically, this theorem tells us that two algebraic plane curves
f(z, y) = 0, <j>(x, j/) = Ocan intersect in an infinite number of points only when thej
have an entire algebraic curve in common.
FACTOKS OF POLYNOMIALS IN TWO VARIABLES 211
distinct «/'s. By a suitable choice of notation we may suppose that
there are an infinite number of distinct y's. Then it is clear that ;f
and ^ must be of at least the first degree in x, since a polynomial in y
alone which does not vanish identically cannot vanish for an infinite
number of values of y. We may then apply to / and ^ the algorithm
of the greatest common divisor as in § 74, thus getting (cf. (4),
§ 74) an identity of the form
(4) F{x, yy{x, y) + <^{x, y)4>{x, y) = R^^,(y) m 0
Since the first member of (4) vanishes at all the points (3), R^+^y)
would vanish for an infinite number of distinct values of y, and this
is impossible.
An important corollary of the theorem just proved is that if / and
^ are two irreducible polynomials in {x, «/), and if the equations
/"= and ^=0 have the same locue, then/ and <^ differ merely by
a constant factor. This would, however, no longer be necessarily
true if/ and were not irreducible, as the example,
/=2;«/2, <l> = x^y,
shows; for the two curves /=0 and <^ = are here identical, since
the curve in each case consists of the two coordinate axes, and yet/
and ^ are not proportional. By means of the following convention,
however, the statement made above becomes true in all cases :
Let/ be resolved into its irreducible factors,
where /i, ■■■f^ are irreducible polynomials in {x, y), no two of which
are proportional to each other. The curve /= then consists of the
A pieces, ^^^ 0,/2 = 0, ••■/4 = 0.
To each of these pieces we attach the corresponding positive integer
«^ which we call the multiplicity of this piece ; and we then regard
two curves given by algebraic equations as identical only when they
consist of the same irreducible pieces, and each of these pieces has
the same multiplicity in both cases. With this convention we may
say :
Corollary. Iff and ^ are polynomials in (x, y) neither of which
is identically zero, a necessary and sufficient condition that the two curves
/= 0, ^ = 5e identical is that the polynomials f and 4> differ only by
«. constant fact^"
212 INTRODUCTION TO HIGHER ALGEBRA
EXERCISES
1. Let/(x), <^(a;), i/'(a;) be polynomials in i whose coefficients lie in a certain
domain of rationality. Then if i^ is irreducible in this domain and is a factor of
the product /<^, prove that kj/is a, factor of /or of <!>.
2. Let/(a;) be a polynomial in x, which is not identically zero, and whose
coefficients lie in a certain domain of rationality. Prove that /can be resolved
into a product of polynomials whose coefficients lie in this domain, which are
■^ irreducible in this domain, and no one of which is a constant, in one and essen
tially in only one way.
3. Extend the results of this section to polynomials in two variables whose
coefficients lie in a certain domain of rationality.
76. Factors of Polynomials in Three or More Variables. The re
sults so far obtained in this chapter may be extended to polynomials
in three variables without, in the main, essentially modifying the
methods already used. We proceed therefore to state the theorems
in the order in which they should be proved, leaving the proofs of
most of them to the reader. The extension to the case of n variables
then presents no difficulty, and is left entirely to the reader (cf.
Exercise 1).
Let f{x, y, z) be any polynomial in three variables, and suppose it
arranged according to powers of x,
f(x, y, z) = a^^y, s)a;" + a^{y, 2)a:»i +  + a„(y, z),
the a's being polynomials in (y, z).
Corresponding to Theorems 1, 2 of § 73 we have
Theorem 1. A necessary and sufficient condition that a polynomial
in (y, z) he a factor off is that it be a factor of all the a's.
Theorem 2. If f{x, y, z) and <fi{x, y, z) are any two polyno
mials in {x, y, z) and ■>p{y, z) is an irreducible polynomial in {y, z) only
which is a factor of the product f^, then ■^ is a factor offor of (j).
COEOLLAEY. Let f(x, y, z) and (j>(x, y, z) be polynomials in
{x, y, z), and let ■\)f{y, z) be a polynomial in {y, z) alone. If i^ is a
factor of the product off(j),but is relatively prime to (f), then it is a fac
tor off.
To find the greatest common divisor of two polynomials in three
variables we proceed ''xactly as in the case of two variables, getting
FACTORS OF POLYNOMIALS IN TWO VARIABLES 213
a set of identities similar to (1), § 74, the P's and Bp+j being now
functions of {y, z), while the other ^'s and the ^'s are functions of
{x, y, z). Corresponding to Theorems 1, 2 of § 74 we now have
Theokem 3. A necessary and sufficient condition thatf{x, y, z), and
(f>{x,y, z) have a common factor which involves x is that Ili,_^j^{y, z)s 0.
Theorem 4. If Hp+^^y, z)=0, the greatest common divisor of
f{x, y, z) and <f>(^x, y, z) which contains no factor in (_y, z) alone is
the polynomial G{x, y, z) obtained by striking out from R^ix, y, z)
all factors in (y, z) alone.
From the algorithm of the greatest common divisor for the two
polynomials /(a;, y, z), tj){x,y,z) we also deduce the identity
(1) Bp+i(2/, 2) s Fix, y, z)f{x, y,z) + ^ {x, y, z)<j)(x, y, z),
similar to (4), § 74.
Corresponding to Theorems 1, 2 of § 75 we have
Theorem 5. If f{x, y, z) and <f>{x, y, z) are any two polynomials
and'\lr{x, y, z)is an irreducible polynomial which is a factor of the prod
uct f^, then ^ is a factor of f or of <^.
Corollary. If the product of any number of polynomials
/i(a;. y^ 2)/2(«. y^ 2) fki^^y^^)^
is divisible by an irreducible polynomial \lr{x, y, z), then i/r is a factor
of at least one of the f^s.
Theorem 6. A polynomial in three variables which is not identi
cally zero can he resolved into the product of irreducible factors no one of
which is a constant in one, and essentially in only one, way.
When we come to Theorem 3, § 75, however, we find that it
does not admit of immediate extension to the case of three vari
ables ; for Bp+j{y), which came into the proof of that theorem,
becomes now Ii^+^{y, z), and we can no longer say that this does not
vanish at an infinite number of points {y, z). Not only is the proof
thus seen to fail, but the obvious extension of the theorem itself is
seen to be false when we recall that two surfaces intersect, in gen
eral, in a curve.
This theorem may, however, be replaced by the following one :
Theorem 7. Iff{x, y, z) and 4>(x, y, z) are any two polynomials in
three variables of which is irreducible, and if f vanishes at all points
{x, y, z) at which <f> vanishes, then is a factor off.
214 INTRODUCTION TO HIGHER ALGEBRA
In proving this theorem we may, without loss of generality,
assume that <f) actually contains one of the variables, say x ; for
if (^ contains none of the variables x, y, a, the theorem is trivial
and obviously true.
Suppose (}> were not a factor of /. Then, since cf> is irreducible,
/ and <^ are relatively prime. Hence, in the identity (1) above,
Rp^i{y, 2) ^ 0. Let us write
(2) <l>{x, y, z) = h,{y, z)x' + b^(y, z)x^^ +■■■ + b,,(y, z) {m> 1),
where, without loss of generality, we may assume hQ^y, z) ^ 0. Then
(3) B,^^{y,z)h^{y,z)^0.
Accordingly we can find a point (y^, 2j) such that
(4) ■«P+i(yr^i)^0, h,{y„z{)^0.
Consequently <^(a;, y^, a^) is a polynomial in x alone which is of at
least the first degree, and which therefore (Theorem 1, § 6) vanishes
for some value x^ of x. That is
<f>(x^, yy 2i) = 0.
Accordingly, by hypothesis.
Referring now to the identity (1), we see that
Kp+i(2/i'«i) = 0.
This, however, is in contradiction with (4). Thus our theorem is
proved.
If to each part of a reducible algebraic surface we attach a multi
plicity in precisely the same way as was explained in the last section
for plane curves, we infer at once the
Corollary. Iff and cj) are polynomials in (x, y, z) neither of which
is identically zero, a necessary and sufficient condition that the two sur
faces y= 0, <^ =
be identical is that the polynomials f and (f> differ only by a constant
factor.
Theorem 7 admits also the following generalization :
Theorem 8. Iff{x, y, z) and 4>{x, y, z) are any two polynomials in
three variables which both vanish at the point (2;^, y^, z^) and of which <j)
is irreducible, and if in the neighborhood iV of {xq, y^, z^) f vanishes at
ell points at which 4> vanishes, then (p is a factor off.
FACTORS OP P0LYN0MIA1.S IN TWO VARIABLES 215
We assume, as before, that <f> contains x and can therefore be
ivritten in the form (2). Let us first consider the case in which
^oC^O' ^o) ^ ^ Here the proof is very similar to the proof of
Theorem 7.
We obtain relation (3) precisely as above, and from it we infer
that a point (i/j, z^) in as small a neighborhood M of (y^, Zg) as
we please can be found at which the relations (4) are true.
Now consider the equation
(5) 0(a;, y^, 3^)=0.
By writing ^ in the form (2). we see that by taking the neighbor
hood M oi (y^, ^o) sufficiently small, we can make the coefficients
of (5) differ from the coefficients of
(6) <iix, ^0, Zq)=
by as little as we please (cf. Theorem 3, § 5). Now a;,, is by hy
pothesis a root of equation (6). Consequently by taking M suffi
ciently small, we can cause (5) to have at least one root x^ which
differs from x^ by as little as we please (cf. Theorem 4, § 6). Thus
we see that a point {x^, y^, z^ in the given neighborhood N of
{xq, y^, 2q) can be found at which
^{xy, yj, 2i)= 0.
Accordingly, by hypothesis,
From the identity (1) we 1ave then
^P+i(^i'2i)=0»
which is in contradiction with (4). Thus our theorem is proved on
the supposition that SoC^o' ^o)''^*^*
* The proof just given vdll, in fact, apply to the case in which not all the &'s in (2)
vanish at the point (j/o, Zo\ « we use the extension of Theorem 4, § 6, which is there
mentioned in a footnote. It is only the extreme case in which all the 6's vanish at this
point which requires the special treatment which we now proceed to give. The reader
is advised to consider the geometrical meaning of this extreme case.
216 INTRODUCTION TO HIGHER ALGEBRA
In order to treat the case in which Jolyo'^^o)" *'' ^®*' '^^ denote by
k the degree of </)(a5, y, z), and let us subject this polynomial to a
nonsingular linear transformation
(7) • y^a^x' + ^^y' + ^2^
2 = a^x' + ^3^' + 78S'
which makes the degree of <f) inx equal to the total degree k oi <f>
(cf. Theorem 2, §64).
Suppose that this transformation carries over the point (3:^,1/^,20)
into the point (x^, y'^, Zq). Then it is possible, since (7) is non
singular, to take such a small neighborhood N' of (a;^, y^, 2q) that
all points in this neighborhood correspond to points in the given
neighborhood Ifoi {xq, y^, Zg).
Moreover, by means of (7), ^ has gone over into
(8) ^\x\ y', z') ^ b'ox'' + biiy', z')x''"' + ■■■ + H{y', z'),
where SJ is a constant different from zero. Let us denote by
f'{x', y', z') the polynomial into which / is transformed. Then it
is clear that, since, in the neighborhood N,/ vanishes whenever <^
does, in the neighborhood If' (which corresponds to a part of iV"),/'
vanishes whenever (j)' does. Accordingly we can apply the part of
the theorem already proved to the two polynomials /' and (^', since
the first coefficient of ^' in the form (8), being a constant different
from zero, does not vanish at {yl), z'q). We infer that <}>' is a factor
°^^'' nx', y', z>) = <f>'(:x', y', z')ylr'(x', y>, z').
If here we replace x', y', z' by their values in terms of x, y, z from
(7), we see that <j> is a, factor of /; and our theorem is proved.
EXERCISES
1. State and prove the eight theorems of this section for the case of polyno
mials in n variables.
2. Extend the result of the exercise at the the end of §73 to the case of
polynomials in n variables.
3. Extend the results of the two preceding exercises to the case in which we
consider only polynomials whose coefficients lie in a certain domain of ration'
ality.
FACTORS OP POLYNOMIALS IN TWO VARIABLES 217
4. The resultant of two polynomials in one variable
/(a:)=ao«» + aia;»'+ ... + o„,
^ (x)=6„x"'+ 6ix'»i+  + bm,
is sometimes defined as the polynomial R in the a'a and 6's of lowest degree which
satisfies an identity of the form , , _
where F and * are polynomials in (oo, ■•• a,; b^, ■■■ b„; x), and the identity is an
identity in all these arguments. Prove that the resultant as thus defined differs
only by a constant factor different from zero fron^the resultant as we defined it
in § 68.
CHAPTER XVII
GENERAL THEOREMS ON INTEGRAL RATIONAL
INVARIANTS
77. The Invariance of the Factors of Invariants. Let as con.
sider the general wary form of the Jcth degree which we will rep
resent by /(a;j, ••• »;„ ; aj, a^, •••), the x's being the variables and the
a's the coefficients. By suitably changing the a's, this symbol may
be used to .represent any such form. Hence, if we subject such a
form to a linear transformation, the new form, being Mary and of
the same degree as the old, may be represented by the same func
tional letter : /(^J, •■■ os'„; a[, a'^, •••). This new form will evidently
be homogeneous and linear in the a's ; that is, each of the a''s is a
homogeneous linear polynomial in the a's. It is also clear that each
of the a''s is a homogeneous polynomial of the Jcth degree in the
coefficients of the transformation.
It follows from the very definition of invariants that if we have
a number of integral rational relative invariants of a form or system
of forms, their product will also be an integral rational relative in
variant. It is the converse of this that we wish to prove in this
section. We begin by stating this converse in the simple case of a
(ingle form.
Theorem 1. If I{a^, a^, ■••) is an integral rational invariant of
the nary form j., _ >
and is reducible, then all its factors are invariants.
It will evidently be sufficient to prove that the irreducible factors
of I are invariants. Let f^,f^, •••'/, be the irreducible factors of /.
Subjecting /to the linear transformation
■Xi = c^^x[+ ... + Oi^x'„,
^n — "nl^l + ••• + ^n!i*»'
218 ,
INTEGRAL RATIONAL INVARIANTS 219
whose determinant we call o, and denoting the coefficients of the
transformed form by a[, a'^, ■■■ , we have
(1) I{a[, 4, ■••) = c^fC^i, «2' •••^:
an identity which may also be written
We have here a polynomial in the c's and a's which, on the
second side of the identity, is resolved into its irreducible factors,
since by Theorem 1, §61, the determinant c is irreducible. Hence
each factor on the first side is equal to the product of some of the
factors on the second. That is
(2) fj^a[, 4, • • •) = c^'M«i, «2' • • •) (i = 1, 2, • • • I)
where the <f)'s are polynomials.
Now let _ _ _ _ 1
''ll — ''22 — ■" — ^in — '
and let all the other c's be zero. Our transformation becomes the
identical transformation, the determinant e = 1, and each a' is equal
to the corresponding a. The identities (2) therefore reduce to
• /i(«l> «2' ■ • •) = ^I'^v «2' ■ • ■) (i = 1, 2, • • • I).
Substituting this value of </>, in (2), we see that fi is an invariant,
and our theorem is proved.
The general theorem, now, is the following :
Theorem 2. If I (aj, a^, — ;h^,h^, ; ■■■) is an integral rational
invariant of the system of forms
fli^V ■■■ ^n> ^V ^V ■■■)
and is reducible, then all its factors are invariants.
The proof of this theorem is practically identical with that of
Theorem 1.
220
INTRODUCTION TO HIGHER ALGEBRA
EXERCISE
If /(qi, 02, ■••; bi,b2, ■••; ■■■; yi, ■•• l/n', ^i. ■•■^ni "•) is an integral rational co
Tariant of the system of forms
/i(a:i, •••x„; 01,02,—),
/zC^i, ■■•a;„; fti.fe, •■•)>
and the system of points (yi,y„),(zi,
factors are oovariants (or invariants).
■z„),, and is reducible, then all its
78. A More General Method of Approach to the Subject of Rela
tive Invariants. We have called a polynomial I in the coefficients
of an wary form / a relative invariant of this form if it has the
property of being merely multiplied by a power of the determinant
of the transformation when /is subjected to a linear transformation.
It is natural to inquire what class of functions Zwe should obtain if
we make the less specific demand that I be multiplied by a polj'
nomial in the coefficients of the transformation. We should expect
to get in this way a class of functions more general than the invari
ants we have so far considered. As a matter of fact, we get precisely
the same class of functions, as is shown by the following theorem :
Theorem. Let I be a polynomial not identically zero in the co
efficients (aj, a^, •••) of an nary form f and let {a\, a'^^ •••) he. the co
efficients of the form obtained by subjecting f to the linear transformation
If  I{a[, a^, • • •) s i/r(cjj, • • • c„„) /(aj, a^, ■■■),
where i^ is a polynomial in the c's, and this is an iaentity in the d's and
e's, then ■\jr is a power of the determinant of the transformation.
We will first show that ^^O when c^Q. If possible let dj^, ■ • • d„.^
be a particular set of values of the c^^'s such that
while
dr
d,,
<*»1 ■ ■ • <*n»
^0.
INTEGRAL RATIONAL INVARIANTS
Then the transformation
221
has an inverse
Let us consider a special set of a's such that I{a^, a^, ■ • •) ^ 0. Then
I{a[, a^, • ■ •) = f (<^ir • • • <^«») ^ («r «2' • • •) = 0
Now apply the inverse transformation, and we have
which is contrary to our hypothesis.
Having thus proved thafi/r can vanish only when c = 0, let us
break up yjr into its irreducible factors,
Since yjr vanishes whenever ir^=0, ir^ can vanish only when c=0.
Hence by the theorem for n variables which corresponds to Theorem 7^
§ 76, i/r, must be a factor of c. But e is irreducible. Hence i/r^ car
differ from e only by a constant factor, and we may write
f=Ke\
It remains then merely to prove that the constant ^has the value 1
For this purpose consider the identity
and give to the c^^'s the values which they have in the identical
transformation. Then e = 1, and the a''s are equal to the correspond'
ing a's. The last written identity therefore becomes
/(«!, a^, ■■■)=KI{a^,a^,);
from which we infer that K=l.
232 INTRODUCTION TO HIGHER ALGEBRA
EXERCISSS
1. Prove that if a polynomial / in the coefficients ai, a^, •■• of an raary form and
the coordinates (j/i, ■■■yn) of a point has the property of being merely multiplied
by a certain rational function \p of the coefficients of the transformation when the
form and the point are subjected to a linear transformation, then i// is a positive or
negative power of the determinant of the transformation, and / is a oovariant.
2. Generalize the theorem of this section to the case of invariants of a system
of forms.
3. Generalize the theorem of Exercise 1 to the case of a system of forms and
a system of points.
4. Prove that every rational invariant of a form or system of forms is tJie
ratio of two integral rational invariants.
5. Generalize the theorem of Exercise 4 to the case of covariants.
79. The Isobaric Character of Invariants and Covariants. In
many investigations, and in particular in the stud}' of invariants and
covariants, it is desirable to attach a definite weight to each of the
variables with which we have to deal. To a product of two or more
such variables we then attach a weight equal to the sum of the
weights of the factors, and this weight is supposed to remain
unchanged if the product is multiplied by a constant coefficient.
Thus if gj, gg, Z3 are regarded as having weights Wy, w^^ Wg respectively,
the term
5 2j
Zt 2(5 Z'
3
would have the weight w^ + w^ + 2w^.
If, then, having thus attached a definite weight to each of the vari
ables, we consider a polynomial, each term of this polynomial will be
of a definite weight, and hy the weight of a polynomial we understand
the greatest weight of any of its terms whose coefficient is not zero. If
moreover all the terms of a polynomial are of the same weight, ths
polynomial is said to be isobaric.
It may be noticed that, according to this definition, a polynomial
which vanishes identically is the only one which has no weight, while
a polynomial which reduces to a constant different from zero is of
weight zero. Moreover if two polynomials are of weights Wj and w^,
their product is of weight Wj + w^.*
* The conception of degree of a polynomial is merely the special case of the con
ception of weight in which all the variables are supposed to have weight 1. The con.
ception of being isobaric then reduces to the conception of homogeneity.
INTEGRAL RATIONAL INVARIANTS 223
We will apply this conception of weight first to the case in which
the variables of which we have been speaking are the coefficients
a^, a^, ••■ of the navj form
j\^ii ■■■*»; (^\i *2' ■■■)•
We shall find it desirable to admit n different determinations of the
weights of these a's ; one determination corresponding to each of the
variables a;^, • • • x^.
Definition 1. If cii is the coefficient of the term
in an nary form, we assign to a^ the weights PitP^i" Pn respectively
with regard to the variables Xp x^, ■■■ x„.
In the case of a binary form,
^0*^1 ' ^1*^1 '^2 t "' * ^A'^2'
the subscripts of the coefficients indicate their weights with regard
to a^g, while their weights with regard to x^ are equal to the differences
between the degree of the form and these subscripts.
As a second example, we mention the quadratic form
n
^a^jX^Xji
Here the weight of any coefficient with regard to one of the vari
ables, say xp is equal to the number of times j occurs as a subscript
to this coefficient.*
In connection with this subject of weight, the special linear
transformation , /. ...
.^. {Xi = x[ {i=^j)
Xj = kx'j
is useful. If Ui is a coefficient which is of weight \ with regard to
X; the term in which this coefficient occurs contains x), and therefore
* For forms of higher degree, a similar notation for the coeflScients by means of
multiple subscripts might be used. The weight of each coefficient could then be at
once read ofi from the subscripts.
224 INTRODUCTION TO HIGHER ALGEBRA
That is
Theorem 1. The weight with regard to Xj of a coefficient of an
nary form is the exponent of the power of k by which this coefficient is
multiplied after the special transformation (1).
From this it follows at once that an isobaric polynomial of
weight X, with regard to Xj in the coefficients (aj, a^, ■■•) of an wary
form is simply multiplied by k'' if the form is subjected to the linear
transformation (1).
Moreover, the converse of this is also true. For if aj, a'^, are
the coefficients of the wary form after the transformation (1), and if
^(aj^, ag, •••)is a polynomial which has the property that
this being an identity in the a's and also in k, we can infer, as fol
lows, that (^ is isobaric of weight X. Let us group the terms of ^
together according to their weights, thus writing <^ in the form
^(ar«2' •■•) = '^i(«i'«2' ■••) + ^2(«i'«2' •■■)+ •••
where </>i, c^j, • ■ ■ are isobaric of weights Xj, \^, ■■■ . We have then
But on the other hand
<#>(«'r«2' •■■) = ^</'(«i'«2' •••)sF(/>i(ai, flj, ■■■) + k>'cl>^(a^,a^, ...)+....
Comparing the last members of these two identities, we see that
A. = Aj ^= An = ■ • •
as was to be proved. We have thus established the theorem :
Theorem 2. A necessary and sufficient condition that a poly
nomial (f) in the coefficients of an nary form be simply multiplied by P
when the form is subjected to transformations of the form (1) is that 4>
be isobaric of weight \ with regard to Xj.
By means of this theorem we can show that the use of the word
weight introduced in §31 is in accord with the definition given in
the present section. For an integral rational invariant of an wary
form which, according to the definition of §31, is of weight \ will, if
IKTEGRAL RATIONAL INVARIANTS 225
the form is subjected to the transformation (1), be merely multiplied
by k'' and must therefore, according to Theorem 2, be isobaric of
weight \ with regard to Xj. That is:
Theorem 3. Tf I is an integral rational invariant of a form f
which according to the definition of § 31 is of weight X, it will also be of
weight \ with regard to each of the variables Xj of f according to the
definitions,of this section, and it will be isobaric with regard to each of
these variables.
As an illustration of this theorem we may mention the discrimi
nant a^a^ _ a2
of the binary quadratic form
a^xl + 2a^x^X2 + a^x^
which is isobaric of weight 2 both with regard to Xi and with regard
to X2.
The reader should consider in the same way the discriminant of
the general quadratic form.
All of the considerations of the present section may be extended
immediately to the case in which we have to deal, not with a single
form, but with a system of forms. We state here merely the
theorem which corresponds to Theorem 3.
Theobbm 4. If I is an integral rational invariant of a system of
forms which according to the definition of § 31 is of weight X, it will also
be of weight X with regard to each of the variables Xj of the system, and
it will be isobaric with regard to each of these variables.
The reader may consider as an illustration of this theorem the
resultant of a system of linear forms, and also the invariants obtained
in Chapters XII and XIII.
We saw in Theorem 5, § 31, that the weight of an integral rational
invariant cannot be negative. This fact now becomes still more
evident, since the weight of no coefficient is negative. Moreover,
we can now add the following further fact:
Theorem 5. An integral rational invariant of a form or system
of forms cannot be of weight zero.
For consider any term of the invariant whose coefficient is not
zero. This term involves the product of a number of coefficients of
the system of forms. Since none of these coefficients can be of nega
Q
226 INTRODUCTION TO HIGHER ALGEBRA
tive weight, the weight of the term will be at least as great as the
weight of any one of them. But any one of them is at least of
weight 1 with regard to some one of the variables. Hence the in
variant is at least of weight 1 with regard to. some one of the
variables, and hence with regard to any of the variables.
In order, finally, to be able to extend the considerations of this
section to the case of covariants, we must lay down the following
additional definition:
Definition 2. If the sets of variables (j/j, ••• y„), (z^, ••■ «„), ■••
are cogredient with the variables (a^j, ••• x^ of a system of nary forms,
we will assign to yj, z^ ■ ■ ■ the weight — 1 with regard to xp to all the
other y's, g's, etc. the weight 0.
It will be noticed that here too, when we perform the transform
ation (1), each of the variables is multiplied by a power of k whose
exponent is the weight of the variable. It is therefore easy* to
extend the considerations of this section to this case, and w^e thus
get the theorem :
Theorem 6. If I is an integral rational covariant of a system
of forms and a system of points which is of weight X according to the
definition q/" § 3 1, it will also be of weight X with regard to each of the
variables of the system, and it will be isobario with regard to each of
these variables.
As an example of this theorem we note that the polar
of a binary quadratic form is isobaric of weight zero. The reader
may satisfy himself that the same is true of the polar of the general
quadratic form.
80. Geometric Properties and the Principle of Homogeneity It
is a familiar fact that many geometric properties of plane curves or
surfaces are expressed by the vanishing of an integral rational func
tion of the coefficients of their equations. Take, for instance, the
surface
^1) /(a:, y, 3; «!, «2' •••) = 0,
* Slight additional care must be taken here on account of the possible presence of
terms of negative weight.
mTEGRAL RATIONAL INVARIANTS 22'i
where / is a polynomial of the ^th degree in the nonhomogeneoua
cooidinates a;, y, z, and asj, a^, are the coefficients of this polyno
mia!l ; and consider the relation
12) </,(«!, «„...) = 0,
where <^ is a polynomial, which we will assume to be of at least
the first degree, in the coefficients asj, a^. ■■• . By Theorem 3, § 6,
there are an infinite number of polynomials of the kih. degree in
{x, «/, s) whose coefficients satisfy the relation (2) and also an infinite
number whose coefficients do not satisfy this relation. In other
words, all polynomials of the 4th degree in (x, y, z) may be divided
into two classes, A and B, of which the first is characterized by con
dition (2) being fulfilled, while the second is characterized by this
condition not being fulfilled. We may therefore say that (2) is a
necessary and sufficient condition that / have a certain property,
namely, the property of belonging in class A.
The simplest examples, however, show that this property of /
need not correspond to a geometric property of the surface (1).
To illustrate this, let k=l, so that we have
/= a^x + a^y + a^z + a^.
And consider first the polynomial in the a's :
The vanishing of <j) gives a necessary and sufficient condition that /
belong to the class of homogeneous polynomials of the first degree
in (x, y, z), and thus expresses a property of the polynomial. This
same condition, a^ = 0, also expresses a property of the plane/= 0,
namely, the property that it pass through the origin.
Suppose, however, that instead of the function^ we take the
polynomial ^^ = «^_1.
The vanishing of this polynomial also gives a necessary and suffi
cient condition that the polynomial / have a ceitain property, namely,
that its constant term have the value 1. It does not serve to dis
tinguish planes into tvro classes, since we may write the equation of
any plane (except those through the origin) either with the constant
term 1 or with the constant term different from 1 by merely multi
plying the equation through by a constant.
228 INTRODUCTION TO HIGHER ALGEBRA
From the foregoing it will be seen that saying that a surface has
a certain property amounts to the same thing as saying that it
belongs to a certain class of surfaces.*
Theorem 1. The equation (2) expresses a necessary and sufficient
condition for a geometric property of the surface (1) when, and only
when, the polynomial (f) is homogeneous.
For if <}) is nonhomogeneous, let us wilie it in the form
<^=^« + ^„l^ H^i + </>o'
where (f)„ is a homogeneous polynomial of the nth degree and each
of the other 0's which is not identically zero is a homogeneous poly
nomial of the degree indicated by its subscript. Let a[, a'^, ■ ■ ■ he ?^
set of values of the a's for which (}>„ and at least one of the other <^/s
is not zero, and consider the surface
(3) f(x, y, s; cd^, ca\, ••■) = 0.
The condition (2) for this surface is
This is an equation of the wth degree in c, and since at least one
of the coefficients after the first is different from zero, it will have
at least one root c^ + 0. On the other hand, we can find a value
Cj:^0 which is not a root of this equation. Hence the surface (3)
satisfies condition (2) if we let c =Cj and does not satisfy it if e= v^.
But a change in the value of c merely multiplies the equation (3)
by a constant and does not change the surface represented by it.
Thus we see that one and the same surface can be regarded both as
satisfying and as not satisfying condition (2). In other words, if ^
is nonhomogeneous, (2) does not express a property of the surface (1).
Assume now that ^ is homogeneous of the wth degree, and
consider the class A of polynomials / whose coefficients satisfy equa
tion (2) and the class 5 whose coefficients do not satisfy this equa
tion. Our theorem will be proved if we can show that we have
hereby divided the surfaces (1) into two classes, that is, that ii
• This brief explanation must not be regarded as an attempt to define the concep
'aon iiroperty, for no specific class can be defined without the use of some property.
INTEGRAL RATIONAL INVARIANTS 229
Oi, 02, ■•• are the coefficients of a polynomial of class A and a'/, a!l^ ••
the coefficients of a polynomial of class B, then the two surfaces
/(a;, «/, z; aj , a^, •••) = 0,
/(a;, y,^2; a'j', a^', •••) = 0,
cannot be the same. If they were the same, the coefficients Oj, a!^,
would be proportional to a![, a% ■ ■ ■ (cf . Theorem 7, Corollary, § 76),
and therefore ^(*i> <*2' •••)= ^"^(<*p «2' "O*
But this is impossible since by hypothesis
<^(ai, 4, ■•■)=0, <^«, <)^0.
Thus our theorem is proved.
This theorem admits of generalization in various directions.
Suppose first that instead of a single surface (1) we have a system
of algebraic surfaces, and that ^ is a polynomial in the coefficients of
all these surfaces. Then precisely the reasoning just used shows that
the equation <^ = gives a necessary and sufficient condition for a
geometric property of this system of surfaces when and only when (/>
is homogeneous in the coefficients of each surface taken separately.
On the other hand, we may use homogeneous coordinates in
writing the equations of the surfaces, and the results so far stated
will obviously hold without change :
Theorem 2. Let
/i(a;, y,z,t; a^, a^, ■••), A (a;, y,e,t; h^, b^, ■■■),•■■
h, a si/stem of homogeneous polynomials in {x, y, z, t) whose coefficients
are aj, a^, ■•■ ; hy, h^, ■■■ ; etc. ; and let
4>{^v ^2^ ••" ' K *2' ■■■ 5 ■■■)
he a polynomial in the as, Vs, etc. Then the equation <f) = expresses
a necessary and sufficient condition that the system of surfaces
/i = o,/2 = o,...
have a geometric property when, and only when, the polynomial (j) is
homogeneous in the a's alone, also in the b's alone, etc.
In conclusion we note that all the results of this section can be
extended at once to algebraic curves in the plane ; or, indeed, to the
case of space of any number of dimensions.
INTRODUCTION TO HIGHER ALGEBRA
EXERCISE
If, in Theorem 2, besides the surfaces /i = 0, /a = 0, ••• we also have a system
"^ P°^°*' ix„ 2,1, Z„ h), (X2, 2/2, %, h), ..
and if <^ is a polynomial not merely of the a's, h'a, etc., but also of the coordinates
of these points, prove that <^ = expresses a necessary and sufficient condition
that this system of surfaces and points have a geometric property when and
only when <^ is homogeneous in the a's alone, in the i's alone, etc., and also in
(^1, yi, 2i, h) alone, in (X2, 2/2, 32, '2) alone, etc.
81. Homogeneous Invariants. From the developments of the last
section it is clear that the only integral rational invariants which
will be of importance in geometrical applications are those which are
homogeneous in the coefficients of each of the groundforms taken
separately.* Such invariants we will speak of as homogeneous in
variants. It will be found that all the invariants which we have
met so far are of this kind.
An important relation between the weight and the various de
grees connected with a homogeneous invariant is given by the follow
ing theorem :
Theokem 1. If we have a system of nary forms.
(1)
of degrees m^, m^, ■ ■ • respectively, and if
I(a^, a^, ■■• ; h^, b^, •■•; •••)
* This statement must not be taken too literally. It is true if in the geometrical
application in question we consider the variables as homogeneous coordinates and if
we have to deal with the loci obtained by equating the groundforms to zero. While
this is the ordinary way in which we interpret invariants geometrically, other inter
pretations are possible. For instance, instead of interpreting the variables (x, y) as
homogeneous coBrdinates on a line and equating the binary quadratic forms
/i = aix' + 2 a2xy + ai^,
f2=bix^ + 2b2xy + bay^
to zero, thus getting two pairs of points on a line, we may interpret (s, y) as non
homogeneous coordinates in the plape, and consider the two conies /j = 1, /a = 1. With
this interpretation, the vanishing of the invariant
01O3 ~al+ bibs  bl,
which is not homoeeneous in the a's alone or in the 6's alone, has a geometric meaning
INTEGRAL RATIONAL INVARIANTS 231
is a homogeneous invariant of this system, of weight X, and of degree a
in the a^s, y8 in the S's, etc., then
(2) m^a+vi^^+ ■■■ =n\.
Subjecting the forms (1) to the linear transformation
(3>
x^=c^^x[+ ... +c^^x'^,
whose determinant we will denote by c, we get
and, since by hypothesis Zis an invariant of weight \,
(4) I{a[, a'^, •••; b[, h'^, •••; ■ ■ ■) = c>'I (a^, a^, ••• ; Jj, b^, •■• ; —).
Every «• is a homogeneous polynomial in the c^'s of degree m^, every
b' of degree m^, etc. ; and since I is itself homogeneous of degree a
in the a's, ^ in the J's, etc., we see that the lefthand side of (4) is a
homogeneous polynomial of degree mya + m^^+ ■••in the e^'s.
Equating this to the degree of the righthand side of (4) in the c^'s,
which is evidently nX, our theorem is proved.
An additional reason for the importance of these homogeneous
invariants is that the nonhomogeneous integral rational invariants
can be built up from them, as is stated in the following theorem:
Thbokbm 2. If an integral rational invariant I of the system (1)
be written in the form _ _ ^
1 = li + Ji+ •■■ + JJc
where each of the Is is a polynomial in the a's, J'«, etc., which is
homogeneous in the a's alone, and also in the Vs alone, etc., and such
that the sum of no two Is has this property, then each of the functions
A' ^' ■ ■ • 4
is a homogeneous invariant of the system (1).
232 INTRODUCTION TO HIGHER ALGEBRA
This theorem follows immediately from the definition of an
invariant. For from the identity,
/^(ai, 4, ••■; 6^, 6^, ■••;•••)+••• +/i(ai, 4, ••■ ; b[, b'^, ■■■ ; ■■■)
= c^[Jj(ai, asj, ■•• ; &i, ^2, ■■• ; — )
+ — \Mcv S' ■■■ ; *i' *2' "■ ' ■••)]'
we infer at once the identities,
Ij{a[, a'^ ; b[, b'^, ■■■ ; •■•) = c^7i(«i, a^, ■■■ ; Jj, b^, ••• ; •••),
I^.{a[, 4, ••• ; b[, b'^, ■■■ ; ■•) = c^'Z/aj, a^, ••• ; b^, b^, ■■■ ; •••).
In the case of a single wary form, but in that case only, we have
the theorem :
Theorem 3. An integral rational invariant of a single nary form
is always homogeneous.
Let /(ajj, •■•a;„; osj, a^, •.••)
be the groundform, and let / be the invariant. By Theorem 2 we
may write j^ j^ + j^ + ... + j^
where Jj, ■■■ I^. are homogeneous invariants. Let the degrees of these
homogeneous invariants in the a's be aj, ••• aj. respectively. Their
weights are all the same as the weight of I, which we will call \.
If, then, we call the degree of/, m, we have, by Theorem 1,
wiotj = wX, ma^ ~ nX, ■ ■ ■ muf. = nX,
from which, since m > 0, we infer
That is, ii, ••• I/c are of the same degree, and / is homogeneous.
Theokem 4. If we have a system of nary forms /j, f^, ••■ and
I polynomial (j} in their coefficients, the equation <^ = gives a necessary
and sufficient condition for a projective property of the system of loci in
space ofn—1 dimensions,
/i = 0, /2 = 0, ,
when, and only when, (j> is a homogeneous invariant of the system of
forms f.
INTEGRAL RATIONAL INVARIANTS 233
If is a homogeneous invariant, its vanishing gives a necessary
and sufficient condition for a geometric property (cf. § 80), and this
property must be a projective property since when we subject the
loci to a nonsingular collineation, ^ is merely multiplied by a non
vanishing constant.
On the other hand let ^ = be a necessary and sufficient condi
tion for a projective property. In order to prove that (j> is an
invariant (it must be homogeneous by § 80) let aj, a^, be the
coefficients of f^; Jj, b^, ■■■ the coefficients of f^, etc.; and suppose
that the linear transformation,
' ''ii^i + ••■ + '^im2;„,
(5)
carries over /j into /{ with coefficients a[, a\\ f^ into/^ with coeffi
cients Sj, ^2' ••■ J ^tc. The polynomial (^ formed for the transformed
forms is <^K,4,; SJ, 5^, •.• ; •••),
and may, since the a"s, J''s, ••• are polynomials in the a's, S's, ••■ and
the c's, be itself regarded as a polynomial in the a's, J's, ••■ and the
e's. Looking at it from this point of view, let us resolve it into its
irreducible factors,
(6) 4>{a[, 4, ••■ ; h[, V^, ■■■ ; ■■■ )=<^i(ai, a^, ■■■ ; b^, b^, ■■■ ; ••■ c^^, ■■■€„„)
It is clear that at least one of the factors on the right must con
tain the c's. Let (^j be such a factor, and let us arrange it as a poly
nomial in the c's whose coefficients are polynomials in the a's, 6's,
etc. Let f {a,, a„.;b„b^,. ..;...)
be one of these coefficients which is not identically zero and which is
the coefficient of a term in which at least one of the c's has an expo
nent greater than zero. We can, now, give to the a's, 5's, ••• values
which we will denote by ^'s, B's, ■■■ such that neither </> nor •^
vanish ; and consider a neighborhood N of the point
(Aj, 4.21 ■■■ > "l» "2' ■■■ ' ■■■)
throughout which
(7) ^(aj, aj, •••; Jj, b^, •■•; •■•)=?^0i
(8) «/r(ai, a2, •••; Jj, b^, ■■■ ; ■••)^0.
234 INTRODUCTION TO HIGHER ALGEBRA
Let us now restrict the a's, S's, •••to the neighborhood iVand ask
ourselves under what circumstances we can have (^j = 0. If this
equation is fulfilled, we see from (6) that (j> vanishes for the trans
formed loci, while, by (7), it does not vanish for the original loci.
Since, by hypothesis, the vanishing of ^ gives a necessary and
sufficient condition for a projective property, a transformation (5)
which causes cf) to vanish when it did not vanish before must be a
singular transformation. That is, if the a's, J's, ■■• are in the neigh
borhood JV, whenever (p^ vanishes the determinant c of (5) vanishes.
Moreover, ^j does vanish for values of the a's, 6's, ■•• in N, for if we
assign to the a's, b's, ■•• any such values, (j)j^ becomes a polynomial in
the Cf/s, which, by (8), is of at least the first degree, and therefore
vanishes for suitably chosen values of the e,y's. We can therefore
apply the theorem for more than three variables analogous to Theorem
8, § 76, and infer that (j>^ is a factor of the determinant o ; and conse
quently, since this determinant is irreducible (Theorem 1, § 61), that
(^]^ is merely a constant multiple of c.
The reasoning we have just applied to (f>^ applies equally to any
of the factors on the right of (6) which are of at least the first degree
in the c^/s. Accordingly (6) reduces to the form
(9) ^{a[, a'^, ■•■ ; b[, b'^ •••; •••) = c*x(«i' «2' •••; K K • ' •••)'
where x i^o longer involves the c^'s. To determine this polynomial
X', let us assign to the <?,y's the values 0, 1 which reduce (5) to the
identical transformation. Then the a''s, 6''s, ••• reduce to the a's,
6's ••■ , while c= 1; so that from (9) we see that
(^(a^, aj, ••■; Sj, Jg ; •■■) = xK' «2'; ^. *j, "•; •••)•
Substituting this value of x i" (9)i we see that ^ is really an in
variant.
In order to avoid all misunderstanding, we state here explicitly
that if we have two or more polynomials, t^j, ^^, ■■■ in the coefficients
of the forms /j, the equations (^^ = ^2= ■•• =0 may be a necessary and
sufficient condition for a projective property of the loci/i=0, even
though ^j, ^2' ■•■ are not invariants. For instance, a necessaiv and
sufficient condition that the two lines
a^x^ + a^x^ + agajj = 0,
INTEGRAL RATIONAL INVARIANTS 235
coincide is the vanishing of the three tworowed determinants of
the matrix „ /, /,
Cvt C&n CCq
Sj 62 Jg
none of which is an invariant. Or, again, a necessary and sufficient
condition that a quadric surface break up into two planes, distinct
or coincident, is the vanishing of all the threerowed determinants
of its matrix, and these are not invariants. In this case we can also
express the condition in question by the identical vanishing of a
certain contravariant, namely, the adjoint of the quadratic form ;
and this — a projective relation expressed by the identical vanish
ing of a covariant or contravariant — is typical of what we shall
usually have when a single equation (^ = is not sufficient to express
the condition. There are, however, cases where the condition is given
by the vanishing of two or more invariants; cf. Exercise 6, §90.
EXERCISES
1. Prove that if in Theorem 1 our system consists not merely of the ground
forms (1) but also of certain points
and we have not an invariant / but a covariant of weight \, and of degree a in the
a's, p in the 6's, etc., rj in the y'i, ^ in the z's, etc., then
TOia + m2y3+ •■• = WX + 17 + ^H .
2. Extend Theorem 2 to the case of covariants. Does Theorem 3 admit of
such extension ?
3. Extend Theorem 4 to the case of covariants.
4. Show that an integral rational invariant of a single binary form of odd
degree must be of even degree.
5. Show that the weight of an integral rational invariant of a single binary
form can never be smaller than the degree of the form.
6. Express the condition that (a) two lines, and (h) two planes coincide, in
the form of the identical vanishing of a covariant or contravariant.
7. Prove that a polynomial in the coefficients of a system of nary forms which
is homogeneous in the coefficients of each form taken by themselves, and which is
unchanged when the forms are subjected to any linear transformation of determi
nant + 1, is an invariant of the system of forms.
8. Generalize Exercise 7 to the case of covariants.
238 INTRODUCTION TO HIGHER ALGEBRA
82. Resultants and Discriminants of Binary Forms. If we inter,
pret (ojj, ajj) as homogeneous coordinates in one dimension, tlie equa
tions obtained by equating the two binary forms
to zero represent sets of points on a line. The points given by the
equation /= are the points at which the linear factors of / vanish,
and the points corresponding to ^ = are the points .at which the
linear factors of <;& vanish. Since two binary linear forms obviously
vanish at the same point when, and only when, these linear forms are
proportional, it follows that the loci of the two equations /= 0, ^ =
"have a point in common when, and only when, / and ^ have a common
factor other than a constant. Hence, by § 72, a necessary and suffi
cient condition that the two loci _/"= 0, ^ = have a point in common is
that the resultant R of the binary forms f <^ vanish.
The property of these two loci having a point in common is,
however, a projective property. Thus, by Theorem 4, § 81,
Theorem 1. The resultant of two binary forms is a homogeneous
invariant of this pair of forms.
From the determinant form of R given in § 68 it is clear that B,
is of degree m in the a's and of degree n in the J's. Hence by
formula (2), § 81, ^ ^ ^^_
Theorem 2. The weight of the resultant of two binary forms of
degrees m and n is mn.
The following geometrical problem will lead us to an important
invariant of a single binary form.
Let us resolve the form /, which we assume not to be identically
zero, into its linear factors (cf. formula (4), § 65),
f{x^, x^ = {a!lx.^a[x^{alx^a'^x^ ... {a'^x^ a'^x^.
The equation /= represents n distinct points provided no two of
these linear factors are proportional to each other. If, however, two
of these factors are proportional, we say that / has a multiple linear
factor, and in this case two or more of the n points represented by
the equation /= coincide. Let us inquire under what conditions
this will occur.
INTEGRAL RATIONAL INVARIANTS 237
Form the partial derivatives :
+ '^'i(<^x^cc[x^){a'^x^a^x^) ... {a'J^x^ a'^x^)
(1) \ ^. + ■■■ +''W^i"'iX2)i<i^l<i^2),
— =  «; {a'lz^  a'^x^) . . . {a'ix^  a'„x^)
2
 •.•  a'„{a'lx^a[x^) ••• {a'l.^x^ al_^x^).
From these formulae we see that any multiple linear factor of
/ is a factor of both of these partial derivatives.
Conversely, if these partial derivatives have a common linear
factor, it must be a factor of / on account of the formula,
df
bx^ dx^
a formula which follows immediately from the expressions,
r af
— = na^xl'^ +(n l)a^x'{^x^ +
(2)
+ auixl~\
d f
But, by (1), no linear factor of / can be a factor of df/9x^ unless
it is a multiple factor of/. Thus we have proved
Theorem 3. A necessary and sufficient condition that f have a
multiple linear factor is that the resultant of df/dx^ and Sf/dx^ vanish.
Definition. The resultant of bf/bx^ and Sf/dx^ is called the dis
criminant off.
From (2) we see that the discriminant of /may be written as a
determinant of order 2 w — 2 whose elements, so far as they are not
zero, are numerical multiples of the coefficients a^, a^, •■■ a„ of/.
That is, this discriminant is a polynomial in the a's. Moreover, its
vanishing gives a necessary and sufficient condition that the locus
/= have a projective property (namely, that two points of this
locus coincide). Hence, by Theorem 4, § 81, this discriminant is a
* This is merely Euler's Theorem for Homogeneous Functions.
238 INTRODUCTION TO HIGHER ALGEBRA
homogeneous invariant, whose degree and weight are readily deter
mined. Thus we get the theorem :
Theorem 4. The discriminant of a binary form of the nth
degree is a homogeneous invariant of this form of degree 2(w — 1) and
of weight n(n — 1).
A slight modification in the definition of the discriminant is often
desirable. Let us write the binary form /, not in the above form
where the coefficients are a^, a^,a„ but, hj the introduction of
binomial coefficients, in the form
fix^, x^) = a^xl + Mai4%2 + '^'^~^^ a^\'^xl + • • • + na„_^x^3^f^ + a^xl.
Then we may write
i / = a^r' + (»*  i)«ia;r% + (^^)(^^) a^r^xi +
n ax, ^ !
1 L /, r"~l
••• f "n— 1 •''2 )
1 A^ ^ a^xl~^ + {n l)a,xl^x^ + (»l)(^2) ^^^n3^2 +... + a,,x^\
nox^ A !
We may then define the discriminant of / as the resultant of the two
binary forms just written. We thus get for the discriminant a
polynomial in the a's which differs from the discriminant as above
defined only by a numerical factor, and for which Theorems 3 and 4
obviously still hold. If this last definition be applied to the case of
a binary quadratic form, it will be seen that it leads us precisely to
what we called the discriminant of this quadratic form in the earlier
chapters of this book.
EXERCISES
1. Prove that the resultant of two binary forms of degrees n and m respee
tively is irreducible.
[SnGGESTioN. When 60 = 0, JJ is equal to oSq times the resultant of two binary
forms of degrees n and m — \ respectively. Show that if this last resultant is irredu
cible, a is also irreducible, and use the method of induction, starting with the case
n = \, m = l.]
2. Prove by the methods of this chapter that the bordered determinants of
Chapter XII are invariants of weight 2.
3. The following account of Bezout's method of elimination is sometimes
given :
If f and <^ are polynomials In x which are both of degree n, the expression
f(x)4,iy)  4>ix)f(y)
INTEGRAL RATIONAL INVARIANTS
239
vanishes, independently of y, for a value of x for which both /and ^ vanish, and
is divisible hj x — y, since it is zero for x = y. Hence
n^,y)
./W«(2')«^W/(3^)
xy
is a polynomial of degree (n 1) in x which vanishes for all values of y when x is
a common root of/ and 4>. Arranging F according to the powers of y, we have
the expression
+ y<Sw + Cii» + CijX^ + ... + c^ „_ia;»i)
+ /(C20+ Cn^ + £22^2 +  + C2,„iX»i)
+
+ 3/"'(c»i,o+c»i,ia;+ c»i,2^'+  +c,,,„_ia;''').
If this function is to vanish independently of y, the coefficient of each power
of y must be zero. This gives n equations between which we can eliminate the n
quantities, lx,x^, — a;"!, obtaining the resultant in the form of the determinant.
R=
'n1,0 *'nl, 1 ****'»— ^,!
=0.
With the help of the auxiliary function F we have, in this case, reduced the
resultant to a determinant of the nth order, while that obtained by the method of
Sylvester was of order 2 n.
Criticise this treatment and makei it rigorous, applying it, in particular, to the
case of homogeneous variables.
4. If /and <^ are polynomials in (x, y) of degrees n and m respectively and
are relatively prime, prove that the curves /= 0, <^ = cannot have more than mn
points of intersection. .
[Suggestion. Show first that the coordinate axes can be turned in such a way that
no two points of intersection have the same abscissa, and that the equations of the two
curves are of degrees n and m respectively, after the transformation, in y alone. Then
eliminate y between the two equations by Sylvester's dyalitic method. ]
5. Prove that every integral rational invariant of the binary cubic is a con
stant multiple of a power of the discriminant.
[SnoGESTiON. Show that if the discriminant is not zero, every binary cubic can
be reduced by a nonsingular linear transformation to the normal form xf — x. Then
as in §48.]
CHAPTER XVIII
SYMMETRIC POLYNOMIALS
83. Fundamental Conceptions. S and S Functions.
Definition 1. A polynomial F{x^, ■■■ x,^ is said to he syvavaeXriz
if it is unchanged hy any interchange, of the variables (a^j, • • • a;„).
It is not necessary, however, to consider all the possible permuta
tions of the variables in order to show that a polynomial is sym
metric. If we can show that it is unchanged by the interchange of
every pair of the variables, this is suificient, for any arrangement
(Xa, Xi,, •■•x^) may be obtained from (x^, x^, ■■■^n) by interchanging
the x's in pairs. Thus, if a^^l, interchange x^ with Xj^; then in
terchange the second letter in the arrangement thus obtained with
x^; and so on. Hence we have the following theorem :
Theorem 1. A necessary and sufficient condition for a poly
nomial to he symmetric is that it be unchanged by every interchange of
two variables.
A special class of symmetric polynomials of much importance are
the 2 functions, defined as follows:
Definition 2. 2 before any term means the sum of this term and
of all the similar ones obtained from it by interchanging the subscripts.
Thus, for example,
'2.x\ = x\\xl+ ■■■ + <'
2a;Ja; = x\x\ + x\xl f ■ • ■ + x\a^,
+ x^xl + xlxl++x;x^ {a^fi)
+
+ xlx{+xlxlJr + xlxl_^,
'S.xlxl = xlx!^ + x\xl H h x\xl
It is clearly immaterial in what order the exponents a, ^8, ••• are
written. Thus, ^x^a^xl — ^^i^P'r
240
SYMMETRIC POLYNOMIALS 241
If we consider any term of a symmetric polynomial, it is evident
that the polynomial must contain all the terms obtained from this
one by interchanging the x's. This aggregate of terms is merely a
constant multiple of one of the 2's just defined. In the same way it
is clear that all the other terms of the symmetric polynomial must
arrange themselves in groups each of which is a constant multiple
of a S. That is,
Theorem 2. Every symmetric polynomial is a linear combination
with constant coefficients of a certain number of 2's.
Among these 2's the simplest are the sums of powers of the x's.
For the sake of brevity the notation is used :
S, = ^4 = x\ + xl+ ■■■ +xt (^=1, 2, ■■■).
It is sometimes convenient to write S^ = n.
Theoeem 3. Any symmetric polynomial in the z^s can be ex
pressed as a polynomial in a certain number of the S's.
Since every symmetric polynomial is a linear combination of a
certain number of 2's, in order to prove our theorem we have only
to show that every 2 can be expressed as a polynomial in the S's.
Now s, = xl + xt^++xl,
S^ = x^^ + xl+ ■■■ +3^.
Hence, if a + ft.
From this we get the formula :
(1) ^^x\ = S^S^  S^^^ (a ^ 0).
If « = /8, we have
SI =xf + xf +  +3^'^ + 2x^x^+ 2xlxl + 
= S2,+ 2Exlx^.
Hence
(2) ^x;xl^l{SlS,:).
Similarly, by multiplying 2 2:^2:^ by /S^, we get the following
formulge where the three integers a, yS, 7 are supposed to be distinct:
(3) * 2 xlx^^xl = S^S^Sy  S^+^Sy  S,+yS^  S^+y S^ + 2 S^^p+^
(4) 2 xlx^xl = 1 (SlSy  S^, Sy2 S^+y S, + 2 S^^^y),
(5) 'S.xlxlxl = \{Sl  SS^^S, + 2^3^.
242 INTRODUCTION TO HIGHER ALGEBRA
The proof indicated in these two special cases may be extended
to the general case as follows :
If we multiply together the two symmetric polynomials
(6) I,xlxlxl, S^ = ^x^^, (,k<n)
we get terms of various sorts which are readily seen to be all con
tained in one or the other of the following polynomials, each of these
polynomials being actually represented :
Consequently, since the product of the two polynomials (6) is sym
metric, it must have the form
c^l.x\*^xl af^ + c^^xl^*^ ■:■ xt+ + c,2xl4  xl+^
where c^, ■•• c^+j are positive integers.
Transposing, we may write
'S.xlxl ... xU^ =^\l.xl4 ... of, ■ 2a4 CiSa;»+X ... a:J
— C^X^X^'' ... Xl— — Cj^liX^X^ ... 2^+'^].
Hence, if our theorem is true for ^x^ ■•• x^, it is also true for
2a;J ■•• a:^+j. But we know it is true for k = l (by definition of the
/S"s), hence it is true for k=2, hence for k = S, and so on. Thus
our theorem is completely proved.
84. Elementary Symmetric Functions. The notation 1.x^x^ ■■■xl
may be used to represent any 2 in n variables. If yS = 7 = ■ ■ • =
z/ = 0, this becomes 2a:j ov S^; if 7 =•■•. = j; = 0, it becomes 2a;Ja;S ;
and so on.
Let us now consider ^x^x^ •■■ ^n where «, ^8, ... v, are all or 1.
The following n cases arise :
a=l, ^ = 7= ... =i; = 0, Sajj,
a=/3 = l, 7=:...=j; = 0, ^^iX^,
'Bll
1^2
The extreme case «= /8= ■•• = j/= is of no interest. We will
represent these n symmetric polynomials byjOj, jSji •■■ Pm respec+'^ely.
They are called the elementary symmetric functions.
SYMMETRIC POLYNOMIALS 243
Theorem 1. Any symmetric polynomial in the x's may he ex
pressed as a polynomial in the p's.
Since any symmetric polynomial in the x's may be expressed as a
polynomial in the S's, it is sufficient to show that every S may be
expressed as a polynomial in the ^'s.
Let us introduce a new variable x and consider the polj^nomial
f{x; XyX^, ■■■ Xn) = {x—X^X — X^ ■■■ (x — Xn)
= X"  PjX''^ +P23f^ ■••+( Ifp,.
Using the factored form of/, we may write
"1 r ■■■ T
dx X — x^ x — x^ x — x^
Since/ vanishes identically when x — Xi, we may write
/= (a;"  7^)  ^;(a;'i xr^)+.
Accordingly,
_J__ = a;"! + {Xi  jt?i)a;"2 + {x} p^Xi + j»,,>"3 + • • ,
X— Xi
1^ = nxf^ + (*S'i  npi)x"^ + {S^  p^S^ + np^)x'^^ + ..
OX
On the other hand, we have
^ = nx'^^  (w  l)pix'^^ + {n 2)p^afs  •••.
dx
Hence, equating the coefficients of like powers of x in these two
expressions, we have
aSiwPi = (w1)Pi,
'^2 Pi^i + ^P2 =(n 2)^2'
or
^1)
'^l  Pl'Sn2 +P2Sn3 + (" l)"lwPnl = ( " If^p
{S,p,^0,
S^PiS^ + 2p^='0,
. s,_,piS^_^+p^s„_s ••■ +(i)»i(«iKi=o
Now consider the identities
x7p,3^^+Pox72 +(lfp„ = (i = l, 2, ■•• n).
244 INTRODUCTION TO HIGHER ALGEBRA
Multiplying these identities by ajj"", •■■ a;^"" respectively and adding
the results, we have
(2) S,p^8,_^+p^S,_^+{lfpAn=0 {k=n,n+l,).
Formulae (1) and (2) are known as Newton's Formuloe. By
means of them we can compute in succession the values of Sy, S^, •■■
as polynomials in the ^'s :
S,=pl2p„
(3)
Thus our theorem is proved.
It will be noted that Newton's formulae (1) cannot be obtained
from (2) by giving to k values less, than n. The necessity for two
different sets of formulse may, however, be avoided by introducing
the notation «, = j, = . . . = n
Fn+l — Pnl2 — — "•
Then all of Newton's formulse may be included in the following form:
(4) S,p,S,^, + +{lf^p,_,S,+{lfkp,=0 (^=1,2,).
Using this notation, we see that the explicit formulae (3) for
expressing the <S"s in terms of the p's are wholly independent of the
number n of the x's.
Since the formulae referred to in the last section for expressing the
S's in terms of the <S"s are also independent of n, we have established
Theorem 2. J^we introduce the notation p„^^=p„^^= ••■ =0 and
use JVewton's Formuloe in the form (4), the formula for expressing
any ^ as a polynomial in the p^s is independent of the number n of the x's.
When we have k polynomials in n variables
we say that there exists a rational relation between them when, and
only when, a polynomial in h variables
exists which is not identically zero, but which becomes identically
zero as a polynomial in the a;'s when each s is replaced by the
corresponding/, J'(/i, •••/^)=0.
SYMMETRIC POLYNomiAT^S 245
Theorem 3. There exists no rational relation between the elemen
tary symmetric functions in n variables p.^., ■■■Pn
For let J'(si, ••• s„) be any polynomial in n variables which is not
identically zero, and let (a^, •■■ a„) be a point at which this polynomial
does not vanish. Determine (x^, ■■■ x„) as the roots of the equation
x"  Oia;"! + a^x"^ _...+(_ lya^ = 0.
For these values of the x's, the jo's have the values a^, •■•«„, and.
therefore P(^i, ■■■Pn) does not vanish for these x's, and is con
sequently not identically zero as a polynomial in the x^s. Thus
our theorem is proved.
Corollary. There is only one way in which a symmetric poly
nomial in (x^, ■ ■ ■ x^) can be expressed as a polynomial in the elementary
symmetric functions p^, ■■■ p„.
For if/ is a symmetric polynomial, and if we had two expressions
^*'''*' f(x„x^)^cl,,(p„p,),
f(x^,x„) = 4>^(p^,p^),
then by subtracting these identities from one another we should hav«
as an identity in the x's,
't'liPv ■■■Pn)^i{Pv ■■■Pu)^^
This, however, would give us a rational relation between the ^'s.
unless 4,iz„z:)^uh^...z:).
Thus we see that the two expressions for /are really the same.
EXERCISES
1. Obtain the expressions for the following symmetric polynomials in terms of
the elementary symmetric functions :
2. Prove that every symmetric polynomial in (xi,Xn) can be expressed in
one, and only one, way as a polynomial in Si, ■■■ S„.
85. The Weights and Degrees of Symmetric Polynomials. We
will attach to each of the elemeutary symmetric functions pt a weight
equal to its subscript, cf. § 79.
Theorem 1. A homogeneous sy/nmetric polynomial of degree m in
the x's, when expressed in terms of the p's, is isobaric of weight m.
246 INTRODUCTION TO IIKiHER ALGEBRA
Let
(1) fi^V % • ■ ■ ^n) = <^( Pv PV — Pn)
be such a polynomial. Since p^ is a homogeneous polynomial of the
first degree in the rr's, p^ of the second, etc., any term of <^, when
written in the x's, must be a homogeneous polynomial of degree equal
to the original weight of the term. Thus, for example, the term
6 pIp2Pz whose weight is 13, when written in the x's will be a homo
geneous polynomial of degree 13. Accordingly an isobaric group of
terms when expressed in terms of the x's will, since by Theorem 3,
§ 84, it cannot reduce identically to zero, be homogeneous of the
same degree as its original weight. If then (f) were not isobaric,
/would not be. homogeneous, and our theorem is proved.
Corollary. If f is nonhomogeneous and of the mth degree, <j) is
nonrisoharic and of weight m.
Theorem 2. A symmetric polynomial in (x^, ■■■ «„), when written
in terms of the elementary symmetric functions p^, ■ ■ ■ p^i i^iU ie of the
same degree in the p's as it was at first in any one of the x^s.
Let/ be the symmetric polynomial, and write
f{Xv a^g, ••• Xn)^<^{Pv> Pv •■■ Pn\
and suppose /is of degree m in x^ (and therefore, on account of the
symmetry, in any one of the 2;'s), and that is of degree fi in the ^'s.
We wish to prove that m = jj,. Since the p's are of the first degree
in Xy, it is clear that m</x..
If (\) is nonhomogeneous, we can break it up into the sum of a
number of homogeneous polynomials by grouping together all the
terms of like degree. Each of these homogeneous polynomials in
the p's can be expressed (by substituting for the p's their values in
terms of the a;'s) as a symmetric polynomial in the a;'s. If our theo
rem were established in the case in which the polynomial in the ^'s
is homogeneous, its truth in the general case would then follow at
once.
Let us then assume that ^ is a homogeneous polynomial. The
theorem is obviously true when n=l, since then p.^= — x^ It will
therefore be completely proved by the method of mathematical in
duction if, assuming it to hold when the number of x's is 1, 2, ••• w— 1,
we ean prove that it holds when the number of a;'s is n.
SYMMETRIC POLYNOMIALS 247
For this purpose let us first assume that p„ is not a factor of every
term of (f). Then <}>{pi, ■■■ Pnv ^) ^^ ^'^^ identically zero but is still
a homogeneous polynomial of degree yit in (p^, ••■ Pni)' Now let
Xn = 0. This makes p„ = 0, and gives the identity
(2) /(^i, a^^i, 0)^<^(p[...Ki,0),
where p'l, ••• p'ni are the elementary symmetric functions of
{x^, ■•■ a;„_i), and f{xp •■• x„_.^, 0) is a symmetric polynomial of
degree m^ in x^, where m^^m. From the assumption that our
theorem holds when the number of x's is w— 1, we infer from (2)
that /i = »Wj^ ^ m ; and since we saw above that /j, cannot be less than
m, we infer that (i, = m, as was to be proved.
There remains merely the case to be considered in which p„ is a
factor of every term of <^. Let p* be the highest power of _p„ which
occurs as a factor in <p. Then
4>{PV ■■■ Pn)=Pn4>l{Pv Pn),
where ^^ is a polynomial of degree /i — h Putting in for the ^'s
their values in terms of the x's, we get
(3) /(a^i, ■■■ x„) = x^xl • ■ • a;*/i(a;i, — x„),
where
(4) /i (o^i, ■ ■ • a;„) = ^1 (;?i, • • • p„).
From (3) we see that /j is of degree m — kin x^, and from (4), since
</)j does not contain jo„ as a factor, that the degrees of /j in x^, and
of (f)i in the »'s are equal,
m — K = fn— K.
From which we see that m = fi, a.s was to be proved.
The two theorems of this section are not only of theoretical
importance, they may also be put to the direct practical use of
facilitating the computation of the values of symmetric polynomials
in terms of the p's.
In order to illustrate this, let us consider the symmetric function
f{x^, ■■■ x^) = ^xlx^Xy
Since / is homogeneous of the fourth degree in the a:'s, it will, by
Theorem 1, be isobaric of weight 4 in the ^'s. Since it is of the
248 INTRODUCTION TO HIGHER ALGEBRA
second degree in x^, it will, by Theorem 2, be of the second degree in
them's. Hence
(5) Sarf^aag = Ap^p^ + Bpl + Qp^,
where A, B, and Q are independent of the number n (Theorem 2,
§ 84), and may be determined by the ordinary method of unde
termined coefficients.
Take w = 3, so that p^ = 0. Letting x^^ = 0, ^2= ajg = 1, we have
jBj = 2,jt>2 = 1,^3 = 0. Substituting these values in (5), we find B=0.
Letting x^= —1, x^^x^^ 1, we have py = l, p2= —^, P3= — 1.
which gives A = l.
Now let w = 4, Xi = X2 = xg = x^ = l.
From this we fipd p^ = i, p^ = 6, pg = i, p^ = 1.
Substituting this in (5) gives 0= —i. Hence
^x^x^x^ = p^ps  4p^.
EXERCISES
1. The symmetric function
/(Xl, ••• Xn)='S,xlX2Xs + 'SiX^Xl + 'SiXlXiXsXi
is homogeneous of the fourth degree in the x's, and is of the second degree in xi ;
hence, when written in terms of the^j's, it will have the same form, Apip^ + Bpl
+ Cpi, as the above example. Compute the values ot A, B, and C.
2. If /(xi, X2, xs) s (xi—X2y(xi — 13)^^2 — xi)^, show that
f(xi, X2, Xs) =  27 pI  ipl + l^pipipz  ^pIp3 ^pipl
86. The Resultant and the Discriminant of Two Polynomials in
One Variable. Let
f{x) = a;" + a^a;"! + a^x"""^ + ■ ■ • + a„
= {x aJ^x  a^) ■■■ {x «„),
^{x) = x^ + h^x^^ + h^x'^'^ + ■■■ +h^
= {xfi,){xl3^){x^^),
be two polynomials in x, and consider the product of the mn factors
(1)
(«l/3l)(«l^2)(«l^m)
.(«« /3l)(«»  ^2) ■■■ (on^m)
SYMMETRIC POLYNOMIALS 249
Phis product vanishes when, and only when, at least one of the «'s
is equal to one of the yS's. Its vanishing therefore gives a necessary
and sufficient condition that /and ^ have a common factor. More
over, the product (1), being a symmetrical polynomial in the '«'s and
also in the jS's, can be expressed as a polynomial in the elementary
symmetric functions of the «'s and /S's, and therefore as a polynomial
in the a's and J's. This will be still more evident if we notice that
the product (1) may be written
^(«i)<^(«2) ••• ^K>
In this form it is a symmetric polynomial in the «'s whose co
efficients are polynomials in the 6's, and it remains merely to bring
in the a's in place of the «'s.
We thus see that the product (1) may be expressed as a poly
nomial F^a^, ■■■ a„; 6j, ■•• b^) in the a's and b's whose vanishing gives
a necessary and sufficient condition that / and ^ have a common
factor. In § 68 we also found a polynomial in the a's and J's whose
vanishing gives a necessary and sufficient condition that / and ^
have a common factor, namely the resultant M of/ and (j>.
We will now identify these two polynomials by means of the
following theorem :
Theoeem 1. The product (1) differs from the resultant B of f
and <f) only hy a constant factor, and the resultant is an irreducible
polynomial in the a's and b's.
In order to prove this theorem we will first show that this prod
uct (1), which we will call F(a^, ■■•«„; 5^, ••• 5„), is irreducible.
This may be done as follows : Suppose F is reducible, and let
Fia^, ••• a„ ; b^,  b^)~F^(a^, ■■■ a„; b^, ■•■ J„) F^ia^, ••• a„ ; Jj, ••• JJ,
where F^ and F^ are polynomials neither of which is a constant.
Then, since the a's and b's are symmetric polynomials in the a's and
(8's, J'l and F^ may be expressed as symmetric polynomials i^j and <f>,
in the a's and ^'s, and we may write
■(«ly8i)(«i^2)"(«l/3™)
(«2  /3i) («2  ySj)  («2  /8™)
.(««/8i)(a„^2)K/3«)
250 INTRODUCTION TO HIGHER ALGEBRA
The factors on the righthand side of this identity being irreducible,
we see that 0j must be composed of some of these binomial factors
and ^2 of ''he others. This, however, is impossible, since neither ^j
nor <^2 would be symmetric. Hence F is irreducible.
Now, since jP=0 is a necessary and sufficient condition ior f{x)
and ^x) to have a common factor, and i? = is the same, any set of
values of the a's and 6's which make F= will also make R=Q. Hence
by the theorem for n + m variables analogous to Theorem 7, § 76, F is
a factor of R. Also, since F is & symmetric polynomial in the a's
and /3's of degree m in each of the a's and n in each of the /S's, by
Theorem 2, § 85, it must be of degree m in the a's and n in the 5's.
But R is of degree not greater than m in the a's and n in the J's, as is
at once obvious from a glance at the determinant in § 68. Hence F,
being a factor of R, and of degree not lower than R, can differ from
it only by a constant factor. Thus our theorem is proved.
Let us turn now to the question: Under what conditions does
the polynomial fix) have a multiple linear factor ? Using the same
notation as above, we see that the vanishing of the product
(«i  «2)(«i "a) ■••(«!«») "
(a2«3)"("2«")
(«»!—««).
= ^(«i» ••■«»)
is a necessary and sufficient condition for this. P is not symmetric
in the a's, since an interchange of two subscripts changes P into — P.
If, however, we consider P^ in place of P, we have a symmetric
polynominal, which can therefore be expressed as a polynomial in
the a's,
C^K •■•«.)? =^K •••«.)•
Moreover, ^= is also a necessary and sufficient condition that/(ir)
have a multiple linear factor.
On the other hand, it is easily seen that f{x) has a multiple linear
factor when and only when f{x) and f'{x) have a common linear
factor. A necessary and sufficient condition for f{p) to have a mul
tiple linear factor is therefore the vanishing of the resultant of /(~^,
and f\x). This resultant we will call the discriminant A of fix).
It is obviously a polynomial in the coefficients of /.
SYMMETRIC POLYNOMIALS 251
Theorem 2. The polynomials F and A differ only hy a constant
factor, and are irreducible.
The proof of this theorem is similar to the proof of Theorem 1,
and is left to the reader,
EXERCISES
1. Compute by the use of symmetric functions the product (1) for the two
polynomials x^ + a,x + a^,
x^ + bix + 62,
ttud compare the result with the resultant obtained in determinant form.
2. Verify Theorem 2 by comparing the result of Exercise 2, § 85, with the
discriminant in determinant form of the polynomial
a^ + Oix^ + OiX + Og.
CHAPTER XIX
POLYNOMIALS SYMMETRIC IN PAIRS OF VARIABLES
87. Fundamental Conceptions. 2 and 5 Functions. The variables
(ajj, ••■ a;„) which we used in the last chapter may be regarded, if we
wish, not as the coordinates of a point in space of n dimensions, but
rather as the coordinates of n points on a line. In fact this is the
interpretation which is naturally suggested to us by the ordinary
applications of the theory of symmetric functions (cf. §86). Looked
at from this point of view, it is natural to generalize the conception
of symmetric functions by considering n points in a plane,
(1) {^vyx)A^vy%)^^^n,yn)
Definition. A polynomial,
in the coordinates of the points (l)is said to he a symmetric polynomial
in these pairs of variables if it is imchanged by every interchange of
these pairs of variables.
As in the case of points on a line, we see that it is not necessary
to consider all the possible permutations of the subscripts in order to
show that a polynomial F is symmetric. It is sufficient to show that
F is unchanged by the interchange of every pair of the points (1).
We will introduce the 2 notation here precisely as in the case of
single variables. Thus, for example,
2 a;J't/f • = a;J'2/?> + 4>«/> + • • • + xl^yi\
2 xlY^^xl^yl' = xl'yl^x^^Y^' + a^Jyf 'a;»^«/f ^ + • • .,
and so on.
As in the case of single variables, it is clear that the order in
which the pairs of exponents "j, /S^; e^^' ^^2' ••• ^^^ written is imma
terial ; and also that every symmetric polynomial in the pairs of variables
(1) is a linear combination of a certain number of 2's.
252
POLYNOMIALS SYMMETRIC IN PAIRS OF VARIABLES 253
We introduce the notation
S,, s Sa^^j ^ xly{ + o^yl + ... + ^„y'„ ^^ = ^^ J ; j.
Theorem. J.wt/ symmetria polynomial F(x^, y^\ ••• a;„, y„) ma?)
5e expressed as a polynomial in these S^s.
The proof of this theorem is exactly like that of Theorem 3, § 88,
and is left to the reader.
88. Elementary Symmetric Functions of Pairs of Variables.
Every S function of n pairs of variables may, by giving to the «'s
and ;8's suitable values, be written in the form
(1) 'Exl^yl'x^^yl;' ■■■ x^'y^".
Definition. The function (1) is said to he an elementary sym
metric function of the pairs of variables {x^, y^), ■ ■ ■ (a;„, «/„) when, and
only when, ^ r, i • ■. « v
hut not all the as and yS's are zero.
We shall adopt the following notation for these elementary sym.
)ric functions : ^^^^2^„ Po^^^ ^v
fn.Q'=^^'2,'"^m ' ■ ■ Pi,ni — ^2;^ • • • 2;^^^.! • • • Z/„, • • • p^n — ^1^2 '" Vn
It is clear that there are a finite number, ^ n (n+ 3), of Pi/s, but
an infinite number of Sf/s.
We will attach to each? a weight with regard to thex^s equal to its
first subscript and a weight with regard to the ys equal to its second
subscript. When we speak simply of the weight of jt?,;, we will mean
its total weight, that is, the sum of its subscripts.
Theorem. Any symmetric polynomial T'ix^, y^', • ■ • a^n, y^ may he
expressed as a polynomial in the pijS.
Since, by the theorem in § 87, any such polynomial may be
expressed as a polynomial in the SyS, it is sufficient to show that
the S,j9, may be expressed as polynomials in the p^/s.
254 INTRODUCTION TO HIGHER ALGEBRA
and form the elementary symmetric functions of these ^'s :
■^1  ^^1 = '^^^i + l^'^Vi  '^Pw + f^Pov
= ^¥20+ Vi'n + ^Vo2'
"^3  ^ ?1 ?2 ?3 = ~^^PZ0 + '^Vi'ai + ^/^Vl2 + /^¥o3'
'^n = hhL = ^"Pno + ^"''PP^l, 1 + ^"^ M'^PnZ, 2++ /*"^o».
Also let oasS^* (/fc=l,2, •••)
Let a and he positive integers, or zero, but not both zero.
Then o.+p = 2 ^»+^ = \'+^l.x'^+^ + X''+^ V2a;J+Piz/j + ..
But by Theorem 1, § 84, we may write
where J' is a polynomial. Hence
where '^ is a polynomial. Regarding this as an identity in (\, fi)
and equating the coefficients of the terms containing V/^^, we get an
identity in the x's and y's,
Sa?^^(Pio,Pon),
where ^ is a polynomial in the p's. Thus our theorem is proved.
Theorem 3, § 84, does not hold in the case of pairs of variables,
as relations between the ^n(n+3) pys do exist; for example, if
n=2, the polynomial
4^20^02 ^^20^01 ^^10^02 + PloPuPoi  Pll
vanishes identically when the p's are replaced by their values in
terms of the x's. It does not vanish identically when w = 3.
In view of the remark just made, it is clear that the represen
tations of polynomials in pairs of variables in terms of the pi/s will
not be unique.
For further information concerning the subjects treated in this
section, the reader may consult Netto's Algebra, Vol. 2, p. 63.
POLYNOMIALS SYMMETRIC IN PAIRS OF VARIABLES 255
EXERCISES
1. Prove that a polynomial symmetric in the pairs of variables (Xi, y^ and
which is homogeneous in the x's alone of degree n and in the ^'s alone of degree m
can be expressed as a polynomial in the jj^'s isobaric of weight n with regard to
the x'i, and m with regard to the y's.
2. Express the symmetric polynomial
S xly^ya
in terms of the ^,;;'s by the method of undetermined coefficients, making use of
Exercise 1.
3. A polynomial in (xi, yi, zi; xs, yi, 22; ••• Xn, y„, z») which is unchanged by
every interchange of the subscripts is called a symmetric polynomial in the n
points {xi, yi, zi).
Extend the results of this section and the last to polynomials of this sort.
89. Binary Symmetric Functions. The pairs of variables (x^, y^,
• •• (a;„, «/„) may be regarded as the homogeneous coordinates of n
points on a line as well as the nonhomogeneous coordinates of n
points in a plane. It will then be natural to consider only sym
metric polynomials which are homogeneous in each pair of variables
alone. Such polynomials we will call Unary symmetric functions.
Most ©f the j»,/s of the last section are thus excluded. The last
w+1 of them (p^o'i'"!,!' ■■" Po»)' tiowever, are homogeneous of
the first degree in each pair of variables alone. We will call them
the elementary Unary symmetric functions.
Theorem 1. Any Unary symmetric function in {x^,yi\ ■•■ ^n^Vn)
can he expressed as a polynomial in {p„o,Pn~i,v '" Po")
If we break up our binary symmetric function into 2's, it is clear
that each of these 2's will itself be a binary symmetric function, or,
as we will say for brevity, a binary 2. It is therefore sufficient to
prove that our theorem is true for every binary 2. The general
binary 2 may be written
2 x^y{'af^yl' ■ ■ ■ K'^fn" («i ^ «2 ^ • • • ^ ««>
where, if we denote by m the degree of this 2 in any one of the pairs
of variables,
m ■■
^ «i + /3i = «2 + ^2 =••■=«« + '^«
Let U8 assume for the moment that none of the y's are zero, and let
256 INTRODUCTION TO HIGHER ALGEBRA
Now consider the elementary symmetric functions of these X's:
Pan
Pon
P — X X ■■■ X —P«o
We may write Po^
(1) ^i^^2yi'^^^" ^sx^.x^. ... X = *(A, ... P„),
where, since we have assumed «i > a2 = • • ■ = "ni •!* is a polynomial
of degree a^ in the P's (Theorem 2, § 85). Hence we may write
(2\ ^(jP ■■■ p w 'P(.i'o»''/'i,"i' ••• j^«o)' ^
where ^ is a homogeneous polynomial of degree k^
We thus get from (1) and (2)
(3) 2 xlY,^ ■ ■ ■ x:yJr = pfc</>(^o»' Pi, nv • ■ ■ Pno)^
an equation which holds except when one of the «/'s is zero. Since
each side of (3) can be regarded as a polynomial in the x's and ys,
we infer, by Theorem 5, § 2, that this is an identity, and our theorem
is proved.
By Theorem 1, § 85, <I> is isobaric of weight a^ia2+ ■■■ + a„ in
the P's. Hence 2a;J'«/J' .■• a;°"«/^", when expressed in terms of these
(n + l)pij'^, is isobaric of weight aj + «2 + ••• + «„, provided we count
the weight of the p^jS with regard to the x's. Passing back now to
an aggregate of a number of such 2's, we get
Theorem 2. If a binary symmetric function is homogeneous in
the n x's {or y's) of degree k, it will, when expressed in terms of
PnQi Pnhv ■■■ Pony ^^ isobaric of weight k with regard to the x's {or y^s).
We have seen in the proof of Theorem 1 that the polynomial <^ in
(3) is a homogeneous polynomial of degree a^ in the ^'s ; so that
^^i'"^//' ... a;^ «//" is a homogeneous polynomial of degree a^ + ySj
= m in the ^'s. Hence
Theorem 3. Any binary symmetric function of degree m in each
pair of variables will, when written in terms of pnQ,p„_i^i, ■■■ Pf^n ^^ "^
homogeneous polynomial of degree m in these p's.
POLYNOMIALS SYMMETRIC IN PAIRS OF VARIABLES 257
EXERCISES
1. Prove that no rational relation exists between pno, ••• pom and hence that a
binary symmetric function can be expressed as a polynomial in them in only one
way.
2. By a ternary symmetric function is meant a symmetric polynomial in n
points (xi, T/i, Zj) which is homogeneous in the coordinates' of each point.
Extend the results of this section to ternary symmetric functions. Cf. Exer
cise 3, § 88.
90. Resultants and Discriminants of Binary Forms. It is the
object of the present section to show how the subject of the re
sultants and discriminants of binary forms may be approached from
the point of view of symmetric functions.
^®* f{x^, x^ = a^x\ + a^xl^x^ + ■■■ + a^x^
= (a'^Xj^ — a[xr^){a'^x^ — a'^x^) ■■■ (a'^j — w^a^z)'
4>{wi, x^ = b,xf + 6ia;f%2 + •■■ + b^x^
s (/31'xi  ^[xj^^'^x^  ^'^^) ••• (,S;>i  /3W.
be two binary forms. Each of these polynomials has here been
written first in the unfactored and secondly in the factored form.
By a comparison of these two forms we see at once that the elemen
tary binary symmetric fractions of the n points
are
and of the m points (/3[, /S'Di^'^, ^'^), ■ ■ ■ {0L 0L)
are ' Sa
o'*i'^2'(ir*«
Let us now consider the two linear factors
a'!x.^^a'iX^, 0!x^ ft'^x^.
A necessary and sufficient condition for these factors to be propor
tional is that the determinant
vanish. Let us form the product of all such determinants :
P =
ia'lP'^  a\^'L) {a'<0^  a',0i) {a'lff^  a' A) J
258 INTRODUCTION TO HIGHER ALGEBRA
The vanishing of this product is a necessary and sufficient con
dition that at least one of the linear factors of / be proportional to
one of the linear factors of 0, that is, that /and (f> have a common
factor which is not a constant.
We may obviously reduce J* to the simple form
P ^/(ySi, /3i')/(/3^, 13'^) /(/3L, 131).
In this form it appears as a homogeneous polynomial of the mth
degree in the as, and as a symmetric polynomial in the m points
(/3J, j8f ). Moreover, it is obviously a binary symmetric function which
is of the nth degree in the coordinates of each point. Consequently,
by Theorem 3, § 89, it can be expressed as a homogeneous polynomial
of the wth degree in the elementary binary symmetric functions of
the points (/S^, /3"), that is, in the h's. Thus we have shown that the
product P can he expressed as a polynomial in the a's and 5's which is
homogeneous in the a's of degree m and in the Vs of degree n.
In § 72 we found another polynomial in the a's and J's, whose
vanishing also gives a necessary and sufficient condition for/ and <j)
to have a common factor, namely, the resultant R. We will now
identify these polynomials by means of the following theorem :
Theorem 1. The product P differs from the resultant P of f and
<j) only hy a constant factor, and the resultant is an irreducible poly
nomial in the a's and Fs.
We may show, in exactly the same way as in the proof or heo
rem 1, § 86, that P, when expressed as a polynomial in the a' and
6's, is irreducible. Since P = and i? = each give a necessary and
sufficient condition for /and <^ to have a common factor, any ^et of
values of the a's and J's which make P = will also make if = 0.
Thus by Theorem 7, § 76, P is a factor of P. We have seen that P
is of degree m in the a's and n in the J's. The same is also true of
P, as may easily be seen by inspection of the determinant of § 68.
Hence, P being a factor of P, and of the same degree, can differ
from it only by a constant factor. Thus our theorem is proved.
Let us now inquire under what conditions the binary form f(x^, x^)
has a multiple linear ' factor. Using the same notation as above, we
see that the vanishing of the product
«4  «i<)(«  aiO  («  «ia;0
(«»!«» <l«!I).
= PiK,<; <,«^')
POLYNOMIALS SYMMETRIC IN PAIRS OF VARIABLES 25&
is a necessary and sufficient condition for this. P^ is not symmetric
in tlie pairs of as, since an interchange of two subscripts changes P.
into Pj. If, however, we consider P\ instead of P^, we have a
binary symmetric function which can be expressed as a polynomial
in the a's
Moreover, P vanishes when, and only when, Pj does. Accordingly
P= is a necessary and sufficient ■ condition for f(x^, x^) to have a
multiple linear factor.
But the vanishing of the discriminant A (cf. § 82) of /(ajj, x )
is also a necessary and sufficient condition for this.
Theorem 2. F and A differ only hy a constayit factor, and are
irreducible.
The proof of this theorem, which is practically the same as that
of Theorem 1. is left to the reader.
If we subject the two binary forms / and 0, which we may sup
pose written in the factored form, to the linear transformation
[ 2^2 ~ ^21^1 "I" '^22'''2'
we get two new binary forms
iA'i^[  A[x',){A'ix[  A',x',) ... (A':x[  A'„4),
{B'ix[  B[x',){B'lx\  B'^x) {B'y,  B'J^),
where ^f= «,%  «,V ^= ^"<'n^hv
A'i= «f 612 + "5^22, Pi =  /3;'ei2 + ^l<f^,
so that A'JB'j  A[B'! = c{a'^^'j  a'S'j),
where c is the determinant of the transformation (IV
Since the linear transformation (1) may be regarded as carrying
over the « 's and yS's into the J.'s and P's, the last written identity
shows us that «,"/3y — «j'/3" is, in a certain sense, an invariant of
weight 1. It can, however, not be expressed rationally in terms of
the a's and Vs. Such an expression is called an irrational invariant
Since the resultant of / and ^ is the product of mn such irra
tional invariants of weight 1, it is evident that the resultant itself i6
an invariant of weight mn. Thus we get a new proof of this fact,
independent of the proof given in § 82.
A similar proof can be used in the case of the discriminant of a
binary form.
260 INTRODUCTION TO HIGHER ALGEBRA
EXERCISES
Develop the theory of the invariants of the binary biquadratic
f(x^, x^) =a^\ + i: a^x{x^ + 6 a^\xl + 4 a^^xl + a^x*
= (a'/ii  a.[x^) (ct^'xj  a'^^ {a'^x^  a'^^) {a^'x^  06^3:,)
along the following lines :
1. Start from the irrational invariants of weight 2,
A = (al'a^  a;0(a^'a{  c^O,
c =(«;'<«;<) K«3  «2«3').
whose sum is zero, and the negatives of whose ratios are the crossratios, of the
four points « <), (< O, ("s. "a'). («4 «")•
2. Form the further irrational invariants of weight 2
£i = B_C, li:2 = CA, Es=A~B;
and prove that every homogeneous symmetric polynomial in Ei, Ei, Es is a
binary symmetric function of the four points («■, «['), and therefore an integral
rational invariant oif.
3. In particular
G2=EiE2 + E^Es + EsEij G^^EiE^E^
are homogeneous integral rational invariants of weights 4 and 6, and of degrees 2
and 3 respectively. Prove that
G2=  36^2, Gj>=iB2ga,
where ^2=0004— 4aia3+ 3 Oj,
^3=000204 + 2 010203— aoal — alai — a.
These expressions g^ and ^s are the simplest invariants of/.*
4. Prove that the discriminant A of /is given by the formula
^=gl 27 <f,.
5. If A ^ 0, prove that jra = is a necessary and sufficient condition that the
four points f= form a harmonic range ; and that g2 = is a necessary ana
sufficient condition that they form an equianharmonic range. (Cf. Exercise ;!
§ 33.)
6 Prove that (72 = g'3 = is a necessary and sufficient condition that / have
at least a threefold linear faotor.f
* They are among the oldest examples of invariants, having been found by Cayley
and Boole in 1845.
t Notice that we here have a projective property of the locus / = expressed by
the vanishing of two integral rational invariants; cf. the closing paragraph of § 81
POLYNOMIALS SYMMETRIC IN PAIRS OF VARIABLES 261
V. If A is the absolute irrational invariant
i.e. one of the crossratios of the points /=0, prove that the absolute rational
invariant j
/= —
can be expressed in the form
27 (A,l)2\2 •
8. Prove that a necessary and sufficient condition for the equivalence of two
biquadratic binary forms neither of whose discriminants is zero is that the inva
riant / have the same value for the two forms.
9. Prove that a necessary and sufficient condition ifor the equivalence with
regard to linear transformations with determinant»+ 1 of two biquadratic binary
forms for which ^2 and ga are both different from zero is that the values of g^
and ga be the same for one form as for the other.
10. Prove that if the discriminant of a biquadratic binary form is not zero, the
form can be reduced by means of a linear transformation of determinant + 1 to the
normal form ^ 3 34
4 3^X2 — g2XiX2 — gsxl
11. Prove that every integral rational invariant of a biquadratic binary form
is a polynomial in g^ and gs.
12. Develop the theory of the invariants of a pair of binary quadratic forms
along the same lines as those just sketched for a single biquadratic form.
13. Prove that every integral rational invariant of a pair of quadratic forms
in n variables is an integral rational function of the invariants ©0, ••• ®„ of § 57.
[Suggestion. Show first that, provided a certain integral rational function of
the coefficients of the quadratic form does not vanish, there exists a linear transfor
mation of determinant + 1 which reduces the pair of forms to
ai3^ + a2xl+ ■■■ VaA
Pl4 + P24+  +PnXl
Then show that every integral rational invariant of the pair of quadratic forms can be
expressed as a binary symmetric function of (cci, /Si), (ota, ft), .•■ (a,, /3„), and that
the ©'s are precisely the elementary binary synunetrio functions.]
CHAPTER XX
ELEMENTARY DIVISORS AND THE EQUIVALENCE OF
XMATRICES
91. XMatrices and their Elementary Transformations. The theory
of elementary divisors, invented by Sylvester, H. J. S. Smith, and,
mor^ particularly, Weierstrass, and perfected in important respects
by Kronecker, Frobenius,«and others, has, in the form in which we
will present it,* for its immediate purpose the study of matrices
(which without loss of generality we assume to be square) whose
elements are polynomials in a single variable X. Such matrices we
will call Xmatrices.f The determinant of a Xmatrix is a polynomial
in X, and if this determinant vanishes identically, we will call the
matrix a singular Xmatrix. By the rank of a Xmatrix we under
stand the order of the largest determinant of the matrix which is
not identically zero.
We have occasion here, as in § 19, to consider certain elementary
transformations which we define as follows:
Dbfinitiox 1. By an elementary transformation of a \matrix
we understand, a transformation of any one of the following forms :
(a) The interchange of two rows or of tivo columns.
(b) The multiplication of each element of a row (or of a column) by
the same constant not zero.
(e) The addition to the elements of a row (or column) of the products
of the corresponding elements of another row {or column) by one and
the same polynomial in X.
* Various modifications of the point of view liere adopted are possible and im
portant. First, we may consider matrices wliose elements are polynomials in any num
ber of variables. Secondly, we may confine ourselves to polynomials whose coefficients
lie in a certain domain of rationality. Thirdly, we may approach the subject from the
side of the theory of numbers, assuming that the coefficients of the polynomials are
integers. The simplest case here would be that in which the elements of the matrix
are themselves integers ; see Exercise 2, §91, Exercise 3, §92, and Exercise 2, §94.
t The matrix of a pencil of quadratic forms is an important example of a Xmatriz
to which the general theory will be applied in Chapter XXII.
262
ELEMENTARY DIVISOKS AND XMATRICES 263
If we pass from a first matrix to a second by an elementary trans
formation, it is clear that we can pass back from the second to the
first by an elementary transformation. Thus the following defini
tion is justified:
Dbpinitioit 2. Two Xmatrices are said to he equivalent if it is
possible to pass from one to the other hy means of a finite number of
elementary transformations.
We see here that all Xmatrices equivalent to a given matrix are
equivalent to each other; and, as in §19, that two equivalent Xma
trices always have the same rank.
The rank of a \matrix is nOt, however, the only thing which is
left unchanged by every elementary transformation. In order to
show this we begin with
Lemma 1. If the polynomial <f>(\) is a factor of all the irowed
determinants of a Xmatrix a, it will be a factor of all the irowed
determinants of every Xmatrix obtained from a by means of an elemen
tary transformation.
If the transformation is of the type (a) or (5) of Definition 1, this
lemma is obviously true, since these transformations have no effect
on the irowed determinants of a except to multiply them by con
stants which are not zero. If it is of the type (e), let us suppose it
consists in adding to the elements of the pth column of a the corre
sponding elements of the qth. column, each multiplied by the poly
nomial v/r (X). Any irowed determinant of a which either does not
involve the pth. column, or involves both the pth and the g'th, will
be unaffected by this transformation. An iv^wed determinant
which involves the ^th column but not the qth. may be written after
the transformation in the form A ± f (X)5, where A and £ are irowed
determinants of a; so that here also our lemma is true.
Theokem 1. If Si and b are equivalent Xmatrices of rank r, and
DlX) is the greatest common divisor of the irowed determinants {i<r)
of a, then it is also the greatest common divisor of the irowed determi
nants ofh.
For by our lemma, D^X) is a factor of all the irowed determi
nants of b ; and if these determinants had a common factor of higher
degree, this factor would, by our lemma, be a factor of all the
irowed determinants of a; which is contrary to hypothesis.
264 INTRODUCTION TO HIGHER ALGEBRA
The theorem just proved shows that the greatest common divisors
2)j(X,), ••■ D^(\) are invariants with regard to elementary transfor
mations, or, more generally, that they are invariants with regard to
all transformations which can be built up from a finite number of
elementary transformations. In point of fact they form, along with
the rank r, a complete system of invariants. To prove this we now
proceed to show how, by means of elementary transformations, a
X matrix may be reduced to a very simple normal form.
Lemma 2. If the first element * f{X) of a\matrix is not identically
zero and is not a factor of all the other elements, then an equivalent
matrix can he formed whose first element is not identically zero and is of
lower degree thanf.
Suppose first there is an element /^(A.) in the first row which is
not divisible by/(X.) and let/ denote the number of the column in
which it lies. Dividing /j by/ and calling the quotient q and the
remainder r, we have y^(^) = 2(X)/(\) + r(\).
Accordingly, if to the elements of the yth column we add those of the
first, each multiplied by —qiX), we get an equivalent matrix in which the
first element of the yth column is r(A,), which is a polynomial of degree
lower than /(X). If now we interchange the first and jt\\ columns,
the truth of our lemma is established in the case we are considering.
A similar proof obviously applies if there is an element in the
first column which is not divisible by/(X).
Finally, suppose every element of the first row and column is
divisible by/(X), but that there is an element, say in the ith. row and
/th column, which is not divisible by/(X). Let us suppose the ele
ment in the first row and yth column is •<r(X)/(X), and form an
equivalent matrix by adding to the elements of the yth column
— •</r(X) times the corresponding elements of the first column. In
this matrix, /(X) still stands in the upper lefthand corner, the first
element of the yth column is zero; the first element of the z'th row
has not been changed and is therefore divisible by /(X) ; and the
element in the ith row and yth column is still not divisible by /.
Now form another equivalent matrix by adding to the elements of
the first column the corresponding elements of theyth column. The
upper lefthand element is still /(X), while the first element of the
* By the first element of a matrix we will understand the element in the upper left
hand comer
ELEMENTARY DIVISORS AND A.MATRICES
265
zth row is not divisible by f(X). This matrix, therefore, comes under
the case already treated in which there is an element in the first
column which is not divisible by/(X), and our lemma is established.
Lemma 3. If we have a \matrix whose elements are not all iden
tically zero, an equivalent matrix can he formed which has the following
three properties :
(a) The first element f{\) is not identically zero.
(6) All the other elements of the first row and of the first column are
identically zero.
(e) Every element neither in the first row nor in the first colwmn is
divisible by f{X).
For we may first, by an interchange of rows and of columns,
bring into the first place an element which is not identically zero.
If this is not a factor of all the other elements, we can, by Lemma 2,
find an equivalent matrix whose first element is of lower degree and
is not identically zero. If this element is not a factor of all the
others, we may repeat the process. Since at each step we lower the
degree of the first element, there must, after a finite number of steps^
come a point where the process stops, that is, where the first element
is a factor of all the others. We can then, by using transformations of
type (c) (Definition 1), reduce all the elements in the first row and in
the first column except this first one to zero, while the other elements
remain divisible by the first one. Thus our lemma is established.
Finally, we note that since /(X) in the lemma just proved is a ,
factor of all the other elements of the simplified matrix, it must, by
Theorem 1, be the greatest common divisor of all the elements of
the original matrix.
The lemma just proved tells us that the \matrix of the nth. order
of rank r >
(1)
"11
*nl
*lm
can be reduced by means of elementary transformations to the form
(2)
/,(\)
b,.
"l,nl
^ ^»i,i ■■■ ^«x,»i
266
INTRODUCTION TO HIGHER ALGEBRA
where /i(A.) 5^ and where //A.) is a factor of all the 5's. The last
written matrix being necessarily of rank r, the matrix of the (m.— l^th
order
(3)
''n
K »i
"nl, 1
"nl, n— 1
is of rank r—\. Consequently, if r>l, (3) may be reduced by
means of elementary transformations to the form
fix) ...
c.
(4)
"11
''B2, n2
where/gCX) ^ and where/2(\) is a factor of all the c's. By Theorem 1,
f^iX), being the greatest common di v isor of all the elements of (4), is also the
greatest common divisor of all the 6's,and is therefore divisible by/j(\).
Now it is important to notice that the elementary transformations
which carry over (3) into (4) may be regarded as elementary transforma
tions of (2) which leave the first row and column of this matrix unchanged.
Thus by a succession of elementary transformations, we have reduced
(1) to the form
(5)
where neither /j nor f^ vanishes identically, /j is a factor of f^, and
/j is a factor of all the c's.
If r > 2, we may treat the {n — 2)rowed matrix of the e's, wjiich
is clearly of rank r — 2, in a similar manner. Proceeding in this
way, we finally reduce our matrix (1) to the form
/i(\) ••• ■••
f^x) ... 00
) .
f,{X) .
c,, .
•
..
■ ■ ''1, 112
«n2,l •
■ '^n2, n2
(6)
ELEMENTARY DIVISORS AND \MATRICES
267
where none of the/'s is identically zero, and each is a factor of the
next following one.
So far we have used merely elementary transformations of the
forms (a) and (c), Definition 1. By means of transformations of the
form (6) we can simplify (6) still further by reducing the coefficient
of the highest power of X in each of the polynomials f^X) to unity.
We have thus proved the theorem :
Theorem 2. Every Xmatrix of the nth order and of rank r can
be reduced hy elementary^ transformations to the normal form
^i(X) ••• •■•
JE^{X) ■■■ ■■•
(7)
Ur{X) •••
■•• •••
where the coefficient of the highest power of X in each of the polynomials
I1,{X) is unity, and Ui{X) is a factor of JEi_^^{X)for ^■ = 1, 2, ■■■ r — 1.
By Theorem l,the greatest common divisor of the ^■rowed determi
nants {i^r) of the original matrix is the same as the greatest common
divisor of the irowed determinants of the normal form (7) to which it is
reduced. These last mentioned irowed determinants are, however, all
identically zero except those which are the product of i of the ^'s. Let
(8) i:4x)U,^(x)  i:,(x)
be any one of these, and suppose the integers ^j, k^, •■• ^, to have
been arranged in order of increasing magnitude. We obviously
have ^j > 1, ^2 = 2) • • • kf ^ i. Consequently JE^ is a factor of U^,^, E^
of ^^^, etc. Thus I]lx)I!lX) ••• Elx)
is seen to be a factor of (8), and, being itself one of the zrowed de
terminants of (7), it is their greatest common divisor. That is,
Theorem 3. The greatest common divisor of the irowed determi
nants of a Xmatrix of rank r, when i ^ r, is
DlX) = ElX)E4X)  EiX),
where the E's are the elements of the normal form (7) to which the given
matrix is equivalent.
It may be noticed that this greatest common divisor is so deter
mined that the coefficient of the highest power of X in it is unity.
268
INTRODUCTION TO HIGHER ALGEBRA
We come now to the fundamental theorem:
Theorem 4. A necessary and sufficient condition for the equiva
lence of two \matrices of the nth order is that they have the same rank
r, and that for every value of i from 1 to r inclusive, the irowed deter
minants of one matrix have the same greatest common divisor as the
irowed determinants of the other.
To say that this is a necessary condition is, merely to restate
Theorem 1. To prove it sufficient, suppose both matrices to be
reduced to the normal form (7), where we will distinguish the
normal form for the second matrix by attaching accents to the
^'s in it. If the conditions of our theorem are fulfilled, we have, by
Theorem 3, _gT,(■^^J ^ ^^^^^^
and, since none of these ^'s are identically zero, it follows that
II[{X) = M^) {i=l,2,...r).
Thus the normal forms to which the two \matrices can be reduced
are identical, and hence the matrices are equivalent, since two
Xmatrices equivalent to a third are equivalent to each other.
EXERCISES
Reduce the matrix
A.
1
\
A
A1
A.1
by means of elementary transformations to the normal form of Theorem 2.
Verify the result by finding the greatest common divisors A(A.) first directly,
and secondly from the normal form.
2. By an elementary transformation of a matrix all of whose elements are
integers is understood a transformation of any one of the following forms :
(n) The interchange of two rows or of two columns.
(b) The change of sign of all the elements of any row or column.
(c) The addition to the elements of one row (or column) of the products of the
corresponding elements of another row (or column) by one and the same integer.
Starting from this definition, develop the theory of matrices whose elements are
integers along the same lines as the theory of Xmatrices was developed in this section.
ELEMENTARY DIVISORS AXD XMATRICES 269
92. Invariant Factors and Elementary Divisors. In place of the
invariants J)i{X) of the last section,, it is, for most purposes, more
convenient to introduce certain other invariants to which we will
give the technical name invariant factors. As a basis for the defini
tion of these invariants we state the following theorem, which is
merely an immediate consequence of Theorem 3, § 91:
Theorem 1. The greatest common divisor of the irowed determi
nants (i= 2, 3, ••• r) of a Xmatrix of rank r is divisible hy the greatest
common divisov of the (i— l)rowed determinants of this matrix.
Definition 1. If a. is a Xmatrix of rank r, and
BIX) (^•=l, 2, ... r)
the greatest common divisor of its irowed determinants so determined
that the coefficient of the highest poiver of X is unity; and if Dq(X)=1;
then the polynomial
(1) i:ix)^^^^ (z = i,2, ...^)
is called the ith invariant factor of a..
This definition shows that these ^'s are really invariants since
they are completely determined by the Z>'s which we proved to be
invariants in § 91. Moreover, by multiplying together the first i of
the relations (1), we get the formula
(2) DlX) = U,{X)II^{X)  KiX) (i=l, 2, ... r).
This shows us that the ^'s completely determine the i)'s, and since
these latter were seen in § 91 to form, together with the rank, a
complete system of invariants, the same is true of the ^'s. That is.
Theorem 2. A necessary and sufficient condition that two Xma
trices be equivalent is that they have the same rank r, and that the inva
riant factors of cue he identical respectively with the corresponding
invariant factors of the other.
Since, in the case of a nonsingular matrix of the wth order,
i)„(X) differs from the determinant of the matrix only by a constant
factor, we see that in this case the determinant of the matrix is,
except for a constant factor, precisely the product of all the invari
ant factors. This is the case which is of by far the greatest impor
tance, and the term invariant factor comes from the fact that the E\
are really factors of the determinant of the matrix in this case.
270 INTRODUCTION TO HIGHER ALGEBRA
A reference to Theorem 3, § 91, shows that our invariant factors
are precisely the polynomials E^ which occur in the normal form of
Theorem 2, § 91 ; and, since in that normal form each ^ is a factor
of the next following one, we have the important result.
Theorems. If IE^{X), ••• EJ(K)areth,e successive invariant fac
tors of a Xmatrix of rank r, then each of these E's is a factor of the next
following one.
This theorem enables us to arrange the invariant factors of a Xma
trix in the proper order by simply arranging them in the order of
increasing degree, two ^E/'s of the same degree being necessarily
identical.
The invariant factors (like the Z>'s of the last section) may be
spoken of as rational invariants of our Xmatrix since they are formed
from the elements of the Xmatrix by purely rational processes,
namely the elementary transformations of § 91, which involve'only
the rational operations of addition, subtraction, multiplication, and
divison. In distinction to these the elementary divisors, first intro
duced by Weierstrass, are, in general, irrational invariants.* These
we now proceed to define.
Definition 2. If & is a Xmatrix of rank r, and D^iX) is the
greatest common divisor of the rrowed determinants of a, then the linear
^ X—a,X — a',X — a", ■■■
of D^X) are called the linear factors o/a.f
Since, by formula (2), I)^(X) is the product of all the invariant
factors of a, it is clear that each invariant factor is merely the prod
uct of certain integral powers, positive or zero, of the linear factors
of a. We may therefore lay down the following definition :
* German writers, following Frobenius, use the term eltmentmy divisor to cover
■both kinds of invariants. This is somewhat confusing, and necessitates the use of
modifying adjectives such as simple elementary divisors for the elementary divisors
as originally defined by Weierstrass, composite elementary divisors for the E'a. On
the other hand Bromwich (Quadratic Forms and their Classification by Means of
Invariantfactors, Cambridge, England, 1906) proposes to substitute the term invari
ant factor for the term elementary divisor. Inasmuch as this latter term is wholly
appropriate, it seems clear that it should be retained in English as well as in German
in the sense in which Weierstrass first used it.
t It will be noticed that if a is nonsingular, the linear factors of a are simply the
linear factors of the determinant of a.
ELEMENTARY DIVISORS AND XMATRICES 271
Definition 3. Let a. be a Xmatrix of rank r, and
\ — «, \—a'. A,— «", ■••
its distinct linear factors. Then if
Ul\)=(Xa)%\a'yi{Xa"y^  (i=l, 2,  r),
are the invariant factors of a., such of the factors
(Xafi, {Xa)\ (Xayr,
(Xa'yi, (XaJL, (Xa')K,
(\«")«i', (\«"y5', lxa"Yr',
as are not mere constants are called the elementary divisors of a, each
elementary divisor being said to correspond to the linear factor of which
it is a power*
Since the invariant factors completely determine the elementary
divisors and vice versa, it is clear that the elementary divisors are not
merely invariants, but that, together with the rank, they form a
complete system of invariants. That is.
Theorem 4. A necessary and sufficient condition that two Xma
trices be equivalent is that they have the same rank and that the elemen
tary divisors of one be identical respectively with the corresponding
elemeritary divisors of the other.
By means of Theorem 3 we infer the important result :
Theokbm 5. The degrees e^ of the elementary divisors correspond
ing to any particular linear factor satisfy the inequalities
ei^e^_, (1 = 2, 3,... r).
By means of this theorem we can arrange the elementary divisors
corresponding to any given linear factor in the proper order by simply
noticing their degrees.
* It wiU be seen that the definition just given is equivalent to the following one, in
which the conception of invariant factors is not introduced :
Definition. Let \—abe a linear factor of the \matrixa of rank r, and let U be
the exponent of the highest power of\—a which is a factor of all the irowed determi
nants (i^r) of a. If the integers e; (which are necessarily positive or zero) are defined
by the formula ei=lilii (i=l. 2, ■•• r),
then such of the expressions n_fj^Yi (\—aY' ■■■ ('\ — aY''
as are not constants are called the elementary divisors of a which correspond to the
linear factor \—a.
272
INTRODUCTION TO HIGHER ALGEBRA
EXERCISES
1. If <^ = and i/r = are two conies of which the second is nonsingular,
show how the number and kind of singular conies contained in the pencil
<^ — Aj^ = depends on the nature of the elementary divisors of the matrix of the
quadratic form <f> — Xf.
2. Extend Exercise 1 to the case of three dimensions.
3. Apply the considerations of this section to matrices whose elements are
integers. (Cf. Exercise 2, §91).
93. The Practical Determination of Invariant Factors and Elemen
tary Divisors. The easiest general method for determining the
invariant factors of a particular Xmatrix is to reduce it by means of
elementary transformations to the normal form of Theorem 2, §91,
following out step by step the reduction used in the proof of that theo
rem. From this normal form the invariant factors may be read off ;
and from these the elementary divisors may be computed, although only,
in general, by the solution of equations of more or less high degree.
There are, however, many cases of great importance in which the
elementary divisors may more easily be obtained by other methods.
The most obvious of these is to apply the definition of elementary
divisors directlj^ to the case in hand. As an illustration, we mention a
matrix of the wth order which has a — X as the element in each place
of the principal diagonal, while all the other elements are zero except
those which lie immediately to the right of or above the elements of
the principal diagonal, these being all constants different from zero :

(1)
a — X
a — Xe„
• •
• •
■■ aX e„_i
•• aX
(ClC2<?„.i#0).
The determinant of this matrix is (a — X)". The determinant
obtained by striking out the first column and the last row is
Ci<'2 "■ ^»i Accordingly
Z)„(X) = (X«)% 2)^i(X) = l, UJX) = {Xay.
Thus we see that (X — a)" is the only elementary divisor of this
matrix, while the invariant factors are (X — a)" and n — 1 Vs.
ELEMENTARY DIVISORS AND XMATRICES
273
This direct method may some.times be employed to advantage in
oonjunction with the method of reduction by elementary transfor
mations. Cf. Exercise 1 at the end of this section.
A further means of recognizing the elementary divisors in some
special cases is furnished by the following theorems whose proofs,
which present no difficulty, we leave to the reader:
Theorem 1. If all the elements of a \matrix are zeros except
those in the principal diagonal, and if each element of this diagonal
which is not a constant is resolved into the product of a constant by powers
of distinct linear factors of the form X, — a, X — a', .■■, then these powers
of linear factors will be precisely the elementary divisors of the matrix.
Theorem 2. If all the elements of a. \matrix are zeros except
those which lie in a certain number of nonoverlapping principal minors,
then the elementary divisors of the matrix may be found by taking the
elementary divisors of all these principal minors.
The proof of this theorem consists in reducing the given matrix
to the form referred to in Theorem 1 by means of elementary trans
formations each of which may be regarded as an elementary trans
formation of one of the principal minors in question.
It should be noticed that this theorem would not be true if the
words invariant factors were substituted in it for elementary divisors;
cf. Exercise 3 below. The invariant factors may, however, be com
puted from the elementary divisors when these have been found.
1. Prove that the matrix
EXERCISES
Xa
xa.
xa
1
01
01
1
1
xa
Xa
Xa
is equivalent to
 1
1
1
(A._«)2+^2
1
(A
 ay + /82
1
(\
■ay^p^
274
INTRODUCTION TO HIGHER ALGEBRA
and hence that its elementary divisors are
[A  (a + /30]=, [X  («  ^0]'
2. Generalize Exercise 1 to matrices of order 2n.
3. Find (a) the elementary divisors, and (b) the invariant factors of the
X.\\  1)2
A.(X  1)3
X1
A
4. Determine the invariant factors and the elementary divisors of the matrix
matrix
2X
3
1
X
4A.
3(\ + 2)
A + 2
2X
6\
X 2X
X1
X1
3(\  1)
1 A
2(A.1)
Is this matrix equivalent to the matrix in the exercise at the end of § 91 ?
5. Devise a convenient rational process for computing the invariant factors of
matrices of the kinds considered in Theorems 1 and 2.
94. A Second Definition of the Equivalence of \Matrices. The
definition of equivalence of \matrices whicli we have used so far
rests on the elementary transformations. These transformations are
of such a special character that this definition is not convenient for
most purposes. We now give a new definition which we will prove
to be coextensive with the old one.
Definition. Two nrowed Xmatrices a and b are said to he equiv
alent if there exist two nonsingular nrowed \matrioes c and d whose
determinants are independent of X, and such that
(1) bscad.*
Since the matrices c and d have, by hypothesis, constant determi
nants, the inverse matrices c"^ and d~^ will also be Xmatrices, and not
matrices whcse coefficients are fractional rational functions of X as would
in general be the case for the inverse of Xmatrices. Consequently, if
we write (1) in the form
(2) a = c^bdi,
we see that the relation established by our definition between the
matrices a and b is a reciprocal one, as is implied in the wording of
the definition.
* We use here and in what follows the sign s between two Xmatrices to denote
isnat every element of one matrix is identically equal to the corresponding element of
the other.
ELEMENTARY DIVISORS AND AMATRICES 275
In order to justify the definition just given, we begin by estab
lishing the
Lemma. If a and b are nrowed Xmatrices, and the polynomial
^(\) is a factor of all the iroued determinants of a., it is a factor of all
the i rowed determinants of ab and also of ba.
For, iby Theorem 5, § 25, every irowed determinant of ab and
also of ba is a homogeneous linear combination of certain irowed
determinants of a.
Theorem 1. If a and b are equivalent according to the definition of
this section, they are also equivalent according to the definition of § 91.
For in this case there exist two nonsingular Xmatrices, c and
d, whose determinants are constants, such that relation (1) holds.
Consequently, by Theorem 7, § 25,* a and b have the same rank r.
Let D/X) be the greatest common divisor of the i'rowed determi
nants of a, where i ^ r. By our lemma, D^{X) is a factor of all
the irowed determinants of ca, and therefore, applying the lemma
again, it is a factor of all the irowed determinants of cad, that is, of b.
We can infer further that I>i{\) is the greatest common divisor
of the irowed determinants of b. For applying to relation (2) the
reasoning just used, we see that the greatest common divisor of the
I'rowed determinants of b is a factor of all the z'rowed determi
nants of a, and cannot therefore be of higher degree than D/X).
A reference to Theorem 4, § 91, now shows us that a and b are
equivalent according to the definition of that section.
Theoeem 2. ^ a. and b are equivalent according to the definition
of § 91, they are also equivalent according to the definition of the pres
ent section.
We begin by showing that if we can pass from a matrix a to a
matrix aj by means of an elementary transformation, one of the fol
lowing relations always holds :
(3^ aj = ca or a^ = ad
where c and d are nonsingular matrices whose determinants are
independent of \. To prove this we consider in succession the
elementary transformations of the forms which were called (a), {h\
(c), in Definition 1, § 91.
* How is it that we have a right to apply this theorem to Xmatrir.es ?
276
INTRODUCTION TO HIGHER ALGEBKA
(a) Suppose we interchange the ^th and qth rows. This can
be effected by forming the product ca where the matrix c may be
obtained by interchanging the pth and g'th rows (or columns) in the
unit matrix
•••
Similarly the interchange of the ^th and qth columns of a may be
effected by forming the product ac, where c has the same meaning
as before.
In each of these cases, c may be regarded as a nonsingular
X,matrix with constant determinant, since its elements are constants
and its determinant is — 1.
(b) To multiply the pth row of a by a constant k, we may form
the product ca, where c differs from the unit matrix only in having
k instead of 1 as the'^th element of the principal diagonal.
Similarly, we multiply the pth column of a by k, by forming the
product ac, where c has the same meaning as before.
If we take the constant k different from zero, c may be regarded
as a nonsingular \matrix with constant determinant.
(c) We can add to the pth row of a <^(^.) times the qth row by
forming the product ca, where c differs from the unit matrix only in
having ^(X.) instead of zero as the element in the pth row and qth
column.
Similarly we add to the ^th column ^(X) times the pth columa
by forming the product ac where c has the same meaning as before.
The matrix c, whose determinant is 1, is a nonsingular
Xmatrix.
It being thus established that one of the relations (3) holds
between any two Xmatrices which can be obtained from one another
by an elementary transformation, it follows that two matrices a
and b which are equivalent according to the definition of § 91 will
satisfy a relation of the form
b = Cj,Cp_i •••Ciadjclj — dj
where each of the c's and d's is a nonsingular Xmatrix of constant
determinant which corresponds to one of the elementary transforma
ELEMENTARY DIVISORS AND AMATRICES 277
tions we use in passing from a to b. This last relation being of the
form
b = cad,
where c and d are nonsingular Xmatrices with constant determi
nants, our theorem is proved.
We have now completed the proof that our two definitions of
the equivalence of Vmatrices are coextensive.
EXERCISES
1. If a. denotes the matrix in Exercise 1, § 91, and b the normal form of
Theorem 2, § 91, for this matrix, determine two Xmatrices, c and d, such that
relation (1) holds.
Verify your result by showing that the determinants of c and d are constants.
2. Apply the considerations of this sebtion to matrices whose elements are
integers. Cf. Exercise 2, § 91, and Exercise 3, § 92.
95. Multiplication and Division of XMatrices. We close this
chapter by giving a few developments of what might be called the
elementary algebra of Xmatrices.
Definition. £i/ the degree of a \matrix is understood the high
est degree in X of any one of its elements.
For a Xmatrix of the kth degree, the element in the t'th row andjth
column may be written ^j.^.x* + a'yX'^i ^ 1 «»,
and at least one of the coefficients of X* {i.e. one of the %'s) musi
be different from zero. If, then, we denote by a^ the matrix of
which a,^' is the element which stands in the ith row and /th col
umn, we get the theorem
Theokem 1. Mery Xmatrix of the kth degree may he written in the form
(1) aoX*laiX*i + 4a, (ao=?^0)
where a,,, ■ ■ • a^ are matrices with constant elements ; and conversely.,
every expression (1) is a Xmatrix of .degree h.
Theorem 2. The product of two Xmatrices of degrees k and I
aoX* + aiX*i+Ha, (ao^O)
boX'lbiX'ifb, (bo#0)
is a Xmatrix of degree k 1 I provided at least one of tJie matrices a^ and
bfl is no n singular.
27S INTRODUCTION TO HIGHER ALGEBRA
For this product is a Xmatrix of the form
where Cq has the value aob,, or hQ&Q according to the order in which
the two given matrices are multiplied together. By Theorem 7,
§ 25, neither aQbp nor b^aQ is zero if slq and b^ are not both singular.
The next theorem relates to what we may call the division
of Xmatrices. »
Theorem 3. if a and b are two \matrices and if b. when written
in the form (1), has as the coefficient of the highest power of X a non
singular matrix, then there exists one, and only one, pair of Xmatrices
qj and r j for which a = a b 4 r
and such that either r^ s 0, or tj is a \ matrix of lower degree than b/
and also one and only one pair of Xmatrices qg and x,^for which
a = bqa + r^
and such that either r^ = 0, or i^ is a Xmatrix of lower degree than b.
The proof of this theorem is practically identical with the proof
of Theorem 1, § 63.
EXERCISE
Definition. By a real matri^ris understood a matrix whose elements are real;
by a real Xmatrix, a matrix whose elements are real polynomials in X; and by a real
elementary transformation, an elementary transformation in which the constant in (J)
and the polynomial in (c), Definition 1, § 91, are real.
Show that all the results of this chapter still hold if we interpret the words
matrix, Xmatrix, and elementary transformation to mean real matrix, real Xtnatrix,
and real elementary transformation, respectively.
CHAPTER XXI
THE EQUIVALENCE AND CLASSIFICATION OF PAIRS OF
BILINEAR FORMS AND OF COLLINEATIONS
96. The Equivalence of Pairs of Matrices. The applications ot
the theory of elementary divisors with which we shall be concerned
in this chapter and the next have reference to problems in which
\matrices occur only indirectly. A typical problem is the theory
of a pair of bilinear forms. The matrices a and b of these two forms
have constant elements, and we get our Xmatrix only by consider
ing the matrix a — Xb of the pencil of forms determined by the two
given forms. It will be noticed that this matrix is of the first
degree, and in fact we shall deal, from now on, exclusively with
\matrices of the first degree.
By the side of this simplification, a new difficulty is introduced,
as will be clear from the following considerations. We shall subject
the two sets of variables in the bilinear forms to two nonsingular
linear transformations whose coefficients we naturally assume to be
constants, that is, independent of X. These transformations have the
effect of multiplying the Xmatrix, a — Xb, by certain nonsingular
matrices whose elements are constants (cf. § 36) and therefore, by
§ 94, carry it over into an equivalent Xmatrix which is evidently of
the fiist degree. The transformations of § 94, however, were far
more general than those just referred to, so that it is not at all ob
vious whether every Xmatrix of the first degree equivalent to the
given one can be obtained by transformations of the sort just re
ferred to or not.
These considerations show the importance of the following
theorem :
Theorem 1. If aj, &^, b,., bj are matrices with constant elements
of which the last two are nonsingular, and if the \matrices of the first
degree mi = a^  Xbj, m^^A^ Xbg
279
280 INTRODUCTION TO HIGHER ALGEBRA
are equivalent, then there exist two nonsingular matrices, p and q.,
whose elements are independent of \, and such that
(1) m2 = pmiq.
Since mj and m.^ are equivalent, there exist two nonsingular
Xmatrices, p^ and q^, whose determinants are constants and such that
(2) m2 = Pomiq9.
The matrix q^ has, therefore, an inverse, qp^ which is also a
, X,matrix.
Let us now divide pg by m^ and q^ ^ by nij by means of Theorem
3, § 95, in such a way as to get matrices pj, p, Sj, s which satisfy the
relations
(3) PosmaPi + p, qo^sSimi + s,
p and s being matrices whose elements are independent of \. From
(2) we get _,
Substituting here from (3), we have
or
(4) niaCPi  Si)mi = mj s  pmi
From this identity we may infer that pj = Sj and therefore
<5) rngsspm^.
For if Pj — Si were not identically zero, mjCPj — Sj) would be a
Xmatrix of at least the first degree (cf . Theorem 2, § 95), and hence
the lefthand side of (4) would be a \matrix of at least the second
degree. But this is impossible, since the righthand side of (4)
is a A,matrix of at most the first degree.
If we knew that p and s were both nonsingular, our theorem
would follow at once from (5) ; for we could write (5) in the form
(6) m2 = pmisi
and p and s~^ would be nonsingular matrices with constant elements.
Moreover, we see from (5) that p and s are either both singular or
both nonsingular. Our theorem will thus be proved if we can
show that s is nonsingular.
PAIRS OF BILINEAR FORMS 281
For this purpose let us substitute in the identity
for qi its value from (3),
0) I = qflSimi + QoS.
Now divide q,, by ra^ by means of Theorem 3, § 95, in such a way as
to get
where q is a matrix with constant elements.
Substituting this value in (7), we have
I s qoSiHii + qjiUaS + qs.
Referring to (5), we see that this may be written
(9) Iqs = (qoSi + qip)mi.
From this we infer that q^Sj + qjp must be identically zero, and
therefore
(10) I = qs.
For if qgSj + qjp were not identically zero, the righthand side of
(9) would be a Xmatrix of at least the first degree, while the left
hand side of (9) does not involve X.
Equation (10) shows that s is nonsingular, and thus our theorem
is proved. It shows us, however, also that q is nonsingular, and
that q = s~^, so that equation (6) becomes nig = pm^q.
We may, therefore, add the following
CorolTjARY. The matrices p and q whose existence is stated in the
above theorem may he obtained as the remainders in the division of pg
and qQ in (2) by m.^ by means of theformuloe :
Po  °i2Pi + P' ^0 = qi™2 + 1
From this theorem concerning Vmatrices of the first degree we
.can now deduce the following theorem concerning pairs of matrices
with constant elements. It is this theorem which forms the main
foundation for such applications of the 'theory of elementary divisors
as we shall give.
We shall naturally speak of two pairs of matrices with constant
elements ai, bj and a^, \ as equivalent if two nonsingular matrices p
and q exist for which
(11) a2 = paiq, b2 = pbiq.
282 INTRODUCTION TO HIGHKR ALGEBRA
Theore:\i 2. If a^, bj and a^, bj are two pairs of matrices
independent of \, and if b^ and bg are nan singular, a necessary
and sufficient condition that these two pairs of matrices be equiva
lent is that the two Xmatrices
nij = aj — Xbj, nig = 3^ — ^^2
have the same invariant factors, — or, if we prefer, the same elementary
divisors.
For if the pairs of matrices are equivalent, equations (11)
hold; hence, multiplying the second of these equations by X
and subtracting it from the first, we have
(12) m2 = pmiq,
that is the X,matrices nij and m^ are equivalent, and therefore have
the same invariant factors, and the same elementary divisors. On
the other hand, it follows at once from the assumption that b^ and bg
are nonsingular, that nij and la^ are nonsingular, and hence have
the same rank. Consequently if mj and va.^ have the same invariant
factors, or the same elementary divisors, they are equivalent. Since
they are of the first degree, there must, by Theorem 1, exist two
nonsingular matrices p and q, whose elements are independent of \,
which satisfy the identity (12). From this identity, the two equa
tions (11) follow at once ; and the two pairs of matrices are equivalent.
Thus the proof of our theorem is complete.
A case of considerable importance is that in which the matrices
bj and b2 both reduce to the unit matrix I. In this case nij and m.^
reduce to what are known as the characteristic matrices of aj^ and Ej
respectively, according to the following definition :
Definition. If a. is a matrix of the nth order with constant ele
ments and I the unit matrix of the nth order, the \matrix
A s a  XI
is called the characteristic matrix of a; the determinant of A is called
the characteristic function of a; and the equation of the nth degree in
\ formed by setting this determinant equal to zero is called the char
acteristic equation of a.
PAIRS OF BILINEAR FORMS 283
We can now deduce from Theorem 2 the following more speci'^^
result :
Theorem 3. If Si^ and a^ are two matrices independent ofX,a neces
sary and sufficient condition that a nonsingular matrix p exist such that *
(13) a2 = pajpi
is that the characteristic matrices Aj and A^ of aj and a.^, ^ave the same
invariant factors, — or, if we prefer, the same elementary divisors.
For if Aj and Aj have the same invariant factors (or elementary
divisors), there exist, by Theorem 2, two nonsingular matrices p
and q such that a2 = paiq, I = plq.
The second of these equations shows us that q = p"^ ; and this
value being substituted in the first, we see that p is the matrix whose
existence our theorem asserts.
That, on the other hand, Aj and A2 have the same invariant factors
and elementary divisors if equation (13) is fulfilled, is at once obvious.
97. The Equivalence of Pairs of Bilinear Forms. Suppose we
have a pair of bilinear forms in 2 n variables
and also a second pair
<j,^ = ia!i^Xiyj, f^ = Lb'^Xiyj;
and let us assume that fj and yjr^ are nonsingular. "We will in
quire under what conditions the two pairs of forms are equivalent,
that is, under what conditions a first nonsingular linear transforma
tion for the x'a and a second for the y's,
:::::.: d : ; : ; : :
,x„ = c^^x[ 4 •■• + c„^x'„ [yn= <iy'i + ••■ + d^^y'„
can be found which together carry over ^j into ^^ and i/fj into i/r^.
« Two matrices connected by a relation of the form (13) are sometimes called
gmi7ar matrices. This conception of similarity is evidently merely a special case of
the general conception of equivalence as defined in § 29, the transformations considered
being of the form (13) instead of the more general form usually considered in this
chapter and the last.
284 INTRODUCTION TO HIGHER ALGEBRA
If we denote the conjugate of the matrix c by c' and the matrices
of (^1, iItj, <f)^, 1^2 by aj, bp a^, bj respectively, we know, by Theorem 1,
§ 36, that the transformations c, d carry over <^j and ■\jrj^ into forms
with matrices c'ajd, c'bjd
respectively; so that, if these are the forms ^^ ^^^ "^t ^® have
(1) aj = c'ajd, h^ = c'bjd.
Consequently, by Theorem 2, § 96, the two \matrices
a^ — Xbj, 32 — Xbj
have the same invariant factors and elementary divisors.
Conversely, by the same theorem, if these two Xmatrices have
the same invariant factors (or elementary divisors), two constant
matrices c' and d exist which satisfy both equations (1); and hence
there exists a linear transformation of tlie x's and another of the «/'s
which together carry over <^j into (f)^ and t/t^ into yjr^. Thus we
have proved the
Theorem. If <j>p yjr^ and (j)^, i/^g '*''* ^^^ ^<^*''*s of bilinear forms in 2 n
variables of which x/tj and ^jr^ are nonsingular, a necessary and sufficient
condition that these two pairs of forms be equivalent is that the matrices
of the two pencils ^^ _ ^^^^ ^^ _ ^^^
have the same invariant factors, — or, if we prefer, the same elementary
divisors*
EXERCISE
Piove that the theorem of this section remaiBS true if the' bilinear forms
<^i, \pi, 4>i, \j/2 are real and the term equivalent is understood to mean equivalent with
regard to real nonsingular linear transformations.
98. The Equivalence of Collineations. A second important appli
cation of the theory of elementary divisors is to the theory of col
lineations. For the sake of simplicity we will consider the case of
X^ = ^ji^Ij f a^2^2 "t" ''^IS'^S'
ajg = a^^x^  a^2^2 + "^33^3'
although the reasoning will be s^en to be perfectly general.
* For the sake of brevity, we shall, in future, speak of these invariant factors and
elementary divisors as the invariant factors and elementary divisors of the pairs of
forms 01, ^1 and 02, ^2 respectively.
PAIRS OF BILINEAR FORMS 285
We have so far regarded a coUineation merely as a means of
transforming certain geometric figures. It is possible to adopt an
other point of view, and to study the coUineation in itself with
special reference to the relative position of points before and after
the transformation. Thus suppose we have a figure consisting of
the points J.^ A^, ••• , finite or infinite in number, and suppose these
points are carried over by the coUineation a into the points A[,
A'2, ■■■ . These two sets of points together form a geometric figure.
It is the properties of such figures as this that we call the properties
of the coUineation. Such properties may be either projective or
metrical. Thus it would be a metrical property of a coUineation if
it carried over some particular pair of perpendicular lines into a pair
of perpendicular lines; it would be a projective property of the
coUineation if it carried over some particular triangle into itself. We
shall be concerned only with the projective properties of coUineations.
As an example, let us consider the fixed points of the coUineation,
that is points whose initial and final position is the same. In order
that (ajj, x^^, iCg) be a fixed point it is necessary and sufficient that
xj = \Xi, x'2 = Xx^, x'^ = Xxg,
that is, substituting in a, that a constant X exist such that,
(1) fflgja;! + (a22  X)2;2 + a^sXg = 0,
«31^1 + «32 ^2 + («33  ^>3 = ^'
The matrix of this system of equations is precisely what we have
called the characteristic matrix of the matrix a of the linear trans
formation. The characteristic function is a polynomial of the third
degree in \ which, when equated to zero, has one, two, or three dis
tinct roots. Let Xj be one of these roots. When this is substituted
in (1), these equations are satisfied by the coordinates of one or more
points, — the fixed points of the coUineation a. The number and
distribution of these fixed points give an important example of a
projective property of a coUineation ; and it is readily seen that
coUineations may have wholly different properties in this respect,
one having three fixed points, another two, and still another an
infinite number.
Coming back now to the two sets of points Jj, A^, ••• and A[,
AL ■■■ which correspond to one another by means of the coUinea
286 INTRODUCTION TO HIGHER ALGEBRA
tion a (which may be singular or nonsingular), let us subject all these
points to a nonsingular coUineation c, which carries over A^, A^, •••
into B^, B^, ■■■ and A\, A'^, ■■■ into B[, B'^, ■•• respectively. The fig
ure formed by the B's will have the same projective properties as
that formed by the ^'s; and consequently if we can find a coUinea
tion b which carries over B^, B^, ••■ into B[, B'^, •••, this coUineation
will have the same projective properties as the coUineation a. Such
a coUineation is clearly given by the formula
(2) b = caci
since c^ carries over the points B^ into the points Ai, a then carries
over these into A^, and c carries over the points A'^ into the points jBJ.
Since two coUineations a and b related by formula (2) are indis
tinguishable so far as their projective properties go (though they
may have very different metrical properties), we will call them
equivalent according to the following
Definition. Two coUineations a and b shall be called equivalent
if a nonsingular coUineation c exists such that relation (2) is fulfilled.
A reference to Theorem 3, § 96, now gives us the fundamental
theorem :
Theorem. A necessary and sufficient condition that two coUinea
tions be equivalent is that their characteristic matrices have the same in
variant factors, — or, if we prefer, the same elementary divisors.
EXERCISES
1. If Pi, i'2i •■• Pi are fixed points of a nonsingular coUineation in space of
n — 1 dimensions which correspond to k distinct roots of the characteristic equa
tion, prove that these points are linearly independent.
2. Discuss the distribution of the fixed points of a coUineation
(a) in two dimensions,
(6) in three dimensions,
lOr all possible cases of nonsingular coUineations,
3. Discuss the distribution of
(a) the fixed lines of a coUineation in two dimensions,
(J) the fixed planes of a coUineation in three dimensions,
for all possible cases of nonsingular coUineations ; paying special attention to their
relation to the fixed points.
4. Two real coUineations, a and b, may be said to be equivalent if there exists
a real nonsingular coUineation c such that b = caci.
With this understanding of the term equivalence, show that the theorem of the
present section holds for real coUineations.
PAIRS OF BILINEAR FORMS 287
99. Classification of Pairs of Bilinear Forms. We consider again
the pair of bilinear forms
of which we assume the second to be nonsingular, and form the
\matrix.
(1) aXb.
Using a slightly different notation from that employed in § 92, we
will denote the elementary divisors of (1) by
(\  Xj^i, (X  X^)% ...... (X  X,)«*, (ej + e^ + • • . + e* = n),
so that the linear factors X — X^ need not all be distinct from one
another. The most important thing concerning these elementary
divisors is, for many purposes, their degrees, ej, e^, ■■■ e/,. When we
wish to indicate these degrees without writing out the elementary
divisors in full, we will use the symbol [gj e^ ■■■ e^'], called the char
acteristic of the Xmatrix (1), or of the pair of forms (j), ■^. It will be
seen that this characteristic is a sort of arithmetical invariant of the
pair of bilinear forms, since two pairs of bilinear forms which are
equivalent necessarily have the same characteristic. The converse
of this, however, is not true, since for the equivalence of two pairs of
bilinear forms the identity of the elementary divisors themselves,
not merely the equality of their degrees, is necessary.
All pairs of bilinear forms which have the same characteristic are
said to form a category. Thus, for example, in the case of pairs of
bilinear forms in six variables we should distinguish between three
categories corresponding to the three characteristics,
[1 1 1], [2 1], [3],
which are obviously the only possible ones in this case. In fact, we
must inquire whether these three categories really all exist. This
question we answer in the affirmative by writing down the following
pairs of bilinear forms in six variables which represent these three
categories : , ,
Xj  X
Xj  X
XgX
288
INTRODUCTION TO HIGHER ALGEBRA
1 ^1^1+ ^2^2 + ^32/8'
Aj "^ A.
1
A.1 "^ \
III.. [3]
1 i«;i«/l + ^^2^2 +
A.1 — A*
Aj — A
1
The pairs of bilinear forms we have just written down do more
than merely establish the existence of our three categories. They
establish the fact that not only the degrees of the elementary divisors
are arbitrary (subject merely to the condition that their sum be
three), but that, subject to this restriction, the elementaiy divisors
themselves may be arbitrarily chosen. They are, moreover, normal
forms to one or the other of which every pair of bilinear forms in
six variables, of which the first is non singular, may be reduced by
nonsingular linear transformations.
The general theorem here is this:
Thboeem. If Xj, Xgi • ■ • ^/i; <*'"« oif^y constants, equal or unequal, and
gj, fig, • • • e^, are any positive integers whose sum is n, there exist pairs of
bilinear forms in 2 n variables, the second form in each pair being non
singular, which have the elementary divisors
(2)
(xXi)% (x\)% (xx^y*.
The proof of this theorem consists in considering the pair of
bilinear forms
(3)
(n n \
.V" = a^i^i + x^y^ f ••• I x„y^.
PAIRS OF BILINEAR FORMS
28^
of which the second is nonsingular. These forms have a Xmatrix
which may be indicated, for brevity, as
14)
M,
M,
M,
where the letters Mj, • • ■ M^ represent not single terms but blocks of
terms ; Mj standing for the matrix of order e^
M,
X,\ 1 ••■
\(\ 1 ...
A,/ — \
while all the terms of the matrix (4) are zero which do not stand in
one of the blocks of terms M,. The elementary divisors of (4) are,
as we see by a reference to § 93 (Formula (1) and Theorem 2), pre
cisely the expressions (2). Thus our theorem is proved.
A reference to § 97 shows that formula (3) is a normal form to
which every pair of bilinear forms in 2 m variables with the ele
mentary divisors (2) can be reduced.*
* Many other normal forms might be chosen in place of (3). Thus, for instance,
we might have used in place of (3) the form
(3')
1 1 Cj+l «i+l
nejt+1 n— eft+l
<f' = SCiXa/ei+l + 2C2!C<y2j +e i+1 + ^CiXiVle^+le^+e^i+l
1 'i+l , «i+«z+l
+
+ 2Cir£(2/2ne4i+l,
where the constants ci, ••• c/e, di, ••• dh may be chosen at pleasure provided, merely
that none of them are zero. For instance, they may all be assigned the value 1.
V
290
INTRODUCTION TO HIGHER ALGEBRA
Let us now return to the classification of pairs of bilinear forms.
For a given number, 2 n, of variables we have obviously only a finite
number of categories. We may subdivide these categories into
classes by noticing which, if any, of the elementary divisors corre
spond to the same linear factor. This we can indicate in the char
acteristic by connecting by parentheses those integers which are the
degrees of elementary divisors corresponding to one and the same
linear factor. Thus, in the case w = 8, the characteristic
[(21)(111)2]
Would indicate that the Xmatrix has just three distinct linear fac
tors ; that to one of these there correspond two elementary divisors
of degrees two and one respectively, to another three elementary
divisors of the first degree, and to the last a single elementary
divisor of degree two.
Two pairs of bilinear forms which are equivalent belong neces
sarily to the same class, but two pairs of bilinear forms which be
long to the same class are not necessarily equivalent.
To illustrate what has just been said, let us again consider the
case w = 3. Here we have now, instead of three categories, six
classes, which are exhibited in the following table :
a
h
c
I.
[111]
[(1 1) 1]
[(111)]
II.
[21]
[(2 1)]'
III.
[3]
The Xnjatrix of this pair of forms may be written in the form (4), where, how
ever, Mi now stands for the matrix of order et :
Mi =
di Ci(\i —
di c(Xj  X)
^)
CiCKi  X)
It will be noticed that the matrices Mj, and therefore also the bilinear forms (3'),
are symmetrical, a fact which will make this normal form important when we come to
the subject of quadratic forms in the next chapter.
Constants similar to the constants Ci and di which we have introduced in (3(
might also have been introduced in (3).
PAIRS OF BILINEAR FORMS 291
are
The three classes la, lb, Ic form together the category I, and ..__
all represented by the normal form given for that category above,
the only difference being that in class la the three quantities \, xj
Xg are all distinct, in class lb two, and only two, of them are equal,
while in class le they are all equal. Similarly category II is now
divided into two classes, Ila and IIJ, for both of which the normal
form of category II holds good, \ and X^ being, however, different
in that normal form for class Ila and equal for class 116. Finally
category III consists of only a single class.
For some purposes it is desirable to carry this subdivision still
farther. The second of our two bilinear forms, yfr, has been assumed
throughout to be nonsingular. The first, (f), may be singular or
nonsingular ; and it is readily seen that a necessary and sufficient
condition that ^ be singular is that one, at least, of the constants X,
which enter into the linear factors of the Xmatrix be zero. Thus it
will be seen that in a single class we shall have pairs of bilinear
forms both of which are nonsingular and others one of which is
singular, and we may wish to separate into different subclasses
the pairs of forms which belong to one or the other of these two
cases.
Let us go a step farther in this same direction, and inquire how
the rank of ^ is connected with the values of the constants X^. We
notice that the matrix of <^ is equal to the matrix of the pencil <j) — \\jr
when X = 0. Accordingly, if (j) is of rank r, every (r ( l)rowed
determinant of the matrix of ^ — \yjr will be divisible by X, while at
least one rrowed determinant of this matrix is not divisible by X.
It is then necessary, as we see by a reference to the definition of
elementary divisors (cf. the footnote to Definition 3, § 92), that just
w — r of the constants X, which enter into the elementary divisors
should be zero. Since the converse of these statements is also true,
we may say that a necessary and sufficient condition that the form <f>
be of rank r is that just n— r of the elementary divisors be of the form
X'i. Let us, in the characteristic [e^^ e^ ■■■ ej, place a small zero
above each of the integers e^ which is the degree of such an elemen
tary divisor ; and regard two pairs of bilinear forms as belonging to
a single class when, and only when, their characteristics coincide in
the distribution of these zeros as well as in other respects. Here
again two equivalent pairs of forms will always belong to the same
class, but the converse will not be true.
292 INTRODUCTION TO HIGHER Al.GEBRA
As an illustration, let us again take the case n = S. We have
now fourteen classes instead of six.
[1 1 1], [(1 1)1^], [(1 1 1)], [2 1], [(2 1)], [3], (r = 3),
[111], [(11)1], [21], [21], [3J, (r=2),
[(11)1], [(21)], (r=l),
[(ill)], (r=0).
We have indicated, in each case, the rank r of the form (f). Thus
in the first six cases <j) is nonsingular ; in the next five it is of rank
2, etc.
EXERCISES
1. Prove that there exist pairs of real bilinear forms in 2 n variables of which
the second is nonsingular, and which have the elementary divisors
(\Xi)S (XAa)''^, (XXi)«t (ei + «2+  +ei = n),
provided that such of these elementary divisors as are not real admit of arrange
ment in conjugate imaginary pairs. (Cf. Exercises 1, 2, § 93.)
2. Classify pairs of real bilinear forms in six variables (the second form
in each pair being nonsingular), distinguishing between real and imaginary
elementary divisors.
100. Classification of Collineations. The classification of pairs
of bilinear forms which we gave in the last section may obviously
be regarded, from a more general point of view, as a classification
of pairs of matrices, the second matrix of each pair being assumed
to be nonsingular. From this point of view it admits of applica
tion to the classification of collineations, since, as we saw in § 98, to
every coUineation corresponds a pair of matrices of which one is
nonsingular, namely the unit matrix I and the matrix of the linear
transformation. Moreover, the normal form (3) of § 99 is precisely
adapted to the treatment of the more special kind of equivalence
which we have to consider l^re, since the matrix of the form i/r is
precisely the unit matrix. ' We may therefore state at once the
fundamental theorem :
Theorem 1. if \j, Xj, ••■ X^ are any constants, equal or unequal,
and fij, e^, ■■■ e^ ani/ positive integers whose sum is n, there exists a col
lineation in space of n — 1 dimensions whose characteristic matrix ha»
the elementary divisors
(\  Xi)% (\  \,)S (\  x^y*.
PAIRS OF BILINEAR FORMS
293
To this we may add
Theoebm 2. Every collineation of the kind mentioned in Theorem
1 is equivalent to the collineation whose matrix is.
Ml
M,
M.
where M^ stands for the matrix of order e^,
Xj 1 ...
M.=
X, 1
\
We thus get a classification of coUineations into categories and
a subdivision of these categories into classes precisely as in § 99.
For instance, in the case w = 3 (coUineations in the plane), we have
three categories whose characteristics and representative normal
forms we give:
I. [Ill]
II. [21]
f^l'=
^1
^1
4
\^x^
[xl =
XgaJg,
x[ =
Xj
x^ + x^
4 =
\x^
u=
\^s,
h'=
Xi
x^ + x^
x',=
X^x^
■^H
U=
\^x^.
III. [3]
These categories we should then subdivide either into six classes
as on page 290 or into fourteen classes as on page 292. This latter
classification is the desirable one in this case. We proceed to give a
list of these fourteen classes with a characteristic property of each.
294 INTRODUCTION TO HIGHER ALGEBRA
That the normal forms of the coUineations have these properties
will be at once evident, and from this it follows that all the coUinea
tions of the class have the property in question, since the properties
mentioned are obviously all projective. That the properties men
tioned are really characteristic properties, that is, serve to distin
guish one class from another, can only be seen a posteriori, by noticing
that no one of the properties mentioned is shared by two classes.
[1 1 1] Three distinct noncollinear fixed points.*
[(11)1] Every point of a certain line and one point not on
this line are fixed.
[(1 1 1)] The identical coUineation.
[2 1] Two distinct fixed points.
[(2 1)] Every point of a certain line is fixed.
[3] One fixed point.
In all these cases the coUineation is nonsingular. The remain
ing coUineations are singular. In the next three, one point P of the
plane is not transformed at all, while all other points go over on to
a line p which does not pass through P, and every one of whose
points corresponds to an infinite number of points.
[1 1 1] There are two fixed points on p.
[(1 1) 1] Every point on p is fixed.
[2 1] One fixed point on p.
In the next two cases one point P is not transformed at all,
while all other points go over on to a line p which passes through P,
and every one of whose points corresponds to an infinite number of
points.
[2 1] One fixed point.
[3] No fixed point.
The remaining coUineations are so simple that they are not merely
characterized, but completely described, by the property we mention.
[(1 1) 1] The points on a certain line are not transformed. All
other points go over into a single point which does
not lie on this line.
* It should be understood here and in what follows that the fixed points which
are mentioned are the only fixed points of the coUineation in question.
PAIRS OF BILINEAR FORMS 295
[(2 1)] The points on a certain line are not transformed. All
other points go over into a single point on this line.
. _ '
[(1 1 1)] No point iu the plane is transformed.
This last case is of course not a transformation at all.
SXERCISES
1. Classify, in a similar manner, the projective transformations in one
dimension.
2. Classify the coUineations in space of three dimensions.
3. Classify the real projective transformations in space of one, two, and three
dimensions. (Cf. Exercises 1, 2, § 99.)
CHAPTER XXII
THE EQUIVALENCE AND CLASSIFICATION OF PAIRS OF
QUADRATIC FORMS
101. Two Theorems in the Theory of Matrices. In order to jus
tify the applications we wish to make of the'theorjj of elementary
divisors to the subject of quadratic forms, it will be necessary for us
to turn back for a moment to the general theory of matrices.
Definition. If ^{x) is a polynomial :
4>{x) = %x"' + a^x'"^+ ■■■ +a^.^x + a„,
then agX" + ajX""! + ••• + a^.jX + a„I
is called a polynomial in the matrix x and is denoted by <f>{x).*
We come now to one of the most fundamental theorems in the
whole theory of matrices :
Theorem: 1. If a. is a matrix, and ^(X) its characteristic func
tion, then ^g^^ ^ 0.
This equation is called the HamiltonCayley equation.
Let c be the characteristic matrix of a :
c = a — \I.
This being a Xmatrix of the first degree, its adjoint C will be a
\matrix of degree not higher than «— 1, if n is the order of the matrix a :
(1) C = C„_i\»i + C„_2\'2 + ... + Co.
We may also write
(2) <^(\) = Ar„\" + VA"^+  +V
Now referring to formula (5), § 25, we see that
aC  \C s ^(\)I.
* It should be noticed that, according to this definition, the coefiScients of a poly
nomial in X are scalars. Contrast this with a Xmatrix, in which the coefiScients are
matrices and the variable a scalar. Both of these conceptions would be included in
expressions of the form :
aox'bo + aix"ibi + ••• + «u_ixb„_i + a„.
29B
PAIRS OF QUADRATIC FORMS 297
Substituting here from (1) and (2), we have, on equating corre
sponding powers of \,  ,
aCj  Co = Ajl,
aC2 — Cj = k^I,
ftC„_i — C„_2 — «nll»
— C„_i = k„I.
If we multiply these equations in succession by I, a, a?, ••• a",
and add, the first members cancel out, and we get
JcqI + kfi + k^a? + ••• + ^„a" = 0.
This is precisely the equation
<^(a)=0
which we wished to establish.
As a means of deducing our second theorem, we next establish a
lemma which relates merely to scalar quantities.
Lemma. If y]r{x) is a polynomial of the nth degree (n>0) whose
constant term is not zero, there exists a polynomial x(^) of degree less
than n such that , , nno
is divisible by yj/{x).
Let X— a, x—i, x — c, •■• be the distinct linear factors of ■y}r{x),
so that we may write
f{x)=k(xay{xhy{xe)y — (a + /3 + 7+ ••■ =w).
None of the constants a, b, c, ••• are zero, since, by hypothesis, the
constant term of ■^ is not zero. Let us, further, denote by 'f'i(x) the
polynomial obtained from sjr by omitting the factor (a; — a)% by
yfr^^x) the polynomial obtained from ■f by omitting the factor (x—bf,
etc., and finally let us form, with undetermined coefficients, the
polynomials
A{x) = Ao + Aj(xa) + A^xay+  +A,_i(xay\
B{x) = Bo + ^i(^  *) + ^2(*  *)" + • ■• + ^"1^^ ~ ^^'^'
C{x) = (7o + c^x c)+C4^xcf+  + Q,.lx  cy\
298 INTRODUCTION TO HIGHER ALGEBRA
From these polynomials we now form the polynomial
X(a:) = A{x)ylr^{x) + B{x)f^x) + 0{x)flx) + ...
whose degree can obviously not exceed w — 1. We wish to show
that the coefficients 4.,, jB,, ••• can be so determined that this poly
nomial 'X,{x) satisfies the conditions of our lemma.
Since i/r^, 1^3, ■ • • are all divisible by {x — a)*, a necessary and
sufficient condition that (x(^)T — * be divisible by this factor is that
the polynomial ^a>)^^A(x)f(ylr,{x)fx
be divisible by (a; — a)". We have
<(j}(a) = Alfi^abf\aefy ••• a.
In oider that <f>{x) be divisible by 2; — a it is therefore necessary and
sufficient that
Neither numerator nor denominator here being zero, we thus
obtain two distinct values for Aq,' both different from zero. If we
give to Aq one of these values, <j>{x) is divisible by a; — a. A neces
.sary and sufficient condition that it be also divisible hy(x~a^ is
that <f)'(a) = 0, accents here, and in what follows, denoting differentia
tion. We shall see in a moment that this condition can be imposed
in one, and only one, way by a suitable choice of A^ The condi
tion that (l){x) be divisible by {x — of is then simply ^"{a) = 0. We
wish to show that this process can be continued until we have finally
imposed the condition that 4>(x) be divisible by {x — ay. For this
purpose we use tlie method of mathematical induction, and assume
that Aq, ■•• ^j_j have been so determined that <f>(a) = <f>'{a')= •••
= <^f'~i^(a)=0. It remains then merely to show that A, can be so
determined that <f>^'\ci) = 0. For this purpose we notice that
(4) cl>^'Kx) = 2Al^\x)A(xJylr,(x)f + R^x)
where Ii,{x) is an integral rational function with numerical coeffi
cients of 1/^1, i/rj, ... i/rW J., J.', ... ^['11. Since
A{a) = A,, A'{a) = A„ A"{a) = 21 A^,  A^' ')(«) = (s  ly.A^^,
it follows that B{a) is a known constant, that is, that it does not
depend on any of the still undetermined constants A„A,^^, ... A„^^,
PAIRS OF QUADRATIC FORMS 299
nor on the ^'s, 0% etc. Consequently we see from (4) that a neces
sary and sufficient condition that </)W(a) = is that A, have the value
^5) A  RM
Determining the coefficients Aj, ^^^  Ai in succession by means
of this formula, we finally determine the polynomial A{x) in such a
way that <^(a;) is divisible by {xa). For this determination,
(xi^)f  « will, as we saw above, be divisible by {x  a)".
In* the same way we can now determine the coefficients of B(x) so
that (j^x))^zis divisible by {xhf; then we determine the coeffi
cients of C{x) so that {x{^)f  a; is divisible by {x  o)y; etc. When
all the polynomials A, B, 0, ■■• are thus determined, '(X(a;))2 a; is
divisible by i/r(a;), and our lemma is proved.
Theorem 2. If a. is a nonsingular matrix of order n, there exist ma
trices b of order n {necessarily nonsingular) with the following properties :
b2 = a,
b is a polynomial in a of degree less than n.
Since a is nonsingular, its characteristic function 4>(X) is a poly
nomial of the wth degree whose constant term is not zero. Hence, by
the preceding lemma, a polynomial ;;^;(\) of degree less than n can be
determined such that ^^(^^^^ _ ^ ^ <^(X)/(\)
where f(X) is also a polynomial. From this identity it follows that
(X(a))2a = </.(a)/(a).
Since, by Theorem 1, ^(a) = 0, the last equation may be written
(X(a))^ = a,
so that b = x(a) is a matrix satisfying the conditions of our theorem,
which is thus proved.
102. Symmetric Matrices. The application of the theory of ele
mentary divisors to the subject of quadratic forms rests on the fol
lowing proposition:
Theorem 1. If sl^ and a.^ are symmetric matrices and if there
exist two non singular matrices p and q such that
(1) a2 = paiq,
300 INTRODUCTION TO HIGHER ALGEBRA
then there alto exists a nonsingular matrix P such that
(2) a, = P'aiP
where P' is the conjugate of P.*
Let us denote by p' and q' the conjugates of p and q respectively.
Taking the conjugates of both sides of (1), and remembering that
aj and a^, being symmetric, are their own conjugades, we get, by
Theorem 6, § 22,
(3) Eg = q'aiP'.
•
By equating the values of a2 in (1) and (3), we readily deduce the
further relation
(4) • (q'rpai = aip'q^
For brevity we will let
(5) U = (q'r'p, U'=p'q',
and note that U' is the conjugate of U ; cf. Exercise 6, § 25. Equa
tion (4) may then be written
(6) Uai = aiU'.
From this equation we infer at once the following further ones :
D^ai = UajU' = ajU'^,
U»ai = UaiU'2 = aiU'3,
(7)
V% = UaiU'*"^ = ajU'*.
Let us now multiply the equations (6) and (7) and also the equa
tion aj = aj by any set of scalar constants and add them together.
We see in this way that if % (U) is any polynomial in U,
(8) x(U)a, = a.x(U').
* A proof of this theorem much simpler than that given in the text is the following :
From (1) we infer at once that ai and a^ have the same rank. Hence the quad
ratic forms of which ai and a2 are the matrices are equivalent to each other by Theorem
4, § 46. If we denote by P the matrix of the linear transformation which carries over
the quadratic form ai into the form 32, vye see, from Theorem 1, § 43, that equation (2)
holds.
This proof would not enable us to infer thatP can be expressed in terms of p and q
alone, and this is essential for our purposes.
PAIRS OF QUADRATIC FORMS 301
We will choose the polynomial
V = X(U)
so that V is nonsingular and
as is seen to be possible by Theorem 2, § 101. Denoting by V' the
conjugate of V, we evidently have
V' = %(U'),
so that we may write (8) in the form
Vai = ajV,
or ai = ViaiV'.
We now substitute this value in (1) and get
(9) a2 = pViaiV'q.
From the first equation (5) we infer the formula
pVi=q'V.
Consequently pV~i is the conjugate of V'q, so that if we let
P = V'q,
equation (9) may be written
a^ = P'aiP,
and our theorem is proved.
The proof just given enables us to add the
Corollary. As the matrix P of the foregoing theorem may he
taken the matrix V'q where V' is the conjugate of any one of the square
roots, determined hy Theorem 2, § 101, of (q')"^ p \
In particular it will be seen that P depends on p and q but not on
Ej or a^. Hence if aj, ag, bj, \ are symmetric matrices, and there
exist two nonsingular matrices p and q such that
aj = pajq, \ = pbjq,
then there exists a nonsingular matrix P such that
a2 = P'ajP, bo = P'bjP.
302 INTRODUCTION TO HIGHER ALGEBRA
From this and Theorem 2, § 96, we infer
Theorem 2. If &i, ag, bj, bg, are symmetric matrices of which bj, bj
are non singular, a necessary and sufficient condition that a nonsingu
lar matrix P exist such that
(10) 32 = P'aiP, \ = P'bjP,
where P' is the conjugate of P, is that the matrices
aj — Xbj, 02 — Xbj
have the same invariant factors, — or, if we prefer, the same elementary
divisors.
If, in particular, bj = bg = I, where I is the unit matrix, we have,
from the second equation (10), the formula
I = P'P.
Such a matrix P we call an orthogonal matrix according to the defini
tion, which will readily be seen to be equivalent to the one given in.
the first footnote on page 154 :
Definition. By an orthogonal matrix we understand a nonsingu^
lar matrix whose inverse is equal to its conjugate.
In the special case just referred to, Theorem 2 may be stated in the
following form :
Theorem 3. If aj and sl^ are two symmetric matrices, a necessary
and sufficient condition that an orthogonal matrix P exist such that
a2=P'aiP
is that the characteristic matrices of aj and s^ have the same invariant
factors, — or, if we prefer, the same elementary divisors.
If this theorem is compared with Theorem 3, § 96, it will be seen
that it differs from it only in two respects, first that a^ and a^ are
assumed to be symmetric, and secondly that P is required to be
orthogonal.
103. The Equivalence of Pairs of Quadratic Forms. Let us con
sider the two pairs of quadratic forms
n n
^1 = 2 aljXiXj, fi = ^ K <«iXj,
n n
and </>2 = 2) a'}jXiXj, V^a = ^ ^'^i^,',
PAIRS OF QUADRATIC FORMS 803
of which the two forms ■^^ and 1^2 ^'re assumed to be nonsingular.
We will inquire under what conditions these two pairs of forms are
equivalent ; that is, under what conditions a linear transformation
nlA + •■• + C„X
exists which carries over ^^ into ^^ and, at the same time, i/tj into ■^j
If we denote the conjugate of the matrix c by c', and the ma
trices of the forms <^j, t/tj,, ^gi ""/^a ^7 %' ^r ^2' \ respectively, we
know, by Theorem 1, § 43, that the transformation c carries over ^j
and i/tj into forms with the matrices
c'ajC, c'biC
respectively; so that, if these are the forms ^^ and ^g' ^^ have
(1) a^ = c'eiC, \ = c'bjC.
Consequently, by Theorem 2, § 102, the two \matrices
aj — Xbj, aj — Xbg
have the same invariant factors and elementary divisors.
Conversely, by the same theorem, if these two Xmatrices have
the same invariant factors (or elementary divisors), a matrix c, inde
pendent of X, exists which satisfies both equations (1) ; and hence
the two pairs of quadratic forms are equivalent. Thus we have
proved
Theorem 1. If 0j, i/^j and (j)^, tjr^ are two pairs of quadratic
forms in n variables, in which yjr^ and yfr^ are nonsingular, a necessary
and sufficient condition that these two pairs of forms he equivalent is
that the matrices of the two pencils
have the same invariant factors, — or, if we prefer, the same elemen
tary divisors*
A special case of this theorem which is of considerable impor
tance is that in which both of the forms i/tj and i/r^ reduce to
xl + xl+  +xl.
• For brevity, we shall speak of these Invariant factors and elementary divisors as
the invariant factors and elementary divisors of the pairs of forms ^1, ifri and ^2, i^j
respectively.
304
INTRODUCTION TO HIGHER ALGEBRA
In this case we have to deal with orthogonal transformations (cf. the
Definition in Exercise 1, § 52), and our theorem may be stated in
the form *
Theorem 2. If s.^ and a^ are the matrices of two quadratic forms,
a necessary and sufficient condition that there exist an orthogonal trans
formation which carries over one of these forms into the other is that the
characteristic matrices of a.^ and a^ have the same invariant factors,
— or, if we prefer, the same elementary divisors.
To illustrate the meaning of the theorems of this section, let us
consider again briefly the problem of the simultaneous reduction of
two quadratic forms to sums of squares. In Chapter XIII we be
came acquainted with two cases in which this reduction is possible ;
cf. Theorem 2, § 58. and Theorem 2, § 59. We are in a position now
to state a necessary and sufScient condition for the possibility of
this reduction, provided that one of the two forms is nonsingular.
For this purpose, consider the two quadratic forms
^ = k^xlirh^l+ ■■• +k„xl,
f = c^xl + c^l + — + c^xl.
where we assume, in order that the second form may be nonsingular,
that none of the cs vanish. The matrix of the pencil ^ — X'^ is
^1 ~ '^i^
K — cJk
and the elementary divisors of this matrix are
\'h, \% \
Co
K
^1 ^2 •»
all of the first degree. Consequently, any pair of quadratic forms
equivalent to the pair just considered must have a \matrix whose
elementary divisors are all of the first degree.
Conversely, if we have a pair of quadratic forms, of which the
first is nonsingular, whose \matrix has elementary divisors all of
* This theorem is, of course, essentially equivalent to Theorem S, § 102, of which it
may he regarded as an immediate consequence.
PAIRS OF QUADRATIC FORMS 305
the first degree, we can obviously choose the constants h and o
in such a way that the \matrix of the forms <f> and yjr just con
sidered has these same elementary divisors, and therefore the given
forms are equivalent to these special forms ^ and yfr. Thus we have
proved the theorem :
Theorem S. If (f> and yjr are quadratic forms and yfr is nonsingu
lar, a necessary and sufficient condition that it be possible to reduce (}>
and ■yfr simultaneously by a nonsingular linear transformation to forms
into which only the square terms enter is that all the elementary divisors
of the pair of form's be of the first degree.
This theorem obviously includes as a special case Theorem 2 of
§ 68, since the elementary divisors are necessarily of the first degree
when the Xequation has no multiple roots.
Comparing the theorem just proved with Theorem 2, § 59, we see
that under the conditions of that theorem the elementary divisors
must be of the first degree. Hence
Theorem 4. If ■^ is a nonsingular, definite, quadratic form, and
<f) is a real quadratic form, all the elementary divisors of this pair of
forms are necessarily of the fvrst degree.
104. Classification of Pairs of Quadratic Forms. We consider
the pair of quadratic forms
n n
(1) (j, = ^a^XiXj, f = '^hjXiXp
and assume, as before, that \/r is non singular. We denote the ele
mentary divisors of these forms, as in § 99, by
(\  Xi>, (X  X2)%  (X  x^y* («! + ^2 +■•• + «* = '»)•
The symbol [ej e^ •••ej we call the characteristic of the pair of
quadratic forms; and all pairs of quadratic forms which have the
same characteristic we speak of as forming a category.*
We have here, precisely as in the case of bilinear forms, the
theorem:
Theorem. IfXy, X2, ••• X* are any constants, equal or unequal,
and e ,e , ■■■ e^are any positive integers whose sum is n, there exist pairs
* Thus, for instance, all pairs of forms of which the second is nonsingular and which
admit of simultaneous reduction to sums of squares, form a category whose character
istic is [1 1  1]. Cf. Theorem 3, § 103.
X
306 INTRODUCTION TO HIGHER ALGEBRA
of quadratic forms in n variables, the second form in each pair being
nonsingular, which have the elementary divisors
(2) (A,Xj)% {XX^)\ (X\*y'.
The proof of this theorem consists in considering the following
pair of quadratic forms, analogous to the normal form (3') of § 99:
^1 t ' ^/> 11 p.■^^ '
(3)
n 1 ' '^1+1 «!+!
+  +/
nl B1 \
( 2 \^C i^iX^ne iJr\ + ^ */:^i^2«fi« j'
1 Ci+l e,+Cj+l
+ ■" + — ''A^i'anfii+l'
where Cj, ••• e^, tij, ■•■ d^ are constants which may be chosen at pleas
ure, provided none of them are zero.
The \matrix of this pair of forms is the same as the \matrix of
the pair pf bilinear forms (3' ) of § 99, and therefore has the desired
elementary divisors.
A reference to Theorem 1, § 103, shows that formula (3) yields a
normal form to which every pair of quadratic forms, of which the
second is nonsingular and whose elementary divisors are given by
(2), can be reduced.
The categories, of which we have so far spoken, may be divided
into classes by the same methods we used in § 99 in the" case of
bilinear forms. This may be done, as before, either by simply noting
which of the X/s are equal to each other, or by further distinguishing
between the cases where some of the X/s are zero.
We are now in a position to see exactly in what way our elemen
tary divisors give us a more powerful instrument than we hadin the
invariants @, of § 57. These invariants @„ being the coefficients of
the Xequation of our pair of forms, determine the constants Xj, which
are the roots of this equation, as well as the multiplicities of these
roots. They do not determine the degrees e^ of the elementary di
visors, and the use of the @,'s alone does not, in all cases, enable us
to determine whether two pairs of forms are equivalent or not.
Thus, for instance, we may have two pairs of forms with exactly the
PAIRS OF QUADRATIC FORMS iJO?
same invariants ©, but with characteristics [(11)11 ••■1] and
[211 ... 1] respectively.* It will be seen, therefore, that the ©j's
form in only a very technical sense a complete sysjtem of invariants.
EXERCISES
1. Form a numerical example in the case n = 3 to illustrate the statement
made in the next to the last sentence of this section.
2. Prove that if two equivalent pairs of quadratic forms have two elementary
divisors of the first degree which correspond to the same linear factor, there exist
an infinite nutnber of linear transformations which carry over one pair of forms
into the other.
3. Prove the general theorem, of which Exercise 2 is a special case, namely,
that if two equivalent pairs of quadratic forms have a characteristic in which one
or more parentheses appear, there exist an infinite number of linear transforma
tions which carry over one pair of forms into the other.
4. Prove that if two equivalent pairs of quadratic forms have a characteristic
in which no parentheses appear, only a finite number of linear transformations
exist which carry over one pair of forms into the other, f
How are these transformations related to each other?
105. Pairs of Quadratic Equations, and Pencils of Forms or Equa
tions. J In dealing with quadratic forms, the questions of equiva
lence and classification do not always present themselves to us in
precisely the form in which we have considered them in the last two
sections. We frequently have to deal not with the quadratic forms
themselves but with the eqvMions obtained by setting the forms
equal to zero. Two such pairs of equations we shall regard as
equivalent, not merely if the forms in them are equivalent, but also
if one pair of forms can be obtained from the other by multiplication
by constants different from zero.
Let us consider two quadratic forms ^, v/r, of which we assume,
as before, that the second is nonsingular, and inquire what the
effect on the elementary divisors
(1) {\\)% (X  \)\ ...... {X X.yic
* We may, in the case n = 3, put the same thing geometrically (of. the next sec
tion) by saying that it is impossible to distinguish between the case of two oonics having
double contact and that of two conies having simple contact at a single point by the use
of the invariants &i alone, whereas these two cases are at once distinguished by the use
of elementary divisors.
t The exercise in § 58 is practically a special case of this.
X Questions similar to those treated in this section might have been taken up in
the last chapter for the case of bilinear forms.
308 INTRODUCTION TO HIGHER ALGEBRA
of these forms will be if the forms are multiplied respectively by the
constants p, q which are both assumed to be different from zero.
Let us write
Then
(2) 4>^\^^=p{4>\'^)
where \'—'Lx.
P
Let \ — a be any one of the linear factors of the matrix of ^ — \i^,
so that a is any one of the constants Xj, X^, ■■■ Xj.; and let us denote,
as in the footnote to Definition 3, § 92, by ?j the exponent of the
highest power of X — a which is a factor of all the «rowed deter
minants of this matrix. Then it is clear, from (2), that Z^ is the
exponent of the highest power of X' — a which is a factor of all the
trowed determinants of the matrix of (^j — Xi/r^. In other words,
x^y
is the highest power of the linear factor X — pa/q which is a factor
of all the irowed determinants of the matrix of ^j — Xt/tj. Turning
now to the definition of elementary divisors as given in the footnote
to Definition 3, § 92, we see that the elementary divisors of the matrix
of </>j — Xi/r^ differ from those of the matrix of — X>r only in having
the constants X^ replaced by the constants pX^/q. We thus have the
result :
Theorem 1. If the pair of quadratic forms <f>, yjr, of which the
second is assumed to be nonsingular, has the elementary divisors
(X  x^)^, (X  \)\ (X  x^y* .
and if p., q are constants different from zero, then the pair of quadratic
forms pcj), q\jr has the elementary divisors
(xx{)% (xx^)% (xxiy*
where Xj = X,
In particular, it will be seen that these two pairs of forms have
the same characteristic, even when the conception of the character
istic is refined not merely by inserting parentheses but also by tb*
use of the small zeros.
PAIRS OF QUADRATIC FORMS 309
The theorem just proved shows that pairs of homogeneous
quadratic equations, of which the second equation in each pair is
nonsingular, may be classified by the use of their characteristics
precisely as was done in the last section for pairs of quadratic forms.
We proceed to illustrate this in the case w = 3, where we may con
sider that we have to deal with the classification of pairs of conies
in a plane, one of the conies being nonsingular.
We have here three categories represented by the following
normal forms : *
I [1 1 11 i '^ ~ ^^"^^ "^ ^^2^  \4
\ir= x\ + xl  4
II. [2 1] \^='^\^i^i + ^l + \^l
lif= 2x^x^ + xl.
III. r^l \<t>=^}^x^x^ + \ ^xl + 2x^x^
l./r= 2x^X^ + xl.
We next subdivide these categories into classes, and, by an ex
amination of the normal form in each case, we are enabled at once to
characterize each class by certain projective properties which it has,
and which are shared by no other class, f Since the conic i/r is non
singuhir in all cases, this fact need not be explicitly stated.
[Ill] ^ and •^ intersect in four distinct points.
[(1 1) 1] ^ and ^ have double contact.
[(1 1 1)] <f} and •^ coincide.
[2 1] <l> and •^ meet in three distinct points at one of which
they touch.
[(2 1)] ^ and ^ have contact of the third order.
[3] ^ and i/r have contact of the second order.
In all of the above cases (^, as well as ■y^, is nonsingular.
In the next five cases, <^ consists of a pair of distinct straight
lines.
* We assign to the constants c; and ^j, in formula (3) of the last section, values so
chosen that the loci = 0, ^ = are real when the constants Xj are real. This is, of
course, not essential, since we are not concerned with questions of reality.
t In order to verify the statements made below, the reader should have some
knowledge of the theory of the contact of conies; cf. for instance Salmon's Conia
Sections, Chapter XIV., pages 232238.
alO INTRODUCTION TO HIGHER ALGEBRA
[111] <f) and t/t intersect in four distinct points.
[(1 1) 1] Both of the lines of which (f) consists touch ^jr.
[2 1] One of the lines of which ^ consists touches ^^ while
the other cuts it in two points distinct from the point
of contact of the first.
[2 1] The two lines of which consists intersect on ■\jr, and
neither of them touches sjr.
[3] The two lines of which A consists intersect on yjr, and
one of them touches yfr.
In the next two cases, <f) consists of a single line.
[(11)1] The line cf> meets tjr in two distinct points.
[(2 1)] The line <^ touches •^.
Finally we have the case:
»
[(1 11]) Here </> = 0, and we have no conic other than i^.
Suppose finally that we wish to classify not pairs of quadratic forms
or equations but pencils of quadratic forms or equations. Consider the
pencil of quadratic forms ^ _ ^ . /
where ^ and yjr are quadratic forms, and yjr is nonsingular, and
suppose that the elementary divisors of the pair of forms <j>, yfr are
given by formula (1) above. The question presents itself whether,
if, in place of the forms <f>, y{r, we take any other two forms of the
the constants fi, v being so chosen that jjl^^^v and that irj is non
singular, the pair of forms 0j, i^j, will have these same elementary
divisors (1). If this were the case, we could properly speak of (1)
as the elementary divisors of the pencil. This, however, is not the
case, and the pencil of quadratic forms cannot properly he said to have
elementary divisors.*
* We here regard the pencil as merely an aggregate of an infinite number of
quadratic forms, namely, all the forms which can be obtained from the expression
<j> — Xfbj giving to \ different values. In this sense we cannot speak of the elementary
divisors of the pencil. If, however, we wish to regard the polynomial in the k's and \,
4> — Xi/*, as the pencil, we may speak of its elementary divisors, meaning thereby simply
what we have called the elementary divisors of the pair of forms <t>,\l/.
PAIRS OF QUADRATIC FORMS 311
There is, however, a simple relation between the elementary
divisors of two pairs of forms taken from the same pencil. In order
to show this, let us determine the elementary divisors of the pair of
forms ^j, irj, above. For this purpose consider the expression
<pi — \i/rp which, when X =?fc 1, may be written
(3) </,iX^i=(lX)[<^X't]
where \' = ^~ ^ .
1 — X
Now suppose, as above, that X — a is any one of the linear factors of
the matrix of ^ — Xvr, and that l^ is the exponent of the highest
power of X — « which is a factor of all the irowed determinants of
this matrix. Then any one of the irowed determinants of the ma
trix oi (f>— \'^]r may, when X t^ 1, be written in the form
(X'  affix')
where / is a polynomial in X' of degree not greater than i — li.
Accordingly, by (3), the corresponding irowed determinant of the
matrix of <^j — Xyjr^ may be written
lfj,j,Xa{lX)yif^{X)
where /j is a polynomial in X. Thus we see that
[Bf]'
is a factor of every irowed determinant of the matrix of ^j — Xt^Tj.
Similar reasoning, carried through in the reverse order, shows that
this is the highest power of a—jj,
A. —
a— V
which is a factor of all these irowed determinants. Hence
Theorem 2. If the pair of quadratic forms <f>, yjr, of which the
second is non singular, have the elementary divisors
(X  Xi)% (X  \i)% (X  x,yK
and if fJ; V are any two constants distinct from each other and such
that V is distinct from all the constants X^, Ag, ••• A*, then the two forms
(f)^ = (f>— /Ai/r, ^jf^ = 4>— vyjr,
of which the second will then he nonsingular, will have the elementary
divisors (\_x;)% (XX^)S (XXifcX*
where Xf=^ (i= 1, 2, ... A).
312 INTRODUCTION TO HIGHER ALGEBRA
In particular, it will be seen that the two pairs of forms ^, y}r and
<^j, i^j have the same characteristic [ej e^ ■■■ ej even if we put in
parentheses to indicate which of the e's correspond to equal \/s.
The characteristics will not, however, necessarily be the same if we
put in small zeros to indicate which of the e's correspond to vanish
ing Vs» since \ and Xj do not usually vanish together. Accord
ingly, in classifying pencils of quadratic forms, we may use the
characteristic of any pair of distinct forms of the pencil, the second
of which is nonsingular, but we must not introduce the small zeros
into these characteristics. This classification, of course, applies only
to what may be called nonsingular pencils, that is, pencils whose
forms are not all singular.
It will readily be seen that what has just been said applies with
out essential change to the case of pencils of homogeneous quadratic
equations. We may therefore illustrate it by the classification of
nonsingular pencils of conies.* We have here six classes of pencils
which we characterize as follows :
[1 1 1] The conies all pass through four distinct points.
[(1 1) 1] The conies all pass through two points at which they
have double contact with each other.
[(1 1 1)] The conies all coincide.
[2 1] The conies all pass through three points at one of
which they touch one another.
[(2 1)] The conies all pass throu'gh one point at which they
have contact of the third order.
[3] The conies all pass through two points, at one of
which they have contact of the second order.
EXERCISES
1. Determine, by the use of elementary divisors, the nature of each of the
following pairs of conies :
■ 8xia;2 — 102:2X3 + 4xiXs =
■4xiX2— Q X2XS + G xixs = 0.
(a) f3x5 + 7xi +i
^ \2xl + Zxlxl + '
(6) {
3 x? — x — 3 x§ — 3 xi X2 + 3 12 xs I xixs =
2 3:1 + xj— x^ — 2 xi X2 — 2 X2X3 + 2 xi X3 = 0.
2. Give a classification of pairs of binary quadratic equations, the second
aquation of each pair being nonsingular, and interpret the work geometrically.
* For a similar classification of pencils of quadrios we refer to p. 46 of Bromwioh's
book : Quadratic Forms and their Classification by Means of Invariant Factors.
PAIRS OF QUADRATIC FORMS 313
106. Conclusion. We wish, in this section, to point out some
of the important questions connected with the subject of elementary
divisors, which, in order to keep our treatment within proper limits,
we have been obliged to leave out of consideration.
If 0j, i/rj and <j>2, y)r^ are two pairs of bilinear or quadratic
forms of which t/t j, ^^ are nonsingular, we have found a method
of determining whether these two pairs of forms are equiva
lent or not. If we use the invariant factors instead of the ele
mentary divisors, our method involves only the use of the rational
operations (addition, subtraction, multiplication, and division),
and can, therefore, be actually carried through in any concrete
case. In fact we have explained in § 93 some really practical
methods of determining the invariant factors of a Xmatrix, so
that the problem of determining whether or not two pairs of
bilinear or quadratic forms, the second form in each pair being
nonsingular, are equivalent, may be regarded as solved, not
merely from the theoretical, but also from the practical point of
view.
There is, however, another question here, which we have not
treated, namely, if the two pairs of forms turn out to be equiva
lent, to find a linear transformation which carries over one into
the other. This problem, too, we may consider that we have
solved from a theoretical point of view; for the proof we have
given that if two pairs of forms have the same elementary
divisors there exists a linear transformation which carries over
one pair of forms into the other, consisted, as will be seen on
examination, in actually giving a method whereby such a linear
transformation could be determined. In fact, in the case of bilinear
forms, the processes involved are, here again, merely the rational
processes ; so that, given two equivalent pairs of bilinear forms, the
second form of each pair being nonsingular, we are in a position to
find, in any concrete case, linear transformations of the a;'s and ^s
which carry over one pair of forms into the other. Even here the
arrangement of the work in a practical manner might require
further consideration.
In the case of quadratic forms the problem becomes a much more
difficult one, inasmuch as the processes involved in the determination
of the required linear transformation are no longer rational ; cf. the
Lemma of § 101. That this is not merely a defect of the method we
314 INTRUDUCTIO.V TO HIGHER ALGEBRA
have used, but is inherent in the problem itself, will be seen by a con
sideration of simple numerical examples. Let, for instance,
(f)^ = 2x^ + 'Sxl, ^^ = 2x\—2>x\,
iri= *i+ 2; ■^i ^
2_
1
Here the pairs of forms (^j, ^j and ^g, 1^2 'f^oth have the elementary
divisors > o ^ q
A — ^, A — o,
and are therefore equivalent. The linear transformation which
carries over one pair of forms into the other cannot, however, be
real (and therefore its coefBcients cannot be determined rationally
from the coefficients of the given forms) since ^j and y^^ are definite,
02 and i/tj indefinite.
We have, therefore, here the problem of devising a practical
method of determining a linear transformation which carries over a
first pair of quadratic forms into a second given equivalent pair. A
method of this sort, which is a practical one when once the elemen
tary divisors have been determined, will be found in Bromwich's
book on quadratic forms referred to in the footnote on p. 312.
Another point at which our treatment is incomplete is in the
restriction we have always made in assuming that, in the pair of
bilinear or quadratic forms (^, i^, the form 1^ is nonsingular. Al
though this is the case in many of the most important problems to
which one wishes to apply the method of elementary divisors, it is still
a restriction which it is desirable to remove. This may be done in
part by making use not, as we have done, of the pencil ^ — \yjr, but
of the more general pencil /x</) — \i/r, /j, and X being variable param
eters. The determinants of the matrix of this pencil are binary
forms in (/u, \), and the whole subject of elementary divisors admits
an easy extension to this case, the elementary divisors being now
integral powers of linear binary forms. The only case which can
not be treated in this way is that in which not only (f) and ir are
both singular, but every form of the pencil /^(^ — X>fr is singular.
This singular case, which was explicitly excluded by Weierstrass in
his original paper, requires a special treatment which has been given
by Kroneckef. Cf., for the case of quadratic forms, the book of
Bromwich already referred to.
Still another question is the application of the method of ele
mentary divisors to the case in which the two forms (p, ir are real.
PAIRS OP QUADRATIC FORMS 315
and only real linear transformations are admitted. In the case of
bilinear forms, this question presents no serious difficulty ; cf. the
exercises of §§ 97, 99. In the case of quadratic forms, however, the
irrational processes involved in the proof of the Lemma of § 101
introduce an essential difficulty, since they are capable of introduc
ing imaginary quantities. Moreover, this difficulty does not lie
merely in the method of treatment. The theorems themselves
vyhich we have established do not remain true, as is seen by a refer
ence to the numerical example given earlier in this section for an
other purpose, where we have two pairs of real quadratic forms
which, although they have the same elementary divisors, are not
equivalent with regard to real linear transformations.
We must content ourselves with merely mentioning this impor
tant subject, and referring, for one of the fundamental theorems, to
p. 69 of the book of Bromwich.
For further information concerning the subject of elementary
divisors the reader is referred to Math's Theorie und Anwendung der
Elementartheiler, Leipzig, Teubner, 1899. In English, the book of
Bromwich already referred to and some sections in Mathews' revision
of Scott's Determinants will be found useful.
INDEX
(The numbers refer to the pages.)
Abelian group, 83.
Adjoint of a determinaut, 30 ;
of a matrix, 77, 80 ;
of a quadratic form, 159.
Affine transformation, 70.
Algebraic complement, 23
Anharmonic ratio, 103.
Associative law for matrices, 64.
Augmented matrix, 44.
Axis coordinates, 113.
Bezont's method of elimination, 238.
Bilinear forms, 114117 ;
determinant, matrix, rank of, 114;
equivalence of, 116, 283 ;
normal forms of, 116, 288, 289;
pairsof, 283, 287292;
reducibility of, 116 ;
singular, 114.
Binary forms, 5;
biquadratic, 260;
cubic, 239;
discriminant of, 237;
factors of, 188 ;
invariants of, 235;
resultant of, 201, 236, 239;
symmetric functions, 255.
Biquadratic, binary, 260.
Boole, 260.
Bordered determinants, 28, 156160.
Bromwich, 270, 312315.
Cancellation, 7.
Category of pairs of bilinear forms, 287 ;
of collineations, 293;
of pairs of quadratic fofms, 305.
Cayley, 63, 154, 260, 296.
Characteristic equation, function, matrix,
282.
Characteristic of a quadratic form, 149 ;
of a collineation, 293 ;
of a pair of bilinear forms, 287 ;
of a pair of quadratic forms, 305.
Class of objects, 81 ;
of quadratic forms, 148.
Cofactor, 23.
Cogredient variables, 90.
Collineation, 68, 284, 292.
Combinant, 115.
Commutative group, 83;
law for matrices, 63.
Complement of a minor, 23.
Complete system of invariants, 93.
Complex quantity, 8, (iO.
Component of complex quantity, 61.
Composite elementary divisors, 270.
Concomitants, 109.
Cone, 120123, 156.
Conjugate of a matrix, 21, 65, 80;
planes, 158 ;
points, 121.
Continuity, 14.
Contragredient variables, 108.
Contra variant, 109.
Coordinates, homogeneous, 11 ;
point, plane, line (axis, ray), 107113.
Correlation, 117.
Corresponding polynomials, 178.
Covariant, absolute, 91 ;
integral rational, 99;
relative, 97.
Cramer's Rule, 43.
Crossratio, 103, 107.
Cubic, binary, 239.
Cyclic group, 87.
Definite quadratic form, 150.
Degree of a polynomial, 1, 4;
of a Xmatrix, 277 ;
of a product, 6, 277.
Descriptive property, 88, 232.
Determinant, 20;
adjoint of, 30;
bordered, 28, 156160;
Laplace's development of, 26;
matrix of, 21 ;
minors of, 22 ;
of a bilinear form, 114 ;
of a matrix, 21 ;
of a transformation, 66 ;
orthogonal, 154 ;
317
318
INDEX
Determinant — Continued.
product of two, 26 ;
rank of, 22 ;
skewsymmetric, 59 ;
symmetric, 5t).
Discriminant of a binary biquadratic, 260;
of a binary cubic, 239;
of a binary form, 237, 259;
of a polynomial in one variable, 2JiO;
of a quadratic form, 128;
of a quadric surface, 118.
Division of polynomials, 180 ;
of Xmatrices, 278 ;
of matrices, 75.
Divisor of zero, 65, 80.
Domain of rationality, 175, 212, 216
Dyad (dyadic polynomial), 79.
Dyalitic method of elimination, 199.
Element of a determinant or matrix, 20;
of a set, system, or group, 81.
Elementary divisors of a Xmatrix, 271 ;
of a ?ollineation, 286;
of a pair of bilinear forms, 284;
of a pair of quadratic forms, 303 ;
simple, composite, 270.
Elementary symmetric function, 242, 253.
Elementary transformation of a matrix, 55;
of a Xmatrix, 262.
Elimination, 198, 217, 238.
Equations, linear, 43 ;
homogeneous, linear, 47 ;
quadratic, 149;
quadratic, pairs of, 307;
quadratic, pencils of, 312.
Equianharmonic points, 107.
Equivalence, 92
Equivalent matrices, 55, 93;
collineations, 286;
Xmatrices, 263, 274 ;
pairs of bilinear forms, 283;
pairs of matrices, 281 ;
pairs of quadratic forms, 170, 303;
quadratic forms, 135, 148.
Euclid's algorithm, 189, 192, 206.
Euler's theorem for homogeneous functions,
237.
Factors of a polynomial, 174, 187, 203; see
invariant factors.
Fixed points of a collineation, 285 ;
lines, planes of a collineation, 286.
Forms, 4; see bilinear, quadratic, binari/
forms ;
biquadratic, 260:
cubic, 239;
polar, 127.
Fours group, 87.
Fractional matrices, 86.
Frobenius, 2(i2, 270.
Fundamental system of solutions, 49;
theorem of algebra, 16.
Generator of a quadric surface, 119.
Gibbs, 79.
Greatest common divisor of integers, 188 ;
of polynomials in one variable, 191, 197,
of polynomials in two variables, 206.
Groundform, 96.
Group, 80;
Abelian or commutative, 83;
cyclic, 87 ;
fours group, 87 ;
isomorphic, 83 ;
sub, 83.
Group property, 82.
Hamilton, 79, 296.
Harmonic division, 104.
Homogeneity, principle of, 226.
Homogeneous coiirdinates, 11 ;
invariants, 2.30;
linear equations, 47, 49;
polynomials, 4.
Idemfactor, 74.
Identical vanishing (equality) of polynomi.
als, 2, 5, 7, 10;
element of a group, 82 ;
transformation, 67.
Hidefinite quadratic form, 150.
Index of inertia of quadratic form, 146.
Invariant, absolute algebraic, 89;
arithmetical, 91, 94, 115, 124, 129, 146, 2ST ;
complete system of, 93 ;
geometric, 88, 103;
homogeneous, 230;
integral rational, 99, 101, 115, 129, 137,
159, 166, 218, 2.39, 260;
irrational, 167, 259, 260 ;
rational, 96, 222; see also integral, ni
tionat;
relitive algebraic, 96, 115; see integral
rational.
Invariant factors, of a Xmatrix, 269;
of a collineation, 286;
of a pair of bilinear forms, 284 ;
of a pair of quadratic forms, 303.
Inverse of a transformation, 67 ;
of an element of a group, 82;
of a matrix, 75, 80;
of a quadratic form, 160.
Isobaric polynomial, 222;
symmetric function, 245, 255, 256.
Isomorphic groups, 83.
INDEX
319
Jacobi, 144.
Kronecker, 4, 139, 262, 314.
Xequation of two conies, 164 ;
of two quadratic forms, 166.
Xmatrix, 2()2.
Lagraiige's reduction, 131.
Laplace's development, 26.
Law of Inertia, 144 ;
of Nullity, 78, 80.
Line at infinity, 13.
Linecoordinates, 108, 110.
Linear dependence, conditions for, 3638 ;
of geometric configurations, 39;
of polynomials, 35, 38 ;
of sets of constants, 35, 48.
Linear equations, 43, 47, 49;
transformations, 66.
Linear factors of polynomials in one vari
able, 187 ;
of binary forms, 188;
of Xmatrices, 270.
Matrix, theory of, 2022, 5166, 7480, 86, 93,
262283, 296302 ;
adjoint of, 77, 80 ;
as a complex quantity, 60 ;
augmented, 44;
conjugate, 21 ;
determinant of, 21 ;
division of one by another, 75 ;
elementary transformation of, 55;
equivalent, 55 ;
fractional, 86 ;
inverse of, 75 ;
multiplication by matrix, 63;
multiplication by scalar, 62 ;
normal form of a, 56 (Exercise 3) ;
normal form of a symmetrical, 59;
o' a bilinear form, 114 ;
of a determinant, 21 ;
of a quadratic form, 128;
of a quadric surface, 118 ; '
of a system of linear equations, 44;
of a transformation, 66.
orthogonal, 154, 302, 304 ;
powers of, 75 ;
product of two, 63 ;
rank of, 22 ;
rank of product of two, 77 ;
scalar, 76 ;
similar, 283;
singular, 65 ;
skewsymmetric, 59;
sum or difference of two, 62 ;
Matrix — Continued. ,
symmetric, 56 ;
transposed, 21;
unimodular, 83 ;
unit, 74.
Minors of a determinant, 22 ;
complementary, 22 ;
corresponding, 31;
principal, 23, 5759.
Mixed concomitants, 109.
Moore, 86.
Multiplication theorem, 28.
Multiplicity of roots of an equation, 18;
of pieces of curves and surfaces, 211, 211
Neighborhood of a point, 8, 16, 214.
Newton's formulae, 244.
Normal form of a bilinear form, 116;
of a binary biquadratic, 261 ;
of a binary cubic, 239;
of a Xmatrix, 267 ;
of a matrix, 56 (Exercise 3) ; •
of a pair of bilinear forms, 289 ;
of a pair of quadratic forms, 169, 171, 306;
of a quadratic form, 135 ;
of a quadric surface, 124;
of a real quadratic form, 148;
of a symmetrical matrix, 59.
Nullity, Sylvester's Law of, 78, 80.
Nullsystem, 117.
Order of a determinant or matrix, 20 ;
of a group, 87.
Orthogonal transformation, matrix, determi
nant, 154, 173, 302, 304.
Pencil of conies, 163, 312;
of bilinear forms, 279 ;
of quadratic forms, 165, 310.
Period of an element of a group, 87.
Plane at infinity, 13,;
conjugate, 158.
Planecoordinates, 107.
Point in space of /i dimensions, 9;
at infinity, 12;
conjugate, 121;
equation of a, 107, 108 ;
neighborhood of a, 8, 16, 214.
Polar plane, 122;
form, 127;
tetrahedron, 125.
Pole, 124.
Polynomial, definition, degree of, etc., 1, 4;
continuity of a, 14;
corresponding, 178;
dyadic, 79 ;
820
INDEX
Polynomial — Continued.
in a matrix, 296 ;
isobaric, 2^2;
linear dependence of, 35 ;
real, 5;
roots of a, 18 ;
symmetric, 240, 252.
Prime (relatively) polynomials, 175.
Principal minors, 23, 5750.
Product, degree of, 1, 4;
of determinants, 26;
of matrices, 63, 277.
Projective transformation, 69;
property, 88, 232.
Pseudo tangent lines and planes, 120.
Quadratic forms, 127 ;
adjoint of, 15i) ;
definite and indefinite, 150;
invariants of, 129, 137, 146, 159, 166, 303;
inverse or reciprocal of, 160;
law of inertia of, 114;
matrix, discriminant, rank of, 128;
normal forms of, 135, 148, 169, 171, 306 ;
pairs of, 165, 302 ;
polar of, 127 ;
real, 144 ;
reducibility of, 136, 147 ;
reduction of, to sum of squares, 131, 139,
167, 170, 173 ;
regularly arranged, 147 ;
signature of, 146 ;
singular, 128;
vertex of, 129.
Quadric surface, matrix, discriminant, rank
of, 118;
classification of, 123, 149, 173 ;
ruling of, 119;
singular, 118;
tangent to, 119, 120, 155.
Quantic, 4;
Quaternary form, 5.
Kank of a matrix or determinant, 22,
5459;
of a bilinear form, 114;
of a \matrix, 262 ;
of a quadratic form , 128 ;
of a quadric surface, 118 ;
of a system of homogeneous linear equa
tions, 47 ;
of a system of points or linear forms,
94;
of the adjoint of a quadratic form, 161 ;
of the product of two matrices, 77.
Rational invariants, 96, 222; see also in
variant, integral, rational ;
of a Xmatrix, 270.
Rational relation, 244.
Rationality, domain of, 175, 212, 216.
Ray coordinates, 113.
Real polynomials, 5, 174 ;
matrix, Xmatrix, elementary transforma
tion, 278 ;
quadratic forms, 144154, 161, 170173.
Reciprocal or inverse of a quadratic form,
160.
Reciprocation, 117.
Reducibility of a polynomial, 174 ;
in a domain, 174, 175 ;
of bilinear forms, 116 ;
of binary forms, 188;
of determinants, 176;
of polynomials in one variable, 187;
of quadratic forms, 136, 147.
Regularly arranged quadratic form, 147.
Resultant of linear forms, 95;
of two binary forms, 201, 236, 239, 257 ;
of two polynomials in one variable, 195,
239, 248.
Roots of a polynomial or equation, 18.
Ruling of a quadric surface, 119.
Sfunctions, 241, 253.
Sfunctions, 240, 252.
Scalar, 62 ;
matrix, 76.
Selfconjugate tetrahedron, 125;
triangle, 164.
Semidefluite quadratic form, 150.
Set of objects, 80.
Sgn, 147.
Signature of a quadratic form, 146.
Similar matrices, 283.
Simple elementary divisors, 270.
Singular matrix, 65;
bilinear form, 114;
conic, 163, 272;
linear transformation, 67 ;
quadratic form, 128;
quadric surface, 118.
Skewsymmetric determinant, 59;
bilinear form, 117 ;
matrix, 59.
Smith, H. J. S., 262.
Subgroup, 83.
Subresultant, 197.
Sylvester, 78, 144, 199, 262.
Symbolic product of bilinear forms, 114.
Symmetric determinant and matrix, 50
299;
bilinear form , 115 ;
binary function, 255;
polynomial, 240;
pAjynomial in pairs of variables, 252:
ternary function, 257.
System, 80.
INDEX
821
Tangent lines and planes to quadric surface,
true and pseudo, 119, 120.
Ternary form, 5 ;
symmetric fiuiction, 257.
Transformation, affiue, 70 ;
determinant and matrix of, 66;
elementary (of a matrix), 55, 262;
identical, 67;
inverse, 67 ;
linear, 66 ;
orthogonal, 154, 173, 304 ;
projective, 69 ;
singular, 67.
Transposed matrix, 21.
Unimodular matrix, 83.
Unit matrix, 74.
Vertex of a cone, 120, 122, 123, 156;
of a quadratic form, 129.
Weierstrass, 262, 270, 314.
Weight of an invariant, 96, 225;
of a covariant, 97, 226 ;
of a polynomial, 222 ;
of a symmetric polynomial, 245, 253.
Cornell University Library
QA 155.B66
Introduction to higher algebra.
3 1924 002 750 473
Digitized by Microsoft®