Kapteyn, Jacobus Cornelius
Skew frequency curves in biology
and statistics
EW FREQUENCY CURVES
IN BIOLOGY AND STATISTICS
RARER
BY
DR. J. C. KAPTEYN,
PROFESSOR OF ASTRONOMY AT THE UNIVERSITY OF GRONINGEN
AND
DR. M. J. VAN UVEIN,
PROFESSOR OF MATHEMATICS AT THE HIGH SCHOOL FOR AGRICULTURE,
HORTICULTURE AND FORESTRY AT WAGENINGEN.
PUBLISHED BY THE ASTRONOMICAL LABORATORY AT GRONINGEN,
HOITSEMA BROTHERS/ GRONINGEN 1916.
KEW FREQUENCY CURVES
IN BIOLOGY AND STATISTICS
RARER
BY
DR. J. C. KAPTEYN,
PROFESSOR OF ASTRONOMY AT THE UNIVERSITY OF GRONINGEN
AND
DR. M.<LAN UVEN,
PROFESSOR OF MATHEMATICS AT THE HIGH SCHOOL FOR AGRICULTURE,
HORTICULTURE AND FORESTRY AT WAGENINGEN.
v •
PUBLISHED BY THE ASTRONOMICAL LABORATORY AT GRONINGEN.
HOITSEMA BROTHERS, GRONINGEN 1916.
INTRODUCTION
BY
J. C. KAPTEYN.
When — now more than 11 years ago — I published my paper on
»Skew frequency curves in Biology and Statistics" (a paper which further
on will be referred to simply as first paper), I felt that I would be „ unable,
probably for many years to come, ^further to prosecute a subject, which
lies somewhat far from my usual „ studies" and I expressed nthe hope that
some mathematician may take up the task of developing the theory in
a more general way".*)
The necessity of such a more general theory appeared very soon after
the publication. Work done at the botanical laboratory of Groningen and
elsewhere convinced me that the special form
which is the only case completely worked out (p. 18 etc.), is really too
restricted for the requirements of practice. I even was not long in recognising
the fact that no special form whatever would be quite satisfactory and that
only a wholly general development promised to be extensively applicable.
Fortunately I recognised at the same time that not only no limitation
is necessary, but that the derivation of the complete course of F(x) is
hardly more laborious than the working out of any particular hypothesis.
Besides this there is another point that made a new treatment of the
question desirable. The equation (7) is erroneous , because in it the squares
of As, which were neglected, ought to have been retained.
As it did not appear that any other investigator had the intention of
taking up the matter , these considerations soon led me to make occasional
notes and to collect examples, in the hope that at some future time I might
find time myself for a rediscussion of the whole matter. This hope was
not fulfilled. Often for months I hardly found a single hour to devote
to the subject and after a while I felt compelled, very reluctantly, to lay
the subject definitively aside, either forever or at the least until the time
of my resignation as a professor.
*) See Preface to 1st paper.
This was the state of affairs when, last year, Prof. VAN UVEN cour-
teously offered his cooperation. I at once gratefully accepted this generous
offer.
In order not to endanger the completion of the work any more we
agreed to simplify matters by leaving out all mathematical developments
which , though they might offer some mathematical interest , would probably
be of little importance in the study of the frequency curves offered by
nature. On the same ground we resolved, later, when we experienced
great difficulties in collecting numerous pregnant examples , rather to limit
ourselves to what examples we had already brought together, than to delay
much further the time of publication. The scarcity of good examples in
literature must , I think , probably be attributed to the way in which the
study of frequency curves has been conducted up to the present and I have
my hopes that a somewhat extensive trial of the method presented in the
following pages will soon put ample materials at our disposal. For as
long as we start from the idea that all frequency curves must fall in a
very limited number of mathematical types, there will naturally, consciously
or unconsciously, be a strong bias in favour of cases which fit into these
types, while deviating cases are apt to be neglected or to be attributed to
exceptional causes. For the present form of our method this danger will
not exist. It will rather encourage the investigation of peculiar forms, which
present no greater difficulty than the more common forms.
In both the present and the first papers the main purpose is different
from that of other studies on the subject. While the latter only try to
find good interpolation formulae, the main purpose of our papers is to
learn something about the effects of the causes to which any particular
form of frequency curve is due.
Meanwhile the first paper is still in so far in conformity with the
writings of Pearson and others, that the attempt was made - - be it as a
secondary aim — of bringing the observed frequency curve under a
mathematical form, that is of finding for it a mathematical interpolation
formula. This plan has been almost altogether given up in the present paper.
The substitution of a mathematical expression, with a moderate number
of constants, for the frequency curve, is necessarily equivalent to a
limitation, a limitation wholly unjustified by the nature of the problem.
In the application of the method of the present paper everything is
done graphically or numerically. From the graphical representation of
the frequency curve , we derive the graphical representation of that function
which is normally distributed. From this function we further derive
graphically the reaction-curve — if need be the growth-curve. This does
not prevent us from finding such quantities as the median and the quartiles.
Quite the contrary. The finding of these quantities — as indeed the whole
of the discussion — becomes of extreme simplicity.
Summarising we may say that the present method is distinguished
from other methods, mcluding that of our own first paper, by its perfect
generality, without loss of simplicity. From other methods, occluding that
of our first paper, by its aim to learn something about causes.
The present paper is indeed quite independent of the preceding paper.
Still it might be decidedly recommendable to read at least the first 15 pages
of the latter before entering on the reading of the former.
For the rest I think that most of the words written in the introduction of
the first paper still apply to the present one ^even the student of statistics
wwho wishes to apply the method, but finds himself unable to follow the
„ argument of these (mathematical) articles, need not be deterred". The
derivation of the theory is necessarily mathematical, its application is
absolutely elementary.
The main purpose of both papers — the finding something about
causes — is no doubt an ambitious one. Indeed it may be well to warn
expressly against too sanguine expectations. The mathematical theory
necessarily starts from certain assumptions. These assumptions are probably
not or not fully realised in nature. Therefore it is impossible to say a
priori in how far our theory will apply to the cases offered by nature.
The main ground for not being altogether sceptical lies in the fact that
a close approach to the normal curve has already been found to occur
frequently. Now our theory is only as it were an extension of the mathe-
matical theory which leads to the normal curve and this extension starts
from what is certainly in innumerable cases a ,,vera causa" viz that the
„ deviations" are dependent on the size already reached by the individual.
A reasoning like that of art. 9 of the first paper, shows this with perfect
evidence.
Still the fact remains that the conclusions to which the theory leads
must not be taken as well established facts but rather as „ working hypotheses".
There is another quite different cause for not being over sanguine.
For evident reasons a theory for the benefit of biologists would be
best worked out by a biologist.
If he cannot do it, because he is but a poor mathematician, the next
best thing — still not approximately equally good — would be to have
the work entrusted to a biologist working in close cooperation with an
expert mathematician.
About the worst possible thing will be to put the task wholly on a
mathematician. Now up to a short time ago , the last case , has been that
of myself, with the only exception, that I cannot even call myself a
6
regular mathematician. By the cooperation of Prof. VAN UVEN, this
exception at least has been removed, but still we are in the third case,
the very worst of all.
In urging this point on the biological investigator, who may happen
to give our method a trial , it is not our intention of invoking his clemency
in judging about this study. It is rather to invoke his cooperation. If
he finds some difficulty, or some point not sufficiently worked out, let
him not at once throw the method overboard. It may well be only the
consequence of our not being biologists, and of himself not being mathe-
matician enough to judge about the possibility of removing his difficulties.
In this way we might come at least a little nearer to the second case,
the case of the close cooperation of the biologist and the mathematician.
In order to make my meaning clearer, I may perhaps quote an example
of what happened in the case of the first paper.
In this paper the theory was fully worked out only for the special case
F(x) = (x + Kp.
This form was deemed sufficient, because it embraced all the curves
tried by myself. Some investigators, however, finding that this form dit not
cover the facts with which they were dealing, concluded that the theory
had to be rejected.
Now this conclusion is unjustified. The fact only proved that a
somewhat more general treatment was necessary. As already mentioned,
it was one of the main motives for undertaking the present treatment.
To my regret I must say that I experienced very little of this sort of
cooperation after the publication of my first paper. Criticisms have not
been wanting. Quite the contrary. But they were mostly from mathema-
ticians who evidently had studied the matter somewhat carelessly. *) The
workers of the Groningen botanical laboratory only have assisted me very
materially. To them and particularly to Miss Dr. TAMMES and Prof. MOLL
I feel deeply indebted for help and encouragement both in writing the
paper and afterwards. May they extend their kind interest to the present
publication.
*) This carelessness must be my excuse for not replying to most of these criticisms.
In proof of it I might quote many instances. One of these may suffice for the present.
An objection made either in writing or in print by the greater part of my critics, is,
that in my theory only four ordinates of the given frequency curve are used in determining
the constants of the best fitting curve , whereas all the ordinates are equally entitled to
contribute (see for instance KOOPMAN'S Inaugural dissertation (Leiden) p. 188 as also his
5th thesis). Yet the most superficial reading of my paper must convince anyone that the
objection is completely unfounded.
Nothing now remains but to state the exact part that each of the
joint authors took in the work. As indicated already in the heading of
tne two first chapters, the first treatment of the main problem is by myself;
the second quite independent derivation is by Prof. VAN UVBN. The
examples given in the subsequent chapters were mostly collected by myself.
Their treatment by graphical methods is entirely due to Prof. VAN UVEN.
GRONINGEN June 1915.
DEVELOPMENT OF THE THEORY
BY
J. C. KAPTEYN.
CHAPTER I.
1. The normal curve. Many investigations have been made about the
way in which the normal GAUSSIAN frequency-curves are produced. We
will simply summarize the results.
Imagine a numerous collection of N individuals who began by all
having the same value XQ of x. This x may represent the length or the
weight or the distance from any determined origin etc. for any indi-
vidual. On these individuals there come to operate , successively or simul-
taneously , a great number of causes Clt (72 ---- Cn , tending to change the
x of the different individuals in different ways. We will call these causes ,
causes of deviation.
The result is obtained that the distribution of the frequencies of the
several values of x, for considerable values of n, rapidly converges to a
limit, which limit is reached for n~™. It is this limiting form which
is usually applied to the cases of nature, that is, it is assumed that we
can , without appreciable error , put n — ^>. Presently we will have to
consider this supposition more closely.
Adopting it provisionally, we introduce the following notations:
CH deviation cause;
Ah.k deviation caused by Qh in the kth individual;
Ah the mean value of all the Ah,k*)y and let
Ah,k = 3/i -f ah.k
as a consequence of the last supposition we have:
(1)
(2)
(3)
*) In what follows a dash over any quantity will denote the arithmetical mean of
the whole of these quantities.
9
The result of the investigation then is:
If the causes Ch produce deviations which satisfy the following con-
ditions :
a. that they are independent of each other;
b. that the ah are of the same order of smallness *) then , after the
operation of all the causes, the individuals will be spread in the normal
Gaussian curve
where x0 is the size of all the individuals before any deviation has taken
place. We will call it the undisturbed value of x. Furthermore
_
In what follows we will call Ah the mean growth under the influence
of cause C&; similarly at? will be the corresponding mean fluctuationsquare.
M and e2 will be called total growth and total fluctuationsquare.
2. On the order of the quantities A and a and on the number n of causes.
As was already mentioned the equation (4) was derived in the supposition
of an infinite number of causes. Of course this cannot be the case of
nature , but as doubtlessly the number of causes is generally very great
and as further — even for moderate values of n — the form of the frequency
curve approaches very rapidly to the limit , it has been generally assumed
that there can be no serious objection against putting n=™. This view,
however, shall have to be modified, at least if we wish to extend the
theory to the size of plants and animals or of parts thereof. The neces-
sity of the modification is not a consequence of the deviation to be
apprehended from the Gaussian exponential form and we will retain this
form even where we do not take n — ~. It is a consequence of the
observed proportion of the constants e and M.
This is easily seen. For let us begin by really taking n = <N>. From
the equation (4) it appears that, in order that the frequency curve be a
real curve, c2 must be finite.
We may exclude the case s = 0 , for in this case the frequency curve
will be reduced to a single point. Now, as we assume that the quantities
a are of the same order of magnitude, this order must evidently be that
of Y^=> so that the quantities a2 will be of the order — . It is true that
Vn n
*) Which will not exclude that a part of them may be of a higher order of small-
ness. These will simply have no appreciable influence on the result.
10
it is allowable (see footnote preceding page) to admit for some of the
a2 a value of an order smaller than — . Still it is necessary — in
n
order that e may remain finite — that the number of the quantities a2,
which is of the order — , remain of the order n.
n
Of the order of the quantities A little can be said in general. If
however they are all of the same sign and of the same order and if M
is finite and not zero, they are evidently of the order of — .
In this case therefore we find that — in order that we may have a
real frequency curve — the fluctuations must be infinitely greater than
the growths.
The correctness of this at first somewhat startling result is easily
illustrated by a particular example. As such a particular example let the
A be not only of the same order but also equal. Similarly let all the a
be numerically equal. In this case it must be evident that the growth
must increase proportionally with the number of causes. On the other
hand the fluctuations , which by definition are as often positive as negative,
will grow (according to a well-known rule extensively used in the theory
of observation errors) proportionally with the square root of the number
of causes.
After the operation of n causes, therefore, the total growth will be
nA, the total fluctuationsquare ria*. In order that the frequency curve
produced shall be a real curve, both these quantities must be finite.
Therefore both the A and the aa must be of the order — , the a them-
n
selves of the order of r-— . Consequently (if n = <v) the fluctuations a must
Vn
be infinitely great as compared with the growths A.
Meanwhile we thus get into contradiction with nature. In considering
the size of plants and animals or parts thereof, we have to do with
onesided deviations, i.e. with deviations all in one sense, usually the sense
of growth. For instance : under the influence of certain causes some plants
will grow a little, some will grow more, others will grow considerably
more, but not a single individual will diminish in size. Wherever this
is the case it is impossible that the fluctuations produced by any one
cause be very much greater, far less infinitely greater, than the mean
growth produced by that same cause.
For, a fluctuation in excess of the mean growth and in the negative sense,
means a total deviation which would be negative, a case excluded a priori.
11
In these cases we are compelled to admit that the fluctuations must
be of the same order of magnitude as the growths. But this being admitted
and n being still considered to be infinite, e must become infinitely small
as compared to M. The frequency curve would thus necessarily be
reduced to a single point, or in other words, all the individuals will
finally have the same size.
I conclude that wherever in nature we have before us a real frequency
curve (which is not represented by a single point) — if we are sure before
hand that the deviations are onesided — the number n of causes cannot be
assumed to be infinite.
We may even go a step further and say that the number of causes
must be of the order of ( — 1 . For a finite number of causes this is of
course rather a vague expression. In practice it may be taken to mean
someting like this, that, though this number may be anywhere between
one half or double the value of I — 1 , it will probably not reach one tenth
or 10 times this amount.
Such a conclusion has certainly something very surprising. It may
not be clear at first sight why we could not for instance, in the case of
growing plants, consider every minute of rain or sunshine as a separate
cause. If we could, this would of course lead us generally to admit a
very high number of causes.
But some reflection will lead us to think otherwise in this matter.
The conclusion that e must grow somewhat proportional to Kn, while
M must grow more nearly proportional to n , rests on the supposition that
the several causes are independant of each other. This implies that if
certain individuals a have been benefitted by the cause Q to a smaller
degree than certain other individuals 6, the case may as,, well be reversed
for the next cause (72. I mean that now the individuals a must have as
good a chance of being the favoured ones as the individuals b. But this
will, I think, mostly not be the case if — as assumed just now — every
minute of rain or sunshine is considered as a separate cause. If certain
individuals are less favoured by one minute of sunshine than certain
others , there probably is a reason for this , which will not have ceased in
the next minute. When certain plants are in the shadow a first minute ,
they will mostly still be so during the next minute. If this happens to
be the case , then we cannot — in the present theory — consider the one
minute of sunshine as a separate cause , but we must take as one cause the
whole of all the consecutive minutes during which the effect is constantly
favourable to a determined set of individuals , less favourable to another set.
12
Nobody will of course expect that by considerations like these, we
will be enabled to draw a sharp division line between the domains of what
we have to consider as different causes.
We will not expect to see the favours of fate distributed over the
individuals of a given set of plants in absolutely the same way during a
certain number of minutes and then suddenly to find this distribution
changed for another. We will rather expect that after some time, while
the great majority are still favoured in the same way, certain individuals
will begin to gain or lose in favour. That as time proceeds the number
of these individuals will gradually increase , till finally we come to a time
for which we may say that we have got quite a new distribution of the
favourable conditions. We will then know that we have arrived in the
domain of a new cause, though we will be unable to assign exactly the
division line between this domain and the domain of the preceding cause.
According to this consideration it is even conceivabk that, by careful
observation at every moment, of the degree to which the several indivi-
duals are favoured, we could get a roughly approximate idea about the
number of what we have to consider as independent causes. It can hardly
be expected that some one will really undertake such observations, which
in every case will be extremely long and difficult, in many cases impos-
sible. We have therefore to stop at the rough notion we get at once from
the frequency curve, which is, that the number of causes must be of
the order of
It is to be noted that the number of causes implied by this estimate —
though of course far from infinite — is still as a rule not inconsiderable.
For just in the case of one-sided deviations here considered, which is the
case of plants and animals, we generally find that the divergences from
the mean size are small as compared to the mean size itself. So for
instance Quetelet finds for the length of adult Italians
M = 59.0 e = 2.47.
We conclude that the number of causes — in the sense of the present
investigation — must be of the order of 600, that is to say it may be
easily 1000, but probably not 10000.*)
*) Of course this is in the supposition that we have to do with really homogeneous
material. In the case quoted it seems far more probable that we have to do with a
mixture and that the size of all the individuals would have been found much more nearly
equal, consequently the concluded number of causes much more considerable, if we had
had before us, as tacitly assumed, a case of real ,,reine Linie".
13
I have been somewhat long in explaining this point , because I think
no attention has been drawn to it before. For the present paper it was
only necessary to point out that the evident necessity there is, in the
case of plants and animals, of admitting that the quantities A and a are
of the same order, does not exclude them from our theory. If we had
confined it to cases in which the deviations are nearly or wholly as often
positive as negative, the theory at least for the skew curves would only
have been slightly more simple, because in that case — the number of
causes being still considered to be very great — the quantities A might
have been treated as of a higher order of smallness than the quantities a.
For the normal curves even such a simplification does not exist at all , for
in the formulae (4) and (5) no supposition in regard to the order of the
quantities A is required.
Remark. The condition a (art. 1) that the causes of deviation must
be independent, implies that the deviations must be independent of the
size x. Meanwhile the derivation of the normal curve proves that the
deviations experienced by the individuals of different size, need not be
identically the same. It is only required that for individuals of different
size x, the quantities An and aw2 be the same.
3. Skew curves. In the first paper p. 10 the remark was made that
not only must skew curves occur in nature, but that they must be the
rule. The skewness, however, may well be too small for ready detection.
The reason is that, even if certain quantities x are normally distri-
buted, the different functions of x cannot be so distributed. The remark
naturally leads to the following
Problem. On certain quantities Z there come to operate the causes
Ch, producing deviations AZ, which satisfy the conditions a and 6. These
deviations consequently are independent of the size Z.
Therefore let
(6) &Z=Ah = Ah + ah.k.
According to what precedes the frequency curve produced will be
normal. Now let the quantities x be dependent on the quantities Z
according to the equation
(7) Z=F(x).
What will be the frequency curve of the 2?
Solution. If the &Z and A# are corresponding deviations we will
evidently have, neglecting powers higher than the second
(8)
14
Solving this equation we will have, to the same degree of approxi-
mation :
According to what precedes the ah,k will be, as a rule, of the order
of — =. The Zh will mostly be considerably smaller. Still, according to
Vn
what has been said about one-sided deviations, it will be necessary to
treat the A as quantities which may be of the same order as the a. At
all events we will assume that none of the A is greater than a quantity
of the order of =-=•
Vn
Accurate to quantities of the order - - we will thus have:
deviation of the &th individual, at abscissa xt
HOI A* -_ . A* or
F(x) 2 [F'(x)J A
(11). . A^_
According to the formulae (4) and (5) the equation of the normal
frequency curve of Z is
in which
(13). . . M= S.4/i = total mean growth of the quantities Z
(14) . . . e2 = 2 ah2 = total fluctuationsquare of the Z.
Now it is evident that, x and Z being corresponding quantities,
frequency Z to Z -f dZ = frequency x to x -f dx.
Therefore , if y = O (a;) represents the frequency curve of the x
for .which equation, because Z = F(x) and dZ = F'(x)dx, we may write
(dividing by dx)
(15) ..... a(z)
in which equation the meaning of M and t is still given by (13) and (14).
Remark 1. As has already been remarked the term with AZ2 in (9)
was erroneously neglected in my first paper. In the preceding article it
was shown that such a course is inadmissible, the reason being that if
the A Z2, therefore also the a2, were really negligible , then we would find
15
by (14) that e2 would be zero. The neglect of the term in question, is
permissible therefore only in the case that the frequency curve consists of
a single point.
The error may be made apparent in quite another way, by showing
that the neglect of the second term leads to erroneous results. It was
indeed in this way that my attention was drawn to it.
Indeed if deviations of the form -prr\ Save r^se to *ne frequency
curve (15), then it is easily seen that in the particular case that we take
for the A — which are the deviations of the quantities Z = F(x) — devi-
ations which are as often positive as negative, (what we will here call
symmetrical deviations) we will have, according to (13)
This being so , an i being the lower limit of the frequency curve (15)
we will have for the median (xm) — that is for the value of &, above and
below which the numbers are equal:
In this expression put
£K2
As the quantities F(x) were assumed to be normally distributed we
will have
Therefore our equation reduces to
1
= "2"
Consequently
F(xm) = F(x0),
from which
Xm ^=- XQ,
That is the median would be simply the undisturbed value of x.
Now this result is evidently false.
For it is apparent that where the deviations are all symmetrical the
arithmetical mean of all the x's cannot be changed. But at starting the
size of all the individuals is a?0. Therefore x0 must be equal to the
arithmetical mean and not to the median.
Remark 2. In general the deviations As, as they are assumed to
16
have form (10), though of course they may be symmetrical for any
particular value of x, cannot be symmetrical for the individuals of every
size. As a consequence we cannot expect x0 to be equal to the arith-
metical mean but in particular cases.
The only particular cases, where there is a possibility of symmetrical
deviation for all the individuals are the two following:
F(x) = a + bx
F(x) = h log (a + bx).
For the proof see Appendix I where at the same time it is shown
that in these cases we have really x0 — arithm. mean.
Remark 3. The solution of the main problem of this article is
evidently equivalent with that of finding the frequency curve produced in
the case of deviations of the form (10). But in this form the problem
may seem to be lacking in plausibility. It would seem to be much more
natural to inquire what would be the frequency curve in the case that
the deviations were of the form
A
F'(x)
This greater naturalness, however, is only apparent as will become
evident, if we fix our attention not on the deviations but on the intensity
with which the individuals of the size x, react on a given cause. For,
where there are at work causes on which the individuals of size x react
with an intensity proportional to
there we will in reality get deviations of the form (9). Under the action
of such causes the deviations would be of the form
only in the case that we might neglect terms of the second order. But
we know that this is not allowable. If therefore , we admit higher powers,
we shall have to consider that, as soon as, by the beginning deviation,
the size of an individual, which originally was x, has changed a little,
say to size x + 9 , then the reaction of the cause will no longer be propor-
tional to =7— but to -=- — — — -•
Now, neglecting 2<* and higher powers of 9,
1 1
17
In order to find the total deviation we will have to find the
average value of this expression. This is obtained by putting 0 = -^- A x,
2i
or as (17) gives the value of A a; accurate to first powers of A,
1 A
O = -~ -^7— -f terms in A2, AB, so that finally, neglecting 3d and
L £ (X)
higher powers of A we have
F'(x) 2 [F'(x)~_
which is just the form (10).
It thus becomes clear that the finding of the frequency curve for the
case that the deviations have the form (10) is really a much more impor-
tant and natural problem than the finding of the frequency curve for the
case that the deviation would have the form (17), a fortunate circumstance
because the latter problem must be much the more difficult of the two.
At least some trials made by myself have not been succesful.
For the rest it may be well to remark:
a. that the very frequent occurrence of normal, or at least approxi-
mately normal curves, leads very naturally to the suspicion that in those
cases where we meet with decidedly skew distributions, this skewness may
be attributable to the fact that we did not measure the most suitable
quantities; that the normal curve would duly have made its appearance,
had we measured other quantities, which are functionally connected with
the observed ones, — if for instance we had measured surfaces or volumes
instead of diameters. Considerations like these lead of course immediately
just to the problem investigated in what precedes.
b. As will appear further on , the second term of (10) must be almost
negligible in many cases of nature. In such cases of course the two
problems become identical.
4. Mean growth and mean fluctuationsquare under the influence of a
single cause C/j.
According to (11) we have — the mean value of dh.k being zero
(according to (21)) —
Furthermore, the divergence of any arbitrary Az from the mean A a?
of all, is
18
Therefore, to the same degree of approximation as before,
Consequently
(20) .... Mean fluctuationsquare at abscissa x , cause Ch =
r . ,2
5. Mean growth and mean fluctuationsquare under the influence of tJte
whole of all the causes..
If now we take the mean of all the expressions (19) and (20), for the
whole of the causes Ch(h = 1, 2, 3 . . . n) we get by (5), if for the sake of
brevity we put
(21) S(Ztf = B#,
., I M (1 + B)s* F"(x)
Mean growth at abscissa x= — jt7~\ —
Mean fluctuationsquare = — . rwrr™
or, if we put
<22> T = *
(23) . . Mean growth at abscissa x =
(24) .... Mean fluctuationsq. abscissa x =
F(x) 2M
e2 1
. -,, f ^
For the case M = 0 , we will rather put
For this case therefore we get
(26) . . . Mean growth at abscissa x == — -^ H (1 + B) wr
TJ
(27) . . . Mean fluctuationsq. abscissa a = 7EvT~2»
Remark. According to (5) and (21)
According to what has been said in art. 2, wherever the number of
causes is very great, the A must be small as compared to the a. In general,
therefore, B will probably be a very small positive quantity. In the case
of one-sided deviations, however, we are compelled to admit that the
a and the A are of the same order. Still it seems natural to think that
19
even here the A must mostly be rather smaller than the a. Particularly
so in the case that --is unusually small. For this quantity can become
small for two reasons: 1st because the number of causes, which according
to the same article must be of the order of f— ) , is small; 2nd because the
growths A are somewhat small as compared to the a. — If now we find
a case in which - - is exceptionally small , whereas there seems no a priori
reason to think that the number of causes is particularly small, we will
be led to suspect the existence of the second cause. — In short there
seems to be every reason to admit that the quantity B, which must
generally be very small, must, even in the case of one-sided deviations be
mostly smaller than unity. It must be expected to be particularly small
M
wherever the value of — is little considerable.
s
6. Inverse Problem.
Given that for certain quantities x we have found by observation the
frequency curve
3f = a(3).
Required 1st to find a function F(x) which is normally distributed.
2nd, the growth-curve, fluctuation- curve and reaction-curve.
We call growth — resp. fluctuation — and reaction-curve, the curves
of which the ordinates are proportional with the mean growth resp. the
square root of the mean fluctuationsquare and the intensity with which
the individuals react on the causes of deviation.
Solution. From what precedes it appears that if the deviations of
certain quantities x have the form (10), with which corresponds an intensity
of reaction proportional to (16), and if, as a consequence thereof the
mean growth and the mean fluctuationsquare have the values (23) and
(24), resp. (26) and (27), that then the deviations of the quantities Z=F(x)
become independent of a;, as a consequence whereof the Z will be distri-
buted in a normal curve, whereas the x will be distributed in a frequency
curve of the form (15).
May we conclude that the inverse holds too, i. e. may we conclude
that if the equation of the given frequency curve has been brought under
the form (15), the quantities Z=F(x) will be distributed in a normal curve?
Such would be the case if we might conclude that the deviations &Z
are independent of the Z, But cannot other deviations than those which
are independent of the Z also produce a normal frequency curve?
This possibility really exists, at least if we assume that growth and
20
fluctuation are independent of each other. So for instance it is easily proved
that a normal curve will be produced in the case that the growth (but
not the fluctuation) is a linear function of the Z, and even this does not
seem to be the most general case. In the above assumption therefore, if
we find the Z distributed in a normal curve , we cannot conclude that the
AZ are independent of the Z and, as a consequence thereof, we have to
admit that with any given skew frequency curve may correspond more
than one form of growth — and fluctuation — curve.
Meanwhile it is hardly conceivable that our assumption holds in nature.
The subdivision of the deviation into a mean growth and a fluctuation is
purely artificial and merely introduced for the convenience of the mathe-
matical discussion. Their independence therefore seems inadmissible.
What we have to expect is that the intensity of the reaction of the indivi-
duals of different size #, will be a function of x. With this one function
there will correspond a determined mean growth and a determined mean
fluctuation (a constant factor being disregarded). With this one function
there will also correspond a single solution for the frequency curve. This
appears from the solution given in the second chapter by Prof. VAN UVEN.
Admitting therefore that with one reaction curve there corresponds but
one frequency curve, the solution of the inverse problem, now under con-
sideration becomes evident.
If by observation, we have found for the quantities x, the frequency curve
(28) .......... y = to(x)
and if we have determined F(x) fom the equation
(29)
— a determination which we will have to consider presently — then the
quanties F(x) will be distributed normally and the mean growth and mean
fluctuationsquare will be determined by (23) and (24) resp. by (26) and
(27), whereas the intensity of the reaction will be proportional to (16).
The equations of the growth — fluctuation — and reaction-curve will be :
(30)
growth-curve y = -p-.— —
F»(x)
M
fluctuation and
reaction-curve
y ==
F(x)
if M not = 0
respectively
(31)
growth-curve y = —
fluctuation and )
reaction-curve '
if M = 0,
21
in which we have always, according to (12) and (13)
j M =2Ah — total growth of F(x)
I E2 = 2 a? = total fluctuationsquare of !?(»).
It is to be noted that both in the equations (30) and in (31) we have
neglected a constant factor which is not the same for the growth and the
fluctuation. If we wish to have the true proportion of the two we shall
have to go back to the equations (22) to (26).
7. Derivation of F(x) from & (x).
Let x — T and x = v represent the lower and upper limit of the given
frequency curve. In order that the quantities F(x) be normally distributed
we must have, to begin with:
(33) F(T) = — ~; F(v) = -f ~.
For other values of x we will find F(x) if we multiply (29) by dx and
integrate between the limits T to x. We get
which reduces to
(34) , fa (x) dx=~ f^2 ~«- * dt-
J 1/JT./-CV
T
A table for the integral ••—= / e~* dt has been given in the first paper.
ynj_zv
By its aid we get at once, for every value of x,
(35) ..... /<*) =
It is clear that since the F(x) are normally distributed, the same holds
for the f(x). It will be convenient therefore to take /(»), which is
directly and completely given by the observed frequency curve y = Ql(x),
for the required function, which is normally distributed.
If we act in this way and if we remark that by putting x — x0 in
(35), we get
(36) ........
we find that finally the solution of the present question comes to this :
From the observed function £l(x) derive f(x) by
-, //<») , x
(37) ....... — / e-*dt= Q(x)dx.
J J
22
The f(x) will be distributed in the normal frequency curve
1
(38)
According to (30) and (31) we will further have, expressed in f(x)
and leaving out constant factors:
(39) . .
(40) . .
fluct. & react, curve y =
Growth-curve y = -
/'(*)
/(«o) not = 0
fluct. & react, curve y = ^
= 0.
Remark. The second term in the growth-curve (39) may cause trouble
1 I T>
on account of the generally unknown factor •
Let us first assume that XQ is known from another source, and consider
separately the cases of one-sided and not one-sided deviations.
a. Where, as with plants and animals, the deviations are one-sided
and in the sense of positive growth , #</ cannot exceed T, it may at
most be equal to r. As a rule the limit of an observed frequency-
curve cannot be assigned with any precision. In these cases it seems
advisable to me to take x = x0. It will follow, according to (35), that/(r)
is not — oo, but
This being so, the value of the integral
f f'(x)
will no longer be equal to unity, but will differ from it by the amount
..*
(42)
This amount will be necessarily accumulated at the lower limit. We
will have to revert to this question of accumulation at the limits of the
frequency-curve.
It will be sufficient to remark in connection with the case under
23
consideration, that such an accumulation is readily conceivable. The whole
of the individuals begin by being accumulated at x = % Where the
deviations are one-sided they may have any values between zero and any
small positive value. We are even compelled to admit values down to
zero, in the case, here assumed, that i = XQ.
The consequence must be that after the operation of a certain number
of causes, some individuals will still have size XQ. The number of indivi-
duals being sufficiently great, such must even be the case after the operation
of a very great number of causes.
Still we do not deny the difficulty, or even in many cases impossibility,
both of the assumption i = x0 and of a finite accumulation at this limit.
The difficulty however, is of the same nature as that which we meet in
the case of the normal curve which extends from — oo to -f <N> , so that
a certain probability is attributed to any size whatever, both positive and
negative though in reality sizes beyond a certain amount are never observed.
The difficulty arises from the fact that this curve presupposes an infinity
of causes, whereas in nature this number is necessarily limited. It is not
felt as an objection, however, because the chances found for the extreme
sizes, are generally so small that they must escape notice.
The indeterminateness in the growth-curve as a consequence of the
unknown factor
l + B
of the second term , will mostly not be very serious. For as / («0), though
perhaps not — — ~ , will doubtlessly generally have a considerable negative
value and as B will mostly be moderate, particularly where -- therefore
— f(x0) is smaller than usual (see Remark to art. 5), the total neglect of
the second term in (39) must not, as a rule, change the growth curve to
such an extent that its main features would be obliterated. There is one
circumstance which still further reassures us about the conclusions to be
drawn if we use
1
(41)
7' (*)
instead of the more complete equation (39) as the equation of the growth-
curve. It is this. The complete equation may be written in the form
(42) J 1 + B d
y "'
Therefore the second term vanishes absolutely in the points where
the curve (41) has its maxima and minima. These points, therefore,
24
which in most cases are the really interesting points of the curve, would
not be changed even were it possible to use the rigorous form. (The
abscissa of the maxima and minima may of course be slightly changed).
b. In the case of not one-sided deviations, x0 will lie between the
limits T and v of the frequency curve. XQ being known from other sources,
/(XQ) will therefore be determined at the same time with the whole course
of f(x). The constant B will be negligible in this case, at least if we may
admit that the number of causes is very great. In conclusion, therefore,
the approximate determination of the growth curve will also not present
any difficulty in this case.
If XQ be not given & priori, the growth curve cannot be determined
with any precision in this case. We shall have to rest content with the
reaction curve. Meanwhile we may here refer to the theory of the propor-
tional curves further below, from which it appears that in some cases at
least we might get indications about the value of x$.
8. Conclusions to be drawn from the observed frequency curves,
The equations (36) to (40) enable us from given frequency curves
to draw some conclusions about the intensity with which individuals
of different size x react on the gr owth- causes , and about their mean
growth itself.
Of course we have not to forget that the hypotheses which lie at the
foundation of the theory may partly or wholly not be realised in the
cases of nature under investigation. For this reason the conclusions we
will draw will have no absolute cogency. They ought to be taken as
working hypotheses, hypotheses which draw the attention to certain more
or less probable facts. The present theory claims no other advantage
than this. How great the probability of the results is, what therefore
must be the value of the working hypotheses at which we arrive, must
appear from long experience.
The nature of the conclusions to which we may be led has already
been set forth in art. 23 of our first paper on skew curves. It seems
reasonable to expect that — if the theory gets a much more extensive trial
than we were able to give it — results in other domains and of another
nature will be found. Meanwhile there is one sort of conclusions, well
illustrated by the application given below, to which we wish here to draw
particular attention.
If for certain values of x the reaction becomes small, then, according
to what precedes, -JTT \ w^ ^e 8mall> an<* / (#) will be considerable. If,
J (x)
therefore, £ be a small quantity /(# + £) will differ relatively much from
25
f(x). Therefore the number of individuals which, in the frequency-curve,
will have a size between x and x -f- 1 , that is
will also be relatively high.
We thus reach the conclusion that , wherever the reaction on the growth
causes becomes small, there we will find accumulation in the frequency
curve and conversely, wherever in the frequency-curve we have exceptional
accumulation, there we must conclude to relative rest in the growth.
A fine example of this phenomenon is furnished by the case of the
spores of Mucor Mucedo l) treated below. The enormous accumulation of
individuals not far from the middle of the frequency curve indicates at
once a stagnation in the growth of the individuals near the time at which
the size, determined by the minimum of the reaction curve, is reached.
Shortly after this conclusion was reached , my attention was drawn by
Prof. H. DE VJRIES to the investigation of Prof. ERRERA (Botan. Zeitung
Vol 42 (1884) p. 497) who found a period of rest in the growth of the
sporangia of some of the fungi of the same Genus.
The probability therefore seems to be very great that direct observation ,
not of the sporangia but of the spores, will confirm our conclusions.
If we find in the frequency curve not an accumulation but a depression,
we will similarly conclude to a high degree of reaction for the individuals
having the size at which the depression occurs. A good example is that of
the ear-length of wheat under scanty feeding, given by Dr. C. DE BRUYKER
(Handelingen van het 13e Vlaamsche Natuur- en Geneeskundig Congres,
p. 172). If we neglect a pretty insignificant top, the curve has two very
decided maxima. Of such two-topped curves, there are at present a fair
number of examples in botanical literature. They are considered as an
indication that we have to do with hybrids, descended from two parental
forms having very different frequency curves.
The present theory furnishes another „ working hypothesis", which
may be valuable in those cases where it may be considered probable or
certain that we are not concerned with hybrids.
In the present instance we are led to the conclusion of a much
accelerated growth of our individuals at about the time that a size of 45
to 85 millimeter is reached. Observations made for the express purpose
of testing this conclusion shall have to decide whether our explanation is
valid or not.
*) Perfectly analogous results were found for the spores of Mucor Mucilagineus. I owe
both series of observation to Mr. G-. POSTMA, who at the time worked as a student in the
botanical Laboratory of Groningen.
26
9. Accumulation at the limits.
Accumulation may well occur at or in the immediate neighbourhood
of the limits. It will do so if at these points the reaction curve stops
very suddenly, that is, if the ordinates of that curve, from being mode-
rately great very near the limits become zero at these points.
This is easily seen. Between the limits i and v and up to any finite
distance from them .,. . can never be zero. For jr/~^ ^s proportional to
the reaction. Therefore, as all individuals begin by having size #0> as
soon as by continued deviation they reach a point for which 77-^ = 0,
all further deviation stops, so that no individual can pass that point,
which thus of necessity becomes a limit of the frequency-curve. Therefore
jrr\ mus* be finitely different from zero for all values of x at finite dis-
J (x)
tance from the limits. This being so we have /' (x) and , as a consequence
thereof, also f(x) finite for all values of x at all finite distances from the
limits. All this will hold in every case.
Now let us assume that we have the case of a reaction curve which
comes suddenly to a stop. Our theory will still hold provided we admit
that the vanishing of jr/-\ fr°m a finite value to zero does not occur with
absolute suddenness *) but that the change occurs gradually and in a way
satisfying a certain condition 2), within a small interval X = T to X = T -f £
resp. x = v — 0 to x = v, which we will assume to be infinitesimally small.
In this case therefore we have: -7rr\ finitely different from zero for
therefore f'(x), consequently also f(x)t finite between these same limits.
Therefore, finally, we will have , among a total number N of individuals:
numb, of indiv. between ) N /•*+£ N
* = r'and* = r + 5 pPS/ fto^*"*-^
T
numb, of indiv. between \ _ N f> N
x = v-6*n&x = v l~yZJ,_9f(*><r <:'dx~y=l
J) The modification required by the theory for the case of an absolutely sudden
breaking off of the reaction curve is easily made in a particular case. I have not succeeded
in solving it generally. In nature the case can, I think, hardly be expected to exist.
*) The condition is that the ordinates must diminish in such a way that the greatest
deviation for any individual of size x remains constantly smaller than the distance which
still separates it from the limit, or in other words that every individual continually
approaches the limit without ever reaching it.
27
in which expressions /(r -f £) and f(v — 0) are both finite. The integrals
therefore have also finite values, that is we have finite accumulation of
individuals within the infinitesimal intervals
i to T -f- | , v — 6 to v.
To my regret I have not found in literature any case of such accu-
mulations at the limit. Still there seems to be no doubt but that such
cases must exist. Imagine a number of plants of one species growing
in a flat topped greenhouse. As soon as the plants have reached a size
equal to the height of the green-house, further growth becomes impossible.
The reaction curve comes suddenly to a stop. Corresponding therewith
we will find an accumulation of individuals with a size just equal to that
of the green-house.
It seems probable that many cases of impediments against growth
beyond a certain size must exist in nature — though generally the limit
may not be so sharply determined as in the preceding instance. In such
cases attention will be called to such impediments by more or less evident
accumulations at the limit of the frequency curve.
10. Proportional curves.
What becomes of the frequency curves:
a. if the reaction on every one of the acting causes becomes A fold.
b. if — the average reaction or deviation remaining equal — the
number of causes grows in the proportion of 1 : A ?
We will call the curves in respect to the original one, proportional
curves of the first resp. the second kind. In regard to those of the first kind
,, const. , . const. T
the reaction, which originally was „, ( . , now becomes X ^rr^r- In order
to pass from the original curve to the proportional one we have therefore
only to substitute -rF!(x) t° F' (x). According to (11) this comes to the
same as if — leaving F' (x) unchanged — we put I Ah instead of AH]
\ah instead of ah.
According to (13) and (14) the consequence of such a change will be
that in the equation of the frequency-curve
. [ M changes to IM.
' l <? .
so that it becomes
28
which, expressed in terms of /(#), according to (35) and (36) becomes
i
A Vn
As for the proportional curves of the second kind, we suppose the
average deviations to remain the same, the Ah and atf will remain the
same in the average. As, however, their number is supposed to increase
in the proportion of 1 : A it follows from (13) and (14) that
M will change to
The equation of the proportional curve of the second kind will
therefore become:
or in terms of f(x)
(45) ...... y = l±=f(x),-
YA71
We may summarise these results as follows: A frequency curve
(46) ....... y^-^'W
V n
will, in regard to another frequency curve,
(47) ....... y = _-/'(*)
n
(48) . be proportional of the first kind if <p(x) = -y [(A — l)/(z0) +/(*)]
(49) . „ „ „ . second „ „ <p(x) = [(*-
In both cases therefore the functions y(x) and f(x) will be linear
functions of each other. Wherever we find two frequency curves which show
such a linear relation, there thus exists the possibility of their being pro-
portional curves. If neither A nor f(x0) are known a priori, we will be
unable to decide the kind of proportionality. If /(a:0) is known a priori
the decision becomes possible. It deserves attention that in any case where
we find a linear relation between <p (x) and f(x) , if we have reason to
assume that the proportionality must be of a determined kind, we can
determine both A and f(x()) and consequently x0. So in the case, treated
further below, of the summer and winter barometerheights. We assume
that we have to do with a proportionality of the first kind. If this is
really so then the undisturbed barometerheight at den Helder must be
761.2 mm.
29
In the theory of observation errors similar cases would offer the possi-
bility of finding the correct value of the unknown XQ notwithstanding the
presence of unknown systematic errors.
11. Medians and quartiles.
Let q.25, xm, <?.76 represent the abscissae corresponding to the ordinates
which divide the area of the frequency curve in four equal parts ; xm will
be what is generally called the median,
xm — g.25 will be the first quartile,
?.75 — ^m „ , „ second quartile.
The determination of these quantities is extremely simple. If the
ordinate of the frequency curve corresponding to the abscissa x be called
y and if i be the lower limit of the curve then q.2b, xm, q.75 will be res-
pectively determined by
(50) . . . . I *ydx = V4; reap./" mydx = l/2; j Kydx = 3/^.
T T T
Thus, for instance, the median of the frequency curve (38) will be
determined by
which by putting f(x) = z reduces to
> _*,
e dz — 1/2.
vnj-
Therefore
(51) /(xm) = 0.
In a similar way we get the other quantities. Remembering that
1 ,.- 0.47694.... I r + 0.47694....
p=f e~*dt = 1/4 and p^/ «-*d< = 3/4,
we get, if we include the results for the proportional curves
/(*-)
. curve (38) 0
prop, curve 1st kind (1 — A)/(a;0)
— 0.47694
— 0.47694 ... A 4- (I- ft f(x0)
— 0.47694... yi 4(1—
4 0.47694
+ 0.47694 ... A 4- (1— A)/(a:0)
4- 0.47694... J/l4-(l—,
For any given frequency curve therefore the median and the quartiles
are at once read off from the curve y =• /(%)•
DEVELOPMENT OF THE THEORY
BY
M. J. VAN UVEN.
CHAPTER II.
1. Introductory. When measuring some quantity appertaining to some
organism (length of ears, weight of fruits, sugar percentage of beetroots),
the different values obtained are usually of very different frequency. The
distribution of the different values among the individuals resembles that
of the different results when observing some physical or astronomical
quantity. If systematic errors may be left out of consideration and the
causes of error are very numerous and independent from each other, the
results of observation are symmetrically spread round their arithmetical
mean, in accordance with a definite law, the so-called ^exponential" law
of error. This law expresses how the probability of a certain deviation
(or error) from the arithmetical mean is determined by the amount of
that deviation. Values so distributed are said to have wnormal frequency"
or to be represented by a normal frequency-curve (which is the graph of
the wexponential" relation between the amount of the error and its proba-
bility). According to this law of error the smaller deviations are more
numerous than the larger ones, as was to be expected.
The frequency-table of some quantity measured in a great number of
individuals sometimes agrees with the normal law of error. In such a
case we may be sure that the different causes of deviation, or rather the
different causes of growth, are very numerous and independent from
each other.
As soon however as the causes of growth are no longer mutually inde-
pendent, the frequency-table ceases to agree with the exponential relation.
When a certain quantity z is spread according to the law of error,
some other quantity x, connected with z by some relation, will not be
thus distributed (only when a; is a mere multiple of 2, to which a constant
has been added , x is spread normally together with z).
Now, if this quantity x is measured, the frequency-table given by
observation will not be normal. In this case there will be a certain
31
quantity z, connected with x by some relation, the frequency of which
does follow the normal law ; and it may be an interesting problem to find
out the relation between the measured quantity x and the normally distri-
buted quantity z.
In the next paragraph it will be proved, that ^abnormal" frequency
occurs when the effects of the causes of deviation (or of growth) cease to
be independent from each other, but, on the contrary, depend on the-
magnitude of the growing quantity x, which undergoes the deviations, so
that the difference between the deviations is due not only to mere chance,
but also to a divergence of the values to which they refer. Moreover the
relation between the effect of the cause and the amount of the deviating
quantity x will be shown to be connected with the relation existing between
the measured quantity x and the normally distributed quantity z.
The relation between z and x, mathematically expressed by: z is a
function of x (z=f(x)), or by: a; is a function of z ( x = <p (z) ), may be
geometrically represented on squared paper by a „ graph" or curve, which
is the whole of the points having x for abscissa and the corresponding
value z for ordinate. We will in what follows speak of the ,curve" z—f(x).
In the same manner, the effect of the cause being denoted by rj} the
relation r\ — \p (x) between r\ and x may be illustrated by another graph ,
the curve rj = y> (x). The function y>(x) will be called the „ reaction-function",
whence the curve r\ = yj (x) will be called the ^reaction-curve". The problem
to be solved consists in finding the relation (or curve) z = f (x) from the
frequency-table given by observation, and afterwards in deducing the
reaction -function (or curve) from the relation (or curve) z = f(x).
In the next "paragraph the mathematical treatment of this problem
will be given.
Those readers who have no taste for mathematical analysis may proceed
to the following paragraph, containing matter of a more practical kind.
In order to make the practical rules easier to understand, the results
of analysis are summarised and translated into easy geometrical language ;
after this the method of determining the curve z — / (x) and of deriving
the reaction-curve is expounded with the utmost simplicity ; finally several
practical hints are added.
2. Mathematical treatment. The successive increments of some growing
organism may be attributed either to the continuous, though variable,
action of a single cause or set of causes, or to the cumulative effects of
several causes, each of which acts during a period, small in comparison
with the whole duration of the growth.
The rate of increase of some organism under the influence of some
32
cause depends not only on the intensity of that cause, but also on the
degree in which the organism reacts upon it. For example in some cases
the rate of increase of a plant may be considered to be proportional to
the rate of taking up food, which, in its turn, may be proportional to
the area of some organs of the plant. By measuring the diameter x of
the organ concerned we find the rate of increase proportional not only to
the intensity of the cause of growth (rain, solar heat, etc.) but also to the
second power of the measured quantity x. In this case we call x2 the
^reaction-function". Thus the increment of the plant, particularly
of the measured quantity x, is proportional as well to the intensity of the
cause as to the particular value of the reaction-function y(x).
In order to avoid difficult intricacies, we shall suppose the several
causes to have nearly the same reaction-functions, so that the notion
nmean reaction-function" may not be deprived of sense.
The increment within some period of growth may be considered as
a multiple ra of an elementary increment. This multiple being proportional
to the intensity of the cause, the elementary increment is, in its turn,
proportional to the reaction-function.
Calling the elementary increment a !), the increment within some
period of growth may be given by
A x = m a ,
the dependence of a from the reaction-function being expressed by
where ft represents a constant quantity of such an order of magnitude,
that m@ becomes of the same order as A x.
Hence in
(1) ......... &x = mpy(x)
we may suppose the reaction-function to assume values of normal finite
amount.
In the same growth-period the different individuals, even if they have
equal values of x, will possess different values of m.
Here we introduce the essential supposition, that these different values
of m, representing the different intensities of the same cause, are due to
pure chance.
In consequence, denoting by m the arithmetical mean of the individual
values of m, the deviations
_ IJL = m — m
!) This quantity a is not to be confounded with the symbol a of the preceding
chapter, where a was used to indicate fluctuations, whereas in the present chapter it
means the elementary increment.
33
from this arithmetical mean are supposed to be distributed according to
the exponential law of error. So the probability that p may be found
between the limits y -- / and y -j- -~ equals
s being the wmean error" or „ standard-deviation".
In order to develop more methodically the theory of the general case
of a variable elementary increment a, we will make a preliminary study
of the case of a constant elementary increment.
The increments A x = m a having from their arithmetical mean m a
the deviations pa, these latter, being the product of the constant a and
the quantity ju distributed according to (2), are also spread according to
the normal law of frequency.
The whole time of growing may be composed of a great number of
successive growth-periods Plt P2, .... P^, ....
The initial value of x being denoted by XQ , the successive values of x are
after Pl . . . xl — XQ -f A x0 = XQ -f mxa — x0 -f m^a -f
„ P2...x2^= xl-{- A xl = o^
„ Ph...xh = xh-i -h A xh-i = xh_i -f- mha = xh-
= XQ -f (mi + m2 -f . . . + wh) a -f (^ + ^ + . . • Vh) a
Putting generally
2 m k — Wl -f- w2 -f . . . = m ,
we have finally
:e = £o
or, putting
The different values of x obviously have the arithmetical mean x; and
the deviations £ from that mean , being the product of the constant a and
the sum ^ = 2 pk — each term of which follows the exponential law — ,
are also distributed according to this law, so that the probability that | will
lie between | -- — and £ + -/ amounts to
Hence: the elementary increment being constant and the values of
3
34
its multiples , contained in one growth-period , being purely accidental , the
different lengths measured after a finite lapse of time are still found normally
distributed round their arithmetical mean.
We have expressly supposed , that the values of the multiples m were
merely due to chance. The meaning of this is that the causes of growth ,
the intensity of which is as it were measured by m , are wholly independent
from each other and built up from a great number of small agents.
Thus far we have restricted ourselves to the case , that a certain cause ,
when operating with the same intensity , also has the same effect — purely
accidental deviations left out of consideration — upon the growth, what-
ever may be the value of the length x undergoing the increment.
Next we shall suppose the elementary increment a to be variable,
that is to say, to depend on the value of x, or to be a function of x. In
agreement with the above notation we put
ft being a constant of the same order of magnitude as a = — , so that
y (x) will assume normal finite values.
This supposition evidently implies, that the organism reacts upon the
cause in such a way that equal intensities of the cause need not have equal
effects on the growth; that, on the contrary, the increment due to that cause
depends, besides on its intensity, also on the value of x, on which it acts.
When several causes cooperate, we shall assume, that the reaction-
functions are nearly identical, or that a single cause is preponderant, so
that it is sufficient to consider one single reaction-function.
By taking the growth-period so small, that the value of x and also
of v> (%) niay be supposed constant, we obtain :
During P1 ---- x1 — x0=
P2 ---- xz — xl = &
„ Ph ---- xh — xh,i =
or, in general,
Hence, starting with XQ and terminating with x,
Proceeding to the limit A x = 0 we find
35
Putting
the equation (3) passes into
Z — Z^ = F(x) - F(xQ) = M+ f.
Here the quantity £, product of the constant factor ft with the nor-
mally distributed number /a, follows the exponential law.
The quantity x itself is no longer, as in the former case , normally spread.
The quantity
(4) ....... ( = F(x)
follows the normal law
Introducing
(5) ..... z = h£
we obtain the quantity z , distributed round the mean value zero according
to the law
(6)
. .
v n
When it is possible to determine the form z=f(x), the reaction-
function if; (x) can be deduced from the relation
f-S-^FW^T7^;1
where F' (x) and f'(x) are the derivatives of F(x) and f(x) resp.
Thus the function \^(x) is determined but for a constant, this latter
circumstance being a consequence of the indefiniteness of the factor ft.
In what follows we shall put
1
*~m'
Determination of the functions Z=F(x) and z = f(x).
The distribution of the different values of x among the individuals
examined may be arranged in a frequency-table.
If for every value of x the probability of occurrence were known, viz.
y = Q(«),
then this relation would be the equation of the continuous frequency-curve.
The probability that x may lie between XL and xz would be
36
Of course the integral between the extreme limits must be unity. In
a lot of N individuals, the number
may be expected to lie between xl and x2.
Now the observations never furnish the faction Q, (x) itself, only some
discrete values of the quantity F^J.
Geometrically spoken: the observations furnish finite parts of the
area of the frequency-curve.
Let the values ^ , £2 . . . £„ *) (rising by equal amounts c) be observed
resp. Ylt Y2 , . . . Yn times. Thus the constant class-range is c , so that
S2 — si == £3 — £2 — • • • • — s n — ?n—l = C.
Then it has in fact been settled, that for Yk individuals x is found
c c
between £k ^ and £k -}- -=- •
Putting
(7) & + ! = **
the observations tell us, that for F/t individuals
In what follows we will denote the lower limit of x by x0 and the
upper limit by xn. Hence the symbol XQ will no longer indicate the initial
or undisturbed value of x.
So between XQ and x1 Yl individuals are found, between XQ and xk
Yl -f F2 + . . . + Yk individuals; finally between XQ and xn the total number
n
2F/fe = N of the indivuals is found,
i
Hence the probability a posteriori amounts
Yk
for Xk-i <x<Xk to -^,
for x0 < x < x1 (or x < o^) to Jj =
„ r^o < x < #2 (or # < x%) to /2
(8) . .
x < #* (or rr < #*) to
iC < 07n (Or X < 0?n) to In =
*) The symbol n will henceforth indicate the number of distinct values observed
for x (in stead of the number of causes , as in Ch. I).
37
The probability Ik is obviously represented by the area of the frequency-
curve contained within the axis of x, the frequency-curve y = Q.(x) and
the ordinate-line of #*. Usually the ordinate of the lower limit XQ is zero.
Hence
fOCk
lk=j Q(x)dx.
*o
In order to determine the function z—f(x) we are guided by the
following principle:
Corresponding values of x and z have equal probabilities.
We can only verify that the probability of x being between xv and x2
equals the probability of z lying between the conjugate values zl and z2.
Whether xl < x < x2 corresponds with zl < z < z2 or with z2 < z < zl has
not yet been settled.
The elementary increment was found above to be
Excluding infinite values for the reaction, we postulate
f(*MO;
the meaning of this is that the function z = f(x) may not have maxima
or minima in the real domain.
By taking h and f$ positive we assume the elementary increment a and
the derived function f'(x) to have always the same sign. As a rule a and
f'(x) will be positive. When the elementary increment a is negative for
some values of x , then also f'(x) < 0 , and the function f'(x) , in passing
from positive to negative values or inversely, must become infinite.
For the present we make the simplifying supposition , that the elemen-
tary increment shall always be positive. Any negative increment is then
due to a negative value of the multiple m (see above).
So we have
First simplification:
/'(*)> o.
The variable z ranges from — oo to -f GO . Unless particular circum-
stances compel us to admit infinite values for z = f (x) corresponding with
values x within the limits XQ and xn, we shall for convenience' sake suppose,
that z becomes infinite only at the limits of the real domain, viz. x0 and
xn. In consequence of the first simplification the lower limit x = XQ is
conjugate to z = — oo , and the upper limit x = xn to z = + oo .
So the
Second simplification
f(x) j£ ± oo for XQ < x < xn
38
gives us two pairs of conjugate values (x, 2), or, geometrically, two
points (x, 2), viz. (x0, — oo) and (xn, +00).
At present we are able completely to determine the correspondence (x, 2) :
(9) X
.
XQ V 71 — oo
Now for W %, i. e. the probability of x < ock, only the a-posterioric
~v i ~y \ ~v
value can be given. It amounts to Ik — — ^- -%?- —• So we have as
an approximate value for 0(2*) (which is the value a priori)
(9a) . . , . e (zk) = !„ = I(fk) = Y* + ^ +— t-I*.
The most probable value of the probability a priori is the probability
a posteriori p. According to the reversed theorem of BERNOULLI (2d theorem
of BAYES) the probable error gp of the probability a posteriori p considered
as an approximate value of the probability a priori, is given by
(10)
where Q = 0,476936 . . . , and N is the whole number of trials , the fraction
p of which has a favorable result.
So the chances are equal that the true probability a priori lies between
— p)
p ~ Q ~ ~ and p
Since the probability a priori is not absolutely certain, the quantity
2 is not determinate either. So the correspondence (x, z) always has an
element of uncertainty, which may be expressed in numerical value by
the probable error gn of z itself. In the Appendix to Ch. II, I A (p. 62) we
shall prove that for QZ the following approximate values may be taken:
1°. in the neighbourhood of z — 0 :
/n\
2°. for great positive or negative values £ of 2
By operating with a sufficiently great number N of individuals the
value of QZ round the centre of the domain (2 = 0) is small , and accordingly
there is but a slight uncertainty in the correspondence (x, z).
On the contrary, the error in z at the extremities of the domain
(z = — oo and 2 = 4-0°) is very important ; the formula shows
Lim Qg = oo.
*= ± oo
39
So the probable error of z increases together with the absolute arith-
metical value of z. At the limits the value of z is absolutely uncertain;
the meaning of this is that it is impossible to decide whether z = — oo or
z = — £ (finite) must be made to correspond to a provisional value of x0.
Inversely it is absolutely uncertain , which value x0 answers to x = — oo ,
or which value xn must be made to correspond to z = -f oo.
Hence the correspondence (x, z) is nearly exact at the centre of the
domain, doubtful at the values xl and xn—\ preceding the extreme limits
and absolutely uncertain at the limits x0 and xn themselves.
The limits XQ and xn of the domain of correspondence , which are conjugate
to z — — oo and z — -f- oo are essentially absolutely indeterminate.
Now it is obvious in what way the function z = f(x) may be determined.
The observations furnish
Yl times the value |lf
So the total number of observations amounts to
N = S Tk.
i
The observations really show, that x is found
Y! times below |x -f -^ = »i ,
/* /* f*
Y2 „ between £ + - = £2 — - = xl and ^2 + -- = x2 ,
/* c
Yn „ above £n_i -f- — = |w — -^- = xn-i ,
Z u
or, in other words, that x lies
F! times between x0 and ^ ,
„ xk
Yl -f- F2 + . . . + Yn = N „ „ x, „ xn
The probability a priori of x0 < x < XH is expressed by
40
where
and
QP = probable error of p (Q = 0,476936 . . .).
Then from
p = 6 (zk) = l(xk)
we derive the most probable value zk of the variable z, which is conjugate
to xk.
In this way we obtain n — 1 pairs (», z), viz. (zl7 2^), . . . (avi-i, 2*1-1).
Marking these pairs by points with coordinates (#, z) we get n — 1
points of the curve which represents the function z = f(x).
The situation of these points is most certain at the centre of the
domain (round z = 0). It has been shown above that the smallest value
o ft
of the probable error Q, of z is * . When moving away from the centre
V N
the uncertainty increases with z itself.
A continuous curve through the n — 1 marked points is most sharply
determined at the centre z = 0.
The uncertainty in the shape of the curve z =/(#) may be illustrated
by drawing two curves at both sides of the original one, viz.
« = f(x) — Q, and *=f(x) + Q,.
We thus obtain a strip round the most
probable curve z = f(x) ; this strip is very narrow
near the centre, but rather wide near the
z _ 0 extremities and even infinite at z = -j- GO and
z = — oo.
From the curve z = f(x) the reaction-func-
may I
graphically.
tion may be deduced either by calculation or
Fl d. I
3. Practical proceeding.
Summary of the results of the preceding paragraph.
Let the measured quantity (x) have the following values
Yl times x = ^ ,
-*2 » % == *2 1
Yn „ X = f n.
The whole number of individuals is therefore
41
The values & may have a constant difference c, so that
c = I2 — f i = £3 — & = • • • = £» — &•-!•
The value |* is considered as the centre of a class which extends to
/* /*
-n at both sides of the centre £*• Hence the class-limits are Xk-i = £* -- «
2i £
and x = 5 ---
The extreme limits x$ l) and zw , these being the limits which x cannot
c
2
/i
exceed a priori , are generally supposed to be different from ^ — -^ and
So the observations furnish the following data for x:
Yl times XQ < a; < xl ,
/ 2 » X± <^ X <^ X2>
Yn-l » Xn-2 <X< Xn-l ,
^n „ flJn-1 < 05 < On,
where Xk = £k + ~n f°r ^ = 1 , 2 , . . . n — 1.
Now form the fractions
YI Y1+Y2 Y, + Y2 + . . . + Fn-i
Pi ~ fl i P%— ~Jf > .... pn— 1 - -ft
and determine the values of z corresponding to p by the relation
e (*)=f>,
where 9 (2) is a function , tabulated at the end of this book.
The value zk which is found with pk, is to be joined as ordinate to
the abscissa Xk.
In this way n — 1 points (xk, %k) (# = 1, 2,...-n — 1) are obtained
belonging to the curve z=f(x), which must be traced through these
points as exactly as possible. Particularly in the neighbourhood of z = 0
the coincidence must be very close.
The value of the reaction-function corresponding to x is, save a
constant factor, equal to the trigonometrical tangent of the angle inclosed
by the axis of z and the tangent-line to the curve at the point with abscissa
x. In this manner any number of points of the reaction-curve may be
plotted, through which the curve itself is to be traced.
The values of p are obtained by dividing the sums Y19 Yl + Y2 , . . . ,
K! + r2 -f- . . . + K*, . . . Y1 + Y2 -f . . . Yn-i by the total number of indi-
J) In what follows XQ will no longer designate the initial value, but the lower
limit of x.
42
n
viduals N = 2 F*. This algebraic operation may be quickly performed with
i
the aid of calculating- tables or slide rules Using a slide rule of about
15 cm. length, after some practice an approximation within 0,001 of the
value may be attained, which is usually sufficient. The fractions which
surpass 0,5 must be taken from unity, since the value of 1 — p must also
be determinate within 0,001 of its amount.
Employing squared paper in sheets of 20 x 26 cm. (SCHLEICHER &
SCHULL , No. 332Y2) the axis of z should be taken parallel to the longer side.
The values of z occurring in practice rarely exceed the interval from
— 2,6 to + 2,6 [0 (— 2,63) = 0,0001]. The unit of z may therefore be
represented by a length of 5 cm.
The length of the axis of x (which lies in the middle of the sheet)
amounts to 20 cm. For x such a unit is preferable that the class- range
corresponds to a whole number of mms and that all the class-limits xlt...
. . . £n_i fall inside the sheet.
In order to plot the points (a^ , zt) . . . (zn_i , zw-i) of the curve z = f(x)
either: the values of z conjugate to the class-limits xk may be taken from
the table of the function p = e(z) at the end of this book , or : instead of
this table we may directly use a scale on which the number p — 6 (z)
corresponding to the value of z is marked at a distance of z x 5 cm. from
the zero-point. Such a scale has at the zero-point itself the number
0,5 = 6 (0), and at a distance of 4,53 cm. = 0,906 x 5 cm. from the zero-point
at one side (z = — 0,906) the number 0,100 = 9 (— 0,906) and at the other
side (z = + 0,906) the number 0,900 = 9 (+ 0,906). Using this scale x)
the interpolation may be performed graphically.
When in this way n — 1 points of the curve z = f(x) have been plotted,
a smooth line is drawn through them; care must be taken that near the
centre the line passes through the points as exactly as possible. Near the
extremities greater deviations from the given points are allowed in order
to avoid irregularities in shape.
The curve z=f(x) having been drawn, the reaction-fonction
1 dx
must be determined.
Sometimes it is fairly easy to find the analytic expression z — f(x)
corresponding to the plotted curve. Then this function f(x) may be
*) Printed on non-shrinking card-board and published by Arnaud Pistoor, 'a Hertogen-
bosch, Holland. A reproduction of this scale is found at the end of this book.
43
differentiated, and the quantity vj is determined as the set of reciprocal
values of -7- = f'(x).
Usually however the equation z = f(x), represented by the given curve,
is very difficult to deduce. In this case we may have recourse to graphical
differentiation, which dispenses with the equation of the curve; this
advantage however is diminished by the drawback that the accuracy with
which the different values of f'(x) or of f/, are determined, is very small.
/ \x)
A slight roughness in the plotted curve immediately has its full effect
on the slope of the tangent. It is therefore very necessary to draw the
curve as carefully and thinly as possible.
dx 1
In the graphical determination of — = ^-r-^- we have , for some values
of x (for instance for the class-limits XK, or for the class-centres £#), to
calculate the trigonometrical tangent of the angle inclosed by the axis of
z and the tangent-line to the curve at the corresponding point.
A good plan is to copy the smooth curve z = f(x) with a fine pen on
transparent squared tracing-paper (SCHLEICHER & SCHULL No. SO?1^). A sheet
of clear white paper, on which a sharp narrow straight line is drawn, is
then put under the transparent paper. The sheets are shifted relatively
to each other until the straight line of the lower sheet coincides as well
as possible with the tangent-line to the curve z = f(x) at the desired point.
A solid ruler should not be used, because it covers one side of the sur-
roundings of the line. If the ruled paper on which the curve is drawn
is not transparent, the straight auxiliary line is traced on transparent
paper, which is placed on the squared paper.
Now the points are marked where this straight line meets two lines
parallel to the axis of x and the mutual distance of which is 10 cm., or
— if the sheet is of sufficient size to get the intersections on it — 20 cm.
If the distance of the two marked points in the direction of x is I cm.,
the quotient -^ (or ^) is equal to the trigonometrical tangent of the angle
inclosed by the axis of z and the tangent-line. This trigonometrical tangent
q however is not equal to -r- , because , in general , the units of x and z
CiZ
are not the same.
Supposing that the unit of x is represented by a cm., and that of z
by 6 cm. (in our case 6 = 5), then
a dx a 1
q~~~~b ' ~dz~ '' : T ' x'
44
whence
1 dx_ b_
The different values of -jrr^\ as well as those of q must be considered
/ (x)
as the corresponding values of the reaction-function , which is determined
but for an (essentially unknown) constant factor. The multiplier - of the
tangent q is of no consequence, in fact.
dx
Plotting the values of -7- for the corresponding values of #, the points
so marked belong to the reaction-function. The reaction-curve itself may
be obtained as the smooth curve passing as exactly as possible through
the given points.
This entirely graphical method may be replaced by a „ semi-graphical"
one, in which a set of equidistant ordinates of the smoothed curve z = f(x)
is measured. The reciprocal values of the differences of consecutive ordi-
dx
nates are considered as nearly proportional to the values of -7- •
When the entirely-graphical method is carried out with the highest
possible precision, it is to preferred to the semi-graphical one, which is
essentially less exact.
4. Analytic expression for the relation z = f(x).
Sometimes the curve z=f(x) has so simple a shape, that it is easy to
guess the equation represented by it.
For the present we will treat only two cases. In the appendix a third
somewhat more intricate case will be discussed.
I. The points (#*, zk) are nearly collinear.
Let the equation of the straight line passing through them be
(12)
The auxiliary straight line put under the transparent squared paper
(or, when itself drawn on transparent paper, placed on the squared paper)
must be so shifted that it passes as exactly as. possible through the points,
particularly through the middle points. The axis of x (z = 0) is cut in a
point, the abscissa of which is called the „ median" and is denoted by xm-
Since 2 = 0 corresponds to p = i , there are as many individuals for
which x < Xm as for which x > xm. The median value of x is that which
is passed over with the probability £.
45
Now
and
1 b
— _ vy fj
/'(#) " a
(see p. 44) where q is the trigonometrical tangent of the angle between
the line and the axis of z, a the length in cm. of the unit of x and b
that of the unit of z (usually b — 5).
Hence
and, putting
& being the trigonometrical tangent of the angle between the line and the
axis of x, we have
So A may be computed from the numbers a und b which have been
chosen in advance, and from the quantity k which is to be measured.
In this way both the constants of the equation are determined. The
case just treated is that of normal frequency. The median -xm here coin-
cides with the arithmetical mean1).
Only in the case of normal distribution the arithmetical mean may
be considered as representative of the different values, as a typical value.
In the case of abnormal frequency this mean is of far less importance.
On account of the linear relation between x and z, the standard-
deviation of x [viz. the square root of the mean square of # — xm] corres-
ponds to the standard-deviation of z [viz. the square root of the mean
square of z], so that
ez = A£X.
Now
ez = y=r (see p. 33 and p. 35).
hence
.+ 00
1 /
since
/»-fOO /"TOO
-4= / xc-*dz = -*~ I xe-
VnJ VnJ
46
which result also follows from the law of distribution
A = =f-x-* A x.
V 71
So our conclusion runs:
Normal distribution of the values of x is indicated by a rectilinear
disposition of the points (xk, zk).
In this case the arithmetical mean is the abscissa of the point at
which the line meets the axis of x, and the standard-deviation is either
to be computed from
I b 61
or to be read from the figure, viz. as the difference between xm and the
abscissa of the point, the z of which amounts to ™= == 0,707.
Here the reaction-function is a constant; so the elementary increment
is independent from the value of xt as was to be expected from the pre-
liminary study of the reaction-function.
The case just treated may be illustrated by Example I : Circumference
of the chest of recruits, measured by A. QUETELET (see p. 54).
Ila. The points (#*, zk) lie in a curve, which sinks rather rapidly
to the left — with a tendency to remain to the right
of a certain vertical line x = x0 — and ascends
gradually to the right with a decreasing slope (fig. 2).
This shape suggests the equation
(14) . . « = ilog=.
P. As the nuinerusof the logarithm becomes negative
for x < x0 , the value XQ is the lower limit for x , at
which z = A log 0 = — oo. So the line x = XQ is an asymptote.
We shall say that the quantity x has „ logarithmic distribution".
First of all the value x0 must be estimated.
Let XQ = 0 be the supposed lower limit. Then the equation reduces to
(14a) ...... z^A.log — = A log a — Alog£w.
Xm
Now we make use of logarithmic ruled paper, and so operate with
the coordinates
M = log x and z.
The equation in the coordinates w, z runs
z = A (u — um) ,
whence the points (u, z) plotted on logarithmic paper must be in a straight line.
47
Inversely, if the points (u, z) (particularly in the vicinity of z = 0)
are nearly in a straight line , this is an indication , that the relation between
x and 2 is tolerably well approximated by the logarithmic equation (14).
When we use logarithmic paper of SCHLEICHER & SCHULL No. 376l/2,
the unit of u is 1 dm., and taking, as before, for the unit of 2 5 cm.,
we find for the slope of the straight line
The point of intersection with the axis of x (z = 0) has for abscissa
u •= un = log XM and is marked at the margin of the paper by the number
xm itself, which evidently is the median value.
So , in the case XQ = 0 , both the constants of the equation are imme-
diately found by analysing the curve on logarithmic paper.
The reaction-function is
The elementary increment is therefore proportional to the value of x
itself. This case very often occurs in nature. For an illustration of it we
may refer to Example II: Threshold of sensation, measured by Prof. G.
HEYMANS. (see p. 55.)
When the curve on ordinary squared paper has the form indicated
above, but the points (u, z) plotted on logarithmic paper do not lie in a
straight line, this may be due either to an erroneous estimate of XQ, or
to the fact, that the relation between x and z is not logarithmic at all.
First we may try to bring the points (u, z) in a straight line by
correcting the value of XQ.
Instead of putting u = log x , we now put
u = log (x — XQ)
that is : we subtract the assumed value XQ from x and operate with the
ordinate-line which has the number x — XQ.
In correcting XQ graphically we may take the following into consi-
deration :
If XQ is estimated too small, that is to say: if we operate with
u' = log (x — XQ') instead of u = log (x — XQ), XQ being larger than oj0', then
is conjugate not to u but tot u'.
Now the difference
u' — u = \og(x — XQ') — log (a — tf0) =
X - J/Q
is positive and the smaller, the larger x. So we join to a certain value
48
of 2 too large an abscissa u1. The figure built up of the points (uf, z)
therefore lies to the right of the figure corresponding to the coordinates
(ut z), which is the straight line in question z = A (u — utn).
This line, having a positive direction-tangent A, tends from left below
to right above, and the deviations from it are left
below larger than right above. So the curve
V (u'y z) obtained by too low an estimate of
XQ, is concave downwards, and its curvature
decreases towards the top (fig. 3a). (See for the
rigorous proof the appendix to Ch. II, I B p. 64).
>r|<3' 3a Eeplacing x0' by a larger value x0" we make
all differences x — XQ smaller, and the difference
u1 — u" = log (x — V) — log (x — x0") = log (l + x*"_ **
is left below larger than right above. The course of the curve retains the
same character, but the curvature has become fainter.
When, at last, we have hit the exact value XQ, the line is wholly
straightened.
If, on the contrary, XQ is estimated too large, say XQ' > XQ, then we operate
/ x i x \
with u = log (x — XQ') instead of u =log (x — x0), u' — u = log 1 1 -° I
\ X XQ /
being now negative and in absolute value the smaller, the larger x.
So we join z •==. X (u — um) to u' < u. Hence the abscissae are taken
too small, and left below more so than right above.
The curve W (u', z), obtained by estimating XQ too
large, is therefore convex downwards, and its cur-
vature decreases towards the top (fig. 36). (See
appendix to Ch. II, I B p. 64.)
When XQ is replaced by a smaller value XQ",
rio 3b still larger than XQ, all differences x — XQ' become
larger, and the difference
XQ" XQ'
X
«•-„' = log(* - x0") - log(*-<) = lo
is left below larger than right above. So the line W becomes less curved.
By reducing x0' too much, the curve W passes into a curve of the
type 7.
49
In the same manner we obtain, in the first case, by increasing x0'
too much, a curve of the type W instead of a straight line.
Usually the exact value of XQ may be determined by interpolation.
If the distribution is not strictly logarithmic, a rather great uncertainty
remains in the determination of x0. On the other hand A and xm may
be determined pretty accurately.
When it is not possible to straighten the curve by altering the
estimated value of x0, the distribution is not really logarithmic.
The relation
11 X - XQ
z = A log —
Xm XQ
generates the reaction-function
The elementary increment consists of a part proportional to the
attained value x (which is positive for positive values of x) and of another
part independent from x (which is negative for a positive value of XQ).
An illustration of this more general case of logarithmic distribution will
be given in Example III: Valuation of House Property in England and
Wales, by Prof. K. PEARSON (see p. 55).
116. Sometimes the smooth curve drawn through the points (Xk, zk)
seems to have a vertical asymptote x — xn on the
I right, and rises there with an increasing slope (fig. 4).
The relation between x and z is now likely to be
approximated by
2-0
(16) . 2 = J log — (Xm < Xn, i > 0).
Xn X
Since the numerus of the logarithm becomes
negative for x > xn , the value xn is the upper
limit for x, at which z — I log co — -f co.
This case is treated in the same manner as Ha.
After having found the mentioned shape of the curve on ordinary
squared paper, we pass to logarithmic paper; and after estimating the
value xn of the upper limit , we plot the abscissa u — log (xn -— x).
The equation of the line on logarithmic paper will be
Z = I [log (Xn — Xm) — log (xn — x)~\ = l («m — tfl = — At* + A Mm-
So we obtain, if operating with the exact xn, on logarithmic paper
a straight line with a negative slope. If xn is estimated too small, we
operate with u' = log (xn' — x) instead of u = log (xn — x} , xn' being
50
smaller than xn , and we join z = "k (Um — u) to it' instead of u. The
difference
U' -U = \Og(Xn' -X) - Iog(xn-X) = \0g fl - 5LZ£*L
\ %n — X
is negative and the larger, the larger x. So we connect
a certain value z with too small an abscissa u'. The
provisional figure thus lies to the left of the required
rectilinear locus z = A(itm — u), which tends from right
no 5 a. below to left above. The deviations from it will at the
left end be larger than at the right. Hence the curve
V (u', z) obtained by estimating xn too small is concave downwards and its
curvature increases towards the top (fig. 5a) (see appendix to Ch.II, IB p. 65).
A greater value of xn produces a line of fainter
curvature.
By estimating xn too large (xn' > xn) we get ,
by a similar reasoning, a curve W to the right of
the required straight line. This curve W deviates
more on the left side than on the right. So the
curve is convex downwards and its curvature in-
F|G- 5t creases towards the top (fig. 56).
Here too xn may be determined, be it roughly, by interpolation.
Now the reaction-function is
(17) .... ,==1: = ^-^.
The elementary increment consists of a part Q independent from x (which
is positive for positive values of xn) and of another part proportional to x
(which is negative for positive values of x). Hence there is besides a constant
element of growth a counteracting or inhibitory cause, proportional to x.
5. Irregularities in the frequency distribution.
I. Domains of very small frequency.
When it appears from the observations that a certain set of successive
class-intervals (with centres £h , &+i> . • • &+«) is not occupied by individuals,
we have Yk = F/,+i = . . . = Fj,+t- = 0 , so that as many individuals are found
below & + -- = Xk as below &+i + = flfc+i , ... as below &+,• + - =
from which follows Ih = I^i = . . . Ih+i.
So the probability a posteriori p, which is also the most probable
value of the probability a priori , remains constant , whence also z assumes
t -f 1 times the same value (zh = zh+i = . . . = «&+<). Consequently of all
the points Pk(xk, zk), (k = 1, . . ., n — 1), the set Phj P/,+1, . . . P*+< is
51
situated on a horizontal line (z = const.) , so that in the domain x/i . . . %h+i
the derived function -=- = f'(x) would be zero, were it not that this would
dx
correspond to an infinite reaction [»? = ,., .
V / 0»)
A slight deviation from the given points is therefore necessary. On
account of the uncertainty in the correspondence (x, z) we may depart
from the horizontal line and give the curve a slope, however little it may
be. This, to be sure, enables us to get rid of the infinite value of the
reaction v\ , but we are obliged anyhow to assign to rj a considerable value ,
especially near the centre (2 = 0), because the correspondence (x, z) is
quite certain there, so that the value of f'(x) may differ but very little
from zero.
So a scantiness of individuals in a certain domain indicates a powerful
reaction at this spot, which makes the corresponding value of x hyper-
sensitive, so that the least occasion suffices to move x from that value.
When however this scantiness occurs near the limits of the whole
domain (XQ . . . xn) this inference is not nearly so sure in consequence of the
greater uncertainty in the correspondence (x , z). But if the gap & , . . . &+»
is rather large, we are, even near the limits, obliged to admit that/7 (x)
is very small within this interval and that accordingly the reaction-function
here assumes a strikingly high value.
If the frequencies Yh • • • Yh+i are not exactly zero but yet abnormally
small, the above considerations still hold. We may illustrate them by
Example V: Length of Wheat-ears under scanty feeding given by Dr. C.
.DE BRUYKER (see p. 56).
II. Excessive frequencies within the limits of the domain.
If for a set of successive class-intervals (with centres gh , ... &+<)
abnormally large frequency-numbers YH, - • • Yh+i are found, the fraction
Y 4- Yk
Pk = - i— increases rapidly in the interval between Xh-i and
and so does z.
Hence the function f'(x) reaches very large values in that interval
and the reaction-function very small ones, so that the growth almost
stagnates. So individuals with small x may eventually overtake the indi-
viduals whose x lies in the domain in question, and — if also negative
growth is admitted — individuals with large x may pass into such with
smaller x.
The consequence is an accumulation of individuals in this domain,
which explains the high frequency-numbers.
62
If the uncertainty in the correspondence (x , z) is of such a kind that
the slope of z (be it in a single point) may be assumed infinite (so that
the tangent-line at this points becomes parallel to the axis of z) , then the
reaction may be considered to be zero there.
Another explanation of such an accumulation with the aid of a many-
valued function z=f(x) will be given in a continuing article to be pu-
blished in the Proceedings of the Kon. Akad. v. Wet. te Amsterdam,
referred to by C. A.
Example IV: Diameter of spores of Mucor Mucedo, measured by
Mr. G. POSTMA (see p. 56) may serve for an illustration of the preceding case.
III. Excessively great frequencies at the limits of the frequency-domain.
We next consider the case that the frequencies Ylt Yzt ... Yi at the
lower limit, or the frequencies Kw_j, Kn-j+i, ... Yn at the upper limit,
or those at both limits, are abnormally large. This takes place, for instance,
when the frequencies form a series ascending to either or both limits.
If the first frequencies Yl , ... Yi are large , the quantity z must rise
in the first interval XQ . . . , Xi from — oo to either a small negative or a
positive value. If the last frequencies Yn—j, ... Yn are large , then z must
rise in the last intervals xn-j, ...xn from either a negative or a small
positive value to + oo.
Now if we stick to the above developed theory, we arrive at a very
peculiar and hence improbable form of the function f(x), as will be shown
later on (appendix to Ch. II, II p. 66).
In order to make the theory applicable also to this case without having
to operate with less acceptable functions it must be generalised by dropping
the suppositions incidentally introduced for the sake of simplification. Such
a generalisation will be expounded in the C. A. (see above).
6. Proportional reaction.
Two sets of individuals of the same kind may be subject to the same
causes of growth, with only this difference, that the reactions of one set,
characterised by x are A times as strong as those of the other, represented
by a^.
The elementary increments are therefore resp.
a — Pv(x) and al = A/ty(#i).
So we have, according to (1) (p. 32) and (3) (p. 34)
ma Ax
53
The individuals, being supposed entirely homogeneous, are likely to
have the same initial value X.
Hence
7 — tx dx i
!>(*) J
-A -A
or
M + f = F(x) - F(X) = ~
The same value £ corresponds in one distribution to x, in the other
to xlt which is generally different from x. Inversely, a same result of
observation x = xL = £9 which is in one distribution joined to the value £,
is in the other connected with a different value, say d- These values £
and £j_ are found from the equation
whence
The difference f x — £ = & (zx — z) does not contain Jf , the mean growth
of Z. The undisturbed or initial value X, which otherwise is inseparably
bound to M by the relation F(X) -f M (see § 2 , form. (4) , p. 35) can now
be determined by itself.
The above formula shows that for x = X we have £x = £ and z1 = z.
By tracing both the curves z=f(x) and z1 = ^(xj in the same system
of coordinates we evidently find the initial (undisturbed) value X as the
abscissa of the point of intersection of the curves.
From
z + hM=h\F(x) — F(X)\ and zt + hM = ~ \F(xJ — F(X)l
it follows that the curve zl = /x (x^ may be obtained by enlarging all
ordinates reckoned from a certain line z = — hM in a certain ratio -y •
A
It is by this property of the two curves that proportional reaction
may be recognised.
The reaction-curves also have proportional ordinates (with the ratio A)
as was to be expected on account of our starting point.
See Example VI : Summer and winter barometric heights at den Helder
(p. 57).
EXAMPLES.
CHAPTER III.
General remarks.
The frequency-numbers Y have been devided by the whole number N
of the individuals ; the quotients y are the ordinates of the points — marked
by a cross (X) — of the frequency-curve ( ).
The points (x, z) — marked by a dot (.) — are joined by the smooth
curve ( ) representing the normal function.
The reaction- curve ( ) is obtained by graphical differentiation of
the normal function.
The scale of x is given at the bottom of the figure.
The unit of y is different in the different examples. It is so chosen
that the frequency-curve is always of a convenient size.
The scale of y is marked at the right margin of the figure.
The unit of z in the original figure is 5 cm; it is 2,5 cm in the
(reduced) reproduction.
The scale of z is marked at the left margin of the figure.
As the values of r\ contain an arbitrary constant factor, the unit of r\
is chosen according to circumstances.
Example I.
Circumference of the chest of recruits, given by A. QUBTELET (Anthro-
pometrie, Bruxelles, 1871, p. 289).
Unit of x : I inch; class-range = 1 unit = 1 inch; N= 1516.
Normal distribution : z = 0.334 (x — 35.0).
X
Y
z
x
Y
z
28
2
35
310
— 2.13
+ 0.18
29
4
36
251
-1.88
+ 0.52
30
17
37
181
— 1.53
+ 0.85
31
55
38
103
-1.15
+ 1.19
32
102
39
42
— 0.84
+ 1.49
33
180
40
19
-0.50
+ 1.81
34
242
41
6
— 0.18
+ 2.13
35
310
42
2
55
Example II.
Threshold of sensation , measured by Prof. G. HEYMANS (J. C. KAPTEYN,
Skew Frequency Curves, p. 25).
Unit of x : 1 decigramme ; class-range = 1 unit = 1 decigr. ; N = 120.
T " i 1 • 1 • j • 1 J • »> ,* /~\ 1 A 1 *M
JUI
VSgd.L J.UJ-LJ
LAUV VJ.1ODJL AWULKAV/JUl . fj t7,^t/
IV&
4.78
X
Y
z
X
Y
z
1
1
9
7
— 1.69
+ 1.06
2
6
10
3
— 1.11
+ 1.22
3
23
11
2
— 0.48
+ 1.39
4
21
12
0
-0.13
+ 1.39
5
21
13
1
+ 0.18
+ 1.50
6
15
14
1
+ 0.42
+ 1.69
7
15
15
0
+ 0.73
+ 1.69
8
3
16
0
+ 0.81
+ 1.69
9
7
17
1
Reproduction on logarithmic paper.
Example III.
Valuation of House Property in England and Wales, years 1885 to
1886, given by Prof. K. PEARSON. (Phil. Trans. Vol. 186, p. 396).
Unit of x : 10 £; class-range from 10 £ to 500 £; #=5829.9 thousand.
1rt, x — 0.50
UUg«»lJ*UUU
\j UJBV4AM1AHIVU • fi *•)•*••*•
'*0.94-
0.50
X
r.-iooo
Z
x
7:1000
z
0
8
+ 1.36
3175.
47.3
1
+ 0.08
10
+ 1.47
1451.
58.9
2
+ 0.58
15
+ 1.68
441.6
38.0
3
+ 0.79
30
+ 2.01
259.8
8.8
4
+ 0.97
50
+ 2.26
151.0
3.0
5
+ 1.10
100
+ 2.53
90.4
1.0
6
+ 1.20
150
104.1
8
+ 1.36
Reproduction on logarithmic paper.
56
Example IV.
Diameter of Spores of Mucor Mucedo, measured by Mr. G. POSTMA
(unpublished).
Unit of x : 3.27 //; class-range = 1 unit = 3.27 /*; N = 330.
Accumulation within the domain.
X
Y
z
X
Y
z
10
18
— 0.44
3
26
11
— 1.67
19
-0.28
3
50
12
-1.48
20
0.00
2
106
13
-1.40
21
-f-0.65
7
33
14
-1.20
22
+ 1.00
11
10
15
— 1.00
23
+ 1.18
12
6
16
— 0.85
24
-f 1.33
25
7
17
— 0.62
25
+ 1.67
26
3
18
— 0.44
26
Example V.
Length of wheat-ears, grown under unfavorable circumstances (closely
sown in poor soil), measured by Dr. C. DE BRUYKER. (Handelingen van
het 13e Vlaamsche Natuur- en Geneeskundig Congres, p. 172).
Unit of x : 1 mm ; class-range = 10 units = 1 cm ; N = 372.
Depression within the domain.
X
Y
z
x
Y
z
20.5
80.5
+ 0.18
28
37
30.5
— 1.02
90.5
+ 0.37
101
58
40.5
-0.28
100.5
+ 0.76
30
35
50.5
-0.13
110.5
+ 1.18
15
15
60.5
— 0.06
120.5
+ 1.70
26
2
70.5
+ 0.07
130.5
+ 1.97
24
1
80.5
+ 0.18
140.5
57
Example VI.
Summer and Winter barometric heights at Den Helder.
kJUU-lJ-UCJ. UlUUUIB* J.J.CVjU.CilVyJ-V/U.A VC .
L jma i
LUUVWAUU .
^ -} N- S2fi02
Winter months : frequency-curve : >
. * ; normal function:
. 7^ — 99.1 ««
Unit of #: 1 mm. mercury; class-range
= 1 unit = 1
mm.
mercury.
Proportional reaction: /. = ^
: r) =
1,86.
Undisturbed value X = 761.2.
Summer Winter
Summer
Winter
X
Y z Yl ^
X
F
z
Fj
zi
718
1
736
58
— 2.77
— 1.56
719
0
737
83
— 2.77
-1.49
720
0
738
103
— 2.77
— 1.42
721
2
739
3
110
— 2.58
— 2.58
— 1.36
722
2
740
6
120
— 2.48
— 2.37
-1.30
723
0
741
8
155
— 2.48
-2.24
— 1.24
724
4
742
15
175
— 2.37
— 2,11
-1.18
725
3
743
18
214
— 2.31
— 2.01
— 1.12
726
1
744
36
249
— 2.30
— 1.89
— 1.05
727
5
745
63
262
— 2,23
— 1.75
— 0.99
728
9
746
74
285
— 2.14
— 1.65
— 094
729
11
747
94
837
— 2.07
— 1.55
— 0.87
730
23
748
145
382
— 1.96
— 1.44
— 0.81
731
20
749
210
415
— 1.90
— 1.33
— 0.75
732
28
750
228
481
— 1.83
-1.24
— 0.69
733
39
751
356
508
— 1.75
— 1.13
— 0.63
734
47
752
444
578
— 1.68
— 1.02
— 0.56
735
52
753
594
632
— 1.62
— 0.90
— 0.49
736
58
754
722
594
58
Summer
Winter
Summer
Winter
X
Y
z
Yl
«i
x
Y z
Y
zi
754
722
594
770
395
732
— 0
.78
— 0
.43
+ 1.33
+ o
.73
755
873
683
771
278
654
1
— 0
.67
— 0.37
+ 1.49
+ o
.82
756
1046
715
772
191
567
1
— 0.55
— 0
.31
+ 1.67
+ o
92
757
1172
725
773
110
496
— 0
43
— 0.24
+ 1.86
+ 1
02
758
1324
778
774
57
387
-0.31
— 0.18
+ 2.06
+ 1
.12
759
1438
818
775
20
307
— 0.19
— 0
.11
+ 2.21
+ 1
.21
760
1549
816
776
13
279
1 •*•
— 0.07
-0.05
+ 2.42
+ 1,32
761
1567
877
777
6
280
+ 0.06
+ 0.02
+ 2.77
+ 1
.45
762
1601
842
778
151
+ 0.18
+ 0.09
+ 1,57
763
1671
848
779
111
+ 0.32
+ 0.16
+ 1,69
764
1425
825
780
85
+ 0.45
+ 0.23
+ 1
84
765
1310
907
781
41
1 •*•
+ 0.58
+ 0.31
4- 1 9fi
766
1179
880
782
31
+ 0.
73
+ 0.
39
+ 2.12
767
969
795
783
13
+ 0.
87
+ 0.46
+ 2.24
768
778
828
784
9
+ 1.02
+ 0.
55
+ 2.
39
769
613
762
785
6
+ 1.
18
+ 0.
63
+ 2.
65
770
395
732
786
2
APPENDIX TO CHAPTER I.
Note to art. 3, remark 2.
According to formula (10) of the text, the deviations executed by any
one individual K, under the influence of any one cause Oh is:
A* - A»'K 1 F"(x) *
-~ 2
Required are the cases in which these deviations can be symmetrical
for the individuals of any size x. With some generalisation of the term,
we call symmetrical the deviations A# if
(b) .......... 2 Ax = 0.
It thus is required to find the cases in which this sum, taken over
all the individuals of an arbitrary size x, is zero for any cause ft. Sub-
stituting (a) in (b) we have
(c) Sjd*-*- 1 F"(X] -A*
F'(x) " 2 [F'(x)J h-K
the sums being taken over the indices K.
A first solution will evidently be
F" (x) = 0.
Consequently
(d) ......... F(x) = a+bx
if, at the same time the deviations are such that
(e) ...... &Ah,K = Q (summation over K)
In the case that F" (x) is not zero , let
(/) ^-|^ = B (summation over K).
If we suppose B the same for each of the causes ft, we get another
solution, for then
F"(x)
'
which integrated gives
or F'(x) = —
60
and integrating again:
(9) ...... l^(aO = — g
which is easily brought under the forms of the text.
The forms (d) and (g) are thus seen to be the only ones for F(x)}
in which symmetrical deviations of the x are possible.
We can easily verify that in these cases — provided the deviations
are indeed symmetrical — the arithmetical mean x is indeed = x0.
In the case (d), the equation of the frequency curve, according to
formula (15) of the text is:
0(a.)= * -&«->*-^
e V %TI
The condition of symmetry, according to (e), being I,Ah,K = Q (for
every cause ft) we have by formula (13) of the text M = 0 , therefore
This is a normal curve having really its centre of gravity (x — x) at
X = XQ.
In the case (#), the frequency curve, according to formula (15) of
the text, will be
"» -
The limits of this curve lie at the points
C
x = oo and x = — ~-g
for which F(x) becomes respectively — oo and + °°-
The arithmetical mean therefore is
~_
2B
Put
-j— Lg log (C + 2Bx0) — ££ log (C
We get
—
Putting in the last
61
We get, because f e-v2 dy =zV n
CV
CO
C1 9 R
(h\ x — __ — -I __ -- _ e~ 2B W-
2£ ^ C
According to (/) the deviations will be symmetrical in this case, if,
for every cause Ch
(2
,.K
or with the notation of art. 1
B = + <*/,.*) (summation over K).
v
It was shown in art. 2 that, in the case of an infinity of causes —
which is here assumed — the Ah must be of higher order of smallness
than the o^.^, so that, 2,ah.K being = 0, we may put
7? _.
-
Taking the sum of the similar equations for the whole of all the causes
we have, according to formulae (13) and (14) of the text
which, in (h) gives
APPENDIX TO CHAPTER
In the appendix to chapter II we propose to give firstly some
demonstrations and explanatory notes referring to the theory worked out
above, secondly a short discussion of the equation of the frequency-curve.
The further development of this theory will be given in the C. A.
(see p. 52). In this latter extension we partly apply this theory — faci-
litated as it is by the two simplifications introduced on pp. 37, 38 — to
a few new cases, which are — though a little more intricate — closely
related to those already treated.
In the C. A. we shall also expand the sphere of action of our theory
by dropping the simplifications mentioned, particularly in investigating
high frequency-numbers at the limits of the whole domain.
I. Explanatory notes.
A. The probable error QZ of z.
From
1 r*
p = — - / e ~ * at
ensues
dp _ e~*
A small deviation Az from a certain value z is accompanied by a
usually also small deviation Ap from the corresponding value of p ; this
deviation may be approximated by
So to the probable error gg of z corresponds the probable error QP of
p, according to the (approximative) relation
whence
1 /**
[Q = 0,47694, N= whole number of individuals, p = =-== / a-*2 df].
V nJ -*>
63
Now , for z = 0 we have e? = 1 and p = -J , so that
_ 0.598
y = VN'
In the neighbourhood of z — 0 we may, without serious error, still
use the same value for QZ.
An expression can also be given for QZ, which holds near the limits
z = ± oo. Here we make use of the so-called „ Error-function" introduced
by GLAISHER and defined by
Erf f = flr*dt,
h
which , for large values of £ , can be expanded in powers of — in this way :
For large negative values — £ the probability p is very small, so^that
1 — p is nearly 1. So the probable error pp of p is approximated by
and the corresponding probable error ££ of — £ by
or, since
or
4.2
i^ *, ji a
i
A very tolerable approximation is still given by
*) See GLAISHEB; On a class of Definite Integrals, (Philos. Magaz. XLII, pp. 294
and 421). E. CZUBEE: Theorie der Beobachtungsfehler (Leipzig, 1891, Teubner) p. 116.
64
The smallest a-posterioric probability p which can occur in a set of
N individuals is obviously
p' = ^;
of course this value is found at the lower limit of the frequency-domain.
At the upper limit the value for p is at the utmost
ff__N—l
P N
Hence the expression N has for its lower limit
a
— li — \ i —
«'(!_ p1) p"(\ — p") N \ N) N
N
and approximately
N
N
N
so that an approximative value for the probable error QP) of the corres-
ponding p' is found in
or, expressed in terms of the corresponding value z1 of z:
•
, i
vn.
Hence the approximative value for the probable error QZ> of z' is
— ,2 i ./— — .26' - JL
n . ez Qp- =
. <ff .
or, more roughly,
So we conclude that the probable error in the largest z which may
be found near the limits of the domain, or the probable error in the
maximum values of zl \ and zn — i , is nearly inversely proportional to these
maximum values themselves ; hence this error becomes less with increasing
values of the number JV of the inviduals.
The following table shows the numerical relations between the values
of z and those of p, Q»VN, and QZ.:
±2
o,
0,1
0,5
1,0
1,5
2,0
2,5
3,0
3,5
4,0
orl-p
0,500
0,444
0,240
0,0786
0,01695
0,00234
203 x 10-6
iioxio-7
372 x 10-9
771x1
Q.V*
0,598
u.C '0
0,655
0,877
0,464
3,153
8,838
32,20
152,3
932,8
e*
0,400
0,321
0,246
0,191
0,152
0,126
0,107
0,0928
0,01
65
For instance N = 10000 furnishes a minimum value p' — 0,0001 for p,
and accordingly z' = — 2,680. The probable error of z takes the values
Qz
0
0,00598
0,1
0,00600
0,5
0,00655
1,0
0,00877
1,5
0,01464
2,0
0,03153
2,5
0,08838
2,630
0,12065
B. Rigorous discussion of the shape of the curves FandTF(pp. 46 — 50).
We shall now prove rigorously that the curve V is concave downward
and W convex downward.
Introducing the number e, base of the Neperian logarithms, and the
modulus mod = M = 10log e = 0,434295 . . . , we find from
u' = 10log (x — XQ') or x^Xt' + W
the relations
and
x = X(! -f e
u'
u = wiog (x — xQ) = M 'log (x — XQ) = M 'log (eM -f XQ' —
or, putting
XQ' — XQ = 0,
and
_
= Melog(eM + a)
z = A (u — um) = ku -f- const. = AM elog (e M -f- a) -f- const.
Hence
du'
+ o
X
M
1+06
'M
and
du'2
Since A > 0, ^— ,„ has the same sign as a.
Two small a value x0' makes a < 0, consequently -r-^ < 0 (concave
downward).
Too large a value XQ makes a > 0 , consequently -j-^ > 0 (convex
downward).
With increasing u' the absolute value of -,— ^ becomes smaller.
As the slope varies but little , also the curvature decreases for increasing
u', viz. left above.
5
66
ID the same manner the second logarithmic curve is rigorously treated
as follows.
From
we derive
and
u'
X = Xn' — 10M' = Xn' —
U = 10log (Xn ~X) = M*\0g (Xn — X) = M *\Og (xn ~ Xn'
or, putting
and
whence
and
u'
z = A
u~M log (eP — T)
u) = const. — Aw — const. — )M clog (eM — T) ,
1U/
— A
X M ~
1 — re
+A
re
Ire
, _
(l-re »
— re
cPz
On account of A > 0 , -,— ^ and A have the same sign.
d?z
Too small a value of xn' makes T < 0, consequently -^ < 0 (concave
fT&fy
downward). Too large a value of xn' makes r > 0, consequently -v-^ > 0
(convex downward).
With increasing u', that is right below, the absolute value of
becomes less, and (on account of the small variation of the slope -r-j]
the curvature decreases also.
II. The equation of the frequency-curve.
The area of the frequency-curve bounded by the ordinate-line x,
amounts to
W* =
67
so that for the ordinate y of the frequency-curve
_dW
Now
dz • Vn
hence
dW_ __ dW dz _ 1
eta dz •~d^'"y^
So the equation of the frequency- curve is found to be
t
V 71
If 2 — f(x) = GO for aj — |, the factor 0-*' — e-t/W becomes infinitesimal
of an excessively high order. If y shall be finite, we must make f (x) in
x — ^ infinite of the same order as e~z\ This can only be done with very
peculiar forms of the function f(x), so that we are usually inclined to
attribute a finite value of y to a likewise finite value of z~f(x)
APPENDIX TO CHAPTER
As this paper is going through the press, Miss. Dr. T. TAMMES, the
well known botanist, sends us kindly a curious frequency curve showing
strong accumulation at the lower limit. The curve is given , further below ,
under the head Y. As this might be a good test case , we requested that
no particulars should be communicated before we had derived the normal
function (z) and the reaction curve (j/)1) in the ordinary way.
Though we can give no figure it must be easy to follow the course
of the latter curve from the numbers (y) in our table. The value of the
ordinates show that it starts from zero and then rises extremely abruptly.
A maximum however is soon reached at about x = 25 , after which it
steadily decreases , so that the reaction for x = 100 is already below half
what it is at maximum.
The meaning of this is of course , that the individuals evidently have
great difficulty in starting their growth. There seems to be an almost
insuperable impediment against beginning growth. Those individuals ,
however, who succeed in overcoming the first difficulty then begin to grow
very rapidly indeed, the rapidity increasing till the size 25 is reached.
After that the growth begins to diminish ; it gradually decreases , to only
half of the maximum growth for the individuals of size 100 and below
one tenth of the maximum growth for the individuals of size 170.
All this proves to be in good agreement with what has been really
observed. Dr. TAMMES writes: »The case I sent you is as follows: the
^quantities communicated are Stalk-lenghts of Linum crepitana, a variety
,of the ordinary flax. They were measured, at a moment in which the
agrowth had not yet ceased, by Miss A. HAGA. The seeds were sown in
„ a great deep flower-pot. Their number was purposely taken very high,
wso that they were extremely crowded. At starting, therefore, the difficulty
fffor each seed was to get a root into the soil. It seems allowable to
Bassume that all the seeds germinated. This has necessarily entailed an
„ intense struggle and many individuals must not have succeeded or not
„ sufficiently succeeded. For those who really got their root in the soil there
1 In finding the reaction curve the normal function z was first smoothed.
69
wnow came a good time. There was plenty of food for a good many of
,,very small plants. The case however changed when the plants, becoming
agreater, required more room. Then a second struggle ensued, viz the
„ struggle for the available food in the too narrow room. The plants now
„ became more and more impeded in their growth.
wlt seems to me that the conclusions from your curve are well in
„ accordance with the facts."
Unit of x : 1 m.m., class-range : 5
m.m., ^=1338.
X
Y
*
17 x
Y
z
n
0
0
115
— 0.181
93
148
34
5
— 0.852
185
120
— 0.139
79
15
46
10
9
— 0.824
215
125
56
— 0.073
60
15
8
- 0.813
230
130
63
— 0.000
47
20
7
— 0.783
240
135
65
+ 0.084
39
25
10
- 0.764
243
140
94
+ 0.170
33
30
9
—. 0.739
241
145
65
+ 0.301
30
35
9
— 0.721
236
150
83
+ 0.400
28
40
- 0.700
227
155
+ 0.537
26
10
55
45
— 0.680
214
160
+ 0.642
24
14
71
50
— 0.653
196
165
+ 0.796
22.5
13
43
55
— 0.625
179
170
A /"»
+ 0.919
21
13
46
60
- 0.600
166
175
+ 1.081
19.5
21
26
65
— 0.560
156
180
C-\ A
-f 1.211
18.5
16
24
70
- 0.534
148
185
+ 1.381
17.5
22
16
75
— 0.495
142
190
+ 1.564
16.5
19
11
80
21
- 0.462
136
195
2
+ 1.810
15.5
85
19
- 0.429
131
200
2
-|- 1.89
15
90
38
— 0.398
127 205
1
+ 2.01
95
21
- 0.341
123 210
0
-f 2.10
100
25
- 0.311
119
215
0
+ 2.10
105
23
- 0.272
113
220
2
+ 2.10
110
43
- 0.240
104
225
115
— 0.181
93
•
Table for / = e
2 i: °
I
2
3
4
5
6
7
8
9
— 0.0
0.5000
0.4944
0.4887
0.4831
0-4774
o 4718
0.4662
0.4606
0.4550
o.4494|f
— O.I
.4438
.4382
.4326
.4271
.4215
.4160
.4105
.4050
•3995
•394i|
0.2
.3886
.3832
•3779
.3725
.3672
.3618
.3566
•35!3
.3461
•3409f
— 0.3
•3357
•33°6
.3254
.3204
•3153
.3103
•3053
.3004
•2955
.29o6fr
- 0.4
.2858
.2810
.2763
.2716
.2669
.2623
•2577
•2531
.2486
.2442^
- o-5
.2398
•2354
.2311
.2268
.2225
.2183
.2 [42
.2101
.2060
.202clf
- 0.6
.1981
.1942
.1903
.1865
.1827
.1790
•1753
.1717
.1681
.i646|
-0.7
.1611
•1577
•1543
.1509
•M77
.1444
.1412
.1381
•T35°
•i3i9i
— 0.8
.1289
.1260
.1231
.1202
.1174
.1147
.1119
.1093
.1067
.io4it
— 0.9
.1015
.0991
.0966
.0942
.0919
.0896
.0873
0.851
.0829
.0807!
I.O
.0786
.0766
.0746
.0726
.0707
.0688
.0669
.0651
•0633
.o6i6i-
— i.i
•0599
.0582
.0566
•°55°
•°535
.0519
.0505
.0490
.0476
.0462!-
1.2
.0448
•0435
.0422
.0410
•0397
.0385
•0374
.0362
•035 !
•034it
— 1-3
.0330
.0320
0.310
.0300
.0290
.0281
.0272
.0263
•0255
•0247Jf
~ 1-4
.0239
0.231
.0223
.0216
.0209
.0202
.0195
.0188
.0182
oi76|f
— 1-5
.0169
.0164
.0158
.0152
.0147
.0142
.0137
.0132
.0127
.01231-
— 1.6
.0118
.0114
.0110
.0106
.0102
.0098
.0094
.0091
.0088
.0084!-
— 1-7
.0081
.0078
.0075
.0072
.0069
.0067
.0064
.0062
.0059
•00571
— 1.8
•0055
.0052
.0050
.0048
.0046
.0044
.0043
.0041
.0039
.oo38j|.
— 1.9
.0036
•0035
•0033
.0032
.0030
.0029
.0028
.0027
.OO26
.O024f
2.0
.0023
.0022
.002 I
.0020
.0020
.0019
.OOl8
.0017
.00l6
.ooi6{-
— 2.1
.0015
.0014
.OOI4
.0013
.OOI2
.0012
.0011
.0011
.OOiO
.OOIOf
2.2
.0009
.0009
.0008
.0008
.0008
.0007
.0007
.0007
.0006
.ooo6t
— 2.3
.0006
.0005
.0005
.0005
.OOO5
.0004
.0004
.0004
.0004
.0004!-
- 2.4
.0003
.0003
.0003
.0003
.0003
.0003
.0003
.0002
.0002
.00021-
- 2.5
.0002
.0002
.OOO2
.0002
.OOO2
.0002
.ooot
.OOOI
.OOOI
.oooil
— 2.6
.0001
.0001
.0001
.0001
.0001
.0001
.0001
.OOOI
• .OOOI
.oooif
— 2.7
.0001
.0001
.0001
.0001
.0001
.ooor
.0000
.OOOO
.0000
.ooool
r*£r"
— 00
Z 0
i
2
3
4
5
6
7
8
9
- o.o 10.5000
0.5056
o-5ii3
0.5169
0.5226
0.5282
0.5338
0-5394
0-545°
0.5506
- O.I
•5562
.5618
•5674
•5729
.5785
.5840
•5895
•595°
.6005
.6059
- 0.2 .6lI4
.6168
.622 i
.6275
.6328
.6382
.6434
.6487
•6539
.6591
- 0.3 .6643
.6694
.6746
.6796
.6847
.6897
.6947
.6996
•7045
.7094
- 0.4
! .7142
.7190
•7237
.7284
•7331
•7377
•7423
.7469
•75H
•7558
- °-5
.7602
.7646
.7689
•7732
•7775
•7817
.7858
.7899
.7940
.7980
- 0.6 .8019
1
.8058
.8097
•8i35
•8173
.8210
.8247
.8283
.8319
.8354
- 0.7! .8389
.8423
.8457
.8491
•8523
.8556
.8588
.8619
.8650
.8681
- 0.8
1 .8711
.8740
.8769
.8798
.8826
•8853
.8881
.8907
.8933
.8959
- 0.9
.8985
.9009
.9034
.9058
.9081
.9104
.9127
.9149
.9171
•9r93
.9214
.9234
•9254
•9274
•9293
.9312
•9331
•9349
•9367
-9384
- i.i
.9401
.9418
•9434
•945°
•9465
.9481
•9495
•95to
•9524
•9538
- 1.2
•9552
•9565
•9578
•959°
.9603
9615
.9626
.9638
.9649
•9659
- i-3
.9670
.9680
.9690
.9700
.9710
.9719
.9728
•9737
•9745
•9753
- i-4
.9761
.9769
•9777
.9784
.9791
•9798
•9805
.9812
.9818
.9824
- i-5
.9831
.9836
.9842
.9848
.9853
.9858
.9863
.9868
.9873
.9877
- 1.6
.9882
.9886
.9890
.9894
.9898
.9902
.9906
.9909
.9912
.9916
- i-7
.9919
.9922
•9925
.9928
•9931
•9933
•9936
•9938
.9941
9943
- 1.8
•9945 -9948
•995°
•9952
•9954
•9956
•9957
•9959
.9961
.9962
- i-9
.9964
•9965
.9967
.9968
.9970
.9971
.9972
•9973
•9974
.9076
- 2.O
•9977
.9978
•9979
.9980
.9980
.9981
.9982
•9983
.9984
-9984
- 2.1
•9985
.9986
.9986
.9987
99.88
.9988
.9989
•9989
.9990
.9990
- 2.2
.9991
.9991
.9992
•9992
•9992
•9993
•9993
•9993
•9994
•9994
- 2.3
.9994
•9995
•9995
•9995
•9995
.9996
.9996
.9996
.9996
.9996
- 2.4
•9997 -9997
•9997
•9997
•9997
•9997
•9997
•9998
.9998
•9998
- 2.5
.9998 .9998
.9998
.9998
•9998
•9998
•9999
•9999
•9999
•9999
- 2.6
•9999
•9999
•9999
•9999
•9999
•9999
•9999
•9999
•9999
•9999
- 2.7:
•9999
•9999
•9999
.9999
•9999
•9999
I.OOOO
I.OOOO
I.OOOO
I.OOOO
QA Kapteyn, Jacobus Cornelius
Skew frequency curves in
K37 biology and statistics
Physical &
Applied Sci.
PLEASE DO NOT REMOVE
CARDS OR SLIPS FROM THIS POCKET
UNIVERSITY OF TORONTO LIBRARY