a roa twee nate a pat enema inten i: yarn oe j Me a 4 om We RR } Ponies - Vol. XIV. Parts I and I : July, 1922 - BIOMETRIKA _ A JOURNAL FOR THE STATISTICAL STUDY, OF | BIOLOGICAL PROBLEMS FOUNDED BY 7 _W. F. R. WELDON, FRANCIS GALTON anp KARL PEARSON EDITED BY KARL PEARSON xs ISSUED BY THE BIOMETRIC LABORATORY UNIVERSITY COLLEGE, LONDON AND PRINTED BY THE UNIVERSITY PRESS, CAMBRIDGE Price Thirty Shillings net “= [Issued 26 July, 1922] » . . (With Plates V—VID): By 8, Hazztmpinr Warren, F.G.S. Some BE . from a Mendip Cave (With Plates VIJI— XIV): By L , M “On a Collection of Neolithic Axes and Celts from the Welle, Basin, Belgi n ROYAL ANTHROPOLOGICAL INSTITUTE, 50, Grea Lond Peter ARTHUR ROBINSON. Universit of Ea bi gh Professor E. BARCLAY-SMITH, University of ion _ Professor Sir ARTHUR KEITH, Royal Colleg uF; LINCOLN’ S-INN-FIELDS, LONDON, Ww ANNUAL SUBSCRIPTION 40/- PORT. VOL. LVI CONTENTS OF PARTS III anp IV. —APRI “ap. C. JupsoN Herricx. What are Viscera? Basanta Kuwar Das, M ’ Raymonp A. Dart, M.Se.,M.B.,Ch.M. The Misuse — cated Umbilical of the Term “Visceral. ie Mammals (with eight te: . Epwarp Pueups Anis, Jr. The Cranial Anatomy G 8. Sansom, B B.Sc. Harly ; of Polypterus, with Special Reference to Pol, Ly - centation in Ar pterus bichir (with Plates III—XXIYV). _ aiken A. B. Appimton. On the Hypotrochanteric Pose : and Accessory Adductor Groove of the Primate ee Femur (with five text-figures). iy +H, Lereanton Kesteven, D.Sce., M. D., Ch.M. A | ‘New Interpretation of the Bones in the Palate and Upper Jaw of ‘the Rg ae Se five duet . figures). : : OF GREAT BRITAIN AND om Ree iy. ; CONTENTS ees Minutes oi the meat General Meeting, January 25th. “prosia ia: the Thoughts of South Sea Islanders. From Birth to Death in the Gilbe: ARTHUR GRIMBLE. On the Long Barrow Race and its Relationship. t Inhabitants of London: By F. G. Parsons, F.R.C.S. Notes on the Suk ib Colony: By Juxon Barton. Some Polynesian Cuttlefish Baits: By Har Y The Older Palzolithic Age in Egypt (With Plates J—I V): By C. G. Senrema: R. F, RAKowskI. Excavations at the Stone-Axe Factory of Graig-lwyd, | of Some Bantu Tribes of East Africa: By Hon. CaanRi (With Mise xX Tees Vi): By Sir Henry H. Howorrs, i PRICE 15) ‘NET GREAT ‘BRITAIN AND IRELAND» 50, Great Russell Street, London, W. Cc. 1 Sonera Agen :—FRancis Epwarps, 83a, High Street, ‘Marylebon > OR THROUGH ANY BOOKSELLER : “MAN Hach ane of MAN ¢ consists of at least 16 eS ae “pages, with ‘lta together with one full-page plate; and includes Original Articles, Notes, and Cor and Summaries; Reports of Meetings ; and Paes Notices a ne Ao it Private Collections. Jeane ; ; hosed 2s. Monthly or £1 per Annum prepaid, Reedy TO BE OBTAINED FROM THE Po, & UANEORY «, Sea ag AND THROUGH ALL Rove omnes : | fay VOLUME XIV aU YE 922 Nos. 1 AND 2 BIOMETRIKA THE STANDARD DEVIATIONS OF FRATERNAL AND PARENTAL CORRELATION COEFFICIENTS By KIRSTINE SMITH, D.Sc., Lond. CONTENTS. PAGE Introduction . 1 I. Fraternal Correlation —. : ; é 3 : é . , : é : : 2 (a) The mean value : : 3 (6) The mean value of o 2_the ig neuen mcand dss Pion : 3 (c) The standard deviation of o? . 4 (d) Mean value and standard deviation of the pr oduct torment II " (e) The product moment, In, of Il and o? 8 (f) The standard deviation of the fraternal correlation éoetieiont 8 (g) Numerical evaluation of the formula for the standard aeyintion of a fetevnal correlation coefficient . : : 9 (kA) Application of the formula to previous saleaintions of Perel tig F : sl 12 II. Parental Correlation : ‘ , : é (a) Mean value and standard deviation of the product moment II, . : ‘ = Vals} (6) The product moment, Myy,,,¢2, of Uy, and o”. 15 (c) The product moment, Iq... of Hz, and o”. 16 (d) The product moment, I,2,¢’2, of o? and o” : 16 (e) The standard deviation of the parental correlation coefficient — . : : a 1G (f) The standard deviation of the slopes of the regression curves. : 17 (g) Numerical evaluation of the formula for the standard deviation of a pi ore correlation coefficient . 18 (A) Application of the formula to previous pulculations of corel ation 20 21 Summary INTRODUCTION. No attempts have been made as far as I know to calculate special formulae for the standard deviations of fraternal and parental correlation coefficients. The usual formula for the standard deviation of a correlation coefficient* which is deduced on the supposition that the values of the same variable are mutually uncorrelated is generally used also for this case, although it is only correct for a * Vide Pearson and Filon: Phil. Trans. Vol. 191 4, p. 229, 1898. Biometrika x1v 2 Fraternal and Parental Correlation Coefficients fraternal correlation coefficient calculated from only two siblings of each family and for a parental correlation coefficient when only one offspring value from each family enters into the calculation. When the material of observation, as is usually the case in investigations of inheritance in higher mammals, consists of families of varying size, and correlation tables are used in which the same weight is given to each observed pair of siblings or pair of parent and offspring, without regard to the size of the family, a rational treatment of the probable error is excluded at the outset. With material in hand which makes it possible to examine numerous siblings, it is most reasonable to confine the investigation to a constant number of offspring from each family. In this case the deduction of formulae for the standard deviations of the two correlation coefficients does not present special difficulties, and this problem will be solved here. We shall suppose that each group of q siblings belongs to the same litter or that from other reasons their order of birth is indifferent. Then each pair of siblings or each pair of parent and offspring ought to take a like part in the calculation, and q siblings give rise to $q (q — 1) pair of brothers and q pair of parent and offspring which all of them are entered in the calculation. The fraternal correlation can thus be calculated either from a correlation table which is made symmetrical so that it contains g (q¢— 1) entries from each fraternity or by the formula quoted p. 10 which gives an identical result. I; FRATERNAL CORRELATION. Although this investigation aims especially at fraternal correlation it concerns of course other calculations of correlation in which the material consists of classes of equal size inside which the individuals are mutually correlated, all of them forming like parts. In the following we shall therefore name a group of siblings a class. Suppose we have a material consisting of q individuals from each of n classes inside which the individuals are correlated while individuals from different classes are uncorrelated. We can then consider such a material as one of many possible samples of the same nature and size drawn from a population consisting of classes of individuals correlated as mentioned. It is therefore possible to face the problem of finding the law of errors for the mean value, the standard deviation of the character concerned and further for the correlation coefficient inside a class, supposing that these are all calculated from a sample like the one now considered. Let the sample be 7, Ys, Ys +++ Yng With mean value 7 and standard deviation o. No special notation will be introduced for individuals of the same class, but summa- tion of products is indicated by } when all factors of the product belong to the same class, and by S when factors of the same product belong to two or more classes. The summations always extend to all 1 classes. KIRSTINE SMITH 3 (a) The Mean Value. For the sample in hand we have The mean value of ¥ for a great number of samples coincides, according to the suppositions, with the mean of the population and this we choose for the zero point of y. The squared standard deviation of ¥ is therefore found simply by squaring the expression above, summing for all the samples imagined and taking the mean value of the result. We thus find oy = a {= (yt) + 2E (ye) + 28 (yr yd}, where a bar above a summation indicates that the mean value has to be taken of the sums for all samples, ie. for the population. Let the standard deviation of the population be s and the correlation coefficient for individuals of the same class 7, we then have = (y2) = ngs? and > (yrye) = $ng (q— Les’ As individuals of different classes are uncorrelated > (yi. Ys) is equal to 0, and accordingly we find 2 aaa Ss an ml * a7 Mie AGE Oty ven Pee eiewustlsends goa conds (1). This contains s and r for the population, which are, as a rule, only known from’ the sample in hand. It will be seen in the following, what is the approximation obtained by putting s and 7 equal to the values found from the sample. (b) The Mean Value of o?—the presumptive Standard Deviation. For our sample we find By taking the mean of o? for a great number of samples we find from this, 9 remembering that the mean of 7? equals a;,’, — = 2\ ‘gaa (1 A eee (3). nq / When we take the value found for c? as an approximation to o?, we find accord- ingly the. presumptive value of the standard deviation of the population by the formula 2 nq ng —{1+(q—D)r}’ which for r=0 or g=1 takes the form known for uncorrelated observations. pTr=S =a * Vide Comptes-Rendus des Trav. du Lab, Carlsberg, Vol. x1v. No. 11, 1921, Copenhagen, p. 32. 1—2 4 Fraternal and Parental Correlation Coefficients For the s.p. of 7 we find by introducing s in (1) i ee id ng—{l+(q—-1)r} (c) The Standard Deviation of o°. The s.p. of the o? of our sample is found from o,°? = o4 —(o”)*, where the latter term is already known. From (2) we find for the calculation of o* wgra? = (ng —1) = (y2) — 23 (Yo) — 2S (YrYo) ceveeveeerecees (4), and from this niqtat = (ng — 1) (S (yi?) + 4 (E (miys))? + 4S (iyo)? + = 4 (ng = 1) Ey) = (Hye) — 4 (nq = 1) = (y2) Sys) + 8E (ys) S Gnys) »--).- For the calculation of the mean values contained in this equation, the six pro- ducts of product sums must be examined. We find (3 (y.))? == (ys*) +23 (4? yo") + 28 (4° Ye") (Z(yryo)P = EB CyPys?) + AE (y'yoys) + OE (Wyoysys) + 2S (Yryoysys)™ b+ (6). > (1) > (WY) => Ca) +2 (4? YoYs) +8 (Yr? YoYs) When the multiplication of products containing the factor S (y,y.) is carried out, it is clear that we need not consider such sums of products where the product con- tains a factor which is uncorrelated with all the other factors of the product, because the mean values of such product sums are 0. In the products & (y,?) S (yy) and = (yy) S(yi42) all the sums of products are of this kind, the factors being distri- buted either in two classes of which one contains 3 and the other 1 factor or in three classes with respectively 2, 1 and 1 in each. We therefore find OS (ya) = S(ylys) + 28 (qiiysts) + £8 Gn yaysys) + V(y2)8 (ys) =e ‘ BS Son, @®) = (WY)S (YiYs) = Os, where the mean values of the a’s for the population are 0. Let us denote the product moment corresponding to y%™"y"y? ysl by Binnpg if all factors belong to the same class and in the opposite case let us insert ‘d’ or ‘s’ as denoting different or same class. * In the sums S all factors of a product are supposed to belong to different classes except those which are denoted by an ‘s’ inserted between them, as belonging to the same class. KIRSTINE SMITH We find then Sp) = 7g 8, S(yiyo) =ng(q-1) Ba S(yry?) =3ng(q-1) Bx > (y2yoys) =4ng (q—1) (q—-2) Bu > (Yo Y3Ys) a gznq (g aM 1) (q > 2) (qg a 3) Bun r S(y2y2) =4n(n-l¢ Bas Seyemge) =4n(n- Ye G-1Be S(nysyeys) = kn (@—WEQ~1W Bir ) Till now no suppositions have been made as to the law of distribution of the ys, but in the following calculation we shall suppose that the distribution is normal and the correlation between individuals of the same class normal. For the general case of normal correlation between n variables the product moments have been determined by Sverker Bergstrém*. Taking the standard deviations as units of the variable and denoting the correlation coefficients by Tyo, Yo..., Where for instance r,, means the correlation coefficient between the 2nd and 38rd variable of a product moment B'inyq, he finds the following formulae for the product moments of the 4th order : By =3 B’a = Bs = 3r. Bn =2ry+1 Bon = Wye + Mo | Bonn — (alsa ligt ae T Vues Se ee (9). Substituting our special values for the correlation coefficient we find Bi = 3st ) Ba = 3rs* | Bx = (2r° + 1) s | Bre 7 (leer yiss| Sica = eure tetas and further : Boo =5' d Boia =1s* d 7 a) ere B11 11= 7's sas ) We are now by means of (8) and (10) in a position to evaluate the mean values of the products put down under (6) and (7). * Vide S. Bergstrém: Biometrika, Vol. x11. 1918, p. 177. 6 Fraternal and Parental Correlation Coefficients We find (= (y2)P = nq {nqg+2+2(q—1)7°}s! (> (mye)! =4$nq(q—1){14+2(q—2) r+ [dng (q— 1) + @?-3¢q+ 3] 7°} 54 S ye > (4.Yo) =4 ng (¢ —1) {nq+44+2(q—2)r} rst (11). (S (440)? eee = (yr)S (ye) =O and > (Hy) S (Hy) = The calculation of «4 may now be continued. We find, by substituting the above mean values in (5), neqrot = s4 {n?g? — 1 — 2 (nq +1) (q—1 r+ (q—1)[2nq-(G—-1)] 7}. From (3) is found n2g? (o?)? = s* {n2g?— 2ng+1—2(nq—1)\(q—Drt+(q—l? rh, and accordingly os 24 Coe \Oe)) — ae ing-1—2(q-1)r+(q-1)(mq-¢4+1)7%}, ne”g or arranged according to powers of nq 2s! = ng {1+q- L732 == jute (q- yr BobnGd60n 600000 (12). This formula for the s.D. of the eat standard deviations is thus exact, supposing that the correlation be normal. For great values of n or rather of ——— ace may consider the s.D. of o? a 1+(q-1)” differential, so that = a* + 80° =o? + 2a8o. From 607 = 2060 we find by squaring and taking mean value for a great number of samples, Gen — oon and by substituting the value of o?2, omitting the last term, pa 2) OR +(q—1)7°}, 2 we2ig or, as with the accuracy obtainable we have Salon it follows that : oo = ng {1+(q—1)7r°}. We notice when comparing this formula with (1) that only for r=1 and r=0 does the rule Ce =AGqe hold good. 7 KIRSTINE SMITH The fraternal correlation coefficient p for the present sample is, when all the 4q (q — 1) pairs of siblings are used for the calculation, defined by P= sep o- where > (Yo) — y Dieta t SONS acnoies FING (13). I 2 ng (q-1) To determine the 8.D. of p one requires in addition to o%», the s.D. of I and the product moment for II and o°. (d) Mean Value and Standard Deviation of the Product Moment 11. Taking mean value of (13) for a great number of samples we find as = (rye) = eng (q—1) rs? and (9?) =a;°, — 1 = sir — — el) | Pcceuves ice astan hueshra ss 14). iis \ mae rh} (14) For calculating the mean value of II? (13) may be written mg? (q —1) T= —(g—1) = (yi?) +2 (nq—9 +1) 2 (Hy) — 2(9-— DS (jy) 15), from which follows | nig’ (q—1P TP = (q—-1 (2 YDP + 4 (mq— 9 + IPS (Hy)? + — 4 (g— 1) (ng— gt DEYDE (Hy) +4 (G- 1TYES (Hy)¥ the mean values of the two products being 0 according to (11). Substituting the rest of the values from (11) we find (q— 1) n’g? IP = s* (2ng — (q —1) + 2r [ng (q- 3)-(Q-1¥] +7 [ne (gq —1)— 2nqg (q—-2)—(q—- 1)’ J}... (16), and by squaring (14) is found (q — 1) wg? (I)’ = s* fg —1 — 2r [nq(q- 1) -Q-1)] + 7° [n'g? (q —1)— 2ng(q— 1)? + (q— Dh}. By subtraction of this equation from (16) we arrive at ae {1 - op lu 2 as ~ ng (q—1) +7? | 34 +34 ale or arranged according to ng a 2s* ong (q— 1) which may also be written 2 2s* 2 9 4 =I By lod eu= apy ltr g- 2h era Da [1 +r(q- prt sales): iow. te | ! at ong = 2) 92 (q- = ae [Eng = DI. 8 Fraternal and Parental Correlation Coefficients (e) The Product Moment, Up.:, of I and o°. By multiplication of (4) and (15) and taking mean value for a great number of samples we find for the mean value of the product Io? nig’ (q— 1) To? == (ng — 1) (q—1) (3 YP — 4 (mg — G+ 1) (2 (yr)? +2 {n'q?— ng + 2(q— 1} EQ) E (mys) + 4(q—-D) S(ny) the mean values of the two products being zero according to (11). Introducing the rest of the mean values from (11), we have neg? Io? = st {—ng — 1 +r [n'g? — ng (q—4) — 2g —1)] +7" [ng (q - 3) - (q— DJ} From (3) and (14) is found wg? Io? = st {—ng+1+ 47 [ng —ng?+2(q—1)] +r [-ng(q-]+(q-DI. As Tyo: = Ho? —T. 0, it follows from the two foregoing equations that D o4 : Un: = ae {— 1+ 2r [ng —(q— 1)] +7? [nq (q — 2) —(q—1)"]}, 2s4 1 a nee 5 \r 2+(@-2)*]-, +@-1) rt {oe (18). (f) The Standard Deviation of the Fraternal Correlation Coefficient. If the sample is great in proportion to (¢—1)r the errors of II and o? can be treated as differentials and we have for the correlation coefficient calculated from a sample = & +61 I 1 Tien Po hot BT et . Th Page GD and p= == : 1 and therefore neglecting the term containing — which according to these supposi- c c nq tions cannot be evaluated —7 From 6p = : {etl — a bor we find by squaring and forming mean value aD = Pee? eee un S C5 i Oar + (=) Og = 2 Te: K (ay a" o" When the values from (3), (12), (14), (17) and (18) are introduced in this formula and the terms containing the higher power of — are neglected, we get nq 2Qr? 4r? et | = Th lens eae 119) = , vy (L+(q— Lt} — FF (2+ G2), {[l4+r(q —2)P+7?(q—1)}+ 7 2 SAGs! On KUIRSTINE SMITH 9 from which is found oO = A=) {L+r(q-2)—7r(q—-1)}? ao and waa) alan) (t+ G-Dn aha sistokssetavaversisie wisleie’e (19). For g=2 this formula coincides with the usual formula for the standard deviation of a correlation coefficient calculated from two series of values of two variables corresponding in pairs, the values of each series being mutually uncorre- lated. (g) Numerical Evaluation of the Formula for the s.D. of a Fraternal Correlation Coefficient. The number, VN, of observed pairs of observations being equal to $nq(q — 1) the formula (19) may also be written I a= Ty (l— ntl +g - Dr. Comparing materials of observations with different number of siblings g, we see that for the calculation of fraternal correlation information of each available pair of siblings has a value inversely proportional to {1+(q—1)r}*% The ratio l+r - : Vg = GS) serves as a measure for the value which must be attributed to 1l+(q-l1)r/ _ information of an observed pair among q siblings, supposed that all of the 4nq(q—1) pair of siblings are used for the calculation, and supposed that the value of infor- mation of a pair of siblings for g=2 is put equal to 1. On the other hand 7 q indicates the ratio between the numbers of pairs of siblings which are required for obtaining the same accuracy in the correlation coefficient in the case of q and in the case of two siblings from each family. Table I gives the numerical values of v for different values of 7 and q. TABLE I. alder Sor : q r=01 | 02 0:3 0-4 0:5 0:6 0:7 08 | O9 1°000 1:000 1-000 1-000 1°000 1°000 1-000 1:000 1-000 2 3 *840 °735 *660 605 563 529 “502 ‘479 =| ~=*460 4 “716 563 468 406 360 327 “301 280 264 5 617 “444 B49 ‘290, 250 221 *200 184 Al y/l 6 538 360 *2'70 218 184 “160 143 130 SILLS) tl “473 298 216 “170 141 | 1121 ‘107 ‘096 088 8 *419 *250 “176 136 ‘111 =| = =-095 083 ‘O74 068 9 373 213 146 el 090 ‘076 066 059 054 10 335 184 123 093 ‘O74 063 054 048 | 044 10 Fraternal and Parental Correlation Coefficients We notice that for values of 7 somewhat greater than 0°5, such as are usually found for mammals, v7, has already decreased to about $ and v, to about 4. By giving the same weight to each pair of siblings when forming fraternal correlation tables from a material consisting of fraternities of different size, we therefore fai! very largely to pay due regard to the observations. With material under conside- ration, as for example anthropometric data, which according to its nature consists of small groups of siblings of varying number, and which is not so numerous that we can afford to omit observations from the calculation to make qg constant for all fraternities, the rational proceeding must be to sort the material according to the number of siblings and calculate the correlation coefficient of each group separately. It is then possible to effect considerable saving of time and labour in the investigation of correlation by avoiding the forming of fraternal correlation tables and using instead the formula ae oq * ail =i) : where a, is the directly calculated s.p. for mean values of fraternities. The results found by the formula are identical with those of the defining formula, so that the only objection to this method of calculation is the lack of opportunity to examine the shape of the regression curve. From the correlation coefficients found for different values of qt, it 1s finally possible with knowledge of their s.D.’s to calculate a mean value of the fraternal correlation coefficient and its s.D. In investigations of inheritance with animals with numerous offspring, where a great number of siblings are available, we have to face the problem of deciding what number of siblings it is profitable to employ for the investigation. We shail state provisionally the problem as follows: with which value of g do we, provided the number of examined offspring individuals (nq) be fixed, obtain the most accurately determined fraternal correlation coefficient ? Or in other words for . which value of ¢ 1s 1 was rest {l+7r(q—1)? a minimum ? * Vide K. Smith, Comptes-Rendus des Trav. du Lab. Carlsb, Vol. xtv. No. 11, 1921, p. 8, where the formula is deduced for the special case q=10. + In the memoir quoted it is shewn (p. 29) that the above formula may also be written q hq | eee J q-1 «@’ = 0’; being the squared s.p. inside fraternities of q siblings and being calculated as a mean of such values obtained from each of the n fraternities. We may here instead of oq introduce the pre- sumptive s.p. inside a fraternity ,o, that is the s.p. we expect to find in fraternities consisting of a great number of siblings. The relation is -so that we find r=1-"2, o which shews that the value of r arrived at must be expected to be independent of q. KIRSTINE SMITH 11 The condition of minimum is q=1+ ra _ Corresponding to the values 4, 4, $ and 2 for r the values of q are 5, 4, 3 and &. In examining the question of the most profitable number of siblings, attention must also be paid to the determination of the parental correlation and the question will therefore be further discussed in the following section. Besides it cannot be left out of consideration that, as a rule, it will be easier to examine the same number of individuals distributed among a smaller than among a greater number of frater- nities. When regard only is had to fraternal correlation, the values of q obtained above must therefore be considered the minimum values. For a more detailed illustration of the variation of the s.p. of the fraternal correlation coefficient with the number of siblings Table IT has been calculated. The table gives the values of the s.p. for 1000 observations distributed among from 500 to 100 fraternities, the sizes of which therefore vary from 2 to 10. TABLE II. The Standard Deviation of a Fraternal Correlation Coefficient calculated from 1000 observed Individuals. q r=} pe ees | T=3 2 0419 ~—sC| 0398 | 0335 0286 3 0356 0351 0316 0278 4 0339 0344 ‘0323 0289 5 0335 0348 0335 | 0304 6 0337 0356 0350 0320 Ff 0342 0365 0365 0336 8 0349 0376 | 0380 0352 9 0356 0387 0395 0367 10 0363 0398 0410 0382 The table does not show a rapid increase of the s.D. when the number of siblings increases beyond the most profitable number found above. Buta comparison of the values for ¢g=5 and for g=10 still shows that the latter are respectively 8°/,, 14°/,, 22°/, and 25°/, greater than the former, so that when there are 10 siblings in each fraternity respectively 18°/,, 31°/,, 50°/, and 58°/, more individuals are required to obtain the same accuracy than when there are only 5 siblings from each family. (h) Application of the Formula to previous Calculations of Correlation. In an investigation* concerning the characters, nwmber of vertebrae (‘ Vert.’), number of rays in the pectoral fins (‘ Pd? and ‘ Ps.’) and number of pigment spots (‘Pigm.’) in Zoarces viviparus from the station Nakkehage in Isefjord, Denmark, * K. Smith, Comptes-Rendus des Trav. du Lab. Carlsberg, Vol. x1v. No. 11, 1921. i Fraternal and Parental. Correlation Coefficients the fraternal correlation coefticient was calculated for 6 (for pigment spot only 5) samples from different years consisting of fraternities of 10 siblings. In this case the probable error of the fraternal correlation coefficient is according to (19) 067449 P.E. (r) = ——— V45n Table III gives for each sample the values of n, 7 and P.E. (7), as well as 7 for ull the samples each weighted according to the s.D. TABLE IIL. Fraternal Correlation. (l—r)(1+97r). Vert. Pads Pigm. Year when sample taken n TEP.E n rP.E. n reP.E. From total 0:4689 + 0095 | 1914 138 0°4590 + 0238 132 0°3169 + 0231 -- — 1915 168 0°4693 + 0215 174 0°4196 + 0211 75 0°3175 + 0306 1916 123 0°5108 + 0248 122 0°3985 + 0251 87 0°3418 + 0289 | 1917 177 0°4715 + 0209 176 03634 + ‘0206 127 0°4112 + 0247 1918 153 0:4801 + 0225 156 0°3329 + 0215 113 0°3074 + 0247 1919 98 | 0:4066+:0281 98 0:2893 + -0260 86 0°3722 + ‘0296 0°3564 + 0092 0°3517 + 0122 samples —— | For the mean values of 7 probable errors have previously been calculated based on the 6 or 5 values found. These probable errors had for Vert. Pd. and _ Pigm. respectively 0:0094 0:0137 and 0:0128, which for Vert. and Pigm. agree extremely well with the theoretical values now found, while for Pd. the error had been estimated somewhat too great. the values II. PARENTAL CORRELATION. For investigation of parental correlation we have a sample consisting as above of nq offspring values Y, Yo, Ys eure: Ynq Cistributed in n classes with g in each, and in addition, containing for each class an observed parental value v We aim at finding the correlation between «# and y’s of the same class. Let the parental correlation be r, and the s.D. for a’s s’ in the population which we may imagine that the sample represents, and let us choose the mean value of the population as zero point for a. The parental correlation coefficient is from the sample determined by = Tey Pp acme: aco KIRstTINE SMITH 13 where o” is the s.D. of w calculated from the sample, and I,, is the product moment for # and y determined by bie = Day a0 Ap ees ao ees ee eT (20). As in the previous section = ne a sum of products each of which consists of factors from the same class. In the sums S each product contains factors from at least two classes, and when two factors belong to the same class it is indicated by an ‘s’ inserted between them. For evaluation of the standard deviation of p, the s.p. of H,,, and o’ are required, as well as the product moments for each pair of these three functions. (a) Mean Value and Standard Demation of the Product Moment U,,. The equation (20) may also be written n— Ua way 2 OT wag S (nn) Cr (21). By taking the mean value for a ae ae e samples we therefore find = —] Hee ee rh gt Wahi ugenun (22). n From (21) we find by squaring and taking mean value nig? IPzy = (n — 1) (= (x9) + (S (asm) — 2(n-1) 3 (ay) S( )S (ay) «--(28). Together with the determination of the mean values occurring here, we shall determine the other mean values of products required for the evaluation of c,,- Lhey are such as arise from multiplication of = (a,y,) and S(a,y,) with each of the two groups = (y,”), & (my), S(yiyo) and X (x2), S(a,2,) and also those which contain a factoy of each of the two latter groups. As in the foregoing section, we need, however, not consider products of a } and an S, because such products may be developed into sums of products all containing a factor uncorrelated with all the other factors of the product, from which it follows that the mean value for a great number of samples is zero for each of these sums of products. It remains to determine the following products : 3 (a4) = (a y;") + 23 (x71 Yo) + 28 (a, YrLr Yo) \ OS (a1) =S (ary) + 28 ary, yo) + 25 (a 3 12 Yo) + € Z(H) (y2) == (my?) += (c.9.2y,) +8 (oa nye) | & (19) & (Yayo) => (wi ys Yo) + 3E (iy. Yoys) +8 (21 Yo Ys) S (aay) S (Hy) =S (a ye) + 2S (ar rye Ys) + & s (ag) Set) =E (wy) +8 (ete ys) S (ay1) S (aa) = S (ara, 1) + € (am) (yr?) = (ay) +8 (ay?) (av) (Wyo) =Z (a2 yp yo) + S (wy? th Ys) SH CHAI OHOD aes) (t 9p & Yo) + € Leen (24). 14 Fraternal and Parental Correlation Coefficients €, &, €, and e, are sums, the means of which are 0. The product moments are, as in the previous section, denoted by 8 and the indices concerning “ a’s” are placed in front of 8, for instance = > (ay) is denoted by .8,. We thus find for the mean values of the sums occurring in (24): > (a3 4;) = 0} 3By Dey) 1948s > (“741 Y2) = eng (q¢ —1)28u = (a, in) = ng iBs > (x, Yr? Yo) = ng (q i 1) 18a > (1%): Yo¥s) = ong (q— 1) (¢-2)i1Bm S (,?2241) =n(n—1)¢q 2B S(aPy") =n(n—-1)¢2h2 d S (ey, Yo) = fn (n-1)¢(q- 1) 2B a s ds S (a YX, Yo) = an (n =o 1) V1 Pi i Ss s dsd S (a, WY) =n (n im 1) g'1 Bh 2 8 (a gaye y= 4m — 1) @(Q= 1) Bay sds ) From Bergstrém’s formulae (9) we find, when introducing r,, 7 and O for the p> correlation coefficients and remembering that in his formulae s and s’ are taken as units for y and #: s6i~ =387,S?s } 2B. = (2r,?+ 1) ss? 2Bn = (2r,? +1) 82s? 183 = 8r,s's? 18a ==Tp(1+4 2r)s‘s? Bn = 38rr,s's U 218; = pss ds : OU feed 2B, = =ss? d 19.42 2811 =1s'%s? d's 9 Jo 9 11a = Fs s asad iPia =%pSS S @ 18111 =1Tps's? sds J} * In this single case the notation fails, as it ought to be indicated that the first « and the last y belong to the same class. KIRSTINE SMITH 15 Applying (25) and (26) we find for the mean values of the products under (24) the following values : (2 (am)? = ng {gy (n+1)r2+14+(q-1)r}s?s* 1 (Slam) = ng (w= 1) (gr? +1 + (9 - Vr} 88" > (2%) B(y2) =ng {ngt+ 24+2(¢—1)r} rps's* > (4h) & (ys) = 4ng (q— 1) (2 + (ng + 2q — 2) 7} rps's* S(ay) Sno) =n (n— 1) {1+ (G— 1) rj} rps’s? = (ay) > (a2) = nq (n 4b 2) p88 somber ee C20). S (a@,y1) S (a8) = n(n — 1) grys’s > (a2) S(y2) = ng (n+ 2r,?) ss? = (a) = (Hye) = eng (q — 1) [2ry? + nr} 8s" = S (a0) S (Y, Yo) = ¥n (n — 1) @’r,?s?s? ) We may now continue the calculation of I’,,. Introducing the mean values in (23), we get é ng IT? = (nm — 1) {ngr,? + 1 + (¢—1)7} 828%. From (22) we find . nq (Ixy)? = (n — 1)? gry?s"s*, and when this equation is subtracted from the foregoing 2 The TI 9 Mae i f 2 ) 6/202 Oty = Wry — Hay)? = 9 (One + Lar (d= Wyss? Gck.as. (28). (b) The Product Moment, Woes o, Of Izy and -o*. Multiplication of (4) and (21) gives nq? Tay. 0? = (ng — 1) (n—1) 2 (y2) 2 (ay) — 2 (2-1) 2 (ay) & (Hye) + 28 (ayy) SY. Yo) +, where y, consists’ of terms S x =, the mean values of which are zero. Taking the mean value and applying (27) we therefore find nig? gy. 0? = (n— 1) ry {ng (ng +1) + rng (q— 1)} 8's". For Ip, .o? we get from (3) and (22) q" Tay. 62 = (n — 1) ry {ng (ng —1) — rng (q — 1)} 8's, and accordingly from the two latter equations “ ee ee nl * Unzy, 6? = Uy. 0? — Way. 0? = i Ves {l+7r(q—1)} s’s*...... (29). 16 Fraternal and Parental Correlation Coefficients (c) The Product Moment, Inty,0%> Of py and.o”®. For o” we obtain, from the formulae (4), (3) and (12), which concern o?, by substituting # for y and putting g equal to 1: /9 1 1 9 2 alia > (a2) — ra S G@ity) eee (30), 5B tl 2, seer ACCESS (31), and o'g2= Ait) S4) a teasous este? eee (32). nr By multipheation of (21) and (30) we get niq Iz, .0° = (n-1P = (141) > (a?) + 28 (a4) S (a2) + Ye> for which the mean value by application of (27) is found to be n? ay . 2 = (mn? — 1) rps'3s, From (22) and (31) follows Ww MN, Lor (Nl) cas: so that = z 7; 2(n—1 Ip y.0" => Ty .o8 — We Co = ues ) TpSas Sileiclexadefeters eVeletatere (33). ie (d) The Product Moment, I1q2,92, of o? and ao”. For the product o?o” we find by multiplying (4) and (30) : niq?o*a” = (ng — 1) (n —1) © (a?) (y") — 2 (n — 1) & (a2) E (yrye) + 4S (x22) S (Yi Yo) + Ys. The mean value of y,; is zero, and therefore by taking the mean value and using (27) we get nq oo" =(n—1) {ng — 1 + 2qr,? — (q — 1) 7} 88°, and when from this is subtracted nq oa? =(n—1) {ng—1—(¢—1) 7} 828°, we arrive at (e) The Standard Deviation of the Parental Correlation Coefficient. For the logarithm of the parental correlation coefficient calculated from our sample we have log p, = log I, — Flog o? — slog o”. For great values of , which allow us to treat the deviations of o”, o? and IL,, from their mean values as differentials, it follows from the above equation by differentiation that Ce a pp ce OTL ny ey do° 6a? (35). Pp ily oO IXIRSTINE SMITH ti employed, which excludes the determination of terms With the accuracy here containing the higher power of - , we hay WW From (35) we find by squaring and taking the mean value for a great number of samples 2 yp) f Cniiey a see — Unie Urteye' 20° o20'2) o p =p P| +7 1 ae =a co a > 54 (,,)? LG ae ae a es a” 267) 32), (83) and which by introducing the values from m he higher power of | — ” leads to (34) and neglecting the term containing las , Ty? a agi tha- D+ gn) + ang tt G- Dy + 2, 21s . Dy 2 pt | poe eS p- D ng a eee n = ih . 5 ea eh 24 Cy eee or oor The +(q¢—1)1 =a [g+3+ (¢q-lr(4—r)] + qr". which may be written OMS (I Ty) ie a ae if ‘ oka Pate ng wae r) 1 Vp oi eR a (36) From this, for q > 1, The first term is the usual expression obtained for g = 1 one must subtract a term which for given values of r and r, increases with q (f) The Standard Deviation of the Slopes of the Regression Curves We shail finally add the formulae of the s.p. of the slope of the regression curves ial ready. The regression for the calculation of which we have all the material ready coefficients are determined by Il, i, a, = —" and a,=—. oO (oe By differentiation, squaring and taking mean value, we find it 2 tae 2 ie ‘| O Mey Only ae OO) \ip eee —— (Iliy)? — (°)? Mery and a corresponding equation for o%,, From these we find by pee the s.D. and product moments {l+r(g-l—arst* P ngs” 2 s° 2 12 : and ahead eri +7r(q—1)—9re+2(¢-—l))re Ud —-r) * Vide K. Smith, l.c., pp. 6, 7, where the same formula is deduced in a different form, containing a _is neglected. q instead of +, The two expressions are easily seen to be identical when the term 2 Biometrika x1v 18 Fraternal and Parental Correlation Coefficients (g) Numerical Evaluation of the Formula for the s.p. of a Parental Correlation Coefficient. We shall first examine how valuable a material consisting of n groups of q siblings with corresponding parental values is compared with nq pairs of values from different families. Denoting the s.D.’s of p, calculated from the two materials by and o,,, we find by applying (36) od Pp ies 1" py ae (i= Te Vq Ty pp q(1—r,?—-(q-1) (1-7) i! — 1, a This ratio indicates the value of an observed pair, when the parental value also occurs combined with (¢ — 1) other offspring values, in proportion to the value of an observed pair when the parental value only occurs once in the calculation. The numerical values of (387) are, for values of 7, and 1, fairly well representative of the values met with in investigations of inheritance given in Table IV. TABLE IV. 2 Uq =O Pp * F Gly: ino Tp = 4 iA NO q rad rad 1=°6 1 1:000 1-000 1-000 2 735 “698 666 3 581 536 499 4 481 *435 399 5 410 366 332 6 357 316 285 7 316 ‘278 249 8 284 "248 221 9 258 "224 199 10 236 “204 181 It appears that entering into the same parental correlation table families with numbers of offspring varying from, for example, 1 to 5 the same weight is given to pairs of observations which according to Table IV ought to vary in weight from 1 to §. It is therefore a more rational proceeding to sort the families according to the number of offspring and deal with each group separately. The work may then be shortened by calculating the correlation coefficient between the parental value and the mean for the offspring from which the parental correlation for individuals is obtained by multiplying with =. o, being as above (see I(g)) the s.D. for means of fraternities of q individuals. It is then possible to calculate the correlation coefficient with s.p. for each group of families and finally calculate a mean value for the correlation coefficient. —— — oe KIRSTINE SMITH 19 In investigations of inheritance with animals with numerous offspring it is as a rule easier to provide information of a given number of individuals among a small number of families than to examine the same number of individuals if they belong to a Jarger number of families. The labour required is therefore not proportional to the number of individuals and it must be estimated for the individual materials whether the encumbrance of dealing with a relatively large number of families is duly compensated for by the reduction of the number of individuals hereby permissible. It does not seem at the outset probable, but it may be possible, that, even in cases in which parent and offspring are equally easily available for investigation, a shortening of labour, that is, a diminution of the total number of observed individuals, may be obtainable by examining several offspring individuals of each family. We will therefore examine for which value of g, o%,, is a minimum when x(q +1) is put equal to a constant &. We find the condition a 1 pee (-rr- (1+ 4) =r 2} =o from which follows (1—r) i! — 1 ma ee ea es d =a fo) = ‘al = r) |! —T," oe 2 ¢ To obtain a survey we introduce a few sets of values for 7, and r for which we give the result in Table V. TABLE V. "p U q 0°20 0°25 1°8 0°30 0°40 lies 0°50 0°60 1:0 It will be seen, that for sufficiently small values of r and r, it is profitable to examine several siblings of each family in those cases where the examination of an offspring individual requires the same labour as that of a parent. As a guide for the choice of the number of offspring in the more frequently occurring case when it is easier to provide data of offspring than of parent, we give in Table VI for some values of 7, and r the number of observations which, for varying values of gq, yield the same accuracy in the parental correlation coefficient as 1000 parents with 1000 offspring. It appears from the table that while the number of offspring increases evenly with increasing g the number of parents decreases more and more slowly, so that the compensation obtained in this way for the increased total number of offspring 2-2 20 Fraternal and Parental Correlation Coefficients tends to be very small for increasing g. Already by increasing q from 5 to 6 we find, for 7,="3 and r="4, that to outweigh the augmentation of 360 in the number of offspring, we only get a diminution of 21 in the number of parents. TABLE VI. Number of Parental and Offspring Individuals which for varying q yield the same Accuracy to py. | T= The Too) r='4 | T=") r='6 OP “0288 Tp, = '0266 Cope 0237 Number of Number of Number of Number of Number of Number of q Parents Offspring Parents Offspring Parents Offspring ] 1000 1000 1000 1000 1000 1000 2 680 1360 717 14338 751 1502 3 573 1720 622 1866 668 2004 4 520 2081 575 2299 627 2507 5) 488 2441 546 2732 602 3009 6 467 2801 528 3166 585 3511 7 452 3161 514 3599 573 4013 8 440 3522 | 504 4032 565 4516 9 431 3882 496 4465 558 5018 10 424 4249 490 4898 552 5520 For fraternal correlation we have found (see Table II) that the most profitable number of offspring was 3—4 for the values of 7 now considered, and that a somewhat greater number was not substantially opposed to economy of work. Whether the number ought to be increased beyond 3—4 or confined to even fewer offspring individuals from each family depends in each investigation upon the relative difficulty of observing parents and offspring. (h) Application of the Formula to previous Calculations of Correlation. P.E. (7%)) = For the investigation of Zoarces viviparus mentioned in the previous section, according to (36) the following formula for the probable error of the maternal correlation coefficient : 1 3—7r])2 a -y¥ aorea le Vn ( e 2 | well as their probable errors calculated from this formula. Giving each of these values of 7, its due weight we have calculated a mean value and its probable in which 10 offspring individuals were examined for each mother, we have 067449 In Table VII are found the values of r, for the three characters examined as error. KIRSTINE SMITH TABLE VII. Maternal Correlation. Vert. Pd. Pigm. Year when sample aie taken TPE. Ty P.E. Tp tP.E 1914 0°3513 + 0343 02409 + 0332 1915 0°4375 +0281 0°3215 + 0303 O- 3762+ +0381 1916 0°4139 + 0355 0°2116 +0387 03622 + 0373 1917 0°3775 + 0298 02824 + 0293 0°3722 + 0332 1918 0°4382 + 0298 02928 + ‘0298 0°3710 + 0308 1919 0°3674 + 0378 071851 +0387 0°3380 + 0398 From total) 6.49214-0131 | 0:2654+:0133 | 0°3654+-0158 samples It appears that these probable errors agree extremely well with those originally calculated* on the basis of the 5 or 6 values of the correlation coefficient obtained from 5 or 6 samples. Summary. In the first section we dealt with fraternal correlation and a formula was deduced for the standard deviation of the fraternal correlation coefficient for the case when the material of observation consists of equal numbers of offspring from each family and when each available pair of siblings is introduced into the calculation. The formula is calculated on the supposition of normal distribution and normal fraternal correlation. It is shewn by means of the formula that forming fraternal correlation tables for fraternities of different numbers and giving each pair of observations the same weight we disturb very highly the distribution of weight which the observations must claim according to their nature. We find further from the formula that when the number of observed offspring from each family may be freely chosen the best determination of fraternal correlation from a given number of observations is obtained by taking (1 + 5) offspring individuals from each family (7 = frater. Fe corr. coeft.). In the second section we deduce, also supposing normal distribution and normal correlation, the s.D. of the parental correlation coefficient calculated from a material comprising equal numbers of offspring from each family. The formula shews that forming parental correlation tables of a material consisting of families of different sizes we also in an unfortunate manner disturb the due distribution of weight among the pairs of observation. It is shewn that if observations of * Vide L.c., p. 24, Table 6. 22 Fraternal and Parental Correlation Coefficients parents are as easily produced as those of offspring it is, for determination of parental correlation, only for small values of the corr. coeffs., for instance r,< 4 and 7 < 4, profitable to include more than one offspring individual from each family in the calculation. For the case more frequently occurring, when the observation of parents represents more labour or greater cost than that of offspring, we have for certain values of 7, and r and varying sizes of fraternities calculated such numbers of parents and of offspring which yield the same accuracy to the parental correlation as 1000 parents with corresponding 1000 offspring. Table VI shews that when the number of siblings exceeds 4—5, there is not much gained by increasing it. Considering both fraternal and parental correlation we may therefore generally conclude that an essential increase in the number of offspring beyond 1 + —, ie. in r practice 3 4, is only then to be recommended, when it causes a relatively in- significant increase in labour. This research has been occasioned by the investigations of inheritance carried out by the Carlsberg Laboratorium Kobenhayn and I am much indebted to Dr. Johs. Schmidt for the interest he has taken in my work. ON THE VARIATIONS IN PERSONAL EQUATION AND THE CORRELATION OF SUCCESSIVE JUDGMENTS. By EGON 8. PEARSON, Trinity College, Cambridge. CONTENTS. PAGE I. Introduction . : : : 3 2 : : ; : . 24 II. Generalized Theory of Personal Equation . : : : ‘ 25 III. Description of the Experiments (a) Experiments 4 and BD F : : : : : : 28 (b) Experiments C and D. ; ‘ : ; : : 29 (c) Experiment # . : : : : = : , : 31 IV. Terminology and Table defining Constants . : : : : 32 V. On Methods of Reduction a) Variate Difference Correlation . ‘ : ‘ : : 37 b) Application of the Results of V. (a) . : F : : 40 VI. Experiment A. Reduction a) The individual Series : : : : ; : ‘ 45 (6) The Combination of Series. : 59 c) On the possible Result of shifting che Head agting the course of a Series. 2 : : : : : : 61 d) Summary of Results . : , ‘ : 5 : : 61 VII. Experiment B. Reduction a) The individual Series . : : : : : : : 65 (b) The Combination of Series . ; : ; : : : 71 (c) Comparison with Experiment A . : : : : F 73 VIII. Experiment C. Reduction (a) The individual Series . : : : : : : : 74 (6) The Combination of Series. : : : : ; ; Wo) IX. Experiment D. Reduction (a) The individual Series . : : : : : : : 82 (b) The Combination of Series . ; : 3 3 : : 84 (c) Comparison with Experiment C . : : : : ; 86 X. Experiment #. Reduction. : F : : : : : 7 XI. Analysis of the Correlation between successive Judgments (a) The Theory of correlated Estimates and accidental Errors 88 (b) Application of Theory to the Results of the Experiments . 91 XII. Prediction : : ; : ; : ; : : ' E 98 XIII. Summary and Conclusions —. : ; : 5 5 , : 99 24 On the Variations in Personal Equation I. INTRODUCTION. Starting from Bessel’s discovery, in the early part of the last century, of the existence of a definite relative personal equation for two observers recording transits by the eye and ear method, there has been a continuous discussion among astronomers on the errors which such personal equations. may introduce, and on the methods of eliminating them or correcting for them*. In such discussions it has been the usual practice to take the yearly mean personal equation, whether relative or absolute, of different observers and to use this mean personal equation as the basis of any correction to be applied to observations made in that year. From a comparison of the yearly means it is admitted that there may be gradual secular changes in personal equation, but it is found that for experienced observers there is usually very little variation. In text-books on Practical Astronomy brief mention of the subject is usually made, and the conclusion drawn is that for an observer in normal health, the personal equation im any one type of observation will remain sensibly constant for “short periods” of time; an exact definition of the words “short period” is not and clearly cannot be attempted}. It is further assumed that variations from the personal equation are due to accidental errors and may be taken as randomly distributed in accordance with the Gaussian Law. With the recent introduction of photography and mechanical methods of record, the interest of the astronomer in the subject has to some extent diminished, but there are many fields of scientific observation where the human element cannot be eliminated, and in the modern researches of the psychologist we find a study is made of problems of this type for their own interest and for the light which they may throw on the working of the human machine. One very important aspect of the problem of personal equation, and of par- ticular import to the astronomer, was discussed in detail in a paper entitled “On the Mathematical Theory of Errors of Judgment, with Special Reference to the Personal Equation,” published in the Phil. Trans. (Vol. 198A, p. 285). In this case various series of experiments were carried out simultaneously by three observers under identical conditions and it was found that there was a marked correlation between the variations in absolute personal equation of the different observers. This in itself was sufficient to show that the judgments of any one observer were not randomly distributed about his mean personal equation. The purpose of the present paper is to discuss the variations in Judgment of one ob- server, and to inquire how far the evidence of four or five experiments suggests that the theory of personal equation and of errors of judgment, as usually accepted, requires modification. The subject is a large one, and much beyond the scope of a single paper; but by making careful inquiries of this type with the help of statistical methods, it * For example, Monthly Notices, Vol. xu. 1880, pp. 75, 165, 302 (Discussion of Greenwich Obser- vations of the Moon); Monthly Notices, Vol. xutv. 1884, pp. 1 and 39 (Greenwich Observations of the Sun); Monthly Notices, Vol. uvir. 1897, p. 504 (General Discussion of relative personal Equations). + For example, in Campbell’s Hlements of Practical Astronomy, 1899, p. 157; Young’s General Astronomy, Revised Edn. § 114, and Chauvenet’s Spherical and Practical Astronomy, 4th Edn. 1. p, 189. Econ S. PEARSON 25 may be possible to construct a more generalised theory of errors of judgment than that which has hitherto been adopted, and although the practical corrections which such a theory will impose may not be large, yet a more detailed knowledge of the nature of the variations and perhaps some insight into the psychological and physiological factors which underlie them, will give the observer a clearer idea of the precautions to be taken to avoid error and a greater justification for confidence in his results. II. GENERALISED THEORY OF PERSONAL EQUATION. Before proceeding to the reduction of the Experiments which have been carried out, I will consider whether it is not possible to make a very general, and yet simple, analysis of personal equation. Let us suppose that we have a large number, N, of observations, which have been made in separate groups, or at what may be termed separate sessions. For the astronomer, a session will be a night’s work ; for the physicist or psychologist, one continuous set of readings or observations. Any particular observation y may be designated (1) by 7, a function of the time when it was recorded, measured from some fixed epoch, or (2) by the number of the session in which it was made, and ¢, the time of record measured from the commencement of that session. E.g. an observation made in the pth session may be written either as y, or py. We will suppose that the secular change can be represented by the function $(7), but in addition to this change there may be another of a different type which may be termed the sessional change, and will be represented by the function /,(¢). The fundamental difference between a secular and sessional change is this: if there is a break of some hours or perhaps days between two series of observations, the sessional change of the first series will have no influence on the judgments of the second series, while the secular change will continue from series to series. The sessional change is thus peculiar to its own session or series of observations, although it is very possible that the same type of change may be repeated in session after session; it may be a change resulting simply from fatigue or perhaps from more complex causes. Figure 3 (p. 46) provides a good illustration of secular and sessional changes; the centres of the small circles represent the mean values of twenty different series of observations, and it will be seen that the general tendency is for a drop in mean judgment from left to mght of the diagram; this is the secular change. The sessional changes are represented by the continuous lines drawn through the centres of the circles, and the slope of these lines is on the whole seen to be very constant throughout the twenty series. In this case the secular and sessional changes are acting in the same direction, but they may well act in opposite directions. We have thus seen that an observation y may be expressed in the form Y= Or) bio (b) FM sevseess Raa coe. deter ss.exn Ts): where Y; is the residual after the removal of secular and sessional changes. The duration of the session is likely to be so short compared with the period over which the secular change is measured, that t may be taken as practically constant 26 On the Variations in Personal Equation for any one session, and $(7,) may be described as the secular term in the ob- servations of the pth session. It remains therefore to consider the function /, (¢). Supposing that there were n observations made in a session, it would of course be possible to fit an (2 —1)th order parabola on which all the observations would lie, so that the values of Y; would all be zero, but such a curve would be entirely useless. If the observations are made at finite intervals so that we can imagine that one may be interpolated between two others, owing to the mass of random errors to which each judgment is subject, we should not for a moment expect that the interpolated error would le on, or even close to the (n —1)th. order parabola. A curve of far lower order would probably give a much better fit. If the sessional change is a sign of some physiological change of state which is affecting the observer’s judgment, it is natural to suppose that it can be repre- sented fairly closely by some simple curve—a low order parabola if not a straight line, or perhaps, if periodic, a sine curve. Suppose that in a practical case, a first or second order parabola has been fitted to the observations of a session; then it will be easy to test whether the residuals Y; follow a Gaussian distribution ; a simple practically sufficient, if not theoretically sufficient test would be to find whether 5 ( zy 10, = (Yee (ii) in tied 5 Ys) _ : a approximately. aE Ome | But there is a further possibility; it may be found that although the relations (11) and (11) hold approximately, the Y;’s are not randomly distributed in time, and that there is in fact a correlation between the successive values of Y;, so that : os (oa et = eee ee ee cts saa remem cee Saeee rT he CA OG Mega) Gail t= for perhaps several positive integral values of i from 1 upwards. To emphasise the importance of the different terms in the relation pt = 0) (Tp) + fo (t) AP Y, clotatehe SiviererTeinierslevelerelevetoretetelotete (1) bis, let us take the case of an astronomer who makes a number of observations, often at many days’ interval. He will take a mean y =mean ¢(t,) + mean f, (0), but he must not suppose that the quantities pyt — ¥ = b(t,) — mean ¢$ (t,) + f, (t) — mean f, (t) + VY; follow a Gaussian distribution. It will be only a part of the expression that does so, the Y;’s, and it is possible that even these may not be true. Further it is clear that successive values of ,y,— 7% will not be independent ; correlation will arise from the inclusion of both the secular and sessional terms, Econ S. PEARSON re and perhaps too from a relationship between the successive Y;’s. There may be no large scale sessional change, and it may be possible to correct for a secular change in personal equation, but even then the mean of a small number “m ” of successive : ; : 1 ; wee observations, subject to its probable error ‘6745 a om will not be a satisfactory < approximation to the true value of the quantity observed, if these “mm” observations are correlated. Suppose for example that the points in Figure 14 (p. 76) represent a series of successive observations which have been corrected for any secular change in personal equation; the linear sessional change is small and has been represented by the continuous straight line, while the dotted straight line represents the mean value of the 63 observations. Yet many sets of 10 consecutive observations could be taken, the difference between the mean of which and that of the whole 63 would be far greater than would be anticipated from the value of the probable error calculated from the expression above. This is because the observations are not randomly distributed in time. In addition to secular and sessional changes in the value of an estimation, there may be similar changes in the standard deviation; the judgments may become more erratic or less so. A sessional change giving an increase in standard deviation would suggest the effect of fatigue; and secular change decreasing the standard deviation might be the indication of increased accuracy with experience. An example of secular change in personal equation and standard deviation is illustrated in the diagram on p. 84; the details of this will be discussed more fully in the reduction of Experiment D, but it is here sufficient to say that the central curve represents the smoothed personal equation, while the distance between any point on this curve and either of the outer curves gives the smoothed standard deviation at that point or period in the series of observations. It will be seen that the standard deviation increases in the later observations. It would be out of place at this point to enter further into the details of variation in personal equation and correlation of judgments, but I think that enough has been said to indicate the general lines of enquiry. In choosing the experiments which will be described in the following sections, the aim has been to select those in which there was likely to be considerable variation in judgment, and where consequently the secular and sessional changes, if present, would be clearly recognizable and the correlation of successive judgments easy to measure. It was also important that the errors in measurement should be small compared with the variations in judgment. It may of course be urged that the experiments should have been carried out by an observer who was unaware of the lines of enquiry and therefore not liable to bias of any form, but this was not practicable, and in fact none of the reductions had been completed nor the general theory developed before all the experiments had been carried out, and I do not think that the observations could have been affected by any conscious or unconscious prejudice. 28 On the Variations in Personal Equation TI. THe EXPeEerRIMeEnts. The present paper is based on the reduction of the following Experiments : A. Estimation of the value of a Third, or Trisection Experiment. B. Estimation of the value of a Half, or Bisection Experiment. C. Estimation of Time, by counting of Ten Seconds. D. Estimation of Ten Seconds without intermediate counting. £. Some repeated measurements of fine structure in a Stellar Spectrum, with a Zeiss Comparator. The first four Experiments were carried out by the writer in accordance with a uniform scheme ; each Experiment was divided into 20 series of 63 observations, making 1260 observations in all. Only one series (or 63 observations) was done at a sitting to avoid as far as possible the effect of fatigue; in the case of Experiments A and B the sequence of the series was much broken, spreading over some weeks, but C and D were carried out within four consecutive days. The dates of the series are given with the detailed discussion of the observations below. (a) Haperiments A and B. Figure 1 is a copy of one of the printed forms used for these experiments ; the longer line was used for A; distance between inner edges of bounding marks 7°53 inches ; the shorter line was used for B; distance between inner edges of bounding marks 5°94 inches. The lines were on the same form simply for convenience in printing, ete. and that not used was concealed while the observation on the other was being made ; a fresh line was used for each of the 1260 observations. In carrying out a series a pile of 63 forms was placed on a table shghtly tilted up towards the observer, and straight in front of him, with a good light coming from the left-hand side, the pencil being in his right hand. He then made a short pencil stroke across the line at the point which he estimated was one-third way along the line from the left- hand end (Experiment A), or at the point which he considered to bisect the line (Experiment B). He then turned the form over, face downwards at his side, and proceeded to deal with the next form in the same manner, continuing until the 63 were finished *, The pencil stroke was made after a rapid eye estimate, the aim being to record the first impression of third or half formed upon seeing the fresh line, and to avoid hesitation; the average time taken in going through a series of 63 observations was 5 minutes 40 seconds for Trisection, 5 minutes 22 seconds for Bisection, or 5°4 seconds and 5:1 seconds respectively between judgments. To avoid bias, it would have been desirable to complete all the observations of an experiment before commencing the measurement of any of the series, but * Actually in Experiments 4 and B 70 forms were marked in each series; the first 7 were to enable the observer to ‘‘get his eye in,” and the measures of them were not used at all in the reduction. Inne used for Experiment A. 7:53 inches. Distance between inner edges of bounding marks. Line used for Experiment B. 5:94 inches. Distance between inner edges of bounding marks. Kaon S. PrEarson 29 Fig. 1. from considerations of time and as all the forms were not printed at the commencement this was not done. In some cases therefore a series was measured directly after it had been marked, and if the observer happened to remember that its estimates were considerably too large or too small, his judgment would almost certainly be influenced when marking the next succeeding series; the correlation of judgments within this second series would hardly be altered, but any natural secular change which had been occur- ring from series to series might be broken*. The measures of the observations were made with a ruler divided to fiftieth’s of an inch, so that readings could be taken to one hundredth of an inch with fair accuracy. (b) Haperiments C and D. These two experiments were carried out with the help of a chronograph. The instrument was run by clockwork, and had a paper tape on which records could be made independently by two pens worked by small electromagnets. One pen was put in circuit with a second’s pendulum, a platinum pointer at the end of which made contact at each swing through the vertical position by cutting through a bead of mercury, the other pen was connected with a tapping key. The rate of the driving clock was not quite uniform, and the pendulum second-marks on the tape were therefore necessary in reckon- ing the intervals between the marks made by the other pen, corresponding to taps of the key. As the estimate in both experiments was one of 10 seconds, it was found that except for a few cases in Experiment D+, the true value of the time interval between the taps could be represented with sufficient accuracy by the factor e/p, where, * See p. 49, remark in Table I, regarding Series IX and X. + In Experiment D, some of the estimates had values nearer 20 seconds than 10 seconds, and here half the dis- tance on the tape between the nearest corresponding 20 seconds was taken for p. 30 On the Variations in Personal Equation e was the distance measured on the tape between consecutive marks of the key. p the length on the tape of the nearest corresponding 10 seconds recorded by the pendulum pen. Had the pendulum been beating exactly one second, 10 xs seconds would have been the true length of the estimate; actually the period as found by com- parison for a long run with a watch was, before Experiments C and D ( 6th December) 1:020 ae after x 7 (16th 0) aOR aaa } so that the length of estimate with sufficient accuracy is 10:2 x = seconds. It is the factor © that will be used throughout the reductions. Po —.- iN N 5 iN iN al N aN, KN zal aN i Ns ara) es ag De SS ae He yy a ey E b c a Fig. 2. Shows a small piece of tape, and the points from which the measurements were made. If the amplitude of the pendulum was rather small, it was sometimes notice- able that the intervals between the second marks were alternately longer and shorter; this was due either to slight deformation in the shape of the mercury bead or (what is really the same thing) from the centre of the bead not having been placed exactly under the equilibrium position of the platinum pointer. But in taking for measurement the even number of 10 seconds, such errors would be inappreciable. : In both experiments the beginning and end of the estimate were recorded by sharp taps on the key (at a and 6 respectively in Figure 2); a long drawn tap (c in figure) then followed to make a break before the next estimate was recorded. The interval between the b tap of one observation and the a tap of the following varied from 1 to 24 seconds. This method of record soon became quite auto- matic, and very few mistaps occurred. The measuremgnts on the tape were made from the sharp beginnings of the marks, which correspond to the making of the electric contact at the beginning of the tap on the key. : ¢ In Experiment C the counting was “sotto voce,” the first tap being made on the count “nought,” the last on “ten”; in order that the counts might be quite uniform the word “sen” was used instead of the two-syllabled “seven.” The counting was usually done in step to a slight beat of the thumb on the key (not hard enough, of course, to make contact), and it was fairly easy to keep the attention concentrated during the counts. In Experiment D there was no counting and it was far harder to keep one’s mind fixed; in fact the mental effort required was quite noticeable, and I found that a greater mterval of rest was required — Econ S. PEARSON 31 between each series than for C. It is mainly by reference to the passing of external events, to changes the duration of which we can infer from previous experience, that we estimate any but the shortest intervals of time. In the counting ‘experiment, the second-intervals between each of the 10 counts which made up the observation were comparatively short, and the beating of the thumb or fingers became almost mechanical; the interval of course varied but was not subject to violent fluctuations. But while most people are able to estimate a second interval with fair accuracy, it would need very much practice to estimate a 10 second interval, and in my case I found it quite impossible to concentrate attention for 10 seconds, solely on the passing of time. I soon found myself imagining that I saw the seconds’ hand of a watch, passing usually from the position where 60 is marked on the dial to the 10; but it was not another case of counting, for I did not note the passing of each individual second mark, only having a vague idea of the position of the 5 second division line. If I tried to think of nothing, my thoughts probably wandered on to other subjects, until I came up with a start, and realising that I had very little idea of how long before I had pressed the key to start the observation, pressed it to finish, with the greatest uncertainty. To keep attention fixed, it appeared that I must try to record the stages of the passage of 10 seconds, and this I was doing vaguely on the imaginary clock face, but I must say that the seconds’ hand was very re- fractory, at times appearing to stop or even move backwards, and was often so slow that I had to close the observation before it reached the 10 second mark. I have given the above description at some length in order to shew that there was an essential difference between Experiments Cand D, which is borne out by the figures of the reduction given later in this paper. The observer with the key sat In a separate room where the beats of the chronograph could not be heard. Experiment D was actually carried out in the week previous to C; before starting, a few trials at estimating 10 seconds had been made with a watch, but these were not repeated after the commencement. Again, some 10 second counts were made with a watch before starting on C, but no comparison with a watch or clock was made during the course of the experiment. The measuring up of C and D was left until both experiments were completed, so that the chance of some bias to the judgment, which occurred in the case of A and B was avoided. (c) Experiment E. This consists of nine series of readings made with a Zeiss Comparator at the Solar Physics Observatory, Cambridge, on photographic plates of the spectrum of Nova Aquilae III. The readings were taken in the first place in order to calculate the Probable Errors of the measurements of certain types of structure featuring in the broad emission bands, and each series consists of readings taken from 51 consecutive settings on a particular marking, either a maximum or the edge of a maximum. Although the number of readings is not sufficient for any great weight to be attached to the results, they are, I think, of sufficient interest to be included. In the instrument used, the plate to be measured is fixed to a slide, 32 On the Variations in Personal Hquation which is moved horizontally in a greased slot by pressure with the hand; the measurer looks through one eyepiece and pushes the slide until the feature on the plate of which he is wishing to measure the position, comes under a cross wire in the focus of the eyepiece; then looking through a second eyepiece at the scale attached to the slide, he takes the reading, the last two figures of which are read from a graduated wheel attached to a micrometer screw-head. In making a measure- ment there are therefore two adjustments : (1) The setting of the marking in the plate under the cross wire in the first eyepiece. (2) The shifting of two very close parallel wires by a micrometer screw in the second eyepiece, until a line of division on the scale appears to le exactly in the centre between them. Far the greater source of error arises from the first setting, particularly if the marking on the plate is not clear cut. In taking a series of measurements, the observer should always move the slide from the same direction—that is he should always push it or always pull it, until he thinks that the marking is bisected or “edged ” by the cross wire, and then he should stop; if he obviously overshoots the mark he should start again, and not hesitatingly move the slide backwards and forwards in search of what he thinks may be the best setting. By shifting the slide into position from the same direction, the measures may be all subject to a fairly constant personal equation due to “over push” or “under push,” “ over pull” or “under pull” of the slide, but this effect may be eliminated by reversing the plate in the instrument, making a fresh series of measures, and taking the mean of the two. In this particular set of readings the slide was always “ pulled” into its final position. (d) It is hoped that the results of some further experiments of a different type in estimating length which were kindly undertaken for me by Mr E. A. Milne of Trinity College, and Mr L. J. Comrie of St John’s College, Cambridge, will be included in a future paper. IV. TERMINOLOGY. Experiments A, B, C and D were arranged in accordance with a uniform scheme, each Experiment being divided into 20 “series” consisting of 63 obser- vations. The series will be designated by the Roman numerals I, II...XX in the order in which they were carried out, and the 63 observations* in a series by the letters Ui Yayinnt Geen Yess In dealing with each Experiment one of the first objects will be to ascertain whether there is any correlation between successive judgments, and the manner in which this correlation, if existent, falls off as the interval between the judgments correlated is increased. 'To obtain these coefficients of correlation it is necessary * The first 7 observations, see footnote, p. 28, being always disregarded. Kaon S. PEARSON 33 to divide the observations of each series into “groups,” and thus we have the 50 observations hg OFp Ys0 form Group 1 with mean d, and standard deviation o;, Yr ) Y3> siete Ya ” ” 2 ” ”» d, ” ” ” Oo; Yk» Yktis +++ Ys0+k-1 9» ” k ” ” di, ” ” ” Ok, Yisy N5> vee Yes ” ” 14 ” ” dy, ” ” ” O14: By “the correlation of successive judgments at intervals of one,” I shall under- stand the correlation of the 50 observations of Group 1 of a series with the 50 corresponding observations of Group 2 of that series; this will be expressed as p,. Similarly “the correlation of successive judgments at intervals of 4,” or pz, is the correlation of the corresponding observations in Groups 1 and k + 1. In fact px 1S given by J 50 > = hel > UtYtrk — Ady 50 Ck Pk SS eS ni Go cao orc cre ‘iv). O,- Ok+1 When these constants are to be referred to some particular series, say the pth, the prefix p will be placed before them, e.g. ,o,, ,o%, ppr, ete. A comparison of the d’s, o’s and p’s of the different series will be instructive, but as each of these constants has been calculated from 50 observations only, to obtain quantities with smaller probable errors we must combine the observations of the 20 series. Thus we shall obtain 1 au 1 1a Ie re Re a (0 9 ee Vv MN mn t=1 i ML in ( ) where n = 50, the number in a group, m = 20, the number of series, and & indicates summation for all 20 series. m lace ees ee Wes =~ & (prorgis + Adis) — Di Digs. ee Ilys in Vi Py= m (Pier Tit) + > (D, — dy) (Diss — des) in view of (v) ...(v1). Putting /=0, in (vi) we have as the square of the standard deviation Sets (oa) eS Di a mee M ‘mn M m : ees Pint deen eer tivnes (vii), 1 1 ¥5 / 2 (Sis ana it en z (0741) +—2 (Des a di+1) M™ mn Mm mn Biometrika x1v 3 34 On the Variations in Personal Equation and finally the coefficient of correlation R, is given by D, and S; are the mean and standard deviation of the combined observations— 1000 in all—of the 20 Groups k, while R, is the correlation between the 1000 observations in the 20 Groups 1 and the corresponding 1000 observations in the 20 Groups k+1, where it must be remembered that owing to the break between each series the 50th observation in Series I is correlated with the (50 +)th observation in that series, and not with the kth in Series II, ete. It will be seen from the equations (vi) and (vill) that it is possible for R, to have a large value even though the coefficients of correlation of successive judgments for the separate series are negligible. For though © (p,o,¢%41) may be m zero for kh: >p,let us say, where p may perhaps be 3 or 4, it is clear that the co- efficients for the combined series, R;, will not vanish as k& increases unless >, (D,— d,) (Des — ters) ee ied 0. SiS. In fact if LZ, (and therefore R,) does not vanish for values of & for which the p's of the individual series vanish, this is a sign of the existence of a secular change running through the series; the means of the separate series differ significantly from the mean of the combined 1000 observations, that is to say they differ significantly from each other. Now it is important to obtain a measure of the correlation of successive judgments, when freed from this secular term. First I define S;/ by the relation Se : laa , SeSHA/ Hj] Dee) sn eee ee ve os (ox) (1x), (m = 20, & indicating summation for the 20 series); it is the standard deviation m of the 1000 observations in the combined Groups k after the secular change has been removed. Then R,’ is given by Ler aes: ee (proses) k= STS (x), this is the correlation of successive judgments freed from secular change; before correlating the observations we are in fact fitting the series means together, by subtracting .d,— D, from the observations of the Ist Group of Series I, ,d,—D, from the 2nd Group and so on, and again subtracting ,d;,, — D,,, from the obser- vations of the (& + 1)th Group of Series I, ete. Again it may be desirable to examine the residuals after a sessional change has been removed from the observations of each series, in addition to the general secular term. Suppose that an observation in the pth Series can be expressed in the form introduced on page 25 phe = br p) Hpi Ve acne eee (1) bis, Econ S. PEARSON 35 where ¢(r,) represents the secular term which we take as constant for all the observations of the pth Series, and f,(t) gives the sessional change, then 8,” will be the standard deviation of the 1000 residuals in the twenty 1st Groups, S;” of the 1000 residuals in the twenty Ath Groups, etc., so that a eae Sy See ED i (0) sity tee oe en ee ee (x1), NUN m t=1 the mean of the residuals being zero, and m = 20, n = 50 again; while the corre- lation of the successive See idle at intervals of &, after the removal of secular and sessional terms, or R,” will be given by 1 a) MN m m = 1 ut Ry = a 1// W/ Si PS k+1 seaiNs gtiecenehiee cones (xii). TABLE OF CONSTANTS. In the following table definitions are given of the most. important of the constants referred to in the preceding section and of others to be introduced in the sequel. 1. The kth Group of the pth Series consists of the 50 observations BU ean baer see st oe RU ee As each Series consists of 63 observations, there are 14 Groups in each of the 20 Series, n will often be used for 50, the number of observations in a Group, m a a 20 ie " Series. 2. The crude Observations. (a) For the pth Series. d = mean of the whole 63 observations. pdx = mean of observations in kth Group. po = Standard deviation of observations in kth Group. ppt = coefficient of correlation between corresponding observations of Groups 1 and & +1, 1. between py, and pYni1, p¥2 ANd pYn+2, ete. p%s = Standard deviation of the first forward differences of the observations in Group 1, ie. of 94.— nr, ps — Ys --- p¥r — pYn- n+l Dy a po =slope of the straight line y—,d, =, (¢ — ) which fits “best” the 50 observations p41, po, +++ p¥t> +» pYn of Group 1. po = Standard deviation of residuals left after the ordinates of this “best” fitting straight line have been subtracted from the observations of Group k. ppk = coefficient of correlation between these residuals of Group 1 and Group lee ale 3—2 36 On the Variations in Personal Equation In the reduction of the results of the experiments, unless it is necessary to specify a particular series, the prefix p before these constants will usually be omitted for brevity. (b) For the combined 20 series. D = mean of the whole 1260 (= 20 x 63) observations of an experiment. D,=mean of the 1000 observations in the combined kth Groups of the 20 series. S, = standard deviation of the 1000 observations in the combined kth Group of the 20 series. R,, = coefficient of correlation between the 1000 observations in the 1st Groups and the 1000 corresponding observations in the /+ 1th Groups. .R; = coefficient of correlation between the 1000 sth forward differences of the observations in the Ist Groups and the corresponding differences of the obser- vations in the &+ 1th Groups. Ss; =standard deviation of the 1600 first forward differences of the obser- vations in the 1st Groups. 3. The Observations freed from the Secular Change. The “secular term” in the observation ,y;, considered as a member of the kth Group is ,d;. Thus the mean of the 1000 observations in the Ath Groups each freed from its secular term will be zero. S;,/ = standard deviation of the 1000 observations (freed from secular term) in the kth Groups. R, = coefficient of correlation between the 1000 observations in the Ist Groups and the 1000 corresponding observations in the & + lth Groups (all freed from secular term). 4, The Observations freed from both Secular and Sessional Change. y =f,(é) is the curve representing the sessional change in the pth Series, so that f, (t) is the “sessional term” in ,y;, the tth observation in the pth Series. pY;=the residual left after removing the secular and sessional terms from ,y,. S;,” = standard deviation of the 1000 Y’s in the kth Groups. R, = coefficient of correlation between the 1000 Y’s in the Ist Groups and the corresponding 1000 Y’s in the k + 1th Groups. par =the part of ,Y; representing the actual estimate which the observer wishes to record. pS; = the part of , Y; representing a complex of accidental errors superimposed on ,a, 10 the process of record. G,, = standard deviation of the sessional terms in the 1000 observations of the kth Groups. Ff’, = Ist order product moment coetticient about the mean of these sessional terms in the lst Groups and the corresponding terms in the &/ + 1th Groups. Econ S. PEARSON 37 V. On MetruHops oF REDUCTION. (a) Variate Difference Correlation. It will become evident in the detailed discussion of the results of the experi- ments, that a considerable part of the correlation of the successive judgments is due to a secular change with time, occurring from series to series, and in the ease of the Trisections, to a sessional change as well occurring within the series ; I therefore propose to consider at this point how far the Variate Difference Corre- lation Method is applicable in this type of problem, and to do this will approach the matter from a slightly more general point of view than that of “Student” in Biometrika, Vol. x. p. 179. Suppose that # and yare the two variables to be correlated, with corresponding values Diane Ciena Yis Yay --+ Ytr vee Yo very and that we may express a; and y, in the form uay= Te, (t)+ X4, Ut am F,(t) ale Vi where F,(¢) and F,(t) are polynomials of degree n in ¢, the unit of ¢ being the interval of time or space between the successive values of the variates, which is supposed equal and constant ; X; and Y; are independent of the secular or sessional change represented by F, and F’. Let us now obtain a general expression for (1) tN Ay OF rf, the correlation of the nth forward differences of a, and y. if 7 Me (2) ING A, Y, or nie 39 ” ” Xt ” } tse Now n! AnX; = ul a e)" Tr+t = Untt — Nen+t-1 +++ e 1) . si(n—s)! Tnt+t—g ove (= 1 Ne Les oCXAdi); where the operator ¢ is defined by ¢*a; = a_5, ete. Further we must assume that Vv (a) > a4, = constant for all values of 4 small compared with 2, t=1 = 0, by suitable choice of origin, e oS Yt+h a 0, =1 v v from which it follows that > A,a%4,=0= 2 An Yess, t=1 f=1 v (b) & (#42) = constant = vo,’ for all values of / small compared with », v > (Ytn) i voy ” ” » ” » t=1 38 On the Variations in Personal Equation v (c) & (@pn@t+n4n) = UX epee’ for all values of h small compared with 2, Qo SS ; = 2 =! (Yt+n Yttn+k) =) x yPkoy ” ” ” ” t=] Vv > (Lesnge Yt+n) = UE xyPk OyOy ” ” ” ” t=1 Similar relations will hold for the residuals XY and Y. Then a little consideration shews that the sum of the coefficients of the products of the w’s and y’s whose indices differ by p in the expression An& Any or (1 = €)” Cnt (he e)” Yn+t is the coefficient of v Xx gy p,»oxe, Mm the product moment Ant. Anye; call this coefficient a,. Now e” operating on «,,, gives ess €” ” ” Un+t ” Yn+t—1" ‘ and if (n+t—r)—(n+t—r’)=p, then *’—-r=p; hence a, is the sum of the coefficients of the products ¢,”¢” im the expansion of (1— «,)"(1—«)” for which r—7 = p, or the coefficient of €? in (1 se )" (1— ey, or of e"*? in (—1)"(1 — €)*”, ; 2n! Gp) "py Hence finally writing 7 = + p we have il v 2n ; -~ > Anz Anyt = Ox0y > (eS 1)" z =1 j=0 so that a, =(— 1) 2n! = (Qn—j)!j! ayPj—n rreeeeees (xiv), where negative values of the subscript of p imply that the subscript of « is less than that of y; e.g. 2yp—p 18 the correlation between a and y;4p. Ut Similarly for the standard deviations of the nth differences lie 4 eat 2n! 2 (Ana: = (or 2. (- Lev Q@n—p!5! BPJ—N sereeeeeceee (xv), = He 2 14! ee Qn 2n! = 2 (And =o Daa ee SESTY] Teeonsnbo060088 i pa (Any)? = oy ban ) (Qn —j) 17! yPj-n (xvi), and for the correlation between the differences 2n >; (— Dye j=0 nk = = a ae [SSE 7 oa al P n! 2 4 2n! ' 2 el ee ea eee WARK (Qn = ji ji Py 13, wy Qa =pyigi vin The correlation of the nth forward differences of the residuals X; and Y; or ,,R’ will equal an exactly similar expression to the last, in which yyp, xp and yp are 2n! (Qn —j) 17! xyPj—n . Econ S. PEArRson 39 substituted for ,,p,p and ,p. But as F,(t) and F(t) are polynomials of degree n in t, we know that A/V Gat constant) AnYyt = An Yi+ constant) ’ and therefore nk = LON tobe ING INDE Ne VG nk’, that is to say we may equate ,f to an expression similar to that on the right hand side of (xvii) above, except that the correlation coefficients of the residuals, namely: xyp, xp and yp are to be substituted for ,,0, ,p and ,p. Now in the usual problem to which the Variate Difference Method is applied it 1s assumed that after taking a sufficient number of differences we shall approach a state in which the corresponding values of X, and Y;, the residuals left after the ordinates of an nth order parabola have been subtracted from a, and y;, are mutually at random in time or space; or that XYPp = 0, XPp = 0, YPp = 0, for all values of p other than zero, and that xPo=1= ypo, XV Ome Xaveo ie. the correlation between X, and Y;. Upon this assumption it follows at once from the modified form of (xvii) that nf = xypo OF Ur yen FY the fundamental relation of the original Variate Difference Correlation Method. Let us now turn to the particular type of problem in which we wish to corre- late the successive values of the same variate. If we are correlating the values at intervals of k, we shall have as corresponding variables, not a, and y, but yz, and Y444 80 that ayPj—-n becomes pj+z4—n ANA yypj—n May be written pee xPj-n »» Pj-—n » NXPj—n ” » Dian” iene Pine os Pew Pe eae where as in the notation of page 35 p, is the correlation of successive values of the variate at intervals of p, and p,™ the correlation of successive residuals (at intervals of p) which are left after the subtraction of the ordinates of an nth order parabola representing the secular change. Hence we have from equation (xvi) that ,,R;, or the correlation between the nth forward differences of yt and Yerx is given by a ants 2n! Pere 8 a (In —syipt Pet k,=> ee XV1l1) ney on eeaaree ee G f Sy ee j= (Qn —)1gt Pi 2M : 2n! > (= Lys — SF prrj—_n™ Ec Cra: = —/ z — a Pe a co (xix), — rape ie J hese —7 i a) (Qn—jyigi” é Ss. i] ° 40 On the Variations in Personal Equation where negative values of the subscript of p and p” are to be treated as positive : ec k= lh w—p,)— 1) then pry — p=. 08 We are again supposing that this secular change can be represented by y = f(t), a polynomial of degree n in t, but we cannot expect that after removing a parabola of even 5th or 6th order*, the residuals Y,, Y2,... Y;,... Y, will be mutually at random in time or space; if we anticipate correlation between Y, and Yi42, we must also be prepared for correlation between Y; and Y;,,_,, and in any case the correlation between Y; and Y; or p44j;-n. where 7=n—k, will be unity. Hence we cannot make the assumptions of the first problem (that yp, = 0, etc.), in fact TAY An Yeap is not equal to gan Now consider the use which may be made of equations (xviii) and (xix). If the values of the p,’s have been calculated from the crude values of the variate, the quickest method of finding the correlations of differences , A, is not by direct calculation but by putting these known values of the p,’s into the right hand side of (xvii). Then using (xix) we have a number of equations connecting the pp” ’s, and the question that at once arises is whether there are sufficient equations to determine these coefficients ? It will be seen at once that there cannot be; if we are proceeding to nth differences, we can obtain g equations by putting k=1, 2,...g, but these will contain coefficients p,, to pny”; in fact m% more equations are required. By using the appropriate equations for the Product Moments and for the Standard Deviation of nth differences corresponding to (xiv), (xv) and (xvi) we could obtain one further equation, but at the same time we introduce one further unknown, the standard deviation of the residuals. That these equations will be indeterminate, can be seen from another stand- point; the nth difference correlation equations (xvi) and (xix) will be satisfied not only by the p,’s and p,"’s as defined above, but by the correlation of the residuals left after the ordinates of a parabola of any order less than n, have been subtracted from the crude observations. Nor can further equations for the py” ’s, be obtained by proceeding to n+1, or higher differences; the further relations obtained will not be independent, for example ny =— ze Bes ntt, a qd oF nt) The possible application of these difference correlation equations is considered in the next section. ete. (b) The Application of the Results of the preceding Section. Although the correlation of differences does not appear to provide a general method for obtaining the correlation of successive values of a variate after secular changes have been removed, the equations (xviii) and (xix) will be found of con- siderable assistance in certain cases. * The figures will probably not warrant the taking of differences of much higher orders than 5th or 6th. Econ S. PEARSON 41 The results of the analysis given in the three illustrative problems below will be used in obtaining the values of various constants in the reduction of the experiments in the later sections. It seemed desirable to collect the algebra together in this way, but in reading this paper the reader may find it more convenient to pass on and refer back to the theory when occasion arises for the numerical application of the results. Problem 1. In this and the following illustrations of the method of the preceding section, the notation of Section IV for the correlation of judgment will be used. I shall suppose that we have m series of observations through the course of which there is some form of secular change; the means of the different series, or the values of ,d, varying considerably. The coefficients of correlation for the combined series, R,, R,,... Ry... Rs; have been calculated, and also the single coefficient R,’, the correlation of the successive values of the observations (at intervals of 1) after the series means have been fitted together—te. after removal of secular change. It is clear that A,y, = A,V/, where y= d,+ Y;’, within any one series, and n n >, Pa (Avy : Ai Yt+k) => 2 (A, Vie Ae Arse) etec., m t= ae where again stands for summation for the m series, so that the Ist difference m correlation equations (xvii1) and (xix) are applicable, and become —1+2R,-—R, — Ry. + 2Ry — Ren 4 eo =e b= 2, tos—1 ...(xx oo 2 '2@>R) aft 2(1—R,) ey (=) —14+2R/-R,/ ——Ri.+2R/— Ries ; = See Bee = 2 to s— oe ( XXI1). 2 (1 7 = 31 %/) k , to s—1 ...(xx1) From (xx) we get the values of ,R;,/=1,2...s—1, and using these and value of R,’ already supposed to be known, the s— 1 equations (xx1) will give the s —1 unknowns R,,... Ry’. The accuracy of this method will of course depend on the errors involved in the assumptions (a), (b), and (c) of page 37 above. Problem 2. To obtain the coefficients of correlation of the successive residuals left after the ordinates of the “best” fitting straight lines have been subtracted from each of m series of observations, that is, after the removal of a linear sessional change as well as a secular change. In the notation of p. 35 these coefficients may therefore be written Rae eeee Botan In the first place let us obtain the constants of the straight line “ best ” fitting the 50 observations of Group 1 of a series; this can be done by the method of Least Squares. If for any series the equation to the line is yada (e-"5") “ (==o0ras before)” ..5..-...0-. (xxil), 42 On the Variations in Personal Equation where the ¢th observation is na d+b(t-" 4 ‘)4¥, 2 n we have that K=> Y? t=1 = {x —d—b (« - SS - is to be a minimum, t=1 ok ok srefor emer 2 ‘ eae therefore ad 0 and ab 0, or > Y,=0 whence S Y= nd, t=1 t=1 | q 1 1 and 8 fy —d—0(t— "Ss y=" 3 )=0 t=1 _ ae z n+1 2 giving Dy i (1 — " )t =b > (®-(n+Dt+i@s 1). t=1 t=1 Or, the first order product moment coefficient about the mean of y, and t (n?— 1) os Pu = ’ giving for the constants of the best fitting line d=d,=" ZY 12° 4s =e (—1) Pu: The next step is to obtain the correlation of the successive residuals left after the ordinates of this hne have been subtracted from the observations. We shall have that NO1Top: = > S id4b(t— -"*)+ Y ao (t+1—" 5°) + Ven} t=1 2 | 2 —nd (q+ enh i) n d So + Yer) — nd? + b> \(¢ +1- ae = 1) (Yer — d) n+1 . wort Be! aC “Z-+ 1)(n- a} -0 3 ( 5 ) (¢ a ) > Vi Yin — nd? —d Yn — ) ll ie, = > Y; as +b Pee a a d) i= d) — Yay + ink #21 , n os 1 — 62 > So a Peg 7 ay n+1 n—1 ,r(W—1) = 2 Y.Vintb ee —~nd+" > Vir ts am a Yn} S20 Sone a dns tae De (W? - ren es eS tons nd PEA Econ S. PEARSON 43 and if p,’ be the correlation of the successive residuals and o; and o,' the corre- sponding standard deviations in oe 1 and 2, we have finally oF (ve — ee {nm +1)y4+(n—1) Yay — 2nd} ... (xxii). PRS ee Pi G7 Fe. eee 5 Similarly we have ass \a40(1-"4 "| it vi = net t=1 2 = 203 (y») - 2nd +203 V(e- "F \n- ah -v 3 (1-75 ] t n b? = > V2+ 2bnp,, ——n(r?—-1 aa t Pu 12 u( ) == wee n(n? — 1), whence it follows that 2 a? =o2— 19 Dr eameuceest ek nicer cree (xxiv). And again, No? = 5 {a+o(t+1-"3") + a ae aan) e—lt = 2d & yrs — Ind? + 2 S (e+ = FS) na} § S (i aot Gi faa a ee 9 Vie —- Y, 2 2 +3 Vrgann ( e i — nb? — Inbd — 2. (b + d) (Yass — 4h — nd) ; (n— 1) = nox? + 2bnpy + nb? — v e—(n—1)t+ A fet OR (4 +1 yy + n ss Ynsi— nd) =Nnoo?+ e n(n?—1) +b {(n+1) y+ (2-1) Ynys — 2nd} o,7 =02— a (?—1)- s {(a +1) y+ (1 — 1) Yngr— 2nd} oo. cceeseveeee (xxv). If the values of p,’ have been calculated by this means for each of the m series, we shall have for the combined series, = (p: a, 0%) ie (xxvi), TECMED Mm me a modified form of equation (xii). As we are subtracting the ordinates of a different straight line from each series, a modification of the first-ditterence equations may be necessary. The 44 On the Variations in Personal Equation Ist order product moment coefticient, for the m combined series*, of successive first differences at intervals of k is given by 1 n a nN P= Eee Yer) (Yerk — Yerkua) — a hee — ata Jy Hees hana mn n m n il 5 e 2 SS (Vi = Vi — >) Yaee = Vea) MN m t=1 aes ape = b anes ee =) {3 (- b ay Vin > ~Eewst)} NM (an n 1 7. 9 fa f 1G 1— Yn MG i-Y N+1 SS Wi) Vek sear) ee “fs z a pra MN mn t=1 le nm nm me = y (6 Nm Yau + Yen a ne {3 b l 3 his Yat + Yr oe n mn) don mn EL See sy PN m WN m MN Or finally, p= (Rp PR, Rea SSO On ee (xxvil), making the assumptions (a), (b), and (c) of p. 37, and where ih Oe > ~ = (0 mn Yn ste Uiceso aa Yaseen) a 3 a s Wy Yaz + Yea aie M m n mm mn 2. y be is the standard deviation of the b’s. There will be similar corrected expressions for the standard deviations of the combined first differences. If we are justified in neglecting terms of the order of Q; + ?, we may use the first difference equations, Ry, = ao 2(1-R fe 4 re erg Pee ye ee (xxvil), — Ria aoe ke — Rey = - k= ae SO sicey Bee where, as in Problem 1, the known R,’s will give the ,R,;’s, and it will only be necessary to calculate directly the one quantity R,”, in order to obtain uy 4 yo RY Re Re Problem 3. In the last illustration it may happen that while Q,+0? is so small as to cause only a negligible error in the value of R,” found from ae 1 a 2k,” ae Re 2S Ree iy = * 1b is the slope of best fitting line in the pth Sevies. Econ S. PEARSON 45 the cumulative effect of this error may be considerable in the value found for R,” (s = 12, say). If then we take second differences 1 W oP, = ne Dz (Ye — Wer + Yee) (Yere — 2Yeresa + Yerere) ‘ thn | a 1 {s Y= Yo — Yn t+ tat : Yer — Yero — Yetnt tHeins| me Len n wh n BS iC a Y =— > & (VY: — 2V i + Vere) Veen — 20 trea + Virus) MN m t=1 = 1 fy WG = Y, ee ere at Vue) J View — Vite ee Viera + Vien ne ‘le n ms n = (Ry_2” — 4Ry” + GR,” — 4Ryy.” + Rayo”) 8”, and is independent of the differing values of the b’s. The appropriate equations are in fact of type, (Rye POS HR 4 OR” — ARG” FRA." a 2(8 —4R,7 = Ry) for k=1, 2,3...s—2, where R_,”=R,” etc. and R,’=1. Then using the known value of R,’, and that of R,”, found as in Problem 2 from the first difference equation, these s — 2 equations will give the s— 2 unknowns R,”... R,”. eT ate (xxix), It is clear that similar methods could be applied in the case of sessional changes of higher order, but I have taken the algebra in these three Problems, as the results will be used in the reduction of the experiments later on. The general explanation and equations may have appeared long, but the actual calculation in any particular case of such quantities as ,R,, R,,...,R,, or .R,,....R,, and then of R,’,...R,, and R,”,... Ry”, is exceedingly simple, and far shorter than a direct calculation from the crude figures would be. In two cases the correlations were calculated both by the difference correlation method and directly without approxi- mation, and the agreement of the former results with the latter established con- fidence in this method of approximation. VI. EXPERIMENT A (TRISECTION). REDUCTION OF OBSERVATIONS. (a) The indindual Series. The observations of this Experiment have been reduced in more detail than in the other cases; the values of pz, k =1, 2,... 13, were found separately for each series, and these and the values of d and o—the means and standard deviations of the Groups—are given in Tables I, Il and III. Several points of interest will be noted ; in the first place the observations have a marked tendency to decrease (i.e. for the estimate of a third to become smaller) both in the course of a series (as is seen by the general decrease of d; as k& increases) and also in passing from the earlier to the later series. These are examples of what have been termed Sessional and Secular Changes. These changes are illustrated in Figure 3 where the centres of the circles give the values of d, for each Series, the length of the dotted lines from either side of these points representing the standard deviations c,, and the On the Variations in Personal Equation 46 vee ee CIE, TYNLOV *g ‘SI w= === pao = = =~ c “SSNI] ONILOASIY | NI NOILVNOF WNOSYSd JO NOILNEIYLSIG (Ove sez 709-3 1 OL:% = I =I a I T_ =e == I I t= I H08-°S OH PH oH ed vt ro} O¢ Gt St Lh 9F Sh th fb ot bh OF 6 YyaduO NI SaIN3S JO JdVId JO ALVWILSS £ t HLONI1 Keon S. PEARSON AT continuous lines through the points representing the “best” fitting straight lines for the 50 observations of Group 1; the slopes of these last lines, or constants ,b, have been calculated by the Least Square method as in Problem 2, p. 41, and their values are given in the 3rd column of Table IV. Another way of examining the sessional change, and of obtaining a typical representation of it, is to calculate the average values’ for the 20 series of y; the tth observation in a series; thus 1 1 j jr=—iMR=—TAt+Y, Tn et im a aes me where ,d@ stands for the mean of the pth series (63 observations) as opposed to pdx, the mean of a particular Group & of that series. The values of Y; represent the sessional variation in any series about the mean of that series or session of observations, and the sequence aj, — D, t=1, 2, 3,... 63, will clearly represent the mean sessional change. The values of 7% are given at the end of Table II and have been plotted in Figure 4, where they have been fitted with the second order parabola (calculated by least squares) y = "486 + 00255¢ — 00001892 ........ cece wees (Xxx). ORDER OF OBSERVATION IN INDIVIDUAL SERIES 1 Ss 10 Ss 20 25 30 35 40 45 50 55 60 63 “""MEAN’ OF OBSERVED THIRDS |" ' ACTUAL THIRD ESTIMATE OF 3 OF LENGTH (MEAN OF 20 SERIES) TRISECTION EXPERIMENT. MEAN SESSIONAL CHANGE. Fig, 4. 48 On the Variations in Personal Equation Figures 3 and 4 together show very clearly the marked sessional change; while the former shows that except in a few series, notably Series I, IV and X, the regression 1s remarkably constant in its value, the latter indicates that the sessional change is better represented by a parabola than by a straight line. The sessional change can also be represented numerically with the help of the correlation ratio of y upon ¢. If we are dealing with the observations freed from the secular change, that is after the removal of the means pl from the 63 observa- tions of the pth series we have ny, given by > (Ye — Dy | 63 5 Sal ae. 1/y ~ F\9o ——_—“—-., where S?= a ¥ (2 —,d B. Tit SG38" SF 260 pai ae oe or S’ is the standard deviation of the whole 1260 observations after the removal of the secular term*. Then the ratio of the mean square distance of every observa- tion from the regression line or line of means 4, to the standard deviation of the 1 68 € / a0 > (ye Ho observations is Ss’ where & indicates summation for the 20 series. m This is a measure of the closeness of fit of the observations in a series to the mean sessional change as represented by the values 7; the larger ny, and therefore the smaller V1 — 7,2 is, the more nearly does a sessional change of the same form recur in series after series. A comparison of the values of V1 — Nu? for the different experiments will show the relative significance of their mean sessional changes. In the present case the value of 7,, 1s found to be 579 + ‘013, while V1 = ny 2 ='815. It would be an interesting problem to obtain the correlation of the successive residuals left after the ordinates of the “best” fitting parabola for each series had been subtracted from the observations of that series; but although this has not been done, a fair idea of the degree to which the correlation of the successive judgments in the individual series is due to the sessional change can be obtained by removing the “best” fitting straight lines from each series. The values calcu- lated for the ,b’s have been referred to above, and using these and the equations (xxu)—(xxiv) of pp. 41—43, the values of 0,’ and p,’, or the standard deviations and correlations of successive observations freed from the linear sessional changes, have been calculated and are given in the 4th and 6th columns of Table IV. The pis are all less than the corresponding p,’s, except in Series X where they are * Actually it is only the values of the Group Standard Deviations Sj’, Sy’... S14’ which have been calculated; they are not all equal (as shown in Table V) owing to the sessional change in standard deviation, but an approximation to S’ sufficiently accurate for the purpose will be given by taking S2= 1, {81/24 Gy2+... + Sya2}. 49 CF 9 cl 9 Z : See: S 0 9 io ae Gp Gg ee G1 0 oD) 1o 9 Z cl 9 a GT G 0: Ch w9 | = i eye} aul, ‘ajouyooy gg ‘daag , FesG De oa ae ee a aa yuotuspnl Jo splooat UAeM oq [VATOZUT WE fl SFI m9 (,S[etay Areurearpead 4 ayy Surpnypout) suorzeadosqo Qf, JO soldes v OF USyLy OUIT URITL ‘Wd JO ON[VA | = | = = } | | PP60. + | 9160. + | 8980. + | 1080. + | GLLO. + | OL90- F | 98F0. + | €Pe0. + | | | . | | | | | | Oh | 0c i 09: | OL O08: | | | ‘Sai 4 ay fO SLIDT OG WoOLf paynnajpa W01QDIALL0/) fo quarvyfaoy JO SLOLLE 87QDQOL gq O€: “od Jo onTRA, | | | | OL: 03. | | | | = “ LG “WR | 168s-+) BOFS- +) CBE +) GSIE. +) COPS. +] [8PS. +) 18GB-+] LEST-+| OFIT-+ 9ECs.+, Z19%- +) GLgz-+] BL40G-+] XX — “ 9% “We | 1BZ0-—| OOOT- +) L2GT-+| E6L1-+| OLe8-+]| OSEF-+| SFIF-+| OOGF-+| BFIG-+ Ezg9G.+| geo9.+! BIEG+| 6ISL-+| XIX = “ 6G ‘W'e | OFFI. +) 8G90.+| FEG0-—| FE8O-—| 990. +} L690-+) E9FO-+| LGET-—| F10G-— , G99T-— | 9LE0.—| 6Z80.+| PFTE- +] ITIAX = “16 We | 18d. +) E8th-+) L6G. +) OLEP- +] G6GE.+| OZ6E.+| OBES. +] FEFS.+| SerF-+, Scog.+) ZEeLF.+)| SOTO. +] 9G¢9.+) ITAX La0yrarosqQ] — © 08 “We | L6E0.— | 09G0-—| 9900. — | BFIT-+| GLST-+| FFGO-+| BISO.+| EeTs- +] oes. +| L9zE.+| LOG. +] FoLG.+| SIFL+| TAX qv SUIMAOU UL SUNSET 97RTq “ 6. ‘urd | [F16-+) 1Z81-+ GIZI- +! 9IOL-+| GFZI- +] OSGI. +) 026% +) CO9E-+) LLZS-+| 6Z09-+| 96TL- +) Qeze. +! 89¢8-+| AX SuIpvat Youu WoT, portly SoA « ae hed | FEOF. + | ELEE.+| L696-+| IGLT- +] GL61-+]| TPIT. +) 816. +) 8ShS. +) 896Z-+| OLEF-+) SIO9- +) ¢2g9.+] G96L-+] ATX — rure | Z980.+ O99T-+ 10zz- +) L16Z-+) POSE. +] OLOE. +) LFZE-+ 196Z-+] GOLE- +) 16ge-+ BI9E-+) egee.+| L689-+) TITX = « ao | G1G0-— | LOFT. —| 690. +| S9OI- +] G8FO. +] L190. +| 86BS. + | SENS. +] L16Z-+| 6LLF-+| PEPF-+) LETS.+| 09E9.+| ITX = "Ure | GLP. — | 8Srr-—| GOLF. —| GLLE-—| FH8E- —| BE9T- —| BOT. — | G6Z0- +] E660. +| BCOT-+| G608-+) FIST. +] I8eL+] IK XT] Ul uoTYRUtysa | | | | | qTAIOA, [Rey se SRE SEAN « zp iurd | €EF0. +) LOLS. +] SS. +) ILG6. +) T8ET. +] GI90- +) LOFO. +) GeLO.—| €8Z0.+| COST. +| O68E.+| G6E9.+] ISTA +] X jou pur (Ue | OSFI-+| SST. +) 90BB. +] OLEE. +] Lege. +) 9OGE. +) Goce. +] FLOP. +| GSE. +) BoFE.+| 96GE-+| BCOr-+] GLOL-+| XI palty, sine ro ieee ELOE- + ZO9E- +) L6OP-+| ZO8E- +) 9FSE.+| CFIP-+ 099Z-+| 6z8E-+| ZOLE-+ OFEE-+) L16E-+) se6e-+| 6809-+] LITA : ture -[¢0%--+) SFEI-+) COSI. +) 9FLO-+| 98LO-+]| GOGO. +) OLOT- +) SFIZ-+) GFRS-+ E6IE-+ Ie9e-+) zecr.+| LGF9-+] ILA = “ €6 are | 6668+] FIP. +) SF8I-+| PL81-+| €990-+| GEL0-+) 98FT-+]| 08G0-—| 990%-+ ZLFI- +) GFLI-+) Gopr-+] O6EE-+) IA = “ BG “We | GIZL-—| 9E60-— | FIOT-—| GFFO-—| 9LL0-—| LOO. —| OLGT- +] €960- +] E8F0. +) L69T-+} BZOS.+| 6OES-+| FEsE-+} A _ — Atoyvadasqg ye say Z | | | | —F1 aj sozvd Sutmsvom toy “ LT ure | 1080-+ | GTe1-—| GOGO. +) Z890- + OLGO.—| €8L0-+] ZG6Z- +) ELZ0. —| 860. + ZFGO-— | 66EO-+| 89TS-+| O9FO--| AT <= “ gt We | GrIG.+| L687. +) E968. +) LIES-+| 12h. +) PHES-+| SFIS. +| F8ZS- +] BLLE-+) 186e-+ Ecge.+) G90¢-+]| O9GG-+] TIT = “ 6 “We | G9zs- +) 9GL3- +] L61%- +] OTL. +) SP8I-+| €L9Z-+| 1G9e. +] LacE.+] leer.+ sear. +| sL0F. +] PETS.+| G8rG-+] IT se Avy] L “Ure | LEZO-— | 8E60.— FEEO-+| SLOT. +| ETI. +] 100+) LOgO-+) LZO0-+| 6810-+ EI8l-+) 9IFO-—| Gers. +| 8008+) I ~~~. ] —___,- | | — ; SyIVULayy (OZ6T) 3% &Id ald | Ud Old | 6d | 8d Ld SEE. Sd) td | & | & Id seLleg ‘sysDUay PUD Sawa, ‘sayy ‘saiway ayoundag wof uoynja.vog fo szuarrfaog ‘NOILOGSID YT, “TL WIAV.L Biometrika x1v EOSF-G = SUOTJWATISQO [TV JO UROTY er 8ee0- —| ChIEO. —| FI6ZO. — | 8L9ZO- — | OLEZO- — |LG6LO. —|€6910- — |S9FLO- — GFZLO- — |9EOLO- — |GF800. —|¢F900- — E9FOO. — |LEZ00. — ce | | ae — —— =| EBF-S | €9 | 6PF-Z| SP | LB¢-3 | 12 | ZOr- | 29 | 6er- | Ih | Oze- | 02 SIGI-—| g96p1-—| 96PI-—] QLPI-—| 9GFI-—| OGPI.~—| FZPI.—| FOPT-—| 9BEI-—| OLEI-—| grEl-—| OgeI.—| Ogel-—| PEEI--| XX ple. | 19 | or | OF | 9eG. | BL | | 9BL4T-—| 9691-—| PE9T-—| SE9T-—] 809T.—| F9GI.-—| 8PSI-—| SESI-—| OIGI-—| O9FI-—| SIPI-—| ZOEI-—| SeET-—| OOI---) XIX Tue. | 09 | Per. | 68 | Eze. | ST ZLEI-—| OPSI.—| SBZI-—| QOGI-—| PEIT-—| OATI-—| OPLI-—| PHLT-—| PHIT.—-| SELI-—| SPIT. —| ZBIT-—| 9801-—| 9G01-—| IITAX ece. | 6¢ | ere. | ge |Fec. | LT SIGI-—| FSl1-—| OLTI-—-| SP1I--| 90TL.-—| PFOT-—| OLOI-—| FOOT-—| 91460. —| 9€60-—| Z6R0.—| 9G80-—|-8080---| 9F40-—| TIAX. cor. | 9¢ | err. | Le | ze. | 91 9Z0I-—| ZZOI-—| 910T-—| %860-—| 9F60-—| OL6O.—| OL80-—) FZ8O-—| OLLO-—| ZELO- — | 0690--| SF9O.—| BE9O-—| OL90--| TAX 6IF- | 4g |69r- | 9¢ | 66F- | ST PSGI-—| PPSI-—-| FSSI-—| QISI-—| FETI-—| 9ZII-—| O8OI-—| ZEOI-—| 8960-—| 0Z6O-—| 880. —| FI8O-—| 0920-—| OTZ0.-| AX Per. | 9G | 96r- | Ge] sec. | FT g9c0.-| zeco.-| 98r0.—| @9F0-—-| g6eo.—| PEEO.—| FLZ0-—| 9BZO-—| BL10- —| 9ZIO-—| 81400. —| ZG00. —| ZZ00-—| 0000. | ATX Pop. | oo | ser. | FE | Ose. | eI 0640: —| O1¢0.—| O6FO.—| BSTO-—| SIFO.—| F8EO.—| 9GE0.—]| BZEO. —| F8ZO.—| F9ZO- —| BGzO.—| 99ZO- — | F9ZO- —| ShZO.—-| IITX ear. | Fo | 8p. | SE] OTS. | SL | | BEIO-—| 9810.—| ZgL0.—| 9FLO-—| OFTO.—| OE10.—| FI1O-—| 0400-—| OFOO- -| 9800. —| F000. —| F000: +| FO00-+) F100-+] ITX brP. | 4 | e0G. | GE] | IT FOPO-—| g0FO.—| 8zFO.—| 9SF0-—| OLFO.—| F9PO-—| OLFO.—| PZPO-—| 98E0-—| F9EO-—| 89E0.—| F8EO.—| FIFO--| 8IFO.-| IX O9r. | G2 TGLF | TE | 9ze. | OT 88r0.+| PpEco.+| O190-+| GS90-+| GZLO-+] ZELO-+] 9EL0-.+| 9TLO-+| OTLO- +) 0890-+)| FE90-+) O8GO.+] Z8g0.+ | O6GO. + xX rep. | I¢ | czr- | 08 | 6EG. | 6 PISI-—| 9ISI-—| 908T.—| FOST-—| 98LT.—| OSLI-—| O9LT-—| O9LT-—| FELT. —| OGLT-—| BOLT. —| Z99T-—| O9I--| 96G1T.-| XT Grr. | OS | 68F- | 6S | cES-. | 8 SELO- -| 9010-—| 8L00-—| FG00-—| 8g00. —| 8Z00. +] OFOO. +) 9800. +} OZIO-+| F8LO.+) O1ZO.+, FEZO. +} OLZO-+) FIEO-+| TILA ery. | 6p | esr. | 9a] ees. | Z PIGI-+] OGSI-+| O€SI-+| VLGI-+] Z6ZI-+| OPET- +] OET. +] GET. +] FSET. +] SLET-+| SLET-+)| PLAET- +] O6EI-+) SPFT'+| ITA LEY. | SP | S8r- | L461] bec. | 9 8660. —| FZ0.—| 90Z0.—| 8910-—| &Z10.—| G900-—| O€00.- | 0000. | GLOO. +) OFOO. +) FL00.+) 8L00-+) OTIO-+) ZETO-+| TA Gop. | Lb | 68r- | 92 | 6ze-. | ¢ ZCLO- -| OOTO-—| ZFOO.—| 9€00-+] g100.+] 8ETO--+! FSIO.+]| GEZO. +] 9EZO. +) Z9ZO. +) F8ZO.+) Z6ZO- +] 86ZO-+) SOEO. +!) J pp. | OF | ZsG. | ce | zPe. | & | | OFTO-+] O€TO.+! 9GT0-+] 8410-+] 8L10.+] 90Z0-+] OTZO.+] 900. +) OLTO-+| 9ZIO-+| GILO- +] SS1O-+| FZIO-+| FIIO.+| AT Ich. | ch | cre. | pe | Pre. | © | | FE60-+] 9E60.+] B960.+| GZOI-+] 9OLI- +] 8OTT-+] GIZI-+] OIZI-+] O€GI-+] 89ZI-+| O8ZI-+] ZIE1- +] OSET- +) OGE1-+] , TIT QGr- | br | LHS. | GE} TSS. | @ | | PRPI-+|] OTST. +] G@PSI-+] S6GT-+] SPOT. +] SL49T-+] SILT. +] FELT. +] 9GBI-+] S88I-+) O€6BI- +] N66T- +] ZEOS. +] 9EOS. + II OGP.S | €F | PEG-6 | GS | LPS-Z | I ar GELI-+) FOTT-+ GOZI-+| 9OGI-+] OIZI-+] 9BSI- +] FEST. +) EEEL-+ 881I- + PSII. +) 9LLI-+} SOG. e611. + | BEET. + it | | | | | a = ae |= ne! zt =o i WV ee ih | 2 Hee ACE | FL dnoary | et dnory | et dnoxry | tT dnory | gt dnory 6 dnoiy, gdnora: Ldnory | 9 dnowy | ¢ dno | 7 dnoiyy ¢ dnory G dnoiy | [dno | satiag On the Variations in Personal Equation ‘saiagy yova fo worwa ‘(sayour QOOG.Z Wihr10 wolf) sdnowg saruag fo sunapy MSI). yidos O42 JO SaNID A UNaTT , ‘NOILOGSINT, “Il AIAVL 50 51 ON Ss Eicon S. PEAR “MOTEG TA 9QRy, WOAT L udyey Udoq OARY BTR, OY} JO TI0940q oA Ye VK Jo sonTRA oJ, ¢9060- cEL060- | OTO6O0- | 80680. GEL8O- | 9LG80- | SP¢so.- C80. | GOG8O- | FESO. | 66F8O- | SLF8O- | OSFS8O- 7s €Or70- | LLPPO- LLYVO- GLEFO. | 6BLEFO. | T6ZFO. | EFEFO-. | OGEFO- | E8ZFO- | LLZFO- | LESFO. | OFFFO- | OFFFO. | SFPFO- SXeX 1GéGO- | O¢ecd-. | 969¢0.- 90LG0- 6F6G0- | LGT9O- | 86190- | EZE90- | GEG9O- | 6LG90- | 84990. | 416690. | 99890. | OZ690- pxaTENG OFO90- 18090: LTé90- £9090. | €98¢0- | G¢9G0- | O6ECO- | 6LECO- | GLEGO- | C9ECO. | ELECO-. | LFZCO-. | T@SFO. | PFOFO- | IIIAX 9E0L0- GLOLO- L60L0- 1c€L0- | PO9LO- | G6ZLO- | TIGLO- | LOGLO- | TE9L0- | OLEL0- | 9ELLO. | 9ESLO-. | LOGSO- | FEG8O-. | TIAX FEL90- 69190: | 89T9O- | 66LG0- 48990. | TPO. | E8TCO- | L9LGO- | F8090- | G9G90- | FL690- | G6ELO-. | LIGLO-. | BISLO. JAX C6ELO- IPE Lo- GLELO- | 9GELO- LOFLO- | O9GLO. | GOO8O- | EEE8O- | LEGRO- | FEEGO- | TG9GO- | 96660. | L8EOT- | LTLOT.- AX 9GL60- | 6¢960- OTS60- | 86760. | FPL8O. | LOGBO-. | LEOT8O- | LO8LO. | LPLLO- | I8LL0- | 8E9L0- | LE9LO- | 196L0- | IFI8O- AIX 6TG80- | 69G80- | OFZES8O- IGG80- | 68P80- | GZF8O. | YFF8O- | BLZBO- | LEZ8O- | EGISO. | FEL8O- | 86080. | YLTSO- | FFZRO- IIx €6990- GEL9O- | S¢L90.- C6L90- 88890- | FI690- | 0Z690- | 8L690- | 44690. | 8ZOLO- | G8OL0- | TLOLO-. | TLOLO-. | GOTLO- IIx 98060- ELL60: | 6€&Z60- LGG6O:- F6L6O. | 9LIGO. | I8I6O- | 6ZZ6O- | 98060- | 89060- | 9ZOGO- | FIS8O0- | OGFRO- | CEFR8O- IX €PcLl- GLSLT- L69TT- G9eTT- | OOFOL- | 9EEOT- | SSZOL- | ODEOT- | GFFOT- | BOLOT- | OGGOT- | GZGLT- | 9ZSIL- | TFLIL- x GOFOT- GLEOL- | FLFOT- 6LPOT- | O8FOT- | 8ZGOT- | TILOT- | LILOL- | €6LOT- | FOSOT- | LLLOT- | O6FOT- | EFFOL- | EEZOT- XI O&rTT- SLPPL: | GICFL- | 9CCFT. LéPrl- | OTGFI- | OLIFI- | TL8EL- | E88El- | BILE. | SESEL- | BFZET- | SEGET- | OVOET. IIIA G8 LOl- SLIOT- TISLOot- LGTOL- GO9OT. | €Z80L-. | GIGOT- | SGIIT- | FOTIT- | 9EOLI- | S6OLT- | €FOTT. | OBTIT- | PELIT- IIA 6E980- 8G€80- | O9G80- | F880. SILL0. | O&ZLO. | GE690- | 66990- | 06G90- | FLZ9O- | FOL9O- | L8090- | F86CO- | Lz6¢O. IA €9680- OZ@680- 60980: | 9LE80. LTI80- | G96LO- | T9DLO- | O€ELO- | 9GTLO- | ETILO- | LOTLO- | T6TLO. | FOZLO- | OLZLO- A 9ELOO- 99890: | 60¢90- P8090. | F8O90- | 8F6GO- | TF6EO- | 806GO- | OOBEO- | GEGGO- | FZO9O- | SzZ8GO- | LESGO. | ZOBGO. | AT. C8eel- | 9GGEI- [8eel- 68L11- IG80I- | OF860-. | 9ZG60- | 8FEG0- | 98C60- | LEGO. | 9OS60- | FLFGO. | 19G6O. | 19°60. 200 G966I- | SL8Gl- PO9GI- | FOSS. I9TGL- | 8GTGT- | OI6LL- | LOLTI- | GLOTT- | ELEOL- | G66GO- | O1E60. | L6160- | F8T6O- Il LIGLO- 9Z690- Ge990- | ¢9C90. G6F90-. | SFE90. | E990. | ZSE9O- | ZL990- | €8990- | 6EL90- | 66E90- | ELC90. | €6090. I FI dnory | gt dnory | gt dnory | tT dnory | oT dnorg| 6 dnory | gdnory |» dnory | 9 dno1y | ¢ dnorn | F dnory | ¢ dnoiy | gdnorg J dnory serseg ‘(sayour ur) sdnospy sarwag fo suoynuag punpunry ‘NOILOGSIUT, ‘TIT WIDVL 52 On the Variations in Personal Equation equal, but though there is in general a considerable reduction, it is clear that neither a linear sessional change nor a parabolic one of the form represented by equation (xxx) account for the greater part of the correlation of successive judg- ments. The coefficients p, and also p,’ vary considerably from series to series, but there is no very marked progressive secular change. On the whole both p, and p,’ are large when the standard deviation is large, and a measure of this correspondence will be given by the correlation of p and c. This can be obtained most readily, and with sufficient accuracy for the purpose, from the correlation of the ranks of these variates, by the method referred to in Biometrika, Vol. x. p. 416*. The results are correlation between p, and o, + 52+ °‘11=7,,, 7 . pi and o, + 60 + °10=7’,.. The difference is not significant, and we may draw the conclusion which could not have been assumed a priori, that the correlation of successive judgments is larger when the variations in judgment are larger, and that this relationship does not appear to be reduced when the large linear sessional change has been removed, Large values of « might have implied erratic observation and small relation between TABLE IV. Constants of Individual Series (T'risection). (The definition of these constants 1s given on p. 35.) iL 2 3 4 5 6 a 8 Series | dy b | fee Pi oy’ Oj o§ I | 2°6238 | + 000673 | +°2925 | +-3008 | 06015 | 06093 | :0721 II 7036 | — 002964 | +°4149 | 4+°5485 | 708125 | :09182 | -0873 Ill | *6350 | — 003626 4 °3643 | +°5560 | ‘08001 | :09561 | -0901 IV | +5114 | +°001718 —:2521 | —-0460 | °05356 | 05902 | 0854 V =| «= °5809 | —:001555 | +2520 | +°3234 | -06853 | -07210 | -0839 VI | 5132 | —-001529 | +°2270 | +°3390 | -05495 | -05921 | -0681 VIL | 6448 | — -004244 | +-4918 | +6457 | 09322 | -11154 | -0939 VIII 5314 | —:002788 | +°5478 | 4+°6089 | °11369 | -12060 | -1067 IX | +3404 | - 004477 | +°4979 | +°7075 | -07935 | :10233 | :0783 X | +5590 | —-000086 | +°7151 | +°7151 | *11141 | *11141 | -0841 XI 4582 | — 000972 | +°7320 | +°7381 | 08317 | -08435 | -0610 XII 5014 | — 002720 | +°4851 | +°6360 | °05923 | -07105 | -0606 XUIL | =°4752 | — 003594 | +°5101 | +°6897 | -06409 | -08244 | -0649 XIV “5000 | — 003818 | + °6433 | +°7965 | 05993 | -08141 | :0519 XV _ | 4290 | — -005588 | +-6810 | +-8568 | 07051 | -10711 | -0573 XVI 4390 | —-003071 | +°6408 | 4+°7412 | (06441 | 07819 | -0562 XVII | +4254 | —-004369 | +:2569 | +-6556 | -05840 | -08594 | -0713 XVIII 3944 | — 000580 | +°2870 | +°3144 | °04568 | -04644 | 0544 XIX 3700 | — 003236 | +°4935 | +°7219 | (05107 | -06920 | ‘0516 XX 2°3666 | —:001725 | +°2850 | +°5072 | -03680 | 04443 | -0441 Mean value of b= — :002425. * The theory is based on the hypothesis that the variates follow a normal distribution, and though this may not be strictly true for the p,;’s and o,’s the method probably gives a sufficiently accurate approximation to the value of their correlation, Econ S. PEARSON 53 successive judgments, and at the same time high correlation might have been expected to result in small variation. The significance of this will be discussed in the concluding sections of this paper. In Table IIT giving the o’s, it will be seen that in general the standard deviations increase in the later groups; though this may be due in part to the parabolic form of the sessional change, with its tendency to an increasing drop towards the end of the series, it is possible that it also indicates a fatigue effect setting in, and causing the later observations in a series to be more erratic ; the same phenomenon appears in the Bisection Experiment where there is no appreciable sessional change within the series. It may in fact be looked on as a sessional change in the standard deviation. At the end of Table I are given the dates on which the different series were carried out, remarks noted at the time as to the condition of the observer, and, for the last 14 series, the time taken to mark off the 70 forms*. It will be seen that there was a large gap between the times of carrying out the first six and the last fourteen series, and this interval of nearly two months broke the continuity of the secular change in the means of the series. In Figure 5 the means of Group 1 (or =—Trisection 5 Personal Equation Personal Equation - —Order of Series —Time (days from 6th May) ‘S “ . ‘ 1 - | Heel Be! I | Te pat ' ' 9, )3, 7 21! 25 "29 "33 37 4 ‘45 ‘49 Be Y/ 61 "6580.73.77 81 TONG 14°16 1820) 3 7 15 19 23 27°31 35 39 43 47 51 Blo 63 67\71 75 79 83 68 1 y=Order 2\enr Eine ° Ae) ry oN 9 13915 17) 19'|1) 5 [o} = quation 15 - e * x ¥ 3 Personal in inches e Fig. 5. the d,’s of each series) have been plotted firstly with the order of series and secondly with the date of series. If x is the personal equation, or mean value of the observations of Group 1 of a series measured in inches, y the number or order of a Series, z the number of days between the 6th May and the date on which the series was carried out, * Reference to the 7 trial forms first marked, in addition to the 63 of the Series proper, is made on p. 28 in footnote. 54 On the Variations in Personal Equation we have for the regression straight line of x on y 2— 24976 =—-ONS5InGy Or) irene ate eee (xxxan)s and for the regression of a on z ¢—2:4976 =— :00233«2 — 53:05) . -n--.eeeeee (xxxill), and these lines have been drawn in the diagram. The corresponding coefficients of correlation between (1) personal equation and order, (2) personal equation and time, and (8) time and order, a meaningless coefficient but required to find the partial correlations, are (1) rey = — 800 + 054, (2) oie 092 Os (3) Tyz = +°882 + 033, and the partial correlations are Toy.z = — 559 + 104, Voz. y = + 049 + “150. But the interval between the May and July series was so large, that the series should perhaps be considered as forming two groups, one of six and the other of fourteen. Taking the last fourteen series, we have the regression lines av — 24596 = — 01346 (y — 7:5), the Series VII being given the order 1, VIII, 2 etc., and x — 2°4596 = — 01048 (z — 6°64), z being the days between 10th July and date of Series. These lines have also been drawn on the diagrams. The correlations are (1) rey =— 674 £ 098, (2) ge = — 673 + -099, (3) Ty, = +956 + 016, giving partial correlations a ena al LLP ogee aseeolie The point of interest is this: there is a secular change in personal equation from series to series; is this change more closely related to the number of series or sessions that have gone before (that is, almost, to the experience gained), or is it due to some change with time in the observer’s outlook ? Suppose that it was arranged to carry out observations on a number of different days with varying intervals of time between them, and that on each day a certain number of different series of observations or sessions were undertaken at regular intervals of perhaps an hour or less; any series could then be classified as the pth series of the qth day. Then 7z,,z (the partial correlation of personal equation and order, time being kept constant) would give a measure of the relationship between change in personal equation and order of series in any one day. This will not necessarily be the same Hcon 8. Pearson 55 as the sessional change, for it has been supposed that this latter occurs only during - the course of a sitting, and is broken by the interval of rest in between. On the other hand if we take all the pth series of the various days, then r;;,, (the partial correlation of personal equation and time, order being constant) gives the relation between change in personal equation and time, taken over a long period. The long break in the middle of the Trisection Experiment takes away any real significance from the difference between ry,,z(—°559) and 1,2, (+ 049) for the twenty series, and in the case of the last fourteen series these coefficients are equal (—°143 and —:138), because the intervals between the series were nearly uniform. In the Timing Experiments, (and D, the arrangement of the series in groups on consecutive days leads to considerably more interesting results *. A comparative measure of the consistency of the consecutive judgments in the different series, is the standard deviation of first differences, or nv 7 > (Yer — Yi) f= ears C5 = n approximately. The values of this expression are given in the 8th column of Table IV. Now suppose we compare the constants in Table LV, the dates and remarks at the end of Table I and the diagrammatic representation of seven of the series given in Figure 6. The first series to be remarked on is IV; most of the series were carried out at the beginning of the morning before any other work, and it is possibly the fact that IV was done soon after a spell of measuring spectrograms with a Zeiss comparator that explains the exceptional values of p, and p,’, namely p, = — ‘0460, pi =— ‘2521. The os, or standard deviation of Ist differences, is no higher than for the other series done at about the same time (in May), and the oa, is lower. The first graph in Figure 6 gives the diagram of this series ; the rapid fluctuations in judgment about a very steady, if slowly changing, personal equation may perhaps have some physiological significance. In the second and third graphs of Figure 6 are represented two of the four Series VI[—X which were done when the observer was not very fit; they have large values for o,, and the o,’s are large compared with those of the ten series which follow, showing that the judgments were rather erratic; the correlation is however high. In VIII there is a great jump between the 44th and 45th judgments, from 2°22 to 2°66, and the gradual drop down, which follows, to 2°20 (for the 52nd judgment) is a good example of a way in which successive judgments are correlated. In Series XI (not repre- sented among the graphs) there appears to be a periodic variation, for the correlation falls steadily from p,=+°7381 to py =— "4428. XIII, XV and XVI are typical highly correlated series with large sessional changes; the o,’s as well as the o,’s are considerably smaller than in the series VII—X. In examining the fourth to sixth graphs we notice what may be called the large scale correlated variations superimposed upon the linear sessional change ; * See pp. 70, 75 and 83, and below. ak 56 On the Variations in Personal Equation it is due to these variations that the values of p,’ remain so large, and it is their absence that makes the correlation in IV so low. The last graph (Series XX) is Estimate in inches. that for which the o, and os are least, and yet p, is quite high (+ °507). As a last instance of these points, we may compare the constants* for XV and Ve b Pi. pi o7 o5 XV — 005588 +4 °6810 + °8568 ‘10711 -0573 XVIII —-000580 +4:2870 +4:°3144 -04644 -0544 2°64 SERIES IV 46} \ FING} / 44 a ry fl \ P,=--046 aap b =+-00172 r oe ee ; o,= 059 ‘36+ Order of observation in Series = Hels 4m 10 20% 085 30 40 SERIES VIII P= +-809 b =~ -00279 38 g= ‘121 a o,= 107 262 SERIES IX P,=+-T07 b =--00448 = 102 O= + 78 The horizontal line intersecting each graph gives the mean of the first 50 observations in that series. Fig. 6. Trisection. Diagrams representing variations in judgment. * For definitions of these constants the table on page 35 may be referred to, Econ S. PEARSON 57 2°64 SERIES XIII 4at p,=~-G90 = +68 gob =- 00359 sf o- 082 ee Soe G- 065 Order of observation in Series 1 10 20 Pp, =+ "R57 are b=--00559 Ww fe : ee 107 a= “057 Estimate in inches. wo no A=+T4l b =--00307 — SERIES XX B= +07 . Dis nae ‘30 hE gost, G “044 10 20 30 40 50 60 The horizontal line intersecting each graph gives the mean of the first 50 observations in that series. Fig. 6. Trisection. Diagrams representing variations in judgment (continued). XV has a large linear sessional change, but superimposed on this there must be considerable correlated variations, for the removal of the best fitting straight line only reduces p, (8568) to p,/ (6810), XVIII has variations altogether on a smaller scale ; the correlation of successive judgments is low and it is barely affected by the removal of the linear sessional change. And yet though the o, for XV is more than twice as great as for XVIII, the o3’s, or measures of the average Jump in estimation from judgment to judgment, are practically identical; the importance of this constant os as a measure of variability of judgment is discussed on p. 69 below. 58 On the Variations in Personal Equation | | as wa ae _|4 ee BS eo eee let Ne areaes | Ee. _ |j — suoigenby vous ATGI. | OLLT.—| GOAT. —| LPF. —| BBG. —| 9ZOL-—| 6890. —| LEFO. —| 8TZO. +] €OLO. + Sg9T. + acinngy is care fh €0Z0- +) GOZO. 9020. ¥ | 80Z0- #, 80Z0- F| L1ZO- #| ZIZO. F| €1ZO- F| $1ZO. F| G1ZO. F L0ZO- F| LE1O- ¥| ZOLO. F J suoyenbgy sou0 | LOIG. —| G9BL. —| ZIGI- —| ESST- —| G6GL. —| FLOT- —| 6190. — | GgFO. —| 60ZO- +] 6690. +) FE9T. +| FOGE. +] Z68F- +] -aaHIG IST oI ,%y | | | — | | gT ral EE Oe ah ee 8 Heat) g v g z T=4 | if { | ‘abunyy [ouowssay wnauvyT wos paatf swoynasasqy anssavong Jo uoynpatwoy fo squarrfa0p TA WIdVi | | aneen | me | | 810. —| 0090+) GLG0.-| GHIO.+| 7980.+} ¢2G0.—| ILh0.+) G400-—| TF90-—| L86T.+) O0G9.—| *y? | | cae | | | | 9FI0-+| 00g0.—| g6e0.+] Ig90-—| ¢900.+] 98z0-+] 68F0-—| OLT0.+ coP0. = 9480:—| Zgp0-+| Tpg9e-—| yt | | | | | COOL. | TIIGIl- | OGZI- | PIG. | €8FT- | 99ST. | 00BZ | OBeZ- | 4082 | 99Te. | LOBE | geze. | 9FZO.+] (Z) *—Z | =| E = =| | a Sa my 8020. L0G0.F L0Z0- F900) 80Z0.F| 90ZO.F| €0ZO-F| ZOZO. +] OGIO. Z6LO.¥) I810.F eCl0.F O10. ¥) | €OOTl- | GIA. | [891 | Q28I- | L791. | OIG6T. | 81ZZ- | F2Es. | BI8Z- | sre. | S488 | 1ee¢- OF ed. +| (1) | y | | | | | = =| eet = ==)| — = ei) | = i ont = a LE100- F LELOO.F 9ET00-+F FELOO. + ZELOO- + 6ZLOO- F | 6ZT00. F | GZLOO- F | 6ZLOO- F) 8ZLOO- F | GZLOO- F | 8TLOO- F | QZIOO- F LZ 100. F 29060. | L060. | O1060- 80680. | ZZL80- | 9LG80. | FESO. | GEE80- | EEG80- | ZOG8O. | FZG80- | GEFSO. | SLF8O- — OGFRO- st | E10. F110. O€lO.F L210. F SZI0.F| €ZI0.F) GIIO-F SILO.) ZI10-#| LOTO-F 9600-F 6100-F £900. | IG19. | G1Z9- | GeZ9. | 9FE9. | G89. | 86F9. | GP99- | 7699. | F069. | 6GOL. | I9GL | OF6L | GREst) “ET | | | | = = a5 cas = aa = _ —_ = = as = = | 10G00- *L0Z00- F 10Z00- F 10Z00- + 00200. =| 66100. | G6100- =| 86100-86100. ¥ | 46100. F 96100. ¥ 6100. ¥ 6100. ¥ +6100. F OI€EI- | SEI. | FESEl- LHEET. | OLZET- | C61El-. | COTEl- | O9PTET- | €O1El. | GPOeT- | COOST- | T96ZT- | 9Z6SI- | ZESZI. aS | 896E0. — SFTEO. =| FIGZO. — 8IGZO.—| OLEZO. — | LS610- —| E6910. —| GIFLO. — | SFSLO.— | MEOTO. — | F800. — | GFHO0-— E9FOO- — LE700. — | 0% | | | | | | 0G | *¢ | | é } | #1 gl a IL Or 6 8 Lh | 9 g F ST ho care set! ‘(uoyoasiuy ) sauag pauiquog fo szunjsuoy ‘A WTEavVi Econ S. PEARSON 59 (b) The Combination of Series. Having discussed the reduction of the individual series, I will proceed to con- sider the results of combining the 20 series. The formulae (v) to (viii) on pp. 38 and 34 give the values of D;, Sz, and R,; which are tabled below. Remembering that D, and S, are the mean and standard deviation of the combined observations Yr» Yo ++» Yu from each of the 20 series, D, and S, the mean and standard deviation of the combined observations yo, y3-.. Yn, and finally D,, and Si, of yrs, Yrs +++ Your WE see that the progressive decrease in D; as k increases indicates the shortening in the estimate of a third during the course of a sitting, while the increase in S;, may perhaps be partly due to increasing variability of judgment due to fatigue. The values of R, are large, but this is to be expected owing to the large changes in personal equation from series to series; in fact for / = 13 it will be found that the limiting expression J; of page 34 gives Dag = + 5435, while R,,=°6151. The reason for this difference between ,; and R,; is that &(p,;0,0,,), and therefore m R,;, does not vanish. The next step is to obtain the values of S;’ and R,’, or the standard deviations and correlations of successive judgments after the secular change has been removed. They are found from Equations (ix) and (x) of p. 34 and are given in Table V (5th and 6th rows). There is here an opportunity of testing the accuracy of the Difference Correla- tion method discussed in Section V(b); the case is that of Problem 1, page 41, the values of R,, R,... R,, are known and give the correlation of Ist differences, R,, R,...,Ri.; these together with the coefficients of correlation of 2nd differences to be used later, are given at the bottom of Table V. Then using the value +6246 for R,’, we get the values of the 12 quantities R,’... R,,’, which have been inserted in the 7th row of Table V. It will be seen that the values obtained by this approximate method agree well with the others, the differences being within the probable error of the R,’s up to and including R,’; beyond this the approximate values become rapidly too small, the error, from the form of Equations (xx) and (xx1), being clearly cumulative. This failure is certainly largely due to the fact that the errors involved in the assumptions («), (b) and (c) of p. 37 are not negligible when the later groups enter into the correlation, for we have already seen that both D;, and S; change steadily with hk. The values of S;’ and R,’ in Table V correspond to the average values of the standard deviations and correlations of successive judgments in the individual series, Le. of the o’s and p’s given in Tables III and I. Owing to the sessional change which occurs during the course of nearly all the series, Ry’ does not vanish as k increases, but appears to approach a limiting value in the neighbourhood of ~+°16. By obtaining for the separate series, the coefticient of correlation, p,’, of the successive observations at intervals of one, freed from the linear sessional change, a step has been made towards the further reduction of the problem. R,’, the 60 On the Variations in Personal Equation coefficient for the combined series corresponding to p,’ of the individual series, is given by 2S =x Tie) Ge ne = = (os")! and taking p,’, o,/ and o,' calculated from ete (xxii), (xxiv) and (xxv) we find that “Salat st uit ponte neece (xxv) bis, RR,’ = + 48922 + 01623. Then R,”, R,;’... R,;” can be found by the method of Problem 2, p. 41; or again using the value of R,”” found from the first difference equations, we may proceed to second differences as in Problem 3, and so obtain R,”...R,.”. In this particular case there is no need to use the second difference equations*, but the values of the R,’”’s have been worked out by both methods, as numerical examples of the theory of Sections V (a) and (b). A comparison of the values given in Table VI shows that there is no significant difference between the results of the two methodst, and the agreement found earlier in this section between the values of R,/ calculated directly and those found from the difference equations, warrants confidence in the results for R,”. Although the negative values of Ry,’ are probably too large for the higher values of & (just as the later positive values of R,; in Table V, row 7, were too small), there is no doubt, I think, that the correlation of the successive observations freed from the linear sessional change, does become negative at k=5,6 or 7 and remain negative for the higher values of k. A word of qualification is necessary; the linear sessional change to be removed has been represented by the line “ best” fitting the first 50 observations of each series, and a glance at Figure 4 shows that the mean values of the later observations in the series of 63 would lie well off this line because of the parabolic form of the sessional change; the negative values found for R,”, R,”, etc. may probably be largely accounted for by this fact. A more satisfactory approximation to the correlation of successive judgments freed from sessional change will be obtained in Section XI below. As o3=0,V2(1— pi), referred to on p. 55 above, gives the standard deviation of the first differences of consecutive judgments in a single series, we shall have as a corresponding measure for the combined twenty series S;=S/ V2 (1—R/). For the Trisection Experiment Ss = 0732. * To get an idea of the order of the terms Q;, and l? which are being neglected, the values were calculated for two values of k, with the result k=1, Q,+b?= —-000001064 k=9, Q, += eet i — R2 + The probable errors in the Table have been calculated from the usual formula e= +°6745 == 3 and do not cover the errors arising from the method of approximation. Econ S. PEARSON 61 (c) On the possible Effect of shifting the Head during the course of a Series. It was suggested to me that the correlation of successive judgments in this and in the Bisection Experiment might be due to periodic shifting of the head from side to side during the course of a series, some parallax effect of the two eyes making corresponding variations in the estimation of a third (or a half) of the line on the form. Now such an explanation might account for part of the corre- lation in these two experiments, but it could not explain the regular secular and sessional changes in the Trisection, except by the highly improbable hypothesis that the observer’s head leant over increasingly to one side during the course of a sitting, and that he started with it more on one side in the later series than in the earlier ones. But beyond this, the fact that correlation is found also in the timing experiments suggests that it is of deeper and more complex origin. It is likely to arise from many unknown causes affecting the environment and condition of the observer, and if one of them is a relative shifting of the eyes, it is of interest, for it will enter into many kinds of observations, where the observer who takes the readings is not looking through a fixed eyepiece. To test the effect of a relative shift between head and paper, 42 of the forms were taken, and trisected in the usual way, but for alternate groups of seven the paper was shifted 4 inches relatively from side to side. The measures of the estimates and their means are in Table VII. The three sets of seven under the heading I, were made with the forms in one position, the three sets under II with the forms shifted 4 inches to the right. The difference is noticeable at once; readings I are smaller than II, and at the same time the curious effect: of sessional change is appearing, the later readings of I and again of II, being on the whole smaller than the earlier ones. Now in carrying out the observations of the Trisection and Bisection Series, the body and head were kept as steady as possible, and it is unlikely that frequent shifts as large as 4 inches could have occurred ; further the differences between the means of readings I and II, are much smaller than the actual variations in judgment shown in the diagrams of Figure 6. But as a further test, a series of 63 forms were marked off, with the head fixed mechanically; the results are given in Table VIII with the usual notation. The correlations are not as high as many of those in Table I, but they are com- parable with those of Series I, V, VI, XVIII. The sessional change is also-indicated by the decrease in das & increases*. Without carrying through a good number of series with fixed head, no useful comparison can be made, but I think that the evidence of this one series is sufficient to justify the assertion that a shifting of the head from side to side cannot account for the greater part of the correlation of successive judgments. (d) Summary. First considering the individual series, it was noticed that there was a secular change in Personal Equation with time—i.e. the means decreased in passing from * The value of o, or 074 may be compared with that of S; for the ordinary series of the Experi- ment 4, which was -0732. 7L0-= (19-1) Bp 19 = 89 ‘pnary fo yfryy uo squaunsadag TIA WIV = = 6190 + LG6OI-+] O&FO.+| L8ZI-+] IZG1-+| 2100-—| 96z0-+) LO10-+| 6IL0-+] I1Z1Z-+] g9¢se-+) Fe0r-+) serrt+| Wd Ss zog90- | Zc990- | P0690. | 84990. | 86G90- | 8890. | 89490. | 9FE90- | ILOLO- | BF690- | 99690. | FEOLO- | 29690. | FODLO- 20 ~ 8008-2 | BZ09-G | 980% | F90G-3 | GSOG-3 | OFOS-S | BFOG-G | O909-G | 9609-3 | JEIS-S | FYIG-| O9TS- | 8BIG-G | FEISS) *P S | | ~ | | | | cS ial el ZI Tj avg 6 L 9 ol ere 7 reamed T=4 = | | 3 S THA WIAVL % S Ss | GP-6 | BP CEE) Ge (92 8% 826 | 16 er ia TeGe\ 2 o | LEB | 1P GEG lave | PPE | LB 68-3 | 0 aes | TVS | 9 = | OF. LEG LY. ‘ G 3 pe CP. G 3 ROtren) eae)) o one Weis calee)| Seer clieey.e Jave|y cee ry lee \lioocie Presets (he eae aon ee “= vuvoye 683 | 6& fuwopg {RE | FE fawage 8FS | 8S Poway {9S | SL J awayy Tt lawoy (7S | ¥ ~ OFZ | 8e 88-3 | | Te GG-3 | FG Gerculaa OL 626 | €& = gee | Le Lee. eo 6F% | €% ers | 91 6 \6e-2 | 2 SS CFS | 9f 96-6 | 6 6F-% | Gz Gre | ST 8 6E.Z I ae lets | | za! ~ | | SUOI}BA | SUOIYBA SUOTJBA SUOT}BA SUOTPBVA | SUOT}VA aS | -1esqQ | -1asqQ -1aSqQ -19SqQ -1aSqO -198qO = | II | jo I | jo II jo I jo II jo I | jo S | Iapig Iapig Tapio Taplo IapPIO | Tapio Econ S. PEARSON 63 the earlier to the later series; in addition there was a remarkably constant sessional change within each series, this change being again a decrease from the earlier to the later observations. There was something in these changes almost analogous to an elastic strain, during the course of a series the estimation of a third drops, in the interval between the series there is a recovery, but not a complete recovery, for the first judgments in the succeeding series start at a little lower level than the first, but well above the last judgments in the series before ; this slight “permanent deformation” caused by the “strain” represented in the sessional change, results in the secular fall. The figure below gives an ideal representation of this. A, By, Ag Bg, ... sessional change in Series 1, 2 ... ete. B,A2, By Ag,... ‘recovery’ during interval between Series 1 and 2, 2.and 3 etc. M, Mg the resulting secular change. Then combining the twenty series, in order to get more reliable results, the coefficients of correlation of successive judgments, R;, were obtained; owing to the secular and sessional changes these coefficients had very high values and as /: increased, apparently tended to a limit at about +°60. By fitting the means of the series together, the secular change was eliminated, and a serics of coefficients R, obtained, which represented the average value of the correlation in a series ; owing to the sessional change the R,;’s did not appear to tend to zero as k increased but to a limit at +°16 or +°15. The correlation of successive values of the residuals, left after subtracting the ordinates of the straight line “best” fitting the first 50 observations of each series from the observations of that series, gave a set of coefficients R,’, which fell off very rapidly and became negative when & equalled 6 or 7; the large negative values of the coefficients for the high values of k were probably due in part to the method of approximation used, and also to the fact that the straight line fitting the first 50 observations in a series did not represent satisfactorily the sessional change. The values of R,’ calculated (up to / =13), gave no evidence of any tendency to periodicity in this coefticient, although there was evidence of this occurring in some of the individual series; periodicity in R,’ would indicate marked variations of roughly the same period occurring at any rate in a large number of the series. 64 On the Variations in Personal Equation It will be shown in a later section that the values of R,’, k=1... 13, can be fitted very closely by a curve of the type y= p+ qr*, where p, g, and r are constants. Finally it was shown that the correlation of successive judgments could not be due to a shifting of the head during the course of observation, although this might perhaps be one of many contributory causes. EXPERIMENT A. TRISECTION. CorrELATION—INTERVAL DIAGRAM. CORRELATION OF SUCCESSIVE JUDGMENTS t= — =a I I 3 4 5 6 INTERVAL BETWEEN SUCCESSIVE JUDGMENTS Fig. 8. In Figure 8 the values of R,, Ry, and R,” (for linear sessional change) are plotted to 4; the theoretical curves of the Equations (xx) and (lvi) shown in the Figure will be discussed in Section XI, and also the points referred to as Ry’, EGon S. PEARSON 65 VII. Experiment B, Bisecrion. REDUCTION OF OBSERVATIONS. (a) The individual Series. In this Experiment the coefficients of correlation of successive judgments for the individual series were not all worked out, but only the values of p,; these are tabled with o, and d, in Table IX. The values of d, for each series, k=1... 14 are also given in Table X. It will be seen that there are not the same marked secular or sessional changes as characterised the Trisection Series. In Figure 9 the means of Groups 1 of each Series—or the d,’s and to “time,” and again if have been plotted to “order” « is the personal equation or mean, y the number of order of series, z the number of days between 13th June and the date on which the series was carried out, we have for the regression lines 2 — 28793 =— ‘00101381 (y—10°5) ............... (xxxiv), x — 2'8793 = — ‘0005359 (z — 32°10) ............... (X&xv). These lines have been drawn in the diagrams; the coefficients of correlation are Tng = —'337 +134, Vey = — 156 +°147, Tye= +945 + 016, giving partial correlations oj p= 20 109, Tnz.y = — 589 + 099. TABLE IX. Constants of Bisection Experiments. : —___ | Dates (1920) and time Time taken Series dy O1 Pl O5=01 V2 ql _ p1) at start | for series Probable Errors ) = is ae | Coefficients of Corre- I 2°8648 | -04997 | +-4942 ‘0503 1Lamdy3 June | 6" | lation calculated from II 8624 | 05461 | +2609 0664 2.45 p.m.s 5 20 | x0 pairs of tf & Ill ‘9262 | 03821 | +-0823 ‘0518 pm. 15 ,, 5 45 | 00 pars of the vari- IV *8642 | 04690 | +4107 ‘0509 10 a.m.) 59 5 30 | ates. Vv -8290 | ‘05158 | + °5870 ‘0469 Sp.mii-” 2 i 1b. 45 VI 9114 | 04609 | + -5768 0424 am. 30 ,, 5 45 VII 9178 | °04415 | +2993 0523 10am.) 4 yu] | 5. 45 p 1 VIIE | 9218 | -04766 | + -4360 ‘0506 1215 pm o> 6 0 pa IX "8724 | 04384 | +°1389 ‘0575 ee 6 5 30 x ‘8990 | 04579 | +°1018 ‘0614 6.30pm. ~ ” | 6 15 80 | +:0343 Xl -9238 | -03617 | — -0423 0522 9.30a.m.l ¢ | 6 30 ‘70 | £0486 Sal ‘9298 | 04810 | +°5089 ‘0477 6.45 p.m.s ” | 6 45 ‘60 | +:0610 XIII | 8806 | -04407 | + 2769 ‘0530 Aitits. Wes 5 45 ‘BO | +0715 XIV | -8312 | 04955 | + -4445 “0522 11 ne ve Mwecat PO 20 ‘40 | +:0801 XV 8242 | 03606 | +°3334 ‘0416 2.30 p.m. ot 56 lo 30 | +0868 XVI ‘7976 | :04135 | +°3190 0483 pane 1G) Voie ini te fe 20 | +:0916 XVII | -8566 | 03739 | +°5531 0353 a ae a ere a ‘10 | +:0944 XVIII *8808 | °03497 | +2776 0420 joan INS, | 7 a | ‘00 +0945 XIX | :8890 | -03986 | + °5407 ‘0382 pm. 19 ,, 2 XX | 2°9030 | -02610 | + +1404 ‘0342 pias 20) = Mean time taken for a series of 70 observations (including the 7 preliminary trials*) 5™ 58s Mean interval between records of judgment 5:11 ‘ * See p. 28, footnote. Biometrika xv 5 von n Personal Equat vons t On the Variat 66 ‘(sayour ur) sdnowxyy fo sunapy fo a1qQv 7, ‘NOLLOUSIG “X ATAVL 18G88-% | 1888-6 | OGE88-G | GLES8-Z | EFZSB-S | GETBB-S | OGL8B-G | 9GT88-~ | ETSB-~ | 9088-6 | LIO88-~ | 8Z088-~ | ODLGL8-G 8EGLB-~ | SUVOTT | | | 8906-2 8r06- GE06- VEO: O06: 9€06- 0806: 8Z06- 906: FLO6- CG06- 8106: PZ06- = O06 | XX F8Le- 9818. 88L8: 9818. BLL8- SLL8- 8LL8. 9628. GO88- 9188. 8G88- F988: 9888: 0688: XIX P998- p998: PLOB. 9898. 9198. 6898. F698. PIL8- OGL8- OFLs: OLL8. PSs: O6L8- | 8088 TAX O&G8.- 86G8.- bas. 9EG8. 0698. 81G8- TIG8- VIGs- 8TS8- PIS8- G18. GGS8- 87g8- | 9948 IITAX F608: 9808- 8908: 0908. PPO: OGO8. 9008. 9664. 9L6L- 996L- OL6L. PL6L: SL6L- |. 9161 TAX 8078: 9888: P9EB- CVE8- CEB: 90€8- TOES: P6G8: 98@8- 9968: PGs. 0968: PSG. GhG8 AX 8LV8- | 8gP8- 978. 9IF8- 90T8- 90T8: bors: PGES: 6688: 69E8: O88. OPES: GCES- GLEs- AIX F968: | OG68. 9T68- 6068: 9L88- 9188. Orgs. 8188: 8088: 96L8:- 8618. 9188: PO88 9088 IIx VIP6: POTG: C6E6- 86E6- OOF6- GLV6- 9GT6. O16. 86E6- PLEG- 9TE6. 8TI&6- O00€6- 8666: ITX 86G6- GOE6- 0666: 98G6- CGC6- CGG6- 6966: VICE- SLE6- 98G6- 0666. TICE6- PEC6- | BEE IX 6006- FG06- 8G06- 9106. 8106. OLO06- P06. VG06- FEO6- 8606. O€06. F006: 9868 0668- x O198- 8698. 0998. 9198. 9198. | 8898. 0898: E898. 0698: 9898- 8898. POLS: 9€L8- | PELB. XI GEl6- v16- GLI6- OLT6- 88I6- 9066. 9066. OGG6- 09G6- F9G6- O9G6. 9¢°G6- O&Z6- | 8IGé- ITIA OS T6: T6I16- 81G6- O1G6. TIG6- 9EC6- O86. 8Gc6. 8CE6- 9616- 006. 9616- c616 SLI6- ITA 906- GLG6- S16: 6066. 9616: S16: OLG6- V1G6. G16: 9616- CLI6- FGT6- 9e16- FLI6- TA 9LE8- O8E8- SLE8. 9LE8- G9E8- | GES: 8Pes- VVE8- PVE8:- PEE8- OGE8- OTE8: POSS: 068. A 0688- 9888: 8988. 9488. Fess: VE88- 98L8- PGL8. 8ILs. FOL8: F898. P89s. 9998. GV98- AI 9GE6- TEC6- VEc6- PVE. CPEB. | 9EEE- Peco. G&G6- OGG6- 8EG6- 8EC6- 9°66- O96: 69G6- Ill OLL8- PLL: CLL: 88L8- Z8L8- | FLL 9GL8. 9¢L8- | OTLB: FOL8- 6698. 9L98- 9998. PE98- II 8998-6 PL9B-E | 8998-6 PE98.¢ 8F98-% | OG98-E 8798-G FFIB-G | FCI8-G 6998-6 | OF98-G 998.6 9F98-E | 8F98-% I | = = = — FI dnory | gt dnory | ZT dnory | TI dnory | OT dnorg 6 dnory | g dnoay | 2 dnory | 9 dnoryg | e dnory | F dnory | ¢ dnory | g dnory | T dno1y | soetreg Econ S. PEARSON 67 The dates on which the series were carried out—the z’s—are given at the end of Table IX; the distribution was more satisfactory than that of the Trisections, and the significance of these two partial correlations will be referred to shortly. The variation in the means of the series is much smaller than in the case of the Trisections ; we have here a range from 2°93 to 2°80 ins. while in the other, from 2°70 to 2°34 ins.; in both cases the secular change is in the direction which lessens the measures, i.e. the marks on the forms in the later series were on the whole further to the observer’s left hand than in the earlier series. Nor does experience appear to increase accuracy, for the true position of the half is at 2°97 inches (and of the third at 2°51 inches). 4 3) —e v ee 8. 290 ae 9 v-2§& 193= —-0) eye a 8|.- 287932 0005359 (2- Sieccye ‘SiGe Sees =. e is] eo Qj = 80 = 279 Pd 103 5 7.9 1 13 15 17 19 1 5.9 13 17 21 25 99 33 37 41 45 49 58 67 61 _65_69 3 24,4 6 8 10 1214 16 18 20 37° 11.°15 19 23 27 31 35 39 43 47 51 55 59 63 67 71 Personal Equation— Personal Equation (Mean of 1st 50 observations of Order of Series. series)—Time (days from 13th June). Fig. 9. Bisection. Means of Groups 1 of each series plotted with Order of Series and Date of Series. Next considering the sessional change, the values of 7; (defined on p. 47) have been plotted in Figure 10; the straight line “best” fitting these points is Yi — 2°S816 = + 0008534 (6 — 82) ....cecceceeeseveeee (xh), where ¢ is the order of observation in a series, and the coefficient of correlation between 7 and ¢ is +°5294 + ‘0137*. Using the relations of page 48, it is found that Ny, = 271 +018, V1 — 9, = 963, and on comparing this latter value with that for the Trisections (815) we see that in the present case the mean sessional change is of less significance. inches 2'91 Diagram of Mean Sessional Change 28S, 2°88 ing {8 Sanh ON eg en a Sg a a ce ror ze H vi we eee ve Mean,at 2°8816 inches 2.86 ¥ wy —Regression 7,- 2'8816= +:0Q03534 (¢-82) 2°85 The value of the true half is 2°97 inches 1 10 20 30 40 50 60 Fig. 10. Bisection. t, Order of Observation in Series. It will be noticed in looking at Figure 10 that the points (¢, 7%) appear to be subject to a fairly consistent periodic variation about the regression line, the * This correlation between the mean tth observation (j,) and t must be distinguished from the correlation between the éth observation (y,) and t, which is +°143, and as it should be, less than 7. 5—2 68 On the Variations in Personal Equation complete period covering from 20 to 22 observations. Without a detailed analysis of the separate series, it is not possible to say whether there is a period of this order underlying the variations in judgment in all series, or whether this periodicity in 7% results from large variations in one or two series; the diagrams of seven of the series, in Figure 11 do not certainly suggest any marked periodic Estimate in inches. variation, and it is possible that the drop at about the 44th and the peak near the AP, =+ 494 96+ SERIES | a= “050 al A A A g,- -050 uy ri. _A 86 Fg - “ on of ay i “ [1 in Series _10 40 50 60 P= +587 if SERIES V 0 2: = E mone = OAT ff Eee e V seal us 84 SERIES VI. = +577 . OF oe 046 | A er 142 : ' rf VON, ez s 6 ; | | oo, 40 ey 10 20 30 SERIES X = +102 En \ ree a. K 4 es ra A . eax Hoe A \ fi fy ee [Sera a GRE 10 20 30 4 SERIES XII p= +509 AN Ma f 20 30 40 50 60 SERIES XVI P= +-319 g,= O41 o,- 048 \ me ; ie v4 \ 10 20 30 ; 40 50 60 "SERIESXX 'A=*-140 r = Mi = 026 The horizontal line intersecting each graph gives the mean of the first 50 observations in that series. Fig. 11. Bisections. Diagrams representing variations in judgment. Econ S. PEARSON 69 55th observations in Series V and VI, would go far to account for the similar features in the 7 diagram, the “y” scale of which is four times greater than that in Figure 11. Using the method of Correlation of Ranks*, the correlation between o, and p, has been calculated for the 20 Series; the result is Te,o, = + °420 +124. Another coefficient which may be calculated, is that of the correlation between os, or the standard deviation of first differences of consecutive judgments, and p, ; using the same method as for 7,,,, it is found that Yosp, = — 416 + 1125 and again ro5,, = + 465 + ‘118. Now p;, a, and os are not three independent quantities, as they are connected by the relation os = VE > Cer = Ye = 0,V2(1—p,), t=1 n and it is open to question, which two are the most fundamental. In the ordinary theory of the Combination of Observations, where it is assumed that p, 1s zero, it is natural to consider o, (or «) as a fundamental constant, the measure of the accuracy of judgment; os appears to have no special significance and merely equals V2c. If however there is a correlation of successive judgments, o loses its importance ; if we take a small number, p, of successive observations and calculate their standard deviation, s,, we can no longer say that s,, subject to its probable error + 6745 od oi will be equal to o, the standard deviation of a long series of V2p judgments. On the other hand there is every reason to expect that the os found from a few observations will give a fair approximation to the os found from a large number. o is dependent to a high degree on the sessional change; for example it has been shown + that if this change can be represented by a straight line of the form y= Ot, then o’, or the standard deviation of the observations freed from this change is given by 12 be ao? =o? —— (n?— 1). It is true that os is dependent to some extent on the sessional change, but far less so; for instance in the case of the linear sessional change, o,’, the standard deviation of the first differences of the successive residuals left after the removal of the line, is given approximately by the relation fo a3? =a; — b*. And for any form of sessional change which is likely to oceur in experiments of the type we are considering, the correction to the difference between two successive observations necessary to get the corresponding difference between the * p. 52 and footnote. + Section V (b) p. 43. 70 On the Variations in Personal Equation residuals after the removal of the sessional term, will be very small indeed compared with the standard deviation of this difference, or 3. It is therefore suggested that in the combination of correlated observations, os, the average value of the jump in estimation between two successive judgments, is of more fundamental importance than o. As an example, consider the diagrams of the observations of Series X and Series XX in Figure 11; the correlation, p,, is very low in both cases, but it is suggested that the physiological significance of the difference in type between the two, les in the fact that os for Series X is nearly twice as large as og for Series XX, rather than in the difference in the o,’s. Or again in the diagrams of the Trisection Experiment, Figure 6, I would emphasise the same point in a comparison of the difference between the two highly corre- lated Series VIET and XVI. Now returning to the coefficients of partial correlation Pay z= +529 +109, xz, = — 589 + 099. With the interpretation suggested on p. 54 for these coefficients, we are led to a rather suggestive conclusion. If we are dealing with a number of series carried out at equal intervals of time in the course of one, or even perhaps two days, but effectively at one epoch when comparison is made with the long range of nearly 70 days covered by the Bisection Series, then the correlation between wx and y is positive, or the pencil mark in the later series tends to be made further to the observer’s right than in the earlier series; this change is in the same direction as the sessional change within a series. There is indeed a curious coincidence, on which of course no stress must be laid, Yay.z= +529 +109, 75,.2= + 5294 + 0137. That is to say the correlation between the mean of a series and the order of that series when a number of series are done in close succession, is of the same sign and magnitude as the correlation between the mean ¢th observation and its order, ¢, in the series. But if we are dealing with all the pth series of sets which have been carried out on different days with varying and perhaps many days’ interval between, then the coefficient 7,;,, 1s negative, or the bisection-marks on the later days have on the whole a tendency to move to the left of the observer ; this is in the direction of the secular change. The conclusion which it seems possible to draw is this; if a number of series are done at very short intervals, the interval of rest between the series will not be sufficient to break the effect of the sessional change; but if a considerable interval elapses between the carrying out of the series, then the sessional change in one series has no influence on the judgments in the succeeding series, but a quite distinct secular change may be noticeable. In the Bisection Experiment both secular and sessional changes are very small, but they are acting in opposite directions. If these two changes are due to different physiological factors, it seems possible that it is the fact that they are acting in opposite directions in the Bisection Experiment which causes them to be of so much smaller magnitude than in the Trisection Experiment, where they were acting in the same direction. Econ S. PEARSON Tal (b) The Combination of the Series. For the combined series, the coefficients of correlation of successive judgments R, for k=1, 2...138 were calculated from 18 correlation tables each based on the 1000 combined observations; the results for D;, S;, and R, are tabled below (Table XI). The effect of the slight sessional change is noticeable in the increasing values of D,. Using the values of D,, S, and Ry, and of ,d; from Table X, Equations (vi), (vil) and (viii) give = (puoi ciss) and (0,7) for k=1, 2...14. Equations (ix) m and (x) then give fhe gales of S;’ and R,/ contained in the 5th and 6th rows of Table XI. The value of R,’ found by this method should be compared with that found with the help of the p,’s, o,’s and o,’s of the individual series, namely Pi(piois) [Se Se? aE rae 3578 + 0186 See aeer ree (x) bis. ope os) The difference which is well within the probable error arises from the fact that R, has been found by grouping the observations in a correlation table, while the pis, os and o,'s were found by direct multiplication of the crude values of the observations. Another method of obtaining the R,’s is from the first difference correlation equations, or the method of Problem 1, p. 41; the results are given in the 7th row of Table XI, while the constants ,R;, the coefficients of correlation of successive first differences required in the solution, are in the 8th row of the Table. Comparing the values of R,’ found by the two methods, we find good agreement up to k=6, but beyond this point the R,’’s of the second and approxi- mative method assume much too large negative values*. It is however evident from the results of the first method that R,;’ does become negative, and as it could not remain negative indefinitely as k increased, there seems here to be another indication that a periodic variation exists among the judgments at any rate in a certain number of the series. For a complete period covering from 20 to 22 observations suggested by the ¥ diagram, R;’ should have a minimum value at R, or R,’; the figures suggest that the minimum occurs somewhat earlier, at about R,’, but the probable errors for these small coefficients are very large. When time is available it would be interesting to examine further the significance of this periodicity. The points (R,;, k) and (R,’, &) have been plotted in Figure 12. It will be noticed that the S;’s in the later groups are larger than in the earlier, this suggesting again as in the case of the Trisections, that the obser- vations become slightly more erratic towards the end of a series. * This result tends to confirm the suggestion made on p. 60 that the difference correlation method gave too large negative values for R,” in the Trisection Experiment. von n Personal ERquati tOnNS VU On the Variat 72 ‘or6r0-=()H- Den Ig =sy | | | | | 6810: -] F6E0.+) LI00-+) Z00--| O8GO-—| GIGO-—| 6PLO-+) 6L00-—| 6L80-—| O680.+| TLg0.—-| gor. —| * tart | | | | | | | | | | | 9810. +! OGZ1-—| OLOI-—| LF90-—| O&L0-—| $€80-—) 9880--| LL490-—| 6000-+) FOFO.+| 0060.+] €88I-+] F9EZ-+|) BLGE.+) Pougeut pug yy | | : | | | | | | €1Z0-¥| €1Z0-F) €1Z0-F €1Z0-F €1Z0-F €1Z0-F) €1Z0-F) €1Z0-F E120. F ZIZO-F) 90B0-*) GOZO. F) L810. | 6EE0.—] BOEO-—| 100.—| FOZO.—| S8E0-—] GP90-—| FFEO-—) GIIO.+) 68E0-+) GF8O-+) 8Z8I-+| 90eZ.+] GEE. +] PoyreM ysT Mey | | | | | | | | 11000. ¥ 0L000-¥ 69000: ¥| 69000. = 69000. + 89000. = 89000. 149000: ¥ 99000. ¥, 99000. F 99000: ¥| 99000. F 99000. ¥ | 99000. = LELPO- | SS9FO- | LO9FO- | 99GF0. | SPSFO. | LPSPO. | O6PPO. | LEPPO. | GLEPO. | ELEFO. | GSEFO. | O9EFO. | SPEFO. | F9EFO. |" * NY | | | | | L810. GSO. ¥) I810.¥| ZS10.F €810-F F810-F) 1810-F, CL1O.F, TL1O-F C9L0-F) EC10-F 9FIO-F 6Z10.F | | SIge- Te9e. | ose. LIge- | 9GLe- | L@LE- cO6e. | Shab. | OLFF. | GGLF. | GoEe. | 66GG. | 66Z9.+ ae Te | | 88000: # | 88000- ¥ | 28000- F| 48000. | 48000. # 88000. #88000. F 48000: | £8000.) 48000. # 48000: #48000. * 98000. #!) 98000. 0980. | €18G0- | 76490. | E4460. | 2610. Bigs | PI80- | 68420- OLLG0- | €8LG0- | OLLGO. | 8490. | 90LGO. | L46LG0. |" 7 ty i] | i | 89E88-% | 99ESB-T | SEERB-S | FIESS-Z YETB-G | 9EZBB-T OGISS-Z FSIG8- | 9EL88-E S9O88-Z FEOSS-S | 8EO88-Z FSGLB-T YESLVT * "d ; is | | ia €1 at II OL 6 | 8 L 9 ¢ 7 Gi, Ge Fea) ‘(uoyoasug) saruay paurquoy fo spunjsuoy IX WIdva I~ Ea@on S. PEARSON 73 (c) Comparison with Experiment A. The difference between the results of the two experiments is probably due to the fact that the estimation of a half is so much easier than the estimation of a third. The variations in the latter observations are all on a larger scale than in the former; the secular and sessional changes are very much greater, and if we compare the values of the fundamental constants, we find: Sy Ss R,! Trisection 0845 0732 + 6246 + 0130 Bisection 0436 0495 +°3519 + 0187 EXPERIMENT B. BISECTION. CorreLation— INTERVAL DIAGRAM. 80/7 Ry e----6 Rye-=—— =a ‘70 Lx ‘60 NS soa i) 50 Lee 5 a oo 40 he ena rpeet peo oe. ie re 9 -—-o | \ XN \ \ \ CORRELATION OF SUCCESSIVE JUDGMENTS : x a : : -) i L a I SI sie T =I 4 i] 6 7 8 9 10 4 12 13 INTERVAL BETWEEN SUCCESSIVE JUDGMENTS = nu i Fig. 12. and even after the removal of the greater part of the sessional change—(the best fitting straight lines)—the coefficient R,” for the Trisections is +:4892, or greater than R,' for the Bisections. The ratio of the values of S;, or roughly 3 to 2, is a 74 On the Variations in Personal Equation measure of the relative uncertainty of the observer in making his estimate in the two different Experiments. There is some evidence for a slight periodicity in the judgments in the Bisection Series; if there is any period in the Trisections it must cover at least 26 observations, for there is no indication of a significant increase in the values of R,, as far as calculated, i.e. up to Rys. VIII. ExpertMent C. Counting or 10 SEconDs. REDUCTION OF : OBSERVATIONS. (a) The Individual Series. The values of d,, o, and p, for each of the 20 series are given in Table XII as well as the hour and date; the means (d,) have been plotted to the order of series in Figure 13. If « is the mean in the factor e/p for a series, y the order of series, z the time in hours and fractions of an hour between 2.0 p.m. on December 13, and the commencement of series we have for the regression lines, 2 — 9186 =—:006056 (y — 10:5) oa cee eee (xxxvi), x2 —°9186 =— 001552 (2 — 38°24) ............... (xxxvl1), — ° 2 1S) os Rates (pi0162) = = +°5200 +0156 ............60. x) bis, Tse See) Se and the remaining values of R,’, /=2,...13, by the approximate method of Problem 1, p. 41. Perhaps the chief source of error in the method is variation in S;, which has been assumed constant; in this experiment the range of Sj, is only 18% compared with 3°6 % for the Re cec ons and 2°5°/ for the Bicone and the results which are contained in the 6th row of Table XIV may be regarded, there- fore, with reasonable confidence. As before, for the higher values of &, Ry’ may be * Hleven definite interruptions in the ordinary routine of counting, due to a mistap on the key or a miscount of the 10 seconds, were recorded at the time of observation, but only three of these resulted in breaks of judgment >-07, the limiting value taken in the above investigation. in Personal Equation vons t aria On the V. 80 ‘GSG000: + SI 1o110 atqeqoud S}JT ‘“solqey U01}B][a1100 9G} Jo Byep pednois o4} WO1j poulvyqo SBA (3) jg “Sutdnors ynoyyIA peureyqo Suteq 4 jo sanyea oy} ‘jogs =n uoreyer aq} wWoIy pouteiqo sea (T) Ag , ‘ | | 960: —| 09¢0.+] LL10.—| 6¢zo. —| 9880. +) g9F0-—| Og10-+| OF00.—| Fg00-—-| 29€0-—| OGTO.—| gz6e-—| 7 ty? if | | | €204=| 820:—| 810-—| O10. + goo.-| g810-+| g810-+] €10.+] ¢00--| 100-—| 9800.-]°" apa | | | | TAT | Geo.-+| oF0-+| 190-+] 990.+| zg0-+ COL+) Ogt+) FOL-+) g0s+! zoet) treet) srp+] geg.-+/ worenbe mous ay | | | | i} | | G1ZO-F S1ZO-F} €1ZO.F] €1ZO.F F1ZO.F 1120. F O1ZO-F) 90ZO-F) ZOZO-F, L610.) O61O-F) 9LIO-F) 9¢10. ¥. | 96G0-—! SPE0.—| Z8ZO-+| FLEO.+] 9€90-+| 8ZII-+| SPZI-+] SI8I-+| €926-+| 9FL3-+] T8ze-+| S9Tb-+| OOZG. +] °° ae eee | lie | | PrveOi OMe, 6p eee! ee = = So ere = = = = wy) ese HOG T CO nl benremnim (Ges | | | | | | | IgPeo- | egreo. | °° Cau) ns | as - a bel = at =) =. pes =" = oe | 6Z10-+ LZ10-+) IZ1O-F| OZLO-#) LILO-F ZILO-F 1110. GOL0-F) OOL0-*) ¢600-*) 6800. 8100.) 9900. ¥ | LL@Q. | GOE9- 98¢9. 6199. ZIL9- Gggg. 6269. | 6ZIL- Q8ZL- 9CPL.- FrOL. | GG6L- PUEQae ease fe est | 88000: 88000: ¥ | 88000- F | 88000. F} 88000. F g8000- + 88000: #88000: ¥ | 88000 F| 18000: #| 28000. £| 48000. F | 28000-#)| 4g000- + /ZI8cO. | FF8GO. | 6E8GO. | ZF8GO- | LFBGO. | Esco. | S?8G0- | LE8G0. | OTSSO- | 68LG0. | ZOLGO. | TPLEO- | LPLGO. | 6GLG0- |“ as ES, | | | | | FIZ6- | GIZ6- | G16. 91Z6- LIG6- | L1Z6- | FIZ6. 11Z6- 80Z6- 90G6- ZO: 8616. | 9616- | C616. Be 1G : | : = : | ms | 6 | 8 L 9 ¢ 7 e -F @ l=4¥ 71 O|SCSEL ral i Or (spuosagy buyunoy) sarway paurquoy fo szungsuog “AIX WIV c od t~ Econ S. PEARSON 81 a little too low, and as a test of the amount of cumulative error which may be affecting R,,’, | have worked out this constant directly from the relations rel Oy A y I — R,; S, S14 ae > (D, _ d,) (Di aaa das) m ay 7, 1 9 9 Y /o il a /3 9 She = Sh 2 We > CD; = aay Sie = Si te a = (Dy am dys)’, me EXPERIMENT C. 10 SECOND COUNTING. CorrELATION-INTERVAL DIAGRAM. CORRELATION OF SUCCESSIVE. JUDGMENTS “40 20H 5 ss I I I I = at I Sean ea a St we (0) 1 2 3 4 5 6 7 8 93 40 n 12 13 INTERVAL BETWEEN SUCCESSIVE JUDGMENTS Fig. 17. with the following results : Ry =— 0124+ :02138, L,,=+°632*, Si 3420: i = 03444. R,; and presumably R,,’ are not therefore significantly negative, and it seems probable that R,’ tends to zero as i: increases, without oscillating about that value. The points (, Ry) and (k, R,’) are plotted in Figure 17; the theoretical curve drawn in the diagram will be referred to in Section XI. * Ly, or the limit to which R, approaches as R,’ tends to zero is discussed on p. 34, Biometrika x1v 6 82 On the Variations in Personal Equation IX. Experiment D. ESTIMATION OF 10 SECONDS. REDUCTION OF OBSERVATIONS. (a) The Individual Series. In Table XV are given the values of d (the mean of the 63 observations of a series, not those of Group 1 only), and of o, and p, for the individual series; the low values of p, will be noted at once, and also the high values of o, compared with those in the Counting Experiment. In Figure 19 below the means have been plotted to order of series, and if x is the mean in the factor e/p, y the order of series, z the time in hours and fractions of an hour between 10 a.m. on December 7th, and the commencement of series, TABLE XV. Constants of Individual Series (Estimate of Seconds). Series d | o1 PI Time of Start Date (1920) | | I 1151 “1217 +1518 + 0932 10.45 a.m. | II Tell] | +1254 + °2332 +0902 11.30 a.m. Ill 1:109 "1330 — 0249 + 0953 12.10 p.m. 7th December IV | 1°052 | +1393 +°1803 + :0923 2.0 p.m. V ‘973 "1292 + 2632 + ‘0888 3.0 p.m. VI HENGE) *1349 +1300 + 0988 10.15 a.m. } VII 1-011 1312 + 3673 + 0825 11.0 a.m. VIII 1:073 1318 +1631 +0929 2.0 p.m. 8th December IX 1:003 | +1108 +1976 + ‘0917 2.30 p.m. x 1°089 “0989 +:0380 + 0953 3.15 p.m. xl 1204 | *1519 + 3405 + ‘0843 10.0 a.m. XII 1°204 “1467 +°1415 +0935 11.0 a.m XIII 1:091 ‘1166 +°3241+°0854 12.0 midday 9th December XIV 1:0386 “1059 + 0566 + 0951 2.0 p.m. XV 1132, |) +1884 +°4814+°0733 3.15 p.m. XVI 1°170 *1500 +1036 + ‘0944 10.0 a.m. XVII | 1:°421 *1520 — 0834 + 0947 11.0 a.m. | XVIII 1°300 “1591 +2314 + °0903 12.0 midday 10th December X1X 1243 +1708 | +:2260+-0905 | 2.0 pm. | XX 1°170 1833 +°1659 + 0928 2.45 p.m. | Correlation between p; and o,, 7o, p,= +°176 +146 (calculated from correlation of ranks). we have for the regression lines # — 11833 = + 01018 (y — 105) .......... hee (xxxix) e — 1:1833 = + °002493 (z — 38:62)... . ssc cceenseceor an (xl). The coefficients of correlation are Te = +'638 + °089, Tey aly 562 ete 103, yz = ar 983 aE 005, Econ S. PEARSON 83 giving partial correlation coefficients Yue y = 510 £7102, rey 2=— 470 4 “118. These latter coefficients suggest that the secular change for observations spread over a number of days will be a lengthening in estimation, but that, if a number of series are done in rapid succession, the tendency will be for a shortening; in fact we should expect the sessional change to be in the opposite direction to the secular, as for the Bisections. 116-—= | 112-2 fe 2 ros.’ & ND) 1044 ~o A — so 10 30 : 30 Fig. 18. 10 Second Estimation. t, Order of Observation in Series. The values of 7% have been plotted in Figure 18; the best fitting line has not been calculated, but it would certainly correspond very closely with the mean, y =1:1333, There is in fact apparently no mean sessional change, though the drop in the last eight values of % may be significant, and a mark of the tendency suggested by the negative value of rz,_-. In Figure 19 the centres of the small circles represent the positions of the means-of the 63 observations of each series; these points have been fitted with the cubic « = 1:093971 +°022116 (y—10°5) + 001174 (y—10°5)*—-0002002 (y—10°5)*...(xl1), which is the middle of the three curves. There is evidence of a slight secular change, the length of the estimation increasing towards the end of the experiment, If however it is remembered that the 20 series were carried out in 4 days, it will be seen that there is in general a decrease in estimation in the course of the 5 series done in any one day. It is this daily drop that the coefficient 7,,,(=—°470) is picking out. Now in addition to the secular change in personal equation, the figures in Table XV suggest that there is also a secular change in standard devia- tion. The vertical lines on each side of the series-means in Figure 19 equal in length the corresponding standard deviations, or o,’s._ These values of o, have been fitted with the cubic w' ='129006 +:001072 (y—10°5) +:000302 (y—10°5)?+ 0000214 (y—10°5)°...(xlii), and the other two curves in the diagram have ordinates equal to #+a’ and #— 2’, so that the distance between the central curve and either of the outer curves, gives the smoothed value of the standard deviation at the point. The diagram provides a generalised representation of a secular change in personal equation and standard deviation. 6—2 84 On the Variations in Personal Equation The factor for a true 10 second interval would be = 28 and was most nearly approached by the means of Series V, VII and IX, while in the case of XVII the mean estimation nearly reached the high value of 15 seconds. DisTRIBUTION OF PERSONAL EQUATION IN ESTIMATING SECONDS. 2-004 ESTIMATE QF 1 SECOND “TDH ‘SOK CP ly DC (SP ara a Ce ed ee a fa) 4 2 3 4 5 6 7 8 9 100 HW 1 1 #14 #1 16 #17 «+18 #19 20 PLACE OF SERIES IN ORDER Fig. 19. (b) The Combination of the Series. In combining the twenty series, D,, S, and R, were calculated from the thirteen correlation tables of the observations of the combined series. Using the correlations and standard deviations of the separate series, R,’ is obtained from = (p101%2) ~ = 1984140204904. ae x) bis, "Tans (o.") a ) and S/='14101, S,’='14056. Then using this value of R,’, and the first difference correlation equations (Problem 1, p. 41), R,’ can be calculated for k=2,...12. The values of these quantities are given in the Table XVI below. Econ S. PEARSON 85 The fall in R, is small, and although there is considerable irregular variation from R, onwards, it appears that R,, will not vanish as k increases, but approach a constant value in the neighbourhood of +°35. This can be tested; we have from the equations (vi) to (x) PCY os: yy I 2 R; D1 S al ae m = (D, = d,) (Dig oe dns) Ri = a i =e (xl), / {sie a x (D, = ayy {Seu aa > (Deri 7 deny} TABLE XVI. Constants of Combined Series (Estimating Seconds). 1:1421 11440 | 1:1415| 1:1413} 1:1421] 171423) 1°1416] 1:1402 | 1:1395 14101 | -14056. -00213 | +-00212 | 19841 }+°1123 | +:0652 +0570 |+:0338 |+ 02049 | £°0211 | +°0212 | +0212 |+ 0213 | + 4463 |—-0243 | — 0243 |+°0094 | — ‘0135 | — 0229 +:0160 |+ ‘0598 er —_ |i = | k=1 2 3 4 5 6 7 8 9 | 10 | ii 12 | | | 111391 | 1:1382) 1:1378 1749 1749 | 1759) 1761} +1764) +1760; 1754) +1757} +1763) +1756) “1760| °*1767 0026 | +:0026 |+:0027 +0027 | + °0027 | + ‘0027 | + °0026 | + 0026 +0027 | +:0026 | +:0027 | +:0027 | | "4825 4269 '3965| 3913] 3764] -3755| -3983| -4045| -3488 | 3691) °3524) °3831 0164 |4°0174 | +°0180 -+-0181 |+°0183 | +°0183 | +-0180 |+°0178 | +0187 +°0184 | +°0187 | +°0182 | | | 0332 | + 0693 | + 0213 | +°0212 |+° 0 98 — 0056 | + 0267 +:0017 | +:0501 2}+:0213 j= 0218 + 0213 | +0213 0734 |+:0357 | — 0458 | Ss=S V2 (1—R)='1785. and as the sessional change for the series is very small, we may make the approxi- mation = (D,— d,) (Dey — des) = & (D, — hh)? = = (Dees — diss)? for all values of k, and in view of the constancy of S;, Si = S'x41 for all values of k. Then on the assumption that there is no significant periodic variation in the observations, R, — 0 as & increases, i = (D,-— ay Mm m 7 pape = + °354., S\t+— 3 (D,-d,y and from (xlii1) R;,—> The correlations R;’ become rapidly insignificant; the values tabulated are of course subject to the errors of the method of approximation, but as inthe case of the 10 second Counting Experiment, these should not be large owing to the constancy of S;,*. The points (k, R,) and (4, R,’) are plotted in Figure 20; the two curves there drawn will be referred to in Section XI below. * The difference between S$; and Sj; is one of 1°4°/, only, 86 CORRELATION OF SUCCESSIVE JUDGMENTS On the Variations in Personal Equation (c) Comparison of Kaperiments C and D. It has been found that in both the Counting and the Estimating Experiments there is evidence of a secular change in personal equation, and that in both cases the tendency is for the estimates to depart further from the true value of 10 seconds in the later series; in the Counting Seconds there is a decrease, in the Esti- mating Seconds an increase in length of estimate. There is also very little evidence of regular sessional change in either experiment. 80) EXPERIMENT D. 10 SECOND ESTIMATING CorrELATION— INTERVAL DIAGRAM. 70} ne at ro a B a or) ° ©) ° ©) ° i=) ° 5 (=) i SE PAD EIS, EEE a PEE (ee) Ee) PL) ' ) ° Ooo ae Ue on BL Sa Sn ® \e-7- a ~ >-@- e 4 N \ ~ oz ~s ~ e- > N Oe - ==o5_ = x a See \ s \ - Sa = = a T I T ae ESET I i= ue IE at Se 1 2 3 4 5 6 if, 8 9 40 a 42 13 INTERVAL BETWEEN SUCCESSIVE JUDGMENTS Fig. 20. Beyond this the similarity ceases; it is only necessary to compare the values of the chief constants (defined on p. 36), ihe Ss Ree Counting 038438 = 0888) +. 5200 + (0156 ieee 1410 ‘1785 +:°1984 + -0205 Econ S. PEARSON 87 The variations in judgment in the Estimating Experiment are very large com- pared with those in the other, and at the same time there is low correlation between successive judgments, so that the observations will be found to be scattered far more nearly in accordance with the Normal Error Law than in the three preceding experiments. In the case of the Counting Experiment, the skew distri- bution of the 1000 observations has already been referred to. But for one or two exceptions (as III and XIX) the individual series in the Counting conform more closely to a general type than in the Trisections or Bisections, and this results in the very smooth values of the constants R; and R,/. X. EXPERIMENT #7. PLATE MEASUREMENTS WITH ZEISS COMPARATOR. The values of p, only have been calculated ; these, with o, and a brief description of the nature of the marking measured, are given in the Table XVII; o, is in millimetres. Series I—VIII involved settings of both slide and micrometer, IX of micrometer only. No great weight can be attached to the result of one series of 50 readings on a marking, but it is justifiable to draw certain conclusions from the results of the eight series. In the first place, there appears to be a significant correlation between the successive measures of the edge of a band (I-and IT), but in measuring the centres, 1.e. in bisecting a bright maximum with the cross wire, there is on the whole no correlation. This perhaps might be expected; the edges of bands or maxima in photographic spectra are not quite sharply cut, so that some uncertainty must exist in the observer’s mind as to where the real edge should be taken to be ; his opinion on this point may vary throughout the course of the sitting, and con- sequently correlation will be found between the successive readings. On the other TABLE XVII. Series Pi v1 Description of marking I + 384+ -081 | ‘0016 | Sharp edge of bright band Edge II + 467+ 075 | -0015 | Slightly vaguer edge than I Til +°117+°094 | ‘0007 | Clear and narrow maximum ) IV | +-090+-095 | ‘0012 | ,, ¥ . V 4+-021+°095 | 0016 |, z Gane VI | —-001-4:095 | -0019 | Broad and obscure _,, Sean VII | —-050+-095 | 0022] , ,, soft . VIL | +-227+-091 | -0041 Serva t: Te . IX +°'288 + 087 | ‘0004 | Micrometer screw settings only hand, in the bisection of a narrow maximum, there will be little doubt as to the position of the centre; the real estimate of the observer will vary but slightly, and the variations in the reading will be due mainly to failure in breaking off the push 88 On the Variations in Personal Equation or pull of the slide at the right moment. It is possible that unconscious “over-pulls” or “under-pulls” may go in runs together, but the measures seem to show that this is not the case, and that the correlation of successive judgments is due rather to correlated changes of mental estimate than to those of a more physical character. If it were more difficult to bisect a maximum, if there were greater opportunity for variation, it is probable that there would be a correlation of successive judgments, and this is perhaps illustrated by the case of Series VIII, which has the largest standard deviation (0041) and also a correlation (p, = + ‘227 + 091) possibly significant. The result of 1X suggests that there is a correlation between successive settings of the micrometer wires in the second eyepiece; this correlation would of course enter into ‘the results of I—VIII, but the standard deviation of IX (:0004) is so small that the effect will be insignificant where the variations in slide settings are large. As a matter of practical application these results serve to emphasise the importance of the routine of measurement usually adopted; if, for example, it is proposed to take four readings of each of a number of markings on a plate, the four readings should not be made in succession, but all the markings should be measured once, and then perhaps a short interval taken before the second measuring is made, and so on. This method should eliminate the error in the mean of several measure- ments of a marking, which may arise from a correlation of successive judgments, as well as errors due to change in temperature of instrument or plate, ete. XI. ANALYSIS OF THE CORRELATION BETWEEN SUCCESSIVE JUDGMENTS. (a) The Theory of correlated Estimates and accidental Errors. It has been seen that in the case of the Bisection and Timing Experiments when the secular term was removed the coefficients of correlation of the successive judgements, or the constants R,’, diminished to approximately zero values as &, the interval between the judgments correlated, was increased. In the Trisection Ex- periment, owing to the marked sessional change which was repeated in practically all the series, R;’ appeared to approach a value of +°16 and not zero as k was increased ; the sessional change in this case appeared to be of parabolic rather than linear form, and it seemed possible that if the ordinates of the “best” fitting parabola of each series were removed from the observations, the coefficients of corre- lation of the residuals, or the R,’’s, would tend to zero as k increased, as in the case of the other three experiments in which there was no large sessional change. The points representing the values of Ry which have been plotted in Figures 8, 12, 17 and 20 appear on the whole to lie so nearly on a smooth curve, that it is of no little interest to inquire whether we can obtain equations to such curves based on some definite theory of the physiological factors underlying the variations in an observer's judgment. Econ S. PEARSON 89 In the first place we have seen that neither a secular change in personal equa- tion—the variation in series means—nor a simple sessional change such as that represented by the straight line or by a second order parabola considered in the Trisection Experiment, will account for the whole of the correlation of successive judgments. We must therefore conclude that quite apart from the large scale vari- ations in judgment which are due to the more gradual changes of state in the observer resulting, perhaps, from experience or fatigue, there is a definite relationship between the small scale variations in judgment; if judgment % is greater than the average of the five or six preceding judgments, then we shall on the whole expect that y4,, the next judgment, will also be greater. I propose therefore to consider what results will follow from the assumption that y, has a correlation r with y_; and y4,, but that for y.. or yz, constant it has no partial correlation with Yt-2 aNd Yr4. Or Judgments at greater intervals. In other words we will suppose that the observer’s estimation at any moment is only influenced by the preceding estimation, and only through this, and not directly, by the earlier estimations. Let us take the successive judgments y, Yrii, Yrs «++ Yt --- and suppose that the total correlation between y; and y,4 18 pz, Where k= 1, 2, 3,..., and p,=r. If there is no partial correlation between y; and y45, Y41 being constant we must have po— pr =0 or pp=r". In the same way if there is no partial correlation between y and y,4,; when y,4, (or #42) 18 constant, Ps — Pip2= 0 or (x= Tae and in general we find that In reaching this simple result there is a point however that has been overlooked ; it has been assumed that there is some physiological or psychological significance in the correlation of an estimate of a quantity and in the preceding estimate, but it must be remembered that the value which the observer records may not be exactly that which he wished to record, or in other words he may be unable to record his true estimate. Thus in bisecting a line it is likely that the pencil point will not strike the paper exactly at the spot intended, or in counting 10 seconds the tapping of the key may not be exactly synchronised with the beginning or end of the count, and there may be many other little external influences of which the observer is unaware, which will all combine to form what may be termed an acci- dental error superimposed upon the true correlated estimation. Let us examine how the relation (xliv) will be modified by introducing the idea of these accidental and uncorrelated errors; we must suppose that the observer’s recorded judgment yt is made up of two parts, a his actual estimate at the moment of record and £, some complex of accidental errors affecting his record. Then Ui Ofte Ops sucess ons ites cieNe d's cru te edhe) Now if we assume that the accidental errors 8; are as like to be positive as negative, and that they will not be correlated in any manner among themselves 90 On the Variations in Personal Equation nor with the fundamental part of the judgment , we shall have the following approximate relations > Beat i neko re ee. where NV is large compared 1 with k N > Bi Bere = 0 ” Tae (xlvi) NV : where k and k’ take any of the D Pip Opn = 0 ” »” ( i values 1, 2, 3,...ete. But the correlation between successive values of the y's at intervals of & is S (a+ Br) (O44 + Brrz) -— NV st a+ Bes S Mth + Pere gee EEE eee k 5 N = = aN —<— 7 = = = 7 - a F N a+ 2 Nigoa ee 2 (a; 4-8) = NV (3 > wt | > (ore + Beye)’ — w( sat ee | =1 t=1 Ne ee y Na NX a Ya) Ne = Oey k Pas ara a) {Bae (3 § ‘+e Be Sa :—u(S mt 3 8 t=1 t-1N : cae t-1 N He in view of the relations (xlvi) where . dr4z] 18 the first order product moment coefficient referred to mean of the successive a’s at intervals of /, and V@2 is the standard deviation of a%, Ogi, --- Urn» JBy »” ” ” Bx, Bru, eles Bitn» and Jae CEBE Now unless there is a steady sessional change in the a’s, we may assume that ” ” ” Uk» Yketiy «+ YELN: for large values of V fo) 7 =...= a’, say, and similarly unless the accidental errors are steadily increasing or decreasing in magnitude ~ a = BY =B2= see == (3?, A Ob a ara Bats and we have Pk = [meee] _ aa ul : ate] = +Tazjagy pe a+ 8 aye 4 a+ B But on the assumption made above of zero partial correlation between two estimates which are not consecutive, we have found that rq, a,,,, the correlation between the observer’s real estimates at intervals of k, can be expressed in the form r*, and therefore Econ S. PEARSON 91 where g is a constant not depending on the interval &. With this expression for the correlation we shall of course find an apparent partial correlation between the judgments at intervals greater than one; for example the partial correlation CR ops r= 22 , and does not vanish unless between y and Yr4, Yr being constant, is q=1. According to the theory suggested this is however a spurious correlation due solely to the presence of the accidental errors. The next problem is to inquire how far a relation of the type of (xlvii) will fit the correlation coefficients which have been calculated for the Experiments A, B, Cand D. In the first place, in order to get as smooth values for the coefficients as possible we must combine the 20 series, which we may do if we _ remove the secular change as represented by the variation in the series means ; this step is clearly necessary for we are considering the relationship between judgments made in close proximity and are not concerned for the moment with the variation in personal equation from day to day. We must therefore deal with the coefficients of correlation R,’ and endeavour to fit a curve z= qr® through the points «=k, z=R,’. I will consider the different experiments in turn. (b) Application of oe to results of Hapervments. Haperiment A. The curve represented by z = qr* is asymptotic to the a axis (as 7 <1), so that if it is to fit the points (4, R,’) it is necessary that R,’ should tend to zero as k increases. But the values of R,’ given in Table V, p. 58, appear to tend as /: increases, to a limiting value between + °16 and +°15 rather than to zero. I think that this results from the marked sessional changes which have been represented in mean form by a second order parabola (see Equation (xxx) and Figure 4), and that if there is a physiological significance in the distinction between the sessional change and the residual variations of the observations when freed from this change, it will be of interest to find out how the coefficients of correlation of these successive residuals—what have been termed the R,”’s—fall off as the interval or / is increased. Should it be found that the R,”’s follow the law Wo pt . R; Gt ) the argument in favour of distinguishing the sessional change from the residual variations will be strengthened. It was found that the values of R,’ given in Table V could be fitted closely by a curve of the form FJ) aS | fd ee ee ee EE Te (xlvii1), where p, g and 7 are constants. A rough trial gave the following approximate values: Dy = LOW, “Qp =. 09)! Tot 3 Now if z=f (p,q, 7) =f (Po, 95 7) + Spa i - 87 4 i ie or A to first order, =pot+ Goro a op as ie 54 Fr ie M or, 92 On the Variations in Personal Equation we have as equations of condition for a least square solution op +175" 0g + kqury’ or = Ry — 9, — oto’, for hile, ae Le: Using the values of p,q and 7) given above, the corrections dp, 8q and &r were calculated and gave finally as the best fitting numerical equation, Ry! ="15244- 6817 (7105) «0-2. 8. eee (xlix) TABLE XVIII. Values of the R,’s for Trisection Haperiments. l 2 3 4 5 6 7 8 | ieee | Values obtained from | R,’ R,’ __— Difference) Probable (Jii) on assumption of R,” k | (direct (from equation | Col. 2— | Error of constancy of G;, (from equation | calculation) (xlix)) | Col. 3 R, ee (lvi)) | | Sy R,,” 6) = + °834 — — | — —- + °804 1 +°625 | 15837) — ‘012 +:013 | °:0778 + °550 ‘571 2 523 | “497 | +026 +016 ‘O776 431 “406 3 388 397 | — +009 HOS ‘0778 268 288 4 315 B26 — O11 +°019 ‘0781 183 205 5 281 276 + 005 + 020 ‘0778 142 146 6 | 5230) 240 — 008 + 020 ‘0782 084 103 7 222 215 + 007 + 020 ‘0782 ‘O71 074 8 ‘191 197 — ‘006 + 021 0783 035 052 9 “165 184 | —:019 + ‘021 ‘0787 “006 037 10 183 175 +008 + 021 “0802 031 026 1] 168 168 “000 +°021 0823 ‘O17 ‘019 12 172 164 +008 +:°021 “0834 023 013 13 +160 +160 “000 +021 “0840 +°009 +:°009 14 a — — — "0840 -- — In the second column of Table XVIII are given the values of R,’ taken from Table V and in the fifth column their probable errors; the values of R,’ given by equation (xlix) are in the third column, and in the fourth are the differences col. 2—col. 8. It will be seen that the fit is a good one, the difference being only greater than the probable error in the case of R,. The points (4, R,’) and the curve of (xlix) are shown in Figure 8 (p. 64). The problem before us is therefore this; can we explain the constant p in equation (xlvili) in terms of the sessional changes? We have seen that the mean sessional change for the 20 series can be represented by a parabola of the second order, but we must allow for a different change in each series. Let us suppose that y =S p(t) will represent the sessional change in the pth Series after the secular term represented by the series mean has been removed, so that instead of equation (xlv) of p. 89, we have yt =fo H+aut+ Bi=fp (+ Y, where Y,=a, + £;. Eaon S. PEARSON 93 Then if } indicates summation for the m (or 20) series, n = 50, the number m of observations in each group of a series, and k takes any of the group numbers 1, 2,... 14, since y = f, (t) will be the “best” fitting curve of its type > Vag 0 t=1 approximately, and on combining the m series > (ye) = 0 m t=1 Again we have no reason to suppose that there will be any correlation between the sessional term /, (¢) and the residual Y;, so that SE (Pao +¥ 1 =0 m t=1 for all values of & and k’ between 1 and 14. As y;' is freed from the secular term, using the relations above we have that EE (fo + Vd) fo (E+ &)+ Vena) —mnd & SE : a) / m t=1 { mn m = 1 we mn R; MVee GeO ae abi eho] a/ [5 > (fp (t)+ Yi mnj® a mies (fo (t+h)+ Vers) — mn} S Joh ot | m t=1 nie 1 mn m t=1 m t=1 nun ay eee eee aera (li), = Re Si Siu ag F, VS? + G2) (Sea? + Ginn?)’ where R,” is the coefficient of correlation between Y; and YV;,,, 8,’ and S;4,.” are the standard deviations of the Y’s of Groups 1 and &+1 (see (xi) and (xii) on page 35), and mn = Sh (t) fp (t +h) - 12 3 oO { 3 Jo aa mn 5 1 t=1 MN) (mini mn 1 2 fp(t+k—1))? ao Pile iy tS ee Ce rae m t=1 X (fol m Nae s t= mn | It will be seen that G;, is the standard deviation of the ordinates of the curves representing the sessional changes, y = f, (¢), which correspond to the observations is the correlation of these successive ordinates at F, G, Gar intervals of . If the sessional changes were linear this correlation would be unity, and a little consideration will show that if the sessional change in each series can be represented by a curve of gradual bend, the correlation will not be far from this value. For example in the case of the parabola (Equation (xxx), p. 47) which was fitted to the mean sessional change and is drawn in Figure 4, it is found that in the kth groups, while Fry GG We shall therefore make no great error in assuming that F,= G1 Gis, = + 994. 94 On the Variations in Personal Equation and it follows that the relation for R;’ can be expressed in the form R; = ain + = aaa ea Revd (111) V (+ am) 0+ G25) AO + ga) 0+ ge) = pith, Ry’, which must be compared with the relation Re = Dt Qu te aa eee (xlviii) bis, where , p= 1524, g—63sli, r= 7105, that has been found empirically to fit the actual values of Ry’. If the expressions p; and J, were constant for k= 1, 2... 14 an interpretation of (111) would be at once suggested. Namely that R,,”, the coefficient of correlation of the successive residuals Y; and Y;,4, left after the removal of the- secular and sessional changes is expressible in the form Be SQ io ones o's oe dee Se oe (1111), that is to say, making allowance for the presence of accidental errors, the law of relationship between the successive estimates suggested on p. 90 above, holds good. Now without finding the curve which represents the sessional change in each series we do not know the values of 8; and G,. We have however that S;”? + Ge = Sx? o auh torn eo sei ale atelatetalasera oll tere eee (liv), where S;’ is the standard deviation of the observations in the kth groups after the removal of the secular term. The values of S;' are given in Table V, p. 58; they are seen to increase as i increases and therefore p, and d, can only be constant for all values of /: if S. M9 S M9, Ss M9 >} D2 S14 | ——— a IR Rin ond otivo 6 v). Ge G2 G2 oy) That the relations (lv) should hold approximately is not at all improbable; for with a sessional change of the parabolic form of the curve (xxx) illustrated in Figure 4, the standard deviations of the ordinates in the later groups will increase owing to the increasing drop of the curve towards the end of the series while S;’ may increase with / owing to greater variation towards the end of a session. In fact for this particular mean series with its sessional curve represented by (xxx) 1t is found that G, ='0336 ins., Gi, =°0406 ins., while 8S,” =:0165 ins. §,,’ =°0201 ins, that is to say, the variations superimposed upon the main sessional change (the distances of the points plotted in Figure 4 from the parabola) become greater towards the end of the series when the observer's judgment perhaps became more : é mesh Gs erratic as he grew tired. These values give — = -49, ote Cr, Gis the relations (lv) do hold very closely. What we find therefore in this typical mean = 50 suggesting that Econ S. PEARSON 95 series represented by Figure 4 may well be expected to hold approximately in the individual series. If then p;, is constant for k= 1, 2,... 14 and equals p, we find readily from equations (li1) and (lv) that ,=1l—-—p,=1—p, and hence (1—p) R= qr* or Ry’ == Tp, 0) Making use of the numerical values p=°1524, g = 6817, r= "7105 we obtain finally Fi COA BK HOD Ee meevereeiuaesan ree ees (Iv1), as the theoretical expression for the correlation of the successive residuals after the observations have been freed from secular and sessional change. This curve is the lower of the two curves drawn in Figure 8. The points which are there plotted about this curve are the points (A, R;,”)* obtained from equation (Iii) (a) On the assumption that G? = G,?=... = G.2 = constant, r/ (1 at a3) (1 a Hs) G? Gif (c) Making use of equations (liv) and the tabled values of S;’. = Py = ‘1524, The close fit of the curve to these points shows that the manner in which the values of R,” fall off as & increases is not much atfected by the different assump- tions regarding the relations of the S;,”’s and the G;’s made in the two cases +. Experiment B. Reference has been made on p. 71 to evidence for a slight periodicity in the observations of this Experiment, which gives rise to small but apparently signi- ficant negative values to R,’, for &S7. Further investigation might enable a correction for this periodicity to be made, but at present it is not possible to express R,” with exactness in the form R, = qr For the purpose of comparison with the other experiments we can however obtain values of gq and 7 which will give a rough fit for the first few values of Ry’. Thus if we take pa Lea Vlaa we get the values Rode = 04 Rs Ry 13, which agree roughly with the actual values given in Table XI, namely Ry— soz, R,—23l, R; = 183, BR, ="085. * In Figure 8 these points have been indicated by R,’” to distinguish them from the correlation coefficients of residuals after removal of linear sessional change, there denoted by R;”. + The values of R,” calculated from equation (lvi) and of R,” and S;,” calculated on the assumption of the constancy of G; are given in the 8th, 7th and 6th columns of Table XVIII. R.! Rs t+ r='72 is the value of the mean of the ratios RY? R, 1 2 and fu , and using this value for r, 3 q was taken as ‘47 by rough trial. 96 On the Variations in Personal Equation Experiment C. At the end of the section dealing with the reduction of the observations for this experiment the conclusion reached was that R,.’ and R,; were not significantly negative; no difficulty therefore arises in fitting a curve of the form y=q" to the values of R,’ given in the 6th row of Table XIV, p. 80. This was effected by the method of least squares, with the result Ry = 6673 x CIOL) ee (Ivii). In the 7th row of Table XIV are given the values of R,/ calculated from this equation, and in the 8th row the differences (R;,’ from observations) — (R,, from curve). If these differences are compared with the probable errors of Ry’, it will be seen that the fit 1s very satisfactory, for the later calculated values of R,’ are in any case uncertain; R,.’ and R,,’ were indeed not used in the least square solution as they were known to have too high negative values. Haperiment D. , On p. 85 it was suggested that Ry, would approach the value +354 as k increased. In this case a curve of the form R,, = 354 + qr*, was fitted to the calculated values of Ry. The fitting was carried out by moments. Making R, — 354 = z, we have s ys ; ; X(z) =@qr ae = N, say, where s is the number of ordinates, or 12a 1 aaa S (ch) = (r + 27? + 87? +... +879) = W x py’ | 1 P il srs an whence Hy = 7 Pp eee (lvii1), and is the distance of the mean from an origin at unit distance from the first ordinate qr, The constants 4,’ and WV are known; solving (lviii) by approximation we have r, and then (lix) gives q. The values are q ='1153 r ='8121) ’ and finally, Ry = ‘354-4 “1153: C812) ei tee ee pa db:<)) Then using the approximate relation 1 Si? oF > (D, — d,) R,y = (Ry, — 354) x —- Ss? —which is a modified form of Equation (xliui)—we obtain for R;’ the equation Ry = 1785 CRUDE scence ee (Ixi). Econ S. PrArson 97 Both of the curves, represented by equation (1x) and (1xi), have been drawn in Figure 20, and show a satisfactory fit, if the roughness of the data is taken into account, The results of the Trisection and of the Ten-second Counting Experiments, and as far as the rough form of the data will allow, of the Ten-second Estimating Experiment, suggest therefore that there is some foundation for the theory of relationship between successive estimates put forward at the beginning of the present Section. To reach the expression qgr* for the correlation of successive judgments at intervals of , it has been necessary in all cases to remove the secular change, and in one case a sessional change as well, but if these changes correspond in themselves to some definite mental or physical processes which can be separated in some degree from the causes underlying the residual variations, then we are justified in inquiring into the significance of the constants g and 1. It has been suggested that 9 ae a? + 3° so that g is dependent on the ratio between the correlated and the uncorrelated parts of the observer’s judgment, that is between what I have considered as the true estimate and the accidental errors superimposed in the process of record, Eee ates aera ee aide te aoe bene (LiL)3 > Now using (Ixii) and the relation * ee a ey eee ae Meustihert (Ixiii), (or S” for the Trisections where it has been necessary to allow for a sessional change), we find that : iti Vat=Vq8', VB H=V(1—g)S! cccceceeeecceeneeeeee(IXiv), and the values calculated in this way for /@ and JB? are given in Table XIX. TABLE XIX. Experiment a] | 1S), | Vaz 82 ? po eee ai 7 Tia aa alae aiaieaat= 7 | - 7 | Trisection oe 4 ne 80 ‘080 (= 8”) in inches | ‘071 036 ae | Bisection (approximate only)... | °47 045 in inches | ‘O31 033 12 | Ten-second Counting ... see i) Ui ‘034 in factor 028 7020 | -79 Ten-second Estimating a 18 ‘141 in factor 060 128 81 If the Trisection and Bisection results are compared it will be seen that the standard deviations of the accidental errors (/ 8?) are nearly the same but that there is a large difference between the measures of the variations of the true * Tt will be seen that owing to a sessional change in standard deviation, 5,” for the Trisections (Table XVIII) and S;’ for the Bisections (Table XI) increase with k. To obtain an approximate value for the standard deviation of the whole 1200 observations as opposed to that for the 1000 observations of any particular Group i, I have used in equations (Ixiii) and (Ixiv) S’ (or S”’) given by Sa (Sy/2 + So’? + S92 +... + 8442). Biometrika x1v 7 98 On the Variations in Personal Equation | estimates (Je). This is a result which we should anticipate, for the method of recording the estimate was the same in each experiment, and accidental errors of the same magnitude would occur in both cases; on the other hand the observer was faced with a more difficult problem in estimating a third than in estimating a half, and this is shown by the greater variability of his estimate in the former case (‘07 against ‘03)*. For the Timing experiments, we find no correspondence between the V B?’s; the great difference between the counting of ten seconds and the attempted concentration of mind on the passing of an unbroken ten second interval has been emphasized in the description of the experiments above, and a correspondence was hardly to be expected. The standard deviations are in terms of the factors e/p and must be multiplied by 10:2 if required in seconds. If now we turn to the values of 7 given in the last column of Table XIX, it will be seen that they le near together, and although that for the Bisections is not an exact measure, there is a suggestion of close agreement between the 7’s in the pairs of similar experiments, for we have estimations of length with ‘71 and ‘72, and estimations of time with ‘79 and ‘81. This coefficient is a measure of the rate at which the correlation of successive judgments falls off or the influence of previous estimates vanishes from the observer’s mind: on the theory of zero partial correlation it is simply the coefticient of correlation between a true estimate freed from accidental errors and the preceding estimate. On any theory 7 would seem to be a fundamental constant not varying greatly for different types of observations, but perhaps varying considerably for different, observers. The fact that it is so nearly the same for experiments with a five second interval between observations (Trisection and Bisection) and for others with an interval of ten seconds or more (Counting and Estimating) shows that the corre- lation of successive judgments is a function not only of the teme interval between two judgments but also of the number of intervening judgments. For if it were purely a function of the time interval we should expect to find a greater differ- ence between the values of 7 found for experiments with a five second interval and a ten second interval. Indeed if the experiments were exactly the same but for difference in interval, R,’ for that with ten seconds would equal R,’ for that with five seconds. Further experiments of the same type in which the interval between the recording of judgments was varied would undoubtedly throw much light on this point. XII. PREDICTION. If the values of “m” successive judgments are known and there is no corre- . fo) lation between them, the “most probable” value of the (m+ 1)th judgment, that is ) = ’ the most reasonable guess at its value that can be made, is the mean of the “m” judgments. If however the successive judgments are correlated, then it is possible to predict the value of the (m-+1)th with much greater expectation of accuracy. * This may be compared with the ratio of 3 to 2 given on p. 73 from a comparison of the S,’s before making any allowance for the accidental errors. Econ S. PEARSON 99 In the Experiments B, C and D it has been found that the correlation between judgments at intervals of /, made in the same session, can be expressed approxi- mately in the form Be OU ene eee nines ete feamaw IRV while for Experiment A, owing to the large sessional change, the expression was B= pe Gr" vs .aeee aes as eer a db aca The decrease of correlation in geometrical progression expressed by (Ixv) follows precisely the law of ancestral heredity, for which the multiple regression equations required for prediction have already been worked out*. It is not therefore proposed to go further into the problem in the present Paper, nor to inquire whether the general multiple regression equations would reduce to as simple a form when the correlation is expressed by equation (Ixvi) rather than (xv). XIII. SumMMARY AND CONCLUSIONS. The secular change in personal equation is shown by the variation in the series’ means, but it is only in Experiment A and perhaps Experiment C, where the general trend of the variations is markedly in one direction, that we find that type of change which is usually understood when a secular change is referred to. In the Bisection Experiment B the linear secular change is very small and its existence might well not be recognized, and yet the series’ means are subject to fluctuations far exceeding those of random sampling. For the probable error of the mean of a series (or of the observations in Group 1) is + 67449 x sn = + 00416, V50 but if we take the distribution consisting of the 20 series means, d,, we find that the standard deviation is 037375, giving for the probable error of a mean d, + 02521, which is more than six times as large as the probable error we have calculated by considering the variations within a series. It is therefore clear that the 50 observations in a series are not random samples of the whole “universe” of observations, as they should be on the Gaussian hypothesis of normal errors. It is again only in Experiment A that there is a fairly consistent sessional change from series to series which an observer might easily recognize and possibly allow for, and yet if we turn to any of the graphs for the Bisection or Seconds- counting which show the variations of judgment within a series (Figures 11 and 15), it will be seen how very often the mean of ten consecutive judgments will give but a poor approximation to the mean of the series; we cannot take the judgments within one series as scattered at random. When dealing with a sample of m * The Galton-Pearson Law of Ancestral Heredity; the offspring and the mean of the kth grand- parents have q7* for their correlation. 7—2 100 On the Variations in Personal Equation correlated variates, the usual expression for the probable error of the mean is (1) + °67449 rm Fn, XS compared with (2) + °67449 = when the variates m m are not correlated, but owing to the sessional variations to which a large part of the correlation is due, the expression (1) being the smaller, is in the present case a worse measure than (2), of the probable limits of divergence of the mean of the sample from the mean of the series. The graphs of Figures 6, 11 and 15 show that there is a tendency for the judgments to vary in waves, to be first on one side of the mean for the series, and then to change to the other, but with no definite period of variation. It is owing to these large correlated variations which cannot be expressed in any simple sessional term, that the coefficients of corre- lation, 7,,,4,, between o, and p, have been found to have positive values ranging from +°52+4°'11 in Experiment A to +'18+4°15 in D, showing that greater variation is associated with higher correlation of successive judgments. An analysis has suggested that the coefficients of correlation of the crude values of the observations at intervals of / can be lees in the generalized form Sy Shy” Ry + Fy ee PN 20s =a Sar deo) Ree aa é a ve {s," 4241 3(D,-ay | Tsar + Guu? +- ¥ Dresden) me where > (D, = dy) (Dyas — Tiy1), & (Di — di,’ ete. are terms representing the secular change, m F,, and Gy, are functions of the sessional change, and R,” and S;,” are the correlation coefficients and standard deviations of the residuals left after secular and sessional changes have been removed. In two experiments it has been found that R, is greater than +°80, which shows clearly that the estimates have not been distributed randomly in time. The coefficients R;” appear to fall off in geometrical progression, and to be closely represented by expressions of the form q7*, in which q and r are constant for any experiment; it has been found that the mtroduction of the quantities F and G in equation (Ixvii) in addition to the secular terms, is only necessary if there is a significant sessional change which repeats itself in series after series. Thus in Experiment C, where there was no such change, R; could be expressed by the relation qr SS’ kat =(Di— —d,) es — de+1) ..(Ixvill). ie ai eat +2 LS Din daa) A tentative interpretation has been given to the results of this analysis. The observations in Experiment A suggested that there was some physiological signi- ficance in the distinction between the secular and sessional changes, and this was Econ S. PEARSON 101 confirmed in Experiment B, where it was found that there was evidence of a linear sessional change acting in the opposite direction to the secular change. A discussion of the values of the partial correlation coefficients rz,.- (personal equation and order, time constant) and r,,,,, (personal equation and time, order constant) suggested that if the interval between the successive series were made very short, it might not be sufficient to break the effect of the sessional change. The correlated variations which have been found to follow the law R,’=q7r*, have been considered as in some way separate from and superimposed upon the other more steady changes. Starting from the tentative assumption that there is little or no partial correlation between the observer’s true estimates at intervals greater than one—that is to say that the observer's judgment at any moment is only influenced by the judgment immediately preceding, and only through this and not directly by the earlier judgments—it has been shown that the constant q in the relation Ry — One ere rere Uae -«.(15¥) bis can be accounted for by the presence of uncorrelated accidental errors which are superimposed on the correlated variations in the observer's true estimate. Without further investigation 1t would be difficult to distinguish between what may perhaps be termed the physiological and the psychological factors ; in the experiments that have been undertaken the variations in recorded judgment depend partly on the movements of the hand, so that the former factors are likely to have played some part as well as the latter. The successive recording motions of the hand may have been correlated as well as the variations in mental estimate. The importance of the results of course depends on how far they may be con- sidered as typical of any practical series of observations made by the astronomer or the physicist. Experiments were admittedly chosen in which it was expected that the variations in judgment would be large, and for the experienced observer working at the type of observation in which he has had much practice, the errors would no doubt be smaller, but it seems to me likely that the phenomena which have been discussed will be present in the judgments of other observers even if on a smaller scale. Experience and accuracy may be gained by practice, but 1t does not follow that the correlation between successive judgments will disappear. The secular and sessional changes may be small, but if rough comparisons of only the yearly mean personal equations of different observers are made, the finer changes, which may be of considerable importance in a combination of observations, cannot be recognized. The Law of Normal Errors requires but two constants to describe adequately any series of observations : (1) the mean, (2) the standard-deviation, while the introduction of a third may be necessary if a gradual secular change in personal equation is noticed. But the more generalized Theory of Errors discussed in the preceding sections requires more detailed information and a greater number of constants to define the character of an observer’s personal equation and variations in judgment. We shall require to know how the personal equation and the standard 102 On the Variations in Personal Equation deviation vary both within a session and over long periods of time, and if there is any correlation between successive judgments, what is the form of the function w, which gives the value of the successive correlation coefficients in the relation Ry = (kh) It is only by a detailed analysis of the observations themselves or of others carried out ad hoc, copying them as closely as possible, that full information on these points can be obtained ; but if the possible complexities which may be present in the variations of judgment are fully realised, a great deal may be done in prac- tical cases by the arrangement of the observations and the combination of the results, to eliminate the factors whose magnitude is unknown and to correct for others which are more easy to ascertain. I have heartily to thank Miss I. MeLearn for making the diagrams for Figs. 3, 4, 8, 12, 17, 19 and 20, and Miss M. Noel Karn for assistance in some of the computation. Biometrika, Vol. XIV, Part I Plate I Warren, /rvherttance tn Loxglove Figs. I—IX. Pelorism of various intensities. Fig. X. Split corolla. INHERITANCE IN THE FOXGLOVE, AND THE RESULT OF SELECTIVE BREEDING. By ERNEST WARREN, D.Sc. Lond. In Biometrika, Vol. Xt. pp. 802—327, 1917, the author published a preliminary report on the earlier results obtained in the breeding of foxgloves ; and the present paper contains some account of the final results of the selection experiments. In 1914 ten foxglove plants (Digitalis gloaimaeflora), obtained from various sources and of different characteristics, were crossed among themselves and also self-fertilised. In subsequent years, 1915—19, new generations were obtained chiefly by the self-fertilisation of selected parents. The measurement, or when not possible the grading, of certain characters (pelorism, colour, size of flower, spotting of flower, etc.) was undertaken in all the generations in order to deter- mine the effect of selection when selfing alone occurred in an apparently pure race. 1. PELORISM. Mendelian inheritance occurred in a typical fashion. A peloric plant crossed with a non-peloric plant produced non-peloric offspring. On selfing these, or crossing them together, there resulted on the average one peloric to three non- pelorics. Of the 10 parent plants two exhibited the peloric condition in a fully developed form, and the rest were non-peloric. The character was very perfectly recessive, and by breeding, it was found that three of the remaining plants were really heterozygous, while all the others were non-peloric and homozygous. It was soon observed that the peloric condition was by no means a clearly defined and fixed character. Pelorism in the foxglove may be regarded as an abnormal lack of power to produce internodes between the flower-buds, and con- sequently there may result considerable fusion of such buds with one another. The maximum stage of pelorism is seen when the main-axis is short and abruptly ceases to grow in height. Only two or three normal flowers may be produced by the axis, and its blunt, sharply truncated end is surrounded by a whorl of bracts or sepals, petals being absent. Sometimes a ring of sessile anthers occurs (PI. I, figs. 1, 11). In typical pelorism the inability to produce internodes affects the terminal portions of all of the flower-axes of a plant, both central and side-axes. A variable number of flower-buds fuse and the corollas unite and may form a large sym- metrical cup or saucer of some ornamental value, but the sepals mostly remain 104 Inheritance in the Foxglove separate (figs. 11, Iv). When numerous flower-buds fuse a dense rosette may be formed by the petals, and the result is not pleasing. The peloric or crown-flower opens early, often before any of the normal flowers. After the crown-flower has faded, the main-axis usually grows through the centre of it, and may even produce - a second crown-flower (fig. v1); but in the case of the side-shoots the axis generally ends in an ovary and no further growth occurs (fig. V). If the peloric tendency is not so well-marked, the main-axis may be only slightly affected by the suppression of several internodes, and by the partial fusion of flower-buds, at a variable distance above the lowest normal flower of the axis. Sometimes a considerable number of internodes may be unduly shortened, so as to produce excessive crowding of flowers which do not actually fuse (fig. V1), and frequently a strongly marked spiral bending of the axis occurs (fig. VIII). At other times the suppression of the internodes may occur only high up on the flowering axis close to where it normally ceases to grow (fig. IX). When the central axis is strongly peloric the side-axes are invariably so, and in all other cases the side-axes exhibit greater pelorism than the main-axis. Finally, the main-axis may be quite normal and show no peloric tendency, but the side-axes may still be strongly peloric. The last trace of pelorism in a plant is shown when only one or two of the weaker side-axes exhibit some slight sign of a peloric tendency. It is unfortunate that it has not been found possible to devise any practical method of measuring the intensity of pelorism, and therefore the plants have been arranged in four grades. 0° grade = no peloric tendency. 1°— 25° grade = those in which the central axis is non-peloric, but the side- axes exhibit some peloric tendency. 26°-— 50° grade = main-axis non-peloric, but side-axes may reach full pelorism. 51°— 75° grade = main-axis partially peloric, side-axes fully so. 76°—100° grade = plants ranging to complete pelorism in all axes. In the generations produced from 1914—19 there were in all 128 fertilisa- tions of different classes of individuals, recessive (peloric), homozygous dominant (non-peloric) and heterozygous dominant (non-peloric) plants, and families were raised. In the table on p. 105 the experimental and theoretical results are compared. The fertilisations of the classes DD x DD, RR x RR, and DR x DR include both selfing and crossing. The sum totals of the experimental and theoretical results are remarkably close; being, crowned, 1019 experimental and 1013 theoretical ; non-crowned, 1169 experimental and 1175 theoretical. It must be noted here that a plant was recorded as “ peloric” or “ crowned” if it exhibited the least tendency towards pelorism in any of the axes. Taking all the classes or groups together it may be said that the inheritance of the quality of pelorism is typically Mendelian. The group kA x RR should include no non- crowned offspring, and the 7 which occurred were obtained by gradual selection. ERNEST WARREN 105 The group in which the experimental result diverged the most widely from the theoretical result was DR x RR (heterozygous plants crossed with recessives) and it would be interesting to know whether such is generally the case in Mendelian inheritance. Number of Crowned Number of Non-Crowned Gametic Nature | Number | Number Offspring Offspring ot OL om —— | Eairings Bamilies/Odspring Experimental | ‘Theoretical Experimental | Theoretical | = aa = r= ——- nah | DDxDD | 16 266 0 o | 266 266 | RRx RR 43 741 734 741 0 O DRx DR 38 777 187 194 590 583, | DRx DD 5 93 0 9 93 93 DRx RR 12 156 98 | 7 58 | 78 DDxRR 14 155 - O O 155 | 155 Totals | 128 2188 1019 1013 1169 | 1175 The Inheritance of the Degree or Intensity of Pelorism. If a peloric plant be crossed with a non-peloric homozygous dominant, the offspring are heterozygous and non-peloric, and if these are self-fertilised or crossed together the peloric character re-appears in an apparently unchanged and un- diluted condition. If, on the other hand, a strongly peloric plant is crossed with a weakly peloric one the offspring are more or less intermediate, and if the offspring are selfed or fertilised together the intermediate nature of the peloric character tends to be retained. In the accompanying table A, B, C, D, # are plants of various gametic con- stitution. On selfing (A) the offspring were all fully peloric. Ov selfing some 5 offspring, A, 2—9, the plants produced were all essentially fully crowned. On crossing two recessive plants (A and /) of different peloric intensities (see bottom of table) the offspring tended to be intermediate. On crossing (A) with an ordinary plant (B) the offspring were non-peloric and heterozygous. On selfing two of these plants, (A x B) pls. 2 and 7, the offspring were either fully peloric, or non-peloric (heterozygous and homozygous). On selfing two recessives, (A x B) 2, pls. 8 and 9, obtained from (A x B) pl. 2, the offspring were all nearly completely peloric. Thus, there was no clearly marked dilution or apparent contamination by crossing a peloric plant with a non-peloric one. When, however, the same recessive plant (A) was crossed with a hetero- zygous plant (C) having in its gametes a weak peloric tendency of about 35° there was much variation in the offspring, and on selfing some of these plants, (A x C) 1, 2, 7, 11, and raising a new generation it was obvious that considerable dilution of the peloric tendency had occurred. On crossing the same plant (A) with a heterozygous plant (D) having a stronger peloric tendency (75°) in its gametes it was clear that in the next generation raised (A x D) 6, 5, 11 less dilution had taken place than in the former case. 106 Inheritance in the Foaglove Pelorism— Various Pairings. ra - - , he oO Peloric Offspring 6 Peloric Offspring ‘2 o = Parentage - = Offspring (selfed) a 100°} 75° | 50°| 25°} & : 100°] 75°] 50° | 25° | A (100° pelorism) | 33) 0} O| O 0) A pl. 2 (100° pelorism) 13} 1 | 01} 0 0 Selfed=LARx RR A pl. 3 5 3) |) Ol SOMmOk 0 Apl. 4 . PaO PO. | 0 | oO A pl. 6 * 6 |) 07/07 0 0 A pl. 9 ef OOO! oO A? (100° pelorism) | (A x B) pl. 2 (non-peloric and; 6 | 0 | O | O | 21 x 0} O} O|} O | 13] heterozygous) B ¢ (0° pelorism and (A x B) pl. 7 (non-peloric and} 5 | 0 | O | O | 23 homozygous) heterozygous) Ax B=RkRx DD (A x B) 2 pl. 8 (100° pelorism) | 12 | 1 | 0 | O 0 (Ax B) 2 pl. 9 - mG Oi Of © C (heterozygous) Selfed= Dk x DR E00) 25 e35 ers A @ (100° pelorisi) (Ax @) pl. 1 (75° pelorism) | 20] 4 |] 9 | 2 0 x 4/12} 3] 1) 7] (AxO) pl. 2 (50° pelorism) Cals elt 1 (0) C g (non-crowned and | (A x C) pl. 7 (heterozygous) | — | — | 2 | O | 10 heterozygous with, | (Ax C) pl. 11 4 —/|2]3)1 7 say, 35° pelorism in gamete) AxC=RRxDR D(heterozygous)selfed) 1 | 1 | 1 | O 8 A 2 (100° pelorism) | (A x D) pl. 6 (100° pelorism) | 17 | 2 | 2 | O 0 x | 2} 0] 0 | 10] (AxD)pl.s5 a8 2) | O 1 6 ale Onleno D g (non-crowned and | (A x D) pl. 11 (75° pelorism) | 27} 5 | 1 | O 0 heterozygous with, | say, 75° pelorism in gamete) AG (100) «2 (505) 4 3 fon 0 In the last generation it will be seen that there was no sharp separation of the plants into two groups attributable to the two grandparental factors. Thus, in the case of (A x C) pl. 2 (50°) the offspring are not clearly divisible into those of RO 35 100° resembling A, and those of attributable to C; in other words there was no obvious segregation into two degrees of pelorism. On the factorial and chromosome hypotheses we must suppose that the factor or factors governing the peloric character tend to become mutually changed and intermediate in nature when the male and female chromosomes containing the factors for the two degrees of pelorism lie alongside each other in the zygote. It will be of interest to obtain a general measure of the strength of inheritance between mid-parent and offspring with respect to the transmission of the degree or intensity of pelorism. For this purpose only recessives were used, involving ERNEST WARREN 107 30 mid-parents. Employing Prof. Karl Pearson’s method the accompanying table gives the correlation surface. Pelorism—Correlation Table—Recessives. Mid-parent and Offspring. Offspring. Grade of Pelorism. x to Mid-pavents. 3 = S "0 Grade of L | Totals Pelorism S 3 3 & il —. War 6 2 23 18 49 263 — Os 58 61 68 11 198 51°— 75° 64 31 Lb — 110 76°—100° 143 14 11 5 173 | Totals 271 108 117 34 530 The coefficient of correlation, calculated from the table, between mid-parent and offspring is 52. The result can be regarded as only a very rough approxi- mation, since a satisfactory method of measuring pelorism has yet to be found. The figure obtained is somewhat low, but it would seem to indicate that the in- heritance of the degree of pelorism is of the nature of ordinary blended inheritance. The point of interest to notice is that the union of two pelorie plants of different peloric intensities influences the gametes, while the union of a peloric plant with a homozygous non-peloric plant does not very readily affect the purity of the gametes with respect to pelorism. Pelorism. Effect of Selection in a homogeneous race. A peloric plant (C) with pelorism of about 85° intensity was self-fertilised, and the offspring, 16 in number, were as follows: 7 with 100°, 4 with 75° and 5 with 50° of ag oss | S a | Crowned Offspring = | Crowned Offspring | 2 | Parentage i Parentage | | (Self-fertilisation) Ss) (Self-fertilisation) 5 | 100°} 75° | 50° | 25° | © 100°} 75° | 50° | 25° | 6 | Zz | | A C'(85°) 7 4/5 |01; 0 ——— | —_ | + C 2, 11 (75°) | 6 | 13 Dn OM 0 Meee 1 | [ | C2 (50°) ieey 8518 ZH || (KO to) {i ) ae 2, 2 (50°) 2 1610" | 20 0 1 L ule |—— LC 2, 8 (50°) ; 1 | 138] 11] 0} 0 EG AG0:). -.. | 17 LIL | 18 | 2 | 0 | i ee C7, 10 (25°) -... 0) 2/18 | 2 | 5 HAs ape oe A ee a =| C 7, 10, 20 (25°) 0} O i 7 L—( 7, 10, 20, 4 (0°) 0 0} O0/| 0 | 6 108 Inheritance in the Foxglove Two of these plants of 50° (C2 and 7) were selfed, and the generation raised exhibited a lowered pelorism. The various selections made and the results obtained are shown in the accompanying table. It will be seen that finally on the selfing of plant C 7, 10, 20, 4 (0°) only non-peloric offspring were obtained. 2. GENERAL COLORATION OF THE COROLLA. As described in the previous report (loc. cat.) the imtensity of the purple coloration was measured by comparing it with a colour-scale founded on the intensity of colour by transmitted light of varying depths of a standard colour- solution. Purple and white foxgloves exhibit the ordinary Mendelian relationship, purple being dominant. | n | oo Coloured Series ETA Parentage Colour r I : S | ® : =H S Cs S Offspring Ps ale 1 P| | oo | RO | |) con] =a) | sept) Gesuiifrcon ime SO) Wa Seal Pare Saba eS | ee | SS u | ee? sn Te a 2 ws ee ieee (ee - I DD x DD | E(selfed) ... ost bes 99 — || 2. |, 40 5 8 os a a Soa 83 DDxDD\F oe aa ans 68 —|—|—|—]} 6 3 as a fh ae 64 DDx DD \ QEx 6 F Hoot pee —}—|—] 4] 38} 5) —f-—]— an 71 DDx DD (Ex oa pl. 16 (selfed) Ee 82 — 2 6/ 2)/—|]— wat = 87 DDx DD | (Ex F) pl. 9 eee) Pe 61 —;—|)—|]—]} 1] 10} 1} —j|— ea 56 i |DDx RR | 9 Ex 3 (Wate) rae 2 ec Tl | ale = DRxDR | EX Yate ele 18 (selfed) 71 —j|—/1 By PIN | ott tae = 2 66 I i . DDx DD | B (selfed)... : ost 95 By) Oy By eral es. = |] a ram 102 DDxRR|GBxe (Wuirtr) AZ —|—-|—] 1 Ae le || ah a] — = 59 DRx DR | (B x Wurte) pl. 1 (selfed) 80 SS ees I Pes | | — S 64 DRx DR | (Bx Waite) pl. 5 (selfed) 32 — |—}|—}| —|—]—/| 2 aan 5 3] | vf. DD x DD Belcan are Beis 95 3 0 2 3 6 1 4 102 DRxDR | i Soae ie ae Seen 0 — | — | — | 4) 129.4 1) 1 Se 7 70 DDxDR|GBxdA 3 82 PA) pl ANS eA, | 105 DDx DD | (Bx A) pl. 7 (selfed) on 30 1 1 | 38 | 12) 13) —}—j|—|— as 87 DDxDD | (Bx A) pl. 2 a (ecliow) m3 65 Fa ag 2/13/12 )—j,—]— = 67 va DDx DD B (selfed)... sista ae | 95 3 10) 2 3 6 i — | — | — ore 102 DRx DR | C (selfed) .. 3 | 34 | | are | | 4 40 DDx DR ° Bx so aBG 65 == |=) = | 2 AVN Oe Nea. |) me ae 53 DDx DD | (Bx) pl. 8 (selfed) ae 50 Se a ee Pee | a 50 DDxDD pl. 4 ate tacks 50 — == — —_ 4 Lei 33 a — 59 DDx DD pl. 7 ae ue 58 ee 5 Aaa Ta a 60" DDx DD pl. 6 : oe 68 Wow sel ee) Oe Oe | = as 5A DDx DD pl. 1 er és 70 sa eee cca eS ee ee eS) 74 Two of these offspring were selected, (Hx F) pls. 16 and 9, as widely divergent | from each other as possible, and selfed. In the families obtained there was no tendency for the occurrence of segregation into the two colour-intensities of (/) and (F) respectively. There was thus a definite blend, and the means of the two families approached the respective colour-intensities of the two self-fertilised | plants. In Series II the same homozygous dominant plant (/), with colour-intensity of 90°, was crossed with a white plant and all the offspring were heterozygous and intermediate. On selfing one of the darker coloured offspring, no. 18, the dominant plants raised tended to be of about the same colour-intensity as the grandparent Ernest WARREN 111 (FZ). In Series III a dark-coloured homozygous dominant plant (B) was also crossed with a white plant. One of the darkest heterozygous offspring (B x White) pl. 1 was selfed and the coloured plants raised tended to be paler than the grand- parent, but the family was small. In Series IV the dark-coloured homozygous plant (B) was crossed with a dark heterozygous plant (A). From the offspring raised, two were selected and selfed, one very dark and the other moderately dark. The two families included only coloured plants, and consequently the parents may be supposed to have been homozygous. The moderately dark parent (Bx A) pl. 2 failed to produce any offspring as dark as the grandparent (B). In Series V the same plant (B) was crossed with a light heterozygous plant (C). From the offspring produced five homozygous dominants were selfed, and in the five families raised only two plants reached the colour-intensity of the grand- parent (B). . On taking all these results together it may be said that there is evidence for the view that crossing a dark race of foxgloves with white plants tends to dull the colour-intensity of homozygous dominants of subsequent generations. General Coloration—Strength of Inheritance and Effect of Selection. In 1914 a dark-coloured homozygous plant (B, ) was crossed with a somewhat pale-coloured heterozygous plant (C; ~)=DD x DR =TIII. The offspring would consist theoretically of approximately equal numbers of dominants and hetero- zygous individuals. The reciprocal cross (C, 2 x B,¥) was also made = II. Several dominants were selfed and families were raised. Out of these families certain plants were selected and selfed and new families were obtained. This procedure was continued until 1917, and the results are given in the accompanying table. The families of the different years are arranged in ascending order of the colour- intensities of the parents. On comparing the means of the families with the colour-grade of the parents (shown in brackets) it will be at once seen that small variations in the colour-intensity of the parents tended to be transmitted to the offspring. It is obvious that the table exhibits the effect of selection in self- fertilised homozygous generations. For example we may take the following: Homozygous plant, IT. 1 had a colour of 70 and a mean of offspring 74 An offspring of above, IL-1, 4 3 rs 74, ne % 82 An offspring of above, II. 1, 4, 17 5 .; 110 =, Ns re 95 Homozygous plant, III. 2 55 me T9 op - ms 66 An offspring of above, IIT. 2, 1 -. re 66, 5 , 55 An offspring of above, III. 2, 1, 18 3 a 30) i . 85 An offspring of above, ITT. 2, 1, 18, 28 ,, es 95, os - 100 Reverse selection 1s shown also : Homozygous plant, IIT. 2 ss 5 Omens 5 66 An offspring of above, IIT. 2, 5 35 3 O20 es; 5 5 57 An offspring of above, IIT. 2, 5, 5 ra 5 40 ,, 55 3 41 An offspring of above, IIT. 2, 5, £ Fe 30 ==; Gs es By 112 Inheritance in the Foxglove Inheritance of Colour-Intensity among Dominants. Dominant Generations (Self-fertilisation) Thus, starting with a plant of about 70 colour-intensity we arrive by selection of self-fertilised plants at mean family intensities of 100 in one direction and 32 in the reverse direction. In another series, starting with a homozygous dominant plant of colour- intensity of about 11, I have by selection obtaimed plants in which the corolla exhibited no general tint. On selfing the pale plant no white plants occurred, ~ ES => — oS | Grades of A ell ess |e eel et eel S| |S Bie |S |S | | Colour- $ S Sy 3 a re coe eet eee lies R oo est < i S Scale a ~~ © | a Say Get alle Monn dl iesigl| ceaean | SORTS nll aie SS (Offspring) a aa eae lees = Cael aah satan Meta fara aera | = iat lle py 5 at rar mY = = = i= a Op a | oc | ral = | tt te Se Veale ast Sseuiest = || Sl m | 30— 39 — —|—/|—)—}|—|)—}/~)-}J-}|-—};-|-—]-|- 40— 49 — 2 )—)—)— ~)—) —)—|—-f—) —})—]} —] — |] — 50— 59 1 | 15 5 il 1 2);—)— 5 1 3} — 1}—}]—]— 60— 69 e | = ui rh 7 6 3 4 2) 2 4 3 3 1}—)— 7O— 79 2, || |, — 2 8 1 7 4 5 J — 1 3 O0o;—}]|— 80— 89 } f—}|—f—m—yorl}—fy—-l 6 3 1yj—|— 2 6 1 | — 90— 99 Se ee er lO 100—109 —|/—!—-—};—/;—/]—|!—}] 1 Yo) en ee ero es 10—119 | | — | —| |=} 9) 0 |= ee ee 120—129 | }—|—|—)—)— | — | 2) es See ee eee ee Means 66 | 54 | 62 | 66 | 69 | 64 | 64 | 82 | 78 | 70 | 59 | 65 | 71 | 81 | 95 | — a | pa | | SO. | SS eg ie alan Slo cc: ee a~lelalelelele!|S1S\81 81 a2 eee Grades of | x 2 iS Ss | a SS SS | RS, eee = o5) = 3 Pe) |i a = = a a ai} A Colour | Sis ee st laos | ce ate Lee | | SP is |S ienoee eo ome ce ea) Seale Se A ot [ok | ce] oP | ok [or P08 Fe P88 a 7 es ee ae ee (Offspring) | PR Oe |e | pe le fH PN TN Wl ale Gy, igen eo See areata = =Seo seaman Ga bee tom eames Klan onl | ap al ap) a ll = fo ne Pee ee esta em elites th pee rer halt Ga = ee eda ee pp = al eH cal lal il a 20— 29 ee ee = SOU My me ha eee es eee peat ae ae ae es 4o—49| O|. = | 3] 3/—| &|—|—1-7) 8|/—)| of — 1) 4) oo ere ee 50— 59 | 1 4,9) 6] 3).5)=)] 1. —le8) &) ob Ve) 1 aoe ee 60— 69 0) 17 3 2 1 2 6} 11f—) — 4 2 lyj—|—-—)|;—|— 0) 1 3) — 70—- 79 2 9 2);—|]— O | 11 9fF—) — 1 4 27—};—)|;—)|;— 0) 1 0) 1 80— 89 l 1 —|—|— 1 il Oy— —|— 2 3,—)'|—)|—);— 2 il 1 4 90— 99 | — — Pe | mee fe ec ee eran 100—109 | — Po J | 2) |) ee oe Se ee ee eee 110119 | —f- = ff) — |) S| SS) i) Se SS SS eee oe Ae pl ees] Pe | |e ee : SS SSS | S| Ol 130—139 | — |. — J—|—) —| =| fe | a aS Se aa aS ea ae eee Means 67 66 57 | 54 | 60 | 55 | 71 | 69 | 41 | 47 | 62 | 68 | 85 | 32 | 43 | 47 | 44 | 74 | 75 | 91 | 100 ERNEST WARREN 113 and the offspring were all pale-coloured ; but when the intensity was decreased by selection to about 4, the “white” plants showed Mendelian segregation, for the offspring arising from the plants produced from a cross with a dark-coloured plant were sharply divisible into strongly coloured and “ white” individuals. As a further example of selection, I started with a homozygous medium- coloured (48) plant (G). This was selfed and a family of 31 coloured plants was raised, there were no whites. Thus, the parent plant may be regarded as homo- zygous. A plant (G8) in this family, not far removed in colour (55) from the average, was selfed and the resulting family had a mean colour approximating to the colour of the parent. A light-coloured (27) plant (G 3, 20) and a dark (81) plant (G@ 3, 13) in this last family were selfed also, and the two families raised tended to resemble their respective parents. In a succeeding generation further progress was obtained in securing a dark race and a pale race. The necessary details are given in the accompanying diagrammatic table. The families printed in heavy type are those leading to a dark race, while those in ordinary type are passing into a light race. Formation of Lnght and Dark Races from a Dominant (homozygous) G. Parents Offspring—Seale of Colour ol er/rx/ ai ale] wo] we] Yt al alt ST ST SU 1 ~1 vt Number Colour | | | | | | | | S Re) o% ie D MT] & | & vw mS by os oS b oS ~~ | > YS | G (selfed) NG 4S cer eee eo) ates LD Gi | 2 G pl. 3 one il) 9 Tears} —|—|]}2]1]0 4/5 |3] 1 GiSaecOteeees Pose 1 t= ee ee ea gee Gopis) 2. | 82 | — |) 2-3 |\9 1a) ale |—)—| G 3, pl. 20 = 27 ee ae ed G3, 13;pl 2 ... SO 2/;6;2);—|;—j;-—/!-—;|;-|- Correlation Table—Colour-Intensity—Dominants (homozygous). Series II and ITT. Parents. areal eel tes | OS tt oan ics ics ons oS Grade of = he] = > > > me Colour- L | it i | | =. Intensity S| eS x = ra 380— 39 — | —- — = = |) 4 BP 33 i ae Oe an | ea er ey o6-1 O05 Tor) == 74 = 5) \\ ee ce I Sh | Bas 8} |e 100 6O— 69 | = eS oii | iss |) B47/ |] Bak Mats} 1 103 70— 79 —\|— 2 2 8 | 19 | 54 | 50 9 O 1) 145 80— 89 —;—|— 2 5 33 2 L)—)}—|— 17 90— 99 2 5 9 TP ayy 1h) 5 m 5|—|—]|]— 61 100—109 — — | — — | — — 110—119 — 2 (0) — | — — 120—129 BS Ss 5| 2 — Totals Biometrika x1v 8 114 Inheritance in the Foxglove In the last table, p. 113, a correlation surface is shown between parents and offspring. It is formed from the series of families given in the table preceding the last, and arising by self-fertilisation. The constants calculated from the table are: standard deviation of weighted parents 1°7805 units, and of offspring 18962 units, coefficient of correlation “707. In this table 39 families were involved, as detailed in the previous table. The starting points were four homozygous dominant plants occurring in the two families raised from the reciprocal crosses (C, x B,) and (B, x C)). 3. Brown Sports. The amount of spotting on the inside of the corolla is not closely correlated to the intensity of the general purple coloration of the flower, for even in white plants the spots may be numerous and of a deep purple colour. In coloured plants the spots were almost always dark purple. As a very rare exception in the coloured plants (4 plants in about 2590) some of the spots were russet brown, and in the case of the larger spots there was a middle area of brown bordered by a margin of purple. In white flowers the spots were fairly frequently brownish- green or brown. In such brown spotted white flowers I could never detect the slightest tinge of purple on the general surface of the corolla, while in purple- spotted white flowers a faint tinge of purple could often be seen. The brown spots of white flowers might not become visible until the flowers were on the point of fading, and in the case of any given white plant it was wholly impossible to affirm that brown spots were, or would be, entirely absent from all of the flowers. With the exception of the four plants mentioned above there was a sharp dis- continuity to the naked eye between purple spots and brown spots, intermediate conditions being absent. The brown colouring matter may be regarded as altered or decomposed anthocyanin. In purple spots a microscopic examination often showed a certain amount of decomposition; but, with the exception of the four plants, the amount was not enough to alter the colour of the spots sufficiently for detection by the naked eye. Thus, the discontinuity lies between a normal small amount of decomposition, and an abnormal entire decomposition. It may be stated that under ordinary circumstances brown or greenish spots (as seen by the naked eye) are linked to a perfectly white corolla, but purple spots occur in both purple and “white” flowers, and an apparently perfectly white corolla may also bear purple spots. If a brown spotted plant is crossed with a purple spotted one the offspring are all purple spotted and heterozygous. The brown spotted condition is inherited in Mendelian fashion, and is recessive to purple spots. No special crossings have been made to investigate the matter, and the results which are given below are merely picked out from the records of the numerous families which have been raised for other purposes. ERNEST WARREN 115 In the accompanying table it is useless to include families in which there was no taint of whiteness, since all the individuals (except 4 plants out of 2500) had purple spots. Brown Spots—Families White or Some Taint of Whiteness. : Purple Spotted Brown Spotted Gametic Nature} Number | Number | of of Ole = OHSS GS Pairings Families | Offspring | _, : .| Mendelian : Mendelian Experimentai | Expectation Experimental Expectation DDxDD | 13 344 344 344 O O RRxRR 11 169 0) | 0) 169 169 DRx DR 13 213 166 160 7 53 DRx DD 15 137 137 137 0) 0) DRxRR 1 8 3 | 4 5 4 DDxRR 6 70 |} 70 | 70 O 0) Totals 59 941 720 715 221 226 It is obvious from the table that the brown spotted condition exhibits Men- delian inheritance. 4. INHERITANCE OF CERTAIN SPORT ABNORMALITIES. Crenate Margin.—In a homogeneous family of 29 plants there appeared one plant in which the free edge of the mouth of the flower exhibited a well-marked serrated condition. All the flowers of a main-axis of considerable size were. similarly affected, and later, lateral flowering axes were formed, and the flowers were also serrate. The character was sufficiently marked to be noticeable at a casual glance of the plant, and since all the numerous flowers were alike in this particular, the character was clearly inherent in the plant, and was not due to a chance environmental disturbance influencing a young growing axis or certain flower-buds. The plant was self-fertilised, and it was confidently expected that the character would reappear in the offspring. Out of a fainily of some 20 plants 12 flowered and no sign of the peculiar serrated condition could be detected in any one of the plants. Here we have a conspicuous character in a large healthy plant affecting every flower of all the flowering axes, and yet apparently it was incapable of being transinitted to the offspring. Split Corolla.—In a homogeneous family (XXXIV) of 27 plants there appeared one plant in which in the great majority of the numerous flowers the corolla was symmetrically divided into an upper, a lower and two lateral pieces by four lateral splits extending down to the base of the flower. The plant was a large, healthy one and produced a number of similar lateral axes. At least 90°/, of the flowers were completely split (PI. I, fig. 10). In a family (VIII 7) unrelated to the above there were 16 plants, and of these, four plants were similarly affected. In one of these plants practically all 8—2 116 Inheritance in the Foxglove (99°/.) of the flowers were entirely split into four pieces, while in the remaining three plants some 50—60 °/, of the flowers were split. All the plants were large and vigorous. It was thought that very probably the character would exhibit Mendelian inheritance. The results of crossing and selfing are shown in the accompanying table. Inheritance of Split Corolla. 3 go @ aos SMe oslo ilmackalnce: memiliae late evel ter ils || S | S& |. .o Qa fon) ol ol (or) sH eee = i - a8 pC es M s ; Q <2 eI E = S S q ~ Noe (= 5 H 4 A = : Lo} ee — — ize a oO a so pa re} eS Ge) 3 x = Sal sare S) Zoe Z > SlelSlelelsis eke |e Ss 2 @ialel2lelf£l—|£lf leet =e a1218|/8|218)S )= 1S |B So) aes eee | cre) eles psa SSG [ete ete Pca let. || oo ge) jf CAN alee neice cree bo |S |= | 5 SMEG Satan Site is (Ss) S|) 8 Mie | KX lai lala la lo la | ole | le le] o | Eyes % | No Splitting | 961/12/ 8 | 7 | 2) 3/1) 0 | 0 | 9 | 96] 12 | 15) 10) 10 2 4 1 0/0} 6) 38.14 121 19720" | COs OO om Om mar as | 15—29 01:0 | 3! O }-0.).0 ) Onl0 (0) 0) OF OR Om Om eo ene 30—I4 O |: 0 | O } O°} Bed | 2512 10.490" | 202) 20m orm Om mee Bes 45—59 Oo} 1} 0 |-2 | 4) b40 1 04)°0 | 0 FO, | On om tom mo a. 60—74 O-| 2°) 0.) Te) e338.) 2] 10-0 moa Oe On mo 5 75—99 1} 1) 0 | 2) 5) 47) 1) Dal 320. Om Om OM xO The first mentioned plant (XXXIV 4) with 90°/, of the flowers split was crossed with an unrelated plant with some 99 °/, of the flowers split (5th vertical column of table). Of the 17 offspring 8 plants were wholly unsplit, while the remainder exhibited the character in a very greatly weakened condition. Three of these offspring, S. J. nos. 9, 18 and 6 having 0°/,, 13°/, and 18°/, of the flowers split respectively, were selfed, and the families raised all contained some plants very conspicuously split, but the character was more marked in the two families raised from parents 18 and 6 which showed some degree of splitting. In a subsequent generation (S. J. 18 pl. 4 and S. J. 18 pl. 10) raised by selfing, the character became very strongly pronounced. An unrelated non-split plant (II 6, 1) was crossed with the first mentioned plant having at least 90°/, of the flowers split (XXXIV 4). In the family of ERNEST WARREN 117 12 plants raised none of the plants exhibited splitting. Two of these offspring (R. J. nos. 9, 16) were selfed and no splitting occurred in the two families. Another generation was raised from R. J. 16, plant 14 and some re-appearance of splitting was detected. The table includes all the split plants which have occurred among some 3000 plants which have been under observation. The results obtained indicate that heredity has some influence, but the data are insufficient for determining the nature of the transmission which does not bear a Mendelian aspect. Creased Upper Lip.—In a certain plant in the majority of the flowers the upper surface and lip exhibited a conspicuous pucker or crease. This plant was crossed with an unrelated normal plant with no crease. Most of the seedlings were killed by the violent elements, but four plants were raised, and in one, a number of flowers exhibited a crease, which, however, was much less developed than in the paternal parent. The data are scanty, but the hereditary trans- mission does not seem to be Mendelian. Spontaneous Appearance of Wlhate plants—Among the numerous homozygous dominant coloured families that have been raised a white plant appeared spon- taneously on two occasions in two unrelated families. These plants, of course, bred true, and as there was no evidence of contamination of the seed the plants must be regarded as new sports. 5. INHERITANCE OF SEED-LENGTH. The mean length of the seed varied considerably in different plants. No discontinuous variation could be detected, and inheritance was of the blended type. Ten seeds were taken at random from one or more capsules of a number of plants of certain series and the means determined. The seeds of a capsule exhibited a moderate amount of variation, but they were monomorphic in varietal crossings, and not dimorphic as was noticed in an interspecific crossing. The distribution was more or less normal. Unfortunately there was very considerable variation in the mean size of the seeds in different capsules of the same plant, and consequently no very accurate determination of the strength of inheritance was possible with this character without an excessive number of measurements. As it was, the investigation entailed the measurement of about 1000 seeds. A plant, C; (mean seed-length 639 units), was crossed with B, (mean seed- length 628 units) and a family was raised; C, x B,=I1. In family II twelve plants were selfed, namely II 1, II 2... II 12, the seeds were measured and twelve families were obtained. In family II 1 three plants were selfed and the seed- length determined, namely (II 1) 1, (II 1) 2 and (II 1) 4. The means of the seed- lengths of these three plants were compared with the seed-length of the parent II 1. Similarly, for example, in family II 1, 2 two plants were selfed, namely (II 1, 2) 5 and (II 1, 2) 20, and the means of the seed-lengths of these two plants were compared with the seed-length of the parent II 1, 2. The data are given in the accompanying table. 118 /iheritance in the Foxglove Mean Seed-length, Parents and Offspring. | Parent (selfed) | Offspring (selfed) | Parent (selfed) | Offspring (selfed) ]| Parent (selfed) | Offspring (selfed) ‘es. | Meme) Desig | Met? | Desi | Mea" | Desig. | Mote] Desig. | Mtr) Desig | Mose | nation length nation length nation length nation length nation length nation length Ill 606 Il1,1 572 Il 4 592 II 4,8 628 Il 9 653 II 9,3 629 111,2 | 668 II 4,12 | 598 Il1,4 | 649 al ae eit rd Se a AS 620 TGsal 621 II 10 646 IT 10,1 660 IL 1,2 | 668 | 11 1,2,5 | 668 if We! 3) | eat TT 10,2 9 |) 7669 11 1,2, 20) 642 Il 6,4 | 670 TI 10,5 | 649 oo (ee I1 6,11 | 695 II 10,7 | 660 11 1,4 | 649 | 111,43 | s655 [— |__| — yf io. _ a 11 1,4,17| 674 [II 6,11| 695 |116,11,6| 665 ne babes seca 112 | 528 |-119,1 -| e2¢ | a7 "| 5479) 17,1 en | 2194) CoC ae 12,3 | 582 Il 7,12 | 570 Eee Es Bt alee 112,5 | 637 117,14 | 624 JIL 10,5) 649 | IL 10, 5,5) 598 IT 2,16 | 566 |— - 2% 11 10,5,10| 629 —— Z 117,1 | @71 | I17,1,7 | 649 Hobe ES) ee) 113 629 II 3,1 686 a ee == |= MM a fe 134 | 686 | qs | 620 | 118,2 | 6a9 | 11107] 860 | 11 10,7,9) 653 113,15 | 672 WBE © PGES. | [is era al ie fees ee ID 11 | 615. |-Draay se elrz If4 | 592 ) I14,2 | 668 | IT9 | 653 | I19,2 | 633 11 4,6 | 657 119,11 | 620 | 1112 | 679 | 1112,9 | 642 11 9,10 | 630 C, (self-pollen) seed-length = 639 By, (self-pollen) _ = 628 C, (By pollen) e = 642, these last seeds produced fam. II. The coefficient of correlation, calculated from the above numbers, between parents (selfed) and offspring (selfed) is 378. This is low for mid-parental corre- lation; but as all the generations arose by self-fertilisation we ought to have practically no correlation at all according to the pure-line hypothesis, for the two original parents (C, and B,) were closely similar to each other in the character under investigation. 6. PURPLE SPOTTING OF THE COROLLA. The purple spotting of the lower surface of the corolla-tube and lower lip varied greatly in the original parent plants, and the character was obviously inherited. The amount of spotting had little relationship to the intensity of the general coloration of the corolla, and “white” flowers were sometimes richly spotted with purple. The percentage area of the lower surface covered with spots was estimated by comparing the flowers with a series of diagrams each covered with a definitely known percentage of spotting. With practice it was found that sufficiently uniform results could be obtained by this method. ERNEST WARREN 119 In plants which had lost completely the power of producing any purple coloration whatever, the spots were brown and usually small and scanty, and among such plants an almost entire absence of spots of any kind occasionally occurred. We have already seen that with regard to the colour of the spots (brown and purple) Mendelian segregation takes place. In the inheritance of the amount of purple spotting no Mendelian relationship could be detected. The smallest amount of purple spotting met with in coloured foxgloves equalled about 1°/,, and the maximum about 70°/,. It will be re- membered that on crossing a dark purple plant with a plant bearing flowers very faintly tinged with purple (say, colour 4 of standard), definite segregation into “white” and purple plants occurred in the second generation following; but on crossing a plant possessing an abundance of purple spots (say, 50°/,) with a plant bearing very few purple spots (say, 2°/, or 3°/,) no such segregation was found, and the spotting tended to remain intermediate in amount. In the numerous crosses that have been made for various purposes the con- dition of the spotting was observed, and it is undoubtedly true that the means of the spotting of the families resulting from the crosses tended on the average to approximate to the spotting of the mid-parent, $(f + 2). No difference could be detected between the reciprocal crosses of two plants. Influence of Selection and Strength of Inheritance in Self-fertilised Generations. In this connection details of Series IT and III may be given (see p. 120). Plant C, with 11 °/, spotting was crossed with pollen of plant B, with spotting 48 °/, = IL. Seven of the offspring were selfed and the spotting of the resulting families was determined. Subsequently two other generations were raised by selfing. Plant B, was crossed with pollen of C,=III. Four of the offspring were selfed and sub- sequently three other generations were raised by self-fertilisation. The distributions of the spotting in the families of the different generations are shown in the accompanying table. In each generation the families are arranged in the ascending order of the parental spotting (see the top and middle horizontal lines). A casual inspection indicates at once that the general trend of the family- distributions follows the gradual increase in the spotting of the parents. As an example of selection we may take : IIT 2 (9°/,) selfed produced with others a plant III 2, 5 (15°/.) IIT 2, 5 (15°/,) selfed - 5 » III 2, 5, 10 (22°/,) III 2, 5, 10 (22°/,) selfed “ : - TN 255, Oj 7 (27 +.) III 2, 5, 10, 17 (27 °/,) selfed produced a family with mean spotting of 39 °/, Thus, we have passed from a plant with 9°/, spotting to a plant with 27°/,, which on selfing produced a family with a mean spotting of 39 °/,. With reference to the strength of inheritance two tables are given on p. 121, one for parents and offspring, and one for grandparents and grandchildren. The respective coefficients of correlation are ‘560: and ‘395. This correlation does not arise by the mixture of two races which have been sorted out by segregation Series If and III. Purple Spotting—-Families from Self-fertilised Parents. 120 Inheritance in the Foxglove wl stim | iil iter ri bt) |e) laoesm| 11) 1) tee a levtn | Ir it tii lili || oe eworesnr| 1111) inecens1 |e AN Ee Toa err re 0Z wets || ineetsses || [a a losin |IIT® rl ILili ll il 3] ea [uesem|iillinsee-111;ale el yee eee ee ot_jas tte) 1°23 Semen x} sem | [Ill bili )oo ttt | 8 pee for er to mn) 1 1) eee nes og 8 ‘F II [al Se ea alialialenieee a ferert ei, [Wem eee | | a 0 | arn |lliliimmeetiiii la mu jasesm| | ties {ti id = cc | @‘S il Pee eae ils eg | 2o@ TT | | 1 ements “ex aha | Lies seceS Tia | ¢@ TI Mmeigieme se I a tet] o | 9H il beerr ye ry riife ed eo | Ii 111 ineeei i) 3 we | atu | Illillitleltiil (se | oeem | 1 i884" 1111 [3 ae 9 ‘PII ie i ae aaa ig iy =] ot gt ‘tS III Trpeees1 tid Ea wl pom | lilieseempitill (al ot | weer | fil i@ersiii ila wf} etm [liitererni tit il |e[ et | e*em | llitse-tiliii|s Gay ame, te eet a o | sau | |** ou iemie 61 € ‘9 IT [| [mwecon | Tee 9 64S en 3) fe ga ee ie GI B'4iry | TP ITP PIII ITIL) S ] 3.) ete am | lee cee ie aaa ri gan. | [ee ee EN a Sailer « % WT ff. eT ce alae era tes él 1911 en clas ce we ie | 9g ihe Lit | paws | fy | | | e a) vim | Plle-"tiliili11 (8) oe | esm | 1 (== eReeS mina i= ol sSeecr iyi) t ili (star | een | Cle Fey eae coca aga a] cL A TY [: S)e? Sess sla em a | em i flleesee- itil} la)! sem |" = sees uo] oem | lieor-i iii iit lista | sem |eeere=911111li]e 91 er |ilimecneceo-i ii lala | sem fine" iili ee 91 sipslat | eeeeni titi} |e ZI I % III | coe TN ela igen ee 71 | 69 II | Pacino og A balan bal gee | fC P IIT [oo] RO ea Sie ae cae e | ell Pe ern me ar sh e lie rey | Pokal: em Pit hearers Ps wma) saoag | ie 08 LUI [ [menanvonnon || sunjods @nax(moa | 111? 842- 11 F111 é | osm | 1° 8e See = aes ee square peutes (@F) "7 ee aS oy. Ey laatecs| Gl ee ea rome | lees llillitl lil |S favox@owem | li-ece tl Tilt t [& . Oo or ~ a +5 Aor PaeSe : Sao &@ |onesseaaggseare |=] 8 go 2 |ooosecrane sale ERNEST WARREN eit during the different self-fertilised generations. Inspection of the tables shows that the distributions of the various families give no indication whatever of the occurrence of segregation into little spotted and much spotted plants. The gradual rise in the degree of spotting of the different parents is followed by a gradual increase in the spotting of the respective families obtained by self- fertilisation. The fact that the correlation between the grandparents and grand- children is less than that between the parents and offspring is further evidence that the small, apparently fortuitous, variations in spotting occurring among self- fertilised generations are inherited. This result is opposed to the pure-line hypothesis, according to which such small variations are regarded as_ slightly different expressions of the same identical character which remains unchanged in its essence from one self-fertilised generation to another. If such were the case Correlation Table—Spotting—Parents and Offspring. Series IT and IIT. Offspring. Grades of Spotting. Parents. Grades of Spotting | | 0O— 3 4— 7 s8—11 12—15 16—19 20—23 24—27 28—31 3g — 35) 36—89 40—43 44—hi Totals Correlation Table—Spotting—Grandparents and Grandchildren. Sertes II and ITT. Grandchildren. Grades of Spotting. Grand- Ror se ee | parents. S oN ®@ | G | & S Grades of %% IL L I | | tes Spotting St} Ol aln | ~Z_] Og 5—11 8 | 19 | 35 | 19 5 | 2 101 12—15 21 | 20} 81 | 10} 12 | — 124 16—19 é } Z 22 | 25 | 37 | 15 2 ik 154 20—23 2 9] 15} 13 3 1 | — 60 24—27 - j (0) 4); —};—|—]|— 30 28—31 2 5 : 6 8 1}—}—]}]— 40 a EE ee SSS Totals 7 | 12 | 17 | 30 | 33 | 66 | 66 | 91 |117| 47 | 20 |’ 3 509 122 Inheritance in the Foxglove the small variations would be fluctuating, non-inheritable variations; but the results in the present case are definitely against a supposition of this kind. It might be urged by some that the result is really due to the existence of genotypes, and that variations within the limits of each genotype are not inherit- able. The distributions of the families in the table do not indicate the occurrence of genotypes of any considerable magnitude. If the genotypes are supposed to be very small the practical result would become indistinguishable from the inherit- ance of continuous variations. 7. Ratio or Breapra to LenerH oF Coro... The breadth was measured as the maximum horizontal width across the mouth of the corolla of a fully expanded flower in which the anthers had opened ; the length was the maximum distance measured along the mid-adcauline surface with the lower lip stretched out straight in the long axis of the flower. It is ; Breadth 1000. The mean of the Length ratios of the four lowest flowers of an axis was taken as the mean of the plant. convenient to express the ratio in the form The original parent plants varied widely in this ratio, and the families raised by selfing tended to have the same ratio as their parents. A plant bearing wide flowers was crossed with one having narrow flowers, and the offspring tended to be intermediate. On selfing these offspring the new generation exhibited, of course, considerable variation, but taken as a whole the intermediate condition was retained, and there was clearly no segregation into wide flowers and narrow flowers. Thus, the different degrees of this character blend readily on crossing, and the mode of inheritance is very similar to that of the spotted condition. The results of a multitude of crossings of plants bearing variously shaped flowers have been carefully determined and tabulated, and there is no question about the general accuracy of the statement made above. In the present place we may confine our attention to the self-fertilised generations of Series II and III (p. 128). A plant (? C,) with relatively wide flowers (ratio 608) was crossed with a plant (f B,) having relatively narrow flowers (ratio 487). The family (= II) had flowers approximately intermediate. The reciprocal cross = III. The distributions of the families of the various generations raised by selfing are shown in the accompanying table. The families of: each generation are given in an ascending order of the ratios of the parents. As in the case of the character of spotting it will be seen that there is a clearly marked tendency for the mean ratios of the families to approximate to the ratios of the respective parents. In none of the families do we find any definite segregation into plants with wide flowers and plants with narrow flowers resembling those of the two progenitors of the series. Wide and narrow races could be raised by selection using only self-fertilisation. Thus in family III with a mean ratio of 531 there was a single plant (III 2) with as high a ratio as 575. This was selfed and the mean ratio of the offspring ERNEST WARREN 123 Series II and III. Ratio of Flower—Families from selfed plants. ese | 9 ‘tt 9 11 eltalaleae = 5 1.. alee geo |9e°8L‘L 2111 Mae eel euoce tote va foestir| (TI Ie* 1111 | 8] eos ieee 111 ii imeectl | 3 PeomecmInMe el | ie oe” | | ios Poca fer etsein| Vi eis 1 til |S Heer tir) | hl) tas) t Le | ses ol'st‘U SII Mines or) 12 Lev | € ‘FT IT bhiteesiiti [3 lee setter) ||| |reneern| | 8 fom. | | iP t joss) | 3 sep ler‘or'esut| | |e 1 | Ld % rec | ¢ ‘8 TI BSS eos al Wires ger | 11 ‘¢ ‘oS IIT lisse; 7 11 |g esc | 119 IT ere IA eS cer fesse} |) IT IPL tld [8 eve} 1 ‘til Pcs oct seei 9 ae iis gt ‘¢ ‘STII | we] rat | pierre] || (See) oeem | ll ieee ri ii | org | FI ‘LIT Tee ll |4a ‘tee | 8t‘T 3 PPP press] | | Ss eze| star | |meeteo™ | | | 5 | ace 6 ‘L ‘@ III alee beacons eer] Z ee] ran | i ltet-en | | | & | re oor | |imeneesy 111 {8 es| sr | 11 1~%eell!i |S }eeloeem) m™e™-1 11111 (8 Poulet |" | | | 1 | |S | oer | es @ 11 PpeHmm trip yi |e cee rim ot S| || SL gep | os 2 ITT bise rr e. Koo hi) I) ios 1 bl tl | 8 | ser | oS ar [prarrony | 1] fg or | aro} FIP lI TIT tll |S pre lee omr bl lect lili ii l& Paarl SSO 1 | ate] zen | 1) | | taoseee | |S mien | (lone | | Ps ese) TTT TELL E pees yy |S ov) eon | lIm@= tlt) |S fuss! ecom | iti ltillaeai1 {8 cameron | (eee | i bbl ts ve | seit | IILIe@e™ iii (8 08¢ 6 iil fie ea) ear ta alte lags ere esi | Litt eee] | e Biel 19 TI hehe 2) tps eee | 1S II Thi ireseny | [8 us| im |i lliteee-"|/8]ew] sem | 11 [neon 1) 18 ee) am | lliinte@-tl |Sfee! eom |tlien=l1t1l 8 ae) om j1liteece tt) |S fil com | titeen-r i111 /8 A Sg ep ear | (eRe | | Shes | sm HN sl cscs CSSot a ce “oly 9 II ome | | | & | cee rin J lartesee yyy, [Ss onvy | syuereg ve gz¢ P IIE J [awe] | | 3 xa | 11 Ieee ll) | 8 foe) ent |! lim@"lil iil |S soo (pas) '9 LTPP Ie oo | |S fomeg] stusva dev (penes)'g | | | 1 | | | | S| 'oxtg=11 Pi alesse sais | 124 Inheritance in the Foxglove was 574. In this family there was a plant (III 2, 1) with a ratio of 533 and the mean of offspring = 563. III 2, 1,18 (ratio 551) produced a family with mean 561, and III 2, 1, 18, 28 (ratio 598) produced a family with a mean of 606. In the reverse direction, through III 2, III 2, 5, III 2,5, 10 and III 2, 5, 10, 22 we pass from a parent of ratio 575 to a family having a mean ratio of 477. With the data given in the preceding table, correlation tables have been prepared for parents and offspring, and grandparents and grandchildren. Correlation Table—Ratios of Corolla—Parents and Offspring. Series II and III. Offspring. Parents. 2 | D AP al al ays ge |g | Grades of Sic SS || “esinllices ‘ Selle . B [ ll le | | | i} Totals Ratios L 1000 Slee & s Si = S so © © So Ne} © Is) 4L0—439 1 O 1 4 5 4; — | — 15 44O—469 | 1 1 8 8 6; 1 | — 25 ATO—JII — \);— — | — APG Ae39 se soult ton lel 3 1 132 500—529 — 2 3 8 | 15 | 28 | 45 | 40 | 28 6 2 1 178 5380—559 — 1 6 | 15 | 29 | 47 | 38 | 32 7 3) —|— 178 560—589 — 2 12 | 15 | 30:) 34) 18 | 13 Q);—-|—~—|— 126 590—619 1 2 9; 18] 14 5;/—};—]—]— — 49 620—659 Ns) | ee ee eee 0 650—679 = 2, 3 2 4 O 1)/—j})—j;—] — |} — 10 Totals 713 Correlation Table—Ratios of Corolla—Grandparents and Grandchildren. Series II and ITI. Grandchildren. Grades of RaSh) aac dn esa yr SSE Pesta PKS ote eresy dP tase yl esi |) hess TS 8 | L | | | il ia | | ] | Totals Ratios L 1000 = 3 = & Se st 2 = & | $ s S SS [ci SSS) ic) Bea R i ices || eSmaliecoms ecco acons ecco acs) Ee ee a eS | ee ——— 44O—4O9 —{— |) — |}; — 1 (0) 3 9 9 i — 29 ATO—AII Ss ee | || 2 3 | 10/15 | 11 3 1 46 500—529 ey | ey || 4 | 22 | 36 | 26 | 22) 12) 2 | — 123 580—559 1 3 | 14 | 38 | 37 | 34 | 24; 16) 9 Pe |) al, |p 179 560—589 —— |" 2 83) LO 138433) Po |e 2ie ie se || — > | = 118 5II0—619 a fe re Pe ie | Vr a ee ey, aes. || (0) 620—649 = | pee ae | cae, || mS li (6) 650—679 — ieee 3 3 {l tae ree ee 10 Totals The coefficients of correlation are ‘601 for parents and offspring and ‘492 for grandparents and grandchildren. The latter figure is somewhat high; but taking the results altogether they are incompatible with any notion of pure-lines. ERNEST WARREN 125 8. GENERAL CONCLUSIONS. In the various characters that have been dealt with in the crossing of different strains of the garden foxglove we have seen that in pelorism, colour of corolla and colour of spots, the mode of inheritance is Mendelian with reference to the qualities: peloric and non-peloric, purple and white corolla, purple spots and brown spots. If, however, there are any marked differences in the intensities of these qualities, the mode of inheritance of the intensity of the quality was found to be of the blended type. The other characters examined were quantitative in nature, such as degree of the development of purple spots and the ratio of breadth to length of corolla, and these characters blended completely. When the intensity of a quality is very slight and approaching zero the difficulty arises as to which category the individual should be referred. When Mendelian inheritance is in evidence the critical point may apparently be determined by the occurrence of segregation. Thus, if a homozygous plant with a very faint tinge of purple (say an intensity of about 4) is crossed with a homozygous strongly coloured plant, segregation occurs in the so-called #, generation, and we obtain on the average 1 faintly tinged plant to 3 much more darkly coloured plants. When, however, the pale plant has a somewhat greater intensity (say about 10), the F, and subsequent generations are intermediate, and definite segregation does not occur. In accordance with this procedure a plant with flowers having an intensity of general coloration which did not reach 5 of the scale was classed as “white.” Without employing such a line of demarcation the results obtained were wholly unintelligible. From the strict Mendelian standpoint, in the example given above, it would probably be affirmed that the faint tinge of purple on “white” flowers is not really a fractional part of the general purple coloration of coloured plants, but is a distinct character governed by a different factor or set of factors in the chromo- somes. To one who has grown the plants this view appears an artificial one. In my previous account I stated that there appeared to be a distinct gap among my plants between “white” plants and coloured plants, and that colorations of about 8—25 of the scale were extremely rare or almost absent, but I have sub- sequently obtained a number of plants having such intensities of coloration, passing imperceptibly down to absolute whiteness. Consequently it is quite un- likely that the faint tinge of purple on “ white” flowers is anything else than the last remnant of a general purple coloration. It is quite similar in the character of pelorism, but the difficulty in finding a suitable method of measuring this character renders the matter less obvious. Thus, it would appear that if a character is not present beyond a certain minimum or unit quantity it may be unable to blend on crossing with a plant possessing the character in a well-marked degree. 126 Inheritance in the Foxglove With reference to the characters which blend, the accompanying table sum- marizes the results obtained for parental correlation. Mid-parents and self-fertilised parents are regarded as comparable. ‘ coset of Number of orrelation. ESOED Offspring Parents and Offspring Intensity of pelorism (homozygous recessive, | 530 520 mid-parents and self-fertilised parents) —f : 2 Intensity of general purple coloration (homo- | 529 “207 ZY ZOusS dominants, self-fertilised parents) J ! Seed-leneth (self- fertilised parents)... 46 378 Spotting (self-fertilised parents) sto 716 “560 Ratio of Corolla (self-fertilised parents) ae 713 601 The probable errors of these results are reasonably small and the average coefficient for the 5 characters is 553 which is not far removed from the average coefficient found by Professor Karl Pearson for a large number of characters in a variety of different organisms. It must be again emphasized that these results are based on self-fertilised generations of pedigree plants of known gametic constitution, and on Johannesen’s theory of pure-lines these parental coefficients should be zero, or at least very small. The evidence of the present investigation is therefore definitely against any general application of the theory of pure-lines and of genotypes of any appreciable magnitude, and further it indicates that selective breeding within self-fertilised generations of a Bomogencoucs race 1s capable of modifying that race to a marked degree. EXPLANATION OF PLATE I. Figs. 1 and 2.—Pelorism of maximum intensity ; grade 100°. Corollas absent, sessile anthers. Figs. 3 and 4.—Perfect pelorism, grade 100°. Corollas joined along their split edges forming a complete saucer. Stamens with filaments. Fig. 5.—Peloric flower of side-axis; the axis terminates in an ovary. Fig. 6.—Pelorism of grade 100°. Numerous flowers fused irregularly forming a rosette, the axis has grown through the crown. Figs. 7 and 8.—Incomplete pelorism of main axes, grade 75°. A spiral bending often occurs. Fig. 9.—Faintly defined pelorism. When such occurred on the lateral axes the plant was said to possess a grade of 25°. Side view, and view from above. Fig. 10.—F lowering axis of a conspicuous sport in which practically all the corollas are completely split longitudinally into four elongated blades. Nature of inheritance obscure. The photographs were kindly taken by Dr Conrad Akerman. ON POLYCHORIC COEFFICIENTS OF CORRELATION. By KARL PEARSON, F.R.S. anp EGON S. PEARSON. (1) ONE of the difficulties which are constantly recurring in statistical practice is that of the correlation or contingency table in which the two variates are classified in broad categories. We may indeed proceed by the method of mean square contingency and correct for the grouping of both variates by the class index corrections on the assumption that the marginal totals for both variates may be assumed to follow approximately normal distributions. Such a procedure gives reasonable satisfactory results*, provided the marginal totals are not in very unequal groupings and the correlation is not intense (say, ‘85 and above). The polychoric table has been discussed by Ritchie-Scott and he has described a method of reaching a polychoric coefficient of correlation from the weighted mean of the possible tetrachoric values+. Such a process is, however, so laborious that it can hardly establish itself in practice. From the theoretical standpoint, however, Ritchie-Scott’s paper was of great interest (1) as guiding us by the size of the probable errors to discriminate between the valuable and worthless dichotomies in tetrachoric determinations of the correlation?, (ii) as providing standard values by which those obtained by other procedures could be directly tested. We shall endeavour to reach in this paper another form of polychoric co- efficient,—that is a correlation coefficient which does use all the information given in a polychoric table,—but which requires less analysis than Ritchie-Scott’s weighted mean coefficient. Thus what may be lost in exactness will possibly be repaid by practical efficiency. There is another point also of very considerable illustrative im- portance ; we desire wherever the data are suitable actually to exhibit in the form of a graph the relation between the two variates. This should be possible in the case of a polychoric table, and in the past has frequently been done by approximate methods of more or less validity. We can indeed take such methods as our present starting point as they will directly indicate to the reader our line of approach. We start with the hypothesis that the marginal totals of our polychoric table can be represented on a normal scale. This is no great assumption in itself. If a true quantitative scale ever becomes available it can be attached at once and with little trouble to the normal scale. To exhibit a variate on a normal scale makes * By ‘‘ reasonably satisfactory results,” we mean that in cases which can be directly checked by the product moment method the difference is within the range of practical insignificance as judged by probable error. + Biometrika, Vol. x11. pp. 93—133. + Thus in a 3x3 table it is possible for two of the corner dichotomies, i.e. those unassociated with the diagonal in the sense of the correlation, to have even negative weights, so that they should be omitted in finding the mean. 128 On Polychoric Coefficients of Correlation no greater assumption than when we exhibit a pressure-volume curve as a straight line by using a logarithmic scale. Now let the polychoric table be such that in the population V under discussion, the sth category of the first variate A contains ns. individuals and the s‘th category of the second variate B contains n.y individuals, while the number of individuals who combine in the population V the sth category of A and the s‘th category of B 1S Ng’. Now when we proceed to exhibit the categories of the A-variate on a normal scale, the process will give us two important quantities : (a) We shall have the ratio of abscissa to standard deviation at the dichotomy between each pair of broad categories. If n,., No, Nye, «+. Nge, ».. be the frequencies of the A-variate for the several cate- gories the values of the ratios of abscissae to standard deviation will be specified as "O08. At, = alae mllee lla pchk emit aN eee hoo Here hs, h, are the values on either side of the category n,. and if there be q categories, n.; 18 bounded by hy or — % and h,, while n., is bounded by hy, and h, or +#. The lower h’s will have negative and the upper positive signs and the greatest care must be taken to see that the proper signs are given to the values of h. Similarly if the frequencies of the various categories of the B-variate be Mens Klltas & Miegy Gt Nig tees the values of the ratios of ordinates to standard deviation will be represented by = 0, hy, “ha, kgs ses byes hag ee gic Oo where ky_, and ky give the dichotomies on either side of n.y. We may consider the coordinate at the back of the variate A when represented on a normal scale to be a’, the origin being taken at the mean on the normal scale. Hence if the standard deviation be o,, we shall find. it convenient to write the absolute normal abscissae Dt = O70, Ny — aes Similarly we take 7’ for the coordinate at the back of the variate B, measured from the mean, and write: Di ca ee Serre where o, is the standard deviation of B. Clearly until a quantitative scale has been determined we shall know h, k, 2, y but not 1’, kK, a’, y’, cz and ay. (b) We shall determine the ratio of abscissa to standard deviation, or the ratio of ordinate to standard deviation of the centroids or means of the groups n,. and ny. U 1p 2 Let Ho = ene ee Ky == 258% then the means of the categories n., and ny. are determined by Nes! hy = (His - H)/ oe Tg = (Ke) i, af (i) KARL PEARSON AND Econ S. PEARSON 129 respectively. The numerical values of h, and ky can be easily ascertained from the table published recently of ordinates of normal curve to permilles of area*. Care must be taken in every case to give the correct sign to hs and ky. Now if there were no correlation, h, and k, combined would give the mean of the group ny, and they give a fair approximation to the result if there are numerous categories, that is if the range of the categories be small. The correlation found from these marginal centroids would then be HOS (Meg Jeg P ING tes waa tee cent enone once ec eee eee (ii), but as Ritchie-Scott has shown} this 7, diverges much more than rg the mean square contingency value from the true correlation, and considerably more than the tetrachoric or polychoric coefficients do. The reason for this is clear and was pointed out by one of us in 1913+. Namely h, and ky do not give the coordinates of the mean of ny. In fact ney hsky is not the contribution of nsy to the product- moment. We propose in the present paper to give first the actual contributions of ny to the means and product-moments of the two variates and then to apply these results in order to obtain (a) a polychoric coefficient, and (b) a graph of the relation of the two variates. The essential assumptions that will be made are the following: (i) The marginal totals having been reduced to a normal scale, and the corre- lation being supposed to be r, we shall calculate what the contents of the sth-s’ th cell would be on the assumption that the frequency surface is the normal surface represented by the given correlation and the marginal totals reduced to normal scales. We shall further calculate the v-moment, the y-moment and the zy product- moment of the sth-s’th cell on the same hypothesis. (11) From these data we shall determine the most suitable value to give to 7, so that the actually observed frequencies differ least from those that would be given by such a correlation surface. We shall also obtain a formula for calculating the mean value of y for the array of B-variates, n,. in number, which corresponds to the sth category of A. We shall thus be in a position to plot. the regression line of Bon A’and test at the same time the closeness with which it fits the thus calcu- lated array means, both variates being represented on a normal scale. We shall write the real coefficient of correlation of the population 7, the coefficient as found from a single sth-s’ th cell, as r,y, and those found from the n,. and n.y arrays as rs. and 7.y respectively. hss, Ise will be the A- and B-variate means of the sth-s’th cell and Tsy the product-moment, per unit of the population, of the frequency in the sth-s’th cell about the mean axes as determined from the marginal totals on the normal scale, * See Biometrika, Vol. x11. pp. 426-8, + Biometrika, Vol. x11. p. 122. { Biometrika, Vol. 1x. p. 188. Biometrika x1v 9 130 On Polychoric Coefficients of Correlation (2) The developments we require involve the use of the tetrachoric functions. The tetrachoric function of the order ¢ is given by* 1 ( d ‘3 1 — 172 aie 1% = —=(- ae GP a ns gence lil). ‘t}\ da) 2a ou The tetrachoric functions 7, to 7, are tabled for positive values of w in Tables for Statisticians and Biometricians + to five decimal places. For negative values of x tetrachoric functions of an odd order remain unchanged, but those of an even order must have their sign as given in the tables reversed. It will frequently be needful to take the difference of the tetrachoric functions at the boundaries of a marginal category. Thus if 7;(h) denotes the value of the tetrachoric function for # =h, we shall need for the sth marginal total Tr (hs) — 71 (hs_y). This difference we shall write, for brevity, ST, and in obtaining its numerical value from tables of the tetrachoric functions it is essential to remember that s (or s’) is supposed to increase in the positive direction of the axis of « (or y), and that when h (or k) is negative attention must be paid to changing the sign of the tabled value of 7;, if t be even. The formula for determining the successive tetrachoric functions for a given value of & is Cay (Ye eee et Ph ramen ANE fipsichinaicb boos: (iv), where p; and q are given by the following table: t Pr qt t Pt qt 2 | -707,1068 | -000,0000 | 14 | -267,2612 | -889,4990 3 | °577,3503 | -4082483 | 15 | -258,1989 | -897,0851 4 | -500,0000 | °577,3503 | 16 | -250,0000 | -903,6962 5 | -447,2136 | -670,8204 | 17 | °242,5356 | -909,5085+ 6 | -408,2483 | -730,2968 | 18 | -235,70238 | -914,6592 7 | °377,9645 | -771,5168 | 19 | -229,4157 | -919,2547 8 | °353,5534 | -801,7838 | 20 | -223,6068 | -923,3804 3333333 | -8249578 | 21 | -2182179 | -927,1051 10 | °316,2278 | -843,2740 | 22 | -213,2007 | -930,4842 11 -301,5113 | -858,1163 | 23 | -2085144 | -933,5637 12 | °288,6751 | :870,3880 | 24 | -204,1241 | -936,3819 13 °277,3501 880, 7047 25 *200,0000 938,9709 1 : 1 : : Since 7, = —— e~ *”, it can be found at once from the tables for the ordinates of the normal curve, and will indeed have been computed at each division in order * The reasons why the tetrachoric functions are tabled with the factor ivi! are: (a) because this factor greatly simplifies our formulae and (b) because a factor of some such order is essential, if we are to have manageable tabulated values. As a matter of fact the factor chosen reduces all tetrachoric functions to numerical values lying between 0 and 1. + Cambridge University Press, pp. 42—51. Kart PEARSON AND Econ S. PkARSON 131 to determine h,. and h.y. It-is then often simpler to work directly with (iv) rather than interpolate into the tabled values of the functions. In an earlier paper* dealing with the tetrachoric functions one of us has shown that if N 24 x — BIE y° Z= Dar — 9) ° i 172 be the equation to a normal correlation surface the variates being measured in the standard deviations as units, then af[N = 17) + 2rtete + 8r2tsts +... HERD tet tee ee eeee (v), where tT; = 7;(#) and tT; = 7; (y). Now in order to proceed further it is needful to \determine the following integrals : h, h, i 7,00, r aT, aa. gy Us We can determine these by using (iii) after in the second case integrating by parts. We have: rh 8 hg f 1 — 1Ly2 | : T1d2 = ra ; ( | aaa 2 da Igy 9 Mbt ty dae] Der leer: d is 1 al h; —| —_— — Sel — --——_ e a Vt! ( dx} JQ hs 1 = NG Sure a acBits (a a orale hotel eT Ster rw ial e 69 84.6.5 tial aleieheiece ei arererels (v1) : hy h, 1 CN Ga ee 2 A a 4 = : ——— ( — aa) a — 2 fe gain: f, ende= |" ee (- ae) age 1 1 ie 1 fh =>— aaat= A aP Y= Tepe da in hea VE Igy at | 1 1 hg ad ate Apo — Ti-» Ve a Ve hes 1 i i 9 6 Se MIT es fe eT eM» aasielelsrerecstrro aiersteis (v1) bis. VvtL vt—1 le But by (iv): a Lt ee fee hn Pt where ; pi=1/Vvt, a =(t-—2)/Vtt—D). * Phil. Trans. Vol. 195 a, p. 4, Equation (xiv), with a slight change of notation. In that paper, 1 —1y2 1 ; : 1 lye w ——e * »v, =—— is written for 7,,,, and —~e 2? ——-® — for 7’,,,. 20 "J(n+))! Ce ae ox vec °° 9—2 ra) 132 On Polychoric Coefficients of Correlation 1 a pares Thus : Tra @ + Weol Vt 7, +Vt—1 74-5. Accordingly 1 a —— =F (VE Sete + NER DS eT) cre rereceeees (vii). The latter form throws us back on $,7; which will have to be calculated to determine the itegral in (vi) for the successive values of ¢ and s. On the other hand a table of Tes SNG a4 NE Teg (viii) would be a convenient method of determining the integral and tables of 7’ might be easily formed, say up to 7%. In this case we may write (vii): hg 1 : | Pies “LT da =— Vi Se Tees, a} do.eiele onssele/efaisrelsietel orefelelorelelatens (ix). We are now in the position to compute all the requisite integrals we need; if we write 7isy for the contents of the sth-s' th cell, then on the supposition that the surface is normal, has correlation r and follows the actual marginal frequencies, we have: Nisa! hs | gaa — => AXA N Nig J sy N =D 7TodeT) HLM eT HT Neto neteit o.. Toe Tp de Tp ee Nee = h ks aedadr use sg) = | ee a Sloe Ta ar Ts Tea + Ss Tey Te, aN Ng J hs N Hoe PS gd pie Gp oh see creer (x1), Nes 7 Ns [ks yedady ; ary ee ae py, N Msg - Ve fe arg : ey SsTo Ss vi aE r3sT Sy 16 + TS, Tage “pees ee Tone Lip hoe aceasta ene (x11), Neat hg (ks gygdad n i Q mer ny / y= | Eee! 2ST, 8y TS, Be eS ge N Ny_y J Ks N shinee TPS lig tel p oe ae Seo (xni). It is desirable to say a few words about the functions 7, and 7, which may at first present difficulties to the reader. — 7, clearly stands for the integral hs 1 Say? : hs i ie Shae, | 7x, ee \/ Qar (Beas and is therefore simply ;./V. Similarly — 7,/ = n/N. KARL PEARSON AND Econ S., PkARSON 133 Next clearly — 9,7, stands for Is T,0da = iy es ae: da pa hs, N20 i Lt a,2ts Ey = ; eat =—%s71, or Sp ll = Sere which is precisely the value given by (vill). Thus (vii) is shown to be correct even for this special case although a form like (vi) bis through which it is reached shows difticulties. Similarly 3; 7) = 357,'* The remainder of the 7’s knowing 7, and 7, come directly from (iv) and the 7"s are always given by (viii). Now it is clear that (x) to (xii) provide a large number of ways of deter- mining r. We might find 7, Le. ry, from the single cell by writing in (x) ny for Tisy. Or we may oe ibe == 8 (Msg sy) ="s (7 Nellans to ce Mi oeel oy la + ese 1’ Nel pet =) (xiv), where fisy 18 given by (x). But h;. is the known centroid of the n,. marginal total, and accordingly the above is an equation to find r, Le. ry, from a given column of the table. If we use this value of 7. in (x) and (x11) to find 7i,y, and k,., we obtain the theoretical cell frequency and y-mean of the cell as found from a column. Now sum fk, for every value of s’ and we find k,. the y mean of a column depending on the data as found from the column, Le. Kee = a S G ges’ Sry Sy Ty +7 Spt ILI oe EPPS tp Sy Ly + sal) (xv), where ny 1s the observed cell frequency and fisy the frequency found by (x) when we insert the value of 7 as found from (xiv). We are thus in a position theoretically to determine on a normal scale the mean of a column from the correlation actually determined from that column, This would be the ideal method of determining the mean of a row or column; but it would involve a great deal of hard work, as with the two regression curves we should need to find r for every row and column by an equation of a high order. Hence in most cases we are likely to content ourselves by finding r for the whole table and then use this value in (x) to determine fi, and in (xv) to find the mean of the array. /. plotted to the known h,. on the normal scale will give the regression curve. * We can thus take Ty)=7, and 7')'=7)’. 134 On Polychoric Coefficients of Correlation The question now arises as to the manner in which we can find r for the whole table most effectively. Clearly we might assume the product-moment components from (xiii) and sum for all cells. We should have Ness'Tss'\ 2 ( =) Sine since the coordinates are measured from the means in terms of the standard deviations as units. Hence substituting from (xii) we have: pas (2 RS AS SO LAS SS a} (xvi). Here fis must be substituted from (x) and we have finally rn 8 (Ne PARBCTY 40RD 4. tO STY 4} 5.6 \V ol Sete To PHM Oem tase t Net epee This equation based upon the product-moment method of finding r is clearly likely to be very complicated, and although it can be proved that the product- moment method is the “best” method of finding 7 when we are dealing with a series of quantitatively measured individuals, we have no certainty that it is the best method in the present case of broad categories. It may indeed be questioned whether another method now to be considered cannot be shown to be better or at least equally efficacious. ) (xvi) bis. Let us consider for a moment what we have in view. We observe ny as the frequency of the sth-s’th cell; we find that with a given correlation r the frequency of this cell would be 7. on the assumption that the frequency surface is the normal frequency surface corresponding to the observed marginal totals. Accordingly the most probable value to give to r would be that which made v= § ~—— = minimum, or, what is the same thing, Noct Sean S (5) = minimum. This leads us, differentiating with regard to 7, to oy (Ga) a Ss = — 0, s, 8’ (Msgr dr or, writing at length, our equation for r is: s (Ge) yy ar + 2rd5Tave Ts +...4+ pr? Ss Tp de! Tp Gr oat xl =) (xvii). N (Soe to. + C367 Ty Tr + Sete vate. + eee + PON Ty Ve ca ar eae Neither (xvi) nor (xvii) are very readily solved. Probably the easiest way will be to obtain an approximate value of 7 by existing methods either from a good fourfold table, or from contingency, and then evaluate (xvi) or (xvi) for values of 7, one well above and one well below this result, so that the real value of 7 les S58 KARL PEARSON AND Eaon S. PEARSON 135 between the two. A linear interpolation will probably suffice in most cases to determine r with sufficient accuracy. It will be observed that what we are trying to do is to fit a normal correlation surface to a series of cell frequencies. We may do this by equating product- moments, or actual cell frequencies properly weighted. The factors = and (=) ss’ ss’ come into our equations as a form of weights. When ny is small as compared with Nis that cell will contribute less to the general equations for 7, and when ‘,," is large as compared with f,, the contribution will be considerable. If the observed results were closely normal then ns,» would be nearly ri. If we might assume the differences of ny and 7isy so small as to be negligible we should have: PASS, Love Lo booed iy ly... PTS Lely te (Xvi) ter, sys and O= SHSen3em + 2r3sT yt. +... fpr? 13,759 "T, +... (xvii) bis, 8,8 instead of (xvi) bis and (xvii). These equations it will be found are identically satisfied. Hence our values for 7+ from (xvi) and (xvii) depend on fi, differing from. Neg". (3) We now proceed to illustrate the application of these results. Stature of Father and Son. The following table gives a correlation table for the inheritance of stature in Father and Son made up in broad categories corresponding to eye-colour groups*. Upon this material we shall be able to test our correlations and our graph against those found by definite numerical groupings. Stature of Father (Broad Categories). 5 Totals | re S a if 1 34 Sp Oss 8 301 B-z 3 87 75 66 22 284 QS y zoo aesocl mar. | le 137 ey — o.8 5f — 18 27 26 11 105 2s 6' — i 98 8 o — é 6 4] és Totals i The positive direction of w is from left to mght and of y vertically downwards. It will suffice to take the 7’s to five decimal figures but it will be needful to go further with the 7’s if the 7’s are to be taken correctly to five figures from (viii). The general reduction formula for the 7’s is: Ty (w) t= 2) Né=1 @ — Tyo(w) (t= 3) (= I a* +) T,(@)= vélt =i) (a(t—2) 41) ..-(XV1I1), : 7 ; -—3, ee or, = Cen je T,_, (#) — (# (¢-1)4+1) 5 D5 @} (xvii) bis. * See Biometrika, Vol. 1x. p. 220. 136 On Polychoric Coefficients of Correlation Hence if 7, and 7, be found accurately the remaining 7’s can be determined as accurately as we please without reference to the 7’s. But, fe ee ck ee ee MR OAC AE ey cea mee ntti ni Hc (xix) Hence the tables of ordinates and areas of the normal curve readily provide 7) and 7’, to seven decimal places, and (xvii) provides the higher 7’s. These were cut down to five figures and an approximate check on their values obtained by (viii). As a matter of fact if 7 is of the order °50 we cannot hope to obtain more than three figure accuracy in 7 without going to higher t- and 7-functions than the sixth, especially when using (xvi). But three figures in the correlation are usually adequate and the labour of computing is much increased if higher functions are used. Such must, however, be used if the correlation be sensibly higher than ‘50. The following table gives the } (1+ .)’s, h’s, H’s, @’s, 7’s, S7’s, 7’s, and Y7’s for the a-variate. TABLE I. (cua Bore 7 7 $(1+a) 0 036 °358 | *622 °802 871 972 | 1-000 h —o |—1°'79912 | —°36381 | +°31074 | +°84879 |+1°13113 |+1:91104 +a H=7, 0) ‘07908 37340 *38014 *27827 21042 06425 10) z,= na! — 2s | 219667 — 91404 | —-02553 | +°56594 | + 98333 | +1:44723 | +2-20464 Q\G3— Ss—1 \ TT) 0) —°036 —°358 — 622 — 802 — 871 -- 972 —1 m=T) 0) +:'O07908 | +°'37340 | +°38014 | +°27827 | +°21042 | + 06425 0) T2 0) — ‘10060 | — 09606-} + °08353 | +°16701 | +°16830 | + °08682 0) T3 0) + 07221 | —°13226 | —:14021 | — ‘03176 | +°02401 | +°06952 0) TA 0) — 00688 | +:°07952 | —:07001 | —:10990 | — -08359 | + 01634 0) TS 0) — 04291 | +°07579 | + °08432 | — 02041 | —-05839 | — 03270 0 T6 0) + 03654 | — 06933 | +°06182 | + 07319 | +:03408 | — -03744 0) Ir — 036 — 322 — '264 — ‘180 — 069 — 101 — ‘028 Ir, +:°07908 | +°29432 | +°00674 | —:10187 | —:06785 | —°14617 | —-06425 Ire ~ 10060 | +°00454 | +°17959 | +:°08348 | +°00129 | — 08148 | — 08682 Ir; +:°07221 | —:20447 | — 00795 | +:°10845 | +°05577 | +°04555 | — 06956 Sry — ‘00688 | + °08640 | — 14953 | — 03989 | +°02631 | +°09993 | — :01634 Ir; — 04291 | +°11870 | +°00853 | —-+10473 | — 03798 | +°02569 | +.:032'70 It +:°03654 | —:10587 | +°13115 | +:°011387 | —:03911 | —:07152 | +:°08744 if 0) — ‘17827 | —:49385 | — 50388 | —°56581-| — 63299 | — °84922 -l Ts 0) +°23690 | +°29898 | +°29475 | +°33853 | +°33915 | +°21135 10) TE, 0) —°18799 | —-00734 | +°00466 | + 06947 | +°12432 | + °18307 0 7, 0) +:°04848 | — -09506 | —°09186 | —:10916 | —:08255 | + -06601 O Ts 0) + 07412 | +:07799 | —:00511 | —:06648 | — -°10343 | — °05518 0) Ts 0) — (08325 | +:°05616 | +°05364 | + °05379 | +°01471 | — :08491 0) ST, +:07908 | +:29432 | +-00674 | —:10187 | — -06786 | —:14617 , — 06425 ST, — ‘17827 | —:31558 | — 01003 | — 06193 | —-06718 | —°21622 | —-15078 IT, +°23690 | +:06208 | —°00422 | +°04377 | +°00063 | —°12780 | —°21135 OT. — 18799 | +°18065 | +°01200 | +-°06481 | + °05485 | +°05874 | — °18307 ST, +°04848 | —'14354 | + 00320 | —:01731 | +:°02662 | +°14856 | — -06601 IT; +°07412 | —:06613 | —-°01310 | —:061387 | — :03695 | +°04825 | 4+°05518 ST; — 08325 | +°13941 | — 00253 | +°00015 | — -03907 | — 09962 | + °08491 A KARL PEARSON AND Econ S. PEARSON 137 The following table gives the corresponding quantities $(1 + a’)’s, k’s, K’s, ys, T's, ¥7,’s T’s and STs for the y-variate. TABLE II. | 4(1+a’) 0 034 | 335 619 He | | 1:000 k —o !—1°82501 | — 42615 | +°30286 | + 69349 la 08482 |+1:73920 +0 K=r, oO |+ 07545 | +:36431 | +-38106 | +-31367 |4+ 22149 |+ -08792 0 7 — Ms-1- As | 96 | “ORO | “ORR0R | 5 \ *Q77QC 2630 214436 | leeat—a,)\ | 2 21916 | 95968 | —-05896 | + °49188 | +°87789 |+1°36301 he 36 | To. 0) — 034 — 385 - ‘619 | —*756 — ‘861 —"959 | -1 1 =T) O | +:07545 | +:36431 | +-38106 | +°31367 | 422149 | + 08792 0 2, 0 ~— ‘09737 | —°10978 | +°08160°|} +°15382 | +°16990 | +°10812 0) T3, 0) +:°07179 | —°12172 | -—°14130 | — -06647 | +:01599 | +-07268 0) Ta. 0) — 00929 | +°08932 | —-06851 | — +11185 | — 08942 | +°00077 0) Ts, 0 —-04057 | +:06463 | +-08551 | +-00990 | — -05411 | --04815 O ae 0 | +°03702 | —-07647 | +-06061 | +-08449 | +-04134 | —-03475 0 Ir) -—°034 | —'301 —'284 | —'137 —'J05 — ‘098 | - 041 Sr +:07545 | +°28886 | +°01675 | — 06739 | —-09218 | —°13857 | — :08792 Ire! — ‘09737 | —°01241 | +°19138 | +°07222 | +°01608 | —-06178 | — -10812 Ir3 +:07179 | —*19351 | — 01958 | +:-07483 | +°08246 | +°05669 | — :07268 Irq — 00929 | +:09861 | - 15783 | — 04334 | +°02243 | +°09019 | —-00077 Irs — 04057 | ++10520 | + -02088 | — -07561 | —-06401 | + 00596 | + 04815 rg | +-03702 | —-11349 | +-13708 | +-02388 | —-04315 | —-07609 | +°03475 Yee 0 — ‘17170 | —°49025 | — °50360 | —°53847 | —-62073 | — °80610 —1 T, a0) +°23105 | +°30489 | + °29416 | +°32847 | +°34094 | +°25021 | 0 Ts, 0) — 18723 | —°O1151 | +°00432 | +°04271 | +°11544 | +° 18882 | 0) a 0) +°05286 | —:09892 | —-09140 | —+11081 | —-O8901 | + 03768 | | 0) Ts! 0) +°06989 | +°01240 | — 00474 | — 04316 | —°09869 | — -08340 | O Ts. O | —:08412 | +°05897 | +°05326 | +°06263 | +°02276 | —-08010 O Ly +:°07545 | +°28886 | + °01675 | —:06739 | — 09218 | — °13358 | — :08792 ST —-17170 | —-31855 | — -01334 | —-03488 | — -08225 | —-18537 | —-19391 ST, +:23105 | +:07334 | —-01023 | +°03431 | +-01247 | — "09072 | — ‘25021 ST — 18723 | +°17572 | +°01583 | + °03839 | +°07273 | +°07338 | —*18882 oT + 05286 | —°15178 | +°00752 | —-01941 | +°02179 | +°12670 | —°03768 ITs! +:06989 | — :05749 | —°01714 | —-03842 | — 05553 | +°10529 | + °08340 IT; | — ‘08412 | +°14309 | — 00571 | + °00937 | — ‘03988 | — 10286 | +°08010 | i From Tables I and II we can find from (x) the value of 7,;/M for any given value of r, and by equating 7iss/N to nss/N we should have an equation to determine the correlation 7 from that cell alone. The weighted mean of these 49 7’s would be Ritchie-Scott’s polychoric correlation coefficient. But the labour would be immense*, see Table III, p. 138. There are certain checks on the accuracy of this table, namely 0, when it =1. We are now in a position to give the product of S,7,)Sy 7," : Ss ‘3s Tp Ys Tp = 0 except for p = * We are not underrating the large amount of arithmetic of the present process. It is not likely to be often repeated, and the sole purpose of publishing all these tables for an individual case is to impress the reader with that fact; while at the same time illustrating the actual numerical processes. The amount of arithmetic, great as it is, is relatively small compared with that of solving and weighting the resulting 7’s in the case of a 49-cell table. 138 On Polychoric Coefficients of Correlation TABLE III. Values of Setp3¢' Tp - s’ | p s=1 f= s=3 s=4 s=5 s=6 s=7 Pp 8 0 |+-001,224 | +-°010,948 + -008,976 | + -006,120 |-+ 002,346 | + -003,434 | + 000,952 | 0 1 |+-005,967 | + 022,206 | + 000,509 | - 007,686 | - 005,119 | — 011,029 | — 004,848 | 1 2 |+-009,795 | — 000,442 | — 017,487 | -- -008,128 | — 000,126 | + 007,934 |4+-008,454| 2 L | 3 |+-005,184 | — 014,679 — 000,571 + -007,786 + °004,004 |+-003,270 |—-004,994] 3 | 1 4 | + 000,064 | — 000,803 | + °001,389 + -000,371 | — 000,244 | — 000,928 }+ 000,152) 4 5 |+7001,741 | — 004,816 | — 000,346 | + 004,249 + 001,541 | — 001,042 | — -001,327 | 5 | 6 | +-001,353 | — 003,919 + -004,855 | + -000,421 |— 001,448 | — 002,648 | + “001,386 | 6 = eB speed ale ee abn ae 0 |+:010,836 | + 096,922 | + ‘079,464 + 054,180 | +-020,769 |+-030,401 | + °008,428) 0 1 |+°022,843 | + -085,017 | + -001,947 | — -029,426 | - 019,599 | — 042,223 |— -018,559| 1 2 |+ 001,248 | — 000,056 | — 002,229 | — 001,036 | — -000,016 }+ -001,011 |+-001,077| 2 2 | 3 | - 013,973 | + 039,567 | + -001,538 | — -020,986 |— -010,792 | — 008,814 |+°013,461} 3 | 2 4 | --000,678 | + 008,520 | — 014,745 | — 003,934 |+ 002,594 |+-009,854 |— 001,611] 4 5 |-:004,514 | + -012,487 | + 000,897 |— -011,018 | — ‘003,995 | + °002,703 |+ 003,440 | 5 6 |—-004,147 | + -012,015 | — -014,884 | — -001,290 |-+ 004,439 | + 008,117 | — -004,249| 6 O |+:010,224 |+ -091,448 | + 074,976 |+ 051,120 |-+ 019,596 | + 028,684 |+ "007,952 | 0 1 |+-001,325 | +-004,930 | + ‘000,113 | — -001,706 | — 001,136 | — ‘002,448 |- -001,076 | 1 2 |—-019,253 | + 000,869 |-+ 034,370 | + -015,976 | + ‘000,247 | — 015,594 |— 016,616 | 2 3 | 3 |--001,414 |-+ -004,004 | + 000,156 | — 002,123 | -- 001,092 | — :000,892 |+ 001,362] 3 | 3 4 |+:001,086 | — -013,637 | + -023,600 | + 006,296 | - 004,153 | — 015,772 |+ ‘002,579 | 4 5 |— 000,896 | + -002,478 | + -000,178 | — 002,187 | — 000,793 |-+ -000,536 |+ °000,683 | 5 6 | + 005,009 | — -014,513 |+ 017,978 | +-001,559 | — 005,361 | — “009,804 |+-005,132| 6 0 |+ 004,932 |+-044,114 |+ 036,168 | + -024,660 |+ 009,453 | + 013,837 |+ ‘003,836 | 0 1 | — 005,329 | — -019,834 |— -000,454 |-+-006,865 |+ °004,572 | + ‘009,850 |+ 004,330 | 1 2 |—-007,265 | + 000,328 | + -012,970 | + 006,029 | + 000,093 | — 005,884 | — 006,270 | 2 4 | 3 |+°005,403 | — -015,300 | — -000,595 | + -008,115 |+ -004,173 | + -003,409 | — 005,205] 3 | 4 4 |+ 000,298 | — -003,745 |+ 006,481 |+ 001,729 |— 001,140 | — 004,331 }+-000,708| 4 5 |+ 008,244 | — -008,975 | — 000,645 | -+ 007,919 | + 002,872 | — 001,942 |— ‘002,472 | 5 6 |+°000,873 | — -002,528 | + 003,132 |+ 000,272 | — 000,934 | — 001,708 }|+ 000,894 | 6 0 |+ 003,780 | +. 033,810 |+ 027,720 |+ :018,900 |+ 007,245 | + 010,605 | + °002,940) 0 1 | —-007,290 | — 027,130 | — 000,621 | + -009,390 |+ 006,254 |+ °013,474 |+ 005,923] 1 2 |—-001,678 | + 000,073 | + 002,888 |+ 001,342 |4+-000,021 | — 001,310 |— 001,396 | 2 5 | 3 |+:005,954 | — 016,861 | — -000,656 | + 008,943 |+ -004,599 |+ °003,756 |— ‘005,736 | 3 | 5 4 |—-000,154 | +-001,938 | — -003,354 | — ‘000,895 |+ 000,590 |+ 002,241 |— 000,367 | 4 5 |+-002,747 | — -007,598 | — -000,546 | + -006,704 |+ 002,431 | — 001,644 |— 002,093 | 5 6 |—-001,577 |+ 004,568 | — -005,659 | — -000,491 |+ 001,688 |+ 003,086 | - ‘001,616 | 6 O |+°003,528 |-+ °031,556 |+ -025,872 | + 017,640 |+ 006,762 |+ 009,898 |+ 002,744 | 0 1 |—-010,563 | — 039,312 | — 000,900 |+ 013,607 |-+ 009,063 |+ °019,524 |+ ‘008,582 | 1 2 |+ 006,215 |—-000,280 | — -011,095 |— -005,157 |— 000,080 |-+ 005,034 |+ -005,364 | 2 6 | 3 |+-004,094 |—-011,591 | — 000,451 |+ 006,148 |+ 003,162 |+ °002,582 |— ‘003,943 3 | 6 | 4 |—-000,621 | + -007,792 |— -013,486 | — -003,598 | + 002,373 |+ °009,013 |— 001,474 | 4 | 5 |—-000,256 |-+-007,107 |+ 000,051 | — -000,624 | — -000,226 |+ -000,153 |+ ‘000,195 | 5 6 |—-002,780 | + -008,056 | — -009,979 |— -000,865 |+ -002,976 |-+ "005,442 | — 002,849 6 O |- -001,476 |-+ 013,202 |+ -010,824 |+ -007,380 |+ 002,829 |+ 004,141 |+ °001,148 | 0 1 |= 006,953 | — -025,877 | — 000,593 |-+ 008,956 | + -005,965 |+ ‘012,851 |4 ‘005,649 } 1 2 |+:010,877 | — 000,491 |—-019,417 | — 009,026 | — -000,139 |+ 008,810 | + °009,387 | 2 | _ 7 | 3 |= -005,248 |+-014,861 |+-000,578 | — -007,882 | — -004,053 | — 003,311 }+°005,056| 3 | 7 4 |+-000,005 | — 000,067 | + -000,115 | + -000,031 | — 000,020 | — 000,077 |4+ 000,013 | 4 5 |—-002,066 | + -005,715 | + -000,411 | — -005,043 | — 001,829 | 4+ 001,237 |+°001,575 | 5 6 |+-001,270 | — -003,679 | + -004,557 |+ 000,395 | — 001,359 | — 002,485 }-+ 001,301} 6 KARL PEARSON AND Eaon S. PEARSON 139 Applying these tests we find : Sey (3s To Iv Te) = 1°000,000, Syy (S57, Sy 71) = + 000,001, Ssy (3, T2939 T2)=+ 000,001, Sey (Se 73 Sy Te) = + 000,008, Sey (95 T4139 Ts) =— °000,002, See (S575 99 7s) =+°000,001, and Sys (35 Te Ie Te ) = + 000,002, results as close as we should expect, when we take into account the fact that our $7’s were only to five figure accuracy, and our products to six. The meaning of Table IIT should be quite intelligible ; namely, for example : 084 = “Y= = 079,464 + :001,947 r — 002,229 7? + 001,538 7? — ‘014,745 14 + 000,897 7° — 014,884 75 +... ...(xx1) is the equation which will give the correlation coefficient r as deduced from the (3, 2) cell. If r be given any other value the mght hand of the above expression is equal to the contents of the (8, 2) cell for a normal correlation surface of corre- lation coefficient * having the observed marginal totals. Thus far the ulnuntgate is absolutely comparable with that needed for nie: Scott’s “polychoric rv.’ We should have to solve the 49 equations, and then calculate—the stiffest part of the work—the probable errors of the 49 correlation coefficients which are the roots of these equations. “Using these probable errors as our weighting data, we should find a mean coefficient. Our purpose is to replace the weighting and the solution of the 49 equations by the solution of a single equation. It will be noticed that both Ritchie-Scott’s and our methods have an undesirable limitation, for we both assume the marginal totals to be those of the normal correlation surface. Actually in our case we ought to treat the marginal totals as unknown, or select h,, he, h;,... hq, ky, ke, hy, ... ky as well as r to give as closely as possible the observed frequencies. Now the 7’s and consequently the T’s and $7’s and $7"s all depend upon the /’s and k’s and the equations obtained by making Nos! nae ae (=) = minimum Tiss’ do not appear to lend themselves to any reasonably brief system of solutions. We were compelled therefore to introduce the admittedly limited form of solution, i.e. the determination of the best normal correlation surface subject to the restriction of its having the same marginal totals as the observed frequency surface. We con- sider this a practically necessary but none the less grave restriction. We next proceeded to determine the value of figy/nsy and (Ty/nsy)? for certain selected values of 7 in order to build up equation (xvii) and solve it by inter- polation. The values chosen were: 0°45, 0°50 and 0°55. These cover the range within which we anticipate the solution of (xvii) for 7 will lie. We need also the value of the numerator in (xvii), i.e. Veg = Ye T Se TH F295 Te Sy Te +2729, 7399 Te + -.-; for the same three values of 7. These results are given in Table IV. On Polychoric Coefficients of Correlation 140 TABLE IV. Values of (isy/Nss'), (Tiss'/Nss')? ANA Ves’. St Function sil s=2 9=8} s=4 s=5 s=6 G=7/ Tiss'|Mes’ (2) | 1°602,750 | -879,954 | °814,714 wo "388,000 co co fe) | 2119000, -oie'da| verresr| so | aeeoo0| é = , 6e) | 2°115,5¢ "916,405 587,85 ao “174,000 eo) <9) (Tisg:/g9') (@) | 2°568,806 | °774,321 663,759 ea "150,544 ry oo 1 (b) | 3:°407,716 | +813,604| -497,830 2 070,756 00 2 (c) | 4:475,340| 839,805 | 345,576 oo ‘030,276 ca <0 vs (a) | + 018,462 | + 011,177 | — °014,603 | — 009,218 | — 002,733 | — ‘002,747 | — 000,336 b) |+ 020,480 |+ 008,113 | — -015,910 | — 008,382 | — -002,154 | — -001,929 | — 000,218 (c) |+°022,694 |+ -004,477 | — 017,013 | — 007,243 |— -001,519 | — 001,228 | — -000,168 isy'Msy (2) | *867,348] 905,539 | +944,250| 1°478,500 | 1:379,000 | 1-887,333 oo b)| 894,565) -944,630| 939,833 | 1:383,615 | 1-215,375 | 1:544,667 wo _ _ (¢) | 915,130) 986,935) -933,333 | 1-278,500 | 1-043,500 | 1:213,333 Ee (ise’/Nas’)” (@) | °752,293 | +820,001 | °891,608 | 2-185,962 | 1-901,641 | 3°562,026 oo 2 (b)| +800,247| -892,326| 883,286 | 1:914,390 | 1:477,136 | 2°385,996 ea (c) | *837,463| 974,041) 871,110) 1-634,562 | 1-088,892 | 1-472,177 wo vey (@) | + 013,846 | + °116,000 | — -005,963 | — 046,943 | — -025,552 | — 041,623 | — 009,764 (b) | + °011,084 |4++125,051 | — 009,011 | — 051,854 | — 026,828 | — 040,529 | — 007,913 (¢) | + 007,767 | +°135,874 | — 013,006 | — 057,659 | — 028,171 | — ‘038,864 | — -005,940 Tisg[Nse (4) | *857,750 | 1:075,552 | 1°108,280] -812,500]| °854,818| -984,375 | 2°194,000 (0) | °751,875 | 1-076,195 | 1°138,747 | -823,410] 844,773 | 930,333 | 1°846,500 _ (e)| °635,750 | 1:075,448 | 1°175,027 | -835,909 | -831,636 | 866,000 | 1°486,500 (Mgs-/Msq)? (@) | *735,735 | 1°156,812 | 1°228,285 | -660,156 |) °730,714| *968,994| 4:813,636 3 (6) | *565,316 | 1:158,196 | 1:296,745 | -678,004| -713,641| 865,519 | 3:409,562 (c)| 404,178 | 1-156,588 | 1°380,688 | -698,744| -691,618| -°749,956 | 2:209,682 vey (a) | — 016,095 | + 002,075 |+ 041,770 | + 013,402 | — -003,847 | — 023,749 | — 013,555 (6) |— °017,786 | + 000,037 | + °049,827 | + 015,435 | — -005,038 | — 028,268 | — ‘014,205 7 __(¢) |= 019,311 | — -002,805 | + ‘059,278 | + 017,601 | — -006,601 | — 033,522 |- ‘014,539 Tiss'[Msy (@) | 1°634,000 | 1°155,897 | 1:078,222| -808,892 | 850,571 | 1°225,786| -671,833 (6) | 1:260,000 | 1-096,966 | 1°098,417 | -837,135 | 877,714 | 1:239,929| -627,333 - (c) | *917,000 | 1-030,862 | 1°121,944| -869,568 | -907,429 | 1-250,000| -570,000 (7iss’/sg')? (@) | 2°669,956 | 1°336,098 | 1°162,563 | -654,306| -723,471 | 1°502,551 | -451,360 4 (6) | 1°587,600 | 1-203,334 | 1:206,520 | -700,795| °770,382 | 1°537,424| 393,547 (c)| 840,889 | 1-062,676 | 1°258,758 | -756,149 | °823,427 | 1°562,500| 324,900 Ye (@) | — 007,715 | - -032,319 | + 013,434 |+ -019,505 | +-007,261 |-+°004,459 | — 004,625 (0) | — 007,215 | — -036,182 | + °015,696 | + -022,370 | 4 007,947 | + 003,430 | — 006,095 (c) | — 006,471 | — -040,720 |+ 018,237 |+ 025,717 |+ 008,735 |+ ‘002,185 | — 007,680 Tiss'[Nsy () oo 1:114,278 | 1°028,556 | 934,423} -960,545| 935,111] 946,600 (0) co 1:006,167 | 1°027,185 | +969,000 | 1:008,273 | 978,944 | -944,400 : a) ca -890,389 | 1°024,148 | 1:007,692 | 1-061,727 | 1°025,111 |] -927,400 o [Orr] 2 | Louw] 8600 | Sasa | athe | ‘ace | “Setar oe) 012,372 055,10! Boe ) : - 958 ,3% 4 (c)| ow 792,793 | 1:048,879 | 1-015,443 | 1+127,264 | 1-050,853 | -860,071 vey (&) | — 004,797 | — -037,653 | — 000,381 | + 017,025 | + 009,967 | + 015,398 | + -000,440 (6) | — 003,957 | — 040,252 | — 001,134 | + -018,995 |-+-011,095 | + 016,166 | — -000,916 (c) | — 002,988 | — 043,158 | — 002,230 | + 021,305 |+ 012,465 |+ 017,118 | — 002,508 iss/Nsg_ (CL) ee) 1:461,333 | °867,077 | 1:216,474 | 1°604,286 | °701,931 | -906,500 (db) 20 1°224,000 | °830,576 | 1-245,526 | 1°693,714] °754,966| -969,125 7 , (2 co ‘988,111 | °786,077 | 1-273,789 | 1°791,000| 812,828 | 1:028,375 g [ore | |} Tawive| -esersee | rseirsae | a-cnerenel) seu maa Maa iene “498,17 : 5 "551,385 |, 278 367 | 75 7 ; 9 2 ‘976,363 | °617,917 | 1°622,538 | 3-207,681 | °660,689 | 1-057,555 vey (4) |= "008,069 | — 042,728 | — 017,170 | + 011,165 | + 012,060 | + °029,542 | +4 010,202 (b) | — 002,189 |—-042,658 | — 020,931 | + -010,905 | + -013,028 | + -032,069 | + -009,778 (ce) | — ‘001,381 | — -042,197 | — 025,479 | + -010,572 |-+ 014,219 |+ 035,116 | + 009,152 Mggi[Nsg (4) ce) ‘961,333 | °747,556 | 1-462,667 | -845,000 | 1:140,500] -870,286 (b ea ‘705,000 | °648,556 | 1°411,167 | 865,000 | 1-235,000 | 1-003,143 - ., (2) a0 "491,000 | 542,000 | 1:337,333 | 877,000 | 1°331,000 | 1°150,286 7 [Owlw |. | aorcoss | asotess | 1-aeicsen | -fasiose | Treenteep | aeaneae oo *497,025 “Ag 2 = 39% *748,225 "525,225 ‘006 {6 ie 241,081 | -293,764 | 1°788,460 | -769,129 | 1-771,561 | 1-323,158 Ye (@) | — 000,633 | — 016,551 | — 017,086 | — 004,935 | + 002,845 | 4+ °018,719 | +°017,641 (b) | — 000,417 | — 014,160 | — -0182536 | — -007,468 | +-001,950 | + -019,060 |+-019,571 (¢) | — 000,309 |— 011,471 | - -019,787 |— 010,293 | + :000,873 | + °019,302 | + °021,685 The values («), (0), (c) refer respectively to 7=0°45, 0°50, 0°55. Kart PEARSON AND Econ S. PEARSON 141 Having obtained (fgs/n,9')? and vv for the trial values of 7, it 1s only a matter of adding vss /(7igy/Nsy)” for all values of s and s’ on the machine in order to obtain: tt = Sos {Vs¢¢/(Nise/' Mee)? The values obtained were: r= 0°45 | 0°50 0°55 w=| +'157,074 | +:012,276 | — 209,976 Whence by inverse interpolation* we find: u=0 for r, = °5034, which is “polychoric 7” as based upon Equation (xvii). We shall compare later the value for r as found by other processes. But the above value is clearly well in accord with the usual result for paternal correlation in man. Table V gives the working values of ryy//(Tiss:/Nss)”- TABLE V. Values of vss:/(Tisy Ns’)? F- s r $= | s=—2 so sa=:4 s=5) S=—0) s=7 (a) |+:007,186 |+4-014,435 | — 022,001 0 | -018,159 0 0 1 | (b) |+ 006,010 | + -009,972 | — 031,958 0 | —*030,424 0 0 (ec) |-+-005,071 |+-005,331 | — 049,232 0 — 050,132 0 0 (a) |+-018,405 |+-141,463 | - -006,688 |— -021,475 | — 013,436 | — -011,685 0 2 | (b) |+-013,851 |4+°140,141 |— 010,202 | — 027,086 | — 018,162 |— -016,986 0 (c) |+:009,274 |-+ -139,495 | — 014,930 | — 035,275 |— ‘025,871 |— -026,399 0 (a) |—-021,876 |+-001,794 | +-034,007 | + 020,301 | — 005,265 | — 024,509 | — 002,816 3 | (b) |—-031,462 |+ -000,032 | + -038,425 |+ -022,755 | — 007,060 | — 032,660 | — 004,166 (ce) |—-047,779 |—-002,425 |+ | 002,890 | — °024,189 |+ ‘011,556 4 (6) |--004,543 |— -030,027 | + °013,009 (ec) |—+007,695 | —-038,318 | + -014,488 — 8 Se | 029,810 | + -010,036 | + 002,968 | — -010,247 031,921 | + 010,316 | + -002,231 |~ -015,487 at ee 042,934 | 4+ 025,189 | — 009,544 | — -044,832 | — :006,579 + + + 034,010 | + °010,608 | + 001,398 | — 023,039 (a) 0 — 030,326 | — 000,360 | + -019,498 | — 010,803 | + -017,610 | + 000,491 5 | (b) 0 — 039,760 | — 001,075 | + 020,230 | + 010,914 | + -016,869 | — 001,027 (c) 0 — 054,439 | — -002,126 | + -020,981 |+-011,058 | + -016,285 | — -002,916 (a) 0 — 020,008 | — -022,838 | + -007,545 | + -004,686 | + -059,958 | + -012,415 6 | (b) 0 — 028,474 |— -030,341 | + -007,029 | + 004,542 | + 056,264 |+-010,411 (c) 0 ~ 043,219 |— -041,234 | + -006,516 | + 004,433 | + -053,151 | + -008,654 (a) 0 — 017,910 | ~ 030,574 | — 002,307 | + “003,984 | + -014,391 | + -023,291 7 | (b) 0 ~ 028,491 |— -044,067 | — -003,750 | +-002,606 | + -012,497 | +-019,449 (e) 0 ~-047,576 | — -067,356 | — -005,755 | + 001,135 | + -010,895 |+ 016,389 S(a)= +°157,074, S(b) = +'012,276, S (¢c)= — 209,976. * The formula used was Casus I or zg=2)+40 (Az_1 + Az) + $0262z9, the solution of the quadratic giving 0. t The table suggests, a posteriori, that we should have got quite reasonable results from linear inter- polation ; we have: from (a) and (b) r=*5042 ; from (a) and (c) r="4928, and from (b) and (c) r=*5025, as against our ‘5034. It should be noticed that the values in Table V are not always in agreement in the last figure with those obtained by dividing »,, in Table IV by the (M%s5-/msq’)? of that table, because the somewhat more accurate process was adopted of multiplying vss by n?,5, and then dividing by 7°,,. Still the physical meanings of 7igy/igy aNd (Nsg//Ngg)? ave SO prominent in the work that it seemed desirable to register their values. 142 On Polychorie Coefficients of Correlation Before we consider the graph due to this solution, let us investigate the value of + to be found from (xvi) The values of jigy/nsgy are already provided in Table IV, but we need a table corresponding to Table III giving the product %s7pX~T>’ instead of the product 93,7,)Sy7,. This is provided in Table VI. Further if Kgs = Nz LS Ty) +73 Lie (Mg ey SST Ne Ty + eens Table VII (p. 143) provides «,, for the same three values of 7, i.e. 0°45, 0°50 and 0°55. Finally Table VIII (p. 143) gives Kgy/(Tisy/nsy’), Whence by summing we obtain Ui Sys {Kss'/(iss/ Mss ts for the three cases. Using the same interpolation formula as before in order to discover the value of r for which v = 0 we find: p= "9204: There is thus a difference of ‘0170 between the two methods. The probable error found for the product-moment 7 is ‘0160 and the result by the usual product- moment process may be given: r='5189 +0160. Thus either of the values reached by the methods of this paper differ by less than the probable error from the true product-moment value. (4) If we work out the results by mean square contingency we find: C, = 480,690, and the class index correlations are*: For fathers : Poe = 962,329. For sons: ro, = 964,523. Hence correlation from mean square contingency r= O/(res To,) = 5179, which is in excellent agreement with the product-moment value. It would therefore be quite reasonable for such a table as the present to use mean square contingency and class index corrections, and save the heavy labour of Equation (xvi bis) or (xvii). At the same time we cannot assert that this process would always be equally satisfactory for tables with but few broad categories and with much higher correlation. Our two processes seem to give values slightly in defect and in excess of the true value of 7, and we might use their mean, ie. 5118, to obtain our graph. We shall, however, first proceed to compare the actual results of solving (xiv) and substituting in (xv) with the result of such approximative processes. Table IX (p. 145) gives the products of 8, 7, Sy Ty’ and will therefore enable us by aid of Table IV (p. 140) which gives the values of fisy/nsy to obtain hs. for any value of 7. Let he =, 7 Se He EPS Sh Fe PSS ta eee (xxil). * Using the values of %, and 7, in Tables I and II respectively. KARL PEARSON AND Econ S. PEARSON _ - 143 TABLE VI. Values of 3,15 3915. p geil s=2 3=38 oa ce s=6 si p O |+:005,966 |-+ °022,207 |+ 000,509 | — 007,686 | — -005,120 | — 011,028 |— 004,848 | 0 1 }+:030,608 |+ 054,185 |+ -001,722 | + 010,633 | + -011,536 | + 037,126 |+ 025,890 | 1 2 |+:054,736 |+ -014,342 | — 000,976 |+ -010,114 | + 000,145 | — 029,528 |— 048,833 | 2 3 |+-035,199 | — -033,825 | — 002,246 | — -012,136 | — -010,270 | — 010,999 |+ 034,276 | 3 4 |+-°002,562 |—-007,587 | + 000,169 | — -000,915 | + 001,407 | + 007,853 |— °003,489 | 4 5 |+:005,180 | — -004,622 | — 000,915 | — -004,289 | — :002,583 | + 003,372 |+°003,856 | 5 6 |+ 007,003 | — 011,727 |+:000,213 | — -000,013 | + 003,287 | + °008,380 |— 007,142 | 6 O |+°022,842 |+ -085,018 |-+ 001,948 | — 029,426 | — 019,601 | — 042,222 |— "018,559 | 0 1 |+:°056,787 | + °100,528 |+ 003,195 |+ -019,728 | + °021,402 | + 068,878 |+ °048,033 | 1 2 |+-:017,375 |+ 004,553 | — 000,310 | + -003,210 | + ‘000,046 | — -009,373 |— 015,501 | 2 3 |—'033,035 |+ -031,745 | + 002,108 | + -011,389 | + 009,639 | + 010,323 |— 032,169} 3 | 2 4 | -—-007,358 |+ ‘021,786 | — ‘000,486 | + -002,627 | — 004,040 | — 022,548 |+ "010,019 | 4 5 |—:004,261 |+ -003,802 | + :000,753 | + -003,528 | + ‘002,124 | — 002,774 |— 003,172] 5 6 |—-011,913 |+-019,949 | — 000,362 | + -000,022 | — 005,591 | — 014,255 |+ 012,150 | 6 O |+:001,324 |+-004,929 | + -000,113 |— -001,706 | — 001,136 | — 002,448 |— 001,076 | 0 1 |+ 002,378 |+-004,211 | + 000,134 | + -000,826 | + -000,896 | + ‘002,885 |+ °002,012] 1 2 |— 002,423 | — -000,635 | + 000,043 | — -000,448 | — 000,006 | + 001,307 |+ °002,162] 2 3 | — 002,976 | + 002,860 | + 000,190 | + -001,026 | + 000,868 | + °000,930 |— °002,898 | 3 4 |+ 000,365 | — -001,080 | + ‘000,024 | — 000,130 | + -000,200 | + °001,118 |— 000,497 | 4 5 |—-001,271 |+-001,134 |+ 090,225 |+ -001,052 | + 000,634 | — -000,827 | - 000,946 | 5 6 |+:000,475 | — 000,796 | + 000,014 | — -000,001 | + -000,223 | + 000,569 |— 000,485) 6 0 |—-005,329 | — 019,833 |— -000,455 |+ 006,865 | + 004,573 | +-009,850 |+ 004,330] 0 1 |+:006,217 |+ -011,006 | + 000,350 | + -002,160 | + 002,343 |+°007,541 |+ 005,259 | 1 2 |+:008,127 |+ 002,130 | ~ -000,145 | + -001,502 | + 000,022 | — 004,385 |— 007,251] 2 3 |— 007,217 |+ 006,935 |-+ 000,461 |+ 002,488 |+ °002,106 | + °002,255 | — "007,028 | 3 4 |—-000,941 |-+ -002,786 | — 000,062 |-+ -000,336 | — 000,517 | — 002,883 |+°001,281] 4 5 |—°002,847 | + °002,540 | + 000,503 | + -002,358 | + 001,419 | — 001,854 |— 002,120] 5 6 | - ‘000,780 | + 001,306 | — -000,024 |+ 000,001 | — 000,366 | — -000,934 |+ 000,796 | 6 0 |—:007,289 | — -027,130 | — 000,622 |+ -009,390 | + 006,255 | + °013,473 |+ "005,923 | 0 1 |+ 014,662 | + °025,956 | + -000,825 |+ -005,094 | + 005,526 |+ 017,784 |+ 012,402 | 1 2 |+ :002,953 | + 000,774 | — -000,053 | + 000,546 | + -000,008 | — 001,593 |— 002,635 | 2 5 | 3 |-:013,673 |+-013,139 | + :000,873 |+-004,714 | + -003,990 |+ 004,273 |— 013,315 | 3 4 |+ 001,057 |— -003,128 | + -000,070 | — 000,377 | + °000,580 | + °003,238 |— 001,489 | 4 5 |-°004,116 |+ 003,672 |+ 000,727 |+ -003,408 | + 002,052 | — 002,679 | — 003,064 | 5 6 |+ 003,320 | — -005,559 | + 000,101 | — -000,006 + :001,558 | + -003,973 | — 003,386 | 6 0 |—-010,563 |— -039,314 |— -000,901 | + -013,607 | + -009,064 | + 019,524 |+°008,582 | 0 1 |+ 033,046 | + 058,500 |-+ 001,860 | + -011,480 | + 012,454 | + -040,082 |+°027,952 | 1 -2 .|—:021,493 | — 005,632 | + :000,383 |— -003,971 | — 000,057 |+-011,595 |+°019,175 | 2 3 |—'013,795 | + 013,256 | + 000,880 |+-004,756 + :004,025 + -004,311 |— 013,433 | 3 4 |+:006,142 |— -018,186 |-+ 000,406 | — 002,193 | + -003,372 |-+-018,822 |— 008,364 | 4 5 |+°001,134 |— -001,011 | — -000,200 | — 000,939 | — -000,565 | + -000,738 |+ 000,844} 5 6 |+ 008,563 | - -014,340 | + 000,260 | — 000,016 | + -004,019 |+-010,247 | — ‘008,733 | 6 0 |—-006,952 | — 025,876 |— :000,593 |+ 008,956 | + 005,966 | + 012,851 |+°005,649 | 0 1 |+ 024,867 )+ 061,193 | + °001,945 |+ 012,009 | + -013,028 | -+ 041,927 |+ 029,238 | 1 2 |—°059,276 | — 015,532 | + 001,057 | — 010,953 | — -000,157 | + 031,978 |+ °052,883 | 2 3 |+°035,498 | — 034,111 | — 002,265 | — -012,238 | — 010,357 | — 011,092 |+ "034,567 | 3 | 7 4 |-—-001,827 | + ‘005,409 | — -000,121 | + 000,652 | ~ 001,003 | — 005,599 |+ "002,488 | 4 5 |+°006,181 | — -005,515 | — 001,092 | — -005,118 | — -003,082 | + -004,024 |+°004,602 | 5 6 |— 006,668 |+ 011,167 | — 000,202 | + -000,202 | — -003,130 | — 007,980 |+ 006,801 | 6 144 On Polychoric Coefficients of Correlation TABLE VII. Values of kgy. s' r s=1 2, s=3 s=4 s=5 s=6 s=7 (a) |+:034,290 |+-045,918 |+000,873 | — -002,076 | — -000,798 | — 000,849 | — 000,094 1 | (8) |+°039,786 |+ -047,855 | +-000,831 /— -001,549 | —-000,541 | — -000,496 | — 000,036 (c) |+ 045,903 | + 049,468 | + 000,762 |— 001,097 | — 000,350 | — -000,251 | — 000,001 (a) |+'048,425 |-+-135,200 | + :008,506 | — 018,688 | — -009,255 | — ‘013,278 | — 002,561 2 | (6) |++050,671 | ++142,180 |+ 003,719 |— 017,061 |— ‘007,957 | — 010,554 |— -001,722 (c) |+°052,617 |-+-149,704 | + -003,946 | — ‘015,291 |—-006,630 | — 008,054 | — -001,089 (a) |+ 001,628 | + -006,926 | + -000,205 | — -001,317 | — 000,633 | — -000,765 | — 000,039 3 | (b) |+:001,526 |+ 007,188 |+-000,223 | — 001,252 | — -000,545 | — 000,509 | + 000,040 (ec) |-+-001,386 |4 007,465 | + -000,245 |~-001,175 | — 000,444 | — -000,235 | + -000,096 (a) |—-001,641 |—+013,645 | —-000,278 |+ 008,425 |+ -005,826 |+ 012,401 |+ 004,608 4 | (b) |-:001,250 |—-012,657 | — 000,247 |+ °008,726 |+ 006,019 |+ 012,553 |+ 004,294 (c) |— 000,903 | — 011,563 | — -000,211 |+ 009,071 | + 006,233 | + 012,663 | + 003,892 (a) |— 001,344 |— -014,202 |—.000,165 |+-012,270 |-+ 009,181 |+ 021,659 | + -009,613 5 | (b) |—:000,940 | — 012,484 |— 000,085 | + 012,745 |+ 09,643 |+ 022,682 |+ -009,562 (c) |—+000,625 | —-010,689 |+ -000,007 |+ -013,278 |+ °010,160 | + 023,755 | + -009,352 (a) |—-000,958 | — °018,805 |+ 000,109 |+ 018,295 |+ 015,185 |+-041,172 |+ -023,419 6 | (b) |—-000,584 | — -011,207 |+ -000,258 | + -018,782 |+ ‘016,036 | + 044,362 |+ -025,040 (ec) |—+000,328 |—-008,749 |+ -000,419 |+-019,263 |-+ 016,957 |-+ 047,837 |+ 026,557 (a) |—*000,047 | —:004,380 | + -000,263 |+°010,961 |+-010,729 |+-036,961 |+ :032,908 7 (b) |-+000,076 | — 003,086 | + 000,316 |+°010,574 | + -010,938 | + -040,073 | 4+ 038,215 (ec) |+000,159 | — 002,067 | + 000,348 |+ °010,019 | + 011,027 | + -043,208 | + 044,126 TABLE VIII. Values of tes:/(Fisg/ se’). s’ r s=1 S12) so s=4 s=5 s=6 si (a) |+°021,394 |+ -052,182 |+-001,072 ) — 002,057 ) ) 1 | (6) |-+-0213559 | +-053-054 |+-001,178 ) — ‘002,033 ) 0 (ce) |-+°021,698 |-+-053,980 | + -001,296 ) — 022,011 ) 0 — | - — (a) |+°055,831 |+°149,303 + -003,713 | — ‘012,640 |— -006,711 |— -007,035 ) | 2 | (6) |+°056,643 | +°150,514 |-+ 003,957 |— 012,331 | — 006,547 | — 006,833 0 (ec) |+-057,497 |-+ 151,686 | + 004,228 | — -011,960 | — 006,354 | — -006,638 0 (a) |+:001,898 |+ 006,439 |+ 000,185 | - ‘001,621 | — 000,741 | — ‘000,777 |—:000,018 3 | (b) |+-002,029 | +-006,679 | + -000,196 | — -001,521 | — 000,645 | — 000,547 | + -000,022 (ec) |-+-002,180 |-+-006,941 | +-000,209 | — -001,406 | — 000,534 | — -000,271 |-+ 000,065 (a) ~ -001,004 — °011,805 |— :000,258 |+-010,415 | + 006,850 |+ -010,117 |+ -006,859 4 | (b) |—-000,992 | — -011,538 | — -000,225 |+-010,424 |+ -006,858 | + °010,124 |+ -006,845 (c) | — 000,985 | — 011,217 | — 000,188 |+ 010,432 |+ 006,869 |+ -010,130 |-+ 006,828 (a) ) — 012,745 | — 000,160 |+ 013,131 |+ 009,558 |+ 023,162 |+ -010,155 5 | (b) i) — 012,407 |—-000,083 | + 013,153 |+ 009,564 |+ 023,170 |-+ 010,125 ©) ) — °012,005 |— 000,007 |+ 013,177 | + °009,569 | + °023,173 |+ °010,084 (a) 0 — -009,447 |+ :000,126 |4+°015,039 |+ -009,465 |+ -058,655 |+ 025,835 6 | (0) 0 — 009,156 | + -000,311 |+°015,079 | + 009,468 |-+ °058,760 |+-025,838 (ec) 0 — 008,854 |+ 000,533 | + ‘015,123 | + -009,468 |-+ 058,853 | + “025,824 (a) 0 2 --004, 556 |-+ 000,352 |+ 007,494 + -012,697 |+ ‘032,408 |+ °037,813 Henle) 0 —-004,377 | + 000,487 | + -007,493 |-+ 012,645 | + -032,448 | + -038,095 (c) ) = 004,210 | + -000,642 | + “007,492 | + -012,574 |-+ -032;463 | +-038,361 S(a)=+°510,573, S(b)=+°517,476, 8 (c)= +°524,735, Y= -- 060,573, v= — 017,476, = +:025,265. Kart PEARSON AND Econ S. PEARSON 145 TABLE IX. Values of 3.1, Sy Tp p s=1 s=2 s— | s=4 s=5 s=6 Si) p 0 |—:002,689 | — :010,007 | — 000,229 | + -003,464 | + 002,307 +:004,970 | + °002,185 | 0 1 |—:013,450 | — -023,810 |—-000,757 | — 004,673 | — :005,069 |— 016,314 |—°011,377] 1 2 |—'023,067 | — -006,044 |+ 000,411 | — 004,262 |— -000,061 |+-012,444 |+°020,57 2 3 |- 013,496 + -012,969 + 000,861 | + -004,653 + 003,938 | + 004,217 |— 013,142 | 3 4 |—+000,450 | + :001,333 | — 000,030 |+-000,161 —*000,247 — -001,380 |+-000,613 | 4 5 |—:003,007 | + -002,683 |+°000,531 |-+-002,490 + 001,499 | —-001,958 |— 002,239 | 5 6 |— 003,082 |+ :005,161 |— -000,094 |+ 000,006 |— ‘001,446 |— 003,688 |+ 003,143 | 6 0 | —*023,802 — :088,590 | — :002,030 | + 030,662 + °020,425 + 043,996 |+ 019,339 | 0 1 |—°051,494 |— 091,158 |- -002,898 | — ‘017,889 |— -019,407 | —-062,458 | — "043,556 | 1 2 |—+002,940 | — :000,770 | + -000,052 | — 000,543 — -000,008 | + 001,586 | + :002,623 | 2 3 |+ 036,379 | — 034,958 | — 002,322 | — 012,542 | — ‘010,614 | — -011,368 |+ -035,425| 3 | 2 4 |+004,781 |— :014,154 |-+-000,316 | — 001,707 + °002,625 |+ -014,650 — -006,510 | 4 5 |+:'007,797 | -- :006,957 | — 001,378 | — 006,456 | — :003,887 | + 005,076 |+ °005,805 | 5 6 |+:009,448 | — 015,822 |-+ -000,287 | —-000,017 + :004,434 +-011,306 —-009,636 | 6 | 0 |— 022,458 | - ‘083,587 | — 001,915 | + -028,931 |+°019,271 + °041,511 |+ 018,247 | 0 1 |— 002,986 | — :005,286 |— 000,168 | — 001,037 |— 001,125 — -003,622 |— 002,526} 1 2 | +045,338 | + 011,880 | — 000,808 | + -008,377 | + °000,120 | — -024,459 |— 040,449 | 2 3 +°003,681 | — °003,537 | — -000,235 | — -001,269 |— 001,074 —-001,150 + 003,584 | 3 4 |--007,651 |+ °022,655 | — -000,506 |+ :002,732 |— 004,201 | — ‘023,447 |+ 010,419 | 4 5 |+°001,548 | — -001,381 | — 000,273 | — 001,281 |—*000,772 +°001,007 |+-001,152| 5 6 |—:011,412 |+°019,111 | ~ 000,346 + -000,021 |— 005,356 |— 013,656 +°011,639 | 6 0 |—*010,833 |— °040,322 |— -000,924 | + 013,956 |+ 009,296 + -020,025 + -008,802 | 0 1 |+ 012,013 |+ ‘021,267 | + -000,676 |-+ -004,173 |+ 004,528 +°014,571 |+°010,161 | 1 2 |+:017,109 |+ ‘004,483 | — -000,305 | + -003,161 |-+ °000,045 | — 009,230 |— -015,264] 2 3 |—°014,068 |+ 013,518 |+ -000,898 | + -004,850 |+ °004,105 +:004,396 |— -013,699 | 3 4 |— 002,101 |+ 006,221 | — :000,139 | + 000,750 |— -001,154 — 006,439 + -002,861 | 4 5 | —:005,604 |+ 005,000 | + -000,990 |+ 004,640 |+ °002,794 |— 003,648 |—-004,172} 5 6 |—-001,988 |+ 003,329 | — 000,060 | + -000,004 | — 000,933 | — 002,379 |-+-002,028| 6 = = ae | — = 0 |—:008,303 | — :030,904 | — 000,708 |+-010,696 |+°007,125 + °015,347 |+°006,746 | 0 1 |+ 016,433 | + 029,090 |+ 000,925 |+ 005,709 + :006,193 + :019,931 + °013,899 | 1 2 | +:003,809 | + 000,998 | — 000,068 |+ 000,704 + 000,010 | — 002,055 | — 003,399 | 2 3 | —°015,502 |+°014,897 |+ ‘000,989 |+ 005,344 |+4 -904,523 |-+-004,844 |— 015,096 | 3 4 + :001,087 | — 003,220 | + 000,072 | — 000,388 |+ 000,597 |-+-003,332 |— 001,481 | 4 5 |—004,744 |+ -004,233 | + 000,838 | + -003,928 |+ 002,365 | — 003,088 |—*003,532 | 5 6 | + 003,592 |-- 006,016 | + -000,109 |— 000,007 | + °001,686 | + -004,299 |— -003,664) 6 0 |—-007,749 | — 028,843 | — -000,661 |+-009,983 + -006,650 + °014,324 |+°006,297 | 0 1 |+°023,811 |+-042,152 | + ‘001,340 |+ 008,272 + °008,974 |+°028,881 |+-020,140} 1 2 |—-014,636 | — 003,835 |-+ -000,261 | — -002,704 — -000,039 | + °007,896 |+°013,057 | 2 3 | —:010,657 | +-010,241 |+ -000,680 |+ 003,674 |+-003,110 | + -003,330 |— 010,378 | 3 4 | +4 °004,372 | —-012,946 |+ :000,289 |—-001,561 + :002,401 | + °013,399 |— 005,954) 4 5 |+ 000,442 | —-000,394 | — -000,078 | — 000,366 — -000,220 | + °000,288 | + °000,329 | 5 6 + 006,335 — -010,608 | + °000,192 | — 000,012 + °002,973 | + 007,580 | — 006,461 | 6 | | | 0 | — 003,242 | — 012,067 | — ‘000,276 }+ 004,177 +°002,782 | + °005,993 |-+ 002,634] 0 1 |+°015,673 |+ 027,746 |+ 000,882 |+ 005,445 + -005,907 | +-019,010 |+-013,257] 1 2 |—-025,614 | — 006,712 |+ 000,457 | — 004,733 — -000,068 | + -013,818 |+ "022,851 ] 2 3 |+°013,663 | — 013,130 |— -000,872 | — 004,711 = 003,987 -- 004,270 |+ °013,305 | 3 4 |—:000,037 |+ -000,111 | — -000,002 + -000,013 | — -000,020 | — 000,114 |+°000,051 | 4 5 |+ 003,569 | — -003,184 | — 000,631 | — -002,955 | — 001,779 | + -002,323 | + 002,657 | 5 6 |— 002,893 | + :004,845 | — 000,088 | + -000,005 | — :001,358 — :003,462 |+-002,951] 6 — i) Biometrika xiv 146 On Polychoric Coefficients of Correlation TABLE X. Values of Sst Yelp. | s’ p | ja | BS s=3 s—4 i) s=6 s=7 p Sf 0 |~-002,716 —-024,296 | — 019,919 |— -013,581 | — 005,206 | — 007,621 |- :002,113] 0 1 |= °013,578 |— -050,535 | — 001,157 | + 017,491 |+-011,650 |-+-025,097 |+ 011,032] 1 2 —+023,244 + -001,049 | + 041,494 | + 019,288 | + -000,298 | — 018,826 | — 020,060] 2 1 | 3 | —-013,520 |+ 038,284 |-+ -001,489 | — 020,306 | — -010,442 | — 008,529 /+ 013,024] 3 | 1 4 — -000,364 | + -004,567 | — 007,904 |— 002,108 |-+ -001,391 | + 005,282 |— 000,864 | 4 5 — 002,999 | + -008,296 | + 000,596 |— -007,320 | — 002,654 | + 001,795 |+ 002,285 | 5 6 = -003,074 |+ -008,906 | -011,032 — “000,956. + -003,290 | + 006,016 |— 003,149 6 | 0 —-010,399 | — -093,014 | — 076-260 | — 051,995 | — 019,931 | — 029,175 | — 008,088 | 0 1 | ~*025,191 | — 093,756 | ~ -002,147 | + 032,451 | + 021,614 |+ 046,563 |+ "020,467 | 1 2 | —:007,378 | + -000,333 |+°013,171 | + -006,123 | +-000,095 | — 005,976 | - 006,367] 2 2 | 3 +:012,689 | — -035,930 | — 001,397 | + -019,057 | + 009,800 | + 008,004 |—-012,223] 3.] 2 4 4°001,044 — -013,114 | + -022,696 + -006,054 | — -003,993 | — 015,167 |+*002,480| 4 5 |+ 002,467 —-006,824 | — 000,490 | + 006,021 | + 002,183 | - 001,477 |— 001,880 | 5 6 + °005,229 — -015,149 | + 018,767 + -001,627 | — ‘005,596 | - 010,234 |+ 005,357] 6 0 |—+000,603 | — -005,392 | — 004,421 |— -003,014 | —-001,155 | - 001,691 |— 000,469 | 0 1 |= "001,055 | — 003,927 | — -000,090 | + 001,359 |+-000,905 |-+ -001,950 |+ 000,857 | 1 2 |-+ 001,029 | — -000,046 | — :001,837 | — 000,854 | — -000,013 | + 000,833 |+ 000,888 | 2 3 | 3 |4+:001,143 | —-003,237 | — -000,126 |+ -001,717 |+ 000,883 |+ 000,721 |— 001,101] 3 | 3 4 |—-000,052 | + 000,650 |— -001,125 | — -000,300 | + -000,198 | + “000,752 |— 000,123 | 4 5 |+:000,736 |—-002,035 | — 000,146 | + -001,795 | + 000,651 | — 000,440 |— 000,561 | 5 6 | —*000,209 | + -000,605 | — -000;749 | — 000,065 | + 000,223 |+ -000,408 |— -000,214 | 6 0 | + 002,426 | + -021,699 | + 017,790 | + -012,130 | + 004,650 | + 006,806 |+ -001,887 | 0 1 |= 002,758 | — -010,265 | — -000,235 | + 003,553 | + 002,366 | + -005,098 |+ 002,241] 1 2 | —-003,451 | + -000,156 | +-006,161 |+-002,864 | + 000,044 | — -002,795 |— 002,979 | 2 4 | 3 |+-002,772 | —-007,849 | — 000,305 |+ 004,163 |+ 002,141 | + -001,749 |— 002,670] 3 | 4 4 | +000,134 | — -001,677 | + -002,902 | +-000,774 | — ‘000,511 | — 001,939 |+ -000,317| 4 5 |+ 001,648 |— -004,560 | — 000,328 | + 004,023 |+ 001,459 | — 000,987 | 001,256 | 5 6 | +000,342 | — -000,992 |-+ °001,229 + -000,107 | — :000,366 | — 000,670 |+ 000,351 | 6 0 |+°003,318 | + -029,682 | + 024,335 | + °016,592 | + -006,360 | + -009,310 |+-002,581 | 0 1 |— 006,504 | — -024,207 | — 000,554 | + 008,379 | + 005,581 | + 012,022 |+ 005,284 | 1 2 |— 001,254 |+ -000,057 |-+-002,239 | + -001,041 | + -000,016 | — -001,016 |— -001,082 |} 2 5 | 3 |+ 005,252 |—-014,872 — -000,578 | + 007,888 | + 004,056 | + 003,313 |— 005,059] 3 | 5 4 |— 000,150 |+-001,883 | — -003,259 | — 000,869 | + 000,573 | + “002,178 |—-000,356 | 4 5 | + 002,383 |—-006,592 | — 000,474 |+-005,816 | -+-002,109 | — -001,427 |- 001,816 | 5 6 |—*001,457 | + -004,222 |— -005,230 |— 000,453 + 001,560 | + 002,852 | — 001,493 | 6 | ae |e 2 — Q |+°004,809 +4 -043,011 | +°035,264 + -024,044 + -009,217 |+ -013,491 |+-003,740 | 0 1 |— 014,659 | — 054,559 | — 001,249 |-+ 018,884 + -012,578 | + 027,096 |+ 011,910} 1 2 |+ 009,127 | — -000,412 | -- -016,293 | — 007,574 — 000,117 |+ 007,392 |+-007,876 | 2 6 | 3 |4 005,299 | —-015,004 | — 000,583 | + -007,958 + 004,092 | + -003,342 |— 005,104) 3 | 6 4 |—:000,872 + -010,946 | — -018,945 | — -005,054 + 003,333 | + ‘012,661 |— 002,070} 4 5 |—-000,656 + -001,815 | + 000,130 — 001,602 —-000,581 | + 000,393 |+ 000,500 | 5 6 | — 003,758 + -010,889 | — 013,490 | - 001,169 + 004,023 |-+ :007,356 |— 003,851 | 6 0 |+-003,165 + -028,310 | + -023,211 | + -015,825 + -006,066 | + 008,880 |+ 002,462 | 0 1 |—:015,334 |— -057,071 | — 001,307 |+ 019,753 | + 013,157 | + 028,344 |+ 012,459 | 1 2 |+-025,172 — -001,136 | — -044,936 | — -020,888 | — -000,323 | + 020,387 |+ 021,724 | 2 7 | 3 |—-013,635 + -038,608 |+-001,501 | — -020,478 | —-010,531 | — -008,601 |+ 013,134] 3 | 7 4 + 000,259 — -003,256 |+-005,635 |-+-001,503 —-000,991 |— -003,766 |+ 000,616 | 4 5 |--003,579 + -009,899 | + 000,711 | — 008,734 | — -003,167 | + “002,142 | + 002,727 | 5 6 |+-002,927 — -008,480 | +-010,505 |-+ -000,911 | — -003,133 | - 005,729 | + 002,999 | 6 KARL PEARSON AND Econ S. PEARSON 147 We shall proceed to calculate A, for three values of 7 which lie near the probable value of r as found from each column. We will take these as °45, °50 and ‘55; from these values we shall obtain h,. for each column from (xiv) and inter- polating the real f,. between them find the corresponding columnar r, which will be then substituted in (xv) by aid of Table X to obtain the columnar mean /;,.. Table XI gives the values of Ay, for r=°"45, 50 and ‘55, and Table XII the resulting values of he : TABLE XI. Values of Axy for r='45, 50 and ‘55. r s=l1 s=2 ==) s=4 SO) s=6 s=7 (a) |—-014,742,01 | — 020,616,58 —°000,400,18 +000,974,70 + ‘000,377,97 | + :000,409,54 | + -000,044,95 (b) |—-017,038,00 | —-021,554,08 | — -000,383,88 + -000,731,59 | + -000,258,31 | +-000,246,06 | +-000,015,95 (c) |—-019,587,49 | — -022,373,22 | — -000,356,40 +-000,518,95 | +-000,168,60 | + -000,136,31 | — -000,003,29 (a) |—-043,836,23 | — -133,792,74 | — -003,545,25 + -021,169,83 | + -010,795,76 | +-015,963,45 | +-003,258,21 b) | —-045,046,53 | —-140,080,50 | —-003,775,08 | +-019,705,30 | +:009,504,62 | +-012,993,41 | + -002,268,84 ’ ) Oy a] | (ce) |—*045,869,07 | — 146,859,24 | — -004,026,98 | +°018,090,53 | + ‘008, 150,14 | + -010,141,50 | + :001,500,21 (a) |-—:014,665,26 | — -082,820,10 | — 002,204,29 | + °030,133,27 | + °018,460,19 | + °033,767,07 | 4+--009,791,12 (b) | - -012,764,50 | —-082,030,73 | — 002,275,94 | +-030,479,17 | +:018,233,88 +-031,794,16 | +-008, 188,80 (ce) |—-010,711,23 |—-080,956,49 | —-002,360,54 | + -030,869,67 | +-017,938,38 + -029,455,85 | + -006,551,72 eee x el ee = eae (a) |—-003,450,60 | — -028,237,21 | —-090,587,66 | +-017,032,32 | +-011,713,27 + -024,762,35 | + -009,092,34 (b) | —-002,645,25 | — -026,280,92 | — 000,528.69 +-017,630,94 | +012,084,98 + -024,998,89 | + 008,434.25 (c) |—-001,920,26 | —-024,106,94 | — -000,459,56 | +-018,316,53 | +°012,492,17 + -025,139,70 | + -007,601,99 (a) | —-001,562,59 | — -016,357,80 | —-000,196,08 | + :013,951,10 | +-010,408,16 | -+-024,456,57 | + -010,780,30 b) |—-001,096,19 —-014,410,34 | —-000,106,48 | + 014,492.89 | +-010,926,94 | + 025,583,17 | +:010,698,56 ] ’ ) 7 Sj ? | (ec) |— 000,731,638 | —-012,372,26 | — -000,003,49 | +°015,100,01 | +°011,507,01 feb ipl o2 + °010,435,95 (a) |—-000,728,92 | — -010,344,20 | +-000,068,82 | -+013,421,77 | -+-011,082,88 | +-029,840,53 | +-016,766,62 b) | —-000,448,58 — -008,432,81 + -000,177,88 | +:013,793,06 | +-011,705,64 | +°032,119,62 |+-017,871,20 ) b] ? > t | irs | (e) | —°000,255,73 —:006,613,75 | + -000,295,92 + °014,164,31 + °012,382,26 + 034,601,52 | + °018,889,99 ts | =| S — =| | a (a) |—-000,090,63 | — -002,150,92 | + -000,121,52 + -005,185,57 | 4+-005,018,14 | +-016,965,98 + -014,515,02 (6) | —-000,037,11 — -001,530,11 +-000,149,03 | +-005,035,92 | + (005, 142,06 + -018,430,12 | +-016,770,70 (ec) |—-000,000,75 | — -001,037,56 | +-000,167,89 + :004,808,83 -+005,217,99 | + -019,928,67 | +019,271,47 | | | TABLE XII. Values of hy. for Columns. r Sa c= 2 Ste sa=4 s=5 | s=6 s—i) “45 — 2°19299 |— -92114 | — 02549 | + -56650 | + ‘98336 |4+1°45054 | +2°30569 50 —2°18505 | — °91848 | — 02538 |+ °56616 | + °98315 +1°44900 |+2°29880 515) — 2°17567 | — 91485 | — 02521 |+°56580 | + 98286 | + 1°44685 | 4+ 2°28999 = | =€ | Actual Ibe — 2°19667 |—-91404 | — 02553 |+ °56594 | + 98333 |+1°44723 |42°29464 | BESIES Os 4229 | +5599 | -4167 | -5309 | -4585 5426 5249 | Interpolated r | 3 : ; Si hae 10—2 148 On Polychoric Coefficients of Correlation We have thus the values of r found from each column*. We now turn to Table X and calculate in exactly the same way the values of Neg = ete Ty ere tay Ly ee. a PE et Sy lias ow eee, for the r peculiar to each column for that column. We thus obtain Table XIII. TABLE XIII. Values of r’sy for r of each Vertical Column. 3’ Sil s=2 s=3 s=4 s=5 s=6 s=7 1 — ‘013,707,54 | — :044,362,34 | — -013,376,98 | — 002,394,74 | —-000,770,04 | — :000,212,73 | — 000,006, 10 2 | —'021,315,40 | —°153,841,07 | — (074,192,36 | — -029,418,02 | — :009,240,63 | — :006,036,02 | — ‘000, 641,41 3 | — 000,771,58 | — 008,202,77 | — -004,826,27 | — 002,225,87 | — ‘000,633,67 | — -000,217,60 |+ -000,030,11 | 4 | +:000,880,64 | + °014,176,58 | + °018,829,61 | + °015,680,02 | + °005,954,01 | + °008,797,09 |+ -001,837,83 5 |+:000,759,52 | +°013,488,41 | + °024,318,82 | + -022,680,28 | + :009,395,75 | 4+ -016,257,72 |+ -004,194,22 6 | +:000,584,54 | -+°011,211,78 | + :031,232,08 | + -032,630,32 | + °015,526,73 | + :032,207,15|+ ‘O01 1,205,66 7 |+:000,127,47 | + 002,739,883 | + °015,206,16 | + -017,131,65 | 4--010,878,46 | + 028,515,80 | + 017,104,71 lege | — 963,72 — 507,66 — ‘010,29 + °303,04 + 425,95 +°789,11 + 1:238,75 The values in Table XIII divided by 7i,./nsy from Table XIV and summed for each column give, on multiplication by N/n,., the i. of the last row of the table. To obtain Table XIV we must return to Equation (x), use the appropriate r for the column and the values in Table III of 3,7, Sy, 7,’.. Taking o, and o, as units of the horizontal and vertical variates we can plot /,. in Table XIII to h;. from Table XII and so obtain the regression line as formed by the means of each column, and set against it the regression lines as found from polychoric 7, = ‘5034, or ‘5204. TABLE XIV. She i Values of —* for columnar Values of r. Ns Sf s=1 s=2 s=3 s=4 s=5 s=6 s=7 1 | 1-481,160 | -918,247 | 881,901 | ee ‘366,258 ee) ee 2 | +850,270 | -995,746 | -946,290 | 1:319,975 | 1°351,776 | 1:261,565 oo 3 | -910,673 | 1:075,087 | 1-090,803 | 830,945 | 853,291 | -876,250 | 1-668,232 4 | 1:846,116 | 1:016,776 | 1-:066,432 | 856,640 | 855,012 | 1:248,823 | -600,417 | 5 2 | :866,469 | 1-:028,801 | -992,395 | -968,288 1°018,112 | -937,952 6 2 | 980,885 | 887,677 | 1-263,099 | 1-619,040 | -803,917 | -999,089 a a0 454,171 808,851 | 1°368,315 848,918 1°316,691 | 1:074,517 * The mean value of » weighted with the column totals is (i.e, within the probable error of) the results on p. 142. +5022 which is in reasonable accord with “ KARL PEARSON AND Econ S. PEARSON 149 This is done in Diagram I. But what we actually desire is to compare the obser- vations and the regression lines as given by the present polychoric method with those obtained by product-moment methods. Stature of Son in Inches. es ee eS SES -2 | oO +) +2 Stature of Father in Inches. Diagram I. Our actual data from which the table on p. 135 was obtained are given in Table XV. The following are the values of the constants in inches : Mean Stature of Father: % = 67-878. Mean Stature of Son: 47 = 68'"845. Standard Deviation of Father: o, = 2’"6576. Standard Deviation of Son: — a, = 26885. Correlation of Father and Son: 7 =°5189 + :0160. In Diagram II the regression line (slope, 5245) with means of the arrays as dark circles is given. Against this we have put as hollow circles the values of h;. and k,. multiplied by their respective s.p.’s to indicate the result as worked out in the present paper. The closeness of the polychoric coefficient 5204 and the product-moment coefficient does not permit of two regression lines being drawn. It will be seen that the fit to the observations by use of broad categories and the polychoric method is really quite as satisfactory as the fit by the product- moment method. But the amount of arithmetical work is incomparably greater by the former, even if it be less than Ritchie-Scott’s process with 49 cells would be. Accordingly we now proceeded to investigate the extent to which approxi- mations shortening the arithmetic would introduce serious error. The first question to be answered is: To what extent in finding the means k;. of the arrays is it needful to use the actual value of the correlation coefficient as found for each column? In order to test this we proceeded to find the k.. for each columnar ‘TI Weiser “sayouy ur way. fo aunqn1g NOS 40 NVIN *sayouy ur uog fo aunjynzy On Polychoric Coefficients of Correlation SHLiVS JO NVIW Stature of Son. Kart PEARSON AND Econ S. PEARSON 151 TABLE XV. Correlation of Stature in 1000 pairs, Father and Son. Stature of Father. TiS eae eee i [ees cae acti leet. elpastadile cea sital BlSlelefelsef{elslelsef;sese;e|/s}ye fe] s © | | | |S |S | | | S| & | | & | H | & | & | & | & | Totals yee eSellaee | saee ess iP ss] ime | 5. || se | So foe & Hl hl & |i] & ote) ie) | S Ne) ie) ie) Ne} Ne) Ne) ie) ton ~ to ~ | ~~ NN Oo eal ee | | | ae 2 60875 | ae mers ee ea ee n Bie ae =|}, I Na he ei VS A Ua a easy ee 6 Cs pee Oe aah Sale Po Sl Ole bse Pee ee) ee 90 Coat We ciel | 6h S|) Oo | 9 | 8 | 1 heart, ero 64875-12131 3] 5] 11 | AICO, We | ede | ale epee ae en eh a 70 uso |) de) 2 6 | “9 | 10/20) 17)15| 7) 6 ey) ey! 66"°875—| 1|—| 6| 4] 11] 24] 21 | 28}10}/ 12] 7] 4] 1 129 67"875—| — | 2 | 2] 7] 9 | 20| 16 | 33 | 27 | 26) 20] 13]; 6| —|—j|—]|—|] 181 CoO oe | alt 1s) 4.) py 19/13 | Jo | 92 | 96 | 24.) 6) 9/2! 1 |—| —F 125 69 s76— | — | — | — | — | 6] 11) 15/18} 18 | 23) 18/13) 4) 4/1 | 1 | —7 131 10" 875— | eRe Aun oe Lon ilsalbe sei |e) Wale eed = 80 VS pe Wes Potala yay | Gnleeor Okay Wasi tT (aw Sky 57 ee eo Oo 18h deh ONS On Te ey a Oe 36 VE Sis) | SS) SS SS a al by |) abu Sieh yy) ae Se 21 Wiycey 5 |) — ee ee rhe eee Se) OM one sar eo Pope fee | SS 8 i pee Se | ee gy | ee Pf 1 GO”"S7G= || = | SS he SSS SS SS SS SS SSS SS SS SS a a eS 2 CPO || a a a ne es ee ene ee ee ee 3 Oe) | SS Sh | |} iad ea (San ea ey a 1 Totals | 7 36 | 63 | 109 fant 139 | 125 1000 array for the same correlation coefficient, and we took for the value of that coefficient ‘5000, somewhat under the value found by either polychoric coefficient. Table XVI gives our results. It involved finding a new series of values for N'sv, but those for figy/nsy have already been computed under (b) in Table IV. The results are given in terms of inches. TABLE XVI. Columnar Means by Different Processes. ks. X Ty kg, X Oy Ie. X Cy | 8 hs. X Oy Common base | Hach column | Each column | Each column its own r for r=-50 assumed Normal 1 —5'8379 —2°5881 — 26498 — 2°4809 Wy) 2 — 2°4292 — 1°3633 — 1°3531 —1°4357 BY 3 — ‘0678 — ‘0276 — ‘0176 — ‘O701 By 4 +1°5040 + ‘8138 + ‘8087 + °7632 3h 5 +2°6133 | +1°1439 +1°1511 +1:0866 344’ 6 .+3°8462 | 42:°1192 +2°1122 +2°1744 4! 4-5’ ai +6:°0982 +3°'3267 +3:°3194 +3:°1109 5’ 152 On Polychoric Coefficients of Correlation An examination of the fourth column of Table XVI shows us that we have not for practical purposes seriously modified the columnar means by using 7 = °50 instead of the individual value for each column. This is illustrated in Diagram III, where except in the case of the first array there is hardly daylight between the two series of points. Stature of Son in Inches. Stature of Father in Inches. Diagram ITI. In Diagram III the hollow circles give the means with 7 obtained for each column, the nearly superposed dark circles the means with r= ‘5000. The solution of the problem therefore falls back on Equations (x), (xvit) and (xv). We should still have to calculate $,7,, 35 T »’, Ss Tp and 9; T,,’, but we should only need the three series of products 35 Tp Sy Ty’, 3s Tp Sy Tp and 95 Tp Sy Ty’, and to obtain k,. it would be adequate to use a value of r for which fisy/n.y had been found for the final interpolation. Still this involves very lengthy arithmetic, and we naturally crave for a still easier process. The present full working out of a numerical example enables us for the first time. really to test the adequacy of an easier method of dealing with such polychoric tables which has been long in use as an approximate method in the Biometric Laboratory. (6) It is clear that if we could find the means of the columnar arrays, we could readily obtain the correlation and the regression line by aid of the correlation ratio corrected for class index. The whole problem accordingly turns on a ready means of reaching—at any rate—an approximate value of the mean of a columnar array. This array is the slice between two parallel planes of a normal correlation surface, In the ease of a surface of zero correlation Z= Dye BAY KARL PEARSON AND Econ 8. PEARSON 153 the slice between XY, and _X, has for its volume on dY Xa _4Xx2q? ei ee | g Tle ax ge ae the slice is therefore given by the normal curve : : —4Y?/a? Ordinate =const. xe ” ioe It seems therefore not unreasonable after the surface of revolution is stretched and slid into a correlation surface to assume the slice to be still approximately a normal curve. Unfortunately the determination of the best mean and standard deviation for normal material given in broad categories does not admit of very easy solution. What we need is the difference between the means of a columnar array and of a marginal frequency as a multiple of the standard deviation of the latter. We shall obtain results differing more or less from each other according to the individual broad category we take as the basis of comparison between o, the standard deviation of the sth slice and o, the standard deviation of the marginal frequency. In fact the range of any broad category or of any combination of broad categories, except the tail categories, can be made a means of linking up o, and ay. A little experience, however, shows (a) that it 1s undesirable to find the o, of any array from a category of small frequency, and (b) that for arrays of small total frequency symmetrical tripartite divisions as far as feasible are the best*. The last column in Table XVI shows the system selected for each of our columnar arrays. Take, for example, s = 5, the columnar array may be taken on the base of 3’ and 4’ categories as +2’ 9 335 3+4 364 and compared with 2421 5+ 6+ 7’ 24, 244, Totals 69 1000 as the corresponding marginal distribution. The corresponding proportional frequencies up to the dichotomic planes are : be and ie . The distances of the 6521 ‘7560 meant from the two dichotomic planes in the first case are —1:12456, and + °39106,, and in the second case — 4261o, and + °6935c,, where o, is the standard deviation of the normal curve assumed to represent the columnar array 5. Accordingly the range of 3’ + 4’ categories = 1:51550,=1:1196¢,, which gives o; in terms of o,. * The probable error of a standard deviation found in this way is discussed in Biometrika, Vol. x11. p- 129. + Found from the Probability Integral Table. 154 On Polychoric Coefficients of Correlation Hence the distance between the means is 69350, — 39106; {6935 — 3910 x 1:1196/1:5155} o, = ‘4046¢, = 1:0866, if we introduce the value of oy. This and the corresponding values are recorded in the fifth column of Table XVI. It will be seen that these values approximate to those in the third column, the greatest differences being in the small first and last arrays. Of course in actually working with material solely given in broad categories we use the value 4046, treating o, as our unit of measurement. The means of the columnar arrays can be found with great ease and with considerable approximation by this method. If we now proceed to take the mean of our means duly weighted with their frequencies, we find it to. be —-0510,—not a very serious divergence from zero. However, we subtract it* from the means in the fifth column of Table XVI, multiply the squares of the remainders by the corresponding frequencies, sum and divide by the square of «,. Thus we obtain 1:'818,8034 Pe = D5 ? = 7-211.9103 5210144, or: n = 502148. If we divide by the class index correlation of the #-variate, 1.e. ‘962,3297, we obtain n = "5218, which correlation ratio we may take to be the correlation coefficient and compare with our polychoric coefticient 5204 (p. 142). Clearly although our means as found by the hypothesis of normal] distribution of the columnar arrays agree only approxi- mately with the polychoric means of the third column of Table XVI, they he practically on the same regression line, as Diagram IV indicates. We conclude, therefore, that in this case as probably in many like cases, it 1s quite adequate to obtain the means of the columnar arrays by treating them as normal distributions, then determining their correlation ratio and correcting it for the class index. The corresponding regression line with the means of the columnar arrays indicated will be for many purposes an adequate graph showing the general nature of the correlation. The general purpose of this paper has now been fulfilled; it has been shown how a general polychoric coefficient covering all the data provided in a given contingency table may be found, and how a graph may be drawn representing such a table effectively. At the same time such a process is very laborious and probably will not be lightly undertaken or only in cases of grave uncertainty. The method * Correlation ratio without subtraction =*5222. + See p. 142. 4d KARL PEARSON AND Econ S. PEARSON NOS JO NV3W | | | (D— . ; . O UHiv4s 40 NVaW “AT Wesrseig “sayouy ur wayyog fo aungnig GA Uh £h, oh th Oh = 532 89 IAS) 99 $9 v9 £9 29 19 09 ' — ! e e ! Hh ‘e . —— (i ao See prc ere 7 1h : i" 99 G9 aA) *sayouy ur uog fo ainynig) 156 On Polychoric Coefficients of Correlation is one of fitting the “best” normal surface to the data subject to the limitation that the marginal totals are exactly reproduced, and this limits the generality. An example has been given of the process, but it is seen from this example that, the heavy arithmetic does not lead us to any more accurate value for the correlation than far simpler methods. Thus: Correlation from product-moment = ‘5189 + 0160. Polychoric Correlation Coefficient “ Best Fit” = 5034. Polychoric Correlation Coefficient “ Product Moment” = 5204. Mean Square Contingency, Corrected for Class Indices = ‘5179. Correlation Ratio from means of arrays = 5218. The latter method, which has been long in use in the Biometric Laboratory, is thus, when used with due precaution, seen to be justified by the theoretically preferable polychoric method. If a method could be discovered of finding uniquely the mean of a columnar array, using all tts cells at the same time, this method would still more effectively replace the polychoric correlation coefficient. ON EXPANSIONS IN TETRACHORIC FUNCTIONS. By JAMES HENDERSON, M.A., BSc. (1) WE define the tetrachoric function of order s to be 7; (#), where 1 d \s-1 e7 3? ; (2) = = || — a Webel elev evecare ie iesauelereietauele (sce ieleyeteieisve 1). 1s(0) = (- 32) (i) Other writers have adopted various other values for the external numerical factor but this is immaterial. The factor A was chosen because it gives an ex- tremely simple expression for the volume of a quadrant of the normal bivariate frequency surface, and because for tabulating the numerical values of the functions it is necessary to have some reduction factor of this kind to keep them of manage- able size. We can usually drop the argument # and speak of 7;.. The values of 7, for s=1 up to s=6 are tabled to five decimal places in the book, Tables for Statisticians and Biometricians*, for values of 4 (1 —a) (which is really 7, when the argument is negative) from ‘000 to ‘500 at intervals of ‘001. With a different multiplier they have been tabled by Charlier+ to four decimal places only for s=1, 4and 5 (wx=:00 to 3). The general form of the tetrachoric function of order s is 1 : ‘'s—1)(s—2) .. s—1)(s—2)(s—3)(s—4) _. (2) = = fam —E ee 546 a a a *— ete} ——— ¢€ N 2ar : that is, the ordinate of the normal curve of errors multiplied by a polynomial of degree (s— 1). 7, 1s simply the ordinate of the normal curve, while 7, is the area of the tail of the normal curve up to a given abscissa w, with the addition of an 1.9 — 5x? x arbitrary constant. This constant may be so selected that t= i = dx, and -«o V Zor will be found from the tables of the probability integral. It will be equal to 4(1+ a), if # is positive and $(1—a), if # be negative in the usual -notation. Accordingly the expansion of a function of a, f(z) in a series of tetrachoric functions, is really the expansion of the difference of the function and a multiple of the probability integral in terms of ye a —}2?/o? Cy + Cy— + Co— +... é 4a > oO (oh where o and ¢p, ¢, C,... are at our choice. * Cambridge University Press, p. 1, and pp. 42—51. { Vorlesungen tiber die Grundziige der mathematischen Statistik, 1920. 158 On Expansions in Tetrachoric Functions The real reason for adopting Cy Ty + CT, + Co Te + Cz T3+---, instead of the above expression, is that the calculation of the constants cy’, c', cy’ ... is more direct than that of ¢, c,, c.... because the tetrachoric functions are semi- orthogonal functions*. It will be seen that the problem of expansion in tetra- choric functions is closely related to a theorem of Laplace. If U be a unimodal function of « within the range under discussion and the integral [= | Uda be required, Laplace transfers to the mode m as origin so that «= m+ & and writes U in the following form : U = Uy), e-* (1 +a, + a,&+...). He extends the limits to 2% in both directions by supposing U = 0 outside the given range and in the integration apples the well-known values of | Be wide, Le. Ah Coife'o) zero if s be odd, and again if s be even (= 27), | ele? ger dE — (Qr —1) (Qr—3)...8.1.V2r oF, —0o ' It will be seen that Laplace is really proceeding by expansion in tetrachoric functions as the process is precisely the same whatever be the limits of the integral of U. Following Laplace we develop our function in “incomplete normal moment . ee vw gS e-2? Ae : ; sfube! functions, ie. TS du+; it is better to use tetrachoric functions. The series in —a 297 tetrachoric functions seems to converge slightly better than that in incomplete normal moment functions. If we have FF (ae) = Gy y+ Oy Ta Oe Ty ce Ogi Ts tee: then, assuming we may integrate the right-hand side of this equation term by term (Le. assuming uniform convergence) between w and o , U a 9 [Fw di= Oni5 a eh eal V2 VB since | i : T.dx = 7 BSS eA aan ar RNR GD O85 50 (ii1). * A series of functions fj (x), fo (x) ... fy (v) ... fy (wv) is orthogonal if fr (x) fy (v) dv =0 when s and s’ are not equal, the integration being throughout the range. They are semi-orthogonal if iE (x) fy (a) 7) (x) dx =0, ¢ (x) being a function of x peculiar to the series. In other words a. system is orthogonal if the sums of the products of different order functions vanish without weighting for xz. A system is semi-orthogonal if we require to weight the values of x to obtain the vanishing of the product sum. This weighting is the great disadvantage of semi-orthogonal functions. In our case of the tetrachoric functions the weighting factor is e**” or the tails of series are excessively weighted. + Discussed Biometrika, Vol. v1. p. 59. Tables of these functions up to s=10 are given in Tables for Statisticians, pp. 22—3. JAMES HENDERSON 159 Let 7, = in (11). Let 7, be another tetrachoric function and suppose s’ is greater than s, Then Z Sly el fe a\i Ne 4 Lp Tye" dx i Oe (- =) SS du ie ; Vs! Vs") VQar op es Orde V Qa Now since 7;(» ) and t,(— ) will always be zero owing to the exponential —— ¢~™, where ps_, is the polynomial in w of degree (s — 1) 1 poe d factor (s > 0) we can integrate by parts transferring the Te from the exponential to the polynomial, therefore : ae dg = I = a = (- i) veel a fo TsTs 6 a= Te Nae oe Psa Th = a Peat) aor da. « The integrated part at every step vanishes at the limits and niGmaately pee } 1 1 7? df ev Tet das i ie ee. Vsti! ree da*— jPeaa 5 Me Since ps; is a polynomial of degree (s — 1) and s’ is > s the ditferential of the polynomial vanishes, 1.e. I Tet C—O names EIS Gee we eataya nena c snes (iv). If s’=s then the differential of p,_, reduces to (s — 1)! so that ie ee — © ah i. — He =a Br Obs Gee ese scts hath acsad vob una tiies (v). These equations (iv) and (v), which give the fundamental properties of the tetrachoric functions, enable us to expand any function #'(xv) in terms of tetra- choric functions if we can find the value of the integral - Lt al | F(a) 1, ¢" de=—— | = Dae AC ee ee i — 00 \ Qar -% Vs! Since ps; 1s an integral function of w, this amounts to saying that we can expand any function of which we are able to determine the successive moment- coefficients. The practical value of the functional expansion when obtained is, however, a very different matter. That depends on the convergency of the series and our experience has shown us that in the most common cases the convergency is so slight or non-existent as to render the expansion idle. 160 On Rapansions in Tetrachoric Functions The matter is a very important one for Thiele*, Edgeworth+ and Charlier have proposed to treat skew frequency distributions by a process, which amounts to the same thing as the expansion by tetrachoric functions. An attempt made many years ago§ to expand Incomplete T- and B-functions by Laplace’s method in Incomplete Moment Functions convinced Professor Pearson that little was to be gained by a series expansion in the form of a polynomial multipled by the ordinate of a normal curve. A variant of this method, that of expressing Incomplete [- and B-functions in a series of tetrachoric functions, was tried a year ago and it was found that except for a small distance round the mode this method of expressing a frequency distribution was quite ineffectual. The matter is of considerable importance because quite recently a Scandinavian actuary in America|] has been analysing mortality curves by tetrachoric functions and asserts not only that they give a good fit but apparently believes that each function of the series has some natural physiological meaning! It is quite possible to re- present the survivors of 100,000 persons born in the same year of life by a Fourier’s series from 0 to 100 years but one would hardly claim any special physiological significance for the individual periodic terms™. Such a series however is far easier to deal with in later treatment, such as differencing, than a series in tetra- choric functions. For the numerical calculation of the tetrachoric functions the difference equation of these functions is invaluable, ie. Ts = BR 5Ts1— YsTs—2, where w is the argument of the functions and 1 s—2 Bs=—=3 Vs = > - "As "Ms (=D) Tables of 8, and y, are given in Tables for Statisticians (p. 1 of introduction) to five decimal places for s=7 to s= 24 (the first six tetrachoric functions being given on pp. 42—51) and in Biometrika, Vol. X1v. p. 130 to 7 decimal places. For our work @, and y, were required to 7 places (sometimes to 8) to obtain the requisite accuracy. The procedure consists in calculating 7,, which is equal to —s0" __, directly to the required degree of accuracy and then by means of the tables referred to above the higher tetrachoric functions are obtained in rapid succession on the machine for a given value of the argument. In the testing of our tetra- choric series seven-place accuracy was aimed at so that it was necessary to calculate 7, to eight places, which was done with the help of Vega’s ten-figure logarithms. * Forlaesninger over Almindelig Iagttagelseslacre, Kjsbenhaven, 1889. + Royal Soc. Proc. Vol. ivi. p. 271, and in many papers, Journal of R. Statistical Society. + Vorlesungen iiber die Grundaziige der mathematischen Statistik (Hamburg, 1920), p. 67. § Biometrika, Vol. v1. p. 68, 1908. | Arne Fisher, Casually, Actuarial and Statistical Society of America, Proceedings, Vol. 1v. Part 1. No. 9. q A normal curve, for example, is quite adequately represented by two or three periodic terms; see Phil. Trans, Vol, cuxxxvi. A, p. 355, 1895, JAMES HENDERSON 161 (2) It is well known that a wide range of frequency distributions can be adequately represented by one or other of the curves y=yye ve (1 + =) RAO ce soe ee ri ee (| e\m —1 e\mo-I [ (vil). ene Y=Yo (1 + =) (i a) wees (0)| By a change of origin and the appropriate stretch or squeeze these may be reduced to PSY OS Beddnooannootnee Wee coke (a) hea oa \ (vil) bis. and RSD Fier eed (cero) ks asta pee re OR era ea (b)| Now, generally, it is not the ordinates of these curves which are required but the areas of certain portions, or in other words the probability integrals of these skew curves. The total range for (vil) bis(a) is 0 to o and for (b) is 0 to 1; since [ Cee dail Gp) Jo 1 and i gett a) dz Bim, m,), 0 we may take these probability integrals to be T( : | : 1 a) = —— uP e—” du oy P(p)Jo 7 1 B(m,, mz) - which are the ratios of the incomplete to the complete ['- and B-functions. . and Bw, m,, m,) = [ gm-1(] — y)m-t dy, 0 The equations on p. 158 show us that if either of the frequency functions (vil) is expressible in a series of tetrachoric functions their probability integrals (assuming convergence) will also be. Now there is no doubt that a large mass of material does not differ practically from the forms in (vil) and accordingly if the above probability integrals cannot be adequately expressed in a series of tetrachoric functions, we may be certain that tetrachoric functions do not furnish a suitable method of representing skew frequency. Accordingly our problem reduces itself to the following one: Can J (p,v) and B(v, m,, m.), or the Incomplete T- and B-functions, be represented with adequate convergency by a series of tetrachoric functions / After examination of the numerical and graphical results obtained, we are obliged to conclude that the answer to this question is in the negative. (3) Let us first consider the expansion in tetrachoric functions of the function Be CP (OP) ae ase siacoan ins deme tiarennt (vill). In expanding this expression there are at least two methods, which we ought to consider, and one may have advantages over the other as far as convergency is Biometrika xiv iil 162 On Expansions in Tetrachoric Functions concerned. It may be expanded with regard: (i) to the mean and the standard deviation, or (11) to the mode in the manner of Laplace*. (i) The mean of the function (viii) is easily found to be at «=p, the mode is at # =p —1 and the standard deviation is Vp. Referring to the mean as origin the function becomes a (E+ pes e— (E+P) y = mum e- @p d gE Let y=¢(- D) Woz , where D= FE and z= ae RI eA ok: (x). Except for a numerical factor the right-hand side is a series of tetrachoric functions. Let $(— D)=e,-—4,D+¢6,D ...(—1)8 ¢, D8 + The function ¢ (— D) has to be determined, i.e. we require to find the succes- sive c’s: be sen feo oem = OT +6,V2! T+ 0,V3! Ty 4+... 4+ C54 Vs! Te ee XD): To determine the c’s. With the origin at the mean the function y must be taken as zero from — © to —p, while from —p to +o it is given by (ix). The c’s will be obtained most easily by multiplying both sides of (x) by e* and equating the coefficients of powers of @ on both sides of the equation, i.e. we make all the moments of the two expressions for the curve the same, for the coefficient of 6° on either side is the sth momentt. Thus [-vetde= |" eg Dy ae: but y = 0 from =— © to — p. mre e( (Ep) 1 eo (E+p) P lr (p) Now w=p+é and z=E€/Vp. e9 ( (2—DP) gP— le-% en Ai Cue 5 ees (n/p =) bh (— eee FE Thus [ We) da vp] e(nP 2) & (— D) a= dz The left-hand side is equal to fe 6) ° @P-1 e-& (1—@) en [: ay 0 ai (p) 7 6) eee ev du el | == Let (1 — 6) =u. » dor TG) 8) =e? (1 — 8), * Laplace’s method is really an expansion in incomplete normal moment functions but as we have seen (p. 158) these may be replaced by tetrachoric functions. — + We owe this elegant method of determining the c’s to Mr H. E. Soper. Originally the ¢’s were determined by use of the fundamental property of the tetrachoric functions but that method, while leading to the same result, is more laborious. Accordingly [ ° Te Ie ef (— D)! = dé. JAMES HENDERSON 163 To find the value of the integral on the right-hand side, consider the term cs (— D)* in the function ¢(— D). Its contribution to the integral is bas rye d ante Cs | ( i) = dz, where 6’ = 6 Ve On integrating by parts the term between limits vanishes owing to the factor e—#". Hence the integral — f ° Oz -£\ = c,0 Je ( - ioe and ultimately 1070? ia Bees = Dee = 0.0 © er Therefore the whole integral on the right is p (0) et, ie. - 6 P9(1 — 6)? = Vp b (Vp8) e®, or Vp o) (Vp0) = e-P9- B® (1 — 0)», and b (V pO) = cy +c, (V pO) + co (V pO)? +... + 65 (Vp 0s + = Cot C1 O + Ce OF cccccceecees + ¢5/ 08 + where cs =c,(V'p)> or Cy = Cs, (Vp), Now e—P9—sp® (1 — @)-P = e- pd—zp0’—p log (1-6) = e—p9—kp0?+-p0+4p6?+ kpe'+}po'+... = esp +} Lp0t+tpe5-+., =b,+6,04 6,07 + Sie +b,0¢+ where 5 =1, b, =b,=0, b:=4p, b,=4p, b;=4p, bb = hp +4 Apy = kp (p + 38), ete. But Vines = b,, therefore 1 e C= Ip Oe SO ie = ip, Gy =1Vp, cs =1Vp, ete. a 2! = 1 ee i ae C, = ©. mer Ce ais. = 55 For numerical purposes these coefficients are much more usefully obtained in the following way: Let e~ PO 2P® (1 — 0)? =), + 0,0 +b. + . Take the differential of the logarithms of both sides ; “a (— p—pO+ p/1— 0) (40,0 +b.0 +... 4+ 0,08 +...) =b, + 2b.0 +... + sb, + ..., Le. pp? (bp + b,0 +... + b.68 +...) =(1 — 6) (by + 26,6 + ... + bs8! + ...). f= so that Gr 164 On Expansions in Tetrachoric Functions Equating coefficients of @° we have pbs—s =(s+1) bs41 — Sbg, | pet F 1.e, Deni = seal {sbs + pbs} eve lersuekay sveetntelekerstetersteretenatcretstatere (x11). By this difference formula successive 0’s can be found very quickly if by, b,, b, are known and we have already found these. Now cs = (Vp) es. = (Vp)-* bg (Vp) = by (Vp), or by = (Vp \sr1 Cs. Substituting in (xi) = d = Be (Vp)st? SCs = (s+ =) {s (Vp) Cy +p (Vp) Coma s+ soe {SCs sr Ces}. or PL ae, is cai ity Vp (s vi 1) SCs GD). 19 paale el acerca aiutats vine tase eUolere inte D, ° This formula gives us very readily the coefficients of @(— D) and thus the expansion is obtained. We had, Equation (x1), eae ve /2p S12 ee Toye as +6,V3!7,+..4 Fe, V6 Dts and all the c’s are known since q = —, =@,.=0. i To find the area under the curve (xi) up to abscissa #, remembering that the left-hand side is zero from & =— © to —p, é (E+ pj? e- (E+p) I e nee ; -d D = me ry | CP) a : aP—e ae te Bas 1c: | aCe =Vp ol D)- = = ae (7 +0,V2! To Ae eset Cs 24 vs! T+...) dz. Now Re idee -~o Ss therefore i as vp [a]. aoe om ees -—¢ V8 1 toa | Tey 90 | 9) ae a cr — ears aes ar 1 Z =4(1+a,)—Vp {C7 + CpV2! tT) 4 res +63. V(s—1)! Tat vA Bee 0 Vp JAMES HENDERSON 165 Therefore finally gPrl er ; i Ga (py da = 4 oo as) Sa Ugitheg\ ago sree OU s\ Tigh" nicrsi Maeve ¢ ste epee aces (xiv), (since c,=¢c,=0) where a,=Vpe,Vs!. Now ¢4)— 7 a {se, + cs} from equation (xiii), ai Ay, I a ‘ ds is ante —, = Vis+1)! vp (stl) vs! v(s— 2)! ie re eee therefore As41 = O Eau {sV(s + 1) a,+V(s + 1) (s) (s — 1) ayo} =4/ aay Wie, SEN (SVS) ok aes eee (xv), where a=1, a=a,=0. The argument of 3 (1 + @) and of the tetrachoric functions is €/Vp, which equals eae say. Vp Since the terms 7, and 7, do not appear one might hope that only a few terms of the expansion (xiv) would be required to obtain a sufficiently accurate result. $(1+4,) is the ordinary probability integral at 2. Note that if # is less than p, ie. z is negative, (1 — a.) must be used instead of 4 (1 +4a,) and the tetrachoric functions of even order must be taken of opposite sign to those for positive z such as are given in the tables. The odd order functions are the same for positive and negative z: Tos (z) = — Tos (— Zz), T2341 (z) = Tos41 (— 2). Obviously we could get the area of any portion of the curve between «=a, and « = a by subtracting two expressions like (xiv) for z, and z,. Pe mae The general expression for - Tip) Giaveste. 1 v4! 1 51 — Ul ames ome pear a I'(p) Vp p 3 pvp 4 1 V6! L ee) 72 R76 See ey 1 V8! my, 1 V9! (47p + 60 p rt 12 prvp 8 60 [ 1 V10!(p? | 19 5 153 p 9 [et appt tt ous pvp 10 136” + ape +1 1vi2!(341 , 341 +p 11 (1440? * a90? t " ey 162 * 1440? + 9520? +1 1 13! (p* | 4938 , 3349 12 Ti3 + 166 On Kxpansions in Tetrachoric Functions and L pp—l p—x - | 7 < dx = x (1 a az) — : val Ubi i oct 0 (p) Vp 3V4 P4ANV5 : 6! 1 v7! (p+8) 1 vs! (ee = > T7 pvp 5v6" pev7\ 3./ °° pvp TVR T4 12 1 v9! {fp + 60 ye VLOUE( DG LS aes T= — 75 pg V9 | 60 pvp 9 710 pee 20? t |r 1 viii (Fe +ie +i} 1. 121 (341 2 4 od +1] TPF 10/1 (308 402 Be ah alae, T1vi9 (1440? tT ag9P Thy ™ 1 V18! (p? 493 , 3349 ~~ ie <1 ET +6 ts AAC (iain ays Pp’ 12V13 (162 1440 2520 °8164,9658 1:2247 4487 ans Sa T4 Vp 2°1908,9023 1:4907,1198 a ve Ts (p se 3) T% pNP Pp’ _ °8451,5425 aa OO1 SEE Gy ln ee a a Pp lp p '3718,489 Set e0 lO alrip e180) = eee rsp 2 + 1377p + 1260) ty, p+ Ty pvp '0569,8743 pp _ 0201,0408 P (2387p? + 12276p + 10080) 7, (560p* + 31059p? + 120564p + 90720) T» — ... (u) Laplacian Form of Expansion. This is an expansion with regard to the mode or maximum ordinate as origin. pl p—-& The mode of y= = we) is at e=(p—1), so that it will be easier to deal with y in the form ae ge en” I~ Pp +0) where p’=(p—1). Let «=p’+&, ie. take the mode as origin. Then as before we require to find $ (— D) so that (p’ + E)P e- (p'+é) he. e728 /p' 5 SICH =o(- p= 7 (xvi), 7 d to where ee and are The introduction of Vp’ in the denominator simplifies the integration a little. JAMES HENDERSON Proceeding as before : of e% (p' ar bye e7 +8) fe 22*/p! 5 LE = eS Dae, ae Tao oe aa dé : rr ef (@—p') pP' e—% 22d, oo alps : ee 1.e. if Torey af 1 od ( D) Van or rie — if (ON pz D oars dz CaO rt sae mee iE eoND’ z-—$2? and eg Pe (1 — 6)- et) = erp) | dz 529 = [2% en kE—ON Dp’? a 1) ohp'e? ¢ bh (OV p’) et es c= dz = $(ONp)) ee therefore (ON ip’) = e- POP (1 — BYP cee Now if 6(—D) =o,-—¢D+eo,P?+...+(—1)%8¢.D%+ 167 (xvi). b (OV p') = +0 (OV p') + (OV p P+... $0,(OV p> + ... =qt+¢/0+c,0?+... +6608 + where : cy =¢,(Vp')® or Ce= cs (Vp')s, e7pe- tp (1 — 8)- (p'+1) = e-p'0—}p'0*—(p' +1) log (1-6) = e-P0-3P'O-+(p' +) (0+5+5+...) (p' +1. +) 6 ee) Nie = gothe+ Eee a where : Che Cy lees. afm pret gk, and generally by differentiating Cy =C sat e Cs, or Gs (Vp’)s = Cs_1 (Vp')s3 a £ Gass (Vp')s-3 : 1 1 ee thus G=— Jee ++ = = es} AS ete eee Ee (xvii), MP. where qo=l1, q= ee Cy = oS Vp’ iv Therefore Pe-(w'+6) — £2/2p' tise Meas {ey — 4. D + 0,D? — ... (— 1)* ¢,D¥ — ...} (<= NV Qarp [(p' +1) ) 1 = _ eieanee = — for, + V2! qt, +3! cots +... 4 (6 421)! Gorge > sae Vp’ 168 On Haupansions in Tetrachoric Functions To find the area up to abscissa # we have é (p’ eye er (p'+é) 1 I —— —————e Gs S—— oT tV2! tet... +V(st1 WCetrgat eae sane jie C(p’ eile E Vp" oo ( ) Ts+ } S =), {CoT + V2 Veit se eee V(s +1)! esTe44 +...}dz =o + a) —¢,7,— V2! OT. —V3! 6,73 — ASN Siiveln, ee as ¢ = 1, : v gP' e-& 1 y , ’ , 1.e. ra 7 eed + Az) — Ay Ty — Ae’ Tz — Ay T3 — «0. — As Ts — oe, po Cp = h) where a, =0,Vs!. Substituting in (xvii1) to obtain the difference equation for the a’s we have eas 1 | Gas il Pa \; vst vp’ W(s—1)! 8 V(s—38))” he call ie OES therefore a, = TE i {sa's4+ (3 = 1) G22) as) se eee (xix), | , 2 and ih = lh — a =—. vp iP By this formula the a’s are readily obtained numerically. It is to be noted that in this case the terms in 7, and 7, do not vanish, as they did in the expansion from the mean. The argument of }(1+a,) and of the tetrachoric functions is oa and the remarks with regard to sign made above must be again observed. Coefficients in the expansion from the mode : : 1 ; /2) a =1, a=—=, @&=—, Vp Ww », NBL e ZI ; v4! (Tp' + ae 2S, v5} ot a; = == a= - a; =— = —— , pvp 8 pt ae) Pepe e! ,_ 6! p? 19° ey pe 5 iS 153 Cae) Ag = pp? 18 20) P iar j » pvp’ 36? as 140 P ats af = N8EL B41 , 34d eat, Boe ee oy 2880 ae 8 = P °* yevp (162 1440? * a520P F*f- G00) 0.'0 0b: a, 80:0 0:18: 0.0 6.0 010.0810) 0)0 0 6 6.0110:0.0 00-6 0.9 0)6)'0)0 "6/6 0fe.0:0! 6,le)e\0: 6's: 60i6 b10i0\00101 050/018. 0. 4.6 aieL6!06;'e>u' a atahe' serene viel enelereleleteleretetel ag. ae m5 Ws Cire a aah We note that the coefficients of powers of 6 in the functions ae atete} pt 2 pl and SO=e 8s Sie A+... (in the expansion from the mode we had p’ for p in ¢’(@)) are closely related. JAMES HENDERSON 169 Then if ¢, is the coefficient of 6” in @(@) and c,/ is the coefficient of 6” in ¢’ (8), Ch = £ Crass (4) In the last expansion it might seem possible to get rid of the terms in tT, and 7, by breaking away from Laplace and expanding with regard to e~#/% in- stead of e~#”'; then choose g to give us the desired result. In Laplace’s form of the modal expansion the exponential term is e at) ma , where w=logy and -at the mode. 2 au (sa) means the value of Mo da? dx ge’ e-® aa Y= T@ +h)’ Co loge y =p log. «—#—log. P(p +1), du =" i Bae du = i dx? a?’ therefore (33) ee ee a : da] Mode P- P rm p (a yeas d E ie — where D=— and z =—, we have to find gq, so ren Aa de iF ! that either the 7, or 7, term or both will vanish. By proceeding as before equation (xvi) becomes (0g) = eve (1 — Ay = e—P'0—298— (p' +1) log (1-8) | The term in 7, will vanish if g=p’+ 1 which is the square of the standard- deviation from the mean, but 7, will still be left. However, it does not seem likely that any advantage will be gained by departing from Laplace’s form of the exponential term. Having found the two expansions from the mean and the mode respectively we shall now proceed to examine the behaviour of the series by numerical calculation, but before doing so we shall endeavour to find a similar series for the Incomplete B-function. (5) To expand | the mean. x oP al — a) “Bip, 9) dx in terms of tetrachoric functions about The mean is at xv =p/(p+q). VPq (ptq)Vpt+qtl Take origin at the mean; then w=p/(p+q)+& Let a? (1 — a) I e— tie? = SES oe acted 0.6.5 F ~B (p,q) ae oe The standard deviation is o= where D=—, y=. 170 On Hapansions in Tetrachoric Functions As in the case of the Incomplete ['-function multiply each side by e® and inte- grate. The limits of the integral on the left-hand side will be «= 0 and #=1, as we take the value of the integral outside these limits to be zero. The € limits will therefore be —p/(p+q) for a=0 and q/(p+q) for «=1. Then Gee? Gecrigh Deg) am . 1—(E+ ppt dea | (le B(p, 9) (E+ p/(p q))} E : 28 (4 — PKETD) g \p-l q-1 0 i e geese oe de= | elu b (— Dye dy, 0 el oa 2 =H: eng dé, / TO 1.e. B(p, g) e% Pl (1 — a) a ry? —Op/p+q = and e79vl1 aR SpA) dix (00) | seca F Na —00)2-+36%02 i = >( o){ = y =" (00) Cee Te rockin th eee eee (xxi). Now Celtel ND ies et ee 1 if B(p, q) ate _ aP- (a) ae C222 C243 O8u¢8 =i seer Bip @) {1+ + 02+ 57 + 37 omer euy Sem da _ Bip, 9g) pBtlg, & BCD e ae &® B(p+s, 4g), B(p, ) B(p,q 2! ~=Bip, 4) s! B(p,q) But Bias ie p(pt+l1)...(p+s—l1) Bw,.g) “(p+rqQi@ tg hl)e(p os 1). PA Soe Pp & (Gee 1) theref {= © Ga=1 40 == eG: ptq 2! (p+g(ptqtl O° Dp Coa h) eek Dee Sal) ae From equation (xxi) AN Goes GORE vigeal)) se org s= pe P 2 era p e p(p +1) Oc) =e pta 1+0 + Ma ae pt+q 21(p+q9)(ptqt)) 6s pip tl) a (pas) +} -. ie tsi@en@tetD- (ptqtse—l) fo oon Let $ (— D) =a, — aD + a, — ...(—1)° a,D* + ..., f (Oc) = ay) + a, (8a) + dz (Oa)? + a; (Oo)? +... +a, (00) +... =o+¢0+¢8+...+¢,0°+ where C, = Ao". By equating coefficients of powers of @ in equation (xxi) the coefficients in ¢(@c) can be obtained in terms of p and q, for fs PY ; (p+) (p+qt+1) JAMES HENDERSON 171 Obviously C= Il, g=—pl(p+q+p/(p+q =), fue ee ee Pee SPO DS Sn i! mee prg) a CpG) (pg tl) 2) (o-eo (pag dl) (p+ 9) sist Dee ae q 1 Zul pr d) (Cogs 1) peg (p49) (p4-9- 1) =3152 if q a oe i 2lp+q) (pta(ptqtl (p+Q(ptgth = 0, Similarly the other c’s can be determined but the work becomes more and more laborious as we go on. Unfortunately, as far as the numerical work is concerned, we have failed after many attempts to find a relation connecting successive c’s, similar to that found in the case of the Incomplete I-function. At first it was thought that the following treatment would facilitate the numerical calculation of these coefficients. 3o° ee Lae Let e PHA =), 4,0 4b,0° +... +b,0 +..., then — p|(p + q) 8 — $0°6? = log, [bo + 0,0 + b.0 +... +O +...}. Differentiate this and then equate coefficients of powers of 6: (by + b,6 + 6.0? +... +b, +...) (—p/(p+ q) — 078) = b, + 2b,8 + 3b,0° +... +5b,0°1 +... Equate coefficients of 6°: sb, = — p/(p + q) bs_. — ? bs_o; therefore —— L p b,, +o b.| ee ye er (xxill). This formula enables us to calculate the b’s very rapidly on the machine when p|(p +q) and o? have been determined. From equation (xxii) Cot G04 6.0? + ...+ 005 +... =(b) + b,04+0.07+...) 4 C2 p(pt+1) 14+04 P a fetid, p+q 2! (p+q)(pt+q+1) Equate coefficients of 6°: Pale il p(ptl)...(p+s--1) Nero Aedt) si(p+q)(p+qtl])...(p+q+s—1) 1 p(p+1)...(p+s—2) p wl ae be 4 a ; C= Gage iy rqeesd) Tip agt ie. es Se ee p(p+1)...(pts—r—-1) r=0 (S—1r)! (p+q)(pt+q4tl)...(pt+tqts—r—-1) 172 On Expansions in Tetrachoric Functions The b’s, having been calculated previously by (xxiii), this last formula gives a fairly rapid way of calculating the c’s, at least the earlier c’s. Then ee p(p+1)...(p+s—r—]) =o (8—r)!(p+q)(p+qt1)...@+qts—r= 1) (aj)... (x) What we require generally is the area represented by [ a? (1—2)t" da: 0 © gP-1 (1 — ¢)71 I @~ 84/0? c= — D) — I, Big aos ‘NM Qar y en? =|" o- DS 1.e. fy ie gP (1 — x9 0 B(p, q) iz rE ea TOE” eae ene = Ca nl Le =) V 2a Var Qn Qa J y en eo yu e7ty’ e-w ]y = — dy — a, | —— a, | D — ee — 1) pe 2 sf foie ae ae oR oe =4t(1+a)-a, V1! 1 —a,V2! T —a,V3! Teas Ta ee (p=) =4(1 +a) — GT — eT. — As T3 — 10. — OS Ts — oy where a, =a,Vs!- s! s 1 p+1)...(p+s—r—1 Then a, = ue = b,- HS estes EG Tag as) (OOK) o y-0 "(s—r)! (pt+q)(p+qtlh)...(ptqts—-r—-l) Now ¢, and c¢, are equal to zero, so that a’, a,’ are zero. Thus there are no terms in 7, and 7,. The arguinent of the tetrachoric functions and of $(1 + a) is y, E _ «—p/(p+q) Oo - (op which is equal to . On applying the above formula for a,’, we were greatly disappointed to find, that with the b’s to 8 decimal places the expression under the summation sign in the examples used commenced with 4 or 5 zeros : : = i : : | erate after the decimal point. As Vs! and (5) ‘both increase with s E being in our oO case > 1) accuracy to the seventh place in our @’s could not be obtained. Accord- ingly the formula actually used was of a different type. ml OC iS =Sec = B(p, q) 1 where the argument of the tetrachoric function is again Let E_e—plptg o Multiply both sides by r,, weighting by the factor e#/, and integrate from —2x to +, the left-hand side being taken as zero outside #= 0 and «= 1 qip+q yp (] — % oS ie. Then f'n MOUSE are dg= | 1San el" df -plp+q B(p, g) =o 2 JAMES HENDERSON 173 Since [ T,Ty @/" dE =0 only the term in 7, will be left on the right-hand side, 1.e. oc na S | TaS Cyt, HI dE = os 7 etl” dé, poe 1 —o Putting Elo =y, dé =cdy, q/p+q oP 1l—2x)r- ae _ we have i ( aa Ts ere lo" dé =. | ae oly ody —p/p+q —«CB (p, q) a = Cgo sV 2a i $ V2 [1 le) U2 eae a=pio+a)\? Le. C,;=——— ( Je ote ( ie ) di « Jo Bipq) eee Nore at (casle+ oy o Vs! Vr 0 Oo % 1 ! Oo fs DOC Gas, Sie) as ) (1-2) eo a “J Bp, 9) Bg (met) ne (ie DAS ee) as! [ \( 7G. 7 oui ( o ) Pes Ol) ronan KS) sat LORE ie ae 22.2! ( Bs ii B ea 7 POY ek ee TS (xxv). The integral for any particular value of s reduces to a series of B-functions and so c, is found. The area up to abscissa w is generally required : Le, A SS i. B &, “ - a S (e.r,) dy. Now is 1,dy =— = T5- a 22 vs ae Oo? ae y [P e - - de=o[' Cy Ty dy—o ts Ty 1 os ats tognt. barra tf — (eT — 1 Ost — sos) CLE =—= = ——= —2°8, Vp 7 4H JAMES HENDERSON ieee of the values of the series than a set of isolated points would. Figures 1—7 corre- spond to the data given in Tables I—VII. Now in the case of the Incomplete [-function we obtained two expansions, with respect to the mean and the mode respectively, and the graphs tell us which of these two gives us the better approximation. Figs. 1 and 3 (Tables I and IIT) show the variations in the values of the series for ie wags dx from the mean Jo D(49) and the mode respectively, while Figs, 2 and 4 give us similar information for 42 948 p—X -00400 i ao ae SL Cm as Ce kt ee ee ‘00300 ‘00200 = Seas me 001001} ‘O0000H ‘0’ - 00100 ‘e(l-a) 73 Ty 5 TT eM To Ty Te Tis Ta Tis Ne 7 Tie Ty Teo Ter Tye Tes Te4 Ls Toe T27 Tog Tog To NUMBER OF TERMS Fig. 1. e\\-a)T3 Ty Ts 1 Ty Tg Tyg To Wy Te Ts Ta Ts Ne Ti Tig Tg To Ta Te2 Tez 124 Tes Tee 127 Tos eg Tso NUMBER OF TERMS Fig. 2. It will be seen that in Fig. 1 the points are much closer to the ‘true value’ line than in Fig. 3 (and similarly in Fig. 2 they are closer than in Fig. 4) so that the expansion from the mean seems to give a better approximation than that from the mode and it has the additional advantage that the terms in 7, and 7, are missing. Besides, it seems more natural to expand these normal curve functions in terms of the mean and standard deviation. For comparison purposes the graphs are all on the same scale. The graphs for the mode and the mean behave in a very Biometrika x1v 12 178 On EHaupansions in Tetrachoric Functions TAR LnT ) pececie, 1* - |, 149) Hey fy : Tetrachoric Terms in the | Value of Series . Functions 7, Series —dgT¢s up to term 7, =| = | 30 +°1586553 | +. *1586553 *1586553 1 +:°24197074 | “0000000 — 2 — ‘17109916 “0000000 = a: “00000000 ‘0000000 — 4 + 09878417 — 0024691 "1561862 5 — ‘04417762 + 0002822 | *1564684 6 — ‘05410632 +°0017468 *1582152 7 + 05453404 — 0009735 °1572417 8 + °02410087 — 0002025 | *1570392 9 — ‘05302190 + 0007797 | "1578189 10 — ‘00355664 + :0000456 ‘1578645 11 + :04657133 — 0004172 *1574474 12 — °01034833 + :0001079 °1575553 13 — 03814548 + :0004117 *1579669 14 + °01939964 — ‘1001868 *1577802 15) + °02921077 — ‘0002967 °1574834 16 — 02483411 + 0002739 1577573 17 — 02054429 | +:0002318 *1579891 18 + '02755708 — '0003334 *1576556 19 + °01256341 — ‘0001691 *1574865 20 — (02825493 + 0004191 *1579057 21 — ‘00548187 + :0000910 *1579967 22 + 02745951 — 0005223 1574744 23. | — 00060803 + 0000134 "1574878 24 — 02558848 + :0006555 "1581433 25 + °00568862 — ‘0001726 *1579707 26 + °02297227 — '0008343 *1571364 27 — ‘00978859 + :0004299 *1575663 | 28 | -:01987296 + 0010674 *1586337 | 29 | +:01296514 — ‘0008607 *1577730 30 | +°01649808 — ‘0013669 *1564061 Ron True value °1577387. I (l-a)T Tz T3 % % % T Ty T To Ty Te Ts Ts Ms Ne Tz Ts Tig To Te) Tez Te3 T24 To5 To6 127 Tog To9 T30 NUMBER OF TERMS Fig. 3. oe * zp _42-49_ z=—_ = males aa JAMES HENDERSON 179 similar manner; for, if we regard the graphs as a wave, it will be noticed that at first the amplitude of the wave is big, decreases gradually up to a term in the neighbourhood of +) and thereafter increases more and more rapidly. This can be explained fairly easily ; as s increases the tetrachoric functions 7, do not increase or decrease steadily but vary in sign and remain of the same order of magnitude. The coefficients a, vary in much the same way (except that they are all positive) up to a certain point and then begin to increase very fast. In equation (xv) we had == - | n= = 5 {/sa + V(s — 1) ass}, : é 8 ; : Le. As, 18 of order \/ — {as + Aso}, So that as s increases there comes a time when Pp = and then the coefficients will continually Vp \/s overcomes the reducing effect of 4.19324 T eS Peay ee ee a (ea eet ee 2(Il-a)T Ty T3 % 5 Te Ty 1 Ty To Th Te Ts Te Ts To Wz Ts Tg To To Tee Tes Tes T25 126 127 Teg Toq T30 NUMBER OF TERMS . Fig. 4. increase. For higher values of p this turning point will not be arrived at so soon and the points will hang closer to the ‘true value’ line for a greater number of terms, but it does not seem likely that the values of the series will tend to a definite limit. The equation for the modal expansion coefficients is a similar one and these coefficients behave in the same way. Turning our attention to the expansions from the mean, Fig. 1 (and Fig. 3 to a less extent) would seem to suggest that the tetrachoric series gives quite a good approximation to the value of the integral. Although some of the points are very 12—2 180 On Expansions in Tetrachoric Functions TABLE III. | ee 2°6846788* ——_ Qy, g=—Z' 2 WP Cee , Tetrachoric Terms in | Value of Series . | as Functions 7, Series —a,'T, up to term 7, 0) 1:00000000 + 0036296 + 0036296 ‘0036296 1 0°14433757 + 01085979 — 0015675 | 0020621 2 0:02946278 — 02061573 + 0006074 | 0026695 3 0°12521683 + °02752089 — 0034461 — ‘0007766 4 0:06166251 — '025038988 + 0015440 ‘0007675 5 0:02648957 + :01160193 — '0003073 0004601 6 0°04236301 + 00557065 — 0002360 "0002241 a 0:03460284 — ‘01460370 + °0005053 0007295 8 0:02288715 + :00939504 — 0002150 ‘0005144 9 0°02516285 + :00363988 — ‘0000916 *0004229 10 0:02488684 — ‘01101274 + :0002741 ‘0006969 11 0:02136289 + °00579095 — ‘0001237 0005732 12 0:02167770 + 00509737 — ‘0001105 ‘0004627 13 0:02272772 — 00713419 + 0001621 "0006249 14 0:02256727 + :00058475 — 0000132 ‘0006117 15 0:02351439 + °00599464 — 0001410 0004707 16 0:02546065 — 00455186 + 0001158 0005865 17 0:02739094 — (00248832 + ‘0000682 0006547 18 0:02996700 + 00573797 — 0001720 *0004827 19 0:03360181 — (00124666 + 0000419 0005246 20 0:03803862 — 00454994 +:0001731 | *0006977 21 0:04355963 + *00382135 — ‘0001665 0005312 22 0:05068120 + 00204641 — ‘0001037 0004275 23 0:05968962 — 00471305 + 0002813 *0007089 24 0:07107603 + 00066657 — ‘0000474 ‘0006615 25 0:08566837 + °00406751 — ‘0003485 -0003130 26 0°10443752 — 00276906 + 70002892 ‘0006022 Pry| 0°12866399 — 00240728 + 0003097 “0009119 28 0°16018283 + :00383980 — ‘0006151 ‘0002969 29 0°20147298 + :00036667 — ‘0000739 “0002230 30 0°25589543 — 00382481 + 0009788 0012017 True value :0005850. near to the ‘true value’ line, the approximation is not really a good one. The important question for us is: To how many decimal places does the series give the result correct ? On going through the tables it will be found that there is no value of the series up to the sth term giving the result correct to more than three or four places. We now come to the real trouble. Suppose a frequency function is expanded in tetrachoric series, how are we to know at what term to stop so as to obtain the most accurate result? If the value of an integral is required, the true value is wanted. In our work we chose integrals of which the value was already known. From Figs. 1—4 it is easily seen that we have as good an approximation at the pata B’ _ 29°4-48 Vp (N48 — 2°6846788. JAMES HENDERSON 181 42 v8 e-& TABLE IV. | edu, 2=—8660254%. o P(49) | Tetrachorie Terms in Value of Series | Functions T. Series —a,’7, up to term 7, 0) "1932381 1932381 1932381 1 + °27418875 — ‘0395757 "1536624 2 — 16790564 + 0049470 *1586093 3 — °02798427 + ‘0035041 *1621134 4 + *10905792 — 0067248 *1553886 5 — (02346554 + :0006216 *1560102 6 — 07134833 + 0030225 "1590328 7 + 04145828 — 0014346 *1575982 8 + 04451198 — ‘0010188 *1565794 9 — 04705083 + 0011839 *1577634 10 — °02465039 + 0006135 "1583768 ’ 11 + 04681171 — ‘0010000 *1573768 12 + °00975248 — 0002114 "1571654 13 — ‘04356976 + 0009902 "1581556 14 + 00140962 — :0000318 *1581238 15 + °03877058 — 0009117 Oar) OY) 16 — ‘00966795 + 0002462 *1574583 17 — 03323150 + 0009102 “1583686 18 | +:°01562623 — '0004683 *1579003 19 + 02744360 — ‘0009222 1569781 20 | —:01974338 + :0007510 1577291 21 — °02171195 + 0009458 "1586749 22 + 02237974 — ‘0011342 *1575407 23 + 01622818 — ‘0009687 *1565720 24 — 02382475 + 0016934. *1582654 25 —°01111122 + :0009519 *1592173 26 + 02431475 — (0025394 *1566779 27 + 00643169 — ‘0008275 "1558504 28 — 02404492 + 0038516 *1597020 29 — 00222729 + 0004487 *1601507 30 + 02317774 — 70059311 °1542196 True value °1577387 5 til zs x)! TABLE V. | fake Sey: ae a zs = » BCS,5) dx, y 6457513, p=15, qg=5, m=20F. Fy i Tetrachoric Terms in Value of Series 2 Functions 7, Series — agrg up to term Tr, | | (0) 100000000 ‘0040751 0040751 | ‘0040751 3 — ‘19638608 + 02950904 + 0057952 | ‘0098703 4 + °01452267 — 02602453 + :0003780 ‘0102482 5 + °03818545 + *01099737 — 0004199 0098283 6 + °05515045 + :00712711 — 0003931 -0094352 7 — °01389639 — ‘01561177 — 0002170 ‘0092183 8 — ‘03609105 + :02031787 + :0007333 *0099516 True value ‘0096054. zp’ 42-48 Soe 5-75 KS SD, fe Wey: (Doel Pe Wie _— _ 9-e4F z Fy Saas 8660254. + {———— = -99c4010 = 2°6457513. 182 On Expansions ir Tetrachoric Functions TABLE VI. 5 (1 —ax)t PANTY de,_y = — 13010412... p= 4, og = Ss em 5T* |, B (4, 3) ¥y P q 2 2 Tetrachoric Terms in © Value of Series | : as Functions rT, Series — ag7, up to terms 7, 0) 100000000 ‘0966212 "0966212 0966212 3 — 28327885 + °04839695 + °0137098 ‘1103310 4 — *01400852 + 05941568 + °0008323 "1111633 i) + ‘16688842 — ‘06703628 +:0111876 *1223509 | 6 + +05349154 — ‘00778490 + 0004164 °1227673 led — ‘05325140 +:°05554783 | +:0029580 1257253 8 — 09445982 — 01930950 — 0018240 °1239013 9 — ‘00063525 — '03745046 — 0000238 1238775 True value °1188790. TABLE VII. 193 (1 —x)t ——__— dz, =— 3°59087385 TF. | » Bias y i Tetrachoric Terms in Value of Series : as Functions rT, Series —dgrT5 up to term 7; 0) 100000000 “0001648 ‘0001648 0001648 33 — ‘28327885 + *00307042 + :0008698 *0010346 4 — ‘01400852 — ‘00458580 — ‘0000642 *0009704 5 + ‘16688842 +:00530458 | -- ‘0008853 -0000851 6 + ‘05849154 — (00442734 + °0002368 ‘0003219 7 — 05325140 + 700191632 | +:0001020 “0004239 8 — ‘09445982 = +:00111687 | +°0001055 *0005294 ) ‘00063525 — 00291774 — ‘0000019 0005275 5th or 6th term as True value ‘00023603. representation of the Incomplete ['-function. beer iP. ec Oe Me : o¢ 17468526 P fn clus ty= Dads ae o 17468526 _ ~= — 1°3010412., — 3°59087385. at the 15th, say, and better than at the 30th. Of course, one might calculate the various terms till the sums became more or less steady, take the mean of these sums after the steady stage is reached and use that as the value required. This process, however, will not give a greater accuracy than three or four decimal places correct and very likely the result will not be so good as that. Besides which it is difficult to give such an arbitrary weighting of terms a theoretical justification. Thus it seems that the tetrachoric series is not at all suitable for the JAMES HENDERSON 183 When we consider the tables and graphs for the Incomplete B-function, the results are certainly no better than in the case of the Incomplete I-function. Unfortunately, owing to the lack of a difference formula connecting the successive coefficients, we only calculated a few terms, but the behaviour of the graphs is similar to that of the graphs of the Incomplete T-function. Fig. 5 is very like Figs. 1—4 but Figs. 6 and 7 are rather different. In Fig. 5 the integral is [’ a(1— 2)! B (15, 5) dx, where p is of high value and q is of moderate size. In Figs. 6 and 7 / 0 iguce S| ae the integral is | sae 4 - 0 > 2 Here pis 4and q is 3. It seems in the incomplete I'- and B-functions that the points come nearer the ‘true value’ line for the tail of the integral than if the upper limit is near the mode. dx, where the upper limits are ‘5 and ‘1 respectively. TRUE VALUE Tz Ty Ts Te 17 Tg NUMBER OF TERMS Fig. 5. IO ON ES RES BE a Tz Ty Ts Ig 17 Tg ‘Ty ‘e(i-d)T3 1% Ts We Tz Tg Ty NUMBER OF TERMS NUMBER OF TERMS Fig. 6. Fig. 7. 184 On Haupansions in Tetrachoric Functions ; 49 p48 p—X : rae 49 —4 Table VIII gives the results for v dee and, since z = ha a le 0 for the expansion from the mean, all the tetrachoric functions of even order vanish. It will be observed that the values of the series vary in a similar fashion to the others and not one of these gives the result correct to more than four decimal places. TABLE VIII. 49 eB e-* i i; rio) ™ Zire (Hapansion with regard to the Mean.) | Tetrachoric | Terms in Value of Series & as Functions rs Series — dgTs up to term rf, 0 100000000 “5000000 “5000000 | *5000000 3 ‘11664237 — 1628675 +°0189973 | *5189973 by °00638743 + :°1092549 — ‘0006979 5182994 i 01785148 — 0842920 +:0015047 | *5198041 9 | ‘01470566 + (0695373 | — °0010226 *D187815 11 | ‘00895618 — 0596711 + 0005345 | *5193160 13 ‘01079260 + °0525526 — ‘0005672 ‘5187488 15 °01015854 — 0471442 | + 0004789 5192277 True value °51899938. After a careful study of the tables and graphs we are forced to the conclusion that a tetrachoric series is of no practical utility as a representation of skew frequency curves such as y= ya? te and y= ya" (1 — a)", and although it may be rash to generalise from our results on these two types it would seem that such a series cannot be generally suitable to represent skew frequency dis- tributions. Moreover, the types, which have been discussed, are of common occur- rence and for these the expansion is certainly futile. The true values of the incomplete ['-function were taken from Tables of the Incomplete U-function which will be shortly issued by H.M. Stationery Office. The values of the incomplete B-function were determined by direct calculation; the power of (1 — x) was expanded and the result readily obtained with the help of the relation ~~ T(pt+q In his Vorlesungen tiber die Grundziige der mathematischen Statistik (Hamburg, 1920) Charlier, when dealing with skew frequency curves, gives as the general equation for the skew frequency curves of his Type A = hy at Bebe ar Bbc” aus Bshv" = tennis ¥ w-p 49-49 £5 -= —~— =0. Vp 7 B®) JAMES HENDERSON 185 where $= = e-#** and dy”, dol’, GY, ... are the third, fourth, fifth, etc. ditfer- 7 ential of coefficients $y, i.e. Y is really expressed in a series of tetrachoric functions, or Y=5 (7,—- 8, V4!7,4+ 8, V5!75— Bs V6! t.—...}. Bs, Bs, Bs, ete. along with M (the mean) and o Charlier calls the ‘ characteristics’ of the distribution curve. Now he seems to think that generally the coefficients 3 4 8; and ®,* will only be required and so he has tabled ¢, (2), ae ; Be for c= 00 to 3:00 at intervals of ‘01 and also for « = 4 (‘Tables III, IV and V on pp. 128—125) to four decimal places. With the series up to 8, the theoretical Y-coordinate will be found, according to Charlier, but from our experience of tetrachoric functions we are exceedingly sceptical about the accuracy of such a result. In fact, we feel certain that the approximation will not be a good one. If the frequency curve be little different from the normal then possibly the approximation would not be very bad. The above investigation was undertaken by me at the suggestion of Professor Pearson and I am indebted to him for several hints. My grateful thanks are due to Miss I. M*Learn for her assistance in the preparation of the diagrams. * Charlier defines the ‘skewness’ S to be S=3£3 and the ‘excess’ H to be H=3(,. MISCELLANEA. I. On the x? test of Goodness of Fit. By KARL PEARSON, F.R.S. In a paper published in the Philosophical Magazine for July 1900, pp. 157—175, I dealt with the following problem: A very large population is sampled, say, the population 7, ng, ... 25, ... Np with total V, and any individual sample is m1, mg, ... m,, ... Mp, total MW. The “ probable constitu- tion” is given by : , MM ah , , * ~ my =n Ms, = 7,712 p60. MR a) SOMO a0) 1 1» 2 ’ 8 N 8) Ty N Doe ae If a large number of samples of size / are taken, what is the distribution of variations from the “probable constitution” in these samples? I showed that if the distribution of categories were such that no category contained a few (Ms — Mg)? isolated units, then the distribution depended on the calculation of y2=S7 Pe and pro- 8 vided a value for the probability P that samples would not diverge more than any given sample from the “ probable constitution.” This process is now familiar to statisticians as the x”, P test. The sole limiting conditions were that the samples should be random, and each should be of the same size J. In some cases the “ probable constitution” (m’ series) can be found at once because the dis- tribution of the sampled population is known a prior?. In other cases the values of the m’ series have to be approximated to, and such approximations are the general rule in all discussions of probable error. We say for example that the standard deviation of the mean of a sample taken from an indefinitely large population of size V and standard deviation o is ¢/n, where n is the size of the sample. We say that the standard deviation of second moment-coefficients of samples of size 7 is J Pa — Be” Vn where py (=o) and py are the second and fourth moment-coefficients of the population sampled. In fact every constant of the sample has a probable error determinable in terms of the constants of the sampled population. All these distributions of deviations from “probable constitution ” are true for perfectly general but random samples of size » drawn from our indefinitely large population. ’ But unfortunately in a considerable number of cases that sampled population is unknown to us ; we have no direct means of finding po, py, etc. What accordingly do we do? Why we replace the constants of the sampled population by those calculated from the sample itself, as the best information we have. And the justification of this proceeding is not far to seek. yw, as found for the sample will only differ from the p, of the sampled population by terms of the order 1//n; for example if we are not dealing with smali samples, and o’ be the standard deviation of the sample, o’ differs from o by terms of the order o/V2n and accordingly the standard deviation of the mean is written o//n when it is really o/z. This method of treating probable errors is universal in the case of fair sized samples to-day and scarcely needs justification. In writing the Miscellanea 187 sample values of the constants for those of the sampled population, we do not i any way alter our original supposition that we are considering the distribution of random samples of size n. We have still » — 1 degrees of freedom, if we have p categories of frequency. The process of substituting sample constants for sampled population constants does not mean that we select out of possible samples of size n, those which have precisely the same values of the constants as the individual sample under discussion. Clearly the given sample has definite moment-coefficients, and if there be p frequency categories the first p—1 moment-coefticients together with the size n of the sample would suffice to fix all the frequencies of the p categories*. Hence no deviations from the “probable constitution” would be possible if we confined our attention to samples of 7 tied to the constants of the given sample! In using the constants of the given sample to replace the constants of the sampled population, we in no wise restrict the original hypothesis of free random samples tied down only by their definite size. We certainly do not by using sample constants reduce in any way the random sampling degrees of freedom. What we actually. do is to replace the accurate value of y?, which is unknown to us, and cannot be found, by an approximate value, and we do this with precisely the same justification as the astronomer claims, when he calculates his probable error on his observations, and not on the mean square error of an infinite population of errors which is unknown to him. The whole of this matter was very fully discussed (pp. 164—7) in ny original paper dealing with the x2, P test. The above re-description of what seem to me very elementary considerations would be unnecessary had not a recent writer in the Journal of the Royal Statistical Society t appeared to have wholly ignored them. He considers that I have made serious blunders in not limiting my degrees of freedom by the number of moments I have taken; for example he asserts (p. 93) that if a frequency curve be fitted by the use of four moments then the n’ of the tables of goodness of fit should be reduced by 4. I hold that such a view is entirely erroneous, and that the writer has done no service to the science of statistics by giving it broad-cast circulation in the pages of the Journal of the Royal Statistical Society. What he would obtain if he placed this restriction on his samples is not the x? for the distri- bution of samples of size n, but of samples which give definite moments. The absurdity of this manner of approach is at once obvious, if as I have suggested, we consider the p first-moments, as there is no reason why we should not do,—for these are just as much “fixed” as the first four— and the conclusion must be that we can learn nothing at all about variation from our sample ; for we have p frequency groups and p-tying conditions. When we wish to find the probable error of a mean or a standard deviation, we do not start by fixing down these characters to their values in the individual sample; we suppose them to take all the possible values they could take by sampling, and after we have reached our measure of variation we then put into our formula the sampled values, to give an approximate value to the functions reached, because we are in ignorance of the real values in the sampled population. The writer in the Journal of the Royal Statistical Society speaks as if I applied y? to a con- tingency table starting by fixing the marginal totals. As far as I am aware I am not guilty of this. My conception of contingency is very different from my conception of x”. I started my conception of contingency with the idea not of a random sample, but with the idea that some function of frequencies alone without regard to their relation to the measured characters would lead to the value of the correlation. Naturally I started from the deviation of the individual cell contents from the same cell contents on the basis of independent probability, as determined by the marginal totals. There was no question of sampling in the matter. In now fairly usual notation I termed * This is Thiele’s method of representing frequency distributions. + Vol. uxxxv. p. 87, 1922. 188 Miscellanea the cell contingency and after playing about with such cell contingencies for a time succeeded in finding a function ¢? of them which for indefinitely fine grouping for a bi-variate normal frequency distribution gave the correlation 7 as : Nese T= ssa? 1+¢ ( 5 Meme n te ae ee where =. 8 bs dececuneee snes Te eeetee pabbadeoGA00 (a). MN gs Mes! Mi I see no reason for confusing this ¢? as a measure of correlation with the y? which is a measure of variability in the samples of constant size drawn from an indefinitely large population. It was different in its origin, as far as I am concerned, and different in its use. It is only when we come to consider the probable error of ¢? that we have to distinguish between (a) the actual marginal totals of the sample and (6) the probable constitution of the marginal totals as deduced from an indefinitely large sampled population. There are, as those who have read Biometrika* will recognise, considerable difficulties about determining the probable error of o”, where 1 +$2=8( a) ; Mg,M,5/ and the determination of the mean ¢? and of the standard deviation of 2 involves very trouble- some analysis. So laborious is the arithmetic involved that for ordinary statistical use it became doubtful whether it would not be better to define ¢? as the mean squared contingency measured not from the marginal totals of the sample, but from the “probable constitution” of the marginal totals of the sample as deduced from the sampled population. In this case if 7 A ; ie IE m ss = Fy Ma's m se Hy Mews TM s9i = Hy Mas!» M5, os) Wea eS ts = (8) p= SSUES ieee meee EEE EEE EEE saaeenene Jeo cceaseeeene daeeeeene ye nts ‘“ M 2 m2," or, 1+¢?=8 = Cs ; MW 541 yg! with this change of definition the probable error and mean of ¢? are more easily obtainable, and in this case for the first time, Mp? can be looked upon as equivalent to a 2. The form (a) from my standpoint cannot be treated as a yx”, because it is not the deviation- measure of a given sample from the sampled population. Nor again is (8) the deviation-measure of the sample from the sampled population, unless we assume that population to have zero contingency, i.e. 7's = m’,,m’./M. : But x? may in the form (8) be treated as a deviation-measure of the actual sample from an artificial sampled population, which differs from the actual population in having no correlation or contingency, but having the same marginal distributions of the two characters. The moment, however, we assume form (8) for our contingency we are giving, what we clearly must give, absoltite freedom to the marginal totals of-our samples. The sole limit on our sample is its total size JZ But when we come to actually calculating ¢? for the individual sample, or the mean value or the standard deviation (i.e. probable error) of ¢? for a series of samples, we have only one course open to us, if we do not know the constants of the sampled population, we must insert the marginal totals of the individual sample of which we have cognizance in place of the * Vol. v. p. 191, Vol. x. p. 570, Vol. x1. p. 570, and Vol. xir. p. 259. Miscellanea 189 unknown values of the sampled population. Thus (a) and (8) provide ultimately the same #?, but the probable error of ¢? and the mean value of ¢? will be different in the two cases. In the first case we vary our marginal totals with the sample as they obviously would vary in practice. In the second case we define our ¢? to be a deviation from the independent probability of an artificial population, we do not keep the marginal totals of the sample fixed any more than in (a). But if we think in terms of x? (and not $?) we appear to do so because ultimately we have to take our marginal probabilities as those of the sample in default of a knowledge of any better values. This point seems to me well illustrated in what my critic in the Journal of the Royal Statistical Society has to say on p. 90 of his paper about Messrs Greenwood and Yule’s use of x? for a fourfold table. He asserts that they ought to have entered the table of goodness of fit with n'=2. The problem before them was whether their fourfold tables could possibly be samples of bi-variate independent probability distributions. Each sample from such a distribution would have perfectly free cell frequencies 741, 7742, M1, M22, Subject to the sole binding condition that as) My +My + Mg + Mo = M. The proper y? is given by 4 m'1.m' \? . m1.’ .\? - ms, 4? i m’s,1'9\? (i= Om SS WD = $y i ple) ee el, ae M | m3, 4 m1.» M's, My M9, vy M M M M and this has three degrees of freedom and is what Messrs Yule and Greenwood desired to find, and they properly used the value of P for n/=4. -(y), Then like the astronomer, who finding the probable error of his mean to be *674490/,/M and not knowing the o of his sampled population, puts it equal to the o of his observations, so Messrs Yule and Greenwood very properly replaced the marginal totals of their unknown population by those of their sample, but very properly did not replace n’=4 by n’=2 !. But says my critic*, if they had, they would have got the same measure of improbability as if they had compared the difference of percentages! Quite so, and obviously so; for in taking percentages they have actually fixed their marginal totals taking 100 of each class and thus for the first time confined their attention to a limited class of samples, not the random sample of size Mf, which has not its marginal totals fixed. We have, indeed, reduced our degrees of freedom by two in taking ratios. When we consider generally the y? for a fourfold table to measure the improbability of a sample we are really comparing the special sample a || <6 a+b with a b! a’ +b’ c d c+d c d cé+d' pee |) ECs M one || inal M the general population, where in the latter case a’d’=c'b’. Now the mean square contingency of the first of these tables is ((« eet (6 Caney (Ge ey (acter are al M M NM p a ae ee See a M M M M a b C a? = te +b) (a+e) (aq b) (b+d) Tae) (c+d) LaiCeers (b+d) i} (ab — ed)? ~ (a+b) (a+e)(b+d)(e+d)’ eelocacuepsgU: 190 Miscellanea But the y? is (« _ ey (- (a' + 0) (+ =); (c a reer ey (a- eee _ M i M sf M M a (a +0’) (a’ +c’) (a+b) (U' +d) (a’ +c’) (ce +d’) s (c’ +a’) (b'+d’) M M M M ue | Laan + oe Sa eos tee Gar ee i} a (a’ +6) (a’ +e) © (a+b) (8 4a’) (V+e)(o4+d) (+d) (6' +a’) y there being three degrees of freedom or we must take n’=4 in calculating the probability P, this may be written = | a + b | ge — a i| ne PaPie Dad’ padi. p'oP'r where p’,1, P'.2, p'1. and p's, are the four percentage numbers of the marginal categories in the sampled population. Now we do not know these percentages in that population and we do what every physicist, every astronomer, and—till I saw the paper by my critic in the Journal of the Statistical Society I should have said—every statistician does, supply the unknown constants from the sample, which leads us to M(ab—edy? (a+b) (a+) (64d) (ed) as used in my memoir of 1912*, The problem I had and still have in view is the variability in samples of definite size—with no other restriction than sample size. The solution of that problem is absolutely comparable with that of any discussion of the probability of an observed result in the theory of probable errors. We have in the bulk of such cases constants involved which concern the distribution in an unknown population, and we supply those constants from the sample itself. As I have already noted the probable error of a mean is 67449 Jug — py? Ne By this we understand that the means of samples restricted solely by their size M@ from an indefinitely large population of moment-coefficients py’, po’ about a fixed origin will have a variability determined by the above formula. But when we proceed to give both pu,’ and po’ the values determined from the sample we know, we do zo¢ add in the manner of my Royal Statistical Society critic, “but in doing so the type of samples is reduced to those having the mean and standard deviation of the sample.” If we did, this selection of samples would clearly have no variation of mean or standard deviation at all! In fact probable errors would be meaningless, unless we drew our samples from a population already fully known to us, in which case we should not in 99°/, of cases want to sample it at all. In the same way when we use the marginal totals of the sample in formulae like (6) we do not thereby reduce our samples to those having constant marginal totals, we merely take the best approximation available to the proper value of x”, and the fact that y, as found from the sample, is only an approximation to the true x? was fully recognised and discussed in my original memoir in the Philosophical Magazine. It only remains to say that the following sentence of my critic’s paper seems to me based upon a fallacious principle and apparently flows from a disregard of the nature of probable errors in general. “Tt should be pointed out that certain of Pearson’s Tables for Statisticians and Biometricians, namely Tables XVII, XIX and XX, together with XXII (Abac to determine 77’) are all calculated * On a novel method of regarding the association of two variates classed solely in alternative categories. Drapers’ Company Research Memoirs, Cambridge University Press. Miscellanea 191 on the assumption that n’=4 in fourfold tables, and consequently should not be used when, as is almost always the case, the marginal totals are obtained from the data” (loc, cit. p. 91). I hold those tables are quite correctly calculated for n’=4, and those who attempt to modify them by assuming n’=2 will be dealing with an entirely different problem. Namely, they will be considering not the improbability of the given sample as one of all possible samples of the given size, which it really is, but one of the indefinitely smaller number of samples that have fixed marginal totals. We do not find the probable error of 7 for a tetrachoric table* on the assumption that the marginal totals are fixed. We find it on the assumption that the marginal totals also vary from sample to sample, and when we have found it, then we substitute in the result the values of not only the marginal totals, but the cell-contents, a, b, c, d of the sample itself for those of the unknown population. With y? we go through an exactly similar process of reasoning. If by this procedure we in some mysterious manner tied our degrees of freedom down to the values of the cell-contents used in our formula and adopted from our sample there could be no probable error for 7, for the values of a, b, c, and d are all required and used. I trust my critic will pardon me for comparing him with Don Quixote tilting at the windmill; he must either “destroy himself, or the whole theory of probable errors, for they are invariably based on using sample values for those of the sampled population unknown to us. For example here is an argument for Don Quixote of the simplest nature: In the sth category of a population WV the frequency is 2,, a sample shows m, in a total JZ The standard deviation of this frequency is Ny [, Mg Ret -%). But we don’t know the population sampled and accordingly obtain an approximate value of the above standard deviation by writing for as a and taking for the standard deviation of m, m F ae : : : ot Ms ( - a) . In doing this it is not a question even of using a marginal total, we have used a cell frequency found from our sample. We have therefore according to our critic reduced our possibilities of freedom by selecting out of all possible samples those with m, in the sth cell—this is exactly parallel to our reducing our freedom by “fixing” marginal proportions or moment- coefficients. But if m, be fixed, it is ridiculous to talk of'a variation of the m, frequency. There- fore either m,=0 or m,= J, or the usual theory and practice of probable errors are wholly at fault. I think this will illustrate what I mean by Don Quixote and the windmill. II. Is Tuberculosis to be regarded from the Aetiological Standpoint as an acute disease of Childhood? By Dr Kr. F. ANDVorD (Christiania). Tubercle, Vol. 11. No. 3, December, 1921. This paper is, we must confess, unconvincing. The author holds that in a community that has long been subject to tuberculosis the time of infection should be fixed in the infantile years for the great majority of cases and consequently we should protect children for the first three or four years from infection. As evidence of his views he takes a graph of what he calls a “population frame” which is really the well-known ‘number living in a stationary population ” (/,,) and represents within this graph the numbers dying from tuberculosis and the numbers who have suffered from it at each age. We are doubtful if his graphs for deaths are correctly drawn. They are made to rise suddenly for about a year and then fall till age 7 but we suspect that they should fall from birth till age 7. We cannot justify his chart (No. VIII) which gives the whole population and the * Phil. Trans. Vol. 195A, p. 14. 192 Miscellanea tubercular population. The non-tubercular found by this chart actually increase after age 17 for many years so that the non-tubercular not only have no mortality but are increased by some process of resurrection! Admittedly the chart is hypothetical but as it stands it calls for amendment. Dr Andvord’s remark that “one would hardly gather from these per-thousand curves,” i.e. from rates of mortality for various ages, “that, as is really the case, more persons die from tuberculosis in the first and second years of life than in any subsequent age period” seems to betray an inexperience in matters related to a life table: this weakness is shown elsewhere, e.g. p. 102, where deaths are stated without populations and without reference to age distributions. Dr Andvord may have other evidence in support of his views but the article under review does not justify them statistically ; we think every point he brings out could be explained as well on other hypotheses. He cannot, moreover, completely prove his case till he has studied communities which become subject to infection after having been kept free from it. For if his theory be correct, the measures he proposes would necessarily produce such a community. W. Pain ELDERTON. The Cambridge University Press, Fetter Lane, London, H.C. 4, and their Agents, are now the sole agents for the sale of the following publications of the Galion and Biometric Laboratories, University of London : Biometric Laboratory Publications Drapers’ Company Research Memoirs. II. ITT. IV. VI. II. III. IV. II. ITI. TV. Biometric Mathematical Contributions to the Theory of Evolution.— XIII. On the Theory of Contingency and its Relation to Associa- tion and Normal Correlation. By Karu Pearson, F.R.S. Price 4s. net. Mathematical Contributions to the Theory of Evolution.—XIV. On the Theory of Skew Correlation and Non-linear Reeres- sion. By Karu Pearson, F.R.S. Price 5s. net, Mathematical Contributions to the Theory of Evolution.—XV. On the Mathe- matical Theory of Random Migration. By Karu Prarsoy, F.R.S., with the assistance of JoHN BLAKEMAN, M.Sc. Price 5s. net. Mathematical Contributions to the Theory of Evolution —XVI. On Further Methods of Measuring Correlation. By Karu Prarson, F.R.S. Price 4s. net. Mathematical Contributions to the Theory of Evolution.—XVII. On Homo- typosis in the Animal Kingdom. A Co- operative Study. [In preparation. Albinism in Man. By Kart Prarson, E. Nerriesaip, and C. H. UsHmer. Text, Part I, and Atlas, Part I. Price 35s. net. Series. VII. Mathematical Contributions to the Theory of Evolution.—X VIII. Ona Novel Method of Regarding the Association of two Variates classed solely in Alternative Categories. By Kart Purarson, F.R.S. Price 4s. net. VIII. Albinism in Man. By Karr Parson, E. Nerriesute, and C. H. UsHmr. Text, Part II, and Atlas, Part II. Price 30s. net. IX. Albinism in Man. By Karn Prarson, i. Nerriesurip, and C. H. UsHrr. Text, Part IV, and Atlas, Part IV. Price 21s. net. xXx. A Monograph on the Long Bones of the English Skeleton. By Kari PEARSON, F.R.S., and Junta Bey, M.A. Part I. The Femur [of Man]. Text I and Atlas of Plates I. Price 30s. net. A Monograph on the Long Bones of the English Skeleton. By Karu Prarson, F.R.S., and Juria Brent, M.A. Part I, Section Il. The Femur [with Special Reference to other Primate Femora.| Text 11 and Atlas of Plates II. Price 40s. net. XI. Studies in National Deterioration. On the Relation of Fertility in Man to Social Status, and on the changes in this Relation that have taken place in the last 50 years. By Davip Heron, M.A., D.Sc. Price 6s. net. Sold only with complete sets. A First Study of the Statistics of Pulmonary Tuberculosis (Inheritance). By Karu Pearson, F.R.S. Price 6s. net. Sold only with complete sets. A Second Study of the Statistics of Pulmonary. Tuberculosis. Marital Infec- tion. By Ernest G. Pork, revised by Karn Pearson, F.R.S. With an Appendix on Assortative Mating by EtHen M. ELDERTON. Price 3s. net. The Health of the School-Child in re- lation to its Mental Characters. By Karn Parson, F.R.S. [In preparation. On the Inheritance of the Diathesis of Phthisis and Insanity. A Statistical Study based upon the Family History of 1,500 Criminals. By CHARLES GORING, M.D., B.Sc. Price 3s. net. A Third Study of the Statistics of Pulmonary Tuberculosis. The Mortality of the Tuberculous-and Sanatorium Treat- ment. By W. P. Experton, F.LA., and S. J. Perry, ALA. Price 3s. net. VII. Onthe Intensity of Natural Selection in Man. (On the Relation of Darwinism to the Infantile Death-rate.) By E. C. Snow, D.Se. Price 3s. net. VIII. A Fourth Study of the Statistics of Pulmonary Tuberculosis: the Mortality of the Tuberculous: Sanatorium and Tuber- culin Treatment. By W. Pain ELperton, F.LA., and Sipney J. Perry, ALA. Price 3s. net. IX. A Statistical Study of Oral Tem- peratures in School Children with special reference to Parental, Environmental and Class Differences. By M. H. Wrutiams, M.B., Jonza Brut, M.A., and Karu PraRson, F.R.S. Price 6s. net. vi. Technical Series. On a Theory of the Stresses in Crane and Coupling Hooks with Experimental Comparison with Existing Theory. By E. S. Anprews, B.Sc. Eng., assisted by Karu Prarson, F.R.S. Jsswed. Price 3s. net. On some Disregarded Points in the | Stability of Masonry Dams.- By L. W. ATCHERLEY, assisted by Karn PEARSON, F.R.S. Jsswed. Price 7s. net. Sold only with complete sets. On the Graphics of Metal Arches with special reference to the Relative Strength of Two-pivoted, Three-pivoted and Built-in Metal Arches. By L. W. ATCHERLEY and Kart Prarson, F.R.S. Issued. Price 5s. net. On Torsional Vibrations in Axles and Shafting. By Karu Pearson, F.R.S. Issued, Price 6s, net, vV. An Experimental Study of the Stresses in Masonry Dams. By Karu Pearson, F.R.S., and A. F. Camppenn POLLARD, assisted by C. W. WurgEn, B.Sc. Eng., and L. F. RicHarpson, B.A. Jssued. » Price 7s. net. On a Practical Theory of Elliptic and Pseudo-elliptic Arches, with special refer- ence to the ideal Masonry Arch. By Karu Pearson, F.R.S., W. D. Reynonps, B.Sc. Eng, and W. F. Sranton, B.Sc. Eng. Issued, ° Price 4s. net. VI. VII. On the Torsion resulting from Flexure in Prisms with Cross-sections of Uni-axial Symmetry only. By A. W. Youne, ErHEen M. Evperton and Karp Prarson, F,R,S, Issued, Price 7s 6d. net II. THO IV. II. III. IV. VI. VII. VIII. IX. Drapers’ Company Research Memoirs—(cont.). Questions of the Day and of the Fray. The Influence of Parental Alcoholism on the Physique and Ability of the Off- spring. A Reply to the Cambridge Econo- mists. ls. net. Mental Defect, Mal-Nutrition, and the Teacher’s Appreciation of Intelligence. A Reply to Criticisms of the Memoir on ‘The Influence of Defective Physique and Unfavourable Home Environment on the Intelligence of School Children.’ By Davip Heron, D.Sc. Price 1s. net. An Attempt to correct some of the | Misstatements made by Sir Victor Hors- LEY, F.R.S., F.R.C.S., and Mary D. SturRGE, M.D., in their Criticisms of the Memoir : ‘A First Study of the Influence of Parental By Karu Pearson, F.R.S. Price -| Alcoholism,’ &c. By Karn Pearson, F.R.S. | Price 1s. net. The Fight against Tuberculosis and | the Death-rate from Phthisis. By Karu Prarson, F.RS. [Out of print. Social Problems: Their Treatment, Past, Present and Future. By Karn PEARSON, F.R.S. Price 1s. net. | | | VI. Hugenics and Public Health. Lecture to the York Congress of the Royal Sanitary Institute. By Karn Pearson, F.R.S. Price ls. net. VII. Mendelismandthe Problem of Mental Defect. I. A Criticism of Recent American Work. By Davin Heron, D.Se. (Double Number.) Price 2s. net. VIII. Mendelism andthe Problemof Mental Defect. II. The Continuity of Mental Defect. By Karu Parson, F.R.S., and Gustav A.J AEDERHOLM, Ph.D. Prices. net. IX. Mendelism and the Problemof Mental Defect. IIT. On the Graduated Character of Mental Defect, and on the need for standardizing Judgments as to the Grade of Social Inefficiency which shall involve Segregation. By Kart Pearson, F.R.S. (Double Number.) Price 2s. net. xX. The Science of Man. Its Needs and its Prospects. By Karu Pearson, F.R.S. Being the Presidential Address to Section H of the British Association, 1920. Price 1s. 6d. net. Eugenics Laboratory Publications MEMOIR SERIES. The Inheritance of Ability. By Epcar ScuusteEr, D.Sc., Formerly Galton Research Fellow, and ErHrt M. Ex.prrton, Galton Scholar. Price 4s. net. Sold only with complete sets. A First Study of the Statistics of Insanity and the Inheritance of the Insane Diathesis. By Davin Heron, D.Sc., Form- erly Galton Research Fellow. Price 3s. net. The Promise of Youth and the Performance of Manhood. By Epear ScuusteErR, D.Sc., Formerly Galton Research Fellow. Price 2s. 6d. net. On the Measure of the Resemblance of First Cousins. By Eran M. ELpEerton, Galton Research Fellow, assisted by Karu Pearson, F.R.S. Price 3s. 6d. net. A First Study of the Inheritance of Vision and of the Relative Influence of Heredity and Environment on Sight. By Amy Barrryeron and Karu PEARSON, F.R.S. Price 4s. net. Treasury of Human Inheritance (Pedigrees of physical, psychical, and patho- logical Charactersin Man). Parts I and II (double part). (Diabetes insipidus, Split- Foot, Polydactylism, Brachydactylism, Tuberculosis, Deaf-Mutism, and Legal Ability.) Price 14s. net. On the Relationship of Condition of the Teeth in Children to Factors of Health and Home Environment. By E. C. Ruopsgs, A B.A. Price 9s. net. The Influence of Unfavourable Home Environment and Defective Physique on the Intelligence of School Children. By Davip Herron, M.A., D.Sc., Formerly Galton Research Fellow. [Out of print. The Treasury of Human Inheritance (Pedigrees of physical, psychical, and patho- logical Characters in Man). Part III. (Angioneurotic Oedema, Hermaphroditism, Deaf-Mutism, Insanity, Commercial Abili- ty.) Price 6s. net. X. TheInfluence of Parental Alcoholism on the Physique and Intelligence of the Offspring. By Eraen M. Etprrton, as- sisted by Karn PEarson. Second Edition. Price 4s. net. The Treasury of Human Inheritance (Pedigrees of physical, psychical, and patho- logical Characters in Man). Part IV. (Cleft Palate, Hare-Lip, Deaf-Mutism, and Congenital Cataract.) Price 10s. net. The Treasury of Human Inheritance (Pedigrees of physical, psychical, and patho- logical Characters in Man). Parts V and VI. (Haemophilia.) Price 15s. net. A Second Study of the Influence of Parental Alcoholism on the Physique and Intelligence of the Offspring. By Karu Parson, F.R.S., and Eraen M, ExpErTon. Price 4s. net. A Preliminary Study of Hxtreme Alcoholism in Adults. By Amy Barrine- ton and Karu Prarson, F.R.S., assisted by Davip Heron, D.Sc. Price 4s. net. The Treasury of Human Inheritance. Dwarfism, with 49 Plates of Illustrations and 8 Plates of Pedigrees. Price 15s. net. The Treasury of Human Inheritance. Prefatory matter and indices to Vol. I. With Frontispiece Portraits of Sir Francis Galton and Ancestry. Price 3s. net. A Second Study of Extreme Alco- holism in Adults. With special reference to the Home-Office Inebriate Reformatory data. By Davip Huron, D.Sc. Price 5s. net. On the Correlation of Fertility with Social Value. A Cooperative Study. Price 6s. net. . XIX—XX. Report on the English Birthrate. Part I. England, North of the Humber. By Eruet M. EpErron, Galton Research Fellow. Price 9s. net. The Treasury of Human Inheritance. Vol. I (Nettleship Memorial Volume). Ano- malies and Diseases of the Eye. Part I. [Just ready. XI. XII. XIII. XIV. XV. XVI. XVII. XVIII. XXII. Eugenics Laboratory Publications—(cont.). Vol. I of The Treasury of Human Inheritance (VI+IX+XI+XI1+XV+XVI of the above memoirs) may now be obtained bound in buckram, price 57s. 6d. net. Buckram cases for binding can be purchased at 3s. 9d. with impress of the bust of Sir Francis Galton. An engraved portrait of Sir Francis Galton can be obtained by sending a postal order for 3s. 6d. to the Secretary to the Laboratory, University College, London, W.C. LECTURE SERIES. Price Is. net each (Nos. III, X, XI, XII excepted). I. The Scope and Importance to the | State of the Science of National Eugenics. By Karu Pearson, F.R.S. Third Edition. II. The Groundwork of Hugenics. By Karr PEARSON, F.R.S. Second Edition. The Relative Strength of Nurture and Nature. Much enlarged Second Edition. Part I. The Relative Strength of Nurture and Nature. (Second Edition revised.) By Eruen M. Experton. Part II. Some Recent Misinterpretations of the Problem of Nurture and Nature. (First Issue.) By Kart Prar- son, F.R.S. Price 2s. net. IV. On the Marriage of First Cousins. By Erxet M. ELDERTON. V. The Problem of Practical Eugenics. By Kart Puarson, F.R.S. Second Edition. VI. Nature and Nurture, the Problem of the Future. By Karn Prarson, F.R.S. Second Edition. III. VII. The Academic Aspect of the Science of National Eugenics. By Karn Prarson, F.R.S. VIII. Tuberculosis, Heredity and Environ- ment. By Karn PkArsoy, F.R.S. IX. Darwinism, Medical Progress and Hu- genics. The Cavendish Lecture, 1912. By Karu Pgarson, F.R.S. xX. The Handicapping of the First-born. By Karu Prarson, F.R.S. Price 2s. net. National Life from the Standpoint of Science. (Third Issite.) By Karn Pearson, F.R.S. Price 1s. 6d. net. The Function of Science in the Modern State. (New Issue.) By Karn Prarson, F.RS.- Price 2s. net. Sidelights on the Evolution of Man. By Kart Prarson, F.R.S:. Price 3s. net. XI. XII. XIII. The Chances of Death and other Studies in Evolution By KARL PEARSON, F.R.S. GALTON Proressor, UNIVERSITY oF LONDON x Vout. I 1. The Chances of Death. 2. The Scientific Aspect of Monte Roulette. Carlo 3. Reproductive Selection. 4, Socialism and Natural Selection. 5. Politics and Science. 6. Reaction. 7. Woman and Labour. 8 . Variation in Man and Woman. Reissue. Vou. II 9. Woman as Witch. Evidences of Mother- Right in the Customs of Mediaeval Witchcraft. 10. Ashiepattle, or Hans seeks his Luck. 11. Kindred Group Marriage. Part I. Mother Age Civilisation. Part IIT. General Words for Sex and Kinship. Part III. Special Words for Sex and Relationship. 12. The German Passion Play: A Study in the Evolution of Western Christianity. Price 80/- net. The following works prepared in the Biometric Laboratory can be obtained from H.M. Stationery Office. The English Convict, A Statistical Study. By CHartes Gorina, M.D. Text. Price 9s. Tables of Measurements (printed by Convict-Labour). Price 5s. The English Convict. An Abridgment, with an Introduction by Karu Pearson, F.R.S. Price 3s. Tables of the Incomplete [-Function. Edited with an Introduction by Kar PEARSON, F.R.S. Price £2. 2. Od. or by Post £2. 2s. 9d. The Life, Letters, and Labouré of Francis Galton By KARL PEARSON, F.R:S. GALTON PROFESSOR, UNIVERSITY OF LONDON Vol. |. Birth 1822 to Marriage 1853 WITH 5 PEDIGREE PLATES AND 72 PHOTOGRAPHIC PLATES, FRONTISPIECE AND 2 TEXT-FIGURES Price 30s. net “Tt is not too much to say of this book that it will never cease to be memor- able. Never will man hold in his hands a biography more careful, more complete.”—The Times “A monumental tribute to one of the most suggestive and inspiring men of . ” . f=) modern times.”— Westminster Gazette “Tt was certainly fitting that the life of the great exponent of heredity should be written by his great disciple, and it is gratifying indeed to find that he has made of it, what may without exaggeration be termed a great book.’—Daily Telegraph Vol. II is now im preparation. Tables for Statisticians and Biometricians Epitep By KARL PEARSON, E.R.S. GALTON PROFESSOR, UNIVERSITY OF LONDON Price 15s. net “To the workers in the difficult field of higher statistics such aids are invaluable. Their calculation and publication was therefore as inevitable as the steady progress of a method which brings within grip of mathematical analysis the highly variable data of biological observation. The immediate cause for congratulation is, therefore, not that the tables have been done but that they have been done so well....... The volume is indispensable to all who are engaged in serious statistical work.”—Scvence “The whole work is an eloquent testimony to the self-effacing labour of a body of men and women who desire to save their fellow scientists from a great deal of irksome arithmetic; and the total time that will be saved in the future by the publication of this work is, of course, incalculable....... To the statistician these tables will be indispensable.”—Journal of Education “The issue of these tables is a natural outcome of Professor Karl Pearson’s work, and apart from their value for those for whose use they have been prepared, their assemblage in one volume marks an interesting stage in the progress of scientific method, as indicating the number and importance of the calculations which they are designed to facilitate.”— Post Magazine (Copies of the Corrigenda to these Tables can be obtained by former purchasers on sending a stamped and directed envelope to Mr C. F. Clay) Just issued, Cambridge University Press : Mounted Charts of Weight and Health of Male and Female Babies Price 7s. 6d. net the pair CAMBRIDGE UNIVERSITY PRESS C. F. CLAY, Manacrer LONDON: FETTER LANE, E,C, 4 UNIVERSITY OF LONDON, UNIVERSITY COLLEGE DEPARTMENT OF APPLIED STATISTICS : The Biometric Laboratory iaated by 4 grant from the Worshipful Company: of Drapers) __ Until the phenomena of any branch of knowledge have been submitted to measurement and number it cannot assume the status and dignity of a science. FRANCIS GALTON. Under the direction of Professor Kart PEARSON, F.R.S. Assistants: Dr JULIA ‘BELL, E. S. Pearson, B.A.; Crewdson Benington Student in Anthropometry: G. Morant, B.Sc., Hon. Research Assistant, J. HENDERSON, M.A. This laboratory provides a complete training in modern statistical methods and is especially arranged so as to assist research workers engaged on biometric problems. The Francis Galton Eugenics Laboratory National Eugenics is the study of agencies under social control, that may improve or impair the vacial qualities of future generations, etther physically or mentally. The Laboratory was founded by Sir FRANcis GALTON and is under the supervison _of Professor KARL PEARSON, F.R.S. Galton Research Fellow: ETHEL M: ELDERTON; Reader in Medical Statistics: M. GREENWOOD, M.R.C.P., M.R.C.S. Medical Officer: Percy Stocks, M.A., M.D. Assistants: E. C. RuopEs, M.A., M. Nort Karn, M, Mout and J. O. IRWIN, B.A. ‘Secretary: M. B. CuHILp. It was the intention of the Founder, that the Laboratory should serve (i) as a storehouse. of statistical material, bearing on the mental and physical conditions in man, and the relation of these conditions to inheritance and environment; (ii) as a centre for thé publication or other form of distribution of information concerning National Eugenics; (iii) as a school for training and assisting research-workers in special problems:i in Eugenics. Now Ready, Elondes University Press New Series, TRACTS FOR COMPUTERS "This Series will endeavour to supply a gap in statistical literature, namely “first aid” to the professional computer. . -No.I. Tables of the Digamma and Trigamma Functions. By ELEANOR PAIR- MAN, M.A. Price 35. net. : Tables for summing o— 5 : where the psand q’s are i=x (Did +91) (Pat + Go) «+» (Dnt + Qn) numerical factors, ~ No. II.. On the Construction of Tables and on Interpolation. Part I. Univariate _ Tables. By KARL PEARSON, F.R.S, Price 35. 9d. net. No. III. On the Construction of Tables and on Interpolation. Part II. Bivariate Tables. By KARL PEARSON, F.R.S, Price 3s. od. net. _ No. IV. Tables of the Logarithms of the Complete ['-function to Twelve Figures. Reprint with Portrait of A. M. LEGENDRE. Price 3s. 9d. net. No. V. Table of Coefficients of Everett’s Central-Difference Tice olavion For- mula. By A. J. THOMSON, B.Sc. Price 35. od. net. No. VI. Smoothing. By E. C. RHopES, B.A. Price 35. od. net. No. VII. The numerical Evaluations of the Incomplete B-Function or of the. Integral | x? (1 — x2)" da for Ranges of # between o andi. By H.E. SOPER, 0 M.A. Price 3s. od. net. No. VIII. “Table of the Logarithms of the Complete ['-Function (to ten decimal ‘ places) for Argument 2 to 1200 beyond Legendre’s Range (Argument 1 to 2). By EGON S. PEARSON, B.A. Price 3s. od. met. IIT. (All. Rights reserved) iP BIOMETRIKA. Vol. XIV, Parts I and II CONTENTS be ak - r e The Standard Deviations of Fraternal and Parental Correlation Coefficients. By Kirstine Suits, D.Se. waa Es Re ie eo a . fi vt On the Variations in Personal Equation and the Correlation of Successive - Judgments. By Eaon S. Pearson, B.A. (With twenty Diagrams in- Text) <3 : : : ; : 3 eae Inheritance in the Foxglove, and the Result of Selective Breeding, By Ernest Warren, D.Sc. (With Plate) . : ot me. . On Polychoric Coefficients of Correlation. By Kari, Prarson, EBS. and Econ S. Pearson. (With four Diagrams in Text), 2°.) 1. V. On Expansions in Tetrachoric Functions. By JamMEs HENDERSON, M.A. (With seven Diagrams in Text) .. : : SER ae Miscellanea: — ieee ; : Bae ke. I. On the x’ test of Goodness of Fit. By Kari Pearson, F.RB.S. II. Is Tuberculosis to be regarded from the Aetiological Standpoint as an acute The publication of a paper in Biometrika marks that in the Editor’s opinion it contains either in Disease of Childhood? By Dr Kr. 'F. Anpvorp. Review of paper in Tubercle by W. Pain ELDERTON ; : Rest : 191. method or material something of interest to biometricians., But the Editor desires it to be distinctly” understood that such publication does not mark assent to the arguments used or to the conclusions* drawn in the paper. ; Biometrika appears about four times a year.* A volume containing about 400 pages, with plates and — tables, is issued annually. Papers for publication and books and offprints for notice should be sent to Professor Kann, Puarson, University College, London. It is a condition of publication in Biometrika that the-paper shall not already have been issued elsewhere, and will not be reprinted without leave of the Editor. very desirable that a copy of all measurements made, not necessarily for publication, should accom- pany each manuscript. In all cases the papers themselves should contain not only the calculated constants, but the distributions from which they have been deduced. Diagrams and drawings should be sent in a state suitable for direct photographic reproduction, and if on decimal paper it should be blue- ruled, and the lettering only pencilled. Papers will be accepted in French, Italian or German. In the last case the manuscript should be in Roman not German characters. Russian contributors may use Russian but their papers will be translated into English before publication. : Contributors receive 25 copies of their papers free. Fifty additional copies may be had on + payment of 7/- per sheet, of eight pages, or part of a sheet of eight Plates; these should be. ordered when the firial proof is returned. It is’ pages, with an extra charge for The subscription price, payable in advance, 40s. net per volume (packing and postage 4s.): single numbers 15s. net (postage 1s.). Owing to the scarcity of early volumes, the following rates must now be charged: Volumes I—II in wrappers, sold only with complete sets, 60s. net each, Vols. I1I—XIII in , * wrappers, 40s. net each. Index to Vols I to V, 2s. net. Standard buckram cases for binding, price 3s. per volume, Subscriptions should be sent direct to the Secretary (Miss M. B. Child), Biometric Labora- tory, University College, London, W.C. 1, to whom orders for series and single, copies should be addressed. All cheques should be crossed ‘‘ Biometrika Account.” M8 PRINTED IN ENGLAND BY J. B. PEACE, M,A., AT THE CAMBRIDGE UNIVERSITY PRESS. ¥ * pk ah Ss rains =. BS Ree ee ee a ae eee ‘wii