Pee ese any Sepa Ne be MN Pee ane era t= een iiethe naan br Ot nde 8 al nfo tre A — Faia bales We Mediailacind nA nde Ast thats dood a inept toner AI Tn amaneet we Bhace ty et a peur Steal Paanrernenteren we Weyy sya Se a tlt dete ft i= Balin tee alban anh eer pt bia Rae Se A Ba if OR RRR A by i) San ag) by?) hs f i pales ¥ uno ile vat i Dats Nye NN ey Bin a) ya sO vib eK A A JOURNAL FOR THE STATISTICAL STUDY OF BIOLOGICAL PROBLEMS FOUNDED BY ‘W. F. R. WELDON, FRANCIS GALTON anp KARL PEARSON KDITED BY KARL PEARSON VOLUME X APRIL 1914 TO May 1915 CAMBRIDGE AT THE UNIVERSITY PRESS LONDON: FETTER LANE, E.C. (C. F. CLAY, Manacer) anD H. K. LEWIS, GOWER STREET WILLIAM WESLEY AND SON, 28, ESSEX STREET, STRAND EDINBURGH : 100, PRINCES STREET CHICAGO : UNIVERSITY OF CHICAGO PRESS BOMBAY, CALCUITA AND MADRAS: MACMILLAN AND CO., LIMITED TORONTO: J, M. DENT AND SONS, LIMITED TOKYO: THE MARUZEN-KABUSHIKI-KAISHA [All rights reserved] a, 32 Zot i ‘ = ‘aed Ape = = a é "s i ue) % t : Cambridge: PRINTED BY JOHN CLAY, M.A. * AT THE UNIVERSITY PRESS. = + tee . P = . ‘ : . . ? ie mm ; Rec a) ee oe ie at . a 2 @—: = VIL VIII. ail XIII. CONTENTS OF VOL. X. Memoirs. Congenital Anomalies in a Native African Race. By HuGH STANNUS STANNUS Tables of Poisson’s Exponential Binomial Limit. . By H. E. Soper On the Poisson Law of Small Numbers. By Lucy Wairaker The Relationship between Weight of the Seed Planted and the Characteristics of the Plant Produced. By J. ARTHUR HARRIS On the Probability that two Independent Distributions of Frequency are really Samples of the same Population, with special Reference to Recent Work on the Identity of Trypanosome Strains. By KARL PEARSON On Homotyposis and Allied Characters in Eggs of the Common Tern. By WitittAm Rowan, K. M. Parker and JULIA BELL A Piebald Family. By E. A. CocKAYNE Clypeal Markings of Queens, Drones and Workers of Vespa ae Ws. By Oswatp H. Larrer : : : Table of the Gaussian “Tail” Functions; when the “Tail” is larger than the Body. By Atice LEE Contribution to a Statistical Study of the Cruciferae. Variation in the Flowers of Lepidiwm draba Linnaeus. By James J. SIMPSON Nochmals iiber “The Elimination of Spurious Correlation due to Position in Time or Space.” Von O. ANDERSON Statistical Notes on the Influence of Education in Egypt. By M. Hosny Height and Weight of School Children in Glasgow. By Ernen M. ELDERTON . PAGE 269 280 288 1V XIV. XV. XVI. Contents Numerical Illustrations of the Variate Difference Correlation Method. By Bearrice M. Cave and Kari PEARSON An Examination of some Recent Studies of the Inheritance Factor in Insanity. By Davip HERON On the Probable Error of the Bi-Serial Expression for the Corre- lation Coefficient. By H. E. Soper XVII. On the Partial Correlation Ratio. Part I, Theoretical. By L. ISSERLIS XVIII. Association of Finger-Prints. By H. WaItre XIX. XX. XXI. On the Problem of Sexing Osteometric Material. By KARL PEARSON Further Evidence of Natural Selection in Man. By Eruet M. ELDERTON and KARL PEARSON Frequency Distribution of the Values of the Correlation Coefficient in Samples from an indefinitely large Population. By R. A. FISHER : XXII. Appendix to Papers by “Student” and R. A. Fisher. HprIroRra XXIII. Tuberculosis and Segregation. By Atice LEE : : XXIV. The Influence of Isolation on the Diphtheria Attack- and Death- (1) (11) (iii) (iv) (Vv) (vi) (vil) - rates. By Eraet M. ELDERTON and KARL PEARSON Miscellanea. The Statistical Study of Dietaries, a Reply to Professor Karl Pearson. By D. Noet Paton The Statistical Study of Dietaries, a Rejoinder. By Kart PEARSON Note on the Essential Conditions that a Population breeding at random should be in a Stable State. By Kart PEARSON The Elimination of Spurious Correlation due to Position in Time or Space. By “Srupentr” On certain Errors with regard to Multiple Correlation occasionally made by those who have not adequately studied the Subject. By Kart PEARSON Formulae for the Determination of the Capacity of the Negro Skull from External Measurements. By L. IsseRLIs Note on a Negro Piebald. By C. D. Maynarp 169 72 175 179 181 188 193 (viii) (1x) (x) (x1) (x11) (xiii) (xiv) (xv) Contents Note on Infantile Mortality and Employment of Women. By ETHEL M. ELDERTON Announcement of Prize Essay by Professor F. M. URBAN Corrigendum, Congo Female Crania in Vol. vu. p. 307 On Spurious Values of Intra-class Correlation Coefficients arising from Disorderly Differentiation within the Classes. By J. ARTHUR HARRIS Correction of a Teena made by Mr MAJOR GREENWOOD, Junior. By K. P. Note on Reproductive Selection. by Dain HERON On the Probable Error of a aot Coefficient. By Karn PEARSON On Medieval English Femora, A Reply to Professor ee By KARL PEARSON . ; ; Plates. Plate I. (1) Samuti, an Ateliotic Dwarf. (2) Subgiant, Height 1:92 m., with Wife and Albinotic Child. (8)—(4) Etimu, aged 25, an Achondroplasic Dwarf . Plate II. (5)—(6) Masimosya, aged 19, Gynaecomastos, cae Shek features which were formerly described as those of Partial Hermaphroditism. (7) oe ee Idiot. (8) Case of Hydrocele testis Plate III. (9)—(11) Boy, Heol 15 years, showing Siac only Plate IV. (12)—(13) Son of Matikwiri, aged 7, a case of Scapho- cephaly. (14) Cases of Umbilical Hernia. (15) poet Hand and Foot in a Child, aged 5 Plate V. (16) Case showing faint es nepresion of 1 upper lip. (17) Blantyre boy, aged 10, with Hare-lip. (18) Young Woman with two nipples on left breast. (19) Gobedi, aged 22, with congenital Humeral Micromely . Plate VI. (20) Ndala, Split Hand, left only. (21) Chibisa, aged 30, elongation of all segments of middle finger and its meta- carpal bone. (22) Shortening of the fourth metatarsal bone. (23) Case of Se Oe Shortening of the left ce toe Plate VII. Transition of Poisson’s disor Limit of the Bi- nomial Series into the Gaussian . to face p. 24 24 24 24 24 24 vi Contents Plate VIII. Sample Eggs, Common Tern, Natural Size (In colours) to face p. 146 Plate IX. Types of Mottling of Eggs of Common Tern . i . 146 Plate X. Dr Maynard’s Piebald Negro. i : ; : a 193 Plate XI. Pedigree of Cockayne’s Piebald Family . : >» 200 Plate XII. | Photographs of members of Cockayne’s Piebald Family Fr , Plate XIII. .) ip ‘ Plate XIV. $ 3 5 9 % » » Plate XV. ” ” » »” ” ” 7) Plate XVI. " 3 3 5) 9 » » Plate XVII. Right forearm of a member of Cockayne’s Piebald Family to show leucotic patches. : » » Plate XVIII. © Leucotic forehead, blaze and white ca of ae in a member of Cockayne’s Piebald Family f i 4 Plate XIX. Model of Skew Regression Surface giving mean Weight of Girls of Class B in Glasgow Schools for a given Height and Age. : ; ; : ; ee 49) Plate XX. Types of Finger-Prints : : > eee Corrigenda and Addenda. The following corrections and additions to the memoir on the Partial Correlation Ratio, pp. 391—411, have been received from Mr L. Isserlis : Page 396. Eqn (28) for = read “; Eqn (30) for 7yy= read ry,=; Eqn (35) for me Oz Oz a z oa read 4 Oz Page 397. Eqn (87) for ry= read 1,=; Eqn (38) for cdz, read cqy2; Line following Eqn (45) for d=—Cyx, read d=—Cry; Eqn (46) for ys day read -ys'Gn2; Line following Eqn (46) for wf,? read ,,R, Page 402. Line 8 from top: after, c, \2 av melee =Isy {(a+24) (1- yn) a insert “provided the SDs of w for constant y are homoscedastic, or sufficiently so, for this to be an approximation.” Line 6 from bottom for y,p—736 read ys h— 306. Expression at foot of page, factor (7zyQx2—9.%y) of first term in large curled bracket should read (Tay Qa?y — Yay?) and factor (7x29x%—Ty2Fay) Of expression in numerator of last line of same bracket should read (12Qx2y —TyzJxy2)- Page 403. In denominator of second term of expression in first line for 1- yn? read 1—,n,?. Page 405. In denominator of left-hand side of Eqn (70) for g,22—7? read 9422-7? xy. me read 14 rey Page 406. Line below Eqn (71) and in Eqn (72) for - oe L474, QP vy Vole X. Parte“ | April, 1914 BIOMETRIKA — A JOURNAL FOR THE STATISTICAL STUDY OF ‘BIOLOGICAL PROBLEMS . FOUNDED BY W. F, R. WELDON, FRANCIS GALTON anp KARL PEARSON EDITED BY KARL PEARSON CAMBRIDGE UNIVERSITY PRESS C. F. CLAY, Manacer LONDON: FETTER LANE, B.C. : EDINBURGH: 100, PRINCES STREET also | H. K. LEWIS, 136, GoWER STREET, LONDON, W.C. WILLIAM WESLEY AND SON, 28, ESSEX STREET, LONDON, W.C. | CHICAGO: ‘UNIVERSITY OF CHICAGO PRESS BERLIN: A. ASHER AND CO. LEIPSIC: BROCKHAUS BOMBAY AND CALCUTTA! MACMILLAN AND CO/, LIMITED TORONTO: J. M. DENT’ AND SONS, LIMITED TOKYO: THE MARUZEN-KABUSHIKI-KAISHA Price Ten Shillings net. [Issued April 30, 1914] - Drapers’ ‘Cotapany. Research. Memoirs. % . Biometric Series, a ‘Mathematical Contributions ‘tothe _. Theory of Evolution.— XIII. On the Theory of Contingency and its Relation to Associa- |” tion and Normal Correlation. By Kart | Pearson, F.R.S. Issued. Price 4s. net. II. Mathematical Contributions to the ' Theory of Evolution.— XIV. On the Theory of Skew Correlation and Non-linear Regres- sion. By Karn Pearson, F.R.S. Issued. . Price 5s. net. It. Mathematical Contributions to the matical Theory of Random Migration. By Kart Parson, F.R.S., with the assistance of JoHN BLAKEMAN, MSc. Issued. Price 5s: net. Theory of Evolution.—XVI. Methods of ‘Measuring Correlation, By Kart Pearson, F.R.S. Jssued. Price 4s, net. V. Mathematical Contributions to the a oe of Evolution.—XVII. On Homo. Studies m ‘Néevonal De eornapn: ; I. On the Relation of Fertility in’ Man to Social Status, and on the changes in this Relation that have taken ~place in the last | 50 years. By Davip Heron, M.A., D.Sc. Issued. Sold only with complete sets. II. A First Study of the Statistics of Pulmonary Tuberculosis (Inheritance). By | Kart Prarson, F.R.S. Jssued.. III. A Second Study of the Statistics of : Pulmonary Tuberculosis. Marital Infec- tion. By Ernest G. Pops, revised by KARL Pearson, F.R.S. With an Appendix on -Assortative Mating by ErHe. M, ELDERTON. i Issued. V. On the Inheritance of the Diathesis ; _-of Phthisis and Insanity. A Statistical | Study based upon the Family History of 1,500 Criminals. By CHarLEs a MD, BSc. Issued. : Questions of He Day and of b, ine notes te net bench (vit ea ‘I. The Influence of Parental Alcoholism on the Physique and Ability of the Off- | spring. A Reply to the Cambridge Econo- mists. By Karu Pearson, F.R.S. a Mental Defect, Mal-Nutrition, and the 'Teachei’s Appreciation of Intelligence. A Reply to Criticisms of the Memoir on ‘The Influence of Defective Physique and * Unfayourable Home Environment on the Intelligence of School ean By Davin. Heron, D.Sc. III. An Attempt to sbaeebe some of the - Misstatements made by Sir Victor Hors- LEY, F.R.S., F.R.C.S., and Mary D. Stures, .M.D., in their Criticisms of the Galton. Laboratory Memoir: ‘A First Study of. the Influence of Parental Alcoholism,’ Y Be, By Karu Prarson, F.B.S. +. Theory of: Evolution—XV. On the Mathe- Mathematical Contributions to the On Further | VIL MendelismandtheProblemof Mental — ‘ ty osis in nthe ‘Animal Kindo By dee ARREN, D.Sc., ALicE Lex, D.Sc., Eon - Lea-Smira, Manion RADFORD, and Kar _ Pearson, F.R.S, Shortly. awe VI. Albinism in Man. By Kari Pusat _ E, Nerrizsarp, and C.'H. Usuer. Text ran I, and Atlas, Part I. od ssued. Price © VII. Mathemntical Contributions tothe | Theory of Evolution XVIII. Ona Novel Method of Regarding the Association of — two. Variates classed solely in Alternative é Categories. By Karu PEARSON, F.RS. Issued. Price 4s. net. - VIIL. Albinism in Man. By Karn Pearson, __ ; ’ E. Nerrursaip, and 0. H. Usamr. Text, gy sd Boney - Part II, and Atlas, Part IT. Issued. ‘Price 303s. net. ‘ uz IX. Albinism in Man. ‘By Karu PEARSON, A st EK, Nerrimsarr, and C. H. Usoer. Text, — Part IV, and Part bg aoe ; Price. : 21s. net. % " ‘Price 3s. net each x excepted). VI. A Third ‘Study of the Statistics ° bot ih? _ Pulmonary Tuberculosis. The Mortality of the Tuberculous and Sanatorium Treat- ment. By W. P. ‘ELpsrron, ELA.’ ‘and AS Sus: Pree eT ‘Issued. gi Vil, be the AubeReaty of Natural Selection _ aae in Man. By E.C. Snow, D.Sc. Issued. Rsk! VIII. A “Fourth Study of the Statistics of eae Pulmonary Tuberculosis : the Mortality of the Tuberculous ; Sanatorium and Tuber- bg culin Treatment. By W. PaLIn Ly gar Ceci ie F.LA., and. Sipney J. aes ALA, ate . Issued. | ix A Statistical. Study of Oral ‘Tem peratures in School Children with s on. reference to Parental, ‘Environmental ‘MB. Juua Bett, MA, and. Kart PEARSON, F. BS: tones oe ‘ os ot IV. The Fight against Tuberculosisand _ a - the Death-rate bis i By Kart Ne - PEARSON, ' F. RS... V. Social Problems : Their Treatment: Past, Present and Future.” ele Kari igs __PEaRson, E.R.S.. pi os Vi Bugenics and Public Health. ere: to the York Congress of the Royal Sanitary Institute. By Kanu Pearson, F.R.S. — “Defect. I. A Criticism of Recent American ~ “Work, By Davin Heron, D.Sc. Issued — (Double. Number). Price Qs. NEL Hi vu ‘Mendelisinand the Problemof Mental Defect. II. The Continuity of Mental — Defect. a By Karu ae ERS and Gustav a Ts mae eae ys VoLuME X APRIL, 1914 No. 1 BIOMETRIKA CONGENITAL ANOMALIES IN A NATIVE AFRICAN RACE . By HUGH STANNUS STANNUS, M.D. Lond, Medical Officer, Nyasaland. (1) I HAVE thought it would be of interest to put on record some observations made by myself in Nyasaland during the past seven years, on the subject which appears as the title of this paper. These observations relate to members of a native population of Bantu stock, belonging to several main tribes, namely, Mananja, Yao, Ngoni and Tumbuka, with a few references to the Nkonde in the north and the Nguru from the south- east. My interest in the subject was aroused by the frequency with which some abnormalities were seen and I think the facts I bring forward will go to shew that this unusual incidence is real and not only the result of the ease with which observations may be made among a partially clothed community. Statistics dealing with the subject, to be of value, must treat of large numbers, such have however only been possible in a few instances to be referred to later. I speak therefore largely from impressions in appraising the rarity or otherwise of any particular condition. It should be remembered in this direction that the cases now to be reported have been met with more or less casually, most of them while travelling on the path or in some village, few in the course of Native Hospital work and none in any Special Department. Classification is a matter of some difficulty for many reasons and as the number of anomalies to be described is not very large it is perhaps more convenient to consider the various conditions according to the anatomical part affected. One large section of congenital anomalies, Anomalies of Pigmentation, I have already dealt with (Biometrika, Vol. 1x. pp. 333—365), and they will not be touched on in the present paper. Biometrika x 1 2 Congenital Anomalies in a Native African Race (2) Dealing with those deviations from the normal in’which there is a change of a more or less general nature, I refer firstly to Infantilism, at the same time recognising that such a condition may not constitute a truly congenital anomaly. To the class designated Idiopathetic Infantilism I should. relegate a woman aged 22 years seen in 1911 at Zomba who presented the figure and development of a girl of 13. There was no breast development, no pubic or axillary hair and the rounded contours of the body and limbs usually associated with this age in a woman were wanting; menstruation had not commenced. In other respects she appeared normal and her mental development was but little if at all below the average. (3) In W. Nyasa I encountered a very excellent example of the Ateliotic Dwarf, a perfect “little man,’ a man in miniature 1:25 metres in height. Another case which I think must be considered as one of simple dwarfism is here reproduced :— Samuti, aged 35, a Yao, 1:42 metres high. He is shewn together with a man of 1:85 metres. Samuti shews no other abnormality (Plate IJ, (1)). No case of Cretinism or Myxoedematous Dwarfism has been seen. I may here mention that Cachetic Infantilism is well seen in some cases of spinal caries among Natives just as among HKuropeans. A paper on “Congenital Humeral Micromely” in the Nouvelle Iconographie de la Salpétriere, T. xxiv. pp. 463—471, Paris 1911, by Dr S. A. Kinnier Wilson and myself, contains references to two cases of Achondroplasia in Nyasaland. Since then I have heard of two other cases and seen a fifth :—Etimu, male, aged 25 years, a Yao, son of Masinjiri of Ndindi’s near Chipoli, Dedza District. The subject stated that he had no children and that no member of the family was known to have been similarly affected. He is a perfect example of the condition as the photographs will attest, and further remarks are unnecessary (Plate I, (3) and (4)). The following measurements were made and tracings of his hands are here depicted (Fig. 1): (1) Head: maximum length. : : : ; . ,20:1>em: (2) - breadth . : : : : 2 elas (3) circumference . 5 : : F : . 600 (4) Nose: length, base to root . : ; : 3°6 (5) breadth, across nostrils. : : : : 45 (6) Face: bizygomatic breadth . : : Become 0) (7) length, nasion to chin 4 3 : 5 ae lala (8) F. to commissure of lips : ; 6'7 (9) Standing height . : ‘ ; LS 2 (10) Span of arms. ; : : . diss (11) Arm: acromion to external condyle of humerus. Bi a4) H. S. Srannus 3 (12) Forearm: external humeral condyle to tip of ulnar tubercle. : : ; : : 3 : Se aliecm: (13) Forearm to tip of middle finger . ‘ : : oe Oe (14) Leg: top of iliac crest to head of fibula. ; 5 SS (15) 5 - - to external malleolus — . eS (16) . . z to sole of foot i : ~ ol (17) Trunk: upper border of sternum to umbilicus . . 3&4 (18) ‘ symphysis pubis 45 BY: Left. Fig. 1. Etimu. Right. (4) No case of actual Gigantism has been seen. Tallness or shortness often runs in families. The tallest man I have ever seen measured 1°92 metres. He was the father of an albinotic child and had internal strabismus but no signs of acromegaly (see Plate I, (2)). Another man who I have not seen but who was measured by Dr Davey at Kota Kota was 2:0 metres in height. No case of Acromegaly has been seen by myself. (5) The following case in the want of development of the lower jaw and zygomatic arches might be considered as the converse to acromegaly (Fig. 2). From the sketch the subject will at once be recognised as a type of Congenital Idiot, the above-mentioned features and ill-formed pinnae together with the rather bird-like appearance being characteristic. 1—2 4 Congenital Anomalies in a Native African Race Jaidi, male, aged 20 years, a Yao of Chumbosa, Bursali, is the second child of a family of three, the elder brother being dead and the younger sister normal. No family history was elicited. Fig. 2. Jaidi, The growth of the face is defective as before noted, the zygomatic arches are so little developed that there are practically no cheeks. The descending rami of the jaws converge very considerably so that the floor of the mouth is very narrow and the horizontal rami are so short that the symphysis is situated mid-way be- tween the lower lip and the neck as they lie on one horizontal plane. The palate is high and narrow. The following measurements were made: Maximum occipito frontal . : : ~ 19:tsem: e bi-parietal . : : ; : . 138 Bizygomatic at junction of zygoma with temporal . > 123 Nose: length. : : ; : : S47 breadth . ; : : ; : ; : : 318 Face: nasion to commissure of lips . : oleae » 5» symphysis of chin . : | lisa Right external strabismus is present and vision defective. Though mentally an imbecile with an impaired speech he is an excellent field labourer. He states that no woman would marry him but that he has had sexual intercourse and that he is capable of the act. H. S. Srannus 5 A few other cases of Congenital Idiocy have been seen and include an example of Spastic Diplegia, a Mongol Idiot aged 4 years in W. Nyasa district and two microcephalic idiots met with in adjacent villages in Chikala district, in neither of which were factors of etiological interest elicited. (a) Aged 22, male, looked like a boy of 12 in physical development, the head was very stnall but no measurements were made; the palpebral fissures were markedly slanting downwards and inwards and an internal strabismus was present ; the ears and palate were normal; the hands large and like those of a man. (b) A male infant aged one year with so marked a degree of microcephaly as to approach in type anencephaly, the resemblance being the more marked as the protuberant eyes and lips were like those characteristically found in anencephalic monsters (Plate IT, (7)). (6) The following case is given at length (Plate II, (5) and (6)). Masimosya, aged 19 years (1911), a Yao of Chipi’s village Zomba, exhibits a marked want of development of sexual organs (male) associated with large breasts. The general form of the body is that of a woman; the attitude, voice, laugh, facial aspect and expression resemble those of a woman rather than of a man. The teeth are good, the body and limbs well developed and there is a fair deposit of subcutaneous fat. The breasts (see photo) are remarkable, being large, with large well-formed nipples and well-marked areolae, dark in colour. They have started to beome pendulous and resemble exactly those of a nulliparous woman of the same age. The abdomen is well formed and round the umbilicus there is a deposit of fat such as is commonly seen in women; the pelvis appears large. There is some hair in the axillae but none on the face or body. The pubes is rather prominent resembling the female mons veneris and there is some develop- ment of hair upon it. The penis is very small, only two inches in length and of infantile type, the glans is covered by a prepuce and there is no deformity. The scrotum is very small indeed and only contains one testicle, the left, which can be felt as a small body about the size of a bean, three-eighths of an inch long. The right testicle is not apparently present in the scrotum or inguinal canal. The scrotum shews no tendency to be divided nor is there anything in the arrangement of the skin to suggest labia. No rectal examination was made. . The subject is insane. He is fairly tractable and good-natured. He has delusions and hallucinations, it is reported, with various phases of the moon, when he is said to travel 15 miles to bathe in a certain stream, etc. He has tried to burn down some houses. I could get very little of his history. The mother and father are said to have been normal; the only other child, a girl, was insane and died in the Central Asylum. The subject once cohabited with a woman who was to have been his wife, but she ran away the next day and I was unable to find out from him if he had any sexual desire. Such is a case which would have been called one of Partial Hermaphroditism but in the absence of further data I shall not discuss it. 6 Congenital Anomalies in a Native African Race (7) Obesity. No-cases of general obesity outside normal limits with possibly a congenital origin have been seen. Steatopygy does not occur. (8) Symmetrical Lipomatosis is conveniently considered here though perhaps not strictly within the subject. Three old women have been seen all presenting the same abnormal feature, namely, the presence of symmetrical lipomata in both axillae, each about the size of a small orange. In a fourth case the affection was one-sided, the subject giving a history of the gradual descent of the tumour from the upper aspect of the shoulder into the arm-pit. That these tumours were lipomata I can only support by clinical examination, they certainly were not of the nature of the pads seen in myxoedema and no signs of that disease were present. There is the possibility that they were acces- sory breasts but they did not present the characters found in undoubted cases of this condition. These tumours may have a similar pathogeny to the masses seen on either side of the back of the neck of men and specially described by Sir Jonathan Hutchinson; on account of their possible paleogenetic significance I have included notes on these cases here. (9) Lymphatism, Post-mortem examination on a boy 10 years of age who died after receiving a blow on the head revealed a thymus gland of considerable bulk, 4 inches long. The blow had not severed the soft tissues over the skull and in the absence of any other evidence of injury or disease one might suspect the case to be one of lymphatism, an inherent disorder which had predisposed to death. In a second case, that of a woman aged 40 years who died after moderately severe burns, a body 44 inches long of yellow colour and firm consistency was found lying on the anterior surface of the heart, the apex of this body being at a level with the 2nd costal cartilage. (10) Coming now to Malformations, there is a well-defined deformation of the skull of which I have seen several examples, the main points of which are well shewn in the photographs. The extreme height of the cranium and marked dolicocephaly without bossing of the forehead, while the sides of the vault of the skull are flattened, are characteristic. The photographs depict a boy aged 7, son of Matikwiri, headman of Mlanje, whose two younger sisters are said to resemble him exactly in the deformity present (Plate IV, (12) and (13)). The second case is a boy aged 15 years, the head measured 21°5 cm. long and 12°5 cm. broad (Plate III, (9)—(11)). (11) Congenital Ptosis is not uncommon and is associated with the typical expression due to this disability. A slight degree of Epicanthus may be fairly often observed; more marked, it is sometimes seen associated with obliquity of the palpebral fissures giving a regular mongolian character to the face (Fig. 3). Buphthalmos has been seen on two occasions in young adults with a history of its congenital nature but nothing else of note; tension normal and vision appa- rently good. H. S. Srannus T Microphthalmos was once seen associated with coloboma of the iris and choroid (see below). Coloboma. This defect was met with in two brothers aged about 18 and 17 years, but neither parent nor, as far as 1 could ascertain, any other member of the family was similarly affected. Bwanali the elder presented a coloboma of the iris and choroid of the left eye; there was also a small opacity on the posterior surface of the lens, which however could not be traced more deeply but which suggested a remnant of an “arteria centralis.” . 2 CLs Fig. 3. Epicanthus. The right cornea shewed some superficial opacities, the iris appeared normal, but examination of the fundus revealed a large white triangular area with the apex near the disc with here and there small masses of pigment. The middle portion of the white area was on a much deeper plane than the rest of the fundus, forming a posterior staphyloma, the whole composing a kind of posterior coloboma (Fig. 4). Right eye. Left eye. Fig. 4. Bwanali Coloboma. This boy also had an accessory nipple. The younger brother Pete presented on the right side a microphthalmic eye with coloboma of iris and choroid resembling the condition in his brother, with an 8 Congenital Anomalies in a Native African Race opaque spot on the posterior surface of the lens. The eye is convergent and vision poor; he counts fingers at one yard. The left eye is normal. Dermoid Cysts of the Face have been seen in the situations shewn in the sketch (Fig. 5). One of these was excised and found to contain the usual pultaceous Fig. 5. Dermoid cysts of face. mass mixed with hairs. These hairs examined microscopically were found to be spindle shaped, tapering at each end, brown diffuse and granular pigment was present in them. A relic of the cleft between the median and upper external processes of the foetal face was on one occasion seen as a small pit at the lower extremity of and just external to an epicanthal fold. (12) Congenital Naevus. Only two cases of naevus have been seen. One a woman presented a small naevus just to the left of the middle line on the forehead at the margin of the hairy scalp, 1 cm. in diameter. The second was a man with a similar growth 1 cm. in diameter on the lower lip just to the right of the middle line (Ching’waya of Zomba). (13) Har. The general conformation of the ear varies a good deal; some of the types are shewn in the sketches (Fig. 6) but all these must be considered as coming J III Fig. 6. within the limits of normal variation. In one case a kind of Accessory Lobule was noted ; the subject was an albino. A number of persons with Accessory Auricles H. S. Srannus 9 have been seen. These consist of little subcutaneous nodules of cartilage forming tubercles one to four in number situated just in front of the tragus, the affection being usually bilateral. An abnormality seen affecting a woman in N. Nyasa consisted in the direct prolongation of the skin from the side of the head on to the outer surface of the pinna so that the upper margin of the ear was hidden, though easily felt beneath the skin. Helical fistula. Under this name have been described the remains of the first branchial cleft found as little pits on the helix. The condition is certainly rare in England and persons exhibiting the anomaly are sometimes shewn as interesting cases at medical societies. That heredity plays a part in its incidence is well known as illustrated by a case shewn by Dr Prichard at the Royal Society of Medicine, an infant with symmetrical helical fistulae, whose mother, four siblings, maternal grandmother and two great-aunts all exhibited the same defect. Having noted this same anomaly in quite a number of natives I became interested to ascertain the actual incidence. The statistics given below embody the results of my observations covering nearly 6500 individuals of all tribes. The popula- tions of whole villages were taken so that consecutive unselected persons were dealt with, Tribe core Right Left pou . { Males 416 7 6 3 No gis { Females | 612 13 Soni bas eS Males ... | 100 | — 4 1 Females... 136 — 3 — Wionsa J Males... 1941 | 34 22 12 6” | Females... 2576 | 69 53 23 Wankonde* ... as 455 | 4 8 5 Awemba a ace 48 1 — 1 Anyanja ee oa 65! 1 1 -— Ahenga sb aah | 142 3 5 — Totals... .. | 6491 132 110 50 Thus among 6491 individuals of all ages and both sexes a total of 292 were found to have helical fistula (4°5 °/,). It was more commonly unilateral, affecting the right side a little more often than the left, giving percentages of 2°08 and 1°69 respectively and for bilateral cases 0°77°/,. Taking each sex we see that the proportions between the three numbers are almost the same. 2457 males 3 41 By) 16 3324 females... 82 64 28 * These figures were kindly supplied by Dr Davey. Biometrika x 10 Congenital Anomalies in a Native African Race The actual incidence in the two sexes is however greater among females than males in the proportion of 5:2°/, to 3°6°/,. An abnormality occurring so fre- quently as 45 per mille might almost be considered to be a variation within the limits of the normal. The fact remains, however, that it is the persistence of a foetal character and abnormal, if the whole of mankind be taken into con- sideration. Dealing more in detail with this defect, there is some variation in the exact site of the fistula; the sketches (Fig. 7) serve to illustrate the extremes of posi- tion in three directions. JIIVG Fig. 7. Three cases presented two pits on the same side, one each in positions A and B. In these three cases the affection was bilateral and symmetrical. The common position at which the pit is found is in D. In another case not included in the series a pit was observed resembling those above mentioned but situated at the junction of the tragus and lobule as in £. These helical fistulae, which I have described as little pits, consist of a small opening on the skin 1 or 2 mm. in diameter leading into a blind sac 1 or 2 mm. deep; often this sac opens out into a little ampulla which can be seen and felt under the skin. The ampulla and canal are generally filled with a little plug of sebaceous matter. In three cases the skin in this situation looked like scar tissue and presented a honey-combed appearance, there being several openings into the ampulla giving the impression that an abscess had formed at some past date in the ampulla with consequent loss of tissue. The fistula is so common and so unremarkable that most tribes have no name for it and one cannot elicit long pedigrees to shew its incidence in families. Cases of heredity were common enough but the type was not necessarily the same in members of the same family; thus a mother with Left Fistula had a child with Right and Left, or again, three brothers were seen two with the Left side affected, the third with Right Fistula. No malformations in connection with other branchial clefts have been seen, H. S. Srannus 11 (14) Lips, Mouth and Palate. Most natives shew a well-marked tubercle in the median line on the “red” margin of the upper lip; in a few however this is replaced by a distinct groove which involves the red margin of the lip or only the subjacent fold of mucous membrane (see sketch Fig. 8 and photo, Plate V, (16)). Jae ‘ = Fig. 8. These cases resemble one of a Hindu (recorded in the Lancet, Oct. 2, 1909, by Thurston), who besides having the median hare-lip was the subject of poly- dactylism. In one of my cases there was a considerable gap between the upper central incisors but no further abnormalities were present. In a single case notching of the upper lip was found to the left of the middle line with a mark running up to the nostril which looked like a scar. There was no question of any operation having been performed, though the condition resembled exactly an artificial repair of a lateral hare-lip (Fig. 9). A similar Bigs 9: case has been shewn at the W. Lond. Med. Chir. Society in which there was, besides, a deformity of the nose and a family history of hare-lip. I have only seen one case of ordinary Hare-Lip, a Blantyre boy aged 10 years (1909), the affection being left-sided and unassociated with any cleft of the palate (Plate V, (17)). Among 30,000 natives examined in the northern districts of this country no case was seen. No case of typical Cleft Palate has come to my notice; on the other hand I have seen three cases which owing to their non-association with defects in the upper lip are of great interest. All three cases, one a boy aged 10 years (1906), the other two adult males, presented complete Absence of the Premasilla and attached teeth. In the boy there was also a Median Perforation in the hard Palate. Congenital perforations of the palate apart from clefts are apparently rare in Europe. Dundas Grant (Roy. Soc. Med. April 1910) has recorded the case of a girl aged 16 years with a perforation above and to the right of the base of 2—2 12 Congenital Anomalies in a Native African Race the uvula with no history of trauma or syphilis. Prof. Karl Pearson has drawn my attention to a skull which was brought by Du Chaillu from Fernand Vas in the Congo (see Biometrika, Vol. vit, Plate XXVI); this shews congenital . absence of the premaxilla, but the two maxillae have not approximated in the mid-line in front as in my own cases, and we do not know the condition of the soft parts, but it is interesting to see this anomaly from another part of Africa. (15) Teeth. Native children are said to be born sometimes with teeth; it is possible that this is not very rare as there is a common superstition regarding them. I have seen one case with this history, to be mentioned later, as having deformities of the lower extremities. A gap of as much as 4 of an inch between the lower central incisors has been noticed a number of times, the other teeth all being regular and touching one another. A similar condition may be seen also affecting the upper pair of incisors, one that I am not conversant with among Europeans. Among 1500 natives examined for statistical purposes in regard to caries the following numerical abnormalities were noted: (a) Complete reduplication of the set of teeth in an adult, the second set lying on the palatal side of what appeared to be the normal set. I have every reason to believe that this was a case of true reduplication, that is to say, the result of growth from doubled enamel organs and not of retention of the deciduous teeth. (b) Reduplication of upper incisors. (c) Reduplication of right lower bicuspid. (d) Reduplication of both bicuspids in the lower jaw on each side and in the upper jaw on the right side in a woman aged 24 years. A single case of a Bifid Eatrenuty to the Tongue was seen in an albino child. (16) Polymazia and Polythelia. 14 cases of these anomalies have been met with casually, so that I imagine this anomaly by excess is comparatively not uncommon. Short notes of these cases are given below for purposes of com- parison : (a) Male adult, accessory nipple springing from the skin at the right sternal edge opposite the 38rd intercostal space, it was large and well formed like a woman’s but there was nothing resembling an accessory mamma beneath it. (b) Male aged 45. Insane and suffering from spinal caries. There was a rudimentary accessory nipple in Scarpa’s triangle on the right side 14” below Poupart’s ligament. (c) Adult female, an accessory nipple on the right breast, small but well formed and lying above the one proper to the breast; both are patent and milk can be drawn through both. H. S. Srannus 13 (d) Adult male, the accessory nipple is situated in a line with the left nipple below it and half-way between it and the costal margin. (e) and (f) Two women each had two nipples to the right breast. (g) A young woman was found to have two nipples on the left breast (Plate V, (18)). (h) Male with congenital coloboma iridis mentioned above has an accessory nipple just above and to the inner side of the right nipple. («) Young male adult has just at the outer edge of the areola of the left breast a very small accessory nipple, and beyond this and above it over the third intercostal space another flat nipple with areola and hairs. (j) Male, presents a rudimentary nipple in the left groin just below the middle of Poupart’s ligament. (k) Young adult male shews a small accessory nipple just below and internal to the right nipple; his brother, father, and grandfather are all possessed of the same identical anomaly. The subject has no children, no nephews or nieces. (1) Female in hospital with syphilis has a small accessory nipple springing from the skin of the chest wall just internal to the point of the left pendant breast. (m) A woman with well-formed accessory breast in the right axilla. It is breast-shaped and pendant though there is no nipple. The woman volunteered the fact that it was a breast and said it swelled with pregnancy. The right breast was twice as big as the left. (x) An old woman with symmetrical masses in each axilla resembling rather the symmetrical lipomata mentioned elsewhere: see p. 6. She states that they appeared at puberty and thinks them to be breasts but denies that they enlarged with pregnancy. In the Japanese this condition has been shewn to be not unrare, and among them tuberculosis has been found to be more frequent than among the normal population. I can only support the idea with one case (No. b). (17) Meningocoele and Spina Bifida. No typical case has been noted. A man was seen with a little dipple of the skin over the lower part of the sacrum in the median line having a little fold of skin on either side forming two small vertical lips. (18) Penis, Testicle; Hernia. Epispadias, hypospadias and extroversion of the bladder have never been seen. T have seen a boy aged 18 years with a short penis enclosed in a fold of skin from the upper surface of the scrotum (Fig. 10).. The boy had other deformities which are described later. When examining a number of recruits I was surprised to find in a large proportion the right testicle hanging lower than the left, the 14 Congenital Anomalies in a Native African Race reverse of what is known to occur among Europeans. On examination of 400 consecutive men, adults, between the ages of 30 and 40 years, I found in 166 or 41°5°/, the right testicle lower than the left. In the remainder or 58°5°/, the right testicle was on a level with the left, or rather higher in the scrotum. I also got the impression that, associated with right lower testicles, the testicles and penis were large. In another series of 280 men, the left was lower than or on the Fig. 10. Boy aged 18. same level as the right in 185; the right lower in 88. There were two cases of left cryptorchidism, one of right cryptorchidism; one each left and mght hydro- coeles and two right inguinal bubonocoeles. ites I have come across a number of cases of undescended testis among other natives, in some associated with a swelling in the inguinal canal, in others there was complete cryptorchidism. Inguinal hernia is not infrequent in adult males but I can give no figures relating to a large number of persons. In a single man it was associated with umbilical hernia. I have never seen a femoral hernia. Umbilical hernia is common enough especially in children. The following figures though small in number give some idea of the frequent incidence of the condition. They refer to all the children in a single village and may therefore be said to be unselected in any way. Age = P oe aE O— 1 year 18 13 6 1— 2 years 44 27 12 3—10_,, 102 44 10 Totals 164 84 28 | =276, giving roughly COM. 0k Gis _—_—Yr— 40 °/, That the hernia diminishes in size even to disappearance after childhood, as indicated by these figures, is certainly true as the same incidence is undoubtedly not found among adults. The protrusion is sometimes very marked and takes the ————— eee ar wv ~ a H. S. Stannus 15 shape of the finger of a glove, some several inches long and curving downwards (Plate IV, (14)). Writing recently E. M. Corner in doubting the commonness of congenital sacs in hernia in general, as insisted on by some writers, has shewn in a series extending to between two and three thousand observations that herniae in children are often multiple and associated particularly with a ventral hernia, a diastema, which though very rare at birth 1s common in young children and of the nature of a true hernia. He believes that this ventral protrusion, which is certainly not congenital, is caused by increased abdominal pressure due to gaseous distension of the bowels the result of fermentative processes, and that other herniae are due to the same cause. Among native children abdominal distension is almost the rule, “ pot-bellied” is an expression always used in speaking of them. This dis- ténsion is due largely, I believe, to fermentative processes, and also a second factor, absent in European children, namely enlarged spleen. Of 50 children under the age of 5 years taken from among those with umbilical hernia, 43 or 86 °/, were found to have the ventral protrusion as described by Corner. 18 of these had enlarged spleens and 20 shewed a considerable abdominal distension. In none was any other hernia found. In these cases we see par emcellence the effect of intra-abdominal pressure, in producing first ventral hernia and ‘secondly umbilical hernia. The weakness of the umbilical scar is due, I have little doubt, to the method of treating the cord at birth. The custom prevailing among many is to bind the whole cord and placenta on to the child’s abdomen till it separates ; with others the greater part of the cord is so treated after severing the placenta ; in any case there must be considerable tension, I think, at the umbilicus and sepsis is more likely to occur. Cursham Corner has said that the size of the bulging is proportional to the length of cord left proximal to the ligature, and the same principle adapted to natives who use no ligature may be true, and thus account for the very “long” umbilical hernias. I am therefore inclined to agree with Corner that the umbilical and the ventral herniae of children are due largely to intra-abdominal pressure, but though my numbers are small, the absence of any other hernia among my cases must be taken to mean that for their production there is another factor to be taken into account, and that is, I believe, in Corner’s cases some congenital structural anomaly, namely a congenital sac, and, conversely, I think congenital sacs are uncommon among natives of this country. (19) Malformations of the Extremities. Various forms of Congenital Talipes are met with which call for no special comment. A peculiar condition characterised by symmetrical shortening of the humeri has been observed and forms the subject of a paper by Dr 8. A. Kinnier Wilson and myself referred to above; certain deformities of the hands and feet are also therein dealt with. Since this paper was written I have seen three other cases of Congenital Humeral Micromely, one of which I mention here as there is a family history of 16 Congenital Anomalies in a Native African Race the defect, a point of some interest and one which I had not elicited in previous cases. Gobedi, male, aged 22 years, a Yao employed as a machila carrier in Zomba, exhibits the deformity in typical form well represented in the photograph (Plate V, (19)). The head of each humerus appears to be poorly developed and though move- ment at the shoulder joint is free, a certain amount of fine crepitus is elicited, such as was found in several of the other cases. The point of interest however is the fact that the maternal aunt is stated to have had the same congenital anomaly. The subject has no brothers or sisters and his own two young children are stated to be normals, his mother and father and more remote relations are not known to be affected. Besides these the following cases deserve mention. A boy was seen, 18 years of age, with a peculiar deformation of the hands, stated to be congenital; the fingers and thumb shewed considerable thickening about the Ist interphalangeal joints with marked ulnar deflection; the bridge of the nose was depressed, the lips very thick, and epicanthus present. There was also the penile deformity above mentioned, except for which I should have doubted the statement in regard to the congenital nature of the hand deformity (Fig. 11). Fig. 11, Boy aged 18, H. S. StTannus 17 I saw at Bandawe a female infant aged 14 years presenting multiple defor- mities. The astragalus of the left foot was apparently implanted in a cup- shaped depression on the lower end of a very much shortened thigh. The femur of this leg was short but around it there was an abnormal amount of muscle as if the usual amount of muscle for a normal had been cramped up into the shortened limb ; the foot could be freely moved by the child. The left foot had only a hallux and two toes with a partial cleft between the hallux and the adjacent toe, but I think four metatarsal bones. The right thigh was also somewhat shortened but the bones of the leg apparently both present, the knee-joint could not be distinctly made out and was flail. Right talipes equinovarus present, also right internal strabismus. No history of similar deformity in family, a brother a year older was born with two upper incisors. Father and mother normal. The father has two other wives with six and ten children respectively, all normal. Such gross congenital deformities are from time to time recorded in Europe, thus Lockart Mummery described a case of congenital absence of the femur in a male child, etc., in the Brit. Med. Jour. for November 5, 1910. In a male 35 years of age I found Congenital Absence of the Right Fibula, the tibia being bowed forward with 8 inches shortening of the limb, the foot on the same side had only three metatarsal bones and three digits including the hallux. A woman was seen with congenital shortening of one leg to the extent of four inches. A single case of unilateral Congenital Dislocation of the Hip has been met with. (20) Split Hand and Split Foot Deformities. The photograph, Plate IV, (15), serves to shew moderately well the deformities met with in a male child aged 5 years (1905): in the absence of a skiagram it is impossible to go into the detail of the bony conditions present. There was no admitted history of similar or other deformity in the family. A second case, Ndala of Njalusi’s Mangoche, shewed a similar deformity of the left hand but in a less degree; he was otherwise normal and stated that no other members of his family were similarly affected (Plate VI, (20)). These cases are interesting to compare with those collected and classified by Lewis and Embleton in Biometrika, Vol. v1, 1908. (21) Shortening of the Fourth Metatarsal Bone. When first I entered the country my attention was attracted by a number of natives who presented a shortening of the fourth toe. Since then Captain Hughes has noticed the condition in Egypt. The descrip- tion he gives is as follows (Lancet, July 16, 1910):—“ The fourth toe is markedly retracted usually behind the level of the fifth toe. The phalanges are not appa- rently abnormally short, and the metatarsal bone can be felt unfractured but with the head very much farther back than usual. Commonly the digit is pushed Biometrika x 3 18 Congenital Anomalies in a Native African Race upwards by the pressure inwards of the fifth toe. The condition is sometimes unilateral sometimes bilateral.” He adds that in one case the second metatarsal and, in another, the third metatarsal were also shortened. In a single case he saw a similar condition in the hand, shortening of the second and fifth meta- carpals. The above description corresponds exactly with the condition seen in this country. I have also seen other toes than the fourth affected, and I shew a photo- graph of a man’s feet with involvement of the metatarsal of the hallux; in- another case the fifth was affected; in another case, a woman, the common variety was associated with shortening of the third metatarsal of the left foot (see Plate VI, (22), (24) and Fig. 12). / . / =. Fig. 12. Fig. 13. OM 2 (22) Syndactyly of various degrees has been observed; sketches of two examples are given in Fig. 138. (23) Polydactyly is not at all uncommon. I have casually come across some dozen cases in five years. In the majority the supernumerary digit consists of a miniature phalanx attached to the skin of the hand or foot at the level of the head of the fifth meta- carpal or -tarsal bone. Such digits are often removed in childhood, leaving a small cartilaginous nodule at the seat of removal. Most commonly it is a symmetrical affection of both hands and feet; in other cases hands or feet alone (Plate VJ, (23)), or one extremity only, present the deformity. In some the accessory digit is well formed and an accessory metatarsal or metacarpal bone more or less complete is present. In one case it was the hallux which was reduplicated, the two digits being partially fused. In another, reported to me, the supernumerary digit in each hand was situated on the radial side of the first finger with probably an accessory metacarpal bone in connection with it. The feet bad extra digits beyond the fifth toes. (24) The following case is of some interest: Chibisa, male, an Angoni of Kawenga’s, aged 30 years. The deformities in- volve all segments of the right upper and lower limbs and to a minor extent the OO H. S. Srannus 19 left limbs (Plate VI, (21) and Figs. 15—18). On the right side there is shortening of the humerus and forearm (10 cm. difference between the two sides), but elongation of all the segments of the middle finger and its metacarpal bone; the middle finger itself measures 10} cm. The metacarpal bones and phalanges of the other fingers are, I think, absolutely a little shortened. The left arm and hand are normal, except that this hand as also the right hand shew a little nodule at the base of the little finger where a supernumerary digit was removed, ® Fig. 15. Fig. 16. O Qinee E™ Fig, 17. Fig. 18. The right foot presents a similar condition to the right hand, elongation of phalanges and metatarsal affecting the second toe, the toe itself being 7 em. long. The tibia is somewhat bowed outwards. The left foot presents shortening of the metatarsal bone of the hallux. . 3—2 20 Congenital Anomalies in a Native African Race The photograph and sketches illustrate some of these points. Other measure- ments were as follows: Height 166°5 cm.; span of arms 164°5 cm.; Maximum fronto-occipital 18°0; maximum biparietal 13'8; - Nose length and width 4:4. (25) Congenital Anomalies of the Kidney. Post-mortem examination on a native prisoner who died of pellagra revealed the presence of a double kidney on the left side and none on the right. From the sketches (Fig. 19) it will be seen that the upper part was the one proper to the side while the lower half was the abnormal portion. L.Suparenal-|-- Fig. 19. Churinigu. Kidney of Left side double. The two parts were really very distinct, partly separated by a groove and cleft. The lower viscus had been felt during life as a tumour in the abdomen of unknown nature as it lay along the left side of the vertebral column. The kidney was unfortunately removed before dissection of vessels, etc. was made, but the sketch shews the arrangement of these at the hilum of the kidney. . The two ureters united below the lower pole of the double organ, the distal ureter being nearly twice the normal size. Se a Se ee HLS. Sraynvus 21 The bladder was normal; there was no right ureter. The suprarenal body of the right side was in its normal position and appeared normal. No other abnormalities were remarked. (26) Some suggestive observations have been made by Dr Ewald Stier, published in the Deutsche Zeitschrift fiir Nervenheilkunde (Band xiv, Heft 1-2, S. 21), from which the generalisation is made that in all anomalies of overgrowth the right side of the body is much more frequently involved than the left, whereas in anomalies of undergrowth the left is more commonly the site of the condition than the right, this distribution being the result of a preponderance of persons with a leading or superior left cerebral hemisphere, as with left-handed persons the converse was found to be true. In other words, the plus anomalies occur on the right side in right-handed people and the minus anomalies on the left side, the left hemisphere being the superior hemisphere, the converse being true. I have therefore tabulated my observations, and though small in number they tend to confirm the idea assuming that the African native is right-handed. This remains unproved and a less marked superiority of the left hemisphere may account for non-conformity of my few cases to Stier’s rule. | | Right | Bilateral) Left Plus Anomalies: ° | Reduplication of teeth 2 3 0) Polymazia : : ae eel 1 0 Polythelia ... soe She i0 | O 5 | Polydactyly ... 1 1 2 Minus Anomalies: Hare-lip ; | @ 2 Cryptor chidism 2 0 2 Absence of Fibula : 1 0 0 ~ and Tibia... 0 0) 1 | Split "hand, foot 0 1 1 | Shortened metacar pal, tarsal 0 1 3 Syndactyly : 1 0 0 Coloboma iris = 1 1 0 Plus and Minus together : | Chibisa 1 (+) — | 1(-) In considering these cases it should be remembered that the majority of my observations have been made casually among natives met in the bush, in villages, etc., others in the course of routine work among troops, prisoners, etc., the few were the result of special investigation. (27) Concluding Remarks. The notes of cases which I have thus collected together form rather a medley of facts but I think certain deductions may be made from them. 22 Congenital Anomalies in a Native African Race It would appear that (1) The slighter the anomaly the greater the frequency with which it may be observed. (2) The more marked degrees of deformity are only seen in children and those in places where European influence is felt. (3) Cases of heredity are only seen among the lesser anomalies. (4) The least obvious congenital anomaly is a helical fistula, and this is found in 46 °/, of the population and is frequently inherited. The difference in the observed incidence between the minor anomalies and those of more marked proportions may be real or only apparent. I think the latter supposition is true for reasons which can be deduced from the facts given above. It is the custom among all the tribes of this country to destroy all deformed children at birth. Any minimal deformity such as a helical fistula is of course unrecognised, an accessory nipple is probably hardly noticeable, accessory digits which can be removed by a nick with a knife are matters of no import, while a foot with six well-formed toes would hardly be considered worthy of note. These abnormalities are therefore comparatively common, but hare-lip, cleft-palate, deformities common enough in Europe, are among the rarest in this country; a child with a hare-lip would be seen to resemble a hare and would be immediately destroyed. Children with the greater deformities would certainly be destroyed. In recent years under European influence native customs fall into abeyance and so we see my single case of hare-lip in a boy aged 10 at Blantyre, a township of 25 years standing, a child with gross deformities of the lower extremities born prac- tically on a mission station; or, to quote another example, an albino reported by myself was the fifth albino child born, the first four having been killed at birth by order of a chief, who in later years came under the influence of an up-country mission station, for which the living albino has to thank his survival. The gross abnormality of absence of premaxilla would pass unnoticed as the deformity is slight. History relates that in the case of the child with lobster claw deformity of hands and feet, it was only saved from a summary death by the efforts of the mother. I think with the evidence as it stands one may with fairness say that con- genital anomalies are common among the natives of this country. Secondly, I think one may also deduce from the facts stated that abnormalities of all kinds are at least not uncommon. In the few cases in which I have adduced statistics there can be no doubt, in other cases it is rather a matter of one’s impression. I have shewn that certain congenital anomalies among natives of Nyasaland are common and have attempted to argue that probably many of them are common. ae H. S. Srannus 23 (28) Very few statistics are available for comparison, but I should hke to refer to some by writers in Egypt. Prof. Madden cites in a letter to the Lancet a case of cleft-palate which he operated on as the first in 11 years during surgical work at the Kasr-el-ainy Hospital, and assigns as the cause of the lack of such cases the “truly awful struggle for existence” which would eliminate infants so handi- capped. The Lancet remarked (Lancet, July 3, 1909), in an annotation upon this letter, that Prof. Elliot Smith considers it to be impossible to endeavour to explain this rarity of congenital defects in Egypt, unless the time-honoured scapegoat of our too modern civilisation be invoked to account for their frequency in other countries. Statistics of the Kasr-el-ainy Hospital compiled by Dr Day are quoted in 1907; among 2630 total surgical admissions the only congenital deformities were 5 hare-lips, 2 talipes, 2 imperforate anus, 1 extroversion of bladder; in 1908, 2702 admissions, 3 hare-lips, 2 imperforate anus, 1 hypospadias, 1 undescended testicle, 1 meningo-encephalocoele. Capt. G. W. G. Hughes, R.A.M.C., in a paper to the Lancet, July 16, 1910, referring to this annotation, remarks “ Readers will be interested to hear that our too modern civilisation is innocent of this slur,” and goes on to shew that many congenital defects are by no means uncommon. Dealing with males between the ages of 14 and 21 years he gives the following figures : Hare-lip in 0°041 7%. Cleft-palate 0016. Polydactylism 0-058 '/ and 0:04 °%, in two series. Shortened metatarsal 0°37 7% and 0:23 % Other deformities of fingers and toes 0:22 7. Talipes 016 %. Among the thousands of ancient Egyptian bodies which Prof. G. Elliot Smith has unearthed and examined, a single case of cleft-palate was met with, a female of 20 years of age with a skull of negroid type, of between the 4th and 6th century B.c.; only one case of talipes (T. equinovarus) was recorded. It is obvious that in Egypt surgical treatment is not sought in cases of cleft-palate and rarely for other congenital defects but many of them are common enough. May the rarity of defects among the ancient peoples of Egypt be due to the same cause that acts in Nyasaland to-day? Were the children affected with deformities killed at birth and “thrown onto the dust-heap” where their remains were soon lost trace of? Of chief interest to me are the figures published by Captain Hughes. He shews that a shortening of the 4th metatarsal bone occurs in percentages rising to 0°37 of males examined. This defect is peculiarly common in this country. Again, polydactylism occurs in 0°05 °/, and other deformities of fingers and toes in 0:22 °/, of Egyptians, both deformities very frequently met with by myself in Nyasaland. 24 Congenital Anomalies in a Native African Race He however does not mention polythelia and polymazia nor helical fistula. I should be very interested to learn if this last insignificant anomaly was looked for. There is no doubt that one of them, helical fistula, occurs with a frequency in Nyasaland which cannot be rivalled by any other among peoples of any race. I think one may also say with certainty that the incidence of others (shortened 4th metatarsal and polydactylism) in this country is far in excess of that among Europeans, though probably much about the same as in Egypt. Upon what hypothesis can these facts be explained? Is there a single cause or are there many at work? These are questions which I shall not attempt to enter into, but by simply recording my observations I shall hope to stimulate others to do the same, for only by accumulating facts can it be hoped that such problems will ever be solved. Biometrika, Vol. X, Part | Plate | (1) (2) Samuti, an Ateliotic Dwarf. Subgiant, Height 1:92 metres, with Wife and Albinotic Child. Etimu, aged 25, an Achondroplasic Dwart. Biometrika, Vol. X, Part | Plate Il (5) (6) Masimosya, aged 19, Gynaecomastos, with other features which were formerly described as those of Partial Hermaphroditism. (8) Microcephalic Infant. [Case of Hydrocele testis included by an over- sight of Dr Stannus in the photographs, and engraved in consequence, Discovered too late to rearrange plates.] 4 Plate Ill Biometrika, Vol. X, Part | ‘ATeydaooydvog SurAoys (01) ‘supad GL pase ‘sog / Biometrika, Vol. X, Part | Plate IV i 5 (13) Son of Matikwiri, aged 7, a case of Scaphocephaly. (14) (15) Cases of Umbilical Hernia, Split Hand and Foot in a child, aged 5, be Biometrika, Vol. X, Part | Plate V (16) (17) Case showing faint medium depression Blantyre boy, aged 10, with Hare-lip. of upper lip. (18) (19) Young woman with two nipples on left Gobedi, ayed 32, with congenital Humeral breast. Micromely. i . i ' i fi ‘ Sat hte . ‘ . i 2 : , . i 7 - a is eo ' - a —s ~ % 7 ” i =, 4 i - 7 1 3h whe = : - : ue - ~ 4 bo ge A? - f 4 7 7 7 | . sd : i . 7 , Fey , et Biometrika, Vol. X, Part | Plate VI (20) Ndala, Split Hand, left only. (21) Chibisa, aged 30, elongation of all segments of middle finger, ; and its metacarpal bone, Zanes Ss. es EE (22) Shortening of the fourth metatarsal bone. (24) Shortening of the left great toe. C 7 P a 7 . A - 5 — ‘ mie 22 in: Jase S >= + : - PK} : : ¥ 5 e 7 . ’ a i = i j ae : (, 2 ae O24 \, = : e 2 bd ry TABLES OF POISSON’S EXPONENTIAL BINOMIAL LIMIT. By Ee it SOPER, “McA: In his treatise, Recherches sur la Probabilité des SJugements, Paris, 1837, Poisson* shows that the series of frequencies (Dean am Oct n(n =i) pg + + ee (DEK ip + 2 atin nih Lr given by the expanded terms of the binomial (p aL qs becomes in the limit, when gq is diminished, and n increased, indefinitely, but so that nq remains finite and equal to m, the exponential series S me m e™(l+_m+o——+...+—+4...]3 2! r! and he points out that the terms of this series will give the proportional frequencies of the occurrences OF pie ee ney times, in any sample, of an event, every occurrence of which is equally likely in the sample and independent of the other occurrences, and which is of such frequency that m events occur in the sample on an average. The series is arrived at by “Studentt,” when considering the theoretical frequencies in sample drops of a liquid of minute corpuscles supposed distributed at random throughout the mass of the liquid. The event may also occur in time, each occurrence being supposed to take place with equal probability in any finite period taken as the sample, and to act independently of the occurrences of all the other events. A physical example, which appears by the closeness of the observed to the theoretical frequencies to * pp. 205 et seq. + Biometrika, Vol. v. p. 351, ‘‘On the Error of counting with a Haemacytometer,” Biometrika x 4 26 Poissows Eaponential Binomial Limit satisfy these conditions, is the number of a-particles discharged per {-minute or }-minute interval from a film of polonium *. In vital statistics the sample may be an individual or house or community and the event an accident or disease and so on. But it must be borne in mind that for such series as the above to be applicable the occurrence of one event in the sample must not preclude or influence in any way the occurrence of a second. The probability of « occurrences, m being the mean number, in a sample, is e-™ m*/x | and in the tables which follow this is evaluated for m= 0'1, 0°2... to 15:0 and for x =0,1, 2... up to such an integer as gives a figure in the sixth place of decimals, the number of places tabulated. The terms of the series were calculated, each by a fractional operation upon the preceding, beginning with the modal term and going both forward and back. Thus if m=7'6 the term e~*® x (7'6)'/7 ! was first calculated by tables of logarithms, and the succeeding terms were then obtained seriatim by the operations 16 76 76 meee ie eI) and the preceding ones by the operations ag eae etc 06 5G" AEG ge done with a mechanical calculator, first a multiplication and then a division. tc., Seven places of decimals were thus calculated and the series is checked by the total, which differs from unity by the remainder (a figure in the eighth or later place of decimals in all the present cases) and the algebraical sum of the errors of seventh figure approximations. Poisson’s exponential series has been previously calculated to four places of decimals by L. von Bortkewitsch+ for values of m from 0:1 to 10:0. The present tables give the probability of each number of times of occurrence of the event. For the sums of these values, that is, the probability of occur- rence of the event, a given number of times or greater, or a given number of times or less, reference must be made to a second paper in this issue of Biometrikat, where such probabilities are calculated for integral values of m from 1 to 30. * See Rutherford and Geiger: ‘‘The Probability Variations in the Distribution of a-Particles,” Philosophical Magazine, Vol. xx. p. 700, 1910. See also EK. C. Snow, ‘‘ Note on the Probability Varia- tions, &c.,” Vol. xxi. p. 198, 1911, who finds the variance of experiment from theory to be such as would occur once in six experiments and once in three experiments respectively of the limited time taken, were theory exact. In a note to the first paper H. Bateman gives a proof of the exponential series of probabilities arrived at from considerations of this problem. + Das Gesetz der kleinen Zahlen, 1898. A comparison of the table printed therein with the present table shows agreement except as to the fourth figure; the nearest fourth figure is not given, in rather many instances, in the tables of Bortkewitsch. + Lucy Whitaker, B.Sc. ‘On the Poisson Law of Small Numbers,’ Vol. x. p. 37 et seq. H. E. Soper TABLE of e™m*/«!: General Term of Poisson's Exponential Expansion (“Law of Small Numbers”). m # | : | a ca : af 0-1 o2. |, 03 04 | O08 0-6 0-7 0:8 09 | 1-0 | | | 0 | 904837 | ‘818731 °740818 | 670320 | °606531 | 548812 | -496585 | -449329 | 406570 | 367879 | 0 1 | 090484 | "163746 °222245 | -268128 °303265 | -329287 | °347610 | -359463 | °365913 -367879| 1 2 | 004524 | -016375 °033337 | 053626 | 075816 | -098786 | °121663 | 143785 | "164661 | -183940; 2 8 | 000151 | -001092 003334 | -007150 °012636 | -019757 | -028388 | 038343 | 049398 -061313| 3 4, | 000004 | -000055 = =*000250 | ‘000715 *O01580 | -002964 | 004968 | -007669 | °011115 | 015328 | 4 i) — 000002 *000015 | 000057 | 000158 | -000356 | ‘000696 | -001227 | ‘002001 003066} 5 6 — — ‘000001 | -000004 000013 | *000036 | *000081 | *000164 | 000800, -000511] 6 ih — = ~ — | 000001 | 000003 | -000008 | 000019 | *000039 | 000073 | 7 8 as = = | -- “000001 | *000002 | ‘000004 | “000009 | 8& 9; ~ = - — _ — _ — — | 000001] 9 | : - x Hes 1°2 1°3 Lp SN 1 1-6 HUSH 1°8 1:9 20 zy 0 | 332871 | 301194 | -272532 | -246597 | -223130 | .201897 | 182684 | -165299 | °149569 | -135335| 0 1 | 366158 | -361433 | -354291 | °345236 | °334695 | 3823034 310562 | -297538 | -284180 | -270671| 1 2 | °201387 | -216860 | *230289 | *241665 | *251021 | .258428 | °263978 | -267784 | °269971 | *270671 | 2 3 | 073842 | -086744 | 099792 | 112777 | *125510 | -137828 | 149587 | -160671 | 170982 | 180447 | 3 4 | 020307 | -026023 | -032432 | 039472 | ‘047067 | 055131 | 063575 | 072302 | ‘081216 | 090224 | 4 5 | 004467 | 006246 | -008432 | 011052 | °014120 | -017642 | ‘0216154 -026029 | 030862 | (036089 | 5 3 | 000819 | °001249 | ‘001827 | 002579 | °003530 | -004705 | ‘006124 | -007809 | *009773 | ‘012030 | 6 7 | 7000129 | :000214 | 000339 | °000516 | ‘000756 | ‘001075 | °001487 | -002008 | -002653 | 003437 | 7 8 | 000018 | 000032 | -000055 | ‘000090 | ‘000142 | -000215 | 000316 | -000452 | -000630 000859 | 8 9 | -000002 | :000004 | ‘000008 | :000014 | ‘000024 | -000038 | “000060 | -O00090 | -000133 | 000191 | 9 10 = “000001 | ‘000001 | ‘000002 | ‘000004 | -O00006 | 000010 | -000016 | -000025 | -000038 | 10 11 — — = — = ‘000001 | ‘000002 | 000003 | “000004 | 000007 | 11 12 ae ee — — | — {| 000001 | 000001 | 12 u 21 22 23 24 2:5 2°6 2:7 28 2:9 30 x | _ ie 0 | °122456 | *110803 | *100259 | -090718 | :082085 | ‘074274 067206 -060810 | °055023 049787 | 0 1 | :257159 | °243767 | 230595 | -217728 | -205212 | -193111 | 181455 | °170268 | 159567 | 149361] J | 2 | 270016 | 268144 | -265185 | -261268 | 256516 | -251045 | -244964 | 238375 | 231373 224042 | 2 3 | 189012 | °196639 | *203308 | 209014 | °213763 | °217572 | *220468 | -222484 | *223660 +224042| 3 4 | 099231 | -108151 | *116902 | *125409 | -133602 | °141422 | 148816 | -155739 | 162154 -168031| 4 5 | 041677 | -047587 | -053775 | 060196 -066801 | -073539 -080360 | 087214 | 094049 -100819| 4 G | 014587 | 017448 | 020614 | ‘024078 | 027834 | 031867 | -036162 | -040700 | 045457 -050409 | G 7 | 004376 | 005484 -006773 | -008255 | 009941 | 011836 | 013948 -016280 -018832 -021604| 7 8 | 001149 | ‘001508 :001947 | 002477 | 003106 | 003847 | ‘004708 -005698 | ‘006827 -008102!) 8 9 | 000268 | 000369 °000498 | ‘000660 | -000863 | ‘001111 | 001412 | -001773 | *002200 002701, 9 10 | :000056 | 000081 | *G09114 | -000158 | :000216 -000289 | °000381 | -000496 | 000638 -000810 | 10 11 | 000011 | 000016 | -000024 , -000035 | -000049 -000068 | -000094 | -000126 | ‘000168 -000221 | 11 12 | :000002 | °000003 | -000005 | ‘000007 | 000010 | -000015 | ‘000021 | °000029 | ‘000041 -000055 | 12 13 — 000001 | 000001 | ‘000001 | >000002 | ‘000003 | -O00004 | *000006 | 000009 = -000013 | 13 14 — — | = | -- _- ‘QON0OT | *000001 | *OO00001 | 000002 -000003 | 14 fon — —-| — | — — = = _ — 000001 | 15 | 28 Poisson's Exponential Binomial Limit TABLE—(continued). | 11 | *000287 | -000368 | *000467 | 000587 | -000730 | -00090) | °001102 | -001337 | -001610 | 001925 12 | ‘000074 | 000098 | °000128 | 000166 | -000213 | -090270 | 000340 | -000423 | -000523 -000642 3% | ‘000018 | -000024 | *000033 | -000043 , -000057 | -000075 | ‘000097 | 000124 | -000157 | -000197 14 | -000004 | -000006 | “000008 | -000011 000014 | 000019 | -000026 | -000034 | “000044 | -000056 | 15 | 000001 | *O00001 | “000002 | -000002 | -000093 | -000005 | 000006 | *CO0009 | -000011 | -000015 MM v ie eer ae — as 81 32 3-8 ow) ass 36 Pil 3:8 39 40 0 | 045049 | -040762 | -036883 | -033373 | -030197 | 027324 | -024724 | 022371 | 020242 | -018316| 0 1 | *139653 | -130439 | 121714 | -113469 | -105691 | -098365 | -091477 | -085009 | -078943 | -073263| 1 2 | 216461 | -208702 | *200829 | -192898 | -184959 | -177058 | -169233 | “161517 | 153940 | -146525| 2 3 | 223677 | -222616 | -220912 | -218617 | -215785 | -212469 | 208720 | -204588 | -200122 | -195367| 3 4 | 173350 | -178093 | 182252 | -185825 | -188812 | -191222 | -193066 | -194359 | -195119 | -195367| 4 5 | ‘107477 | -113979 | *120286 | -126361 | -132169 | -137680 | 142869 | -147713 | 152193 | 156293) 5 6 | 055530 -060789 | 066158 | 071604 | -077098 | -082608 | -088102 | 093551 | -098925 | 104196 6 7 | 024592 | -027789 | -031189 | -034779 | 038549 | -042484 | 046568 | -050785 | -055115 | -059540| 7 8 | 009529 | 011116 | 012865 | -014781 | -016865 | -019118 | -021538 | -024123 | -026869 029770! 8 9 | 003282 | -003952 | °004717 | -005584 | -006559 | -007647 | -008854 | 010185 | -011643 | -013231| 9 10 | -001018 | -001265 | ‘001557 | -001899 | -002296 | -002753 | 003276 | 003870 | 004541 | 005292 | 10 1 2Q me Soe s 5 16¢)| = : — 000001 -000001 | -000001 | -000001 | -000002 , -000003 | -000004 | 16 Ly =) ieee ee or ee = a “000001 | 000001 | 17 al at | ‘49 43 Fy NS 46 Te 48 yo G0 ae 0 | 016573 | -014996 | °013569 | -012277 | -011109 | -010052 | ‘009095 | :008280 , -007447 | :006738 1 | 067948 | -062981 | 058345 | °054020 | 049990 | -046238 042748 | °039503 | -036488 | -033690 2 | *139293 | +132261 | 125441 | +118845 | -112479 | 106348 | -100457 | ‘094807 | -089396 | -084224 3 | 190368 | +185165 | 179799 *174305 | -168718 | -163068 | °157383 | °151691 | -146014 | -140374 4 | 195127 | 194424 | -193284 | 191736 | :189808 | -187528 | 184925 | +182029 | *178867 | °175467 5 | 160004 | +163316 | °166224 | +168728 | 170827 | °172525 | -173830 | °174748 | -175290 | 175467 G | *109336 | -114321 | +119127 | 7123734 | -128120 | -132270 | 136167 | 1139798 | 143153 | 146223 7 | (064040 | -068593 | ‘073178 | ‘077775 | 082363 | 086920 | 091426 | 095862 | -100207 | *104445 032820 | -036011 | 039333 | 042776 | 046329 | 049979 | -053713 | 057517 | ‘061377 | -065278 9 | -O14951 | -016805 | 018793 °020913 | -023165 | 025545 | 028050 | °030676 | -033416 | -036266 10 | -006130 | 007058 | 008081 | °009202 | -010424 | -011751 | -013184 | 014724 -016374 | 018133 11 | -002285 | 002695 | -003159 | -008681 | -004264 | -004914 | -005633 , 006425 007294 | 008242 12 | 000781 | -000943 | -001132 | 001350 | -001599 | 001884 | :002206 | 002570 -002978 | :003434 3 | -000246 | -000305 | 000374 | 000457 | 000554 | -000667 | 000798 | -000949 001123 | -001321 14 | 000072 | 000091 | -000115 | 000144 | -000178 | -000219 | 000268 | 000325 | 000393 | -000472 15 | 000020 | -000026 | -000033 | -000042 | -000053 000067 | -009084 | -000104 -000128 | 000157 16 -Q00005 | 000007 | 000009 | ‘000012 | -000015 -000019 | -000025 | 000031 -000039 000049 17 | 000001 | -000902 | *000002 | *000003 | -000004 -000005 | -000007 | *000009 | *000011 | :000014 7) WOANAAK OHS iS |) Ga — | 000001 | *000001 | -000001 | -000001 | -000002 *000002 - -000003 | -000004 1S) = _— ces — = — — | 000001 | -000001 | *000001 | | ole of Se Nord a4 55 56 a7 eee se ye) i) OO) x 0 006097 | °005517 | 004992 | -004517 | -004087 | -003698 | 003346 | -003028 | 002739 | 002479 031093 | ‘028686 | -026455 | °024390 | 022477 | ‘020708 | 019072 | -017560 | 016163 | -014873 2 | -079288 | 074584 | 070107 | -065852 | -061812 | -057982 | :054355 | -050923 | -047680 | -044618 3 134790 | 129279 | 123856 | 118533 113323 | *108234 | +103275 | 098452 | ‘093771 | 089235 | ES Ce e®R > H. E. Soper 29 TA BLE—(continued). m qr Le Or © nn aS) N s aT ial OU fo ~t ~ oa) Ne) QD S 171857 | *168063 164109 | -160020 | *155819 | *151528 | 147167 | °142755 | 138312 | 133853) 4 175294 | 174785 173955 | -172821 | 171401) -169711 | 167770 | -165596 | *163208 | *160623 5 *149000 | °151480 | -153660 | 155539 | °157117 ) °158397 | *159382 | -160076 | 160488 | 160623) 6 “108557 | °112528 | -116343 | -119987 | °123449 126717 | 129782 | +132635 | °135268 1387677) 7 | 069205 | 073143 -077077 | -080991 | ‘084871 -088702 | 092470 -096160 | ‘099760 *103258 8 039216 | ‘042261 -045390 | 048595 | °051866 | -055192 ‘058564 | -061970 | °065398 | 06838 | 9 “020000 | 021976 024057 | -026241 | °028526 -030908 | °033382 | -035943 | 038585 | 041303, 10 -009273 | °010388 -011591 | -012882 | °014263 -015735 017298 | 018952 | -020696 | 022529 | 71 003941 | ‘004502 -005119 | -005797 | °006537 -007343 | -008216 | 009160 | 010175 | ‘011264 | 12 001546 | 001801 | 002087 | 002408 | °002766 -003163 -003603 | -004087 | 004618 | °005199 | 13 ‘000563 | -000669 | -000790 | -000929 -001087 -001265 | 001467 | -001693 | °001946 | "002228 | 14 ‘000191 | 000232 | -000279 | -000334 000398 000472 | 000557 -000655 | ‘000766 000891 15 000061 | ‘000075 | -000092 | -000113 | *000137 | -000165 | 000199 | -000237 | *000282 | ‘000334 16 000018 | -000023 | -000029 | -000086 | ‘000044 | -000054 | ‘000067 | 000081 | ‘000098 | ‘000118 | 17 “000005 | ‘000007 | ‘000008 | :000011 | ‘000014 | -G00017 | 000021 | -000026 | -000032 | ‘000039 | 18 000001 | 000002 -000002 | -000003 | -000004 | -000005 | -000006 | -000008 | -000010 -000012 | 19 = — 000001 000001 | 000001 | -000001 -000002 | -000002 | 000003 | -000004 | 20 = = —- ; — ar — -= | :Q00001 | *O00001 | *OO0001 | 21 6'l 6°2 63 Or4 6'5 O°6 oy 6°8 Org 70 we 002243 | -002029 | -001836 | °001662 | ‘001508 | -001360 | -001231 | -001114 | -001008 | ‘000912 | 0 013682 | °012582 | -011569 | -010634 | 009772 | ‘008978 | °008247 | 007574 | °006954 | ‘006383 | 1 041729 | :039006 | :036441 | °034029 | 031760 | 029629 | 027628 | -025751 | -023990 | 022341 | 2 084848 | ‘080612 | -076527 | °072595 | ‘068814 | 065183 | 061702 | -058368 | -055178 | 052129 | 3 129393 | °124948 | :120530 | -116151 | °111822 | -107553 | :103351 | (099225 | -095182 | ‘091226 | 4 "157860 | °154936 | *151868 | -148674 | °145369 +141969 | 138490 | 134946 | -131351 | 127717) 6 "160491 | °160100 | -159461 | +158585 | °157483 | -156166 | °154648 | 152939 | 151053 | -149003 | 6 139856 | *141803 | *143515 | 144992 *146234 | -147243 | :148020 | 148569 | °148895 | *149008 | 7 "106640 | -109897 | :113018 | °115994 | ‘118815 | °121475 | -123967 | °126284 | -128422 | 130377] $& 072278 | ‘075707 | :079113 | -082484 | 085811 | 089082 | 092286 | -095415 | 098457 | 101405 | 9 044090 | 046938 | -049841 | -052790 | 055777 058794 | 061832 | 064882 | 067935 | ‘070983 | 10 024450 | 026456 028545 | :030714 | °032959 | 035276 | -037661 | -040109 | -042614 | “045171 | 11 012429 | ‘013669 | 014986 | :016381 | °017853 | -019402 | -021028 | -022728 | -024503 | 026350 | 12 005832 | 006519 | -007263 | ‘008064 | °008926 | -009850 | :010837 | -011889 | -013005 | 014188 | 13 002541 | ‘002887 003268 003687 | (004144 -004644 | -005186 | 005774 | ‘006410 ‘007094 | 14 001033 | °0011938 -001373 | -001573 | 001796 -002043 | -002317 | 002618 | -002949 003311 | 14 ‘000394 | :000462 | -000540 | -000629 | 000730 | -000843 | -000970 | 001113 | 001272 *001448 | 10 000141 | ‘000169 | 000299 | 000237 | ‘000279 *000327 | *000382 | -000445 | ‘000516 | ‘000596 | 17 000048 | -000058 | 000070 | ‘000084 | ‘000101 000120 | :000142 | ‘000168 | 000198 -000232 | 1S ‘000015 | *000019 | -000023 | -000028 | -000034 -000042 | -000050 | -000060 | -000072 ‘000085 | 19 000005 | ‘000006 | -000007 | -000009 | °000011 -000014 | ‘000017 | -000020 | 000025 000030 | 20 000001 | *090002 | -000002 | -000003 | *000003 | -000004 | 000005 | -000007 | ‘000008 -000010 | 271 = -— | -000001 | 000001 | -000001 ‘000001 | *000002 | -000002 | -000003 -000003 | 2.2 = = | -—- —- | — | = — 000001 | 000001 000001 | 22 q q 30 Poisson's Exponential Binomial Limit TABLE—(continued). ™m av ax Hi | 7:2 13 Th 1s} v6 at 78 79 8-0 0 | :000825 | 000747 | -000676 | -000611 | 7000553 | -000500 000453 | 000410 | -000371 | 000335 | 0 1 | :005858 | ‘005375 | 004931 | 004523 | 004148 -003803 | (003487 | -003196 | 002929 | 002684] 1 | 2 | 020797 | 7019352 | -018000 | 016736 | ‘015555 | 014453 | (013424 | -012464 | 011569 | 010735 | 2 3 | 049219 | 046444 | 043799 | 041282 | 038889 036614 | 084455 032407 -030465 | 7028626] 3 4 | (087364 | -083598 | -079934 | 076372 | (072916 069567 | 066326 -063193 060169 | 057252 | 4 5 | 124057 | 120382 | +116703 | -113031 | °109375 105742 | -102142 -098581 ‘095067 | 091604} 5 6 | 146800 | "144458 | 141989 | :139405 | 136718 °133940 | "131082 | -128156 125171 | *122138| 6 7 | *148897 | *148586 | *148074 | -147371 | °146484 | °145421 | 144191 | -142802 | -141264 | 139587] 7 8 | 132146 | -183727 | °135118 | +136318 | ‘137329 | -138150 | *138783 | -139232 | -139499 | -139587| 8 9 | *104249 | *106982 | 109596 | *112084 | *114440 | °116660 | 118737 | -120668 | -122449 | 124077] 9 10 | ‘074017 | 077027 | 080005 | -082942 | 085830 | ‘088661 | °091427 -094121 | 096735 | -099262 | 10 11 | ‘047774 050418 | °053094 -055797 | °058521 | °061257 063999 -066740 | 069473 | -072190) 11 12 | 028267 | -030251 | ‘032299 | -034408 | 036575 | 038796 041066 | -043381 | 045736 | -048127 | 12 3 | :015438 | 016754 018137 019586 | ‘021101 | -022681 | °024324 -026029 | 027794 | 029616 | 13 14 | 007829 | 008616 009457 | -010353 | 011304 | -012312 | 013378 -014502 | -015684 | 016924 | 14 5 | 003706 -004136 004603 -005107 | *005652 006238 ‘006867 -007541 | 008260 | -009026 | 15 16 | 001644 001861 002100 002362 | 002649 002963 | °003305 -003676 | -004078 | 004513 | 16 17 | -000687 | -000788 | ‘000902 | °001028 | 7001169 | ‘001325 | 001497 | -001687 | -001895 | 002124 | 17 18 | -000271 | 000315 | -000366 | -000423 | 000487 | *000559 | 000640 | 000731 | *000832 | 000944 | 18 19 | -000101 | °000119 | *000141 -000165 | 000192 °000224 | *000259 -000300 *000346 | 000397 | 19 20 | 000036 -000043 | 000051 | 000061 | 000072 | 000085 | 000100 | -000117 | 000137 | 000159 | 20 21 | 000012 -000015 *000018 | 000021 | °000026 | -000031.| *000037 000043 | -000051 | -000061 | 21 22 | 000004 | -000005 -000006 | -000007 | 000009 -000011 | *000013 -000015 | -000018 | -000022 | 22 23 | 000001, -000002 *000002 | -000002 | “000003 , 000004 | *000004 | -000005 ‘000006 | 000008 | 23 24 = = ‘000001 | -000001 | 000001 *000001 | 000001 -000002 | -000002 | *00C003 | 2 25 = 23 = ae = mn — 000001 | 000001 | -000001 | 25 = | ae raw x | 8 8-2 8:3 Sy 85 SiG) ile oa 88 | 89 90 | «x 0 | 000304 | -000275 | 000249 | *000225 | 000203 | 000184 | -000167 | 000151 | 000136 | -000123 | 0 1 | -002459 | ‘002252 ‘002063 | ‘001889 | -001729 | -001583 | -001449 | 001326 | 001214 | OOL111) 1 2 | 009958 -009234 008560 | ‘007933 | 007350 006808 -006304 -005836 -005402 | 004998 | 2 3 | 026885 | 025239 | 023683 | 022213 | 020826 | 019517 | ‘018283 | 017120 | -016025 | ‘014994 | 3 4 | 054443 | 051740 049142 046648 | 044255 -041961 039765 | 037664 035656 | -033737 | 4 5 | 088198 | 084854 081576 | 078368 | ‘075233 | 072174 069192 | -066289 | 063467 | -060727 | 5 6 | 119067 | "115967 | "112847 | °109716 | "106581 *103449 | -100328 | 097224 | 094143 | 091090 | 6 7 | 137778 snes 1305 “a105) "129419 | -127094 | °124693 | -129224 | 119696 | °117116| 7 8 | -139500 | *139244 | -138823 | "138242 | 137508 136626 | 135604 | -134446 | 133161 | 131756 | 8 9 | 125550 | *126866 | 128025 | "129026 | *129869 | -130554 *131084 | "131459 | "131682 | -1381756| 9 10 | -101696 | *104031 | 106261 | *108382 | °110388 | -112277 *114043 | -115684 | *117197 | 118580) 10 11 | :074885 | ‘077550 | ‘080179 | ‘082764 | ‘085300 | -087780 | °090197 | 092547 | °094823 | 097020 | 11 12 | :050547 | -052993 | -055457 | °057935 | 060421 062909 | -065393 | -067868 | °070327 | °072765 | 12 13 | 031495 | ‘033426 | 035407 | ‘037435 | °039506 -041617 °043763 , 045941 | -048147 | -050376 | 13 14 | -018222 | 019578 | 020991 | (022461 | ‘023986 | 025565 -027196 | -028877 | 030608 | 032384 | 14 15 | -009840 | 010703 | 011615 | °012578 | 013592 | -014657 ‘015773 | 016941 | -018161 | ‘019431 | 15 16 | -004981 | 005485 -006025 | °006604 | °007221 | -007878 | 008577 | -009318 | 010102 | °010930 | 16 17 | -002373 | 002646 002942 | 003263 | 003610 -003985 004389 | -004823 | ‘005289 | ‘005786 | 17 18 | -001068 | 001205 -001356 | °001523 | 001705 | -001904 | 002121 | -002358 | 002615 | -002893 | 18 19 | -000455 | -000520 | -000593 | °000673 | 000763 | -000862 | 000971 | -001092 | °001225 | 001370) 19 20 | 000184 ee ia ‘000283 | -000324 | -000371 | 000423 | 000481 | -000545 | ‘000617 | 20 H. KE. Soper 31 TABLE—(continued). me xv rs Bo Sl 82 S38 Sh Sb S'6 Ont ss SD | 9:0 21 | 000071 | ‘000083 | ‘000097 | ‘000113 -000131 | °000152 | ‘O00L75 | *000201 | *000231 | ‘000264 | 27 22 | -000026 | ‘000031 | -000037 | ‘000043 | -000051 | ‘000059 | 000069 | ‘000081 | 000093 | *000108 | 22 28 | 000009 | -000011 | ‘000013 | ‘000016 | 000019 | 000022 | *000026 -000031 | -000036 | 000042 | 23 24 | 000003 | -000004 | 000005 | ‘000006 | -000007 | ‘000008 -009009 | ‘000011 000013 | ‘000016 | 24 25 | -000001 | 000001 | ‘000002 | :000002 ‘000002 | *000003 | -000003 | °000004 | 000005 | ‘000006 | 25 | 26 — — —- 000001 | 000001 | *O00001 | “000001 | *QOQ0OL1 | *O00002 | -000002 | 26 2Y = — — a = — — ‘OOO0O1 | ‘OOO0001 | 27 Da Oot 9°2 Fs a4 hes) IO Oey 9°8 99 10°0 4B | sons. 0 | 000112 | 000101 | ‘OO0091 | -000083 -000075 | ‘000068 | *OO0061 | *000055 | *000050 | ‘000045 | U 1 | 001016 | -000930 | ‘000850 | :000778 | ‘000711 | 000650 | ‘000594 | -000543 | 000497 | :000454 I 2 | 004624 | 004276 | 003954 | 003655 | -003378 | 003121 | (002883 -002663 | 002459 | 002270 g 3 | 014025 | 013113 | °012256 | 011452 | ‘010696 | °009987 | *009322 | ‘008698 | ‘008114 | ‘007567 3 4, | 031906 | 030160 | 028496 | -026911 | -025403 | :023969 | ‘022606 | °021311 | 020082 | 0189174 5 | 058069 | -055494 | -053002 | 050593 | -048266 | 046020 | 043855 °041770 | 039763 | °037833 | 5 6 | °088072 | -085091 | °082154 | °079262 | ‘076421 | -073632 ‘070899 | °068224 | *065609 | °063055 6 7 | 114493 | -111834 | *109147 | +106438 | °103714 | *100981 | ‘098246 | -095514 | °092790 | ‘090079 7 8 | 130236 "128609 | °126883 | °125065 | 123160 | °121178 | °119123 | *117004 | *114827 | *112599 8 9 | 131683 | °131467 | *131113 | 130623 | -130003 | 129256 | °128388 | *127405 | °126310] 1251109 10 | *119832 | *120950 | °121935 | *122786 | °123502 | °124086 | °124537 | °124857 | -°125047 | -125110 10 11 | :099188 | -101158 | *103090 | *104926 | -106661 | *108293 | ‘109819 | *111236 *112542 | 113736 11 12 | ‘075176 | ‘077555 | ‘079895 | -082192 | -084440 | 086634 | ‘088770 | -090848 | °092847 | 094780 12 18 | 052628 | :054885 | ‘057156 | 059431 | -061706 | (063976 | ‘066236 | 068481 | ‘070707 | (072908 13 14 | °034205 | :036067 | °037968 -039904 | ‘041872 | °043869 | °045892 | 047937 | “050000 | °052077 14 15 | 020751 | 022121 | °023540 *025006 | 026519 | 028076 | ‘029677 | ‘031319 | ‘033000 | ‘034718 15 16 | °011802 | 012720 | ‘013683 -014691 | -015746 | -016846 | ‘017992 -019183 | °020419 | -021699 16 17 | (006318 | -006884 | 007485 | 008123 | :008799 | 009513 | -010266 | -011058 | °011891 | ‘012764 17 18 | 003194 | 003518 | -003867 004242 | 004644 | 005074 | 005532 | ‘006021 | ‘006540 | ‘007091 18 19 | (001530 | -001704 | 001893 ‘002099 | 002322 | -002563 | °002824 | °003105 | °003408 | 003732 | 19 20 | :000696 | 000784 | ‘000880 | 000986 | -001103 | -001230 | *001370 | °001522 | *001687 | 001866 | 20 21 | 000302 | 000343 | ‘000390 000442 -000499 | 000563 | 000633 | 000710 | *000795 | ‘000889 | 21 22 | 000125 | 000144 | °000165 *O000189 | 000215 | 000245 000279 | *000316 | ‘000358 | -000404 | 22 23 | 000049 | -000057 | ‘000067 = “000077 000089 | -000102 *000118 | *000135 | 000154 | -000176 | 23 24 | 000019 | -000022 | ‘000026 -000030 | 000035 | ‘000041 | *000048 | *000055 | ‘000064 | -000073 | 24 | 25 | ‘000007 | -000008 | 000010 | -O00011 | *000013 | ‘000016 | “000018 | ‘000022 *000025 | -000029 | 25 | 26 | 000002 | -000003 | *000003 | :000004 | ‘000005 | ‘000006 | *O00007 | *O00008 000010 | 000011 | 26 27 | "000001 | 000001 | *OOO001 | *O00001 | *000002 | -000002 | *O00002 | "000008 *000004 | 000004 | 27 28 — —- | = — 000001 | ‘000001 | ‘000001 | 000001 000001, -000001 | 28 QO — = —- | — ee 000001 p22 @ 10:1 10°2 L0°3 L104 L0°5 106 L0°7 L0°8 KORY) | 110 7 0 | -000041 | :000037 | -000034 -000030 -000028 | -000-25 | ‘000023 | -000020 | -000018 | *OO0017 0 1 | 000415 | *000379 | 000346 | 000317. °000289 | -000264 | -000241 | :000220 | 000201 | ‘000184 7 2 | 002095 | 001934 | 001784 -001646 | -001518 | :001400 | 001291 | -001190 | -001097 ; 001010 | 2 3 | 007054 | 006574 | °006125 | -005705 | 005313 | 004946 | 004603 | 004283 | -003984 | *008705 | 3 1 | | | 32 Powssovs Exponential Binomial Limit TABLE—(continued). UD x = ine ey ] BL 10°1 10:2 10°3 10"4 | 10°5 | 10°6 L0°7 10°S 10:9 | 11:0 | —|- ——|——|——— | _—— 4 | O17811 -016764 | -015773 | 014834 | -013946 | -013107 | -012313 | -011564 | 010856 | -010189 |] 4 5 | 085979 | °084199 | °032492 | 030855 | *029287 | ‘027786 | °026350 | 024978 | 023667 | 022415 | 5 6 | °060565 | -058139 | (055777 | 053482 -051252 049089 | 046991 | -044960 | 042995 041095 | 6 7 | 0873887 | 084716 | 082072 | 079458 °076878 | (074334 | 071830 | 069367 | 066949 064577, 7 8 | °110326 | *108013 | *105668 | 103296 *100902 | -098493 | ‘096072 -093646 | (091218 | ‘088794 | 8 9 | °123810 | 122415 | °120931 | -119364 “117720 | *116003 | °114219 -112375 | -110475 °108526; 9 10 | 125048 | °124863 | °124559 | 124139 +123606 +122963 | -122215 | -121365 | -120418 | 119378 | 70 17 | 114817 | *115782 | °116633 | °117368 | -117987 | *118492 | -118882 | *119159 | -119323 | -119378 | 71 12 | 096637 | 098415 | *100110 | °101719 | *103239 | :104667 | *106003 | +107243 | :108386 | 109430 | 12 13 | ‘075080 | ‘077218 | ‘079318 | -081375 | ‘083385 | °085344 | ‘087248 | -089094 | 090877 | °092595 | 13 14 | °054165 | °056259 | 058355 | *060450 | -062539 | °064618 |} 066683 | °068730 | ‘070754 | 072753 | 14 15 | °0B6471 | °038256 | 040071 | 041912 | °043777 | °045663 | 047567 | -049485 | 051415 053352 | 15 16 | °023022 | 024388 | 025795 | °027243 | °028729 | °030252 | -031810 | 033403 | 035026 ‘036680 | 76 17 | (013678 | 014633 | °015629 | 016666 | ‘017744 | °018863 | 020022 | -021220 | 022458 | 023734 | 17 18 | :007675 | *008292 | ‘008943 | -009629 | -010351 | °011108 | -011902 | :012732 ‘013600 | 014504 | 78 19 | ‘004080 | 004451 | *004848 | *0054%71 | -005720 | -006197 | °006703 | -007237 | 007802 | -008397 | 19 | 20 | °002060 | -002270 :002497 | -002741 003003 | 003285 | -003586 | 003908 | -004252 | 004618 | 20 21 | 000991 | °001103 | *001225 | ‘001357 | *001502 | °001658 | -001827 | °002010 | 002207 | ‘002419 | 21 22 | 000455 | -000511 | 000573 | 000642 | 000717 | 000799 | :000889 | -000987 | -001093 | °001210 | 22 23 | -000200 | -000227 | -000257 | 000290 -000327 | -000368 | 000413 | 000463 | -000518 | ‘000578 | 23 24 | 000084 -000096 | -000110 | 000126 -000143 | 000163 | -000184 000208 | -000235 -000265 | 2 25 | 000034 | 000039 | *000045 | *000052 , 000060 | *000069 | ‘000079 | -000090 | 000103 | ‘000117 | 25 26 000013 | 000015 *000018 | -000021 | 000024 -000028 | -000032 000037 | -000043 | 000049 | 26 27 | 000005 | 000006 | 000007 | *000008 | 000009 | :000011 | *000013 | :000015 | 000017 | *000020 | 27 28 000002 | -000002 | *0000038 | *000003 | -000004 | *000004 | :000005 | *000006 | -000007 | -000008 | 28 29 | 000001 | ‘000001 | -000001 | °000001 | 000001 | 000002 | *000002 | -000002 | -000003 | 000003 | 29 3O — _ — a — 000001 | ‘000001 | -000001 | *000001 | ‘000001 | 30 a let ED 113 114 TAG 5 = eto Mie 11°8 11:9 12°0 iz. 0 | 000015 | 000014 | ‘000012 | ‘000011 | ‘000010 | *000009 | °000008 | °000008 | *000007 ‘000006 |} 0 Z | -000168 | °000153 | 000140 | 000128 | ‘000116 | *000106 | ‘000097 | -000089 | ‘000081 | ‘000074 | 7 2 | 000931 | -000858 | 000790 | 000727 | 000670 | ‘000617 | 000568 | 000522 | 000481 | 000442 | 2 3 | 003445 | -003202 | 002976 | 002764 | ‘002568 | 002385 | -002214 | -002055 | 001907 | ‘001770 | 3 4 | 009559 | 008965 | -008406 | 007879 | 007382 | 006915 | 006476 | 006062 | -005674 | 005309 | 4 5 | °021221 | 020082 | -018997 | 017963 016979 | °016043 | 015153 | ‘014307 | -013504 | ‘012741 | 5 G | ‘039259 | °037487 | -035778 | 7034130 | 032544 | -031017 | °029549 | 028137 | -026782 | 025481 6 7 | 062253 | -059979 | -057755 | 055584 | 053465 | 051400 | 049388 | -047432 | 045530 | 043682 | 7 8 | 086376 | °083970 | 081579 | -079206 | ‘076856 | 074529 | 072231 | -069962 | 067725 | ‘065523 | 8 9 | 106531 | -104496 | 102427 | -100328 | 098204 | 096060 | -093900 | °091728 | 089548 | ‘087364 | 9 10 | “118249 | -117036 | 115743 | 114374 | +112935 | *111430 | *109863 | 108239 "106562 | °104837 | 10 11 | 119324 | +119164 | °118899 | °118533 | -118068 | *117508 | °116854 | *116110 | -115281 | 114368 | 11 12 | *110375 | 111220 | +111964 | *112607 | *113149 | °113591 | +113933 | -114175 | 114320 | *114363 | 72 13 | °094243 | 095820 | -097322 | 098747 | -100093 | °101358 | -102539 | -103636 | -104647 | 105570 | 13 14 | 074721 | (076656 | -078553 | -080409 | -082219 | 083982 | 085694 | :087350 | -088950 | 090489 | 14 15 | 055294 | 057236 | 059177 | 061110 | -063035 | -064946 | 066841 | -068716 070567 | 072891 | 15 16 | -038360 | ‘040065 | -041793 | 043541 | 045306 047086 | 048877 | ‘050678 | 052484 | -054293 | 16 17 | 025047 | -026396 | -027780 | -029198 | 030648 | °032129 | :033639 | 035176 | 036739 | ‘0388325 | 17 18 | 015446 | 016424 | -017440 | -018492 | 019581 | -020706 | 021865 | ‘023060 | -024288 | 025550 | 18 19 | -009023 | 009682 | -010372 | ‘011095 | 011852 | :012641 | 013465 | -014322 | -015212 | ‘016137 | 19 Se ee ee ee ay ee ee ee ee Po pain hs: H. E. Soper TA BLE—(continued). Vit Lv = = = =F = — 111 11-2 113 [14 HN, | LTO 9) eat te? | | | | 20 | 005008 | -005422 | 005860 | -006324 -006815 | 007332 | ‘007877 21 | :002647 | 002892 -003153 | :003433 -003732 | ‘004050 | 004388 | 22 | 001336 | -001472 001620 001779 | -001941 | ‘002136 | -00233 23 | 000645 | -000717 | -000796 | 000882 | -000975 -001077 -001187 24 | °000298 | -000335 | 000375 -000419 | -000467 | 000521 | -000579 25 | 000132 | -000150 000169 | 000191 | 000215 | 000242 | -000271 26 | °000057 | 000065 000074 -000084 | 000095 | 000108 000122 | 27 | 000023 | 000027 | 000031 —-000035 “000041 "000046 | -000053 28 | ‘000009 | -OVDOL1 | 000012 -000014 | ‘000017 | -000019 | 000022 | 29 | -Q00004 “O00004 000005 *OO0006 | *OO0007 | *O00008 | -~OO0009 | 30 | -000001 | -000002 ‘000002 | -000002 | -000003 | -000003 | -000003 | el *YODVO1 | -000001 | *000001 | 000001 | *OO0001 | ‘000001 | OY | — -- — — -— | F = i || StF 12:2 12°3 124 io 12°6 12°7 | r | “aanel| 0 000006 | -000005 | *000005 | 000004 -000004 | *000003 000008 | 1 000067 -000061 | *000056 | 000051, 000047. *000042 -000039 2 000407 000374 | 000344 -000317 000291 -000268 | -000246 3 001641 | -001522 | °001412 | -001309 | -001213 | 001124 | -001042 4 | 004966 | -004643 | °004841 | 004057 | -003791 | 003541 | 003307 5 | 012017 | -011330 | 010679 | -010062 | -009477 | ‘008924 | 008400 | 6 | 024233 | -023037 | °021892 | 020794 | -019744 | (018740 | -017781 7 | 041889 | 040151 | 038467 | 036836 | -035258 | ‘033733 | -032259 | 8 | 063358 | -061230 | °059142 | -057095 | 055091 | °053129 | 051212 9 | -085181 | -083000 | ‘080828 | -078665 | 076515 074381 | (072266, 10 | *103069 | 101261 | °099418 | ‘097544 | -095644 °093720 -091777 11 | 113876 | °112308 | 111168 | +109959 | -108686 | 107352 | °105961 12 | *114321-| -114180 | 113947 | -113624 | -113215 | °112720 | -:112142 13 | °106406 | -107153 | °107811 | -108380 | -108860 | °109251 | *109554 14 | 091965 | :093376 094720 095994 | -097197 | ‘098326 | -099381 15 | 074185 | ‘075946 | °077670 | 079355 | -080997 | °082594 | 084143 16 | 056103 | -057909 | °059709 /-061500 | (063279 ‘065043 | 066788 | 17 | :039932 | -041558 | °043201 | 044859 | -046529 ‘048208 | -049895 | 18 | ‘026843 | 028167 | ‘029521 | -030903 | -032312 °033746 | °035204 19 | -017095 | -018086 | ‘019111 | -020168 | -021258 | °022379 | 023531 20 | -010342 | -011033 | ‘011753 | -012504 | -013286 | 014099 | 014942 21 | 005959 | -006409 | ‘006884 | -007383 | 007908 | ‘008459 | -009036 | 22 | -003278 | 003554 | 003849 | 004162 | 004493 | ‘004845 | -005216 23 | 001724 | -001885 | ‘002058 | -002244 | -002442 | ‘002654 | 002880 24 | 000869 | 000958 | -001055 001159 | 001272 | *001393 001524 | 25 | 000421 | -000468 | 000519 | -000575 | 000636 | ‘000702 | -000774 26 | :000196 | :000219 | -000246 | -000274 | -000306 | °000340 | -000378 | 27 | -000088 | -000099 | ‘000112 | 000126 | -000142 | ‘000159 | 000178 | 28 | -000038 | 000043 -000049 | -000056 | -000063 | ‘000071 | -000081 | 29 | 000016 -000018 000021 | -000024 | -000027 | -000031 | -000035 30 | 000006 | -000007 000009 -000010 | -000011 | (000013 | ‘000015 31 | -000002 | -000003 -000003 000004 | 000005 | -000005 | 000006 32 | 000001 | -000001 000001 -000002 ‘000002 000002 | 000002 | Bo = | = — “000001 | 000001 | “000001 | *O00001 BY ee ee = == i) = ae Biometrika x ao ints ———— rs it} || rile) | BO) 008450 | -009051 | -009682. 20 004748 | 005129 -005533. 21 | -002547 | 002774 | 003018 | 22 001307 | 001435 | 001575. 23 000642 °000712 | ‘000787 24 “000303 | 000339 | ‘000378 25 000138 | -000155 *OOO1L74 26 “000060 | *O00068 | 000078 | 27 ‘000025 | -000029 | ‘000083 | “5 © 000010 000012 | -O00014 | 79 “000004 | 000005 -000005 | 3 ‘000002 | -000002 - *000002 3 *000D01L | 000001 | “000001 | 37 I 12'8 12°9 13:0 xv 000003 | 000002 | 000002} 0 000035 | -000032 | *000029 | 1 “000226 *000208 | ‘000191 2 000965 | ‘000894 | *000828 | 6 003088 | °002882 | *002690 | 4 007905 | 007436 | 006994 | 5 016864 | -015988 | °015153 | 6 030837 | -029464 | ‘028141 | 7 049339 | °047511 | 045730 | 8 ‘O70171 068100 -066054) 9 089819 | -087849 | ‘085870 | LO 104516 | *103023 | *101483 | 11 -111484 | -110749 | :109940 | 12 -109769 | -109897 | (109940 | 13 100360 | -101263 | "102087 |) 14 085641 | ‘087086 | 088475 15 (068513 | ‘070213 -071886 16 ‘051586 | 053279 | -054972 | 17 036683 | 038183 039702 | 18 024713 | (025925 -027164 | 19 015816 | 016721 °017657 | 20 009640 | 010272 °010930 ) 2 ‘005609 | -006023 | ‘006459 | 22 003122 | :003378 ‘003651 | 25 001665 | ‘001816 | 001977 | 24 000852 | ‘000937 | °001028 | 25 000420 000465 | 000514 | 26 “000199 | -000222 | 000248 | 27 000091 , 000102 ‘000115 | 28 000040. -000046 | *000052 | 29 000017 | -000020 | *000022 30 “000007 ‘000008 | 000009 31 “000003 -000003 | ‘000004 52 “000001 | -OO0001 | -000002 | 3: SS = 0000014173 5 34 Poissons Eaponential Binonial Limit TABLE—(continued). m | | | \ 13°2 LO258 Wy lovin A elo olen elo G 137. | 13:8 13°9 1L‘0 | | > St Ss INO © © % Ww & 7 9 | *000568 | -000626 | ‘000689 | :000758 000002 ‘000002 *000002 -000002 | 000001 ‘000001 *000001 | *000001 000001 ‘000001 000027 | ‘000024 | *000022 -000020 | -000019 | ‘000017 000015 | 000014 | -000013 | -000012 000175 ‘00016 | *000148 000136 -000125 | ‘000115 *000105 | -000097 | -000089 | *000081 | ‘000766 | 000709 | °000657 -000608 000562 | -000520 000481 | -000445 000411 | 000380 | ‘002510 | *002341 | 002183 | -002035 -001897 | ‘001768 °001648 | 001535 -001429 | ‘001331 | "006575 006180 °005807 | 005455 005123 ‘004810 004514 | -004236 ‘003974 | 003727 "014356 013596 | *012872 | -012183 | 011526 | 010902 *010308 009743 | 009206 | :008696 ‘026867 | 025639 | "024458 | -023322 022230 | ‘021181 | 020173 | 019207 | -018280 | 017392 "043994 -042304 -040661 -039064 -037512 | -036007 084547 -033132 031762 | 030435 ‘064036 | °062046 *060088 | 058161 | -056269 | -054410 | 052588 | -050802 | -049054 | 047344 | 083887 | ‘081901 | ‘079916 | -077936 075963 | -073998 | -072046 | ‘070107 -068185 | -066282 | 10 ‘099901 098281 | 096626 | -094940 | 093227 -091489 -089730 | -087953 ‘086162 | -084359 | 11 ee HOBLOO fine a aon ue 104880 | *103687 102441 | -101146 | ‘099804 | -098418 | 1? | "109898 | *109773 | *109566 | -109279 -108914 | *108473 | °107957 | -107370 | *106713 | *105989 | Lo "102833 | *103500 “104087 | 104595 *105024 | -105373 +105644 105836 | 105951 | *105989 | 14 _ 089807 | °091080 | °092291 | -093439 | -094522 | 095539 , 096488 | ‘097369 | 098181 | 098923 | 15 | 073530 075141 | ‘076717 | -078255 079753 | -081208 | -082618 | 083981 | -085295 | 086558 WH OHMWAAK Cs WHS a D ‘056661 | °058345 | -060019 | 061683 -063333 | -064966 ‘066580 | -068173 | 069741 | 071283 | 17 041237 | *042786 | °044348 | 045920 | -047500 | -049086 | (050675 | °052266 | 053856 | 055442 | 18 039400 | :040852 | 19 027383 | ‘028597 | 20 018125 | 019064 | 1 011452 | :012132 | 22 028432 029725 | 031043 | -032385 | 033750 | 035135 °036539 -037962 | 018623 | 019619 | 020644 | -021698 | -022781 | °023892 -025030 | -026193 ‘O11617 | °012332 | ‘013074 | 013846 -014645 | -015473 016329 | -017213 ‘006917 | °007399 | 007904 | -008433 -008987 | 009565 -010168 | -010797 003940 | -004246 | ‘004571 | 004913 -005275 | ‘005656 006057 , ‘006478 | :006921 | :007385 | 25 | “002151 | 002336 | ‘002533 | :002743 | 002967 | 003205 | 003457 | -003725 | 004008 | 004308 | 24 | 001127 | ‘001233 | 001348 | 001470 -001602 | 001744 | 001895 | ‘002056 | 002229 | -002412 000832 -000912 | 000998 | -001091 | ‘001191 | :001299 | 26 | 000275 | ‘000306 | °000340 | 000376 -000416 | 000459 | -000507 | 000558 | (000613 | :000674 | 27 ‘000129 | -000144 | ‘000161 | 000180 -000201 | :000223 | :000248 | 000275 ‘000305 | 000337 | 28 | 000058 | “OO0066 ‘000074 | 000083 | -000093 | 000105 | -000117 | -000131 -000146 | :000163 | 29 | 000025 | -000029 | 000033 | -000037 | 000042 | 000047 ‘000053 “000060 -000068 | ‘000076 | 30 | 000011 -000012 | :000014 | -000016 | 000018 | °000021 | -000024 | °000027 -000030 | :000034 | 31 | "000004 | 000005 | -000006 | -000007 | ‘000008 | -000009 | 000010 | -000012 | 000013 | ‘000015 | 22 | “000002 | -000002 -000002 | -000003 | ‘000003 | 000004 | 000004 | *000005 | ‘000006 | ‘000006 3S | | 000001 | ‘OO0001 | ‘000001 | -000001 | -000001 | -000001 | -000002 | ‘000002 -O00002 | 000003 of D —- |; — a — | :000001 | ‘000001 | ‘000001 | ‘OO000I | ‘000001 | 35 141 | Ife | 14°38 4 | 145 146 et 14-8 | 14-9 15°0 “ | 4 0 |} *Q00001 | 000001 | 000001 , 000001 000001 | = — | —- —— a ei =. | © 5 1 | 000011 | -000010 | 000009 | -000008 — -000007 | 000007 | ‘000006 | ‘000006 | -000005 | *000005 1 : 2 | 000075 | ‘000069 | -O00063 » 000058 | *000053 | 000049 | -000045 000041 | ‘000088 | *0000384 | 2 3 | *0G0352 000325 | 000300 | ‘000277 :000256 | ‘000237 | 000219 | -000202 | ‘000186 | ‘000172 | 3 4 | 001239 -001153 -001073 | 000999 -000929 | -O00864 | 000803 -000747 -000694 | 000645 | 4 5 | 003494 -003275 -003070 | -002876 -002694 -002523 -002362 002211 | ‘002069 | °0019386 | 3 | g 6 | 008212 -007752 | ‘007316 | “006902 -006510 | ‘006139 | 005787 °005454 | 005138 | (004839; 6 ¢ 7 | (016541 | -015726 | 014946 -014199 -013486 | -012804 | -012152. -011530 -010937 | 010370) 7 | ; & | 029153 | 027913 | 026715 | 025559 | -024443 | -023367 | -022330 | 021331 | -020370 | 019444 | 8 2 9 | 045673 | 044040 | 042447 | -040894 | 039380 | ‘037907 | (086472 | "035078 | 033723 | ‘032407 9 10 | 064399 | -062537 | 060700 | -058887 057101 | 055343 | 053614 | -051915 050247 | 048611 10 11 | :082547 | 080730 | -078910 | 077089 | -075270 | 073456 | ‘071648 :069850 -068062 | ‘066287 | 17 \ | 7 Ds er H. E. Soper TABLE—(continued). m 35 Lyl | 142 Lyd | Lh 4 1d | 146 TH 148 19 15-0 s | = | Rae ae - ; a §-096993 095530 | -094034 | -092507 -090951 089371 ‘087769 | 086148 | 084510 | 082859 12 105200 | -104349 | 103437 -102469 | -101446 | *100371 -099247 | 098076 | -096862 | 095607 13 105951 *105839 | 105654 -105396 | -105069 | -104672 *104209 103681 | 103089 | 102436 14 § -099594 | -100195 | 100723 -101181 | *101567 101881 *102125 | -102298 | 102402 | -102436 15 ‘087768 088923 | -090021 091063. -092045 092967 -093827 | -094626 095361 | 096034 16 072795 | -074277 | 075724 077135 | ‘078509 | -079842 | -081133 | 082380 | -083581 | 084736 17 (057023 *058596 | -060158 -061708 063243. ‘064761 -066259 067735 069187 | 070613 18 042317 043793 | 045277 -046768 | 048264 -049763 -051263 | 052762 | 054257 | 055747 19 029834 031093 | 032373 ‘033673 034992 -036327 037678 -039044 | 040422 | ‘041810 20 020031 | -021025 | 022045 -023090 | ‘024161 -025256 026375 027517 028680 | -029865 27 | 012838 -013570 | 014329 | 015114 -015924 -016761 -017623 | 018511 | 019424 | 020362 22 007870 008378 | ‘008909 | 009462 | 010039 ‘010640 -011264 011911 | -012584 | 013280 23 (004624 -004957 | 005308 | 005677 | 006065 006472 -006899 “007345 °007812 | 008300 24 002608 -002816 | 003036 | 003270 ‘003518 -003780 -004057 °004348 -004656 | 004980 25 001414 001538 | 001670 “001811 -001962 “002123 -002294 002475 | -002668 | -002873 | 26 “000739 ‘000809 | -000884 -000966. ‘001054 -001148 | -001249 “001357 001473 | -001596 27 000372 -000410 000452 | 000497 | 000546 000598 -000656 -000717 “000784 | 000855 28 ‘000181 -000201 | -000223 | -000247 -000273 000301 -000332 000366 “000403 | 000442 29 ‘000085 “000095 | “000106 “000118 | °000132 “000147 | -000163 -000181 | -000200 | 000221 40 000039 “000044 000049 | -000055 000062 | 000069 -000077 000086 -000096 | -000107 4 “000017 {000019 | 000022 | 000025 | -000028 | -000032 -000035 | -000040 | “000045 000050. 32 000007 -000008 | 000009 000011 | -000012 “000014 -000016 000018 -000020 | -000023 33 § 000003 -000003 | -000004 -000005 -000005 | 000006 000007 000008 | 000009 | 000010 34 000001 “000001 | -000002 000002 000002 “000002 “000003 000003 000004 | 000004 35 — 900001 | 000001 -000001 | -000001 000001 -000001 | 000001 | «000002 | -000002 36 =e aa eae ap as ae = 000001 | -000001 ‘000001 37 ON THE POISSON LAW OF SMALL NUMBERS. By LUCY WHITAKER, BSc. PART I. THEORY AND APPLICATION TO CELL-FREQUENCIES. (1) Introductory. Let p denote the probability of the happening of a certain event A, and q = 1-—p, the probability of its failure in one trial. Then it is well known that the distribution of the frequencies of occurrence n, n — 1, n — 2... times in a series N of n trials is given by the terms of the point binomial VG @ eo a/) MMH rMn Raa hone yoAusdcbbooodote. (1). The fitting of point-binomials plotted on an elementary base c to observed frequency distributions has been discussed by Pearson*, and he has indicated that, if c be unknown, the problem can be solved in terms of the three moment coefficients Hy x, M4 required to find c, p and n. In actual practice but few cases of frequency ean be found which are describable in terms of a point-binomial, and of these few a considerable section have n negative, p greater than unity and q negative; thus defying at present interpretation, however well they may serve as an analytical expression of the frequency. . The hypothesis made in deducing the binomial (p+ q)” as a description of frequency is clearly that each trial shall be absolutely independent of those which precede it. In this respect it may be said that binomial frequencies belong to the teetotum class of chances, and not to those of card-drawings, when each drawing is unreplaced. In the latter case the “contributory cause groups are not inde- pendent,” and our series corresponds to the hypergeometrical rather than to the binomial type of progression ft. Using the customary notation 8, = p;°/"s°, Bo = M4/M, the binomial is determined from : n=2/{8—B.+B}, c=ov6—28,432, Pq =$ 8B — B+ B,)/(6 — 28, + 38;) * «Skew Variation in Homogeneous Material,” Phil. Trans. Vol. 186, A, p. 347, 1895. + Phil. Trans. Vol. 186, A, p. 381, 1895. Lucy WHITAKER 37 In order that n should be positive, it is needful that 3—-B,+ 8, =4(6—28,+ 28), should be positive. If this is satisfied clearly c will be real because 8, is always positive. Further then 1 6— 28,4 28, Ree) eS Goa eae) is always less than a quarter and p and q will therefore be real. If the reader will turn to Rhind’s diagram, Biometrika, Vol. vit. p. 131, he will see that the line 3—6.+ 8,=0 cuts off all curves of Types III, IV, V and VI, and includes a portion only of Type I, with a part of its U and J varieties. The binomial description of frequency, therefore, is not—considering our experience of frequency distributions—likely to be of very universal application. (2) Further Linutations. Now let us still further limit our binomial by supposing : (i) that the unit of grouping of the observed frequencies corresponds to the actual binomial base unit c and (ii) that the first of the observed frequencies corresponds to the term Np” of the binomial*. In this case the mean m of the observed frequency measured from the first term of the frequency will be equal to the nq of the binomial and the standard deviation of the observed distribution will be equal to Vnpg. We have thus: Doo KG — LC, TN —O) sc cn cda ete nscrwes: (111) and n and q will both be negative, if m be less than o% The condition for a positive binomial is therefore that o be less than 4/m. (3) Probable errors of the constants of a Binonual Frequency. It is desirable to find the probable errors of p and n as determined by these formulae. We have: pa = nq, M2 = MPs buy’ = Gqdn+ nog, Sp. = pon + ngdp + npog, assuming deviations may be represented by differentials. Hence, since dp = — dq: Ou. —(p — q) du, = Gon and pdm,’ — bu. = nqdq. Square each of these results, sum for all samples and divide by the number of samples, and we have: Oe at qy Cie 2(p—-q) C5 Oi) City iy Gon 2 oot Vi ,=Nn20"%o,2. Oe HD fay! 270 4% p, a Hy fg * The exact nature of these limitations must be fully appreciated. The best fitting binomial to a given frequency distribution will usually be far from one in which the first term of the binomial corresponds to the first observed frequency. The modes of the binomial and the observed frequency will closely correspond, but the ‘‘tails” of the binomial may be quite insignificant and correspond to no observed frequencies. 38 On the Poisson Law of Small Numbers Now (ai is the standard deviation of variations in ~. and therefore oO", = (fa — pe’)/N. Similarly oy! is the standard deviation of variations in the mean and therefore Lo Heo sures t f 1 ween o*,/=Ms/N. Lastly the product o,,0,/7,,,,' measures the correlation betwe deviations in #, and yp,’ and is known to be p,/N*. Thus we have: 1 eer ee Bo pay q)? fa = 2(p—@) pst, wePay = + {a — Pa? + Pe — pps}. Butt fy = npg {1 +3(n—2) pg}, | een Arrs bh 00 (iv). = mpg(p-G, He = npg | Whence after some purely algebraical reductions we deduce: n p ad r= >= Pe el aN | erm Aan (bet |) | adodoccdcous00s ' ° VNY (1- 4) - me A) aes 2(1- *) ©) we pasl E noe Sp . Op) = Oy vv 2 ae ip Cae aaa (vi). Formulae (v) and (vi) are very important; they enable us to obtain the probable errors for x and p when a binomial limited in the present manner is fitted to a frequency distribution. We see at once, that as n grows large and q grows small 0, =o, approaches the limit V2/N, or the probable error, 67449 /2/N, of p and q is finite. But o® being finite op becomes infinitely great, or the probable error of n indefinitely large. Thus when the n of the binomial is very large, g being very small, the probable error of its determination is so great that its actual value is not capable of being found accurately. Again, suppose V embraced 200 observations, the probable error of q would be of the order ‘07; if NV corresponded to only eighteen observations, then the probable error of q would be of the order ‘22. It is clearly wholly impossible * Biometrika, Vol. 11. ‘‘On the Probable errors of Frequency Consiants,” see p. 275 (iv), p. 276 (vii), and p. 279 (xii). _ + Phil. Trans. Vol. 186, A, p. 347, 1895. + There is no difficulty in obtaining the probable errors of n and p from the more general values in (ii). In this case t= 0No%g +o Be 208, 78,7B,p,» Cp=oq= es Ve? ,$42B,"0°2, — 2B, og %R,7,2,° The values of TB.» Fg, and "g,p, tor different values of 8, and B, have been tabled by Rhind, Biometrika, Vol, vit. pp. 136—141., Lucy WHITAKER 39 from series of observations even of the order 200, much less of order 18, to assert that g is or is not really a “small quantity.” Thus the observed value of q corre- sponding to a population of extremely small q might easily show g=°15 to °50!. (4) Poisson—Law of Small Numbers. A last limitation of the point-binomial is made by supposing the mean m = nq to remain finite, but g to be indefinitely small. We write : We N(p+ qr =N(Q-g4qr=N-gt (1+ 4) me me =N(1-q)? (14+)! nearly Ne aa (1 +m So + ei + a) ; Here the successive terms give the frequency of occurrence of 0, 1, 2, 3... successes on the basis of each success not being prejudiced by what has previously occurred. This is the Law of Small Numbers. It was first published by Poisson in 1837*. It was adopted later by Bortkewitsch, who published a small treatise expanding by illustrations Poisson’s work+. The same series was deduced later by “Student” in ignorance of both Poisson and Bortkewitsch’s papers, when dealing with the counts made with a haemacytometert. The mean is at m from the first group, the other moments as “Student” has shewn § are: f,.=™M, pg=mM, py =3m?4+m. Hence B,=1/m, B,.-—3=1/m. When the mean value is large, 8,, 8, and the higher §’s approach the values given by the Gaussian curve. Clearly the Poisson-Exponential formula contains only the single constant m = p, and its probable error is therefore °674490//N = 67449 m This will, if V be reasonably large and m not too big, be a small or at any rate a finite quantity (i.e. not like o, for g very small). Hence it might be supposed, although erroneously, that the Poisson-Exponential formula was capable of great accuracy in addition to its great simplicity. But this is to neglect the fundamental assumptions on which it is based, namely: (i) that the data actually correspond to a binomial, (ii) that in that binomial g is small and n large. Clearly (i) shows us that, if we can find the binomial, it will actually be closer to the observed frequency than Poisson’s merely approximate formula. * Recherches sur la Probabilité des Jugements. Paris, 1837, pp. 205 et seq. + Das Gesetz der kleinen Zahlen, Leipzig, 1898. £ “On the Error of Counting with a Haemacytometer,” Biometrika, Vol. v, pp, 351—5, 1907. § They may be deduced at once from (iv). 40 On the Poisson Law of Small Numbers Secondly (11) can only be justified as an assumption by actually ascertaining the form of the binomial from the data and testing whether n is large and q small and positive. It appears absurd to base our formula on an approximation to a binomial of a particular kind when, on testing in the actual problem, such a binomial does not describe the results. As a merely empirical formula, the Poisson-Exponential of course can be tested by the usual processes for measuring goodness of fit, but no such test nor any discussion of the probable errors of their results have been provided by Bortkewitsch himself nor by Mortara, who has followed recently his lines in a work to be considered Jater. Asa matter of fact in the cases dealt with by Bortkewitsch, by Mortara and by “Student,” n will be found almost as frequently small and negative as large and positive, and q takes a great variety of values large and negative and large and positive, as well as small and positive. Thus the initial assumptions made from which the “law of small numbers” is deduced are by no means justified on the material to which it has so far been applied. (5) Application of the Law of Small Numbers to determine the Probable Errors of Small Frequencies. Given a distribution of frequency for a population NW let ny be the frequency in the cell of the sth row and éth column of a contingency table (or if we drop t, 7, would stand for the frequency of any class). Then if we take a random sample of WV individuals from this population, the chance that an indivi- dual is taken out of the 7, cell is fis/|N, and that it is not is 1— 7: Therefore if the original population be so large that the withdrawal of an individual does not affect the next draw, the frequency of individuals in M random samples of WV will be given by the terms of the binomial : M {( 1") + ke Now, if jig/N be very small, and WN large this will approximate to the Poisson series : Me (1 +m+ stat =) : where m == x V. But 7y/N will approximately be the mean proportion of the whole in the st cell of the sample itself =ny/N, or m= ny. Thus if in any cell of a contingency table, or in any sub-class of a frequency whatsoever, we have a frequency ny small as compared to the population V, then in sampling, this small frequency will have a distribution approximating to the Poisson Law, and tending as my, becomes larger to approach the Gaussian distribution*. It would appear, * Such approach is usually asswned when we speak of 67449 Alf nae ( e “) as the probable error of the frequency n,. But such a ‘probable error’’ has really no meaning if 7, be very small and the exponentiai law be applied. Lucy WHITAKER 41 therefore, that the Poisson Law of Small Numbers should be applied in order to deal with the errors of random sampling in any small frequency, and an appeal should not be made—as is usually the case—to Sheppard’s Tables on the assump- tion that the frequency is Gaussian. The following Table I illustrates the results obtained (a) from the Binomial, (b) from the Poisson-Exponential and (c) from the normal curve on the two hypotheses that (1) the frequency is 10 in the 1000 and (6) is 30 in the 1000. But here a word must be said as to which Gaussian is to be compared with the Binomial or the Poisson-Exponential. The usual method of fitting a Gaussian is to give it the same mean and standard-deviation as the material to which we are fitting it. For example, we should compare the Poisson exponential with a Gaussian at mean m and with standard-deviation ym, or the point binomial with mean ng TABLE IL. Comparison of Binomial, Poisson-Exponential and Gaussian for cell-frequency variations in samples for case of 10 and 30 in a total population of 1000 Percentage Frequency | 10 in 1000 30 in 1000 ; = | a a =| Binomial | Reese Gaussian Binomial | feo Gaussian 0 00004. | = =-OU0005 ‘00132 19 | -00848 | ‘(00894 = ‘01100 1 ‘00044. (00045 =|) 00327 20 ‘01287 ‘01341 01553 CO 2 ‘00020 ‘00227 | ‘007385 Bil ‘01857 sO1OIG 9) ic a . - a = a Frequencies {total 1000) 150 100 10) Biometrika, Vol. X, Part I. 150 LO oO Catisstarn. Plate VII 35 : Gatisstan. ied Frequencies (total 1000) Biometrika, Vol. X, Part I. Plate VII 10\0- aes Mee =| 1 4 —_ 350, 150 | 1 aoe — 5]O Mp9 1 300 100; | 250 Tegal 501 200; OG 5 10 De 1501 Caussian 4 \ “TIE 7.2/0 | 100) 100) J , we | pa r] | } | b 15 20 25 30 35 ete | oak yp Gausstan. 50 5@ oF Pht = 30 i > | 14 - | | Oa Daa ae 6 OF 8 DION MSD IGT BIOs | 35 40 Ww Lucy WHITAKER 43 than if we apply the ordinary process of mean ng, standard deviation Vnpq, and Sheppard’s table for areas to the frequencies. It will be noted that this amounts to using Sheppard’s correction on the crude second-moment and slightly shifting the central ordinate towards the side of greater frequency. This is the Gaussian curve used in Table I. The object of the present section of our work is to indicate how far it is legitimate to use the Poisson-Exponential up to cell frequencies of the order 30 in a population of about 1000* and how far we then reach a state of affairs, which for practical purposes may be described by ordinary tables of the Gaussian. It will be seen from Table I that the Poisson-Exponential even for ny =10 and 30 is not extremely divergent from the Binomial. In Plate VII the transition of the exponential histograms of frequency towards the Gaussian form is indicated for cell-frequency = 1, 5, 10, 15, 20, 25 and 30; in the cases of 10 and 30 the corresponding Gaussian curves are drawn. It will be seen that with due caution the Poisson-Exponential may be reason- ably used up to frequencies of about 30 in the 1000, and that after that it would be fairly satisfactory to use the areas of the Gaussian curve as provided in the usual tables. (6) In order to table the results of the Poisson-Exponential for easy use, it seemed desirable to turn them into percentages of excess and defect. For example take the distribution for a frequency 5. It is: Per cent. of Cases in which: 0 006,737,945 a defect of 5 occurs : 0674 1 033,689,725 3 4, oy more Pe: 4043 Dy, "084,224,310 > 3 or more vor: 12°465 , 3 140,373,850 P 2 or more an 26°503 4, 175,467,310 as 1 or more ee 44-049 5 175,467,310 the true value ar 17°547 6 146,222,755 an excess of 1 or more ~. S 38°404 a 104,444,825 ms 2 or more . 23°782 8 065,278,015 - 3 or more ‘ 13337 9 036,265,564 * 4 or more s 6809 10 (018,132,782 P 5 or more *. 3183 Tt ‘008,242,178 _ 6 or more ' 1370 12 003,434,238 " 7 or more a 0°545 13 001,320,860 _ 8 or more . 0202 * Of course in the Poisson-Exponential itself the total frequency plays no part; it is only useful in testing the validity of the approximation. 6—2 44 On the Poisson Law of Small Numbers Thus we see that if the true value of the frequency be 5 for the average sample, it will only lie outside the tange 1 to 10 in 674 + 1:370 = 2044 cases per cent., or the odds are 49 to 1 that the value found will be from 1 to 10. On the other hand it will lie outside the range 2 to 8 in 4043+ 6°809 =10°852 °/, of cases, or once in about 9 trials the frequency will lie outside this range. Or, again, once in about every four trials (25°8°/,) the result will fall outside the range 3 to 7. On the other hand if we write «= 5 (1 — 005) = 223047, we have — 4°5 and +55 as the deviations from a mean 5 of all beyond 0°5 and above 105, giving w/o =—2:0175 and + 24658 respectively. These cut off tail areas of 02181 and ‘00684, respectively. Thus in 2°865—not 2°044—per cent. of cases we should assert that the frequency would he outside the range 1 to 10, or the odds that it would lie inside this range are now only about 34 to 1, not 49 to 1. Calculated from the Gaussian the frequencies outside ranges 2 to 8 and 3 to 7 correspond to 10°1°/, and 26:2°/, of the trials instead of 10°9°/, and 25°8°/,. If we take for the standard-deviation of our Gaussian Vnpgq — yy = 2°21171, we find that the odds in the first case are still only 35 to 1, but the percentages in the other two cases are 11°3 and 25'8. It will be clear that near the centre of the curve—especially when we equalise the excess and defect of the Gaussian by taking equal ranges on both sides—it does not give bad percentages of frequency, but that it does not lend itself to the accurate determination of the range for reasonable working odds such as 50 to 1. It will be noted that the total area in excess and defect of 2 and more = 23°782 + 26°503 = 50285, or corresponds very nearly to the “probable error.” Actually the Gaussians with standard deviations of 2°23047 and 2:21171 give probable errors of 1:504 and 1-492 respectively, so that the Gaussian with 1°5 as the probable error is very nearly accurate. Table II gives the Poisson-Exponential; it will enable the reader to appreciate the range of probable variation in small frequencies. Thus we realise that in 37°/, of cases in which the true frequency is 1, the cell will be found empty ; in 13°5 per cent. of cases it will be empty when the actual frequency is 2, and in 5 °/, of cases when the frequency is 8 and in 1°8 °/, when the frequency is 4. These results indicate how rash it is to assume that a sample 4-fold table with one zero quadrant signifies perfect dependence or association in the attributes of the material sampled. The second line below gives the percentages of cases that 0 would appear in a cell when the actual number to be expected is that in the first line calculated from Table IT on the usual theory of a priort probabilities : Actual ih 1 4 5 6 | ren aes | 9 &over 116 | 0-43 0-16 | 0-06 0°02 | Percentage 63°21 eee || 8°55 | 3°15 0°01 Lucy WHITAKER 45 TABLE II. Table of Poisson-Exponential for Cell Frequencies 1 to 30. Cell Frequencies Per cent occurrence of values differing by x or Per cent. occurrence of values differing by « or more in excess Fy 1 2 8 J 5 6 ¢3 8 9 10 22 al 20 19 18 a 17 elen 16 lp ae = | Soi sia! 12 | oo ii | ‘6 10 005 i 9 ae 012 ‘050 ce 8 034 123 yi | 5 if 091 302 623 | 1:033 S 6 248 7730) |) e tea7d |) 222123) |e 25925 5 ——=——| -674 | 1°735 -| 2:964 | 4:238 | 5-496 | 6:708 | 4 1°832 | 4:043 | 6-197 | 8-177 | 9-963 | 11°569 | 13-014 3B ———_| 4:979 | 9:158 | 12°465 | 15:120 | 17-299 | 19-124 | 20-678 | 22-022 | 2 |__| 13-534 | 19:915 | 23-810 | 26-503 | 28°506 | 30-071 | 31°337 | 32-390 | 33-282 | 1 | 36-788 | 40-601 | 42°319 | 43°347 | 44-049 | 44:568 | 44°971 | 45-296 | 45°565 | 45°793 Actual] 36°788 | 27-067 | 22°404 | 19°537 | 17-547 | 16-062 | 14°900 | 13-959 | 13°176 | 12°511 1 | 26:424 | 32°332 | 35-277 | 37:116 | 38-404 | 39°370 | 40-129 | 40-745 | 41-259 | 41°696 2 8030 | 14-288 | 18°474 | 21-487 | 23-782 | 25-602 | 27°091 | 28:°338 | 29-401 | 30°323 B 1°899 | 5-265 | 8:392 | 11-067 | 13°337 | 15-276 | 16-950 | 18°411 | 19°699 | 20°845 4 366 | 1°656 | 3°351 | 5113 | 6:809 | 8-392 | 9852 | 11°192 | 12-422 | 13-554 5 ‘059 453 | 1191 2°136 | 3:183 | 4:262 | 5:335 | 6°380 | 7°385 | 8°346 6 ‘008 ‘110 “380 813 | 1°370 | 2:009 | 2°700 | 3°418 | 4:146 | 4:875 i 001 024 ‘110 284 +545 883 | 1:281 1726 | 2°403 | 2°705 8 000, -005 029 092 | 202 363 “572 ‘823 | 1:110 | 1°428 9 =e ‘001 ‘007 027 ‘070 140 ‘Q4] 372 | 532 ‘719 10 = 000 002 008 | 023 ‘051 096 159 | 242 346 11 is = “O00 “002 ‘007 ‘O18 ‘036 ‘065 “105 “160 a | 12 as = : ‘001 002 006 013 025 | 044 | ‘071 E 13 — ; = _ “000,001 002 ‘005 “009 ‘O17 “030 Se eR hs = i000 ‘001 002 | 003.) 007 ‘O13 15 == weet) t= See S25 ‘000 ‘001 ‘001 | 002 ‘006 EB | 16 = ae a = 000 ‘000 ‘001 002 eel 17% = = — a = = - “000 001 18 _ | = = ‘001 19 = — — —_— | — — -—— a — “O00 20 = a = 224 (ies = = = = — 21 = = aa == hie = = a a —_ 22 = = -_ = = = = = = = 23 = = _ = = _ = == = — 2 imate ee os — = os ee ee le = _ Ne = —_ = ao Se dig ee See = 26 — = — — = a= = = = oe at | — —— == = = 46 On the Poisson Law of Small Numbers TABLE I1—(continued). Cell Frequencies av 11 12 | 18 Uy 15 16 iif 18 19 20 | | Ly ed 22 | hes 21 = 20 a a 19 sik eS =p 18 — a | 17 = — ‘000 “000 es | 16 | | — | »:000'| 000) | COM mmoue ea | 15 | | 000 000 | ‘001 002 | 004 | ‘007 Be | Ly ‘000 | -001 002 | 004 | ‘008 | ‘015 | -026 eB ole ite ‘000 | 001 004 | 009 | 018 | 032] -052 | -078 eed He 001 003 | 009 | 021 040 | -067 | ‘104 | -151 209 8 || any 002 008 | 022 | -047.| -086 | +138 | -206 | ~-289"\") -3Sia/Nae500 es | 10 020 052 105 }_ 181 279 | 401 543 | -706 | 886 | 1-081 a | 129 121 229 | 374 | 553 | -763 | 1:000 | 1:60 | 17538 | 1-832 | 2-139 peril 8 ‘492 | -760 | 1-073 | 1:423 | 1:800 | 2-199 | 2-612 | 3-037 | 3-467 | 3-901 -£] 7 | 1510 | 2-034 | 2589 | 3-162 | 3-745 | 4-330] 4-912 | 5-489 | 6-056 | 6-613 ae 5 | 3-752 | 4582 | 5-403 | 6-206 | 6:985 | 7:740 | 8-467 | 9°167 | 9-840 | 10-486 is 5 | 7861 | 8-950 | 9-976 | 10-940 | 11-846 | 12-699 | 13°502 | 14-260 | 14-975 | 15-651 8 4 | 14319 | 15-503 | 16°581 | 17°568 | 18-475 | 19-312 | 20-087 | 20-808 | 21-479 | 22-107 2 3 | 23-198 | 24-239 | 25-168 | 26-004 | 26-761 | 27-451 | 28-084 | 28°665 | 29-203 | 29-703 2 2 | 34-051 | 34-723 | 35°317 | 35-846 | 36-322 | 36-753 | 37-146 | 37-505 | 37-836 | 38-142 1 | 45-989 | 46-150 | 46°31] | 46-445 | 46°565 | 46-674 | 46°774 | 46-865 | 46-948 | 47-026 | | |Actual, 11-938 | 11-437 | 10-994 | 10°599 | 10-244 | 9-922 | 9-629 | 9°360 | 9-112 | 8:884 “A 1 | 42-073 | 42-404 | 42-695 | 42-956 | 43-191 | 43-404 | 43-597 | 43-776 | 43-939 | 44-091 8 2 | 31:130 | 31-846 | 32:486 | 33-064 | 33-588 | 34-066 | 34°503 | 34-909 | 35-283 | 35-630 z 3 | 21-871 | 22-798 | 23°639 | 24-408 | 25-114 | 25-765 | 26-367 | 26-928 | 27-451 | 27-939 = 4 | 14596 15-559 | 16-450 | 17-280 | 18-053 | 18-776 | 19-451 | 20-088 | 20-686 | 21-251 2 5 | 9-261 | 10°129 | 10-953 | 11°736 | 12°478 | 13-184 | 13-852 | 14-491 | 15-099 | 15-677 = 6 | 5593 | 6-297 | 6-983 | 7°650 | 8-297 | 8-923 | 9-526 | 10-111 | 10-675 | 11-219 s | v7 | 3219 | 3742 | 4266 | 4-791 | 5-311 | 5-825 | 6-399 | 6-826 | 7:313 | 7-789 |= | 8 | 1769 | 2-198 | 2-501 | 2°884 | 3-275 | 3-669 | 4-064 | 4-461 | 4-856 | 5-248 © 929 | 1160 | 1:407 | 1-671 ; 1°947 | 9-932 | 9-593 | 9:824 || 3107 \\iardae | > 10 467 | -607 762 | . 933° | 1117 | 1-312 | 1-516 | 1732" |S kobaeeoaiep foes ana) Ble 225 305 | 396 | 502 | 619 | -746 | -882'| 1-030 | 1:185 | 1:348 aa | 12 104 | +148 | = -201 261 331 ‘All ‘497 | +595 | -699 | - -809 58 | 13 047 | 069 | 097 | -131 172 | -219 | -272 | 333 | 400 | “478 Seca, 020} ‘031 046 | -063-| 086 |" 1140) “14d | 1625) eos mmmoes ea | Is 008 ‘O14 ‘O21 ‘030 042 ‘057 074 ‘096 121 149 295 | 16 003 | -006 | 009 | -013 | 020 | 028 | 036 | 050 | 064 | -081 cali | ll 001 002 | 004 | 006 | ‘009 | 014] 017 025 | 033 | 042 i 18 001 000 | 002 | -002 | 004 | 006 | 008 | ‘O12 | “O17, |) =e-0z2 2 19 000 | 000-001 001 002 | 003 | 003 | -006 | 008] O11 8 20 ae ae 001 ‘000 | -001 ‘002 | 002 | 003 | -004 | -005 5 Of |, was = 000 | 000 | -000 | 001 001 | 002 | -002 | -003 E Bea = 000 | 000-001 001 | ‘001 So eee dine = = = ze = an = 000 | 000 | 001 ° 24 = = = = = = = = = 000 = 25 at em | ees = = = = = = == 8 26 = = ke i oi R a7 _ — — — — — — aw 28 Es at, ae ee Sn (ure — = = = Lucy WHITAKER TABLE Il—(continued). Cell Frequencies 4 = v 21 22 23 24 25 26 or 28 29 30 Al , ° Op | ae es ee = = = = = = = ‘000 5 21 = ral -_ oe = =e ‘000 000 ‘000 ‘001 8 20 —_ 2 — — 000 000 ‘O01 001 ‘001 “002 | 19 ae aa -000 -000 ‘001 ‘001 002 | = -008 004 006 2p 18 -000 000 001 001 002 004 006-009 012 ‘O17 |-Ba | 17 ‘001 002 003 ‘005 008 ‘O11 016 | 023 ‘O31 ‘O41 some 16 003 ‘006 ‘010 O15 022 ‘031 043 | 056 073 092 | alee eee 012 020 -030 043 O59 ‘078 “102 “129 160 195 An Mew 039 058 ‘081 “109 142 “180 224 273 328 387 | ES 1 ‘ll “150 ‘198 252 314 384 “460 543 632 727 | Se | 19 277 355 “443 540 ‘647 762 884 | 1:014 1-151 1-293 | pon lett “625 763 912 | 1:072 | 1°240 | 1-417 | 1-601 | 1°791 1987 | 2-187 een | 10 1-290 | 1°512 | 1°743 | 1:983 | 2:229 | 2-482 | 2-739 | 3:000 | 3-263 | 3:528 ze 9 2-455 | 2°778 | 3°107 | 3:440 | 3°775 | 4111 | 4:446 | 4°781 | 5-114 | 5-444 ee 8 4-336 | 4°769 | 5-200 | 5-626 | 6:048 | 6-463 | 6°872 | 7-274 | 7-669 | 8:057 zg Hi 7°157 | 7°689 | 8-208 | 8-713 | 9-204 | 9-682 | 10-147 | 10-599 | 11-038 | 11°465 38 6 | 11°107 | 11:704 | 12-277 | 12:827 | 13°358 | 13°867 | 14°357 | 14°830 | 15-285 | 15°724 aa 5 | 16292 | 16-900 | 17-477 | 18-025 | 18°549 19-048 | 19°525 | 19°981 | 20-417 | 20-836 = 4 | 22°696 | 23-250 | 23°771 | 24:263 | 24°730 | 25-172 | 25°591 | 25-990 | 26-371 | 26-734 © 3 | 30-168 | 30:603 | 31:010 | 31°391 | 31°753 | 32-094 | 32-416 | 32-721 | 33-011 | 33-287 2 2 | 38-426 | 38-691 | 38:938 | 39°168 | 39-387 | 39°593 | 39°786 | 39:970 | 40143 | 40-308 1 | 47-097 | 47:164 | 47-226 | 47-283 | 47-340 | 47-392 | 47-440 | 47-486 | 47-530 | 47-572 Actual] 8°671 | 8:473 | 8-288 | 8-115 | 7:°952 | 7:799 | 7°654 | 7-517 7°387 | 7:264 2 1 | 44-232 | 44:363 | 44-485 | 44-603 | 44°708 | 44°810 | 44:906 | 44:997 -| 45-083 | 45°165 8 2 | 35-955 | 36-258 | 36-542 | 36-812 | 37-062 | 37-299 | 37-525 | 37-739 | 37-942 | 38-135 2 3 | 28-397 | 28:828 | 29-235 | 29-620 | 29-982 | 30:326 | 30°653 | 30-965 | 31-262 | 31-546 5 4 | 21-785 | 22-290 | 22-770 | 23-227 | 23-660 | 24-074 | 24:469 | 24°847 | 25-208 | 25-555 a 5 | 16-230 | 16°758 | 17-264 | 17-748 | 18-211 | 18-655 | 19-083 | 19-493 | 19-888 | 20-269 # 6 | 11-744 | 12-251 | 12-740 | 13-213 | 13-669 | 14:110 | 14°538 | 14°951 | 15-351 | 15°738 a ” 8254 | 8709 | 9°153 | 9:585 | 10-007 | 10-418 | 10°819 | 11°210 | 11-591 11-962 | B 8 5-637 | 6:022 | 6-402 | 6-777 | 7:146 | 7:509 | 7:866 | 8-218 | 8°562 | 8-901 & 9 3°742 | 4:052 | 4:362 | 4:670 | 4:978 | 5-284 | 5°588 | 5-890 | 6-188 | 6-484 | ES 10 2-415 | 2-654 | 2°895 | 3:188 | 3°385 | 3-632 | 3°880 | 4:129 | 4:377. 4°625 a 11 1517 | 1:692 | 1°873 | 2-057 | 2-246 | 2-438 | 2-633 | 2°831 | 3-030 | 3-230 | tai | 812 927 | 1°051 | 1°18] 1315 | 1°456 | 1:599 | 1:°747 | 1:899 | 2-053 | 2-210 isso 13 “552 637 727 “821 92] 1025 | 1°134 | 1:247 | 1°362 | 1:481 HS 14 320 376 437 “500 570 643 720 ‘801 885 | 973 4 15 ‘181 217 256 298 “345 394 ‘448 “504 564 626 2 | 16 ‘100 -122 147 ‘173 204. 237 272 311 352 395 ‘ee | 17 054 -067 082 ‘098 ‘118 139 162 “188 ‘215 | +245 ig 18 -028 036 ‘045 ‘055 ‘067 ‘080 095 ‘111 129 | +149 ° 19 O15 ‘019 024 -030 ‘037 ‘045 054. ‘065 076 | 089 S) 20 007 010 013 016 ‘020 ‘025 ‘03 ‘037 044 052 B 21 004 -005 ‘007 009 ‘Oll | 014 ‘O17 021 025 ‘030 = 22 002 ‘002 004 ‘005 006-007 009 ‘O11 O14 ‘O17 8 23 ‘001 ‘001 002 ‘003 003-004 ‘005 006 ‘007 ‘010 S Qh 001 001 ‘001 002 002 002 003 003 004 006 = 25 ‘000 000 001 ‘001 ‘001 001 ‘001 002 002 003 8 26 =s ee -000 ‘000 000 ‘001 001 ‘001 ‘O01 002 — x or es = : ‘000 -000 000 -000 ‘001 a 28 Bs = = -_ _ : | “O00 48 On the Poisson Law of Small Numbers PART II. CRITICISMS OF PREVIOUS APPLICATIONS OF POISSON’S LAW OF SMALL NUMBERS. (7) We now turn to the illustrations which various authors have given of the Law of Small Numbers. “Student's” Cases. We take first the series given by “Student” in his memoir on counting with a Haemacytometer*. They are of special importance because the series at first appear of fairly adequate size, namely consisting of 400 individuals, and further we should anticipate that the Law of Small Numbers would hold in his cases. He obtains better fits with the binomial than with the exponential but, as he remarks, he has one more constant at his disposal. On the other hand, if the exponential be a true approximation, the binomial ought to come out with a large n and a small but positive g. “Student” finds for his four series : L400 x (111893 — 1893)-2™, Il. 400 x (97051 + 02949 624, III. 400 x (1:0889 — -0889)-2"™, IV. 400 x (9525 + 0475989", {I. and IV. may, perhaps, be held fairly to satisfy the conditions, although it is not certain if 46 is to be considered a large n or ‘05 a very small q. I. and III. fail to satisfy the conditions at all, unless the probable errors of q¢ and n are such that g might really be a small positive quantity and n really large and positive. The following are the values for the four series of n and q and their probable errors : I g=— 1893 +0647, n=— 3°6054+4 1:2209. Il. qg=+°0295 +0457, n= 46°2084 + 71°7373. III. g=— 0889 + 0534, nm = — 202473 + 12°1165. IV. qg=+ 0475 +0452, n= 985263 + 93°7494. Now while these results are very satisfactory for II. and IV., they are not wholly conclusive for I. and III. We can approach the matter from another standpoint; the probable error of g for p=1 is il . 67449 Ta V2 = 67449 x 0707 in “Student’s” cases. Thus the deviation of q from q a very small quantity is for I. 2°68 times the 8. D., and for III. 1:26 times the 8. D. Since g may be either positive or negative, we may reasonably apply the probability tables and the odds against deviations occurring as great as these are in one trial about 250 to 1 and 9 to 1 respectively. Hence in four trials we should still have large odds against their combined appearance. * Biometrika, Vol. v. p. 356. Lucy WHITAKER 49 We have said that the results for Il. and IV. are fairly satisfactory, ie. we mean that they are consistent with g being small and positive and n being large ; but of course they are also consistent with g being negative and n being small and negative. It will be obvious from these results for “Student’s” data that it is extremely difficult to test the legitimacy of the bypothesis on which the “Law of Small Numbers” is based. In none of the cases dealt with by Bortkewitsch, much less in those dealt with by Mortara, are the populations (V) anything like as extensive as those considered by “Student.” But populations of even 400 give, as we see, too large values of the probable errors of g and w for us to be certain of our conclusions. (8) Bortkewitsch’s Cases. Taking Bortkewitsch next, he deals with the following cases : I. Suicides of Children in Prussia for 25 years: (a) Boys, (b) Girls, 25 cases. II. Suicides of Women in eight German States for 14 years: 112 cases or 8 subseries of 14. III. Accidental Deaths in 11 Trade Societics in 9 years: 99 cases, or 11 sub- series of 9, IV. Deaths from the Kick of a Horse in 14 Prussian Army Corps for 20 years: 280, or, as Bortkewitsch, 200 cases. It will be noted at once that Bortkewitsch’s populations (1) are far too small for any effective determination of the legitimacy of his application of Poisson’s formula to his data. We take his cases in order: I. (a) Suicides of Boys. TABLE IIL. | Number of Suicides _... ON LZ AS We 5 6" | 7andiover | | | | -——-—-— cae at Se } | Number of Years So 4 | 8 | Des) 4s 108 1 | 0 The binomial is: 25 [1:2033 — -2033]-°™, Mean 1:9600 and yw, = 3:2584. We have g= — 2033 +°2421, n= — 96425 + 109416. If y were really zero its probable error would be +1908. Clearly 25 cases are wholly inadequate to test the legitimacy of applying the Poisson-Exponential to the frequency*. But to what extent is the reader made conscious by Bortkewitsch that his cases fail entirely to demonstrate the legitimacy of applying his hypotheses ? * The x2 for the binomial is 2°379 and for the exponential 2°836, showing a somewhat better result for the binomial. Biometrika x 7 50 On the Poisson Law of Small Numbers I. (6) Suicides of Girls. TABLE IV. E Rees ae | | Number of Suicides... 0 vi | 2 3 | | — ——— Number of Years a 15) | oan | 0) The binomial is: : 25 ['7418 + °2582 7, Mean = 4400 and yu, = 3264. We find g ='2582 +1012, n=1°7041 +°7850. As in the case of the boys’ suicides, if g were practically zero its probable error would be + ‘1908, and there is nothing in this result again to justify us in asserting that q is indefinitely small and n indefinitely large. Actually we have: TABLE V. Number of Suicides per Year. 0 1 aye mans x 7 | Actual ... ee 15 9 1 ure Bortkewitsch 16-1 71 [cg | ees Binomial (a)... 15:0 8°9 11 — Binomial (0) 15:2 8°7 11 — (a) is the binomial considered above, (b) is the binomial obtained by taking n a whole number = 2, and g= mean/2 = ‘22, we. 25 (78 + °22). It is clear that either (a) or (b) gives better results than the Poisson-Expo- nential. Applying the test of goodness to fit, we have x? = ‘007 for the binomial (qa), x? = 610 for Bortkewitsch’s solution. Both give P > ‘60 but the first is much better than the second. If both boys and girls are taken together, we find the binomial 25 (9333 + (0667). This is the nearest approach to a small q and big 7 we have so far found—ze. the nearest approach so far to an exponential, but it is reached by a process, «.e. that of adding together two series of entirely different means and variabilities in a manner which cannot be justified, for Bortkewitsch’s hypothesis depends essentially on the homogeneity of his material. Even here the fit of the point binomial is slightly ; better than that of the exponential. Lucy WHITAKER 51 II. Suicides of Women in hight German States. Bortkewitsch gives the following table : TABLE VI. Number of Suicides of Women per Year State - - Totals Ore iee ee eel Sachse he Sl o.8| Fo | | | (a) Schaumbureg- tnppe: 4) 4) 2) 4)/—]--|]— | 14 (b) Waldeck... Mee huer dl 8 A he | 14 (ce) Liitbeck ee ves elle i geale cele a eee os 14 (d) Reuss a. L. . ce soni | oil | Sl soulmate 2a Ue Pe ms 14 _(e) Lippe PR Gy OU) BE ay a | | 14 (f) Schwarzburg- Rudolstadt ... | — ; 1}—}; 2)/—) 56]; 8) 2) 1)—)— 14 (g) Mecklenbure- Strelitz soe |e MBL) PE ee ea eT So eee eae 14 | (A) Schwarzburg-Sonderhausen SN a ed Sc he 14 Totals 112 The resulting binomials are : (a) 14( 9714 + 0286)", (b) 14( ‘8571 +°1429)9%, (c) 14( 5819 +4181), (d) 14 (1:0058 — -0058)-##24, (e) 14(1°3929 — -3929)-77, (f) 14( 6071 + :3929)3%, (g) 14(1°5792 — 5792)-91"7, (h) 14 (16609 — -6609)-3, Thus it will be seen that of the eight binomials only four have a positive q, and of these only one can be said to have a very smajJl g, and even in this case the n is not indefinitely large. Of the four negative binomials three have quite substantial q’s, and the fourth with its small negative qg corresponds most closely to the Poisson-Exponential. The probable error of g for g=0 is +:2549. The number, 14, of cases taken is therefore wholly inadequate to test whether the Poisson-Exponential may be applied to these data. The mean value of q is negative and = — ‘0820 + ‘0901, and the standard deviation of g=:3928 + -0637, which are within the limits of random sampling of g =0 with a standard deviation of 3779. We shall return to a different manner of considering the point later. At present we wish only to indicate that the hypothesis is that q is a very small positive quantity and that data which give ga standard deviation of ‘3928, or in the next example of 4714 are really inadequate to test such a hypothesis ; for in the resulting binomials g may easily lie anywhere between +°8 and —°8, and it is not possible to demonstrate that its real value is practically an exceeding small positive quantity. 7—2 52 On the Poisson Law of Small Numbers III. Accidental Deaths in 11 Trade Societies. Bortkewitsch provides data from which the following table is deduced: TABLE VII. Accidental Deaths Index Number of Society Totals 1 ] | ; 0 BNO NG We | eae | 10 al 13 S Syl 9 1h ; 2 3 9 12 fq 3 9 20 =a a 9 23 ei 9 QF AG pas 9 29 Ne ie 9 Al Te hee 9 40 1 2 9 42 }— | — 9 55 — 2 9 Totals ... | 16 | 7 The resulting binomials are: (18) 9( 4914+ -5086)5°8, (14) 9( 61844 -3816)"%, (12) 9:(1:9227 = 9227), 27, (20) 9 (11282 — +1282)-s2"e00, (23) “9° 9921 2.0079) eenes (27) 9( 52294-4771), (29) 9 (14130 — -4130)72™, (41) 9( 8454 + +1546)9°66, (40)° 9 (2:0342 — 1:0842)-27™4, (42) 9( 9822+ -0678)72”, (55) 9( 6154+ °3846)n2, Of these eleven binomials seven have a positive g; only one of these (23) actually corresponds to a really small q and large n, although a second, (42), approximates to this condition. In the five other cases the q’s are quite sub- stantial; in (13) the q is larger than p. Of the four negative q’s none can be said to be so small and the » so large as to suggest that they really correspond to the Poisson-Exponential. The probable error of q for q=0 is, however, + ‘3180, and thus for such small series, no test whatever can be really reached of the legitimacy of applying the Poisson-Exponential to such data. We may note, indeed, that seven of the eleven values of g exceed the probable error and two of these are more than three times the probable error. We should only expect two negative values of ¢ as great or greater than ‘9227 in 80 trials, whereas two have occurred in 9 trials, Lucy WHITAKER 53 so that the odds are considerably against such an experience. g is — 0469 +0959 and the standard deviation of g is ‘5127 + ‘0678, both results compatible with g indefinitely small and a standard deviation = “4714. The main problem, however, of the legitimacy of applying the Poisson-Exponential to such series cannot be answered by data involving only total frequencies of 9 to 14 cases in the individual series. He clubs the results given for each application of the Poisson-Exponential together and examines the observed totals against the sums of the calculated totals. Thus calculating the 11 Poisson-Exponential series* and adding them together Bortkewitsch finds for observed and calculated deaths: TABLE VIII. Accidental Deaths in 11 Trade-Societies. Bortkewitsch examines the matter from another standpoint. The mean value of Number of Deaths Observed Frequencies Sums of 11 Exponentials Single Binomial .. | 3°8 | 95 0 10 | 11 | 22 oe O) Te 13 4: 16 7 3-7 15°2 | 14-3 | 12°3 | 9°8 | 20|12)o7| 0-7 13°9 | 15°6 14-8 | 124 9°6 138 & over} Totals If we attempt to fit a single binomial to the observed line of totals, we obtain: m= 43636, o2=7'°5849 leading to the negative binomial : O97 382 — 7802). 7s, g=— 1382, + 18297,- n=— 59111 + 1391, or the constants are significantly substantial with regard to their probable errors. The resulting frequencies are given in the last line of the table above. The reader Here: * The values of the means and standard deviations for the eleven societies are : m | o m o m o 13 7889 1:969 23 6°222 2-485 || 40 2°889 2°424 14 2-556 1:343 27 1'889 0-994 || 42 | 4-556 2-061 12 2556 | 2217 29 5889 2-885 || 55 4°333 1°633 20 4:333 | 2-211 41 5111 2079 || | All these means are less than 10, which is the limit reached by Bortkewitsch’s Tables for the Poisson- Exponential. Bortkewitsch says he has taken the societies for which ‘the statistics indicated the smallest numbers of such accidents.’’ This is not very clear. It is certain that a society with a mean number of accidents =100, if it consisted of 200,000 members, would be more suitable for application of the exponential, than one with a mean of 8 if it only contained 10,000 members. Both Bortkewitsch and Mortara confine their results to means less than 10, and seem to indicate that ‘‘ smallness” has been determined by the absolute frequencies, but clearly it is relative frequency with which we have to deal. The use of such a term as Das Gesetz der kleinen Zahlen for the Poisson-Exponential seems open to serious objection, if it be associated with ‘‘m” an absolutely small number, and not with smallness of ‘¢q.” + For q=0, the probable error would be +°0959 and accordingly q is very divergent from the Poisson-Exponential value of zero. 54 On the Poisson Law of Small Numbers will be surprised to see how closely the single negative binomial determined by two constants gives the same result as the sum of the eleven Poisson-Exponentials determined by eleven constants, no one of which is really of any significance for its own exponential*. If we apply the condition for “goodness of fit,” »?= 5°83 for the single binomial and y?= 5°88 for the sum of the eleven Poisson exponentials, leading to P='950 and P= ‘951 respectively, or the fit with a-single negative binomial is slightly better than that with eleven exponentials. The two constants are significant, the eleven constants have no real significance for their individual series, as is demonstrated by the fact that the binomials for these series do not approximate to the Poisson-Exponential type. We may now consider the previous case of suicides of women from the same standpointt. The following are the data as given by Bortkewitsch : TABLE IX. Suicides of Women in Hight German States. | Number of Suicides 0 | 1 | 2 | Beall oA | 8 | 9 | 10 & over Totals Or fo) Se | Observed Frequencies 9? S198 O05 115 Uae ee eee ate 3 112 Sum of 8 Exponentials | 8°0 | 16°9 | 20°3) 18°7 | 15-1 | 11°4 | 8°3 | 5°6 | 3°6 | 21 | 2°0 112 | | | Single Binomial ... | 12°6 | 18:4 | 18°8 | 16°4 | 13-2 | 9°9 | 7:2 | 5-1 | 35 | 24 | 4°5 112 i | For the single binomial we have : m = 3'°4732, o7? =8:2312, leading to: 112 (2:3699 — 1°3699)- 25354, where q=— 13699 +°1490, n= — 25354 + 8076. If q were very small its probable error would be +0901. The values of g and n are quite significant, g is large and negative and n is small and negative. The resulting frequencies are given in the last line of the table as “Single Binomial.” Turning now to the test of “goodness of fit,” we have for the sum of the 8 ex- ponentials y?= 7:957, and for the single binomial y?= 7°740, leading to P= 633 * If the reader will turn to the first footnote on p. 53 he will note that for nine cases, the standard deviations of the means (o//9) are roughly about -7 or errors of +1 to +1:5 may easily occur in the means. Hence with the possible exception of (13) and (27) the m’s have not significant differences, and are not typical of the individual societies. + The values of the means and standard deviations are: : | m o | m o | Schaumburg-Lippe | 1:°429 1-178 Lippe Are ag on 2°857 1-995 Waldeck eligi) poco 1:378 | Schwarzburg-Rudolstadt ... 5143 1:767 Liibeck ... ais 2-571 1:223 | Mecklenburg-Strelitz or 5°286 2°889 Reuss a. L. we =| 2648 1:631 Schwarzburg-Sonderhausen | 5°642 3-061 The standard deviation of the mean is here o/V14, or, say, 5. Thus errors of 1 might easily occur in the values of m. There are probably significant differences between the first five and the last three states, but not between the first five among themselves or the last three among themselves. Thus the Poisson-Exponentials, if correct in theory, are not significant for the individual states, Lucy WHITAKER 55 and ‘654 respectively. Thus again the single binomial with only two constants give a fit slightly better, than the sum of eight exponentials with eight constants. Bortkewitsch looking at the observed frequencies and the sum of 8 or 11 exponentials—without using any satisfactory test for “goodness of fit ”—assumes that the coincidence is so good as to justify his hypothesis. But a better fit can be obtained with two instead of 8 or 11 constants by simply using a negative binomial. We must note here that Bortkewitsch is using the final coincidence merely as justification of the Poisson-Exponential; the total frequency is not describable in terms of the 8 or 11 constants as it is in terms of the two, for these eight constants are not really significant for his individual eleven trade societies or for the suicides in the individual eight states. If he wants to describe the total, he has no constants by which he can do it. If, on the other hand, he wishes to describe what has occurred in the individual societies or states, we have seen that their binomials differ very widely from Poisson-Exponentials. If, lastly, no stress be laid on the individual cases as having too large probable errors, but only on the general coincidence with total frequencies, then the same coincidence would justify us in using a single binomial with two constants only*. It appears to us that to properly test the Poisson-Exponential, we need not 9 or 14 instances in the individual case, but several hundred instances,—more, indeed, than “Student” has taken—and that no proof of the “Law of Small Numbers” can be obtained on data such as those of Bortkewitsch or Mortara. IV. Deaths from the Kick of a Horse in Prussian Army Corps, omitting four Corps with Bortkewitsch. Here the results are: TABLE X., | Number of Deaths ... 0 1 Z 3 | 4 Totals Number of Corps —... 109 65 22 3 1 | 200 Whence : m='61, p.='6079 and the binomial is: 200 (996,557 + 003,443 771707, This is the first of Bortkewitsch’s illustrations for which his hypothesis that q is small and n large is really justified by his data. For: q = 0034 + 0670, n= 1771711 + 3449:108. The probable error of g for q really zero is + ‘0674. * Of course immensely better general total fits are obtained by using the sums of the actual 8 or 11 binomials than by the Poisson-Exponential sum or the single binomial, but the results in that case involve 16 or 22 non-significant constants. 56 On the Poisson Law of Small Numbers The actual results as given by the binomial and the Poisson-Exponential are: TABLE XI. | i Number of Deaths ... | 0 eae ae 2 3 4, and over | = we |I— | | | | Observed Hae as 109° =4)), Ob" 22 3 1 Binomial a ax 108°6 | 66°4 | 20°2 Ail 0-7 Exponential ... “3 108°7 | 66°3 20°22 | 4:1 O'7 | Actually if we work to two decimal places in the frequencies we have y? = ‘61 for both binomial and exponential, or the goodness of fit is practically identical. In this case it seemed worth discussing the binomial fit more at length. Taking the moment coefficients about the mean we have: (1) Mean =ng="6100. (11) bs = npg = 6079. (111) Hs = npg (p — g) = 590,562. (iv) fy = npg (1 + 8npgq — Opq) = 1:°643,373. We have already discussed the binomial from (i) and (ii), giving x” for goodness of fit ="6096. Using (11) and (111) we have for the binomial 200 (985,739 + 014,261), giving x? = 665. Using (111) and (iv) we have: 200 (979,524 + °020,057 303, giving x? = 707. Putting : B.= Me | pe? and Bi = M3/ ps?, we have: B,—-3 =(1—6pq)/npq, Bi. = —4pq)/npy, and working from 8, and £, we find: 200 (969,150 + :030,850)89™, and in this case x? = 1:1286. This of course does not give a bad fit, but it is clear that working from the lowest moment coefficients, as we might anticipate, gives the best results. But if q be the chance of death from the kick of a horse, and n the number of men in an army corps, then the binomial should be 200 (p +q)” Now it is obvious that none of the binomials give, by their value of n any approach to the real number of men in an army corps. If we start with the Lucy WHITAKER 57 number of men 7 in an army corps as 50,000*, we have ng ="61 and g=:000,0122, thus reaching the binomial 200 (-999,9878 + 0000122), giving as compared against Bortkewitsch : Binomial Bortkewitsch 0 108°6876 108°6703 1 66°3002 66°2889 2 20°2213 20°2181 3 41115 41110 4 and over °*7035 ‘7034 and y? = 608,298 608,318 or, the slight advantage to the binomial exists but is of no significance. Now it seems to us that in this case the use of the exponential is justified for the total frequencies, but as far as describing those frequencies is concerned, it gives no better result than the binomial. But as in the other five of Bortke- witsch’s cases the Exponential is not justified by the individual series themselves. It is perfectly true that the exponential has a definite theory behind it, and is interpretable in terms of that theory, i.e. we must suppose the probability of an occurrence very small and the chance of its repetition absolutely identical. But is the second of these conditions ever likely to be demonstrable a priori, or must * This supposes that every man in the army corps is equally liable to death from the kick of a horse; of course a very arbitrary assumption. + To illustrate the idleness of the application of the Poisson-Exponential even to these data for the Prussian Army Corps, we give here the binomials for the whole of the 14 corps. Index Number of Corps Binomial G 20 (-95 + -05)16-0000 : 20 (1325 — +325)-2-4615 ul 20 (1:5667 — 5667) —1-0585 Tl 20 (-9 + +1)6-0000 IV 20 (-6 + -4)1-0000 W 20 (-6318 + +3682) 1-4938 Vi 20 (1:0912 — :0912)—9-3202 VII 20 (-9 + +1)6-0000 VI 20 (-65 + +35) 1-000 Ix 20 (‘8115 + *1885)3-4483 x 20 (1°05 — -05)-15-0000 XI 20 (1-11 — -11)-11-3036 XIV 20 (1:05 — -05)—24-0000 XV 20 (1:1 — *1)—4-0000 One seeks in vain through these binomials for any approach to q very small and positive and very large and positive. In no case does n approach the number of men in an army corps, say 50,000, or q equal the chance of a death from the kick of a horse, say, ‘0000122! It seems impossible by clubbing such equations together to give any satisfactory proof that the Poisson-Exponential really does apply to individual cases. In the 20 years involved, there were doubtless great changes in both the training and the personnel of each army corps, and the results obtained may be just as much due to such causes as to the errors of small samples. Biometrika x 58 On the Poisson Law of Small Numbers not we a posteriori demonstrate it from the data themselves? Child suicide may be influenced by example, by environmental conditions in different districts, possibly even by meteorological conditions in different years. Again, even in different army corps the conditions may be far from uniform, the spirit of the corps, the teaching with regard to the handling of horses, the experience of past life according to whether the corps is raised in town or rural districts may all tell. Even Bortkewitsch before he gets his best fit removes four corps or 80 observations from his data. We do not criticise this removal, but even unremoved he says the fit of theory with experience leaves “wie man sieht, nichts zu wiinschen iibrig” (p. 25). But the binomial is before removal: 280 (1:085,714 — 085,714)-8 155 in which q is not very small and is negative, and n is not very large and is not positive. It is true that the probable error of g for q insignificant is in this case +0570, but this only shows that the data were insufficient in quantity to determine whether the exponential could be applied or not. (9) Mortara’s Cases. Mortara* in an interesting paper has realised the possibility of repetitions not being independent and has discussed a constant @’, by which he proposes to test such influence. This quantity Q should be unity, if the Bortkewitschian hypo- thesis can be applied. He then takes 16 or 17 districts with records of 10 years, and calculates the mean number of deaths from some special cause per year, say, for each district for those years. If this mean number exceeds 10, he casts out that district, presumably on the ground either (1) that such a number is no longer small, or (ii) that it differentiates the district from those with lower numbers. Thus Bologna with 10°9 deaths by murder is excluded and Bergamo with 84 is included, although Q’=1 for both. Bologna with 7:1 deaths from smallpox is included, but Pavia with 12°3 is excluded although the Q’ of the former is 2°5 and that of the latter 1:7. What method should be employed in dealing with the frequency of the excluded districts which may amount to 50 °/, of all districts is not discussed. Having thus reduced his available districts, Mortara proceeds to apply the exponential to each individual district ; he adds up the results for each district and compares his totals with the observed totals. It will thus be observed that he fits his exponential to ten observations, and then adds together five or more districts to get his totals. We can equally well apply this process by fitting a binomial to each 10 observations and then adding up such results. But it is quite clear that on the basis of ten observations, it is, owing to the large probable errors, wholly impossible to assert, whether a binomial of the kind required by the Bortkewitsch-Mortara hypothesis,—i.e. one of very small positive q and very large positive n—really is justified. We can illustrate this at once from Mortara’s Tables (see his pp. 42 and 45) for deaths from Chronic Alcoholism. The * « Sulle variazioni di frequenza di aleuni fenomeni demografici rari,” Annali di Statistica, Serie v. Vol. 1v. pp. 5—81. Roma, 1912. Lucy WHITAKER 59 observed numbers, and those deduced from the binomials are given in the accompanying table. At the foot are the observed totals, Mortara’s exponential totals and the binomial totals. TABLE XII. Deaths from Chronic Alcoholism. Oe Cason aber) |) ling | 9 | io | ar | a2 las |e | | | | | | | | i. i st | | Calabria 1 ® 4 — 2 = |) = | | — | — | Observed 1:49] 2°84) 2°70) 1°71 81]; -31}) -10] -03)| -01 = | —- | Mortara 118} 2°85} 3:06} 1:91 "76 | °20} -03 | - | = | | Binomial Foggia 1 2 4 — | 2 oS || | | | O. 1:00} 2°30] 2°65] 2:03) 1:17 54 | “21 07 | °02] -O1 | M. 96) 2°29) 2°70 | 20S eelaliSae Padova 10 ( 36833 — 2°6833)-%# Verona 10( 56000 — 4:6000)-™" Brescia 10 (O97 20 — B8i0i 2h) Bergamo 10 ( 23821 — 1:3821)-7=9 Catanzaro 10 (156128 — 14-6128)~ 76 Vicenza 10 ( 34854 — 2°4854)-Ve97 Out of the eleven cases only two give g small and positive; not a single one gives for g anything like the chance of a death from small-pox in the district, nor for n anything like the population of the district. There is an increasing divergence from the positive binomial as Mortara’s Q’ increases in value. We see that in nine cases, however, a negative binomial not the exponential is required to describe the frequencies. The probable error of gq, for insignificant q is as before + 3016, and therefore it is improbable that g is zero in at least 9 out of these 11 districts. Examining the totals we find Binomial Exponential v2 = 9°64 * 570°79 Oi ‘000,000 Lucy WHITAKER 61 TABLE XIII. Deaths from Small-powx (1900—1909). : fy | 12 or 0 1 2 8 4 5 6 : g er ea lee | more Venezia 4 5 = 1 7 || le Observed | 4:49 | 3°60] 1°44 38 08 01) -- — = — —|— — | Mortara 4°40} 3°71} 1°46 36) 06) ‘01 -—— | Binomial Bologna 4 4 1 1 — —|/|— —}|} — |90. 4:07 | 3°66} 1°65| 49 ear Nfs ee: ©) een |r| ean cn | i ee IVT 4:04} 3°68} 1°65 “49 ‘ll 02 Ol; — |) — B. Treviso 5 3 7 pe == 1 | | 0! 3°68 | 3°68} 1°84 61 15| :03 ‘OL| - - M. Halls 236) 1-18 | ‘61 By ON ‘09 ‘05 03) ‘Ol; — | — = B. Pavia 4 3 el eee fee lL catty tea 0. SO) 382624) 2:17 87 26 ‘06 Ol = | = M. 4:14] 2°76 e538) 79 40 19 09 O05 02 01} ‘01 |) — | = B. Cagliari 5 1 1 1 SV es eel aes peo ee eee | a ea ee eS OS Ocou eouiOln woo “99 42 IL) 04 OL y= —}— = M. 4:07] 1°89} 1°17 79| 55! -39| -28| -21| -15| -11/:08|-°06/ -25 |B. Padova 3 3 _ 2 = 1 a |e == = 1/— = 7 |KO): ‘91 | 2°18) 2°61} 2°09] 1°25 ‘60 24 08 03 Ol}; — | — — M. o2'| 2°03) 1°40 98 MON 250 36 26 19 13°] “10 7 16 | B. Verona 4 3 — 1 = = 1 ome ee oe |e | i O. OM Ql Sei :6la|) 22098) 125 ‘60 | +24) -08 03 ol) — J] — — M. 4:07 | 1°74] 1:09 ‘75 4 40 | °31 733 18 14 | -11 | ‘09 Fata) || 15%, Brescia 2 3 2 2) = == = = a ee | 13 0) ES alan le QO 2 Oil) || a2, 1°82 | 1°20] ‘66 31 33 05 | 02 | — — M. 4°99) 1°42 87 6 47 coll 30 24 20 17 | *14 | °12 79 |B. Bergamo 2 — 2 2 — | 1 — 1 1 1 }/—/—] — |9. *20 79)| 1°54) 2:00)) 1°95 1°52 99 5)5) 27 12 | 04! -02 ‘Ol | M. Hotere dol 1:57) 1:46) 1723 98 74 54 38 he | Altsy | aly 24 |B. | Catanzaro 3 3 1 1 1 |g en |e | 1* | 0. *20 “ON 1254.1 2:00)) 5951) 1:52 "99 YD) ||) 47/ 12 | :04 | 02 ‘O1 M. 4°80 |} 1:20 el 50 38 ill 25 21 18 LG 4 | el2s 04s 1B: Vicenza 3 = 1 1 1 1 = 1 1 = — | — 1 O. ‘17 GSae eso alee 1:95 | 1°60 | 1:09 BS} 15 | 706 | °02) -O1 M. Qe SOs da? 1eQBnle LeODie 282, 65 51 39 30 | SBA PI | °48 B. lias ae | Totals Somos, ie i.| 2 2 CRden No Wo sor ery) wae. Os 19°24 | 24°97 | 21°50 | 16°54 | 11°76 | 7°58 | 4°38 | 2:25 | 1°07 ‘46 | °16 | -06 03. «| M. 40°25 | 23°70 | 14:05 | 8°58] 5°78 | 4°16 | 3:08 | 2 L254 ed 59 Olles7ios eo soln b | Brescia, and 1 at 27 in case of Catanzaro, if the means were to agree with those given by Mortara. * 1 at ‘12 or more’ in cases of Brescia and Catanzaro was found to signify 1 at 20 in the case of 62 On the Poisson Law of Small Numbers In other words the binomials give a reasonable total fit, the exponentials a practically impossible one. But there is another question to be asked in such series as those of Mortara: What justification is there in cutting off at 10 cases, say of murder? (Standard Deviation)? = or, if we use the form preferred by Bortkewitsch* 8S? (ms — nq)? eet ee S? (ms, — nq) (l—1)nq © This in other notation is Mortara’s Q”, the only criterion he’ actually uses provided by his equation (17 ter), p. 18. Thus his Q’, which he says must not differ much from 1, is only /p, and it would be better to use p—which has a direct physical meaning—than Mortara’s Q =,/p. Clearly Mortara’s somewhat elaborate process of deducing Q’, does not amount to more than saying: Fit a point binomial and test if p is slightly less than unity. We contend that it is best straight otf to fit the binomial. Hence : p= It is true that Mortara does not reach his Q”, our p, by the simple process of asking whether the binomial is one with a positive probability less than unity. He endeavours to obtain it by considering whether there is “lumpiness” in the observations. But it seems to us clearer and briefer to ask: Are the contributory cause-groups independent as in teetotum spinning? If so, the data will fit a true binomial and p will of necessity be a positive quantity less than unity. If they are not of this character then p must of necessity be greater than unity. It is of interest to see how Mortara’s test of dependence of contributory cause groups 2 leads to a criterion, but he actually only gets his Q”, Le. our binomial p after 2 a series of hypotheses which much limit, and that in no very obvious manner, * The use of NE or en in the value of the standard deviation when l is small has been several times discussed. It may be dealt with as follows: The probable errors of a mean as deduced by the two processes are E=-67449 . o/a/1, and E’=-67449 .o/,/t-1, now B= 61449 6] JT(1 +5) +-~- ] 1 = -67449 +. (« eee ot. ) wl Not OE 1 may and —— is less and often much less than ‘67449. wel /21 Hence if we only know o from the observations themselves, and this is the usual case, we have: 1 / 2s gh Jt where o’ differ from o by a quantity usually far less than the probable error of ¢. In other words the refinement of using H’ for F is idle having regard to the accuracy of our observations; and the form used by Bortkewitsch and Mortara with ,/1—1 for nibs of no importance. Now the probable error of o is °67449 E’ = -67449 64 On the Poisson Law of Small Numbers the nature of those contributory causes groups. Of course if their dependence were of the nature of successive draws from a pack, then the result would be a hypergeometrical series and Q? would have no physical meaning for the series at all. (11) We will deal with one further illustration out of many considered by Mortara which are of like character. In the case of Marriages of Uncle and Niece (see Table XIV, p. 65), where the distribution of Q’s is the most favourable for his theory, the binomials are Reggio Marche 10( "7000 + -3000)'° Umbria 10( :9000 + -+1000)°° Basilicata 10 (14000 — -4000)7* Sardegna 10( °44545 + °55455)198% Emilia 10( 9818 + -0182)2020 Abruzzi 10( 8429 + °1571)788 Lazio 10 (12548 — -2548)—12 1646 Puglie 1OGeS =o ace Veneto 10 (1:34.44 — +34.44.)~1'S064 Toscana 10:(2:2667 A266 7) 28" Calabria 10 (13584 — -8584)—2#°88 of which only one (Emilia) approaches the conditions for an exponential distribu- tion. If we test the totals at the foot of Table XIV, we find the result much to the advantage of the binomial, for which P = ‘902 as against ‘714 for the exponential. (12) On Mortara’s own showing nearly all the Qs of his numerous series are greater than unity, and very few of the binomials are positive. If we consider the distribution of Q's, given in his work omitting Table 13 (Deaths from Malaria) we find a range from ‘5 to 3°6 with a mean Q at 1:2565 + 0847, while for the distribution of all the p’s in the binomials we have determined, we find a range from ‘4 to 15°6 with a mean p at 2°5655 + ‘3817. These results are sufficient to show that there is no real distribution of p round the value unity but the binomials have a distinct tendency to be negative. (13) But the whole theory of Poisson’s exponential law in the hands of Bortke- witsch and Mortara appears essentially vague. The binomial is built up on the assumption of the repetition m times of a number of independent events, of which the chance of occurrence is identical and equal to g. The population is n and the chance of occurrence q in the case of each individual. The mean frequency of occurrence is ng. But if g be very small we have seen that the series 1s —m [1 , m me e ppnow nt soy v8 ; i —— Lucy WHITAKER 65 TABLE XIV. Marriages of Uncle and Niece (1900—1909). : as 0 1 ze | 8 pe Nae |e oer |e om ip it | 22 | 23 | 14 15 | 18% | | over er Marche O.| 7 3 = = | | M.| 7-41] 2-22] -33] -04 | | | | Bale iz 2 | Umbria 0. 6 3 1 — — | | M 6:06 | 3°03 s(oa wld 02 B. | 5°90} 3:28 ‘73 08 | -— BasiicamOnime. | o3 | — | 1 | — | ; | | | | | M 5°49 | 3:29 99 ‘20 03 | | B. | 6°04] 2°59 92 31 10 03 Ol. | Sardegna O. 2 5 3 = - M.| 3°33] 3°66) 2°01 74 20) °05 O1 Pelpeccores-o6 | 3:03) = | — | — | — | | Emilia =O. 1 3 2 Zee 1 — M Il-l1l| 2°44) 2°68} 1:97] 1°08} °48]) ‘18 05) -Ol B 1:09] 2°48} 2°70] 1:98 | 1:08 47| ‘17 05 Ol Abruzzi ©, || — SUF whee ail 3 | 2 _- 1 — | — | M 61) 1°70) 2°38) 2:23] 1°56 8 Se G 06 02 B Se elon mcd S|) 2°43 | 1°68 87 34} *11 03) — isazion ©. | 1 eal a2 3) — | 2}/—) 1] —) — M ADM eA One Dele 224 173!) 1:07 5D 25; 10} ‘03 OL B 63} 1°56! 2°09} 2:00} 1°54) 1°01 59 31 14} -06 03 O01} ‘O01 Puglie O.| — 30 al 2 nh peal 1 72 ae bbe |e fee ae | M 27 FOS) wake) Loi U38 | “835 427) =19) 08 03 01 | — | B Sone lesOny We77 | 1-80 | 1538 |1:14| -77| °49) 29] +16) +10) :05] :03 | ‘Ol | ‘OL | Veneto O. 1 = 1 l 3 1 i a 2 | | M. 11 SHOP aoa OOM le OO! Ts fn) 28 82) “46 23 10 OA 025-01) Bs 21 ‘70 | 1°26] 1°62] 1°67 46 | 1°13 TS Mh > aap 30) 17 09 | 05 | °02 | -O1 | O01 Toscana O. _ =e 1 2 2 2 1 Le — | — 1 M. O04 24 66) 1:19) 1:60] 1°73 / 1°56) 1:20} -81 49} °26 13 | 06} ‘02 | 01 | — B. 31 SM lsOvaleelaton al 2ifal elie) 1-00 83 | °65 5O 37 27 | 19 | 13} 09 | 06] :10 | Calabria O = - 2 2 1 1 1 ented tg | —| 1 M — Ol 05:; “16. 36 4) 94] 1:20 | 1°33] 1:32] 1:17 | :95) -70| 48 | -31 | -18] -20 | B 00 O03 wi; 26 48 73 96 | 1:16 Deal OL C7 6Ou plaleon I 2a. a6 : | ea | Totals O. 24 24 1} || ile 8 9 5 5 3 it 1 it = |) SE ee al SS M. | 24°88 | 19°47 | 14:93 | 12°72 | 10°39 | 7:93 | 5°76 | 4°10 | 2°96 | 2-17] 1°57] 1°13 | -78| -51 | -32 | -18] -20 \ B. | 24°22 | 22°16 | 16°46 | 11°73 | 9°35 | 6°88 | 4°98 | 3°74 | 2°84 | 2°19 | 1°71 | 1:29 | -97 67 | 48 | °32] -26 | } J As Biometrika x 9 66 On the Poisson Law of Small Numbers from which x has disappeared, and in this exponential we have seen that Bortkewitsch and Mortara suppose m small, ic. 10 or under. We have seen that there is no reason why m should be absolutely small, and that the name given by Bortkewitsch to the Poisson-Exponential—ie. the “Law of Small Numbers ”—is misleading. But supposing the mean occurrence m to be small, it by no means follows that g need be small and n finite. For if g="2 and n=4, m would be “small ”—and the sort of small number with which our authors deal, but the mere fact that the mean frequency of occurrence was 2 would not justify our using the Poisson-Exponential for (Golam ey) The fact is that when our authors speak, of the deaths in a Prussian Army corps from the kick of a horse, or the suicides of schoolgirls, or the deaths from chronic alcoholism as being “small,” they really mean small as compared with the number of persons exposed to risk. They had probably in mind all the men in the army corps, all school-girls or all individuals liable to death in the towns considered. But are all men in the army corps,—or only the cavalry, the artillery, etc..—equally liable to death from the kick of a horse? Is every school-girl equally hable to commit suicide or only a very few morbid and unhealthy minded girls? Is every individual equally liable to die of chronic alcoholism, or only perhaps the 10 or 12 confirmed and aged drunkards in a town? The moment we realise these doubts, what is the population n to be considered? It is not m being small, but the smallness of m/n that leads us to believe that the binomial may have passed into an exponential. But if only six school-girls per year in a community are in the least likely to commit suicide, what is the justification for the “law of small numbers,” if the average number of suicides be 65? Further, if we pass to even a large community in which the tendency to commit suicide is graded—a very probable state of affairs—m might be small and n large, and yet since q is not constant, the binomial and its exponential limit would not be applicable ; and this non-applicability would not depend on “lumpiness””—i.e. contagion or example in occurrence. Thus the probability might be: (Pit h) (P+ G2) (Ps + Gs) +++ (Put Yn) with all the p’s independent (as in spinning differently divided teetotums) and not correlated (as they would be in drawing successive non-returned cards from a pack). It would seem therefore that a priort we should not expect the conditions for the exponential to be fulfilled in most of the cases selected by Bortkewitsch and Mortara, although with perfect mixing we might expect it in the cases cited by “Student.” (14) In order to test this point on adequate numbers, the ages at death of all persons dying over 70 years of age were extracted for a period of three complete years from the notices of death in the 7imes newspaper for the years 1910—1912: see Table XV. These announcements of death are those of individuals in a fairly limited class, which may be considered stable in numbers for these three years. 67 Lucy WHITAKER | | 80- OLLI I ol | | | | GE. 8F-1 I c | | FR: LE-€ r II | | | E9-E 96-9 F OL | OF. | C6: L lI €P-t LUFL | Sl 6 | | | LEI OL-€ € 66-81 8E-LE cE 8 Ge. | &6- | L OF-¢ 88-6 8 FL-€F iL. 6P TF Z Le. 18-@ é LLL | GP-DS LE LL-G8 | 68-€8 | 18 9 CF.9 IL-OL €1 €E-6F | G6-G¢ 19 CL-9FL LT-6é1 611 G GFF LG. G Le. Le L8-6E FE-FIL | &6-IIl Lo ¥G-L06 | Té€-LL1 691 ip [€-¢1 0-91 | LI GF-Z6 1-6 68 FO-GIZ | GF-Z61 Est SL-CEZ | OF-60E | Gz g 1€-F6 T0-F6 F6 GL FEE Ge-L1E | gle E6- F6G | 06-69@ Le 68-006 62-006 | LOE | o LE-9FE 88-0GE OGE O€-L6E | OG-GLE | OLE LE-ELG | LE- PLE LOE IL-FLL €0-6€1 OFT 69-GE9 L€-€E9 €€9 CE-9EE | 19-€9€ F9E SL-9@1 | 18-G¢L GOL If-G& | G66 OF 0 Ox: | 8L-6 |. “1 II | Q8-€ 98-¢ r 01 | 96-01 | LF-€I | 61 6G | PP. Ciel il G8-TG | C9-8E | ike | 8 89-1 | 90-6 i FO-6G =| 99.e¢ | 69 | Z 88-9 | ¥8-6 9 ST-66 FE-L6 |! 8 9 91-1 | COsG 1 CLE | 9G.86 (ac G6 8¢cT €G-O0¢1 CFL | G | €E-6 | 18-6 LL 19-0L | 86-éL 6L O8-ELE 60-661 L81 Y 97-€ wre | & GBF | 69. EF GP SI-C9L LLGT LS F8-9G% 89-216 | ITE 8 6F-GE G8-Ze | BE 0G-€91 F0-G91 FOL 18-682 OL- TLE 9G BL-181 O€-F81 OLI o 00: EE OL-O&E CGE 91-966 | 8L-16E 16€ 66-8 8L-GEE 6EE 11-46 | OG-LOT | OIL ib QE. 868 10-668 T&8 €8-O8F | LG. P8F F8r 9@-86L | LE-BLE GEE T6- Ge FF-GE | €€ 0 [ey | Bl | | | | | | ay (ae eIulouUurt DIATIS eH | yBVIUTOTT | VALS) (en | B | paaaos' Biyuauodxy ertulouUutl JATOS -uauodx gg | 1?! iq | I 40 -usuo0ds gy | [vl lq | P 90 | -uauodxg | (PRMOUra P qo eo | Te! lq | P qo motp aad == : ae : pleas eS —— —— sired jo Tay nN TIAQ) PUB SIVAK YG IIAQ purv SIvAK GY IBAQ puB SABI OF TAAQ PUB SIBOR OL -vadndsmau sowty, ay2 woul pabp ayy fo hivp sad syynaqr “AX WIAVL “TOTO AA “Ud TAL 68 On the Poisson Law of Small Numbers Table XVI shows that the announcements of deaths over 70 years of age only amount to 3°74 per day for males and 3°52 for females. These are certainly “small numbers,” but “small” with regard to what? Are we to consider n as the number of the population which embraces, (i) all the individuals of the limited classes of the same range of ages as the defunct, (11) all the individuals announced as dead on the same day, (iii) all the individuals of whatever ages of the class which announces deaths in the Times? Or, should we refer to all the individuals in the community of that range of ages, or the whole community at large, 1e. the chance that in a population of so many millions an individual over 70 or 80 as the case may be will die and have their death announced in the Times newspaper? Well, it really does not matter, because if for any one or all of these populations the binomial (p + q)” applied, we should get if g were small and n large, the Poisson series Cm @ +m+ ae nag + 4 Ziel ; and this quite regardless of the size of n. If therefore we did find a series in which g was very small and n large, we might not be able to say to which, if any of the above populations n applied. On the other hand the mere fact that m is small is no justification for the use of the “law of small numbers” as is sometimes implied. If it be argued that the small number of people who die over 80 and have their names recorded in the Times are drawn from a small population, we reply so it may be argued are the school children who commit suicide, the uncles who feel any inclination to marry their nieces, or the men liable to die of chronic alcoholism ; and we can in the case of the announcement of deaths test the values of g and n on fairly adequate numbers. As a matter of fact we do not know, in attempting to apply the Poisson formula, what is the population from which we are drawing our individuals, and the justification of the Poisson formula lies only in showing that there actually does exist a binomial for which qg is small and n large. We might imagine that as we got to the higher ages practically every person of that age would die, or that in our notation q would be 1 nearly and p be a very small quantity ; thus an approach might be made to the Poisson-Exponential. But the approach to the Poisson-Exponential arises not through q approaching unity but from q becoming very small. Nor again in the lower age groups do we find ourselves left with a positive binomial. In all cases except women over 90 years of age, we find that a negative binomial best fits the observations. Even in the case of the announcements of deaths of women over 90 years, we find that the approach of the binomial to the Poisson exponential depends on / i: 53°3333 (1 a 553555) being measured with sufficient approximation by e = 2°71828. But (1:01875)?°8 = 269323, Lucy WHITAKER 69 and is therefore not a very close approximation, a result shown when we use a binomial by the substantial improvement in the measure P of “goodness of fit.” Even in this case we are not prepared to say what is the population for which the g = ‘01875 in the case of these announcements of deaths of women over 90 years of age. It can scarcely be that there are only 29 women over 90 years TABLE XVI. Constants for Deaths of Aged. Men. | | | Probable _ Probable lee omiaal Expo- Age over | p q Error n Error m , P a | nential | of q | of n P | 70 years... | 112965 | — 12965 | + 03314 |—28°8747 + 7°3734| 3°7436| °1355 0045 | 80 years... | 1°12152 | — 12152 | + 03349 |—14:0703 + 3°8704 | 1°7099 | °9358 | *1129 | 85 years ... | 1:01903 | — 01903 | + :02902 | —43:2996 + 67°5797 | °8289) “9737 | “9715 90 years... | 1°00654 | — :00654 | + 02934 | — 42°8498 |+192°3069 | -2801 6741 | 6672 | | | . | | Women. = , = | | Probable Probable | | Bi al Expo- | Age over Dp qd | Hrror n Error m eee nential of q | of n | ee | ei. Paice: 70 years ... | 1°34012 | -- 34012 | + 04161 |—10°3522 | + 1°2307 | 35210) -8084 | -0000 80 years... 1°20770 | - -20770 | + °03294 |—10°4400] + 1°8309 | 2°1569 -9686 70018 85 years ... | 1°14507 | — °14507 | + 03077 |— 8°1447) + 1°9627 | 1:1816| ‘9860 | -1062 90 years ...| *98125 | +°01875 | + ‘02779 | + 29-0573 +43:0634 | 5447| ‘9848 | -8116 | | | of age living in the country, whose deaths are likely to be announced in the Times when they occur. Further the probable error of qg is such that actually this case might equally well be a random sample from material following a negative binomial. Analysing our material we see that our first two cases of males and the first three of females are such that they could not possibly be random samples from positive binomials, the probable errors of q are too small. Next, seven cases out of the eight do give actually negative binomials and the eighth might, having regard to its probable errors, well be a negative binomial. Thus although our daily occurrences are certainly in Bortkewitsch and Mortara’s sense “small numbers,” they give no support to the use of a Poisson-Exponential. If it be said that these “small numbers” differ in character from those used by our authors, the reply must be: we know in none of these cases the real population from which deaths are to be considered as drawn. The chances of death are certainly graduated with age, but the chances of suicide are graduated with temperament, and the same is true of alcoholism, or again the chance of 70 On the Poisson Law of Small Numbers death by accident is graduated with occupation. At any rate until those who support the use of the “law of small numbers” demonstrate its application on material, where the probable errors are sufficiently small for us to measure the true value of gq and n, no advance can be made. Nor until we have clear ideas of the population in which the chance is q, is it possible to assert that it may be used for the suicides of school children, and the marriage of uncle and niece, and must not be used for the deaths of aged people, which certainly occur in “smaller” numbers. In the illustrations of deaths we have taken, certainly the Poisson-Exponential is not the rule, although the distributions appear to approach it, as towards a limit, when the number of deaths approach zero. But our data which show the rule of the negative binomial appear to show it in no more marked manner than much of the data selected by Mortara himself indicate the negative binomial, although owing to the sparsity of his material his results are far more erratic and unreliable. Nor is Bortkewitsch much behind Mortara in the evidence he produces for a negative binomial being as reasonable a description—possibly owing to inherent lumpiness— as a positive binomial of these “small number” frequencies. (15) Conclusions. (a) The Poisson-Exponential gives a fairly reasonable method of dealing with the probable deviations of small sub-frequencies in the case of random sampling. When the average value of a sub-frequency is not more than 3°/, of a population, then Poisson’s formula suffices in most practical cases to determine the range of error likely to be made. Tables are given to assist its use. (b) The application of the Poisson-Exponential to various data by Bortkewitsch and Mortara has hardly been justified by those writers, for they have not tested whether the probability q is small and positive and the power n large and positive in the cases considered by them. When this is actually done, it is found that their hypotheses, having regard to the probable errors of q and n, are largely unjustified in the case of their illustrations. Even in such cases where it is justified, a binomial gives a better result as measured by the test for goodness of fit. (c) Negative binomials repeatedly occur and give just as good fits, where they occur, as positive binomials. In the illustrations taken by Mortara, the frequency 10 used is so small that it is not possible to assert that either positive or negative binomials are demanded by the data. Still the average p of his results is very significantly in excess of unity. (d) Mortara like Bortkewitsch cuts out of his data straight off all districts with, on the average, more than 10 cases in the year. But the g obtained from 20, 40, or even 100 cases in a population of 100,000 is a small g in the sense that the resulting binomial is adequately expressed by a Poisson-Exponential. There Lucy WHITAKER 71 appears to be no valid reason for such a procedure, except the experience that many such cases actually give negative binomials*. It seems to us theoretically unjustifiable to apply the exponential to 8 cases say in a district of 100,000, and not apply it to 12 cases in a district of 200,000. Actually p may be 1:4 in the first case and only 09 in the second. (e) We consider that the reasonable method in every case is not to start with the Poisson-Exponential, which screens the truth or falsity of the a@ prior hypotheses, but to fit a binomial regardless of the magnitude of p. The fact that quite as good fits are obtained with negative as with positive binomials suggests that a new interpretation of these cases of “negative probability” is requisite. Several cases of the interrelation of “contributory cause groups” which provide a series represented by a negative binomial (p—q)~” have been recognised f. A general interpretation based on a very simple conception seems needed for these demographic cases in which the law of small numbers appears far more often to correspond to a negative than to a positive binomial. This paper was worked out in the Biometric Laboratory, and I have to thank Professor Karl Pearson for his aid at various stages. * Can we cite in addition perhaps, the fact that existing tables of m*e~”"/x! do not extend beyond m=10? + Pearson, Biometrika, Vol. tv. p. 208. THE RELATIONSHIP BETWEEN THE WEIGHT OF THE SEED PLANTED AND THE CHARACTERISTICS OF THE PLANT PRODUCED air 2 By J. ARTHUR HARRIS, Ph.D., Carnegie Institution of Washington, U.S.A. I. Inrropucrory REMARKS. 1. In Biometrika, Vol. rx. pp. 11—21, March 1913, were published constants showing the relationship between the weight of the seed planted and the number of pods on the plants produced in twenty experimentally grown series of Phaseolus vulgaris. From the economic view point, number of pods is the most important character which could have been chosen, total weight of seed matured only excepted. But to the student of morphogenesis, or of the physiology of seed production, other characters are of equal interest, while the comparison of the correlations for various features must yield results of significance. The purpose of the present communication is the presentation of the constants measuring the influence of the weight of the seed planted upon the number of ovules formed and the number of seeds developing in the pods of the matured plant. These various relationships have now been worked out for a relatively large bulk of material. Altogether there are 29 individual series belonging to 5 varieties, involving 17,953 plants, from which 119,192 determinations of the number of ovules and seeds per pod have been made. The reply to the possible suggestion that the expenditure of effort in the collection-and analysis of such masses of data is quite unjustifiable is twofold. First, a major portion of the labour involved was necessary for investigations not touched upon here. Secondly, there are many problems of morphogenesis and physiology which can only be solved by the amassing of large series of accurately determined biometric constants which when sufficiently numerous may themselves be the materials for statistical analysis. The data here contained are recorded in partial fulfilment of such requirements for certain definite morphological and physiological problems. The present paper is limited strictly to matters of fact; general discussions are reserved until further data—much of which is already available in a raw state— are reduced. — Te eee ee ee le ——_ J. Av HRARRIS TT. MATERIALS. The first paper may be consulted for details not entered here. analysed are drawn in part from the series already considered for the relationship between weight planted and number of pods produced. 73 The data In addition to the White Flageolet, Navy and Ne Plus Ultra varieties already treated, several lots of Burpee’s Stringless and two of Golden Wax are available. III. ANALYSIS OF DATA. 2. Data for Number of Ovules and Seeds per Pod. Tables III—VI, similar to those of the preceding paper, give in a condensed form the data for the correlations discussed. Table I* gives the correlations TABLE I. Correlation and Partial Correlation Coefficients. | | | Namber coro: Nummberc! Correlation, Partial Correlation, Partial F eight Weight | pare Weight | : Series of end Pos of Pods | eadiOvdlese| Correlation, and, Seeds Correlation, Plants Twp Examined | no mane ng pws | | | LL 1141 |--008+°020; 8043 026+°008; ‘027+°008 —°013 +4008 | — 013 + :008 | LG 182 066 + 050 806 1534023 | +140+°023 —-100+°024 |—:103 + 024 | GG 750 |-°368+°021) 6310 018+°008' -029+'008 ‘004+°008} ‘0164-008 GGH 583 208 + 027 5251 045+°010) 01947009 024+ °010 | — 004+ ‘009 GGH2 499 176+ 029 3502 0934°011;) -083+°011 063+°011} 049+-011 GGHH 396 193 + 033 2656 — 022+ °013 | — -042+°013 —:-029+-013 |— 048+ 013 GGD 514 "159 + 039 1438 ‘1074018; ‘089+4°'018 ‘O71+°018} -068+°018 | GGD2 449 215 + 030 1227 0444°019;) °018+°019 ‘079+°019] ‘0624-019 | GGDD 342 137+ 036 | 807 1014023} -092+°024, -089+°024} -076+°024 A 1484 177+:017| 14029 010 + :006 | —"039+°006 = 007 + :006 | — 054 + 006 | HHA 1271 1454-019) 11230 — ‘000 + 006 | — -030+°006 = 016 + 006 | —-014 + 006 HD 1416 "129+°018| 5581 — 044+ °009 | — 067+ :009 — 049+ 009 | — 052 +:009 | | HDD 1204 121+°019;} 5449 — 029 +009 | — 065 +4°009 —-010+:009 |— 030 + ‘009 DD 513 282 + 027 1827 098+ °016) -009+°016 0504-016} -008+-016 | DDD 459 215 +°030 2018 0444°015|} ‘0004°015) 0464°015} ‘006+4°015 | DH 670 258 + 024 5955 075 +009 | — ‘005+°009 = ‘076 +009 | — (013 + 009 | DHH 565 152+°028, 5019 0454°010} *008+°010, ‘011 +°010 | —-025+°010 OSC 530 150+°029| 2569 059+ °013} 032+ °013 | 031+:°013} -024+°013 USS 680 155+:025| 6605 023 +008 |—-000+°008) -041+°008| -024+ 008 OSH 361 129+ °035 3.406 ‘032+ :012| 0014-012} 0374-012] -020+-012 | USHH 224 1434-044 1743 112+°016} ‘098+°016 ‘011 +016 |— -004+-016 USD 312 195 +037 | 802 127+:023/ °098+:024; -071+:024| -067+-024 | USDD 237 241 +-041 | 851 2384022} +175+°023|) °1314°023] -090+:023 FSC 586 147 +027 2876 047+°013] -017+°013|} :089+°012} -073+°013 FSS 868 098 + 023 7809 021+°008) *001+'008) -026+4-:008; -004+4-008 FSH 475 "100 +031 4541 049 +°010} -018+°010 —-045+°010 |— -073 +:010 FSHH 427 121 + 032 3837 015+°011 /—-013+°011 -0404°011) ‘0174-011 FSD 428 130 + ‘032 1449 "060 + 018 | — 027+°018 —-019+-018 = +036 +018 FSDD 387 144+ 034 1556 0B7+°017) 013 +017 | ‘047+ °017) 024+ 017 | * The weight of the seed planted was weighted with the number of pods counted. Sheppard’s correction was used for seed weight, but not for the integral variates ovules per pod or seeds per pod, differ slightly from those of Table II of the first paper. Biometrika x Thus w and o, 10 74 Weight of Seed and Characteristics of Plant between weight of seed planted and ovules per pod, 7,., and between weight 8 Pp per } 8 planted and number of seeds matured per pod, 7s. The partial correlation coefficients, 3 i= Li wo me wp Pr. po ae Tws — Trp pss p' wo P = ) DWS: = 5 2 5) v1- “wp Vl=r r ‘po V1 — Pop V1 — ns showing the correlation for weight (w) and ovules (0) and weight and seeds (s) for constant numbers of pods (p) per plant are also given. These require in addition to the correlations here given 7), 7p). and r,s, the correlations between the number of pods per plant and the number of ovules and seeds in these pods. Values of 7», are available from the preceding paper (Biometrika, Vol. 1x. p. 21, Table VII) and from a supplementary table giving nine additional constants*. For the reader’s convenience these are reprinted in this table. The values of r,, and ry, will be published in connection with another problem. The probable errors have all been calculated on the basis of the number of pods examined as V. There is considerable question whether the actual number of seeds planted should not have been used instead; the degree of trustworthiness of a constant is perhaps not greater than is indicated by the lowest number of actual measurements (irrespective of the number of associated measures taken). The point is not of the greatest practical importance for the present case, since the number of series is so large that conclusions can be drawn from the run of the constants as a whole and too much weight need not be given to individual series. A glance at the table shows that the correlations are low throughout. The suggestion naturally arises that some of the extremely low values may be due to non-linear regression. The regression straight line equations and the results of Blakeman’s test} are given in Table II. Here 7, 7 and the straight line equation for the regression of ovules and sceds per pod on weight planted (in working units) are determined by the conventional formulae. The final two columns give the values of vil se Mee ue: Miele =i" when €=7?— 7? and xy, = 67449//N. All the straight lines are shown in Diagram 1. The empirical means are indicated in all of the cases where it can be done without confusion. The slope is very slight and the agreement of observed and predicted means not very close, especially near the ends of the range, where the number of observations is small. There is, however, no clear indication that a curve of a higher order would describe the results better than a straight line. This irregularity is precisely what is to be expected in cases of low correlation. * Harris, J. Arthur, ‘“‘An Illustration of the Influence of Substratum Heterogeneity upon Experi- mental Results.” Science, N. 8. Vol. xxxvi. pp. 345—346, 1913. | Blakeman, J., Biometrika, Vol. 1v. pp. 332—350, 1905, ~T Gr J. A. Harris TABLE II. Tests for Linearity of Regression. | | _ | Correlation, r, | Correlation-Ratio, | Regression | Blakeman’s | Blakeman’s Series and n, and Straight Line —_ Criterion, Criterion, Probable Error | Probable Error Equation Test Ad Test B x | a eel | For Ovules: | | | USS 0232 + °0083 0657 £0083 | -54230+4-0074 wv 3°720 | 1°688 DHH 70445 + *0095 0788+ 0095 | 4:°9385+4°0257 w 3°431 1151 USDD 2381 + 0218 "2978+ 0211 3°6886 + “LOOL w 11°096 2°102 GG D2 0442 + 0192 1276+ ‘0189 4:7224+4 0137 w 3°152 1:967 FSS 0209 + ‘0076 0403+ °0076 =| 5560640153 w 2°263 “754 HH “0098 + :0057 "0661 + °0057 5°3600 + 0056 w 5°159 1°678 For Seeds: | USS ‘0407 + ‘0083 0946+ 0082 | 3:5870+°0206w | 5"182 2351 DHH 0111 + :0095 05414 °0095 | 4°1521+4°0106 wv 5573 1°869 USDD 1313 + :0227 1932+ °0223 | 2°1840+-0940 w 8712 | 1°650 GG D2 0794+ 0191 "1760 + ‘0187 2°4735 + 0846 w | 4181 | 2°529 FSS ‘0261 + 0076 0499+ :°0076 =| -3°0712+ "0269 w | 2°793 931 HH ‘0068 + *0057 0953+ 0057-42119 + 0058 w 8421 2°739 Blakeman’s criterion has been applied in two ways, A and Bb. In the first the actual number of pods examined has been taken as V. In test B the number of seeds planted (not the weighted number) has been used in obtaining x,. If the first test be accepted as the proper one, it follows that regression cannot safely be regarded as linear. But there are two important points to be taken into account. The correlation ratio 7 depends upon the squares of the differences in means, hence it has always a positive value, which may be very substantial because of the errors of sampling when the number of individuals per array is small. Thus when r approaches zero 7 is limited by 7, the mean values of 7 for zero correlation*. Hence a test for linearity based on a comparison of 7 with a very low value of r may be misleading, Again, as pointed out above, the significance of both r and 7 should perhaps be tested on the basis of the lowest number of measurements. If this be done, as it is in test B, there is found very little evidence for non-linear regression. Certainly, one. cannot possibly assert that the low values of 7, which is seen throughout these experiments, is due to the number of ovules (seeds) per pod at first becoming larger and then decreasing after a maximum is reached as one passes from the lowest to the highest grade of seed weight. The results of Table I are also shown graphically in Diagram 2. Here the relationships for weight of seed planted and number of pods on the plant developing are also indicated as a basis of comparison. The values of both ry. and r,s are in general conspicuously lower than the low values of 7,,. But very few of them drop below the zero bar; one is forced to the conclusion that there is a distinct though very slight correlation between weight and ovules and between weight and seeds. * See K. Pearson, Biometrika, Vol. vit. pp. 254—256, 1911. . 10—2 76 Weight of Secd and Characteristics of Plant Consider in somewhat greater detail the signs and magnitudes of these correlations*, Of the 26 values of 7, only 4 are negative. The mean value of the 22 positive coefficients is +°0673; the mean of the 4 negative is — 0236; the mean of all (regarding signs) is +°0533. For the relationship between weights of seed planted and number of seed matured per pod, 7s, 21 constants are positive and 5 are negative. The mean of the positive coefficients is + 0502; the mean of the negative values is — ‘0303 ; for all 26 correlations the mean (regarding signs) is + ‘0348. Thus both correlations are (as is clear from the diagrams) unquestionably positive but very low. Apparently the relationship for weight and ovules is slightly closer than that for weight and seeds per pod, but the difference is too slight to justify any final conclusion. Consider now the question whether the observed correlations ry, Ts are to be regarded as direct biological relationships between the two variables w and o or w and s, or whether they are to be looked upon as merely necessary resultants of other interdependences. At present, the only other demonstrated correlation which might tend to bring about sensible values of 7», and rys is that between number of pods per plant and number of ovules formed and number of seeds developing per pod. Since number of pods per plant is known to be correlated with weight of seed planted, while both number of ovules and number of seeds per pod are correlated with number of pods per. plant, some correlation must be expected between weight planted and number of ovules and seeds per pod. If now the observed values of 7%. and r,s which are always small, are merely the necessary resultant of the relationships 7), 7po, ps, one would expect the partial correlation coefficients, ,?0, pws, to be sensibly zero. If these partial correlations are not sensibly zero, it can only mean that there is a direct (causal) relationship other than the one just considered between number of ovules (or seeds) and the weight of the seed planted. The partial correlations and the correlations are shown side by side in Diagrams 3 and 4. The lowering of the degree of interdependence between both weight and ovules and weight and seeds by the correction for number of pods per plant is clearly marked. In a number of cases in which the correlation coefficient is positive the partial correlation coefficient is negative. Thus only 4 of the 26 values of 7. are negative, while 9 of the partial cor- relation coefficients have the minus sign. In only 5 cases is ry, negative, but in 11 of the series, the sign of 7s is negative. The mean values of the partial cor- relations are very close indeed to zero. Thus j7y.='0186 as compared with Two = 0533; pFws= "0099 as against 7, = 0348. * T have already shown (Science, N. 8. Vol. xxxvu1. pp. 345—346, 1913) that the LL, LG and GG series are open to question because of the lack of certain precautions in the cultures; while they are included in the table of fundamental constants to avoid any possible criticism of selection of series they will be left out of account in the following disctssions. Ci J. A. Harris 26. 96 GG va *spe0s IO} IOMOT XIS Ot} ‘SeTNAO €6 G6 ‘spun Burytom ur payunjd spaas fo 14620 Af LG 606: 26ley Sin 2b Obs Sh Vl SL ol Wa 0 ne, Oreegee se 9 8G v Joy orev sour toddn xis ayy, ‘pajyueyd poss jo yysIea wo pod zed speas jo pus pod sed sojnao jo worssorsoyy ‘TL NVYYOVIG 9 *spaas 40 sajnao fo Laqunu uvayy Weight of Seed and Characteristics of Plant 78 - -- - --HHN- “|- - --- - -SSd- 6 dn : "(= soul, UayoIq pue sapoto £°".= saul] ULI puUe seToII0 “‘SuIdo[aAep S[VNPIAIPUL Jo sol4staoqovIeYyo pue pojyUTTd speos Jo yYBIOM AO SUOVI[eIIoD UMOUY oY} JO UostIedmMog L=Saull WAG pure sjop prfog - - -HHDD- ---- - -GDp- ‘6 WVdOVIG GO-— 00: gO-+ 0G: + Go-+ *$7W0191J909 U01)V)aLL09 JO san)v4 79 J. A. Harris ' 1 ' 1 ' 1 1 ' t 7 m™momom-mdeededed«cdeereteeéeuzxzAz2zdg Nnnoannenrn nnn npnAnin Ss FPS dS = So Sos 2 YG Seas SB use Zz Ss Sts To Ge rite Se Gemeee higw. 4 a ae | ! | | 1 1 ! | ! | | ! ! \ | | { | | | | | ! | 1 | | 1 { ! | | 1 | | ' | | ) | | | | ! | | ' | | ' | | | 1 ! n \ \ ! | I | | | | yi | ' | | | \ | \ | | | fel | | 1 { | | | | ) 1 | ! | | i ‘h | ! | \ ; | i 11 | 1 ‘i 5 mR | | f | i “y ly ! | A | | ' | 1 My 4 | R | ia | ] ‘ er ae a eq al | | | \ eZ \ | | rip 4 | | | 1 \ eet ee ye fan ee 4 cog | 7 \ | \ } ! ! ! ‘6 fey ais | I | ] | | | | ! ' Su | | | } } \ } ! ! ! 3 | | | | | | | ' 1 ! ' ' | | ' ' | 1 | \ ! | | | | | 1 I | ! | | | 1 \ | | | ! I | ! | | | ! | | | | 1 1 ! 1 ! 1 1 | ! | 1 1 1 | ! | | | | | | | | | ! | | ! | \ | ] | i \ | 1 | j | | ! | | } } vo tl | 1 i 1 | 1 \ ! ! ! ! Wah] | l 1 ' i ! I ! ! | ! | \ \ Wi a! | | 1 ! ! | | I | I | | | iol | | | \ ! 1 | \ ! \ | ue ' ! | | ! | ! | | | | ] Po) | } l | | } | | | | | | | | | | | | | 1 1 H | m= saul, prjos {°"4= seul, WaxoIg - - - - GHN- - ---~- - --HHN- ~~ =. --- ~~~ -GdDd- | | | ~e | | | | | ' | ' ! | l | | ! \ | I J L} | ! I | - - 2000- ---~=~----- QDD-- OL:— cO:- 00: GO:+ O03: + -yuejd aed spod jo aequinu 4ueysuoo roy UOTyRIeAIoo [eyed 9} YIM pod azod soynao pue payue[d pases Jo JYSIOM Jo UOTZBIeII00 Jo UosTIvdWOQ “g WvAoVIG *squaioyjaoa fo anjpvg Weight of Seed and Characteristics of Plant 80 ' ’ ' ' ' L 1 | I ' | ' 1 ' f | A \ t i ' ! 1 1 1 mal | saghemesl seolecs feuh a Fy = a = = ge Efe ee ee ee wee Se See See SS i = 1 S ¥ = ' | } = = S i S or = a i] 4 \ = N a | ! | | | | | | | | | | | ! | | ! ! | | I | i | 1 OlL-— | | | | \ 1 \ | | I ] | | | | | | | I \ | | | ! | | | | \ \ \ | | | | ! | \ | \ r | | \ | | | | ! | | | | | | 1 | | | | | ' | | | | 1 | | | | | ! | \ | f | | | } I \ | \ \ } ' | | ] | | | \ ! | | | | I! | | ' ! | | | | | | ! a | ' | | | l co-+ | | \ \ | | | | \ \ | | } 1 \ | ' | ‘ \ \ | ! | R ' | | | | 1 } } } | | | | DAS \ | | | \ | I \ \ ll’ | | A | | | \ \ t Hl he IN | | ’ \ ! /,\ Lig tw \ | | ' Aas Cee, | 5 aN | ' \ L I 1 \ I | d | 00: Yo Vi iene ] Sey =, fi ] a iN Aiba joe \ / | \l | | | GN : ran | ay | | \ je db 8 fel | a, \ Q \ | ei ee | 6” le | (ie | | \ Pn ot 3 1\ / \ / AS \ / \y | i ise\e ] areal Re, Nee eal Lv tau | | ‘ ie Mile ae a‘ | ens | Ms f | | | : \ p--6 ! | \ hi | Mi / \ co-+ \ | ' loa | fj \ | | \ Cal , \ | | \ f \ y | 1 ! | | | I 1 | | \ls | | \ | 'e' | ! | | i | | g \ \ a ai / ' \ oO } } } t Q f ;'! | } | \ } (ema | | ! | ! | | 1 TEX, | Teal ! | ! 1 | ! \ | | | | \ | | le 1 ! ! eS ete De } ! | | } ] ' } \ i ' Ore \ ' } | oly \ | | ' | } | | 1 | ' | } | \ Y | ie \ } | \ | 1 ' I \ \ \ } } | | | \ | ] \ \ ' i | | | | I i i I | | ! f= sautt pros £*"4= souly uweyorg ‘yuetd sod spod jo sequin yuvysu0o toy uoNeper10o [eyred oy} YI pod red spaas jo requInU puUe payue{d poos Jo yYSIAM Jo UOYRI[eII09. JO UOsTIBdUIOg “F NYWOVIG ‘squarayjaoa fo anjv4 81 J. A. Harris | | i} | = a a a = | | 13 | | 24 | (es) 929.—008. | al iz | €9 9IL | 03 | (@e) 008.—¢LZ. 5 aoe ace : 3 : | LL | 961 | €3 | (Ze) $£L-—04. ge eae ema ee eal 1 ge | Gia | ae eg ae lp | B6E | 86L | LPI | (08) 094.—GéL- | | = | : ZeG | BEE | BST | (66) 9eL.—00L | So Sean a = = tyes ala Mtr at ly Glee = Sh, Ae CO) RC a Te OOK: | ag ee - | | (1) O001-—¢L0. aaa ei = | | ee SY Cac: | = | | | | spoag |seTAg) speeg soTMAg speeg | seTnaQ speeg | se[nag speeg |sernag speeg |seynag. speag |setnaC eon og, “POT peso, | wwsog, “PPE | peso, | weyog, | “PA | qeaog, “ero, | POT] pequz, | reso, SP°E | pear preog, “P°F) pero, | rein, | PPE | poquera poog eh 3 Ad)p/ seg ZAIN Satay ADD ser1ag HHDD saveg HDD SaMag HDD Serag DD seueg ee SOR MECMESRAD 1 Biometrika x Weight of Seed and Characteristics of Plant | | | = =o —= | FI € == = == | 61 CE pf = = | Ay 09 OL | gL SOL | 0G = = | = IO C= 9 8% 9 |8¢ ZOL | && = aS OG €F ja an ata = = li Oy cg | OL | al | SOT | O8 GF 24 OL | (sé) ¢4¢.—0¢¢. SI It ge Lee SiL a = =|) fe Ee OL | = = — | 02 CP OL | 6FS | SEE | O09 TP F¢ OL |(é@) 0¢¢.— Gee. 09 161 | GS |16@ | OOL | FET etait eecg | semaine BE 8. idh = — — | 9rr | Gtg | OIL | FZG | OTL | BSL, | SOL | Gs | GE (Te) Se¢.--008. ‘0g r9 | S&L |eSel| 14% | PEG = oa = | Gy O€l | 46 | TE 1¢ OL | 68 | 66€ | 9, | ¢g9 | G06 | O9T | 80E | Oar | 84 |(04) 009.—FZT. SEL | OLE | Z |E9FL) 9886 | F9G = = — | 68 IST | I¢ | €6— | 691. | I¢ | 64h | 209 | GIT | BIG | 68ZL | 6ze | 68E | GEG | 76 |(6Z) SL7.—OGF. gt | 9EF | ZB | OtES| OOFF | F98 = ol oy, 9e1.| 8c | 981 | EPS | HH | IIS | LTO | OBL | OSL | 686 | ZBL | Ser | 8G | POT |(Sz) OG%.—= Is} «OT € |(9) O&¢1.—96T- = = : = iB | 8 G |(¢) &6I-—ooT- — = — = = a = | — |ct | 26 |¢ |\(%) 001-—¢L0- = | _| c -| | eI | = = | i | Speag | saTnag spod speag | seTNAGC, spog spaeg |seynag spog spsegq ‘sa[nAQ spog spaeg |seTnag spog Spoeg |se[nag | spog Spe9g | SeTNAg | spog spseg SIMA spog | TPIOL| TRIO, Teqoy, | TeIOL TROL | TOL TBIOL | [4ON, | TIOL | Tei", TOL, | TOL TSIOL | TOL THIOL | TOL payueld peg : Saeaearae aianaal a =e |e = = : : jo D'T sated TT sattag dIdsn sears dS sees HHS) seg HS) sates Ss sereg DSN selteg ISIE AL “AI WIV 83 J. A. Harris | | | | | lab = agit ce | = | ep | 6S OL | = | (GL) GEE-—008- == —\ = IST PST 42S 1G rer 0s G6I 18@ | 0¢ 6FE | 69 66 | OFT | OSL | O& | (@I) 00e.—FLé. id G | |68t | 796 9¢ | G6E | EEF | T6 LOOT | OSE | 9FZ | FSOT |EBLT | FIE | O99 | LEG | SGT | (ZT) $4é.—o0ge. 991 | OIG | IF | 129 | LFL | GFL | Beez | 966z | OFS | OGEz | S6FE | FZO | GO9E | Z9TO | LLOT | Z9GT | GBIZ | FEE | (OF) 0Gé.—géEe.- 069 | F8L | GGT | COST | OGLT 19€ | 68ZG | OOS9 | OBIT | I8¢g | EOL | BHI | E98G | E8OOT| ZOLT | GZEE | GLP | F9L | (6) Gee-—00E. L8ET | GLLT | OGE | OOLT | EGFS FEF | E899 | LEZ | OGFI | FESF | ZOTO | LGOL | OOTL | ZEST | B8IS | 99FE | O66F | zB | (8) 00G-—EALT- BEES | INGE | 9L9 | TS8 | OOSL | SFZ | OE9T | 9006 | GE | EEE | OLTF | ISL | FOLF | LEFS | OGFI | GEO | 9882 | O17 (4) GLT-—OST- SOIL | O€FI | 98% | Gzz | e6z | 19 | O6F | BIO | SIL | FAT | E413 | lor | LTS |9PSE | 9E9 | 908 | SEIT | BET | (9) Og7.—... ...| 1 | 3 |10]19|29/35| 67) 54] 92] 51) 74] 56] 68] 59} 85] 61] 72] 50 | | | a c eee Se el Length in Microns.—(continued). | 380 | 31 | 82) 33 | 34) 85 | 36:| 37 | 38 | 39 | Totals Remarks a — sz Mzimba (Donkey) (a) ...| 2] 2/ 2 (=| - 500 | &. S. Proc. Vol. 87, B, p. 31. eer? | | Rats only. Human, Native Woman (b)| 49} 27/23/13] 7| 1) 1 |—|—|—| 1220 | &. S. Proc. Vol. 85, B, p. 427. | | Various hosts. Human, mixed (c) ... | 132} 125]}90/59/30/13] 8 | 2 | 2 | —} 3600 | &.S. Proc. Vol. 86, B, p. 301. Read | ; | from diagram. Rats. TT. bruces... oe | 27] 26/18/11} 4) 4; —|—| 2 |-—| 1000 | &. S. Proc. Vol. 84, B, p. 331. | | Read from diagram T. rhodesiense sen Peamozipe2o (lid. 13) Oat | Lo) — a 1 | 1000 | & S. Proc. Vol: 85, B, p. 227. | | Various hosts. To further establish our point let us compare the Human strain (c) for 3600 trypanosomes with the 7’. rhodesiense. Here y? = 325'47 leading to P < ‘000,000,01. In other words the great degree of divergence for the case of the Nyasaland native woman is exceeded at least a thousand times, when we take the big example of four natives and one European. Sir David Bruce and his colleagues write of these strains : “(1) The trypanosome of the human trypanosome disease of Nyasaland is T. rhodesiense (Stephens and Fantham).” In other words the P =-000,01 is inter- preted as sameness. “(2) This is a distinct species, nearly related to 7. brucei and 7. gambiense, but more closely resembling the former than the latter.” In other words they at this date distinguished between 7’. brucei and T. rhodesiense*, and as a result of this distinction proposed to call the human trypanosome disease of North-east Rhodesia and Nyasaland by the name “ Kaodzera” as not being identical with the sleeping sickness of Uganda and the West Coast of Africa. If we, however, compare 7’. brucer and T. rhodesiense we find y? = 46°83 and P=-019. In other * R.S. Proc. Vol. 85, B, p. 433, 1912. 96 A Study of Trypanosome Strains words once in about 50 trials we might expect to get two samples from the same population as divergent or more divergent than the distributions found for T. brucei and T. rhodesiense. We have in fact in the cases of these two trypano- somes reached our first instance of comparative sameness, and the statistics should have shown Sir David Bruce and his colleagues that 7’. brucei and T. rhodesiense were relatively the same, and though both differed from the human trypanosome of Nyasaland widely, the approach to 7. rhodesiense was only slightly closer. The accordance—speaking in a relative sense—of 7’. rhodesiense and T. brucei was asserted by Stephens and Fantham in March, 1912*. In May, 1912, Bruce and others, speaking of the 7. rhodesiense, term it a distinct species; in February, 1918, they say—although without publishing further frequency distributions— that “There is some reason for the belief that 7. rhodesiense and T. brucei (Plimmer and Bradford) are one and the same species,’ + and in a further paper of the same month, “Evidence is accumulating than 7. rhodesiense and T. brucei (Plimmer and Bradford) are identical{.” In May, 1913 (R. S. Proc. Vol. 87, B, p. 34), we are told that the Mzimba strain is identical with the wild-game strain and that “it has already been concluded that this species is 7. brucei vel T. rhode- stense.” As far as the statistics of the subject go the only really weighty evidence for the identity is that of 1912, on which, without statistical analysis, the distinction between the two species was asserted. (c) We will next consider the possible identification of 7. gambiense with T. rhodesiense and with T. brucev. The second identification is seggested by Sir D. Bruce and others in the words§: “Whether these slight differences are fundamental or only accidental it is impossible at present to say, but enough has been written to show that Trypano- soma gambiense and Trypanosoma brucet approach each other very closely in shape and size.” The following table|| provides the data for 7. gambiense to be compared with the distribution of 7. rhodesiense ranging from 12 to 39 in the last table. Microns. ont | | | | | T5416 ee | 18| 19 ie 2) | 21 | 22 23 | 24| 25 26 | 27 | 28 | 29| 80 | 31 | 82 | 33 | 34| 35 | 86 | 37 | 88 | 39 | Totals | | | | al | | | | | | | 9 | 21. m6 9 114 aie 85 | 61 | 47 a ie Sila Opiellen | Aaa eet = ze - | 1000 lie =) I The Sear adie are ran a ate of hosts. For the 28 classes we have, y?= 140°27 and P<-000,000,1. The chief point therefore is the complete divergence, not the resemblance of the two series. * R. S. Proc. Vol. 85, p. 238, 1912. § R. S. Proc. Vol. 84, B, p. 332. + R. S. Proc. Vol. 86, B, p. 407. || R. S. Proc. Vol. 84, B, p. 330. + R.S. Proc. Vol. 86, B, p. 302. KARL PEARSON 97 Stephens and Fantham, who term their work a “biometric study,” speak of “the general resemblance between the curves representing the measurements of these three trypanosomes (7. gambiense, T. rhodesiense, T. brucet).’ They con- tinue: “We do not consider, however, that identity of measurement would necessarily imply identity of species. We still believe that the difference in internal morphology, namely the presence of the posterior nucleus, is sufficient to separate 7’. rhodesiense both from 7. gambiense and T. brucei*.’ As a matter of fact the “ biometric study ” of the data does not indicate identity in the measure- ments, but confirms the result of internal morphology by proclaiming wide differentiation +. (d) We can now compare 7. brucei and T. gambiense. Of these Sir David Bruce writes: “Whether these slight differences are fundamental or only acci- dental it is impossible at present to say, but enough has been written to show that Trypanosoma gambiense and Trypanosoma brucei approach each other very closely in size and shapet.” The biometric commentary on this is that for length of the two series yx? = 126°52, giving P< ‘000,000,1 and that as far as size is concerned the samples ditfer immeasurably, ie. far beyond the limits of the calculated tables of P. We should thus conclude, merely from the statistical evidence, for close same- ness in 7. brucer and T. rhodesiense but for marked divergence of both from T. gambiense. * R. S. Proc. Vol. 85, B, p. 233. + In a later section of this memoir I show that Stephens and Fantham have been markedly biased in their judgment of even and odd units of measurement (p. 129 below), and that the recognition of this makes a wide difference in the goodness of fit of my resolution into components to their data for T. rhodesiense. It seems desirable therefore to inquire whether this bias affects the test of ‘‘sameness”’ of T. rhodesiense with T. gambiense, T. brucei, and the Human strains (b) and (c), see the Tables pp. 95—6. The data were accordingly classified into groups of two microns, starting with 12 and 13, 14 and 15, etc., so as to get rid of the even bias as far as possible, and we find : Old Unit Ranges New Two Unit Ranges Strains compared | on x? Ps n x? 1p) | TL. rhodesiense and T’. gambiense | 28 140°27 <'000,000,1 14 118°73 | < :000,000,1 T. rhodesiense and T. brucei ... | 28 46°83 019 | 14 25°76 ‘018 T. vrhodesiense and Human strain (b) a ae reales 28 69°95 000,01 14 45°92 000,06 T. rhodesiense and Human | strain (c) | 28 325°47 < ‘000,000, 01 14 253°37 | < -000,000,01 The bias towards even numbers of Stephens and Fantham has thus not substantially influenced our results, which still show the relative likeness of 7’. rhodesiense and T. brucei, and the marked divergence of the former from 7. gambiense and the human strains. { R.S. Proc. Vol. 84, B, p. 332, Biometrika x 13 98 A Study of Trypanosome Strains (e) It seemed well worth while to investigate how far the two Nyasaland strains of Human Trypanosomes given in the table on p. 95 agree or differ. The first (b) of these strains from a native woman of Nyasaland may be compared with (c) a compound strain from four natives and a European. We find x? = 172°36 giving P < :000,000,1. In other words, the two Nyasaland strains from human beings are indefinitely differentiated. I now compare the Mzimba (Donkey) strain* (a) with human strains (b) and (c), we find: for (a) and (b) x? = 22316 giving P certainly < :000,000,01 ; for (a) and (c) x? = 348°55 . < 000,000,01. Thus the trypanosome strain found in the donkey appears to be absolutely incomparable with that found in man in Nyasaland, just as the strain found in the donkey differed from that found in wild-game. (f) We may now turn to a memoirt by Sir David Bruce and others com- paring the Mvera cattle strain, the wild-game strain, and the wild Glossina morsitans strain. They give on p. 18 of that paper the graphs for 500 specimens of T. pecorum, the wild-game strain, and of the wild Glossina morsitans strain taken from a variety of hosts. The following are the frequencies: Microns. _— * aa Strain 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | Totals == |. a Mvera Cattle Strain eee Met bana a 15 | 64 | 101) 186) 114) 59 Seal 500 Wild-Game Strain ... we | —| — 2) 34 | 85) 172) 119 | 63 | 292) 3 | — 500 Wild G. morsitans Strain ...| 1 | 4 | 16 | 42 |129) 147/103) 42 | 15} 1 | — 500 We compare first Mvera cattle strain with the wild-game strain and find for our 10 categories x7 = 34554, P= 000,243. This is a relatively low degree of divergence considering that P has been running into 1 in 10,000,000! But it means that if these two strains were samples of one and the same population, we should only expect two such divergent samples to occur 1 in 4000 trials. * This Mzimba strain of trypanosome is discussed in a paper headed: ‘Morphology of the various strains of Trypanosome causing Disease in Man in Nyasaland.—The Mzimba Strain’ (R. S. Proc. Vol. 87, B, p. 26); it is said to be of the Nagana type and is identified by Sir David Bruce and colleagues with 7’. brucei vel rhodesiense, the source of the human trypanosome disease. t+ R. S. Proc. Vol. 87, B, p. 4. KARL PEARSON 99 Next we find for Mvera cattle strain and the wild Glossina morsitans strain, x? = 40°508, or P=-000,008, or only once in 125,000 trials would a pair of samples so divergent arise when testing the same material. Lastly, testing the resemblances of wild-game strain and wild G. morsitans strain, we find x? = 35°41, or P =°000,2, not such a gigantic divergency as we have found in many cases, but a difference so great that it only occurs once in 5000 trials requires explanation as divergency and cannot be used as an argument for “sameness.” It will thus be quite clear that as far as the measurements of length go, there is wide divergence to be accounted for between the trypanosomes found in the cattle, the wild-game and the tsetse fly, and that statistically this divergence is the remarkable feature. Yet the conclusion of Sir David Bruce and his colleagues, arguing very largely from the frequency distributions, is that “The Mvera cattle strain, the wild-game strain and the wild G. morsitans strain belong to the same species of trypanosome, 7. pecorum*.” d It will be seen that actual statistical analysis does not in any way confirm the bulk of the conclusions reached by Sir David Bruce and his collaborators. The strains may or may not be ultimately of like origin, but what is quite clear from the analysis is that, if we are to rely on the measurements, then it is the diver- gence, not the sameness of these strains, which should have been emphasised. No stronger evidence could be deduced of the danger of appeal to statistics when the statistics are not handled by the trained statistician. The mere appeal to the resemblance of frequency curves given in the form of percentages, often based on widely different totals, is an only too common error of medical investi- gations ; it is by no means confined to the Scientific Commission of the Royal Society, Nyasaland. But it has recently become so marked a feature of Series B of the Proceedings of the Royal Society, that a vigorous protest is really needful. Thus in the very last part issued (Vol. 87, B, p. 89) occurs a paper on “ The Trypanosomes causing Dourine.” In this paper there may be microscopic evidence to differentiate the strains A, B and C dealt with; on that I cannot express an * A further conclusion is also reached (Ibid. p. 26) ‘‘7'. pecorum, Nyasaland, is identical with the species found and described in Uganda.” Unfortunately the species found in Uganda is dealt with in a paper (R. S. Proc. Vol. 82, B, p. 468) which provides no frequency distributions, and does not tell us the total number on which the mean length—13°3 microns—is based. The mean value of the T. pecorum, Nyasaland is 13-954 (R. S. Proc. Vol. 87, B, p. 3) and the standard deviation is 1-393 in microns, thus the probable error of the mean is °67449 x 0623. Assuming the Uganda trypanosome to be the same strain and to have the same variability as the 7’. pecorum, Nyasaland, the difference of the means = ‘654, with a probable error of *67449 x /2 x 0623 =-67449 x -088, thus the deviation of the means is 7:73 times its standard deviation. A deviation so great would only occur about once in 4 x 10" trials, i.e., would be practically impossible if the two strains were identical. Here again it is excessive divergence not sameness which the statistics indicate. 13—2 100 A Study of Trypanosome Strains opinion. But on pp. 92—3 percentage frequency curves are drawn for the three strains, and the following remark is made : “A survey of the curves obtained by plotting out in percentages the various lengths of trypanosomes encountered in each of the three strains is of interest. It will be observed that in the case of rats the curves of each of the strains corre- spond fairly closely.” Now what do the authors mean by “fairly closely”? In their conclusions they identify B and C and differentiate A. Unfortunately they have not given their actual frequencies, and I have had to endeavour to reconstruct them from the percentage curves. There results for the rat-data: Microns. | i eel ier ae | | | 16|17|18| 19 | 20| 21 | 22| 23\ 24| 25 | 26 | 27] 28| 29 | 30| 31 | 82| 83 | 84| 85 | 36 | Totals | | | | | | | oo | | | | | cae | | | z | [oe es te | ae ea | | Berlin Strain A . | 1 | 1 {10] 9 |12/17|17)22|28]48 |47 |57/55|42 |39!37|/28/13| 81 6 | 3 | 500 Frankfurt Strain B...]—|—| 1] 3 | 5) 1] 4/10] 20/29*|18* | 25 | 24 | 35* | 23} 15)18}15] 8 | 3 East Prussian Strain C}—-| 1 | 4 3 | 6|12)15)22|24)27 | 28-|37|31])16 |10; 7) 5; 2);-—|— ee 7 . : | | | | | | We obtain the following results: Strains A and Bs) ¥7=3111, P= 0627, Strains A and C: y?=43°37, P= -0034, Strains B and @: 4°=72:72, P=<:000,001 Thus to judge from rats only, Bb and C are far more divergent from each other than either is from A; in other words the strain A is intermediate between B and C and closer to 5, from which it is not immensely divergent; two such samples as A and B might, as far as the length distributions go, be drawn from common material once in 16 trials. Now of course no one suggests that a conclusion drawn from this rat-material is to replace one drawn from guinea-pig material, but the statistician cannot agree that for rats “the strains correspond very closely”; and he finds it illogical to place the evidence of the rat-data on one side and proceed to draw conclusions from the ocular inspection of the guinea-pig curves, without noticing that the conclusion is markedly opposed to the proper deduction from rat-data. Indeed while the guinea- pig-datat give a relatively high degree of relationship between B and C (P =:0157) it is not as high as the rats give between A and B (P=-0627); and while the * The values given by the percentage graphs in these cases are respectively 21, 17 and 34, and the total appears to be 247 and not 250 as stated. Hither 247 were used or the graph is in error. The three individuals were introduced in a way calculated not to increase divergence. + The frequency distributions for the guinea-pigs have had to be reconstructed from the percentage curves, the necessary data not being published by the authors. KARL PEARSON 101 relationships of A and B (P< -000,000,1) and A and C (P< 000,001) are very low, the origin of the second hump in the guinea-pig distribution for A requires much more analysis and the certainty by control experiments, that it always repeats itself, and is not the result of hitting a “ pocket.” ¢ It seems to me that any statistical analysis by modern methods of the trypano- some data compels us to confess that either statistical methods must be discarded entirely in these trypanosome investigations, or they must be pushed to their logical conclusion, and used as the fundamental instrument of research which can guide our enquiries by inference and suggestion when, and when only, it is handled by the trained craftsman. Thus far the use made of statistical methods seems merely to have confused the issues, and brave would be the man who would venture to say after reading this section of our present paper that any two strains discussed by the commission are definitely “same” or certainly differentiated. (5) On the Probability that the Animal in which the Trypanosome vs cultivated makes essential Differences in the Distributions of Frequency. But the very method which casts apparent discredit on the results at present reached seems able to lead us to definite conclusions provided we start with it as the fundamental mode of investigation. Really very little inspection seems to indi- cate that not only the host but the period of infection materially influences the frequency distribution. These points have not been wholly disregarded by the in- vestigators in this field, but they have had no quantitative measure by which they could appreciate the relative influence of the various environmental factors. Nor indeed could the method be fully applied without experimental observations on trypanosomes of the same strain subjected to differential treatment. Knowing in such cases the quantitative divergence produced, we should be in a position to infer whether two strains from different sources were separate species or merely modified by differential environment. Until we have such quantitative measure no hypothesis of sameness or difference can flow from statistical treatment; nobody as yet knows how much to attribute to environment, how much to attribute to individuality of strain. In endeavouring to throw light on this matter we are, however, checked at the very start by the absence of effective material. In some cases the period of infectivity is not given; in others we are not always able to break up the total frequency by reference to the host, or to a single host. And even when we merely classify by one type of animal as host, we may have reduced our material to such small numbers that samples may be “same,” which on larger numbers would show the marked divergence due to the emphasis of smaller differences*. Some suggestive points can, however, be effectively dealt with and they are treated in the following paragraphs. * It may not be possible to differentiate Bavarian from Wiirtemberger on samples of 50 crania, although quite possible on samples of 400. 102 A Study of Trypanosome Strains (a) I ask what difference is made when a strain is passed through various animals (goat, monkey, dog, rat) or through a single animal alone. Taking the wild-game strain discussed by Sir David Bruce and others*, we have: Microns. 10 | 11| 12 | 18 | 14 | 15 | 16 | 17 | 28 | Totals Wild-Game Strain (from various ae Wild-Game Strain (from a single rat 510) Here we find y? = 65°37 and P < ‘000,000,1. In other words the distribution of lengths of the trypanosomes of the wild-game strain obtained from various animals differs so enormously from that obtained from a single rat that the two cannot be looked upon as samples of the same population. The moment this result is realised we appreciate that (1) it is impossible to compare two strains developed in a variety of animals unless we have previously tested on the same strain the equal valency of these animals, (11) a series of animals of even the same species may quite possibly give widely divergent results from those obtained for a single animal. Thus passing from a variety of animals in wild-game strain to a variety in wild G. morsitans strain makes less difference (P = ‘000,008)—although great enough— than passing from a variety of hosts to a single rat in the wild-game strain. This rule is not universal, but it illustrates the absolutely essential need for testing the effect of change of host before questioning the identity or non-identity of two strains. (b) I now turn to the Mvera cattle strain, and ask what differentiation is produced by the dog and goat as hosts. The data are very sparse and unless we get a high degree of resemblance may be worth little. They run+: * R. S. Proc. Vol. 87, B, pp. 6 and 8. + R. S. Proc. Vol. 87, B, p. 3. I tested the relative interchangeability of goat and sheep in the case of T. caprae. The data are as follows: (R. S. Proc. Vol. 86, B, p. 280) Microns. eee ie | | Wi tiee pee a | T. caprae | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 82 | Totals —— | | Goat .. |—|—] 8 | 7 | 11] 85 | 43 | 50 | 88 | 28) 27; 17) 5 | 1 | —} 260 Sheep... | — | — J 1 | 10 | 12 | 29 | 39 | 31 | 28 | 20 5 | 3 id 1 180 | leading to y?=18-088 and P=-1133 or the resemblance is considerable although not so great as we find between goat and dog for the Mvera cattle strain. KARL PEARSON 103 Microns. | > | ] | DON Le | 12" TS tp \\ 15 | 16 | 17 | Totals | fs | | | Mvera Cattle, Goat ... | 1 Wed ala 225 26°) 19 is.) i 100 ” ” Dog | a eon ot lala ag ZI 8, | =— | 100 We have x? = 5°396 leading to P= "714, or in 71 pairs of samples out of 100 from a homogeneous population, we should get more divergent results. It follows therefore that, as far as these small series of this strain go, goat and dog are interchangeable as _ hosts. Let us go a stage further and ask whether ox is interchangeable with goat and dog. The following is the frequency distribution for the trypanosomes through the ox: Microns. 9 10 ae ei 13 || Lo |G NLT | is Totals | , a\2 _| | | Mvera Cattle, Ox ... | — | A Pis | 33 | 44 | 49|91| 7 | 1 | 180 | | Compared with the goat strain, this gives ‘x? = 9559 and P =°3888, and compared with the dog strain xy? =9:461 and P= ‘3973. Thus in about two out of five trials from a same population we should get pairs of samples differing more than the dog and goat strains do from the ox strain. We conclude that while for practical purposes dog, goat and ox strains in the Mvera cattle trypanosomes are interchangeable, yet the dog and goat strain are nearly twice as much alike as the ox strain is to either. Lastly—although it is rather a rash proceeding—I compare rat with goat and dog. It is rash because only 40 trypanosomes through the rat were measured, and this is wholly inadequate for real determination. The frequencies for the lengths are: ae oe ee ee ae | | | 9 | LO tL | 12) 18 14 | 15 | 16 | 17 | Totals | | —| | are Mvera Cattle, Rat ... sas fe NN I | alee oy 40 ; » Dogand Goat | 1 | 1 | 6 | 25 | 49 | 56 | 40) 21) 1 | 200 We find y?=21'329 and P=-0064. The small series of rat trypanosomes probably accounts for no smaller value of P, but the odds of 155 to 1 are sufficient to show that rat series must not be mixed with series from the goat, 104 A Study of Trypanosome Strains dog or ox. This confirms the view obtained for the wild-game strain, that a strain taken through the rat as host is incomparable with strains from other animals. (c) The totals considered for one species of host in (a) and (6) are rather small. Larger numbers are forthcoming for the so-called Mzimba strain of trypanosomes taken from a donkey at Mzimba. The frequencies are here*: | ” ” Microns. 16 | 17 18/19 | 20 | 21 | 22| 23| 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | Totals | ; | | } | | | | le 4 | | | Re cal as Mzimba Strain, pos | 3 17 | 56 69 | 67 | 47 | 27 | 22 | 10/12] 7 4| 4 | 4|2)1 | 360 e | 2 i 41 91 79 56) 53 | 38/39 22/19/16/15 | 9 | 2/2/21) 500 este We find y? = 25499 and P=-0619. Thus only about once in 16 trials should we get such a degree of divergence as the two samples present, drawing them from the same population. This is very far from such a divergence as we have noted in the rat and dog for the Mvera cattle strain, or in the case of rat against other _ animals in the wild-game strain, which was extremely large. The only expla- nations that occur to me here are: (i) In the case of the wild-game strain and the Mvera cattle strain a single rat seems to have provided all the trypanosomes, while in the case of the Mzimba strain two rats were used; this might lessen the influence of individuality. (ii) In the case of the Mvera cattle strain and the wild-game strain the trypanosomes were ultimately taken from a great number of individuals. In the Mvera cattle case we are told that 32°/, of the herd were affected, and we have some details of 16 head of cattle and 5 donkeys naturally infected+. In the wild- game case, the wild game affected were very numerous, covering cases of eland, reedbuck, waterbuck, bushbuck, oribi, koodoo, hartebeeste, buffalo and hyaena. Now can we start with the hypothesis that all the individual cattle and all the individual wild game were each bitten by a fly carrying the same strain of trypanosome? Have we any more right to suppose @ priort that one wild- game strain of trypanosome and one cattle strain of trypanosome exist, and ask whether these two are identical, than to ask whether the strains carried by hyaena and hartebeeste are the same? We have already (p. 93) seen that the strains from two hartebeeste are extremely divergent. What right have we @ priort to classify all wild-game trypanosomes together and call them a wild-game strain ? And if two antelopes, whether of the same or of different species, give widely different results, why are the trypanosomes of oxen of the same herd or donkeys and oxen from the same neighbourhood to be classed @ priort as of one species ? * R. S. Proc. Vol. 87, B, p. 31. + R. S. Proc. Vol. 87, B, p. 15. KaruL PEARSON 105 If we turn to the Mvera cattle, we find there were four sources of trypanosomes for the ox, two for the goat, and the same two for the dog—these two sources being two of the four cattle sources. There was only one source for the rat, but I have not discovered how far it was identical with one of those for ox or goat*. In the Mzimba donkey strain there was one source for dog and rat. In the wild-game strain there were, I make out, ecght sources of trypanosomes for the goat, four for the dog, and only one for the rat. Thus the individuality, which might be supposed to influence the result, because we are treating of trypanosomes in this case from a single rat, in the Mvera cattle case from a single rat, and in the Mzimba data from only two rats, may really arise from the fact that the rat strains in each case are derived from a single source, while the dog, goat and ox strains show a multiplicity of sources. The troublesome point is that the experimental part of the work has not been designed to answer what seem to me fundamental questions. We cannot directly inquire what difference the host makes because different hosts have rarely been treated with the strain from a unique source. We can say that dog and goat are interchangeable for the Mvera cattle strain, because both drew trypanosomes from the same two sources ; but we cannot determine whether the difference in the ox is due to difference of the host, or to the introduction of two more sources. Simi- larly the divergence between the trypanosomes from rat and from other animals for the wild-game strain may be due to using one rat and therefore one source, and not the many sources of the other animals, or it may really be due to the differentiation of the host. In the same way the difference between the two hartebeeste may be due to individuality in the same species, or to infection from different strains. (d) To some slight extent we may appreciate the effect of individuality by comparing the two rats 512 and 513 im the case of the single source, the Mzimba strain f. The frequencies are as follows : Microns. | | Neal ee lee | | Mzimba Strain | 16 | 17 | 18 | 19| 20 | 21 | 22 | 23 | 24 | 25 | 26 | | | | | | | | 7/1/1/2] 240 2/1 He 260 | Rate? °... |—| 5 (Rat 513. 2) 9 { The numbers are not as large as we should like; but they give y2=17'89, P ="3306. * R.S. Proc. Vol. 87, B, pp. 2 and 15. + R. S. Proc. Vol. 87, B, pp. 6 and 8 compared with 5. Rat from p. 8. t R.S. Proc. Vol. 87, B, pp. 29 and 31. x Biometrika 14 106 A Study of Trypanosome Strains Clearly then two samples as divergent as those found wonld occur on the average once in three trials. It follows that two individual rats are really inter- changeable and we note that the extent to which ox is interchangeable with dog or goat for the cattle strain is very much the degree in which two rats are inter- changeable. To judge from this single instance, individuality within the same species of host is not very important, and when we find two hartbeeste differing as those considered on p. 93, it seems much more likely, with the information we have at present got, that the hartebeeste were infected with different strains of trypanosome than that their individuality produced the enormous divergence noted. Again the sensible divergence between Mzimba strain in dog and rat on p. 104 is probably due to difference of host, but the enormous difference in the wild-game strain between a single rat and dog and goat on p. 103 is probably due to differences in the strains of trypanosomes in the various types of wild game dealt with. We may consider whether the dog and goat data for the wild-game strain differ sensibly. We have* Microns. i oe : : | | 11 | 12 | 18 | 14 | 15 | 16 | 17 | 18 | Totals + : | Wild-Game Strain, Goat ... | 1 | 16] 37 | 73 | 38) 26| 8 | 1 | 200 | | 3s = 5 Dogs... | — | 12) 31))957 | 50 | 24 | 6. | ==s\eas@ | | eal a Here y?= 6:04 and P ='5378. Thus in more than half the trials we should obtain from homogeneous material pairs of samples more divergent than those for dog and goat. This confirms the view formerly expressed that as far as trypano- somes are concerned dog and goat are interchangeable. We cannot yet say that they are not interchangeable with the rat, as the mixture of strains in dog and goat and the uniqueness of strain in the rat may account for the marked divergence of the latter. Sir David Bruce and his colleagues do not appear to have noticed the wide divergence of the distribution of the rat from the dog and goat either as indicating the heterogeneity of the wild-game and the cattle strains of trypanosomes, or as suggesting such wide differentiation of strain by the host, that rat-material cannot be mixed with that from dog and goat. They do, however, remark of the wild-game strain: “In this the rat is not a suitable animal, since many strains of 7’. pecorwm have no effect on it}.” This suggests that 7. pecorum is not homogeneous and that the rat exercises a selective influence on its strains. The suggested rejection of the rat data seems, however, to be based upon the in- convenience of its non-infectivity, and not on what might turn out to be of great importance a selective influence on wild-game or cattle strains. It is not possible to test this selective power in the present mstance, as we do not actually know how heterogeneous either the cattle or wild-game material used really was. * R.S. Proc. Vol. 87, B, p. 7. + R. S. Proc. Vol. 87, B, p. 7. Karu PEARSON » 107 (e) If we turn to the 7. pecorum strain as actually found in the tsetse fly, we see that Sir David Bruce and his colleagues deal with these trypanosomes passed through a variety of animals, of which only goat and dog supply sufficient numbers for any even approximately accurate treatment. The data are as follows*: Microns. beg. zo") 27 es 14| 15 | 16 | 17 | 18 | Totals = rales ; | | Wild G. morsitans strain: Goat | 1 | 3 | 12 | 21 | 55 | 60 | 32 | 12) 4 | —} 200 | . ¥¢ ” Doe |=") == 13) 144) 34. 41, | 40 | 19:1 9 | — | 60 i | Ea Wild G. morsitans strain: Rat | — | 1 | — | 3) 22 | 28/19] 6 | 1 | — 80 For goat and dog we find y?= 19°518, which give P=-0125. The resemblance is therefore far less than we have found for goat and dog in other strains, only once in 80 trials from homogeneous material would two samples of such divergent character arise. Before we comment on this it seems desirable to compare the very inadequate rat data. For rat and goat we have x? = 12201, P='1434. For rat and dog we have x?=11:370, P =-1245, Accordingly we see that for this material the rat strain (i) lies between the dog and goat strains, and (11) is definitely interchangeable with dog and with goat, while the dog and goat are much more divergent. Now the sparsity here of all the data must prevent any dogmatism; all we can reach is suggestion for further investigation. But the following points should be notedt. The trypano- somes through the goats were obtained from sva different goats, infected directly from the wild fly; the trypanosomes from the dogs were obtained from only four different sources, namely from a monkey directly infected by the wild fly, from a dog directly infected, and from two goats (89 and 125), the former only of which is identical with one of the former six goat sources. Lastly, the rats were infected from one dog alone, upon which the tsetse flies had directly fed. This dog is not identical with one of the dog sources. Now unless we assume that all the strains of the trypanosome found in the tsetse fly are identical—which is certainly not in accordance with the differences found in the strains of wild game from the “ fly- country ’—it is by no means certain that the trypanosomes obtained from wild G. morsitans, through goat, dog and rat as above noted came from anything like the same sources. Further, the closer resemblance between rat and dog strains * R. S. Proc. Vol. 87, B, p. 11. + R. S. Proc, Vol. 87, B, pp: 10, 11, and 19 to 22, 14—2 108 A Study of Trypanosome Strains may simply be the result of the rat strain having been developed in the dog as host. The divergence between the dog and goat strain may again be solely due to the greater variety of sources in the goat. The data from the wild G. morsitans experiments seem to indicate that the observed divergences between the strain from rat and the strain from goat or dog may not be due to difference of host; but to difference of source from which the material was drawn, and to difference of treatment of the individual stock of trypanosomes, e.g. the number of hosts, ete., through which it has passed. It seems absolutely certain that at the present time most light would be thrown on the conditions for asserting sameness or diversity of strains, by well devised experiments on strains from single sources passed through different species of hosts in different manners, in order to determine the exact measure of divergence produced by host and by treatment, and ultimately to devise a standard treatment for all strains which we desire to compare. The exact nature not only of host, but of standard treatment is most vital. We can demonstrate the influence of treatment at once by considering the “ percentages of posterior nuclear forms among short and stumpy forms” recorded by Sir David Bruce and his colleagues for the wild-game strain*. All the trypanosomes were from rats, and although the date of infection of the rat is, I think, not stated, the dates of first extraction will be after much the same interval, and we can therefore classify by date from first extraction. We find the following table: Wild-Game Strains. Percentage of Posterior-Nuclear Forms among Short and Stumpy Forms. From first Extraction 21°/, and under 22°/, and over | Totals | | 6 days and under 18 6 24 | 7 days and over 6 18 24 | Totals 48 Using Sheppard’s formula for the four-fold table, we have for tetrachoric r Oe or, the correlation between this character of the trypanosome and the time after infection of extraction is very considerable. It will be obvious that in a standardised treatment this time of extraction will play a most important part. But it again is not independent of the species of trypanosome, for if we take the wild Glossina morsitans strainst, we find : * R.S. Proc. Vol. 86, pp. 396—404, Tables III, VI, IX, XII and XV. + R. S. Proc. Vol. 86, B, pp. 410—418, Tables III, VI, IX, XII and XV. I have added one percentage by random selection from the complete table by lot in order to give 60 cases, and save labour in fractionising. Kart PEARSON 109 Percentage of Posterior-Nuclear Forms among Short and Stumpy Forms. From first Extraction 7 °/, and under 8 °/, and over Totals 6 days and under 18 30, | 12 7 days and over Totals 30 leading to r = — ‘309. In other words using tsetse fly strains and not wild-game strains, but the same host, we find that now the correlation is negative or the longer the infection the smaller the percentage. Actually the five G. morsitans strains show remarkably irregular results compared with the results for the wild-game strains; the ex- tractions were spread over much the same period, 13 to 14 days on the average, but were somewhat more numerous for the G. morsitans. Thus even the same method of extraction may give widely varying results according to the nature of the strain producing the infection, although the host be the same. To the statistician who examines the frequency distributions provided by Sir David Bruce and his colleagues for both wild-game strains and Glossina morsitans strains, there can hardly remain a doubt about the heterogeneity of the material in each case. We have already demonstrated this statistically for the wild-game strains. These strains not only differ by immense differences inter se, but intra se they are clearly heterogeneous. Whether this heterogeneity is due to the mixture of separate strains, to dimorphism within the strain, or to the combination of material drawn from the rat at various stages of infection, it is not possible on the material at present available to determine finally. The same remarks apply with even greater certitude to the wild G. morsitans strains than to the wild-game strains. But we shall return to this point in the last section of this paper. We have already noted that Sir David Bruce and his colleagues identify— against the weight of the statistical evidence—the Mvera cattle strain, the wild-game strain and the wild G. morsitans strain as belonging to the same species 7’. pecorum*. They had previously identified other strains in wild game, G. morsitans and human beings} with 7. rhodesiense which they elsewhere describe as vel brucet{. This is again, I hold, against the weight of statistical evidence. But it is not clear from the memoirs themselves what is the exact process by which an individual fly, an individual human being, or the blood from a specimen of wild game is credited with carrying a homogeneous strain. The sizes are so different in the cases of T. pecorum and T. simiae that there may be no difficulty in distinction, but the range is so great and to the statistician the material seems so heterogeneous in the ease of T. brucei vel rhodesiense that, perhaps, a fuller description by the authors * R. S. Proc. Vol. 87, B, p. 26. + R. S. Proc. Vol. 86, B, p. 42. + R. 8. Proc. Vol. 86, B, p. 426. 110 A Study of Trypanosome Strains of the process of differentiation would aid him. This is of especial importance if it should turn out, as I suspect, that the trypanosomes classed as T. brucei are either dimorphic, or belong to two different species. In another paper* we find the trypanosomes from G. morsitans, on the basis of their infective powers on monkey, goat and dog, resolved into 7. brucei vel rhode- stense, T. pecorum, T. simiae and T. caprae. But it is clear that the differentiation was not done solely by infectivity, or there would have been no means of dis- tinguishing 7. bruce: and T. pecorwm which attack all three—monkey, dog and goat. The question arises, whether 7. pecorum, T. simiae and T. caprae being readily identified by microscopic examination or size, the remainder was classed as T. brucei, in which case the question of the heterogeneity of this group, which appears to attack all animals, is rather supported than otherwise by this paper. Frequencies of the Various Strains for Length. Length in Microns. | -.| | | | | | Strain 9 |10)11| 12 | 28 °\ Th \eioy| 6 ety 78 | 19 20 21 | 22 | 28 | 24 | 26 | 26 | Ae aaa . T. pecorum 2 | 6 | 42/193 | 452] 618/ 453/178] 51] 5 mee T. simiae —{|—|—| — | — | 7) 28) 76) 93) 126) “92)/0 47i\" 221) eG io) ae ee T. caprae —;/—-—|;-|]—- 1}— | 3] 8) 28) 49) 79.) 95) 80 | (i) 7. rhodesiense —i—|—]1 3 10} 19] 29] 35! 67 | 54) 92 | 51| 74}> 56! 68 59| 85 Gal Ze bruce, 62-9) — ao 8| 14] 17} 40] 63) 55 66 63)| 75) 87) 93] 80] 82 (ii) 7. gambiense... | —|—|—]— |} 1 | — 9| 21| 56| 79|114/122 110] 85] 85] 61] 47] 49 (iv) Mzimba Strain | 8| 27) 791175 189 139/109} 72) 66) 36) 32 (v) G. morsitans ... | = 7| 3L| 148 | 230 | 326 252 237 | 184) 143 | 115 | 130 | 110 (vi) Wild Game ... | 1 8} 53/118 | 252 381 | 348 | 285 | 200 | 162 | 149 | 135 (vii) Human Strain | — | —|—}— | — 1} 10/ 41/154) 325 | 494 | 528 577 | 512 | 525 | 511 | 464 | 425 | (viii) Chituluka... | |= | - 1 8} 48} 81, 78) 71) 44, 46] 56) 53) 98) 120 | | | | | Length in Microns—(continued). | | jie | | | | Strain 27 | 28 | 29 | 30 | 31 | 32 | 33 84 | 85 | 36 | 37 | 38 | 39 | Totals} Source - | ee ae | aes T. pecorum sti - | | |—|—| 2000 | R. S. Proc. 87, B, p. 13 7. simiae | esa ea 500 | Ibid. 85, B, p. 477 T. caprae 68| 57; 24) 9) 2) 2)/—|—|} - 500 | Zbid. 86, B, p. 278 (i) TZ. rhodesiense | 61| 72| 50] 52| 28| 13/13|-5| 1| 1|--|—| 1] 1000 | Jéid. 85, B, p. 227 (i) 7. brucei 72| 50) 38| 27) 26) 18)11} 4) 4)/—|—j| 2 |—] 1000 | Jbed. 84, B, p. 331 Gii) ZT. gambiense 47| 44) 31] 20] 11] 4] 4 - -—|—-| 1000 | Ldzd. 84, B, p. 330 (iv) Mzimba Strain | 24] 22) 16) 7) 4] 4)—| | —|—| 1000 | Zbcd. 87, B, p. 31 (v) G.morsitans ... |127|133/113} 96; 54; 44/11) 7; 2;—|—|—]/— | 2500 Lbid. 86, B, p. 419 (vi) Wild Game ... | 125/110] 62} 55} 33] 12] 7/ 3} 1|/—|--|—|—| 2500 | Jbid. 86, B, p. 405 (vii) Human Strain | 372] 347 307 | 198 | 167 | 123 | 77 | 36} 12/11} 2 | 1 |-—| 6220 | Zoid. 86, B, p. 330 (vill) Chituluka 111 | 128/138} 99/117} 91/63/27/11| 9) 1 | 1 |—J| 1500 | Zbed. 86, B, p. 291 * R.S. Proc. Vol. 86, B, p. 422. KARL PEARSON 111 At any rate the exact method of differentiation adopted would be of interest to the statistician. The result of the paper is that the four species of trypanosomes occur in quite comparable permilles of tsetse flies caught in the sleeping sickness area of Nyasaland, and there is no evidence to show that they or other strains also may not occur side by side in the same fly or in the same specimen of wild game. Further, these compound strains would then appear in different proportions in the host. Some such hypothesis seems very needful to account for the extreme heterogeneity of the wild game, wild G. morsitans, and human strains as recorded by Sir David Bruce and his colleagues. The following table gives a comparison of what appear to be homogeneous strains—T. pecorum, T. simiae and T. caprae— with what appear statistically to be heterogeneous strains, ie. 7. brucer, T. rhodesiense, T. gambiense, the Mzimba strain, the wild-game and wild G. morsitans strains of human type, and the human strains themselves. The table Means, Standard Deviations and Coefficients of Variation of eleven Strains of Trypanosomes. Seri M Standard Coefficient Bs ean Deviation of Variation T. pecorum 13°992 + :019 1°2816 +014 9°16 + ‘099 T. simiae 17°870 + 050 1°6558 + ‘035 9°27 +°199 T. caprae 25508 + :063 2°1011 +045 8°58 +184 () T. rhodesiense ... 23°577 +°100 4°6764+ 071 19°83 +°311 Qi) ZT. brucei 23529 + 094 4°3938 + ‘066 18°67 +°291 Qui) Z. gambiense 22°113 + ‘081 3°7867 + ‘057 17°12 + °266 (iv) Mzimba Strain... 217413 +:063 2°9586 + 045 13°82 + 212 (v) G. morsitans 22°695 + 058 4°3002 + 041 18°95 + 187 (vi) Wild Game se 22622 + 047 3°4541 + 033 15°27 +:'174 (vii) Human Strain ... 23°796 + °035 4°1262 + 025 17°34+°108 (viii) Chituluka 26°172 + ‘084 4°8414+.060 18°50 + °235 above, gives the means, standard deviations and coefficients of variation of these strains. It will be seen that the first three are of a very different character to the last five. The variation of the latter is about double that of the admittedly pure strains, and throughout the whole course of our further work this possibility of heterogeneity, and the differential selection of the components by the host must be borne carefully in mind. Great divergences do not discourage the use of biometric methods, and we get occasionally identities of strains which are quite beyond the limits of chance coincidence and which point to definite possibilities if only host, environment, and treatment are once effectively standardised. I propose to try to throw some light on these points in the remaining sections of this paper. (6) On the Probability that Strains are alike after allowance for the Host. (a) Luckily in certain cases the treatment has been more or less alike. Thus in the wild Glossina morsitans strain, the tsetse flies brought to the Laboratory 112 A Study of Trypanosome Strains from the “ fly-country” were in one strain (I) fed on a monkey and in the case of four other strains (II to IV) fed on dogs. From these animals thus infected others were inoculated, but in each case only the trypanosomes from a single rat were used for purposes of measurement and comparison. ‘The following table gives the frequency distributions of the five strains, and chiefly on the basis of these distributions, Sir David Bruce and his colleagues conclude that: “The five wild Glossina morsitans strains resemble each other closely, and all belong to the same species of trypanosome.” (p. 421.) Wild G. morsitans Strains*. Microns. Strain I |] Se Oe Investigating the statistical measure of resemblance 230 | 326 | 252 | 237 | | | 25 143115 130 7 the following series of results : Strains I and IT: x? = 81°88, P < 000,000,1, Strains I and III: Va aLoosil, P < :000,000,01, Strains I and IV: ye OOS: P < :000,000,1, Strains I and V: 2— 115°77, P < :000,000,1, Strains II and III: x? = 32812, P < :000,000,01, Strains II and IV: x? = 184°88, P< :000,000,01, Strains II and V: x? = 208:79, P < :000,000,01, Strains III and IV: x? = 122°79, P < :000,000,1, Strains III and V: x? = 147-20, P < :000,000,1, Strains IV and V: x? = 23°90, P =:2470. in the usual way we have Statistically therefore there is not the faintest resemblance whatever between any pair of these strains except the IV and V. These strains are for practical purposes interchangeable. In one out of every four trials two pairs of samples of 500 from the same trypanosome population would give results more divergent than those observed. But what is the source of this resemblance? Why are these two strains alike and all the others widely divergent? There is nothing whatever in the paper to account for this agreement, and it is the more remarkable because Strains IV and V are to the statistician the most compound looking of all the strains. But some uniformity of origin or treatment has caused the two com- ponents to appear in like proportions, and at the back of this resemblance there is some vital point, if we could follow it up. Were the two dogs bitten by the same * RS, Proc. Vol. 86, B, p. 409 et seq. fly, or Rats 658 and 660 really inoculated from the same dog ? Kart PEARSON Clearly 113 there is a point here which ought to be cleared up, for otherwise the statistician could only conclude that the wild G. morsitans strains are widely divergent, and that their compound nature suggests that the tsetse fly carries various types of trypanosomes and these in varying proportions, (b) I now turn to the five human strains dealt with by Sir David Bruce and his colleagues. animals. Human Strains. A: Compounds from Various Animals*. Let us first consider the human strains compounded from various The following table gives the length distributions : Microns. | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 Strain I, Mkanyanga ... 4] 19 | 42 | 63! 81) 75) 91) 65) 66} 93] 91/107 ” ) 2 2 | 12 | 55 | 108] 159 210 | 188 | 215 | 177) 138) 83 » II], Chituluka 1 8 | 48 | 81 | 78| 71] 44] 46] 56) 53) 98] 120] » LV, Chipochola ... 2] 4 | 32 | 68 |110/101/109|106) 95° 95] 74] 64| » V, Chibibi 1 8 | 20 | 58 | 117(122|123,107| 93) 98] 63} 51 | — Sum 41 | 154 | 325 | 494 [oes 577 | 512 | 525 | 511 | 464 | 425 Human Strains. A: Compounds from Various Animals—(continued). Microns. 38 | Totals Strain. I, Mkanyanga . 1220 ee al B 1500 » Ill, Chituluka ... 1500 IV, Chipochola ... 1000 V, Chibibi 1000 6220 We may conipare the strains precisely as in the case of the wild G. morsitans We find: Strains I and II: Strains I and III: Strains I and IV: Strains I and V: strains. Strains II and III: Strains II and IV: Strains II and V: Strains III and IV: Strains III and V: Strains lV and V: x? = 408°50, x? = 204-99, x? = 180°63, x” = 20540, x” = 923°62, x foi 0k, veh, 00, x? = 53132, x? = 563°82, x? = 16°81, * R. S. Proc. Vol. 86, B, pp. 287, 291, 295, and 297. B, p. 423. Biometrika x P =< -000,000,01, P =< :000,000,01, P = < :000,000,01, P =<:000,000,01, P = <:000,000,001, P =< -000,000,5, P =< 000,000,5, P = <:000,000,01, P = <:000,000,01, P =7733. For Strain I see R. S. Proc. Vol. 85, 15 114 A Study of Trypanosome Strains Again we have the remarkable result that all the human strains are statis- tically divergent beyond any possible comparison, except those of Chipochola and Chibibi which show a high degree of correspondence. Now is this result the outcome of treatment? We note the following diversity of hosts : Strain I. Strain II. Strain III. Strain IV, Strain V. Gross | Percentage | Gross | Percentage | Gross Percentage | Gross | Percentage | Gross | Percentage Men 60 4°9 — 0-0 - 0:0 — 0 — 0:0 Monkey ... 100 8:2 160 10°7 160 1KO}27/ 160 16°0 160 16:0 Goat 20 16 60 4:0 80 5°3 80 ‘0 80 8:0 Sheep 60 4°9 20 1:3 1 0-0 ‘0 — 0°0 Dog ss 260 21°3 260 17°3 260 17°3 260 26°0 260 26°0 Guinea Pig | 120 9°8 _ oOo | — 0-0 — ‘0 — 0-0 > Rat 600 49°2 1000 66°7 1000 66°7 500 50°0 500 50°0 Totals 1220 — 1500 — 1500 — 1000 — 1000 — Now it will be clear at once that the percentages of trypanosomes drawn from various types of host are identical only in the case of Strains IV and V, which we have found in close accordance. But there is not great divergence in source between Strains II and III although Strain I shows fairly wide differences. We find, however, that II and III are statistically very unlike, the next closest resemblances, although very slight, being between II and IV and V. It would not seem therefore that the degree of similarity is wholly determined by similarity of hosts. I have accordingly reinvestigated the five human strains by taking rats only. But, of course, even then it is of vital importance to be certain that the process of transfer from man to rat was the same in all five cases, and of this no evidence is provided. Human Strains. B: From Rat only*. Microns. 15 | 16 | 17 | 28 | 19 | 20 | 22 | 22 | 29 | 24 | 25 | 26 | 27 | Strain I, Mkanyanga =e lie 1 | 21 | 40 | 52 | 49 | 80 | 31 | 36 | 33 | 48] 52 » LL, 5, Rat 728 .~f—|—| 2] 4] 15 | 30] 57 | 72 | 85 | 72 | 59 | 44 | 26 » I, E, Rat 796... --... | — |— | 2 24 | 30.) 42 |60))) 61 "87 | 78ul soulmoralnoo » III, Chituluka, Rat 952 | 1 | 3 | 21 | 27 | 23] 15] 10] 15 | 19 | 21 | 34 | 44 | 36 3 ILI, Chituluka, Rat 953 — ih 17 | 26 | 20 | 19 | 15 | 14) 26 | 18 | 33 | 40 | 34 ,» IV, Chipochola, Rat 1337] — | — | 4| 6] 16| 29 | 53] 61 | 59 | 69 | 56 | 51 | 36 .) V, Chibibi, Rat 1660... | —- | -— | — | 4 {17 | 29 | 46 32 | 69 | 73 | 52 | 40 | 31 Sum Ne Ete via 1 | 5 47 | 112! 161 | 216 | 290 | 316 | 376 | 362 | 322 | 294 | 235 * R. S. Proc. Vol. 86, B, pp. 288, 289, 292, 293, 295, and 298. For Strain I see R. S. Proc. Vol. 85, B, p. 423. Karu PEARSON 115 Human Strains. B: From Rat only—(continued). Microns. : Totals Strain I, Mkanyanga 600 eT. E). Rat 728 500 55 II, E, Rat 726... dis 500 » II], Chituluka, Rat 952 500 » III, Chituluka, Rat 953 500 » LV, Chipochola, Rat 1337 500 i V, Chibibi, Rat 1660... 500 Sum... male ... [219 | 210] 134| 108} 88 | 57 | 28 | y) S| dl 1 } 3600 This table with its two pairs of rats inoculated from the same strains is peculiarly instructive. We can compare II, Rat 726, with I, Rat 728. We find: x? = 36195, giving P =-0048. This is far from the high degree of divergence we have found between the com- pound human strains, but it is not satisfactory as a measure of the agreement of the same strain in two hosts of the same species. Applying the same test to the two Rats 952 and 9538 of Strain III we have: Vv? =14715, giving P="9038. This is, of course, quite satisfactory. We should not hesitate to assert identity of strains and of treatment in the case of the trypanosomes from these two rats. The statistician will feel fairly confident that there is a factor of divergence between the trypanosomes of the two rats in Strain IT, which does not occur in the two rats of Strain III. He will be almost certain that the strain was not conveyed through the same steps or at the same stage of the disease to the rats in Strain II. Unfortunately dates and processes are not discussed. Sir David Bruce and his colleagues say that it is remarkable how much alike these distributions for Rats 726 and 728 are, and again for the distributions for Rats 952 and 953 that they also closely resemble each other. “It is curious and striking that the same strain of trypanosome growing in two different animals should show this remarkable similarity*.” The interesting point is that the statistician would agree with the remarkable similarity in the latter case, but the divergence not the remarkable resemblance in the first case would force him to seek for some explanation in treatment. It will, I think, be clear from these illustrations that a strain of trypanosomes, even if obviously compound, can be taken from a single source and after inoculation into two different individuals of the same species be identified as same; but to insure this result on every repetition the greatest caution will have to be exercised as to identity of process and treatment. * R.S. Proc. Vol. 86, B, pp. 289 and 293. 116 A Study of Trypanosome Strains There are still further results of importance to be ascertained, however, from our table of human strains. Let us compare Strains IV and V, which we found resembled each other closely even for compounded hosts. We now reach x? = 14085 and P=:5229. Or, the probability that these two strains are identical has been reduced by selecting out the rat data only. But the result is still so high that no one would hesitate to assert that Chipochola and Chibibi were suffering from a disease due to the same strain of trypanosome. The correspondence is so close that we have combined Strains III and V for all other comparisons. In the case of Strain ITI, we have added together the results for Rats 952 and 953. Such addition is less reasonable for Rats 726 and 728, but without doing this, it is impossible to decide which rat is to represent the E strain. I have then made the following com- parisons : Strains IV and V with III: y*?=525-67, P <:000,000,01. There is accordingly no similarity at all between the Chituluka strain and that common to Chipochola and Chibibi. Strains IV and V with IT: y? = 64°70, P < ‘000,001. Thus the strain from the European E from Portuguese East Africa diverges from the Nyasaland strain widely, but not as widely as that of Chituluka does from those of Chipochola and Chibibi. Strain I with Strain HI: y? = 12613, P <:000,000,1, Strain I with IV and V: y? = 21782, P < :000,000,01. Thus the trypanosomes from Mkanyanga are widely divergent from those of the three other Nyasaland cases. Nor are they any closer to the European E: Strain I with Strain Il: yx? = 331°37, P <:000,000,01. Thus with the exception of the Chipochola and Chibibi strains, the trypanosome distributions from human sources differ widely. Nor is this to be wondered at, if the human beings owe their trypanosomes to Glossina morsitans, for in that case we should expect the human strains to be as diverse as we have found those from the tsetse fly itself. It would remain to explain the close similarity of the Chipochola and Chibibi cases. It would be interesting to know the history of these cases with regard to locality and to the possibility of a unique source of infection. (c) In the case last dealt with, namely that of Chipochola and Chibibi, we have the remarkable feature that the strains although significantly identical, whether treated in the rat alone or in compounded distributions from various hosts, resemble each other somewhat less closely in the single host series. This is not generally the rule. Some of the big divergencies we have already noticed become far less appreciable, nay, even become resemblances when we confine our attention to one species of host. The chief misfortune which then too often arises is the Kart PEARSON 117 paucity of the total numbers that we have at our disposal. I will consider, however, from this aspect the relations of the three strains wild G. morsitans, wild game, and Mvera cattle. I compare first the lengths of 200 trypanosomes from wild G. morsitans and wild-game strains. These yield for the host, goat* : Microns. | Satie From Goat 9 | 10 | TA) 12. | 13 | 1h | 15 | 16 | 17 | 18 | 19 Totals | é | | Ps aa | Wild G. morsitans Strain | 1 | 3 | 12 | 21) 55 | 60 | 32 | 12) 4 | —| — | 200 Wild-Game Strain Hae | —|}— | Wk | 37 | 73 | 38 | 26) 8 2 200 | | giving : x? = 26782 and P=-0015. To further test this, I take the same two strains in the dog as host+: Microns. | ere ee | From Dog D0 | IED RED HES NGI A GESY IRE TRS | is} | Totals i onl | fe | eas | ee ee | Wild G. morsttans Strain 5) 3 . 34 41 | 40/19) 9 | — 160 Wild-Game Strain ae al a 31 | 57 | 50 | 24 | 6 | —| 180 | | \ | | Here v= (045 and P= 3171. The value we had previously found for a mixture of all strains was P = -0002. Thus the two strains may be considered as identical when we deal with the trypanosomes from the dog, as showing considerable divergence when we take the goat, and as showing marked divergence when we take a great variety of hosts. The weight of evidence in favour of a standardised treatment thus becomes very great. Let us look at precisely the same material for the wild-game strain and for the Mvera cattle strain, first for the goat and then for the dog as host+. The grave difficulty is the paucity of measurements thus differentiated. Microns. i =a Eee aaa | | : ie yi eel | lo | | From Goat 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | Totals | | : SS | : | Wild-Game Strain ...|— | — | 1 | 16 | 37 | 73 | 38] 26| 8 | 1 | 900 | Mvera Cattle Strain ... | 1 1 OF la 225260) TOE sa 1} ==) 100 | ed | This gives x’ = 14670, leading to P =-'1013. * R.S. Proc. Vol. 87, B, pp. 6 and 11. + R. S. Proc. Vol. 87, B, pp. 6 and 11. by by . S. Proc. Vol. 87, B, pp. 3 and 5 + + 118 A Study of Trypanosome Strains Microns. | From Dog | 11 | 12 | 13 | 14 | 15 | 16 | 17 | Totals eee | ps ee Me | | : | | | Wild-Game Strain soe fe IQ ASI D7 de bOM24alaG | 180 Mvera Cattle Strain | || Lia O eel 8 | — 100 | | I This leads to x? = 15992, P="0138; Previously (p. 98) on the total series of different hosts we had found P=-000,243, Thus by referring our material to individual hosts, we have reduced the degree of divergency between the wild-game and Mvera cattle strains, but it would be still hazardous to state that these strains are identical. Lastly, we turn to the Mvera cattle strain and the wild G. morsitans strain dealing with dog and goat as hosts separately * : Microns. From Goat ako) |) ib ae 1h 15 | 16 | 17 | Totals | | Wild G. morsitans Strain ... | 1 Sale | 21 | 55 |.60 32 | 12 | 4 200 Mvyera Cattle Strain eeu Cal 1 3] 14 22 | 26 | 19°) 2372 100 | | This gives x= 1-968, P = °4368. And again : Microns. | | From Dog 9 | 10| 21 | 12 | 13 | 24 | 15 | 26 | 17 Bas | ee eee ae eer 5 5 | Wild G. morsitans Strain... | — | -— | ela: | 34 | 41 | 40 | 19 | 9 160 Mvera Cattle Strain | —|—| 3/11] 27) 30} 21-4 8 |= 19 ico | | | | | | | | | resulting in x? = 11:120, P=:0852. The Mvera cattle strain and the Glossina morsttans strain had for all hosts a divergence measured by P= 000,008. Thus the great bulk of this divergence is due to multiplicity of hosts f. To sum up the results obtained for 7. pecorum in Mvera cattle, wild G. morsitans and wild-game strains, the identification of these strains was quite illegitimate on the basis of the compound host frequencies. It is reasonable on the basis of * R. S. Proc. Vol. 87, B, pp. 3, 10—11. + It is worthy of note that in comparisons with the cattle strain the goat appears to give closer results than the dog, but the dog appears the better in the comparison of the G. morsitans and wild- game strains, Kart PEARSON 119 trypanosomes taken from a single species of host. But how far the resemblance in these cases is produced by a selective influence of the host and not necessarily by an identity of all the members of the strain before transference to the host is not demonstrated. On the other hand while divergence due to host will account for the divergences which are so notable in 7. pecorum, it will not account for the divergences in the human strains; these are startlingly conspicuous even if we confine our attention to a single species of host. Precisely the same remarks apply to the trypanosomes similar to those causing disease in human beings found in wild game and in the tsetse fly itself. There must be another source for these divergences. (7) Discussion of the Heterogeneity which is statistically demonstrable in the bulk of the Trypanosome Measurements. The reader who has attentively followed the course of the argument in the previous sections will be prepared for the next step in this memoir, the attempt to account for the large divergences between strains of trypanosomes in individuals of the same species by the heterogeneity of those strains. My suggestion is that the strain in one fly differs from that in another because the components do not appear in the same proportion, the strain in one specimen of wild game from that in another, or in one man from that in another because they have been bitten by a fly containing the components in unlike proportions. The host does make some difference, either by nutrition or selection of trypanosomes, but it is a minor differ- ence. Thus consider what we may probably hold to be pure strains and observe the average differences in length found by Sir David Bruce and his colleagues: Microns. T. pecorum T. simiae* T. caprac | Mvera Cattle + Wild G. morsitans § Goat 17°3 | Waterbuck 26°8 | Donkey 13°5 | Goat 13°5 Monkey Loa Ox 25°7 | Ox 14:2 | Monkey 13°6 — Goat 25°3 | Goat 13°8 | Dog 14:2 — Sheep 25°6 | Dog 13°8 | Guinea Pig 14°6 — | — Rat 14°8 | Rat 14:0 Max. Difference 0°8 Max. Difference 1°5 | Max. Difference 1°3 | Max. Difference 1:1 | We may thus anticipate that in a pure strain the change of host would hardly make a difference of more than 2 microns in the average length. We must * R. 8S. Proc. Vol. 85, B, p. 479. : t+ R. S. Proc. Vol. 87, B, p. 3. + R. S. Proc. Vol. 86, B, p. 279. § R. 8. Proc. Vol. 87, B, p. 10. 120 A Study of Trypanosome Strains accordingly be prepared for some such change as this in the shifting of the mean when the host is varied. We have next to inquire what type of curve accurately describes the strains which we are fairly certain are homogeneous. If the reader will turn back to p. 110 he will note at once a marked difference between the distributions for 7. caprae, 7. pecorum and T. simiae when compared with those entitled Mzimba strain, human strain, wild-game strain, 7. brucei, T. rhodesiense, T. gambiense and the wild G. morsitans strain. The coefficients of variation of the former group are all under 9°5 (mean = 9:00), the coefficients of variation of the latter group are all over 13°5 (mean = 17:29). We recognise therefore a totally different order of variability. Even in absolute variation as measured by the standard deviations we find the first group with its mean S. D.= 1°68 and the second with its mean 3:96. An examination of the graphs scattered through the trypanosome papers to which we have referred will, we think, convince the statistician that we have to deal with heterogeneous and not skew homogeneous material*. It becomes of course important to ascertain whether in the pure strains a Gaussian curve will suffice to describe the frequency closely enough for statistical purposes, for, if it does, the analysis into at any rate two Gaussian components of the heterogeneous strains becomes relatively direct, if laborious. I will consider the 7. pecorum, T. simiae, and T. caprae strains from this standpoint. (a) T. pecorum (see p. 110). Mean = 13:992 microns. S.D.=1:2816 microns. boa Observed | Calculated Microns Values | Values 9 and under 2 0°46 10 6 | 5°98 - 11 42 45°41 x°=7°630 12 193 192°52 P= 572 138 452 456°70 Ls 618 607°12 15 453 | 452°49 16 178 188-98 LT 51 44°16 18 and over 5 | 6°20 Hence in 57 out of 100 trials from material following the Gaussian distribution a more divergent sample than that observed would actually be obtained. We can therefore conclude that a simple Gaussian frequency adequately describes the distribution in size of 7. pecorwm. This is illustrated in Diagram II. * Note especially the bimodal graphs in R. 8. Proc. Vol. 83, B, pp. 5 and 11, for both the Uganda and Zululand strains of 7. brucei, in Vol. 86, B, pp. 291—293, for human strains, in Vol. 86, B, pp. 395, 397 for wild-game strains and pp. 409, 411, 417 and 419 for G. morsitans strains, KARL PEARSON 121 Total 2000. Frequencies per Micron. 1 ' ' ' t OOo OAS S16 MSem9 Microns. Driacram II. Gaussian fitted to T. pecorum Frequency. (b) T. simiae (see p. 110). Mean = 17'870 microns. S.D.=1°6558 microns. Microns | Observed Calculated Values Values 14 and under ui 10°46 | 15 28 27°63 16 | 76 63°92 x?=8'149 17 93 103°78 P= +520 18 126 118°32 19 92 94°66 20 47 D878 21 22, 20°96 22 6 5°80 23 and over 3 1:29 150 140 ; 3 130 2 120 S$ 110 Ss S 100+ Frequencies per Micron. QO fe) (OMmmsMNaee Se diemly is) 19-9901 21 Of a5) oO Microns. Dracram III. Gaussian fitted to T. simiae Frequency. Biometrika x 16 122 A Study of Trypunosome Strains We conclude that the Gaussian adequately describes the distribution of T. simiae. In more than half the trials we should get a worse sample. See for graphical fit, Diagram ITI. (c) T. caprae (see p. 110). Mean = 25°508 microns. S.D.=2°1011 microns. Microns Observed Caleulated Values Values | 20 and under Aas 4:98) | 21 8 9°82 22 23 23°95 23 49 46°74 x?=5°175 24 79 73°05 ae 25 95 91°38 val 26 80 91°54 | 27 | 68 73°45 28 57 47°16 29 24 | 24°26 30 9 9°98 | 31 and over 4 | 4°38 This is a still more excellent fit; if the Gaussian represented the population, in 92 °/, of samples we should get a more divergent sample than that observed. The curve is given in Diagram IV. 150 140- 2 120 te : 110, = 4 10 é 90, = 80 = 70 = 60 eS 50 S 40 S al a va IN és os 19° 90 QI 29)" 93) 94995" 96) 227) (2829 sOn sl asa Microns. Diacram IV. Gaussian fitted to 7. caprae Frequency. It will be clear from the above three illustrations of what we may term homogeneous trypanosome strains that the Gaussian curve of frequency suffices to describe adequately such material. It is equally clear that no Gaussian can Kart PEARSON 123 possibly describe such skew distributions as we get in the wild-game strain or wild tsetse fly strain of the trypanosome species identified by Sir David Bruce and colleagues as 7. rhodestense*. It is equally impossible in the case of the human strains figured in the paper of February 1913+. I illustrate this on the frequency distribution for 6220 trypanosomes of human strains}. Observed Calculated Observed Calculated 14 and under 1 | 75°45 26 425 | 520°55 5 10 =| 62°51 I a7 | 372 444-42 16 | 41 101'51 —s| 28 347 || 35796 17 154 | 155°50 | 29 | 307 271°88 | 18 325 | 224°73 I 30 | 198 194-81 | 19 494 306-27 | 31 | 167 | 131°68 20 528 393-73 | 32 123 83°91 21 577 | 47739 83 77 50°44 22 512 545°93 | 34 36 28°61 33 525 588-91 | 35 | 12 15°30 24 511 599°17 | 36 and over | 14 | 14:18 25 464 | 575°04 | | | Here y?= 501 and P< ‘000,000,001. In other words description by a Gaussian is absolutely impossible. The histogram of observations and the curve are shewn on Diagram V. Now the suggestion that flowed at once from these results was the compound nature of all the material classed under the headings : G) TZ. rhodesvense. que) L brucer. Gu) TZ. gambiense. (iv) Mzimba Strain. (v) Wild G. morsitans Strain. (vi) Wild-Game Strain. (vii) Human Strain. With the experience of the Gaussian fitting the homogeneous strains, the direct step was to investigate whether the above material could be analysed into two Gaussian components and to determine how nearly these components were in agreement. The method of carrying out this analysis was provided in the first of my series of Contributions to the Mathematical Theory of Evolution§, There was nothing to prevent the process being applied to every individual frequency given by the trypanosome workers, except the very laborious arithmetic. The method was applied to the above seven cases, and also (viii) for the purposes of illustration to a single human case, that of Chituluka, a native of Nyasaland, who * See R. 8. Proc. Vol. 86, B, pp. 407 and 419. + See R. S. Proc. Vol. 86, B, pp. 285 et seq. See R. S. Proc. Vol. 86, B, p. 300. - a § Phil. Trans, Vol. 185, A, pp. 71—110, 1894. 16—2 A Study of Trypanosome Strains 124 ‘soumosouvddiy, weuny_ Jo uowynqraysiq Aouonbely 4g 04 UvIssney jo oInTIVq “A NVUOVIG “sUuOoLoryy ge LE 9E GE VE SE GE LE OF 66 BG LG YS GB VS ES GS 1G OG BL gL LL SieSESvE Se Gl Ie OL i ine ane BN irae At ey N s \ fe) 004 OO OOV 00S 009 OOZ2 008 ‘uolay dad sarauanbaty 0669 1PI0L KARL PEARSON 125 died of sleeping sickness*. With the single exception of 7. brucei every one of these distributions broke up into two components, and into two components with strikingly close means. I propose to call these two components 7. minus and T. majus. I do not assert that they are distinct species; they may be dimorphic groups of one and the same trypanosome species. But the recognition of their existence seems to bring some order at least into the chaos we have already noted as existing in the trypanosome measurements. ‘Two human strains or two wild- game strains differ from each other with such wide divergence in their frequencies because these two groups 7. minus and T. majus are mixed in the individual in different proportions. | Standard Coefficients of Size of MSs Deviations | Variation Populations Strain = =o |e T. minus | T. majus | T. minus! T. majus | T. minus | T. majus | TL. minus T. majus T. rhodesiense ...| 18°7418| 26-1122 | 2°3184 | 3-4397 | 12-370 | 13-173 eae Bees 7. brucei... «|| 19°8244 261122 2-6439 | 34134 | 13°337 | 13-072 Peace eos | “T. gambiense _...|19°8926) 26-2463 | 2-0566 | 26260 | 10-339 | 10-005 |) ORY. | | BNE Feamailecus apr SOD Re Pad iee N GokE ( 634-96 | | 365-04 | Mzimba Strain ...| 19°8966 | 24:0508 | 1°3961 | 3:1028 7°017 | 12-901 ) 635°, |) 365 om | @. morsitans Strain | 19°6475 | 27-1966 | 1°7503 | 2-70138 | 8-908 | 9-932 eae 7 aa ’ Wild-Game Strain | 20-4418 | 25-8263 | 16332 | 2-8799 | 7-990 | 11-151 ae ae es OO"D eon ipa dO bet i 9e . F . Human Strain ...| 20-3687 | 26-2930 | 1-9444 | 3-4470 | 9°536 | 13-110. ercee “1 | woe 7 Chituluka ... ...| 19°8410 | 28-7875 | 1-9785 | 2:8823 | 9-972 | 10-012 ae hoe y Means... _... | 19°8315 | 25-9542 | 1-8498 | 3-0328 | 9-360 | 11-712 | 9 —- ee T. simiae ... —... | 17°870 Be wp OOS ae aS 2700 = 100° - lo T.caprae... ..{ — |25:508 | — |21011| — | 8-580 = rae | | | 100 °/, | i 2 | The table below gives the chief biometric characters of 7’. minus and 7’. majus as found from the seven resolutions. The mean values of the constants for T. minus and for T. majus are placed at the foot; in calculating these mean values, Chituluka’s data have been excluded as already included in the human strain, and also those for 7. brucei not directly resolved. At the foot of the table I have placed the constants for 7’. simiae and T. caprae, the nearest pure strains to 7. minus and T. majus respectively. I do not in the hase Proc: Vol.86, By ip. 290. 126 A Study of Trypanosome Strains least suggest there is any identity, but comparison may bring home to the trypanosome worker the average sizes of the two components*. The differences of the variabilities are, however, much larger, and the influence of host on variability as well as on mean ought to be studied. It will be seen at once that the divergence in the individual means of 7. minus from the general mean is very slight, at most a micron, and well within the limits which arise, as we have seen, from difference of host. It is a most remarkable fact that from six independent reductions the mean size of JZ’. minus should come out so nearly 19°8 microns. In 7. majus the correspondence is not so good; the average of about 26 microns falls to 24 in the Mzimba strain and rises to 28°8 in the case of Chitulukat. Still it does not appear to me that these changes of mean of the 7. majus component are absolutely beyond the variation due to differ- ences of host and treatment. Another more serious matter is the comparatively wide range found for the variabilities ; but even here it is impossible to assert that such differences will not occur with difference of host. For example the Mvera — cattle strain, a fair sample of the simple 7’. pecorwm, gives: | Fost M | Standard Coefficients of Ot tas | Deviation Variation Goat 13°80 1°462 10°592 Rat 14°75 *839 5689 Dog 13°79 1:087 7°885 Here while the means are within one micron, the differences in variability are of the same order as those found in 7. majus from different hosts. Again, taking a pure homogeneous strain as 7. caprae with goat and sheep as host, which are scarcely so differentiated as man and antelope, we find: Hoc Mean Standard Coefficients of Deviation Variation Goat 25°31 2-187 8°642 Sheep 25°60 1°92¢ 7512 | Lastly, taking 7. simiae for goat and monkey we have: r Standard Coefficients of | Eos lean Deviation Variation | (ea ae eee — - aa | | Monkey ... | 17°26 1-403 8127 | | Goat | 18°11 1°687 9°315 | * The maximum average length of 7’. caprae is 26°8 in the waterbuck and of 7’. simiae 18:1. + It should be noted that with the whole of the human data the mean is 26°33 and that Chituluka’s mean is very exceptional. Kart PEARSON 127 I think we may conclude that, allowing for the errors of random sampling and the errors arising from the resolving process, the deviations observed in the variability of our two components do not invalidate the hypotheses : (i) That the widely divergent results obtained from different strains are due to the existence in the same individual of two types of trypanosome with very varying percentages from individual to individual. (ii) That one of these types has a mean length of about 19°8 microns and a variability of about 1°8 microns, the other a mean of about 26°0 microns and a variability of about 3:0 microns. The means may vary 1 or 2 microns with the nature of the host and the variability 0°5 to 1 micron. The large type predominates in the Nyasaland human strains*, on the average in about the ratio of 3 to 2, but the smaller type predominates in the G. morsitans and wild-game strains in about the same ratio; while in the trypanosomes classed as T. rhodesiense, and 7. gambiense as well as in the strain from the Mzimba donkey the preponderance is still cf the smaller type and the ratio approaches 13 to 7. Whether these ratios are peculiar to the host or due to the infecting fly, it is not at present possible to determine. But the hypothesis of the existence of these two types,—whether as a dimorphism of 7. rhodesiense or as independent species seems to bring some order into the apparent chaos of recent trypanosome measurements. The following paragraphs give the calculated constants of the reductions, and the numbers of the diagrams showing the nature of the compound frequencies: Ga) T. rhodesiense. Mean = 23°577, fly = 21°86874, fs = 1079°10255, Ms = + 401986, Ms = + 1105°74834. Reducing nonic: 249° — 298-7232q' — 5817q° + 1114°7684q° + 34°7620¢' — 117924954? + 12°9808q? + 0891g + 0001 = 0 wheret p,=—10q. The root is p,=—12:2578. This leads to the two components in the Table p. 125. The histogram of the observations and the two component Gaussian curves with their compound are given in Diagram VI. The resolution is not a very good one; for 24 groups y?= 37°48, and P =:05, or once in 20 trials only we should get a worse result. But an examination of either the graph or the original frequency shows at once the cause of this divergence. In their measurements Drs Stephens and Fantham have had a strange bias in favour * The European from Portuguese East Africa had predominance of T. minus. See R. S. Proc. Vol. 86, B, p. 288. + Notation of the memoir Phil. Trans. Vol. 185, A, p- 84, Eqn. (29). ins y of Trypanosome Sti e A Studi 128 ‘snfvwm “J, pus snuru “J, oJUt asuarsapoys “JZ Jo Aouanbergq oy} JO UOTJNTOSey “TA WVUOVIG “sUuoLOUTT 66 ee Ze 96 GE ve SE GE IE OS 66 8G 46 96 GG VG 86 GO IG OG Gi Bl Zi Ol SI VI Et aos tt Ob arawan Dau ad s ‘wouoryy ‘OOOT 1270.7, ~ ' Kart PEARSON 129 | of even numbers. No curve whatever could fit the data satisfactorily under the 7 circumstances! Either they used a scale graduated to 2 microns only, and had a 7 prejudice in favour of the scale markings, or else their even numbers were in some way more conspicuous than their odd. Whatever the source of this peculiarity i” may be, there can be no doubt of the bias*. ~| The only way to obtain a reasonable measure of the goodness of fit in Stephens | and Fantham’s results for 7. rhodesiense is to group from 10 to 12, 12 to 14 and so on in comparing the observed and calculated frequencies. If this be done we find x? = 5:03 for 13 groups and P=-957, a splendid fit. The frequencies are as follows : 26-28 | 28-30 | 30-32 10-14 | 14-16 | 16-18 | 18-20 Observed | 9 | 38-5 | 93-0 | 133-5 Calculated | 7-17 | 34°67 | 92-99 | 132-91 20-22 | 22-24 | 24-26 }2-34,| 34-36 | 36-38 | Totals | 134°0 | 127°0 | 1385°5 “1395 | 112°0 | 60° 6 5 15 1000 124°79 | 124°28 | 146°35 | 145°56 | 106°55 | 56°22 | 21°36 | 5° 11 "17 | 999°85 { (i) TZ. brucez. The data for this trypanosome were taken from Sir David Bruce and colleagues’ diagram+. I have not come across the original publication with the measurements involved in this diagram. Describing this species in : July 1910, the authors speak of its well-marked dimorphism. This is very obvious in the graphs for length given for the Uganda 1909 and Zululand 1894 strains, but the numbers given are far too slender (160 and 200 respectively) to justify any attempt at analytical resolution. Graphically we may take it that roughly the following are the means of the components: T. minus. T. majus. Uganda 1909 20 microns 28 microns Zululand 1894 18 microns 29 microns. These are not very widely divergent from the values 19°8 microns 26°0 microns we have found from the seven resolutions. In May 1911§ the two curves for Uganda and Zululand appear to be added together to give a 7. brucei curve of length distribution. This is again markedly bimodal with one component mean at 18°75 microns and the other at 27°5 microns, both approximative. Thus far 7. brucei appears quite well to fit in with our other material. But in September 1911 appears the diagram of 7. brucei said to be * Bias of this or of a similar character is not uncommon—even in the pages of this Journal. I remember once pointing out to a Scotch anthropometer his prejudice in favour of whole centi- metres. He looked at his results, recognised the bias, and then gravely told me that it was not due to any personal bias, but that the Creator must have designed Scotsmen on the metric scale! + R. S. Proc. Vol. 84, B, p. 331. + R. S. Proc. Vol. 83, B, p. 2. § R. S. Proc. Vol. 84, B, p. 186. Biometrika x 17 130 A Study of Trypanosome Strains based on 1000 individuals, Here there is a mode about 240, with possibly a sub- mode at 19 microns, but the evidence for dimorphism has largely disappeared. It is very desirable that we should know the details of this curve, ie. the nature of the hosts and so forth, for it apparently replaces the earlier data and remains the standard 7. brucei distribution. It certainly shows nothing of the definite heterogeneity (or dimorphism) of the previous Uganda material. Its constants are as follows : Mean 28°5290, fy = 19°30583, fy = 996°87764, Le; = 10°54837, Hs = 2146°37930. 249° — 10186189’ — 4°0057q° + 140°6937q° + 62:0835¢q' — 29°39409? + 11:2371¢@ + 1.44139 + ‘0331 = 0. No suitable root of this equation exists and accordingly it would appear that this distribution is not rigidly reducible to Gaussian components. This result is so remarkable in view of the obviously bi-modal character of the earlier 7. brucei distribution, and the resolution into two components of all the other seven distributions, said to be allied to 7. brucez, that I determined to consider the matter further by fitting Gaussians to the ‘tails’ of the 7. brucei distribution*. I chose as the right-hand ‘tail’ the frequency from 28 to 38 inclusive, and as the left-hand ‘tail’ the frequency from 13 to 18 microns-inclusive. The two resulting components were : T. minus. T. majus. m, = 20°0817 (19°83), My = 26-4359 (25-95), o, = 2°8685 (1°85), oy = 36399 (3:03), he O2o0, Ny = 467°52. The totals populations for each component are clearly not very good and their combination exceeds by 9°6 °/, the total observed population; but the means are not widely divergent from the average values resulting from our six resolutions, as the numbers given in brackets testify. Accordingly I determined to select the means of the components at values near the mean values of six reductions, and after one or two slight betterments, determine the sizes of the populations and their standard deviations so as to give the mean, and second and third moments of the observed population. These provided: T. minus. T. majus. iy = 198244, ms — 261122; ao, = 2°6439, o, = 34154, n, = 410°83, Nei Oooala. * Biometrika, Vol. 11. p. 1 and Vol. vi. p. 65. ; a 4 ea = Karu PEARSON 131 The following table gives the observed and calculated values : Microns Observed | Calculated | Microns Observed | Calculated B) | 5 | 3°44 26 82 72°74 Ls 8 | 5°80 | Bie 2) 67°98 15 14 12°25 28 50 59°71 16 17 22°79 29 38 48°10 17 40 37°05 30 Dif 36°04 18 63 52°87 ruil 26 24°79 19 55 66°68 82 18 15°67 20 66 75°44 | Go iil 9°09 21 63 78°43 BY A 4°84 22 75 77°49 85 4 PET 23 87 75°61 36 — | 2) 93 74:71 BH = -1:75 25 80 74:36 ©6|| 38 2 J From these results we find y?= 29°92 and P=:22. Thus more often than once in five trials we should get a worse divergence than the observed, if the sample were taken from the calculated population. Some endeavour was made to better the fit by small variations from the above solution, discussed by least squares, but no improvement was effected. The two components are represented in Diagram VII (p. 132). (i) =T. gambiense. Mean = 22°1130, fy = 14°3389, jis = 531°3585, fs = 29°1104, Hs = 2429-0948. Reducing” nonic : 24¢° — 7178109’ — 30°5070q' — 300:02609q? + 869°6372q! — 278°8475¢q° — 270°9547¢? + 58:9108¢q + 146050 = 0. This leads to p,=—10qg=—91777, and the components given in the Table p. 125. The two Gaussians and their compound are given in Diagram VIII - (p. 133). We find y?= 11:96, giving for n’ = 18, P=°80 a splendid fit. Gv) Mzimba Strain (from Donkey). Mean = 21°4130, fg = 87531, pis = 2935629; és = 26°6602, bs = 1926°7045. The reducing nonic : 24¢° + 53°5186q" — 25°5876q° — 4157069? — 171:2637¢4! + 227°12119° — 37:3371¢ — 30°8995q + 86177 = 0. The required root is p, = — 10g = — 40000, which leads to the two components given in the Table on p. 125. The two components and their compound curve are 17—2 ins ypanosome Sti a) of T) e U { Stud ok: 132 . etn + Sem — = —— a ‘sn(pu +7, pue snwuwu “7 oUt 2aonIQ “J Jo Aouenbarg oy} Jo UoYNosey [TA WvuSvIG "sUuo.oUyy ge JE 98 GE VE EE ZE LE OF 6 8B LZ eg ai VG 86 2S 1G G6 Obese Li Ol Gwe ~ | | | A | LT | i Ol O€ OV OS 09 Ol 08 06 OO t “uowonyy ad sarouanbawq “OO0T 27707, 3 . e 13 Kart PEARSON ‘sn(pum “J, pues snww 7 oyut asuaiquob “yz yo ouanbarq oyy Jo uoynposay [ITA Nvasviqg *sUwoloryy Ge ve ES oe 1€ OF 62 8G LO 96 GB VG SG BG IG oe 6 st Zt OL Gi VI Si al + \ ) a4 Odl O€l OvL OSL “uounryy dad sarouanbaw sy “‘OOOT 2970.7 Total 1000. Frequencies per Micron. 134 A Study of Trypanosome Strains figured on Diagram IX on this page. We have y?= 19:28, giving for n’=17, P=-26 a fairly reasonable fit. (v) Wild G. morsitans Strain. Mean = 22°6952, fy = 184918, fy = 7584420, fe; = 43°0246, Ps = 89548788. 210 200 190 180 Mean 170 160 150 140 130 120 110 100 + i 10 fj \ es a Bi. = 21-229 23.24 05 -96 27 28° 29 30 (Sil sORscmes. “Microns. 14 5) Ge 8! sg Diacram IX. Resolution of the Frequency of the Mzimba Strain into 7’. minus and T. majus. ~~~ KARL PEARSON 135 Reducing nonic: 24¢° — 224°6115q’ — 66°6402q° — 595-9589 + 5079°3305q! — 4500°7030q? — 1460°5459¢? + 879'6116g¢ + 1522340 = 0. The required root is p,=— 10g =—4°75085, which leads to the components given in Table on p. 125. These components with their compound curve are drawn in Diagram X (p. 136). Here y?=92'75 which for 20 groups gives P< :000,000,1. Thus although the G. morsitans strain breaks up into two com- ponents the combined curve is not a probable description of the frequency. One would like to test another sample of this strain, at present it tells against the validity of our reduction. (vi) Wild-Game Strain. Mean = 22°6220, fe ol; fog = 404°4932, ty = 29°0514, by = 2247-6657. Reducing nonic: 24q° — 18°94469' — 30°38349q° — 250°2869q° + 851°7475¢' + 118°6154q? — 212°3972¢? + 15°4222¢ + 144283 = 0. The root required is p, = — 10g = — 6°9859. There result the two components provided in the Table p. 125. The two components and their compound are figured on Diagram XI. (p. 137). We find y?=12°61 giving for n’=19, P=°81, an excellent fit. (vu) Human Strain. Mean = 23°7963, by = 170252, jt, = 7131660, fs = 27-1889, Ms = 80341222. Reducing nonic: 24¢° — 1381°3796q' — 26°5147¢q' — 89°8059q@’ + 96441764! — 67427559q3 — 114°7894q" + 8144929 + 95887 = 0. The root is given by p,=—10qg=—8'5576, which leads to the components given in the Table on p. 125. The two curves and their compound are figured in Diagram XII (p. 138). Although the two components merely from the graphical point of view do not give a bad fit, the number of trypanosomes in- volved is so large that the deviations are not reconcileable with random sampling trom two such components. We find y? = 79°67, giving P < ‘000,001. ‘snip *J, pue snuww “J OYUT UTeI4YG supzisLow “DH PTIM ey} Jo Aouanbarg oy Jo uorynjosey “yx NvYDVIG “sUuoLorTy eS LES OGr Gor SGN 2G V6e UGG S666 AGE OGsG I SiveZ LAO Gina St % O eS RM S OOL S 3 ; = | S = O S 00G (= 1 ‘ i] = ' > 008 nS | YS = ' B 'S iS =x | OO¥V \ | | | OOG 136 uowoy ad hauanbawy 00G6 1PIOL Total 2500. Frequencies per Micron. KARL PEARSON 137 In order to determine how far heterogeneity of treatment or material might be responsible we took further frequencies. In the first place we dealt with the 3600 measurements for trypanosomes through the rat only. The frequencies are: | | _ ; | | | | 15 |16|17)| 18 | 19 | 20 | 21 | DODD 2 | 25 26 27 | Q8 "29 | 80 1 8L 132.83 34 | 35 | 86 | 387 | 38 Totals! | Py | | ms : a i) =| aa iL 5 | 47 |} 112 | 161 | 216 | 290 | 316 | 376 | 362 | 322 | 294 | 235 | 219 zu al 108 | 88} 57 | 28] 9 | 8) 1 1 | 3600 | | | ee Ie | | 400 350 300 250 200 150 100 50 1415 16 1718 19 20 21 29 93 24 25 96 27 28 29 30 31 32 33 34 35 Microns. Dracram XI. Resolution of the Frequency of the Wild-Game Strain into T. minus and T. majus. Biometrika x 18 ‘ains y of Trypanosome Sti St ul d A 138 ‘snlpu *J, pues snuww *f OUI UNG UBUIN_ 94} Jo Aouonbeag oy} Jo uOoIyN[OSay | at IN -- \ *suoLorTy ge LE 96 GE ve FE GE IS OF 66 8G LG a GG ue €6 66 16 OG 6 Bi 2I GI Gi FI Et — e— er er Orr rer hl ‘IIX WvuovIG va 0 001 006 00€ 0Ov 00S 009 ‘wou sad fawanbay 0669 2Y707, Kart PEARSON 139 These give : Mean: 24°6175, ple = 15°25897, Hy = 602°23008, fs = 19°21542, fs = 2023°21556, leading to the reducing nonic : 249° — 80°8739q' — 13:2924q" — 42°3159q? + 306°5227¢' — 166°4257q? — 248654q? + 12°6008q + 1:2081 = 0 which gives po = — 10q = — 70031. This provides the two components : T. minus. T. majus. m, = 21:6772, my, = 26°9993, o, = 2°2404, T, =a 298), nm, = 1611:18, Ny, = 198882. The components and their compound are figured in Diagram XIII, p. 140, and we find for n= 21, y?= 52°68 and P=-00016. There has thus been much improvement of goodness of fit, although the result is still unsatisfactory. It is impossible, however, to look through the graphs given by Sir David Bruce and others for the human strains* without being convinced of their fundamentally bimodal character, although there appears to be much evidence of its being disguised by heterogeneity of host and treatment. (viii) Diagram XIV (p. 141) gives the resolution for the human strain from Chitulukat. The constants Mean = 26:172, fin = 2302260, Hy = 1179°30786, fy = — 3718226, bs = — 3248°43805, leading to the reducing nonic: 24q° — 393°8678q' — 49°6370¢q° + 520°2910q° + 8226:94354' — 12493°5620q' — 101-101 7¢@? + 855°7520q + 63°2383 = 0. The value of the root is p,=— 10g =—16:2295 and this leads to the com- ponents given in the Table p. 125, and illustrated in the diagram. The graph while giving broadly some of the features of the case is by no means a satisfactory fit; for n= 21 groups, y?= 86 and P is < 000,000,1. The diagram suggests that we are probably dealing with a mixture of three components with means about 18°5, 25°5 and 31-0, but at present we have no satisfactory method of performing multiple resolutions of this character. * R. S. Proc. Vol. 86, B, pp. 285—302. + R. S. Proc. Vol. 86, B, p. 291. 18—2 ————— wns y of Trypanosome Stra Studi A A 140 1 ‘snfpu “7 pure snuw *7, oyut ATUO syey Ysno1yy ‘ureyg weunZ, wos souosoueddry, jo Aouonberq oy} Jo uoNNosery *SUudLIUTT ‘IITX NVUOVIG ) OOL 006 00S ‘uowary wad fauanbaasy “0096 17207, ‘sn(vu “J, PUB sNUUW “7 OFT VYNTNIYO oaAeN oy} Woy soutosouvddry, yo Aouanbary oy} JO UOTN[osay “ATX NVUOVIG "swouarTy 66 8c LE 9E GE VE SE GE IE OF 66 82 LG 9G GS VS GG GG WW OG GI Bl Al QI Gl vl El Gl 141 KARL PEARSON os OOL OSL uowny ad houanbawg 00ST 1279.7 142 A Study of Trypanosome Strains It will be seen that the following strains, 7. rhodesiense, T. brucei, T. gambiense, the Mzimba, and wild game, give either reasonable or excellent results as combined frequencies of 7. minus and T. majus. On the other hand the G. morsitans and the human strains break up into reasonable pairs of componeuts, but the goodness of fit test is not fulfilled. In the case of the human strain, we better matters somewhat by taking the strain through the rat only, but the fit is still bad. If we confine our attention to a single human being, the case of Chituluka, we still do not get a satisfactory fit, although few statisticians could look at the four diagrams published by Sir David Bruce and others for Chituluka*, and not recognise the character of the material as being at least bimodal. The same applies to the Mkanyanga data of an earlier paper}, it is distinctly bimodal. But besides this bimodal character there are certain other features in the human data, and to a lesser extent in the G. morsitans, which appear to some extent to disguise the bimodal features. I am not prepared to assert definitely that this is the appearance of a third component. It is of course easy to improve the fit of the distribution by the introduction of such a third component, but the remarkable excellence of a bimodal resolution for 7. rhodesiense, T. gambiense, and the wild-game strain makes me hesitate at present to adopt such an expedient. Owing to the courtesy of Sir David Bruce (who heard from Sir John Rose Bradford that I was much puzzled over the differentiation of strains) I have been able to examine a series of drawings of the various strains of trypanosomes. There is no other morphological differentiation which impresses itself a priori on the layman and statistician, and which might serve as anew measure of the possibility of differen- tiation into 7. minus and T.majus. But it occurs to me that an index of breadth to length of the nucleus might just possibly serve as a differential character of even more importance than the length. It is only a suggestion and considerable caution would have to be used in selecting only nuclei not near the dividing stage. But it would be of striking interest to see how far the resulting frequency distributions for the nuclear indices were or were not bimodal. I think a classification according to nuclear index might possibly—to judge from the drawings—cut across the forms “intermediate ” in length. But this is only a suggestion which may appear idle to the student of the subject?. Some difficulty might also arise from the doubt as to whether the index was really greater than 100, or the nucleus as a whole had set itself athwart the “length” of the trypanosome. This difficulty would certainly have to be considered in the “stumpy” 7. brucei and T. gambiense * R. S. Proc. Vol. 86, B, pp. 291 to 293. + BR. S. Proc. Vol. 85, B, p. 428. + Several students of the subject with whom I discussed the matter stated that they considered the nucleus so mobile and so impermanent in form, that a ‘‘nuclear index’ would prove of little value. I think much objection could a priori be raised to the use of the trypanosome ‘‘length” on the same grounds. ‘The problem is rather, whether in dealing with large numbers we do reach an average type. It would only be possible a posteriori to justify the use of a nuclear index, i.e. if it were found to differ sensibly from one pure strain to a second, and if it confirmed in such cases as 7’. rhodesiense resolutions based on length frequencies. KARL PEARSON 143 forms, but I am inclined to think that the index really passes through the value 100. Undoubtedly this range of index, or possible athwartness of the nucleus is not conspicuous in the simple strains like 7. pecorum, T. simiae and T. caprae. Conclusions. (i) If appeal be made to statistical measurements, judgment between identity and diversity of strain must be formed by means of accepted statistical processes and not by mere comparison of graphs. (11) Statistical processes show that the conclusions already formed as to the identity of trypanosome strains from mere inspection of the graphs cannot be confirmed. (111) There must be some standardised process of treatment both in regard to host, and to method of and stage of infectivity at extraction. (iv) Even making allowance for differences due to host and treatment, we find remarkable divergences in the very strains asserted to be identical. (v) It would appear that some order would be brought into the chaos, if we could consider the strains described as 7. brucei, T. rhodesiense, T. gambiense, the wild-game, the Mzimba, and very probably the tsetse fly and the human strains as really consisting of two components, which for the time I have termed T. minus and T.majus. It is highly desirable that additional measurements should be made (? a nuclear index ascertained) to determine whether these lead also to similar components. I do not assume that this is a final solution of the problem, nor do I assert that T. minus and T. majus represent necessarily, although probably, distinct strains ; they may be dimorphic forms of one and the same strain occurring in different pro- portions. But, I believe, that the suggestion of their existence may help to explain some anomalies of the present chaos. I ought also to state quite frankly that this paper is not written in a merely critical spirit. I believe that the trypanosome workers have undertaken in their elaborate systems of measurements most laborious and most valuable work, but, I think, the time has now come when without trained statistical aid, but little further progress will be made in a very important and urgent matter. The very large amount of arithmetical work in this paper would never have ‘got carried through had I not had the ever ready assistance of my colleague Miss Julia Bell; to Mr H. E. Soper also I owe help in the arithmetical work, but I have to thank him in particular for the careful preparation of the diagrams, and the planimetric determination of their frequencies by aid of which the x? for all but two of the compound curves was found. In the case of YZ. brucei and T. rhodesiense actual calculation of the areas of the normal curves was used. ON HOMOTYPOSIS AND ALLIED CHARACTERS IN EGGS OF THE COMMON TERN By WILLIAM ROWAN, K. M. PARKER, B.Sc., anp JULIA BELL, M.A. (1) Origin of the material and method of measurement. The settlement of Common Terns, which provided material for the present work, is one of old establishment on Blakeney Point, Norfolk. This is a shingle spit of some 8 miles in length on the north coast of Norfolk, about 12 miles west of Cromer. The colony is situated on the very end of the point, with water on three sides. Here the spit is a combination of dunes, salt marsh and shingle, and for the most part the nests are found on the open shingle on the seaward side of the dunes. Nests are plentiful in the embryo dunes in some years, though this year (1913) none were found there. The colony was more scattered than usual and covered the greater part of a mile of sea front. To avoid missing any clutches, Miss K. M. Parker, B.Sc., and Mr William Rowan divided the nesting area into suitable well marked plots and worked these one after another. Each of these again were worked in strips, till a patch was com- pleted, when the workers moved on to a remote one, to give the birds a chance of settling down again. After measurement each egg was numbered with indelible ink, so that any one egg was never measured twice. In all 203 clutches were handled. (2) Reduction of the material. The principal part of the work of tabling and reduction was carried out by Julia Bell*. The characters dealt with were: (i) Length of Egg : : ; : : L (ii) Breadth of Egg, maximum value. : : B (ii) Lateral Girth at section with maximum procaine : Gr (iv) Longitudinal Girth . 5 ; : : : : i Gy (v) Length-Breadth Index. ; : B/L (vi) Mottling, as determined from a Pas of one eggs. M (vii) Ground Colour, as determined from a tint scale. ; C * The authors have to thank Miss B. M. Cave for certain tables and their correlation coefficients. The Editor is responsible for the actual wording of this paper. W. Rowan, K. M. Parker ann J. BELL 145 The Leneth of eze LZ may be considered as the easiest character to determine fo) {=} and needs no further comment. The Breadth of egg B should be closely related to the Lateral Girth G,, and in most cases the relationship G,=7B is very closely satisfied. If we sum and take the means we have a = Mean Lateral Girth/Mean Breadth. This gives in the present material : mw = 3224 as against 3142, which marks an error of about 2°6 wee rather larger than we might anticipate, and possibly due to the inclusion of a certain number of slightly damaged eggs, and the measurement of the eggs in the field and not in the laboratory. The relation between G, and B is a useful test of accuracy and should be determined with a slide rule before the egg is finally replaced in the nest, or lost sight of. The Longitudinal Girth G; is somewhat more difficult to measure, and a rough test of its accuracy not so easy to determine as in the case of G,. We have, how- ever, developed a formula for determining Gin terms of B and L, and on testing it we find that as a rule the differences are below 15mm. Such a formula may be useful as emphasising the need for remeasurements, when the observed and calculated girths have values much in excess of 15mm. We are not prepared to say, however, that the coefficients in this formula can be extended beyond the case of the Common Tern. While the Length-Breadth Index is valuable as giving a measure of the ellipticity of the egg, it is not of much influence on the apparent oval shape, unless we suppose some theoretical geometrical construction for the egg. Hf we suppose the blunt end of the egg to be approximately spherical, the hemisphere ending with the maximum breadth, then the egg might be considered as divided into two portions, the upper or hemispherical with radius $B and the lower with length from the base of the hemisphere (or ‘ equator’) to the lower pole = L—4$B. The ratio of these two segments of the length depends only on the index B/Z. Thus it is conceivable that this index has actually as much association with ovality as with ellipticity, although without some geometric theory of egg-shape, we are not able to make any dogmatic assertion as to the value of B/Z. It seems, however, a character of considerable interest as being free of absolute size and also some measure of shape. If J =B/L and O be the ratio of $B to L—4B, : B/L Le. O= = BIL eit have correlated O for eggs of the same clutch as well as J. Of course, since O 1s a function of J, there will be relatively little difference in the results. =I/(2—JI), we may consider O a measure of the ovality, and we Biometrika x 19 146 On Homotyposis in Eggs of the Common Tern The mottling is a far more difficult matter for determination. The points which may be considered are: (1) Size and shape of individual splodges. (ii) Portion of the egg over which these splodges are distributed. (iii) Area of mottled surface as compared with whole area of the egg. The fieldworkers selected 9 typical mottlings (see Plate IX) and named these a, b,c, d,e, f, g,h, i; they then compared each recorded egg with these and selected the letter which marked the egg on the scale most resembling the egg to be recorded. There is little doubt that in this manner they divided the whole series of eggs into differentiated classes. But it may be doubted whether the judgment made depended on one only of the above three characteristics. Hence when we came to arrange the eggs a, b, c, d,...h, 7 on a scale of mottling, we found that the order would not be the same when we classified in turn by each of the three characteristics. We endeavoured to place the eggs in order by extent of mottling, ie. by (iii), but we think that the relatively low value of the homotyposis which has resulted is possibly due to size and shape of the mottlings, (i), having had as much influence on the classification as the extent of area mottling. Even position on the egg, (11), can influence judgment considerably. We believe that in future work on eggs, it would be desirable to classify the mottling of each by using the three characteristics independently. Even then an ocular appre- ciation, as this must be, may fail to give a very close measure of the nature of the mottling and thus weaken any homotypic correlation. The Ground Colour of these eggs varies through all shades of brown to brownish greens and blue-greens. The fieldworkers attempted to give the value or depth of ground-colour pigmentation without regard to the brown or green shade of colouring. The scale of values is given at the foot of Plate VIII. A point seemed worth consideration: assuming the pigments to be deposited on the egg in its passage through the oviduct, it was conceivable that greater pressure might indicate greater intensity of pigmentation. We accordingly selected the broader egg in each clutch and investigated for every pair of eggs from the same clutch whether the broader or narrower egg had the larger mass of mottling and greater density of ground colour. We reached the following results : The broader egg in every possible clutch-pair has Greater mottling in 26 cases | More dense ground colour in 25 cases a + « = . The same ‘ BY op The same > $ 39 Less . 40s, | Less dense 5 5 BY a Perhaps not very much stress is to be laid on these results, but they suggest that the total amount of pigment deposited is less the broader the egg, i.e. for the same bird a relatively smaller egg will be more pigmented. A solution of this Biometrika, Vol. X, Part I. Plate VIII. Colour Value Scale Cambridge University Press SAMPLE EGGS,COMMON TERN, NATURAL SIZE Gs) enh i fe Biometrika, Vol. X, Part | Plate IX g h i TYPES OF MOTTLING OF EGGS OF COMMON TERN. W. Rowan, K. M. Parker anv J. BELL 147 rather unexpected result may, perhaps, be found in the suggestion that the total amount of pigment is the same in both eggs, but the mottling and ground colour will appear denser on the smaller surface of the smaller egg. The point deserves consideration on the basis of larger numbers and possibly better defined measures of pigmentation. (3) Means and Variability. Table I gives the means, standard deviations and coefficients of variation of the several characters studied. It will be seen that the tern’s egg has for quantitative characters relatively small variation. The values of the coefficients TABLE I. Means and Variabilities (Absolute Measurements in Centimetres). Character | Mean SU ease | Deviation Variation | Length ZL ae .. | 4:14+:007 180 + -005 4°34 + "12 Breadth 2B ‘ cog 2°98 + -004 099 + :010 3°33 4°09 | Girth G... ae ae 11°39 + ‘015 376 + 010 3°30 + °09 _ Girth G, mbt .. | 9°59+-014 347 + 010 3°62 4°10 | Index B/L ~ Lad 72°04 4 136 3°449 + 096 [4°79+ °13] | Index of Ovality, O* .... | 56°35+°171 4:334+°121 [7°69 + °22] | of variation are less than many of those which we find for the human skull (3 to 8), but greater than those we know for the wing of the wasp. It is very doubtful whether the coefficients of variation of the indices should be included in such considerations, for the object of the use of these coefficients is to get rid of absolute lengths, and this is already done in the case of indicest. It is noteworthy that the length of the egg is only slightly more variable than the breadth and the breadth-girth is actually more variable than the length-girth. (4) Correlations. If we turn to the correlation of characters in the same egg, we note that while the ordinary product-moment correlation * has been calculated for all measurable pairs of characters, this is not possible for the ground colour or the mottling. Where mottling has been used with a quantitative character there has been calculated and both corrections used. Where mottling has been con- sidered in conjunction with ground colour, there we have adopted: mean square contingency correcting for both number of cells and for class-index correlations. * O=(B/L)/{2-(B/L)}. + For example, if we take 1/0 for our index of ovality its mean =176-32, the standard deviation =11-24 and the coefficient of variation =6°38. Is O or 1/O the more variable? It does not seem that the coefficient of variation can help us in such a problem. 19—2 148 On Homotyposis in Eggs of the Common Tern Certain facts are at once obvious from this Table, others are obscured. In the first place length and breadth of the egg of the Common Tern have a rela- tively small relationship, while the relationship between the two girths is between TABLE II. Correlations of Characters in the same Egg. Characters | Symbols | Correlation Remarks — — = = = ——=— Length and Breadth L, B \+:2220+ 0374 = Longitudinal and Equatorial Girths | Gy, Gy +°5297 + 0284 = Length and Longitudinal Girth ...) Z , G, +°8804+-0088 = | Breadth and Longitudinal Girth ...| B, Gy) |+°5216+ :0286 = | Index and Longitudinz al Girth B/L, Gy | —'3832 + 0336 — Index and Length *: B/L, L |—"7284+°0185 = Index and Breadth es BIL, Bo | +5033 + 0294 = Mottling and Ground Colour M,C + ‘2260 (corrected C2) | More mottling, deeper ground colour Mottling and Index... | M, B/L |}—-1550 (corrected y) | Less mottling, higher index Mottling and Breadth | U, B — 1803 (corrected 7) | Less mottling, oreater breadth | Ground Colour and Index ... C, BIL ‘0000 (corrected n) | No relationship Ground Colour and Breadth | C, B |—*1506 (corrected n) | Fainter ground colour, greater breadth two and three times as great. This probably flows from the consideration that the correlation of G, and G, arises from B being a factor in both and only secondarily from the correlation between Z and B. The correlation of the longitudinal girth with egg length is 60% higher than that of longitudinal girth with egg breadth; both these correlations are more substantial than that of the longitudinal girth, G,, with the egg index, B/Z. The egg index correlated with length is large and negative, and with breadth considerable and _ positive, precisely the results we should anticipate would appear if the correlation were largely spurious *. In order to ascertain how far it was possible to predict the longitudinal girth from length and breadth, double (for Z and B) and triple (for Z, B and B/L) regression formulae were ened out. The following equations resulted : (i) G.— G,=1-2701 (B-— B)+1:6415(L—L), or, G, = 12701 B +1°6415 L +8224, and = (ii) Gy — G; = — 17-2930 (B — B) + 146374 (L — L) +7636 (I —T), or, G, = — 17-2930 B + 146374 L +-7636 B/L — 527239. The first seventeen eggs were taken as a random set to test these results upon with the following values: * As a matter of fact the correlation of index and length for a constant breadth is —-997 and of index and breadth for a constant length is +-996 instead of unity. These values indicate how closely the linearity of regression holds in these quantitative measurements. W. Rowan, K. M. Parker and J. BELL 149 TABLE III. Observed and Calculated Longitudinal Girths. Calculated Girth Ditference | Egg | Observed | Number | Girth | pmer l : Gy (ii) | (i) As | Ay ie - = | | =s i 11-40 11°14 | 11-20 + 26 +°20 | 2 11°65 11°83 11°74 — 18 — ‘09 3 12°10 12°24 12°07 | -— ‘14 } 4°03 | y 10-80 11°46 | 10°84 — “66 --04 5 11°70 Deo ee aS + °47 +°39 | 6 11-20 We elt — ‘07 —:14 | ? 12°15 13°19 12:31 — 1°04 —16 | 8 (i) 11-20 11719 1-27 + 01 — 07 | 8 (ii) 11°30 11-09 WED =e. 2 +03 9 (i) 11-50 1144 | 11°61 | + 06 Sil 9 (it) 11-40 11°36 11°45 + -04 — 05 10 11°50 11°52 11°61 — 02 —1l 11 11°80 WES 5 iy ele 72 + 25 08 12 11-90 11°62 11-74 + 28 +716 | 13 -(@) 11°10 11°01 10°94 209: |. E16 13 (ii) 10°80 10°75 10°78 + 05 | +02 18 (iii) 11°70 1455 |) 11:55 + 25 e155 e o3 a Root mean 354 146 | | square A | To judge by this small sample we obtain only increased inaccuracy by taking the more complicated formula. We shall only make an error of about 14 mm. if we calculate the longitudinal girth from G, = 12701 B +1:6415 E +8224, and for the egg of the Common Tern at least this is a convenient formula for verifying measurements in the field. The remaining correlations indicate sensible correlations, but these correlations might well be substantially higher had a better scale of mottling been adopted ab initio. In the first place we see that the mottling and the ground colour are sensibly correlated, and the deeper the ground colour the more intense is the mottling*. We have already seen (p. 146) that for eggs of the same clutch the broader has less intensity of ground colour and more meagre mottling. This is true for the eggs of the Common Tern in general, although it is probable that a better classification of mottling would bring out more marked correlations. The * This might probably be asserted interracially as well as intraracially, compare for example the swallow with the skylark, the lapwing with the ringed plover, etc. 150 On Homotyposis in Eggs of the Common Tern following are the orders (a) of mottling chosen, (b) of breadth classes, (c) of index classes : | | | (a) () (c) | ser sy: . Order of Breadth Order of Index ——— —_ {ics = —___— _ — Class B Class BIL Class gte+d a 3°00 a 72°64 a c 2°99 f+i 72°54 b gtetd 2°97 C 72°30 Cc ttt 2°96 gtetd | 72:27 h h 2°96 h 71°95 ft+é b 2°95 b 70°54 = Mean 2°98 Mean 72°30 The relationship is small, but exists. It seems reasonable to suppose that the order of mottling classes as given by B or B/L, where there is only one displacement, may be a better one than that we have selected. But if in the mottling order b and c were interchanged, it would agree with the B classification, in so far that the three classes of least and of most mottling in the two classi- fications would be the same. We now turn to the ground colour. We see that the ground colour is fainter, when the egg has greater breadth, but that there is no relation of the index to the intensity of ground colour. The results of p. 147 are thus confirmed by the general correlation of ground colour and breadth. Although there is no high-correlation, we may assert that it is probable that the intensity of pigment dees not depend on the pressure during transit of the oviduct, but rather on a constant amount of pigment being distributed over a larger surface. (5) Homotyposis in Eggs of the same Clutch. The homotyposis, or degree of resemblance in character between eggs of the same clutch may be studied on the present material. The chief direct and cross homotypic correlations are given in Table IV. Pearson has shewn* that the degree of resemblance of undifferentiated ‘like organs’ might be expected to be equal to that of pairs of brethren, i.e. about *50, and proved that this is so for many homotypes in the vegetable kingdom, a result which has been since confirmed by much as yet unpublished material from the animal kingdom, including a number of series of birds’ eggs. Thus the mean value of the homotyposis for eggs of the Common Tern could hardly be improved upon. Only the colour characters show irregularity, especially the mottling, a * «On Homotyposis in the Vegetable Kingdom,” Phil. Trans. Vol. 197, A, pp. 285—379, 1900. ad hee W. Rowan, K. M. Parker ann J. BELL fod feature we have already indicated as difficult to measure. It will be seen that the correlation of the ground colour of an egg with the mottling of a second (3989) has come out greater than the organic correlation between mottling and ground colour in the same egg (2260). TABLE IV. Homotypic Correlations. | Symbols | Characters Correlation L, L£ | Lengths of Eggs in same clutch ... oe oes su ... | 4643 + 0346 B, B | Breadths of Eggs in same clutch ... fee bet i ... | 5176+ 0326 G, G, | Longitudinal Girths of Eggs in same clutch... et ... | 0076 + °0327 G@,, @, | Equatorial Girths of Eggs in same clutch ae oss ... | 4621 + 0350 | : | Mean value ae te 4879 Direct M, M | Mottling of Eggs in same clutch ... mo te “ee ae| °3500 C, C | Ground colour of Eggs in same clutch ... es one oe “5709 | Mean of six characters ... isi eis "4788 | aan | L,B | Length of one Egg with Breadth of a second ... ... | 0922+ 0441 | C, M Ground colour of one Ege with Mottling of a second ... | 3989 + :0379 Cross | £, G, | Length of one Egg with “Longitudinal Girth of a second... 4229 + '0362 B, G | Breadth of one Ege with Longitudinal Girth of a second... | *2530+-0416 | 1, Gy | Longitudinal Girth of one Egg with Equatorial Girth of a second | +2603 4 -0413 | B/L, ie Indices of two Eggs of same clutch ie .. | 65374 0308 Index | 0, Indices of ovality ‘of two Eggs of same clutch ... ee ... | 5527 + °0309 | 1/0, Ho Inverse of indices of ov ality 5361+ :0317 | | | | | Mean of three Index Correlations —... 5475 Mean of nine Homotypic Correlations 5017 | We feel that the classification by mottling is at present too uncertain, and that until the result cited has been confirmed with larger numbers and more definite categories, it would be idle to consider whether, while a given bird has usually highly or lowly pigmented eggs both as to ground colour and mottling, yet when in the individual egg there is an excess of mottling pigment, there may be some tendency to a relatively less increase of ground colour. Thus the correlation in the individual egg might possibly be less than the correlation between eggs of the same clutch. Such considerations must be postponed until the fact itself is adequately demonstrated. 152 On Homotyposis in Eggs of the Common Tern Another relation suggested by Pearson* is that the cross homotypic corre- lation of the characters # and y should on the average equal } (correlation of w and w + correlation of y and y) x (the organic correlation of # and y). It is clearly impossible from what has just been said to apply this to the cross homotyposis of ground colour and mottling. We can apply it to the five cases in which quantitative measurements have been made. Table V gives the requisite data, the last two columns giving respectively the calculated and observed cross correlations. TABLE V. Cross Homotypic Correlations. Characters Direct Correlations ' Cross Correlation oe | Organic Correlation | Te | eee (1) and (2) | | | (1) (2) (1) and (1) | (2) and (2) | | Calculated | Observed | DL B | +4643 5176 22.20) +1090 0922 L G, | “4643 OTK | “8804 | °4278 "42299 ie G, | +5076 “4621 “5297 | +2568 2603 | B Gree ee ou7G Oi 5216 | "2674 "2530 | | eee | Gy 3) BEE “5076 SDDoM — 3832 | — +2083 — 2007 When we compare the calculated and observed cross correlations, we see a striking agreement, or the theory that cross homotyposis is the product of direct homotyposis and the organic correlation of the characters under investi- gation holds very closely for the egg of the Common Tern. The general results obtained are in good accord with those reached by previous observers, and the authors hope to investigate one or two doubtful points on fuller material this year. * Phil. Trans. Vol. 197, A, p. 290. W. Rowan, K. M. Parker anv J. BELL 153 APPENDIX OF CORRELATION TABLE TABLE A. Length and Breadth of Egg. Breadth. aS es [ | | | Totals Sikhs = {8 39 89 Sa) Sn) | 1 0) 1 | l Hee ee oat ae SS el Sy 5 ee ee ah | ome le 5 yas fake S33 sto Ie (eee rs 15 SO eee ah hse eal | | ee | Pe 10 on hes ee G6 Bal) A 25 = ho Dule 6) hese) bal 5 26 S05 09 ef —— |) — | 1 | 1 Dae ot lh gorleSa is) 2 27 3 | 410—-p14 ee Sai) toe eon ebul|l 2 45 415—h19 | Sea! OF Bi 2 8 24 4:20—4-24 Win Siles | 6) 2 31 425—429 Dae AN Ae AGB il 2 23 430— 34 ee eG 3 ak) a 20 4°35 439 Towser lute ero alo. loa 14 44O—I 4h Oalenaileccceit al eae if 44S 49 =| bl) tj |—)| 2] 5 L50—b 54 Tj} 1{]—/ 2] — ] 6 | 455—h59 | : | ; 0 460—h65 1 465—469 0 LIO—-4VS 2 Totals | ME Oe ileae os oa | 14 | 24 | 61 | 61 | 57 | 35 | 20| 10 | 2 | 294 H | TABLE B. Garth L, and Girth B. Girth B. ; i la2leaje| za . 9 S| 5 | ileal | Sales 10°00—10:09 10°10—10°19 10:20—10°29 10:30—10°39. | 10-40—10°49 | 10.50 10:59 4 —— | — |_-1 4 1 4h - | 10°60—10°69 1 : = = | — < 10'70—10°'79 | — |.— | — ! 1 1 } 2 . 10°80—10°89 | —- yl ya a e090 1090 == |= 1 yet 42) els) 1 | Se Om 1c 9 ean| ie et he Rs al | 97, We Se | ee © | i10o-1119] —| —| —|—| 2] 5] 9| 5] 2/—J—J1]— Mcgee Ve See ee) SINS lg 10}, 8} 5)/—]} 1}—)}— HATED Sa 30) | | ee ene) a em Fe el a |e ea eter .0— 7 1-7.) |) apace Webern @7 ==. ey | 11:50—11°59 Tees eee ae Ila acelin eel 60— 11-694) — ||. | —— iL a=} | AR i) 5 5 5 = x 11°70—11°79 Sele Toshio eGullet — | 11:-80—11°89 > Teele ea Go Goe co) 11°90--11:99 TOW POF ie Assen, oe | el? 00== 72-99 — = | — |) — = | | BE i) 12:10— 12:19 | 1| 2 12:20—12:29 et Totals Biometrika x WY Eggs of the Common Tern yposis in On Homot 154 “we a] ana HMO MHA A SS maf OD 19 19 190 | AAD O MOM | age ee pee es Sey aetna eh aera ee S|) Se ee ua y |7 Cia lecue cee CUeOl | Galee aa SSM sleOs oleae ile | = See OE Neh enl Oe sl peel its ela Te ee ealeG tl aer eG eal) ilar = oe eich e\eGen ois) ee sed = = WEL Rowalntoe eee Nigel = [eh al at eee ier — = Ti lee ee STPIO, 67-61—04-6T 68-61 —06 6T 66-61-—08.61 61-6I—O1L-6T 60-@1— 00-81 66-TI—06-1T 68-TI—08-1T 6h-TE—OL-TT 69-TI—09-1T 65-1I—0¢-IT | 67-II—OF-1T | 68:-TI—08-1T 66-TI—06-TT | 61-TI—OT-TT 60-TI—00.TT 66-01 —06-0T 68:01 —08:0T 64-01 —O4L-0T 69-01 —09-0T 6¢-01—0¢-0L | 64-01—0%.0L 68-01 —08-0T 62-01 —06-0L 61-0I1—OL-0L Me bels®) 60-0I—00-0T | "Tyne pup bby fo yybuayT ‘OD FTavVit SiS Hie] eis SS ~~ | & | ~ S ~ Og Gg | te ~ lL Ql} Al alT ale] se] ow] eo] ws S/R} RY Ss o D | a ae alle st SN > N S an S in S NY > N S | a IAS a eRe aL acl OG Cd en al a Wa | | | S|) see || Seale Se || elf Se alfeosarn te Set] Seal Sea ge Nosed | seta [ees ° 0, TT— Ne. oo = a 69-LI—0¢-T] 65-1 09:0E a S a | ee ee = ST 6%-TI—0@-TI | Aamo a ac a 4 oD tS ap b | S § |¢1-4—o1-4 s) 6L-TIL—OL-TT a i Ss : = Pomerat ae 3 4 —C0.4 5 : A 60-%—GO. | 60-1 00-1 7 cee = oy — Tl) SS tn tf —pn.t Re) 66-01 —06-01 eet io) = LOT ROOT HY — 4 ~ = a = S 1 O-—CR.E Ye) 68-0I—08-01 29 = ee ms a 1 N te.0—nG.e oO 62-01 —0L-0T _ 16-806- oy , .E—G9.E Ne) 69-01 —09.01 = = 68-8 98-8 i — & is .E—09-¢ 10 | 69-01—09-01 aa a] 18.E—08-E : ; arom 64-E-—-GAE oo) | 6-01-—04-01 Po = eee ae /.e ay pia = 6801-08-01 ° ee Bice ee 69-2 —O9-¢ = 66-01 —04-01 a 2 ‘ ? [Sue e =n perom a Py Sava (a , Pale 19-6€—09-§ °o OL-01—OL-0T =) : 60-01 —00-0T A PA BABAARSRAW MMBIHRD HNWVIwD Mm he 4 | ~ Seeeeseeseese {es 3S X Sos Xe) S Q > & SSSSRRRKKLSVE * o ire) fo La) ‘XOpuyTy W. Rowan, K. M. Parker anv J. BELL 157 TABLE G. Breadth of Egg and Index 100 Breadth/Length. Breadth. fon | & + | d + | > = | oO s+ | st | Pe BS I Se Sh all Ps Te pe teeky tS ce eS) SS ee tt caainllE S | Rl Rl} _ eX |_ Raia Ie2le sl a&laelse lala] | | ee era Ne eh ee ay tb) oh ll a Botals} 1D S LD 2 Ro > WwW > Ww > WD S w S eso ek | els pole TS |S Sis | 1S Ri RIL RIRI RIL RIRIRIRARIni HISD | SH /1H | 62:0—63'9 64:0—65°9 | 66-:0—67°9 | 68:0—69'9 70°0—71°9 72-0—73'9 ThO—75'9 %6-0—77'9 ¥80—79'9 80°0—81'9 82:0—83°9 84:0—85°9 | | ~1m ts = conn Index. WoL PDE > OW DO NT bo Ow Or | aa) oe OTN Shiono kas | bo OOD) ATW bo on) i] Pe web bo bo OAT bo Totals TABLE H. Ground Colour and Mottling. Ground Colour. Totals 15 16 | ; 73 (+3-96) | (—5-28) | (+ °70) 5 | 4 4 26 (+107) | (+ :69) | (—1:45) So 14 3 49 is (23°73) ~ | 18 ‘ 101 = | 13 | 29 291 158 On Homotyposis in Eggs of the Common Tern TABLE J. Index 100 Breadth/Length and Mottling. Index. S| SD [SD [Sa] Seonloaian | Lom onltoraicay SPU eet met comm ipo ian Umut WIL CAy mylene cco Ionian | be iassy [1 Wo) | © Ks) Ne) S ~ 2S ro NS rs ra) ore) ey | | | | anya | | | | | Patel Totals | San Ease ec eae a ge ell eral ieka eager |S | | Sy =“ xe} Soha | oe) RN => Ne} nH SS eR |= | Ne) Ne) Ne) 3° NN reel) ES BS ~ | w Se) ies) oc | 2 a g+d+te 15 | 7 — D> a 6 l 6 b Sule = c 92/4) 99 h 5 Fri 5 Totals TABLE K, Breadth of Egg and Mottling. Breadth of Egg. | & : 3 a) 35 | 3 esl | | | Totals 2) i=) S S = RN on - | gtdte — | Qa Galva Feo UO ote GP oy ts : = a —|—;—]—j;—] 1) 1] 6] 4) 6] 3] 3] 2) — 26 2 b ee eee Mee kerala sey ymoy lee |) ay aloe on 49 aa ¢ 1. | 2) — | 8 3s Bl aa O28 19y) alize |e onli b: 1) Sie heaton h ee ee es ee ee hea fri | ie ae eA eal S| ef 29 NN Oa a a a UO Po Totals 1 | 2{1|]2| 41/14/24] 59/611 56] 35| 20/10] 2 | on momAl li As Ht i é | | W. Rowan, K.-M. Parker TABLE L. Ground Colour and Index Ground Colour. AND J. BELL 100 B/L. 9 Seat canny | a eel edlaa fa i 62:°0—62'9 63:0—63'9 64 O—C49 65°0—65'9 66°0—66'9 67:0—67'9 68-0—68'9 69:0—69'9 70:0—70°9 71:0—71°9 72:0—72:9 73:0—738°S ThO—TH'9 | 75:0—75°9 760-769 TiO Ti | %78:0—78-9 79:0 —79°9 Index, | Reo akacwwe pr | | Si paroo. aa en og eo a) AD YS Ww i) Ww S wD S wD ME Sigal SSF eet COT SCO SO OS A SNE) OPS a 2 il RI Riel RIRliaAgl Rial ala |] a . | | ] | 255—259 | — | 1 260—2°64 | 1 ; | 265—2°7), = | cae Es 275—2°79 ~ | 1 1 | 280—2-84 = - | | 2 2) 4 2) L|— Wh || 2-85—2°89 | 2 1 4 i) 1} —}--}| — 290—2'94 1 4 4/10); 14 8 5 1 2:95—2:°99 1 2 3) 16 | 12 Oy) || — | S:00—3 04 it 1 12 |) 12 8 8} 2 - 8 4 6) |5—— 8-i\ 135) 74 2 ale = on) ase -I ou or or is) Totals TABLE O. Length of Egg in Pairs of same Clutch. L00—4 04 4O05—4'09 st D> Ty D> ~ ~ XN Se at a lh | eal a[ mle] es SS eashe | Sse L55—43B9 44O—Y 4S J = ol Sd ate Sd be 9) Dd XY fol lolals Sn) 35 35 An Sn) | | | | | wW SO WwW > WwW ~~ | %© rs) Dd DS & | S Sn) Sn) 89 Se ED | BT5—B TI | |-- 1 3-80—3°8 4 | Totals 20} 14] 1 3-85—3°89 | — Ze ey Male ale at al — 3:90-—3'94. 4 — | — 2) 2 3 a ea 3953-99} 1 }—} 2/ 8] 6] 6] 2} 2}—] 2}—}—];—}—}]—]— 4:00 —404 = 6 28 23 225) See es es 4:05—4-09 | — =) Se a ae es le Sie ee 410—414]—| 2 1 ee lee eta eS Ee ey MN cet Ea The hl | £15—419 | — | — 1;—|;—] 8] 4] 6] 2] 5)| 1 2};—;—|]—]— | 420—4241—|—! 1} 1] 2) 1) 1] 2] 5) 2) -2) 2) =|] 1 | —] 3 | 4:25—429 : ;}—/ 4] 1] 3] 1} 2/—}]—jf 1]— = | 80-484 1— |—|—]| Ly—] 1] 1) 4) 2) a 4 be |e 435—439 3 Deh — = | Se 4e |) Y4O--L hh Sl eee | 1 | | #45—4'49 =| pl 1 450—h 54 | | 5 1 oa 28 | 24 | 36 | 24 | 23 | 12 W 8 Totals 10:20—10:29 10°30—10'39 10°40—10'49 10°50—10°59 10°60—10°69 10°70--10°79 10°80—10'89 10:90—10°99 11:00—11:09 11°10—11'19 W. Rowan, K. M. Parker anv J. BELL 161 TABLE P. Girth B in Pairs of same Clutch. | S(E(SiSisl/SyF/s{/Sl/sisys | ine) ine) ioe) Dd o> Dp D len) Ss S S S | s s 4 | asec eter te dea al ie he i otels | S S S S S S S S S S S S R/S (S/S PRPSsl/Sl sel slaelSis ~ ~ ~a Dd =) Dp ier) im) S S S S 4 n 4 a 8:20— 8°39 9 8-40— 8°79 0 &*80— 8:99 2 9:00— 9:19 16 9°20— 9°39 37 9-40— 9°59 68 9°60-— 9°79 57 9°80— 9:99 31 10°00—10°19 11 10°20—10°39 3 10°40—10°59 0 10°60—10°79 3 Totals 230 TABLE Q. Girth L in Pairs of same Clutch. Sn oylliFas) Ss > a lala sf eS eS ss) es ee es S)/s/s is & = {| 2] 8 SiSi/E l/s lsl/ sls S =) S S S ™ mt ~ a il a ™ ™m | & N ™ ™ iat = Oh] = =e ~ ™ = ™ ™ ™ | ™ ieee | sales | | Loiieede alae Pa | ok Pa) | totals! Ss S S S S S S S S S i S S =) S | S/S) s |S © =| 2/8 e/S}/eF;el/se]}s]s S S SS S ™ ™ ™ ban i iH ~~ a RN R SO 4 nN 1 mm | N sO | i 1 be | 11°20—11°29 11°30—11°39 I1:-40—11-49 11°50—11°59 11:60—11°69 11°70—11°79 11'80—11°89 11:90—11°99 12:00—12:09 12°10 —12'19 Totals Oalla=. — | — 3 Il 2 4 2) — 1 3 2 + 3) 1 | — 3 6 4 2 1}|— 4 4 — 2 2 2 6 1 5 1) — 3 1 2); —}]— Fs |) al —/;|—|— il 1 1 2 rar [esrron Iker l. ‘a> no hoses: | | | | woanad | roe 1 @) || al 1 =at= | 2 2 Fle tte it aie, ee A esa ltota lt ested ao) a) ee ee ea OuINe a eal ik hel ae 3| 3 eon eee a Sol ae a Sp Gel eese |ulegte es eel 3 ee In oi) wou yal ige |e | uae | a ees Se Woh Bou ee ee 1 = (| een ee a1 | 20 | 18 | 11 | ose elle Biometrika x 21 162 On Homotyposis in Eggs of the Common Tern TABLE R. Mottling in Pairs of Eggs of same Clutch. Totals a OQ h fti Totals Pairs of same Clutch. TABLE &. Ground Colour of one kgg with Mottling of the other Egg for Ground Colour. Totals gtet+d 8 67 (-5°16) al | 6 Wly/ | (+2°66) bh b 31 = S Cc 10 82 s (-1°71) h 1 14 (1-00) | (— -94) f+t 4 3 13 (+2°14) | (41:20) Totals 32 31 224 iS In Tables R—T, the contingency of each cell is given in brackets, W. Rowan, K. M. Parker anp J BELL 163 TABLE T. Ground Colour in Pairs of Eggs of same Clutch. a+b 12 42) | (+ °58) | (4+10°47)) (41:43) | (—2°97) f 3 4 4 12 20 8 51 (— 5-96)] (—3-12) | (—3-12) |(+ 1°43), (48-28) | (+2°49) g—k 1 4 1 2 | 8 8 24 (- 3-22) (42°49) | (45-41) Totals 222 TABLE U. Length of one Egg with Breadth of the other Egg for Pairs of same Clutch. Length of one Egg of Pair. Breadth of Second Egg of Pair. a = = = ms 2) S/S 2 | i Totals ese 2:55—2'59 | | 1 | — c= = 1 260—2°64 | — | — eee | 1 265—2°7 = | = 0) 25— 27 1 1 2 2°80—2°84 = 1 1 a 1 2 2 1] — 1 12 2°85—2:89 | — | — | — 1 3 oe 3 3 }|— 2 16 2:90—2:94 1 4 1 7 3 6 i 2 is) 2 5 tn eee ee | 47 2:95—2:99 | 1 | — 4 2 2 8 9 6 (ji) zt 83 5 Ab ee eel 55 3:00—3 04 il 2 2 2 i 2H a3 5 4 5 4 6;—|]—]|1 52 POSER) | | 4 (oe | 4 4 2 3 2)— 1 |= 1} — Zo) SHOE | | 1 2 3 3 1 1 Ae | |) SS ee 17 3:15—3'19 == | 1} — — | — 2 Totals 230 164 Length Girth of Second Egg. Length and TABLE V. Girth L. in Pairs of same Clutch. On Homotyposis in Eggs of the Common Tern 10°20—10°39 10°40—10°59 10°60—10°79 | 10°SO—10°99 11°00—11°19 11°20—11°39 11°40—11°59 11°60—11°79 11°80—11°99 1200—12°19 Totals Length of First Egg & |os fe |aelalalyalals | | | | | | | | | | Totals Si S S => } S S S = S/Sia|/RXl/ S/S] s 39 oi|(sei sisal arlasrls | to bo no bo co bo | pre EPDprhe OQD-! — i BORO Or TABLE W. Breadth and Girth L. in Pairs of same Clutch. Breadth of First Egg. 2:55—2°59 Length Girth of Second Egg. Totals 10°20—10'39 | 10°40—10°59 10°60—10°79 10°50—10°99 11°00—11°19 11°20—11°39 | 11:40—11°59 11°60—-11°79 11°SO0O—11°99 12°00—12°19 2°65—2°69 270-274 280—2'84 rele! | 285—2°89 47 | Gees ioe | 2:95—2:99 Sora! | | | 200-304 bo oR HAT on bo 3°05—3'09 bo or 3-10—3'1h lromaale| | | Length Girth of Second Egg. W. Rowan, K. M. Parker anp J. BELL TABLE X. Girth L. and Girth B. in Pairs of same Clutch. Girth of First Egg. 165 10°20—10°39 8:20— 8°89 Breadth aD Sd NN lon} Sc | oo | | S > & | & 9:20— 9°39 9-40— 9°59 9:00— 9:19 10:00 —10°19 | 10°40—10°59 Index 100 Breadth/Length in Pairs of same Clutch. op Meros50 lst ms a ad Se |p | 10°60—10°75 etl Se ee | | OOO OO A | 2) G18 eo a) Se | 11-00—11°19 Ae ol ae pie eb |= melt 11°20—11°39 }—| 83| 9/11 ]}12]} 8} 4)/—]—j 1 ie om 0 oOl a a 9 3a) SAN tA Ni | 6) B)|| 2el\e=9) 11°60—11°79 Sel eouee ied | sole ot tle = 11°80—11°99 ees alpaca ee 50m al wt j=l) Sete 12-00—12°19 | } 2] 1] | Totals oF Oo iO | 8 | 16 | 37 | 68 | 59 | 31 | eS lewia| 13 TABLE Y. 65 69 70 | TL NG 72 failed 76 = — 2 1 15 3D) | ol 1 3 say |p aay |i) baa) 15 | — 3 2°5 15 | 3 tf 75 15) 2eon 1 :on Le — 5 | 5 5 3 6 1 2 15] 6°5 5 °B) || —— 3 | 1 bo LY BO Ge OH OV CH 5 O55 | — Lat == 4/4] = |? 3 I 1: 6 |6: 25/1 10 il 2 5 | 2 Se OW HO 5 | ‘i | 3 I — {5 | — B59 ea ebe 2 | 5/—|]—|— 3 2°5| — | — i 62: — | — | ee 25 3) Totals | HPHOmlprEwoH HOR HR HH at | | 166 On Homotyposis in Eggs of the Common Tern MONK FOr DMDOAADIEOD sl AAMAS wd A ie ics re) 58 | 59 | 60 | 61 in Pairs of same Clutch. TABLE Z. 100 B/L 2 — B/L 6321 85h C66 56 | 57 Index TABLE AA. Clutch. f same a 2 L/B—1 in Pairs W. Rowan, K. M. Parker anv J. BELL 60-6—00-6 66-I1—£6-T 96-I—¥6-1 86-I—16-1 [egal Sse Ge 06-I—88-T lie Wlealer? eee ete 48-I—G8-T | Ja aa | faa] | | | [aaa ece | ee a mame HTN WOODY SS | nN |p ora 69 a1 | [Egat tina SenlhON | 69-I—49-T [| <2 JAG Sts A eSeSiaECY 99-I—"9.1 Anan Pao walion &9-L—19.T sow | a! | 09-I—8G-T AG EGET 4G.I—@G-T 167 168 Index of Second Egg. | TABLE BB. On Homotyposis in Eggs of the Common Tern Girth L and Index 100 B/L in Pairs of same Clutch. Le ng th—Guirth of First Egg. 3 | 26 (39 | 48 51 | > |X |< Di oma) fies ol | al |iey S571 BO Sa S> oH SORE |e S= IEC Salas > > > SSF | sStlosh PS St | Ss ee ee ere IL | S = SIS S Ss S > S S X Sy Se) se Sis pat | SO at) ACO SS =) =) S =) ™ ™ = ™ ™ RX ~ ial ~ ~ m el al ~ ~ SS a a ee ee | 62:0—62:9 | 14s ee 63:0—638'9 — 64:0—64'9 | 2 1} — |] — 69:0=—65°9 - 1 66°0—66°9} 1 1 =: 1)>—|;— 2/— 2 | 67:0—67'°9 3 1 2 he 68:0—68'9 Healy oll 3 9 el) = 69:0—69°9 Lie Q aoe _ - . e | ¢ 70°0--70'9] —|—}|1 | 1 | Ge) 2B a 22 ae CL 07129: 3) LON WS 8 2 Sued T20—72:9 6 3] 6 7 6 | 7 | — 73:0—73'9 Sle Bol al) e6a) ¢3a|ee3a es TLO—7TL9t— |—]} 1 5 4 ul 6 4 Py i 75:0—75:9] — |— | l 2 4 2 3 Ba lp 76-0—76:9}| — | l = 4 2 2 1 1S 77:0—77'9 > 1 Vets 3 78°0—78°9 1 79:0—79°9 —}|—|—| 1 | | | Totals MISCELLANEA. I. The Statistical Study of Dietaries, a reply to Professor Karl Pearson. By Proressor D. NOEL PATON, F.R.S. PROFESSOR PrARSON’S criticism of Miss Lindsay’s Study of the Diets of the Labouring Classes in the City of Glasgow (Biometrika, Vol. 1x. Oct. 1913) is a good example of the danger of one who does not understand the problems involved and who is ignorant of the work already done upon a subject attempting to discredit the results of an investigation by the application of mathematics according to his own fancy and in, what seems to me, a totally illegitimate manner. Not appreciating the questions which were under investigation, he starts his criticism by demanding that our studies should afford a solution of problems other than those we had before us, and, because he does not find the solution of these problems, he proceeds to abuse the work. Apparently in his opinion the object of the studies should have been to determine what effect the diets which the families were taking at the time of the study had upon the physique of the various individuals. He states that, if adequate anthropometric observations had been secured in such a study, it would have been at once possible to co-relate these with the diets. It is unnecessary to point out, as was pointed out in the Report, that the physique is determined by the whole previous condition of life and by the influence of heredity, and that it is absurd to attempt to relate it solely to the diet (Report, pp. 3 and 4). The objects of the studies are quite clearly stated on p. 4 of the Report: “Do the working classes of this city get such a diet as will enable them to develop into strong, healthy, energetic men, and, as men, will enable them to do a strenuous day’s work ; or are the conditions of the labouring classes such that a suitable diet is not obtainable? Further, if a suitable diet is obtainable, and is obtained, is it procured, or can it be procured, at a cost low enough to leave a margin sufficient to cover the other necessary expenses of the family life, with something over for those pleasures and amenities without which the very continuance of life is of doubtful value?” It was accepted as proved by previous work that for the labouring classes: “If a family diet...... gives a yield of energy of less than 3500 Calories per man per day it is insufficient for active work, and if less than 3000 it is quite inadequate for the proper maintenance of growth and normal activity.” The first question investigated was: “Did the families examined receive this supply of energy?” As regards the poorest classes this was answered in the negative. The validity of this conclusion has not been challenged by Professor Pearson, Biometrika x 22 170 Miscellanea The second question considered was whether the diets contained a sufficient supply of protein. Previous work indicates that this is probably something above 110 grms. per man per diem. It was shown that in families with regular incomes of over 20s. a week the average protein intake was above 110 grms., and that in families with regular incomes and in those with irregular incomes of under 20s. a week the average protein intake was under 110 grms. This conclusion has not been refuted. Accepting our premises, the final conclusion was (p. 27) “that while the labouring classes with a regular income of over 20s. a week generally manage to secure a diet approaching the proper standard for active life, those with a smaller income and those with an irregular income entirely fail to get a supply of food sufficient for the proper development and growth of the body and for the maintenance of the capacity for active work.” The main points proposed for the study were thus elucidated. : The part of the Report to which Professor Pearson specially directs his criticism is not the main problem, but that dealt with on pp. 30 and 31—The Physique of Children in Relationship to Diet, a subject taken up at the suggestion of Dr Chalmers. Professor Pearson, having declared the data totally insufficient, proceeds to apply his statistical methods not to refute Miss Lindsay’s conclusion, but to demolish other conclusions upon the relationship of physique to income which were never deduced by us. The very guarded conclusion in the Report was: “These show very markedly the relation- ship between the physique and the food. When the weight is much below the average for that age almost without exception the diet is inadequate.” Weights alone were considered. Thirty-six children, boys and girls, were dealt with. As the relationship of weight to income was not under consideration, they were classified not according to the income but according to the energy value of the family diet. Hence Professor Pearson’s remarks upon this point are quite beside the mark. I give below, in a re-arranged form, the Table from Appendix IV. The individuals are placed in two groups according to the energy value of their diets, with, opposite each child, the average weight for the age, taken from the Report of the Anthropometric Committee published in the Transactions of the British Association for the Advancement of Science, 1883, and with the difference between the weight of the child and the average weight. The differences between groups 1 and 2 are sufficiently marked and warrant the conclusion as stated above. That is, of the children in families the diets of which yielded more than 3000 Calories per man per day: 10 were above the standard or not more than 5 lbs. below it, 8 were more than 5 lbs. below it, while of the children in families in which the diet yielded less than 3000 Calories 3 were above the standard or not more than 5 lbs. below it, 15 were more than 5 lbs. below it. It must be remembered that the ‘standard’ is for the children of all classes and not for those of the poorer classes. The fact that the average age of the children in the second group was about 1? years greater than that of the children in the first group does not account for the marked difference. The last question which Miss Lindsay had to consider was, how the necessary supply of energy and of protein might be supplied without increased expenditure, and she was right in stating that these can be more cheaply purchased in vegetable than in animal foods. She Miscellanea 171 TABLE A. Family Diets above 3000 Calories per Man per Day. _. | Age in Weight | enenray| Number | Calories ve ; Sex nee OA Weight | Difference ears in lbs. . in lbs. 2 4003 a f°) 39 46°7 - 77 2 4003 10 3 63 67°5 — 45 2 4003 8 3 50 54:9 — 49 2 4003 5 3 35 39°9 -— 49 36 4091 3°25 3 35 5450 0) 4 3882 8 2) 45 52°2 | — 72 32 3822 6°25 2 39 42°4 — 34 4 3882 6 ie) 39 42°4 - 34 4 3882 10 3} 56 67°5 -11°5 39 3422 10°5 2 55 65 —10°0 50 3471 6°25 io) 37 42-4 | — 54 50 3215 6 9) 47 42-4 + 46 BS 3116 6 2 43 42-4 + 06 18 3248 55 @ 43 41-0 ap 2450) 54 | 3282 5 Q 33 39°6 — 66 58 | 3080 6 3 38 44-4 | — 6-4 30* | 3136 55 4 21 4] | ~—20-0 49 | — 3841 55 3 42 41 + 1 * Family with rickets. TABLE B. Family Diets below 3000 Calories per Man per Day. . A | 3 Standard Number Calories ' Age . Sex Weight Weight | Ditference in Diet in years in Ibs. nba: d4 2690 13 ie) 76 87 —11°0 14 2690 12 Q 60 76°4 —16°4 14 2690 10 ie) 45°5 62:0 —17°5 US 2936 10 o 56 62:0 — 6:0 7 2931 9°75 io) 44 62°0 —18°0 D0 2686 5°75 2 42 42°4 — o4 14 2690 9 3 45 60°4 — 15-4 4l 2723 6°75 3 53 49°77 + 3:3 14 2690 6 3 36 44-4 — 84 57 2974 5 3 37 39°9 — 29 3 2891 5 rey 37 39°9 — 29 2 2772 5eD 2 34 41°0 — 7:0 24 2412 eT: 2 39 68°1 —29°1 21 2329 9 g 37°5 55°5 —18°0 24 2412 6 g 28 42°4 —14°4 21 2329 1 3 60 72°0 —12°0 10 2435 8 3 43 54°9 —11°9 59 1978 5 4 26 39°9 —13°9 172 Miscellanea undoubtedly starts with the well-known conclusion that a Calorie in the food absorbed in a mixed diet from whatever source, protein, fat or carbohydrate is of equal dynamic value. Previous work amply justifies this. She was not foolish enough to attempt to draw any conclusion from her investigations as to the relative value of animal and vegetable food in the diets on the physical development of the individuals. Professor Pearson seems entirely unable to grasp the fundamental fact that the physical development of the individual depends largely upon his past conditions of life. To co-relate it with the special constituents of the food which he habitually eats will require not only an enormous series of studies, but a full investigation of the character of the various food stuffs and of the mode of cooking. These points I tried to explain to him when I wrote to him in summer. He did not write to me as, in his criticism, he says he did. Miss Lindsay forwarded to me a letter from him to her, and I wrote a reply to Professor Pearson which he did not acknowledge. In conclusion I would say that before he expects his criticism of a physiological problem to be taken seriously, he had better make some attempt to understand the nature of the problem. Certainly it is not my intention to waste time in replying further to his criticism unless in the future it is more pertinent than is his present contribution. II. The Statistical Study of Dietaries. A Rejoinder. By KARL PEARSON, F.R.S. I puBLISH Professor Noel Paton’s reply because it is very typical of the type of difficulty which we meet with at present, when we assert that what is really statistical work must be undertaken only by the adequately trained statistician and that when it is not, then the investigation cannot be considered as falling into the field of science. Professor Paton states that the following question given on p. 4 of the Report formulated its object: “Do the working classes of this city get such a diet as will enable them to develop into strong, healthy, energetic men, and as men, will enable them to do a strenuous day’s work; or are the conditions of the labouring classes such that a suitable diet is not obtainable?”... Now Professor Paton either assumes that the sample taken of the diet of the individual family was their customary diet, or he does not. If he does, then the question: Was the diet such ‘as would enable the working classes “to develop into strong, healthy, energetic men”? has meaning. If he does not, not only is it idle, but the section dealing with the physique of the children on the basis of a sample diet taken as a rule for a week (occasionally for a fortnight), is beside the point. But anyhow, I ask how he can possibly ascertain how the working classes will “develop into strong, healthy, energetic men,” if he does not take an adequate anthropometric survey of the families subjected to the dietaries recorded? He says that it is accepted and proved that “ If a family diet...gives a yield of energy of less than 3500 calories per man per day it is insufficient for active work ; and if less than 3000, it is quite inadequate for the proper maintenance of growth and normal activity.” He further assumes with Miss Lindsay that calories from animal and vegetable foods have equal “dynamic value.” I assert that neither of these conclusions, Miscellanea 173 which he accepts, are based on adequate research and they are in fact refuted by Miss Lindsay’s own material. For, if it can be shown that animal and vegetable calories have different results on the physical development of the children, it is clear that the first statement as to how many calories are needful for the proper maintenance of growth has no significance until a statement is made with regard to the source of the calories. Professor Paton cites no evidence for his statements; from what I have read on the subject of calories, I feel convinced that most of the data on the matter would not stand for five minutes any adequate statistical analysis. The Report, Professor Paton tells us, shows “very markedly the relationship between the physique and the food.” Yet in a previous paragraph he says ‘that the physique is determined by the whole previous condition of life and by the influence of heredity, and that it is absurd to attempt to relate it solely to the diet.” Now the only way to ascertain whether there was a marked relationship between the food and the physique of the children was to correlate the two for a constant age and investigate whether the correlations were such, having regard to their probable errors, that they could be considered significant. I did this with the result that the total calories in the food and the girls’ weight for constant age was not definitely significant with regard to the probable error, while in the case of the boys the probable error was so large that it was impossible to say whether the relationship was really considerable or not. In fact no marked relationship could be deduced from Miss Lindsay’s data, they were too inadequate. If Professor Paton’s statement as to the influence of heredity is to be trusted, then even my correction for age was inadequate, and the data ought to be corrected also for physique of parent! If so, why was the parent not measured ? Professor Paton places before the readers of Biometrika two tables on which this “marked” relationship is asserted by him to rest. One of the cases in his Table A, No. 32, is erroneously placed in this table; the details show that the number of calories was 2949 and not 3822* ; it should be in Table B. These tables contain 16 boys’ weights and 20 girls’ weights. Professor Paton takes the British Association measurements, which are, of course, wholly inadequate as a test of Glasgow children, and making no real correction for age+ considers whether the children in the two tables were or were not above the quite arbitrary limit of 5 lbs. below standard. He gives us no measure at all of the significance of the result, which is based on the vagaries of sampling 16 boys of ages from 3 to 11, and 20 girls from 5 to 13; and he supposes in some way that this treatment can possibly refute the correlation coefficient, wo,» Of weight and food calories for constant age with its probable error! I can, however, throw more light on the matter. Owing to the great courtesy of Dr Chalmers, Medical Officer of Health for Glasgow, I have been able to more than treble the number of weights of the boys and girls subjected to the dietaries. The results for total calories in food, C;, now aret: Girls, 69 Boys, 55 alec, = +21 £08, al uc, = +05 +09. Thus the relation for boys is now quite insignificant, and for girls may well be insignificant also. At any rate although both correlations are positive, there is no “marked” relationship between the physique and the dietary. Of course, it may be said that these weights (w) have been taken at some interval after the dietaries were recorded, but unless we assume the dietary to be a rough measure of the permanent feeding of the family, whose physique has been gradually built up for years before the dietaries were recorded, the observations must be discarded as of no value at all for testing physique, or as Professor Paton phrases it “development.” * In the Appendix V of Rickety Families, it is given again ; this time as 2329 calories, + The deviation at each age would have to be measured in terms of the standard-deviation of weight at that age; naturally the deviations are larger for older children. + I have to thank Miss B, M. Cave for the present series of correlations, Miscellanea = aT i But the most interesting point ascertained from the new material is the confirmation of the result that the higher the proportion of animal to vegetable calories the greater the weight. In Biometrika, Vol. 1x, p. 533, we had for 16 boys and 20 girls: Boys: alws Cy/C4= — 23416, Girls : alws Cy/C_= — 12°15. We now have for 55 boys and 69 girls: Boys: alws Cp/C4= ~ 380+ 08, Girls: ailing, Cy [C4 =— — 94. + ‘08. These results seem to indicate that Miss Lindsay and Professor Paton, who supports her view, are in error when they consider a calory the same whether it be from animal or vegetable food. On the other hand, our larger numbers now indicate that : (i) For a constant age the expenditure on vegetable or on animal food has no sensible relation to weight. (ii) For a constant age the number of calories in vegetable food has no sensible relation to weight. (iii) For a constant age the number of calories in animal food has a positive correlation with weight for both girls and boys, being definitely significant in the first case (+°32 4°07) and not so in the second (+08 + ‘09). (iv) For a constant age the correlations of weight with ratio of expenditure on vegetable and animal foods are for both boys and girls quite insignificant as compared with their probable errors. I am extremely obliged to Dr Chalmers for doing his best to supply additional material. As far as it goes, it tends to show that calories are of far more importance than expenditures, but that calories from animal food are more closely related to physique than are calories from vegetable food*. The new material supports my criticisms that the failure to distinguish between animal and vegetable calories stultitied the advice given by Miss Lindsay, i.e. to spend money on oatmeal rather than on eggs. It also indicates that no safe conclusions with regard to dietaries can be drawn until a reasonable anthropometric survey accompanies the record of dietaries, and the whole is reduced with adequate statistical knowledge. One point I can allow Professor Paton. It was an oversight on my part, when I said that I had written to both Miss Lindsay and to himself; the letters in which Miss Lindsay and he stated that to follow up the families now would be impossible were both replies to one and the same letter of mine addressed to Miss Lindsay. The additional facts I desired were in their opinion unascertainable, and further correspondence did not seem to me likely to be of any service in achieving the end I had in view, namely to render of real service to science a piece of recording work from which in my opinion then and in my opinion still, very misleading conclu- sions had been drawn, and which conclusions in their turn had been exaggerated in the press résumés of the paper. I do not think any such work as that done on dietaries by Miss Lindsay and Professor Noel Paton will be of real value until (i) these dietaries are accompanied by a thorough anthropometric survey of the whole families of the dieted and (ii) the equality of animal and vegetable food calories ceases to be considered as a dogmatic truth. * Of course the results show that on such data as are available, the food has relatively little relation to the weight, there is no ‘‘marked ” relationship. Miscellanea 175 III. Note on the essential Conditions that a Population breeding at random should be in a Stable State. By K. PEARSON, F.R.S. Let us deal with bi-parental inheritance in the first place. Let « be a character in the father, mean 2, standard deviation o,; let y be the same character in the mother, 7 its mean, and gy» its standard deviation. Let z be the character in offspring of one sex, o3 be the standard deviation of all offspring of this sex and Z the mean. Let: pu’, jus’, pag’ 3 oe” os”) pea” 3 aN poe!” pag’, pa’”, be the moment coefficients about the means respectively of father, mother and offspring frequency distri- butions. Let 7,, be the mean of the offspring of those parents, who have characters w and y, and let the array of frequency of such offspring be given by fs (wv) du about 2,,, i.e. the character of any offspring in this array is 7,,+4, where uw is independent of the parental characters x and y, but Z,, is a function of w and y the parental characters. Some writers have suggested that the offspring character should be taken as a blend of the parental characters, i.e. z=}(et+y), understanding by blend the mean of the parental characters. This appears to be very unsatis- factory for: (a) It supposes the parental characters to fix absolutely the offspring characters which is far from a result of experience. (6) It supposes the mother to reproduce the female size of character in the male and the female offspring alike, whereas she contributes to each the sex character of her own stock, ice. if she is a tall woman, she would contribute absolutely more to a son than to a daughter. The late Sir Francis Galton got over this difficulty by “reducing female measures to their male equiva- lents.” This he did by altering absolute measurements in the ratio of male to female mean measurements. Thus he would take for the mean of his array of offspring a ih oa @ 2ay =F ange if he were dealing with male offspring. A more reasonable hypothesis is to assume that This will practically agree with Sir Francis’s form, if the coefficients of variation in the two sexes are the same, i.e. 01/%=09/Y. If we measure wv from the mean of the array of offspring we have ee eyed, Ls Z=503 @ + a) etuehelesmmteh alo stasntiasielastioe seraranetnarins arene eb (11). We shall now suppose the offspring to follow the law (i), or poet = +4 =") 4 ee Gan 1 2 where w and y are uncorrelated (mating at random), and w represents other influences than the parental, and is therefore uncorrelated with # and y*. The frequency distributions of # and y * This assumes the homoscedasticity of the arrays of offspring due to pairs of fathers and mothers with characters « and y, 176 Miscellanea may be taken as given by fi (v—2#) and fy(y—y). Let V,x WV, be the total number of possible matings ad w =SJi(%-2)foy-Z) dedy and the total number of offspring V3 in any array =f fs; (u) du. I now propose to give the expression for the zth moment coefficient about the mean, i.e. py”, of the population of offspring of a given sex. We have NV, x Nox N3x on”= |] [2 o3” co 2 =!) + uf ve (w— 2x) fo (y—Y) x fz (u) dadydu, the integration being extended over the whole of the frequency distributions of father, mother and oftspring. Thus t=n=s Jn—s ‘L(y —%)}n-8— BOON us pal” aa Hos res = = 8 iB feet ea oy" 8- tot os x fi (v7 — 2) fo (y— ¥) fs (u) dedy du. Now «, y and wu being independent we have 1 = ‘ = iA | (w- EB) a (x az @) Au = pn —3- t : Y ; ” ” al Yy-¥)' hly-y) dy= pi : 8 fi lv N; | us fs (uw) du=pyg = g t=n— n-s 4 is fey = 03" i. - 3 + | Pin-s—t Ht bet }] eer sonaes (iv). s=0 gr-3 |n—s|s t=0 jn- s—t|toy"- St gat os8 Thus we reach, remembering that py! = py" =p," =0, i. 1 at bs ” Pe 4 om (, + Hs, + any Synveraisiesayefolstaseleleqelstctouncajatesalerecciotetenttats(s(e{etetctahsfa¥uialetstetalstetstetatetetl Rete ieeis (Vv), Hs = 3 o3° (# aF e) ipeall “ad aantearearoealemenitete te coduchiae ua ee ss: ete ae meee eee eee (v1), Hl d 2 is 5 if Des. ! Ha = 76 os! (A+0% et) +5 oy (PE +8 ) Hal ee oh heNeeeeere (vii) But po’ =077, pe” =o", and py” =o". Hence we must have pl? =4o52 a a'o{bisieiolelatafe, a¥a\s otesa(e ajalejs\etarele(aoleleiate,sie7= slelelnfsle(erafatsl stale (vill). If as usual we take 8, = p37/u.? and B,=4/p? we find from (vi) and (vii), writing s?= po'¥ /B rl ) {Va —3 Ve, ne Seen ERR acke ta inaah is (ix), 4 Byit = {2 uy = 74 (8! 4+ 8,” +6) — 32} REE ERR EES ER 0c (x) Whence by the use of (viii) /By¥ =2,/2 va” -5 war-+V,} ve ae ted! ik eae (xi), : Jer ac gVv=4 {182” — 16 (Bo + By! + 6) SIO. ai sosine satin ss heater cee (xii). Miscellanea 177 Hence in order that the offspring population should be stable, it is needful that in the array of offspring for given parents : 1 (a) FTI) 03. (6) Nama, (Bi - 5 (B+ /B7)| =2y2/8,” (1-7) = 3 Wa” if 8,” =8,/=B,", i.e. the skewness be the same for fathers, mothers and offspring. i 1 mr B=; (78)" —15), if pee = Bo! =f," Thus, we have for the array of offspring of given parents 1 i op 03 enor ae B= 5B, tones outta sae esac e caaioe ees ceavenges (xili). Peal, Be” —8= 5 (Br — 3) Accordingly the variability of the array is less than that of the population of offspring ; and the array (unless B,;'”=0, 8;’”=3) is more skew and has greater kurtosis than the general population. If 712, 723, 73, be the three correlations of father, mother and offspring we know that the mean standard-deviation of the offspring of arrays having the same parents is we 1 = 743 — 793" — 149? + 2712713731 s'=05 tony cy Cee AY and this equals if there be no assortative mating ae epee (712=0), o3N 1 143? — 193°. If we could assume this equal to s we must have, since y 1 =—- OG: a/ 2 39 1 ls prepa aa S 2 =N1—- 713° — 193"5 2 leading to 1132 +723°=4, or if the two parental correlations are equal to 113 =123 = "9. In other words, if the parental influences were equal and there were no assortative mating and the character in the array of offspring had the mean value then the population could only be stable if 713 =723=0°5. But this apparently noteworthy result only begs the question. By the general theory of correlation the mean of the array of offspring is ame 713712723 U— & 4 V3 —N12713 YY Lis? Vox 1-732 oy)” if there be no assortative mating, Cn? z 1138, 1938 03 (rm te ros 3 = 103 (= ae oj 2 C1 a2 Biometrika x 23 178 Miscellanea Hence if we asswme the mean of array of offspring to be given by VW y aa (Z +2) (i) the second portion of the expression must be zero, i.e. mean of whole population of offspring must coincide with mean of array of offspring where parents have the mean values and (ii) we must have 73=723=%. In other words the form of our assumption involves both the equal influence of the parents and the value of the parental correlation. From the standpoint of heredity no such assumption is legitimate. Neither in Mendelian theory nor in biometric formula, nor again in actual observation is it permissible to suppose that the mean of the array of offspring is determined solely by the parents. Still less is it possible to suppose the actual character of the offspring to be the mean of that of the parents (i.e. put u=0). Ifit were we should have z=4(a+y), whence flow mr 1 ’ ” (Doan (12 + #2") m” | , ” 1 Bs = (3 + ps3 ) tie e/6iaiulole ae elnie)sieje eisieieis.eiais/s sie jslelelelsisiainre (xiv). i 1 , , ” ” ba 76 (ua + Bpy me’ +4") But these equations assume that py, w3'Y and py” are all zero—an absurdity in itself and contrary to all experience, whether biometric or Mendelian. For non-assortative mating and equal potency of parents, they lead to parental correlations of the order °7 and to an impossibility of stability in any population*. In fact any such relations as (xiv) are inconceivable on the basis of both biometric as well as Mendelian theory and observation. Parental correlations have never been observed anywhere near such a value as 0°7. Equations (xiii) are, however, suggestive ; they show that if the parental distribution be symmetrical and mesokurtic, the array of offspring will remain so after selection; but if the parental distribution does not possess these characters, then any selection of individual parents will emphasize the asymmetry and the kurtosis in the resulting array of offspring ; or continued selection of this type will lead to greater and greater divergence from the normal or Gaussian frequency distribution. * If we assume that the mean of the array of offspring of parents of characters x and y is given by lx +my, it is only another way of asserting that the regression is linear and that _ 712 — 713723 3 _ 713 — 712723 63 i n= 1-137 oy’ 1-193? 9° If we make /=m, or give equal weight to the parents, it is only rational to suppose that o;=c2 and T12=113, Which lead us to N16 O° =m=—2 3, 1+ T93 01 ‘ : ‘12 93 Hence the mean of the array is —— —(#+y), 1+793 04 : and whether we make x constant and y constant or x+y constant leads to precisely the same variability in the array, i.e. : 1 = 19? — 1137 — 17932 + 27121713723 Qryo2 s$=03 > ~=03 1-_—_, 1 - 723" 1+793 If assortative mating be zero, this equals 03 VI = Qo? and, if to reach the results for u’” given above we put this zero, we must have r= N50=0°7 nearly. Miscellanea 179 IV. The Hlimination of Spurious Correlation due to position in Time or Space. By “STUDENT.” In the Journal of the Royal Statistical Society for 1905*, p. 696, appeared a paper by R. H. Hooker giving a method of determining the correlation of variations from the “in- stantaneous mean” by correlating corresponding differences between successive values. This method was invented to deal with the many statistics which give the successive annual values of vital or commercial variables; these values are generally subject to large secular variations, sometimes periodic, sometimes uniform, sometimes accelerated, which would lead to altogether misleading values were the correlation to be taken between the figures as they stand. Since Mr Hooker published his paper, the method has been in constant use among those who have to deal statistically with economic or social problems, and helps to show whether, for example, there really 7s a close connection between the female cancer death rate and the quantity of imported apples consumed per head! Prof. Pearson, however, has pointed out to me that the method is only valid when the connection between the variables and time is linear, and the following note is an effort to extend Mr Hooker’s method so as to make it applicable in a rather more general way. If 21, @, #3, ete., W715 Y2, ¥3, etc., be corresponding values of the variables 7 and y, then if X, Ly, Xz, ete, Y1, Yo, Y3, etc. are randomly distributed in time and space, it is easy to show that the correlation between the corresponding th differences is the same as that between « and y. Let ,,D, be the nth difference. For 1D, = 21 — Vo, reer 1D,? = ar? = 24 Wy +209, Summing for all values and dividing by V and remembering that since 7, and «2 are mutually random S (2, #2) =0, we gett Again, Dy= I - Yoyo Dar Dy = "19, — L211 — M1 Yot L272. Summing for all values and dividing by NV, and remembering that 7, and y. and a, and y, are mutually random FDigDy Dp {Dy 2ey Fx Fy peie/ Tenis DD, ey, Proceeding successivel ? =? SNe SSAA oe Saisncuieeecnice Bata eters asia tons ID 2 y ne Dy, rest OF n-1Dy ad ( ) “rie Now suppose 2), 22, #3, etc. are not random in space or time; the problems arising from correlation due to successive positions in space are exactly similar to those due to successive occurrence in time, but as they are to some extent complicated by the second dimension, it is perhaps simpler to consider correlation due to time. Suppose then v= X,4+bt,+ct?+dtitete., «,= Vo+bto+cty?+dt3+ete, where X,, X,, etc. are independent of time and ¢,, ¢,, t3 are successive values of time, so that t, —tp_4= 7, and suppose y,= Y,+ 04, +¢t,?+ ete. as before. * The method had been used by Miss Cave in Proc. Roy. Soc. Vol. Lxxiv. pp. 407 et seq. that is in 1904, but being used incidentally in the course of a paper it attracted less attention than Hooker’s paper which was devoted to describing the method. The papers were no doubt quite independent. + The assumption made is that n is sufficiently large to justify the relations S4"71 (x) /(n — 1) = Sy” (x) (mn — 1) = Sy" (x)/n_ and S{"—! (a?)/(n — 1) = Sy” (x?) /(n — 1) = Sy" (x?) /n, being taken to hold. 232 180 Miscellanea Then 1D, = Dy -bT — eT (t, + te) — dT (t?7 + tte 4%”) — ete. 1D, =,Dx—{bT+cT? +dT3 + ete.} — & {2¢7'+3dT? + 4eT + ete.} —t,° {3dT + 6eT? + etc.} — ete. In this series the coefficients of ¢,, t2, etc. are all constants and the highest power of 4, is one lower than before, so that by repeating the process again and again we can eliminate ¢ from the variable on the right-hand side, provided of course that the series ends at some power of ¢. When this has been done, we get nDx=nDy +a constant, nDy=Dy +a constant, BO Hei apy = Ds »D % => lxy) ‘ 20) wd . =r Pn oo » (0) 5 rayle Ss 7 and of course PD ety =e UD DY for ,D, and ,D, are now random variables independent zn of time. Hence if we wish to eliminate variability due to position in time or space and to determine whether there is any correlation between the residual variations, all that has to be done is to correlate the Ist, 2nd, 3rd...2th differences between successive values of our variable with the Ist, 2nd, 3rd...xth differences between successive values of the other variable. When the cor- relation between the two rth differences is equal to that between the two (n+1)th differences, this value gives the correlation required. This process is tedious in the extreme, but that it may sometimes be necessary is illustrated by the following examples: the figures from which the first two are taken were very kindly supplied to me by Mr E. G. Peake, who had been using them in preparing his paper “The Application of the Statistical Method to the Bankers’ Problem” in The Bankers’ Magazine (July— August, 1912). The material for the next is taken from a paper in The Journal of Agricultural Science by Hall and Mercer, on the error of field trials, and are the yields of wheat and straw on 500 345 acre plots into which an acre of wheat was divided at harvest. The remainder are from the three Registrar-Generals’ returns. I Il Ill IV V VI Correlation between ... Sauerbeck’s | Marriage} Yield of Tuberculosis Death Rate. | Index numbers. Rate Grain Infantile Mortality and wits ... | Bankers’ Clear- | Wages | Yield of 5 ing House Straw | — ULI ees Ireland England | Scotland head Raw figures ss — °33 — 52 +°753 3 "35 +02 First difference... +°51 + ‘67 +590 +°75 + °69 +°51 Second difference ... + °30 +°58 + °539 74 ‘74 +°65 Third difference ... + :07 + °52 +°530 — — — Fourth difference ... +11 + 55 + °524 — — —_ Fifth difference ... + 05 + °58 — — — — Sixth difference... — +°55 = = — — Number of cases | 41 years 57 years Ly 42 years ae eed years! plots Sa) ae The difference between I and II is very marked, and would seem to indicate that the causal connection between index numbers and Bankers’ clearing house rates is not altogether of the same kind as that between marriage rate and wages, though all four variables are commonly taken as indications of the short period trade wave. 1 had hoped to investigate this subject more thoroughly before publishing this note, but lack of time has made this impossible. Miscellanea 181 V. On certain Errors with regard to Multiple Correlation occasionally made by those who have not adequately studied this Subject. By KARL PEARSON, F.R.S. (1) Iv is well-known* that if we endeavour to predict the value of a variate xv) from a correlated variates 21, 2, ... Y,, by determining a linear function of #7, #2, ... v, which has the maximum correlation #, with wz, then the value of #,,? is given by : ‘ h,?=1—A/Ago, where A is the determinant — 1 5) TOL) M025 eee Ton Pity 0 5) Ry oon Pate | | Troy Tnty nds oe 1 | and A,, is the minor corresponding to the constituent of the pth column and gth row. The system I propose to consider is that in which all correlations like 7, are equal, whatever p be, toa constant p, and all correlations 7,,,, where p and g may take any values from 1 to n, are the same and equal to «. We now have for the value of A the expression Wl Pye (Ps) ees Ps | | To evaluate this determinant add all the rows but the first together, giving nap, l+(n—l)e, 1+(n—l)e, ... 14+(m—-l)6, multiply the result by p/(1+(2—-1) €) and subtract from the first row. We have np* ; | np? = —— OFF Oe OF — aa 0. l+(n—le’ ’ , | {1 11s x Aon Ps l, ¢, || Ps €, 1, € | BNajoidie sraseuiscia gee aneicstesejmee eelicow camer \ Ps €, €, 1 | ane 2 Hence fe a (he = i a i 1c n ( T+(n—l)e AGEING Too eer (a), / n - or R,t=p Tes (eat hPa wetiecees (ii). * Biometrika, Vol. vit. p. 439. + The sign of R, must be determined from other considerations. 182 Miscellanea Thus if 7 variates are equally correlated (e) among themselves, and equally correlated (p) with another variable, we shall not indefinitely increase the accuracy with which the last variable will be predicted from the others by increasing indefinitely the number of the variates 2. illustration. The coefficient of multiple correlation is required as we increase the number of brothers from whom a prediction of a character in a given brother is made. The fraternal correlation =°5, Number of Brothers R, 1 “5000 2 “5774 3 “6124 4 “6325 5 6455 6 6547 10 ‘6742 a “7071 Compare against these results ¢wo parents only in a population where there is no assortative mating and the parental correlation="5. Here e=0, p="5 and n=2, .-. R=4$,/2=°7071, or two parents will give more information than 10 brothers and sisters, and as much in fact as an indefinite number. Suppose the parents tend to select their like, i.e. suppose there is assor- tative mating in the population, say, e=+15, then with the same intensity of parental correlation 2 = 6594, or, two parents will give us more information than six brothers and sisters. Now this illustration brings out the real nature of the effect of increasing the number of variables from which we predict. Such increase has very little value, if those variables are fairly highly correlated with each other. To be effective they must be highly correlated with the variate we wish to predict and correlated very slightly with each other. Even in this case there is a limit to the degree of correlation reached when the number of variates is indefinitely increased, namely p/,/e, and it is clear that if p be small and e fairly large, no very great increase of correlation is obtained if we use an indefinitely great number of variates. For example if p='05 and «=°5, we find &, =-0707 only. Even if p were ‘10, we should only raise R to ‘1414, could we predict from an indefinitely large number of such correlated variates*. Indeed as long as ¢ is not less than p we gain singularly little by combining large numbers of variates, For example if p were ‘4, and e=°4 ten such variates would only raise the correlation to 5898, and an indefinitely large number to ‘6325, which is less than double the single correlation. Yet there are apparently many persons who believe that by taking a number of low correlations, a high relationship can be reached ! Actually there is a limit.to what relations can possibly exist between a variate xv and a series of equally correlated variables x, ... 7,. Since & must be less than unity, we have py aes PON) SV Gielen ats 2 np* —1 or e> f ° n—-1 Thus if 72=10 and p=‘5, « must be >'1667. Or, it would be impossible for 10 variates to have a correlation ‘5 with another variable, and a zero correlation with each other. * Even if p were ‘10 and ¢ as low as ‘10 we should not raise R for endless variates of this order of correlation above -3163, while from compounding ten such variates we should only obtain a correlation about double that of a single variate, i.e. R=+2294, Miscellanea 183 If we suppose a number of variates » to be uncorrelated with each other, but correlated 115 102) «++ “on With another variable w), then we have from the determinant as given below 2 cee + 2 ==i(l » Tot» To2> e%* Von =(1 — 749 — lo Sesicsaee UI ) Dou. iy bes 0) 5 oon 0) eevee eee eee eee) , ‘ 5 9 eae: he= More+ Poy? + seeita Tony or R=Jn ih Myo tit FT on" Therefore, if 2 variables, uncorrelated among themselves, be correlated with an additional variable, it is necessary that the root mean square of their correlations should be less than 55 We see therefore that it must either be impossible to find a large number of variables n uncorrelated among themselves, which are correlated with an additional variable, or else their correlations with this variable must be extremely low. The last result shows us the fallacy of supposing that correlations are simply added together for a combined effect ; clearly when the variates are uncorrelated among themselves, we add by the sum of the squares. For example, if 7) =792=...-=70n='03 one hundred such variables would only raise R to 30, On the other hand if the variates are highly correlated together, say e=°81, an indefinitely great number of such variables would only raise the multiple correlation to ‘0333, if the individual correlation were ‘0300. We are now in a position to apply our results to the problem of the relative intensity of heredity and environment. This problem has been singularly misunderstood especially by the popular exponents of Eugenics. Some illustrations of this may be given here. Major Leonard Darwin writes as follows in the Journal of the Eugenics Education Society: “It is impossible to compare heredity as a whole with environment as a whole as far as their effects are concerned ; for no living being can exist for a moment without either of them*. Moreover, in order to compare two things so as to be able to use the words more or less in connection with such a comparison, we must have a common unit of measurement applicable to them both. But what is the unit by which both heredity and environment may be measured ? I myself have no idea. May we not be discussing questions as illogical as enquiring what portion of the area of a rectangle is due to its width and what to its length? Js 7t ever wise to use words in scientific literature without endeavouring to attach a definite meaning to themt ? It is hard to conceive a paragraph of the same length more full of evidence of complete ignorance of the methods used in modern science for comparing correlated variates! Yet it goes out as the opinion of the President of a Society which is endeavouring to spread the scientific doctrines of Eugenics among the people! Major Darwin begins by stating that it is needful to have a common unit of measurement in order to compare two variates. To begin with we are not comparing two things, but we are comparing the influence of two things on * There would in our sense be no heredity if the average child born to noteworthy parents was equal to the average child of the whole community. Yet it is perfectly easy to understand how living beings could exist under such a law of reproduction. Major Darwin seems to be confusing two things, the fact that a man is born true to his species, and the fact that he resembles his immediate ancestry. It is the latter fact only which concerns us when we compare heredity and environment, i.e. how variation of immediate ancestry affects the individual’s physical or mental characters. But without such heredity individuals might quite well exist. + The Eugenics Review, Vol. v. p. 152. The italics are mine. 184 Miscellanea a third, ie. the intensity of a certain environmental influence and the intensity of a certain somatic character in the parent, say, on the intensity of the somatic character in the off- spring. Yet Major Darwin tells us we cannot do this because we cannot measure these things in the same unit !—How suavely yet forcibly Sir Francis Galton himself would have ridiculed such ignorance in high places as is passed by the Editor of the Eugenics Journal !— We can hear him now telling us how the intensity of each character could be measured by its grade, and how the problem turned on whether the same change in grade in the environ- ment and in the parental somatic character produced greater or less change in the grade of the filial somatic character. When we inquire whether inter-racially stature is more closely related to cephalic index or to eye colour, are we to be met by the statement that these characters cannot be compared because they cannot be measured in a ‘common unit,’ and then be told that it is not “wise to use words in scientific literature without endeavouring to attach a definite meaning to them?” Every trained statistician knows that each character is measured in the unit of its own variability—in what he terms its standard deviation*, and that this standard deviation provides him with a measure of the frequency of each value of the variate in question. It seems to me that the only correct sentence in this paragraph, is the author’s statement that he himself has no idea what unit is ‘common’ to heredity and environment. But our author continues : “Take any quality, and we find that the human beings composing any community differ more or less considerably as regards that quality. Now we can measure the correlation between the differences shown in this quality and the differences of environment to which the members of the community in question had previously been exposedt. This is one correlation. Then we can also measure the correlation coefficient between, say, father and son, as regards the quality in question. Here is a second correlation; and if we are told that the relative influence of environment and heredity is measured by the ratio between these two correlation coefficients, we certainly do thus get a clear conception of what is + meant }. But has the writer really obtained a clear conception of what such coefficients of correla- tion mean, when in the next paragraph he continues : “Tmagine an ideal republic, in some respects similar to that designed by Plato, where not only were all the children removed from their parents, but where they were all treated exactly alike. In these circumstances none of the differences between the adults could have anything to do with the differences of environments, and all must be due to some differences in inherent factors. In fact the environment correlation coefficient would be nil, whilst the hereditary correlation coefficient might be high §.” Could any better evidence be adduced that the President of the Eugenics Education Society did not know what a coefficient of correlation meant at that date? The coefficient of correlation for the environment might be anything from —1 to + 1; the only obvious fact would be that you could not find its value, except in the form 0/0, from an environment which precluded any measure of variation. How again Sir Francis would have smiled at the notion that the coefficient of correlation for a constant environment must be nil. Why should we follow such * Of course he may or does need other constants to help in the description of the frequency. + loc. cit. p. 153. + This seems to contradict the writer’s previous assertion that two things are incomparable, if they have not a ‘common unit’! § I wrote at once to Major Darwin pointing out the error of such a statement and he withdrew it in the next number. But the harm done by an article of this kind cannot be reversed by correcting a single misstatement. Miscellanea 185 advice as that given by the President of the Society to avoid as far as possible “such phrases as the relative influence of heredity and environment,” when on his own showing he does not in the least appreciate the methods by which this relative influence is measured ? Then Major Darwin continues : “Surely what we want to know is how we can do most good— whether by attending to reforms intended to affect human surroundings, or to reforms intended to influence mankind through the agency of heredity. But does this ratio [that of the environ- mental and hereditary correlation] give us any sure indication of the relative amount of attention which should be paid to these two methods of procedure?” Our only reply can be that these correlations certainly do, and that as long as the President of the Eugenics Education Society fails to grasp their meaning, he is doing grave harm to the science of eugenics. We measure the change in the character of an individual which would be produced by a change of a like or an allied character in a parent, such change being one of which we have experience ; we measure the change which would be produced in the character of the individual by changes in the environment such as we have experience of, i.e. when we move the individual from a badly ventilated to a well ventilated house, from a back to back to a through house, from a low wage to a high wage, and so forth, and we find the resulting changes are of a wholly different order in these cases to what happens when we change the physical characters, the health or habits which define the parents. It is on the basis of this that we assert that the relative strength of heredity is far greater than the strength of environment. To this reasoning, apart from such arguments as the above or those to be immediately dealt with, reply is only made by talk as to the impossibility of an individual surviving if you deprived him of his normal environment! It would be just as reasonable to assert that everything must be due to heredity, because a race of supermen would breed supermen ! What the scientific eugenist has endeavoured to measure are the influences of such range of differences in environment as occur in everyday experience and are therefore producible from the political, economic and social standpoints, not the absence of all environment at all. But while this is recognised by some of the popular eugenic writers, they have approached the problem from another standpoint which indicates equally how little they grasp modern statistical theory. We admit, they say, that the environmental correlations may be of the order ‘03 or :05 and the inheritance correlations of the order 50. But this is the correlation of one character in environment. You ought to take ten or twenty, and then you will have multiplied up environment to be more effective than heredity, for 03 x 20=-60. In the first place we may suggest that it would be just as reasonable, if the argument were a valid one to multiply up the favourable hereditary characters, to take weight, height, muscular activity, health, intelligence, caution, and many other desirable factors, and these not only in one parent but in brothers, sisters, aunts, uncles and grandparents and treat the cross-correlation of these with the character under discussion. But although every improvement in stock would reflect itself in improvement in offspring, correlations cannot be added together—any more than forces by simple arithmetical addition. You do not combine two hereditary correlations any more than two environmental correlations by mere addition. You must proceed by the combinatory process indicated at the commencement of this paper, which is one of course familiar to every trained statistician. : Yet here is a statement which the Editor of the Hugenics Review admits to its pages without contradiction * ; The point that we wish to make is this, In the face of so much ignorance concerning, not only heredity itself, but also its complement, the influence of environment, how can any one be justified in making sweeping generalisations with reference to these subjects? Such generalisations, however, are made. It is said that we have a definite proof that inheritance is of far greater strength than environment. This argument takes the following shape. The correlations between parent and offspring for a number of features have been calculated, and the mean is found to * Vol. v. p. 219, in an article by A. M. Carr-Saunders. Biometrika x 24 186 Miscelianea be somewhere about °5. Correlations between individuals and various aspects of their environment have also been worked out—as, for instance, mental ability and conditions of clothing, or between myopia and the age of learning to read*—and the mean value is found to be about -03. It is then said that the mean ‘‘nature value” is at least five to ten times as great as the mean ‘‘nurture value,” and upon this is founded the generalisation that ‘‘nature” is of far greater importance than ‘‘nurture’+. It may be questioned, however, whether such a comparison does not involve a serious misiake. For if we consider the two mean values that are compared, we find that, whereas the ‘‘mean nature value” is the mean value of a number of observations, all of which provide a full measure of the strength of heredity, the ‘mean nurture value” is the mean value of a number of observations, each of which measures only the strength of some one isolated aspect of environment. It would appear then that the full strength of inheritance has been compared, not with the full strength of environment, but with the average of a number of small isolated aspects of the latter. Asa matter of fact it is quite beyond our power at present to sum up the full effect of environment upon the individual and compare it with the full effect of heredity. We are, therefore, justified in saying that we neither know in particular cases how far the environment can produce any effect, nor can we make any definite statement as to the comparative strength of ‘‘nature” and “nurture.” Now this is the doctrine passed by the Editors of the Hugenics Review, the journal of a society, which has assumed the mantle of Francis Galton{, and it is passed, because the editorial committee of that society does not grasp the meaning of multiple correlation! The passages in italics have been so printed to draw our readers’ attention to them. In the first place, of course, a single correlation coefficient does not provide a full measure of the strength of heredity. In the table cited the coefficients are those for one parent or for one brother or sister. Each relative—and those for independent stocks are either non-correlated or inter- correlated very slightly—provides such a coefficient, and further each character in such relatives may be correlated with the character under discussion in the subject in question. In the; next place the environment factors do not consist of “some one isolated aspect of environment.” All these factors or aspects are closely interlinked, and this was a fact well-known to the workers in the Galton Laboratory. The real interpretation of such a difference as 560 and °03 in the average values of single coefficients can only be appreciated by those who are conversant with the theory of multiple correlation, and it is quite clear that those who profess to guide the public in this very difficult problem—which is essentially a scientific problem—lack any adequate knowledge of the sole instrument by which any conclusion can be drawn. : The writer appears to be wholly ignorant of the nature of multiple correlation in the first place, and in the second entirely to overlook the very high correlations which exist between environmental factors. Bad wages, bad habits, bad housing, uncleanliness, insanitary sur- roundings, crowded rooms, danger of infection, etc., etc. are all closely associated together, and while the order of correlation between environmental and physical characters is low, that between individual environmental factors is in our experience very high. Thus the problem of multiple correlation illustrates closely the theory developed in the first part of this note; we have to deal with a low p and a high «. For example, if we take the environmental factors to have an average inter-correlation of °70, then an infinity of such factors for a mean environmental and individual correlation of ‘03 would * As the writer phrases this correlation, it is very liable to be misinterpreted. What the Galton Laboratory did was to show that myopia was very markedly inherited, and that the theory that it was largely due to school environment was incorrect, because children who began to read late, i.e. went late to school, were not less myopic than those who went early. + Karl Pearson, Nature and Nurture, Eugenics Laboratory, Lectures vi. p. 25, + If there was one point on which Francis Galton felt strongly and wrote it was on this point of the relatively great intensity of ‘‘nature” as compared with “nurture.” I do not stand alone in recognising it as an essential part of his teaching: ‘I am inclined to agree with Francis Galton,” writes Charles Darwin, ‘‘in believing that education and environment produce only a small effect on the mind of anyone, and that most of our qualities are innate.” Miscellanea 187 only raise the correlation to 0359 against a s¢ngle parental correlation of *5000; if the correlation was ‘05 instead of ‘03, we should have the total possible environmental multiple correlation ‘0598 as against ‘5000. Even if we raise the average environmental correlation to ‘1 and the inter- environmental factor correlation be reduced to °5, the multiple correlation of an infinity of factors is only :1414 as against the sengle factor of heredity °5000. Even if we could pick out one hundred environmental factors which had no inter-correlations—which experience shows is wholly impossible—and each of these independent factors was correlated to the extent of -05 with the mental or physical characters of an individual they would only just reach the hereditary influence of a séngle character in a séngle parent. Now let us suppose an absolutely idle case, namely that the environmental factors had the same correlation as a parent, i.e. 5, with the character of the individual, and only a correlation of °6 with each other, then if we could use an indefinitely great number of such factors the multiple correlation would only be ‘5//°6=°6455, while the correlation with two parents, with no assortative mating, would be -7071. Even with assortative mating, it suffices to take only the four grandparents into account to show that heredity acts in excess of an environmental scheme even so preposterous as is suggested above. If we take the parental correlations *50, the grandparental ‘25, and those of assortative mating -15, we have for the determinant : A=| 1, °50, 50, ‘25, -25, -25, -25 (60,1; 15). "50; “50, “0; 0 | 50, ‘15, 1, -0, . 0, ‘50, ‘50 5, 50, 0, « 2 i 25, 0, ‘50, 0, 0, ‘15, 1 Add together the second and third rows multiplied by 3951, and the fourth, fifth, sixth and seventh multiplied by ‘0456 and subtract the result from the first. The first row then becomes | 5593, 0, 0, 0, 0, 0, O| the others of course remaining the same. Hence NOD 9B Anne and R?2=1— A/Apy =1—°5590 = 4407. Therefore R:=='6639. Or together grandparents and parents would influence a man’s character more than an 5S infinity of environmental factors of the same grade of correlation, because the latter factors are far more highly correlated together than several of our relatives. Actually of course we are dealing with average values; the average value of environmental correlation with individual character being in our experience of the order ‘03 to ‘05 and the Lod inter-environmental factor correlations of the order 5 to ‘7. But these averages enable us to appreciate the total effect. The doctrine taught by the writers in the Hugenics Review, that we know nothing of the relative intensity of environment and heredity and that it is unwise “to use words in scientific literature without endeavouring to attach a definite meaning to them” only demonstrate how far the Editors of that Journal are removed from any appreciation themselves of modern statistical methods. How far the doctrine is removed from the very strong views held on this point by Francis Galton, only those who have studied his writings and know how strongly he felt person- ally on the subject are in the least competent to appreciate, 24—2 188 Miscellanea VI. Formulae for the Determination of the Capacity of the Negro Skull from External Measurements. By L. ISSERLIS, B.A. § 1. Formulae for the determination of the capacity of the human skull from external measurements, were obtained by Lee and Pearson*. The material they employed consisted of various series of measurements of Bavarian, Aino and Naqada skulls. Measurements of Ancient and modern Egyptian and other non-European skulls were employed, chiefly for purposes of comparison. The formulae, some of which will be quoted later, were intended primarily for the prediction of the capacity of European skulls, from external measurements. Doubt has been thrown on several occasions on the applicability of these formulae to the Negro skull, one of the reasons alleged being the supposed difference in thickness of the bone of European and Negro crania. The publication t of the late Dr R. Crewdson Benington’s researches on the negro skull has made it possible to obtain similar formulae for negro skulls, and to test how far these can be applied to the prediction of the capacity of European skulls and conversely to test the applicability of Lee and Pearson’s Equations to the negro skull. § 2. The material is fully described in Dr Benington’s Study. The crania dealt with in the present paper are Benington’s series A, B, C. A. Congo Crania in the Royal College of Surgeons. These crania provide 46 males and and 21 females, as owing to various defects no capacity is available for numbers 25, 38, 48, 54 among the males and numbers 69, 72, 75, 79, 82, 85 among the females. B. Crania from the Gaboon, Group I, brought by Du Chaillu from Fernand Vaz in 1864. Of the 50 male and 44 female crania in the series, 2 males (numbers 3 and ?) and 1 female (number 2) are defective, leaving 48 male and 43 female crania available. C. Crania from the Gaboon, Group LI, brought by Du Chaillu from Fernand Vaz in 1880. Two of the 18 males (numbers 12a and 20) and two of the 19 females (numbers 8 and 18) are defective. Altogether 110 male and 81 female crania have been dealt with. The correlation has been calculated of the capacity (C) and the product of the breadth, length and total height (B, Z and #), for each group and for the aggregates of 110 male, and of 81 female crania. Correlation coefficients have also been calculated for the capacity and breadth, capacity and length, and capacity and total height, but for the aggregates of the three groups only. Re- gression formulae are given in all cases. It is to be observed that Dr Crewdson Benington’s measurements of capacity were taken with mustard seed, packing and measuring glass and that the error of measurement or rather his average difference as compared with other workers in the Biometric Laboratory was under 10 cm*. In comparing the regression formulae obtained here, with those given by Lee and Pearson for European and other skulls it must be remembered that in all their formulae except (12) and (18) of p. 247 they employed the auricular height and not the total height. In the present paper as in Dr Benington’s study H denotes the total height. Lee and Pearson denote this by #’ and use # for the auricular height, It was not possible here to use the auricular height as it was not available for the whole of the Gaboon series B and C. * Phil. Trans. Vol. 196, Series A, pp. 225—264. + Biometrika, Vol. vit. Nos. 3 and 4, Dec. 1911. Miscellanea 189 Taking first the male skulls, the mean value of the capacity and the product BLH, their standard deviations and the correlations are given in the following table. TABLE I. Mean capacit | Mean value of : ; in in as z BLH incom} | %e 2 em. nae "C, BLU 46 Congo Skulls... 1344 3303 126°22 282°99 872 48 Gaboon (1864) ... 1379 3295 108°30 | 230°30 *822 16 Gaboon (1880) ... 1447 3463 10960 266-42 308 110 Negro skulls... 1375 3323 120°74 265 °20 "842 The corresponding regression lines are for the 46 Congo ae3 C=‘00038889BLH+ 59 +o SCRE EECCaTE (1), n 48 Gaboon (1864) | C=-0003865BLH+ 105% 4 eee (2), n 16 Gaboon (1880) | 0=-0003323BLH4297 + occ (3), vn +A 110 male negro skulls C='0003849BLH+ 96 ao web eneee caseneade (4). n Lee and Pearson’s corresponding equation for males is C= 000266 L BH’ + 594-6* vo. ccscececsscesceccescescesscecenes (P). This is not a regression line, but is obtained by method of least squares from the results for various races in their table 20. The formulae 1—4 can be used to predict the capacity of an individual skull from external measurements. The probable errors of the mean were calculated by the formula 0°674490, — N where # is the number of skulls in the group to which the formula is applied. If we substitute in (1)—(4) the mean values of B, LZ, H for the Bavarian male skulls used by Lee and Pearson, Viz. : B =150°5, L =180°, H=133'8, 5 41 we obtain, from (1), C=14744+—= Vn np Op C=1471 #82 Vn » (3), c=1506+ © Vn » 4) v=14964 >, Vn * Loe, cit, Equation (12). H’=total height. 190 Miscellanea The measured capacities of these German skulls have a mean value of 1503 c.c. a result which is in very close agreement with (4) the formula based on 110 skulls. 1508 is the mean capacity of 100 skulls so that === 655. Thus the difference between the actual mean capacity N10 of German skulls and the mean capacity estimated by the negro formula is less than 10 cm’. although the mean capacity of German male skulls exceeds that of negro males by 1503 —1375=128 cm’. If the above values of B, Z, H are substituted in Lee and Pearson’s formula (P) on p. 4 we obtain C=1492. On the other hand if we substitute the mean values of the dimensions of the 110 male negro skulls, B=137, 2=178, H=135 in formula P we obtain C=1400 as compared with the measured mean of 1375. This is not as good a reconstruction as our formula (4) or as the formulae of Lee and Pearson employing auricular height, and is probably due to the fact that P is obtained by the method of least squares from 11 means only. § 4. An approximation to the influence of the thickness of the bone of the skull on pre- dictions of capacity from external measurements can be obtained by differentiating the equation C=hkBLH+const. and putting dB=dL=dH=t. We obtain dC=k(BL+LH+ HB)t, or if we observe that in the equations the constant is comparatively small dC 1 1 1 C= G ry 38 7H) t with B=150°5, L=180°6, H=133'8 de 1500 — t (02) approximately. Thus a difference of 10 cm*. in capacity corresponds to a difference of 4mm. in thickness which is about 5°/, of the thickness (say 6 mm.) of the human skull. We may fairly conclude then, that there is no appreciable difference in the thickness of the negro skull as compared with the European. § 4. The female crania yield very similar results. The following is the table for the female skulls. TABLE II. | Mean capacit Mean BLH : : Giana in tnt s in cm.3 oD OS om. "cy BLH 21 Congo Skulls... 1206 2858 LOZ e Se 90 43 Gaboon (1864) ... 1232 2924 126°7 270°95 8814 17 Gaboon (1880) .., 1240 2964 97°31 265°8 8560 81 Negro skulls... 1227 2956 117 255°72 "7668 | Miscellanea 191 The corresponding regression lines are 21 Congo skulls C= 0003645 BLH+164+ E> Maseeameneace these (5), Jn 43 Gaboon (1864) | C=-0004122BLH+ 27+ aw Teen, (6), Vin 17 Gaboon (1880) C= 0003134 BLH4+311+ = sea ea Sessatiente teenies (7), vn 81 Negro skulls C= 0003508 BLH 42044 ieee (8). vn The corresponding Lee and Pearson formula obtained by the method of least: squares is C2 OO0USC LBA 7812 cciccceccsscscovccessezsssesesccescenss (Q). The mean values of B, L, H’ for the Bavarian female skulls discussed by Lee and Pearson are B=144-11, L =173'59, Hf’ =128°07. With these values, we deduce from 5—8 the following values for C. 45 (5) C=1331+ ww (6) C=1347+ = vn (7) C=13154— The mean of the measured values of the capacities of these skulls is 1337 and formula (8) based on 81 negro skulls gives a result in very close agreement. If the above values of B, L, H’ are substituted in Lee and Pearson’s formula @ we obtain C'=1284 a result which differs from the true value much more seriously than the prediction by the negro regression formula. Again, if we insert the mean values B =130°75, £ =171°38, H=129°'81, of the 81 female negro crania in the formula Q we get C=1266 as against the mean of the measured values which is C=1227, demonstrating again the fact that the formulae P, @ based on 11 means are not as good as the regression formulae. § 5. We add tables of the correlation between capacity and breadth, capacity and length, and capacity and total height for the 110 male and the 81 female skulls, and for comparison reprint the corresponding value for German (Bavarian) skulls. Miscellanea TABLE III. Correlation Males. Negro German Capacity and Breadth 4977 6720 Capacity and Height 6080 (total height) 2431 (auricular height) Capacity and Length 7433 5152 TABLE IV. Females. Negro German Capacity and Breadth 7578 | “7068 Capacity and Height 5450 (total height) | 4512 (auricular height) Capacity and Length 6699 | ‘6873 The corresponding regression lines are given in the tables below : TABLE V. Males. Negro German (9) C=12°6356B-—3561+202 | C=13-432B—517°34 Vn (10) C@=12-8301Z— 1087 + = C=9'892L — 289-55 7 (11) C@=15°3265H’— 694 + > C=5:264H + 86805 (auricular height) nN (H'=total height) TABLE VI. Females. Negro German (12) @=17°872B-1114 +7 C=15-716B — 927-66 nN 87 (13) C=12°46 L- Oe ae C=12°055L —755°53 nN 98 (14) C=10°871H’— 184+ J, | C=10993H+ 8213 (auricular height) (H’ =total height) Se =r 5 ‘ 4 7 wu at Pes. a Ps fed ; : j i ; : - ay ra : : : 2 7 ‘ 1 a ie 7 ' : : 2 t - 1 =. ae 1h A ¥ i _ a : _ 1 = ; : = 7 om - * : : * got Biometrika, Vol. X, Part | Plate X Dr Maynard’s Piebald Negro Miscellanea 193 No great degree of accuracy can be expected in reconstructing the capacity of a skull from a single measurement, but the remarkable difference of formula (11) for negro skulls from the corresponding German formula is of course due to their referring to different measurements of the height. If we insert H=133°8 in (11), which is the mean total height of the Bavarian 96. skulls we get C=1356'7 + —= instead of the measured mean C=1503 cm’. Ne sos : ; 105. : F : Similarly equation (9) gives C=1555°6 + a instead of 1503 when we insert the German n mean B=150°5. Thus 9—14 are of little use for our purpose. VII. Note on a Negro Piebald. (C. D. MAYNARD.) THE remarkably interesting photograph of a negro piebald on Plate X has been forwarded to the Editor by Dr C. D. Maynard. The native comes from the district round Chai Chai. Dr Maynard writes from Ressano Garcia, and states that the hospital attendant took the photograph. The extraordinary interest of the case arises from the fact that the thighs and feet are of normal negro pigmentation, but in the other patches we have varying degrees of pigmentation of the skin down to albinotic white. Unfortunately there is no dorsal view, but the back is stated to be also affected with albinotic areas. The boy reported that he was in the same condition when born, and that the nature and areas of the pigmentation had not altered. VIII. Note on Infantile Mortality and Employment of Women, from the Report on Condition of Woman and Child Wage-earners in the United States, Volume XIII. Infant Mortality and its Relation to the Employment of Mothers. By ETHEL M. ELDERTON. THe author of this Report emphasizes the difficulty of determining the effect of women’s employment and points out that ‘It would be possible to draw positive conclusions as to the relative importance of this particular factor only by point-to-point comparison of the infant mortality for a period of years in two large communities, or two classes of large communities, in which all the material conditions were sub- stantially common, with the single important exception that in one a considerable proportion of the married female population of child-bearing age were at work outside of their homes and in the other community with which the comparison was made none of the women were so employed. To admit of entirely sound conclusions, it would be necessary that the populations-—and especially the women—of both communities should be of like ages, races, and physical health, that their living conditions should be practically identical, and that, in a general way, the child-bearing women should be of about the same grade of intelligence....... In default of some such comparison on a broad scale of the mortality of the infants of working and non-working women of similar ages, races, intelligence, and living conditions, no one can determine accurately how many of the deaths of working women’s infants are due to the mother’s work and how many to the other conditions of their lives and environment.”’ (p. 18). The author illustrates the point by taking the six New England States and giving the infant deathrate, percentage of women of 16 years and over who are breadwinners, percentage of foreign- born to the population and percentage of population living in towns of 4000 and more inhabitants, and showing that, though the states with the highest infant mortality have also the largest Biometrika x 25 194 Miscellanea number of women employed, they have also the largest percentage of foreign-born and of those living in urban surroundings, and that it is therefore impossible without further investigation to assign the infant deathrate to any of these three factors. A further investigation has been undertaken into the 32 Massachusetts cities and the death- rate under a year is given, the percentage of foreign-born, the births per 1000 of the population*, the percentage of women gainfully employed and the percentage illiterate, and a comparison is made between the ten cities with the highest and the ten cities with the lowest infant deathrate and percentage of women employed and the other factors enumerated. The conclusion is reached that “These comparisons indicate, superficially at least, that a more direct relation exists between infant mortality and the birthrate, the percentage of foreign-born, and the percentage of female illiteracy than between infant mortality and the employment of women.” (p. 38). There can be no doubt that a direct study of the infant mortality in relation to women’s employment can only properly be made, when we confine our attention to women, employed and unemployed, who are actually mothers and live in the same town, and when we correct for aget, and if possible home conditions. Still if we take a series of different towns the right method must be to correct by the method of partial correlation for such divergent factors as we are able to ascertain and allow for in the series of towns investigated. I have endeavoured to apply modern statistical methods to the data of this Report, taking as measures of the environmental conditions in the towns: D the general deathrate, 7= percentage of illiteracy, f= percentage of foreign-born population, e=percentage of females employed 10 years of age and upwards (note, not percentuge of employed mothers, so we may be largely measuring effect of child labour on future motherhood), and @=deaths under one year per 1000 births. Then we have for cor- relations : Tc = 68, Ta= “70, ldf= 74. Hence numbers of foreign-born and of illiterate appear to be slightly more influential on infantile mortality than employment of women. These values are certainly high and the first is the sort of crude value which is used as an argument against the employment of women. Proceeding to partial correlations we have Pac= 36, a= “43, oS 48, Ct ed "42, eas 57, Fide “Bl. We next corrected for two factors and found : iflde= “34, ef a= 12, i af= 43, Thus we see that illiteracy has least influence on the infantile deathrate and the presence of foreign-born most. But even the presence of foreign-born and of illiterates is not a very complete measure of environmental effects liable to influence the infantile mortality in different towns as apart from employment of women. Many women employed means industrial conditions and possibly generally bad environment. I have taken as a measure of this the general deathrate D and find TDa= 71, 'De= ‘47, TDi= ‘60, Di “49, Whence I find : prac='97, pra="62, pras="75, D'fe= “49, D'i= 61, Dit 68, showing very substantial relations after correction for a general measure of poor environment. * The author is not very confident of the full accuracy of the complete registration of births. + Young women are often employed up to the birth of their first one or two children, but the death- rate of these elder-born is heavier than the deathrate of those who immediately follow. Miscellanea 195 Next proceeding to allow for two factors we find pDra="23, pora="44, ppl ae = "35, the latter result shows that general deathrate and illiteracy are about equally influential on the relation of employment of women to infantile mortality. Finally I corrected for all three factors and found : isDl de = "28 or 60 °/, of the crude correlation 72,='68 is due to women being most employed in towns where the general deathrate is high, where illiterates are frequent and the population is largely foreign- born. How much further the relationship would be reduced, could we equalise other features of these Massachusetts cities, it is not possible to predict. The examination of the individuals im one city appears to me to be the only satisfactory method of disentangling the numerous factors which influence infant mortality. We commend, however, the study of the first part of this Report, as it deals very clearly with the difficulties which arise, and will counteract the tendency, which is prevalent, to assert causation whenever association is observed. The author lays stress on avoiding such logical confusions. Part Il of the Report deals with infant mortality and its relation to the employment of mothers in Fall River, Massachusetts. In 1908 the attempt was made to visit the homes of each of the mothers of the 859 infants who died during the year and to ascertain details con- cerning her occupation, ete. In 279 cases the family could not be traced. In 266 cases prior to the birth of the child the mother was at work outside the home while in 314 cases the mother’s work was limited to household duties or other work carried on entirely at home. Thus only the cases of deaths are dealt with and the causes of death are compared in the two groups of cases (1) when the mother was at work outside the home prior to the birth of the child and (2) when the mother’s work was carried on entirely in the home. I hold that this method will never prove as satisfactory as that employed in districts in England ; in England certain districts are chosen and every baby within that area is visited and the deathrate per number born in one group can be compared with another and the circumstances surrounding those babies who survive and those who die in the first year of life in a given district can be analysed. I do not think that the fact that a rather higher percentage of all deaths from gastritis etc. in Fall River occur when the mother works away from home and a rather higher percentage from congenital debility at birth when the mother does not work away from home will help us much in discovering the influence of the employment of the mother on infant mortality, nor do I think it will throw much light on the question of stillbirths with which the Report also deals. It is found that there are no more stillbirths proportional to all deaths when the mother is industrially employed, but it seems to me that this tells us nothing about the number of stillbirths pro- portional to all births. The real question is whether mothers employed away from home in factory or workshop, wuose other circumstances are the same, lose more children in the first year of life or have more children stillborn than the mothers who are only employed in their homes and I do not think a comparison of causes of death will lead us much further, and I think it may lead to difficulties. When dealing with the mother’s work after childbirth in relation to the causes of infant mortality it is pointed out that the smaller percentage of deaths from congenital disease among the children of mothers who returned to work after childbirth was owing to the fact that most of the children dying from this group of causes died in the early weeks of life before the mother returned to work. For this same reason the number of deaths from gastritis ete. of children whose mothers returned to work is exaggerated, for we are missing out a whole series of illnesses 196 Miscellanea which have ceased to add to the child deathrate by the time the mother returns to work and we must increase in this way the percentage of deaths of any disease of the later months of a child’s first year of life. It seems to me that a comparison of deaths in this way will really give very little information ; an excess of deaths from one disease means a defect in some other disease; it is shown that when the baby is nursed exclusively by the mother 26-0 per cent. of the deaths were from diarrhoea, gastritis, etc.; when partly nursed the percentage was 52°3 and when artificial food was exclusively employed the percentage of deaths from diarrhoea etc. was 42°9; the baby certainly dies less from gastritis when it is breast fed but it dies in greater numbers from other causes. Here again there is a difficulty; deaths from congenital diseases fall on the first weeks of life when breast feeding is the rule, while deaths from gastritis etc. fall on the later months of child life when “partial breast feeding” has become more common and I do not think it is possible to draw any conclusions from a comparison of deaths from one disease to deaths from all diseases as to the importance of artificial feeding in relation to deaths from gastritis. Interesting information is given as to the reasons for artificial feeding ; the numbers are not large enough to justify any definite conclusions, but thisis such an important part of any inquiry into the influence of artifical feeding on the infant deathrate that one welcomes its inclusion in a report of this kind. WE have been requested by Professor F, M. Urban to insert the accompanying announcement. ANNOUNCEMENT. A prize of One Hundred Dollars ($100.00) is offered for the best paper on the Availability of Pearson’s Formulae for Psychophysics. The rules for the solution of this problem have been formulated in general terms by William Brown. It is now required (1) to make their formulation specific, and (2) to show how they work out in actual practice. This means that the writer must show the steps to be taken, in the treatment of a complete set of data (Vollreihe), for the attainment in every case of a definite result. The calculations should be arranged with a view to practical application, i.e. so that the amount of computation is reduced to a minimum. If the labour of computation can be reduced by new tables, this fact should be pointed out. The paper must contain samples of numerical calculation, but it is not necessary that the writer have experimental data of his own. In default of new data, those of F. M. Urban’s experiments on lifted weights (all seven observers) or those of H. Keller’s acoumetrical experi- ments (all results of one observer in both time-orders) are to be used. Papers in competition for this Prize will be received, not later than December 31st, 1914, by Professor E. B. Titchener, Cornell Heights, Ithaca, N.Y., U.S.A. Such papers are to be marked only with a motto, and are to be accompanied by a sealed envelope, marked with the same motto, and containing the name and address of the writer. The Prize will be awarded by a committee consisting of Professors William Brown, E. B. Titchener and F. M. Urban. The committee will make known the name of the successful competitor on July 1, 1915. The unsuccessful papers, with the corresponding envelopes, will be destroyed (unless called for by their authors) six months after the publication of the award. Corrigendum. Dr Derry has most kindly pointed out a slip on p. 307, Vol. VIII; the value of 100 (B—#H)/Z for Congo female crania is +1:9 and not —1°9, which brings these crania nearer to their proper place, and the remarks on this point p. 308 should accordingly be cancelled. outs if Ana omy and Physiology Be ; Caren THOMSON, University of Oxford University of Cambridge x ARTHUR KEITH, Royal College a Surgeons. ARTHUR ROBINSON, University of oe oe VOL. XLVIII- : oa ANNUAL, SUBSCRIPTION 21/- POST FREE _ CONTENTS OF PART Ill. —APRIL 1914 a Note on Two Cases of - ‘Well-marked ‘Suprasternal Bovis: Professor PETER P e Development of the Lobus Quadratus of the Liver, with Special Reference to an a jatar of this Lobe in the Adult. ‘Professor F, G. Parsons. The Characters of the English gh-Bone, FReprric Woop Jonss, D.Se. The Lower Ends of the Wolffian Ducts in a Female Pig bryo. D.Davinson Brack, B.A., M.B. (Tor.). Two Cases of Cardiac Malformation—more especially — 7 ofithe Infundibular Region. ‘Rarer Tuompson, Ch.M., F.R.C.S. Figures relative to Congenital Abnor- malities of the Upper Urinary Tract, and some Points in the Surgical Anatomy of the Kidneys, Ureter, and Bladder. Haron Riscusters, "MLA., M.D., B.C. (Cantab.), F.B.C.S. (England). Anomaly of the aR Inferior Vena Cava: Duplication of the Post-Renal Segment. Brrnarp Cozn, M.B. A Communication — as to the ‘Causation of large Vascular Grooves found on the Inner Aspect of the Os Parietale. Rupert Downes, M.D., M.S. (Melb.). The Interrelationship of some Trunk Measurements and their Relation to. Stature. ed J. T. Wisox, M.B., F.R.S. Observations upon Young Human bryos. . Review: Biedl. Innere Sas eS - physiologische ae und ihre vieaianed die eeiclane. \ : i a 4 4 - “Yoh ‘XLII. _Taly—December, 1913 Sommas, W. J., M. A, Sc. D., LL,D., F.R.S. Paviland One an Aurignacian Station in Wales 4 (rhe Hucley Memorial Lecture for. 1913). (With Plates XXI—XXIV.) Jounsron, Sir H. H., G.C.M.G., K.C.B., D.Sc. A. ‘Survey of the Ethnography of Africa; and the Former Racial and Tribal Migrations | thai Continent. ‘Eyans, Ivor H.N. Folk Stories of the Tempassuk and Tuaran Districts, British forth Borneo. ‘Donnas, Hon. (CHARLES. History of Kitui. Parsons, F.G. On some Bronze Age and ne utish Bones from Broadstairs, with Type Contours of all the Bronze Age Skulls in the Royal College Egyptian Sudan. (With Plates. XXV_XXXVIIL) Hitron-Stueson, M, W., F.R.G.S. Some Arab and Shawia Remedies and Notes on the Trepanning of the Skull in Algeria. (With Plate XXXIX.) of Felloys. ‘SANDERSON, REP URE Ge enya Games of Central Africa. mee Title Page and List of Fe oye. ae ee ‘ ee WITH NINETEEN ‘PLATES AND. MANY ILLUSTRATIONS IN THE TEXT. we arma se) Se PRIOR: Ios, NET 3): y Lonoon THE ROYAL ANTHROPOLOGICAL. INSTITUTE, 50, Groat Russell Street, We, f or enyaeah any Bookseller ioe ah i . “ Publishea coiled the dacoion ef the Royal pee aed Tastinitel of Great Britain and Ireland. Bach number of MAN consists of at least 16 Imp. 8vo. pages, with illustrations in the text together ‘with one full-page plate; and includes Original Articles, Notes, and Correspondence; Reviews — and Summaries; Reports of Meetings ; and Descriptive Notices of the eaters of Museums and 4 - Private Collections. : ; we, Mont ly or 108. une teas pena TO BE OBTAINED FROM THE inet of Surgeons’ Museum. “Seuiemann, C.G.,M.D. Some Aspects of the Hamitic Problem in the Anglo- University College, London. It is very desirable that a copy of all measureme - in Roman not German | characters. 10s. net. Volumes I, Ul, UI, IV, V, VI, VU, VIIL and IX (1902. M.D. Dona “wie Plates I_VI ae 19 Le) ‘Tables of Poisson’s Exponential piiomial ATES: Ae Poisson’s Dee of Small Numbers, By - Plate VII). alae PaaS IV. The Hainisnship Boreas Weight of. the Seed cof the Plant ee i} aes J. ARTH Diagrams in oe ae RE AN “15 admis in the text) ‘VI. On Homotyposis and ‘Allied Charabkate's in 1 eee | Win1aM Rowan, K. M. ee B. Se. and . "Plates Natt set colours) and eo (Miscellanea: (i) The Statistical 1 Sindy. of Dietiries, a + Reply to, ‘Professor Kart uP By D. Norn Paton, F.R.S. f é 'Gi) The Statistical Study of ‘Dietaries, a 3y I (ii), Note on the essential Conditions that a ‘Population bre Ee ‘ (iv) The Elimination of Spurious | Correlation due to ‘Postion By “Student” i” Saat a (v) | On ‘certain Errors at regan ho. Meltiple Comelation o ieee (vi) (vii) | (With Pla (viii) Note on™ “Tafantile salt and Bape of Women. ; ELDERTON . ariel Sone aaa | Announcement by Professor F. Ae ‘Unpaw oe j on Coregendae PMR obras fds sety ERs che is The UhBapiione ‘of a mee in “Bioheotes nucle that i in Hine E method or material something of interest to biometricians. But the Ed understood that such publication does not es assent to the argu drawn in the paper. — Biometrika appears about four times a year. A volume co tables, is issued annually. ‘ Papers for publication and books and offprints for notice should: be sent to for publication, should accompany each manuscript. In all ¢ the papers t! not only the calculated constants, but the distributions from which th xy hi ve be -and drawings should be sent in a state suitable for direct photograp fic repr paper it should be blue ruled, and the lettering only pencilled ye Papers will be accepted in German, French or Ttaliar Contributors receive 25. copies of ae ‘papers ay i dditional copies may be payment of 7/- per sheet of eight pages, or part of a sheet t extr ; Plates; these should be ordered when the final proof is return d. ‘The subscription price, payable in advance, is 30s. net per volum Bound in Buckram 34/6 net per volume. ‘Index to Volumes It ‘to C. F. Clay, Cambridge University Press, Fetter Lane, Lon _pookgeller, and communications respecting advertisements should als id Till further notice, new subseribers to Biometrika may obtain Tols bound in Buckrvam for £12 net. A The Cambridge University Press has aerated the Univers thi of Biometrika in the United States of America, and has authorised them toc $7.50 net per volume ; 3) single parts $2.50 net each. NOW READY FOR STATISTICIANS - TABLES © . CAMBRIDGE: Tae BY JOHN CLAY, ae Vol. X.- Parts IT and III. November, 1914 BIOMETRIKA i A JOURNAL FOR THE STATISTICAL STUDY OF ! BIOLOGICAL PROBLEMS FOUNDED BY W. F. R. WELDON, FRANCIS GALTON anp KARL PEARSON EDITED BY KARL PEARSON CAMBRIDGE UNIVERSITY PRESS C. F. CLAY, Manager LONDON: FETTER LANE, £.C. EDINBURGH: 100, PRINCES STREET also H, K. LEWIS, 136) GOWER STREET, LONDON, W.C. WILLIAM WESLEY AND SON, 28, ESSEX SEREET, LONDON, W.C. CHICAGO: UNIVERSITY OF CHICAGO PRESS BOMBAY AND CALCUTTA: MACMILLAN AND CO., LIMITED TORONTO: J. M. DENT AND SONS, LIMITED TOKYO: THE MARUZEN-KABUSHIKI-KAISHA Price Twenty Shillings net. [Issued December 3, 1914] I. "Mathematical GontHibations to che | ‘Theory of Evolution.— XIII. On the Theory of Contingency and its Relation to Associa- tion and, Normal Correlation. By Kary- Parson, F.R.S. Price 4s. ‘net. re Mathematical Contributions to. the Theory of Evolution. XIV. 'On the Theory |~ -of Skew Correlation and Non-linear Regres- 3 sy By Karn PEARSON, F.R.S. Price 5s. II. Mathematical Contributions to ‘the Theory of Evolution.—XV. On the Mathe- - matical Theory of Random Migration. By Karn Pearson, F.R.S., with the assistance of JOHN BLAKEMAN, M. Se. - Price 5s. net. he IV. Mathematical Contributions to the Theory of Eyolution—XVI. On Further | Methods of Measuring Correlation. By Karu PEARSON, F.R.S. Price 4s. net. _-V. Mathematical Coo oNe to the I. On the Relation of Fertility in Man to Social Status, and on the changes in this | Relation that have taken place in the last 50 years. By Davip Heron, M.A,, D.Se. = Price 3s. Sold only with complete sets, * Il. A First Study of the Statistics of Pulmonary Tuberculosis (Inheritance). By Karu Pearson, F.R.S. Price 3s. net. IlI. A Second Study of the Statistics of Pulmonary Tuberculosis. Marital Infec- | . tion. By Ernest G. Pors, revised by Karu Parson, F.R.S. With an Appendix on | Assortative et by. ErsEn M. ELpERTON. Price 3s. net. IV. The Health of the Schoo!-Child in re- | lation to its Mental Characters. By Karn ; ~ Pearson, F.R.S. Shortly. «| V. On the Inheritance of the Diathesis — of Phthisis and Insanity. A Statistical _ Study based upon the Family History: of | 1,500 Criminals. By Caries Gorine, M.D., BSc. Price 3s. net. Questions of the Day ae of the Pray. Pte The fe of Parental Alcoholism » on the Physique and Ability of the Of- | spring. A Reply to the Cambridge Eeono- ; ee mists. By Karn Prarson, . B.S. Price TS. et Sep s _ IL. Mental Defect, Mal-Nutrition, uA the Teacher’s Appreciation of Intelligence. - A Reply to Criticisms of the Memoir on _ ©The Influence of Defective Physique and Unfavourable Home Environment on the Intelligence of School Children. {By Davip Huron, D.Sc. | Price 1s. net. ‘an An Attempt to correct some of the |’ - Misstatements made by Sir Victor Hors- LEY, F.R,S., F.R.C.S., and Mary D. SrurGE, M. D. is in their Criticisms of the Memoir: | . A First: Study of the Influence of Parental’ _ Alcoholism,’ &c. ‘By Kart ee ge FR. S. Price \s. net. Tx, Mendelism and the Problem of ‘Mental Defect. LER On the Graduated Character of 2 Mental Defect, and on the need for standardizing Judgments as to the Grade of which shall involve Heese By ite { RRASEES F.R. a bl umber.) ~ | Studies i mM National Deterioration, a ts Le a VE. Hugenics and Public Health, e na ‘Mathematical ‘Contrib tions Theory of Evolution.—XVIIL. (€ - Method of Regarding th _ two Variates classed solely i - Categories. By Karn Al Price 4s. net. 1a ee ‘VIL Albinism in Man. By Ka yi |. NerriusHyr, and OC. H. UsuE Part II, and Atlas, Part II. Pri Te -Albinism in Man. By Kari Px eae hed Dry NEDTLESHIP, and ,C. H. UsHt - Part ya ,and Atlas, Part IV. Price 21 ne beget peek ree ¢ arene) "Pubstodlowe. ~ of the Tuberculous and Fauna ment. By W. P. Expsrron, | 5. ae ae Price 33, Hs - Pulmonary Tuberculosis : the M _ the Tuberculous : Sanatorium a \ - culin Treatment. By W. Party EEDERTON, ie FILA, and Smyey Die PARRY e giv. _ Price 3s. net. LS i 3 i A Statistical Study of Oral. ‘Tem peratures in School Child: ith spec veference to Parental,. Environmental and. Class Differences. By M. H. Wi M.B., JULIA Brut SM AC "PEARSON, ERS uae “6s. net. Iv. The Fight eee Tuberculosi and the Death-rate from Phthisis. Hh Sos eiby E.R. S. pes As. ne Past, » Present an “Putre,’ --PEAnson,: FERS. Price Is.. net. + to the York Congress of the Royal S: Institute. By eee ieee FE. a Domai. ina The bebe of - Defect. ny Karu Pzarson, E.R. Gustav. A. JampERHOLM. Price 1s, net. VOLUME X NOVEMBER, 1914 Nos. 2 AND 8 A PIEBALD FAMILY. By E. A. COCKAYNE, M.D., MRCP. In spite of the great interest, which they have always excited, well authenticated examples of piebalds in the dark races have been found to be rare. In the white races they are much less conspicuous, in part owing to the presence of clothing and in part owing to the lack of contrast between the pigmented and unpigmented skin, but the likelihood of their coming under the notice of a skilled observer is much greater. The scarcity of records shows that piebalds in the white races also must be very uncommon. Last year I met with a case in a baby, and found that the child belonged to a family, many of whose members showed a similar defect of pigmentation. The family, belonging to a farming stock, originally came from the neighbourhood of Bury St Edmunds in Suffolk, and the anomaly is known to have descended directly through six generations. The oldest member, with whom I have talked, is fairly certain that it was present in one generation at least before this. Of the first two generations in the pedigree (see Plate XI), I could obtain no definite information except the statement as to the existence of the piebaldism in I. 1 and II. 1, but of the third, III. 2 is said to have had a frontal blaze of white hair and white skin on the neck and forearms, which was very conspicuous owing to its marked contrast with the neighbouring weather-stained normal skin. IIL. 4 appears to have been the only member of the family who showed a marked dislike to the condition and always wore a wig to hide the frontal blaze. III, 2, whose family name was C—-—%*, had fifteen children. The first, IV. 2, a male, with dark hair, married twice, and had eight normal children, five by the first wife and three by the second. The second child, IV. 5, was a piebald, with a large frontal blaze, white skin on the front of the neck and arms, and blue eyes. He transmitted the condition to all his three children. V. 3, the eldest boy, aged 22 and unmarried, possesses dark hair, with a V-shaped frontal blaze of white or cream coloured hair, the apex of the V commencing near the coronal suture and spreading out to a width of 34 inches, as it reaches the forehead. The eyebrows and some of the eyelashes are white. The next boy, V. 4, is aged 18. He has light hair and a very large blaze of unpigmented hair, which covers the whole of the top of the head. His eyebrows and eyelashes are white, and the eyes are blue. Both boys have white patches on the front of the neck and on the arms (see Plate XII). * Names preserved in the confidential register of the Galton Laboratory, Biometrika x 26 198 A Piebald Family Next in the fourth generation were twins, IV. 6 and 7, both piebalds. They were evidently not uniovular, because one had dark hair, and one light, and the white blazes were dissimilar in extent, but it is uncertain which had the larger. Both died at an early age. The next, IV. 8, a girl, was normal with dark hair and eyes and remained unmarried. Next came a woman, IV. 10, who was a piebald with a large frontal blaze, white eyebrows and eyelashes, and white skin on the front of the neck and forearms. The right eye was blue, and the left brown (see Plate XIII (B)). Her child, aged 13, is quite normal with light hair and dark eyes. The next child, IV. 12, Mrs W——, has a large frontal blaze and dark brown irides. There is a large irregular patch of white skin extending from just below the chin to the heads of the clavicles, and round it the skin appears to be more deeply pigmented than the rest of the skin of the neck. There are a few small islands of pigmented skin near the edge of the unpigmented area. The skin of the anterior aspect of the forearms is unpigmented from the elbows to the wrists, and here also, there are some small islands of pigmented skin in marked contrast to the unpigmented area, in which they lie (see Plate XIV). The first two children of this individual were daughters, V. 8 and V. 9, both piebald, the third a normal son, V. 10, and then three more piebald daughters, V. 11—138. The first of the daughters, Mrs G——, V. 8 (see Plate XIV), is very fair with a very large frontal blaze covering the whole of the top of the head, and her eyebrows and eyelashes are white. Her normal hair has pale creamy diffused pigment and, according to the individual hair, some to a decided number of granules*. The hair of the blaze has no diffused pigment and no granules. The irides are light brown, but the outer segments on both sides are paler and greenish in colour. The skin of the forehead and base of the nose is very pale in colour. She has a large white patch on the skin of the front of the neck, beginning just below the chin and widening out so as to embrace that over the inner ends of both clavicles. As in her mother there appears to be some concentration of pigment round this white area, and there are small isolated areas of pigmented skin near its edge. She has unpigmented skin on the anterior aspect of both forearms. Of her two children the first, VI. 1, a boy aged 8, is normal, the second, VI. 2, a boy aged 14, is a piebald (see Plate XIV). This child, VI. 2, was nine months old when first seen. He had a very large frontal blaze, resembling that of his mother and covering all the top of the head, the eyebrows and eyelashes were white with the exception of some of the outer hairs. Hair, pale cream in colour, said to be from the light area, has pale creamy diffused pigment and some granules (8), the granules being very small. It was obvious, even at this age, that hetero- chromia iridis was present. The right iris was pale except for a sector of dark grey occupying the upper and outer quadrant, the left iris was entirely dark grey. No difference in colour of the skin of the neck or forearm could be made out. * B to y on the Galton Laboratory scale of granular pigmentation. KE. A. CockayNneE 199 When the baby was seen after the summer of 1913, the grey portions of the irides were becoming brown, the pale portion was still light blue. The face and arms were sunburnt, and it was noticed that the forehead was paler than the rest of the face. There was a pale area on the front of the neck, and the whole anterior surfaces of the forearms were white, the edges being very irregular in contour. There was also a white streak running obliquely right across the posterior or extensor aspect of the left forearm, and this offered a marked contrast with the rest of the surface, which was very brown. When the sunburn had died away the difference between the pigmented and unpigmented skin could no longer be made out. IV. 12’s second daughter, V. 9, aged 23, has only a small cream coloured frontal blaze, and the rest of her hair is light brown (see Plates XV and XVI). The eyebrows are composed of an even mixture of brown and white hairs, and the eyelashes are similar, with brown and white hairs alternating. The irides are grey and uniformly pigmented. There is a large irregular area of white skin at the base of the neck. The whole of the anterior aspect of the right forearm is unpigmented, and there are similar small areas scattered over the posterior aspect (see Plate X VII). The left forearm is white only on the anterior aspect. The next girl, V. 11, is aged 9. She has a very small frontal blaze, but the skin of the forehead is pale (see Plate XV). The eyebrows show a division into two parts, on the inner halves grow white hairs only, and on the outer brown hairs. The eyelashes on the contrary consist of alternate brown and white hairs. The irides are grey and uniformly coloured (see Plate XVIII). There is only a small white area in the middle of the front of the neck, but there are well differentiated white areas on the anterior aspects of both forearms (see Plate XIII (A)), and on the inner aspects of both upper arms. Her hair was examined and the first sample showed very pale diffused pigment and some granules (8). Two more samples were then examined, one from the blaze and one from the neighbouring part of the scalp. The first showed no diffused pigment and no granules, the second showed the majority of hairs with yellow-brown diffused pigment and a decided number to plenty of small granules (y—6), but a few had no diffused pigment and no granules. The next piebald child, V. 12, died young. She had a frontal blaze and blue irides. Some of her hair showed very pale diffused pigment, and some granules (8). The next child, V. 13, also died young. She was a piebald nearer to the classical type than any of the others. She had a large frontal blaze, white skin on the forehead, and large areas of white skin on the front of the neck and chest, and in addition a very extensive area on the abdomen. Of the fourth generation the next child, IV. 13, was a male with dark hair and eyes, who had 5 normal children; the next, IV. 15, had fair hair and died young. Twins, IV. 16 and 17, came next and died in infancy*. They were heterogeneous, * The tendency to twin in this family is worth noting. 26—2 200 A Piebald Family a dark-haired boy and a light-haired girl. A girl, IV. 19, was born next and she had twin sons, V. 15 and 16, who were also normal. The last three children, IV. 20—22, a girl, a boy, and one whose sex I am unable to ascertain, were all normally pigmented and all died at a very early age. The pedigree confirms the strongly hereditary nature of piebaldism, and in this as in other published cases the character can affect either sex, but has only been transmitted by those affected. Unless we are to assume that in the case of such a rare anomaly as piebaldism, I. 2, II. 2 or III. 1, were really unnoticed piebalds, then III. 2 could only be heterozygous, or since piebaldism is dominant a (DR). We must take IV. 4, IV. 9, IV. 11 and V.7 for pure recessives (RR). Thus the number of piebalds in the five sibships of generations IV, V and VI should be one quarter, Le. $(15+34+1+6+2)=7 nearly. We have actually 14 out of 27, thus piebaldism does not seem to act numerically as a pure dominant. The areas of unpigmented skin are less than in the classical piebalds, but it is probable that in some, at least, they are larger and more numerous than I have stated. On the covered parts of the body and legs, which I was unable to examine except in the baby, they would not be very noticeable. It was not until I had noticed the white skin on the neck and arms of one of them that I was told anything about the existence of similar patches on the others. If true, it is remarkable that none have had white patches on the legs. With regard to the local distribution of the pigment, there appears to be an excess at the edges of some of the unpigmented areas, as has been noted in other cases. In the case of other pale areas, the demarcation between them and the normal skin is very slight, and is probably due to the fact that they are not wholly unpigmented. ‘This remark applies especially to the forehead, which in some of them looks paler than natural, but not wholly devoid of pigment. In some the eyelashes are alternately white and brown, and in others the eye- brows are similar, and in one at least hairs growing on the scalp near the blaze are in some instances entirely without either diffused or granular pigment. This suggests that the skin beneath may show a deficient and irregular distribution of pigment. The most interesting feature is the occurrence in three members of the family of well-marked heterochromia iridis, a character which has been met with in members of a piebald family, but always independently of their piebaldism, never, as in this case, in true association with it. It proves conclusively that these cases are not congenital leucoderma. There seems to be no association of piebaldism and general lack of pigmenta- tion of hair and irides. Affected and unaffected members have been both fair and dark, but the fairest piebalds seem to have the most extensive frontal blazes. In the cases photographed the individuals were blonds and there has been great difficulty in getting a good photographic contrast of differences of pigmentation very noticeable in the living subject. Plate X| Biometrika, Vol. X, Parts Il and Ill “ATLULR pleqeig Jo valsipog *"|RULION O “pleqatg & "SIPLIL BIMMOTYDOIeJaAY 2& ‘punod pelq + oor € : + Cle 1G 1 Oo © © 2 2:01.06 Se 2 eo hc Oat 2 yp + 4 : + + 6-6-6 ©6010 16 © 8 0 210 0: 62 © © @ 22) > ‘III Biometrika, Vol. X, Parts Il and III Plate XIl V. 3 and V. 4 as children showing their marked V-shaped frontal blazes. Plate XIII Biometrika, Vol. X, Parts Il and III a ‘aTqeysmnsurysip ‘sudyT & Japun Aypno T ATIsva WW Tp Ut Ss M Uaes aq qa I SIPLIT BVIULOITOOLI}JoY otf} qnq Avul ozetq eu ‘OL ‘AL aq} uo va 1B “WIR Jol ayy Jo yoodse ror ayOSNIT payerywa Taytp [Joa wv ajue SUIN ALOUS I iE A Plate XIV Biometrika, Vol. X, Parts II and Ill $ : “sUaT B JO BSN ay} YIM Z| ‘TA JO ada FYSLT oy} JO TWorjoos AoyABp a} PUY daIty [[B UL OTQISTA UB [BYUOIF VIA OTT, ‘mOspuRID . LGN: paw ‘TaTJOMIpPUBI vail ANT: ‘TaYyOoul "8 iN ‘suOTyRIoUIS vot} UL wsIp [|e qat tq Biometrika, Vol. X, Parts Il and Ill Plate XV ‘wo sisters with white frontal blazes. dances V. Biometrika, Vol. X, Parts Il and Ill Plate XVI V. 9. Showing white forelock or blaze, Biometrika, Vol. X, Parts Il and Ill Plate XVII Right forearm of V. 9 showing white patches on posterior aspect. The photograph is untouched and it is difficult to bring out by photography the grades of pigmentation when the arm is untanned by the sun, although they are quite clear on actual inspection. Biometrika, Vol. X, Parts Il and Il Plate XVIII Large photograph of Y, 11 to show paleness of forehead and white hairs on inner half of eyebrows. CLYPEAL MARKINGS OF QUEENS, DRONES AND WORKERS OF VESPA VULGARIS. By OSWALD H. LATTER, M.A. Upon the front of the head of Vespa vulgaris certain yellow markings stand out conspicuously upon the otherwise black surface. Below the three ocelli and between the upper portions of the two compound eyes there is a median four-sided yellow patch, the “corona”; to the right and left of this, separated from it by a fairly wide interval, and occupying the bay of each of the compound eyes is a pair of elongated yellow blotches; while straight below the corona and between the lower portions of the compound eyes is a very conspicuous yellow area which “extends over the clypeus and down to the labrum or upper lip which hes between the two mandibles. This clypeal patch of yellow bears upon it a black mark which is subject to considerable variation. I distinguish in the queens and workers five chief types of this black mark: see diagrams on p. 202. In Type I a broad vertical black band extends right through the yellow patch from the top to the bottom; a little below its middle the band bears to right and left a pair of bluntly pointed and slightly upturned arms: the portion of the median band below these arms is somewhat narrower than that above. Type II is derived from I by suppression of the black portion below the transverse arms. In Type III the extent of the black colouring is yet further reduced by the absence of the upper half (or thereabouts) of the vertical band. In Type IV the lower part of the vertical band re-appears, but the width of all the components is very much less than in any of the preceding types. In Type V the component parts of the black marking cease to be in contact; the upper portion of the vertical band is interrupted by a broad belt of yellow; the two “arms” are separated from the lower part of what remains and from one another; while there is no black at all below these remnants of the “arms”—a feature recalling Types II and III. Types IV and V are however represented only by single individuals in the series examined. Between these main types certain intermediates occur. Thus some individuals have the black piece below the “arms” very narrow, approximating therefore 202 Clypeal Markings of Vespa vulgaris to II, but conforming to I if we take extension of the black right through the yellow as the criterion of I; such individuals are distinguished as 1+ II. Others again conform to II but possess a slightly darker stain on the yellow in the line where the distinctive lower black portion of I might occur; these are called I+ 1. Similarly, intermediates between II and III are recognisable: in II + III the top of the upper portion of the vertical black band is very narrow; while in III + II there is a mere stain on the yellow of this region. A single instance occurs of an intermediate between I and III (1+ III), where the vertical black band extends right through the yellow, but is much narrowed at its upper extremity. Front of head of V. vulgaris ¢. SD Ocelli Eye Yellow Patch tn Girsen bay of Eye (All parts left white are actually yellow.) Clypeus alone HF) (*) @) Types I Il III IV W 4 VII) VII VIII (3 VII) VIII Drawn by K. W. Merrylies. My first examination consisted of about 200 tubes containing queens of Vespa vulgaris from different nests. In the case of some queens the heads were missing, and in the course of transit the contents of some of the tubes had got loose in the jar. I have numbered these 199 to 208. The results are given in Table I (p. 204) and the summary below: O. H. Larter 203 Type I Pure Poh ‘. I+II 11 62 , 1+ Il] I Type II II+I] 29, 5 Pure 94} 120 55 II + Il 4 Type III III + I a : 4 Pure 3, C4 III +1V 0| Type IV IV +I1ll 0 a Pure | il 2! IV+V 0 Type V Vea Hi 1 r Pure 1 Total: 185 It will be seen that transitional cases undoubtedly occur. The bulk of the queens, however, fall into Types I and II, or queens are very little variable. To test: (1) whether this variability was still further lessened by taking only the queens from a single nest, and (11) the relative variability of queens, drones and workers, I now examined all the queens, workers and drones of a single nest of V. vulgaris. In this case all the 127 queens were of Type II*. The classes of the workers are given in Table II (p. 205) and may be sum- marised as follows: Type I Pure 5) 10 § v.s or (I+II)? 5| Type II II+I 6 “ Pure at Ae Total: 172 It will be seen that they are somewhat more variable than the queens of the same nest, but not so variable as queens from different nests. T now turn to the drones of this same nest. JI had 150 at my disposal. The drones exhibit a very wide range of facial markings. In the material examined comparatively few fall into the scheme of classification adopted for the queens and workers, and it thus becomes necessary to resort to six types of face which appear to be peculiar to the male sex. These are numbered VI, * There were 129 queens in this nest, but No. 34 was missing and No. 98 had its head damaged too badly for classing. 204 Clypeal Markings of Vespa vulgaris VIG VID, VII, VII (4 VIID, VIII and IX, see diagrams, p. 202. In Type VI there are two somewhat elongated black dots upon the yellow clypeus, one being sub-central, the other on the ventral margin; in VI (4 VII) the ventral dot is longer dorso-ventrally and a third dot appears upon the left side (right side, in figure seen from in front) opposite the gap between the two previous dots; TABLE I. Types of Clypeal Markings in V. vulgaris Queens. No. | No. | No No. No. | | | | | | oe = 44 III 87 | 1+II | 180 = 173 Ill hee II Dials 88 II 131 II 174, | Tees | 3 | Il4I 46 II 89 ae a 132 II 175 II | 4 1 47 = 90 alge 133 Il WG ho WIESE ae ig alle AS. Wise Oe L 134 — 177 Tore 6 at AQ) 3) > Sel 92 et 135 I Tey) ~ UE re |) IE BORE in ale 93) LEE in| 36 II 179 esi 3 ail 51 if 949 i 137 II 180 II+I 9 | II o2 II 955) ILE 138 I 181 II 10 || I 5a el 96.09) 139 II 182 = 11 Il+I 54a eee 97 I 140 I 183 | I1+II1 12 II+1 ii) 10 98 dee 141 I 184 = Lea 9 al 56 md 99 |II+IIL} 142 Teen 185 I 14 oe 57 IL 100 I 143 I 186 Il 15 II 58 pa 101 ats 144 Il 187 =|) St 16 II 59 I 102 Il 145 II 188 | I 7 A eaeale 60) meet 103 I 146 I 189 I+II 18 Il 61 IL 104 IU 147 II 190 II 19 | II+I 62 II 105 Il 148 =e 191 I 20) ae 63 II 106 a 149 i 192 I HS ee Al 64 | II 107 I 150 Il 193 III 22a mel (ij) = TU 108 Il 151 II 193.4 II 937 | 21 66 II 109 I 152 = 194 = OAgea eles 67 = TOKO} 7 |) S50 153 II 195 | I+III 25 IL+I Con ay tll 111 | IE+T 154 1H 196 II 26 I 69 eet 112 Il sys | AU 19 jee 27 Il 70 113 I 156 Il 198 IIl+I 28 | I a Il 114 II 157 Til 29 TST 72 ] 115 |II+I0T} 158 IIl+I = - 30 I a eal 116 II 159 Il 31 II Mee letsel Has IU 160 I Loose | 32 II Aven AT 118 I 161 I 33 V 75 | II ie) |). HE 162 Il 199 I 34 Tal M6 | SUL 120 |II+III} 163 I 200 I Si alipageeee i, ae 191. || 7 ale 164 II 201 I 36 iat AGH), eae 129° | 165 II HO i) ULE 37 I oe ee 122 -- NGS | IM 2O3ke ee lal 38 II 80 ee 124. I Gy | 7 204 Il 39 Il+I] Se eee 125 II 16S 205 II 40 ete 82 - 126 elie es 169 |) »— 206 Il Alton lee 83 I NO Wel 70a) eae 207 Ll AN Oyl ad 845 i 128 | II 171 I 208s | 42 Il 85 128b Il 172 = 209) | sri | | 43 I 86 | TSU 129 SHI | | | O. H. Larrer 205 TABLE ILI. Types of Clypeal Marking of Workers of a single Nest of V. vulgaris. No. No. No. No. No. 1 — 37 II 74 II 112 II 150 II - 2 — 38 IT 75 II 113 II 151 II+I1 3 — 39 II 76 II 114 II 152 II 4 _- 40 II 77 II 115 II 153 IT 5 — 41 II 78 — 116 II 154 IT 6 — 42 II 79 I 117 II 155 II 7 — 43 II 80 II 118 II 156 II | 8 —- 44 II 81 II 119 II 157 Il | 9 II 45 |Iv.v.s. 82 II 120 II 158 II 10 II [1+11] 83 II 121 II 159 | II 11 II 46.) I 84 II 122 II 160 II 12 Iv. s. 47 II 85 it 123 II 161 II [I+11] ] 48 I 86.) 1d 124 I 162 | II 13 II 49 —— 87 II 125 II 163 II 14 II 50 IT 88 II 126 II 164 I+! 15 I 51 II 89 II 127 II+1 165 II 16 II 52 II 90 II 128 II 166 II 17 II 53 II 91 II 129 Il 167 Il 18 IT 54 II 92 II 130 II 168 IT 19 II 55 II 93 II 131 II 169 II 20 II 56 II 94 Il 132 II 170 II 21 II 57 IT 95 IT 133 II 171 II 22 IT 58 Iv. s. 96 II 134 II 172 II 93. | Tf Aeim o7 fo = | 135 | mm} ia | 1 24 II 59 II 98 II 136 II+I1 174 II 25 I 60 II 99 II 137 II 175s I 26 II 61 II 100 II 138 II 176¢" If 27 II 62 lil 101 II 139 1H ied, II 28 II 63 II 102 II 140 II 178 II 29 II 64 II 103 IT 141 II 179 II 30 Iv.s. 65 I 104 II 142 Il 180 Il f[+it)| 66 -| IL 105 | II ea iii [er 31 II 67 II 106 II 144 II 182 II 32 II+I 68 II 107 II 145 II 183 II 33 II 69 I 108 II 146 II 184 II 34 — 70 II 109 II+I 147 II 185 — 35 IAs al II 110 II 148 II 186 — (I+1I] } 72 II 111 II 149 IT 187 II 36 II 73 II in VII the two median dots are united by a slender black line, and there is a pair of lateral dots, right and left; in VII (4 VIII) the median line is of uniform width, extending from about the centre to the lower margin, and to its left side there is a single dot; in VIII the median line alone is visible, both lateral dots having disappeared; while in IX there is no continuous median line, but merely two black spots, one at the extreme dorsal and the other at the extreme ventral side of the clypeus. It will be noticed that Types VII—VIII approximate to Type IV in so far as the black stripe begins at about the middle of the clypeus and extends right down to the ventral margin. Biometrika x 27 206 Clypeal Markings of Vespa vulgaris The data are given in Table III (p. 207) and are summarised below: very narrow 6 ape a A VII) if f Type II no horns 1, very narrow 1 2 Type III : : : : : 0 Type IV : : : : : 0 Type V ; : : ; 0 Type VI Pure oi 60 3 Vale vali 1 Type VII 5 : : . nee, Type VIII (VIII near VI) 8 F (VIII + $ VIT) 27 58 , Pure 48 Type IX : : : : ‘ 2 Total: 151 It will be realised at once how far more variable the drones, of even one nest, are than the workers or queens for this character. But their variability is rather of a negative than a positive character, appearing to consist in more or less extensive absence of the fuller markings of queen and worker. The results here deduced for variability of non-measured characters do not wholly agree with those found by Wright, Lee and Pearson on the wing measure- ments of the same nest of V. vulgaris. They found that for absolute measurements the variability as determined by the coefficient of variation was in every case such that the worker was more variable than the drone and the drone than the queen. On the other hand they found when they dealt with zndzces that the drone for wing measurements was slightly more variable than the worker and the queen less variable than either*. Possibly the divergence apparent here may be explicable in tbe sense of the drone’s variability lying in the present case in an absence of marking rather than in any positive variation. The drone’s variation is about a centre of much dimimished marking. If we could measure the variation in the total area of marking in queen and worker we might find it as great as the varia- tions in the smaller markings of the drone. It would be of much interest to investigate a series of drones from different nests. It is clear that the clypeal markings form a secondary sexual character and they would probably provide classifications for hereditary purposes. * Biometrika, Vol. v. pp. 414 and 421. O. H. Latrrer TABLE III. bo Types of Clypeal Markings in Drones of a single Nest of V. vulgaris. No. No. No. No. | No. | 1 VIII 33 VII 63 VII 93 VI 123 VIII near VI} 34 VI 64 Wi 94 VI 124 I 2 VI 35 VIII 65 VII 95 VI narrow | 3 VIII 36 VI 66 VIII 96 VII 125 VI 4 VIII 37 VI 67 I 97 il 126 VIII 5 VI 38 VII very very 127 VII | 6 VI 39 VI narrow | narrow | 128 VII | 7 VIII 40 VII 68 VI 98 | VIII 129 VIII 8 VIII 4] VI 69 VIII 99 VIII 130 VIII 9 VIII 42 VI 70 VI 100 VI 131 VI | 10 IX 43 VIII Hil II 101 | VI gy VIII | 11 VI 44 VI but no 102 AVIA 133} VI 12 VIII 45 VIII horns 103 VIII 134 VI 13 VI 46 VIII 72 VIII near VI} 135 VI 14 Vall 47 VI Wp} VI 104 VI 136 VIII 15 VIII 48 VIII 74 VIII 105 VIII 137 VI near VI} 49 VII 75 VIII 106 VII 138 I 16 VIII 50 VIII 76 VIII 107 VIII dots of 17 Vi 51 VII 77 VI 108 VIII VII 18 VI 52 VII 78 VI 109 VI 139 VII 19 VIII 53 VIII 79 I 110 VI 140 VI 20 VII 54 VI narrow Wadi WAL 141 VI 21 VII 55 I 80 VI 12 VIII 142 VIII 22 VIII very 81 VI 13 VI 143 VIII 23 VIII narrow | 82 VIII 114 VIII near VL near VI} 56 IX near VI} 115 VI 144 VI | 24 VII 57 I 83 VIII 116 VI 145 VIII | 25 VII very 84 VII 117 II near VI 26 VI narrow | 85 VI very 146 VIII De VI 58 VIII 86 VI narrow 147 VI 28 VI near VI} 87 VIII 118 == 148 VIII 4 VII 59 VIII 88 VII 119 VII 149 Vali 29 VIII 60 VIII 89 VI 120s ee Vall 150 VI 30 VIII 61 VIII 90 VIII 121 VI 151 VI 31 VI 62 VIII 91 VII 122 VI 152 VI 32 VIII iVII 92 VII 4 VII 27—2 “ar TABLE OF THE GAUSSIAN “TAIL” FUNCTIONS; WHEN THE “TAIL” IS LARGER THAN THE BODY. By ALICE LEE, DSc. In a paper published in Biometrika, Vol. v1. pp. 59—68, tables for the’ in- complete normal moment functions were printed, and they have since been reproduced in Tables for Statisticians and Biometricians recently issued from the Cambridge University Press. From these tables values of the Gaussian “Tail” functions were deduced and a short table of yy, and yy, appeared in Biometrika, Vol. vi. p. 68. The value of these functions being demonstrated in practice during the last few years, a more complete table of yh, W., Wy; has appeared in the Tables for Statisticians and Biometricians. In the introduction to those tables, however, Professor Pearson indicated that it was important to have a similar table when the “tail” forms more than half the entire curve, and gave the fundamental formulae for obtaining the numerical values of the functions. The present table has been calculated to supply the want thus indicated. 10 ff y Li E B oO H Cc Let the figure represent a Gaussian curve of total population WV and standard deviation o. Let AB be the ordinate at which it is truncated and let OB = haxaor Let GH be the ordinate through the mean G of the truncated portion and BH =d, the distance of the mean from the line of truncation, let } be the standard deviation of the truncated portion about GH, and n=the area of the truncated portion, or of the population observed. Then if any material be supposed to ALICE LEE 209 form a truncated portion of a normal curve, d, n and = can be found (see Tables, pp. xxvii and 25). We have Apt ORY Sccaids oe str gh ssa oj Sow ee din «ies son (i), Arn 10) pte sc rmas enee nice Se ncnen tageaaGs Antes (11), Aira OV Ute cee cranes Gacis cuties aihsielstas vas b,c (iil) These are tabled for each value of h’, at first proceeding by (01 and then by ‘10 as unit. Now y, being known we find h’ from the table, and hence deduce yy, and 3. 2 gives us the value of o from known d. Hence h=h’ xo can be found, lastly (111) gives us the total population from which n is drawn. Thus the constants NV, o and h which fix the total Gaussian are determined. It will be sufficient to illustrate the method of using the tables on certain data as to the English thigh-bone, recently published by Parsons*. Dwightt+ has adopted a method of sexing human femora on the basis of a markedly bimodal distribution obtained by him for American bones. He terms female any femur with diameter of head less than 45 mm., and male any femur with diameter of head over 47 mm. Parsons follows this rule and sexes by other points femora with heads from 45 to 47. As unsettled remainder he has 20 femora of 45 mm. and he gives 12 to ? and 8 to #3; of 46 mm. and 47 mm. he has 41 femora and he gives 4 to ? and 37 to f. As a result of this process he obtains a female frequency curve which rises very abruptly at high values of the diameter, and a male frequency curve which rises very abruptly for low values of the femora. But, if there really be any marked skewness in frequency of the parts of the human skeleton, which is very unusual, we should anticipate that it would be of the same sense. Parsons’ distributions are as follows (loc. cit. p. 256): 50 | 51 | 52 | 36 | 87 | 38 | 39 | 40 | 41 | 42| 48 | 44.| 45 | 46 | 47 | 48 | 49 1 1/—| 3 |] 8 |] 14; 12) 18 | 12] 12} 3 8 | 8 a 1 can 99) | 1741 Si || 19 i 10 bo The ¢ 48 mm. femur according to.the rule should have been treated as a male but presumably it had marked female characters. Were there no marked male characters in any bone below 45 mm.? It will be seen that there is a remarkable dip in the total material at 46 mm. which corresponds to Dwight’s division. In material measured six years ago in the Biometric Laboratory, where every bone in a relatively large series was measured, no such dip occurs and there is in those data no justification for Dwight’s method of sexing. The group of 29 § bones at 47 mm. and the sudden cut off at 45 mm. seems to condemn this method of sexing, at “ay rate from the statistical standpoint. * Journal of Anatomy and Physiology, Vol. xiv. pp. 238—267. t+ American Journal of Anatomy, Vol. tv. p. 19. + This material has been statistically reduced and will shortly be published. 210 Table of the Gaussian “ Tail” Functions Without arguing this point out here, we may illustrate the use of the Table (p. 214) of w’s by taking two of Parsons’ frequency distributions for females; we will cut them off at the points suggested, and then investigate the total popu- lations of females which result. Our author pools for these distributions right and left bones. Taking the diameter of “head of femur” for the females, we have | | | Diameter in mm. ... | 36 | 37'| 38 | 39 | 40 | 41 | 42) 43) gL | | Frequency ... sohe |Past 1/—]} 3] 8 | 14 12 | 18 | 12 These are exactly the bones the Dwight process gives as female. We find >? = 28851, d =2°6159 (measured from 44°5). Hence wy = S2/d? = 4216. Whence by interpolation from the table h’ ='782, W,='864, W;= 1-278, leading to o = 2:260, he edGn, and Mean = 42°73 mm., NV =88-2. Parsons gives for R. femur, Mean = 43, L. femur, Mean = 42, and the total number of bones dealt with 55 + 48 = 1083 (Tables, loc. cit. pp. 249— 251). In his frequency distribution (p. 256) he only records 85 female bones, which give a mean of 42°54 and a standard deviation of 2078 mm. These values are clearly not widely divergent from those we have found above by supposing all bones under 45 to be female. To test the matter further the 105* female bones of which the head was measured by Parsons were taken out. They provide the distribution : how | | Diameter in mm. ... | 36 | 37 | 388 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | Sil Frequency... ... | 1 | 1 | 4 | 109)18' | eulPotalarsiie4 3) == leer These give Mean = 42°47 . 3 Dis 1-996; = 105, * It is not possible to say whether he has omitted two queried measurements. He has not omitted bones he queries in breadth of lower articulation. | | | Frequency ALICE LEE 211 Cutting off all bones over 45°5 we find >? = 26392, d=2-6264, leading to vr, = 3826. Hence bh’ ='984, WW. ="783 and y,= 1195. These provide for the non-truncated population, Mean 42°48, s.D.=2:056, N=104, which are in still better agreement with Parsons’ constants for the 105 bones than the constants for the 85 bones were for their series. It would appear therefore that, if we suppose all bones under 45 female and use our Tables, we get results in reasonable accordance with Parsons’, and possibly by a theoretically more justifiable method than endeavouring to sex the bones above 44 and below 48 from other characters. We have considered from the same aspect the character breadth of lower articular end of femur. Parsons’ distribution of 89 female femora is as follows (p. 257): i | | q . ; Ay SPL yee ya ae llie a wh | yw Prpiliay re | Breadth in mm. ... | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 69 | 70 | 71 | 72 | 73 | 74 | | | | | a : SeEeat In this Table he has only one bone in excess of the numbers on which he bases his means on pp. 250—1. If we truncate at 69'5, Le. reject all bones over 69 mm., we find 2?=3'9803, d=2:9058, and pr, = "4714. Hence we deduce h’ =5295, We= 977, W,=1426. These lead to h=1:503, o=2:839, N=984, Mean=68-00 mm. The actual values given by Parsons’ distribution above are o=2571, N=89, Mean=67'54 mm. Thus the agreement is not nearly so good as for the diameter of the head ot the femur, being about 10°/, wrong ino and NV. It should give as good a result if the method were quite satisfactory, for the bones have been sexed by the diameter of the head, and the limit 44 mm. for diameter of the head corresponds fairly closely to 69 mm. for the breadth of lower articulation. As this paper is not intended as a discussion of Parsons’ data, to which we hope again to return, we will only deal with one more illustration of the use of 212 Table of the Gaussian “ Tail” Functions the Table. We take out from his Tables, pp. 244—248, the diameter of head of femur for 174 male bones. ee Diameter of Head in mm. ... | 45 | 46 | 47 | 48 | 49 50 | 51 | 52 | 58 | 54) 55 =: Teas (sae Frequency ... pb pee ||) c 33 | 17 38 | 20 18; 12; 6 | 8] 3 | The constants of this distribution are Mean=49'14, o=2377, N=174. Truncating at 47°5 we have >? = 3'4341, d=2°7869, whence yw, = 4422, and from the Table h'='679, W.=°908, wW,=1°331. These lead to h=NT132 and Mean = 49°22, o=2°530, N=162, Le. to a “tail” of 40 not one of 52 below 47°5. Actually this tail distributes itself as follows: Onder 45 45 46 Av Gaussian tail 5 6 12 17 Against Parsons’ 0) 9 10 33 This confirms our previously expressed view that probably a considerable number of the bones classed as 47 mm. are really female femora, and that the male distribution runs considerably beyond 45 mm. into the range treated as purely female. Finally let us try the result of pooling male and female bones and breaking up the composite frequency by the method of Phil. Trans. Vol. 185 A, p. 84. We have now 279 bones distributed as follows: ane ; | | ie | HER UES ser 36| 37 | 38| 39 | 40 | 41 ca 43 | 44| 45 ale 47 48| 49 |50| 51| 52| 53| 54| 65) Total ea ke oe = Frequency {1} 1]—/ 4 [10 Bee) 13/33/18| 3820/18/12! 6 | g|3| 279 | The constants are Be = 46°63, s.D.=3°93, by = 15°4040, bs =— 67791, = 5415162, ws = — 5380°5339. The nonic is gq.’ — 596189," + '0689¢.5 + 9°83579q. — 3:°4275q.' — 8: 2041¢,? — 30209," + :0144g, - 0097 = 0, giving the root q,=— 934 and p,=— 9°34, ALICE LEE 213 and ultimately the two components : Male Female Mean ‘ 2 : : f 49°83 43°72 Population : ; ’ 133°25 145°75 Standard Deviation . d : 2B 2°662 Max. Ordinate . . : : 23°83 21°84 While the means agree roughly with those obtained by Parsons’ sexing (49 and 43), we see that this analysis much more nearly equalises the number of male and female bones, and indeed makes the female population rather larger than the male, while Parsons has 79 °/, more males. The “truncated tail” method would probably give results in better accordance with the present had we not truncated at the quite arbitrary Dwight-Parsons’ divisions. These examples may suffice to illustrate the application of the Tables to anthropometric measurements on man, where we can feel fairly confident that the material, if sufficient in quantity, would be adequately described by a Gaussian or normal distribution. Such cases may arise when material for the two sexes, or for two races, is commingled and we can be fairly certain that one or other or both “ tails” of the material present homogeneous parts of the mixture, Another illustration drawn from Galton’s data for American trotters will be found in the Tables for Statisticians, p. xxvi. The chief weakness of the method, besides the assumption of the Gaussian, often quite legitimate, is the absence as yet of the values of the probable errors, which values must be very considerable for slender material such as that used above. See following page for Table of Gaussian “Tail” Functions. Biometrika x 28 214 Table of the Gaussian “ Tail” Functions Table of Gaussian “ Tail” Functions, “ Tail” larger than “ Body.” W aT (—) Ay Yo (—) Ape Ps (-) Ay h’ F OF aDe “f) 00 | BS |. 008 le Bee coogi mh 200) (| oer samme OL |) 860" 5 Oa aly 248 algae oan imn capes 01 03 | °567 2 | 1-949 1-969 02 002 006 : O15 08. | penugl” SOEs" | soso eee maa Ose eg ee 03 0, |) -5e4".| 22 | eBay Ia te len Osea one 04 oo | bea" (|) 7Pe ly eebi emcee ell wipe ual sae 05 06 || 2560, La lonm |) 1210 ae eal mtn weeny 06 O7 i558 | oa || aie coer ee ay OV 08: | 55m eee || q-bosmiaw 1880 | | 08 09 | °BBB sae 1203)741), oan el E86 a 09 “1 Cie te ate SIS? «lie 1-352 | “1 2 535 oe 1143 As 1°726 Was 2 3 DIG W/aore (C1000) | pee, eee eee 3 4 497 | Cra |, 1040 |, poh aki eae 4 of Avs d . 0 2 cealenors Be ecw, op PSR | Snares io cioaadhe can re ae eves ee 6 7 BE ae 899) || cocce || SteBLOnel uae 7 9 oy, 4 “QR “Oo é 2) | | S| Se | Ge | | 1-0 380 | ‘O19 17 eee rises Le kg 11 Br | elk FAO) | Ooh a alta a eee alate 12 949 | (OTR S Srod yl 088) Siciso We en are E ‘018 ss 71 033 1-107 023 ry : i a | ae He Ph | acy PO | ‘4s 15 88 | Ol? “10” || 080 S| jo7a 4) ClO, ze ‘O17 : 028 014 ane eu 016 Be cin eae I nin |) Le v7 Doe ee Ce eerie) ep aces. ||. 27 fe 4 4 “—° 6 | 32] oe | S| Bs | tee | ae | 2-0 =O alee 487 | 022 | 1-993 | 006 | 96 2] ay ee 466 | 020 | j.o18 | 095 4 947 a2 |--1g4 | Ols aap | LO i014 |e eles 2-3 We alee A009) || 02085 alone adores ee eae ‘O11 as ‘017 1-008 :003 a oe 151 | 010 397 | 18 | ioe | 02 | a4 Pye ae 010 ae ‘O15 ee 002 i 26 Laie ieee ref tds | ee | oe "7 Teg | so8 369. | ota. | HROOBA IE ope alnne dy 2°8 124 | - cos A eis | ean (Ree 2-9 SECs | eae eee || leony | ear fe 3-0 109 333 1-001 3-0 See Zables for Statisticians and Biometricians, Introduction, p. xxvii. CONTRIBUTION TO A STATISTICAL STUDY OF THE CRUCIFERA. VARIATION IN THE FLOWERS OF ZLEPIDIUM DRABA LINNAUS. By JAMES J. SIMPSON, M.A., D.Sc. CONTENTS. Pact I. Introductory : ; : : d ‘ : : : . 215 II. Botanical . , : : : : : : : ; . 217 1. Specific Gharatters. : ; : : : : ; 5 aly 2. Morphology of the Flower. ; . : : : 2 218 3. Conception of Chorisis . ; : : : : : . 219 4. Orientation of the Flower. : : ; : : . 220 III. Examination of the Data . : F : ‘ : ; ; . 222 1. Classification . : : : F : : : : . 222 2. Analysis . 6 : : ‘ F , ‘ : ; . 223 IV. Statistical . ; : : . 242 1. Study of the cane and Semidand Deviations : ; . 242 2. Study of the Correlation Coefficients . : : . . 245 V. Morphological Significance of the Statistical Results : ; . 251 VI. Variation in the Gynecium : : F : , . 255 VIL Suggestions for Future Studies in ine Plane 257 I. Inrropuctory*. In the summer of 1905 Professor J. W. H. Trail drew my attention to an extraordinary example of variation which occurred in the several organs of the flowers of Lepidium Draba Linneus. At that time I examined in detail 1832 individual flowers taken from a single plant growing in a piece of uncultivated * Tam pleased to have this opportunity of expressing my great indebtedness to Dr J. F. Tocher for invaluable assistance in the biometric part of this paper. The correlation and other constants were calculated in his laboratory, and without his assistance the publication of this paper would have been greatly delayed. I must also thank Professor Karl Pearson, in whose department in University College, London, the statistical study was originally undertaken, for reviewing this paper for publication and also for much kindly criticism and advice. To Professor Trail my thanks are also due for many botanical hints. 28—2 216 = =©Contribution to a Statistical Study of the Crucifere ground in his garden, and the results of these observations form the basis of the present contribution to the study of the variation in the Crucifere. Botanical problems, which have been hitherto attacked from the biometric standpoint, have been comparatively easily handled, because the material has been more or less homogeneous in character. For example, variations in the number of sepals of Anemone nemorosa* or in the number of ray-florets of Chrysanthemum leucanthemuwm+, and the consequent distribution of these are capable of direct treatment by Pearson’s well-known method of fitting frequency curves, The only work comparable to the one in hand occurs in Biometrika, Vol. It. p. 145 (Variation and Correlation in the Lesser Celandine), but in this case the numbers of members in the calyx, corolla and androecium have been examined as a basis for a study of homotypic correlation and in this flower each of these organs consists of a single constituent with numerous members. The problems studied in this paper, however, are more complex inasmuch as they deal not with one organ of the flower but with all the organs, their con- stituents and members both separately and collectively. It is also, I believe, the first biometric work of its kind on a cruciferous flower and embodies a study of chorisis, that is, “the splitting up or division of one or more components of a flower into two or more equal or unequal parts ”—a factor which is supposed to have been of the utmost importance in the evolution of the natural order—Cruciferee. A complete discussion of this phenomenon is reserved until the flower is studied in detail. It would be well here to emphasise the fact that the flowers examined for this study were not taken from different plants but, on the contrary, were obtained from several inflorescences growing on stems which had arisen from buds on the roots of a single parent plant. This mode of reproduction is rather unusual, but, in the present instance, is of particular interest imasmuch as it gives greater homogeneity to the material. The parts of the flower which have been considered are (a) the perianth, which consists of (1) the calyx and (2) the corolla, (b) the andrcecium and (c) the gynecium. The functional differentiation of these organs is of great importance in the interpretation of results so that it might be well to recall the particular réles which these play in plant economics. The gynecium and the andreecium are respectively the female and male organs of reproduction and consist of carpels and stamens, while the perianth forms a protective covering for these delicate structures. The calyx or outer organ of * Yule, Biometrika, Vol. 1. p. 307. + Biometrika, Vol. 11. p. 309 et seq. J. J. SIMPSON PANE the perianth is concerned solely in the protection of the flower in the bud, but the corolla, in the open flower, also serves, along with the honey-secreting sacs at the base of the stamens, as an attraction for insects. The characters which have been taken as a basis for this study are numerical, eg. the number of petals in the corolla, but no measurable characters, e.g. the length and breadth of the petals, have been considered, although, as will be pointed out later in connection with possible future studies in this flower, these characters might also with advantage be taken. The Crucifere, as an order, are usually regarded by botanists as being very definite in type and no observations have been recorded to show to what extent, if any, deviation from the recognised botanical floral formula exists, so that the main object of this paper was to determine the frequency of the variability of the parts of the various organs and constituents, and also the degrees of correlation existing between the organs themselves. The mode of observation is worthy of remark, however, as it might well be argued that if the flowers used for examination were fully “blown” deficiency in the number of parts might be due to post-developmental fracture, but in all the cases here recorded the observations were made on flowers in bud or only half open so that the influence of wind or other external agency is altogether discounted. The material was also examined microscopically in all cases so that there should be no possible doubt as to the exact origin of any member. The importance of this will be seen in the details of the analysis. II. Bovranrcat. 1. Specific characters. The generic and specific characters of Lepidiwm Draba may be obtained in any complete systematic botanical work so that it is unnecessary to repeat them here, but a few notes bearing especially on the study in hand may be of value. It is a perennial about a foot in height and is covered by a minute down from which its popular name, the hoary cress, is derived. The inflorescence is a raceme not much lengthened and so forms a broad, almost flat, corymb-like termination. The individual flowers are small, white and numerous. The constituents of calyx, namely the sepals, are green; they are short, nearly equal and bear no pouch at the base. The petals are small and white; they are equal in size, obovate, undivided and generally stalked. The stamens are six in number; the filament is simple, 1.e. it bears no appendages, and is shorter than the petals; the anther consists of two roundish lobes. The pods are “ broader than long”; they are com- pressed laterally at rght angles to the narrow partition. The thick valves are boat-shaped and sharply keeled but not winged; each valve contains a single seed. 218 Contribution to a Statistical Study of the Cructferce 2. Morphology of the flower. The typical flower consists of six whorls, made up in the following manner: (a) Calyx 2. (b) Corolla 1. (c) Andreecium 2. (d) Gynecium 1. (Plate I, fig. 1.) (a) Calyx. This organ is composed of two whorls each consisting of two sepals. The outer pair arise at one level on opposite sides of the flower and are inserted on a slightly lower plane than the inner pair; they are parallel to the plane of compression of the gynecium. The inner pair are also situated opposite one another but in a plane perpendicular to that of the outer pair; they are thus at right angles to the plane of compression of the gynecium. These whorls are denoted on Plate I, fig. 1, by the Roman numerals I and IT respectively. (b) Corolla. This organ consists of four petals all inserted at one level and alternating with the position of the sepals; they thus constitute a single whorl. (See III, Plate I, fig. 1.) (c) Andrecium. Six stamens form the andreecium ; they arise at two different levels and thus constitute two separate whorls. The outer whorl, which is lower down, consists of two stamens which are shorter than the others and corre- spond in position to the inner sepals. The inner whorl consists of four stamens, arranged in pairs which correspond in position to the outer sepals. (A reference to the figure (Plate I, fig. 1) in which the two whorls are marked IV and V respectively will make this clear.) (d) Gyneciwm. This organ consists of two carpels forming the sixth or innermost whorl. (See VI, Plate I, fig. 1.) It will be seen from the foregoing description that the order of the six whorls here detailed is that in which they would be found were we to strip the flower of its components at the different levels consecutively from below upwards. It is also the order in which we would find them, passing from the outside to the centre, were we to cut a transverse section through the flower. Another point, however, which is not so obvious but one which has special interest in our study, is the fact that this is also the order in time of development. The actual sequence in which these constituents of the flower appear in the bud is therefore : I. Outer whorl of Calyx (Sepals). II. Inner whorl of Calyx (Sepals). III. Corolla (Petals). IV. Outer whorl of Andreecium (Stamens). V. Inner whorl of Andrcecium (Stamens). VI. Gynzcium (Carpels). J. J. SIMPSON 219 3. Conception of Chorisis. Chorisis or reduplication is generally looked upon by botanists as a means of multiplication of the parts of a flower. It consists in the division or splitting of an organ in the course of its development by which two or more organs are produced in place of one. Chorisis may take place in two ways: (1) transversely—when the increased parts are placed one before the other, that is, the resulting components are on the same radius ; this is known as vertical, parallel or transverse chorisis ; (2) collaterally—when the increased parts stand side by side, that is, on the same circumference. Transverse chorisis is supposed to be of frequent occurrence ; thus the pistils of Lychnis and many other caryophyllaceous plants exhibit a small scale on the inner surface at the point where the limb of the petal is united to the claw. The formation of these scales is supposed by many to be due to the chorisis or unlining of an inner portion of the petal from the outer. Collateral chorisis is seen in different natural orders. In Strephanthus, in place of two stamens there is sometimes a single filament forked at the top and each division bears an anther. This is usually supposed to be due to collateral chorisis arrested in its progress. The flowers of the Fumitory are also generally considered to afford another example of this type of chorisis. In these we have two sepals, four petals in two rows and six stamens, two of which are perfect and four more or less imperfect. The latter are said to arise by collateral chorisis, one stamen being divided into three parts. Collateral chorisis may be compared, according to Bentley, to a compound leaf which is composed of two or more distinct and similar parts. Let us now consider chorisis in its bearing to the flower under consideration. In the description of the morphology of the flower we noted that in the inner whorl of the andrcecium there were four stamens arranged in pairs while in the outer whorl there were only two stamens situated singly. Various opinions have from time to time been advanced to explain this anomalous structure so that it might be well to briefly review these. Of the andrcecium of the Cruciferze Oliver says: “The two pairs of long stamens are generally thought to be due to chorisis or the division in the course of development of single antero-posterior stamens. Others have thought that the six glands represent abortive stamens and that these with the six stamens make up a normal series of twelve in three whorls.” De Candolle held the view that the stamens formed a single, originally tetramerous whorl alternating with the petals in which the median members, Le. the anterior and posterior, were cleft (chorised) in two. Since however the lateral stamens are inserted lower down than the median stamens and are also, 220 Contribution to a Statistical Study of the Crucifere as already pointed out, formed earlier in the bud, this view is clearly untenable. Two whorls must be taken into consideration owing to the difference in the levels of insertion, the single stamens being lower down. Kunth, Wydler, Chatin and others regard these two whorls as typically four-membered (tetramerous), those of the outer whorl corresponding in position to the sepals, those of the inner whorl corresponding in position to the petals. To arrive at a typical cruciferous flower from this, two stamens in the outer whorl abort, while the individuals of the two pairs of the inner whorl come together. (Plate I, fig. 3.) Others (Krause, Wretschko and Duchartre) regard the outer whorl as typically dimerous (i.e. with two constituents) and the inner whorl as typically tetramerous (i.e. four-membered). The more modern view, however, regards both whorls as dimerous but the inner one chorised collaterally thus giving the typical cruciferous flower. The reasons put forward to support this theory are as follows: (1) The upper long stamens are usually paired in the median line, also sometimes coherent. Further, in place of one or both of the pairs, there occurs sometimes a single stamen—a hint at reversion, or one or both pairs may be replaced by three or more—a suggestion of further chorisis. (2) In the earliest visible stage of development in the bud it may be seen that each pair of stamens arises from a single wart-like projection and that division is therefore a secondary result. This is not very easily demonstrable in the Cruciferee but is more evident in a closely allied family, the Capparidacee. Since the present study includes numerical variation in the different constituents and positions of the andrcecium it will be interesting to note to what extent any one of these theories is borne out by the variations in this flower. 4. Orientation of the flower. Having defined the positions of the various stamens relative to one another, in what is usually regarded as a normal cruciferous flower, let us now consider the different possibilities when the flower is abaormal. Suppose that one of the pairs of stamens of the inner whorl is represented by a single stamen, that is, suppose that chorisis had not taken place. Now with regard to the peduncle of the inflorescence this stamen might be placed in two diametrically opposite positions, namely (1) it might be adjacent to the peduncle (Plate I, fig. 4) or (2) it might be on the distal half of the flower with reference to the peduncle (Plate I, fig. 5). Two questions now arise, (1) do non-chorised stamens occur as frequently as chorised stamens on the side of the flower next to the peduncle? or (2) do either of these occur with greater frequency in this adjacent position ? According to which of these questions is answered in the affirmative must we conclude whether there is any connection or correlation between the proximity of J. J. SIMPSON 221 the chorised stamens to the peduncle and chorisis. The former would suggest no correlation, whereas the degree of correlation hinted at by the latter would depend on the frequency of the occurrence. We have so far considered only two possible positions, viz. a non-chorised stamen adjacent to the peduncle, i.e. in the proximal half of the flower, and a non-chorised stamen opposite to the peduncle (i.e. in the distal half of the flower with reference to the peduncle), but the question naturally arises “ Are these the only two possible relative positions which might occur?” Might the petiole not twist so as to bring the hypothetie non-chorised stamen into any position varying from 0° to 180° with reference to the original plane ? Let us illustrate this by means of the Figure 6, Plate I. Taking the position of the peduncle as our fixed point the non-chorised stamen might occupy the “adjacent” position a or the “opposite” position al. A rotation of the petiole, however, might cause this stamen to occupy any of the positions marked a2, a3 or a4 or even any intermediate position between a and al on either side of the vertical plane A—B, in the horizontal plane a, a4, a2, a3, al. In a study of the variations in this flower, this is precisely what was found to occur, i.e. the distribution was equal round a fixed point so that we are unable to say whether there is any connection between the proximity of the non-chorised stamen to the peduncle and chorisis or not. But the full bearing of this consideration does not end here. The orientation of the flower is of practical importance in fixing a basis on which to establish a grouping of the different variations. Any analysis of the data is impossible unless some definite part of the flower be agreed upon as a starting point. Now we have seen that the position of the peduncle with respect to any definite stamen does not require to be taken into consideration. Consequently we may take either of the two stamens of the outer whorl, which correspond in position to the outer sepals and which are “normally” non-chorised, as our fixed point and call it 1; the stamen opposite, i.e. in the same whorl, we shall call 2; the chorised pair of the inner whorl to the left (or in the floral diagram above) may be termed 3 and 4; while the corresponding pair to the right (or in the floral diagram below) would thus be 5 and 6 (Plate I, fig. 7). Where variations occur in any of these stamens we shall hereafter refer to those as occurring in “position” 1, 2, 3, 4 and 5, 6 respectively. On this basis of symmetry, it will simplify matters considerably if we regard as 1, in flowers in which either of the two outer stamens is modified, that one which still maintains its original character while, on the other hand, if both are modified, that one which retains the greatest approximation to normality, e.g. if one be chorised while the other is not, the latter would be in position 1; or if one Biometrika x 29 222 Contribution to a Statistical Study of the Crucifere were chorised while the other was only partially chorised* the latter would again be in position 1. Following on this it is at once seen that where both are normal or where both are equally abnormal it makes absolutely no difference which position we choose as. TEI: 1. Classification. EXAMINATION OF THE DATA. Considerable difficulty was experienced in classifying the variations owing to these occurring in so many different forms yet with so few characteristics in common as to warrant their inclusion in definite classes. The total number of flowers examined was 1832, of which 1062 had the accepted normal structure (see page 218). The remaining 770 showed variation in different degrees of advance or regression, i.e. there was an excess or deficiency in the number and structure of the members of the various organs. Thus we see that there was a deviation from the accepted normal structure in over 42 per cent. of the individuals examined. The perianth has been selected as a basis for classification and Table A shows the sub-divisions which have been adopted. Amongst those flowers in which the TABLE A. Number Number Number Number | Variations of in in in in the Variations Group Sub-class Class Class Class I. Perianth normal hele — — — 1687 = Sub-Class A. Gyneecium normal -— _ 1680 — a Group (a). Andrecium normal 1 1062 — — — Group (6). Andrecium abnormal 57 618 — — — Sub-Class B. Gynzecium abnormal — -- 7 — = Group (a). Gyneecium one carpel 2 4 — — = Group (6). Gyneecium reduplicated 2 3 —_ -- 62 Class II. Perianth abnormal at ee be — — os 115 — Sub-Class A. Calyx normal, corolla abnormal — — 55 — ~ Group (a). Gynzecium normal iil 54 - _ = Group (0). Gyneecium a single carpel : 1 — = Sub-Class B. Both calyx and corolla abnormal — — 60 — = Group (a). Gynzecium normal : il 46 — — = Group (b). Gyneecium a single carpel 6 14 — — 29 Totals 91 1802 1802 1802 91 * For the present we use the terms ‘‘chorised” and ‘‘chorisis” in the sense of the definition already given. J. J. SIMPSON 223 perianth was normal there were no fewer than 62 different types of variation, and amongst those in which the perianth showed a departure from the accepted normal structure there were 29 types of variation. Thus of 1802 flowers examined, 1062 had the typical cruciferous structure, 625 had the perianth normal but the andreecium and gynecium modified in 62 different ways and 115 had all three organs modified in 29 different types of variation. The remaining 30 individuals are not capable of classification under the fore- going scheme but have been grouped into three classes as shown in Table B. TABLE B. Number Number of of individuals | Variations | in the Class Class III. Reduplication of parts but flowers not separate ... 10 11 Class IV. Reduplication of parts with flowers separate... 6 17 Class V. Part of a flower replaced by a flower see oe 2 2 Totals... se ae a ae aes oh 18 30 Altogether, therefore, there are five separate classes which give a total of 109 different modes of variation. 2. Analysis. In the further reduction of the data it is essential that we consider the variations in the stamens, and for this purpose we must naturally commence with Class I, Sub-class A. To avoid describing each of these in detail, it is necessary to have recourse to a graphic method of representation. Several such methods suggested themselves and although none are ideal we have chosen one which may help to give a true impression of the various modifications assumed by the andrecium. We shall also give a few examples by another method which might have been adopted but which seems to us to be even more complicated. Let us, in the first place, consider in what directions abnormalities have occurred. A typical stamen consists of two parts, (1) the filament and (2) the anther. (1) Filament. This may be of its normal length or less than its normal length or altogether absent. (2) Anther. This may be present or absent. But other complications arise. As already explained, in the accepted typical cruciferous flower, chorisis has taken place in positions 3.4 and 5.6 so as to give 29—2 224 Contribution to a Statistical Study of the Crucifere rise to two stamens in each of these positions. Now, we find that, in certain flowers chorisis has only partially taken place and in others it has not occurred at all so that we have thus another three possibilities to consider. In describing the androecium, therefore, we must (1) define the position of each stamen to which we refer, (2) state the nature of the filament, (3) note the presence or absence of the anther and (4) emphasise the nature of the chorisis. Let us use the following symbols 1, 4 and 0. Filament, indicates that it is present and complete. 1 with reference to 4} Chorisis, indicates that it is total or complete. Anther, indicates that it is present. jeulaetent, indicates that it is only half-length. 4 with ref oe Suleesee se : g Wim retenence to (Chorisis, indicates that it is ouly partial. ( Filament, indicates that it is absent. Chorisis, indicates that it has not taken place. Anther, indicates that 1t is absent. 0 with reference to We have already fixed upon our nomenclature for the various positions; these are 1; 2; 3.4; and 5.6. To avoid descriptions and at the same time give a graphic representation of the floral formula of the andreecium the following system might be adopted: (1) Place the whole floral formula within square brackets thus [ _ ]. (2) Place positions 1; 2; 3.4; and 5.6 within curled brackets thus { }; and (3) Place individuals, i.e. 1, 2, 3, 4,5 and 6, within rounded brackets thus (_ ). Expanding this with reference to a normal flower we would have for the andreecium only [12} {2} {2)], or still further in the order of Filament, Chorisis, Anther, Stamen (1.071) G.0. 1) (1.1), Ge ea Gea i Or, taking an actual example from our data : Stamen number 1 is normal and complete, and there is no chorisis; stamen number 2 has a filament only half-length but the anther is present and complete, and there is no chorisis; stamen number 3 is normal and complete; stamen number 4 is only half-length but with a complete anther—chorisis between 3 and 4 is complete; stamens 5 and 6 are only half the normal length but have complete anthers—chorisis between 5 and 6 is complete. This would be represented thus: (ia.0.1) 0% DGD) iG eae a) Geel (a sea) J. J. SIMPSON 225 Another graphic method and the one which we have adopted is as follows. Each flower is represented in a table similar to the following: Number of Diagram Stamen Filament Chorisis Anther Frequency The first vertical column gives the frequency of the variation or the number of individuals examined with this structure. The second vertical column gives the number of the corresponding diagram in the plates. The third vertical column gives the individual stamens in the positions already defined while the other three columns denote the various factors to be considered. The different possibilities of variation in these may be shown by the symbols 1, 4 and 0 as already defined. It should be noted, however, that in positions 1 and 2 a dash (—) will be placed in the chorisis column to indicate that these are typically non-chorised stamens and that absence of chorisis does not therefore indicate abnormality. Representing the same example as before, by this method, we would have: Frequency eee Stamen Filament | Chorisis Anther 1 1 — 1 2 s = 1 3 1 1 il 4 4 1 1 5 i 1 1 6 4 1 1 The following table shows graphically the types of variations illustrated in Figs. I—LVIII, ie. Class 1, Sub-class A, flowers in which the perianth and gynecium are both normal. It will be seen that in the flowers illustrated in Figs. XLVIII—LVIII another complication has crept in. Stamens 8, 4, 5 and 6 have themselves sometimes undergone partial or total secondary chorisis. In the tables, therefore, by sub- dividing the squares containing the details we can thus adhere to our initial nomenclature. Let us take the three most difficult examples to illustrate this. (1) Fig. XLVUI. The division corresponding to stamen 8 is sub-divided. This would indicate that in this position there were actually two stamens, The nature of each of these individual stamens is, as before, given in the sub-divisions. In Contribution to a Statistical Study of the Cruciferae 226 TABLE C. ToYUY Free | fmm at Ct | ad et ae reed Fl me Ls te ie en Saenn es aoe ant Ono meOnsa AS SISTIOYS) | |=so0 | |aaae | |aHos | |azunoo | |oooo | loess lelisisters | |aaae qUoULe ll Lo ee ee SS indin Lo ee ee eon >) SS itn iiaen lien linia! Seas HO Ss OHNO aie lee | Lecleeueatel mA Hd mA OH 1d mn Hud 6 mG O Huw 6 mM Hid 6c ra Go SH 1d 6O mA Hid © ram OD Hud 6 jo qoquinyy Q, om lal > — Weisel iad ES LA I > HH bd ba jo soquany = 4 x be > ba Ss al a ) Aouonboay oO oo t 00 o + a mo + JoyJay on aes HO aaa aeaee eS aoa tn oe oe ce ee aor OnO aaa SISILOYD) | | Aa didn | poaeae | | HHOO | paaee | CS) j |oooo | ES solo Jaane JUOTAC TL [eee ee oe ee IN See HHO IHN SS A HOHNO ts en I oe a ono mame OrO Sd a oS uatey}9 | mM & aH 1d 6 mao Hid mao Hud © mA +H 12d mA Hid © mA) 19 mG oO stm 6 HAN OD Hd JO TOq UIT NT Q, - a lam am 4 weASeIC, es = - = i > = jo qoquinyy | > 4 bd oA a rs , MN bd I~ Aouenbea a = a =i oO = = =) nN N Lal ToyJay fo al es ft mee oe ee ee niles edie ieee! lito oon oe aoe HO Ses Me ee oe el eer Onn Cai en ie ie ie! eco meee eo oe ioe) feos ihe) iesees| Gisceen pees eee pon ae a SHIN aan a i ai Saeed CO SS Hin See Odea me AAAS UsTLeYS z ie a - jo roquinyy mG oD st im 6 mG. OD StH 1d maa oO Hid © TBIsRIC jo raquinyy I VII XVI XIX | XXII fouonbed,y 1062 38 2 2 J. J. SIMPSON TABLE C—(continued). Sees HO SeOnsa eS Ses SSeS woyqUy ee See aS a See ee | SISIOyD | |e Hoo | | aaa | |paunoo | | aaa | |auHoo | | aaae | | aaa | |aaeeaee QUOTAVT TY SANA RS mS eS in re HN Hae © Rae SS SHINAI RI O qeHOn saan HNO A SaaS KH eae eR Re uoUIeIS “ é é , rma GN oD Hud € ma oD Hun moIAN oO Hid mA OO Hd rma OO Hud © mA OD Him 6o mW O Hid oN MO HTH OO © jo OQ TUN AT tH en b om FS ae 5 s IIIXXX IAXXX XIXXX . 4 ILIATX Jo qoquiny “A 2 aah AA PN oS? tA tA ’ iN r d A : _ 7 as : Aouonbaty ar) 4 ry cor) a 4 a 09 wayyy foe ee oes ee Biel oe ae Aaa Se HOHO ae sone sO SISLIOYO | Jeaae | |auoo Jama 1S... | [aaa | joa | Jaane | |nnoe quowUe [Ly mS ina aR a AIN© mt eS AAA ry AIAAING HINO INGA Re HINAaI|oa Ss On eee me OO se AAR AIO aS eS RINO UdTUR]S JO TOQTUAN NY mG) SH id 6O NN 1) Hid © NOD SH 10 0 my GI) SH 1d © mG OD Hud © mA OO Hid aN Hd © ITXXX JIIAXXX XLIV XLVII a ee bd WIeISVICT a iS jo aaquam Ss ~ “ “ Aouenbaa yy a on | BHOnaae 3} es Onr HO TIQIUY Sonia ieee tiene ie! see HO Sr Sena HO ee | mOn AS SISLIOUY) |oaae | [| aHooO | HH0O | | HHOoO laa | jaa | Hn oO | | HHoO ad f JUOUe TL IAS Ss in sda HO ta ic oS a AladiaO RIN SICIR a in re OnrnaH HOnm RANA IO eA AHO WEUIey}G 9 69 +H 1m 6 Ho Hid © jo zaqmny mao 10 co mA Hid © mA Oo HD mA OD Hud mM” oO} Hid 6c MN oD Hid ce Seno 1 on | tH = 4 urease iS 4 Sy = > jo aaqman yy 4 TITAXX “4 INGIPXOXEXG TIAXXX tA nS = ms 4 ~ ey Aouonbary ~ rt 0 ar) na 9 a) 4 ferce t Contribution to a Statistical Study of the Crue 228 TABLE C—(continued). loyyuy talib) hl! tt tl ia tl tela SISIIOYO | |aroor | |SSseoe JVUOUTAe TT Cs es ee Be oe Cn is oes ee ee ueureyg aa st 10 aN tH 1D © jo qaquinn | WRIseIg = = jo qaquiny | A =| | Aouenbary. 10 4 Toyquy Loon lites Mien len en een ee itn Bien ee eee SISTIOY | |aaenoO | | OOnaAs QUOULe TTY I See ae TOUIeyG JO Toq Un Ny UIB.LORICT 3 = jo zaqunyy 5 Aouonboty 9 a TIGA Lelie en endl oe ee eel a | | sismoyg | | | | Aa cmaieictmicn | JVUOULe LY See ea SS ea uaTIeyS JO JOQ wan py UUIBLoRIT jo roquinn Aouonbar gy TABLE C—(continued). yay Sen One ot SISTIOYL) | |oOORn AA | |pnoorn JUOULC TL Sen Ona ROnnR ASS Goutes aN ost 1D CO Pano st 10 N au jo zoquinn 3 om o UIRIOvICL s = jo qoquiny ‘S) = 4 Aouonbea gy st a Loyqay (nt oe en en ee en Le en A oe en ee SISLIOY() | \Iesio.or | [pase JFUOULVIL Lo on oon ts oe Co (eS oe en | eats) mao st 19° 6 NOH 19 € 3 = ac al 0 jo coquinn im UWIRISVI(T = = o caquin am | POR LOC LUNG 5 | Souonbory n 4 J. J. SIMPSON 229 this example what really occurs is: There are two individual stamens each equal to the original length and bearing a complete anther and separated from one another by secondary chorisis. (2) Fig. L. Divisions 5 and 6 are sub-divided to show that there has been secondary chorisis in both of these stamens. In the former, chorisis has been complete but has resulted in one being full-length and with a functioning anther while the other is only half-length with a functioning anther. In the latter, chorisis has not been complete inasmuch as only the anther has been chorised. (8) Fig. LIT. Divisions 5 and 6 are both sub-divided, consequently we may infer that both of these stamens have undergone some stage of chorisis. In the first column we see that all are full-length, in the third that all have functioning anthers, but the second tells us that chorisis has been only partial in each case. The symbol } between 5 and 6 indicates that between these two chorisis has also been partial. Thus we conclude a state of affairs as follows: In position 5.6 (1) there arises a single filament which divides into two at some distance from the base and (2) that each of these again sub-divides and (3) that on the end of each of these four sub-filaments there arises a functioning anther. The others may be worked out in a similar manner but a reference to the diagrams will at once obviate any misrepresentation. From the foregoing table and illustrations it is evident that further classi- fication is possible but it would be well to point out here certain difficulties which arise. As an example let us consider such a case as (using our original terminology) that in which, in any of the positions (1; 2; 3.4; or 5.6), the stamens are represented thus (1.1.0) (0.0.0), thus ($.1.1)($.1.1) or thus (4.0.1) (4.0.1). Which shall have precedence? If we are to consider these variations as deviations from the usually accepted normal cruciferous flower, then we may safely assume that that flower which has the greatest number of functioning parts in a certain position is less aberrant than one in which any or all of the parts are altogether wanting; while, on the other hand, if in a position in which chorisis normally takes place, we have defective groups like those in cases 2 and 3 cited above, in one of which chorisis has taken place but not in the other, we must consider that group in which chorisis has occurred as being the one less removed from normal. On this basis then the above examples would be placed in the following order with regard to normality : GaGa yd. bl) (2). G.0.1)G.0. 1); (3) (1. 1.0)(0..0..0). Similarly for any of the others. Consequently we are now in a position to classify the actual cases under observation. So far we have considered only those flowers in which there was the typical number of stamens, with their manifold variations in size and structure, but now Biometrika x 30 230 Contribution to a Statistical Study of the Crucifere we must classify those in which secondary chorisis has given rise to more than the accepted number. ~ Let us take stamens 5 and 6 as our basis, ie. those individuals in which position 5.6 is occupied by more than two stamens. The relative frequencies of the different types of variation in the andreecium in the 618 specimens so far considered (see Table C) are very interesting. The number 1062 in Table F refers to 1062 flowers in which the andreecium was TABLE D. Number of Variations in Number of Individuals in : Sub- 9 Sub- Section group Section group Group a. Whole of the andreecium normal. Fig. I — 1 — 1062 Group 6. Andrecium variously modified ... Sub-group a. 21 = 480 Outer whorl normal (5 and 6 variously modified) — — — = Section 1. Stamens 3 and 4 normal. Figs. [I—IX 8 — 419 — Section 11. Stamen 3 normal, 4 represented thus (4-1-1). Figs. X—XIV 5 _- 25 — Section iii. Stamens 3 and 4 thus {(1-0-1) (1-0-1)}. Figs. XV—XVIII ... 4 — 9 — Section iv. Stamens 3 and 4 thus {($-1-1) ($-1-1)}. Fig. XXII 1 — 2 — Section v. Stamens 3 and 4 replaced by one. Figs. XIX—XXI ... 3 == 25 — | Sub-group £. 13 —_ 54 | Outer whorl represented thus {(1-—-1)(4-—-1)} —- — = = | Section i. Stamens 3 and 4 normal. Figs. XXJJJ—XXIX 7 _— 42 — Section ii. Stamens 3 and 4 thus {(1-1-1) (4-1-1)}. Figs. XX X—XXXIV 5 — 11 — Section ili. Stamens 3 and 4 thus {(§-0-1) (0-0-0)}. Fig. XXXV ... 1 — 1 = Sub-group y. Stamens 1 and 2 represented thus {($-—-1) ($-—-1)} ... — 4 = 16 Section 1. Stamens 3 and 4 normal, Figs. XXXVI and XXXVII 2 — 11 — Section ii. Stamens 3 and 4 thus {(1-1-1) (4:1-1)}. Figs. XXX VIII and XX XIX 2 _ 5 = Sub-group 6. Stamen 1 normal, 2 absent ... — 5 = 29 Section i. Stamens 3 and 4 normal. Figs. XL—XLIII 4 — 27 — Section ii. Stamen 3 normal, 4 thus (4-1-1). Fig. XLIV ... 1 — 2 — Sub-group e. Stamen 1 thus (4:—-1), 2 absent. Figs. XLV—XLVII — 3 — 6 J. J. SIMPSON 231 TABLE E. Number of Number of Variations in | Individuals in Section eee Section oa Sub-group 7. Stamens 1 and 2 are normal — 9 — 30 Section i. Stamens 3 and 4 are represented by three. Fig. XLVIII 1 = 3 —- Section ii. Stamens 3 and 4 are normal. Figs. XLIX—LI 3 —- 16 — | Section iii. | Stamens 3 and 4 represented thus {(1-1-1) ($-1-1)}. Fig. LIT 1 — 2 — | Section iv. ; | Stamens 3 and 4 thus {(1-0-1) (1-0-1)}. Figs. LITI—LV 3 = 5 — Section v. Stamens 3 and 4 represented by one. Fig. LVI 1 == d — Sub-group 6. : Stamens 1 and 2 thus {($-—-1) ($-—-]}}. Fig. LVII —- 1 — 1 Sub-group x. Stamen 1 normal, 2 absent. Fig. LVIII —_ 1 — 2 TABLE F. Frequencies more than 3 in order of magnitude. (References have been made to the figures.) Figure Frequency} Figure Frequency Figure Frequency | I 1062 XXVIII 11 XXV a VIII 227 XXXVI 9 VII 6 III 130 VI 8 x 5 IV 38 XII 8 XLII 5 XX 21 XIII 8 LI 5 XL 18 XLIX 8 XVIII 4 XXII 15 Ix a LVI 4 normal. Where variation occurs, the greatest frequency, namely 227, occurs in flowers in which one of the pairs in the inner whorl is replaced by a single stamen while the next highest frequency, namely 130, occurs in those flowers in which partial chorisis has taken place in the inner whorl of the androecium. Following this the magnitude of the frequencies diminishes rapidly. The next, namely 38, occurs in flowers in which nearly all the parts of the androecium are modified while, near this, is the frequency 21 which exists in flowers having only one stamen in each position. we find that stamens 1 and 2 are involved. In the next two frequencies, namely 18 and 15, 30—2 232 = Contribution to a Statistical Study of the Cruciferce From these raw data we can see that the inner whorl of the andrceecium is the whorl most subject to variation and further that this variation is in the direction of a decrease in number. TABLE G. Class I, Sub-class B: Perianth normal, Gyneciwm abnormal (see Table A). saleseey Seales ae me 1S ob ede et eS lie valle tes Willers eae Perse) Were eb |) ee tae il ae i) 3 5) os og 2 ne s a) Ons, lone 2 Z a 2) 2h |28| 8 | S| S72) 26 (48/8) 5] s o| 82 (dela |S ) Sele ee ee Sake eee | etal, a Bran sea |e | 21) E | 1 1 — 1 1 1 — 1 2 1 — 1 2 1 — i ‘ - 3 1 1 1 ‘a 3 1 1 1 ¢ Pes 4 1 1 1 ; Ibs 4 1 1 1 | 5 1 0) 1 5 1 1 1 6 0) 0) 0 6 4 1 1 l 1 1 1 1 1 _— 1 ep tenis Ooa|ot )|s na 2 1 1 1 3 1 1 1 ; 1 1 1 4 1 1 1 : SoM. Ona lpeil alee ee i. | lentil 1 SET 4 i 0 1 2 | LXITI 5 l 1 l 1 1 1 6 1 1 al 5 1 Lele 1 1 1 1 1 1 @ aa Oak a Class II: Perianth abnormal. Variations in the members of the perianth (calyx and corolla) have necessitated the introduction of new symbols in the diagrams. These are shown in the com- posite diagram Plate I, fig. 10, and are explained on p. 257. It will be evident from Table H, p. 233, that the same type of variation in the andreecium occurs with different types of variation in the perianth, e.g. in the second and fifth figures no fewer than six different variations in the perianth accompany a single type of variation in the andrcecium. Reference to the diagrams in the plates will show what these variations are and will render a detailed explanation unnecessary. The asterisk in Table H under LXXVIII indicates that there has been adhesion between stamen 1 and one of the stamens in position 8.4, in other words between one of the members of the outer whorl and one of the members in the inner whorl. Class ITI. The members of this class are characterised by a reduplication of the various organs but without separation into two distinct flowers. There are in all 11 individuals with 10 different types of variation. A word of explanation is necessary with regard to the interpretation of the position of the various stamens 233 J. J. SIMPSON TABLE H. Class IT. IaqQuy SISTIOU/) | Jaane On saa Ornate | | HHOO | JaaaR JUOUeT LA Bae i en ROn RA A qaIO eS SSS Se een eS me U9TBYS Jo Taqman Ny rae” O stud © rN tid © mA Hid © mG OD Hud 6 mM Oo tid 6 : 7 - 7 — 7 a ; at iS ial = 4 | (ct i lool El rbd | > i > TWRASVL(T ie sis | rd S “4 O Joq Un Ny 5 od K | q q r seats - re | bd < | | 4 4 iS Kouonbot gq 4 ao a 4 ar) ee TIQIAY BS SSS Sieh! HOnnHO CORR eA Ss SAHnROHO SISLIOY() | Jara | Ce 4HH |onmoo | | aaa ae | |ooos qFUOTAV TTY [on en es en ee HOnnHH BHOnHHO COOn nA HAH SseqeHOHO UOUIeYS mn Hid co mG oH 1d 6 ra Oo Hud aA SH WO HA OD 19 6) Jo oquUINN : = te eaeten! > Sey = = 4 a lool lous Il > =! TWRISeIT Matnlsl ote ea eal bo Fe be be bd bad Atisiilets) bd sf ( joaoquny | KM OK SC I IOISIOn : Co bd iS F etter | 4 3 PS bd PS bl LA rbd iS et | 4 SoH ei 4 4 4 4 4 | | aa Aouonbery pees eb ona re ~ ati ay I3Yquy aA ODO tre emal et Tire) AON sae OO nan aaa HO SISLLOY) | HarIND OOO | | HHOO One aA A | Jaa | | xHoO Quogue TY re en) eee ee) oO AnAARaRA SCOn nse ile) | se tehesuc BONES) aN MO H19 aA H 19 rN 0%) +H 10 6 mn St 19 0c mm OOD tH 19 © jo raquin yy | = a ey = acie . ig = WeASVIy Ss Site wn BA > jo qaquan yy 4 Hr Ss 2 le 4 S 4 S 4 Souenbeaq om OAH = 4 0 The stamens in the outer whorl are longer than those in the Consequently if these were reduced in length they might easily be mistaken for members of the outer whorl. in these flowers. inner whorl. In all cases of difficulty, however, the crucial test, the point of origin, was applied and the positions assigned to the various members as shown in the diagrams were determined microscopically in See Table I, p. 234. this manner. TABLE I. Class IIT. 234 Contribution to a Statistical Study of the Crucifere ToyIUy Ls en A oe cn Be Be Le Ae oe Be Ln es oe ee ee ee SISTLIOUL) aes Ae HnOO ANA Ar ae aes wae | ee! aa 4 JVUOULET TY E> ee De Le os Le | Sees As SS UdTUeYS jo qoquin jy La! nN fan) st awd we} La NO Ho Ne} aN SH 1d We) tH HH UIVISVICT > a 2 o Jaquin o ‘S) JO 1oq TUN y ma Aouonbo.ty = 4 - | Toya y Ss ee oe Ln en es nn ce ne SISTIOU/) Se Ane ARS AR OO Ae Ses en en en en ee oe ee qVUOUMETTY Ca i A | tela feel Sool) ih Geli} abel ue UTS N oD ~—s Ye) Ke) N ise) Q No} rt 5 A) q | al JO Taquin WWIRISVICT ‘ws = jo raqmmnyy o o a ouonbaty — 4 ToyyUaV Cn en Bc ne | oy P teil) tl atl hie! Cc ee ns ee | STISILOUL() COO FRR AAR Ae COO mem OCC CO AF ae eqn ee JFUOTAET LY Eo en On ne sear eo | ae sans me UdUIvYS 0 roqmanyy La ae) st re) Ne} aN oD st Nes) Ne} Lan) GW OD sH 19 co Ri h | rn areasely a a = oO jo taquinyy a a o q : i) Aouonbaty 4 4 = | Toy ae A Be oe Be ne Do eo Leen en ec ee a SISLIOU() SO ROR NN OR ee | ee De qFUOULe LY Ee en ne ee ee ee ee Do eo Se Se ee RR ORR Ae U9U’YS jo Toquin yy Lal AN oO st New) Ne) = ie ine) SH ite) We} = = TIVISEL(T iS > jo Jaqmnn S oO oA a ouenbaa yy a 4 TABLE J. r Class IV J. J. SIMPSON jo roqmunyy | ma Hid JIYJUV aad aes HO aan One ssuoyg | | |arnan | | [ance | | loons PUNTA HY AAR AARAAAy ae AHO Neo OSS Lecslandental mo GOD SH 1d © oA Hid © aA Hd © JO ToQuUInyy rH wIRLsRly > = = ; Ss ein Q jo coquin yy oO ro) a eS) TIyqJUY SSS See (oomat (ree Leen Veer Leal poo) UOTE TY oes As ee Se eee ANA aa Ce ubS roa GI OD) SH 1 mA Hud QO stud © GY.) GN 6%) S ce oY) jo Joquinyy UIeISvIC, > = = : ay > ease jo zaquin yy Oo So > ie) Aouonbeay 4 Ne) =| Toyyuy Sessa See eS ae ORO SISLIOYY) | [aaa | |oorn | losers qUuOULe TTY Len oon Ben Aloe el Cs in ie ee ae | aeHOnO waIeyS | mao Hud wmeasvlqT a Eis ae jo toquinyy | oe) oO is JIYUV loealiaen ies aon iii Salil leet ae ee! ae ee que] Lp reStasoedqet i ed Sasser SS St ume 48 mao Hid aN OD Hid © mM Oo Hud 6 jo rtoquinyy ureaselcy = > = jo raquin yy 5 > ON Es ‘ ouenbar yy Ney o ry 236 Contribution to a Statistical Study of the Orucifere Class IV. In this class there are 17 individuals giving six different modes of variation. Reduplication has taken place to such an extent as to give rise to two separate flowers on one pedicel. Each of the flowers was diminutive in size. Two tables are thus necessary for each “ flower,’ A and B: see Table J, p. 235. Class V. This class has been formed to include two very aberrant flowers showing two distinct variations. In both cases part of the flower has been replaced by another flower, in one case normal in the other slightly divergent. CIX is one of these in which the original flower is normal except that one of the carpels has been replaced by a small flower (see Table K and diagram, Plate X). CX is the other. In the original flower stamen 1 has been chorised and one of the chorised parts has given origin to a separate flower (see Table K and diagram, Plate X). TABLE K. Class V. o v o a Ric o os os a a 2 ££ | od q 5 ~S = 2 &p 2a g qe 3 S| eS es s s 5 ge | 8s| 3 | 5 A = Bn = c y TD a es Ss) cap elena, Se ee hee Fe 1 1 me 1 1 1 aa 1 2 1 = 1 2 1 ae 1 eC 3 1 i 1 ; CIX 3 1 0 1 Z 4 1 1 1 B 4 0 0) 0 5 1 feta ear 5 1 0 i 6 1 Ik 2p Pak | 6 0) 0 0) ; 1 OA ahs al 1 1 25 1 1 (0) * 2 1 = 1 ; en 3 1 1 1 a a ee 1 B 4 i 1 1 ay 1 1 1 5 1 1 1 1 | CX] 8 1 1 1 6 1 1 1 A 4 1 1 1 | 1 1 1 1 1 1 | 6 1 1 Weed | The asterisk denotes the position of the origin of the secondary flower. This concludes our analysis of variations LIX to CX both as to perianth and andreecium, but before proceeding to the statistical part it is desirable that certain peculiarities should be observed and that an understanding be arrived at with regard to the interpretation of these. J. J. SIMPSON 237 In this procedure 44 variations, namely LIX to CII, must be dealt with. When we study the number of parts which occur in the position of individual members of a whorl and then try to draw conclusions as to normality or abnormality of the whorl itself we find the following difficulties. Let us take the outer whorl of the andrcecium as an example. (1) If in the position normally occupied by stamen 1 there were two stamens and in the position normally occupied by stamen 2, no stamen occurred, then with regard to the whorl the total number of stamens would be two. Now this is the accepted normal number of stamens in the outer whorl, so that if number alone were considered the inference would legitimately be drawn from the table that the whorl was normal. But this is not so! Or (2) If in position number 1, one normal stamen occurred and in position number 2, one functioning stamen, with the filament only half the normal length, occurred, then the number of functioning stamens in the whorl would be two, Le. the accepted normal number. But again, on the basis of number alone, we should not be able to say whether the whorl as a whole was normal or abnormal. Now as this state of affairs exists not only in the whorl under consideration but in all the whorls of the flower, we have thought it not only advisable but necessary to emphasise these abnormalities as a safeguard in the interest of systematic statistical treatment. For this purpose, therefore, small diagrammatic formule have been drawn up, and these have been given in conjunction with the diagrams: see Plate I, figs. 11, 12. We have already defined the positions of the various parts of the androecium but have hitherto refrained from naming the different constituents of the perianth. In the cases under consideration, however, it is necessary to do so, and Plate I, fig. 11 illustrates how these are definitely determined. The two outer sepals are named A and B (see Fig. 11). A corresponds in position to stamens 3.4 and B to stamens 5.6. OC and D are the two inner sepals ; C’ corresponds in position to stamen 1 and D to the stamen in position 2. The petals are named A’, B’, C’ and D’ and lie respectively between sepals A and C, Band D, A and D, and B and C. The actual order of all the parts is summed up in Plate I, fig. 12 (1—14). Biometrika x 31 Contribution to a Statistical Study of the Crucifere TABLE L. Class V. 238 | ouenberq WIeIsVlCy jo qoqruny LXI LXX [eulIouqy 410 [BULIO NY ‘PIO AK, [OU M, ul SA9Q (AO TAT iS K gx <= ie] st UL taquin yy | uorqtsod yore Sse KH SaaS So SN Len en oe ee ll ee ee oe CS | Eee | Seater oS a) a a he aL = mee on enna | Se se Ree On eR NN UL SaOq UOT Aonenbet y oa | f 7 UWIRLSVICT bd = jo qaquim yy 4 a 4 |_Tetusoaqy 0 = SS = ~ x | [BULION TIO A, TLOCTEAN = an © worqisod yore UL TaquanN Aouenbaatyy Se oe Be ee oe oe eS eS | TURISVI] jo qequiuy N “ loot | ‘| i S me } le sR a nN oD Nn wt ee ce ee oe oe oe ee eC ey | 6 ee ee ee ee ee eS ey | | | [euLtouq Vy 10 [VULI0 NY ‘[LOT] A. N N TEOMA Ul STOq THOTT n | TACT | 2S ~ So | < ase aes = a nN + a =H | uorqtsod yao UL faq un jy [ZOU | jo raquioyy Se ee ee | Se ee AR eR NN ANS [OU M jo zaqumyy II IV [i ce A oe oe Be ee eS WQ OD HSA GA Ese a a SES HO FSO SROs Se —| —! Ky a FR & F&F & 239 J. J. Srmpson TABLE L—(continued). Souanbeat yy 4 | =) = 4 TURAISVICT = pes ee yw jo gaqmany NOON TAX NEDEXAXGT EXSXOXGETI [eultouqy 10 x = i ; . = pe es ae, 7 Py ‘ | : » > - [VULIO N ‘TOT, = =i q | q | al Po) 1 NX | io | LT NX xq qq N x NX q Ss Ou AN es] a fe a oD | a fe] a o a = nN oD a 10 | 4 a rex 4 ra UL Seq WoT uotyisod yore UT TOquIn a Se ae Ss OO He HN | SS ese ROO HO NN HO Fa FOR HH HO MN sO Fe On OR FRO Ae Aouonbaty | a = = 12 is CARNE es —__~ =e Ea = = Z UIeLSPICT DA LA jo aaqrany A A EN XEXGT IXXXT J a | . ab R iM - | [euLIouq Fy 10 , n _ ‘ » is K R N S peumon jou | 4 Se ~ ras ~ x = x 41 x ~ OR a ah ge see — a = | | | me rae | TOU M. s AG er m an ) > aA) N yD A ies io) © o) a] > G > > UL S1oq Wd PT : r 2 = re ¢ = | ~ = F = Ei i aaa at” = i es uortsod yore as tol GJ ie Se iil KO | rt rst St SO GUD OOO. | SO Fe FR Or Fee Ae SHO FR OCOORF BO NN UT TOQUIN NT Aouonboay = | 4 ™” a 5 : ~ ee -| = = : e _ c = WRIsViI ~ ae jo zaquiny ~“ | INTEXEXET TENCX SCAT XxXXT = | — [eulouq Vy 0 ‘ re pa. si = K fs r remmon qaoyyy| 4 as a x i SO nee x ~ se, Sars ~ se aS TOT AA, % an ee m= UL SIOQ UII] 5 eS ve aa ~ 7a N = an Pu a a oD 4 + 4 er) al 4 Ney uortsod yoro Quts SSO OR RRO ORO AN lil Lot ee) hh (Re -—-O Sm FR On FO NN sO Fn RR On FO MN UI JaquIn [LOT AA. aS. See HO Se HO ee ee +H ae + Jo zaqtueyy AIS SSP Silas ee WR OQ WRSQA WAN EO =~ SA FROR RIA Resse SR CIQ Sasa oN fis TOU. = Se a sas Sg Fa ba See NG OS A) eS A ee he ieee he on St oii age 31 v fere y of the Crue Contribution to a Statistical Studi ° TABLE L—(continued). 240 UL S1Aq WATT Aouionbaay = 8 a o eee AXXXT HIAXXX'1 S 5 jo toquin ai A | eee a = x = 8 x x be x S} oS Pe = x hg TOU AL ~ re an a n a Se = — ON ve) ds | UL Taq uIN N uorytsod yore | NO Ae Few COO NM Se ae OOR Se SBR NN OO8 Ss Ol Oi Oe eGliGl | AN Aw BANS ANN OW UL SLoqure fT Aouenbe.lyy a 4 o = } ee | : —! WRIST it paste S) = Jo zaquan x. AIXXXT TIAXXX'T nA g d yeuLtouqy 10 es = seal ae ie we ee is ta; came drem |) OS x = Ste. oa ‘ = { = q saad ~ & ‘ Sees TLOmT AN ° ~t a a tH a - a Ney oD 4 a) a 12 oD 12 uorqtsod yore ul Jaquinyy IO) tls! eS Se INS NIN Le re ee ee CS | a0 Fe FONN FAN HO Tc eee AN FAR BRN NT NM ouenbad yy 4 4 Ne) 4 UIBASRI(T owe wXY XIXXXT : 46 sequin TIIXXXT IAXXXT DOO S s ean 2 ee [ss ss oo ee IOUA TOU M, o jo gaquiny 2) S ~ = —— ——— — | - | eutou JO : eS ~ 5 x A = x ~ =z | [CULIO N “PLOT AA, > Toy A eu tH a Ve) st oe) fe] st Ne) a We) UL Staq Ute JY wortsod yore url faqmunN CE EE ee AN FA ANAK ANNAN WH Aouonba.ty oq 4 [aml h UTRISVICT > “4 aa © S oO qaquInN J aN A a RULIOUG W LO ‘S U Lv Ssh Ee x Yo oS Sl Ss a eS se| [BULION “PLOT AK TOUAUN Uh ee = as | UL SIOQUOyY z é | 9 5) 4 6 N ioe) uorisod yora | Ur JaquuN eee SS... BN AN ANRwe Fe OM Aouonbeady ureLseiq, jo raquin yy Aouonbedy = a WIBLSRL(T is was jo raquunuy A OX [eULLoUq YW «0 . = ke ‘eumoy qroym | “ S 7 hem aS se x 4 = jo coquan yy Bi a = mF She 5 eae [euadouqy 10 [BUOY “ETOU AA. [PROG LANN UL SIOq WOT uortsod yore UL TaquinN Aouonbaay N A A A A i s (=) lan) pro) See ANN CDOO0O FAN MAN UIRISVL(T jo zoquin CI [euLtouq Fy 10 [BUI N AOU A ROTA UL SOQ UIOTT uortsod ToRe UL JoquInN ae NAM ANNA NN TS Ou. ae Sones Ke) jo daquieyy BS SEO Seeks TZOUIM [Psat ts = eo iss = = JO TOQuan Ny | 242 Contribution to a Statistical Study of the Crucifere IV. STATISTICAL. The analysis which we have given of 1813 flowers is sufficient to show that the idea of a definite fixed number of sepals in the calyx, of petals in the corolla, of stamens in the androecium or of carpels in the gyneecium of cruciferous plants is not upheld by an examination of a large number of flowers of this species. In less than 1 per cent. of the flowers examined there was an increase* or decrease in the number of sepals in the calyx; in less than 1 per cent. there was also an increase or decrease in the number of petals in the corolla, but in 2 per cent. there was an increase in the number of stamens in the andreecium, while in 22 per cent. there was a decrease in the number. Since then the number of sepals, petals and stamens is not absolutely fixed for any of the organs it becomes necessary now to consider whether the number of members in one organ is related to the number in the others. As has already been pointed out we have not only to consider organs as a whole, but, in the case of the calyx and the andrcecium, the constituents of these organs, owing to the fact that these organs are each divided into two separate whorls which are inserted at different levels and are placed in directions at right angles to one another. Further, a special study has been made of the various positions in androecium to ascertain to what extent bilateral symmetry may be regarded as an inherent character of the flower under consideration. By this means also it seems that some defimite information might be obtained with regard to the perplexing and, at present, hypothetical theory of chorisis, the reasons for the existence of which have been summarised on p. 219. The statistical part has been divided into two sections: (1) astudy of the Means and Standard Deviations, and (2) a study of the Correlation Coefficients. 1. Study of the Means and Standard Deviations. — Although it is obvious from the analysis of the data under consideration that the numbers given for the botanical floral formula, namely, Calyx—4, Corolla—4, Stamens—6 and Gynzcium—2, are the nearest integers, it is not at all certain from a mere inspection of the tables whether the actual means deviate from this number in the direction of excess or deficiency. * Where chorisis of a sepal or petal has resulted in two or more distinct individuals we have regarded each of these as a distinct sepal or petal in recording the numbers. This method is natural however inasmuch as it is the only means by which we may possibly trace reduplication of parts. J. J. SIMPSON 243 Consequently the mean and standard deviation for each of the organs and its constituents have been calculated and these are given in the following table: TABLE M. Means and Standard Deviations of the Number of the Organs and their Jonstituents. es : ee es — ed oo ; : | Coefficient Organ Constituent Member ae Sh cn of | | ae Variation Calyx ne. fe — -- 3°9796 vaya 67415 i Outer whorl — 1°9757 "1979 10°020 5 Inner whorl — 2°0039 1304 6°507 Corolla... igs — _ 3°9520 *3523 8914 Andrecium ... — -- | 5°8092 "7567 13025 Ss Outer whorl — | 199570 | :2704 13°817 “ = “Stamenl | 9840-1602 «=| «16-280 Fe —_ —Stamen 2 ‘9713 | ‘1915 19°715 | Fs Inner whorl | — | Sane 76588 | 17077 | : _ | Stamens 3.4 | 1:9950 ‘2728 | 13674 me == | Stamens 5.6 | 1°8627 | ‘4863 | 26°107 (1) The most obvious result which is revealed by these constants is the fact that in all cases (except the inner whorl of calyx) the actual mean of the organs is less than the recognised typical number, thus : The mean number for the calyx is 3979 instead of 4. The mean number for the corolla is 3°952 instead of 4, The mean number for the andrecium is 5°809 instead of 6. (2) The inner whorl of the calyx shows the smallest departure from the accepted typical number, namely, 2:004 instead of 2. Let us, however, test how far the differences in the character of the analogous parts are significant by ascertaining the Probable Error of the difference of the means of the characters. TABLE N (1). I. Constituents of the One - 7~— | Maan. stars Constituent |} Number — Deviation Outer whorl | 19757 | +1979 Inner whorl | 2:0039 "1304 | The difference here is D='02813 and the a eratis error of the difference ie = 0037; thus the value p= 76. The difference is therefore clearly He D m significant. 244 Contribution to a Statistical Study of the Crucifere It is worthy of notice that the outer whorl of the calyx is more variable than the inner whorl and that it possesses on an average fewer sepals. TABLE N (2). Il. Members of the Outer Whorl of the Andrecium. Mean Standard Number Deviation Member ‘9718 “1915 | | | Position 1 | Position 2 "9840 "1602 | The difference here is D= ‘01268 and Ep =:00391; the value —— is Di Thus when the two positions of the outer whorl of the andrcecium are taken into consideration, a probably significant difference is found between the means of the distribution of the parts of this whorl. Now in position number 1 there is a greater approach to the accepted type owing to the fact that when the com- ponent of one of the positions of the outer whorl was found to depart from the accepted type, the other position was selected as the starting point for the orientation of the flower and was called position number 1. It is all the more noteworthy that the deviation for position 2 is not in the direction of greater but of lesser frequency and the variability of position 2 is greater. We have thus again a reduction in the value of the type with greater variability. TABLE N (3). III. Members of the Inner Whort of the Andrecium. Mem Mean Standard Cen Number Deviation Position 3.4 ... | 1°9950 2728 | | Position 5.6 ... | 1°8627 -4863 Here the ratio eee is nearly 15 and therefore there is quite a significant = (m,—™M2) difference between the means of the distributions of the two members of the inner whorl of the andreecium. In both cases the tendency is towards a suppression of functioning stamens rather than an increase, together with greater variability in the case where the reduction from the accepted type is more marked. This difference in variability is, in the main, real and is not due to the arbitrary selection of the 3.4 position. This will be evident from a study of Table XIX. It will there be seen that there were 1754 cases where two stamens occurred in J. J. Srmpson 245 one of the positions of the inner whorl of the andreecium. This is the number in the accepted type, and thus there is no variability. What is the nature of the distribution of the stamens in the other position (Table XVIII)? It is as follows: 1 | 2 | 3 4 | | 287 | 1436 | 26 | 5 | 1754 The mean for this array is 18569 stamens, with a variability of ‘4116. Thus when there is no variability in one position of the inner whorl of the androecium there is a large variability in the other position. Similarly we find the following distribution for position 2 in the outer whorl of the andreecium when position 1 is of the accepted type, 1.e. shows no variability. 57 1708 1 1766 The mean for this array is ‘9683 with a variability of 1784. Again therefore, when there is no variability in position 1, there is a reduction of type in position 2 with great variability. 2. Study of the Correlation Coefficients. For the purposes of this study a number of correlation tables have been prepared and as the results of these will have to be considered under different groupings it seems advisable to tabulate them, and insert them consecutively. The system which has been adopted to facilitate reference is to commence with the outer whorl of the calyx and consider all cts relations with the other whorls of the flower passing from the outside inwards; following this comes the inner whorl of the calyx and its relations with the other constituents of the flower from the outside inwards and so on. The following table shows the characters studied and the correlation coefficients found. In order to make the comparison of the various correlations as complete as possible it will be necessary to consider each constituent or organ with all the other constituents or organs and to avoid overlapping as far as possible. The most natural method would be to commence either with the outermost constituent, namely, the outer whorl of the calyx, or with the innermost constituent, namely, the inner whorl of the andreecium. For reasons of a morphological character, which will be seen later, the inner whorl of the andrceecium has been chosen as the starting point. : Biometrika x 32 246 Contribution to a Statistical Study of the Crucifere TABLE O. Correlation Coefficients between the Number of Various Organs and their Constituents. Table Yr. The outer whorl of the calyx and the inner whorl of the calyx... is xa I 1957 The outer whorl of the calyx and the corolla ae w II "7275 The outer whorl of the calyx and the outer whorl of the andrcecium ae a Ill “5886 The outer whorl of the calyx and the inner whorl of the andreecium is ato IV 2613 The outer whorl of the calyx and the andrecium ... She Fae ee ids Vv “4371 The inner whorl of the calyx and the corolla Ae via a VI 2476 The inner whorl of the calyx and the outer whorl of the andrecium ase io VII 3229 The inner whorl] of the calyx and the inner whorl of the androecium wats moa). WELL +3905 The inner whorl of the calyx and the andreecium ... oh ee wee Ben IX “4592 The calyx and the corolla 2 aes ae see noe eA x 6926 The calyx and the outer whorl of the andreecium ne re es Bsc XI "6245 The calyx and the inner whorl of the andreecium ... a ute Me, ae XII -4014 The calyx and the andrecium ... nes ore ae a Boe || 2-SILILIL 5721 The corolla and the outer whorl of the ‘andrescium ante ame ES ae a XIV 4762 The corolla and the inner whorl of the andrcecium ... ats Ape are hee XV 1773 The corolla and the andreecium ae XVI 3174 The outer whorl of the andreecium and the i inner whorl of the androecium: XVII "1984 The inner whorl of the andreecium, Posie 3.4 and the inner whorl of the XVIIT| 4305 andreecium, position 5.6... a The outer whorl of the calyx and the inner whorl of the andrecium, position 32 oxen 4646 The outer whorl of the calyx and the inner whorl of the andreecium, position 5.6 XX 2134 The inner whorl of the calyx and the inner whorl of the andreecium, position 3.4 | XXI 4634 The inner whorl of the calyx and the inner whorl of the andreecium, position 5.6 | XXII | 2519 The corolla and the inner whorl of the andrecium 3.4... ses is ... | XXIIT | +2558 The corolla and the inner whorl of the androecium 5.6 XXIV | 0903 The outer whorl of the andrecium and the inner whorl of the andreecium, XXV | -2661 position 3.4 | The outer whorl of the ‘andreecium and the inner » whorl of the andreecium, tXXVI +1539 position 5.6 soc (a) The inner whorl of the andrecium. From the standpoint of the systematic botanist the most anomalous constituent of the cruciferous flower is the inner whorl of the andrcecium, inasmuch as in each of the positions where one stamen would naturally be expected, the presence of two is regarded as typical. It has been explained in a previous section that botanists now usually regard this anomaly as having arisen by collateral chorisis from what was originally a single stamen in ancestral forms. For the sake of conciseness and in order to avoid unnecessary repetition the following abbreviations have been used in Tables P—X. O. W. Ca. = Outer whorl of the calyx. I. W. Ca. = Inner whorl of the calyx. Ca. = Calyx. Co. = Corolla. O.W. A. = Outer whorl of the andrcecium. IW. A. = Inner whorl of the andreecium. A. = Andreecium. = J. J. Simpson 247 The following Table, P, gives the correlation coefficients between the I.W. A. and the other constituents or organs of the flower in order of position. TABLE P. I. W. A. and the other Constituents. Constituent or Organ Table Correlation OMWaiCan... As IV 2613 Hee Wit Cari ne Wang | "3905 Ca. site ae XII 4014 Co. Se oe XV | ‘17738 | O. W. A. ... ves XVII 1984 | The highest correlation between the inner whorl of the andrcecium and the other constituents or organs is that with the calyx; next in order come the inner whorl of the calyx, the outer whorl of the calyx, the outer whorl of the andreecium, and lastly the corolla. In other words, we should be better able to predict the number of stamens in the inner whorl of the andreecium from the number of members in the calyx than from the number of members in any other constituent or organ. (b) Relations between the organs themselves. Having thus discussed the inner whorl of the androecium with the other organs and constituents it might lead to some useful result if we proceed to determine the “organic correlation” existing between the various organs themselves. In this connection we have to consider the calyx, the corolla and the andreecium, and for this purpose the correlation Tables X, XIII and XVI have been prepared. The character which has been selected for this study is the number of members in each organ. The following Table (Q) shows the results obtained : TABLE Q. Correlation Coefficients between Ca, and Co. ... | "6926 Ca.and\ A.” ... | “5721 | | Covand As... || “3174 | ee | (1) The calyx and corolla are much more highly correlated to one another thau is either of these with the andreecium. In other words, the two protective organs of the perianth are more highly correlated to one another than is either protective organ with the male reproductive organ. It is further evident that (2) the calyx is much more highly correlated to both the corolla and the andreecium than are the two last named to one another. From (1)it may be concluded that, on an average, 32—2 248 = Contribution to a Statistical Study of the Crucifere an increase or decrease from the accepted typical number, namely four, of petals in the corolla is accompanied by an increase or decrease in the number of sepals in the calyx; while from (2) an increase or decrease in the number of stamens in the andreecium will be accompanied, on an average, by a greater increase or decrease in the number of sepals than in the number of petals. (c) Relations between the constituents of organs. The constituents of (1) the calyx and (2) the andrcecium will now be con- sidered. ; (1) Calyx. The outer and inner whorls of this organ are inserted at different levels and have a decussate arrangement, so that, although the organ as a whole is protective in function, the two whorls actually help to enclose the flower at right angles to one another. The correlation between these two whorls is an extremely low one, namely, 1957 (Table I), in other words, an increase or decrease in the number of sepals in either of the whorls of the calyx is associated only in a very small degree with an increase or decrease in the number of sepals in the other whorl. Or again it may be expressed thus, the two whorls of the calyx vary to a great extent independently of one another. This statement should be taken in conjunction with that made on p. 244 with regard to their Means and Variabilities and should also be borne in mind when the correlation between these two constituents and the other parts of the flower are discussed below (see Tables R and 8). (2) Andreciwm. This organ is also composed of two whorls, an outer and an inner inserted at different levels. Its function is of course reproductive. The correlation between the two constituents is very low, namely, ‘1984 (see Table XVII), and is almost the same as that between the two whorls of the calyx. The inner whorl of the andrcecium shows greater variability than the outer whorl and tends to vary independently of this latter constituent, just as in the case of the two whorls of the calyx. Having thus considered the organs per se, let us now compare the correlations between each individual constituent or organ and all the other constituents or TABLE R. (d) Correlation Coefficients between the Outer Whorl of the Calyx and 2nd Component Table 1, UI Ne OP IL "1957 Co ner 11 7275 O. W. A Ill 5886 Te Wee A: IV 2613 A Vv 4371 J. J. SIMPSON 249 organs. For this purpose it will be necessary to tabulate the results in series and consequently it might be well to start with the outermost constituent of the flower, namely, the outer whorl of the calyx, and tabulate the correlation coefficients passing inwards to the andreecium. The inner whorl of the calyx will next be taken in relation to the other constituents and so on. From the above table it will be seen that the outer whorl of the calyx is most highly correlated with the corolla; it is also highly correlated with the outer whorl of the andrcecium but much less so with the inner whorl of the andrcecium. TABLE §&. (e) Correlation Coefficients between the Inner Whorl of the Calyx and 2nd Component | Table rs COs. VI 2476 O. W. A. VII "3229 Wie Ae VItl “3905 TNS IX 4592 The low correlation between the inner whorl of the calyx and the corolla is due to the close adherence of the former to type, that is, there is very small variability. TABLE T. Correlation Coefficients between the Calyx and (f) 2nd Component Table rs Come. x "6926 O. W. A XI 6245 IW. A XII 4014 AS sts XIII “5721 There is a higher degree of correlation between the two organs of the perianth than between the calyx and the andreecium. The high correlation between the calyx and the outer whorl of the andreecium is mainly due to the high value obtained for the correlation between the outer whorl of the calyx and the outer whorl of the andreecium. TABLE U. (g) Correlation Coefficients between the Corolla and 2nd Component Table is OR Wards XIV "4762 I. W. A. XV 17738 Aes XVI 3174 250 Contribution to a Statistical Study of the Crucifere The corolla is much more highly correlated with the outer whorl than with the inner whorl of the andrcoecium, and the correlation between the corolla and the andreecium as a whole is not very great. A comparison of Tables T and U shows that there is a much greater correlation between the calyx and the andreecium and its two whorls, than between the corolla and the same constituents. So far we have considered the relationships between the different parts of the flower from the outside inwards, but when we examine these relationships, taking the inner whorls as our starting point, some new aspects of* the problem become manifest and, as these have been of great value in the interpretation of the results, it has been considered advisable to tabulate them thus: TABLE V. (h) Correlation Coefficients between the Inner Whorl of the Andrecium and 2nd Component Table is Can ic ase XII “4014 TW C aremene VIII “3905 OMWis Cannes IV 2613 Oo AW tcAln tee XVII 1984 Co. =: mate XV ‘1773 TABLE W. (2) Correlation Coefficients between the Outer Whorl of the Andrecium and | 2nd Component | Table i: | Carex: 586 XI 6245 [OW Cosme, Ala 5886 Cones. bee XIV “4762 [iti Rg Cay ee VII 3229 A comparison of Tables V and W shows that the correlations between the outer whorl of the andrcecium and the other components are higher than for the inner whorl of the andrcecium, except in the case of the inner whorl of the calyx. J. J. SIMPSON 251 TABLE X. ()) Correlation Coefficients between the Andrecium and 2nd Component Table Ts Ob aor oe XIII 5721 Te WieCare ere IX "4592 OW Can ce- V 4371 Co. ... ie XVI 3174 This table shows that when the androecium is considered as a whole it 1s mast highly correlated with the calyx and least correlated with the corolla. V. MORPHOLOGICAL SIGNIFICANCE OF THE STATISTICAL RESULTS. It is quite clear from the tabulated results that there is a definite departure from the usually accepted cruciferous structure in a very large number of the flowers of Lepidium Draba which have been examined for this study. This does not obtain merely in any one organ or constituent but in all the organs and constituents, although not to the same degree in each. The statistical results will now be examined from the standpoint of the botanist in order (a) to note their morphological or genetic significance and (b) in order to see whether these figures throw any light on the evolution of this cruciferous plant. It is almost axiomatic to state that the “purpose” of a flower is a purely reproductive one and that therefore its existence is justified only in so far as it serves to reproduce its kind. But not all the parts of a flower are solely repro- ductive in function. Each individual consists of two parts, (1) Reproductive, (2) Protective. (1) The reproductive organs are the gynecium (2?) and the andreecium (¥), while (2) The protective organs (perianth) are the corolla and the calyx. One of the organs of the perianth, namely the corolla, is still further specialised. The calyx consists of four sepals, green in colour, whose sole function is to protect the flower when in the bud, and in many cases these are reflexed immediately after the flower has opened up, and are of no further importance to it. On the other hand the petals though essentially sepal-like in structure, in this as in the great majority of flowers, are not green but of some other colour. In the species under consideration they are white. Now although the petals are of great importance in protecting the reproductive organs while in the bud their utility does not cease 252 Contribution to a Statistical Study of the Cruciferae when the flower opens but, along with small nectaries at the base of the stamens, serve as an attraction for insects whose visits are essential for cross-fertilisation. The reproductive organs of what is regarded as the typical cruciferous flower consist of (1) the gynaecium which is composed of two carpels and (2) the andreecium which is composed of six stamens. The stamens are delicate structures and do not hold an isolated position in the flower. When in the bud and immature they are subject to external influences, for example, (1) they might be shrivelled up by the heat of the sun, (2) they might be blasted by rain or wind or (3) they might be attacked by herbivorous insects, so that the protective perianth plays an important part in flower economics. Now what does an increase in the number of stamens imply? It is obvious that if the number of stamens is increased the total volume occupied by the reproductive organs is increased and consequently a tax is put upon the protective organs if they are to fulfil their function adequately. If the perianth does not respond to this tax from space considerations, the reproductive organs stand a small chance of ever fulfilling their function, so that one would naturally expect that variation of some kind in the perianth would follow variation in the reproductive organs. Another important point which must never be lost sight of when interpreting the statistical results is the symmetry of the cruciferous flower. The calyx consists of two whorls each with two sepals; the corolla of one whorl of four petals and the andrceecium of two whorls of stamens, the outer having two members and the inner four members (see Plate I, fig. 7). Consequently a cruciferous flower is bilaterally symmetrical only on that vertical plane which passes through the division wall of the carpels, between each of the pairs of stamens in the inner whorl, between two petals on either side and through the middle of the outer pair of sepals, This plane may be referred to as the “plane of symmetrical division.” Owing to the fact that the corolla consists of only one whorl, the outer whorl of the calyx corresponds in position to the inner whorl of the andrcecium, and the inner whorl of the calyx to the outer whorl of the andreecium. From a study of the Means and Standard Deviations of the various organs and constituents we arrive at the following conclusions : Calyx. (1) The greatest approach to constancy in number in the whole flower is in the inner whorl of the calyx. (2) There is a significant difference between the means of the two whorls. (3) There is much greater variability in the outer than in the inner whorl of the calyx and on an average it possesses fewer sepals. (4) There is a tendency towards a reduction from type in the number of sepals in the calyx. Corolla. (5) There is a tendency towards a reduction from the accepted typical number in the number of petals in the corolla. J. J. Srpson 253 Andrecium. (6) There is a significant difference between the means of the distributions of (a) the members of the two whorls of the androecium, (b) the members of the two positions in the inner whorl, and (c) the members of the two positions in the outer whorl. (7) From whatever axis we view the andreecium as an organ it is distinctly asymmetrical in the distribution of its functioning stamens. (8) There is much greater variability in the inner whorl than in the outer whorl of the andrceecium. (9) In both positions in the inner whorl of the andrcecium there is a tendency towards a reduction from the accepted typical number of stamens and in the position where this is most marked there is the greatest variability. The interpretation of these results is not at first sight very evident. Why should there be a tendency towards a reduction in the number of members in the different organs of the flower and why should this tendency be most marked in the inner whorl of the andreecium? As has already been pointed out all the flowers examined were taken from a single plant which gave rise to new stems by means of buds on the roots. May this tendency to reduction in the parts of the flower whose function is sexual reproduction not be an expression of a tendency towards an elimination of sexual in favour of vegetative reproduction ? Another phenomenon which lends support to this hypothesis is the fact that in this plant the percentage of “ pods” which attain maturity is extremely small. Whether there is or is not a tendency towards vegetative reproduction, may we not also have here a harking back towards an ancestral form in which the number was less than the at present accepted typical number? In fact one would expect that if the present constitution of the inner whorl of the andreecium had been most recent in development, reversion would first take place in it, and conversely one might reasonably conclude that since this whorl shows greatest variability, and most marked tendency to reduction in the number of members, it is more than probable that its present constitution was arrived at by an increase in number from a more primitive type. Let us now examine the deductions made from a study of the correlation coefficients and see if they have any morphological interpretation. (1) The calyx and corolla are more highly correlated with one another than is either of these with the andrcecium. (2) The calyx is more highly correlated with the andreecium than is the corolla. In other words, the two protective parts are more intimately associated in increase or decrease with one another than is either of these with the male reproductive organs, and further the calyx which is solely protective in function is more Biometrika x 33 254 Contribution to a Statistical Study of the Cruciferae intimately correlated with the male reproductive organs than is the corolla which serves as an attraction for insects as well as a protective covering of the bud. (3) The two whorls of the calyx are not highly correlated, i.e. they vary inde- pendently of one another. (4) The two whorls of the andrcecium also are not highly correlated. Morpho- logically this means that when there are two constituents in one organ, each having the same function, they may vary independently of one another, so that although an increase or decrease in the number in either may be correlated with an increase or decrease in the number in any other constituent of the flower, the same does not hold true with regard to the two constituents. (5) The outer whorl of the calyx is most highly correlated with the corolla, next with the outer whorl of the andreecium and lastly with the inner whorl of the andrecium. The reason why the outer whorl of the calyx is more highly correlated with the outer whorl than with the inner whorl of the androecium is not at first sight very evident, but may be explained on the basis of its protective power. The members of the outer whorl of the andrcecium lie in a plane parallel to that of the outer whorl of the calyx, and are much more widely separated in this plane than are the members of the inner whorl of the andreecium. Con- sequently any increase in the number of stamens in the outer whorl would involve a much greater increase in volume within the flower than a corresponding increase in the number of stamens in the inner whorl. Thus we are not surprised to find that such an increase in the outer whorl of the androecium is more intimately associated with an increase in the outer whorl of the calyx than a corresponding increase in the inner whorl of the andrcecium would be. (6) There is very low variability in the inner whorl of the calyx and it is almost equally correlated to the two whorls of the andreecium. The morphological explanation of these facts follows as a corollary to that given above. (7) The calyx is much more highly correlated with the andrcoecium as a whole and with its two whorls than is the corolla. As we have already said the calyx is the predominantly protective organ and consequently this higher correlation has a physical basis. The corolla being partly attractive does not enter so closely into space economics. (8) The outer whorl of the andrcecium is more highly correlated with the other components of the flower than is the inner whorl of the andreecium. This again follows on the basis of space considerations. Any increase in the number of members in the inner whorl of the androecium does not involve so radical a change in the volume of the flower as does a corresponding increase in the outer whorl of the andrceecium. J. J. SIMPSON 255 VI. VARIATION IN THE GYNACIUM. So far we have not considered the gynzcium on account of the small number of variations which occur in that organ and from the fact that these do not lend themselves to statistical treatment. The gynecium consists typically of two carpels which are flattened in a vertical plane parallel to those containing the pairs of stamens in the inner whorl of the andreecium. The thin partition wall separating the two carpels therefore stands at right angles to this plane. Now when we examine the different types of variations in the structure and number of the carpels we find the following: (1) a single carpel, (2) two carpels (typical), (3) three carpels, (4) four carpels, (5) two sets of two carpels within a single perianth, (6) two sets of two carpels within separate perianths but on one pedicel. Let us now proceed to examine each of these in some detail. (1) The gyneecium consists of a single carpel (see Figs. LXXXVIIT—XCII). In all these cases, except LXX XVII, as will be at once seen by reference to the figures, the suppression of a carpel is accompanied by the suppression of some of the members of nearly all the other organs thus: In LXXXVIII two petals are absent and one stamen is aborted. In LXXXIX one sepal, two petals and two stamens are absent. In XC, XCI and XCII all the organs are deficient in members. A noteworthy phenomenon in this respect also is that the suppression of members which accompanies the suppression of a carpel is usually in the vertical plane which passes through the plane of separation of the carpels. (2) The gynecium consists of two carpels. This is the accepted typical structure and the statistical study deals with these in detail. (3) The gynecium consists of three carpels (see Figs. XCIII and CII). When three carpels occur in the gynecium they are never found co-laterally, Le. the additional carpel is never found with its origin at the side of a carpel, but always arising from the plane of separation, which is in the plane of greatest variability. (4) The gynecium consists of fowr carpels (see Fig. Cl). Just as in the previous case the increase in the number of carpels takes place in the plane of separation of the carpels—one on either side, so that a cruciate structure is found. A reference to Fig. CI will make this clear. In both of these groups it will be evident that an increase in the female reproductive organs is 33—2 256 Contribution to a Statistical Study of the Cruciferae associated not only with an increase in the male reproductive organs but also in an increase in the protective organs or perianth. (5) The gynecium consists of two sets of carpels within a single perianth. This is rather an anomalous group but is extremely interesting inasmuch as it contains a series of annectant forms linking group 2 to group 6. What we actually have here is a complete reduplication of the reproductive organs encased within a single series of protective organs. In some of the flowers examined with this structure it was rather difficult to determine the orientation owing to a torsion of the thalamus, but in the types figured on Plates IX and X (Fig. XCIV and Figs. XCV et seq.) the mode of origin of these is quite evident. Several important observations on these forms may be stated. (a) There are really two complete sets of reproductive organs and in one case see Fig. each of these is of the typical cruciferous structure. Fig. XCVII h of th f the typical f truct (b) Increase in the number of the reproductive organs is accompanied by an increase in the number of members in the protective organs. (c) The increase in the number of members of the reproductive organs is for the most part in the plane of division of the carpels, in other words, in the outer whorl of the calyx and its associated petals. (d) This is also the plane along which the separation of the reproductive organs has taken place. (e) This plane is the one which we have already shown in the statistical part to be the plane of greatest variability. (6) The gyncecium consists of two sets of two carpels within separate perianths but on one pedicel. In this group we reach the limit of variability in the material examined. In place of a single flower consisting of calyx, corolla, andrcecium and gynecium we actually find two complete sets of all these organs, on one pedicel (see Figs. CITI— CVIIT), while in one case (Fig. CIII) each of the two flowers has the typical cruciferous structure, so that were each of these separately examined it would undoubtedly be regarded as a normal flower. Yet we must bear in mind that, botanically considered, one flower and one flower only arises from a pedicel. Were this, therefore, an isolated example, and if no annectant forms existed, the departure might well be regarded as a “mutation,” but a consideration of the numerous variations which we have already considered, taken in conjunction with group 5, only serves to emphasise the fact that “the vertical plane which passes through the partition wall of the two carpels and consequently separates the individuals of the pairs of stamens in the inner whorl and passes through the centres of the sepals of the outer whorl of the calyx is a plane along which this flower is in a state of flux and is the plane in which it is probable that the flower has changed, and is still changing, from some quite different ancestral form.” J. J. SIMPSON 251 VII. SUGGESTIONS FOR FUTURE STUDIES IN THIS PLANT. It must be very obvious to anyone who has perused this paper that the results which might be obtained from a study of this plant are by no means exhausted. An attempt, however, has been made to interpret the variability in its flowers, both from a morphological and an evolutionary standpoint. Studies of a different nature might be undertaken in order to test the results obtained, e.g. : (1) What is the degree of fertility in the flowers of this plant? For this purpose it would be necessary to find the percentage of flowers which produce fertile seed. (2) What are the variants, if any, which are associated with infertility ? (3) What are the characters of the flowers which are produced from the seeds of the different variants? If seeds selected from the different variants were grown separately and self-fertilised, one could trace the variations in the flowers of the next generation and see to what extent the different variations were transmitted. This study is capable of much elaboration and is one which would be fraught with great possibilities. It seems to involve a satisfactory method of determining how far these variations are concerned in plant economics, and also to what extent they have been instrumental in the evolution of the Order Crucifere. EXPLANATION OF FIGURES 8, 9 AND 10. PLATE I. FIGURE 8. (a) Typical stamen (outer whorl). (b) Stamen with half-length filament and complete anther (outer whorl). (c) Typical stamen (inner whorl). (d) Non-chorised stamen with two complete anthers (inner whorl). (e) Stamen of inner whorl with two complete anthers but only chorised in the upper half. FIGURE 9. ) Aborted stamen of outer whorl, i.e. filament with no anther. (b) Absence of stamen in outer whorl. (c) Full-length filament in inner whorl with no anther. (d) Half-length filament in inner whorl with complete anther. (e) Half-length filament in inner whorl with no anther, (f) Non-chorised stamen in inner whorl with half-length filament but with two complete anthers. FIGURE 10. ) Normal sepal. (b) Sepal divided almost to the very base. (c) Sepal completely divided into two distinct sepals. (d) Sepal absent. (e) Normal petal. (f) Petal divided almost to the very base. (g) Aborted petal. (h) Petal absent. Outer Whorl of Andrcecium. Inner Whorl of Calyx. Number of Sepals Contribution to a Statistical bo On o.8) TABLE I. Outer Whorl of Calyx. Number of Sepals. 0 1 1 3 y 1803 | 8 O | J 6 Totals TABLE IIT. Outer Whorl of Calyx. Number of Sepals. ab Totals 3 5 0 5 5 Hh 1 85 Bs y) 1713 | ES Mg 3 a 9g } 7a ‘it\ = 2 peo tft) [Ee oe CF AW ~ —=a~ @) ©) ee ot Nee ae CIV B CVI B OVIIB CIII A Clu B CVI A CVII A XCVI NOCHMALS UBER “THE ELIMINATION OF SPURIOUS CORRELATION DUE TO POSITION IN TIME OR SPACE.” Von O. ANDERSON, St. Petersburg, RuBland. 1. Im Aprilheft der Biometrika, hat “Student” gezeigt*, daB das von Cave und Hooker vorgeschlagene Verfahren, den Korrelationskoeffizienten zweier oscillieren- der Variablen durch Berechnung erster Differenzen (also durch Ersetzung der Reihe a, 2,... %,, durch die Reihe: A’a, =x, — x, A’x,= a, — &3,... An. = Bn —4n) vom evolutorischen Element zu befreien, eine Verallgemeinerung zuliBt. Das Verfahren ist nimlich, streng genommen, nur dann richtig, wenn die evolu- torische Komponente durch eine lineare Gleichung darstellbar ist. Findet letzteres nicht statt, kann also, z. B., jener nur eine parabolische Gleichung hoherer Ordnung geniigen, so mu8 man zweite, dritte u.s.w. Differenzen nehmen (also statt A’s,, A’x,... nehme man A”z, = A’a,— A’x,, A” x, = A’x, — A’as, ... etc.) und danach Korrelationskoeffizienten berechnen. Letztere kénnen bald einen konstanten Grenzwert erreichen, der das gewiinschte Resultat darstellt. Unterzeichneter ist schon vor etwa 2 Jahren zu ahnlichen Schliissen gekommen. Durch von ihm unabhangige Griinde wurde er aber bis jetzt vom Drucke seiner diesbeziiglichen Schrift abgehalten. Da er bei seiner Untersuchung Wege ein- schlagt, die von denen des “Students” sehr verschieden sind, und auch zu manchen Schliissen kommt, welche letzterem unbekannt geblieben zu sein scheinen, so kénnte vielleicht eine kurzgefaBte Darstellung der wichtigsten Resultate seiner Untersuchung fiir die Leser der Biometrika von einigem Interesse sein. 2. Methode. Die englische statistische Schule vernachlassigt in ihren Unter- suchungen ein Verfahren, das von russischen und deutschen Gelehrten oft ange- wandt wird (Tchebycheff, Markoff, v. Bortkiewicz, u.s.w.) und neben groBer Strenge und Exaktheit noch den Vorzug hat recht elementar zu sein—die Methode der mathematischen Erwartung némlich, Mathematische Erwartung einer Grofe (4) heiBt bekanntlich soviel als das Produkt aus dieser GroBe und ihrer Wahrschein- lichkeit (w), also Aw. Wenn eine Variable eine Reihe einander ausschlieBender * Biometrika, Vol. x. Part 1, 8. 179, ‘‘ The Elimination of Spurious Correlation due to Position in Time or Space.” By ‘‘Student.” Biometrika x 35 270 «=©Die Verallgemeinerung der Cave-Hookersche Methode GréBen annehmen kann, so ist deren math. Erwartung als die Summe der Erwar- tungen aller dieser GréBen definiert. Wir werden hier die mathem. Erwartung iiberall durch das Symbol #( ) bezeichnen. (A) ist also, z. B., gleich Aw. Die hauptsichlichsten Satze tiber mathematische Erwartungen diirften als bekannt angenommen werden. Um aber die Nachpriifung der Formeln dieser Schrift zu erleichtern, werden wir hier die fiir uns wichtigsten Satze noch kurz andeuten : (Ql) H(@t+y-2ztu-t...)=H(a)+HYy)-H(a)+H(u)-H).... (2) Wenn uw, y, z,... von einander unabhingig sind, so ist H(e.y. 2... =f (a) Ee EZ) sea. (3) EH (k«)=kK (x), wo k const. ist; und daher auch : iE (hy) = he: (4) Wenn eine Variable X die Werte 2, 2,...%, annehmen kann, so ist die Wahrscheinlichkeit W, daB die Differenz «;—H(«) zwischen den Grenzen —aV HE (a?)—[E(2)P und +a V# (a*?)—[E(a)f enthalten sei, gréBer als 1— = ae wo a gréBer als 1 sein mu (ein Theorem von Tchebycheff). In unserer Untersuchung werden wir iiberall statt des wahrscheinlichsten Wertes einer Gréfe deren mathematische Erwartung berechnen. Bestimmen wir zuerst, wie sich der Korrelationskoeffizient zweier oscillierender Reihen verhalt, wenn man deren GréBen durch Differenzen Ai. (Naa Nace A’y, A”y, A’’y, ... ersetzt, und darauf untersuchen wir die Frage von den Grenzen der Anwendbarkeit der verallgemeinerten Cave-Hookerschen Methode. Um Raum zu sparen, werden wir nur die endgiltigen Resultate der Berechnungen anfiihren, ausgenommen die 3 ersten Formeln, deren Bestimmung als Beispiel der Rech- nungsmethode dienen mige. 3. Definition. Unter einer oscillatorischen Reihe werden wir eine solche Reihe XH, Ua, Uy, eee, Uj, ove, Ln verstehen, bei der HE (a) = H'@) =... = Ea) =. =f (a) =) const. und alle einzelnen Glieder von einander vollig unabhingig sind, so dab Ef (#;4;) = E («;). E (a), wobei 7+ 7. Solchen Bedingungen wiirde zwm Beispiel eine Reihe geniigen, deren Glieder die Resultate emer Versuchsreihe mit konstanter Wahrscheinlichkeit darstellen, etwa Resultate von Ziehungen aus einer Urne mit m weiBen und n schwarzen Kugeln. O. ANDERSON 2M 4. Mittleres Fehlerquadrat. Bezeichnen wir a;—H(#) durch &;, so ist KE, = E (a; -— E (a)] = # (a) — EH (2) = 0. Da die einzelnen & von einander vollig unabhéingig sind, so ist E (&&) = # (&). H(&) = 0. Das mittlere Fehlerquadrat der Reihe & ist gleich Seine mathematische Erwartung wollen wir (nicht ganz in Ubereinstimmung mit der iiblichen Bezeichnung) o,? nennen. 1 n nu eran PO 1 [Se] 28 [Se]. sereatoo = E (a*) -(H (@)P, ein Ausdruck, der oben im Satze 4 (§ 2) unter dem Zeichen der Quadratwurzel steht. [St-2@r| {ee| Andererseits ist aber 1 (Gone gleich A oe | und daher o2= B(E). 5 nN Untersuchen wir den Ausdruck abs (a; —- My], wo M, das arithmetische ak iL Sa SAD Mittel der Reihe 2, also = bedeutet. Da Barr t+ (a) + et... +H (a) + En _ nN o— M,= HE («)+&- E;— M: n > &; (wenn man JM; fiir +— einsetzt), so haben wir: n E E (ee My» | ae 1, Be st my | _E E aoe oe] acl al 1 See—2366)| =n B(E)— 0B (Me) =n (&) — nF | fant @)- B®) ie b[ Sq -any |= =H Be) if Um £(&) zu bekommen, muf man diesen Ausdruck durch (nm — 1) dividieren. 35—2 272 Die Verallgemeinerung der Cave-Hookersche Methode Ks ist also auch Um das Fehlerquadrat o*y,z fiir die erste Differenz A’xv zu erhalten, bertick- sichtigen wir, dab A’x; = (#i — Bir) = (Ei — Eis) und E (A'a;) = E (&:) — & (Ein) = 9. Daher haben wir: n= ; } n—-1 [‘Stwn- Bay] ['S&- si aes H | im a eee n—1 Sa | |S ee _2E Be fin | +E | 2e°]} n—l 1 1 2 = +> (m1) B(E*)— 0+ (n-1) B(E)} = 28 (B). Ks ist also Can — on Nach demselben Rechenschema ergiebt sich fiir das mittlere Fehlerquadrat der zweiten Differenz A’x; der Ausdruck 60,2 , dritten ms DG aes i 200,7 » vierten a PN ie = 700,2 2k! » k-ten . Aas, »» AA On: Wir konnen daher folgende Gleichung aufstellen re One = o'a'a ahs aN Se: CNL ra 7,0, 2 Oy, = 2 = 6 = 20 = 70 pe are a sre eeeee (1) i k ) welche exakt ist, und folgende n n-1 n—2 n-3 TN n-~k 3 (0;—M,) “SA'a? “LA's? “SA” ae "Sama SNC 1 a 1 = u —_— —— u — 1 —— pe pe n—1 2(n—1) 6(n—2) 20(m—3) 7O(n—4) “Qk! k ay * TEE a welche nur anniihernd richtig’ ist. n-k n-k D> Atk) x2 D2) (A®e -M (h Ve 5 5 1 J 1 A‘ ‘a, * Ks ist vorteilhafter yeaa und nicht —— zu berechnen. i :) raga ee pea —a O. ANDERSON 273 LHYs Ss . 1 5. Das mittlere Produkt mo Wenn zwei Reihen X, Xo, X3, shaiie, Xn» und Yuva Gas Uae Uns beide im Sinne des § 3 oscillatorisch sind, und eine Korrelation nur zwischen GréBen mit gleichen Indexen, also zwischen 2, und y;, « und y, 2; und ¥;, u.s.w. bestehen kann, so ist es leicht ersichtlich, daB E ((x— E@)\[yj— Ey}, =0, wenn i4j, Bezeichnen wir y;— H'(y) durch y;, #;— E(w) wieder durch &;, so finden wir leicht folgende Ausdriicke : Ei Pay =H | — =H (Ei), n | = E (Eni) = pay, n—l > A’a, A’ Yi 1 Pareny= 8 |—Gaa— | = BP [E@%— Me) yi- My) E | , ree a ee ee rs ("S'a 8) op, A) yy 2k! Paw, Agee Nae hee Wir konnen daher wieder zwei Gleichungssysteme : _Pyraty Parcary Pavrra’y Pavve atrry Pawa ay k! k! und nN n n—-1 n—-2 -3 > [w—E(«)|[y-H(y)]) DA'aA’y, TAwA"y, & AGA’; 1 1 1 n—1 "7S © M2)" = 20 = 3) "S Awa, Ativ)y, AM x Ay, = t= ———— =p (2a) TO(n—4) 7" 2k! eS Et kl yi i” —k) aufstellen, von denen das erste exakt und das zweite angenihert ist, und welche den Gleichungen fiir o,? (also auch fiir 0,2) des § 4 genau analog sind. 274 = Die Verallgemeinerung der Cave-Hookersche Methode 6. Das Fehlerquadrat der Fehlerquadrate. Betrachten wir jetzt den Bereich der Schwankungen der GréSen der Systeme (1a) und (2a) um deren mathematisch zu erwartenden GréBen in (1) und(2). Mit anderen Worten, gehen wir (mit Riicksicht auf Satz 4 § 2) zur Darstellung der math. Erwartung der Fehlerquadrate der genannten Groé8en iiber. S [v; — £ (x)P 4 2) ]2 Fiir 2 7 ergiebt sich das Fehlerquadrat BSD) a (é )] P 3 [2:— Ma} Fiir * : ergiebt sich das Fehlerquadrat Ee ACO ap aes] : al n n(n—1) S [A’ajp Fir DED ergiebt sich das Feblerquadrat (2n — 8) (EB (&) - [A (E)P}+2(a— DBE)? 2(n—1) , "S [Av aiP Fiir aCe 2) ergiebt sich das Fehlerquadrat {P= (9n — 23) (H (E+) —[# (&)P} + '7n — 42) LE (&)P 9(n —2)? , n—-k S(A® x; : apeal Und endlich fiir Ohi aan (n—k) Tere ae Oe 24) HE) — (BEN) + 4 EYP LAS (2+ 1 + A? (n—2kh4+2)+4+ A)? (n—2k4+3)4+...+ 472 (n—2k4+h)] + 2B¢ (2 (8) —[H(E)K +8 [BOR Bat Bet + Bea) ergiebt sich der recht komplizierte Ausdruck : Wenn Jy, b,, bs, ... by die Koeffizienten der Zerlegung des Binoms (1 +1)* k(k—1) darstellen, also 6,.=1, b,=hk, b= 1g? USW., 80 ist hier Qh! 2 Ag=(b2+b2+b2+... +02" = | eral ht 7 Aj? = (doby + bb, + babs + 0. + Dp adn? = Fsivese ; Qk! 2 A? = (by bz + bib; +... + dy ob)? = Feeeerd Smet w ee ee mere e eer e ee eaee eee essere eeeereeeerereseseeeeeeeeeseerese ' 2k! 2 A? = (bb; ar bibj45 ap ooo ar by—jbx)? = Feel > Cr ee i ee O. ANDERSON 275 (Oy) (Oe Or) + (by Oy + 6") Fe. + (Oy + be + bP +... + Be 1)%, B? = (by bi)? + (bob: + br be)? + (do bi + 1 02 + 6565)? +... ar (by b, ar b,b, + bibs («; — M,) Qa! ” ial ee 1 ) > he 1 . n- var Cle Dae oder angenéahert eee »” “2(n—1). ” ” (n = 1) ge ’ahner rae 1 ‘e SN ae): (85n — 88) 04 do! > 6(n—1) P i 9 (n— 2) ‘ “4 n—-2° n—-3 =f wy, .\2 PS eo) (231n — 843) oy! Sox! e201 = 2) a 7 50 (n — 3)? 4 i, n—3° n—k > (A® x; Fiir aa endlich kann man das Fehlerquadrat in solcher Form dar- ma —*) stellen : @ oo Nin ~ k) + 2(n—b—1) (Epi) +2@-k-2) C Ses ») +2(n—k—3)( k. (ke —1).(k— 2) ee (k +1). (E+ 2). (E +3) alle OSCE (h+1). he +2). (6 +3)... 4g)) k.(b—1).(k—2)...2.1 yt (k +1). +2). (6 +3)... (+h) 5° +2(n-—k—jJ) ( +2(n—b—b) ( 276 Die Verallgemeinerung der Cave-Hookersche Methode Ks ist also klar, daB zusammen mit dem endlichen Differenzieren die Unsicher- heit der Bestimmung von a,” stetig wichst, anfangs etwa im Verhaltnis Nee NTR De ISY Ae 7. Das Fehlerquadrat des mittleren Produktes. s Ei 2 7 2 urs == orpicbe sich das imonleriudine eerie ete S (= Ma) (ys — My) Fiir + ear ergiebt sich das Fehlerquadrat B (Epi) — (BE P| Eis) P+ on2o,/ n n(n —1) n—-1 > A'a;A’y; Ace aa . Pee Pee f Fir SON ea ergiebt sich das Fehlerquadrat (2n — 8) {EB (E2r2) — LE (Ea +(e — 1) (LE EW) P+ 02207} 2(n —1) j n-k > AMa, Ay; Im allgemeinen Fall Soi = ae erhalten wir fiir das Feblerquadrat folgenden Sh) kik! Ausdruck : Timp {48 — 28) UE EWE) — LE EP + 2 {LE (Eh) P + 0,70,7} [A (mn — 2h +1) 4+ A? (n-— 2k 4 2)4+ ... + Ae (n— 2k +k)] + 2B? (E (E2yr2) — LE (Ei) P} + 4 (LE (Esti) P + oe2o,?} (BP + Be +... + Bi} , wo die Koeffizienten A,’, A,’,... B,?, By, ... dieselbe Bedeutung haben, wie in § 6. Wenn 2; und y; einander vollkommen gleich sind, so ist [E (Emi) P={H(P)Paorts HEM A)=H EA); Foy = a7", und obiger Ausdruck fallt mit dem in § 6 zusammen. Fiir den Fall der “normalen” Verteilung kénnen wir auch alle diese Ausdriicke betrachtlich vereinfachen, besonders wenn wir # (&;;) und #(&?.?) als Funktionen von rz, darstellen. Olbne aber hier darauf einzugehen, wollen wir jetzt tiber den Korrelationskoeffizienten ins Klare kommen. 8. Definition des Korrelationskoeffizienten. Der Korrelationskoeffizient R wird gewoéhnlich nach der Formel 2 (a — Ms) (yi— My) R= Z berechnet. /S@- May 3 i,y 1 1 -O. ANDERSON 277 Zu welchem Ausdruck ist er nun als empirische Anndherung aufzufassen, S (a; — Mz) (yi — My) H E (a; — Mz) (yi - m,)| 1 e 1 2 oder zu zu = : : r/ See —M,) S (y; - M,) Neo tas M,)"] E(S (y:— My] 1 1 1 1 Beide Formeln sind durchaus nicht mit einander zu identifizieren und fallen nur in erster Anndherung zusammen. Da die zweite aber bedeutend leichter zu handhaben ist und dies auch mehr den tiblichen Rechnungsmethoden der englischen Schule entspricht, so definteren wir ry, als EK E (a; re M;) (Yi -_ | 1 ve E E ee Mz) | E E (y;— My} | | E : p : Anders ausgedriickt ist yy = Pay , WO Dzy, Gz, Fy die Bedeutungen haben, Oxy welche wir ihnen oben in §§ 4 und 5 beigemessen haben. 9. Das Verhalten des Korrelationskoeffizienten zweier oscillierender Rethen x und y, wenn man deren GroBen durch Differenzen ersetzt. Fiir die k-te endliche Differenz von z und y haben wir an ee : . Qe! y a A! ao, A 4 } — i) - ( 1 fe _ Paha, AMy kt. ey Pe Afiz, AMy eg n—k nk = NE 2 2 = 9 ye SR ae id A TAlkin + T Ah) 2k! 2k! E\ > Ae2|.H| & A®y2 Ax ly 2: ee paar ae lel lo” EL Ri’ So Gee Tar, Ay = Guan oy Pay: Wir haben also ganz allgemein das genaue Resultat : Try = Ya'e, A’y = TA" x, A’y — TA" x, A”'y a T atk), Ally: Da aber diese r unbekannt bleiben und wir fiir ein beliebiges Taide, A@y DUT n—t . AMZ, AMY, dessen Anniherungsformel a, a —— kennen, so miissen wir wiederum feststellen, inwiefern man sich in der Praxis auf die Ubereinstimmung der empirischen Koeffizienten mit deren mathematischen Erwartungen verlassen kann, wie gro$ also die Unsicherheit ihrer Bestimmung zu schatzen ist. Biometrika x 36 278 Die Verallgemeinerung der Cave-Hookersche Methode 10. Mitileres Fehlerquadrat des Korrelationskoeffizienten der endl. Differenzen zweier Rechen. Aus der Formel &;=—-~ : , kann man folgenden Ausdruck a vi 5, Aa > A by 1 1 ableiten : n—k n-k n—-k > AMa A® y; > A a2 > Aye - Parc k : or Alk : o ae = ‘bt i IR Ty ay Ac, Ay ni Aa m Aly - —— ee, a) eT) aT aa ate p) ee Pawea®y 20° Aix 20° Amy der nur in erster Annaherung und, wenn alle 4 Briiche des Ausdrucks echte sind, richtig ist; dies ist bei groBem n der Fall. Und ferner ergiebt sich daraus die Formel : +6 (1 =? xy)? ; Pe ee iden: ene k.(k—1) 2 Gas sisi i Ol eel =e " 2) ey cea ig (Pan (er +2(n— 22) agen (k +2). Gaal ox k.(k—-1).(k—-2).. ; +2(n—k— ®) (ary. (K+2).(k+38).. 21 "3 (Ama) 7 ii a = k) (vergl. dazu die Formel fiir in § 6). Die Formel fiir on ist immer nur dann giltig, wenn man ea) ee (Eepe) —[H (Ea? , HE) — [LA (e)P ie E (p*) — [EB Gp?)P n LE (Evi)? 4[H (é)] 4 [EB (p*)P _ Ege) — Hkh) EE) BG) = BE) BOY) _Beye)- E©). st 2H (&*). Bp’) gleich Sal evar setzen darf (vergl. Biometrika, Vol. 1x. p. 4). Aus der Formel fiir oR erhalten wir: byes (1 = xy)? Ry n ; jar 1 Lata) 3n a 4 Ry n-1 °2(n—1)’ "i (l=) 7 80% 88 Rs a ee 718 (m=2)- _Gd=—ry)? 231n — 848 Ryn —3 * 100 (n—8)’ u.S.W. 2 O. ANDERSON 279 Die Fehlerquadrate der Korrelationskoeffizienten aufeinanderfolgender Diffe- renzenordnungen verhalten sich folglich zueinander ungefahr wie 2:3:4:5.... Die Unsicherheit wachst also mit zunehmender Differenzenordnung etwa im Verhaltnis N DEEN! O uA) AES Nf Dyions ll. Korrelationskoeffizient zweier zusammengesetzter Rethen, die aus oscillato- rischen und evolutorischen Elementen bestehen. Da “Student” diese Frage treffend dargelegt hat, konnen wir uns kurz fassen. Wenn wir in Betracht ziehen, daB fiir uns die evolutorische Komponente einer Reihe schon dann in der Praxis verschwunden ist, wenn sie im Verhialtnis zur oscillatorischen Komponente so klein geworden ist, daB sie nur die 3, 4", u.s.w. Zahlenstellen des Ausdruckes fiir R beeinflussen kann,so kommen wir zum SchluB, da8 nicht nur Komponenten, die durch eine Parabel héherer Ordnung darstellbar sind, sondern auch solche, denen nur transzendentale Gleichungen (z. B. Sinus- reihen) gentigen, beim endlichen Differenzieren eliminiert werden. Ja mehr noch, man kann beweisen, daf iiberhaupt alle mehr oder minder “ glatten Reihen,” alle bei denen eine geniigende positive Korrelation zwischen den Nachbargliedern bemerkbar ist, fiir die Praxis beim endlichen Differenzieren verschwinden. Das verallgemeinerte Cave-Hookersche Verfahren ist daher augenscheinlich ein sehr universales Mittel, die Korrelation oscillatorischer Elemente aus zusammengesetzten Reihen herauszuschiilen. Es hat aber einen Haken, auf den hier noch hingewiesen werden muf. 12. Kann man aus dem Verhalten der Rethe R,, R,, R,, ... Ry bestimmen, ob wir den Korrelationskoeffizienten rein oscillatorischer Reihen vor uns haben? “Student” scheint zu glauben, daB8 wenn irgendein Rf; seinem Vorgiinger R;, gleich ist, wir es sicher mit dem Korrelationskoeffizienten oscillierender Elemente zu tun haben. Vor einem solchen Schlu8 ist nachdriicklich zu warnen. Wie es meine (fiir diesen Artikel etwas zu langwierigen) Berechnungen zeigen, kénnen zwei Nachbarkoefti- zienten h;, R;, auch bei stark evolutorischen Reihen einander ungefahr gleich sein, und die Wahrscheinlichkeit eines solchen Zusammentreffens ist gar nicht sehr gering einzuschatzen. Nur wenn wir, von irgendeinem #; angefangen, immer dieselbe GroBe fiir R erhalten, also Aj = Rji.= Rj4.= Rj4;=, wird ein solcher SchluB berechtigt sein, und je linger die Reihe gleicher R, desto wahrscheinlicher wird dieser Schlu8. 36—2 STATISTICAL NOTES ON THE INFLUENCE OF EDUCATION IN EGYPT. By M. HOSNY, M.A., B.Sc. The statistical returns for Egypt are—as compared with European data—still in a somewhat elementary stage. Age-distributions are of very little value, and in the case of infantile mortality we have only information for certain towns. Further, in the larger towns there is a considerable cosmopolitan element, which gives them a widely different character from the often sparsely populated rural and desert districts. Education is not compulsory, and schools and literacy are largely confined to Cairo, Alexandria and the Canal Government, even when we exclude all foreign scholars. In the same way criminality* preponderates, in an inverse order it is true, in these three districts, but it is not absolutely certain whether this is due to their more efficient policing, to the presence of more foreigners, or to a real absence of crime in the rural populations. Crime does not appear to arise in Egypt from poverty or drunkenness, two of the main factors of its origin in Western Europe. The criminal, indeed, is rarely habitual; he is an amateur, rather than a professional, and criminals are more often well-to-do, their crimes arising from motives of revenge or passion. The fact that criminality in Egypt is highly correlated with literacy and scholarship would be noteworthy and might possibly be used as an argument against education, did not the association of crime and education arise from the prevalency of both in the more populated districts, where again we find the greatest abundance of foreigners. Naturally such questions arise as: (i) Are the foreigners—and if so, which section of them—to any extent responsible for the prevalence of crime in the districts frequented by them ? (ii) If we allow for urban conditions, will there still be found a high asso- ciation of crime and education ? It is perfectly easy to obtain from the Egyptian Census-. we used that of 1907—the number of foreigners of each denomination in the various Egyptian * We understand by “criminality” in this paper, not commission of but conviction for crime. M. Hosny 281 governments. The only difficulty here was the presence of British troops in Cairo and Alexandria, which placed that nationality in an anomalous position. These were estimated approximately and subtracted. The following groups of foreigners were then dealt with: (a) Ottomans, (b) British subjects, French, Austrians, Germans and Russians*, (c) Greeks, (d) Italians. The Greeks and Italians were separated from the general European group (b), because they are largely differentiated, the Greeks being frequently small traders and the Italians often manual workers. Their large numbers also justified a separate classification. Table I gives the foreigners per 10,000 in the 17 Egyptian districts we were able to deal with. It will be noted that the Greeks far outnumber other TABLE I. Foreigners per 10,000 and Population per sq. kilometre. Europeans other | Population Governments Ottomans | than Greeks and Greeks Italians | per sq. Italians | | kilometre Cairo .... 0 a 453, 312 | 298 204. 6060 Alexandria oe god 661 514 745 482 6780 Canalt ... Bas aon 416 583 846 445 7666 Beherat ... Si hee 43 23 31 11 178 Charkieh ae ae 29 5 24 | 1 257 Dakahlieh and Damietta 18 5 | 18 3 346 Gharbieh 2 0) g 0) 226 | Kalliuhieh 8 3 13 2 | 469 Menufieh 2 1 7 0 | 618 Assiutt ... + gz 3 il | 454 Assuan ... he 8 5 13 | 4 | 533 Beni Suef Aeneas 15 6 9 1 | 351 Fayoum ... 10 3 4 0 | 255 Gerga 2 0 2 0 532 Guizeh .. 6 6 4 33 447 | Kenat ... 5 4 4 2 339 Miniat 10 5 7 1 458 | foreigners, but that all foreigners are concentrated in the Cairo, Alexandria and Canal governments. It was far more difficult to obtain a measure of urban conditions. We had to take very rough measures of the density of the population, because the limits of certain areas are too vaguely defined to be of any service. El] Arish has been excluded from the Canal district, Suez and Sinai have also been excluded as there is no enumeration of them with respect to criminality, literacy and scholarship. These densities, w.th such value as they have, are given in the last column of Table I. * The contributions from other smaller nationalities were omitted. + Various approximations and omissions occur in these cases in obtaining density. 282 Statistical Notes on the Influence of Education in Egypt Table II provides the number of male criminals per 1000 of the male popu- lation, the literacy or number of male persons able to read and write per 100 of TABLE II. Hducational and Criminal Indices. | Male | Titer Male Scholars Governments Criminals per : er acy. 5—19, per 100 | 1000 males | P* 100 males boys of those ages Cairo ar er “ae 12°90 28°03 30°20 | Alexandria ae aA 14°15 | 30°09 19°99 Canal and E] Arish ... 22°30 23°39 8°54 | Behera ... ae Sac 5°30 | 9°29 1-01 | Charkieh aie ae 4°20 9°09 1°66 Dakahheh and Damietta 4°35 5 {salts} 1°76 | Gharbieh ; a 5°65 8322, 3°04 | Kalliuhieh a fs? 6°65 8:13 1°39 | Menutieh bee ae 3°85 8°45 1:06 Assiut sate 5°45 701 4:02 Assuan ... Are well 4°20 7°68 0°82 Beni Suef Dial 8:42 2°03 Fayoum ... 6°85 6°54 1°85 Gerga 3°95 5°84 2°22 Guizeh A 5:10 6°38 1:15 |} Kena... Re ie 3°45 5-34 1°66 | Minia 5:20 (es Sule the male population*, aad the number of male scholars aged 5 to 19 per 100 of the native boys of those agest. We shall use the following symbols to denote the factors which occur in Tables I and IT: O = Ottomans, G=Greeks, J = Italians, = Europeans other than Greeks and Italians. C= Criminality, 2 = Literacy, S=Scholarship, D = Density of Population. Each government was treated as of equal weight, although the populations vary from 233,000 in Assuan to 1,485,000 in Gharbieh. The standard-deviations and product-moments were found without grouping. The following results were obtained : Means Standard Deviations Correlations No = TO15, el 4-791, Noh + 8450) aE ‘0468, 1 — WOO, op = CoAl: Tos = + 6242 + 0999, mg = 5031, og = 7'735, rrs = +9028 + 0308, Mp = 1528, op = 2475°5, Correlations : rpc = +9614 + 0124, rps= + °8097 +:0566, = rp, = +9563 + 0138. Now at first sight these results would seem to indicate a very bad influence of education on crime. Where literacy and scholarship are greatest, there criminals * Kgyptian Census, 1907, p. 99. + Foreign male scholars are excluded in the case of Cairo, Alexandria and the Canal. They have no sensible numerical existence elsewhere. Criminals and scholars are taken from the Annuaire Statis- tique de VEgypte, 1912, pp. 95 and 135. M. Hosny 283 are most numerous! And a superficial argument might be used to condemn the character of education in Egypt, or education in general. But it will be clear on examination of the isolated values that the observed high correlations arise solely from the urban character in Egypt of both criminality and education. We have endeavoured therefore to correct this by finding the partial correlations for constant density of population. There now result prog = — 9554 + 0143, pcr = — 9231 + 0242, blips = +7480 +0721. Thus, while there still remains a quite considerable relation between the prevalence of literacy and scholars for constant density, we find that for a constant degree of urban conditions, the greater the literacy and the greater the amount of education the less will be the criminality. The negative correlations are now even higher than the uncorrected positive ones and of course are markedly significant. While admitting the slender nature of the Egyptian data, we think that this swinging over of the relation of crime and education when we correct for density is sug- gestive, and it would be of interest to work out similar correlations for states in which the statistics are of a more ample character. It does, however, appear reasonable to assert that there is no evidence to indicate that education leads to criminality—rather the reverse—in Egypt. We will next consider the influence of the presence of foreigners in Egypt. We find: Means Standard Deviations Correlations Mo= 99°53, oo — 195-60; Too = +°8425 +0478, Mp= 86°88, Bea lleeyidy rao = +°9546 £0145, Mg = 119-41, og = 25661, roo= +9429 +:0181, m= 68°24, oa, = 152-04, rio= +9192 + 0254. Correlations : po = +°9575 +0136, rpm = +:9844 + 0050, Ppa = +9491 + 0162, rpr= + ‘9617 + 0123. Here, if we judged by the raw correlations only, we must assert that the corre- lations of crime with the presence of foreigners are so high, that the foreigners must be corrupting the Egyptian population. But again the association only arises because the criminals and foreigners are both prevalent in the big towns. If we correct for density of population, we find the results are very different. Thus we have: broad >= — ‘9811 +0061, preg=t+ "1692 ap ‘1591, p’ac = +°3524 + °1483, pro = — 0718 + 11628. It is now obvious that the correlation of Europeans other than Greeks and Italians with criminality has become insignificant having regard to its probable error; the correlation of the presence of Italians and criminality is now negative, but less than its probable error. Thus of Christians only the presence of the 284 Statistical Notes on the Influence of Education in Egypt Greeks may possibly, but not certainly, be detrimental. The Ottomans have now a large negative correlation of a quite significant character, or we might assert that the presence of Ottomans tends to diminish criminality. The Greeks are frequently moneylenders and alcohol dealers, and the Ottomans, especially the Arabs, have among them a good many religious teachers. We have, however, to note that criminality is greatest in the Canal Govern- . ment, where Europeans and Greeks are most frequent, while the Ottomans are most numerous in Alexandria, where crime is almost 40% less than in the Canal Government. To test the influence of the three densely populated governments, we put the Canal proportion of the Ottomans at Cairo, that of Cairo at Alexandria and that of Alexandria at the Canal. There resulted: Toc = +°9707, instead of +°8425, Tpo = +'9870, instead of + °9575, leading to pYoc= + °4918, or we may safely say, that if the proportions of Ottomans at Alexandria and along the Canal were interchanged, then no relation between the presence of Ottomans and the absence of criminality would exist, indeed the relation would probably be reversed. The prevalence of the Ottomans in Alexandria has been attributed to its more temperate climate. There is certainly a large Ottoman element in Alexandria, there being 21,827 Ottomans out of a population of 332,246, and it is larger than any other foreign element except the Greeks. In Cairo, with 29,516 out of 654,476 inhabitants, the Ottomans exceed any other single foreign element. It is conceivable, therefore, that they may be able to influence the moral tone of those towns. It must be borne in mind, however, that crime is far more frequent in the Cairo and Alexandria governments than in the more purely rural districts, and we can scarcely suppose that Cairo and Alexandria would reach the still higher criminality level of the Canal, were it not for the presence of the Ottomans. In the Canal Government there are exceptional conditions, and we can hardly assume that a transfer of the Ottomans from Alexandria to the Canal would interchange their proportions of criminality. Greeks no doubt flock to the Canal for business purposes, the other Europeans largely for control purposes; the Ottomans, relatively speaking, avoid it. Without further analysis it would not be possible to assert definitely that the presence of Ottomans reduces crime. It may be doubted whether the presence of foreigners, with the possible exception of the Greeks, is really associated with the extent of criminality in Egypt. A further investigation was undertaken in regard to the possible influence of education on infantile mortality. The birthrate and deathrate in Egypt are both remarkably high. Thus for the years 1899-1909 inclusive the average rates were: Births per 1000* Deaths per 1000 Excess of Births over Deaths Cairo 40°7 35°7 +50 Alexandria 38:0 31°8 + 62 * Still births not included. M. Hosny 285 Many European towns with half the above birthrates have considerably greater excesses of births over deaths. Unfortunately the infantile mortality is only recorded in Egyptian towns and not in the governments at large or in the rural districts. We are obliged, there- fore, to deal with these only when considering the relation of education to infantile mortality. From p. 49 of the Annuaire Statistique de V Egypte, 1912, we obtain the infantile mortality for 1911, and note at once how extraordinarily high it stands. From p. 286 of the Census of Egypt, 1907, we take the percentage of male literates in the total male population, and from the Statistique Scolaire, 1912-1913, p. 74, the percentage of scholars in the total population*. As it was possible that the density of town population might influence the results, we took the number of persons per house, which was about the only social factor available. . This will probably represent fairly closely the average size of family. This was taken from the Census, 1907, p. 286. The mean, 5°64 persons per house, suggests that the average number of living children can hardly exceed three. The marked relationship that occurs in European towns between gross size of family and infantile mortality cannot be satisfactorily tested on the Egyptian data, because we cannot ascertain the infantile mortality in each size of family. The number of persons to the house is indeed rather a measure of net than gross family, and we only know this as an average value for each town. It does not follow that a town with a low number of persons per house is one with small gross families; the low number may be due to the heavy infantile mortality itself. Accordingly the correlation between persons per house and infantile mortality is not necessarily even a measure of the influence of overcrowding on infantile mortality (although this is often supposed to be the case); it is con- ceivable that a high infantile mortality might be the source of a low number of persons per house, and the unravelling of cause and effect is only possible where we know not only the number of persons per house, but its relation to both the gross and net family of that house. Let J = Infantile mortality, Z = Literacy, S = Scholars, P = Persons per house. Then we have the following results : Means Standard Deviations Correlations M, = 29°30, o, = 7608, ry, = —'1040 + 1500, M,, = 21:96, 7 ouligue Trg = +5809 + 1278, M,= 5°569, og = 2951, ris = +0093 + 1508, Mp= 5°643, op = 1428, oe Correlations : rprp= + 1675 +°1487, rp, =— "842141487, rpg = +0296 4 °1508. * The scholars were taken for 1910-1911, the year of infantile mortality, but this involved the assumption that the foreign scholars were the same in numbers in 1910-11 and 1912-13, probably not a very inaccurate assumption, which in any case affects little more than Cairo, Alexandria, Suez and Ismailia practically. It is the number of Egyptian scholars that is rapidly changing and the scholars dealt with in our ratio are Egyptian only. Biometrika x 37 286 Statistical Notes on the Influence of Education in Egypt TABLE IIL. Infantile Mortality. Persons per House and Education. Infantile Male Literacy ‘ cholars Persons Town Mortality per 100 of per 100 of per per 100 births population population House Cairo ee. 32°9 28°03 6°12 4°62 Alexandria Sec 26°9 30°09 3°41 8°43 Damietta 18°1 7°06 1°89 6°96 Port Said ae 21:0 24°13 2°64 4°14 Ismailia ... 26 16:0 28°05 1:22 4:10 Suez oe See 26°9 25°74 0°64 3°85 Benha ... sre 29°6 20°54 4°87 5:47 Zagazig ... we 27°9 25°67 6°47 6:07 Tantah ... ae 29°6 25°62 9°45 5°18 | Mansorah sae 21°4 26°77 6°20 4°60 | Chibine El] Kom 16°1 18°71 5:10 4°92 Damanhur ee Oieo 19°54 3°55 7:27 | Guizeh ... a 35°6 19°23 eS 6°12 | Fayoum ... fee 40°1 15°55 4°78 9°34 Beni Suef Sur i7/ 2il 21°15 7:19 511 | Minia... ae 38°2 21°96 8°89 4:96 Assiut ... ist 33°6 20:92 11°48 6:00 Sohag... ee 29°0 DOS | 5°58 6°15 Kena see oe oe 16°76 4°80 5:14 Assuan ... ae 41°1 21°31 5°78 4:09 hi 1 It will be clear from these results that there is no significant relation between the literacy of the male population and infantile mortality. There is also no significant relation between the number of persons to a house and the number of scholars, Le. it does not appear to be the more crowded towns which have the largest percentage of scholars to the population ; Alexandria and Damietta, for example, have considerably more than the mean number of persons to the house and relatively few scholars. On the other hand a larger number of literates marks less crowding. Crowding and infantile mortality are slightly related, but con- sidering the probable error, not with definite significance *. While literacy has no relation to the infantile deathrate, it is noteworthy that there is a significant correlation (+°53809 +°1278) between the number of scholars and the infantile deathrate, which is greater where there 1s more education. Now this either suggests that many scholars mean large families and large families correspond to increased infantile mortality, which is usual, or that the towns in which there are the classes who educate their children have a higher infantile deathrate. The only means, and those inadequate, of testing the first * This agrees with the result for overcrowding and infantile mortality in English manufacturing towns, where the correlation is very small and sometimes has one sign and sometimes the other. M. Hosny 287 assumption are to take the partial coefficient between scholars and deathrate for constant number of persons per house. We find*: pris => + 5336 + 1107 5 similarly pry, = — 0504 + 1543. There is thus a slight increase in the relation of scholars to deathrate when we take a constant number of persons per house, and it is hard to believe that the relation is indirectly due to size of family. The second result shows that literacy has no relation to the infantile deathrate. Towns like Alexandria, Damietta, Port Said, Ismailia, and to a less extent Suez, with a low infantile deathrate have a low education rate, and towns like Cairo, Guizeh, Beni Suef, Minia, and Assiut, with high infantile deathrates have high education rates. The first towns are on the sea or the canal, the second in the Nile Valley; it is con- ceivable that the latter are the more unhealthy for the infant; it would need special local knowledge to explain why education has been most accepted above Cairot. There does not, however, seem any relation between ignorance, as measured by literacy, and a heavy infantile mortality, nor on the other hand can we assert that education and European influence have certainly increased criminality. * The value of p7,5 is + °0207 +1508, and is therefore not significant. + It is noteworthy that there is no relation between literacy and number of scholars, i.e. education of children does not appear to follow the power to read and write in their parents, that is to say if we judge by the averages in towns and not by individuals. 37—2 HEIGHT AND WEIGHT OF SCHOOL CHILDREN IN GLASGOW By ETHEL M. ELDERTON, Galton Fellow, University of London. In 1905-6 an enquiry was made in the Public Schools of the School Board for Glasgow as to the height and weight of all the scholars, the occupation of the parents, the number of rooms occupied etc. By permission of Sir John Struthers, of the Scottish Education Office, these schedules were most kindly placed at the disposal of the Galton Laboratory. The number of children concerning whom the enquiry was made is over seventy thousand of ages 5 to 18 years. The schools from which these children came were divided into four groups according to the district in which the schools were situated. Group A comprised schools in the poorest districts of the city. ee a » in poor districts of the city. ee. ‘i , in districts of a better class. eed 3 » in districts of a still higher class with which are included four out of five Higher Grade Schools. The data were originally used by the Galton Laboratory with the object of discovering how far the physique of school children, judged by their height and weight, is affected by the occupation of the father and the employment of the mother. With this end in view the necessary data were entered on cards. Children over 14 were excluded and all children who had not both parents alive were also excluded; this left us with 30,965 girls and 32,811 boys. The object of the present paper is to ascertain what is the average weight of a child of a given age and a given height. The first step in this enquiry was to sort the cards and form tables giving the distribution of weight for each height at each age in each school group, and this laborious work was carried out very largely by Miss Augusta Jones; this step necessitated making 72 tables, and she is responsible for 58 of them while the EK. M. EvprErtTon 289 remaining 14 are due to Miss H. Gertrude Jones; I have to thank most heartily these colleagues for their very efficient help in this matter. The three factors with which we are concerned are the age, height, and weight of school children. The instructions issued to the teachers in the schools for recording these three facts were that ages were to be given to the nearest year, weights to the nearest pound, and heights to the nearest quarter of an inch*. The method of recording ages is very important. Ages being recorded to the nearest year, this means that children classed as 6 years were from 5°5 years to 6°5 years ; and the average age of this group was 6 years; this is not the method most frequently employed for recording ages; “age last birthday” is generally used and if “age last birthday” be given as 6 years then the children of that age are from 6 to 7 and the average age of children in this group is approximately 6°5 years. It will be seen at once that a comparison of weights and heights of two groups of children of 6 years cannot be undertaken until we know which method of recording ages has been adopted. The height and weight of these Glasgow children have been compared by Dr Leslie Mackenzie and Captain Foster+ with the height and weight of children as given by the Anthropometric Committee of the British Association and it is pointed out that at each age the average weight of the children is uniformly below the “standard of the Anthropometric Committee,” and that generally speaking the same thing applies to height. As a matter of fact this point as to age has not been noticed by these writers and children whose average age is 6 years in Glasgow are compared with children whose average age is 65 years, naturally the younger children are shorter and lighter. There is further an important question to be asked: Which standard of the B.A. Anthropometric Committee ought to be selected? To this point I return below. As I have said the Glasgow children’s ages were recorded to the nearest year}, but the Anthropometric Committee recorded age last birthday, and before these children can be compared the six months extra growth must be allowed for. This is quite easily done by finding the regression of height and weight on age and adding half the regression coefficient to the height and weight of the Glasgow children. We have found the regression for children of 5 to 14 inclusive to be as follows: Boys Girls Regression of Weight on Age ... 4564 4916 5 Height on Age foe 1:807 1937 * Tt is not known what record was made when an exact half year, an exact half pound or an exact quarter inch occurred. + Report on the Physical Condition of Children attending the Public Schools of the School Board for Glasgow, by Dr W. Leslie Mackenzie and Captain A. Foster. Wyman and Sons, 1907. + The actual wording of the Glasgow direction to school teachers runs: “In recording age, disregard months and record to nearest year; thus 6 years 7 months record as 7 years, 8 years 3 months record as 8 years.” It is not clear how 6 years 6 months would be recorded; we have assumed as no half years are entered in the schedules that an exact calculation was made in the case of each child of doubtful age to ascertain whether it was or was not past the half year. 290 Height and Weight of School Children in Glasgow This means that we must add 2°28 lbs. to the weight of the Glasgow boys and ‘90 inches to their height and 2°5 lbs. to the weight of Glasgow girls and ‘97 inches to their height before we can compare them with the Anthropometric Committee’s standard. The Glasgow children still fall below the “ Anthropometric Committee’s Average” but not to the appalling extent shown in the diagram at the end of Dr Mackenzie’s and Captain Foster’s Report. Personally I should hesitate to compare actual height and weight of Glasgow school children with the so-called Anthropometric Committee’s standard. The so-called standard is taken from the Final Report of the Anthropometric Committee of the British Association, 1883. In Tables XVI—XIX, the average heights and weights at different ages of males and females of different classes of the population of Great Britain are given. For example in the case of stature we have four classes: Class I, Professional Classes, Town and Country, 10,739 individuals, ages 9 to 60; Class II, Commercial Classes, Towns, 5472 individuals, ages 8 to 60 (5 below 8 are of no service for means); Class III, Labouring Classes, Country, 8727 individuals, ages 3 to 70 (8 below 8 are of no service); Class 1V, Artizans, Towns, 126,236 individuals, ages 3 to 60, and 451 babies at birth. All these data are pooled and the column headed “ General Population, All Classes, Town and Country,” and it is this “General Population ” which is so frequently cited by various medical authorities, including Dr Leslie Mackenzie and Captain Foster, as the Anthropometric Committee's “standard.” What they understand by such a “standard” it 1s impossible to say. It does not represent the “General Population” of Great Britain, but the total population measured by the Committee. In this all the babies are artizan babies, there are only 8 children from 0 to 2 and these belong to the labouring rural classes, and there is no professional class contribution until after 9 years of age. Then the various age groups are made up from various social classes in proportions which bear no relation whatever to their actual proportions in the kingdom at large. For example, the average height of lads of 18 is determined from 1724 of the pro- fessional, 62 of the commercial, 148 of the rural labourer, and 371 of the town artizan classes! It will be quite clear that a “standard” reached in this way means absolutely nothing at all, and yet this is the “standard” which, attached to numerous weighing machines is posted in innumerable public places up and down this country. It does not in the least represent any “General Population” of Great Britain. To be a standard of the general population each class should have been properly weighted, and this cannot be done as in certain classes certain ages are quite inadequately represented, or not represented at all. There is in fact no such thing as an “ Anthropometric Committee’s standard” for either height or weight. The only thing that is possible is to compare the corresponding social class in that Committee’s measurements with the measurements under considera- tion. In the case of Dr Leslie Mackenzie’s and Captain Foster’s data, this is undoubtedly the Class IV, “ Artizans, Towns.” Such a comparison is made in the accompanying diagrams. It will be seen that the Glasgow children as far as height is concerned are the equals if not the superiors of the Anthropometric EK. M. ELpERTON 291 Committee’s artizan class. In weight they appear to be somewhat less, but here Dr Leslie Mackenzie and Captain Foster have overlooked the fact that the Glasgow children were weighed without boots, but the British Association Committee weighed in ordinary indoor clothing, i.e. with boots or shoes on. Now girls’ boots weigh as much as 11 to 21 lbs. and boys’ boots 1? to 34 lbs.* Hence in comparing children in Glasgow with those six months older, Dr Leslie Mackenzie and Captain Foster have dropped 21 lbs. in weight, while in comparing children without boots with those with boots they have dropped another 14 lbs. to possibly 3$ lbs. We should anticipate therefore that their readings would be 3} to nearly 6 lbs. too small f. There is in our mind very little doubt that the weight of the Glasgow children is at every age equal or superior to the weight of the artizan children measured by the British Association Anthropometric Committee and the statement of Dr Leslie Mackenzie and Captain Foster that “at each age from 5 to 18 the average weight of the [Glasgow] children is uniformly below the standard of the Anthro- pometrical Committee{” arises from their having entirely overlooked the con- ditions as to class, age and manner of weighing which were adopted by that Committee, a knowledge of which was essential to any comparison with the Com- mittee’s data. In the diagrams on pp. 292-3 we have given the Glasgow measure- ments set against those of the artizan class of the Anthropometric Committee, and the reader will see clearly how all the arguments based on differences between the Glasgow and the “ Anthropometric standard” fall at once to the ground. There is nothing exceptional in the Glasgow data, they differ of course from data for the children of the professional classes, but this difference is not confined to Glasgow. Apart from this point it is essential that the ages of the two groups of children should be the same and not differ by six months. In the data used for this paper, children of 5 were omitted; they are few in number and are not therefore likely to give such reliable results when each age group is used separately. The mean weight for each height in inches was then found and the regression equation calculated. These equations are given in Table I. It will be observed from these equations that, though some irregularities occur, generally speaking weight increases more rapidly for a given height in the better school groups, at the later ages, and for girls more than boys except at ages 6 and 7. ; We can see from these equations that the multiple regression surface for weight on height and age is not absolutely planar. It can be shown that it is * New ‘‘tacket ” boots for girls of five in Glasgow weight 1 lb. 5 oz. falling to about 1 lb. 3 oz. when the tackets are worn down ; for girls of fourteen 2 lbs. 6 oz. falling to about 2 lbs. 2 oz. For boys of five years new tacket boots weigh 1 lb. 14 oz. falling to about 1 1b. 11 0z. when worn down; for boys of fourteen the former weigh 3 lbs. 9 oz. and the latter about 3 lbs. 30z. We have to thank Dr Chalmers, M.O.H. for.Glasgow, for this information. + Many public elementary school children have great masses of metal on their boots. Undoubtedly the older children have heavier boots, and we can see from the diagrams that the divergence of the Glasgow children from the Anthropometric Committee’s artizan children increases with age. £ Report, Scottish Education Department, 1907, p. iv. 292 Height and Weight of School Children in Glasgow | WRIGHT AND HEIGHT OF GIRLS 90 Glasgow Data, All Schools. B.A. Anthropometric Committee, 85 Artizan Class. 65 Weight in lds, > Ss) Ot Or Sera Ge Ve S) Height in inches. a eat nant tte E. M. ELDERTON 9} WEIGHT AND HEIGHT OF BOYS. 90 Glasgow Data, All Schools. B.A. Anthropometric Committee, Artizan Class. 293 oS! Or Or Weight in lbs. or oO Orv oO oO Fe HHIGHT. abe) 6 a 8 9 OMe = Ae Ss) U4, tS Biometrika x 38 Height in inches. 294 TABLE I. Height and Weight of School Children in Glasgow Glasgow. | Age Group 4, Boys Group B, Boys Group C, Boys | Group D, Boys 6 | W=—-21:16441:503H | W=—22°57641532H | W=— 24:4894+1591H | W=— 25°6784+ 1-603 7 | W=—-21-0654+1519H | W= —26:308+1°635H | W=— 27°39041667H | W=— 34:819+1°818H 8) W=-19°38141:-495H | W=-28:127+1695H |W=—- 6°64041:227H |W=—- 3277741 790H | 9| W=—24°65241:635H | W=—32°8264+1-818H4 | W=— 20°67141562H W=-— 36:030+1:883H 10 | W= —29°589+41-768H | W=—35°696+1:899H | W=— 42°94242°055H | W=—- 51:63642:218H | 11 | W=—54:91042302H | W=—39°72542-:005H | W=— 43-628+2-:088H | W=— 67°66442°546H | 12) W=—49°5134+2:217H | W= —62:8904+2-476H | W=— 55:°386+42°337H | W=— 62:052+2-450H | 13 | W=—65:467+2°547H | W=—-63°51642511H | W=— 81:342+2°854H | W=—- 99:908+43°160H 14 | W= —83°7844+2°888H | W=—76°749+2775H | W= —103°661+3°251H | W=— 126°446 + 3°633H | Age Group 4, Girls | Group B, Girls Group C, Girls Group D, Girls 6 | W=- 14561413298 | W=—-15:98541:345H | W=— 23°72841:551H | W=—- 27°5634+1624H 7 | W=-19°762+1465H | W= —24:13041:556H | W=— 29°312+1693H | W=— 30°22641694H 8 | W=—20-7214150383H | W= —30°6224+1-718H | W=— 29°113+1:691H | W=— 40°156 +1:927 H 9| W=-30°1334+1:730H | W= —29-0814+1°709H | W=— 38°62441917H | W=— 45:0664+2-045H 10 | W=—36-4784+1:878H | W=—38:8664+1:925H | W=— 46-263+2-088H | W=- 62:015+2397H 11 | W=—-48-70742:153H | W=- 43:005+2:034H | W=- 51:146+2209H | W=— 57°754+2'330H 12 |, W= —58°2774+2°360H | W= —63:908+2°465H | W=— -77°316+2°735H | W=— 84:298+42°8597 13 | W=—%74:15642694H | W=-—88-043+2:939H | W=— 83:960+2:892H | W=—103°594+3:229H 14 | W=-95:4644+3:084H W=-84:49642:906H | W= —106:1334+3°317H | W= —134:197+3'804H ; | W is weight in lbs., H is height in inches. The ages are central ages, and to obtain the weight corresponding the child should be taken to the nearest whole year. weight on age and not height on age which is non-linear. to a given height The departure from linearity is not great, but Mr H. E. Soper, in order to smooth the material, fitted a parabolic surface to the regression surface of weight on height and age. Let W be as before the weight in lbs., H =height in inches, and y equal the age of the child measured from 10*. linear. Then W=—¢.(y) + ¢o(y) H is the form of the surface when the relation of W to H for a given age is sensibly Mr Soper now assumed: hi(Y= At nyt+t ay, do(y) =b + hy + boy? and determined a, a, and as, b), b; and b, so that: > {n (bi (y) — a — Hy + aoy”)} = minimum, > {n (db. (y) — b, — bry — b.y?)} = minimum, where » is the number of individuals in any age group. * Thus y takes every value from — 4 to +4 and we have nine equations to deal with. Biometrika, Vol. X, Parts I and III Plate XIX EPORT . 1906) (CLASS B, CLASCOW R . Model of Regression Surface, giving mean weight of Girls of Class B of Glasgow Schools for a given Height and Age. The mean weight is the vertical coordinate and each section parallel to the front of the model gives the mean weight for the several Heights of Girls of a given Age. See p. 295. E. M. ELpERTON 295 Now ¢,(y) and ¢,(y) for given ’s are the values determined in Table I for the constants at each age of the regression lines W=-—-A+8H, and n is the number of children dealt with at each age. Our type equations are then of the form: Up + 4a, % (ny) + 4a.> (ny?) = 42 (A), hay’ (ny) + 4c, (ny?) + ae (ny!) = 43 (Ay) 4a) & (ny?) + 44, (ny?) + ta. (ny) =1> (Ay), and similar equations for b,, b,, b., with B for A. When these constants had been determined we put Y=10+y and obtain the equations given in Table II. TABLE II. Glasgow. School Children. W= Weight in lbs. Hf =WHeight in inches. Y=True age. Boys Group A W= {02181 242-214 —-2554Y} x H- (11327524 67°542-14-7417 F}, B W= {01533 Y2+1:900—-"1493V' x H— { 83314Y7+ 53:101- 9°8662Y}, C W= {03990 Y243-614— 5796 Y} x H - {2:08397 Y2+4 139-456 — 31-6570 F}, D W= {02983 Y?+2°799 — 3636} x H— {176624 Y24111-407 —24-0174Y?. ” Girls Group A W= {01880 Y?4+1°657—1644V} x H— { :96454¥?+438:119— 9°7305 7}, B W= {02081 Y?+1:907 —-2043 Y} x H— {1:16330 V2 + 60-230 — 13°6925 7, 4 C W= {02315 V242°222 —-2457YV} x H— {121642 VY? + 67626 — 14°3416)", D W= {02701 Y?+2:385 — -2832V} x H— {1:56165 VY? + 87-624 — 18°9239 Y}. A model of the surface for Glasgow Girls of Class B has been made by Mr Soper. Allowing for the points based on few observations at the end of each regression line of weight on height for constant age, the scroll represented by black threads is quite a good fit to the observations represented by card sections. The model is, however, difficult to photograph in a manner which shows effectively the approximation of the thread scroll to the cut card sections. The reader should note that an additional thread is placed between each of the threads which graduate the regression lines for the different ages. Further eight tables (Tables a—0) have been constructed in order that the average weight of any boy or girl of a given height and age can be read off at once. See pp. 8300—303. It has been stated before that the age groups in Glasgow are from 5°5 to 6°5 years etc.and that 6,7, 8 etc. are the centres of each age group, 38—2 296 Height and Weight of School Children in Glasgow but since frequently the centres are at 6°5, 7°5 etc. we have constructed Table III which enables anyone to find mean height and weight at any age between 5°5 and 14°5 years. The regression lines are calculated from the original tables in which children of 45 to 5°5 years were included. The regression lines omitting the TABLE III. Glasgow. | Mean Height : Mean Weight y Ge es sl er D Ae a8 G D Boys: 5:°5— 6°5 41°3 42°1 42°] 43°0 40°9 42°0 42°'5 43°3 6Ob— 7h 43°0 44°0 44:0 44°8 44°2 45°6 45-9 46°6 T'5— 8:5 45°1 45:9 46°2 46°9 48-0 49°6 50°1 51°2 8'h— 95 47°0 47°7 48°1 49°0 52°3 53°9 54°4 56°3 9:5—10°5 48°8 49°5 49°9 50°9 56°7 58°4 59°5 61:2 10°5—11°5 50°6 51°1 51°5 52°6 61°6 62°7 63°9 66°3 11°5—12°5 52°3 52°8 53°5 54°2 66°4 67°8 69°1 70°8 12°5—13°5 53°8 54°3 55°0 55°9 lier 72°9 75°6 76°9 IZ*5—14'5 55°2 55'5 572 57°7 75°6 77°3 82°2 83°2 Regression on Age | 1800 ins. | 1°728 ins. | 1°847 ins. | 1°846 ins. | 4°305 lbs. | 4°395 Ibs. | 4°772 Ibs. | 4°914 lbs. Girls : 5'b— C5 41°0 42°0 41°9 42°7 39°9 40°6 41°3 65— 75 42°9 43°7 43°7 44°8 43°0 43°9 44°7 7'5— 8:5 44°6 45°6 45°6 46°4 46°4 47°7 48°1 S:5— 95 46°6 47°4 47°6 | 48:6 50°5 51°8 52°7 9'°5—-10°5 48°5 49°2 49°4 50°4 54:7 55'8 56°9 10°5—11°5 50°3 51°1 51:2 52°2 59°5 60°8 61°9 11°5—12°5 52°4 53°0 53°3 54°1 65°3 66°8 68°4 12°5—13°5 54°4 55°2 55°4 56°5 72°4 74:3 761 IS*5—14'5 55°8 57°71 57°0 58°7 76°8 81°3 83°0 Regression on Age | 1°914 ins. | 1°859 ins. | 1°903 ins. | 1°943 ins. | 4°551 lbs. | 5-083 Ibs. | 4-944 Ibs. | 5°489 lbs. 41°8 45°6 49°3 54°3 58°8 64°4 70°5 78°8 89-0 children of 4°5 to 5°5 years were worked out for height on age and weight on age for boys in Group A, and were found to be 1°81 instead of 1°80 for height and 439 instead of 4°31 for weight, but such differences are not great enough to matter and the remaining regression coefficients were not calculated with children of five years excluded. In connection with the tables (1 to 72, pp. 304—3839) it should be noted that in transferring the data for boys from the original sheets to cards, ‘75 of an inch was included in the inch above; for example, 30°75 inches was entered as 81 inches and the centre of the group of 30 inches is 30°125 inches. The data for the girls KE. M. ELpERTON 297 were transferred to cards much later and the simpler method was employed, and 30:75 was included in the 30 inch group and the centre of this group is 80°375. Through the kindness of Dr Priestley, School Medical Officer for Staffordshire, we have been able to obtain the regression of Weight on Height for certain age groups of boys and girls in that county. Staffordshire is a county of very various occupations and contains an agricultural as well as a mining and factory population. The children measured are “entrants” and “leavers” and a further group of children, namely those from 8 to 9 were measured. The “leavers” include children of 12 to 14 years, “since in general the only ‘leavers’ at age 12 to 15 are rural, and the only ‘leavers’ at age 13 to 14 are urban*.” The children were of the age stated, 5 and not yet 6, 8 and not yet 9, on January 1, 1911, but the actual day of weighing may have been any school day from January to December, so that a child entered as 8 may have been only a few days short of 10 when it was actually measured, and therefore the mean age of the group of children of 8 to 9 will be 9 years. “In the case of the group of leavers, 18 to 14, no child can have been more than 14, because on attaining that age the children are entitled to leave school, and generally do leave. With these the mean height and weight in our tables refer to the true mean of the years of the group, viz. 13 and a halft.” We shall table to the middle of the group, namely at ages 6, 9, 138 and 134. The children were weighed and measured without shoes, but in ordinary indoor clothes. The figures were read to the nearest quarter of an inch and to the nearest quarter of a pound. Staffordshire Children. GIRLS Boys Ages | | | Regression of | Regression of Mean Mean Taio | Mean | Mean Poe Height | Weight | eee Height | Weight ane | | | 6 419 | 398 | 1-705 49-1 | 41:0 1-741 9 47°7 51:1 | 2024 48°1 | 53°0 2°120 13 56°7 (ek | See, ays) | Thos} 2°811 13} 57°1 81-0 3°360 56°3 | teat, 3°166 | It will be as well to compare these means with those for all Glasgow ; so far we have not given them in this paper for all the schools taken together but only for each school group. * Staffordshire County Council, Annual Report of the School Medical Officer for the Year 1911. J. and C. Mort, Ltd., 39, Greengate Street, Stafford, 1912. + Ibid. p. 25. 298 Height and Weight of School Children in Glasgow Tlasgow Children. GIRLS Boys Ages Mean Mean Mean Mean Height Weight Height Weight 6 41°7 40°5 41°9 41°8 9 47°3 51°3 AlT(OT | 53°7 Le Yar 74°8 54°6 | 73°6 Girls in Staffordshire are taller at ages 6, 9, and 13 than girls in Glasgow, but they are lighter at ages 6 and 9. We might argue from this a lack of physique in Staffordshire girls who are absolutely ‘7 Ibs. lighter at age 6 than Glasgow children, and relatively to their height even more than this amount. At age 9 the absolute difference is less and at age 13 Staffordshire girls are heavier than Glasgow girls but they are 14 inches taller, and since the regression of weight on height at age 13 for girls is 3°272 lbs. we should expect Staffordshire girls to be 49 lbs. heavier than Glasgow girls, but they are not so much. I should hesitate to say that the physique of Staffordshire girls is inferior to that of Glasgow girls; the difference probably is one of race, but such questions must remain unsolved till we have a far wider range of anthropometric data than is available at present for all the districts of Great Britain. Boys show the same characteristics to a lesser extent; Staffordshire boys are taller at ages 6, 9, and 13, but they are lighter in weight; at age 6 they are ‘8 lbs. lighter than Glasgow boys; at age 9 they are ‘7 lbs. lighter and at age 13 they are 1:7 lbs. heavier, Again relative to their height Staffordshire boys are lighter than Glasgow boys at the three ages for which a comparison can be made. Comparing boys and girls in Staffordshire we find that girls of 6 and 9 are shorter and lighter than boys of the same age, but at 13 and 13% girls are both taller and heavier. At 6 and 9 years the regression of weight on height is practically the same for both sexes, but at 13 and 13} the regression of weight on height is greater for girls than for boys; girls are heavier proportionally to their height than boys are. For girls of 13 an additional ich in height should mean 3°3 lbs. more weight while for boys the additional pounds expected are only 2°8, while for girls of 134 we expect 3:4 lbs. increase for every inch of growth and for boys 3:2 lbs. increase. A comparison of the regression coefficients with those given for Glasgow in Table I will show that the coefficient is higher in Staffordshire for children of 6 and boys of 9 than in any of the school groups in Glasgow. The regression coefficient found for girls of 9 and 13 in Staffordshire is practically identical with that found in Group D in Glasgow, and boys of 13 in Staffordshire would seem to be most like boys of Group Cin Glasgow. EK. M. ELpERtTon 299 In a Drapers’ Company Research Memoir * recently published tables are given showing the height and weight of boys and girls of 12 to 13 years who were members of the Worcestershire public elementary schools. These tables are XLIV and LIX and will be found on pp. 100 and 107 of the work cited; we have calculated the mean heights and weights and the regression coefficient of weight on height as we have done for Glasgow and Staffordshire. The mean age of the group of children of 12 to 13 years is 12°5 years, so allowance must be made for the six months age difference in comparing with the Glasgow data. We have already given the mean heights and weights of Glasgow and Staffordshire boys and girls of age 13 so we will calculate what the height and weight of Worcestershire children at age 13 would be. An additional year makes a difference of roughly 1:9 inches and 49 lbs. in the height and weight of a girl and of 1°S inches and 4:6 lbs. in the height and weight of a boy. GIRLS Boys , Regression ; | Regression een eee of Weight Ea ioa Gone of Weight eigh eig on Height eigh eig On EeiHe 12°5) Eat Sen ee 55°2 72°9 2°829 54°6 eal 2°800 emcee estes Fea | 75.8 = BBS | 744 em 18 Staffordshire 56°7 780+ = 55°8 ToL3 = 13 Glasgow 55°2 74:8 — 54°6 73°6 —- Worcestershire children are taller than Glasgow children but slightly shorter than Staffordshire children. They are also rather heavier than Glasgow children but not relatively to their height. The height of Worcestershire children of 12°5 years is the same as the height of Glasgow children of 13 years, but the weight of girls is 2 lbs. less and of boys is 14 lbs. less. Worcestershire children are lighter than Staffordshire children, but when allowance is made for the difference in height the Worcestershire children are not much at a disadvantage ; girls are a pound lighter and the weight of boys is practically the same. The differences we have found between the Worcestershire, Staffordshire and Glasgow children may well be due to differences of local race, and not be the results of differential environment or nurture. We should have little hesitation in applying the returns for Glasgow children as an approximate standard—say to the lb. and inch—for all British children of the artizan classes. * «A Statistical Study of Oral Temperatures in School Children with special reference to Parental Environment and Class Differences,” by M. H. Williams, Julia Bell and Karl Pearson. Studies in National Deterioration, IX. 1914. Dulau and Co., Ltd., 37, Soho Square, W. + This weight appears somewhat exaggerated. It may in part be due to local differences in the average ages of ‘leavers.’ 300 Height and Weight of School Children in Glasgow TABLE a GLASGOW. BOYS. GROUP A*. Weights for Height at each Age. Actual Age. 6 7 8 9 10 11 12 13 14 Oo gees.5 = =| ea = — -- _ — 34 | 30°0 | 31-0 — | = — — — — = 86 Weslo i325 = == — = — — — Ae | SERB oe esi ip) — — — — oy, 34:4 35:4) |) = 35:8a — | = — —- | — 38 35:8 36°9 | 37:4 3iE3 ne — — |) = 39 37°3 38-4 39°0 | 39:0 38-4 | — —_ — — jo | 388 | 39°99 | 40° | 406 | 40:2 | 39:3 = — = Bt | 40:3) |) 14 498] AD 30 249 es eles = = = y2 40°7 | 499. | 43°71 “4ahOs |ex43-08 | BaB:4ye) w49-a = — 48 | 43°2 | 44-4 | 45:2 | AB 7 4537 9) 45-4 |) 44-7 AO Gum 44 | 44:6 | 45°99 | 46°83 | 47-49) 47-6" || 47-5 |) 47-0) 46-2 eee S| 258) 46s Ay a de Ae eo 494 | 49° | 49:3 | 48-7 | 47-9 "oo | 46 | 47:6 | 48-90) 49:9: | 50-7 =| 5IS3: | 51-5) || (51-6) || IcSenenoee "©.| 4% | 491 | 50-4 |" 61-5 9) 5974 5351 11 653-6" | 538) | 53 :omen aa | 48 | 50S | 51:9 | 53-1 | 54: 54-9.) 556 | 56-1) 756 Deamenoee 49 _ 53-4 | 54°6. | 55°83 |. 56:8 | 57-7 | 58:4 | 59-1 | 59°6 50 -- 54°9 56°2 57:5 || 58:6 59°7 60°7 61°6 62°5 51 = = 57°8 | 59-2 | 60:5 | 61:8 | 63:0 || 164-2, Senet 52 = = 59°38 _| 60-8 | 62:3 |-63°8 | 65°3° || 66:8. j\m68:3 53 dn a 60:9 | 62:5 | 64:2 | 65:8 | 67°6. | 69:40") sD 54 | 64:2 | 66:0 | 67:99 | 69°9 | 72:0 | 74-1 55 | 65:9 67°8 69°9 72°2 74°5 77°0 | 86 | 67°6 69°7 72-0 745 77:1 79°9 eae: 69:2 | 71:5: || 74:0 —| 716-7 | 70-7 aalesono eta te 2 alee tales 734 | 761 | 79:0 | 82-3 | 85-8 59 fy aisle 81°3 84°9 88°7 60 ae a 83:6 | 87-4 | 916 61 : 90°0 | 94°5 put = | : ae ae = 97-4 TABLE 6. GLASGOW. BOYS. GROUP B. Weights for Height at each Age. Actual Age. | asta 7 8 9 10 u 12 13 U | [ego Sle 7c5 — | — = = = — = — 34 | 29:0 — | — = = = — — — Sh. Nis80-C@ie ola pa ae = = — — = cies epi | Soni Wy: as — _ — 73307 ile sasGe | reso = — — SBN) 8572) 13620 36.6 = — — — — — 39. | "36°8 Nav:8, noes a a = — = = 40 | 383 | 39°4-}-40:0 | 4071 | 39°8 — — — _ 41 | 39-9 “| 40:0.) 47a S41) eae — — — — 42 | -A0-5: |) 42:6) 4) AB *Ses | A387 eo 8 = —_ — 8 | 43-0 | 44-9 4570 1 45-5 45-7 45-5 — — — yb | +446) | 45:8 9 467) 47 -Sie4 76am 47-6 — — — a | ae | 465 1 474 4849 40-159) 49-5) 40-7 eoes = = | 46) 47:7. | 49:02) 5051) 50297, 51-5) eS ma eee sel MD IEG — 20) 47 | 49-2 | 50-6) | 51:8 | 52-7 | 253-40 653-0 erode | ede? cal serorad, | 48 | 508 | 522 | 53:5 | 54:5 | 55-4 | 560 | 565 | 567 | 568 49 | 82:4 | 53:8 | 55-2 |) 56:3 | 5763 9) 87s | 68'S | 59:3 ailoore 50 | 53°9 | 55:4 |.-56°8 -| 581 | 59:29) 60:2 ss6l-1 | Ci-8 mao 51 — 57-0. | 58'5 | 59:9 || 61-20) (62745 63:4) |) 64-4en Ga 52 a 58°7 |) 6020 Glave ose 645 | 65:7 | 66:9 | 68-1 58 — = 61:9 | 63:5 | 651 | 66°6 | 68-1 | 69:5 | 70:9 54 65°3 67:0 68°7 70-4 72°0 73°7 55 6771) | G8r9= | 70:8. 7277 || e746 vers 56 — = — —- 70°9 72:9 75°0 dell 79°3 57 — = — -— 72°8 75°0 77°3 om 82:1 58 Tel) 79261), (82520 ee beO 59 a = | 79°3 | 81:9 | 84:8 | 87°8 60 — = — | — —- — 84°3 87°3 90°6 61 | = — 89°9 | 93-4 62 | — — -- — 92°4 96°2 63 | — —- 95°0 99°0 * Throughout weights are given in lbs., heights in inches. E. M. Evprerton 301 TABLE y. GLASGOW. BOYS. GROUP 0. Weights for Height at each Age. Actual Age. eee 6 Ge 8 9 10 ol 12 13 10} 35 30°5 as = =— = = — a — 36 SOE al ae = = = = = — 37 33°7 36-0 — = = — — — — 38 35:2 37°5 38-6 = = — — — a 39 36°8 39-0 40°1 = = = = —~ = 40 38-4 40°5 41°7 41°8 = = = = = 41 40-0 42-0 43-2 43°5 = == = = = 42 ALD 43°5 44°7 45‘1 44-7 = = = = 43 43:1 45-0 46°3 46°7 46°5 = = = = th 44:7 46°6 47°8 484 48:3 47°5 ae = = 45 46:2 48'1 49°3 50:0 501 49°6 48°5 = = 4G 47'8 49-6 50°8 516 51:9 51°7 50:9 = — Wee 49-4 51-1 52°4 53-2 53-7 53°7 53°3 = = We 3 51:0 52°6 53-9 54:9 55-5 55°8 55:7 55-4 = 2) _— 54:1 55-4 56°5 57°3 579 58-2 58-2 = "en | 50 _ 55‘6 57:0 58:1 | 59-1 59°9 60°6 51-0 as ®| 51 ere ae 58°5 59:8 60:9 62:0 63-0 63°8 64:6 a) 52 tes yes 60:0 | 61:4 | 62:7 | 641 | 65-4 | 66-7 | 67-9 53 as = 61-6 63:0 64:5 66°1 67'8 69°5 71:2 54 as = 63'1 64:7 66-4 68-2 70:2 72°3 74°6 55 = aah == 66°3 68-2 70°3 72°6 75°1 77-9 56 = os = ae 70-0 72°3 75-0 77-9 81:2 5Y = a 74:4 77°4 80°8 84°5 58 =e ae = aa a 76°5 79°8 83°6 87°8 59 ane ii ese z 82-2 86:4 91:2 60 a a8 846 89-2 94°5 61 el rane = at 92+1 97°8 62 as aes es diss 94:9 | 101°1 63 = Se a = 97°7 | 104°4 64 aos ae = = Zs Js = — | 10738 65 as = z = OP a a Wi A ~T~119 e —_— Pte bo bo © ~71 Oo 153 220 225 152 Totals E. M. ELprrton 305 TABLE 3. Glasgow. Height and Weight of Boys of Group A. Weight of Boys of 7°5—8°5 years Beit yy yt | | bol 2 eae eae ee WOH HHH HH HW H G BH BW BW 6 15 GH GH YH H 8 BH HB 19 S Sas es os es + SRE Sees eEsassssseRr P53 |e — — —— - = 1 aie 2 — -———— — — 3 oh | || a es = = 2 OF 0 || area ces ee SS 4 eee 1 10 liars ——— 265 43 2 2 = = — 18 Cee 12 6.13 3 2 : a == ==) “95 Cae 3 1010 8 4 — = = 2 32 gO, {——— 111119 1 7 6 38 = 64 Mipeete = 2 4° 5 18 23 92 11 7 2 5 pe et 95 eee 9 4 14 36 53 24 87 «(4 — a fe Se eee aie! eee A 39 44 60 438 15-1 - : 222 Lie — on 1 1 22 46 52 53 34 18 611 a 244 S| _—————s Ti 69 3054-43 83°91 6 8 211 LE ,, ee rd) 14. 81 33.234-19 19.8 1 1 164 2? oo) oe eM ly Weise Sih 16 99 T8 7 T 124 Bo) -— ———= 1 ee o ee WG IAN 4: O19 Seno be 74 S| = Se ee leh SiMe 2 =e ee eee 31 Oe = jt ee et ee Omi OS i eg eH 15 | 7 oe oa a Se ee le Totals] 1 1 1 18 24 56 89 157 194 202 249 159 150 83 78 47 21 8 8 1 — 1 — 1 | 1535 TABLE 4 Glasgow. Height and Weight of Boys of Group A. Weight of Boys of 8'5—9°5 years Height} | | | | | Peeeiee Oe l wlied Rl beth ht SL | op Petals DD WS eS ee Se De ee te Ras es 5 O22 BR SR BS BESSSSESRRKREELNS —~ | aaa zs = 1 le SS i 1 _ = 3 = 1 Sie == 1 2 = = = 5 1 1 a= iz 1 = = 3 = == I) aah a ee 1 = (14 = = 1 SS eee — SS SS = is Pmt 469 9 6-4. 1. 8 = 1 = 39 a 21710 8 Ae AA = 49 eee 9-11.98 95 18° 15. 6 > 4 99 = = a 28.19 83 42 80 18 6. 2 160 ot 615 23 43 45 33 23 11 6 2 “ —— | 907 B —— — 2 — 1 5 10 27 44 49 37 2510 3 2 = 216 AT yy J— — — — — elon 5 23ne40Ne 360952) 20) ON 1 - 206 48 4, | — 1 Sao 16m 2) A026 Reso] = 152 49, J —— — — — — — — 69 20 1914 18 4 Or 100 50 ,, = 1 Ceo miomONN wtcci = 1a he 51 ,, — — eee OM geeteG al or aie ee se 30 52 ,, — — ee Nee re re eee Te od a 12 53 J—— — — — — — 1 TMs 92} Smeg es ee eek ee 7 54 ” — —= 1— — 1 1 — 1 1 — 5 55 ,, = — = 1 3 — 1 — —- 1 6 56 ,, 2 60 ,, 2 Gils 1 | 1e2s 6-3-9 18 67 93.118 162 176 145 165 98 56 51 5% | Totals 306 Height and Weight of School Children in Glasgow TABLE 5. Glasgow. Height and Weight of Boys of Group A. Weight of Boys of 9°5—10°5 years ee ee eee de Pe Pt te fe ry REY, TELIA CLOSE aE LC ILC) wD i) > iY) WwW WwW ay eiay Wey ey eye ev Eley ale) SMR AA HQKR &@ HH © K ® HH HK QAAH He HH HnmeS Ss FS WD WD vey wD SOF RO GECOMICOU ECO] oe 9 e 35 ,, 36 ,, or is eae = iT = ‘ = 89, he I 1 . Oe. nse ele aoe al Dek - a ie ee gle oo eee eel Be boda \42, [—-——2 3325 2 1 : : 43 ,, ae ee eee Eee i Bye I sl ile = ge: i ee eS PA ee et Les, | = — — == 2 1024 oS rei ele omenon | 2 Hekapeny || (ees ee L Be 142 oon Oo) 19 OG) 7 46 aT zs BS fier 3) 2 8 : ] 10 > 1 318 6 I 50 1 929. % 19).19' 27 381616 6 4 — (== 51s fe al ea el aispalsy okie oy) b2) 5. = - 4 al 2 4° 7-9) 19" 9) 7 13. 03) 53 ,, 8 44 51 3) eee = eA Ss i Tl Totals 2 1 3 9 23 19 45 95 114 126 142 205 142 119 121 77 59 38 15 9 10 3 TABLE 6. Glasgow. Height and Weight of Boys of Group A. Weight of Boys of 10°5—11°5 years Hoan 1 tl ba) Dea 1 1D WH 1D 19 WH 19 1 1D WH IO 16 mM MWA BH DWH KR OADM HOR © Dow Oe DMOMOeRe eRe eRe RH TH HH O&O 348 1 Cue 1 38 ,, 39 5, 1 3 | 40 ,, Leal! 2 |41, ._—1—— 22——— 1 = = 6 | 42 5, an 3 = ee Se 4 eS ECT a Wee, BOA Bey — al 14 2A oe | a es iy Gr By 4 = = 24 Vaio =e eS Ol yl oO mae ma 2 o- ae ae 55 46 cee =. ie iy 7” 0 eSal4 er Geen ome ee? == oe = 72 [gee PCS ae Oe 8 e181 3028 eae 103 oe ——— By Din OHS be OAs hy ily a 8} a ss =) \ as en (en ee nn ee Me epi atn Reap) GC) se) 191 | 50,5. [= —— — — F 2. 3 7 116 193) 727 48 oS 181 Pye apr |f eae Ces age, See at V— 1 5) eS, 18) (354 eC Oniieoneo 3 167 52 ,, = 9 = 188 A RCO RICE OO een 117 | 53 ,, = : CE Gh MES ees TE IG BRS) BY i 99 Be | = 1 Xk 3 3 H Sig BA = i= jl 44 55 4, ae SS ee es ee 25 56 ,, feces CE ee re Ie ee eee ame 1 oe ae 10 15 tena (pea OR eg ——— = = 1 Se | ee 4 | 68, J— — — — — — — = == = — — P9ns, afte ee SS aah ea ee 2 _ a Ee Ln ee Totals] 2 2 4 6 15 41 35 61 73 158 127 125 160 114 82 76 70 48 321817 1 3 5 2 1 | 1278 EK. M. EvprErton 307 TABLE 7. Glasgow. Height and Weight of Boys of Group A. Weight of Boys of 11:5—12°5 years Lees Cee ll bl Lat | pQotals HS Be A 1) Go Wey Rey Be SE eS ap pete tae ears aie) DBO RwHaeHHSA K SF H® BO DBRMOAHHFMFR OH ODA QO SFS Soo © SF GF GF GCKHKHHDHAAAAAS as ere ES sy = —— — — IL — = ———— 242 1 — 1 — = = ——1%13844 2 — —--— 1 212 6 5 — 1 2 == —— 1l— 4722 14 7 1— 1— ———/— JF slr20 16 ll & 3 1 1 - ——— 1— 717 27 40 16 14 2 1 l= —_— — —— —— 110 33 41 33 22 12 3— 1 - —- —- — — — 1138 14 30 41 36 20 7 6 1 - Oe. == Il = Go 1G Sl 38 si lb 7 1 1 2 iS, 1 5 8 16 28 312718 4 2 2 a — — — pe ge 8 1 93 99coT 4 9 6 ——————————_ udder Some 1O.uS ie Oa b i= oh Ip ee ee Oe 56 = 1h, lee Une Gat | Si | te 1 ie ee ey oti Oe eee | 58 ., Tine et er 59 ,, 1 1 | 60", = | Gi, | 62 ,, 6355, O4 ” Totals 16 50 98 160 163 143 91 78 43 26 12 TABLE 8. Glasgow. Height and Weight of Boys of Group A. Weight of Boys of 12°5—13°5 years | | Height} | | eee ee il oe Tt el ts otals i ye ON Ia 29) SIO! AG a Ag; AG, | 1D 9 ©) ©) W WwW HW WwW | 2 ££ 4 8&8 ® ® FH © DB ® BH BH B® HF B SS ey Oe PR Se BO" OG SG SS rn ee ea | ” 4A ” aa 1 5 cas 6 LEB, og ft — 1 3 1 —_- — 6 4G ,, — 2 4 5 1 12 oe = 2 8 13 68 = = = 27 48 ,, 2 ay Sy dkey yi 1 1 —_- — 41 HOS ee 2s 18s 7 Ti <6" Al 66 Oem ce = 7 97 29 «17 «10, 8 : 93 BIS 1 — 1 8 26 34 40 22 6 1 1 — 140 02 5, — — — 38 21 46 51 33 22 6 4 2 188 oe oe 7 84 AB Be TI 8 i 1 160 | pe Ad, 20) 45. 87-18. 2 1 140 ene) 8 8, 10 138° 88516. 10 FO. 8 56 ,, ae uF eo Mtoe eG I) Qo 89 es 1 ee 10. 10" <8) 4 8 = 49 5B || aS ee ee Oe eee eas eee ee a a 33 KD a a a oo Ge: 22 60 ,, a Wor 23 4 61 .,. a a oe eh ee at 7 62 y, = ee ees We ae 2 3 ” 64 ,, —_— — — —- 1— — 1 Totals 4 8 21 67 189 172 205 222 155 71 59 37 16 17 7 1 «1 1202 308 Height and Weight of School Children in Glasgow TABLE 9. Glasgow. Height and Weight of Boys of Group A. Weight of Boys of 13°5—14°5 years wD S ne Se ye Bp eS ee J S 1 ad u aS Wey) D> 99 ™~ wD Sy 99 — ia wD d Sa) = > Sop sys tis Sy ee! re) tase tase | IS) or ES SSS aS 2 42: : - 43 [ane ae ee 44 1 1 = | 45 —- — — 1] — - = = | 46 — # 1 -- - — = ee = a a = 2 Se Sins — = ta al — — 49 ,, 1°05 ahD 0 dered a eo = 50 ,, cme eee eyercc oO = = = il x, — — 1 6 5 10 8 8 == = ID — — — 411 22 4 1 #1 - teh — == 1 42 AGS oe 4 eis ee ae = bh, Wee ee Se ee eel SS a eee (eee es Mee rl le 2S | 56 ,, 2: M4 10) 10" 1a Ge Lj = Sea alte, 1 B62) Oe h8u 5) 4s = I — | Res os = — 565 6 4 2 2 4 1 — —~ ~— — 59 ,, = = 1M Sie 1s 2 eee | 60 ,, —- —-— 2— 212 1—- => — 61 ,, = Phat 1 = 62 ,, =e = DS (Ge —- -—- — > HF ~— ~ —S — 1TH ~~ ~ ~ ~ ~ — 1 | O4 ” = eT = a Totals BD 98 Il 28 46 77 56 GI 45° 365 T1715) 10" 825) lt TABLE 10. Glasgow. Height and Weight of Boys of Group B. Weight of Boys of 5°5—6'5 years Height] Pt ot LS 1 A A 1G, 8 1S tS So Hl) tS od a) as ome 298 1 = = 1 30 ,, — —_— - — — 31 ” =, = Tay ae, | aaa aa sats 32 ,, | — —- -—- -— ~~ — — 3 33 4, 1 — 1 2 — —- — — 4 Shs, eS eageiyen p> sO ee = 5 39 4, — 8 4 4—- ~ ~ ~ ~ ~ ~ ~— ~ ~ ~ ~ — 11 36 ,, — — 6 14 3 1 — 1 25 OT ss — 1 3 1441 6 1 3 — — 43 38 ,, = 2D ISS OR ol 2 —_- —- — 67 39 1 — 1 8 29 31 23 11 4 2 — ~ ~ ~—~ —- ~ — 110 40 ,, — — 1 4 23 39 61 49 9 4 1 — 191 AL; — — — 6 9 2 46 59 28 14 2 2 — —~ — — — 192 2 — — — — 2 14 81 71 44 22 22 206 1 ae 6 1 69 97 138 188 239 140 97 88 TABLE thik E. M. ELpErRtTon 309 Glasgow. Height and Weight of Boys of Group B. Weight of Boys of 6°5—7‘5 years Height Totals 298 ae 30 1 31 ” 32 ” 33, 5 ey 2 35 Sal 6 36, 3% 13 Gress bang 23 38, 6 8 38 | 39». aay) 47 | HO Bone 2: 103 Gis Sy ily 44 96 16 6 — 1 1 146 | 42 ,, i Tl Hes 59, Ge CO. Me a Stes, aah 8S 43 ,, a 2 47 63 59 36 16 3) 1 1—- —- — — 246 AL ,, 1 Oy, AAD 00 27 5 5 2 1 — — — PANG ie Demon eocee ssa 7 Gee = TAO UO - a Y; lO eZ AO OD: ele 12 —-—- — 114 ie - 1 4 by AIA iN) 1B} (3) 4 — = 60 1S... 2 : ee ede Sree Ok pear, ee = 27 oo i ER a a ee 9 50 ,, Lf = Totals 44 93 115 189 218 215 174 119 TABLE 12. Glasgow. Height and Weight of Boys of Group B. Weight of Boys of 7°5—8°5 years Height a) ad > 2 3 5 5 l 2 3 ire 41, J~——-—-—— Av “ey. a ak Vp ee 897 18) 2b) 68: (3- —= =—= == — e ee = DS 972 198 43° 96 9 98 1 5 444. | —— — — — DW NG Bt cl (les Bs) ily ey 45 ,, a fy iN) Bi tsfsh hy? YO) ills} © 9) = | 46 ,, - 1 De loue4e3Si bse 268 2909) oe | 47 ,, i 4 &A 18° 22 Bh BY Gb Tie & aS = 48 ,, i) SBR IO) 1K ay ORY ye) 49 = 5 oy 6G Lt, GBS en | 50 ,, 1 1 1 eA gO a) eo | Bil |p = _ 2 2 iy sy ~ = = 1 Totals} 1 1 4 3 10 30 51 118 126 185 222 174 158 107 108 76 ~ 26 21 14 4 Biometrika x 310 Height and Weight of School Children in Glasgow TABLE 138. Glasgow. Height and Weight of Boys of Group B. Weight of Boys of 85—9°5 years ee B4e 35 ,, 36 ,, 87, 2 38 39 ,, 1 40 ” 4 ies 5 4», 5 3 ” 4 ee 6 Ds, 3 Oz . 5 Ss Se) > TIA Co HW @ =~ MAMA AAaawewHePS> 2 N CoN lee Totals] 1 3 3 11 32 85 100 156 160 162 161 198 124 89 73 45 17 14 7 4 2-1 1 | 1449 TABLE 14 Glasgow. Height and Weight of Boys of Group B. Weight of Boys of 9°5—10°5 years ~ L | | Pe wD I ID IQ 19 4M HR OH © 6 SG DO GR 37 1 39 1 HO |e le es = ——- Tie ee | ere a eS = Wor ss ——-— 2 22 3 511 1 1 -—— 48, [~———-——-—— by Ge 8} Gy = || = SSO EB Bl — — 45 ., ——=— — — 1] 4171213 10 9 2 2 1 — - - 40.5, ———=— 1] 2 2 6129°94 19 26> 21 8 fF 1 1 -—- — _ — Li ty 2 91620 34 39 20 13 1 2 Q — - -- — 48 ,, = 1 31118 381 44 48 95 28 4 2 —- 49 ,, - 3 1 5 16 48 45 40 3414 6 3 1 - BI) - 2 UW PA ily PE apy sy CHL ey alah 7 —_ = ihe 1 a 4 49 8) 1S 24 26 enero mele 1 52). = 1 Le a 4 2 ISOS ae Om Cee Ome. 4 ie == 2. 2 — 8 2 49°97 58 Yoo 4 OF 1) =a 54 5, 1 1 1 1T— 22 8 2 J — — — — — — — — 55.) _ 2 1 2 1 —- ob = = 3— 1 -- Dil sy - - = - 1 —_ 8 ,, = 1 = 1 = Totals] 2 — 1- 6 13 19 26 63°80 94 133 188.159 133 132°82°50 51 S18 124 5 2) le EK. M. Eberron 311 TABLE 15. Glasgow. Height and Weight of Boys of Group B. Weight of Boys of 10°5—11°5 years | Blea ce. oO SP) »~D q Ne) = | e Bre KH NODUD We | | Ww bo bo Or OU SUNT CO vo bo | Ww bo 1 6 8 24 31 55 71 124 134 128 171 128 105 88 88 57 31 32 10 % Glasgow. Height and Weight of Boys of Group B. Weight of Boys of 11°6—-12°5 years Sere ete ile eel) je | petals Lg te Fy: On Or tS iW oD OH WwW KD ike they, Site, Ks 1 1 — = 3 il 1 - = 3 1 1 = = 3 8 1 1 1 13 7 3 2 2 20 So" 9 14 6 2 1 1 - 43 OM 4a 5) S08> alli 4 — — - — 74 tp) lS) Ghoys YAS Nils 10), 2) 128 2 13 25.35 29 17 #9 4 1 1 137 2 4 25 36 438 34 24 0 2 1 1 1 180 — 4 § 20 41 55 36 12 12 2 1 188 | 1 2 5 19 34 61 37 19 5 1 184 | — —: il 2 4 16 22 30 17 7 38 2 — — Ll — 107 | 1 5 38 9 12 26 138 9 2 80 | 4 (6 lb B36 6. 18) 8 56 i 28) of 360 40 OB 35 | = i—a—s 4A 4 = 14 —- —- — — — — 1— 1 —~ — 38 1 i, al 1 1 10 pa a ae 1 1 — = 1 1 — 1 1 eS | ES 13 48 70 1380 155 172 175 169 118 102 49 26 22 138 18 2 4 1 1282 312 Height and Weight of School Children in Glasgow TABLE 17. Glasgow. Height and Weight of Boys of Group B. Weight of Boys of 12°5—135 years Height 5 ih il iS ta love) S wn tnt Os Ss WAND OW WOND ABS Ss WHO O bo a DOS SRE EY — c S De HR ow bo 1 1 3 8 of 9 2 5 3 3 1 feet a mw oognok dr = Totals —— 1 6 17 65 113 150 219 223 171 114 78 4613 8 9 3 2 1 #1 TABLE 18. Glasgow. Height and Weight of Boys of Group B. Weight of Boys of 13°5—14°5 years Height Totals u > = 428 2 43 5, ] 44 ” = 45 ” az ae: ae ia 46 ” a 1 = a = 1 47 ,, : 2 I 3 ie. a = 5 i ae = ee See 10 50 ,, Sa ee Sa, SP Brae? 18 Due, SS Se al 410 7 5 2— ~ ~ ~ ~ ~ ~ ~ ~— — 29 52 See ee a 28 53 = 61s <4 Ons 1 — 35 54, fo= = SS 4, eS 8) To ye ee een 56, Pi Se 6 AS 6 a 56 ,, 3°23 SG Sede = AT 57 ,, lo 6 ©9212 26) ae ee) 58 ,, ~ 4 2 4290550 2) -— — 27 59 ,, = ae 9 3 2 “1 3.93) -— 1. See 60 = 1 l= = 2 Gil = 1—- 2 — — 2 — — 5 PB d |eee eo = 1 1 ae 68 ,, ee er a 1 O4 ” : = ro 1 1 Gos Me 1 1 SSS SSS SSS Totals 1 — 1 8 23 40 54 54 67 44 39 30 155 7 9 1 383 — 1 397 eno At RAS EK. M. Euprrton 313 TABLE 19. Glasgow. Height and Weight of Boys of Group C. Weight of Boys of 5°5—6:5 years Eewamieieeriiert ae let f= ee ay fy | | | Totals GMa oe PG BIG a se ye FAG Ay | uD Peal Ay NS SSE ery ae Sel esis fey ss QV S) 8) ial Gabo Gayo Se eas sre ub i ears) ZWD TABLE 20. Glasgow. Height and Weight of Boys Weight of Boys of 6°5—7‘5 years | bo He — or ee “IO bo We | | | Ww i w Or d> CO GC or lad | | | | pow mi ator Totals 314 Height and Weight of School Children in Glasgow TABLE 21. Glasgow. Height and Weight of Boys of Group C. Weight of Boys of 7°5—8°5 years }Height ty fo St eh ail, 7h el eels gat ae ee dh | jTotals I HH DH HH 19 1H 1H 16 H HH 1H WH 1 19 Ib | 1S Ts S9) AO) eS So Spiel imiteemena si a) sy tos Ne GSP Sa) ey GSS Sa) 9 9 H HH © SS YS SH S59 ©) DH ) H) © OS © = 308 1 ohh Ss 36 ,, 1 37, 1 1 CT an Pee ah ke CMO did ihe oi) 6 29 Le To A ee 11 Gor A ao oe 10 pt eer Ge owe 32 42°, Se Sa 43 5S a We =, Soe peed ally 66 eg (mes sear 5 TT 94 | 455, Pe 1 010. 22) 7a 14) 3. an 104 6 Se ES ape PRS 90 47 ” Tl ae Sat Ge Se PSS SC ee ee eee: a ot eee ag ena 2 4S ” ae q 49 ” | AO op | ol ss | 52 ,, NODE Gs 54 ” 95.5; 56 ,, 57 4, Ooi, Totals} 1 1 5 8 23 37 66 75 94 88 61 72 52 2617 7 2 4 1 1 — 1 +4 642 pei el [see = = = = = neon TABLE 22. Glasgow. Height and Weight of Boys of Group C. Weight of Boys of 8°5—9°5 years Height Totals an wer RHE HEH Oo Ss % OS SS ZA NMW Ss WHS OAWAND — 47 | PTTL LET] al ee] erro more! =! | wel | i WD re Ob 1 1 5 4 3 6 5 2 1 1 1 | Se NORE wWoOORR Bere E. M. ELpErRtTon 315 TABLE 23. Glasgow. Height and Weight of Boys of Group C. Weight of Boys of 9°5—10°5 years Height | | | | | | | | | | | Totals enh ty ess eye athe key Ney Uy Ne ea io) m 8 C f= SO ) ~ Oo © Oo © 2 ~ 392 -— 1 AD op || Le re are 2 Aly J — — — ee er ee = Oe. : 33 43 | | Sapa aaa, 1 1 4 4A 4, 5 3 3 17 4S OS = ae eae 1 5 4 6 4 4 2 ——s —— comes —) 26 cme el Oo. 3 9) OMA 7" 7. 5 = = SSeheil 158 in ee ee 8 7 ay 10 4 Tod == 1 a 65 48, | — — — — —- — — 665) 2011926 18e1t 2° —= 1 2 = = 110 49 ,, — LB Gu Bis 1 ee wl wk By | Se Se 50, f— — — — — — — — — 5 Gi lOnls,o3 10. 8-6) 3 ==] 88 aL 4, = i i BO) By 13} MO) WO) ak 62 b2 ,, — 1 @ © yf BR BP ak a ee et 53 ,, ———— Oe Oe eStats 20) 54 J —— — — — — — — = eee aaa Be ss 11 55 ” nay) = as i rl = SS SS 1 1 — « —— 1 = 3 } == i l 2 2 i a ee i 6 19 32 33 66 69 91 68 81 37 616 TABLE 24. Glasgow. Height and Weight of Boys of Group C. Weight of Boys of 10°5—11°5 years Height Totals Ss al | it ! | | I i u i d Ney Al id a wD il ry if it rl il L | J a 3 SHA SHS DR AHH HK DAH 4 S => S OB, i 68S 6 6 MM BR RM NR Go oo H is) 252 1 oe) oe ik 40 4, —_ 41, J—— 1 —~ — — — — — — 1 i 2 we" 1) 4h ,, 2 4 45 ,, a) 9 46 ,, Oy BY) 9 AT ,, Be 9) 4.2 Selo Al = = 36 48 ,, 6G 16slhel2 == 8) 2 == 2) = = 65 49 ,, J — — — — — AWS AWeO iS eal 3a ae 61 50 ,, 55 LO O9al'7 to 1k Bad) = = 96 51 ,, LG) aly WO GT 2 80 52 ,, = TL AIL Wee GY IS ee) = 75 53 ,, — lS Le a OURAN Meron sees We 3 = S59 54 4, | — = 1 Oy ese 8) Gel al BP Oe a a a | BG) 55 ,, oe ul eo sot b) 08) Orr Fe 56 ,, a et eee 0 yi Se Se SS SS 2 [=| 39 44 25 18 316 Height and Weight of School Children in Glasgow TABLE 25. Glasgow. Height and Weight of Boys of Group C. Weight of Boys of 11:5—12°5 years certs ee ie ee EN ek of" Howls Pe i eta Sie Dae Sagal Ue eto) bal ak Us) sy tS) | > pe SS Le OQ SH SSeS Ss 897 8S 1G) RE RD CO SS Oe SO Sa) | an Shr Ser Ui AOI Gi 6 8) SO! KO) UBS) BSE Ee Co Co ROOM EC) EOD EOS = oF 35% —— ee Dee 1 4l, Jl —— Se 1 oes ] = == Sa OS OS SS SS i Jhy J- 1 — —-———— = 1 ee ee ee ee 5 46, |— 1 2 — = —_—— -— 3 Vy derma Va ee a sy ye lee) lg Se — —] 13 RS EE a SS a ee ee 26 10) NS a onlay — —{ 41 Oy —— 17 91812 611 2— 23—— —— 1] — — — — — 69 ol ., —— 2 2 9 14 2823 7 7— 1—— —— — — — — — — 93 52. ———— 315191819 461 1— = 86 co 6 715 18.17 96 2) 2 — 72 i Te aye ePBy als, (Ge Oho Ts ee 81 DD, 1445 8 6 4 32 ———— —— — 37 D0 ne —- 1— 4 3 56 810 3— 2 — | —— — — 37 Sr APO Se eS. tO meal = 17 AS = — 39 WS 7 ee 7 59, — -— l — lili1ii1i— 1 1 8 60 — ee oh eee zee ] 1 Totals] 5 4 13 29 36 74 86 83 88 54 42 30 3011 7 3 1 3 — 1 — 1 [601 TABLE 26. Glasgow. Height and Weight of Boys of Group C. Weight of Boys of 12°5—13'5 years 66 | 67 ea a en ey EN é cee ee 80 4 4—- — — 1] — — — = —-} 62 15s¢ [2 oe ee 16 9 9 = SSS SS eae 6 199 493 Pie 913 6), 222) By Sou ieeercbe le Sle) ae a 2 vO aS ois 62) al — — 1°41 2) 2 = = = — — — 1_m— 2@ ii — — —- — — —~ ~ — 1 2 ~~ —°1 — — 1 —— 1 i | Totals 9 5: 90 32° 77 94 “881 80 154. 58 95 19978) 10) Gun ae en E. M. ELpERTON 517 TABLE 27. Glasgow. Height and Weight of Boys of Group C: Weight of Boys of 13°5—14°5 years Height| | i ee aS ee eee ae | | | Totals EME rea et tay 8 AS i AS aR AG | 9 DS 89 > ™ itn) er) Sp) aa = » D> Sa) fag al wD = Be) | ite} “WD Ney NS ~ ro ie“e) ive) Dp 5 je) S S i | ™ In| 2 SSS Slee ot | i) = Se 1 49 ,, J ae Ser i 1 50 | a pe ame 4 ee 2 TY 8 6 SD os | ees ea ee Ha = 17 53, 31° CRS as ine spay eh E 26 55. ee oe G6 ae VG aA 2 lho 55 ee Fh 816 4 ee 35 Pe ee eT 6 10 18 a Se 1 Bt ee 41 Die a TU 8lUCUGCBCOCC CB TD 33 Se 08 at eed. Bk ye Pe ee a Ge 15 60 ,, eee Vere s eo Roi SN Pare Se ON ag 61 ,, = ea a ee eee ee 4 ee le leo. eae 6 Oo. a Ae ee es ee 3 64 ” aie => 1 asi tees Caw * 1 65 ,, ae ee ee ee 3 (50 Ne ene a 1 Ee 1 69 ,, —_—- — 1 lL | Totals Ae Open led) 49) S00 29 9? 1b. 8. 8, GB BD, 259 | TABLE 28. Glasgow. Height and Weight of Boys of Group D. Weight of Boys of 55—65 years Height a ie eee lim Serre a She ee yeh ee ey ee Slike 2 Sages £ : : 365— ea 2 a7, 1 2 9 355, 48 15 39 6 4 17 40 5, 6 16 i 54 ee 2 6 8 al Das 1o8 if 92 43 ,, I 8 71 44 ,, 1 44 45 ,, 7 25 46 ,, 1 12 47 ,, 4 48 ,, 1 Totals 417 _-Bicmetrika x ~Z OD Ti Us ¢ a) OH OH GUE BE REE S We DWN OD 318 Height and Weight of Schooi Children in Glasgow TABLE 29. Glasgow. Height and Weight of Boys of Group D. | Weight of Boys of 65—7‘5 years ae mens Oe eee oe le ee eee Seo Yee et Moe rye Sie Oey Ye Mah 9s Mays ky Lee StSS Ne ope Gey Tey) Tee Fey Sy 9G sy) OS sh} 6 © He SBS SE SF SF YF HO 6 G 1—-— 2 ~~ —~ ~—~ ~—~ Tre ~—- ~~ ~—- — — 1 = ——— oe — — 2 2 2 1 1 = — 2 OD) 2 eee ee 1—- —- — — — te: A) =e lpn aS aoa eee Maem ages om | ep 2 Sao al re el Oe) 1 = ey ee ee a MA wey OIG oh Si : 44 4, SS 1D ly BR OB se Feats (emcee nate ay) ear eH OP MEL ACY §. Bh) Ge 46 , 3) 8 Sallis Osean Oom a8) ae Tee awe 1 ieee Gy ise eee eee WRX = ee ees ve a ot ee ey eee Se 505, — 1 — Totals 1 4 138 14 32 65 90 84 80 44 29 20 19 6 TABLE 30. Glasgow. Height and Weight of Boys of Group D. Weight of Boys of 7°5—85 years | J Is J i al ’ | \ | i Y | | il ; l J 0 | u | I! f ey <3 5S U RS Fey iS > > S : Q x 9 §9 WD ~ ~~ Ss a 1 a= fae 1 1 = 2 _ 3 ae gh a = = 1 Se ee OS ae 3 ee eet N Fil ai ee) al é eee 8 ss 2s 159 ROM ON hae — SS | 58 Occ 16G ble £6) ae opeel eG Seg eRe) icteats oor) OYh RB. Be = 71 Hh) tee = Lae 8) Soe eee ed == || 8 = 1 1 98) 20729 28\rs oa Se aoe : 9; 0 0F 16-220mo0 100 1 2 tA er Ommnly 64 = Sos see eee Oe a 38 16 4 1 2 —.2 h&. 4 83. 42. 55. 94°72 80 69 68 26 14 9 5° ‘Ge ==ameieminnen EK. M. ELpErRtToNn 319 TABLE 31. Glasgow. Height and Weight of Boys of Group D. Weight of Boys of 85—9°5 years ig Totals SEEN ak sta el ee eee | f Gy ae Sa) ey SS a els Wea ee bent Sy res tee he Spy ey te 39 3 > S19 5 Bo SCS ES $ 373 1 38 ,, os 39 ,, iia 40 ” a 4l ” 1 2 42 , 1 2 1 6 We & 4 a Ze Dh « —- I 235 4 2—- —_—— fo Po 2 WO 1Oe Ge 1 Ba04 | 46, J——— 7131310 7.5 3 1 — | ieee = 15 24 17 1b 8 8 2 eee = 421615 01-19 1619 7 1 49, J—— — — — i T2210 12 th 8 3 I 50 ,, | — — — — — Bees 2) el 20-0) 19) 13 — — IP 2) 2) 2 ADT 33. 5 3 14 6 4 2 1 12121 1 2—-—1—— 8 16 36 54 614 61 71 5! 9 52 41 25 15 TABLE 32. Glasgow. Height and Weight of Boys of Group D. Weight of Boys of 9°5—10°5 years Height [inca ah fea eee ease eae | Totals : - ey ey > 9 \ yD 3 2 36 3 1 oe 3 ” 1 43 5, 2 44 55 8 45 ” 2 8 46 ,, 6 5 1 23 Ty 8 8 Ale 3 48 48 ,, 4 12 Bia eal 5D 49 ,, 5 6% 20 15 94 50 ,, 2 22 26 115 61, | —— — — — 1 917 84 52 ” AO OMA SeG 453) — 60 Dy 3 40 54 ” 1 12 55 ,, 2 56 ” 3 Siew. 1 58 ,, 59 ” BO Cn. 1 Totals} 4 8 10 24 27 34 68 50 6: 52 45 25 ¢ 558 320 Height and Weight of School Children in Glasgow TABLE 33. Glasgow. Height and Weight of Boys of Group D. Weight of Boys of 10°5—11°5 years Te DQ 9 J! iG i 1 io 13 49 ,, ‘ ray mies Ol, Bediky py aye ty ey YE ak bal Sil — /610920 14516) 9! 6553 52, J—-——- - 1 3 41014 380 17 1211) 4 7 1 2 — 53, | ———-—- — Ne es) oh kay ye se Gy OS} Te TI Bee — 5.5 6 8 410°563 253 =] 55. 9 4°36 4 4 3 900 2 He)

9530 sd BOD ole 1 82 38 ,, —— OTA eke) a Sib a8 8} — 124 oo 4-13 43 59 23 19 6-=— (9° = j= 170 40 ,, 1 7 24 42 55 46 14 6 2 — — — — 197 | Via hae 3 8 6-20) 45> 757.) 19) bn Sa oe 165 Lge 1 1) 49902351) 929) S17, Ik oe eee 146 4B ,, ieee gi Ahkeiloe uch Yi Oc — ie As, Px SB) Gp 07> SO pee = 38 45 ,, = | 1 1 1a oe il Vow - 1 -= “2552 6 hae = I Ae 3 Totals 1 1 4 8 25 50 110 146 183 186 205 100 47 41 24 8 2-2 1143 | TABLE 38. Glasgow. Height and Weight of Girls of Group A. ae Weight of Girls of 65—7°5 years | Height ik ih al i ‘I it 1 | 1 | L | L | al a Hl al t il | al ab Boels yk aie nest Daas OE ; oe 5 ey t aS > > Ss wb WwW Ns) 30 &— 1 COL” ex. 1 82... 4 Some 3 4 o4 ” 2 er 6 85 ,, Do 18 BO, abe ali 23 BT. 5 ee KG) = 38 BET 1153 1 69 39 ,, 4 16 1 119 40 25 3 = 164 tee = 8 4 a 218 HD — 2 2 2 1 — 215 43 ,, l Oe ay = Il == 211 44 ,, A kB 166 wa) Ley sy" 7 91 4G ,, SOs oo 34 LP ee ya 8 ll 17 1 iL & 6 = 1? 2) | Totals 3 BY tb 95 153 189 265 191 1438 143 : ‘ i g 1405 E. M. ELpErton 323 TABLE 39. Glasgow. Height and Weight of Girls of Group «A. Weight of Girls of 7°5—8°5 years j la al : 7 ae ee ee eo eh ek te ae SY DB ms DS © & > ™ SA) 9D co SH Ba ay Rey Ge Sen tsa ey aoe UD RNR VR RN 5 46H & Say = > = = os got a Nene es hn Feng ken Se Ss eee eee co 1 1 2 4 1 A 5 10 3 6 1 AaNOoORe wy Totals] 2 — 1 7 25 39 71 108 183 199 189 197 182 80 71 46 29 5 5 1131—41 13938 TABLE 40. Glasgow. Height and Weight of Girls of Group A. Weight of Girls of 85—9°5 years Peon st | | | td | lard eet | [Totals Eee oe eNO 2S Rotor IG) US is. Was MS SS | yar es ID sy BA mw ID a~ ™ 89 cys 2S) 9 9 ~ Q wm s 2 8 3 : 328 1 3S 99 nn 0 nie! apenas ce ee ON a GR ae a ee ea See il | OW yy il) a5 3i| 36 ” 4 Q = ] 5 6 3 14 5 1 14 Lessa lees 2 — — — = 26 | 36.112 4-56. — 46 | SoM PS eS: oy Gas” I - 72 A tae) 07 DOR 20m Gre 3 121 — — — — — AW 22e 73840 Aer ee D8 1 160 == — — 2 712 33 51 46 33 10 10 3— 2 — — —— — —J] 209 45 15, 42> 40°30 32.98.10 2 3 — — — — — 1 |, 219 | — 12 7 li 26 32 23 3012 6 6 1————— 163 | 2. 2) VO Ge fo 2438.15 12 4° 3 1 —.—— — — || 133), — 7 3811 91697 2 1———— 65 —— — Ih Sol Miler =—elleSslO S.Quil 2° Bo 4. 1) a a 37 — — eS pS ee ee ee Ss 17 — 141— 7 —_—— — = SS ee ee 2 a Mi ene 1 | 90 135 214 170 129 112 122 68 40 28 16 10 1314 | TABLE 41. Glasgow. Height and Weight of School Children in Glasgow Height and Weight of Girls of Group A. Weight of Girls of 9°5—10°5 years n) 8 SS Diy Ce ¢ a) ec Ys RR & aw VS OS es) SS nS _ Nt~ 9 a 1 4 0 6 3 6 3 1 IL 3 womornw rs it~ Ce 0 FNL OR KWH | me Ore Phe | | epity ax AAAAA AGH Cs eo) 22 36 45 74 131 107 147 135 153 112 100 74 48 30 2! TABLE 42. Glusgow. Height und Weight of Girls Weight of Girls of 10°5—11°5 years ae ee | Mote viet oe Woreierilap ape at Wel iene 1 ike) STEAD aah SGN OS) Galo Nee RS GS 3 St SS + J Ye 3 U eee ns x OS on bo Gg Cs Ce oe oe & s SS exes Wi Le CH DE Db orp w wo BATON TR ee CON roe | eo OUR = a) HN eS HSH S Mts Og ]WWN an e w et 1 8 6 18 16 9 5 PpoaT@o BSH CODDNWe Oo ND AA AAN S Xe) | Totals 5134) 18022) 55) 3) 272 ol EK. M. EvprErton 325 TABLE 43. Glasgow. Height and Weight of Girls of Group A. Weight of Girls of 11°5—12°5 years Height Totals TL 1S Se GS tana ran es lim twos OS ace l 35 ,, il 38 ,, ] 39 ,, =e 40 ,, 2 41 ” il 42 5, 2 43 ” a 44 yy 13 45 ,, 20 4O 4, 32 47 5, 65 48 5, 83 49 ,, ) 139 50 ,, 4 162 267-1 A l 161 20 ieee ol 153 — — — — — 112 7-1 1 — — 133 3a Orb — I le 92 4b Oh by cay kh 93) = eS SS 65 All) 83, PE ie 1) DE Oh. ee 41 TRA a a) see i eee ea ee 27 Sot elle lle ———t le a 1 9 as ES — i | 2 ee 1 | 3 a 1 20 41 70 OD De Otsolelo elo Ome seem 63) doe 1s 7 1214 129 159 of Girls of Group A. TABLE 44 Glasgow. Height and Weight Weight of Girls of 12°5—13°5 years Height} | | | eae | | leeelee lie le al : Totals Se] WW WH H WW 6 BH WH lH H H we ww IO WD | Reese es DS 1 SS RS SS Se See Se ay Ss) SOP OP at ROMO” Os BCI ho a 139 120 = It] 4 | _ — 2 3 — 8 4 1 — — Ob 43 2, = 13.1310 1 1 = 712 4 12 — = 3) 6B) 1 il eS 12 6—11—2 ——— — Ae Deedee Dt 2, 1 — es een ES EG Ye es ee ae ee Biometrika x 326 Height and Weight of School Children in Glasgow TABLE 45. Glasgow. Height and Weight of Girls of Group A. Weight of Girls of 13°5—14°5 years a 3 > ~ Ks) io) 464 — — 2 —-—--—- - -— 2 47 ” 1 a ah SSS 3 48 ,, Boe ee 2 vom GBM Se Sip ROMREee gs =o tamer 10 50, ==, A OC Ogee Se pasty vel at = = 13 bl | eee 8 68 ea be 2 es Vee) Weer oe i sO) Gy A _ = 33 58, | — — — 2 2 3 2 6 4 4 3— 1~—~ —~ —~ — — — | 27 hy | — — — — 1 5 1 10 8 56 5 1 ~ 1 ~ ~ ~ — — Ff 37 55 ,, = 1-9 10. -% 80 8 ee en 56 ,, — 2-167 8.8 4 6 6-—9 — — Wee Cygne (eee ee ee Po Se a eT OS a (i ee ee ee Peay oy ae OO |! Be BYP) 3 —- — — — — 1-— — — 1 4 3 «4 a — o al 2 22 Ce | = ee eee ee SS 3 61 ,, = = — Ss 1 Flt Sse a] = 5 2 ” Fay = ch Ges Osa eee 2—- —- —- —- — — — 2 | 20 30 22 28 11 291 TABLE 46. Glasgow. Height and Weight of Girls of Group B. iF a | Weight of Girls of 5°5—6°5 years Height rama lave lem Totals : eo) Re RD isp) 9 2 jer) 39 > => Diet em gal ae ah 2 ee 1 30 ” 1 4 ae .- Lo ae 1 Bie ——— a= ae = = = SSS 32, 2 = 2 By 2) 4 Bh ” os 2 1 =a = a d Soe Ne oe ade Ome eee a 14 86. 1 oO --6 10-9 = SS ee 28 B75 Se BOB 6) I ee ee 49 38, = OY 29. 86 24.10. 5 8 ee eee 78 ore 1’ 93s 18°238" 888033). 14 924) ee rn 152 40°, [oS TD 84 P81 536 80) 13) 29 se en eT le. — — — 6 18 48 89 45 22 6 3 1— — — — — 188 42 ,, —_- — — 1 4 9 31 53 34 14 5 — — — —~— — — 151 43 ,, — — 1-— 2 6 12 37 36 22 21 4— 1 ~—~ ~— — 142 44 5, — — — — 1— 4 14 17 18 12 8 — — 1 — — 70 45 5, — — = — 1 1, 4 8) 5 8 7 Se 24 Wi eee Oe ees Te 8 Vat pee = SSS Se Sat 3 54 5, =a mM ee eS Oe SS SS SS 1 55 ” Peat, Se ee eb See a gee a ea = 1 — 1 Totals{ 4 17 37 93 148 171 198 201 133 61 55 18 3 2 8. 1 Teme KE. M. ELprERtToNn 327 TABLE 47. Glasgow. Height und Weight of Girls of Group B. Weight of Girls of 6°5—7°5 years Height 327— Seeger $3 ,, Dive VS 34 5, a 4 2 1 35 ., —= Sey BS os ee = 5 = ny ae as a= $2 2%5 A SF ah ee ae = gee eee me 7a OPC IO) 4 Ae 93) lt Se 89 ,, > Do Sly bay al) ye 8 ae! Oe i 3) 20. 36 298 280° 8 8 1. 1 1 ae aa 41 ,, = ToS A NO). Zu 8h) Dy i OS Se 42 ,, ee elo oh 160) 47 2 lo) 9) 9 4a, | — — — — 1 2 2 52 57 45 23 7 3B ~— 1 —— ane 3G) 80) 56 465 36 20 8 38 2 ———— 45 ,, = 1 9 14 30 31 23 15 4 @2 = = Cn a 8 9) 8) 18. 13) 8) 5 = ies — — aS > Re EG OD il 48 ,, —— SS SS SS 1 1 — 83 2 4 499, | — — — — —~ —~ ~ ~ ~ 2 ~ ~ — 1 1 50, J — —- — — —- — —- —- —- —- —- — — 12~2 1 157 224 215 54 Ql TABLE 48. (Glasgow. Height and Weight of Girls of Group B. Weight of Girls of 75—8-5 years Z x PRS om, i a See a os 4 | Height | ot aan Same a ae (aa eile | | [Totals Us Be a ‘ DD DM kD w% | Dd 89 E Sa) a o> | a) = ID UG ie) | 3 1 34 ,, oa Oy ‘ 36 ” 1 aT ” 8 | RE ss 13 | OY 5 34 | 40 ,, 45 | 4l ,, 87 | 42, J~—— 1 2 3 622 35 30 16 9 2 J—— — — — — — — 127 meee G88: 89. BI -36..80) 16 oh) 8 9 = 177 44 ,, t= 208 4d 4, fee al 229 | AND sy. | a ral yes 166 47 yy J — — — — — — 10 2 4 132 48 ,, Ona 69 ID | | SS SSS SS Al cd 33 33 50 Sup 16 Bil S279) 1 4 52, J———— — — 1 53, | ——— —— — — Lo Totals} 1 1 4 15 25 51 78 155 16: 159 115 96 30 19 12 1356 8 Height and Weight of School Children in Glasgow TABLE 49. Glasgow. Height and Weight of Girls of Group B. Weight of Girls of 8°5—95 years DD => I fos) 4 Ke} Coe Orb bo | snaozaal | Hanooquere Totals} 2 1 1 38 15 42 74 92 123 163 161 142 156 132 98 58 44 23 8 8 5 4 1 2 2 | 1360 TABLE 50. Glasgow. Height and Weight of Girls of Group B. | Weight of Girls of 9°5—10°5 years Height) | Pd lb al a ee ee 1 19 ID 19 AS 2D WO AD Wg AHA ADAG WAG AS ts to Oe oe Romeo mS) a eS depremab Say ats) =~ on) ™ SS) ID NS ory asl SSsaleeewatss] les) Gash Ga) US) oO HH H H SS SS Suen HIN SHKCONE cE IRS Ns) To sy AS aS RS CS 367 ] = 1 a7 ae 1 : = 2 eoe. il 1 1 - - == 3 295 el 1 1 =_ 5 40 ,, 2 1 1 ae 4 ee Cee ee ee alee oh = 10 IO Neale De eae ieee _ eon ie ae eM ye a. od 2 eee i ate rem besa ealeg alls Oy. 1B) ee ih he es : = 62 DHSS ——— 1 2 317 30 23 22 to 2 wl — —_— 117 4O ,, ———— 2 217 22 28 22 20 13 8 4Y— 2 — 142 Wie Soyo} eis Sh) Zbl Bey Bey = 4 Hl —- —_— 194 ke 5 | SS SS L169 le 23830 AC Sl Ae eee ee 190 49 ,, 1. 8 811 1b 44 41.99 76.8 6 7 eee 185 00%, ] ay ES) lise Xo} iO) AT 0) GO ee 136 ons ] 2 1 1 By ey Ty aly IG) Wk il By HY sD — 1 110 52 ,, 3 2°98 9 9 9 7 8. 3— Teneo 53. Pipe lbes) Pe aNey in lee I] Lael 24 ae 1 1 1281 eee 8 | 55 ,, 1 |e ee ee 2 = peal 8 item (eas oe ae ai ee ee lo 1 = 6 Ce aoe 9 Sy ee 4 3 4 11 24 35 76 103 127 136 126 176 138 116 81 K. M. ELDERTON 329 TABLE 51. Glasgow. Height and Weight of Girls of Group B. Weight of Girls of 10°5—11°5 years [Eh PT a a ce Totals ee oy, eg on see Se ORO, WS te “tt a ie Ass G. Soe Oats SO LOR re Oy ry) Sd) Ce. Od 6 % SD) RCOle eS Gece, 2S IS CO! 109) 109) 65. 09 7 9 41 5— 43-5 — 91-5— 3 10 caw | vo w aI — He bo | | ox a WOanNTRARR See — OO eel RP Ol fp bb Sa ew oror-1 © © bo | FNWWNHNYe | ern ores =t | [oi Hel I | mies arorosen | eae COS) Ee) a= or CO 1 1 years Totals | fo) » I~T He GO) COT IO Tee S Oro | { edie J OV Se POH Oke = = _ w w bore bb b i — i WMOWDQO of u or ES) Sw i b | | | = ro | ATR NI OD bo | =i enon bo be Totals] 3 4 14 29 54 103 140 162 171 129 131 86 74 59 361231212111 3 3 2 — 1 | 1252 OW PENH EH 330 Height and Weight of School Children in Glasgow TABLE 53. Glasgow. Height and Weight of Girls of Group B. Weight of Girls of 12°5—13°5 years o cia (Me ee eh ob has oe ak Rel NSPS GSE ast WOU ONEMICRYS Tue) Moh (SN IGA tS Gal) Rey ea Ses Se) NS “pb ost oF 1 iD Tope resets es) usc). sy Key SN S = = Tel gtel x X SSS == Dh A iae |e eee na |e ee ee a a 43 ” = sy a oat a eal ga Nias ee Dad agar ees ae =e a Pitino et lye ane mes Se e es — YGe A reseed ee -—— — Gp oe Se eO! GAs Eo ameD ee = — YSt ieee baie weed eee eas — — $95, = =e OD: SISA SS Ope ee BOs 0 Wo = 2 ALO", 1 Gomes Ge ee = — ble, (oe LD S00 een = — 525, = 2D) 198 82 SP GOSme inert tee = — Ce | eee 2: 5, (2333931 lo o6y ole = 54 ,, ee eI eS ale RO ayay ey I) sy zl | = Ds ——— 12 5 14 30 48 941610 4 1 — 1 = 56 2 1S) 3) 10) Mibg 92407 68S Ay en ee Ye | ee CREO Se ie iiy (9G Seo) 58 ,, — 3 20 10N 7 1314 on ie BOs Wretntos a eae —= 5 6104s 60 ,, eet Lge Pgh Se ep Gi 1 ey a ye she OY ire 62 ,, = 1 1 SS) Sea Gia —— 1 —— 60 ,, 1 — 67 ,, = zal _ = OS ” os 1— = 69 ,, = = Catal apa ahh Se Totals] 1 4 18 34 87 144 133 180 154 125 93 63 63 231612 7 3 3 2 — 1 TABLE 54. Glasgow. Height and Weight of Girls of Group B. Weight of Girls of 13°5—14°5 years Sa kod Wk hs 4 hobs " Sg | A $38 6 8s 8 SRkkSseRseSESS x sf—| — — — 1 — — = a | 46 ” 1 ae rae LS Ss as aa ee oa aN 4? ,, = = =) Se = a 48 ,, a i1—- — — — i — = oe oS ge ee GO. i ete nl ln ee — ees sf, | = '— 2-9 6 9 2)6 7 ee 50. Ns ee 8 0 GaGa mee — — — — — 53 ,, Se ee sy Oe) I A Oh i ae Se SS 54 ,, SD oe LT SC oe one ele eel DOI. = Te A a ida: SS 56 ,. ee ee ee cieihien) Oe eS De Oe, —- 1— 1 TL a Ge Ge 2 lO ae Oe 2 ee eee | — 58 ,, 1 ae er i BY ay GT eee 59 ae 1 Th Fe Ae lla | GO ,, = it oe 4 il A SI 61 ,, = 1 peeeen MO Si RB Aye Op ee ee — ie 1 eo 680 = = SS ey a ee Ss a LD | | Totals 1 1 7 11 21 25 44 59 44 45 47 31 21 21 10 5 8 — 2 E. M. Exvprerron . 331 TABLE 55. Glasgow. Height and Weight of Girls of Group C. Weight of Girls of 5°5—6°5 years Height | | | | | | J Totals . ae = = => = > > 1 | ’ | al it uw? il al il | JL Sie AS eh ial 9 ™ iD KD D 9 2D © S 1 le Oe 4 } E29 ee a 6 Bet ID 2a des Cia nO, tee ge en pee ee eee 22 ee ee ee ee ee ee eee i eee eee | nae ee. “OO eS eae ee te ee = = 34 a SD 24 1G; 9) 0 5 eee ae ee ne ime Oey Ime Gs Oe) 1 — | 8 ee ee Pe NN sy Oph pe go live ae ee SD A 75 162 1G ey ee = 1 =) 8 7 59°99 38 64° enn ey Be eS ee | a ee eee Oa eee Ss i en ee 2 iss 2 E. M. ELpErtToN 337 TABLE 67. Glasgow. Height and Weight of Girls of Group D. Weight of Girls of 8°5—9°5 years Benet ed fh. all do periisetly salerale ols alae | | | | Totals | lp 18) 1p 1S 1S th 1s 1s 1 As tp 1B 1s 16 1 tH 1H 1H 1G Pap OS EO ESO eer eOp eT: Or NS) 2S) OX st SD, AQ eS SS) Sf So) La eS Gy Se SON west est estes stg CY, UG Ry GS LG ECS) “RO GOs KO AO! t= RS RS GAS, BS CD 1 1 = 1 1 De ee = 2165 : a 56 4 ee hes 17) 19013) 8-8 ae ae = | 59 ope 3 13 1819 Gar gaat Ee ee 70 VN et 1 POG HIG WT 1Oje7 —— 2) VA a Ti Vee eT 8 1813. 17. 84 3 5 8 Se 73 HO) | | gig ie an aera WaOb en dome, Ord At ee 69 | Sea oMlOL bo ya O64 a Te ed Ag S| (ti em esl el ee Dron a: Ae el ee re 15 Ome ee eave alle ele wi af ee ME SD 5 — ee (OSes ae = Se ee ee 1 6 ee a ee De Aiba [yee ee ice eee 4 Oe |e ner ee ee ee ee Sk oe 1 TABLE 68. Glasgow. Height and Weight of Girls of Group D. Weight of Girls of 9°5—10°5 years Height eee yd potas WW UD My AQ AQ WO ID ID WD ID ID W ID iD iS > 1 e) a) a) 41§—| — 1 — — SS SS 1 42, | 2 — i Sse a _ 3 eal he = = ee ener ae Abby a. lle at UGens Hote a Lees — — — 9 45 5, — 5 2213 1—-—— 2Z— — — — —- — — — 16 HS 5 aes UALS ey OSE, Ssh ots}e a} ee — — SS 44 Ula eA AG Ged no, ook 3 SS 68 48 ,, —— 4 4 612 919 15 4 3 2—— ~— — — — — — — — — — 78 Loe. — — —— 4 713 21131714 6 4 2 — 102 50.,, | — — — — — Gree As opal SA 7, — il — — 92 él, J|————— — = IG Gi 18 2 64 52 45 fF — — — — — — — — é 2 8 62 21—-—1%1—— 42 2.3 213 —-—— — — — 1 18 — 2 By i 6 k= 12 | — al — 2 — 6 eS ee ee eee ee ee See) Se 1 24 30 49 61 57 52 560 338 Height and Weight of School Children in Glasgow TABLE 69. Glasgow. Height and Weight of Girls of Group D. Weight of Girls of 10°5—11°5 years wel Ca yor lon = ame ee Ss ee ee eee eg |. eae a AS ee ae aay Wes, Nor Me Mites 5 2 Wa re 2 La) —— 1 2 Lm 4 Je. | 3.19 2 1 oe SS SS SS eee 11 Ue = 0 96. ae gel ee oe =o Sees 4g, [ee Ll -4 510 4827053 1 ie eee 39 Or ———— 1 9141614 5 5 38 21 1’ — — — — — — — 72 20. ;, —— 1 —— 4 7 8141015 7 1 131 1TH— 1 ~— — — — — — 74 41, J — — — — — 1 6 6 412141011 4 5— 2 2 — — — — — — — — 77 Gy ie Nee See 1.413 8 8 810 °3 2°20 = oo = ee Bore oo a a ee ae 8 3.77 3059" oe Gael eee 55 TS Raien| Pee, fon, ey Le 391-347. 5 29'9 4 3 oe eee Lem (enema ee a ee, ee at Sa 3156 1. 3.49) 2 1 18 Cauca (eee seer ee Say, hat Q: 4. = SE ae 8 ie a Ss ee eee Td Ws eae es Se eS ee i 2 a ee ihe 4 Se eS a 1 Totals} 1 2 9 10 15 24 44 43 44 54 47 43 34 35 30 1418138 4 5 3 2 1 2 — 1 | 498 TABLE 70. Glasgow. Height and Weight of Girls of Group D. Weight of Girls of 11°5—12°5 years Height Totals a a a a a eh Ss fee Sa ee ce ee 6i— a 4 eee 2 i 15 48 ” 4 = = ll 18 on 7-5 11 el 36 50 ,, a 12 1 ied 9.02. 55 areas 8 514 Ose 78 52 ,. = 8 By a) 9S) 1 63 : 58 == 9 ee) 78 ob 5; — 5 21-17-15 4 95 55 —— 1 45/6; 72 5 10 53 56. eee IGS Be Sib} 47 evans eG 4 27 aoa ee 4 11 59 ,, ee al i nial) 5 6. 1 62 ” The 68: ,; 1 O4 ” Ti Paes zs | Totals} 4 15 30 42 61 77 76 65 62 47 26 38 594 E. M. ELpERTON 339 TABLE 71. Glasgow. Height and Weight of Girls of Group D. Weight of Girls of 12°5—13°5 years i Vege ee ee ee ee ee 6 - By a hy hs Se a a 16 ., 3) cit heey Weare = —— 9 .; 8 9 FT 4— 2 2—- —- — — — ~— — — 33 » Creo e Say 68a 4 |, 37 5) 2 10 14 13 8 2 Loy’ ~~ ~~ ~—~ ~— — 51 » — 6 4 19 1 13 6 1 3— ~ ~ ~— — — 67 3 1 1 38 11 22 9 10 4 4 ~—~ —~ —~ 1 ~— — 66 > —-— 1 6 18 11 14 9 8 1 1%1—-— JI 71 ” — — — 38 383 12 20 12 1446 6 2 1—— I 76 ” — — — — 4 7 4711 1 2 6 1m i — 46 ” — — — 1 2 38 4 7 56 56 22 1 — — 32 5 Se ee ee a en ee Ee 14 Mt ss —_- — — — 3 2 38 1.1— — 10 52, fe he ee ae i = 2 63" ,, oS ee a — aH 1 1 —_ 2 — SS Totals 538 TABLE 72. Glasgow. Height and Weight of Girls of Group D. Weight of Girls of 13°5—14°5 years TEESE PE alba rlPs P aeoca ID ID WH 1D 1D 1D ID ID 1H ID 1S ID 1H 1H HH 1H 1H WH | 19 AMA HH BDH ~ HH DH AK HTH DDB HF > Sig bb © S Bee a Cope CON OO COMES OG! A a Se SE 9 i a rr SS ar a SSS SSS SS eS 1 op een me See 3 Pag me pace SUR aA SMe ye ey yt Ne ere ae ey See 2 — — — — — J ~— ~~ — — — 1 ee ae eS 10 io Nee as 10 OMe ees ee ete 32 ag eee et 35 Gare) fee = =e a ee 45 ee 15 51 = eee ees 10 a bees eal Ar = a 5 52 pe re eo See 3 39 ee 1 eee as ee ee | 0) es 2S a a 1 in Se a ee a 7 2 oe ee Se ee, eg Se ee 1 aa pe eee 2 Totals} 1 1 3 6 7 19 43 33 41 47 50 24 30 20111310 5 2 2 — 1 369 NUMERICAL ILLUSTRATIONS OF THE VARIATE DIFFERENCE CORRELATION METHOD. By BEATRICE M. CAVE anp KARL PEARSON, F.RBS. In 1904 Miss F. E. Cave in a memoir on the correlation of barometric heights, published in the R. S. Proc. Vol. Lxxiv. pp. 407 et seq., endeavoured to get rid of seasonal change by correlating first differences of daily readings at two stations. A similar method was used by Mr R. H. Hooker in a paper published some time later in the Journal of the Royal Statistical Society, Vol. LXVII. pp. 396 et seq., 1905. This method was generalised by “Student” in the last number of Biometrika (Vol. X. pp. 179, 180). He showed that if there were two variates x and y, such that x= (t)t+ X, y=fO)+ ¥, where X and Y are the parts of # and y independent of the time ¢, then the spurious correlation arising from a and y being both functions of the time could be got rid of by correlating the ditferences of # and y, and that ultimately, when m is sufficiently large : Tamg Amy = VamHyamty = etc. = 1yy, so that the correlation of 2 and y, free from the spurious time (or it might be position) correlation, Le. ryy, could be found by correlating the successive differ- ences of « and y. When the correlations of the differences remain steady for several successive values, then we may reasonably suppose that we have reached the correlation 7yy*. This method is still further developed by Dr Anderson of Petrograd, who in a valuable memoir published in this Jowrnal has provided the probable errors of the successive ditference correlations of a system of variables : xX; AGS O05) XG Vis Vay OO) NG. * Having been in communication with ‘‘ Student,” while he was writing his paper, I know that the interpretation put by Dr Anderson (Biometrika, Vol. x. p. 279) on ‘‘Student’s” words (Ibid. p. 180) is incorrect. ‘‘Student”? had in mind, if he did not clearly express it, the ultimate steadiness of . . c 7 T amy Amy for a succession of values of m. K. P. Bratrick M. Cave and Karu PEARSON 341 where the correlations of random pairs of values of the variates, or the product sums, S{(X,— X)(X,—-X)}, SHCis= OrOe = 9a)e are both zero. Dr Anderson has further provided us with the values of the standard deviations of the successive differences, Le. aud ¢ Tamx Amy? which represent the ultimate values of o,,,, and o Amy? when we have carried m so far that the time effect has been eliminated. The new method appears to be one of very great importance, and like many new methods it has been developed in a co-operative manner, which is a good reason for not entitling it by the name of any single contributor. We prefer to term it the Variate Difference Correlation Method. With the exception of a few illustrations given by “Student,” no numerical work on the correlation of the higher differences has yet been attempted. It is clear that much numerical work will have to be undertaken before we can feel complete confidence in our knowledge of the range and of the limitations of the new method. We have yet to ascertain how far in different types of material a real stability of difference correlations is ultimately reached, and how far various assumptions made in the course of the fundamental demonstration apply in dealing practically with actual statistical data. One of the most important assumptions made if there be n values of the variates is that arising from the reduction in the number of values as we take the means which occur in successive differences, and a like assumption is made in the case of standard deviations. Thus for example: Ue ede © ee {8(X) =X, n-1 but = S(X 5 — X54) = (X1— Xn) /(n— 1), and will not be sensibly zero, although Te ke cal it is assumed to be, unless n be very large. Similar remarks apply to the sums used ‘ erat 2 : ey ee 1 n a = in the standard deviations, i.e. we assume in the proof S i) = oar js (Xe a) eal coe float Ultimately with the mth differences we come in the proof to relations of the type 1 NES iit 3 my g (X 5) 1 Ny and nM = 1 id Van) re - § (X,). il nm—-m 4 ? Biometrika x 44 342 Illustrations of the Variate Difference Correlation Method Now such relations will undoubtedly be very approximately true, if the X’s are random variates uncorrelated to each other, and provided m is small compared with x. These conditions seem amply satisfied when we proceed to fourth or sixth differences in barometric pressures, taken, say, over ten or twelve years; the addition of four or five daily pressures will hardly affect sensibly either the mean or the standard deviation. But such extensive data, while not only involving a great deal of labour in the difference work* are not those which, perhaps, most frequently demand the attention of the statistician, whether he be economist, sociologist or a student of scientific agriculture. In such cases it not infrequently happens that the available data only provide a range of 20 to, perhaps, at most 50 years ; and we need to discover whether there is a true relationship between our variates, apart from a continuous change in both due to the time factor. At present accurate statistics of annual trade or revenue, or satisfactory annual demographic data hardly extend at most beyond a period of 50 years. Very often—under even approximately like methods of record—we shall hardly have more than twenty years’ trustworthy returns. Not only has the method of record been changed, but the conditions of transit and trade may have been immensely modified and in a manner which we could not suppose to be even approximately represented by a continuous function of the time. The object of the present paper is to illustrate the theory of the variate difference correlation method in its present stage of development on a short series of economic data, in order to test what approximation there is in such short series to stability, and further how nearly Dr Anderson’s values for the successive standard deviations apply to such cases. We have selected as our data ten economic indices of Italian prosperity for the years 1885 to 1912, together with a “Synthetic Index,” formed by taking the arithmetic mean of the ten economic indices referred to. These eleven indices are given by Professor Georgio Mortara in an interesting memoir: “Sintomi statistic1 delle condizione economiche qd’ Italia” which was published in the Giornale degli Economisti e Rivista di Statistica, for February, 1914, and form Tabella I, of that memoir, which we here reproduce in part as Table I. The indices in each case are obtained by dividing the returns for any year by the means of the returns for the years 1901—05, inclusive, and multiplying as usual by 100. The indices are for returns of (1) Gross Receipts of Railways, (i1) Shipping, loaded and unloaded at the ports, (i11) Effective Revenue of the State, (iv) Inter- national Commerce, Value of Imports and Exports, (v) Number of postal Letters and private Telegrams, (v1) Amount of Stamp Duties, (vii) Savings Banks’ Returns, (viii) Impo: tation of Coal, (ix) Gross returns of consumption of Tobacco, (x) Returns of Coffee imported. Professor Mortara has drawn attention to the very high correlations of these individual indices with each other, and of each of them with the “Synthetic Index.” The latter correlation is, however, to a certain * A discussion of the correlations of the higher differences in barometric pressures will we hope be shortly issued. Ce a Beatrick M. Cave AND Karu PEARSON 343 extent spurious. For if J,, Z;,... J be the individual indices and J, the synthetic index, then J,=75(44+ 4+ ... + Ji) and any individual index J,, if there be no correlation between such individual indices, would give 1 a 7 T i! T \2 7b Us — Ls) La — La) = z9,, 8 Fa — 40) I 9. a 2 I Ti 1 2 2 2 Ca poss SU; —I; = 100 (Cie Ose eed) and accordingly 1 72 my ; Wie 2 : Cae toe UG ea aac eee) co Tho TABLE I. Professor Mortara’s Table of Index Values of Italy. Numerical Index (Mean 1901—1905) = 100. | A | “a » vw | | | S.5 Paani eee Soul ste Vee eta | ia | 8 | eg | 2 eae Year z a ® 3 & oa} as ax Ss ois 2 Opes ae eee eo acess! Sees e | O ) Ss |S, | ee ee a B a 20 aS ES) | ess a BA 4g | } | 1885 61 |-- 63 78 72 | 38 82 47 53 82 98 67'4 1886 62 63 |. 79 | 74 40 87 53 52 86 94 69-0 1887 68 73 | 82 78 42 94 BY 64 88 91 73°5 1888 Omg ples” 83 =)" 162 43 98 57 69 86 81 72:0 1889 71 Me | 285 70 45 99 58—| 71 86 78 74:0 1890 (2 78 | 86 66 46 97 59 | 78 87 81 75°0 feomaneer2 72 | 85 \| 60 |: 48 | 96 | 60 | vo | 88 | 80 73°1 1892 lye 7 86 | 64 bl 96 64 69 89 80 745 1893 "On 970 8 | 64 | 55 95 65 66 90,5], 73 Toes 1894 72 a 72, 86 | 63 57 93 65 84 89 71 75'2 1895 73 76 | 89 | 66 60 92 68 a 88 70 759 | 1896 75 ves 90 | 67 64 94 70 rp So ee Tec 1897 78 80 90 | 68 | 68 96 73e |e 76 Site (ro 79:1 1898 81 84 STE aan 183 96 75 | 79 | 89 78 Sp | 1899 86 88 | 93 | 88 75 96 70. iN 87 \ar91 82 | 86°5 1900 89 89 | 94 | 91 78 96 83 | 88 92 82 | 88:2 NOON |) 90" °85), with Importation of Coal (? >°75), and International Commerce (? < ‘68), and fairly highly with Revenue (c. ‘55). On the other hand the sixth difference correlations with Post (c. 15), with Stamp Duties (c. ‘24) and with Savings (> °23) are all such (i) that they might well have arisen from the spurious element in the Synthetic Index correlations, and are all less than their Andersonian (steady value) probable errors. Almost the same may be said of the Railway Index ; it is not beyond suspicion of being spurious, and is scarcely significant having regard to its probable error. The Consumption of Coffee is also not very closely associated with the Synthetic Index ; it is only about twice its probable error (427 +°205), and a good deal of its value may be spurious. Further in the case of both Coffee and Railways, the correlations are still falling between ‘04 and ‘05 for each difference. The last individual index remaining is that for Consumption of Tobacco and although the correlation of sixth differences is not really significant it is negative (— ‘247 + 235), and is exhibiting a steady negative rise. Stripped therefore of the common time factor the Synthetic Index will be seen to be no very appropriate measure of trade, business activity, and spare money for savings and luxuries. With Post, Stamp Duties and Savings, it has probably only a spurious relationship, expenditure on railways has little influence, that on luxuries is very slightly significant, or indeed in the case of tobacco negative. It is, however, closely related to variations in external trade, i.e. imports (including coal), to exports and shipping and to effective revenue. It appears to us that a suitable general index of prosperity, which will distinguish between a continuous growth in all factors with the time, and favourable and unfavourable fluctuations from this growth, can only be obtained, when there has been far more ample study of the associations of individual indices among themselves, and of these indices after they have been freed from the time factor, i.e. of associations between high difference correlations. From this standpoint the study of General Index theories is at present in its infancy. (6) Railway Index, This index is very noteworthy in the nature of its associations after removal of the time factor. We have reached a steady correlation (c. 62) with Shipping, but beyond this no values of first class im- portance appear. The relation of Railways and Revenue after falling practically ce Correlation Method ereé) te Diff 1a ions of the Vari Tllustrat 348 Weg toe OO PS Ao eS GVB-+ZPl- | OGG. FIO. | EPS. +TLT- | 16.49L3- | 961-FC9F. | FOL. FoLr. | 68.68 | 6ez.FPIZ. | LL1.FUrG- | 0¢z. ¥L00.—| cet. FOg9. . ¢ VEG. FEES. | CES- FATT. | QBS. FECT. | GZ. FIGS. | L81-FSEr. | L6I1-Fcor- | 9.0 | PIS. 80K. FL EPO, cee. 4 110. GOL. E FPL z aie E 661-+ POE. | GTS. +FIGT- | 80. FEES. | 90. +6PZ- | ELI-FI9P- | L6I-FLIE. | ZL-93 G6I- +108. | PEI-+199- | FIZ-F8GT- | $80. + LAL. “uy | & GLI-FOPE. | OOS FLIL. | L8I-FELZ. | B8I-F99Z- | E9T-ESHF- | 961. FRI. 20-91 GLI. FHPE. | GOL. F289. | FET. #10. | TL0- £908. ple to I9L-+GPE. | G8I-FOLO- | 991-+00€- | OLT-FG9Z-. | SGT. FOTP- | Zel-FLz0.-| 08-6 ELI F6ZE- | LOL. F199. | ELT-F9ZZ- | G90. F FOB. epuaae) GOI F68G- | QBI-FEEP. | OBIL-FEPF- | OLT-F8IG. | IGT-FI8P- | ect. F9ET. 20-2 9&1. F69E- | 6L0-FCOL | GOI-FI19¢- | 090. F68L- | Bla ast] = 800: +696- | 800-+196- | 800-+0L6- | G00-+6L6- | C10-+1P6- | S10. F6FG. | €8-G¢ OLO- #Z96- | GOV. F186. | G00. FZ86- | GOO. F8RE- sorqiquune 2 ie) o) O1B-FOOF. | LEZ GLI. | GEZ-FOLZ- | FFE. FFGT. —| FES. F GCS. —| E1Z. F ERE. —| GEz. F FIZ. 8L-FT €OL- +991. | 90% +EF- | LLT. # LPG. “ 499 FEL-F6IF. | LEZ-+FPEL- | GES. F6FS- | 63E- FIST. —| O€Z- F89L- —| GOT. FFELE. —| F1Z- F ROE: Té-8 10l- #992. | 10% F18e- | E91 F FEC. “ yag 6LI-F8GF- | LOG. FEFZ- | L0B-FOFS- | OLG- F L1Z- —| BIS- FF80- —| BRI. F 18. —| ET. F LEE. 8L-F ZOL- FEL. | O6T-F6ZE. | FST. FPG. «up| & GLI-FO06E- | E61-FOIZ- | BEI. FBBZ- | OGT- FLFZ- —| ZOT-F ZOO. | ELT. F18e.— | QL. FPEE. 88-Z IIT FEL9. | O61. FFG | IGT. FF6P- “ pig] § OLI- +192. | 181. ZOI- | SLT. FOG. | 641. FPFI-—| ZBI. F090. | G91. 29z- —| ELL. F 6S: 86-1 FEL EPIC. | Bl FPLO. | SCL. FOLE. Hite | OGI.+16F- | 880-+999- | 9FI-+6L6- | GLI. + 1FE- EL. + [LP- | LET-+99€. | 981. F69E- 1Z-% LOL: + Z09- LIL: £9F¢. 180. ¥ 699- ‘HIG 4ST © 0G0-+816- | GO0.+€86- | L00-+FL6- | G00. F886- |O10-+F196- | 100-4116. | OLO. FZ96- 82:91 G00. F066. | C00. *686- | G00. F686. |Sel9TZUeN?) = — a | = LEG. F BES. | GFZ. FILE. —| MYT FGLG. | 09S. F FEO. — | 0GS- ¥ 600- —| GFZ. F6CO.—| LAT. F2ES. | COL. FOOL. OL. F1Z9. | FLG eee GIS. F 86%. | BBS. FE8T-—| OGT-FILG. | 9ET- FGEFO. —| GES. FL90. | Get. F FCO. -| LET. FFT. Pee oc ered Pieacen 300 Tre Z ane a 61-098. | 9G. FEZ. —| OPI. F8LG-. | 61Z- F990. —| FIS- FOOT. | FIZ- FGLO.—| FEL. F199- | ZOLFEEL | -FETZ 9ET- F919. | LEO. ¥ 188. “ or] OLI. +16€- | 66T-FZZL- —| 1ST. F FES. | L0G. F FLO. —| OGI- F6FS- | BGI FLZI- -| SOL. F289. | TIT FeL9. | Te-ZT PEL. + 129. OFO. F 68. epee. SGT. +88€. | LLT. FELT. —| 9II-FF09. | IST. #80. —| LOL. #962. | LAT-FZ81-—| LOL. F199. | FEL. FFIEC. 00-2 ZI. +029. | 680. ¥ 888. SUG lee! CII. + Geo. | O€1T-F6LF- | 960. 2E9. | ZI. FZOP. | SOL. FE9G. | TCT. F96I- | 6L0-FGOL- | LOL- +209. LT-G o90.F69!. | Lh0-F6e8. | Bra ast] © 910. +286. | 900-826. | C00. +986. | C00. F066. | 800. L96- | 800. F016. | C00. F186. | G00. £066. FT-62 G00. ¥166- | G00. F966. |Sa1UENy —— = ifs —— OFS. FOZ. —| HES. FEF. —| FIZ. FEE. | POS. F LEP. —| E€Z- F 19Z- —| 6EZ- FFIS- —| OG. FL00- —| 90S. HaeF- | FET. FIZ. z. + BCE. ene GEG. + GEL. —| LES. FGI. —| LET. FOF. | 00S. FORE. --| CEZ- FELL. — | 9BS- F 60. —| GEz. F LL0. 102. 3 188. | FL. F609. ee | eer. F268. ;. une 61G- + E90. —] HIS. FLGT.—| LLL + BFP. | PEL. FIFE. —| O€Z-FL10- | IZ. FFSI-—-| FIZ. FGI. | 961-F6zE. | 9ET- F9T9- 90-8 GLI. ¥ IGF. “ wy GOG- +810. | L61-+F9L-—| FSI. F68P- | CST. + 16Z-—| 261. +6SI- | S61. FECT-—| EI. FL0G- | O61-F1FZ- | PSL F129. 96-4 IST. $608. « pag| & GSI-EPbO. | G11. FFOS-—| SEL-FOSG. | GLI. FRI. —| OLT-F9Ge- | TeI-+OFI-—| ext. F9e. | set. +FL0. | ZI1. F029. 2L% ‘| O&l.- Fees Oita OOI.¥F09- | 680-099. | €OT- F062. | €60-FOFY. | B60-F9F9. | FEL EEGs. | GOL. FI9G. | TIT. F9FS. | 690. FG! €%-€ | Le0-F P48. | Bld 1 GLO. + 8€6- | G00. F6L6- | 900. F686. | G00. F966- | O10. F196- | GOO. F6L6- | COO. ZR6- | COO. E686. | COO. LEG. 08:6 $00. F866. SeIgUENY — | | | £06-+L8r. | CES. FLPZ.—| BOL. FGGL. | LEB. FET. | 9ES- TPZ. | Che. FECT. | cel. Fogg. | 211. FLFG. | 990. F499. | 61z- EEece. ie comer @w PSI. +89F. | CES. +FEL-—| ELL. FIL. | LOS. F FOL. | 61Z-FGLG- | Tez. FOFT- | COL. F FPL. eee Pesan eG | Hees et ag S 91. F609. | 618-090. —| GOI. FGOL- | FI. FO9T- | 961-F8ZE- | L1Z-FROL- | F80.F L182. | FOL. FFG. | 20.188. | GLT.F IGF. ene | tay |e OFT. +92G. | GOS. LEO. — | 660. F ETL | 861-F6EI- |) GLT-FL9€. | Oz. FREO. | 1L0. F908. | IGT. F FEF. | OFO.Fg68- | IGT. F¢0G. 69-7 eaeenererle: CEL. +F0G. | GBI. F490. —| 160-+60L- | 8L1-F FOL. | 9ST. FEse. | set. F¢90.-| ¢90.F Fog. | ect Fore. | ee. F998. | OT. EEC. FOCEW ies abuse £90-+6GL. | €L0.+GEL- | 680-099. | €90-FLLL- | 160-+1¢9- | €€1-FIOF- | 090-4682. | 190.4699. | LEO. +68. | LEO. FFLE. tL-€ BIG 4st | & G10. +6. | 900. +4986. | G00-+686- | G00- + 466. | 600: + 996. | 400. FZL6- | GOO. + 886- | 200- +686. | C00. +966. | G00. F 866: 97-82 | seuend| F = vie = = | | ne 4 a : son | ; eames) ee age te) x x aayjog oo0Rqoy, yeo9 SSUTARG ates | 4qsog ‘econeniaee | Onueaary surddiyg [ey senna | "SUOT}VIAIG paepurig pup SMOLM AQVQol pup SUOUZD]OLLOL) ‘soopuy wYIT "AT WTAVA 349 BEATRICE M. CAveE AND Karu PEARSON CG-6G FEE. FOGE- | GHG. FGGT. | OGG. FOFO. | SEB FEE. | OPS. FELT.—| GES. FSP. | OLS. FOO. | LE-FZET- | OFZ. F FOS. —| 40<- + LO. ou LG-O€ =| 80G- + FFE. | OES. +EOT. | LEG. +L6I- | LEZ-+IOG- | FES. +60. —| FBS. FEES. | PEL FEF | G1B-FRGES- | GES. FGET- —| FBT. + BOF- ua¢ 68-ST | I6G1-F89€- | ZIG FOBT. | OBZ. FL00- —| 11G- F G61. | 61Z- F890. —| 661 F FOE. | BLI-FRSP- | I6L- +09. | GIS. FEO. —| LOT. #60E- “ wr} Co » GE-8 | SLI. +6PE- | G6T-F9G6I- | GOG- FETO. —| LEL-+FE9T- | GOS. ¥8C0.—| GAT. FOE. | SL4I- FOE. | OLT-FL6e- | ZO’- +810. | 9FI- + 9ZS- “ Pag] & 66-4 | FOL. ¥6TE. | SL1-FOST- | GOT. FIFO. | IST F680. | ZBI. FHGO.—| IVT. FSFE. | OLT. FL9B- GGL. +88€. | Z8I-+PFO. | Cel. + FOG. “ pug} 3 00-¢ | 160. +899. | CPI. +€8Z- | FEO. +LE9- | LPL. +E9Z- | OZI-FEFF. | COL F68G- | OGI-FIGE- | E11. FZE¢- | COT. FTO. | 490-F6GL. | BIC IST €€-96 | T10- F996. | 610. F186. | C10. FIF6. | 120. FEI6- | 8z0-F E98. | 800. F696- | 00. F816. | 910. FLE6-. | CIO. F REG. | ZLO.F BGG. |SeyTUENy | | ae PGS. + 9ZE- CF: ET PL. FFL. —| FOS. ¥ LEP. —| OFS. + ESL. —| LPZ-FROL- | OGZ- FSO. | Lh. FGI. | 63S. F1LG- —| 9es. EPS. —| es. FLFS.-| “WIE 806-+FFE- —-HD-L EST. + PLP. —| L6L- + BOF. —| GES. FEL. —| OES. FFHO- | EEG FLATT. | LCS. FPL. | SCS. FEI. —| L2G. F961. —| BES. FPEL-—| “ yIC I61-+89€- | €&-F ISI. +8LF. —| S81. FO8E. —| ET. F99T- —| BIZ-F E80. | SIZ FIGL | L08-FEFZ- | 91Z- FETT-—| PIS. FLSI--| 612. F090-—-| “ WF 5 SLI. + 6FE. 6h Z ELT. + BLE. —| 181 FLZE. —| SBI. FOLZ- —| 66I- FIT. | 00Z- FALL. | E61. FOI. | 661. FZST-—| LET FF9T- —| BOS. FLG0.-| “ pre] POL. +6LE- 19-T I9T. + 9FE. —| GST. FPFO. —| POL F LTE. —| OLT- F681. | VBI FOLO. | 181 FZOL- | LLL FAT —| SLT. F FOS —| BI. FF9O-—| “pus 3 160- + €¢9. 8: LOT. +916. | G90-499L. | PEI-FIOP. | OLL. FSGS. | SGI. F EEF. | 880.499. | O€I-FE6IF- | 680. F099. | E10. FEL: Bid 481 T1O- + ¢¢6. 02-6T 800: +896. | 00-+F86- | L00-FEL6- | 910-F4E6- | 800-F196- | G00. FE86- | 900-F8L6- | COO. F6L6- | G00. F E86 |sarUNd | - Ch. +EE1-. | P81. FFG. OS-OZT' | LPZ- +961. | 0G3-FZG0. | OSS. +0G0-—| EFS. F ILI. | SES. FOL. | O91 FGOLG- | FIZ FEBE- | SOT. F4GL- “499 08%. + E9T- | E81. FPLF- 08-29 T€G-+ PPI. | 9E-F0GO-. | 9ES- FLO. —| BZ. FECL. | BBS. FEPS- | OGT- FILE. | L61-+0F- | ETL F ISL. “ wag B1G-+98I- | [81-+8IF- 61-€€ 81Z-F8L0- | 81G-F980- | 8IZ- F980. —| 80B- FEET | LOT-FOFS- | OPI. FRLG- | LAT. SPP. | GOL FOL “ Wl 6 C6I-+961- | E41. #8LE. 90-8T GOG- G00. | 661-FLET- | 66T- FTE. —| LET. FELZ- | GT. FEES. | LET FFES- | FET. FESPr- | 660-FELL- oe EPAETIES SL4I-FOGI-. | I91- F9FE. OE: OT B8I- + LG0. —| BL4T-F99T- | GLT- F608. —| 99T-FOOE- | S41. FEGT- | 911-FHO9. | ZEI-+6ZS-. | 160- + 6OL- “ pug) ~ CPI. +E8S-. | LET. FT. GIL Gel. +18€- | [€l-+ELF- | LG1--+690-~| 961. F6FF- | OFI-F6LZ- | G60. FZE9- | COT. FO6G. | 680. F099. Bid si 610. + 1Z6- | 800- + 896: 66-LE 200. +066- | 600-+¢96- | TLO-+ 466. | 800-FOL6. | 100. FFL6. | G00. F986. | G00. F686. | G00. F686. |SerzFUeNY OSG. + 9F0. | FOG. F LEP. —| 1PS- F9GT- 6F- GG B1G-F EGE. | GEB- FOES. | LES. FGLE. | HHS. F FST. —| OCS. F FEO. —| FOS. F LEP. —| LES. F BES: “199 L6-+L61- | L61- + ZOv- —| 18S. + FFT. OT-2T GOG- F6LE- | 60. FEE. | GET. FIGS. | GBT. FI. —| YES. F EFO- —| 00S. FORE: —| LZ. FFEL- ye Mae OG: + 00. —| 8ST. F O8E —| 81Z- ¥ 8L0- 69-9 681 FELE. | OGI-FLZE- | 90B- FEFS- | OLZ. F 11Z. — | 61S- F990. —| FET. F 1FE- —| FIZ. ¥ ODT- “ wr} s 0G. + €10- —| IST. + LZE- — | GOS. + G00. 6L-€ €8I-+908- | E81-FLOE. | S81. F99Z-. | OGT- FLFS.—| LOG. FFLO- —| CBT. F 16Z- —| Q6T- ¥ GET- “ pie} &- G8I-+ IPO. | Z81-F PFO. —| 2ST. FLGO. — GG. OLI-#19G- | GLI. ¥808- | OLT-¥996- | GLI- FPFI-—| I81- F980-—| GLI. F861-—| SAT FP9T- “pug | oe V60- + 1E9- | G90. F991. | CEI. F188. 18-F 8OI-+€9¢. | €E1- FOP | OTI-F8IG- | ZLI- FIPS. | SEI. FZOF- | £60. FOF9- | E90. FALL: BIC 1 G10-+1P6- | G00. +486. | GOO. +066- IL-7 600: ¥€96- | L00-FZL6- | G00. F626. | G00. F886. | 400. F066. | G00. F966. | COO. F166. | SeryTUENY 886. F ECE. OFZ. F GBI —| 0G. FZGO. | GIS. F Ege. Z6-6 096. F L0.—| 961 FG9F- | PES. F EGS. —| OG. F600. —| Ges. F 19%. —| OES. FaKe- “ U9 | w LGB. +10G- | GEE- FEEL. —| 9ET- FOGO. | GOS. F CLE: 82-9 PES. + 860- —| LB. + SGP. | O€- FPO. —| GE. F190. | eZ. FELI-—| 61Z- F GLZ- pee caee ITG.+961. | €1Z- F991. —| 8TZ- F980. | 681. F GLE. Il-F PIG. + EOL. —| CLI F19P- | 81S. FF8O-—| PIS- FOOT. | OZZ-FLIO. | 961. F 8BE- “ w]e 461-+€9T- 881. FOLZ- —| 661. FLET- | E81. F908. 06-2 O61. F FPS. —| C91. F EPP. | GOS-FGOO. | OCT. F6FZ- | LET-F6GI- | CLT. F198. eS I8l-+980- | POT-+LTE.—| 8L1- F991. | OLT- + 19%. GZ: GOL. F6EE-—| GGL. FOLP- | Z8T- F090. | LOT-F6Z | OLT-FEGZ- | OCI. FEBE- “ pug} = LPI. +€96- | PGI-+19F- | I€1- FIP. | SOL. FE9e. 6-3 89T- +960. | 1ZI-FI8h. | €ZT-+1LF- | 8OT-+E9S | Z60.F9F9. | 160. F199. Bid ST} 5 1G0.+€16- | 100. +€L6- | 600-¢96- | 600- + £96- 99-FT G60 + 868- | C10. IF6- | OLO-F196- | 800-F496- | O10. F196. | 600.FE96. |soyULNy | % | 9FG. + EE1-— | LP. +801. | OSZ- F080. — | zs. FEE. | OGS. F LZ0. €¢- 16 P6I-FGLP. | €1Z- #88. —| 6FS- F6C0. — | 6ES- FFIS- — | ChS- FEST. 5 VEG. +760. — 9ES-F FPO. | 9EZ. +LG0-—| 60S. F HES. | FES. F BGO. 20:21 LOI. +GOF- | ZOZ- F PLE. —| 9EB- F FSO. — | 9SS- F 60S. — | LES- + OFI- abe 616. +890. — 81G-FE80- | 81Z- +480. —| 961. F Ese. | FIS- FZOL- €6-9 L6O1-+L1€- | 881. +188. —| PLZ + 6L0- — | Z1S- FP8T- —| L1Z- + 8OT- , UW) GOG. +890. — 66I-+I1&T- | 661-+1E1-—| €81-FL0E- | O61. FFP. 0&:F 961-+ZSI- | GLI. +18: 661- + LZI-—| 861. FEST. — | ZO’. + 8E0- | Bae B G81L-+790-— | 9LT-+681- | GLI- F60Z- —| GLI. 80%. | ZOL- F6EE. 10-€ G81. FLZO- —| QOL. FZ8S- —| LLT- FZSL- —| 181. FOFT- ZBI. +990: — “pug O61 FOP. | OLI. +699. | LET. ¥690-—| E€T. F107. | BST. ¥ 920. PPE Gl. FOCI. | LET. F99€- | GGT-F96I- | PET. Fess. | eel. FOP. | Bld wT 860. +988- | 910-986. | 110. #L96- | L00-FZL6- | G20. + 868: 19-GE €10-+6F6- | 400-146. | 800-FOL6- | 900-F6L6- | L00.FZL6. | SorgTqUENd 45 Biometrika x 350 Illustrations of the Variate Difference Correlation Method to zero, now stands at something greater than -42 and might rise higher, but the relation to International Commerce as a whole is zero, which suggests that the goods imported and exported are not in the bulk carried by rail. Further althoagh the final value of the Railway and Post correlation is scarcely sensible (— 214 + :259), it has been continuously negative from the second difference, and thus suggests that increased expenditure on the post means lessened profit for the railways. This might be interpreted in two ways: (i) that business con- ducted by post or telegram lessens rail intereommunication by person, or (il) that in the case of state-railways, there is not an increased profit to the railways from carrying larger mails. But still more remarkable are the negative correlations of Stamp Duties, Savings, Tobacco and Coffee with Railways; none of them are very large, and all but savings, perhaps, of the order of their probable errors. But taken as a whole they suggest that when the Italian spends little money in going about, then he saves more, or spends more on such luxuries as tobacco and coffee. Lastly we have the Coal Index. It might be supposed that a year with great coal importation would signify great railway activity, and this-is the judgment which would be made from the raw correlations of these variates. But the actual facts are exhibited in a correlation still falling at the sixth difference and hardly significant having regard to its probable error. The inferences formed must be: (i) that imported coal is used largely at the ports of disembarcation or travels inland by other than railway transit, (11) that the imported coal is largely used on the railways themselves and that its cost is a heavy tax on their resources. (c) Shipping Index. As we might anticipate this is highly correlated with (i) Railways (c. 62), (ii) Revenue (c. °75) and less highly but very significantly with (111) International Commerce (c. 54) and (iv) Coal (c. 58), but it appears to have no relation whatever with Post, Stamp-Duties and Savings, and when we come to luxuries, their importation is clearly not a factor of shipping prosperity. Neither in the case of J’obacco nor of Coffee are the correlations really significant ; with the former we have an increasing negative correlation and with the latter a decreasing positive one already below its probable error. Thus we see that neither directly by bulk of importation nor indirectly by immediate increase of consumption, does a rise of shipping mark significant rises in the use of luxuries such as Tobacco and Coffee. It would be of interest to ascertain whether in- creased consumption of luxuries does not rather follow than accompany favourable trade fluctuations. (d) Revenue Index. This index as we might expect is fairly highly cor- related with Shipping (c. °75). It has relatively small relation to Railways (422 +206) at least at the sixth difference and a somewhat similar value (c, 42 4°20) for Coffee. Thus the suggestions arise that revenue is but little produced by the railways and that coffee is not a very large factor of the custom dues. It is astonishing to find, however, that Post, Stamps and Savings have negative correlations with Revenue of —°888 +°213, — 255 + 234 and —:154 4+ :244 respectively, which, if scarcely significant, have been in each case for several Bearrick M. Cave and Karu PEARSON Bay differences persistent in sign. Even the correlation with Tobacco is small, falling and insignificant (< 115 +247), and that with Coal which might be supposed to be high as marking good trade times is hardly significant although apparently rising (? > -270 + 232). Lastly the correlation of Revenue with International Commerce is again small, falling, and insignificant (< ‘214 +°259). Thus Revenue or the “entrato effectivo dello stato” seems to provide an index which has little valuable relation to any other characteristic of prosperity beyond shipping. (e) International Commerce. Here we find no single final individual index correlation greater than °54, which is that for Shipping. The next most important correlations are with Post (>°47) and Stamp Duty (c. 46). With Railways the correlation is zero, and with Revenue also falling and insignificant. With Savings, Coal, Tobacco and Coffee the correlations are all insignificant; in fact in the last three cases not only are the values less than their probable errors, but they are still falling. It is thus clear that in Italy the total of Exports and Imports is no measure of all-round prosperity, they do not immediately increase either savings or the consumption of luxuries. (f) Post and Telegrams. Here we have the lowest series of correlations we have yet reached, Post values have no significant relation to fluctuations in Railway (c. —:20 + °24), to Shipping (— 059 + 249), Stamp Duties (—027 + °250), Coal (— 050 + 250), Tobacco (+ 108 + 247) or Coffee (— 183 + :246) Indices. It is significantly correlated only with International Commerce (> + ‘47 +'19) and, perhaps, significantly with Savings (+ 336 +°222) but negatively with Revenue (c. —'38 +:21). In short the number of letters and telegrams in Italy is hardly a mark of any other favourable fluctuation in prosperity, beyond International Commerce. (g) Stamp Duties. This Index is correlated positively and significantly with International Commerce (c. +'46 +°20) and positively, and doubtfully with Savings (c. + °85 +°22). It is correlated insignificantly and negatively with Raalways (—'261 +233), Shepping (— 009 + -250), Revenue (—'255 +°234), Post (—'027 +250), and Tobacco (—'129 + 246); it is correlated positively and insignificantly with Coal (+ (052 + ‘250) and Coffee (+ °222 +238). Thus again freed from continuous time changes, fluctuations in the Stamp Duty Indew are of small value as a measure of contemporaneous general prosperity. (h) Savings Bank Index, There are practically only two correlations of any importance with Savings and these are both negative, namely those with Railways (— "431 + 204) and with Yobacco (—°431+4°204). Hence it would appear that when the Italian people is in a saving mood, it spares on transit by rail and on the consumption of tobacco, and when it expends on these luxuries, then it does not save. Savings have small and possibly not significant correlations with Post (> +°33 4°22) and Stamp Duties (< +'353 + °219), and insignificant and positive correlations with International Commerce (>+°27 + °23), Coal (> +19 + °24) 45—2 352 Illustrations of the Variate Difference Correlation Method and Coffee (c.+°05 +°25); they have insignificant negative correlations with Shipping (< — 03 4°25) and Revenue (< —‘15 + °24). Savings are thus—apart from continual time change—no very satisfactory measure of general prosperity, and a fluctuating increase is usually accompanied by a reduction of luxuries. (t) Coal Index. The importation of coal has little relation to any factor of prosperity besides Shipping (c.+°58 4°17). With Railways the correlation is not quite double the probable error and the value, even at the sixth difference, appears still falling. The correlation with Revenue only just exceeds the probable error (+ '270 +232). With International Commerce (+171 + °243), Stamp Duties (+ 052 +°250), Savings (+ 196 +241) and Coffee (+ 152 + °245) the correlations are less than their probable errors, small and in some cases still falling. With the Postal Index, the correlation is negative, insignificant and falling. Alone in the case of the Tobacco Index does the correlation appear to be nearly as significant as in that of Shipping, but it is negative and increasing* (—°514 + 184), while in the case of Shipping it was steady. It is singular to find that Coal, the increased import of which should mark increased industrial activity, is, beyond the naturally influenced Shipping, alone effectively associated with the con- sumption of Tobacco. (j) Tobacco Index. This is of considerable interest as marking the association of indices of trade prosperity with the consumption of a luxury. With four exceptions J'obacco is negatively correlated, although often insignificantly, with the other indices. Revenue (+°115 + °247), International Commerce (+ °015 + ‘250), and Post (+ °108 +:247) are all positive, insignificant, and in the first two cases still falling. The correlation with Coffee is positive and might, perhaps, be significant (+°326 +°224), but it appears to be still falling. With Coal and Savings there are probably significant negative correlations (—°514+°184, and — 431 +°204 respectively); with Railways (— ‘243 + ‘236), Shipping (— ‘271 + °229) and Stamp Duties (— ‘129 + ‘246) there are insignificant negative correlations, but they tend to confirm each other in sign. Thus we see that the consumption of tobacco can hardly be considered as a measure of general prosperity; it appears to be greatest when trade conditions are unfavourable, and in particular when savings are least and manufacturing conditions as measured by the importation of coal are slack. The result suggests the pipe of the unemployed at the street corner, rather than the increased expenditure of the fully occupied artisan. (k) Coffee Index. 'This is another luxury and the results are very similar. There appears a significant correlation with Revenue (+ °400 + 210), which might easily be explained, and there is a falling but possibly significant correlation with Tobacco (+326 +°'224), With all other indices the relationships are * It is, perhaps, hard to believe that so much smuggling could be carried on in colliers, that it would seriously affect the profits of the tobacco monopoly ! —_—_ — BeaTRicE M. Cavk anp Kart PEARSON 353 insignificant. Railways (—‘204 +240), Shipping (+ °2382 +°237), International Commerce (+142 +245), Post (—'133 4246), Stamp Duties (+°222 + 238), Savings (+°046 +250) and Coal (+°152+°245). Apart therefore from the general increase of consumption with the time, during which time the general prosperity of the nation has increased, it would not appear that the consumption of a luxury has any organic relationship to prosperity. We do not find that a favourable trade fluctuation is associated with increased consumption of luxuries. In fact the suggestion arises that in the case of tobacco the consumption may be greater in a period of depression. Conclusions. While we lay no special stress on any of the results suggested by the difference correlations above studied—far more intimate economic know- ledge of Italian affairs and methods of measurement would be requisite—we yet venture to insist on one or two general considerations. The very superficial statements, so frequently met with, that such and such variates, both changing rapidly with the time, are essentially causative will doubtless cease to have any scientific currency, directly the method of variate differences is fully appreciated. We shall no longer assert that the fall of the phthisis death-rate can be off-hand causatively associated with the con- temporaneous rise in the number of persons dying in institutions, or that the increased expenditure on luxuries is necessarily a measure of increased national prosperity. If we turn as in the present paper to the actual correlations of the indices themselves, we find in every case an arid and scarcely undulating waste of high correlation. No one can obtain any nourishment whatever from the statement that the Tobacco Indez is correlated with the Revenue Index to the amount of :983 and with the Suwings Bank Indeaw to the extent of ‘984! The organic relationship between these variates is wholly obscured by the continuous increase of all three of them with the time. But when we proceed to sixth differences and see that the consumption of tobacco has little, if any, relation to revenue, and is associated substantially but negatively with savings, we seem to touch realities, and realities of some worth. Again what can we learn, if we are told that the Shipping Indes is correlated to the extent of ‘99 with both the Revenue and the Savings Bank Indices? We might imagine, that increase of shipping was not only the primary cause of increase in Italian revenue, but also the essential origin of any increase in the Italian peasant’s and artisan’s savings! An appeal to the variate ditference method shows how fallacious such imaginings would be! An examination of the sixth difference correlations shows that while prosperity of the revenue is closely associated with trade as measured by shipping (77), the correlation is not nearly perfect; on the other hand there appears to be no significant organic correlation at all (— 154 + ‘244) between the prosperity of the revenue and the savings of the Italian populace. As we have noted a knowledge of local conditions and methods 354 Illustrations of the Variate Difference Correlation Method of reckoning quantities might enable us to put other and, perhaps, more luminous interpretations on our results. But there can be small doubt that to proceed from the actual correlation of such indices to the correlations of their higher differences gives the feeling of clearing away the sand of the desert, and reaching all the ordered arrangements of an excavated town below; the slight undulations of the waste above are really fallacious, and enable us to appreciate nothing of the actual topography of the city. The method is at present in its infancy, but it gives hope of greater results than almost any recent development of statistics, for there has been no source more fruitful of fallacious statistical argument than the common influence of the time factor. One sees at once how the method may be applied to growth problems in man and in lower forms of life with a view to measuring common extraneous influences, to a whole variety of economic and medical problems obscured by the influences of the national growth factor, and to a great range of questions in social affairs where contemporaneous change of the community in innumerable factors has been interpreted as a causative nexus, or society assumed to be at least an organic whole; the flowers in a meadow would undoubtedly exhibit highly correlated development, but it is not a measure of mutual causation, and the development of various social factors has to be freed from the time effect, before we can really appreciate their organic relationships. In the present paper we have dealt only with very sparse “ populations” (only 28 values of the variates), but this has enabled us to consider not only a very large number of correlations, but to see the practical influence of terminal con- ditions on our theory. This may we think be summed up in the statement that the Andersonian formulae for the standard deviations will hardly in many practical cases be more than very roughly approximated before the size of the population becomes too small to make the deductions reliable. Further in most cases our difference correlations have hardly even with the sixth differences reached a steady state. Possibly they have done so in the cases of Rail and Shipping, Shipping and Post, Shipping and Coal, Revenue and Post, International Commerce and Stamp Duties, International Commerce and Savings, Savings and Coffee, and in one or other additional pair. But in the great bulk of instances there is still a more or less steady rising or falling appreciable in the difference correlations, and all we can really say is that the final value, the true ryy, will be somewhat greater or less than a given number. From an examination of the actual numerical working of the correlations, it appears to us that the terminal values are in the case of these short series of very great importance. It is further clear that the theory as given by “Student” depends upon certain equalities which are not fulfilled in practice in short series. We await with much interest the complete publication of Dr Anderson’s work, and hope to find a fuller discussion of the allowance to be made in short series for the influence of the terminal state of Beatrick M. Cave anp Karu PEARSON 355 affairs* on the steadiness of the series and on the approach to the standard- deviation formulae. But apart from these lesser points, our present numerical investigation has convinced us of the very great value of the new method of Variate Difference Correlations. * For example if we measure X from its mean, J] “1 = afi n n m oa pang 8 (Xe Kou)? (AR P= 5 (8 (9) - Ky 8 (A) MP) - AN 1 1 ~n—-1 n-1 : r, 2 , V"\2 since S (X,X,,;) is by hypothesis zero, Sg ee Ng teary, 1 {o2x,— 4 (X?+X,?)} —(4X)?. The first term 1 2 or : : aoe Hi n—1 = = 1 r r 207y. gives Dr Anderson’s value of o?,y. Now AX equals nisi 8 (X, — X544) = (X,-X,). Thus the v—1 ee. 2 ail remainder is ——> [ ox,-3 oe? a ala i (X,- x, | . Now the average value on many trials of 4 (X72+X,2) will be oy, and of (X;-X,)*, 20, so that the full value may be 2 77x and small 2 (nv —1) for n large; but for n small as above such a relation as o®, y=20%, and the similar but more complex relations of the standard deviation formulae for the higher differences need not hold for any individual case, and thus the steadiness of the difference correlation series, and the approach to the Andersonian formulae are very far from attained. AN EXAMINATION OF SOME RECENT STUDIES OF THE INHERITANCE FACTOR IN INSANITY By DAVID HERON, D.Sc.* In the last few years a number of studies of the inheritance factor in insanity have been published in America, Germany and England. The value of investigation of such a topic cannot be overestimated. We are quite certain that the prevalence of insanity is not falling; many of us indeed believe that the statistics suffice to demonstrate that it is substantially increasing, and that we can attribute this increase not in the first place to the intenser strain of modern life, but to the greater power of modern treatment to check or temporarily cure attack, and thus allow wider possibility of reproduction to members of affected stocks. Indeed the problem seems closely associated with an essential difficulty of modern civilisation, the greater protection of physically and mentally degenerate stocks unaccompanied by any adequate limitation of their thereby increased power of procreation ; the inheritance factor thus tends to aid the relatively greater survival of the socially unfit. The studies we have referred to would be of great importance from this aspect of eugenics if (i) the data were collected without conscious or unconscious bias, and (ii) the inferences drawn from them followed logically from the data thus collected. Unfortunately it is not only in the interpretation of statistics that adequate training is required. It is equally important that in the actual collection of them we should proceed, not only free from the bias which arises from the hurried acceptance of dogmatic theories of heredity, but what is often still more needful, free from the bias which is almost certain to waylay our progress, if we have not initially considered with trained insight the fallacies which may result from our method of recording or even tabulating our material. The day of the amateur in svience is gone; no one now pays any attention to men who propound elaborate atomic theories or stellar hypotheses, without having had preliminary training in physical or astronomical science. There are still, however, some who appear willing to accept the statement of statistical data or the inferences drawn from those * This paper formed the second portion of a lecture given at the Galton Laboratory on March 3, 1914. D. HERon 357 data by men who have clearly had no adequate training in statistical science. The craniologist, the anthropologist, even the biological student of heredity and evolu- tion are recognising that a statistical training is needful for the true interpretation of many of the facts in their special fields of research. The physiologist still appears to believe that he can deal with the average effects of diverse dietaries or the pathologist with the “ mass-phenomena” of the hereditary factor in insanity without any training in statistical method. A physicist might just as logically assume that without mathematical training he could give an adequate mathe- matical account of a physical phenomenon, or a cosmic theorist suppose that he was effectively furnished for astronomical research by the perusal of a popular primer on the stars! The statistical calculus cannot be mastered by any easier road than the differential calculus, or, to put a more apt illustration, statistical training is as needful a preliminary to the handling of statistics, as time spent in a physiological laboratory to the effective handling of tissues. In twenty years it will be unnecessary to insist on these points, they will be universally recognised in the courts of science; but at present it is not only necessary to reiterate unpleasant truths, but to emphasise their validity by illustrations which bring home forcibly to scientist and layman alike the danger of amateur statistical handling. To state that a man is in error is not sufficient, if he continues time after time to repeat his assertions, apparently under the belief that incessant repetition will convince the world of the value of his theories. In the case of the inheritance factor in insanity we are not dealing with any purely academic question of science. We are up against one of the most difficult problems of modern life, where true advice is of urgent importance to the nation as well as to the individual. It is not only the medical man but the layman who seeks guidance in the question of the marriage of members of insane stocks, and a laboratory like the Galton Laboratory knows how often advice on such points is sought. It is disheartening when help is rendered to the seeker to be faced with the criticism: “ But Professor says I may marry if I take a wife of sound stock,” or “Dr recommends marriage, although my father was insane, because I am over twenty-five and still sane myself.” When teaching of this kind, arising solely from false interpretation of defective data, is spread widecast in a dozen different papers or journals, it is not sufficient to issue a brief statement of its futility. It is needful to give it the coup de grdce by a more lengthy criticism of its fallacies and their illustration in a form more likely to impress the imagination. The attempt is made in this paper to deal with only one of the authors, who have contributed fallacious eugenic rules to those seeking knowledge on the influence of the hereditary factor in insanity. In a long series of papers Dr F. W. Mott, Pathologist to the London County Asylums, has stated that when the children of insane parents become insane, they do so at a much earlier age than did their parents, and on the basis of this assertion he has drawn some very sweeping conclusions for practical conduct. Thus in the Biometrika x 46 358 Recent Studies of the Inheritance Factor in Insanity British Medical Journal of May 11, 1912 (p. 1060), he states that “this signal tendency of insane offspring to suffer with a more intense form of the disease and at an early age, as shown in the above figures and tables, is of great importance for the following reasons: first, it is one of Nature’s methods of ending or mending a degenerate stock; secondly, it is of importance to the physician, for he can say that there is a diminishing risk of the child of an insane parent becoming insane after he has passed 25, a matter of great importance in the question of marriage ; thirdly, it is of importance in connection with the subject of social surgery of the insane, for when the first attack of insanity occurs in the parent the children for the most part have all been born....Sterilization would therefore be applicable to relatively few parents admitted to asylums.” Put briefly, Dr Mott’s views are that in “Antedating” or “ Anticipation,” in this alleged tendency of the offspring to become insane at any earlier age than their parents, we have Nature’s method of purifying degenerate stocks, that the children of insane parents who are still normal at the age of 25 may safely marry*, and that it is useless to take any special measures to limit the reproduction of the insane since nearly all their children are born before the onset of insanity. These conclusions, if proved to be correct, would be of the utmost importance to the Eugenist. If the Law of Antedating or Anticipation really acts in the way Dr Mott has suggested, then it would seem to be unnecessary to take any special Kugenic action in the case of the insane and indeed the “ Law” has already been used in support of this view. Thus in a leading article in the British Medical Journal+, which deals with Dr Mott’s work, it is stated that “This intensification of mental disease in the young—this ‘ anticipation’ as it is called, which is one of Nature’s methods of ending or mending a degenerate stock, is specially important in connection with sterilization, as the figures given by Dr Mott show that when the first attack of insanity occurs in the parent the children have for the most part all been born. Sterilization, therefore, would be applicable in relatively few cases. It is at least obvious that when views such as these are taken of the “Law of Anticipation,” it merits the most careful examination. Let us consider, then, first of all, Dr Mott’s presentation of the case for anticipation. For some years past Dr Mott has been engaged in the collection of cases in which two or more members of a family are or have been resident in London County Asylums, and has noted wherever possible the age of onset of the insanity. Information was thus obtained regarding 217 pairs of father and offspring, and 291 pairs of mother and offspring and the results are summed up in the following table. Thus in comparing the age at onset of insanity in father and offspring, we find that among the fathers only 1:4°/, became insane before the age of 20, while among the offspring the percentage was 26°2. These figures are also shown graphically in * See for instance Problems in Eugenics, p. 426. + May 11, 1912, p. 1089. D. Herron 359 Figs. 1 and 2*. Here the horizontal scale represents the age of onset in 5-year groups—the vertical scale the percentages of cases occurring in each age group. TABLE I. Percentages of Cases whose First Attack of Insanity occurred within Various Age-periods. Age-periods Father | Offspring | Mother Offspring - Percent.|} Per cent. | Per cent. Per cent, Under 20 years ig 26°2 0°6 27°8 20—24 years... 0-4 18-0 374 15°7 pting 1-4 | 18-0 44 18-2 \ Adolescence 30—84 ,, 9°6 13:0 4s) 13°4 35—39 ,, 11°5 73 9-2 10°0 40O—-AL,, 9-2 6-4 10°3 58 4I—49,, 14:3 6:0 12°0 3'7 | Involutional 50—54,, 17°5 0-9 12°3 2°4 t period 55—59 13°8 3°7 14:0 Py) 60—64 ,, 10°1 — 11°6 1°3 65-—69 _,, 5:0 _- 8°8 — 70—74 ,, 46 0-4 31 — 75—79 ,, 0-4 _- 1:3 = 80 - 0:4 — 0°6 — I have been obliged to follow Dr Mott in treating the “under 20” group as a 5-year group as otherwise my diagrams would bear no resemblance to his, but this procedure is far from satisfactory when such a large proportion of the cases in this group are congenital cases in which the age of onset should be taken at 0 years. The tables and diagrams show that among the parents more than half the cases occur after the age of 50, while among the offspring, more than half occur before 30, and this is taken to prove that there is Anticipation or Antedating in Insanity. This will perhaps be made more evident if the percentages of those who became insane before the age of 25 are given in each case. Among the fathers, 2°/, and among the mothers, 4°/, became insane before the age of 25. Among the off- spring, on the other hand, the percentage is 44. Another way of looking at the matter is to take the average age of onset of insanity in each case. Dr Mott gives a Table showing these averages but unfortunately has omitted the congenital cases so that the extent of anticipation is considerably under-estimated, and the form in which the data are given does not permit of an accurate calculation of the actual averages. From the information given it appears, however, that the average age at onset of insanity among the parents is about 50 years, among the offspring about 26 years, showing an anticipation or antedating of some 24 years. * I am very grateful to Miss H. Gertrude Jones, the Hon. Secretary of the Galton Laboratory, for the diagrams which illustrate this lecture. 46—2 360 Recent Studies of the Inheritance Factor in Insanity et at Ons 1 and 2. Diagrams to illustrate the Distribution of Age + and Offspring. (Mott.) ty in Paren of Ingani Fathers [_] Offspring FY = [i D. HERON 361 Now these conclusions, if satisfactorily demonstrated, would obviously be of the highest importance, but they were immediately challenged by Professor Karl Pearson in a letter which appeared in Nature of November 21, 1912 (p. 334). Professor Pearson’s letter is as follows: On an Apparent Fallacy in the Statistical Treatment of “ Antedating” in the Inheritance of Pathological Conditions. The problem of the antedating of family diseases is one of very great interest, and is likely to be more studied in the near future than ever it has been in the past. The idea of antedating, i.e. the appearance of an hereditary disease at an earlier age in the offspring than in the parent, has been referred to by Darwin and has no doubt been considered by others before him. Quite recently, studying the subject on insanity, Dr F. W. Mott speaks of antedating or anticipation as “Nature’s method of eliminating unsound elements in a stock” (“Problems in Eugenics,” papers communicated to the First International Eugenics Congress, 1912, p. 426). Iam unable to follow Dr Mott’s proof of the case for antedating in insanity. It appears to me to depend upon a statistical fallacy, but this apparent fallacy may not be real, and I should like more light on the matter. This is peculiarly desirable, because I understand further evidence in favour of antedating is soon forthcoming for other diseases, and will follow much the same lines of reasoning. Let us consider the whole of one generation of affected persons at any time in the community, and let , represent the number who develop the disease at age s, then the generation is represented by Noy Nyy Ng_ vos Ngys0- N00, Say. Possibly some of these groups will not appear at all, but that is of little importance for our present purpose. Let us make the assumptions (1) that there is no antedating at all; (2) that there is no inheritance of age of onset; thus each individual reproduces the population of the affected reduced in the ratio of p to 1. Then the family of any affected person, whatever the age at which he developed the disease, would represent on the average the distribution Po, PN, Pla, +++ PNs5 +++ PN100- The sum of such families would give precisely the age distribution at onset of the preceding generation. Now let us suppose that for any reason certain of the groups of the first generation do not produce offspring at all, or only in reduced numbers. Say that g, only of the n, are able to reproduce their kind; then of the older generation, limited to parents, the distribution will be Jo ot M121 + GoN2t «e+ YgNgH «+» + Y100%1005 but the younger generation will be D (GoM + G12 + YoNgt 0. + Ye Nat --- + G100%100) (pM +... HNg+... +2100); i.e. the relative proportions will remain absolutely the same. The average age at onset and the frequency distribution of the older generation, that of the parents, will be entirely different from that of the offspring and will depend wholly on what values we give to the g’s. If frequency curves be formed of the two generations they will differ substantially from each other. This difference is not a result or a demonstration of any physiological principle of antedating but is solely due to the fact that those who develop the disease at different ages are not equally likely to marry and become parents. 362 Recent Studies of the Inheritance Factor in Insanity A quite striking instance of the fallacy, if it be such, would be to consider the antedating of “violent deaths.” Fully a quarter of such deaths in males, nearly a half in females, occur before the age of twenty years. Consider now the parents and offspring who die from violent deaths ; clearly there would be no representative of death from violence under twenty in the parent generation, and we should have a most marked case of antedating, because the offspring generation would contain all the infant deaths from violence. In the case of insanity, is the man or woman who develops insanity at an early age as likely to become a parent as one who develops it at a later age? I think there is no doubt as to the answer to be given ; those who become insane before twenty-five, even if they recover, are far less likely to become parents than those who become insane at late ages—many, indeed, of them considering the high death-rate of the insane, will die before they could become parents of large families. Now Dr Mott took 508 pairs of parents and offspring, “collected from the records of 464 insane parents whose 500 insane offspring had also been resident in the County Council Asylums,” and ascertained the age of first attack. As at present advised, it seems to me that his data must indicate a most marked antedating of disease in the offspring, but an antedating which is wholly spurious. There is, I think, a further grievous fallacy involved in this method of considering the problem, but before discussing that I should like to see if my criticism of this method of approaching the problem of antedating can be met. KARL PEARSON. Biomerric LABORATORY, University Cotieer, Lonpon, November 11, 1912. Dr Mott has referred to this letter in his Report for 1912*, but it will be more convenient to deal with his reply after we have examined the method by which his data have been collected and the use made of the data. Let us consider first of alli how the data were obtained. Dr Mott in describing his material says that it consists of a collection of cases in the London County Asylums where two or more persons are related to one another. Thus Dr Mott has dealt—not with a series of complete pedigrees in which every member is included, whether insane or normal, but with a series of cases in which two or more members of a family are known to have been in London County Asylums. No notice is taken of those who are normal throughout their lives and no allowance is made for those who are normal at the time the record is made but who may afterwards become insane. Do cases selected in this way provide a complete or impartial view of the facts? Some of Dr Mott’s own comments on his data throw a considerable amount of light on this point. In his Report for 1909+ he says: “From all the Asylums I have received valuable reports, but in the case of the older asylums it has been a matter of the utmost difficulty to trace the records of so many years back,” and in his Report for 1910+ he says, ‘Some of the asylum authorities have gone through their case books for a number of years back, but the results have not been satisfactory owing to the difficulty of obtaining particulars without a living repre- sentative of the family being resident in the asylum—for instance, 110 old cases * Annual Report of the London County Council for 1912, Vol. 11. p. 62. + Twentieth Annual Report of the Asylums Committee of the L.C.C., p. 90. + Twenty-first Report of the Asylums Committee of the L.C.C., p. 94. D. HERon 363 reported from Bexley have been rejected as the relatives in the other London County Asylums could not be traced, for no instance has been included unless full particulars could be obtained.” It is thus clear that not all the cases could be traced and that there was special difficulty in tracing the older cases. What is the effect of a selection of this kind? A study of the following hypothetical cases may serve to throw some light on this point. TABLE ILI. Anticipation or Antedating in Insanity. Hypothetical Laamples to show the Effect of Dr Mott's Selection of Cases. | First Example | Second Example Mother: Born Sod ss we oi te. | 1873 1833 Married... a ey aay oe 1893 1853 Became Insane and admitted to Asylum 1913 1873 Age at First Attack seis ise Pot 40 40 Died fee oe if ave eee 1914 1874 Son: Born... ace 900 See Scie oc 1894 1854 Became Insane ... “ine ies nor 1894* 1914 Admitted to Asylum... apo oe 1914 1914 Age at First Attack ... a ne 0 60 The mothers in those two examples have exactly parallel careers. In each case the mother became insane at the age of 40 and only lived one year in the asylum. In the first case the son was a congenital idiot but was only admitted to an asylum at the age of 20. The age of onset in this case is taken at 0 years and the case shows marked “ anticipation.” In the second case the mother also became insane at the age of 40, the son not till the age of 60, 40 years after his mother’s death. The second example thus tells against the Law of Anticipation. Are these two cases equally likely to appear in Dr Mott’s data ? In the first case mother and son are in the asylum at the same time and were admitted within a year of each other. It is very improbable that the relationship would escape notice and such a case is almost certain to be recorded. In the second case, however, the son is not admitted to an asylum till 40 years after his mother’s death. Even if the family remained in the same area for 40 years after the mother’s death, it would obviously be very difficult to connect the histories of mother and son. This case, which tells against the Law of Anticipation, is almost certain to escape notice. A spurious anticipation or antedating is thus inevitable owing to the method of collecting the data. It has also been pointed out that Dr Mott has made no allowance for those who are mentally normal at the time the record is made but may subsequently * Congenital Idiot. 364 Recent Studies of the Inheritance Factor in Insanity become insane, and this introduces further spurious anticipation. Another hypo- thetical example will perhaps make this clear. Let us take the case of a mother with six children, five of whom have become insane as follows: TABLE IIL. Children Mother eis bal 5 | ipl | 2 eat biel wee) 5 6 Born 506 dot one 1830 1850 | 1852 1854 | 1856 1858 | 1860 Became Insane ... | 1860 — 1872 1896 1914 1888 1860+ | Age at Onset of Insanity | 30 —* 20 | 42 | 58 30 0 | a The extent to which this family would show anticipation or antedating would depend very largely on the time at which the record was made as is shown in the following table. TABLE IV. Age of Onset of Insanity in Date of Average for | Amount of Record Children Anticipation Mother Children 1860 30 0) 0) 30 1872 30 0, 20 10 20 1888 30 0, 20, 30 16°7 13°'3 1896 30 0, 20, 30, 42 23 0 7 1914 30 0, 20, 30, 42, 58 30 (0) If the case were noted in 1860 then the age of onset of insanity in the mother is 30 years—of the child 0 years—a clear case of anticipation, and nothing would be known of the fact that four other children will afterwards become insane and will bring the average age of onset in the children up to 30 years—exactly the same as that of the mother. Nor is the record even now complete for if the eldest child ever becomes insane, the age of onset in his case must be at least 64 years and this will further increase the average age of onset in the children. It is thus clear that in dealing with incomplete families and ignoring the possi- bility that those who are normal at the time of record may afterwards become insane, Dr Mott has introduced a further spurious anticipation or antedating. If we examine carefully the first pedigree given by Dr Mott at the Eugenics Congress}, we see clearly how probably much of the anticipation recorded by * Alive, 64 years of age and still normal. + Congenital Idiot. + Problems in Eugenics, p. 413. D. Heron 365 Dr Mott has arisen. Unfortunately this is the only pedigree for which sufficient details have been given to enable its completeness to be tested. The pedigree and Dr Mott’s description of it are as follows : “A.B., an alien Jew, aged 54, was admitted to an asylum for the first time suffering from involutional melancholia ; he has a sister who has not been in an asylum, but, as events turned out, bore the latent seeds of insanity. The man is married to a healthy woman who bore him a large family ; the first five are quite healthy, then comes a congenital imbecile epileptic (cong.)*, then two healthy children followed by a daughter who becomes insane at 23, then a son insane at 22, and lastly two children who are up to the present free from any taint. The sister of A.B. is married and has a family of ten, seven girls and three boys; one of the females was admitted to the asylum at the age of 19, and since this pedigree was constructed a brother of hers has been admitted aged 24. Half-black+ circles are insane. The pedigree is instructive ; it shows direct and collateral heredity ; it also shows remarkably well the signal tendency to the occurrence of insanity at an early age in the children of an insane and potentially insane parent.” 3 Brothers 6 Sisters: 23 22 19 @ = Insane. 13 children: 9 Alive, 4 Sons, 5 Daughters. 4 Dead. 3 Insane. 1 Cong. Fig. 8. Pedigree to illustrate the effect of Dr Mott’s selection of cases. F. W. Mott: ‘‘Heredity and Eugenics in relation to Insanity.” Problems in Eugenics, p. 413. This pedigree was given as above in July 1912, and in an address previously delivered before the Manchester Medical Society on Oct. 4, 1911, Dr Mott gave the same pedigree, but without any reference to the nephew of A.B. (brother of the girl who became insane at 19) who became insane “since the pedigree was constructed,” so that this man became insane between 1911 and 1912 and this serves to “date” the pedigree. Now it should be noted that at least five of the children of A.B. are over 23 years of age and up to the present time healthy. But all these children are alive and if any one of them afterwards becomes insane, the average age of onset of insanity in the children will be raised—and it is clear that the more incomplete the pedigree the greater the amount of spurious anticipation. Again Dr Mott states that in * This does not agree with Dr Mott’s pedigree which gives the congenital case as the seventh instead of the sixth child. + According to our usual custom, they are represented by full black circles in Fig. 3. Biometrika x 47 366 Recent Studies of the Inheritance Factor in Insanity nephews and nieces the age of onset is earlier than in uncles and aunts. In 1911 this pedigree gives a case in which an uncle became insane at 54, his niece at 19— but one year later a nephew who became insane at 24 has to be added, thus raising the average and there are eight more children some at least of whom may become insane at later ages. As before the incompleteness of the pedigree introduces an artificial and spurious anticipation or antedating. The remedy is obvious ; we must only deal with completed families. A further fallacy involved in Dr Mott’s method of work must now be noted. In directly comparing the age of onset in parent and child, Dr Mott has ignored the fact that in the parent the incidence of insanity is for all practical purposes limited to the age of 20 and over since cases of congenital defect and of adolescent insanity hardly ever marry. Among the general population of asylums, however, 12°/, become insane before the age of 20 and in Dr Mott’s selected data the percentage rises to 27—or more than a quarter of the whole become insane before 20. This in itself causes a very marked spurious anticipation. As Professor Pearson has shown (p. 361 above) if we were to investigate the age at death in parent and child from accident or violence, we should find the same spurious anticipation. There are thus three fallacies involved in Dr Mott’s work. In the first place a spurious anticipation or antedating arises from the inclusion in the record of families whose history has not yet been completed, for those who become insane at late ages in the younger generation do not appear. Secondly, even with families whose history is completed, those cases in which the insanity of parent and child is contemporaneous are far more likely to be recorded than those in which the child becomes insane long after the parent*, and thus the cases which show anticipation are more likely to appear in the record than those which tell against Dr Mott’s views. Thirdly, by directly comparing parent and child, he has practi- cally limited one of the two groups which are being compared to ages at onset of over 20 years and has thus obtained further spurious anticipation. Dr Mott also lays stress on the appearance of insanity in a more intense form in the younger generation. “I have proved,” he sayst+, “that there is a signal tendency in the insane offspring of insane parents for the insanity to occur at an earlier age and in a more intense form in a large proportion of cases, for the form of insanity is usually either congenital imbecility, insanity of adolescence, or the more severe form of dementia praecox, the primary dementia of adolescence, which is generally an incurable disease.” But we have already seen that Dr Mott’s method of collecting his data is such that an enormous preponderance of early cases of insanity in the younger generation is inevitable and of course such cases are largely incurable. Type of disease is very closely related to the age of onset and * Dr Mott states (Archives of Neurology, Vol. v1. p. 82) that ‘the main bulk of the cards (i.e. his records), however, refer to parents and offspring admitted to the asylums within the last fifteen years.” + Archives of Neurology, Vol. v1. p. 82. D. Herron 367 by selecting the latter we can alter the proportion of any particular type of insanity. Dr Mott has obtained his material in such a way that, in the younger generation, cases of insanity coming on late in life are much less likely to be recorded than those which appear in early life, and hence the early cases are in a majority, but the change in age of onset, and consequently of the type, is entirely spurious and arises solely from the way in which the material has been obtained. We can now deal with the reply Dr Mott has made to Professor Pearson’s criticisms. In his Annual Report for 1912 (p. 62), Dr Mott says: “ Professor Karl Pearson, writing to Nature, November 21, 1912, ‘On an apparent fallacy in the statistical treatment of “ Antedating” in the inheritance of pathological con- ditions,’ criticises on mathematical grounds the evidence of anticipation. I do not feel myself competent to reply to the opinion of such an eminent authority on mathematics applied to biometrics, but it does not militate against my conclusions, nor explain away the fact that a large proportion of the insane offspring of insane parents are affected with imbecility or adolescent insanity; for granting the assumption that there is no antedating at all, we might rightly expect the ages at onset of insane offspring of insane parents to be comparable with the ages at onset of all the admissions to the asylums during the same period*. This is by no means the case, for amongst the insane offspring there is a far greater proportion atfected early in life, as is shown in the following figures and curves” (they appear here as Fig. 4 and Table V). According to these figures the onset of insanity among the recorded insane offspring of insane parents is considerably earlier than among the general admis- sions to asylums, but it has already been shown that this is due to the fact that the data have been selected in such a way that the early cases in the younger generation are the most likely to appear. Further, if Dr Mott’s argument be a valid one, we might also expect the ages at onset of the insane parents of these insane offspring to be comparable with the ages at onset of all the admissions to asylums during the same period. This is by no means the case as is shown in Fig. 5 below (see also Tables I and V). We see here that the insanity of the parents comes on at a much later period than among the general admissions to asylums and that there is a far less proportion affected early in life. If Dr Mott’s method of argument be sound, he has not only to deal with an antedating of insanity among the offspring but also a post-dating of insanity among the parents. Both are of course spurious and arise from the peculiar selection of the data and from the fact that, owing to differential death-rates, the ages at onset of “ admis- sions” will never be the same as the ages at onset of the admitted—i.e. the asylum population—at any time. * ««We might rightly expect”’ these ages to be different, because ‘‘admissions”’ are not the same as the population in the country who have at one time or another been insane. The percentages of total cases of acute mania, of senile insanity, of congenital idiocy, and of melancholia, who reach the asylums, are not the same. The reader has to distinguish between the population of admissions, the population of admitted, and the insane population of the country. A sample of the latter may be reached from completed family histories, but not from records on admission or from records of an asylum population. 47—2 368 Recent Studies af the Inheritance Factor in Insanity 45-7 45 Ko All Admissions [__] Ns All Admissions [_ ] / Insane Children Wy fe Dar yyy YY Insane Parents of “Wy 354 Yy of Insane Parents 35 Insane Children Yd, 30 30 g $ > > i) D> Ss = 20- = 5 § Q AY = an t VaR" 25- 35- 45- 55- 65- 75- Age at Onset of Insanity. Age at Onset of Insanity. Fig. 4. Fig. 5. Diagram to illustrate the Distribution of Age at Onset of Insanity among: (1) The Insane Offspring of Insane Parents. (2) The Insane Parents of Insane Offspring. (3) All Admissions to L.C.C. Asylums. ; TABLE V. Percentage Comparison of the Age at time of Onset of Insanity in the Insane Offspring of Insane Parents and the General Admissions to the London County Asylums. MALE FEMALE Toran Age at Onset of | 4489 direct | 274 insane | 5097 direct | 389 insane | 9579 direct | 663 insane | Insanity admissions | offspringof | admissions | offspring of | admissions | offspring of during last insane during last insane during last insane four years parents four years parents four years parents Under 25 20:0 43°8 20°2 44-2 20°1 44:0 25—84 19°9 27°7 19°9 28-0 19°9 27°9 35 --4h 21°9 13°8 21-5 16°7 21°7 15'5 45—B4 Teel 7) 10°2 18°6 7°4 18:2 8°5 55—64 1353 3°6 12°4 2°8 12°7 3:2 65—7T4 SPH O-'7 59 0°8 5°8 O'7 75 45) — 1°6 _- 1135) — 41 male imbeciles out of 274 offspring 54 female 389 ” ”» 95 male and female , alae 663 D. Heron 369 It is possible to illustrate the various fallacies which vitiate Dr Mott’s conclu- sions regarding anticipation by considering the age at death of parent and child. I do not know whether it is generally recognised that it is exceedingly difficult to get any considerable body of data in which the ages at death of a parent and all his children are given, for of course the record is incomplete and biassed until the death of the last surviving member, and in some cases to get a complete record we must trace the history of a family for over 150 years. George the IIIrd, for instance, was born in 1738 and all but one of his 15 children were still alive in 1810, 72 years afterwards, and the last surviving son, Duke of Cumberland and King of Hanover, did not die till 1851, 113 years after his father’s birth—and this is by no means an extreme case. In the material I am about to describe I found one case where the interval was 160 years. Another difficulty which arises is the tendency in practically all family histories to omit infant deaths, so that we do not get a complete record. It seems probable that the deaths of minors are not represented in such records in anything like their true proportion and that the differences are greater than might be expected to arise from differences of physique and nurture due to class. Thus records of the Landed Gentry give 31 deaths per 1000 males under 20 years* while actual experience shows 163 to 197 per 1000+. But in the records of the reigning families of Europe we get a practically complete record of all members and therefore from von Behr’s Genealogie der in Europa regierenden Fiirstenhdusert{, I have extracted particulars of the age at death of over 2000 individuals—all belonging to the 18th century. There was here no selection— every child was entered and every family had been traced from the birth of the parents till the death of the last survivor. Now in Dr Mott’s data we have already seen that cases in which the age at onset of insanity in parent and child is contemporaneous are most likely to be recorded. We can test the effect of a selection of this kind by investigating the effect of selecting, from our data regarding the age at death among those royal families, only those individuals who died within a certain number of years of their father’s death, and the results are given below in Table VI, p. 370. When we deal with the whole of the data, absolutely unselected, every family being complete and traced to the death of the last surviving member, we find that 680 out of 1829 or 37:2°/, died under 20 years of age. Let us now apply a very slight selection to the data and reject the 92 cases in which the interval between the deaths of father and child was at least 60 years. We find now that 680 out of the remaining 1737 died under 20 years of age—or 39:1 °/,. Thus the effect of a selection of this kind is to cause a slight increase in the proportion of deaths at the early ages. If we make the selection slightly more stringent, by taking only those who died within 40 years of their father’s death, the percentage of individuals dying under 20 years of age rises to 46°7 and if we go still further and consider * See Pearson: Proc. R. S. Vol. 65, p. 291. + Statistics of Families, p. 73. Pp ? p + Tauchnitz, Leipzig, 1870. 370 = Recent Studies of the Inheritance Factor in Insanity TABLE VI. Illustrating the Effect of Selection of Material on the Distribution of Age at Death. (Reigning Houses in Europe—18th Century.) Children who died: All Cases ie Unselected within 60 years | within 40 years | within 20 years in their Age at Data of their of their of their father’s Death father’s death father’s death | father’s death lifetime Numbers} °/, |Numbers| °/, |Numbers] °/, |Numbers| °/, |Numbers| °/, Under 20 680 | 37°2 680 39°1 680 | 46°7 680 | 6274 648 82°7 20—389 277 15:1 277 15°9 277 19-0 254 | 23°3 121 15°4 40—59 336 18°4 336 | 19°3 274 | 18°8 127 Igoe 15 1:9 60—79 450 | 24°6 395 | 22:7 | 214 14:7 29 2°7 — — 80and over 86 4°7 49 2°8 | 10 7 — — = = Totals 1829 | — 1737 | — 1455 — 1090 - 784 = Average | Age at 35°9 33°7 26°9 16:2 eu Death* only those who died in their father’s lifetime, then the percentage rises to 82°7 °/,. Looking at the matter in another way we find that the average age at death has fallen from 35°9 years to 7°7 years. The same facts are given in Fig. 6, which shows that as the selection of cases becomes more stringent, there is a regular increase in the proportion of deaths at the younger ages. In exactly the same way, the fact that cases where the insanity of parent and child is contemporaneous are the most likely to appear in Dr Mott’s records causes a spurious exaggeration of the cases of insanity at early ages in the younger generation and consequently a spurious exaggeration of the number of cases of imbecility and adolescent insanity. We can also investigate directly the question of anticipation or antedating on this material. In order to avoid the heavy weighting of large families which would arise if every child were entered, I have taken only one child from each family. Let us consider first of all the distribution of age at death of Fathers and their First-born Children. The facts are given in Table VII. We have altogether 294 cases in which we know the age at death of a father and his first-born child. None of the fathers died before 20 but of the children * These averages were calculated, not from the five age groups given above, but from the same material classified in 15 age groups. D. Herron B yall 106 out of 294 or 36:1°/, died before 20. The average age at death among the fathers is 61 years, but among the children it is only 36 years, so that there is an anticipation of 25 years. To borrow Dr Mott’s words, the figures clearly show the signal tendency among the offspring to die at a much earlier age than their parents; that is to say, anticipation or antedating is the rule. Age at Death. O= 205 40- 60- 80- A OO == 40 as 20 Children who 2 group died within 40 years of their vn €AC Fathers’ death Percentages dying h age anu ee on i ee aS died within : * 60 20 years of their = 40 ae aa 2 ey Fathers’ death &.= 20 oy E TTT TT EET EEE S$ 80 a ee Fs er ge 60 ~o~ pba Ae SES Fathers’ lifetime $§§ a Soe LC a QS . — Age at Death. Fig. 6. Diagram to illustrate the Effect of Selection of Material upon the Distribution of Age at Death. (Reigning Houses in Europe, 18th Century.) Now in this material there is no selection of families. Every family was taken and the age at death of every first-born is known, so that we are only left with the 372 Recent Studies of the Inheritance Factor in Insanity TABLE VII. Showing Anticipation in Age at Death. A. Fathers and Children. (Reigning Houses in Europe—18th Century.) é __ | First-born First Sons who Age at Death Fathers Children | Fathers Hadeckularen 0— 9 — 95 s oo. 10—19 — 11 os as 20—29 6 21 4 8 30—89 16 18 8 15 40—49 45 31 31 39 50—59 70 34 54 39 60—69 tid 37 58 44 70—79 62 33 46 54 S80—89 18 14 12 13 90 and over _ — — 1 Totals 294 294 213 213 Percentage dying under 20 0 36:1 0 0 Average Age at Death ... 61 36 60 59 | Anticipation de es 25 1 third of Dr Mott’s fallacies, in that no allowance has been made for the fact that the parental group is limited to ages over 20 while more than a third of the off- spring die under 20. The effect of this selection can be removed almost entirely by taking instead of the first-born child, the first son who married and had at least one child. There are in all 213 such cases and we see that there is now no anticipation. The difference between the average ages at death is less than a year and by removing the artificial selection we have got rid of all anticipation or antedating. These facts are also shown graphically in Figs. 7 and 8. The horizontal scale gives the age at death in 10-year groups while the vertical scale gives the actual numbers of parents and offspring dying in each age group. The diagram on the left shows marked anticipation, and should be compared with Dr Mott’s diagram (Fig. 1) in which the ages at onset of insanity of father and child are compared. When, however, we get rid of the selection of cases by taking only sons who have had children, then there is no anticipation. If we compare the distributions of age at death in mothers and children we get exactly the same results. The facts are shown in Table VIII. We see that the first-born children died on an average 18 years before their mothers, but when we compare the age at death of mothers and the first son in REIGNING HOUSES IN EUROPE — I8™ CENTURY. U Bg Zz 7) 2 '& aoe = brs 6) > q ae (eo) | ol 4 (e] 2) = "2) « L Le O° & a wW = ao) ee <0 25) Le = fe) Pi Te a) Ee & WwW * a 3 7s < a WwW ie) Ww Caece ee: re zt yn a WW . FS Chia: ce s) Zz a fo) a. led "2 a Le we re) & ee © a e fo} Ae g 3 [rs fe) & ar Wd 5 g a ee a } itp) z i} a fay 2 z= - re) z \ ae, EE 2 Seo: 25 Zz fo) a < ea T a Sao ea on eee 0 Q 0 ° 9 ° 0 ° 7) fe} 7) fo) 0 to) t t io) N N AIN3NOD 3Y4 CHILDREN Z a —— 36 FIRST BORN CHILDREN ANTICIPATION TET AIN3IND3YS DEATH AT Fig. 8. AGE DEATH AGE 48 Fig. 7. 374 Recent Studies of the Inheritance Factor in Insanity TABLE VIII. Showing Anticipution in Age at Death. B. Mothers and Children. (Reigning Houses in Europe—18th Century.) First Sons to | First-born | | , . | | Age at Death Mothers | “Ghijdren’ | Mothers | pave Children 0— 9 —- 122 — = 10—19 2 13 hers — 20—29 47 26 21 8 380—89 | 49 22 30 16 4O—49 | 43 32 A) 41 | 00—o9. |aeom 35 39 | 40 60—69 | 80 42 52 46 70—79 leer O2. an anes 4] 54 SO—89 Po eS ear Wa 10 14 90 and over | Die eal oH 1 1 | a | | Totals | BE) Sar 220 220 Percentage dying under 20 6 39°1 5 0) Average Age at Death ... | 53 35 55 59 Anticipation Ss aoe 18 —4 each case to have children, then the sons live four years longer than their mothers. It would have been better in this case to have compared the mothers with the first daughters to have children but unfortunately von Behr gives very little information regarding the female lives, except in special cases. The figures show a marked anticipation in age at death when we directly compare, as Dr Mott has done, mother and child, but this vanishes when we remove the arbitrary selection. The same facts are shown graphically in Figs. 9 and 10. If we combine these figures we can compare the age at death of parent and child and the results are shown graphically in Figs. 11 and 12. Fig. 11 shows that Dr Mott’s limitation of one of the two generations he is comparing to adults, without imposing a similar limitation on the other generation, introduces an artificial and spurious anticipation. The average age at death of the parents is 56 years and of their first-born children only 35 years—so that we get an anticipation of 21 years. If, however, we make the two generations almost directly comparable by dealing only with sons who have children—there is no significant difference between the two averages (58 against 59 years). In these cases we have dealt only with completed families and have taken every family without selection. If, however, we consider only the cases in which CCA KK \\ OC Of Ome RO aol Ol (FO) ve Ox i wo no vt ¢t o S} a si 2 g 2 ADIN3INOAYS i : ey tS a at | 3 A RAC a i. 376 Recent Studies of the Inheritance Factor in Insanity . COW pst een eee oa «8 § z \) Mt CL) W re Parked fay ofS) Co EE, D. Herron the eldest child died in his father’s lifetime the amount of anticipation is greatly increased. The facts are shown in Table IX and in Fig. 13. REIGNING HOUSES IN EUROPE — I8™ CENTURY. ACE AT DEATH OF FATHERS & OF FIRST BORN CHILDREN. WHO DIED IN THEIR FATHERS’ LIFETIME, AVERACE ACE AT DEATH OF:- FATHERS FATHERS 62 CHILDREN 10 CHILDREN ANTICIPATION §2 100 FREQUENCY We see here that among the fathers none died under 30 while 87°/, of their children died under 30; the average age at death among the fathers was 62-- among the children only 10, showing an anticipation of 52 years. 378 Recent Studies of the Inheritance Factor in Insanity TABLE IX. Showing Anticipation in Age at Death. C. Fathers and First-born Children who died in. their Fathers’ Lifetime. (Reigning Houses of Europe: 18th Century.) First-born Children Age at Death | Fathers | dying in their | Fathers’ lifetime | 0-9 | et | 89 LO==19 | — 11 20—29 — 16 380—89 | 6 10 4O—49 | 20, 5 50—59 | 32 2 60—69 3D — O79) 32 = 80—89 10 | a= Totals 133 133 | | Percentage dying under 20 | O a) Average Age at Death ... 62 10 Anticipation oe aae 52 It is now possible to illustrate the effect of the principal fallacies which vitiate Dr Mott’s conclusions. In the first place he has dealt with families which are largely incomplete and has collected his material in such a way that cases in which the insanity of parent and child is contemporaneous are the most likely to be recorded ; in the second place he has directly compared parent and child with- out allowing for the fact that practically no parent can become insane before 20, while there is no limitation of this kind among the offspring of these insane parents. In Table IX and Fig. 18 we see the effect of dealing with incomplete families in which the children died in their fathers’ lifetime. There we get an anticipa- tion of 52 years. If we get rid of the first and second fallacies involving a selection of cases by dealing with every family, as shown in Table VII and Fig. 7, the anticipation falls to 26 years. If we get rid of the third source of fallacy also, by comparing the fathers with the first sons who have children, asin Table VII and Fig. 8, then the anticipation falls to less than a year. The Law of Antici- pation or Antedating has thus in Dr Mott’s case no foundation, in fact it is a spurious result of the mode of collecting and interpreting data. Now Dr Mott has not only asserted that this “ Law” applies to insanity but has also drawn the conclusion that the offspring of insane parents if still normal D, Heron 379 at the age of 25 may safely marry. In an address delivered before the First International Eugenics Congress*, he said: “ You will observe that 47°83 °/, of the 500 offspring had their first attack (of insanity) at or before the age of 25 years and as you see in the curves of parents and offspring, the lability of the child of an insane parent becoming insane tends rapidly to fall. Now besides the fact that this shows Nature’s method of eliminating unsound elements of a stock, it has another important bearing, for it shows that after twenty-five there is a greatly decreasing lability of the offspring of insane parents to become insane and therefore in the question of advising marriage of the offspring of an insane parent this is of great importance. Sir George Savage recently said that this statistical proof [sic !] of mine entirely accorded with his own experiences, and that if an individual who had such an hereditary history had passed twenty-five and never previously shown any signs (of insanity) he would probably be free and he would offer no objection to marriage.” Now I entirely fail to understand how anyone could recommend marriage in such cases, even on Dr Mott’s own figures; for if it be true that 48 °/, become insane before 25, it must be equally true that 52°/, become insane after that age and this very important point seems to have been forgotten. These figures, however, are taken from Dr Mott’s selected data, selected.in such a way that the early cases are enormously exaggerated. Until Dr Mott publishes a series of complete pedigrees, it will be safer to assume that the age at onset of insanity among the offspring of insane parents does not differ widely from that of all admissions to Asylums and there we find that only 21°/, become insane before 25, and 79°/, after 25. But surely at a Eugenics Congress of all places some thought might have been given to the mental condition of the children resulting from such matings, before advising marriage. It would not have been difficult for Dr Mott to have extracted all the available cases of this kind from his collection of pedigrees, i.e. all cases in which an individual had an insane parent and was normal at the age of 25, and so have discovered the probable fate of the offspring from such matings. Unfortunately the details given by Dr Mott regarding his pedigrees are usually so scanty that little use of them can be made, but two at least show the danger of the matings Sir George Savage and he sanction; these two pedigrees were given by Dr Mott in his lecture on Heredity in Relation to Insanity, delivered to the members of the London County Council. The first is shown in Fig. 14. (It appeared as Fig. 11, p. 18 of Dr Mott’s lecture.) In the first generation a man who became insane at 70 had four children. The eldest, a girl, became insane at 68 and was therefore normal long after the age of 25. Dr Mott does not state whether the marriage of this woman preceded or followed the onset of insanity in her father, but even if her father had become insane before her marriage, Dr Mott * Problems in Eugenics, p. 425. This is one of many illustrations of the evil done by that Congress ; attention was directed and much weight given to hasty statements and ill-digested material. 380 =Recent Studies of the Inheritance Factor in Insanity would have raised no objection to the marriage since the woman herself was not insane. There were in all six children from this marriage of which Dr Mott would have approved. Two became insane, three were blind and five are said to have been paupers. = Insane P= Pauper B= Blind Fig. 14. el Til IV ie 3 ®&) = Tuberculosis gy = Suicide & = Insanity Con = Congenital ZG) = Unknown D. HErRon 381 The eldest child remained normal till the age of 34 and although both his parents became insane Dr Mott apparently would not have objected to his marrying. He did so and one of his children became insane and eight out of nine are said to be paupers. These nine children are apparently still young so that their ultimate fate is still uncertain. The second pedigree I shall quote was given as Fig. 28, p. 33 of Dr Mott’s lecture, and appears here as Fig. 15. A man who had an insane father and an insane grandfather became insane at the age of 55. He was therefore normal at the age of 25* and Dr Mott would have sanctioned marriage in his case. He actually married twice. His first wife was tuberculous but not insane; they had two children, both insane. His second wife was normal and it is definitely stated that there was no insanity in her family; they had five children and one of these became insane. Yet Dr Mott would permit the children of insane parents to marry if only they are normal at the age of 25! Again, Dr Mott has stated that it is useless to attempt to limit the fertility of the insane since most of their children are born before the onset of insanity, and therefore before any action can be taken. From his statistics of relatives in L.C.C. Asylums, Dr Mott has calculated the proportion of offspring who were born after the first attack of insanity in the parent and found that “Forty-six offspring out of 581 were born after the first attack of insanity in the parent, 1e., 7°9°/,. That is to say in the case of 529 insane parents, the birth of only one-twelfth of their 581 insane children would have been prevented by sterilisation or life segregation of the parent after the first attack of insanity. These figures refer to the offspring which become insane, but there are a large number of offspring which do not become insane and these would be cut off if life segregation or sterilisation were adopted +.” But here again Dr Mott is using the data obtained from his index of relatives which shows a greatly exaggerated number of cases at the earlier ages among the offspring, and he thus greatly exaggerates the number of cases in which the children were born before the onset of insanity. No conclusion can be drawn from any but complete records of families. But apart altogether from this, many of these parents are themselves the children of the insane and much could be done to discourage such marriages. Unfortunately as we have seen Dr Mott directly sanctions marriage to those who remain normal till the age of 25. In further support of his view Dr Mott has stated that out of 642 females admitted to three London County Asylums in 1911, 148 were recurrent cases and of these 32 (21°/,) had children between their respective dates of admission. “The inference that can be drawn,” he says, “is that about one-fifth of the recurrent cases, or approximately one-twentieth of the female admissions have * If the term ‘‘age at onset”? has any real meaning. + The italics are Dr Mott’s. Biometrika x 49 382 Recent Studies of the Inheritance Factor in Insanity children after their first attack of insanity and of 31 such cases examined, 73 chil- dren were born after the first attack of insanity in the parent.” But have these 148 recurrent cases been followed up to the end of the repro- ductive period? Not at all. No ages are given and the cases are merely those which were admitted to Asylums in 1911, Dr Mott’s remarks being made in June 1912, so that no attempt has been made to follow them up. There is no justification for Dr Mott’s advice. There are many other points in Dr Mott’s work which deserve detailed exami- nation, but time will not permit more than a brief account of a few of them. It should be noted, for instance, that Dr Mott has used his index of relatives in London County Asylums as an argument in favour of the importance of the inheritance factor in insanity. His argument is as follows: “At the present time in the London County Asylums there are 725 individuals so closely related as parents and offspring, brothers and sisters. A priori, this, to my mind, is striking proof of the importance of heredity in relation to insanity, for we cannot suppose that 20,000 of the 44 millions of people in London brought together from some random cause would show such a large number closely related eis i0 9) oc But Dr Mott has not attempted to give, and I doubt if he ever will be able to give, a satisfactory estimate of the number of relatives in even a random sample of the population, and the population of asylums is far from being a random sample of the general population—there is for instance an extraordinary divergence inage. Yet without definite information on this point it would be impossible to say whether insanity is inherited or not—that is if we had to depend solely on Dr Mott’s data, It should also be noted that in these cases Dr Mott has clubbed together every form of insanity, from congenital idiocy to senile dementia, except of course cases due to specific infections or trauma. I myself think that course is the only possible one. To anyone who has studied even a few pedigrees of mental defect, nothing is more striking than the extraordinary number of different forms of mental defect that may appear in the same family. Seven years ago, in a First Study of the Statistics of Insanity and of the Inheritance of the Insane Diathesis*, I was confronted with the same problem, and after a full consideration of all the available data and of the opinions of those medical men who were best qualified to express an opinion came to the conclusion that the only possible course was to group all forms of insanity together, with, of course, the exceptions I have already indicated. The whole question was dis- cussed very fully in my paper and it was there suggested that an even broader classification might be of service. This point of view met with some criticism at the time but nothing has occurred to alter it, and the study of the inheritance of * Galton Memoirs, No. II. (Dulau and Co.) D. Herron 383 insanity in general or of an even broader degeneracy must always remain the first object of our studies. Any investigation of the inheritance of special types of insanity or degeneracy can only be carried out however on unselected material—on the records of com- plete families. The type of insanity is so closely related to the age of onset that any tendency to exaggerate the number of early cases, as in Dr Mott’s material, will entirely vitiate the conclusions drawn. Thus Dr Schuster’s conclusions as to the inheritance of special types of insanity based upon Dr Mott’s data* must also be rejected on the above grounds. Dr Mott’s index of relatives in London County Asylums is unfortunately of very little value in the study of inheritance in insanity. Progress can only come from the study of complete pedigrees in which every member of the family is entered, whether insane or normal, and the ages of the normal at the time the record was made are just as important as the age at onset of insanity in the insane members, for a statement that a young man of 20 has not been insane is of a very different degree of importance from the statement that a man of 70 has not been insane. In the papers I have cited the children of the insane if normal at 25 are advised to marry, and it is asserted that it is useless to attempt to discourage the reproduction of the insane since most of their children are born before the onset of insanity, and that we should rely on the Law of Anticipation to end or mend degenerate stocks. I have shown, I think, that the Law of Anticipation as applied to the insane has no foundation in the facts provided and that the advice given as to the marriage of the insane and of their normal offspring is fundamentally unsound and directly eacogenic. Much yet remains to be learnt regarding the inheritance of the insane diathesis, but no one who has studied the family histories of the insane can doubt that in ivheritance we have by far the most important element in the production of insanity, and in view of all the facts it is the obvious duty of the Kugenist to discourage, rather than to encourage, procreation by the insane and even by those of their offspring who appear to be normal. * Report on the Statistical Investigation of Relative Cards, 21st Annual Report of the London County Council Asylums Committee (1910), p. 95. 49—2 ON THE PROBABLE ERROR OF THE BI-SERIAL EXPRESSION FOR THE CORRELATION COEFFICIENT. By H. E. SOPER, M.A. Biometric Laboratory, University of London. In a recent paper* Professor Pearson shows that where one character is in multiple graded grouping and the other in alternative categories, greater or less than a given magnitude, the correlation coefficient admits of simple expression ; the assumptions being that the unmeasured character, B, has a normal distribution and that the measured character, A, has linear regression upon B. Under these conditions the data required are the numerical ratio of the alternative groups, the standard deviation of the measured character and the deviation from the general mean of this character of the mean of one of the groups. This expression is subject to greater fluctuations of value in samples of NV of the population than is the product moment form, especially where one of the groups is relatively small; and it is proposed to find formulae for the mean and second moment of the errors from this mean to a first approximation, that is to terms in 1/N. These will appear in terms of the correlation coefficient, r, of the original population (which will be supposed normally correlated) and the fractional frequency, f, in that population of the group possessing the Bree or positive intensity of the character put into two classes. Let y be the graded character and # the alternative character the intenser value of which is possessed by the fraction f of the population. Let %, y be the general means and o,, o, the standard deviations of « and y. Then it is shown in the paper that if 2’, 7 are the means of the group /, pelt ee (1), cy La on the assumption that the regression of y upon « is linear; and that if 2 be normally distributed this is equivalent to ona x Se ee HONORE AOR ON NEDSS SC > (2), Cy Z * Biometrika, Vol. vu. (1909), p. 96: ‘‘On a New Method of Determining Correlation between a Measured Character 4, and a Character B, of which only the Percentage of Cases wherein B exceeds (or falls short of) a given intensity is recorded for each grade of 4.” H. E. Soper 385 where z is the ordinate of the normal curve cutting off the area f=4(1—a) or 4(1 42) as defined in Sheppard’s Tables of the Probability Integral. Now if Do, Pi; Pa, ete. be the moment coefficients of the whole population with respect to the character y defined by 0S (Mats) Nett cess cat mener rience ersta es (3), [see bi-serial table below which is here to be looked upon as representing the general population] and Dy; Dia! Pz ete. are moment coefficients of the group n’(=/N) defined by (i 2 1S) (COOP I BL geese en erm gona races anesoo00 (4), we may write f= pp, fy’ =pi, J = Pr. Ty = V(po— pr) and mye oe, Ay at Ms aE scl AOE he) De 2 aR oe (5). V(p2— pr") 2 —+ Grades of y in Bi-serial Table. [tea ES 1 ea ae, ae V ” ” ” < Ny Nye 3 on | ° 2 ny Ns! Ns as} S 5 | oO Ny Ny Ns n'/N=f In samples of WV the frequencies n,, n,, ns’ and consequently the moment coefficients p and p’ and the ordinate z are subject to fluctuation and the values of the correlation coefficient calculated from this formula will have a distribution of errors. Let 7 be the mean value in such samples, ér the deviation from this mean value in any sample when 6dpp, dp,, dp), ete. dz are the deviations in moments and ordinate. Then = 3 Pi + Spy’ — (po + 8p) (pi + 8p) 7 +6) : oe geet te hess Cormeen (3): / {po + Sp. — (p, + 6p,)"} x (2 + 62) SP To express 6z in terms of deviations’ of the moments we have Ls, Ufa a (Fee ae 8 bh V Qar 8) 386 Probable Error of the Bi-serial Correlation Coefficient Hence to second order terms in powers and products of deviations 1 1 2 240s = ——— en eee N Qar =z{l—aéa—}(1—a’) (Sa)} at+éa 1 AD Sie eee a /Qar sa] m = —— Sa(@e ay 0 J 2a 5 Cy =2["(1-ag-40 -@) B+...J de =z |da—4a(da)}. It follows that . etb2e=2+a8f—5 (8f) ,_ i ine =z+adp — oF (Ope )8 kei beeeeeee (10). At the same time that this value is put in (7) we may simplify the expression and subsequent algebra by supposing the graded character y to be measured from its universal mean value as origin and in terms of its standard deviation as unit of measurement in which case n=0; pr=of=1, and by () py 2) ieee ees (11), and since p, =f (7) becomes, mM zr + op,’ — fdp, — Spy Sp, Expanding to second order of deviations we find, P+ OF HT 64 Oy disaw- ccnp ease oven eeeeeeeee (13), where 6,, 6, are the first order and second order expressions 1 ; ' o= a {dp,' — fdp, — zr dp, — ar 8p,'}, fa 1 : n= : {har8p, dpy — - Spy Spo + = 5p, dpy — $6p' Sp, + 4 fOp, Sp. — Spy dp + $2r (Op,)? + S er (dp)? + (1 + 2a?) _ (5p. Seek (14). Taking mean values T= MEAN Og cooler eae eaten (GUS), mean 6, being zero since by (3), (4) mean dp, =S (mean dnsys”)/N =, mean 6p,’ = S (mean 6n,’¥;”)/N = 0. H. E. Soper 387 Mean 6, is to be evaluated by the formulae mean 67,,5p) = (Putv — PuPo)/ N mean Op, dpy = (puso — Pu Pu )/N | mean 6p. 6p. =(p'utv — Pups )/N of piich the first two are well known* and the third may be proved thus: N8Spu = S (Sngye) = 8 (Eng ys”) + 8 (bn ys"), N Sp, =S8 (6ns' ys”), *, N?Spudpy =S8 {(Sns )? yst*} + S {Ong Sng’ ys yy} + S [Ons Sng Ys’ Ys}, where in the third sum s’ may or may not equal s. But mean (6n;)? =n; (1 — n/N), mean (6n; dry’) =— ns ns’ /N, mean (dn; dny”) = — ns'ng/N, the last whether s=s’ or not. Thus we findt mean dpudpr = (puso — Pu Py — pu" Pv) | N = (p'usrv— pupo)/N. Evaluating mean 6, es these formulae we find, 7 Saye 73 tt ar (p2’ — P2Po ‘) SES = (py — Pi Po) A (pr — Pipo) — & (ps — py Po) + of (Ps — Pips) ; ; as ; ee 7 —(pr — Po Pr) + $27 (ps — pr’) + 3 2r (pa po!) +1 + 2a?) 5, (Pr — Po Y ena let); in which the undashed moments, being those of a normal curve with unit standard deviation, about its mean, have the values Pi=9, Pr=l, ps=0, pr=3, and the dashed moments beyond the first two, Sy i eae have values depending upon the nature of the frequency distribution of y and z. Assuming «, y normally distributed t ? =|, depareat ay YT dandy, ae 1 : pi = i | tice TOA og diay a -2 Tv Se * See Biometrika, Vol. 11. (1903), p. 275: ‘‘On the Probable Errors of Frequency Constants.” K. Pearson. The second follows in exactly the same manner as the first, since the constancy of the total frequency dealt with is only involved, in deducing the relations (i) and (ii) of p. 274, of that paper. + Dae =8 (n,/y.")/N. + Since the moments appear in the term containing 1/N any errors in their calculation due to incorrect assumption of normality will not affect the present approximate formulae provided such errors are of the order 1/N. 388 Probable Error of the Bi-serial Correlation Coefficient Putting y=rex +7 and integrating with respect to 7 for constant 2, ie l Ly? pi=| (ray +(1 ae r)} ——— @ 2% da V Qe ps = {(rx)3 3 (rv) (1 Re r2)} oe Pat ce ia Sat A eeett alta a) | ee ON ANET Oo ovctc caaiSookdde cane (19). When these values are put in (17), and terms collected, the mean value of the bi-serial correlation coefficient in samples of V is found to be Mi : eee a i ae fs cenaaee i: | Pees. (20), where f =1—f= WN. ‘ In the work of obtaining this approximation all powers and products of deviations above the second order have been neglected. The means of such terms in samples of NV involve second and higher powers of 1/N* and the present result is correct to the first approximation. Again squaring (13) and taking mean values and subtracting the square of (15) we find to the same approximation as before G,7 = mean (or) —imeanyo,” seen ene ene eee (21). The evaluation of mean 6, being carried out precisely mm the same way as mean 6,, the result is the second moment of deviations of the bi-serial corre- lation coefficient in samples of J, ofa (e- i s (1 = (1 +f)! ror| ieee (22). Writing the two results (20) and (22) r=T {1 + (be 4P 1) of =a Niyee Wir Erk ee (23), the values of 1, Yat and y, for values of $(1—a) [=the smaller of n/N, n/N] from ‘50 to ‘01 are to be found in table (24). * See Biometrika, Vol. 1x. (1913), pp. 97—99. + xa for $(1+a) was tabled in Biometrika, Vol. 1x. p. 27, and the table is reproduced in Tables for Statisticians and Biometricians, p. 35, Cambridge University Press. H. E. Soper 389 $(1-a) fa Xa" va 3 (1—a) fa Xa” a 50 0354 | 1:5708 | 2:5000 20 — ‘0871 2:0414 | 2°8578 “49 ‘0353 15711 | 2°5003 ‘19 — -1001 | 92-0898 | 2°8951 “48 0350 | 1°5722 | 2°5011 ‘18 — 1146 | 2°1437 | 92-9364 47 0346 | 15741 | 92-5024 “17 — ‘1308 | 2:2035 | 2°9825 “46 0339 | 1:5766 | 2°5043 16 — 1490 | 2:2703 | 3-0341 “4S 0331 1°5799 | 2°5068 15 — 1696 | 2:3453 | 3-0923 “hh ‘0321 15839 | 25098 Ly = 1931 24303 | 3°1582 “43 0309 | 1°5886 | 2°5134 13 — +2201 25272 | 3°2337 “42 0295 | 1°5943 | 2°5177 ‘12 —~ -2513 | 2°6389 | 3°3208 “41 0279 | 16007 | 275225 ‘11 — 2880 | 2°7687 | 34224 “40 0260 | 1:6079 | 2:5279 10 — +3317 | 29221 | 3°5427 39 0240 | 16161 | 2°5341 095 | — -3568 | 3-:0095 | 3°6115 38 0217 | 1°6251 | 2°5409 090 | — +3844 | 3:1057 | 3°6873 3 ‘0191 1°6351 | 2°5484 085 | — 4153 | 32119 | 3°7713 36 0163 | 1°6461 2°5568 080 | — +4499 | 3°3299 | 3°8648 85 0132 | 1°6582 | 25659 075 | — -4887 | 34620 | 3-9696 ‘Bh 0098 | 1°6714 | 25759 070 | — -5324 | 3°6110 | 4:0879 33 0062 | 1°6858 | 2:5868 065 | — -5822 | 3°7806 | 4:2294 32 ‘0021 1:7015 | 2°5986 060 | — -6403 | 3°9748 | 4:3777 ‘81 |—-0023 | 1°7186 | 2°6115 055 | — °7083 | 4:2002 | 4°5584 30 |—-0070 | 1:7371 | 2°6256 050 | — *7897 | 4:4652 | 4°7723 ‘29 + |—-0122 | 1°7573 | 2°6409 045 | — °8868 | 4:°7829 | 5-0283 emt —-Ol79.) || 17791 2°6575 040 | —1:0053 | 51715 | 5:3410 27 | —-0241 1°8028 | 26755 035 | —1:1577 | - 56568 | 57362 26 |—-0308 | 1:8286 | 2-6952 030 | —1:3556 | 62859 | 6-2485 25 |—-0382 | 1°8567 | 2°7166 025 | —1°6308 | 7°1347 | 69481 ‘24 |—-0462 | 1:8874 | 2-7399 020 | —2:0272 | 83600 | 7:9572 23 =|—-0550 | 1°9208 | 2°7654 015 | ~2°5263 | 10:3024 | 9-4975 22 | — 0647 19574 | 2°7933 010 | —3:8889 | 13°9393 | 126086 21 |—-0753 | 1:9974 | 2-8240 | ee (24) The bi-serial value of the correlation coefficient has the standard deviation Op = (Ke — Wart ry/ VN, whilst that of the product moment value is op =(L—)/VN In table (25) a comparison of the values of the numerator is made for five values of r, for divisions at 0, ‘5, ...2°5 times the standard deviation from the mean of the ungraded character. Values of ,/(xa2— par? +74) for $(1-a)= Values ? of | (a “500 “309 “159 067 023 -006 ‘00 1:00 E25 1°31 1°51 1:93 2°76 4°5 225) “9375 1:19 1°25 1°45 1°86 2°68 4:3 “50 ‘750 1:00 1:06 1°26 1°65 Dea, 4°0 wy) "4375 “69 ‘75 94 1°30 1:95 3°2 1:00 ‘00 OF oo “49 ort 1:13 1°8 Seahie (25) Biometrika x wy 390 Probable Error of the Bi-serial Correlation Coefficient Thus the effect of grouping and applying the bi-serial value of the correlation coefficient is to add 25°/, to the probable error in the most accordant case where r is zero and the division equal, whilst if r is as large as ‘5 and one group as small as 10°/, of the whole the probable error is nearly doubled. For higher values of r the errors of sampling, in the case of the product moment formula, grow smaller and ultimately vanish when 7=1,; but the bi-serial values are not invariable in samples drawn from a perfectly correlated population but possess a variability as high as *27/s/V in the most favourable case when the grouping is equal. If the standard deviation be calculated from the approximate formula, Graf [2 —7) IN wee (26), which may be written * = Or = (Na — 12) NO eee CAD the error of computation will not be great for values of f and r commonly met with as the following table compared with the last will show: Values of ya—-7? for $(l-a)= ip “500 309 "159 “067 ‘023 -006 ‘00 1°25 1°31 1°51 1:93 2°76 4:5 "25 1-19 1:25 1°45 1°86 2°70 4:4 “50 1:00 1:06 1°26 1°68 2°51 4°2 ‘75 69 wi) 95 1°36 2°20 3°9 1:00 "25 31 51 93 1-76 3°5 soo0n (28). The difference between the two expressions only reaches 5°/, when the smaller group is less than 7 °/, of the whole. It will not be necessary, excepting in small samples, to apply a correction to the bi-serial formula for r in virtue of the mean of samples differing from the population value. The correction is less than 1/Nth part of the value calculated unless one of the alternative classes is as small as 4°/, of the whole. I have to thank fellow members of the Staff for assistance in calculating the tables. * For a table of xa see Tables for Statisticians and Biometricians, p. 37. ON THE PARTIAL CORRELATION RATIO. PART I. THEORETICAL. By L. ISSERLIS, B.A. § 1. The theory of non-linear regression in the case of two correlated variables is due to Prof. Karl Pearson*. He shows that regression ceases to be linear when the correlation ratio » differs sensibly from the correlation coefficient 7 and establishes criteria for parabolic, cubic and higher forms of regression. The present paper deals with the regression surface of three correlated variables x, y, 2, Where, though the regression of z on a, y cannot be adequately represented by an equation of type Ze = ETE ed (1), om ox oy the regression of z on w for a constant y and of z on y for a constant « is linear. A large proportion of the non-linear cases that occur in practice fall into this class. It will be remembered that Z,, in (1) denoting the mean of the array of z’s for a given w and y the coefficients y;, y;’ are partial regression coefficients and it will appear that just as it is necessary to introduce the correlation ratio 7 for an adequate description of non-linear regression of two variables, there must be introduced multiple or partial 7’s for the description of such regression in the case of more than two variables. We recall the definition and principal properties of .7,—the correlation ratio of y on 2, oq, being the square root mean weighted square standard deviations of the arrays of y: i S(Nzo"n,) “ SS {roy (y - In) (_—-7’)o,/°= Ory V Notes (2), Om, S (nz Ya — Y)} Cee Umea 7 ae Naeem yer an Tecrinas treats (3), and NGG ST) Cy Sua ne) ys wacnaweeteace seals (4). * Drapers’ Company Research Memoirs. Mathematical Contributions to the Theory of Evolution. XIV. ‘‘On the General Theory of Skew Correlation and Non-linear Regression.” 1905, 50—2 392 On the Partial Correlation Ratio ' Here we are dealing with WV pairs of two characters A and B. ng of these have the character w of A. Y, is the mean of this #-array of b's. o,,, is the standard deviation of this array, om, is the weighted standard deviation of the means of the arrays and Y is the value of y given by the regression straight line, Le. This is the “best fitting” straight line (in the Gaussian sense) to the means of the arrays, and is the regression line when the regression is linear. § 2. Consider now three correlated characters A, B,C. If N combinations of A, B, Care taken, we may denote by n, the number of these which have the character A =a and by nz, the number in which A has the value w and B has the value y. Let %, 7, Z be the mean values of the total population, and let Zz, be the mean of z for a given w and y. The frequency of A=a, B=y, C=z 1S alte We define the correlation ratio of zg on « and y, which may be denoted by vyll;, or if no confusion is likely to arise by H,, by the equation esl] Seen une = 20) > (6). The triple sum in the definition can be written SSS {Nayz (2 — Z+ 2 — Zry)?} = SS {Ngy (Z — Zry)?} + SS {(Z — Zay)} X S [Maye (2 — Z)} + SSS {Nays (2 — 2)} = SS {nzy (Z — Zzy)?} — 2SS {gy (Z — Zey)?} + No?. SYS Oe Zo = eee De G2 GSE a oe oe (7) This is a generalisation of the property of ,, given by (3). Hence pilates Further, the “best fitting” plane tou the means Z,, is given by 2-2 L—2 592) == Wh Sf eyg OP ctahnsnetes See eee 8 Gy a On ue Cy (8), Te Pyan where fg ven vealid tn ctde debe ne (9), ary > __Vyz— Vaz xy 2 aa (a (10). Tay Let ,,R, denote as usual the maximum correlation of z with any linear function of « and y, then rayliter = 32a + Ys Tzy PIA MOOD CAS AHOC CI OO ORDO GODOOAt.00 (a): y2 2 > > > _ Pye Fen = WV yz x28 ay Ley u ™ L. ISSERLIS 393 Subtract (11) from (7) after replacing 7, and rz, by the appropriate sums, and we obtain Laifet aS Ayla | N oy = SS ay (Z — Zn)? -—S | = (2—2Z)(a@—@) vs} == Jew ge (2-Z)(y-y) ve} a y Big ee ce Opie RNa 2s. Cs ye 2 PT Day = Sis ay (2 — Za)” — Nay = (Zxy — 2) (@ — @) Y3 — Nay 7 (Zay — 2)(Y— 9) Ys | . a y Using (8) this can be written [eyll? — my R?| No? = SS {nay (Zry — 2) (Zay — Z)} = SS {Nyy (Zry — Z)?} + SS {xy (Z— 2) (Zn — Z)} ...(13). But SS {ney (Z— Z) (Zxy — Z)} =SS | (1 9 yp / * ane 12 = No, (Ys02" zx — Ys’ F2 — 33 Vay Fz + V3 T2V yz — V3V3 Vay Fz — V3 oz) “LB 1¥—Y Oz+ 42 J Ox Cy Co See wey Oz i eta z= Cray 3 is Fa) x y PMG aval ine — Va is May) ict 8 ys — Ya — Yalay)) ,,R7 and by (6) that ,,H?< 1. § 3. These properties and definitions can be extended to the case of m variables #,, 2, ... fm. We now use » mt, for the mean of a, when a, a3, ... &m are given and denote by S a summation extending to the variables a, a, ... wp. Le. 497 If we define the correlation ratio of x, on a, #3, ... 2m by the equation UN Git mace eda) iS IC gel ali) Maan easiest lesesed (16). 1...m We can deduce in the same way as in § 2, the relation AM ote mev ds Wioc=3 112) (Gp coe tn i Oe (17). 2... In order to generalise (15) we recall that the “best fitting” linear function of the variables a, 3, ... 2m to the mean o5 mi, 18 Oe ye ie oeay meee ong ew me (18), on O71 G2 Oo; om 394 On the Partial Correlation Ratio where by, = a and Ryq 1s the minor with its proper sign of the element in the pth row and qth elamn of the determinant = ecg rime Fayre ec eee cere Pee: i ry lm Vme OOo Ih while the maximum correlation between any linear function of a, 73, ... ®, and TIS ast le, Were R-R —o3,..miy? = pe = bisa Ost is os ie Olin Tun ate oe (20), ll S {ny (@, — DB) (2, — %)} S {ns (a — Z,) (#3 — X,)} = by ce a Dis 6 7 Noo, Noo; 8 {7m (@ — &) (2m — Lm)} +... + Dig —— Neto = aa ee (21). Subtract (21) from (17), noting that n.= S {rom}, etc., we obtain* 3...m (, 3, eek FS iOn3 30. plate) No? = = oO a i = 8 {ae (%— oan | =f 2 |r = Bis (a, — X;) (a, a x) P 2 2... oO 3 ae apiece Ar S {ran --. Dine (ay ae 7.) (Gan ro Zn) Im om as SENG oO — ae =e == {No mM (&, = away +8 {n, ee Dis (a. = Lz) ae a x)| Tee 9 C2 2...m m ne oO pee = = +8 \" a bin (Ca ee Lm) ( m& fad a) = = Np o = 2 = = 8 |» (2, oe 2...) + Nem 4 Dy (2, aaa 2) (2. mA ae ;) 7 gsi | 2... 2 2... 2 Oj —t = 8 tm (Gone 2) jaan — a+ Drs = (2 — ae) Tee m Ge ee a Dim es (Ge zm) eae eee (22), using (18) this equation becomes (s, 3, Ravel motores ana) No? > (Hen Cena at 2) (2, me — X1)} 2... = 7/8) gen (om = DG) + 8 {fea an CG a X,) Goan aan X,)} 2... 2...m * By an extension of the notation described at the beginning of this section S3., denotes a summation with regard to the variables 73,14, ... 2,3 11,2,...m iS the frequency of a particular com- bination of the characters w,, x2, ... x,, While ny, is the frequency of the combination 2, 2. L. IssERuis 395 But S {ts on (X, a ZL) Cann =e X,)} 2. IN ( = ee. geen Ly — Ly Xz — Vs er Ly — Ve =S ie (- Dig ——— 0, — dy O—---) (2. my —Mt+ Dis O,+... (24), o on o 2...m ( 2 and -Noyon%is= S {2m (@, —%;) (o—- ) 1... =S (eee (a2 = Hy) Cant aa X,) 2... with similar values for 7,3, 7.3, ete. .. the right hand side of (24) = = by? yp — Dy? oP — 043012047123 « -- oe Gy Osis bs0 or To3 — bis Oe Bis Dr Cie vaee = GOS (- Tro — Oya — Bis 05 — Oya t'n4g — « — Gina) + o;7b,s (= 113 — Dis — Diy Ig — «- ) Goh Gam - ~.nens seen nee eIAeER SIR CIMA CE Ie eRe Eero (25). Each line in (25) is identically zero from the definition of the b’s and the properties of the determinant in (19). S {Ns mM @ m% a aa) 2 2 2...m Hence payrelae = 2 ealagi => —— = WN Fe aii RIM 8s oie sks (ouslekele| oxsiejsie’s (26), Orie so that the fundamental properties proved by Professor Pearson in connection with the correlation ratio 7, hold for the generalised H defined in this section. In particular equation (26) shows that a necessary and sufficient condition for linear regression in multiple correlation of m variables is that O35 Pilla == pyr means For in this case the mean value of any array of 2, will lie on the “best fitting ” m-dimensional plane. § 4. The regression surface of z on xy being assumed of any particular type the constants in the equation may be determined (i) by the method of least squares, i.c. by making the sum of the squares of the deviations 2,,—@(xy) a minimum, z= ¢(w#,y) being the regression surface, or (11) by giving such values to the constants that the correlation between z and (a, y) shall be a maximum. When ¢ (a, y) is of the second degree the two methods lead to identical equations for the determination of the coefficients. The same equations are also obtained if the surface be “fitted” to the means by the method of moments. There is, however,a distinction to be observed. The equation z= (a, y) when the regression surface is of specified degree contains a definite number of constants and the first two methods will give exactly as many independent equations as there are constants to be determined. The method of moments will give as many equations as we please if sufficiently high moments 396 On the Partial Correlation Ratio are used, including of course the equations given by the “least squares” method or maximum correlation method. Even without introducing high moments, when there are three variable characters new equations may be obtained by the method of moments, by combinations of characters which do not arise in the other methods. The method of moments is most convenient for our purpose, but we shall only employ those equations which can also be justified by (say) the method of least squares. For convenience let the origin be taken at the mean of the three characters so that2—y—2— 0) Jet atytze denote SSS {Mayet y'2"} _ Pasty (27). a ee Noiiay Ge Ox Oy Cz With this notation 7,, and ga, are identical; when z does not appear in the product, it is sufficient to write @,s 4 The most reasonable next approximation to make when a linear function (a, y) does not adequately represent the statistics is (a, y)=a quadratic function of a, y. a 2 2 Let ELI rg OEE a (28). om Ox Gy Gxoy Oz Oy’ Multiply (28) by mz, sum for all values of w, y, z and divide by V OS + Cry HOA SF 200. -2. sess dea near eeeeee (29). ’ : ales x De Z Multiply (28) in turn by “ae times —, 7, ae ae x and sum as before, ap IO i) Oa and we obtain Toy = 0 + OT yb CQnry Ap CGasit Pf nyte s0+efecioes seine nee enna (30), Pyz SO Olay + COmp + Oxy Hii Qys seceese osese nee Gnee eee eter (31), Qayz = Uxy + Aqary + OY aye + CYary + CYasy tI Yays veeceeceveseees (32), Gare = A+ Agast+ gary + CYary + CQat Hf Qary? vrecerececrscecscneas (33), Pye = AA Adaye + bq ys + COays + CYnry +S Qys -cescersensercereoees (34). Actual numerical fitting shows that in many cases e and f are small compared with c*. This is the case when the regression of z on « for a constant y and of z on y for a constant « is linear. We shall therefore confine ourselves in the present preliminary paper to the case where we may write 2 gi BY ee (35). Oz Ox Cy OyGy Here for constant # or y the regression of z on y or of z on @ is linear. * Cf. Census of Scotland, 1911, Vol. 111. p. xuvi1. where Mr G. Rae obtains by moments the regression of fertility on age of husband and wife. Let W=age of wife, H=age of husband, C=number of children in completed marriages. He finds C yy = 20°149493 — 0°555812W — 0173804 — 0:002846 W2 — 0:003494.H?2 + 0:012675 WH. See also the paper by E. M. Elderton in the current part of this Journal, pp. 291—295. L. IssERLIS 397 Equations (29) to (32) become, when the regression surface is given by (35), ORICA CTir lia oe ME sia a i sremiebe cin ccs ara ie aheteers sie yee scsgiee (36), Me MAO OI If we eliminate q,,z (which is a triple moment troublesome to caleuiae) between equations (45) and (49) we have ! ay? = OP zy + Dr yz + CGayz = OP ag + Dr yz + C (Gary — May) + ACQary + beqaye = Tox (3+ CO) + Tey (ys + Ch) + C (Gay2 — Tay) + (Y36 + 09) Yary + (Ys'¢ + CD) Qanye =O [Gaye — Tay + Oqary + Gaye] + ¢ [Ore + Pry + Ys Gary + 9 dey] + Ys" ex + Ys Vey 7 = yl? + C [Gare — Pry + Oqary + $Yxy2] by (46) and (11), — Pay t Fay — 2qery? Qty =. ay Ht? — ayl? at | dev — Ty tea He RZ : Hence C= $$ et ee (52). 2 Paty + Paye — 24ary2QaruT ay Gary? — Pay — = Seeage me ay This value of c? is positive by (51). Equation (52) shows that ,,H7=,,R? is a necessary condition for linear regression, which we have already proved in equation (26). L. IsskEr.is 399 The regression surface of z on a, y is, with the values we have now obtained for the constants BZ Vg — Tyel ny © © Vyz—Vax2V ay Y as = = te om 1l- Prey Ox Us rey ae ar we el 2 ale: : {= ay? = Yary & e + Vay Yary — Cay Y — y Ee ee EY et ee a = pee 4 7 Payt + Paty — 2Qa2y Yay?? ay pales ae ae 7 sys 1 ay Ye 1 a Tey ( (w7—%)(y—-Y) + — pS “ey piavetanallslasons (53) Tiley The terms in the first line give the ordinary regression plane. In most cases the regression does not differ widely from linearity so that ,,H?—,R? is small. § 6. We must now get some idea of the relative magnitudes of qa, Jay, Ge2 and moments of lower order. First with regard to g.., which is equal to Leis : On Oy aye S (Nyt? y) = S (Nz ix") (54) Vary No? oy No? Oy Si aheieje Bhd S(n yo? ) Similar] gf a Zs : Me Yaty Nozo/ Be Pay; or, so far without approximation ai 9 _<¢ Y 5 ¢ us = S(nz,a?o7y,.) + S (ny yxy) i oe cy) Ae 2Naeo,7 ap $ (By ote B,’) Preys * It is noteworthy that the hypothesis that regression of z on a, y although of 2nd degree is such that regression of z on x for a constant y is linear leads to the result that the total regression of z on x is parabolic. + Pearson, l.c. p. 28, Eqn. (lxiii). { Pearson, l.c. Eqns. (li), (xlv) and (xiii). § Pearson, l.c. Eqn. (lxv). || It can be found fairly directly by tabling to the squares of the variates, when we need a simple product moment. In a later part of this paper some comparisons of actual and approximate values for numerical cases will be found. L. Isseruis 401 Now the mean values of oy, and o?y, are known to be Cy liter) -and,-o,° (L—73,,,), and the deviations from these are usually somewhat irregular. Rarely can we do anything better than assume them to vary with a slight linear variation from the mean. For example Oy, = Oy (1 —7°gy) (1 + Ax), where A is small. In such a case Were (1 — rg) (144.8) or to a fair degree of approximation*, we may put : Vx2y2 = 1 Tey ar 4 (B, ar (SEO) Pevy, and thus write Quy — May = 1+ %(B2+ Bs — 4) Pay = 1+ y+ $(8,— 3+ Bo — 8) roy. The latter part of this expression vanishes if the frequency of the # and y variates be mesokurtic. It can of course be retained if desired but its product with V wy? — ey RZ will usually be of the second order. If we write v=t(Bo- 3+ B, — 3), we find the approximate regression surface Lim aa PyzVay (@ — @) ns Pyz — VazV ay We Y) Oz 1- Tey Ox t= Tey Oy / se — ah? (w— 2) (y-Yy) PG (57) Tee Was | os = ae : This equation (57) enables us to express approximately the multiple ,,H, in terms of the simple 7, 292, xMys yNx- + §7. To obtain this connection between the multiple ,,H, and the simple 7’s we may proceed as follows : OF Zay =d f= ae , origin at mean, oz ox Gg y Oy Hence keeping y constant and summing for the «’s “Ud 4 a — +2 uf ce ae ee ee (58). z & y FnOy » _ SSS {ney (2) — 2) les SSS (ayz 27) «¢ ue Luar aon No? No? ie SSS {Nayz (Zey — 2)? — SSS {Nay2? pee Yy. y fend aye” oi ead ay He = No? BeUNag -~ * T.e. we are neglecting terms of the second order as AV Bi. 402 On the Partial Correlation Ratio page ls — yn? = SSs ka (“ (wy) | cy (a — =») (2d+a @ tty | 4 by ae ox oe oy oy Cy | Gyan = 5 (22) (a+ 2) [2 (a4 on) (24%) = ss |" (FS Ox #) (a+) 2 d+) + (a2) Ox ) . by By 0, — 2,2 CY\? Nay) 99 a ie ay y Cy Nay s {(a+ ale jo Ses Ox ta + as |e On? (a+) N} by CY\" | Nay LU — X,? =0+8 \(a+ 2) (s Neorg )t. Now S {Naya} = 8 {(w — dy + Ly)? Noy} = ny (on + 0+ 2,7), x we a : CY \7 Ox,? N, ay H? — yn =8 (a+ “f) at : On ns fo+ 70-9 y = ee + 2et 0), Cy oO, ay? — ne = (1 = yn2) (EO)... eee (59). Similarly eye ee = (8 —27,))(0F aC) nee eee (60). Remembering that a= y;+c0, b=+¥;' +ch we get from (59) and (60) a ra) be ay? — Ne woe See ee : SF le ¥5%3 (sh — ys 9) 1- Qa ; tiGe {ys o— 39 — Ob (ya = Ys, 0)} AERC (61). From the values of ys, y;, 9, @ in (9), (10), (41), (42) we obtain easily yzYary — Vaz Vay? / yz 4ary az Yay Yah — ys = "een Teeter Seay / Vaz Q x2 oan Tye Jay? y3 6-730 = 1 — 1 xy Ya — 29 — Of (ys — ys 9) — PaeQary — TyeGiy? (Txy Yay? — Yary) (Cay Jary — Yay?) Cyz ary — Vaz Gay?) LP ey (1 — rey)? and c? is given by (52). Hence (61) may be written (Qe —Vxz Try) (Tey Vay? — Yury) (7 az — Ty2? ay) @ wy Yay? — Ya? y) x lai x R? = x = ~ ie es (1 = xy)? CL = ya") (1 — ray)? (Ll — amy’) (TxzGary = Tyzay ) ne (Vey ary? = Qa? 2y) (? "ary Yay y — Yay? Cae, = "ae Gey)) 6 tly (L= 722) f 2 24 ae ay? —s 2 wy2Y a2 Vg ) (Gu22 — Tay) — CZ yt GF a = wary Vay | ay L. ISSERLIS 403 mau yz — “a2: ey) (7 xy Vary ane dynes CG wz — Tye ‘ny) (Vay Vay? — = x2 a) ane (1 ile oe (1 — yNax’) (_l— Trey) d ort y N22) (122 — TyzTay) (yz — Vr2V ay) (Tyz x2y — 22 J ny?) ( es Tr) pis (re = Vaz" ny) Pay ary — Ixy) xz —Py2"ny) Cry Yow? — dev} (62) avis a ; aa eee i (1 — rx)? (1 = yn2?) (1 — ray)? (1 — any’) 1 ays L— 1 yy eed) May ; yz ee. ( 1 ) 2 2 ( 1 i ay ee Smee eat yaar =r ar 1 > 4 2 Ar lie ya WMI May 1 (ume a) ie rey &) oy at ha SE Aan re a (64), Lips where 2, = Tay (ya! oe ay) ye ~My and d= (yn? —T ‘ew) (y Ma" zs ny) G2) Gary) | lr, (1 — ynz’) (1 =1%xy) : ane oy / , Seay while = Sire alae vay ats So Seat ease reine (64, where ),’, A. may be Agere on Ai, A» by an interchange of # and y in the suffixes. 1 1 Finally ~ ee ate antonnes (65), Quy? Fay t Paty — 24ay2 2x2 ihe. ~ ee ye r xy CAE xy 2 L— Prey where Ko = Pay + Prey = 24ay aryl xy Fay? t+ Pay = ‘Qaey? Ta? y “ay) zis a2 pal The suffixes of &,, &/,-’,, Xo, Ay, Xo’ and «, denote the order of smallness of these terms. (1 77 5))(Qa242 — Tny) (dur eee 404 On the Partial Correlation Ratio We make the corresponding substitutions in equation (62), including the value Tee hale ee 27 eV yx Manz Se ay age R? on the right-hand side and obtain the following et Tey accurate formula for ,,H?—,,R2: ¢ ; (7 "ye — Vaz i) © “ay Vary Yay2) G 1 ) 2 2 ve \" vy x en (ay Jal z wy R, ) | qd Tay 1—r & (xz yz s Gi ay a ary) (= ti) (1 cay Pe De an ie) {ste ~VyzFay? (Tyz Qa2y — Vaz Jay? 2) (7, ay Jay? — Ya? y) @ "ey Vary — Gy)| The ray (l-r wan) ) 1 ———— Qaty? — ay = (Tyz i? NazT ey) (Try Qaty ay?) ( re + Ny + 7) (1 = rxy)? a7, ne meine! yeVay) (Pry Jay? — x2 y) ( ers , ‘ (1-7) ee nae yz Jat ira ne Voesj2) (7. ez Speier) (Giip = ieee) (_-r a — ye tae — Bsa ye ay | ce = ae ey) (Tay Ja2y ~ Gov) ( oe ) (= 7a) (Lr)? barges (Taz — TyzT xy) (Tay Try? — uty) ( Se eae ‘| " v6 (l—7,,)° las + EY lili ees (66). The right-hand side of (66) apparently contains terms of the first order, while on the left the lowest order occurring is the second. But the coefficient of qz2, in the first order terms is 1 = yy ae aa Vee lacy) TayT zy ar (Cie — Pyzt 7) Tog + Lyz (Px oar Prehiey) (Gere a Toe Tay) xy 1 : ee (1— rzy)! {yz —Vxz ie) Vey + (Taz — Lyz Tny) Gz + Px — Papel Piha) — Pry and vanishes identically. Similarly the coefficient of qg,y: is zero and thus equation (66) is a relation between terms of second and higher orders. A first approximation then may be obtained by equating second order terms. This gives (ay H 2 ~~ ay R2) (Tyz— Paz xy) (Vay Yary — Yay?) — (Vez — Vyz" ay) ("xy Qry? — Yay) 3 (1 = 7x)? — Paz Qary — Vyz ay? | (1 = ray) (Gaye — May) _ (Vyz = Vaz xy) (Pay Very — Yay’) E yz + ae — = 2x27 yzV ry iB | = al a Tay)? 1 oy (ae an VyzV ay) Gra Gav aa ary) [a fol r yet T 2% — 2Px2VyzT ny é| (67) (1 = ,,? E A hens AA eat ore ; Ta L. ISSERLIS 405 The coefficient of ,,H,—,R,/ on the left reduces to 1 1-7, ( — ry) [Qu ylaz — Yay? 2Tyz| (1- cs i dhe ya Pony Pye tT a2 — 20 x2P yzP oy = Ne — Tay é (TyzV xy = Vee)” ae Uae Ee — 1-7, Lr ay l= Pp a yNe — 3 ”, aii (ya — Tiny) (fer ie =) (68) Bee (1 y 1 — yn Teter oes Gane , Similarly a tape ye a De ce Nye see (Ue ae TNC eee uz wel yz! wy er _ ale za wily z( pesca ) me 1 it os Tn) E, 1 —/ a il oof Tye 1 a fy (69). We shall still be correct to second order terms if when using (68) and (69) in (67) we replace 1 —,,y,? and 1— any occurring in denominators by 1 —7%,,; so that ae (oy He — sy 2) oe ey w-¥* Yr, ay V2 yo Yay? Lr Ye — Vyz21. xy 9 9 Vyz Vay — Vaz 9 9 = z ; = : 5D Gne == T*2y) a Ti = Fr Gar = Te) ay E Qa2yVaz — Yay?! a 7 ay Vay Vay? — Ya2ry Ve, = VyeT xy 2 3 YoeVay ~ Tyz . a (Car i Tt) a 1— 7, Ment = Ta) Yay?’ yz ~ Yx2y Vaz 1 — Pay § 8. This result is of importance, as it shows that the heavy labour of the direct calculation of the generalised correlation ratio can be replaced by the calculation of four simple correlation ratios. The coefficients involved are the ordinary coefficients of linear regression denoted above by y;, y; and expressions involving product moments of orders 3 and 4, To these latter we may approximate by the methods of § 6 Qaryx- 1. Array A good approximation for —“*—— 1s _ . Ifgreater accuracy be needed Gary? — May + py Jury — 1 aa (4(A.+ Bo - 6) + ZT ay Gury — Pry 1 +1 ay +4 (Bs + Bs — 6) Pay We saw in § 6 (equation 55) that Vaty = Vay VB, + VN Gp= a ay VB.—Bi—1 oS ) Yay? = Vay vB an Nae Tey v By — 6-1, approximately. 8, and @,’, which are zero in normal correlation, will in general be very small compared with gn’? —72y and yn? — 7xy so that Voy Vary ~ Yay? be) Vey V (Be zm 1) (cy? = Ty) a V (By —1) G Ne — ry) QaryVaz — Yxy?"yz Tzz V (Be — 1) @Iy? = Pay) — Tyz V (Be —1) (yn? — xy) we may use and Voy Vay? — Yary _ Vay V(By = OGie = fay) = V(Bs —1) (7,7 — Poy) Gay? yz — Quty lz Vyz Vv (22 Be —1) Gis Toy) V(Bo —INiGny Tv ee Biometrika x 52 406 On the Partial Correlation Ratio In the important case 8, = 8,/=0 and ,.=2n,=Te, these approximations fail, and so does the process by which (70) was obtained as @ and ¢ vanish and (61) is indeterminate. We must then fall back on equation (59) xy H?—- ye = (a? + c*) (1 — yz”) =(ys +c’) (1 —1xy) since 0@=0 and 9, = Tay = 3 (1 — Mey) + = (1 + #3), Gary? } ar y 1-7, cy 2 Ya2ry? — Tay or Cay He TF wy Te?) € a iea)) neglecting 38rd order terms. The right-hand side reduces to , — 7?,,, so that 9 2 ¢ BE i: 4 9 9 ae GS) (ne =1)) ine (71) Yxty2 — 1 PA 9 2 bs ae aaa (yn? — 1°,,) approximately ; xy of course in these circumstances (60) would lead to the value 29, ) a 5 5 ee = TF cet ean PeR PROBA ARIA on. Jas0005000%0 (72), ay showing that if 8, = 8, =0 and if (,.7,? — 7,7) — (yn2? — Mxy) is of higher order than the first then (,7.2— 1.2) — (,n2 — 7°.,) is also of higher order than the first. We shall now seek relations between the six correlation ratios of three ) “hyperbolic” variates. From (59) and (60) we get, on eliminating ,,, H/, ye — af =P > COLO Une — 29) Heyes — 0 sn, = 3? — ys + ys ye? — Ysa Ny” +2 (ys9 y nx = Ys Px’) +0 (0 — 6 + ya? — any + Pyne — Fany?) = (5? — ys") (L= ay) + Ys" (ya —Pny) — Ya 2 (ey? — Tay) — 2¢ (79 — yb) (1 — xy) + {209738 (y nx? — xy) — 2erys'h (omy? — Tay) +O (P — b+ ye = aty + Pyne — Px Ny’) The terms in the second line are second order terms. Neglecting these and noting that (Ys? — ys?) (L = Pay) = yz — Mae and (39 — ys'b) (1 — ray) = Pye Gay? — Tez Qary, we obtain the following equation for c: (yne — Tey) — (ane? — Mex) — Ys? (ya? — Tray) + Ys? (ey? — Tay) =D Vary — Tye Vay) eee (73). L. Isseruis 407 24/2 P i Now (eH? - mR 2) fee 2c?7*,, 1f we neglect second order terms. If ary? a ey we use the value of c given by (73) in (70), and adopt the notation ,U, for ane — Vx, We have Tvy ty Up: 2 U, rae Ys y OF aF pee Gi = 2 (TuzQery — Py Gay?) (Pay Gary — Gey) Ys (ye — Ys? y Ur) — (Pry Gey? — Yury) Ys (@Uz — Y2?2Uy)} 0... (74). Let us write »x, for ,U,—;?,Uy and yx, for ,U, — y3°,Uz, then (74) becomes (pe =e apeay Tey = 2 Gee Gy — yz Jay?) {rys. Cary oF Joy?) yXz— Ys Cosy Yay? — Yury) aXet (75) is a relation between second order terms and it is sufficient to use equation (55) for qaz, with 8, replaced by zero and B, by 3, so that arty = "2,0 y, ay? = V2,Uz, (X2— y Xz)? Tay = 4 (Taz VUy eye VU) Ls y Xz (Tay VpUy i VU 2) = Wane Vay Vp Um— Nay) Weoeeon (76). This identity between »7-, yz, xn, and ,7, is symmetrical in w, y. Two more such identities may be obtained by interchanging the letters «, y, z in cyclic order*, There are therefore three identities between the six correlation ratios : yNa>. «Myr, yz. zNyr az, za I have not so far succeeded in reducing them to simpler forms, although possibly such exist. In special cases simplifications result. These are illustrated in the following section. §9. We defined y;, y,;' the regression coefficients of z on w, y by the equations Vaz — VyzVay 1 Tyz — VazVay (eer Ch te LE (10). a lay ee diaellv 1 Vee ~ Vy yz Dae Pe Se le a a ae Pye SE (77) _ Vyz — Vaz ay 1 Vay — ye" ee Dee 76 2 nk amare 2 — 1" xg 1-7, It will simplify the algebra if we use X”, p?, v°, XN, w, v? for yUy, -U,, 2U2, Ue, Buy. je respectively and P,Q, RP”, QR’ for yes eXys aXe aXe» aXu» vXe jeapectively so that P= - ye”, Q= fe Ya? W?, R=r—- YH) Dee Ne yy? oe, Q’ an Me es yo? y?, Rea nyt? J Siesietase seats * We are supposing here that the regression surfaces of # on y, z and of y on z, x are also hyper- boloids of type similar to (35). 52—2 408 On the Partial Correlation Ratio The three identities connecting the six simple 7’s become in this notation (R— RB’) ray = 4 (rz f= Pyzd) {ys BY (ray pw! — 2) = aR (Tyr — bY} --(79), (P= PP rye = 4 (ye ¥! = Tex) (yy P" (Pye ¥ = Bb) — HP (Tyee — v’Y} ---(80), (Q a Or =4 (12 — Vy V) {2 Q GaN —V) 9, Q Ca = ny} ...(81). (i) We can deduce from (79) that if the correlations of both # and y on z be linear and equal, then the correlation ratio of 2 on y and y on @, i.e. ny and yz are equal. Thus in biparental correlation, if the regression of the child on each parent be linear, then the correlation ratio of the father on the mother is equal to that of the mother on the father. For under the conditions stated az = yNz = Vea = Vey Hence y,= 3 =- Je and (79) becomes 1 ae Voy st DP = pee P rey = Arye (Mo A) [= 8A? (Pay — A) + YH? (Tay r — BY}, oe 3% — BYP A+ BYP Pay = Ary (X= WY {May Apel — 2? — Dye! — 7h, Us (A — bw’)? a (A+ WP — 4 (ray rp! — Ape’ — DV? — we?) | = 0, ey which reduces to No RG +2?A+(2—717%,,) pi ti 4” (27 ay + 3) (Tay + aa A-pwy ~ — : ; : = 0. ( i) | (es ap i) (Tay ap 2)? But rzy is numerically <1. .*. the factor in curved brackets is positive. Hence Niles Ope Cee eT (ii) An interesting deduction from the identities (79)—(81) is the following: “Tf any four of the six regression lines that occur in the mutual variation of three variables are linear, so are the other two.” We have to prove that if any four of the six quantities ), w, v, X, mw’, v’ vanish, then the remaining two vanish as well. Hirst let 4p = — Ae 10; (79) gives Toy? (Ys fe? + VP = Are w [Ye VEN ny ft — Yas 1°}, (80) NV Ty? = — ray nn”, (81) bg = From (80) yy [Aes Pye + yr 27y27] = 0. But Ary, Vay ar wees = (ee ee — at) >0. Wy =), and 0) and these satisfy (79). There are three cases of this type. L. ISseriis 409 The case X’= p’ = vr’ =X = 0 is proved in the same way. There are three cases of this type as well. Next take the three cases of type N=p=0, N= 0. Equations (79)—(81) become (79) i — se se ep yr. (80) Ny ry? = ey’ (yyy? (— vf, (81) ys”? = 4(—Tryv) {— Yeo y2V? (— vd}, (80) leads to (ry By Array eya) yt = 0. We have already shown that the first factor is positive, pv =0, and hence v= 0, and these values satisfy (81). The three cases of type X == pw’ = v'=0 lead to (79) yt 7, = 0, (80) Ny, = 0, (81). (ye? v? = y2?X?) Pye? = A (Pye — Pryv) [= 2 Yo? (Tae =v) + 222A? (Trev — VY}, whence \’ =v=0. There remain the three cases of type N—i— 0) (0. Here (79) is satisfied identically. (80) becomes OM? = YP YP Pye = — Aeryzps (ya (A? — 1?) (= wh, (81) (WP = YP A?P Tae = Aryzd" [= Yo (Mw? = yo ®”) (= VI, which reduce to (A? = 2?) [Pyed? = (WT yz + Ae’ Taz) w} = 0, (w? — 2? d’?) et — (279 2 + Ayo) "yz) 7} es The only common solution of these equations is 0. We have thus accounted for all the fifteen possible cases. (iii) Three regression curves linear. In six cases out of the possible 20 cases the linearity of three only of the regression curves involves the linearity of the remaining three. 410 On the Partial Correlation Ratio Let X= p= 0 and either v or v’ = 0. It follows from (79) that R= RB’, ie. v= v’. “. both v and v’ are zero. We have now four linear regression curves, .’. six are linear. Le p=v =0 and p =—0: Since; = v= 0, 3) P= or N= eesontliat P= — yee NeSore, P22 = per — yo Ren (79) becomes (v? + 932A?) ey = 40 yes? (Tay? — Ys"¥s V2), (80) becomes (ye? v? — rye? MP)? Pye = A (Tyzd — Pry V) Yayo ¥? [(Yov — Ya’) + Taz (q2'V — Y2X)]- The first reduces to {x Val Cee + 1 yP xy) Tey — : 2ryz Ven == (0) Se) a we and the second to (av? — yo! rN)? Ie Ayarye, (Mz — Tey vy) v? = 0, all Hence either 7}=v=0 or there must be a very special relation between Vays Vyzs Vee If instead of uw’ =0 we take v= 0 we get similar results, ie. in general the vanishing yu, v’, w’ or w, v’, v involves that of X, v, ’ or A, pw’, V. This accounts for six more cases. There are eight left. Of these six are typified by fp =7=0=y 50m ut and lead to the same conclusions. The remaining two are N= — 7 — 0 Or Ay The first supposition, \=~=v=0 gives Pa=-yv?, Q=— 9°, R= yp, P=nr2, One RR’ = p?, leading to (79) (vps My? Tye = bres Wye! (Pay? — Ys 7}, (80) 2+ yy?! 3 y, = Atay VO [Tyed? =v? | (81) (2 + yg? A)? 729, = Ary Neyo" {Taz fl'? —. Yas NJ, which give A’ =p’ =v'=0 or a very special condition to, he satisfied by the correlation coefficients ry, Tyz, Tex: L. ISSERLIS All We may conclude then that in general the linearity of any three of the six regression lines involves that of the remaining three. (iv) If the regression surface of z on #, y reduces to a plane, the regression curves of w# on y and y on & reduce to straight lines. We have as in § 7 z, Goya) Gj LU eg gph OEE Ne als, (58). on: Ty, Cy OxFy @, 4 ya Pay ie > Y But Pe ear EEE EEA CHa AB YC ces if, ox aca “ Bi Bat oye Ce By d+a ie here ee Date, (av B, +¢) Vee Me — peal Ped Oz oe B, =e oy é ne -8, = Op = B (yt? — 9" UR a), ies Poy “} cemeg WC aye C aye ae +a SET oy (( i B2— 8-1 Pris + terms Now the regression coefficient of y? is zero. and thus Sunilarly But c¢ vanishes when Hence if it follows that of higher order. of z on y for a constant « is linear, Therefore the To first order terms we may put 8,=0 and £,=3, aaa CV ey ats a ee = (0, ae — x Clay £ b ne i 9 =f =0, ga hee = lags by (52). pal = uy Bi, “Ny = yNa = Vay: We thus see that if the three generalised correlation ratios »,H,, yzHx, xH, are equal to »,fz, yx, zx, respectively, the six correlation ratios yn, Nz, “Mz, 2x» ys yz reduce to the corresponding correlation coefficients xy, Tz, Tyz and that the “linearity ” the six regression lines. of the three regression surfaces involves the linearity of MISCELLANEA. I. On Spurious Values of Intra-class Correlation Coefficients arising from Disorderly Differentiation within the Classes. By J. ARTHUR HARRIS, PH.D. Carnegie Institution of Washington, U.S.A. WHEN the constants of the and y characters of the population in 7, are quite indistinguish- able symmetrical tables* may be used, but not otherwise. Primarily and for the most part, however, the use of symmetrical tables has been restricted to cases in which the degree of interdependence between the measures of all possible pairst drawn from a considerable series of associated individuals—in short to intra-class correlations {—is sought. The dangers of spurious correlation due to the artificial symmetry of the surface is then much greater §. Pearson|| long ago pointed out that when intra-class differentiation exists, for example, because of age in the case of characters determined upon the members of a fraternity, or of posi- tion on the axis in the case of serial organs, the values of 7 may be to some extent spurious. In the cases considered by Pearson differentiation is an orderly phenomenon, i.e. the magni- tudes under consideration increase or decrease with age, position on the axis, or some other extrinsic characteristic with such regularity that the relationship can be expressed by an equation which may be used in correcting the raw values of 7. In other cases, the problem is not so simple. Ditferentiation within the class may exist, but it may be difficult or impossible to arrange the individual measurements by any character outside of themselves to obtain the constants necessary for determining the true correlations from the spurious values deduced from the tables. Illustration [. The correlation between yields of wheat in variety, testing. In variety testing, the experimenter seeks (or should seek), among other things, to determine the correlation between yields of varieties in different years. If this correlation be 0 (and regres- sion be linear) it is clear that the yield of a variety in one year furnishes no basis for prediction * R. Pearl, Biometrika, Vol. v. pp. 249—297, 1907; H. S. Jennings, Journ. Exp. Zool. Vol. xt. pp. 1—134, 1911 ; J. Arthur Harris, Biometrika, Vol. vit. pp. 325—328, 1910. + K. Pearson and others, Phil. Trans., A, Vol. cxcvu. pp. 285—379, 1901; K. Pearson and A. Barrington, Eugenics Laboratory Memoirs, No. V, 1909. + Biometrika, Vol. 1x. pp. 446—472, 1913. § With only one pair of measures the probability of spurious correlation is, in cautious work, very slight, for the possibility of differentiation can be easily tested by the critical comparison of the physical constants, || Pearson, K., “On Homotyposis in Homologous but Differentiated Organs.” Roy. Soc. Proc. Vol, LxxI. pp. 288—313, 1903. Miscellanea 4138 concerning its yield in any subsequent year. If, on the other hand, the correlation be high, prediction from a few years’ test may be made with great probability of certainty. Given a measure of the “performance” of a series of varieties during a number of years it would at first seem quite allowable to form symmetrical tables or to use the intra-class formulae of a former paper* to determine the intra-varietal correlation, and to regard this as a satisfactory measure of the differentiation of the varieties and of the average prediction value of a year’s test. Such is, however, not the case, for while there may be no orderly change in yield throughout the period under consideration, the individual years differ greatly in their average yield for all the varieties. The influence of this “disorderly differentiation” upon r is admirably shown by A. D, Hall’s+ table of the yield in bushels of wheat in the Rothamsted experiments. Let b=yield in bushels per acre of any one of m varieties in any one of n years, 71, Y2 be the “first” and the “second” years of a symmetrical intra-varietal correlation surface, v,, v, be the “first” and ‘second” varieties of a symmetrical intra-annual correlation surface. Then Toy, oyy will be a (spurious) measure of the (persistent) differentiation of varieties, Ty byg? 2 (spurious) measure of the differentiation (in the yield of all the varieties) of years. Applying formulae (v)—(ix) of Biometrika, Vol. 1x. p. 450, to these data, I find S[n (n—1)]= 2128, S[(n—1)3 (b')]=83122°5, S[(n—1) 5 (b2)] =3483626°4, S [5 (b') P=3610204:57, S [5 (b)]=370820°13, 6=39:0618, 0)? =111°257328, "by, Py = — 032. The result is obviously spurious, for mere inspection of the entries in the table shows that some varieties regularly give heavier yields than others. The source of the spurious value is to be seen in the fact that an intra-class coefficient has been calculated from a symmetrical surface formed from classes (varieties) represented by a series of yields differentiated by annual variations in the growing conditions. By correcting for this source of differentiation by expressing each yield as a deviation from the mean yield of all the varieties for the particular year, i.e. b”’=b—b,, where the bar denotes a mean and the subscript y that it is for all the yields of a year, I have found ¢ "by, byo= 266. Measuring the differentiation of years in terms of intra-annual correlation (intra-class correla- tion in which each class is defined by the year and its individuals are the yields of the different varieties grown), I find from Hall’s table S[m (m— 1)]=4440, S[(m—1) 3 (b)]=174129-2, S$ [(m —1) 3 (b)] =7317531:92, S[s (b) P=7586436'21, S[> (b)]=370820°13, b=39-2183, 042=110-017719, Ty, by = "791. Since the varieties have been shown to be differentiated, this result must also be spurious. Let b”=b—6, where the v indicates that the mean denoted by the bar is for the yield of the * Biometrika, Vol. 1x. pp. 446—472, 1913. + Hall, A. D., The Book of the Rothamsted Experiments, p. 66, 1905. + Science, N. S. Vol. xxxvi. pp. 318—320, 1912. Probably a better method of dealing with such cases will sometime be found. So far I have not succeeded. Biometrika x 53 Miscellanea 414 | : L188) — | GLE8 | 6-18h | F-O1F | €-SEOL | F-€F6 | €-F86 | S-608 |Z-9TIT) 6-€98 | F408 | 6-019“ (4) = ‘SplorA TeqoN, €1-0Z80Le | — 92 | 8 06 0G 0G GE ZC roe ri 61 61 st st* syorT Jo taquinyy 16-1931 | G-BPE| OL — |8686 | PL1 | &Sr | 00h | Ger | #93 | FGF | 0-8€ | B6E | 696 | °° (SQ9°TTeH) O9tTMA S.zoqUNTT 88-FOPPT | ZEse | IT | L-Lh | ¥-L46 | 6IL | 86h | 9-48 | GLE | Dee | 0-8P | 818 | 8-8e¢ |69¢ | °° °° weppIyg au My 81-0991 | 8268] Il | OFF | BCL | 6FI | 6h | 9c | LIP | eee | & bP | ee | Sar | gee | °° (s4geTTeH) oITG AA eIEOIOTA ge-1296 | 8-ccz| 2 = = = — |r | 1-0F | 0-93 | 98h | b-98 | ese | 0.08 | (Saz@TTeH) poy jeutsG PE-CPLPL | 9-99€ | OT Se AG VEY |8:0Ges| TGS) Galy 1k€-07, 168.86 1 -G.09) 1° 8:625 10:0 | 7:86. | es (poy) yeoumM qaqnyD 96-PC9T | 9-01F | IL | 91h | & 1 | FFL | 8.9F | F-9E | b8E | 8-8E | T-6S | ¢-8e | @6e | O-€e | “*(pey) BeYD ysnoy uepjoy bL-8ZZ9T | 8-CIF | Il | 8h | 0-L6 | 0-46 | €-9F | 0-68 | F8E | GSE | ELP | Loe | Ih | L-1e | (semumey poy plo) jeaang 69-G089I | 6-SIh | IL | Ltr | 0.18 | 0.06 | 88h | G28 | 9-9F | L-9e | @&1¢ | O28 | Ber | ZTE | * °° (eqtG AA) eq ATOOM €0-92691 | 6-91F | IL | 0.9% | G23 | 6-0€ | 8-L4h | 9-0F | G-2e | 0-6€ | LIP | 1-23 | @-97 tts ATOsINN poy GG-OLILL | &-617 | IL | 62> | LbS | Sl | 8-Lb | 0-8b | Gch | 0-6€ | 1-29 | G-ze | L-2F 2 ay M sAasep LE-TIEL. | &-b3h | IL | & Lh | 9-61 | 0.Fo | &-6P |6-0F | L-6€ | G-8e | I-I¢ |¢-8e | ¢-0F a (poy) 3ormorg O€-8P80L | 0.-€46 | 2 — |01> | 8er | &re | 8-8F | &-Ge | 0-Le (OMA) BEYO poy OZ-1618l | v-rEF | IL | 6-Gh | 3-83 | 0.36 | L-2¢ | 91h | or | GEE | LS¢ | L-Le | 8-eF "7 dapuoM poy €v-OPZ8T | €-Geh | IL | Z-9F | 9-08 | 9-16 | 1-69 | 1-Fr | Sr | 9-1e | ree | G.6e | F-FF “pay JoIsig, 99-0€I8T | 8.Geh | Il | &8h | 9.86 | BGs | 8.09 |63r | Ger | 6Fe | Leo | Le | 8-eF ee wavysuey pay O€-F8LLT | 9-007 | OL |. 8-Sh | #86 |.¢8 | 0-49 | F-9F | 0.0F | FLE | BES | &-9F a “7 *** MO0ISOY POY 00-L66LT | 9-€0F | OT | 9-Sh | %-B | ¢-16 | 0-79 | Lor | 0-FF | BEE | 9-6F | 0.ZF | ¢.9F — |" (071M) eseopreyy €6-89r6l | o.pay | IL | $-9F | 97S | OTE | 8BG | BF | F1h | Beh | 18h | Soh | Bsr | ose | + (pay) oprorg sejog 86-88018 | 8FO7 | IL | 80¢ | 6-8T | O18 | 82d | o-6F | HBF | 186 | BIG | Br | B6r | G68 | SHeTTeH (pey) dorq uaploy 6E-S19GS | 6-9LF | IL | Feb | 7-91 | GES | 0-19 | 96h | 94h | 9-9h | 9-69 | Gly | 8.67 | O98 | °° ** (poy) yeoumA QnID B6-E4881 | Z-86¢ | 6 G-bG | 1-86 | 82 | 0-69 | 8h | ¢-6F | OF | 1-9 | 9.07 = om * (pay) Beyo ey M 6E-E9Z1Z | SIF | 6 GSS | F-2S | O91 | 1-99 | 9.6F | Ger | o-8F | 0-29 | L-8F = cat ee sess (pay) qoary a : | Feces ] (A) z (4)< | story | Test | OssT | 64ST | SLST | LUST | QLET | Gost | FLeT | exeT | sLeT | TZST | Aqarre a ‘supa x quaiafuy ur qoayy, fo sayaruny fo pjargx IT ATaviL Miscellanea 415 variety for all the years it was grown. Correcting for the influence of the differentiation of varieties in this way I have found* = Ooi Pett Ug Thus season is a far more important factor than variety in determining an individual yield. Ittustration II. Influence of Personal Equation upon the Correlation between the Grades assigned to the Same Paper by a Series of Instructors. Stripped of the verbiage in which it has been clothed in discussions among pedagogues, one of the chief problems concerning the reliability of the grades assigned in examinations resolves itself unto the statistical question: What is the correlation between the grades assigned to the same paper by different instructors ? Let g be the grade assigned to any one of m papers by any one of 2 instructors, let 7,, ¢ be the “first” and “second” instructor (of a symmetrical intra-class table) passing judgment upon a paper, ~1, ~2 the “first” and “second” paper graded by the same instructor. Then from Table I of D. Starch+ I deduce, by the intra-class formulae (v)—(lx) of Biometrika, Vol. 1x. p. 450, ‘O71. = '659, Vo. gy /E = ij Vig 9p "no By using the deviation method as illustrated above, I have found P94 !ig = TB2 Pol'y Ol pg = B86: TABLE II. Grades of Papers Assigned by Various Instructors. Instructors. pe 8G aes eae | 80 | | Papers. SDWNA AW Co MH Both of these results, in which an attempt was made to correct for the personal equation of the instructors in determining the correlation between the estimates of different instructors on the same paper, or to correct for the differences in merit of the papers in testing the individuality of the instructors, are higher than the raw values given above, which are clearly spurious. Similar results { are obtained from Jacoby’s astronomical grades§. * Science, loc. cit. + Science, N. S. Vol. xxxviu. p. 630, 1913. + Personally, I can attach little pedagogical significance to series as short as those of either Starch or Jacoby. They serve here as illustrations of method merely because I know of no more extensive series. § Science, N. S. Vol. xxxr. p. 819, 1910. 53—2 416 Miscellanea The essentials of this note may be summarized as follows: In using Intra-class coefficients care must be taken to guard against spurious values arising through differentiation among the individuals of the class. Besides the orderly differentiation (due to age of individuals, position of organs on axis, etc.) for which Pearson has determined corrective formulae in terms of correlation coefficients, a disorderly differentiation for which such corrective formulae have not as yet been found some- times obtains. Illustrations of such cases are here given. Probably the empirical methods used here in correcting for this disorderly differentiation should be replaced by formulae with a sounder theoretical foundation. This I have not as yet been able to do. The purpose of this note will have been served if it directs attention to a source of danger which may sometimes be encountered in the use of serviceable formulae, and indicates a method by which in the absence of more perfect methods practical results may be secured. CoLtp Spring Harpor, N.Y. February 3, 1914. II. On an Extension of the Method of Correlation by Grades or Ranks. By KARL PEARSON, F.R.S. In a memoir published in 1907* I have shown how, on the hypothesis of normal distribution, the true correlation of variates » may be ascertained from the correlation p of grades. If g, and gz be the two grades, v, and v2 the corresponding ranks, # and y the corresponding variates _ with means # and y, and standard-deviations o, and o2, while 2 Ore 2 NV 1 tialR- a) 2 = C102 02 270102 vies is the normal frequency surface of the variates, then Fe LG Ney 2-1 N=3N=%H, Fg) =F =7:N, ei Laas 1 —-R=1=>F— | e° 91" da, N20, / 0 oy, = 02x, = 7b (W2—1). Further I showed in the memoir just cited that y=2 sin (F ?) * «On Further Methods of Determining Correlation,” Drapers’ Company Research Memoirs (Dulau and Co.), pp. 11, 12. Miscellanea 417 where a convenient method of finding p was by the formula : ey OSI De or again by =1- WN?) The problem has recently occurred of dealing with data where: (i) One variate is given quantitatively, the other variate is given by ranks. For example, place in school-class has to be considered in relation to marks in examination, or the rank in a teacher’s general appreciation has to be considered in relation to marks in examination. (ii) One variate is given by broad categories, the other by ranks. For example, five or six categories of general intelligence are given as the basis of the ) 8 g teacher’s classification of intelligence, and this has to be considered with regard to rank in, say, class or examination, possibly with regard to a special subject. We require in both cases to deduce from the data the true variate correlation. Case (i). Let x be the character measured by its grade, y the character given quantitatively. Then with the notation above, if p’ equal the correlation of grade and of variate, + the corre- lation of the two variates : 1 Pay ar Noyo,’ where $2 [te 2 = Pay= | a —_ 2(Y¥-Y)(W-H) aedy, pz, , +0 [to 5 0 Pot is ie (y-¥)%q ap rey: Integrating by parts after putting y¥=0 and writing de _ de» dr 1 dady? qh aco pets di, dz ‘Baw. | ie 10x F a Integrating again by parts: dr Lol Sone : ry) +0 [+o = sos] al e ~ozdady aoe 2-r 2rx'y’ y? 2 [+o [+o 4 12, __ o9iV? Ie | 1 = yes ee =) dx dy NOR. —« 2m FS o, N? 1 o V2 oot i Nr re” ary Hence dip! _ 7 (ae \= NV = No204, Wr oy, dr adr * Phil. Trans. A., Vol. 195, p. 25. 418 Miscellanea Thus since p’ vanishes with 7, p’ = amie Thus finally r= us 5 P= 10233 p’, -) ne aoe 1-0233 p”. It will be clear from this that the correlation p’ between rank and quantitative variate can never be “ perfect,” for it cannot exceed the value ‘9772, otherwise the correlation 7 would exceed unity. It will be seen that for practical purposes 7 is very close to p’, but still from the theoretical standpoint, it is not without interest to discover that the correlation between ranks and a quantitative variate can never be perfect. For example, it is impossible to have perfect correlation between place in class and examination test, even if the boys were in the same order in class and examination. The defect, however, will be very slight. Case (ii). Let the subscript C refer to any “broad” class and let n be found from either of the formulae 12 1 — =\2 ae S {te Gog) ee 12 — = or 7 WV (¥2 —1) Be a the first applying to grades and the second to ranks; then 12 ges r= 1:0233 wi Wa 8 (ne Go- 9)", Ghee = ee = 10233 rl n S {te Vo-v)}; according as grades or ranks are used. In actual practice the values of 7’ or 7” should be correct for number of classes and for “broad” categories. See Biometrika, Vol. vit. p. 256 and Vol. rx. p. 118. Numerical illustrations will be provided later. III. Correction of a Misstatement by Mr Major Greenwood, Junior. In a recent paper by Mr Major Greenwood and Mrs Frances Wood “On changes in the Recorded Mortality from Cancer and their Possible Interpretation*” occur the following words : ‘The case is evidently analogous to that studied by Professor Karl Pearson in his pamphlet, The Fight against Tuberculosis and the Death-rate from Phthisis (Dulau and Co.). Professor Pearson published three diagrams: (a) the general death-rate of England and Wales; (6) the phthisis death-rate ; (c) the ratio of phthisis deaths to all deaths. The original figures seem to have been the crude rate for males and females separately from 1835 onwards.” The “ evident analogy” with what appears to me the wholly fallacious treatment of the authors in their paper above cited I do not now stay to discuss, but I wish to draw attention to the words: “The * Royal Society of Medicine, Proceedings, Vol. vir. Section of Epidemiology, pp. 79—170. March 27, 1914. Miscellanea 419 original figures seem to have been the crude rate for males and females separately from 1835 onwards.” Why the writer of these words should have assumed them without any inquiry of me, or any examination of the values of the crude death-rates (which are accessible to every- body) to be “crude death-rates,” I do not know, but they illustrate his readiness to form a biased judgment when his feelings are stirred by unfavourable criticism. As a matter of fact the rates were standardised rates reduced to the population of 1901, and most kindly provided at my special request by the General Register Office. It is of interest to observe that Dr Weinberg of Stuttgart—recently made precisely the same charge as Mr Major Greenwood with the same over-hasty assumption that the reality must be the desired, if undemonstrated, error*, With the German as with other foes, it is well to leave ample opportunity for their assuming you to be foolish ; their assumption may lead them to run against hard reality. Kees IV. Note on Reproductive Selection. By DAVID HERON, D.Sc. The fact that in the case of mant fifty per cent. of one generation comes from twenty-five per cent. of the preceding one was first noted by Professor Karl Pearson in the Chances of Death (Vol. 1. p. 80) and in dealing more fully with this important generalisation in the Ground- work of Eugenics, p. 27, he said: “It is very difficult from any English statistics to determine how many adults never marry. No information on this point is asked in the death schedule for males ; it is asked but imperfectly answered in the case of the schedule for females.” In a footnote he adds: “The Registrar-General informs me that the record of civil condition in the case of female deaths is worthless and that no useful return can be made from it.” He found that in the Argentine and in Scotland 60 per cent. died unmarried, in the United States 51 per cent., and from the last two English Censuses and the Annual Reports 48 per cent., and added “ This indirect method of reaching the result is, however, not very satisfactory. We may, I think, conclude in round numbers that 40 per cent. of the population dies before it reaches the age of 21 and that probably another 20 per cent. are never married.” On this assumption Professor Pearson proceeds to show that “about 12 per cent. of all the individuals born in the last generation provide half the next generation.” Some data published in Bulletin of Population and Vital Statistics No, 30 for the Common- wealth of Australia (Tables 48 and 84 a and b) prove that the assumptions made lie very close to the facts. The data are shown in the following table which gives the conjugal condition and issue of the males and females who died in Australia in 1912. From this we find that half the total number of children came from 3337 of the parents (all those who had at least 9 children and part of those who had each 8 children). It thus appears that of the males 17,404 out of 30,285 =57°5 °/, died unmarried while half the total offspring came from 25-9 °/, of those who married and 11:0°/, of the whole number of males, so that approximately three-fifths of the males born die unmarried and one-half of one generation comes from one-quarter of the married population or from one-ninth of all the males born in the preceding generation. The diagram gives a graphical illustration of the argument. In exactly the same way we find that nearly one-half of the females born in Australia die unmarried and that one-half of one generation comes from one-quarter of the married and from one-seventh of all the females born in the preceding generation. * Archiv fiir Rassen- und Gesellschafts-Biologie, 1x. Jahrgang, 8. 87. Leipzig, 1912. + It has also been dealt with in various mammals. See the Groundwork of Eugenics, Eugenics Lecture Series 11 (Dulau and Co.), p. 29. 420 Miscellanea Conjugal Children in || Deaths of Total Deaths of Total | Condition | Each Family Males Children Females Children Single 0 17404 —_ 10011 — Married 0 1422 — 1317 — 5 1 1036 1036 1083 1083 45 2 1098 2196 992 1984 x 8 1127 3381 1050 3150 S 4 1147 4588 1001 4004 - 5 1070 5350 976 4880 s 6 1058 6348 1013 6078 5 tf 1040 7280 974 6818 ry) 8 992 7936 881 7048 Ar 9 819 7371 799 7191 A 10 | 801 8010 622 6220 5 11 473 5203. 469 5159 " 12 394 4728 314 3768 ne 13 196 2548 193 2509 - 14 109 1526 101 1414 Bs 15 50 750 57 855 16 | 27 432 22 352 a 7 119 8 136 H 18 5 90 3 54 x 19 6 114 2 38 ” 20 3 60 2 40 » 21 = — 1 21 a 22 -- —_— 1 22 iY 23 J 23 — — Totals 30285 69089 21892 62824 Diagram to illustrate the fact that three-fifths of those born die unmarried and that one-ninth of one generation produce one-half of the next. (Deduced from records of males.) First Generation. Percentages. 60 70 80 90 100 O 10. 20 30 40 50 : More | a Families of 8 and upwards Families of 1—8 6 10° 20" 3040 0 150 COM OMG OMNES OMIOO Percentages. Second Generation. Journal of Anatomy and Physiology. CONDUCTED BY SIR WILLIAM TURNER, K.C.B. - ARTHUR THOMSON, University of Oxford ALEX.. MACALISTER, University of Cambridge ARTHUR KEITH, Royal College of Surgeons ARTHUR ROBINSON, University of Edinburgh VOL. XLIX ANNUAL. SUBSCRIPTION 21/- POST FREE CONTENTS OF PART I.—OCTOBER 1914 H. L. eee B.A., M.B. The Morphology and Histology of a Human Embryo of 8:5 mm. J. Witrriw Jackson, F.G. $. Dental Mutilations in Neolithic Human Remains. J. S. B. Sroproxp, M.B., Ch.B. (Manch.), The Supracondyloid Tubercles of the Femur and the Attachment of the Gastrocnemius Muscle’to the Femoral Diaphysis. J. A. Pinus pr Lita. Note on a Case of Bifid Penis, with Penial-Hypospadia. Dayip WarErsron, M.D. A shrek Hedbryo of Twenty-seven Pairs of Somites, Hmbedded in Decidua. : LONDON: CHARLES GRIFFIN anv COMPANY, Lrp., Exeter Street, Strand JOURNAL OF THE ROYAL ANTHROPOLOGICAL INSTITUTE Vol. XLIV. January—June, 1914 Contents :— Minutes of the Annual cjecieeai Meeting, January 20th. Presidential Address. The Reconstruction - of Fossil Human Skulls. Banrour, Henry. Frictional Fire-making witha Flexible Sawing-thong. Bares, Daisy M., F.R.A.S. (Australia). A Few Notes on Some South Western Australian Dialects. Knowuns, W. J., M, R.I.A. The Antiquity of Man in Ireland, being an Account of ‘the Older Series of Trish Flint Implements. Berry, Ricoarp J. A., M.D. (Edin. and Melb.), F.R.S. (Edin.), Ropertson, A. W. D., M.D. (Melb.), and Bicunur, L. W. G.. The Craniometry of the Tasmanian Aboriginal. BEst, Exspon. Ceremonial Performances Pertaining to Birth; as Performed by the Maori of New Zealand in Past times. Ivens, Rev. W. G. Natives’ Stories from Ulawa. Baszpon, Huerperr, M.A., M.D., B.Sc., F.G.8., ete. Aboriginal Rock Carvings of Great Antiquity in South Australia. (With Plates I—XVI.)_ Coox, W. H. On the Discoyery-of a Human Skeleton in a Brick-Harth Deposit in the Valley of the River Medway at Halling, Kent. (With Plates XVIII—XXII.) Kuzrra, Arraur, M.D. Report on the Human and Animal Remains found at Halling, Kent. ~ ~-WITH TWENTY-TWO PLATHS AND MANY ILLUSTRATIONS IN THE TEXT, PRICE 15s. NET LONDON: THE ROYAL ANTHROPOLOGICAL INSTITUTE, 50, Great Russell Street, W.C. or through any Bookseller MAN A MONTHLY RECORD OF ANTHROPOLOGICAL SCIENCE Published under the direction of the Royal Anthropological Institute of Great Britain and Ireland. - Kach number of MIAN consists of at least 16 Imp.8vo. pages, with illustrations in the text together with one full-page plate; and includes Original Articles, Notes, and Correspondence; Reviews and Summaries; Reports of Meetings; and Descriptive Notices of the Acquisitions of Museums and Priyate Collections. & Price, 1s. Monthly or 10s. per Annum prepaid. _ TO BE OBTAINED FROM THE ROYAL ANTHROPOLOGICAL INSTITUTE, 50, Great Russell Street, W.C. AND THROUGH ALL BOOKSELLERS CONTENTS (All Rights reserved) Ss PAG 1. A Piebald Family. ig HE. A. Cockayne, M.D., M.R.C.P. (With Plates XI— ri SUNN NUD) ei : : - - 197& II. Clypeal Markings of Quads Deine sad Workers of Veena Vulgaris By) oiteeey ; Oswatp H. Latrer, M.A. (With One Diagram in the text) . : ss 201 III. Table of the Gaussian “Tail”. Functions; when the Tail” is larger than the : Body. By Atice Luz, D.Se. (With One Diagram in the text). : . 208 IV. Contribution to a Statistical Study of the Crucifere., Variation in the Flowers of Lepidium draba Linneus. By Jaums J, Stupson, M.A., D.Sc. Hoes Eleven Diagrams in the text) . : 215. V. Nochmals tiber “The Elimination of Spuiios Coytoia ion due to Position in Time or Space.” Von O. ANDERSON, Petrograd, RuBland . ; : 269 VI. Statistical Notes on) the Influence of Education in ihe By M. Hom, M.A., B.Sc. i : 280 VIL Height and Weight of School Ohilaren in idles “By Bae M, Einaoio Galton Fellow, University of London. (With Plate XIX and Two Diagrams in the text) Ae hes 288 VIII. Numerical Mlustrations of the fans Diterstes Correlation Method, By Beatrice M. Cave and Karu Parson; F.R.S. . : : : : » 840 IX. . An: Examination of some Recent Studies of the Inheritance Factor in Taganey BET By Davin Heron, D.Sc. (With, Fifteen Diagrams. in the text) . oP aa X. On the Probable Error of the Bi-Serial Expression for the Correlation Coefiicient. By.H. E. Soper, M.A. Biometrical' Laboratory, University of London. . | 384 XI. On the Partial Correlation Ratio. Part I. Theoretical. By L. Issernis, BA. 391 Miscellanea ; (i) | On Spurious Values of Intra-class Correlation Coefficients arising from Dis- orderly Differentiation within the Classes. a J. ARTHUR HARRIS, Ph. D. Carnegie Institution of Washington, U.S.A. Artie Ta * cee es On an Extension of the Method of Correlation ie Grades or Hanke by | Kart Prarson, F.R.S. sales Genk rat wal AS 416. (iti) Correction of a Misstatement made by Mr tion OL es Junior. K- P. 418 © (iv) Note on Reproductive Selection. By Davip Huron, D.Sc. ', _ 419 The publication of a.paper in Biometrika marks that in the Editor’s opinion it contains either in method or material something of interest to biometricians. But the Editor desires it to be distinctly understood that such publication does not. mark assent to the arguments used or to the conclusions _ . drawn in the paper. Biometrika appears about four times a year. A volume containing about 500 pages, with Plates ae tables, is issued annually. Papers for nnbieation and books and offprints for notice should be sent to Professor Kian Pearson, University College, London. It is a condition of publication in Biometrika that the paper shall fy already have been issued elsewhere, and will not be reprinted elsewhere without leave of the Editor. It — is very desirable that a copy of all ‘measurements made, not necessarily for publication, should accom- pany each manuscript. In all cases the papers themselves should contain not only the calculated constants, but the distributions from which they haye been deduced. Didgrams and drawings should be sent in a state suitable for direct photographic reproduction, and af on decimal paper it should be blue ruled, and the lettering only pencilled. Papers will be accepted in German, French or Italian. In the first case the manuscript should be in Roman not German characters. ‘ 4 Contributors receive 25 copies of their papers free. Fifty additional copies may be had on payment of 7/- per sheet of eight pages, or part of a sheet of eight pages, with an extra charge for Plates; these should be ordered when the final proof is returned. The subscription price, payable in advance, is 303. net per volume (post free)’; single numbers 10s. net. Volumes I, II, (11, IV, V, VI, VII, VIII and IX (1902—13) complete, 30s. net. per volume: Bound in Buckram 24/6 net per yolume, Index to Volumes I to V, 2s. net. Subscriptions may be sent to C. F. Clay, Cambridge University Press, Fetter Lane, London, E.C., either direct or throug. BBY -bookseller, and communications respecting advertisements should also be addressed to C. F. Clay. Till further notice, new subscribers to Biometrika may obtain Vols, I—IX together for £10 net—or bound in Buckram for £12 net. The Cambridge University Press has appointed the University of Chicago Press Agents for the sale of Biometrika in the United States of America, and has authorised them to charge the following prices :— $7.50 net per volume; single parts $2.50 net each. ! o CAMPRIDGE: PRINTED BY JOHN CLAY, M.A. AT THE UNIVERSITY PRESS. ign Vol. K." Party tV °° ‘May, 1915 BIOMETRIKA A JOURNAL FOR THE STATISTICAL STUDY OF BIOLOGICAL PROBLEMS ; FOUNDED BY W. F. R. WELDON, FRANCIS GALTON anp KARL PEARSON EDITED BY KARL PEARSON re TL ie oie Sea aes, i ae Aint ie - SUN EB 1915 ee gest | Nov: CAMBRIDGE UNIVERSITY PRESS ~~enal_ Ne a ‘ Cc. F. CLAY, Manaczer k, LONDON: FETTER LANE, E.C, EDINBURGH: 100; PRINCES STREET SS er ey es ep I D also H, K. LEWIS, 136, GOWER STREET, LONDON, W.C. : > WILLIAM WESLEY AND SON, 28, ESSEX STREET, LONDON, W.C. Aye : CHICAGO: UNIVERSITY OF CHICAGO PRESS i BOMBAY, CALCUTTA AND MADRAS: MACMILLAN AND CO., LIMITED TORONTO: J.°M, DENT AND SONS, LIMITED TOKYO: THE MARUZEN-KABUSHIKI-KAISHA Price Ten ‘Shillings net [Issued June 4, 1915] AFTER JUNE 30, THE CAMBRIDGE UNIVERSITY PRESS AND THEIR AGENTS WILL BE TH “tg SOLE AGENTS FOR THE SALE OF THESE PUBLICATIONS. OPN ea I. III. “LY. ole ETT: IL IIL. IX. 150 years. - BIOMETRIC LABORATORY y PUBLI ATIONS - Drapers’ Company Research Memoirs. — ne ~ Biometric Series. ve Mathematical Gontwbutions to the Theory of Evolution.— XIII. On the Theory: of Contingeney and its Relation to Associa- tion and. Normal Correlation. By’ Karu PEARSON, F.R:8, . Price 4s. net.’ Mathematical Contributions to the | .Theory of Evolution —XIV. On the Theory of Skew Correlation and N on-linear Regres- sion. By ag PEARSON, F.R.S. ‘Price 5s. net. ; Mathematical. Contributions to the Theory of Evolution.—XV. On the Mathe- ' matical Theory of Random Migration. By Karu Parson, F.R.S., with the. assistance of JoHN BLAKEMAN, M. Se. Price 5s. net. Mathematical Contributions ‘to the Theory of Evolution.—XVI. On Further ' Methods of . Measuring Correlation. Karu Pearson, F.R.S. Price 4s. net. Mathematical valet thas to the By aS : % ' Theory of ivelieon: _ xvi. On Hone i . typosis in the Animal Kingdom. By ERNEst Warren, D:Sc., Atice Lex, D.Sc, EpNA mea ‘Lea-Smira, MaRron atest and Karu Ke Soo Z PEARSON, ERS. ie ei PSaptlye ree VI. -Albinism | in Man. By ‘Karn PEARSON, - E. Nerrnesure, and CH. Usuer. Text, ‘Part I, and Atlas, Part I. Price 35s. net. VIL. Mathematical Contributions to the tis ing ’ Theory of Evolution—XVIII. Ona Novel Method of Regarding the Association of : two Variates classed solely in Alternative. i Categories. By cen PEARSON, E.BS. Hehe. Price 4s. net, a Tat ‘VILL. Albinism in Man. ‘By: Karn PERaRson, gas _ EK. Nerriesurp, and C. H. Usnur. Text, Part II, and ‘Atlas, Part II. Price 30s. net. Tae ‘IX. Albinism in Man. By Karn Pearson, Text, KE. Nerriesuip, and OC. H. Usumr. “ Part IV, and ae Part bY. erie 21s. net. Studies m National Detaoraier. . 74 ae On the Relation of Fertility in Man to Social Status, and on the changes in this Relation that have taken place in the last By Davip Heron, M.A., DSc. Price 3s. Sold only with complete sets. A First Study of the Statistics of Pulmonary Tuberculosis (Inheritance). By Kart Parson, F.R.S. Price 3s. net.- A Second Study of the Statistics of Marital Infec- Pulmonary Tuberculosis. tion, By. Ernest G\Pops, revised by Karn Prarson, F.R.S. With an Appendix on | Asgsor tative Mating by Eraen M, ELDERTON.. Price 3s. net. The Health of the School-Child in re- lation to its Mental Characters. By Kari PEARSON, F.R.S. *[Shortly. On the Inheritance of the Diathesis of Phthisis and Insanity. A Statistical Study based upon the Family History of 1,500 Criminals. By CHarbEs Ags M. D., B.Sc. Price 3s. net. t The Influence of Parental Alcoholism. on the Physique and Ability of the Off- ‘spring. A Reply to the Cambridge Econo- mists. 1s. net. Mental Defect, Mal-Nutrition, an the Teacher’s Appreciation: of Intelligence. ‘A Reply to Criticisms of the Memoir on ‘The Influence of Defective Physique and _Unfayourable Home Environment on the Intelligence of School Children.’ By Davip Heron, D.Sc. Price 1s. net. An Attempt to correct some of the Misstatements made by Sir Victor Hors- LEY, F.R.S., F.R.C.S., and Mary D, Sturn, M.D. phat their Criticisms of the Memoir : ‘A First Study of the Influence of Parental Alcoholism,’ &. By Karu eee E.R.S. Price 1s, net. Mendelism and the Problem of Mental Defect. Mental Defect, and on the need for standardizing Judgments as to the Grade of Social Inefficiency _ By pte PEARSON, E.RB.S. which shall involye Segregation. “By Karn PEARSON, ERS. Price tote t VI. A Third Study of the ‘Statistics of oe he Pulmonary Tuberculosis. The Mortality ° — of the Tuberculous and Sanatorium Treat- ment. By W. P. ELpERTON, WE eee in aun 8. J. Perry, A.I.A. Price 3s. net. Se eam aka VII. Onthe Intensity of Natural Selection — in Man. (On the Relation of Darwinism ’ to the Infantile Death-rate.) | By £ CO. ‘Snow, D.Se. \ Price 3s. net. etc VII. A Fourth Study of the Statistics of : Pulmonary Tuberculosis: the Mortality of | _ the Tuberculous ; Sanatorium and Tuber- — ' culin Treatment. By W. Pain ak ee F.LA., and SIDNEY J. ‘PERRY, A. TAL Price 3s. net. 1x. A Statistical Study of Oral Tem- . peratures in School Children with ees reference to Parental, Environmental and | Class Differences. By - ‘M. H. ‘WILLIAMS, M.B., Junta Betr, M.A. | and» Karn eS ; PEARSON, F. RS. Price 6s. net. pit “Questions of. the bay and of the Fray, Pe a a. “iF a iY The Fight against mubprculesiel ae the Death-rate from Phthisis. By Kanu Pearson, F.R.S. Price 1s. net. * V. Social Problems: Their ‘Treatment, my Res Past, Present and Future. By Kant RATT PEARSON, F.R.S. Price 1s. net. ‘gitee VIL _ Eugenics and Public Health. raphe 2 eae to.the York Congress of the Royal SP ie _ Institute. By Karn Pzarson, F.R.S. Price é re 1s. net. WII. MerdeltamanatheProblam of Mental? - Defect. I. A Criticism of Recent American _ te a Work. By Davip HERON, DSe. . “Doxte Number.) . Price 2s. net. VIII. Mendelism andtheProblemofMental - Defect. II. The Continuity of Mental’ Defect.. By Kart Puarson, F.B.S., and Gustav A; JanDERHOLM. Price Is. ‘net: ‘IIL On the Graduated Character ‘oft: Davie a Price 28. net. iC ES eee ei SOS VoLUME X MAY, 1915 No. 4 Tne by « ; {* wie 915) ASSOCIATION OF FINGER-PRINTS. é 3 Nations bul Z By H. WAITE, M:A., B.Sc. 1. Introduction. Certain papers have been published in recent years giving the results of research on the variability and correlation of the hand, notably (1) “A First Study of the Variability and Correlation of the Hand,” by Miss M. A. Whiteley, B.Sc., and Karl Pearson, F.R.S., Proceedings of the Royal Society, Vol. 65, pp. 126—151, and (2) “A Second Study of the Variability ‘and Correlation of the Hand,” by M. A. Lewenz, B.A., and M. A. Whiteley, B.Sc., Biometrika, Vol. 1, pp. 345—860. In the former the writers urge “the import- ance of putting on record all the quantitative measures we can possibly ascertain of variability and correlation ” of characters of the human body. Although Finger- Prints, the characters dealt with in the present paper, cannot strictly claim to _ be quantitative it is hoped by the writer that the results may prove of some interest and use in the solution of the great Problem of Evolution in Man, especially when compared with the results obtained from the study of other measurements of the hand. The principal motive underlying most of the work which has been done in the past on the subject of Finger-Prints has arisen from the development of means of identification and it was based on the fact that the general pattern and character- istics of the finger-prints of any individual are persistent throughout life. As far as I am aware, however, no paper has yet been published attempting to measure the association between the various types of finger-prints in an individual or com- paring these with the relations which have been found to exist between other measurements of the hand. These are the objects of the present paper. 2. Primary Classification of Finger-Prints. As primary classification Purkenje proposed nine types, Galton* three—each being divided into twenty- four sub-classes,—and Henry+ four, these also being sub-divided into a number of classes. For the purposes of this paper I have adopted the method of dividing all the prints into four primary classes; I have also adopted Henry’s definitions and * Fingerprint Directories, by Francis Galton, F.R.S. Macmillan, 1895. + Classification and Uses of Finger Prints, by Sir EK. R. Henry, C.V.O., C.S.I. Wyman and Sons, Third edition, 1905. Biometrika x 54 422 Association of Finger- Prints nomenclature as far as they are required, and these follow in general those of Galton. Secondary classification with its minute details is not used in this paper. The four classes referred to above are Arches, Loops, Whorls and Composites. In Arches the ridges run from side to side, consecutive ridges being roughly parallel and the curvature increasing in general from the base to the tip. (Plate XX. Fig. 1.) In Loops some of the ridges are doubled back upon themselves making a half turn or a little more, the two parts of the doubled ridge diverging from each other at the centre of the pattern. (Fig. 11.) Consequently this pattern has an open mouth directed downwards either towards the right or towards the left of the finger. The direction of this opening supplies a means of subdividing Loops into Radial and Ulnar Loops according as the direction is towards the radius or towards the ulna, that is, towards or away from the thumb. As will be seen later (p. 422B) the proportion of Radial Loops is very small except in the forefinger, so that this method of subdivision has been used only in dealing with that finger. In Whorls some of the ridges make a complete circuit, either as closed con- centric ovals or as a more or less continuous ridge forming a spiral. (Fig. iii.) Composites consist of combinations of two or more of the other patterns. (Fig.iv.) In this class are also included those finger-prints which are too irregular in general outline to be placed in any one of the other main groups. This class also includes the bulk of those patterns about which Sir Francis Galton, in his book on Finger Prints*, p. 79, states—* They are as much Loops as Whorls, and properly ought to be relegated to a fourth class.” It is possible, however, that some of Galton’s “ambiguous cases” may have been classed in this paper with Loops. For further details of these principal classes with their modifications and sub- divisions reference may be made to the works mentioned in the footnotes on p. 421. 3. Material. The material on which this investigation is based consists of two thousand complete sets of finger-prints of adult males, part of a much longer series in the Biometric Laboratory of University College, London. They belong to the lower type of artisan and labouring classes. No selection whatever has been made, except that a few sets, which were incomplete or which contained prints so damaged as to be indecipherable, have been rejected. 4. Symbols, The following symbols are used :—A = Arch, SZ =Small Loop ; LL = Large Loop (see p. 423); W= Whorl; C= Composite; L,= Radial Loop; L,,= Ulnar Loop; R=Right Hand; Z=Left Hand. &,, R,, R;, R,, R; designate the thumb, forefinger, middle, ring and little finger respectively of the right hand, and L,, L,, L;, L,, L; represent the corresponding fingers of the left hand. * Finger Prints, by Francis Galton, F.R.S., Macmillan, 1892. Plate XX Biometrika, Vol. X, Part IV Loop. Fig. (ii). Arch. (i). Fig. Composite. Fig. (iv). Whorl. Fig. (iii). Illustrations of the four fundamental types of Finger-Print. H. Waite 422A 5. Distribution of Classes of Finger-Prints. A preliminary survey of the prints brings to light a considerable clustering together of prints of the same kind. Thus, each of 241 sets contains prints of one class only ; each of 329 sets has nine prints of one class, and each of 194 sets contains eight out of the ten prints of one class; that is, each of 764 sets, or over 38°/,, has at least eight prints of one class, the large majority of these being loops. Again, each of 892 sets contains prints of two classes only, so that each of 1133 sets—or nearly 57 °/, of thre whole—has representatives of not more than two of the four classes. On the other hand all four classes appear in only 95 sets, while the number of single hands, each of which contains at least one of every class, is only 23. For the calculations which follow it has been found advisable to subdivide the loops into two classes, Small Loops and Large Loops (p. 423). Considering these as separate classes, giving five types in all, the distribution of numbers of types for the two hands is shown in the following Table: TABLE 1. Distribution of Types in Right and Left Hands. Number of Types in Right Hand. [=| 7 = 1 2 3 4 5 Totals a Ewe 1 37 84 47 6 = 174 As 2 65 | 465 | 360 61 4 955 sM| 3 15 | 256 | 347 | 96 2 716 2 4 1 36 83 30 1 151 2 & 5 = 1 2 1 = = Totals} 118 | 842 | 839 | 194 5 q eee patel = In this Table, taking as origin the cell (3, 2) containing 360 types, we have the following results : Mean of Left Hand Types, 428 oy, "7628 Mean of Right Hand Types, — 435 a ‘7608. We thus find the correlation coefficient (7) to be ‘281 + 014. The contingency coefficient (c), corrected for the number of cells, is °289. Hence we conclude that there is a distinct, though not very great tendency towards equality in the number of types in the two hands of an individual. It appears, however, that the divergence is rather greater in the right than in the left hand. The question now arises whether the difference in divergence in the two hands for the samples taken is significant. I have tested this by the method proposed by Professor Karl Pearson*. * «©Qn the Probability that Two Independent Distributions of Frequency are really Samples from the same Population,” by Karl Pearson, F.R.S., Biometrika, Vol. vu11, pp. 250—254, July, 1911. 54—2 4228 Association of Finger-Prints TABLE 2. Divergence of Types in Right and Left Hands. Number of Types. i 2 3 4 5 Totals Right Hand _... 118 842 839 194 vi 2000 Left Hand wo 174 955 716 151 4 2000 For this Table x? = 33°72, whence FP is less than ‘000,005. That is, the odds are more than 200,000 to 1 against the occurrence of two such divergent samples if they were random samples of the same population. In other words the right hand generally tends to have a greater divergence of types than the left. The following Table gives the distribution of classes of prints for the various fingers of both hands: TABLE 3. Distribution of Classes of Prints. A is rie W c | Ry = 46 1104. | 1 649 200 1 lS) 537 456 481 174 Re a 212 1399 38 274 ai Vee se 63 1015 17 729 176 Re a 31 1631 3 228 107 ‘aoe pe ers - is | Totals Mes 704 5686 515 | 2361 734 L, on 91 1311 3 341 254 Ly ae sl3e || 732 | 383 437 135 Ls oe: PAS} 1408 35 240 102 In aeS 66 1283 | 12 491 148 Ls Ses 35 1727 — 150 88 Totals us 720 6461 433 1659 727 Totals for both hands} 1424 12147 948 4020 1461 The most striking feature of this Table is the uneven distribution of the various classes, especially the large proportion of ulnar loops and the very small H. Warts 423 number of radial loops except in the forefingers. A comparison of the distribution in the two hands shows considerable differences; e.g., in the left thumb the num- ber of arches is about double the number in the right; again, the whorls in each finger of the right hand are greatly in excess of those on the left, while the left hand has, in every case, an excess of ulnar loops. If we arrange the numbers of each class in order of magnitude, we see that the order for the arches is identical for the two hands and also for the ulnar loops. In each of the other classes there is one exception to the “identical” order. I have tested these distributions for each type by the method referred to in the footnote of p. 4224, with the following results :—In the arches the odds are more than 500 to 1 against the occurrence of two such divergent samples which are random samples taken from the same population; in the ulnar loops the odds are more than 200,000 to 1; in the radial loops about 5 to 2; in the whorls more than 1,000,000 to 1, and in the composites more than 1300 to 1. We may thus fairly conclude that with the exception of the radial loops the frequency distribution of the classes between the fingers is different in the two hands and the radial loops are so few, except in the forefinger, as to be almost negligible. 6. Subdivision of Loops. The great preponderance in the number of loops and the insignificance of the number of radial loops, except in the forefinger, make another subdivision of this class necessary. The method adopted is as follows :— All loops, in common with whorls and composites, contain certain well-defined points; these are (1) the “ delta,” or “outer terminus,” and (2) the “ point of the core,” or “inner terminus.” [See Henry, pp. 22—24.] The number of ridges mtervening between the delta of a loop and the point of the core may be anything from one up to about thirty; in only 38 cases out of the 13,095 loops does the number of ridges exceed 25; two of these are over 30, one being 32 and the other 35. The complete distribution of ridges is given in Table 4 a. In dividing the loops into two sub-classes according to the number of ridges the nearest approach to equality is obtained by taking (a) those containing from 1 to 12 ridges, and (6) those containing 13 or more ridges. For brevity I have called these classes (a) Small Loops, and (b) Large Loops; the terms “Small” and “Large” have no reference to the relative sizes of the patterns. The numbers in the two groups, thus arranged, are 7033 and 6062 respectively. Table 46 gives (1) the number of loops for each finger, (2) the means, (3) the standard deviations, and (4) the coefficients of variation in the numbers of ridges. Examining the Table below consider first the means. The order which is identical in the two hands runs: (1) Thumb, (2) Ring Finger, (8) Little Finger, (4) Middle Finger, (5) Index. It will be noticed that this order of the means is quite different from that of the relative areas of the patterns. 424 Association of Finger-Prints TABLE 4a. Distribution of Ridges in Loops. Right Hand. Left Hand. Ridges | Ft) | (Be -| Bel seeaieleee, NE | len |) es 1 3 19 18 8 a 20 21 5 10 2 7 68 47 27 24 79 65 25 32 3 7 76 50 26 49 Tey 54 40 42 4 8 71 66 35 68 ga 60 33 56 7) 24 67 57 58 70 70 52 45 69 6 14 47 80 36 92 54 47 32 69 tf 24 45 76 44 77 64 | 75 47 70 8 29 50 85 45 73 67 70 34 91 9 25 40 97 36 83 65 77 53 85 10 43 47 109 55 94 73 117 61 122 al 53 55 116 48 100 76 124 89 156 12 53 69 120 71 133 77 125 119 157 3 65 62 116 66 111 65 141 94 134 14 89 60 128 76 143 67 139 111 150 15 68 60 84 79 103 49 99 94 131 16 91 43 75 73 106 35 66 117 136 1 86 38 62 60 97 26 47 86 88 18 84 33 28 65 73 12 37 60 64 19 100 16 11 34 46 11 15 58 33 20 61 12 5 34 48 a 6 33 17 21 49 a 4 18 19 4 22 6 22 34 2 2 19 3 6 1 12 4 23 33 4 — 8 4 a — 9 2 24 22 1 1 4 6 1 1 8 1 25 ll -- = 2 2 = —~ 2 2 26 7 _- —_ 1 2 = == 1 = 27 7 — — 1 1 -— — 1 — 28 7 — — — -- = = 1 — 29 1 = = 1 = 30 ae rig | Seas 1a Sas eee ones 32 techs | Meee ee | a 35 as ee ae Totals | 1105 | 998 1487 | 1032 | 1634 | 1314 | 1115 | 1443 | 1295 | 1727 TABLE 40. Nee of Means Standard Deviations Coefficients of Variation oops Ree R iB R is R ij Thumb 1105 | 1314 | 15°52++10 | 13-27+-09 | 5°174°07 | 4°634°06 | 34°34+ °53| 34:85+ °51 Index .. {| 993 | 1115 | 9°69+:12) 8834-10 | 5-41+°:08 | 4°88+ °07 | 55°82+1-08 | 55°24+1-00 Middle Finger | 1437 | 1443 | 10-41+-08 10°55+-08 | 446+ ‘06 | 4°53+ :06 | 42°80+ ‘63 | 42°914 °63 Ring Finger... | 1032 | 1295 | 12'374°12 12°774°10 | 548+ ‘08 | 5°09+°07 | 44°31+ °78 | 39°85+ °61 Little Finger . | 1634 | 1727 | 11°75+-:08 | 11°53 + -07 | 4°97 + ‘06 | 4°46 + -05 42°30+ °58| 38°71+ ‘50 le H. WaAITE 425 Comparing the two hands we see that the differences in the middle, ring and little fingers are insignificant ; in the thumb and index, however, there is a marked difference in favour of the right hand. The order of the standard deviations in the right hand is: (1) Ring Finger, (2) Index, (3) Thumb, (4) Little Finger, (5) Middle Finger. In the left hand the order of the last two is reversed, but the difference is small. With the exception of the middle finger, where the difference between the two hands is only about equal to the probable error and is therefore insignificant, the standard deviation is in every case greater for the right hand than for the left ; the differences are all of the same order of magnitude and range from about 39 to 54, Coming now to the coefficients of variation—the order in the right hand is: (1) Index, (2) Ring Finger, (3) Middle Finger, (4) Little Finger, (5) Thumb. In the left hand the order of the ring and middle fingers is interchanged. Comparing the two hands we see that in three cases—the thumb, index, and middle finger—the differences are each less than the probable errors; in the other two cases the variability is considerably greater in the right hand than in the left. I have carefully revised the calculations involved but have been unable to detect any error; neither can I suggest a reason for the large differences. In “ A First Study of the Variability and Correlation of the Hand” (see p. 421), the writers find that the variability of bone lengths is closely related to the relative utility of the fingers, the least variability being that of the most useful finger. There appears, however, to be no such simple relationship between the ridges of the loops and the relative utility of the fingers. I have compared the distribution of ridges in the loops of the thumbs by Professor Pearson’s method (p. 4224, footnote), which gives y? = 166°64; hence the odds are much greater than 1,000,000 to 1 against the occurrence of two such divergent samples if they were random samples taken from the same population. The distribution—absolute and percentage—of the five groups is now as follows (Table 5). In comparing the large and small loops it will be seen that in both hands there is an excess of large loops in the thumb, ring and little fingers, and an excess of small loops in the index and middle fingers. The order of these classes agrees in the two hands with one exception in each case. An approximate measure of the relationship existing between the various combinations of digits is given by the number of cases in which two particular digits on the same or on opposite hands have the same pattern. Table 6a gives the percentages for the same hand and for digits of the same name on opposite 426 Association of Finger-Prints TABLE 5. Arches Small Loops | Large Loops Whorls Composites No. Oe No. oie No. Sh No. Os No. oe Ry me 46 2°30 290 14°50 | 815 40°75 | 649 32°45 200 10°00 Ry aise 352 17°60 | 654 32°70 339 16°95 481 24°05 174 8°70 Rs ... | 212 10°60 | 921 46°05 516 25°80 | 274 13°70 77 = 3°85 Ry eet 638 315 489 24°45 | 543 27-15 729 36°45 176 ~=—880 Rs nit 31 1°55 870 43°50 | 764 38:20 | 228 11°40 107-5585 Totals aio 704 3224 2977 | 2361 734 Es 91 4:55! 547 27°35 | 767 38°35 | 341 17:05 | 254 12°70 bs 313 15°65 | 833 41°65 | 282 14:10] 437 21°85 | 135 6°75 : 215 10°75 | 887 44°35 | 556 27:80] 240 12:00] 102 5:10 fe 66 3°30 | 583 29:15 | 712 35°60 | 491 24:55 | 148 7-40 Ve 35 1°75 | 959 47:95 | 768 38:40] 150 7-50 88 4:40 ‘ i = - Totals = 720 3809 3085 | 1659 727 Totals for both hands | 1424 7033 6062 4020 1461 hands; the readings for other combinations of digits on opposite hands are given in Table 6d, p. 431, where all the patterns are grouped in three classes for the sake of comparison with Galton’s results. Remarks on Table 6a. (a) The percentages vary greatly with different combinations and with different patterns. (b) The means and totals for digits of the same name on opposite hands are all much greater than the corresponding readings for the right or for the left hand; the means, with one exception, and also the totals for particular combina- tions on the left hand are all greater than the corresponding readings for the right. (c) The order of magnitude of the totals is nearly the same for the two hands, those of the combinations including the thumb being, with one exception in each hand, the lowest. Hence, judging the relationship by the totals, it appears that (1) digits of the same name on opposite hands are the most closely related, the magnitude falling in order from the little fingers to the thumbs; (2) omitting the thumbs, two consecutive digits are generally more closely related than others more widely separated ; (3) the digits of the left hand are more closely related than those of the right. (d) The relationship between the thumb and any other digit seems to be less close than that between any pair of digits not including the thumb; also, in both H. WaAItE 427 hands, the thumb appears to be most closely related to the ring finger, then to the little finger, next to the middle and least to the fore-finger. Another method of investigating the approximate relationship between the various digits is by means of a “centesimal” scale, as in Galton’s Finger Prints, Ch. Vit. Table 66 gives such scale readings for small loops, large loops and whorls, for pairs of digits on the same hand and for digits of the same name on opposite hands. Percentage of Cases in which various pairs of Digits possess the TABLE 6a. same Class of Pattern. I have not considered it necessary to include other couplets Right Hand Left Hand Couplet Totals Totals A SL | LL W Cc SL | LL W Cc Thuinb and fore-finger 15] 63) 775) 13°0} 1:4) 29°7 J 2°4/15°7|) 7:2} 8:3) 1:3) 34:9 * middle finger [1:3] 8°7|11:2| 80| -7| 29°9 }1°6/ 16-2) 13-4] 4-9] 1-4] 37-5 + ring 5 71) 6:5 | 12:4) 17-8) 1:1) 38°5 71132) 17-1) 7:9} 1-5] 40-4 is little . ‘4| 99/150] 6-9] -9| 33:1 } -7/| 19:0) 15-7] 2:7] -9| 39-0 Fore-finger and middle finger | 6°4 | 22°7| 7:2) 8:°9|1:2| 46:4 [5°8/26°3| 7:°9| 86/1:2) 49°8 5 ring ,, 2°3/13°7| 6°4/16:3| :9| 39°6 }2°5)17°8| 7:1) 12°7|)1°5| 41°6 ‘ neler 1:2/ 20:1] 9:1] 6-8] -9| 38:1 [1:3] 26-3! 8-2] 4:3] -3] 40-4 Middle and ring finger 2°5| 16°8| 91)12:0) 7 | 41°1 §2°6 | 21:1] 15:4) 8:8) -8) 48°7 - little ,, 1:0 | 26°8| 14:0} 4:6) :4] 46°8 [1-2 | 28°9) 15°3| 2:9] -6| 48-9 Ring and little os *8 | 20:0} 15-7} 10°0 | 1:0 | 47°5 ‘9 | 23°7|19°5 | 6:0} °8]| 50°9 Means 1°8 | 15-2 | 10°8 | 10-4 9} 39:1 2-0} 20°8| 12°7| 6°7/1°0 | 43°2 | | Couplet A Sia |) oda) He C | Totals Two thumbs 3 1°56 | 10°2 | 23°4 | 13°5 | 2°7] 51°3 » fore-fingers ... 9°3 | 22°7 | 5°6.| 14:4] 1°74) 53:4 | , middle fingers 5°8|31°8|14°8| 7:0] 9} 60°3 » ring - 1°9 | 18°2 | 18°4 | 21:2 | 1:5} 61°2 Seilittle, =": 9 | 36:1 | 27-2) 5:0| 1:4) 70°6 Means... 3°9 | 23°8 | 17°9 | 12°2 | 1°6 | 59°3 from opposite hands, because, as is shown later, the relationship between any pair of digits from opposite hands is practically the same as between the corre- sponding pair on the same hand. I have also omitted arches and composites from this part of the inquiry as the numbers belonging to these classes are, as a rule, comparatively small. Biometrika x 55 428 Association of Finger-Prints The scale reading for any pair of digits is calculated as follows :— Take, for example, the whorls on the right thumb and right fore-finger; the former has 32°5 and the latter 24 per cent. of whorls, while 13 per cent. of right hands have whorls on both thumb and fore-finger. Now from independent pro- bability we shall expect BEES x 100, or 7°8 per cent. of “double whorls” in this combination of digits and we therefore conclude that the remaining 5°2 per cent. of double whorls are due to a relationship between the digits. If we set aside the 7°8 per cent. out of the 32°5 and 24, we see that from the remaining 24-7 and 16:2 per cent., the greatest possible percentage of double whorls would be 16:2; but as the actual percentage in addition to the 7°8 is 5:2, the centesimal 5-2 x 100 measure of the relationship is Sr , or 32, to the nearest unit. TABLE 60. Approwimate Measures of Relationship between various pairs of Digits on a Centestmal Scale. | Right Hand Left Hand Couplet ] SL | LL | W |Means} SL | LL | W_ | Means Thumb and fore-finger ... 16 6 | 32 18 27 21 34 27 5 middle finger... 26 0 38 21 26 16 28 23 Re ring 5 27 8 29 21 27 16 29 24 little i 44 Oo aioe Ooms mn 4 | Jo30n sos Fore and middle fingers... 44 22 54 40 34 39 64 46 * ring os Hh 30 15 49 33 33 23 44 33 ‘ little a Bae 32 25 47 35 29 32 46 36 Middle and ring se, os 42 11 80 44 50 31 64 48 y, little ,, a 29 26 31 29 33 27 30 30 Ring and little . ae 67 32 81 60 64 26 74 55 Couplet SL LL W | Means| Two thumbs me 59) | 34 | 69) || 54 » fore-fingers ... 48 27 55 43 » middle fingers... 47 41 52 AT » ring ms 64 50 78 64 », little 53 67 53 62 61 Most of the remarks on Table 6a will be found applicable to Table 6 6, with, at most, but slight modification; the chief differences are that the relationship between the middle and little fingers is not so high in Table 6 0 as in Table 6a, and the order for pairs of like digits is not the same in the two Tables. H. WaitE 429 Comparison of results with those of Galton. In order to compare with Galton’s results it is necessary to put large and small loops into one class and to include composites with whorls. Making some allowance for the difference of classification, and for any slight variation which may be due to the fact that the material is drawn from very different classes of the population, it will be found that there is almost perfect agreement between our data on all essential points. The relative frequency found in the two investigations was :— Galton Waite Arches 65 per cent. 71 per cent. Loops CGO GBA A Whorls PABA QA The differences are small in comparison with some found by Galton when examining the finger-prints of different races. For example, 1332 Hebrew children had arches on the right fore-finger in 13°6 per cent. of the cases, while only 7°9 per cent. of 250 English children had arches on that finger. TABLE 6c. Percentage Frequency of Arches, Loops and Whorls on the different Digits. Gatton* | WAItE From observations of the 5000 From observations of 20000 digits of digits of 500 persons 2000 persons Arches Loops Whorls Arches Loops Whorls Digit FARR Eile reali | | ote Wad | Oh RON | Fore-finger... | 17 | 17 53 53 30 pa i ad Ue 49°7| 55°7| 32°7| 28°6 Middle finger | 7| 8] 78] 76] 15] 16] 106 | 10°7 | 71°9) 72:2] 17°5| 17-1 Little se 1 2 86 90 13 8 1°5 a7 81°7| 86°4] 16°38) 11°9 Thumb 3 5 53 65 44 30 2°3 4°5 55°2 | 65°7) 42°5 | 29°8 Ring finger... 2 3 53 66 45 31 3°2 3°3 51°6 | 64°7| 45°2} 32-0 Totals SOs EsoMo2eooON 4 yell lise Sbs2el 8o:9) | SLO) 344275) 15479) 119° ie = | Galton arranged the digits as in Table 6c, in order to bring out certain peculiarities. He says :— “The digits are seen to fall into two well-marked groups ; the one including the fore, middle, and little fingers, the other including the thumb and ring finger. As regards the first group, the frequency with which any pattern occurs in any named digit is statistically the same, whether * From Finger Prints, p. 116, Table II. 430 Association of Finger-Prints that digit be on the right or on the left hand; as regards the second group, the frequency differs greatly in the two hands. But though in the first group the two fore-fingers, the two middle, and the two little fingers of the right hand are severally circumstanced alike in the frequency with which their various patterns occur, the difference between the frequency of the patterns on a fore, a middle, and a little finger, respectively, is very great. “Tn the second group, though the thumbs on opposite hands do not resemble each other in the statistical frequency of the A. L. W. patterns, nor do the ring fingers, there is a great resemblance between the respective frequencies in the thumbs and ring fingers ; for instance, the whorls on either of these fingers on the left hand are only two-thirds as common as those on the right. The figures in each line and in each column are consistent throughout in expressing these curious differences, which must therefore be accepted as facts, and not as statistical accidents, whatever may be their explanation.” (Galton, Finger Prints, p. 116.) These remarks apply with equal force to my figures although the actual percentages differ somewhat in certain cases, the most marked being in the middle finger arches and the little finger whorls. The following points of agreement in the distribution of the patterns are also noticed by reference to Table 6 c. The frequency of arches on the fore-fingers is much greater than on any other of the four digits. “It amounts to 17 per cent. on the fore-fingers, while on the thumbs and on the remaining fingers the frequency diminishes in a ratio that roughly accords with the distance of each digit from the fore-finger. “The frequency of Loops has two maxima; the principal one is on the little finger, the secondary on the middle finger. “ Whorls are most common on the thumb and the ring-finger, most rare on the middle and little fingers.” (Finger Prints, p. 117.) In discussing radial and ulnar loops, which Galton describes as loops having “inner” and “outer” slopes, respectively, he says :— “Tn all digits except the fore-fingers, the inner slope is much the more rare of the two; but in the fore-fingers the inner slope appears two-thirds as frequently as the outer slope. Out of the percentage of 53 loops of the one or other kind on the right fore-finger, 21 of them have an inner and 32 an outer slope; out of the percentage of 55 loops on the left fore-finger, 21 have inner and 34 have outer slopes. These subdivisions 21-21 and 32-34 corroborate the strong statistical similarity that was observed to exist between the frequency of the several patterns on the right and left fore-fingers; a condition which was also found to characterise the middle and little fingers.” (Finger Prints, p. 118.) These statements are true, in general, of my Table 8, but my percentages on the right fore-finger are 22°8 radial and 26-9 ulnar; on the left they are 19°2 and 36°6 respectively. Close agreement is also observed in Table 6d which shows the tendency of digits to resemble one another in their various combinations. Galton omits combinations into which the little finger enters “because the overwhelming H. Waite 431 frequency of loops in the little fingers would make the results of comparatively little interest, while their insertion would greatly increase the size of the table.” (Finger Prints, p. 119.) I have included them, however, for the sake of com- parison and completeness. My percentages are readily obtained from Tables LVI to C in the Appendix. TABLE 6d. Percentage of Cases in which the same Class of Pattern occurs in various Couplets of Digits. GaLTON* WaItE Arches in Loops in Whorls in Arches in Loops in Whorls in Couplet Same | Opposite | Same | Opposite | Same | Opposite] Same| Opposite Same Opposite Same Opposite hand | hand hand | hand | hand | hand fhand| hand |hand| hand |hand| hand Two thumbs — 2 — 48 — 24 — 1°6 = 47 °4 = 24°5 ,, fore-fingers = 9 — 38 — 20 — 9°3 — 36°2 = 20°4 » middle fingers — 3 — 65 -- 9 — 5°8 — 60°6 — 10°5 » ring + re 2 — 46 = | ae _ 1g, — 46°3 — 27°9 » little Sac || — ~- — — — —_— ‘9 = 63°2 — 63 | Thumb and fore-finger | 2 2 35 33 16 15 1SOi) WeSh 36:8 Shey i828 1745 Ps mid-finger 1 1 48 47 9 8 1-4 15 47°0,; 46°7 |10°9} 10°5 x ring finger 1 1 40 38 20 18 af 6 41:0} 39°4 | 20°8| 19°0 Fore and mid-finger . 5 5 48 46 12 a 6:1 5°5 ANID} || alos | ileats) || 12383 H a ring finger ... 2 2 35 35 ily 17 2°4 2°3 36°5 | 35°7 | 20°8} 20°2 Middle and ring finger} 2 2 50 50 13 12 2°5 2°4 48°3) 47:1 | 14:7} 13:7 Thumb and little finger | — -- — —- | = oo "52 "45 |54:2|) 53°6 8°8 81 Fore and little finger... | — — eel eae —_— 1-20} 1:15 |47°7) 47:2 9°3 8:1 Middle and little finger el | = il 1-0 63:9 | 63-5 6°5 6'1 Ring and little finger... | — — — — | — — 8 7 56") 54:8 | 12°9')) “11-8 |e In commenting on his results in Table 6d, Galton says:—*“ The agreement in the above entries is so curiously close as to have excited grave suspicion that it was due to some absurd blunder, by which the same figures were made in- advertently to do duty twice over, but subsequent checking disclosed no error. Though the unanimity of the results is wonderful, they are fairly arrived at, and leave no doubt that the relationship of any one particular digit, whether thumb, fore, middle, ring or little finger, to any other particular digit, is the same, whether the two digits are on the same or on opposite hands.” It will be noticed, however, that while exactly half of Galton’s eighteen pairs of percentages, which are worked to the nearest unit only, are in strict agreement, in all the other cases the result is one or two units less for two digits on opposite hands than for the corresponding digits on the same hand. In my figures the percentage for two digits on opposite hands is in every case the lower, and * Finger Prints, p. 120, Tables VIa and YI Db. 432 Association of Finger-Prints although the differences are small, ranging only up to 1:8 while four-fifths of them are less than 1, the consistency of the results suggests a slightly closer relationship between a pair of digits on the same hand than between the corre- sponding pair on opposite hands. This view is further supported at a later stage of this paper. (See Remark (d) on Tables 14-16, p. 450.) One further comparison is of interest, namely, the measure of relationship between the various digits on a centesimal scale. It should be noted, however, that while Galton’s means are based on loops and whorls only, omitting arches from his three groups, mine are based on small loops, large loops and whorls, omitting arches and composites from my five groups; also Galton gives no results for those combinations which include the little finger. TABLE 6e Approximate Measures of Relationship between the various Digits, on a Centesimal Scale. Gauton* WatrtE | Couplet Means Right Left Thumb and fore-finger ... 24. 18 27 i middle finger... | 27 21. 6 \e Tosa a ring finger... 39 21 24 | Fore and middle finger pac 60 40 46 » ring finger... ne 23 33 33 Middle and ring finger aa 52 44 48 Right and left thumbs Soh 61 54 se fore-fingers ... 48 43 i middle fingers 43 47 aH ring fingers ... 65 64 For the reasons given above we could hardly expect that these readings would be even approximately equal, but for all that, the same general relations are seen to hold good in the two sets of results. It is convenient at this stage to summarize a few of the most important points which have been brought to light in the foregoing pages. These are: (a) ? instead of o,?, where >? = SS (y = ali) N We may write xs SS(Y= Tat a ~ a - N = SS (y ge Ya) ak S (Re (Yau me ai)’ a 28S (y ae Ya) Ya x a¥i) * N N N S Cs Gia) S {Ny (Ya = ai)? Fu. ain ie toes oe Nie eae since the third term vanishes ; hence 2 SiMe (Go ai) YN Jas (Nz Fq7)/N +8 {nz Ya- ail} /N Stn Oa) But DAAC) | eee u No,? 1-7, and S [Ne (Ya = adi)” Novae eam where , is the crude 7 found by the ordinary method. H. Watt 437 We have, therefore, the value of the correlation ratio of restricted Tables given by 7? Sey ee Ny a 1— 9." + 9)" ; or ae 7 a V1 — ei + Hp" The correlation ratio has been found by the method described above for all the restricted Tables; it has also been determined by the ordinary method for a few of the other Tables, but no correction for number of arrays has been applied. The results, together with the coefficients of correlation and of contingency, are given in Tables 8 and 9. Regression curves for all the restricted Tables are given on Plates (a—e). The continuous line is the independent probability curve and the broken line the curve of the observed means. It follows that the area between the curves, weighted, of course, with the marginal totals, gives a measure of the correlation ratio between the two characters. Each set of three figures for two particular characters, namely, those for the right hand, left hand, and both hands respectively, will generally be found to resemble each other closely. Irregularities occur chiefly with composites but this is not surprising if we consider the nature of this class. Sc. Coefficients of Correlation of Restricted Tables. A glance at the diagrams of means of the restricted Tables, Plates (a—e), shows that the regression is generally non-linear; it is also evident that a sensible value of r is introduced by the restriction*. Hence the value of 7 as found by the ordinary product-moment method is (1) too small because of the skewness of regression and (ii) too Jarge on account of the restriction. These two contrary causes render the coefficient of correlation of restricted Tables unreliable and therefore quite valueless ; for even if it sometimes agrees fairly closely with the correlation ratio and the contingency coefficient, this agreement is probably due to the fact that the two sources of error counterbalance each other. In the remaining Tables, for which the results are given in Table 9, the regression is frequently skew; for this reason and for those given above, I have rejected the values of the coefficient of correlation in the sequence and have based my conclusions on the contingency coefficients, confirmed in general by the corre- lation ratio. * For example, in small loops and large loops, left, the case in which the difference between r and ¢ is the greatest, the independent probability numbers have the correlation coefficient — 512 (instead of the theoretical value zero), as compared with —-+507 of the observed numbers. In the case of arches and small loops, both hands, 7 for the independent probability numbers is — ‘148, as against + °147 of Table 8. Doo) SR *SOAIN() “yenioy = Sour'y uoyorg ‘SOAING Aqiqeqoig quepuedepuy = SOUl'T Ssnonuryuor) ‘SqULI gq -19. 501 LOF SOAINS) UOIsseIsoyy “(IIATX 19@L) 9 “Sta “(IIX 19%) ¢ “BLE ‘(IT e1qeh) % “31a “spunyT ywog ‘sayoup fay ‘sayour ‘1y bry ‘sayoLp Oem eo Ge ay i emacs 41) ‘spunpT yiog ‘sdooy abuvT “yfoq ‘sdooryT abin'yT “qybiy ‘sdooT abst (IATX 19%L) § “By (IX 19%L) 6 “Sta (T aqez) T ‘31a ‘spuolT Ywg ‘sayour ‘WfoT ‘sayour ‘MOT ‘say oLE Oy OCs ee ee oe! G vs 8 ass L 0 g v € z L é att 4 yybuy ‘sdooT nus D ALVIg ‘SOAIND [BNW = soury uaeyorg ‘saamg AyyIqeqorg yuepuodapuy =soury snonunyuog ‘s}ULIg-IOSULY IJ saamng worssaasay ‘(XITX o19@L) ot “31 ‘(AIX 14%L) IT “31 ‘(AT 19% L) OT “Sta ‘spuveT yiog ‘sayour yforT. ‘sayouP ‘qybry Ssayour Ole One te Ol Ge oer Cea : eer Reine i 0 Ce ee a & EL = S) H ~ 7. 6 ve Gro & 2 be : 1 3 = \ by bc ‘(IIIATX 198) 6 “8tt ‘(IIIX 1deL) 8 “tq “(IIT Arde) 2 8h ‘spuvH ylog ‘sayouKr yfa'T ‘sayoLe “qybiy ‘sayour Ol 6 8 Z 9 S v € 6 L G v € 4 L eect S = 8 eS > WN aS x= GE +E Ss = > Ob, OPV ®O| 6G OG ‘g aLVId ‘yhry ‘soqisodmogp “qybra ‘s)loy Af ake pet the i Pras rorya Melee ae ean SE Poe ec al, YE a i ALK : Ue i Ae. ene ‘SOAING [BNJOY =soulryT usyorg ‘soaing AyyIqeqorg yuepuedepuy =soury snonuruop ‘sjurIg-10Surg 10y soamng uorssersoy | . ‘(IT e19eL) 8T “ta (IAX A1V@L) LT “Sta “(IA e19®%L) 9T “tL ‘spunyy yz0g ‘sdooT 2)pUMmg ‘yaT ‘sdooyT ypwWg ‘ybuz ‘sdooT youg Ol 6 8 Z 9 G i 6) L e 0) L 6 oe S : \ Ss 3 . Q wf? $ > on > & St Gos = ‘ = to § = 5 ee ole b6 \ Ol “(I eae) St “31a (AX 142L) FT “8a (A 198L) Sl “3 ‘spuvyT yog ‘sdooT zvug yaT ‘sdooT ))vwy ‘ay8ryqz ‘sdooT yous ) rt to le § 8 5 5 Ss c > 5 S by = fs S = Ole Say i x = = a ara Saas oo [ep) Ol ‘A FLVIg ‘saaIng [enjoy =sourry uoyorg ‘saamng AyYIqvqorg yuopucdepuy=soury snonuyucy ‘syulag-toSutqy Aof saamp worssor5eyy ‘(IIIT 19%.L) #6 “St “(IITAX 148L) 6% “31 \(TIZA e19%L) 2c “Bt ‘spuvy ylog ‘sdooy aliun'T ‘yfaq ‘sdooT abuntT qybuy ‘sdooT albany OL 6 8 vA ) ) v Ss 6 l ! \ = Oo 25 by Ss S sz = q cs a € S € = v v OG G ‘(IIT 98D) 16 “Sta (ITAX 1981) 06 “8M “(ITA *198L) 61 “SUT ‘spuvH yilog ‘sdooT jJvwWg yfoT ‘sdooT Jpg wybig ‘sdooT 7)nWy OL 6 8 Z 9 G v € G L 0 Q 3 3 a S Q iS} 2 s S = : by BS >: = &. = ‘$s Q ALVIG Be |= » (AT 19% L) 08 “BLL (XX 14RD) 66 *3t “spuvET yOg ‘s8).00y 4 “yfoT ‘s)40Y ol 6 8 2 9 G Vv € j L G v & 6 = x é ee ae ae Q $s as) B es) = my 3 (AIT a1qet) 26 “3 ‘(XTX 19% 1) 96 “Sta ‘spuvyT yiog ‘sdooT abuny yfaT ‘sdooT abunT ol 6 8 Z 9 G Ve j L ‘spuvyT yrog “sopisodwmoy 2 LV Td. (op) ‘saAmng [enjoy = seury usyorg ‘saaang AypIqeqoig yuepusdepuy = seurry snonutyu0D yfaT ‘saysodumoy ‘yfaT ‘sajrsodumog “(XI 91qeE) go ‘817 “ybry ‘sdooT abav'yT ‘SJULIG-doSUly IO} soaang uowsseisoy (X o198L) 86 “BI “qybuz ‘s.L0Y 4 qybry ‘sapsodwop qybuz ‘sajrsodwop H. WailtTE 443 TABLE 8. Table in y yee | yalp | yx) | ayTe | xyIp | xy? 0 Cc Appendix Arches and Small Loops 062 | (192 | :242 | -227 | °228 | -266 | -264 | -251 | -305 I 5 Large Loops 273 | °273 | °194 | °198 | °298 | °215 | *220 | -209 | -246 IT Z % Whorls 317 | °359 | *283 | °290 | °353 | *286 | °292 | 291 | 335 III 3 Composites : 146 | °154 | °139 | 139 | °157 | °140 | -140 | +140 | 154 IV | Small "Loops and Large Loops 422 | 438] ‘074 | -082 | *430 | °073 | 080 | 081 | -166 Vv a=) $6 Whorls 585 | 622 | °313] °371 | 574 | 315 | °385 | °378 | °408 VI a0 Composites 270 | °273 | *207 | *210 | *273 | *197 | 201 | -205 | *228 VII | Large Loops and Whorls -234 | -239 | 079 | 081 | °305 | 112 | 116 | -097} -162| VIII Composites 093 | °138 | :093 | °093 | °104 | 045 | 045 | 065 | °128 IX Whorls and Composites 020 | °162 | °117 | °126 | (032 | ‘061 | °061 | -088 | 137 xX Arches and Small Loops 038 | 197 | °271 | +267 | *249 | *301 | -292 | +281 | °335 XI a Large Loops *333 | 344 | -253 | -261 | °375 | -293 | -302 | -280 | °309 XII Whorls "255 | -286| °232 | -235 | *281 | *232 | -235 | +235 | -274 XIII 'e ‘ Composites 154 | +159] +144 | 7145 | 164 | -147 | -148| -146 | 162) XIV s Small Loops and Large Loops |— *507 | 534 | ‘095 | °112 | °510 | °028 | 025 | -055 | *120 XV os ¥ Whorls *565 | 623 | °364 | *422 | °570 | °310 | °353 | °386 | -419 XVI o Ne 365 | +392] -298 | -309 | 401 | 281 | 293] 301) 311}; XVII 4 | Large Loops and Whorls 160 | °193) °129 | -131 | *260 | °157 | -160 | °145 | -236| XVIIT Composites 080 | -131 | -063 | -063 | ‘088 | ‘007 | 007 | 021 | -108 XIX Whorls and Composites 115 | +208 | °217 | °217 | 173 | *202 | -201 | -209 | °244 XX Arches and Small Loops "147 | -340 | -387 | -381 | °279 | °358 | 350 | °365 | -440 | XLVI B 6) Large Loops 364 | 375 | -298 | 306 | 482 | °364 | °384 | 343 | 383} XLVIT f= ‘ Whorls 319 | -409| +355 | °363 | 397 | °341 | 348 | 355 | 402 | XLVIIT 3 Composites 203 | -227 | 220 | 220 | 226 | -212 | -213 | -216 | 239} XLIX & | Small "Loops and Large Loops A471 | °527| 162 | *187 | 476 | -059 | -O71 | *115 | °234 L ee ,, Whorls ‘638 | *707 | +421 | 511 | °670 | -412 | -478 | -495 | -503 LI a Composites 382 | 393 | °331 | °339 | -402 | 333 | °341 | *340 | °365 LII Large Loops and Whorls 147 | "194 | °178 | °178 | °323 | °228 | -234 | -204 | 333 LIII Composites 020 | 109} -085 | 085 | :087 | ‘066 | ‘066 | °075 | *181 LIV Whorls and Composites 150 | 280} -295 | *294 | °195 | -285 | +233 | *260 | 320 LV Remarks on Table 8. A comparison of the Correlation Ratio with the Con- tingency Coefficient of the Restricted Tables. (a) The values of 7 and C are generally in very close agreement. (6) The value obtained for 7 is, however, always less than that for C. (c) In only three cases does the difference between y and C exceed 011. probable error of 7 ranges from ‘015 for the smallest values to 011 for the largest ; it will also be remembered that no corrections have been applied to 7 nor to C, since we do not yet know what these corrections should be for restricted Tables. We may assume, however, that, as with ordinary Tables, correction would modify m less than it would diminish C, and the corrected values of 7 and C would thus, in all probability, agree somewhat more closely than at present. Biometrika x The 57 444 Association of Finger-Prints TABLE 9. Right and Left Hands. Table in Oe Ct 1 Appendix Arches 2 and Arches Z ... ... | +°686 + 008 664 688 = XxI ie Small Loops Z ... | +°160+°015 "285 302 | -2344+-014| XXII a Large Loops Z... | —"297+ 014 "322 337 — XXIII a Whorls ZZ... .... | —°257+°014 ‘290 | -307 -- XXIV Composites Z ... | —*140+°015 “118 ‘161 -- XXV Small Loops R and Arches L ... | +°185+-015 *309 325 | -2834+:014| XXVI A Small Loops Z| +:711+ 007 631 635 2 XXVII ss Large Loops Z| — °378+°013 382 “393 = XXVIII by Whorls Z ... | —'494+ ‘011 “499 506 = XXIX Composites Z | — -290+-014 292 | -309 aoe XXX Large Loops Rand Arches L ...| —'275+:014 297 314 = XXXI i Small Loops L | —-217+-014 | -262 | +282 ee 0.8 Gk " Large Loops Z| + °550+:011 519 525 — XXXIII 2 Whorls Z . | —*123+°015 ‘210 235 | 159+ 015 | XXXIV Composites ZL | —-017+°015 ‘000 ‘089 — XXXV Whorls # and Arches L ... ... | — 3808+ 014 BB B51 = XXXVI 5 Small Loops Z — 555 + °010 534 “540 = XXXVII 5) Large Loops Z +°021+°015 283. ‘301 | 170+ °015 |XX XVIII x Whorls Z . +°741+:007 ‘670 672 = XXXIX Composites ib +°280+ °014 296 313 = XL Composites Rand Arches L ... | —°146+ °015 15 |, 59 = XLI Small Loops L| —:188 +°014 72 203 = XLII x Large Loops Z|} + 131 +:°015 25) ‘166 a XLII Whorls £ . | +°059+°015 PLA 168 | -105+:015| XLIV - Composites Z... | +°250+°014 367 379 =— XLV Further Remarks on Tables 8 and 9. The results given in these Tables show:— (a) A general agreement between the correlations for the same pair of classes of prints whether obtained by different methods from the same Table (omitting values of r in Table 8), or from different Tables, the principal exceptions being those for which the correlation ratio has been calculated in Table 9. (b) A wide range in the magnitude of the results for different pairs of prints. (c) ‘The association between any class of print in one hand and the same class in the other is, in general, as might be expected, much higher than any other association of these Tables. Omitting the composites the remaining four con- tingency coefficients between the same class in different hands are, with one exception, each greater than any others; the same may be said of the correlation coefficients, the exception in each case being the correlation between whorls in the right and small loops in the left hand, which is slightly greater than the correlation between the large loops in the right and left. Even with the com- posites the contingency for the two hands is greater than that for composites with * Values of contingency coefficients corrected for number of cells. + Values of contingency coefficients not corrected for number of cells, given for the sake of comparison with other Tables. + The value of 7 is in all cases Nyx Ney H. WattE 445 any other class found from any of the Tables, while the correlation coefficients have five exceptions to this general rule. (d) The contingency coefficients given in Table 8, where the two hands are taken together, are, with two exceptions, greater than the corresponding coefficients in other parts of Tables 8 and 9. The exceptions are (1) the contingency co- efficient ‘234 for small loops with large loops of Table 8 is slightly less than those in Table 9; and (2) the coefficient 503 for small loops with whorls in Table 8 is rather less than that for whorls (right) with small loops (left) of Table 9. A further study of the above Tables shows that :— Large loops are closest to arches. Arches * r whorls. Whorls FS FA small loops. Small loops ": a whorls and then to arches. Composites is » small loops and then to arches. The suggestion thus arises that arches and whorls have the closest natural resemblance to intermediate sized loops, and also that the “natural order” of the classes of finger-prints is :— (1) Large Loops, (2) Arches, (3) Whorls, (4) Small Loops, (5) Composites. This is more clearly seen from the following arrangement of the contingency coefficients. TABLE 10. Contingency Coefficients of Right Hand. Large | arches | Whorls | S™2ll Composites Loops Loops Large Loops _... 1 246 162 166* 128 Arches ... aes "246 1 °335 “305 154* Warorlsine.. as 162 "335 1 408 137 Small Loops _... 166 "305 408 1 228 Composites ie 128 154 137 228 1 TABLE 11. Contingency Coefficients of Left Hand. ieee Arches | Whorls ae Composites Large Loops... 1 309 | +236 120 103 Arches ... ee “309 1 | °274 tooo "162 Whorls ... Bee "236 214 et “419 244 Small Loops... *120 335 =| 419 1 “311 Composites... 103 162 | 244 ‘B11 1 L te * Coefficients which do not agree with the proposed ‘‘ natural order. 57—2 446 Association of Finger-Prints TABLE 12a. Contingency Coefficients of Right Hand with Left. Right Hand. ; Te Arches | Whorls tae Composites vz a tr | Large Loops _... 519 "322 283 *382* *125* eS) SArchesayen: Ai °297 664 337 309 “115 3 Whorls ... og 210 “290 670 “499 127 1) Small Loops... "262* "285 534 631 172 Composites wa “000 ‘118 *296* | :292 367 (Corrected for number of Cells.) TABLE 120. Contingency Coefficients of Right Hand with Left. Right Hand. os rae Arches | Whorls Tae Composites ae eB hi | Large Loops _... 525 337 301 *393* 166* is Arches ... aoe 314 688 “351 325 159 EN Wo WRI eae ac "235 307 “672 ‘506 "168 I | Small Loops... "282% | -302 540 "635 2038 Composites ao 089 161 "313* | 309 — 379 L$ $$ —$__ J} (Not .corrected for number of Cells.) TABLE 13. Contingency Coefficients of both Hands taken together. rae Arches | Whorls nee Composites Large Loops _... 1 “383 333 234 181 _ Arches ... hee 383 1 "402 *440* 239 Whorls ... ie 333 "402 1 503 *320 Small Loops... "234 “440 503 1 . 361 Composites aa 181 | 289 *320 361 1 The contingency coefficients of the right hand with the left have been given both corrected and uncorrected for the number of cells and both sets of results point to the same conclusion. * Coefficients which do not agree with the proposed ‘‘ natural order.” H. WalItTe 447 The proposed “ natural order” of the types is supported by the above Tables, only eight coefficients out of the fifty-five not being in complete agreement. In four of these cases the difference is very small, most likely well within the probable errors, and they may therefore be regarded as insignificant. A similar arrangement of the correlation coefficients still further supports the proposed order, though not quite so conclusively, probably on account of spurious correlations. 9. Association between the various Fingers. In this section I have calculated the contingency coefficients only, the classes being arranged in the order found in Section 8, p. 445. It would, of course, be possible to obtain Tables with much finer grouping either by further subdivision of the loops or by making use of the “secondary classification” described by Galton or Henry (see footnote, p. 421). All such finer grouping would raise the contingency; the extra labour involved by the addition of some three or four rows and columns to each Table would, however, be so con- siderable that the question arises whether some allowance can be made for the coarser grouping employed. ‘This can only be done if we may suppose a “natural order” of some kind with a frequency roughly approaching the normal. This gives a rough upper limit to the contingency and is the purport of the work in the earlier sections on “natural order” and corrections. As an example of the effect which finer grouping has on contingency I have found the contingency between the index fingers of the two hands by means of a “seven by seven” Table, the radial and ulnar loops being separated, and also by means of a “five by five” Table in which no distinction is drawn between the radial and ulnar loops. The results in this case, not corrected for grouping, are 653 and ‘626; when corrected for grouping these results become ‘704 and °698, respectively. ‘They are so nearly identical as to suggest that no very material advantage would be gained by a further subdivision of classes. On the assumption that there is a certain degree of continuity in the distri- bution I have corrected all the results for grouping as well as for the number of cells. The method employed for the former correction is fully described by Professor Pearson in Biometrika*. The following Tables give the contingency coefficients for each finger with each other finger. The two sets of coefficients are included, viz. those which are not corrected for grouping, that is, which are obtained without any assumption of a “natural order” and those which are so corrected, in order that the conclusions based on the latter may be compared with those based on the former. * «On the Measurement of the Influence of ‘Broad Categories’ on Correlation,’ by Karl Pearson, F.R.S., Biometrika, Vol. 1x. pp. 116—139. 448 Association of Finger-Prints TABLE 14a. Contingency Coefficients of Right Hand. R, Ry Rs Ry R; Rk, 1 429+ 011 *455 469 A73 R, 429 1 “645 576 519 hs *455 645 1 665 565 Ry “469 576 “665 1 *690 Rs 473 519 565 “690 1 (Corrected for Grouping.) TABLE 140. Contingency Coefficients of Right Hand. R, 1 373 +012 “379 *400 385 Ry 373 1 561 ‘S11 441 Rs 379 561 1 568 *460 R, “400 ‘511 568 1 576 Rs; 385 441 ‘460 “576 1 (Not corrected for Grouping.) TABLE 15a. Contingency Coefficients of Left Hand. L, Ly Ls; Ly L; I 1 503 “465 “474 508 + ‘012 Ly 503 1 675 “609 539 Ls 465 675 1 “724 585 ‘ZL; 474 -609 "724 1 “711 Le “508 539 585 ‘711 1 (Corrected for Grouping.) TABLE 150. Contingency Coefficients of Left Hand. L, Ly Ly Ly Ls Ly 1 435 390 “401 ‘410+ 014 LT, 435 1 582 529 447 ip -390 582 1 ‘611 ‘471 ; -401 529 ‘611 1 ‘577 i “410 “447 ‘471 ‘577 1 (Not corrected for Grouping.) — eA RENE SEER NI OEE NG A ac al bl mr tals SS aa pie: H. WAITE 449 TABLE 16a. Contingency Coefficients of Right Hand with Left. R Ry Rs R, R, Ds; 177 441 -440 446 | -424 Ss 698 [5x5 Table . se |i @ose L, “479 an iE oe Table 640 559 521 Ls "427 ‘608 ‘786 | ‘669 -| 561 D4 "446 ‘587 663 | °814 | ‘675 Ls “501 ‘515 OM OLS | “899 (Corrected for Grouping.) TABLE 16). Contingency Coefficients of Right Hund with Left. R, Ry R; R, Rk; I “649 *B85 368 383 "347 arGtG 626 [5 x5 Table] KS hick ie ie 412 653 [7x7 Table] | 2°) 493 439 Lz *356 “530 656 572 “459 4 “375 514 558 “702 "556 Ts “402 "432 431 534 ‘707 (Not corrected for Grouping. Remarks on Tables 14a, 15a, and 16a. (a) It will be seen from these Tables that the association of types between corresponding fingers of the two hands is, with one exception, always closer than that between any other pair of fingers. The order of magnitude of these associations is :— (1) Little Finger, (2) Ring Finger, (3) Middle Finger, (4) Thumb, (5) Index Finger. (b) If we omit the thumb for the present, leaving it for separate comment, and consider the association between corresponding fingers as of the “first order,” that between fingers of consecutive rank, such as R, and R;, or R, and L, as of the “second order,” and so on, we notice a significant relation between any particular association and its “order.” Thus: j First order associations range from *899 to “704 or ‘698, Second o i - = ‘724 to 608, Third 2 : ra K 609 to °537, Fourth Fe A ; Pr 539 to 515. The amount of overlapping in these ranges appears to be quite insignificant. 450 Association of Finger-Prints (c) It follows from (6) that if in any of these Tables we start from a first order association and pass in any direction through those of other orders we find a continuous and rapid fall; that is, a finger is always more closely related to a consecutive finger than to one more remote (but see (a)); and the greater the difference in rank between two fingers, whether on the same or on different hands, the less close is the association between them. (d) The association between any pair of fingers in one hand is, in general, closer than either of the corresponding associations between a finger of the right and one of the left hand. There is one exception to this rule in associations of the second order, one in the third and one in the fourth. (e) The associations of the left hand are in every case closer than the corre- sponding associations of the right. (f) The associations of either thumb with any finger all fall below those of the fourth order of (b), and the range of the sixteen coefficients is only from *424 to ‘508. As it is difficult to base any conclusions on these figures as to the relations between the thumb and the various fingers, I have carefully checked them by reworking the whole of the calculations involved, but have in every case arrived at the same result. I have also found the probable error* for the largest and for one of the smallest coefficients of the set. As the contingency coefficients are all of the same order of magnitude and the number of individuals the same in all cases, the probable errors of all will be of about the same magnitude and it is unnecessary to calculate more. The probable errors in the two cases being of the order ‘011 the differences in the contingency coefficients may be regarded as insignificant. Although in three cases out of the four the contingencies of the thumb with the middle, ring and little finger respectively are in ascending order of magnitude, the differences are so small in comparison with the probable errors that no conclusion can be drawn as to the relations between the thumb and the various fingers. We may notice, however, that the rule (d) holds good for the thumbs with but two exceptions. The contingency coefficients given in Tables 14 b, 156, and 16, are all smaller than the corresponding results of the other series, but a careful study will show that the remarks (a) to (g) almost invariably apply to these Tables also. Note. In some preliminary work on this paper I classified the types as follows :—(1) Arches and loops with 1—3 ridges, (2) Loops with 4—10 ridges, (3) Loops with 11—14 ridges, (4) Loops. with 15 or more ridges, (5) Whorls, (6) Composites. With this classification the following contingency coefficients were found for corresponding fingers of the two hands:—Thumb ‘686, Fore- finger 642, Middle finger ‘686, Ring finger °730, Little finger ‘738. These results, which were not corrected for grouping, are seen to agree very closely with those * The method employed is that given in Biometrika, Vol. v. Parts 1. and u., ‘‘On the Probable Error of Mean-Square Contingency,” by John Blakeman and Kar! Pearson. H. WaAITE 451 of Table 166, the values being rather larger probably on account of the slightly finer grouping. 10. Comparison with Results of Previous Work. It would be well to compare briefly some of my results with those of the two works mentioned on p. 421. Whiteley and Pearson arrived at the following conclusions :— (i) The hand is a very highly correlated organ, far more highly correlated than the skull and even somewhat more so than the long bones. (11) The parts of the left hand are distinctly more closely correlated than those of the right. (ii) The order of correlation of the first finger joints is identical for both hands. This order is as follows :— (a) The external fingers have the least correlation and the little finger always less than the index. (6) 408 r) 203 B 124 a 92 = 21 Totals H. WaltTE 4 TABLE XXV. Arches, Right, and Composites, Left. Arches, R. 2 3 4 5 Totals = Obie esr 16. |) 5. fide 2B 13 4 0) 0) 434 2 2 1 0 0 103 2 1 0 0 4) 0 22 Q4 0) 0 0 0 4 g 0 0 ) 0) 1 [e) = aD Totals J 1541 | 294 111 33 16 7) 2000 | TABLE XXVI. Small Loops, Right, and Arches, Left. Small Loops, R. 0 1 2 3 4 5 Totals 558 369 QO liter 129 30 1542 ot Dae aon |) va | me | by | tL |b 290 gf Mieilbesisfe nis) ul) aoe |) 01 4 | 102 3 Tech aley ihe ie 3 0 49 e 3 5 7 4 1 0 20 “a 2 2 0 0 0 0 4 593 453 | 392 306 211 45 2000 TABLE XXVII. Small Loops, Right, and Small Loops, Left. Totals 496 425 366 315 283 115 Small Loops, L. Aes WeRS Totals 2000 TABLE XXVIII. Small Loops, Right, and Large Loops, Left. Small Loops, &. Totals 515 577 423 311 141 33 Large Loops, L. 2000 Biometrika x 59 9 460 Whorls, L. Arches, L. Small Loops, L. Composites, L. Association of Finger-Prints TABLE XXIX. Small Loops, Right, and Whorls, Left. Small Loops, BR. 0 1 2 3 J 5 0 159 217 286 | 249 198 43 i 130 143 79 | 48 12 1 Q 112 58 19 13 I 0 } 92 25 6 | 1 10) 10) y 79 10 eo 0 1 5 21 0) O 10) 0 0) Totals | 5938 453 392 306 211 45 TABLE XXX, Small Loops, Right, and Composites, Left. Small Loops, AR. 0 1 | 2 8 | 4 5 Totals 0 | 323 | 311 | 299 | 267 | 191 | 45 | 1436 1 194 LOD IF 82 32 | 17 0) 434 2 60 22) | ll | vi | 3 0 103 3 12 10 | 0) O 0 0 22 . J 4 0 | Oo 0 0 0 4 5 Oneal 0 0 OueaO 1 Totals | 593 | 453 392 306 | 211 45 2000 TABLE XXXI. Large Loops, Right, and Arches, Left. Large Loops, &. | 0 | 1 | 2 3 | } | 5 Totals 0 290 483 394 | 255 1030 ell 1542 I 22 ; 3 J ii) Totals} 457. | 659 | 478 | 280 | 108 | 18 | 2000 TABLE XXXII. Lurge Loops, Right, and Small Loops, Left. Large Loops, R. Totals 496 425 366 B15 283 115 Totals 3 2000 — oe H. WaAItTE TABLE XXXIII. Large Loops, Right, and Large Loops, Left. Large Loops, R. — a baer eral | | | lO A OR is | isa) 3 ome 7 121 | 956 | 135 | 59 5 Sa 2 53 | 135 | 149 | 67 | 18 4 3 17 | 66 | 101 77 =| “Al ) 4 2 20 36 AT 32 SS 5, 0 4 ¢ 4 Totals | 457 | 659 . TABLE XXXIV. Large Loops, Right, and Whorls, Left. Large Loops, R. Whorls, L. Meer Cs DMS Totals TABLE XXXV. Large Loops, Right, and Composites, Left. | | 0 | ae 3 J | 5 | Totals | ——— — . 0 Boel ATeeasoSn elo 1 Sle | 13 1436 cs 1 90 | 140 | 107 70 93--) 4 434 | 2 Hh eels ees, alg) 4" 103 Pima 4 St Rates Ce eal Olle 20 22 S 4 2 2 Olea OF Oe |e 4 5 5 Ss ieee Oita |e): OM | 102 Inn 20 1 O | | aot Totals} 457 | 659 | 478 280 | 10s | 18 | 2000 TABLE XXXVI. Whorls, Right, and Arches, Left. . Whorls, R. S 0 | 526 | 405 | 262 | 169 | 130 | 50 | 1542 al 193 t6On hoya sie O:| 0 4 290 Ms 2 80 | 19 3. || 20 One 20) AP 102 S| 3 38 4 0 0) One) (0) 42 | J 20; nO OF). 20 OF 0 20 nl ode deh OG |e 20 O20) 70%), 10 4 | Totals | 861 | 497 292 |170 | 130 | 50 | 2000 462 Association of Finger-Prints TABLE XXXVII. Whorls, Right, and Small Loops, Left. Whorls, R. S— & O nD | jor 2 | alii | 80 5 = 202 | 65 15 1 S 99 | 16 O 0 g NM Totals 497 292 170 130 50 TABLE XXXVIII. Whorls, Right, and Large Loops, Left. Whorls, R. 3 Large Loops, L. | Totals | 497 TABLE XXXIX. Whorls, Right, and Whorls, Left. Whorls, R. Whorls, L. Mey to OWNS | Totals | 861 497 292 170 130 50 | 2000 TABLE XL. Whorls, Right, and Composites, Left. Whorls, R. 0 i 2 3 4 5 NS ‘y Lard 9° 2 0 744 | 331 | 185 93 55 28 | 1436 ey 1 98 | 132 78 58 52 16 434 | 2 16 28 19 16 19 5 103 oa 3 el es 7 2 4 1 22 Es 4 Ouran a1 2 1 0 0 4 5 5 0 0 1 0 0 0) 1 S | ES EET SEES Totals | 861 | 497 | 292 | 170 | 130 50 | 2000 H. Walt : TABLE XLI. Composites, Right, and Arches, Left. Composites, R. 0 il 2 8 4 5 Totals | - 0 1042 | 378 | 107 12 3 O | 1542 ial ae 933 | 46 9 2 0 0 | 290 a 2 84| 16 2 ) 0 0 102 a 3 40 2 0 0 ) ) 42 S 4 20 0 0 0 0 0 20 pals ae 2 Ole 0 0 33 5 | Totals | 1423 | 442 118 TA tS (0) 2000 TABLE XLIV. Composites, Right, and Whorls, Left. Composites, R. Whorls, Z. ABW Ss WHO Totals J 1423 | 442 118 14 3 0) 2000 463 A464 Small Loops. Large Loops. Association of Finger- Prints TABLE XLV. Composites, Right, and Composites, Left. Composites, R. 0 eee 3 J = 0 1100 12765 | 53) 4 6 1 a 1 958°| 130°) 432) 43 0) ay 9g by 28 14 | 3 1 S 3 8} 8 4 2 ) = 4 0 0 3 lal 30 1 3 5 ) ) eid 0 0) 0 Totals | 1423 | 442 118 14 3 ar TABLE XLVI. Arches and Small Loops, Both Hands. Arches. i) 1 2 B 4 5 6 if 8 9 10 {Totals 0 361 7 1 1 1 |-0 0 0 0 0 2 373 1 205 | 19 7 6 3. k= 30 ) 0 0 5 — 245 g 168 33 12 7 ‘| (0) 5 2 5 — — 223 } 139 36 13 9 S| 6 4 10 — — — 225 105 40 25 10 2. | 6 10 — — —_ — 198 102 | 33 25 18 10 10 — — — — — 198 88 | 37 21 18 15 — — — — — — 179 74 47 23 21 — —- — — — —_— 165 638] 298. |. TS ha) ee eT 45] QI = =< pe = = _ = _ = 66 19-|° = = — se = = =5 — a _ 19 1369 | 291 | 145 90 40 22 19 12 5 5 2 | 2000 TABLE XLVII. | Arches and Large Loops, Both Hands. Arches. § 0 109 | 32 24 32 19 13 12 11 5 5 | 2 264 if 164 | 55 34 23 ll 6 5 1 0) 0 — 299 2 230 70. _|__33 18 4 3 2 (0) 0) — — 360 3 225 41 27 9 4 (0) 0) 0) — — — 306 4 212 | 45 16 5 1 0 0 = = = — 279 5 160 | 22 9 2 OS XO = _ a= = = 193 6 119 13 2; 1 1 | — — — — — — 136 7 86 9 0 0) —_— — — — — — —_— 95 8 49 3 (0) — — — — == =e 9 1 — — — —_ — pes 10 sy Pees ee | ey ee = oe Sle ee =. eS ae Ven eee H. WaAITE 465 TABLE XLVIILI. Arches and Whorls, Both Hands. Arches. | 0 | 2 1 pee eS ee ae ae ee 0 351 | 160 100 66 30 21 16 12 ii) 5 2 768 1 240 69 25 16 7 1 2 0) O O — 360 2 191 38 ll 5 2 (0) 1 0 O — — 248 BS 8 146 14 8 Dy 1 10) 0) O — — — 171 Fly 124] 6 1 1 0 0 63 = 132 2 5 86 3 (0) (0) (6) O — — — — — 89 = 6 ah 1 (6) (0) O — — — 78 t 71 O O 10) — — — —_ — — — 71 8 47| 0 0 ae = 47 9 23 (0) — = — — | 23 10 13 Sly _— — 13 | Totals | 1369 | 291 145 90 40 22, 19 12 5 5 2 2000 | TABLE XLIX. Arches and Composites, Both Hands. Arches. eee ee ote.) 7 |. 2 | 650 |193 | 101 | 62 | 33 1 40g | 66 | 34 | 19 6 ‘ 202 | 92 ale ty i 8 3 1 g 2 0 1 oy 0 0 0 g 0) (0) (0) } 0 0 ot e) (0) — a 40 TABLE Small Loops and Large Loops, Both Hands. L. Small Loops. 0 1 2 3 4 5 6 ” 8 9 10 0 44 13 10 18 17 20 16 | cot 31 30 19 1 44 19 13 19 14 30 37 45 42 BG alse a 2 60 94 | 24 S10) | Bin 42 54 55 SG ip hee a| 3 5) 31 25 39 43 47 32 Onn ees || ee eee 5 y 49 34 | 41 44 | 46 40 25 ea mene iL se om 5 35 39 38 33 29 19 = = = = aa AG 34 | 32 34. 22 TOES | st TR ee al a ec a (ae = 7 24 | 99 | 29 Oe | reise fl, (aed RS ae ne eS 8 16 20 13 ye ee oh | eee ee 9 9 Oe RS cee Se tL" a a (bee 10 By a a a ee ee = j | Totals] 373 | 245 | 223 | 225 | 198 | 198 | 179 | 165 | 109 66 19 Totals 264 299 360 306 279 193. 136 95 52 13 466 Association of Finger-Prints TABLE LI. Small Loops and Whorls, Both Hands. Small Loops. Whorls. Totals 24! 2° Hi 179 TABLE LILI. Small Loops and Composites, Both Hands. Small Loops. | Beales | 7 | 8 | 9 | 10 |Totals 0 1098 1 536 E; a 240 mt Mer es nD 4 6 S. i) = ‘ j=) iS) Totals | 373 TABLE LIII. Large Loops and Whorls, Both Hands. ‘Large Loops. an si 2 8 J a | @ | y | 8 9 | 10 [Totals 0 156 138 133 91 73 51 43 42 29 9 3 1 24 45 54 63 56 39 30 29 16 4 — 2 12 24 36 35 50 3) 30 19 7 — — a 3 12 16 30 21 35 35 17 5 — a= — cele 8 7 | 20° | 32 | 96 | 23 | 16 fe 5 3 8 20 20mg 928 10 — -- — — |— = 6 4 9 24 30 11 — —_— if 10 18 29 14 — — — aoe 8 11 22 14 — — 9 11 12 — a = -- — — 10 13 (etna ie | ra eee Ba || — Totals J 264 | 299 360 306 279 193 136 95 52 13 3 Composites. Composites. H. WaAItTE TABLE LIV. Large Loops and Composites, Both Hands. Large Loops. Whorls and Composites, Both Hands. TABLE LV. Whorls. 0 tr | 2 3 5 5 6 : By ano 10 |Totals 0 26 1098 1 40 536 2 12 240 3 8 85 y 3 26 5 ) Tl 6 ai 6 7 as l 8 ai 1 9 ae 0 10 be 0 Totals 248 | 171 | 132 | 89 2000 TABLE LVI. Right Thumb and Indea. Right Thumb. A SL LL W (6! Totals Pe = A 29 97 148 50 28 352 a Sh 12 125 320 139 58 654 x LE 2 27 149 125 36 339 a W 1 26 144 | 260 50 481] 20 C0 2 15 54 75 28 174 aa aD | Totals | 46 290 | 815 | 649 200 | 2000 Biometrika x 60 468 Right Middle Finger. Right Little Finger. Right Middle Finger. Right Ring Finger. Association of Finger-Prints TABLE LVII. Right Thumb and Middle Finger. Right Thumb. A f SL 18 LL 1 W 2 C 0) Totals 46 290 SL LL 428 223 69 20 815 W 31 215 208 160 35 649 200 Totals TABLE LVIII. fight Thumb and Ring Finger. Right Thumb. | A SL bh We C Totals BL 5 60 | 248 166 64 543 W 8 55 | 929 | 355 82 729 eee 2 22 73 57 22 176 | Totals | 46 290 | 815 | 649 | 200 2000 TABLE LIX. Right Thumb and Little Finger. Right Thumb. A SL LL Ww G Totals A 8 ‘14 5 3 i 31 Se 30 198 | 414 | 162 66 870 BD, 5 61 300 | 304 94 764 W i 10 58 137 22 228 Oo 2 4 38 43 17 107 Totals | 46 290 | 815 | 649 | 200 2000 TABLE LX. Right Index and Middle Finger. Right Index. A SL THs W C Totals A 127 69 9 5 2 212 SE 187 | 453 119 104 58 921 Tes 30 | 108 144 | 1792 62 516 W 4 16 48 | 178 28 274 C 4 8 19 22 24 thy Totals | 352 | 654 | 339 | 481 | 174 | 2000 —_— sl a H. Waitt 469 TABLE LXI.- Right Indea and Ring Finger. Right Index. Totals Right Ring Finger. TABLE LXII. Right Index and Little Finger. Right Index. ie isn en ewe) eo leretats Right Little Finger. TABLE LXIII. Right Middle and Ring Fingers. Right Middle Finger. Right Ring Finger. TABLE LXIV. Right Middle and Inttle Fingers. Right Middle Finger. | Ww Right Little Finger. 60—2 470 Association of Finger-Prints TABLE LXV. Right Ring and Little Fingers. Right Ring Finger. S SL Totals & ey 31 o | 870 = | 764 tS | 228 7m | 107 a aad i] 543 2000 = | TABLE LXVL. Left Thumb and Index. Left Thumb. A SL LIL W C Totals eal ne A733 78 25 | 30 313 SS SL 32 313 355 68 65 833 iG LL 5 40 143 AT AT 282 2 W 3 46 136 166 86 437 ro) C 4 15 55 35 26 135 =| Totals | 91 547 767 341 254 2000 TABLE LXVII. Left Thumb and Middle Finger. Left Thumb. i SL Totals to} 0) A=] y ew © SL es | ie o| W = C 3 | Total | otals Left Ring Finger. | Totals TABLE LXVIIL Left Thumb and Ring Finger. Left Thumb. 7 264 | 198 152 | 342 60 | 172 33 48 | 547 767 5 38 110 158 30 341 Totals 66 583 712 491 H. WaItE TABLE LXIX. Left Thumb and Little Finger. Left Thumb. a Totals oO bp 42) 35 Falla aCye 959 ae LL 768 BS W 150 I C 88 ney 3 Totals TABLE LXX. Left Index and Middle Finger. Left Index. 3 A SE ae W © | Totals = ty A 117 84 9 Z 3 215 © SL 166 525 80 79 37 887 oS IDE, 19 187 157 139 54 556 ac) Ww 6 Q5 21 171 17 240 SHieo 5 12 15 46 | 24 102 ~ 8 Totals } | 282 2000 TABLE LXXI. Left Index and Ring Finger. Left Index. Left Ring Finger. SL LL 833 TABLE LXXII. Index and Little Finger. Left Index. Left Little Finger. SL LL Ww Totals 471 472 Left Little Finger. Left Ring Finger. Left Little Finger. Left Thumb. Association of Finger-Prints TABLE LXXIII. Left Middle and Ring Fingers. Left Middle Finger. Totals | A S L 13 421 311 88 54 215 887 LL W 556 240 102 Totals 66 583 712 491 148 2000 TABLE LXXIV. Left Middle and Inttle Fingers. Left Middle Finger. SLI W Totals Totals | 215 887 556 240 102 2000 ——— TABLE LXXV. Left Ring and Inttle Fingers. Left Ring Finger. | PO Si Pe | W | Cc | Totals c (0) 1 35 99 44 959 212 73 768 120 15 150 60 15 88 Totals | 66 | 583 | 712 | 491 | 148 | 2000 TABLE LXXVI. Right Thumb and Left Thumb. Right Thumb. A Gn i Wee | motte): A 91 SL 547 LL 767 Ww 341 C 254 Totals 2000 aa de H. WaItE TABLE LXXVII. Right Thumb and Left Index. Right Thumb. A SL LL WwW C Totals 3 A 313 s SL 833 & aE 282 ie W 437 Gay D 135 om e [SS Totals 290 | 815 | 649 2000 TABLE LXXVIIL. Right Thumb and Left Middle Finger. Right Thumb. S A Speen | Ww Cc | Totals | E Fes fy 23 60 86 32 14 215 ie 17 167. | 414 | 210 79 887 = 0 40 | 232 | 214 70 556 oS 5 18 57 141 19 240 S 1 5 26 | 52 18 102 2 3 46 290 815 | 649 | 200 2000 TABLE LXXIX. Right Thumb and Left Ring Finger. Right Thumb . Left Ring Finger. TABLE LXXX. Right Thumb and Left Little Finger. Right Thumb. ra o Ey o_ A Ee isu foe a, ~~ W 4 C ae os Totals A SL LL W C 10 17 6 2 (0) 29 207 468 181 74 2 54 275 341 96 3 8 40 80 19 2 4 26 45 11 46 290 815 649 200 473 474 Left Index. Association of Finger-Prints TABLE LXXXI. Right Index and Left Thumb. Right Index. Totals A SL LL W C a lalla 91 = SL 547 ale Ae 767 iS W 341 o C 254 SSS SST Totals 352 654 339 481 174 2000 TABLE LXXXIL. Right Index and Left Index. Right Index. | aol st. sz, \ cee) re wr omlinee A 313 SL, 295 SLy 538 1H 88 LL, 194 hee 437 | C 135 Totals 352 282 372 174 165 481 174 2000 TABLE LXXXIII. Right Index and Left Middle Finger. Right Index. Be on S i A © SL = LL Ss W S| @ — 3 Totals TABLE LXXXIV. Right Index and Left Ring Finger. Right Index. Left Ring Finger. Totals | 66 583 712 491 148 H. Wattr TABLE LXXXV. Right Index and Left Little Finger. Right Index. TABLE LXXXVI. Right Middle Finger and Left Thumb. Right Middle Finger. x | ee Se ye ech Totals. | 0 | | = A ¢ 35 | Ba SE) i 236.425 12 nae) 192 50 959 | =| LL 81 | 183 | 176 | 224 | 104 768 | Ss W 5 21 30 86 8 150 | HH 6 6 5 17 49 11 88 | = | | —j | Totals}: 352 | 654 | 339 | 481 Totals fe) S 91 = 547 a 767 a 341 Bs 254 4 2000 TABLE LXXXVII. Right Middle Finger and Left Index. Right Middle Finger. Hea aed W. | .0% | -Totals: | i I | i | A 10 | 176 | 21 1 5 313. | TSM) VG 86 | 529 | 177 2a 7 833 eS | LL 7 95 125 42 13 282 2 | Ww 5 80 144 180 28 437° | Ss C 4 41 49 27 14 135 | TABLE LXXXVIILI. Right Middle and Left Middle Fingers. Right Middle Finger. 5 A SIAL, kW. C | Totals oe e eee A 115 94 4 0 2 215 ay UE 84 635 129 25 14 887 wr | LL 8 152 295 70 31 556 ae) W 3 20 65 140 12 240 Ss C 2: 20 23 39 18 102 & SI Totals 212 921 516. 274 77 2000 Biometrika x 61 476 Left Thumb. Left Little Finger. Left Ring Finger, Left Index. Association of Finger-Prints TABLE LXXXIX. Right Middle and Left Ring Fingers. Right Middle Finger. A SL | LL W C Totals A 49 16 1 0 0 66 SL 110 398 | 59 8 8 583 LIL 39 346 | 241 63 23 712 W 9 102 165 184 31 491 C 5 59 |. 50 19 15 148 Totals | 212 921 516 274 77 2000 TABLE XC. Right Middle and Left Inttle Fingers. Right Middle Finger. SL Totals 13 35 959 289 260 137 47 768 30 57 56 3 150 18 3l 28 8 88 2000 921 516 274 77 TABLE XCI. Right Ring Finger and Left Thumb. Right Ring Finger. Totals 17 SL 121 49 547 LL 262 Tes 767 W 3 28 88 199 23 341 C g 130 254 Totals 729 2000 TABLE XCII. Right Ring Finger and Left Index. Right Ring Finger. 4 ae aa w | @ | Totals 313 833 282 437 135 2000 Left Thumb. Left Ring Finger. Left Middle Finger. Left Little Finger. Right Ring and Left Middle Fingers. Right. Ring Finger. H. Waire TABLE XCIII. | Ay ese eet ye Ce “1 Totals | 47 122 SS | Peg) race et 215 15 886 meas |e ais 83 887 1 24 | 204 | 266 61 556 0 5 96 |), 1949/15 240 0 2 1 77 12 102 63 | 489 | 543 | 729 | 176 } 2000 TABLE XCIV. Right Ring and Left Ring Fingers. Right Ring Finger. A | SZ | LL | w | Cc | Totals | 66 | 583 | 712 491 Totals 148 TABLE XCV. Right Ring and Left Little Fingers. Right Ring Finger. | A SL LL W Cc Totals A 17 15 3 0 0) 35 SL 42 399 250 194 | 74 959 LL 4 68 276 339 81 768 W 0 5 9 125 11 150 C 0) 2 5 71 10 88 Totals 63 489 543 729 176 [ 2000 TABLE XCVI. Right Little Finger and Left Thumb. Right Little Finger. in soi W Totals 91 547 767 341 254 61—2 477 4 ( 8 Association of Finger-Prints TABLE XCVII. Right Little Finger and Left Index. Right Little Finger. pA uate ST | LL Wo} C€ | Totals | a 22 | 204 76 6 5 S130 .| 3 8 477 | 269 44 35 833 | aa 0 76 | 155 34 17 282. 2 1 73) |e 198e) a128 37 437 || o 0 40 66 16 13 seul 4 eer | ‘ = | | Totals} 31 | 870 | 764 | 228 | 107 | 2000 | TABLE XCVIII. Right Inttle and Left Middle Fingers. Right Little Finger. 5 ASM asia eee W | @ | Totals | ee 3 —— a A 17 5) OSes 2 3 215 ie Ys 12 507-291 42 35 887 sally ea) 0 152-301 71 32 556 Sy) 2 al eo a4 26 240 sl 0 14 48 29 11 102 = | | | 3 Totals | 31 870 | 764 | 228 | 107 | 2000 TABLE XCIX. Right Inttle and Left Ring Fingers. | Right Little Finger. i SEE W C | Totals a | | = 50 0 1 66 os 440 10 9 13 583 op P 255 | 398 30 27 712 = G0. | ST 78e leas 50 491 pa 1 36 79 16 16 148 2 o | | =a) 31 870 | 764 | 928.0!) 107 TABLE OC. Right Little and Left Little Fingers. Right Little Finger. 2 | | 4 | sz | zz | w | @ | Totals 2 3 : = A 18 17 Oy | 0 0 35 3 SL 11 ol a6 21 30 959 = LL 1 115 | 543 74 35 768 = W 1 13 22 99 15 150 4 C 0 4 23 34 27 88 cs Ss | ee | Totals | 31 870 | 764 | 928 | 107 | 2000 i ON THE PROBLEM OF SEXING OSTEOMETRIC -MATERIAL By KARL PEARSON, FE.RS. It is well known that anthropometric, particularly craniometric measurements give frequency series, which for moderate sized populations follow closely the normal or Laplace-Gaussian distribution. Measurements of stature, cubit head-length, cephalic index, etc., etc., obey with sufficient accuracy for most purposes of science the normal law. This statement may with a high degree of certainty be extended to practically almost all measurements on the adult skeleton. But a new difficulty arises in dealing with the parts of the skeleton: the sexing of the several bones of the human body is by no means certain, and this is especially the case when we come to deal—not with the cranium or the pelvis but with the long bones. In order to get over this difficulty, and to find the constants for each sex, it occurred to me some years back when the sexing of the long bones had presented this problem very forcibly to workers in my laboratory, that the method of my first contribution to the mathematical theory of evolution* might be applied. Namely, we might take the unsexed material and assume it to consist of a compound of male and female data, the frequency curve for each of these being normal ; the two components might then be found in the manner of the paper just referred to. The method was especially likely to be successful, when. the series was otherwise homogeneous, the numbers large and the character dealt with substantially diffe- rentiated sexually. Of course the method does not give the sex of each individual bone, but I have shown in another memoir+, how four to six characters thus resolved form a basis for determining the probable sex of each bone, and this with an accuracy which is very probably as great as, or even greater than, anatomical appreciation unbased on a system of numerical measurement. One of the few objections to the method is the labour involved in the process. While the analysis required in the application of the method is not so severe that it has not been applied in a large number of cases by workers in the Biometric * Phil. Trans. Vol. 185, A, pp. 71—110. + To appear in the next number of this Journal. 480 On the Problem of Sexing Osteometric Material Laboratory, it is still considerably beyond the powers of most of the present workers in anthropometry, and probably no anatomist of the present day has the mathematical knowledge requisite for the solution of the reducing nonic, or the arithmetical patience required for the calculation of its coefficients. It has occurred to me, however, that the work might be considerably shortened by the following considerations. The bones usually dealt with are those found in ancient cemeteries, in plague pits, clearance pits or crypts. It is probable, though by no means certain, that adult female bones in such cases would be rather more numerous than male. On the other hand being somewhat smaller they are asserted by some writers as likely to be more frequently broken, and they certainly may more readily escape preservation or measurement. If we take these two causes as counter- acting each other, we may assume as a first approximation that the numbers of male and female bones will be equal. In the next place it is a result of much anthropometric experience that male and female variations, i.e. their standard de- viations, are closely alike. These again we can take equal to a first approximation. Accordingly, to this first approximation, our osteometric series may be considered to consist of two equal normal components with different means. Let the mean of the unsexed material be J/, and let the actual means of the sexed components be M,, Mz, their standard deviations be o,, o., and their total frequencies 7, and n., where the subscript 1 refers, say, to the males, and 2 to the females. Then m,, Mz, 01, Fz, NM, and n, are the quantities we desire to discover. Let the moment- coefficients of the total material be, in the usual notation, p., Ms, Ms, Ms and let N(=7,+%:) be the total unsexed population. We shall write as customary Bi = bs?/s?, Bo = Mal oo’, Bs = Msfls/uo. Then, if our hypothesis be correct and the material consist very nearly of two equal normal distributions, 8, and ®; ought to be very small, while @, will be large in relation to them. It is convenient also to write: f, =4(38-8,), C= LOB — Bye aene: eae oe eee (1), m=M+m, DAU PPS Saerinang oper capo Wedoanoe: (ii), Jo = Yr'Po/ f2, Gy = (Gare a) fae eee eee eee (111), Ce AU RRR AR ALB GRHAB Me clin OnISHE GrO4090 5» (iv). Then the fundamental nonic may be written: : Keg 3 o.° = OR ahr 2 Bi ge® —3 (6, oar 5G?) qo ae (878, & Ste i) qe" +3 (482 — 30, 6-36") q+ 3 (Bi & — $ Bib") qe + 8B fq — Be =0...(v). VB. {8,— 66.0. 3& oe — sash Further: a= ana ee ae QA), where the sign of /8, is determined by that of ps. Again i EE ip i DON Se a (vii). Kart PEARSON 481 Lastly : Cie — fa + Go) — $ poly —$ Vion | os = py (1+ G2) — Malt — Vary | Equations (ii), (iii), (iv), (v), (vi), (vii) and (viii) form the complete solution of the problem when we make no approximations whatever*. If, however, 8, = 8; =0, then, the two components being equal, we have+: ese, VGA UU 0 Veit J tan sg Sais abs (ix). o= o=NV py {1 — Vo,}2 It will be seen that it is needful in order that the solution may be real that should be positive or 8, < 3, 1.e. the total frequency should be platykurtic. Now let us suppose that the values given by (ix) are a first approximation and that we need a second approximation in which the two normal curves will be unequal in frequency, mean and standard deviation. Write: n=4N, y=VmG3, c=Vu(1—-VG)P ee (Gh: Ohiat and suppose : m=n+ 6n,, N= N+ONy, Voie eels) Oa) o,=0+ 60;, O,= +560, where the differentials represent small quantities of which the squares and products may be neglected to a second approximation. Our equations aret: m+n =N, MY + Neo = O, ny (y? ar o;”) + Nz (Yo? + O27) = Nps, M (or? + By, 077) + M2 (2 + 3y,0.?) = Nuss, 1 (yi' + Gyo? + Bay!) + Ne (24 + Gy? oe? + 80,!) = Nyy, 2 (91? + 10y2 oy? + L5y, 0,4) + ne (y2? + 10y.20.7 + L5y.o0!) = Vu. We now differentiate these and after differentiation put ie wW=-yY= ono. Hence we find: OMe ONEE Hla ciple ed Ae oe cae aii (x), @ (Oyi + OFs) 4 2yony= 0 cc vageseeecee stones pee (xi), Zny (Sy, — Oya) + Zo (Sa, + Soy) =O... eeeveceeeeceseee (xii), 3n (Oy, + Sy.) (y? + a?) + 2dr (9? + 30°) + Brox (dc, — da.) = Nuys... ..(xili), y (y? + 80”) (Sy, — Syz) + 80 (9? + a”) (60, + 60.) =0 oe ee eee. (xiv), n (Oy; + Oy) (Sy! + 80 yo? + L504) + 28ny¥y (yt + 104?o? + 150%) +n (60, — 02) 20yo (y? + 8a?) = Nps .....c eee. (xv), * They are, in a somewhat better form, those originally given by me in Phil. Trans. Vol. 185, A, 1894, pp. 71—110 ; see Nquations (14), (15), (18), (19), (27) and (29) of that memoir. + Loc, cit. footnote, p. 91. iwioch cit. p: 182. 482 On the Problem of Sexing Osteometric Material where it must be remembered that the differential terms are introduced solely to account for the asymmetry as represented by p; and y;, assumed to be zero to a first approximation. But (xii) and (xiv) show us that we must have : by, = Oy, 60, = —So,. Hence from (xi): On S = NOG | Y cecu eee a eee (xvi). (xil1) now becomes : 2ry? dy, + Garda, = ps, and (xv): Ary? dy, (9? + 507) + 20a (9 + 3807) da, = p;. Whence solving we find: ss ue) ao Ms 3 Ms i by; by 5 (1 +3 =) oT Sy oe (xv), ~o\ Ms 1 ps 60, = — ba. =-2(1452)% Sante ee (xvill) 6n, = — 0m = — "Oy EAS MEMO ee EME ERR RIE rio 90 (xix) sy, These form together with (ix)’* the complete solution of the problem. The following example illustrates the procedure: 541 measurements were made of the bicondylar width of English femora, right and left, male and female being mixed. The frequency below resulted. Frequency Distribution of 541 Femora for Bicondylar Width. mm. Frequency mm, Frequency mm. Frequency 61 | 1 71 23 62 1 72 33°5 63 1:5 73 25 64 5 74 22 65 13°5 75 36 | 66 14 76 25°5 67 15°5 Wh © 29°5 68 22 78 32°5 | 69 31 79 19°5 | 70 19 80 33 The constants of this distribution were : M=75'8152, jy = 37°692,112, jiy=— 2'587,693, ju, = 3020°898,695, u, = — 83-260,992. Hence we deduce: B,= 000,125,047, 8, = "000,106,750, Bo = 2°126,349, €, = °436,8255, ¢, = 001,143,72. Kart PEARSON 483 Clearly 8, and ®; are so small that the distribution fulfils our condition of being very closely symmetrical. The nonic, equation (v) above, is: qs) — 3:057,789q.’ + 000,18757¢9.5 + 2°858,8174q,° — -009,8679,! — °754,678q.' — 002,50114," + :000,000,0546q. — ‘000,000,000,005 = 0, the last two terms being written down to many figures to show their inappreciable- ness. The root required is: qo = — 65679, which by (vi) leads to: y? + 558,0507 — 24°755,802 = 0, and provides the solution : Females. Males. Mean: 70°547 mm. 80°526 mm. Total Frequency : 255°4 285°6 ee nen cron (A). Standard Deviation: 3°4842 mm. 3°6944 mm. | Modal Ordinate*: 29°24 30°84 We have now to inquire how far the same result would be reached, if we had supposed as a first approximation equal Gaussian components and then proceeded to determine a second approximation by aid of (xvii) to (xix). Equations (ix) give us: ies — 1 — Oro: W=—y=y = 49912, Gy= o2=3'5750. Thus to a first approximation : Females. Males. Mean : 70°824 mm. 80°806 mm. | Total Frequency : 270°5 270°5 esca acre (B). Standard Deviation: 3°5750mm. 35750 mm. Modal Ordinate : 30°19 30°19 (B), statistically speaking, is so close to (A) that it gives every confidence of a second approximation practically reproducing (A). We find: | Ms _ _ 020.8112 Ms _ _ .026,0386 Cy oe ry ) a5 Ve ’ of ry 2 Dj a= 519.0874) Y = 1:948,228. Ine o * nv Yo=——- of the normal curve. Hy v/ 210 Biometrika x 62 484 On the Problem of Sexing Osteometric Material Hence by (xvii) to (xix): SL es Sir MaRS SOLO), 60,=—60,= oo X‘029,809= 1066, én, = — Ong = +n X 056,308 = 15°231. It will be seen from these results that: bo, _ o O% _ _ 0563, 0298, % 0568 oY, may be considered fairly small quantities, and that they justify our assumption. We have accordingly: Females. Males. Mean: 70°543 mm. 80°525 mm. | Total Frequency : 255°27 285:73 > ¢.| WAR eee (C). Standard Deviation: 3°4684 mm. 3°6816 mm. Modal Ordinate : 29:36 30°96 It is clear that the solutions (C) and (A) are for all practical purposes identical. Thus the short method is justified in the problem of sexing osteometric material. An improper extension of the method to material in which the sexes occur in very unequal groups may be guarded against by simply observing whether 8, and £, are very small quantities. In conclusion it may be desirable to compare the values of these sex-constants as found mathematically with sexing by anatomical appreciation. I owe an anatomical sexing of the same bones to my colleague, Dr Derry. The following values of the constants resulted : Females. Males. Mean: 70:098 mm. 79°764 mm, Total Frequency : 221 320 GRAB Kiso (D) Standard Deviation: 3°5148 mm. 4°1254 mm. Modal Ordinate : 24°55 30°95 | It will be seen that the mathematically deduced constants are not widely divergent from those obtained anatomically, but the accordance if fair is not ideal. The accompanying diagram exhibits the differences in the frequency distributions found by the two methods of sexing. The chief difference lies in the transfer by the anatomist of the larger female bones of the mathematical sexing to the male group. I do not propose to discuss here the relative advantages of the two methods, but would draw attention to a few points of interest : (i) The svlution (D) makes no appeal to measurement in the sexing, it is based purely on an anatomical appreciation. It would therefore be subject to 485 ‘SUIXOG [BOINIOJVUY pUB [wOI}VMEyye JO uosTAvdwog “YPM AB[Apuoolg “Browag™ AINyUI0 YIL[T UopuoT UwuU Ud YIpry Lwjphipuoarg €6- 66 16 O6 68 88 48 ue c8 v8 €8 G8 18 O08 62 82 te SE ye ue gs (Gin ky (OVE ahs} ishe) Hey teks eich en arse tsleh als) (lis) Mo)eh © (sxe) TSK; L Hl 1 1 L ! n L L Nl L 1 | i I | KARL PEARSON i ail [eens zi 1 falas as 1 = L Tt Sheet uray & yep pauquoy —_-— BULXag | sollog az], eeee jeormtojyeny |seuog ojemag eveece DANY poUIquogy QOD Burxag | Sa]v]y Lop uvissnery Gg qg Jeonewmoyiep [sopeutay tof weissneyn y VV 2 62 OPN shouanbatgy 486 On the Problem of Sexing Osteometric Material personal equation, depending on the features upon which the experience of the individual anatomist leads him to lay most stress. The solution (C) is unique, that is to say, given the same data, all statisticians would reach the same values, of Frequency Distributions of Bicondylar Width in Male and Female Femora sexed by Anatomical Appreciation. mm. ie) 3 mm. f°) 3 mm. f°) 3 61 1 = 71 20°5 2°5 81 = 28 62 1 = 72 29°5 4 82 — 23 63 15 = 73 16 9 83 = 19 64 5 — 74 115 10°5 84 1 16°5 65 13°5 = 75 10 26 85 — 19°5 66 14 = 76 7 18:5 86 _ 16° 67 15°5 77 2°5 27 87 = 75 68 22 — 78 I 31°5 88 —_— 3 69 31 — 79 1 18°5 89 — 3°5 70 15°5 3°5 80 1 32 90 — 0°5 course apart from errors in arithmetic or from the number of decimal places retained in the working. It eliminates the factor of personal equation. (ai) (C) would, however, be influenced by the fact that our material is not perfectly homogeneous except for sex ; because (a) there is a mixture of right and left bones, and, to judge by the anatomical sexing, this may involve a difference of ‘7 to ‘9 mm. in the means and ‘08 to ‘24mm. in the standard deviations; this would add to the heterogeneity, (b) our bones may be due to somewhat mixed classes and possibly mixed periods, (c) the bicondylar width is liable to be injured by rough treatment of the bone, and this injury will most affect the weaker, and therefore probably the younger, bones. These bones might then be treated as female, a classification which most anatomical sexing also favours. While the total number of these London femora is nearly 800, the bicondylar width could only be measured in 541 cases. This selection will not necessarily be random as to size or sex, and may modify our constants found mathematically from the distribution. On the other hand it would affect also the anatomical appreciation of sex, but only in as far as it was based on the size of the condyles. (iii) We know from very considerable sexed data that the variation of man and woman is very nearly the same. The coefficients of variation measured in the usual way, i.e. by 100 standard deviation divided by mean, gave: Mathematical Sexing. Anatomical Sexing. 9 492 ff 457 SOL my I= G5) A=- ‘16 Karu PEARSON 487 There was thus closer sexual accord from the anatomical method. But when the same anatomical sexing was applied to the character of the head of the femur in the vertical plane, I found for right bones: Q 5:05 J 637 A =— 1°32, and for left bones: 9 4:91 PAY) A=-1°19, differences far greater than occur in the mathematical sexing from the bicondylar widths. Accordingly no great stress can be laid on inequalities in the coefficients of variation deduced from either process of sexing. It. would appear to me that we have reached on the whole a reasonable biometric method of sexing. To what extent it can replace the sexing by anatomical appreciation must be left to the future. But it is clear that when anatomists themselves prefer to that appreciation an appeal to a single character, e.g. to the measurement of the femoral head, and only settle by anatomical appre- ciation the sex of femora with diameters between 45 and 47 mm., then they do not show much confidence in their own method of sexing. An interesting experi- ment could be made if some 400 to 500 sexed bones were available, and then, without knowledge of the real sex, two or three anatomists and a statistician were to be asked independently to determine the mean and variability of two or three characters of the bones of each sex in this material. I have cordially to acknowledge the help of my colleague Mr E. Soper in the determination of equations (xvii)—(xix) and in their solution (C) in the numerical case for which I had reached the solution (A); also the labour of my colleague Miss H. Gertrude Jones in the preparation of the diagram which contrasts graphically the mathematical and anatomical solutions of the problem. FURTHER EVIDENCE OF NATURAL SELECTION IN MAN. By ETHEL M. ELDERTON, Galton Research Fellow, AND KARL PEARSON, F.R:S. (1) ‘The second author of the present paper writing in 1894 a commentary on the statement that “no man, as far as we know, has ever seen natural selection at work,” remarked : “ Every man who has lived through a hard winter, every man who has examined a mortality table, every man who has studied the history of nations has probably seen natural selection at work*.” The emphasis is here to be laid on the word “ probably,” because the seeing depends on the power and validity of the scientific means adopted to analyse the observed facts. In a paper communicated by the same author to the Royal Society in June 1912+, it was shown from the Registrar-General’s series of ten yearly life-tables that when allowance was made for change of environment in the course of the fifty years a very high association existed between the deaths in the first year of life and the deaths in childhood (1 to 5 years). This association was such that if the infantile deathrate increased by 10°/, the child deathrate decreased by 5°3°/, in males, while in females the fall in the child deathrate was almost 1°/, for every rise of 1°/, in the infantile deathrate. The method of investigating by life-tables could not be extended beyond 1900, because the life-tables for the next ten years (1901-1910) were not then out, and indeed have only just appeared (December 1914). While the infantile deathrate as shown from the life-tables had risen from 1871-1900, the child deathrate had fallen for the same period. During the next decade 1900-1910 both deathrates have fallen together; such a secular change does not in any way modify the argument of the paper, which lies in the statement that whether two deathrates rise together or rise and fall simultaneously we can draw no inferences at all, wntil they have been corrected for secular change. Most economic, demographic and physical variates are changing continuously with time, and no comparison of time graphs or calculation of . correlations will demonstrate of necessity anything but spurious association, until * The Chances of Death and other Studies in Evolution, Vol. 1. p. 166. + ‘“The Intensity of Natural Selection in Man.” R. S. Proc. B. Vol. 85, pp. 469—476. Erue, M. ELpERTON AND KARL PEARSON 489 the time factor has been eliminated. It is the deviations from the continuous curves of secular change which may turn out on careful analysis to be truly indicative of causal relationship between the variates under consideration. The first attempt to get rid of secular change by a method of differences was made by Miss F. E. Cave in 1904 in a paper on barometric correlations*, and shortly afterwards Mr R. H. Hooker published a paper dealing with the same pointt. Both these authors used only first differences and gave no general theory of the method. Quite recently “Student” has published a paper{ giving the fundamental formulae, and indicating how by taking successive differences of two variates and correlating them, we free ourselves from the time or locality influence, and approach the true and probably causal relationship between them. When the correlation of the differences becomes steady, then we have reached the actual correlation of the variates corrected for the time factor, provided an assumption is made which we shall discuss at greater length below: see footnote, p. 495. Mean- while Dr Anderson of Petrograd has been working on the subject, and in a most valuable memoir§ he has added to “Student’s” results a number of new theorems; for example, the probable errors of the successive difference corre- lations when they become steady, and the relations which should be fulfilled between the squares of the standard deviations of successive differences, when the series has become steady. We have thus a double means of ascertaining whether the desired object—the elimination of the time-factor—has been approxi- mately achieved. A third additional test will be indicated in this paper. This new statistical process has been termed the Variate Difference Correlation Method||, and there is small doubt that it is the most important contribution to the apparatus of statistical research which has been made for a number of years past. Its field of application to physical problems alone seems inexhaustible. We are no longer limited to the method of partial correlation, nor compelled to seek for factors which rendered constant will remove the changing influence of environ- ment. In the present case, that of the influence of imfantile mortality on child mortality, Pearson endeavoured to eliminate the influence of continual environ- mental improvement by making the expectation of life at six years constant. Snow achieved the same object by correlating the deathrates of one sex for a constant deathrate of the other**. In both these cases substantial evidence of Natural Selection was obtained from the mortality tables. The object of the present paper is to demonstrate by the still more complete elimination of the * R. S. Proc. Vol. uxxiv. pp. 407 et seq. + Royal Statistical Society Journal, Vol. uxvitt. pp. 396 et seq. 1905. + Biometrika, Vol. x. pp. 179, 180. § Ibid. pp. 269—279. || Pearson and Cave: ‘‘ Numerical Illustrations of the Variate Difference Correlation Method.” Biometrika, Vol. x. pp. 340—355. | R. S. Proc. B. Vol. 85, p. 472. ** «The Intensity of Natural Selection in Man.” Drapers’ Company Research Memoirs, Dulau & Co., 1911. 490 Further Evidence of Natural Selection in Man time factor involved in the variate difference correlation method that a selective deathrate plays even in highly civilised states a marked part in the natural history of man. (2) The material dealt with in this investigation consists of the Registrar- General’s returns for births in England and Wales and of deaths in the first five years of life from 1859 to 1908 with the addition of as many years before 1859 as were requisite to make our highest differences fifty in number, and with the addition of as many years after 1908 as were requisite for following up the births of that year to the fifth year of life. Thus actually our data extended from 1850 to 1912. The reason for this procedure lies in the desirability of using a constant population, and not reducing by one a relatively small number like 50 on each differencing. Asa result of this process we had to modify Dr Anderson’s values for the probable errors for the steady values of the difference correlations because ‘In our case the size of the population does not change as we proceed to higher differences*. The second cause which requires extension of the data is a very important one, and must be illustrated numerically. Consider the table: Deaths of those born in a given year. 7 Female ; 2 Year Births 0—1 1—2 2—88 3—4 4—5 1908 | 478,410 | 63,594 = = = 1909 — -_ 14,146 es eas 1910 = os = 5,020 = 1911 = = = = 3,449 1912 a =i = == = Now the deaths of infants 0O—1 in 1908 are not necessarily of infants all born in 1908, but the total deaths 63,594 must represent closely the deaths in the 478,410 infants born in that year. Disregarding immigration and emigration, this gives a deathrate per 1000 of 107-495 and leaves 414,816 children alive. Of this group 14,146 may be taken to die in the second year of life, giving a deathrate of 31-990 per mille. There remain 400,670 children who reach the third year of life in 1910, of whom 5,020 die, giving a deathrate of 11:939, and 395,650 survivors. These survivors are followed into 1911 and 1912 in the same manner, and thus we obtain approximately the deathrate up to the fifth year of the male children born in 1908. We thus in bulk follow the same group of children through the first five years of life. Tables I and II give the deathrates for males and females respectively under the heading of the birth year of each group. These death- rates have been taken to three decimals places for the purpose of determining the higher differences correctly to one decimal place. The successive differences of * All the probable errors of the difference correlations given in this memoir are these modified Andersonian values, i.e. they are the probable errors on the assumption that the difference correlations have reached steady values. Erur, M. EupERTON AND KaArL PEARSON TABLE I. Deathrates in each Year of Life for groups born in the Year of 1st column. _ Males. | Ga 1—2 2—3 oy I—5 1850 159°781 63°935 34°092 22°138 19°021 1 168°706 64:°977 33°882 26°657 18-209 2 173°324 63°533 40°936 24-316 16°022 3 174°808 74°802 35°464 22°298 16°720 4 170°890 60°598 30°740 21°688 21°329 5 169°151 59°696 33004 29°546 19°780 6 156°756 64°317 39°551 25°594 13°751 tr 168°486 67°928 36°777 19°471 13°923 8 172°591 68°712 30°566 19°857 16°268 9 167°106 58°887 31°650 22°325 21°962 1860 162°642 70°401 35 °297 30°083 21°325 1 167°634 65°785 41°390 27°654 | 16°236 4 156°684 73°848 37°326 22°0138 | 14°982 3 163°183 67°537 32°774 21-088 12°552 4 166°309 66°462 34°408 17°660 15°991 5 174°356 68°369 28°278 21°473 17°015 6 173°659 60°603 32°159 22-751 | 18°105 ¢f 166°905 63°790 32°731 23°220- | 16°085 8 168-064 62-988 32°357 20°185 | 12°821 9 169°022 65°716 30°186 17-450 | 11°624 1870 174°287 62°401 26°760 16°344 | 16°052 1 171°840 59°853 25503 21°635 | 14°727 2 162°321 99 °267 30°754 18°731 | 12°655 3 163676 61°415 28°007 17°275 12°502 4 164:976 60°316 ZO Be, 16°028 13°499 5 173°145 58°422 25°400 18°848 13°187 6 160°415 56°088 27°992 177044 13°2638 tf 149 °627 63°344 26°104 17171 11°738 8 166°266 58°104 27°853 15°280 13°049 og) 149°754 66°188 22-245 16°741 12°115 1880 167°313 48°181 25°909 16°036 11°920 1 142°532 59-164 23°333 15°261 10°430 2 153°154 54°365 24°026 14°362 9°416 3 151°184 58°292 23°015 13°823 11°042 4 160°381 53°952 22°077 14°792 9°533 5 151°175 57°960 23°052 13°615 10°073 6 163°081 55°611 21:009 13:°974 10°569 th 158°243 50°713 22°169 14°394 9°894 8 150°177 56°882 22958 13°536 10°150 9 157°476 56°279 22°338 13°750 10°647 1890 164°757 59:098 22°604 14°551 9°623 1 163°761 54°255 20°821 12°615 8°757 2 162°112 52°776 19°442 12°497 10°604 3 173°333 47-035 20°343 14°121 8°546 4 149°633 55°787 21°271 11°750 8:439 5 176°280 51°404 19°535 11-912 8:973 6 160°989 50°293 18°742 11°802 9°366 ih 170°291 50°986 19°124 12°396 8°608 8 175°183 48°227 19°380 11°496 8°657 9g 176°606 49°837 17-094 11°565 7°165 1900 168°685 44°728 17-400 9°759 7°553 il 165°617 42°597 15°570 10°241 7°135 2 146°791 40°581 16°572 9°587 6°853 3 144°567 45°517 15°368 9-260 6949 4 158 °684 38921 15°234 10°194 6°458 5 141°1938 39°326 15°702 8°893 6'886 6 144°819 38 °037 14°222 8°858 5°655 tf 130°259 36°615 15°159 7643 6°230 1908 132°928 34°102 12°529 Salis 5°962 Biometrika x 63 491 492 Further Evidence of Natural Selection in Man TABLE II. Deathrates in each Year of Life for groups born in the Year of 1st column. Females. Oxsiely apes) 2-8 gay paws 1850 130°477 | 62°235 34°147 22625 19°278 i 138°306 62°106 33°992 27:°087 16°762 2 142°185 61°812 39°787 23°805 16°148 3. 144°270 | 71:°939 35°186 22686 16°948 4 Pa 52 59024 30°863 | 22°226 21°907 Oo} 3218 se eo elA|: 34°057 29°375 20°501 i 129°877 61-902 39°758 | 26°146 14°305 142°203 65°853 36°712 19-990 14°391 é 143°595 64°513 29°716 20-796 17-089 9 138°477 55°5385 32°024 24°485 21°454 1860 131°914 66°902 36060 29-941 20°765 “et 1 137 °339 61°860 41°243 27°727 16°287 2 127°203 70°313 36°543 21-905 15°398 3 133°321 63°438 32°637 21°623 12°702 4 138°232 63°395 34°846 18°217 15°547 5 145°388 66°401 28°484 21°437 17°145 6 144°879 58°180 32°392 23°086 17°536 7 137°712 61°641 33°243 23°081 15°653 8 141°756 58624 31°892 20-060 12°524 9 141°450 60°720 31004 18-108 11°732 1LS70 144°596 59°845 26°855 16°235 15°196 1 143°353 56°379 25°062 21°493 13°903 | 2 1367453 52°652 30°2138 18°454 12°571 3 134°079 57°739 27°549 16°950 11°421 4 135°939 o7°795 25°498 16-031 13°253 5 142°730 54°032 24°563 18°848 12°833 6 131°718 51°057 27976 16°685 12°651 i 121°936 59°5385 25°135 17441 11°214 & 137°958 52°463 27 °646 15°186 12°491 9 120°6438 62°324 21°792 16°911 11°880 1850 137°691 45620 25°319 15°556 11°335 1 117°221 55413 22°443 15°359 10°396 2 127°627 50°057 23°539 14°178 9°586 3 122°766 54°048 22°948 13°643 10°606 4 132°701 49-967 21°259 14°913 9°40] i) 123°667 53°415 22°740 13°251 10°030 6 134°814 51°108 19°985 13°800 10°295 i 130°689 46-402 21°538 14°594 10°012 8 122°312 53°369 22°456 13°926 10°146 ) 1297148 53°307 21°783 14°170 10°569 1890 135°839 54°850 21°607 14°731 9°478 1 132°763 51°567 20°841 12°646 8°900 2 132°414 49 °387 19°174 12°708 10°370 3 143°346 44°511 19°842 14°293 8°643 4 123°522 52°267 21°499 12°183 8°300 5 144°326 | 49°339 18°643 11°817 8°499 6 133°535 | 46°691 18°439 12°184 9°282 tf 140°755 47°998 18°109 12°739 8642 & 145-001 44-832 18°436 11-490 8°865 @) 148-000 45928 16°954 11°724 7164 1900 139°148 | 41-901 16°902 9°712 7611 e 1 136°346 39°527 15°156 10°606 77184 2 118°479 37 :064 16°168 9°891 6°792 3 118-004 42-270 14°774 9°315 7615 4 131°477 -36°598 14°136 9°793 6°453 5 114°641 37-084 15-066 8-710 6-902 6 119-668 36°006 13°552 9°168 5°513 @ 104°487 33904 13°789 7493 6°071 1908 107°495 31°990 11:939 8°546 5°939 Ernest M. Evperton AND Kart PEARSON 493 these deathrates up to the sixth and, in a few cases, to the tenth were then formed. In our notation m, is the deathrate in the rth year of life, 1e. from r— 1 to r years of age, and 6,m, is the sth difference of this deathrate. As we have five deathrates for each sex this involves 10 means, 10 standard deviations and 20 corre- lation coefficients, but as we have used six successive differences these numbers must be multiplied by seven. The calculation of these differences and of upwards of 150 correlation coefficients has meant very strenuous labour. It must, indeed, be admitted that the application of the variate difference correlation method is not, even with small populations, a light task, but the change from the high positive to low negative and then to high negative values of the correlation is of extraordinary interest, and indicates the stages by which the associations are freed from the spurious influence of the time-factor. (3) All our correlations are given in Table III (p. 497), but it is desirable to discuss in detail certain groups of them. We take first the correlations of the deathrates in successive years. They are: Male. iMemates Mn Ms + 398 + 080 + 390 +081 Tg my +859 + 025 + 864 + 024 Pe +924 + 014 4-928 + 013 i + 911 + 016 +917 + 015 my Ms All these are positive, all are significant and, the first excepted, are very high correlations. There is no significant difference between male and female. The least important is the relation between deaths in infancy and deaths in the first year of childhood. We have in these correlation coefficients the numerical expression of what is obvious in Tables I and II, 1. as the deathrate in any year of age falls so does the deathrate of the same group in the following year. It is this fact which has led to the erroneous idea that natural selection plays no part inman. The fact, however, simply expresses the continuous change of environ- ment which has been in progress since 1860. During the half-century improved economic conditions, bettered sanitation, and developed medical care have lowered the deathrate at each age*. It is therefore impossible to deduce any argument as to natural selection in man from these correlations until we have removed this continuous influence of the time-factor. This is achieved by the variate difference correlation method. In every case a preliminary examination of Tables I and II shows that the correlation of the first differences of the deathrates of successive years is negative, and as we take higher and higher differences the intensity of this negative correlation increases, until with the sixth differences it reaches to the * As we have already remarked the infantile deathrate showed little of this improvement till 1905, It was about this same year that the absolute number of births in England and Wales began to decline, so that while the population has increased by something like 34 millions, that population produces about 76,000 fewer babies annually. . 63—2 494 Further Evidence of Natural Selection in Man very substantial value of about —°7. In other words a rise in the deathrate of one year of life means a fall in the deathrate of the following year of a most marked kind. While with the sixth differences we are approaching fairly closely steady values it may be doubted whether we have reached them in any case but that of 154m. dgm;:_ Lhe following are the sixth difference correlations in the case of the deathrates of successive years: Male. Female. Tagmy . Some — 688 + 090 —-719 + 081 "55 my « 5gm3 — 673 + 092 — 660 + 095 User tasene: — "703 + 085 — ‘731 +:078 T Samy « 5g Ms — 695 + ‘087 — 736 +077 Again the male and female results are in excellent agreement, and we grasp the startling manner in which the new method reverses a judgment based on- relations which have been deduced without any regard to secular change. (4) The question naturally arises: How far are these the “steady” values of the difference correlations measuring the organic relation apart from the time- factor of the deathrates in different years of infancy and childhood ? There are three fundamental tests: (i) The correlation coefficients of suc- cessive differences should have ceased to be markedly rising or falling. Table II (p. 497) shows that this is approximately but not absolutely the case, but we have reached a stage in which any further changes are certainly of the order of the — probable errors and thus of little significance. The unsteadiness as will be in- dicated later in better tests is greatest in the differences of the deathrates in the first and second years of life. Here the correlations were taken to the seventh and eighth differences and gave: Male. Female. Lae ee — 696 + 090 — 729 + -082 Tsai . dams — 692 + 094 is ‘731 4 °084 which appear to have reached practical steadiness. Actually the final correlations must be somewhat greater than those obtained from the sixth differences. To push the process further, however, would be of small advantage because higher differences involve introducing earlier data, and the birthrate data before 1855 become more and more unreliable. Again in the extremely high differences, the additional year required for an additional difference if not appertaining to rela- tively smooth data may in itself, when we have only a small total frequency of 50, produce a certain amount of unsteadiness. (ii) We may consider the mean values of the differences. Erne, M. EvperTon AND Karu PEARSON 495 If our first variable be taken* as a= ¢,(t) + X, where X is the intrinsic value of w as apart from the time change, then mean 6,4,” after steadiness has set in is * One of the bases of the variate difference correlation method lies in the assumption that the intrinsic variation is superposed on a secular change of a continuous character; the causes which determined the intrinsic variation X are supposed to be sensibly independent of the time for the period under consideration. We conceive the secular change as given by a parabola, say, of the sth order, but the deviations from this curve are supposed in magnitude and sense to be independent of the time, i.e. due to chance causes which are the same in 1850 as in 1900. This assumption is an important one and must lead to our seeking relatively short periods consistent with a numerical frequency sufficient for significance. It can be roughly tested, of course, by considering ox as found from, say, the first and second halves of our observations. In our own case we found: Values of ox deduced from Sixth Differences for 1st 25, for 2nd 25, and for all 50 years. (m1) | (mz) (ms) (mg) (m5) a = | SoMa el ieee. lacs: igual ae: GO malh aes a) 1 é | seat Ds Ist 25 years ... | 7°32 | 6-94 | 5-51 | 5-61 | 2-09 | 2-30 | 1°52 | 1:67 | 1:05 | 0-91 | All 50 years ... | 8-61 | 7-83 | 4:71 | 4:63 | 1:59 | 1:77 | 1:17 | 1:28 | 0°86 | 0°78 2nd 25 years ... | 9°70 | 8°61 | 3°73 | 3:37 | 0°83 | 0-98 | 0°66 | 0-68 | 0°63 | 0°62 I | These values are less steady than we had originally hoped for. Clearly the variability of the X portion of the infantile deathrate has grown greater, and that of the four child deathrates has grown sensibly smaller with the time. The fundamental hypothesis of the variate difference method is there- fore only approximately true for this material, We have made some investigations on the assumption that x= ¢,(t) + (a+ bt) X, but the values of a and b obtained were by no means satisfactory. We have in hand a further investigation of the problem by the method, originally suggested by one of us, before the difference method was started; namely to subtract from « the value obtained by the best fitting parabola of the sth order in the time and so to reach the actual values of X. The relation of these to the time can then be found with some degree of accuracy. To the male deathrates of the second and fourth years of life we applied parabolae of the third order in the time, and obtained excellent fits ; we then subtracted the ordinates of these parabolae from the deathrates and correlated the remainders, dy and dy say. We found Wiehe +°312+-088, a value corresponding more nearly with "551mg d5neg than T Sm dgmy? and indicating that we might more rapidly approach final values by this method than by that of variate differences. But the fitting of high order parabolae is very laborious; at the same time the graphs give excellent tests of the accuracy of the work, and we obtain the actual values of what we have termed X and Y, as represented by dy and dy. We then correlated the numerical value of ds with the time and found "dt = —°284+-089. It is clear that with correlations of this order with the time, "dod, would not be modified by the extent of its probable error if we found the partial corre- lation tdydy? or corrected the correlation of dy and d, for the time. There is another point, however, which justifies us in disregarding this variation of X and Y with the time as of secondary importance. The correlation of X with the time is positive in the first year’s mortality and negative in the following four years; thus while it would certainly tend to give a negative value to 7x, for the 1st and 2nd -years of life, it would tend to give a positive value to the correlation for all successive pairs of years beyond the 1st and 2nd. Now all such successive pairs of years have high negative values, which are therefore minimum values, but these values are all in excellent agreement—roughly equal to —-7—with that found for the 1st and 2nd years of life. We therefore concluded that the influence of the time on the deviations from the secular curve of change, although very sensible, is of no substantial importance for he correlations. 496 Further Evidence of Natural Selection in Man equal to mean 6,,,X, and this (taking, as we have done, ‘backward’ differences) is given (the C’s being the usual binomial coefficients) by = (X= Oa Cs Kg =) a Ae ee Oe Xp s09)) MN Now if we remember that the X’s have chance values uncorrelated with each other then we shall have for the squared standard deviation of the mean 6,4,X, 9 mean 6,.,,;X — 2c,? ( + CO? ae 1 Ce? + a0 + +C,?) o n> Or, the probable error of the mean (7 + 1)th difference after the steady values have been reached 2 = = 67449 / 2122 ox r[r At first sight this appears of no value, because oy is unknown, but Dr Anderson has ‘given os __,, in terms of ox when steady values have been reached *, 1.e. Rat Le when we assume steadiness reached. The values of the means of the differences with their probable errors on the assumption of steadiness are given in Table IV, and the ratio of the means to their differences in Table V. It will be seen that the positive and negative signs are not scattered quite as much at random as we might have hoped and that this is especially the case in the infantile mortality differences}. If we take all the ratios of the means to their probable errors except the first difference, we find their average value 1°16 ; it should be of course 1:18. Of these ratios 33 are positive and 25 negative. If we omit the ratios for the first year of life, we find 24 negative and 20 positive, while the mean value ="98 as against 1:18, the theoretical ratio of the mean to its probable error. It is obvious that the infantile mortality differences are those which are anomalous. Otherwise the mean differences vary fairly satisfactorily * Biometrika, Vol. x. p. 272. + It may be noted that at the beginning of the period we have the disturbing influence of war and at the end of the period wholly changed conditions due to a great limitation of births. The means depend on differences of mortality under these conditions. 497 Erue, M. ELpERTON AND KARL PEARSON (Su) re9x FLT | | | (Fu) reax YQINOT (fw) wwoX paryL, (fu) veax puossg ‘saynuyqwagy fo saouasafigg pun saynsyynag fo suoynjas1o/ (tu) eax ysatnq 110. ¥ 180: $969.- = - — ~— = = = — | we} ¢ 10. + IPL: —| 180- + GOL. — | = = a a = = = ett ire Z80- #89. —| $60. FFZI--| = = = == = = See = 860- + L9G: —| LLL. FOS8F- — | — | = = = = = —= — | wé < UG LOG sa GGL: CLG- ee eee aa | apne aa ae = — — — | wee | g OTL FG10-— 9TL- +680. + ot aa == = = — — — ug | GLO. + L16. + 910: + 116. + = _— — — = — = — UW = sal ; . GFL. $68. + | SFL. + L6E. +) 8L0. + LEL- — | 80. F EOL: — = rc an =a gh es UQ e IPL. 9¢8- +) SEL. FI8E- + 160- #699-—| 060. F999. — a a a = = 7 we z GEL. FOLE- +) 9E1- + EEE. + 680. F0G9.— | G60. F8T9- — = = a 7 a ae wk = SEL. +00. +) LET. FELIS. + 860. F89G- —| FOL. F6ZE. — == — = == aa — wee a IGL- FFGO- +) GEL FF9O- +) FLL. FSLE- —| GIL FOLE- — = — oa = = = weQ 5 GII- F101 —| GLI. #860. — | SLL. FF60- — | OTT. FEFO. — = == = = ar ‘ra wel > ZO. 188: +) PZO- FZ98- +) CLO. F8Z6- +] FIO. F FBG. + = = — = = — UW & | LOT. FGLO. +) 89T- FOEO- —| FHL. FLLE- +) GFI- F6EE- +] G60. F099- — | G60. F EL9- — = = = == a) g O9T- FZS0-+) I9L- FEO. —| GFL. F9OPE. +) CEL FELE- +] G60. + 1P9- —| 160. F099. — = aa = = wW*Q S GGL. F E80. +] EST. FZZO- — | GEL-F FOE. +) LFL- FZ8Z- +| 960. FFI9- —| 160. F9E9- — a = = == we = OPI. +690- +| FFI. F800: —| CET. F193. +) LET. F61Z. +| 960. F 19¢- —| F60- F L8G. — = =a = = wkQ nS O€T- F6ZI-+| ZET- FELO- —| LZT- F9GI- +] OSL. FOFI- + | L60. FETS. —| 860- #90. — = =a aa ri WeQ g OTL. FZFO- —| FLL. FEFT-—| OTL FLL0-+) OL1- F820-+] LOL. FO9E- —| FOL. FOKE- — — = = = wie | x GEO. FZIG- +) 9EO- F S8L- +) OO. FFFS- +) O€O- F LBB. +] FZO. FF98- +| CZO-. F EGR. + = = =a ete! mith pale e = | = = a: = a 780: + 1€L- —| 760. + 69--| 9 -— Soe Cae hj = | = amy oe = = G80. + 6ZL- -— | 060. F969--| — = wg cE 8G1- + 19Z- — | O€9. FIST- —| S9T- FSE0- +] TOL. + G03. +] LOT. F00G- +] 6S1- + LBS. +| 180. F61L-— | 060: F889--| — = Ww ee SP1l- £98. —| FST. FOILS: —| LOT- F EGO. +) GGL. GOS +] OGT. FGLT. +] PST. FL1Z- +] 180-# GOL. —| 180. F6L9--| — aa we a OFI- + 16Z- —| €G1- F881. —| GET. + 180. +] EST. F9TZ. +] EST. FFI. +] EGT- +Z6I. +| 180- +689. —| 980. FP99.-| — se at tL g GEL. FFGS. — OFL- FE9L- —| GPL-FIL- +) 9EL- F8EG- +) SHI. FEOT- +| IPT. FIST. +| 080- F199. — | P80. F GP. — = a weg | 8G1- FB8T-—| TEL. F980. — O€T- F8ZI--+) OSI. FETS. +] ZEL- F C90. +| TET. F 180- +] 980- #6Z9. —| €80-F809--| — | — weQ = 9I1- +€00- —| 911. FEO. + FIT-FIGI-+) E11. F991. +] OLL- FZFO-. —| 9TI- £800. —| G80. FGFS. —| F80-FTEG--—} — LUO at G80. + 8ZE. + | LO. FFEF- +) 8L0- + LEF- +) LL0. + 9EF- +) O80- F FOF. +| O80. F LOF- +] 180. FOGE- +] 080-F86E-+] — | — Ww fs) “2 } 2 } Wy } 2 fo) o TH WiTdvi Further Evidence of Natural Selection in Man 498 Es & fe eee ey 6g. + | 68. + | 961+ | o7-ot+ | pet — | 96-1 Gey + | O61 ore peat fei. = | oes (OS ar SGeo ate G6. €0- 1 — Giro ss 9L-T Gio = nOR— | Ghak—= | eye | Gps Feo: 62-3 - | €61- | pt. + | or + | ove + | 09-1 Z9-01- | 96-8- | or-8- | 69-4-— | LLOL— | 6-01- | i pes | } 2 4 2) } 2 | | (wu) G—p i 1e0x (Fu) F—E :41B9K (Su) g—g :avaK PI. + | 6 + | G6-IT+ | ZO.c+ | °° w8e . 148 68. — | €p- — | 60-6+ | OL-o+ | ** wg . UL GI. > eG. =| eta | Ole is: 129 s 19 r9- — | OG — | 9T-6+ | 80.6+ | 77 wg is u4¢ og. + | FO. + | ZL-G+ | 68-L+ | °° wg s UP gg. + | LI. + | 29.14 | Ie.r+ | “7 wee pag OI. — | 6G. — | OL. + | FL. — | *** wg # pug 8L.8- | ¢ce6—- | ope—- |! rp.¢- “wile ‘gousregiq 481 rane alee, 4 ae i OYBIIVA (2) g—T :awox | (tu) T—9 :av9ax ‘SOLA BIQUQOLT Way, 02 Saouasafiug fo sunayy fo oyoy “A ATAVI, ‘OTNA OUILS OY} MOT[OF SdopIO LOYSTY Jo soouetegip oy} pur %w—'+%w=wle arvak yys oqy 10g cad ae = ee = = OPE-G FEL. +] LGP-GFELG-1+ | OL1-8+ 888-91 +| F0L-6 £069-61 +] we S 448 = aes Te me — ae GELG +966. —| PELEFELL- —| 6FS-PF19P-6 +) 610-44 E8801 +] ** we % Uy L8G. 4 1G. +) 196. + 18S. +, L8E- + 99L. + | OCE. F G08. + | CES. F LLL. —| CBP. FEL. —| FOP-LF ELS —| LEP-LF SSE. —| GLE-SF9G0-G +] 909-SF8I9-G +] °° we 439 SGI. 4991. +) LEL FELT +| POG. F GGL + | OBL. F LEL- +] 18e- + GBP. —| COG. F L6G. —| FEL. FOVO.—| FEL FEPI- — | 6EG-1F099-6 +] LE-TFesg.s +] “wi! fe 49g 990: + 160: +) PLO. + 820- + | OLT- ¥ LOL. —| LOL. F POL — | OGL FELT. —| GEL + 186.—| LAE. FPLL-+) 18e. F410. +/ 19. FISE1 +] SIL Fere-1 +| whe ia qa LE0: + 990: ~ | PP0- + 090- — | 090. + 280. — | LE0- + 780- —| £80. F 280. + | 940. F 00-—| 661. FOLT-+) GOs. Foeo. +] ere. FZee. +] 08e. FLEr- +] °° wée : pag BGO; + 160. — | L20- + EGO. — | 9E0- + G00. + | GEO. FF 10. +] 6FO- F EOL. +| SO. FZLO- +] OLL- FILO. -| IIL #990. —|G6I- #Ze0. +] E1z. Fezo. —| wee pug 1G0- + £66. ~ | EGO. + 90%. — | 6ZO- + GHG. — | 6GO- + ES. — | GEO. + 9GE. — | C80. F 19. —| FLO. FOGO.—| FLO. FSCO. —|EET- FBSL —| GI. FEEL. —| 77 wig ‘oouasagig 4sT GE-FOLTL | 9€:FLG-1L | Lo. +8P-GT | Lb-FSF-SL | 99. FE9-€3 | €9-F10-FZ | 98-FFLIG | 88 FETC 96-FOE-ZEL | 90-1F 19-091 | w ‘oyeryzvog [enjoy 6 | P é 6 g é P é P = = = OYVLIVA (Sw) G—p :1vaK (Fa) F—Eg s1vIAK (Sw) g—g tavaX (Su) g—T :avox (Tue) [—9 :a1v9K ‘SLOMM A)QDQOLT ayy pun sunayit "AT WTAIVL Erak, M. EupErtTon AND KARL PEARSON 499 round zero in the required manner. The interest of this test is that we see that the bulk of the time effect has been removed even when we reach the second difference, a result confirmed by the fact that the correlation of the deathrates’ second differences is in every case already substantially negative. (iui) A third set of tests are those which are based on the standard deviations of the differences. In the first place if we assume steadiness to have set in, we can calculate oy, the intrinsic standard deviation from the known value of 5 x» bY means of Dr Anderson’s formula cited above (p. 496). Table VI gives the intrinsic values of ox, i.e. oy as deduced from the variability of the differences. It will be at once observed that for the third difference the mortality ratios of the third, fourth and fifth years of life reach steady standard deviations. In the case of the first year of life it is not till the eighth difference that this result is reached, while in the case of the second year, it can hardly be said to have been obtained with the ninth difference. A distinction should be noted here of which the exact physical significance is not obvious to us. In the second, third and fourth years the intrinsic standard deviations fall to steady values, but in the first and second years they rise towards those values and these are just the cases where steady values are not absolutely reached. TABLE VI. Intrinsic Standard Deviations (ox). Year: O—1 (m;) | Year: 1—2 (mg) | Year: 2—3 (m3) | Year: 3—4 (my) | Year: 4—5 (ms) Order of Difference | 3 ¥. 3 ? CRAs se 3 4 3 a 2 ae = = Ist 7°62 6°96 3°90 3°86 1°75 1°83 1°53 1°52 1°23 1:09 2nd 7°89 7:22 4°13 4:09 1°67 1°81 1:29 1°34 1:00 83 3rd 8:14 7°45 4°32 4°28 1°62 e728 heal 1:29 93 80 4th 8°34 7°63 4°47 4°42 1°59 1°76 1°18 1:29 87 77 5th 8°50 7°76 4°60 4:54 1°58 1°76 1:17 1:28 86 77 6th 8°61 7°83 4°71 4°63 1°59 Ea) erly 1:28 86 78 7th 8°66 7°84 4°80 4°72 _- —- ) — — — = 8th 8°68 7°85 4°88 4°78 — = 9th — — 4:97 4°82 — | — — — — = (iv) There is another test for the standard deviations of the differences deduced by Cave and Pearson from the Andersonian results and used by them in their memoir on Italian Index Values*, namely as steadiness is approached the ratio of the squares of standard deviation of successive differences should approach closer and closer to 4, the exact value being o"5.m 2 4-2, s 2 o's ym s * Biometrika, Vol, x. p. 346. Biometrika x 64 500 Further Evidence of Natural Selection in Man Table VIL shows how rapidly the system approximates to the theoretical values in the case of the higher differences. On the basis of all the tests we have applied we may, we think, conclude that by the sixth difference we have reached values for the correlation of deathrates in successive years which are in all probability close to the organic or intrinsic values. Only in the first and second years of life is steadiness not absolutely reached, but for practical purposes but little change can be anticipated in the correlation coefficients. TABLE VIL. Ratio of Squared Standard Deviations. my my ms ma Ms Mean | Mean | Theory 3 : 3 f 3 ¢ ) f 3 f 6 |S i | vl 949) 956] °354] °369] +142] °144|] +194) +192] -211] -181 | °370| ‘368) 2 2 | 3:199 | 3:221 | 3°374 | 3°384 | 2°731 | 2°934 | 2°127 | 2°317 | 1-996 | 1°728 | 2°685 | 2°717 3 8 | 3°547 | 3-552 | 3638 | 3°633 | 3-133 | 3-240 | 2-944 | 3-086 | 2°896 | 3:109 | 3°232 | 3-324] 3°333 4 | 3676 | 3673 | 3°754 | 3°741 | 3°363 | 3°428 | 3°341 | 3°504 | 3:054 | 3°227 | 3°438 | 3°515 | 3°500 5 | 3°738 | 3°723 | 3°811 | 3°793 | 3°591 | 3°604 | 3-509 | 3-562 | 3:504 | 3°641 | 3°631 | 3°665 | 3°600 6 | 3°756 3°848 | 3°828 | 3°690 | 3°683 | 3°708 | 3-648 | 3-691 | 3-758 | 3-739 | 3°730| 3°667 3734 | | (5) We can look at the association of deathrates in successive years from another standpoint. We can ask if there be an increase of 10 points in the deathrate for a given year, what increase or decrease will there be of deathrate in the same group in the following year ? In Table VIII below the second column gives the spurious change which is apparent in the crude data, the third column gives the real organic change which is discovered when the time-factor is removed. TABLE, VIII. Association of Deathrates without and with Annulment of Time-factor. Result of an increase of- 10 deaths per mille in one year of life on the deaths per mille in the next year. | Disregarding Time-factor Annulling Time-factor Increase of 10 in Deathrate of | — ) ? 3 | di) Ist Year on 2nd Year 2nd Year on 3rd Year 3rd Year on 4th Year 4th Year on 5th Year | Increase 3°3 Increase 671 Increase 69 Increase 7:0 Increase 3'8 Increase 6°6 Increase 6°7 Increase 6°8 Decrease 3°7 Decrease 2°2 Decrease 5‘2 Decrease 5:1 Decrease 4:3 Decrease 2°5 , Decrease 5:3 | Decrease 4°5 | Erase, M. Exvperton and Karu Parson 501 It is easy to see how those who contented themselves with crude deathrates, making no allowance for the betterment of deathrates with the time, interpreted a higher deathrate in one year to mean a higher deathrate in the next year of life, and so questioned whether natural selection applied to civilised man. As a matter of fact we see that the true organic relationship of deathrates is much more probably summed up in the statement that a decrease or an increase of deathrate in one year of infancy or childhood is in each case followed by an increase or a decrease in the deathrate of the survivors of the same group in the following year. Disregarding the time-factor we have a result quite incompatible with natural selection; annulling the time-factor, we have a result not only compatible with natural selection, but very difficult of any other interpretation than that of a selective deathrate, 1.e. a heavy mortality means a selection of the weaker members, and the exposure to risk in the following year of a selected or stronger population, which has accordingly a lesser deathrate. (6) We now turn to the problem of how far this influence extends, or probably it would be better to phrase it: how far this influence can be traced. It is not only that the age group we follow does not absolutely consist of the same individuals but even with those members that are the same there is very often change of environment due not to time but to a change of locality or of economic condition affecting individuals. Added to this there is a continuous immigration and emigration. But beyond these causes weakening the association, there is another difficulty of great importance arising from what has happened in the intervening years. We wish to find out how an increase of deathrate in the sth year of life affects the deathrate in the (s+2)th year of life, but the events in the (s+1)th year will largely dominate and, perhaps, screen the results we are seeking. Such problems are always arising in statistical research. For example, a child may resemble its grandfather simply because both grand- father and child are like the child’s father. We know that the problem is answered statistically by inquiring what is the relation between a character in the child and the grandparent for a constant value of the character in the parent. In precisely the same manner we must in the present problem inquire: What is the correlation between the deathrates in the sth and (s+ 2)th year of life for constant deathrate in the (s+1)th year of life ? TABLE IX. Influence of Natural Selection at Interval of Two Years. Partial Correlation of For constant 3 Q dom, and dgm3 sis... gig | — °4307 | — 5242 Ooms, and Sgmq dg73 —'2555 — 2058 Somz and dgm, —... Ogig — ‘1798 | — 3129 502 Further Evidence of Natural Selection in Man We shall of course work with the sixth difference correlations in order to free ourselves substantially from the time-factor. Here again the judgment based on the partial correlation of the crude deathrates is in all six cases reversed. For every one of the partial coefficients of crude deathrates shows that for intervening year with a constant deathrate, an increase of deathrate in the earlier year means an increase, not a decrease in the later year. Actually an increase in the one year is shown in Table X in all cases to be followed by a decrease at two years’ interval. TABLE X. Influence of Natural Selection at Interval of Two Years. Result of an increase of 10 deaths per mille in the second following year. For constant death- Increase of 10 in Deathrate of SR é ce) Ist Year on that of 3rd Year 2nd Year Decrease ‘81 Decrease 1:28 2nd Year on that of 4th Year 3rd Year Decrease ‘61 Decrease ‘52 3rd Year on that of 5th Year 4th Year Decrease ‘99 Decrease 1°4 It will be seen that these values are appreciable although far less important than the decreases produced in a following year by an increase in the immediately preceding year. Thus we judge that a selection of the weakly children in one year is largely influential on the deathrate of the immediately following year, and diminishes, as we might anticipate, with increase of time. Some objection might, however, be taken to the sixth difference correlations, when we consider deathrates of the same group two years apart. Ui dgmy . dgmg i dg my . dgmg V5qmg. Sgms Male. +227 +°159 + 339 +149 +°397 +:142 They are Female. +°200+°161 +377 +:°144 +°393 +142 It will be seen that while they are all of the same sign and fairly accordant for both sexes the probable errors are becoming very substantial relative to the coefficients. We have indeed too limited a range of years. (7) If now we take out the correlation coefficients of the sixth differences for three years’ interval, and again for four years’ interval we find great irregularities. Male. Female. Togmy . dgmg +°205 +:'161 +035 +168 P3gme . Og™Ms a ‘030 + ‘168 ae ‘072 + ‘] 67 "gm « gms — 181 +163 — +251 +158 The correlations now do not agree in sign, they are insignificant having regard to their probable errors, and there is no close correspondence for the two sexes. Erase, M. Evperton anp Kart PARSON 503 We should need a far longer period than 50 years to determine certainly even the signs of these correlations, and their real magnitudes would require still ampler data. It would appear impossible to assert on the basis of the above values of the correlations at three and four years’ intervals more than the insignificance of the associations between deathrates of the same groups at intervals of more than two years*. In other words the effect of intense selection appears to be exhausted after an interval of two years. The word “appears” is used purposely because there must be some spurious weakening of the effect due to our not being able to follow absolutely the same individuals. (8) We have further studied to some extent the relationship between the male and female deathrates. There is almost perfect correlation between male and female deathrates in any given year of life after we annul the time-factor. Thus, if we represent female deathrates by m’, we have as illustrations: = +9905, Peqms. Some’ — ‘9880, T5gmg .5gmg" + ‘9687, TS mg. dgmy’ >= ak ‘9800. Psgm, . dgmy’ Of course the sole significance of these values lies in the fact that years of stress, whether due to climatic or epidemic causes, affect equally infants or children of both sexes of the same age. But these very high values in our opinion cast considerable doubt on the partial correlations derived from them. We have in fact — T1313 _N = De and if we suppose 7, and 7; nearly equal, then if 7; be of the above high value N will be extremely small, but D is also, owing to the presence of the factor V1—732, very small. Thus 37. although it may be very considerable is the ratio 3°12 = * Actually the partial correlations of the sixth differences at three years’ interval based on the above values are : dgm, and dgnr4 dgmy and dgms dgntg and dgmz +526 +°181 dgms, and dgmg | +°251 + °485 Correlation of For constant 3 @ | These are certainly all positive, but they are irregular as between the sexes and probably quite unreliable for the reasons already given. Should a more extended experience show that there is a real if slight positive correlation between deathrates at three years’ interval, while there is con- siderable negative correlation at one and two years’ intervals, we should be compelled to discuss whether there may not be something periodic in the nature of the heavy and light deathrates of infancy and childhood. We have been unable to trace any sign of such periodicity either in the deathrates or in the graphs drawn, but we do not believe that a very short periodicity would be elimi- nated by the variate difference method using any moderate uumber of differences. We cannot on this point accept Dr Anderson’s view. See Biometrika, Vol. x. p. 279. 504 Further Evidence of Natural Selection in Man of two small quantities and any disturbing cause which but slightly modifies the value of either 7, or 7.; may even change the sign of NV and so swing sry. over from a considerable positive to a considerable negative value*. We can consider the correlations between the female deathrate in one-year and the male deathrate in a second year, supposing of course time influence annulled. We have Tscmcoglia, aa ae ‘667 4 (Gaon Ogio sa 6879), Tégmy’.dgm. — — 7337 Cae, Sgm! — 7188), "Sgms dma’ — 7313 (Ts5ms 50g 745 ee 7032), 5 gms’. dg my — — 7278 (1"5, ms’. dgmg’ — "7313). Thus we see that the same remarkably high negative correlations exist between the male and female deathrates of successive years of groups born in the same year as exist between male and male or female and female deathrates within the same group in successive years. In fact in two out of the four correlations the cross relationships are higher than the direct, although the differences are scarcely significant. Here again there is nothing noteworthy, considering the very high correlations just noted to exist between the male and female deathrates of groups born in the same year. We can, however, endeavour to correct such values by finding the relationship between the deathrate in females in the first year of life and males born in the same year in their second year of life for a constant death- rate of males in the first year of life. Or still more stringently between the deathrates of females in the first year of life with males in the second year of life for constant male deathrate in the first year of life and constant female deathrate in the second year of life. We should anticipate that such values would come out small or insignificant, if our interpretation of the high negative correlations between deathrates of the same group in successive years of life be a correct one, Le. that the high deathrate leaves a stronger population. For a heavy deathrate in the females of one year should not leave a stronger population of males for the following year after correction by partial correlation. We obtained the following correlations : 5g my! Sgmy’ . Og mg tas 5240 + 0692, =+°4665 + 0746. dem! 65m . 56 My! * The reader must note that we say a ‘‘disturbing cause”; it is not the mere result of random sampling affecting N. The probable error of N=1rj2—133793 for a sample of size n is given by j 4 e Jn and is thus quite easy to calculate. We have tested it on a number of cases of partial correlations worked out for this paper and find that if -67449cy is of the same order as N, then -67449c,,,, is of much the same order as 379. In other words, if N is so small relative to its probable error that it might easily have a reversed sign, then 37). is insignificant as compared to its probable error also. For example, N=:0446 and D=-0956 leads to 3rj2=°4665 with a probable error of -0746. 3ri2 is accordingly considerable and significant, but the probable error of N is only :0105, and we can hardly suppose the sign of 3r1 due to a random sampling variation in the sign of N. “674490 y= 67449 — {D2 — N2[2 (1 — 1452) + 2 (1 — 1992) +1 — ry? - 3]}?, Erue, M. Evprerton anp Kari PEARSON 505 These values were so startling and so contradictory, that we proceeded to eighth differences with the results: gm" dg my’. dgmq — ‘60185 + 0609 gmz/""Sgmy . Sgmy’ — + 5481 + 0667, which emphasised as well as confirmed the previous results. Now it seems absurd to suppose that the deaths of female infants in one year can organically influence the deaths of males of the same group in the next year, or male infants the deaths of females in the successive year. But the extraordinary feature of these results is that while a high deathrate of female infants lessens the deathrate of males in the second year of life of the same group, a high deathrate of male infants increases the deathrate of females in the second year of life of the same group. In order to throw further light on the matter we investigated male and female deathrate correlations in the third and fourth years of life. We found 5gM3! Sgmg’. Sgmy — — ‘2640 + 0887, — 0082 + ‘0954, The second is practically zero, the first of no importance having regard to the high values of the correlation of deathrates of groups of the same sex in the third and fourth years of life (/:—°703 +085; 2:—°731 4 :078). Had we come to these values at first we should have been content, but the cross relation between the infant deaths of one sex and the deaths in the second year of life of the opposite sex was undoubtedly puzzling. 5g mg! gms. Sm! — We then proceeded to still further limit our conditions by determining the partial correlation between female infants in one year and males in the second year of life of the same birth-year when the deathrates of the males‘in the first year of life and of the females in the second were both constant. We obtained = +1632 + 0928, = +2997 + 0868. Og my - OG my! Sgmy! . Og my 5g my’. dg my! Ggmy . 5g ms! Having regard to their probable errors these are of a quite different and negligible significance when compared with the values of an d 56 my gm!) dg my’. dG my 'TSgmy. domo’ given above. It is worth while noting that Sg! 7 Sg my’ . 8gm, = — °2188 + 0908, Sgma! demi «Sama = + 1088 + 0943 also give values of no practical importance. Or, to annul the spurious influence of infantile deaths of one sex, A, on deaths in the second year of sex, B, of the same group, it is more effective to render constant the deaths of A in the second year of life than of B in the first year of life. 506 Further Evidence of Natural Selection in Man In the light of this result we have found the correlations between deathrates of sex A in the third and sex B in the fourth year of life, for constant deathrate of sex A in the fourth year of life. We have = — 0818 + 0948, Sgmy! Syms’. Sg my” Some idemsstsen = ‘1477 + (0933. Both of these may be taken as zero, having regard to their probable errors. Thus on the whole, while the relation between the deathrate of a group of one sex in one year and the deathrate of the remainder in the following year of life appears after the annulment of the time-factor to be very considerable and negative, there does not appear to be any organic relation between the deathrate of sex A in one year and sex B in the following year, if we proceed by the method of partial correlation. But at the same time we believe that this method must be used with very considerable caution, and that to avoid erroneous conclusions the whole problem must be investigated from a variety of standpoints in cases like the present where one of the three total correlations is extremely high. The numerator NV ranges in the cases we have been discussing from about ‘01 to ‘05 and with a small total frequency like 50, any disturbing cause—apart from random variation— may have marked influence*. (9) The conclusion which we have formed is that in the present problem of natural selection it is probably better to annul the environmental factor by the variate difference method rather than to proceed by the method of partial correlation as we have hitherto done. By the former method we have shown that for both sexes a heavy deathrate in one year of life means a markedly lower deathrate in the same group in the following year of life, and that this extends in a lessened degree to the year following that, but is not by the present method easy to trace further. It is difficult to believe that this important fact can be due to any other source than the influence of natural selection, i.e. a heavy mortality leaves behind it a stronger population. Nature is not concerned with the moral or the immoral, which are standards of human conduct, and the duty of the naturalist is to point out what goes on in Nature. There can now scarcely be a doubt that even in highly organised human communities the deathrate is selective, and physical fitness is the criterion for survival. To assert the existence of this selection and measure its intensity must be distinguished from advocacy of a high infant mortality as a factor of racial efficiency. This reminder is the more needful as there are not wanting those who assert that demonstrating the existence of natural selection in man is identical with decrying all efforts to reduce the infantile deathrate. We have to acknowledge the great assistance we have received from our colleague Miss Beatrice M. Cave in the laborious arithmetical work of this paper. * If F=N/D, where N and D are both small, but F finite, then 6f/F=6N/N-6D/D and small disturbances produce great results in I’. FREQUENCY DISTRIBUTION OF THE VALUES OF THE CORRELATION COEFFICIENT IN SAMPLES FROM AN INDEFINITELY LARGE POPULATION. By R, A. FISHER. 1. My attention was drawn to the problem of the frequency distribution of the correlation coefficient by an article published by Mr H. E. Soper* in 1913. Seeing that the problem might be attacked by means of geometrical ideas, which I had previously found helpful in the consideration of samples, I have examined the two articles by “Student,” upon which Mr Soper’s more elaborate work was based, with a view to checking and verifying the conclusions there attained. “Student,” if I do not mistake his intention, desiring primarily to obtain a just estimate of the accuracy to be ascribed to the mean of a small sample, found it necessary to allow for the fact that the mean square error of such a sample is not generally equal to the standard deviation of the normal population from which it is drawn. He was led, in fact, to study the frequency distribution of the mean square error. He calculated algebraically the first four moments of this frequency curve, both about the zero point, and about its mean, observed a simple law to connect the successive moments, and discovered a frequency curve, which fitted his moments, and gave the required law. Thus if a, #, ... 2, are the members of a sample, NL = 0, + %,+...+ Ln, and ny? = (@, — @)? + (&% —Z)P +... + (Lp — @)’, the frequency with which the mean square error lies in the range du is propor- tional to : ms” jee 20? du. This result, although arrived at by empirical methods, was established almost beyond reasonable doubt in the first of “Student’s” papers. It is, however, of interest to notice that the form establishes itself instantly, when the distribution of the sample is viewed geometrically. * Biometrika, Vol. 1x. p. 91. + Ibid. Vol. vt. pp. 1 and 302. Biometrika x 65 508 Distribution of the Correlation Coefficients of Samples In the second of these two papers the more difficult problem of the frequency distribution of the correlation coefficient is attempted. For samples of 2 the frequency distribution between the only two possible values —1 and +1 was 7 2} where p is the correlation of the population. Besides this theoretical result, | “Student” appeals only to experimental data. From these he derives an empirical form for the distribution when p=0, and makes several valuable suggestions. It has been the greatest pleasure and interest to myself to observe with what accuracy “Student’s” insight has led him to the right conclusions. The form when p=0 is absolutely correct, and as a further instance I may quote the remark* “TI have dealt with the cases of samples of 2 at some length, because it is possible that this limiting value of the distribution, with its mean of determined by Sheppard’s theorem to be in the ratio +sin'p : 5 — sin“, AS : ‘ Dine 2 : ~ Sin~'p and its second moment coefficient of 1 — e sinp) , may furnish a clue to the distribution when n is greater than 2.” As a matter of fact it is just these quantities with which we shall be concerned. To Mr Soper’s laborious and intricate paper I cannot hope to do justice. I have been able to establish the substantial accuracy and value of his approxima- tions. It is one of the advantages of approaching a problem from opposite standpoints that Mr Soper’s forms are most accurate for those larger values of n, where the exact formulae become most complicated. 2. The problem of the frequency distribution of the correlation coefficient 7, derived from a sample of n pairs, taken at random from an infinite population, may be solved, when that population can be represented by a normal surface, with the aid of certain very general conceptions derived from the geometry of n dimensional space. In this paper the general form will first be demonstrated, and for a few important cases some of the successive moments will be derived. Incidentally it will be of interest to compare the exact form with Mr Soper’s approximation, and with reference to the experimental data supplied by “Student.” If the frequency distribution of the popuiation be specified by the form ot 5 my)” _ 2p (a —m,) (y — ms) (y - ot df = 1 =e 1—p?2( 20,2 20102 2o>2 dady Qra,0,V1 — joe : where df is the chance that any observation should fall into the range dwdy, then the chance that » pairs should fall within their specified elements is if e =m)? _ 2p (w— my) (y — mo) a (y — me)? 1 a res p 2 (Qqro, om V1— pin 5 as cate gee He | dx, dy, ... dn AYn...(1), and this we interpret as a simple density distribution in 2n dimensions. * Biometrika, Vol, vi. p. 304. R. A. FISHER 509 For the variables # and y it is now necessary to substitute the statistical derivatives determined by the equations Ne = = Mea (2), ng = &(y), nus = (eB), mpd = xy ~ 9, nN MT Pa fy = = (a— %)(y—Y), and it is evident that the only difficulty lies in the expression of an element of volume in 2n dimensional space in terms of these derivatives, The five quantities above defined have, in fact, an exceedingly beautiful interpretation in generalised space, which we may now examine. 3. Considering first the space of n dimensions in which the variations of « are represented, the mean and mean square error of n observations are determined by the relations of P, the point representing the n observations, to the line Ly = = Hy =... = Ly, for the perpendicular PM drawn from P upon this line will lie in the region B+ Het... tUy = NX, and will meet it at the point M, where N= eee Oe eC 5 further, since, PM? = (a, —%) + (a —@P +... + (@n— 2), the length of PM is p/n. An element of volume in this n dimensional space may now without difficulty be specified in terms of % and y,; for, given % and w,, P must lie on a sphere in n—1 dimensions, lying at right angles to the line OM, and the element of volume is Cu dp,da,-’ where C is some constant, which need not be determined. 65—2 510 Distribution of the Correlation Coefficients of Samples The point in 2n dimensional space which is represented by the n pairs of observations must be such that its projection on the n dimensional space, in which « is represented, lies upon a certain sphere of radius ,/n, and on the space in which y is represented, upon another sphere of radius p/n, and now, when we come to the interpretation of r, we must observe that to each point on the first sphere there corresponds a certain point on the second sphere, to which it bears the relation aa yaa Gna WY yea Yo wee In general this relation does not hold for the n pairs of observations, and the two projections will not fall at corresponding points on the two spheres. If now one of the spheres be turned round so as to occupy the same space as the other, and so that the lines upon which a, and y,, and the other pairs of coordinates, are measured, coincide, then corresponding points will lie on the same radii, and the correlation coefficient 7 measures the cosine of the angle between the radii to the two points specified by the observations. Taking one of the projections as fixed at any point on the sphere of radius po, the region for which r lies in the range dr, is a zone, on the other sphere in n — 1 dimensions, of radius “,VnV1—7, and of width gw, Vndr/V1—7r?, and therefore n-4 having a volume proportional to pw,” (1—7?) 2) dr. 4, We may now turn to the direct simplification of the expression (I), at each stage discarding any factors which do not involve r. eee oe _ 2p (ema) y =me) won eas 20102 202" S da, dy,da,dyy ... diary dyn may be reduced to n he =m)? + oy? — 2p {rpyme +(G~ my) (J — ms)} 4 (¥- m2)? aot » 1—p?? 2072 20102 2o02 ee AD AY py”? A pry fig"? A fly (1 — 7) __n a _ 2pruame + i n-4 or to Ta pi (203° 20,93 203? fey eae ae (leo?) 2 du, du,dr. In order to integrate this expression from 0 to 0 , with respect to mw, and pe, let Habe oe Hao O10," fed,” and we have 0 09 = =a (cosh z— pr) ¢ a OZ Wi iG One (1 -7’) axes) 0 d n—-4 - is z or Se (a) 2 0 (cosh z— pr)” * R. A. FISHER 511 which, on substituting cos @ for — pr, may be expressed in terms of a Legendre function in the form n—4 (i cosec 0)" Qn_o(t.cotO).(L—9®) 2 dr ceececccecesesseeceee ily, Nak I a dz teed. EES 9 cosh z+cos @~ sin @’ - dz 1 Git Eee Oe) Eo bee I, (cosh z + cos 0)" D —2 (aa 78) sin 0’ and since this is a function of pr only, we may express the frequency distribution by the convenient expression Fart / 6 Ne ig Ra aD or" fe a) ue Professor Pearson has shown that this last result can be obtained directly from Sheppard’s theorem* that ao 1 Ma 2 be 1 [ [e ~ 3 ) (5 aes a) Id, 2.V1— R? 0 Jo making the substitutions pol po pnd pa = ose) 1 ” n (=Rh)z2° d= p°)ay” it s n (— B)S)— poe’ R mrp (—R)>,5, (1—p2)a,0,’ which give R= pr and cos (— Rh) = 6, we obtain n My? 2prey fy | Me” 2 po — oe 2 2 i ia) sana 7102 F62 - ~ e 2 o1,02(1—p)Jo Jo and hence differentiating (nm — 2) times with respect to 7, the required expression is obtained. 6 OU se 5. The form which we have now obtained may be applied without difficulty to all small even values of n, and in such cases is peculiarly suitable for the calculation of moments. When n= 2 the ordinate of the curve, with abscissa 7, is SOE (1— 71°) sin 8’ which becomes hyperbolic in the neighbourhoods of —1 and +1. The value * Phil. Trans, Vol. 192, A, p. 141. 512 Distribution of the Correlation Coefficients of Samples of r is, therefore, as we know, either —1 or +1, and the proportion, in which these occur, depends upon p. The ratio of the infinite areas included with the asymptotes of the above curve is cos" p eos (= p)’ so that the mean value of a number of observations is SU 2 When n=4 there is still no approach to normality, the curve takes the form — (0 — 3 cot 6 + 36 cot? @), which, when r is positive, increases regularly from its value of ;4 when 6=0, to infinity, to which it approaches as @ approaches 7. Unless p is actually equal to 1, in which case r is also 1 of necessity, the curve has finite ordinates at both extremes. For calculating the number of values which should fall within any given range, the integral, earl — @cot@), may be directly tabulated, as has been done in forming the accompanying table of “ Student’s” observations, and the corresponding expectations. The values given by Mr Soper’s formula are apposed for comparison. Table for comparison with p. 114, Biometrika, Vol. IX. H.E.Sopevr’s Calculated ae 3 i r frequency | Observed ri Sees oe approxi- Diasec g m : m mation g m ‘905—1 | 202-1 175-5 230°3 ge *805—905| 1249 | 136-5 } eRe 69 | “98-9 } pie 20 705—805| 88-7 84 } 72°1 | — 38 09 +203 | 318 -605— 65:1 66 57°6 Peoer =e a t +123. { 173 |. Zee \ +118 | 158 305— 30°6 245 34°3 x 205— 24°8 24:5 } = Oe Te | 99-7 i Boe 2 105— 20°5 19 25°6 2 aoe aa 7 \ =11-6 9). 2:55" sllemsets } 21:6 | 9-80 1-905— 145 22 18°8 2 1-705 — 10:7 ee oo NE at ee * ef a ‘o re 5 . . i . . oe he iG } 412-7 | 10-54 oe $121 | 9°21 Pape oe ca } +51] 219 He + 86 | 8-80 1105— 5-1 ES NS : 19 . ; ie ee ae ot ee lets 2 } +105 | 44-10 — 745 — 23°61 — = 84°17 R. A. FIsHER 513 6. The direct process of integration by parts applied to such expressions as -4 ae 1 (eB > : on 02 ' | ae ey, and es (1 —r*) T ani 9 Os : : is: oP & when n is even, merely introduces the sums and differences of the terms AP at the extremes, where r is —1 or +1, with coefficients which are, in any particular case, easily calculable. Thus, » being 6, OLE bat Cd AE CGE a. Oe yn ie =) ga= ja 5 |. a - (25 oI, = 2 x the sum of the extreme values of 6 7p 9 — 3 cot 0 + 30 cot? A) (1 — @cot 8). — 2x the difference of the extreme values of 3 r If p=sin a, so that the extreme values of 6 are a= a and 5% a, the sums and differences may readily be expressed in terms of a, and the first few may here be tabulated: the table has been carried back as far as is necessary for the calculation of the fourth moment. sum difference sin?6 (7+26? 7-66? m cot? 6 po Bee OROs= 2 = ————— 3 te an? ap? a 36 cot 6 7 cot of m (a+3 tan a+3 a tan?a) ; 62 2 = {6 +#(1 -5) cot a cota (1+a tan a) cota {a-2ton a+ (F +0’) tan a} 6? 1 : 3 Zi +a? Ta snd 5 mw tana 2a tan a = 7 ae 6 cot 6) 2 tan? a (1+a tan a) m tan? a — Gs 3 cot 6 +36 cot? 6) a tan? a (143 tan? a) 2 tan? a(a+3 tan a+3a tan? a) a (4-96 cot 6415 cot? 6-156 cot? @) | 2tanta(4+9atana+15tan2a+ 15a tana) | r tanta (9 tan a+15 tan’ a) There are here two natural series, which appear alternately as sums and differences; the simpler, which may be expressed in the form 7 sin? a (. ae 2 cos ada ‘ 514 Distribution of the Correlation Coefficients of Samples is essentially a series of Legendre functions of the first kind; and may be expressed as Be i = . tan? a ea Py (1 tan a) ; and it is these only which occur in the evaluation of the even moments. 7. It is, however, desirable to obtain general expressions for these integrals in terms of 1 and p, and to evaluate them when n is odd. For this purpose let us introduce a quantity ¢, such that cos ¢ = cos 0 —k, then, when & is sufficiently small, we may expand ¢? by Taylor’s theorem, so that Pa Cee | aC eat a: oP NOE gat * 59903 t/3 (sna) 3 to Now let k= phv1—7, eesti re Cn =") ( 0 y o hen Bene Se anim s 2 sin 000) 21°” and differentiating twice with respect to h F ; 4) 2 ¢ pee 2 a) 2 @2 i 3 6) 3 @2 pa Teer) (e ae =e -)( sag) 5 thea) Gam! ge whence, dividing by (1 — r)2, we obtain p Gein) SOR arenes ( Was (sana) 5 as & a) 2 (1 — 72) \sin 7) Dal Ue sin na 2 pale 4 ( oe Ee wate (sn 000) 3 a n—4 4 Q : “3 3 gr e so that {Ee r? (1-7?) aaa g ar may be obtained by multiplying by |x —3 the coefficient of h”-* in ef r? dr il —¢ cot d =1 A) Pay sind’ when cos ¢ = cos 0 — phV1 —r?=—p(r+hv1—7°). Our object might equally be achieved by the evaluation of the integral ie ner ( (oy ) fe -1 1—7°\sing sin @/" The quantity ¢ is determined by the equation cos ¢ = cos 0 — ph V1 — 7°, that is cosh =—p(r+hv1—7"). R. A. FISHER 515 If now r=sin B, h=tane, then cos = — psin Bf, cos $=—pV1+h?sin(@+e)=—pV1+/?sin f’, and as_ sr passes from —1 to +1, 8 passes from -5 tomer 2 y 6 from 5 —a to xt a, Bp from —v 46 to = and thence to = Das 2 2 2 7 7 and co) from g—%tos +a and thence back to — st a, where sina =p V1 +42, ¢ oscillates in the same manner as 6, with a somewhat greater amplitude, and slightly in advance in respect of phase. 1 dr P si? VI- 9 The expression may now be reduced to pe ae : pa ’ e| - aag (. 1 & g sin a sue .) ae’ = +e sin? d zi —sin?a’ sin? 8’ (1 —sin?a’ sin? B’)? us 2 ane] ats sina’ sin 8’ dp’ = (0 ie 2 28 : ; $ zs » 1 —sin?a’ sin’ — sin’ a’ sin (1 — sin? a’ sin? 8’)? +p fi (f) sin a’ sin PB’ dp’ a an — sin? a’ sin? 6’)? 2 2 a7 4 7 2 _puw | 7p ae (=) + ™p (1 — cosa’) cos a” cos?a’ \cosa cos? a’ pr sin a tan é€ ~ cos? a’ cos a but cos? a’ = 1 — p?(1 + h?) = cos?a— sin? a tan’e, ,ftil-¢cotd dr om tan? a so that jes ——_ as 1 sitd vVl—,~ 1l-Atane From this evaluation we deduce the general form ie (1 ae Foe e Cie, CAN YO cy dae anlaw sie (III). Biometrika x 516 Distribution of the Correlation Coefficients of Samples The absolute frequency df, with which + falls in the range dr, is therefore - n-1 net —p?) 2 ) n—2 @ SES Co) ae ie aie 8. Ido not see how to integrate the other expressions of the type en rv dr es sin?d VJ — 7?’ although a form could probably be obtained when p is even. The general expression for the second moment may, however, be deduced by means of a reduction formula. By a process of integration by parts it appears that, if we write n-4 ne 23 Q n—1 2 ie (ies 2)? ares Nl 2 dr =Ln.p, then Lnse.2 = Into.0+ WLn,9 —u(n— Dies : tan® and since i — an (= “—tana+a), we may obtain successively tan?® tan? I 6.9 = 247 ee eesee “+ tana—a), 4 3 an7 7? 3 io 20e (= green CPrges “—tana-+a), 6 5 3 and so on, yielding, when n is even, the expression a In.g = In. —T |n _ 2{ tan" xd, era oe) a form which may well hold when 1 is odd. The above expressions are useful in tabulating the numerical values of the second moment, 7+ 07, of the unit curve, which may easily be calculated in succession for different values of n when tan?a is taken to have some simple value. 9. Before leaving this aspect of the subject it is worth while to give a more detailed examination of the mean of the frequency curves of r when n= 4. Two formulae are arrived at by Mr Soper, which are equivalent approximations of the second degree a fat 3 PA elie 1—p? 3 L r= p|[1- ay {1+ 7 +8p)} [=p [1-95 {r+ pga +3e9} |, 1. F=p [1 HOF {tga M9 J=0[1-*Gf1-70-o} R. A. FISHER 517 and these we shall compare with the form ET: 7= 2 (a + cot a— acot?a), p | 1000 | -2000 | -3000 | -4000 | 5000 | 6000 | 7000 | 8000 | 9000 | -9500 I | 0853 | 1710 | -2578 | 3463 | 4377 | ‘5333 | ‘6347 | ‘7443 | 8649 | ‘9304 II | 0847 | 1697 | 2555 | 3419 | 4310 5241 | 6236 | °7330 | 8566 | -9254 TIT | 0850 | 1704 | 2570 | 3451 | 4360 | 5301 | 6290 ‘7357 | 8540 | 9209 It will be observed that the approximations le on either side of the exact value over the greater part of the range, and that the error of the first approximation increases up to the value when p="9. The second formula gives the correct value somewhere between ‘8 and ‘9, and is thereafter too large. For the particular case p = 6608, I find (formula IIT) 7 ='5897, nearly the maximum difference from p, Mr Soper gives (p. 109) the value 5933 and the experimental data ‘5609. The two theoretical values are much nearer to each other than either is to the experimental value. On the whole, it is obvious that even in this unfavour- able case Mr Soper’s formulae possess remarkable accuracy. 10. The use of the correlation coefficient r as independent variable of these frequency curves is in some respects highly unsatisfactory. For high values of r the curve becomes extremely distorted and cramped, and although this very cramping forces the mean 7 to approach p, the difference compared with 1 —p becomes inordinately great. Even for high values of n, the distortion in this region becomes extreme, and since at the same time the curve rapidly changes its shape, the values of the mean and standard deviation cease to have any very useful meaning. It would appear essential in order to draw just conclusions from an observed high value of the correlation coefficient, say 99, that the frequency curves should be reasonably constant in form. The previous paragraphs suggest that more natural variables for the treatment of our formulae are afforded by the transformations ip PEA oo irceaer( l-r p 7 = tan a= —___., “/ The expression for the frequency curve (II) aaa 6 ; moe n—-1 ff2 : G8)” laa) ae 66—2 518 Distribution of the Correlation Coefficients of Samples now becomes ( 0 ie 62 dt " n-1 sin 000 2 (+e) = and the range of the curve is extended from — to +o. It is interesting that in the important case, r=0, the frequency reduces to dt ~—~n=1 and the curves are identical with those found by “Student” for z, d+?) the probability integral of which he has tabulated in his first paper. 11. The moments of these curves are obtained by the evaluation of the expressions i ( 0 ie @ dt fe ( 0 Va @ tt in@ ; a » \sin 000 5 haneeeaie —« \sin@0@ 2 dQ+e)2 J —» \sin@00 2 +e2 and so on; of these the first is known already (III) to have the value vg In —3 preareaenii=l Chea yo)) 2 and the others may be obtained in succession, for Greys i << Cre Cedi, oh Cm | Sut Ney G2 aeaateone CP eee SiO OO) cara (14 pyr op” aoe | ee eS TO ptt as i tae ano p annem so that the first moment i ( ri) Nae tdt fy) 7 |r — 4 ain . a 5 n=l ~ 9, ° n= n=4 sin 806 (1+ #)? 2 (Ch) (ae) »n—1 n—1 eae 2 (1+#)2 4(n—2)p_ —@ geal gar ail ss n-3V/1—— n-3 hence t= T. The mean, therefore, is greater than the true value 7 by a constant fraction of its value. And this fraction decreases in the simplest possible manner as n increases. In the same way, we may evaluate the second moment, COs = array Hea) ag) preatie| IH ise 3) all and a= flts + Ooo ot the third moment ae (n—2)r Re au i )))) VB.e ~ (n—3)(n—4)(n—5) |sa+e)+ (n—3p J’ and the fourth moment R. A. FIsHEr 519 3 6 (n — 2) 7? ah a Se eS 2\2 ae = (n—4)(n—6) {a a) GA — 3)(n — 5) (n— 3)!(n—5) For high values of n, all but the first terms tend to vanish; §, tends to vary as p’?, and @, tends to become independent of p. In effect for high values of 7, where p? is nearly equal to unity, the form of the curve is nearly constant, but the skewness measured by , decreases to zero at the origin, and changes its sense, when 7 and p change their sign. (l+7°)+ Tables are appended for inspection rather than for reference which show the nature and extent of these changes in the form of the curves. Table of o°. P= ‘O01 03 ‘10 *30 1:00 3°00 10°00 30°00 100°00 8 °2531 2593 *2810 3430 “5600 1°140 3°350 9°550 | 31°250 13 "1123 °1148 1234 1481 2344 ‘4811 | 1344 3811 | 12°444 18 ‘07219 | °07372 | -07908 | :09438 | °1479 *3010 8365 2°367 7722 23 705319 | °05429 | -05817 | °06925 | -1080 2188 6066 1714 5°592 3S 03484 03555 | -03805 | 04518 | "7015 1415 3912 17105 3°602 43 02590 | -02643 | -02827 | °03353 -05194 1045 “2886 *8146 | 2°655 53 02062 02103 | °02249 -02666 | -04123 08288 | °2287 *6451 | 2°103 Table of By. eS 01 O03 ‘10 30 1:00 3°00 10°00 | 30°00 | 100-00 oe r= & | 05685 | °1662 ‘5076 =| 1°230 2°450 3°788 3°965 | 4°153 | 4°184 | 4°252 13 | 01517 | 04776 1376 *3400 *7058 | 1:018 1°205 | 1271 | 1296 | 1°3065 18 | 008399 | :02463 ‘07645 | °*1914 “4016 5857 *6990 | °7395| °7546| °7619 23 | ‘005757 | ‘01691 05247 | °1317 “3016 4093 "4910 | °5208| °5314| °5361 38 | °003518 | 010385 038214 | -08100) ‘1731 "2559 *3031 | °3260} 3334 | °3366 43 | °002530 | -007435 | *02315 05841 | +1251 "1858 ‘2237 | °2376| °2429 | °2452 53 | :001973 | 005798 | *01807 04562 | -09800] +1458 1757 | *1868} +1910] °1928 Table of Bs. 7=| -00 ‘01 03 “10 30 1:00 3°00 10°00 30°00 | 100°00 or) r= 8 | 60000 | 6°1137 | 6°3179 | 70179 | 8:4767 | 10°9668 | 12-9652 | 14°1116 | 14°5024 | 14°6508 | 14°7159 13 | 3°8571 | 3°8802 | 3-9248 | 4:0663 | 4°3770 | 4:9397 | 5°4240| 5°7147| 5°8186| 5°8578| 5:8750 18 | 3°5000 | 3°5121 | 3°5356 | 3°6104 | 3°7937 | 4°0828| 4°3532) 4:5186| 4°5783 | 4°6009| 4-6109 23 | 3°3529 | 3°3612 | 3°3768 | 3:4271 | 3°5556 | 3:7486 | 3°9356) 4:0511 | 4:0930] 4:1089]| 4:1159 33 | 3°2222 | 32271 | 3:2365 | 3°2667 | 3°3343.| 3:4619| 3°5773| 3°6493| 3°6756| 3°6856| 3-6899 43 | 3°1622 | 31656 | 3°1723 | 3°1938 | 3°2422 | 3:3261 | 3°4172|] 3:4692| 3°4886| 3:4958] 3:°4991 53 | 31277 | 3°1303 | 3°1356 | 3°1522 | 3°1898 | 3:2640] 3°3281| 3°3676| 3°3826] 3:3883| 3-3909 520 Distribution of the Correlation Coefficients of Samples 12. The fact that the mean value 7 of the observed correlation coefficient is numerically less than p might have been interpreted as meaning that given a single observed value r, the true value of the correlation coefficient of the population from which the sample is drawn is likely to be greater than r. This reasoning is altogether fallacious. The mean 7 is not an intrinsic feature of the frequency distribution. It depends upon the choice of the particular variable r in terms of which the frequency distribution is represented. When we use ¢ as variable, the situation is reversed. Whereas in using 7 we cramp all the high values of the correlation into the small space in the neighbourhood of r=1, producing a frequency curve which trails out in the negative direction and so tending to reduce the value of the mean, by using ¢, we spread out the region ot high values, producing asymmetry in the opposite sense, and obtain a value ¢ which is greater than tr. The mean might, in fact, be brought to any chosen point, by stretching and compressing different parts of the scale in the required manner. For the interpretation of a single observation the relation between ¢ and 7 is in no way superior to that between 7 and p. The variable ¢ has been chosen primarily in order to give stability of form to the frequency curves in ditferent parts of the scale. It is in addition a variable to which the analysis naturally leads us, and which enables the mean and moments to be readily calculated, and so a comparison to be made with the standard Pearson curves, but it is not, with these advantages, in a unique position. In some respects the function, log tan 4 (a +5) , 1s its superior as independent variable. I have given elsewhere* a criterion, independent of scaling, suitable for obtaining the relation between an observed correlation of a sample and the most probable value of the correlation of the whole population. Since the chance of any observation falling in the range dr is proportional to 4 ee PW ie peat Be Oiage rey ea) (a oe my for variations of p, we must find that value of p for which this quantity is a maximum, and thereby obtain the equation n—-1 4) x mow : 4) n—l 62 = dp ia moe fe 730) 3 a Since ie dat aes 1 ( 0 es @? o (coshaw+cos 6)” |n—1\sin@00 2 a = d ye ee 2: ge ke I, op \a p*) (cosh & + cos no ae * R. A. Fisher, ‘On an absolute criterion for fitting frequency curves,” Messenger of Mathematics, February, 1912. | R. A. Fisner 521 which leads by a process of simplification to the equation du i (uabiz= are —p cosh 2) = 0. Since cosh w is always greater than pr, the factor in the numerator, r—p cosh a, must change sign in the range of integration. We therefore see that r is greater than p. Further an approximate solution may be obtained for large values of n. The integrand is negligible save when # is very small, and we may write 9 1+ 5 for cosh x nar and (1 — pr)” e*(1—P") for (cosh a — pr)”. meena? ene Then rf e Te | (1+ 5)e aC PO de, 0 0 and in consequence, as a first approximation, 1 = r =p(l 2 ) : r=p ( ats oF The corresponding relation between ¢ and 7 is evidently 1 —is (1 ar x) Fi It is now apparent that the most likely value of the correlation will in general be less than that observed, but the difference will be only half of that suggested by the mean, @. It might plausibly be urged that in the choice of an independent variable we should aim at making the relation between the mean and the true value approach the above equation, or rather that to which the above is an approximation, or that we should aim at reducing the asymmetry of the curves, or at approximate constancy of the standard deviation. In these respects the function log tan } (a + 5) that is, tanh p is not a little attractive, but so far as I have examined it, it does not tend to simplify the analysis, and approaches relative constancy at the expense of the constancy proportionate to the variable, which the expressions in 7 exhibit*. * [It may be worth noting that Mr Fisher’s ¢ is the ¢-square root mean square contingency—of the more usual notation, and is the expression used in determining the probability that correlated material has been obtained by random sampling from uncorrelated material. Eb.] ON THE DISTRIBUTION OF THE STANDARD DEVIATIONS OF SMALL SAMPLES: APPENDIX I. TO PAPERS BY “STUDENT” AND ROA FISHER: (EDITORIAL.) CONSIDER the population distributed according to the law —m)2 hy (x —m) and let a sample of n represented by the variate values a, x... &, be taken from it. Then the probability 6P that this sample will lie between a, and 2, +6a,, x, and a+ da, ... % and a+ bap, Nn _1 S(@,-— my yet Boat is = Ge Note Or;OTs 2Oce = const. xX e where %= - Sas) ae a— - S(a,;—Z)? we may write: n=? n (Z&—m)? a (3 2 D} 2 SP =const. x e g ) OD OL g's ss OL gl eee nee (iii). Changing as Mr Fisher does (see p. 510 above) to & and & as coordinates we have: n=? n(xZ—m)? = (bese SP =const. xe ( os o ) >"-2 87 8>. We see at once from this* that the law of distribution of samples of means is the normal curve YH We. 2 OE wee ee eee (iv) * Of course the form reached above shows that for normal distributions there is no correlation between deviations in the mean and in the standard deviation of samples, a familiar fact. EDITORIAL 523 with mean =m, the mean of the population, and with standard deviation =a//n, a well-known result. On the other hand the distribution of samples of standard deviations is This curve was first reached by “Student” as a highly probable result following from the relations he had obtained from the moments of ?*. Mr Fisher's work thus enables us to justify “Student’s” assumption. “Student” has discussed at some length the distribution curve for >. He has obtained the values of the moment coefficients p., ws; and py, and the general expressions for the means when n is even and odd. The whole problem is of such importance that it seems worth reconsidering, and providing tables showing the approach of the distribution curve to normality as n rises from 4 to 100. The following investigation largely repeats work given by “Student,” but it expresses the values for ws, w,, and @, and 8, in a different formt. We shall not use approximate expressions for the constants, for the order of terms in 1/n depends so largely on the relative magnitude of their coefficients, that such expressions become unreliable for values of n under 100. Clearly (v) is a skew curve with range limited at one end, }=0, and not at the other, = 00. See Figure p. 524. We shall write the standard deviation of %, os, and the moments of the frequency about the end of the range O as M,’, M,’, etc., while the moment- coefficients about Q will be as usual w,(=0), 2, etc. Obviously w.=cs% It is desirable to ascertain S, =, os and the skewness as well as §, and ®, for the distribution. We do this to show the rapidity of change to a normal distribution. It is well, however, to notice a priori that for n large the distribution does become normal. * «Student’s” approximate values for 6, and fy (loc. cit. p. 10) are, we fear, erroneous. He gives iManeee aly but it is needful to have a further term in we a) ene 72 2 order to obtain 6; and fy, correctly to the second approximation in =. If this further term be p/n?, then: 1 64p —3 : A sere 3 oi on (a + i). as against ‘‘Student’s On (a - i) ; 16p 1 Bo oF lgerers ” ” ” 3 (1- 7a). An examination of our table (p. 529) shows that ‘‘Student’s” corrections are not of the right sign to agree with the facts, and that further no constant value of p would give good results even for fairly high values of n, i.e. it is probable that the term in 5 in D? is of equal importance with that in t+ “The Probable Error of a Mean,” Biometrika, Vol. v1. pp. 1—25, more especially pp. 4, 6, and 8 to 10. ; Biometrika x 67 524 Standard Deviations of Small Samples yb | i} } i} ' | i} 1 ! 1 ) U ie) P a OP =mode=, O0Q=mean=2. To obtain this approximation to (v) let us assume & =>+ e, and suppose e small. Expanding log y we find: log y = log y, +(n — 2) log 3S —4n(S/o) + eo. -, n Ee n-2o¢? = ne n—2 6? : = @ + 5 )+ terms in e% 2c? n > Hence since > is at our choice we will take it so that and thus: n>? e ww _— 4 Ss Y=Yur2e ~ 7 @ * o*/(2n) + ete. Or, if ¢ be small compared with o, the distribution is the normal curve: 2 y=yle FR =| = om <— <. =: = =x) ce Ee with mean at aay rma: and standard deviation o/V2n. If n as usual be considerable, this agrees with the ordinary result, ie. Z=o and cy =o/V2n, the distribution being treated as normal. We will now deal with the full result (v). We have: é nz? M, = ySrds=y{ PAs al Pee CEP Rane pec n s ccc (viii), 0 and clearly M,’ depends on a knowledge of Ly= | ule du detasen indeocer aaa eee (ix), 0 for we have: P fon n+p—-1 Mea cal PAs EDITORIAL 525 Integrating by parts we find: Ly = (q—1) Ly-2 ...d.1 L,, if g be even (Ghee a q Mea ep pe sad: Now I= [e-¥du=,/5, 0 2 L, =| ue *” du =1, 0 thus M,’ is determined, and will depend on whether 7+ p be even or odd. poh cn n+1 ie oe n+1 J But My = Yo a Ln = Yo C (n-1) Ens, ims go \?-1 M, =H Yo (=) Una —l1 pe = M,'/M)' = o° ; n and Hence To find the modal value } we must differentiate (v), and we have which gives a result in agreement with the mean > of the approximate solution (vi), as we should anticipate. It now remains to find WM’ and WM,’ absolutely on ea M, =Y% \J/n HRS n—1 My = % Ga Lino Suppose n even, then Lya=(n—2)(n—4)...2x1 TEN PSN ase 1 vA PT ee TL ee (x11). 2; Hence for n even 3 p= MM = oe n—-2n—4 VE eG eG eal we ttteesaeeees (x1V A). Again for n odd Ln = (n — 2)(n— 4)... 1 x Ve Ins =(n — 8) (n—5)...2 x1, and .hence _ o (n—2)(n—4)...1 /o ReaD Bel fe 14) 526 Standard Deviations of Small Samples These accurate values of =, the mean standard deviation of samples, were first given by “Student” (loc. cit. p. 8). Now by Wallis’ Theorem Ny Gea £ product-of even numbers up to 2n gv" ~ product of odd numbers up to 2n —1° Thus (xivA) for n large tends to become SS ae eeu Vn He 5 me Vaio N° These values, however, really only suffice to show the approach of > to o, as M II and (xivB) they depend on the neglect of terms of the order * as compared to 1, and we should get absurd results for os? by subtracting the square of the above values of > from #2 in (xi). All they really tell us is that for n large & =o, but they give no true, approximation in a If we use Stirling’s Theorem up to the third termt, Le. a 1 wl= Bmore (1+ 55+ Me 12% 288.2? é = 3} (foes we obtain L=a (1 Teamie 39,8) esi eee (xv), 2 1 oc? = om (1 — a) | deed telatne oe eee (xvi), ape ae M3 = ane? ie ee Bar nomhouldet lled’to antreducouNeut _ 189" Sito, Sunnie ut we snou e compe ed to introduce e term — 5184025 g expression to reach the second terms in p; and «4. As we have indicated (p. 523, ftn.), such a term, even if used, will not lead to profitable results. It is better to work with the full formulae. It is desirable to find the full third and fourth moment coefficients in order to determine @, and 8, and so measure as 7 increases the rapidity of approach to the normal curve. * « Student” has used an extension of Wallis’ Theorem, which will suffice for certain constants only. + We can write (xiv a) < o gin-1 ln —1)? >= ( [dese J: Fis aeteeseet cuneate en er ee eeeee (xvii), Jn |n-2 T and (xivB) PO: (n-2)[n-38 = el ™ ee Jn (g-3) | 3 (m ios 3))2 ie ig ieielsleieieleltis oloieseleieleteeieivieleleieieintelersteteletete (xviii), and then apply Stirling’s Theorem. EDITORIAL 527 We have: f é N+3 N+3 May a Lins = Yo (5) (n+ 1) Ly = = (n + 1) M,’. 2 ’ 4(n2—1 : Hence ps = Mi /M, = = (n+1) pe = ae ee eros taste (xix), n+2 N+2 M; = Yo ca Lnia = Yo a ) MEn- = Ss nM’. n Hence |g EG DLE ope) RRR a eT reer eR OED (xx). Transferring to mean: b= bs. a 2a fer. a fe [ny SS GnnY (1 — 2¢3°/o? — jae *) S a (1 z =e eae an Ga Thus yp; will grow small, not only owing to the factor z but because oc? tends to equal o?/2n as n increases. i os? \? eae o>: ( iz OF Now Bi mr bs? i. n2 os! i Or B= 8n 2 (1 - Set aoa) Shoes ay Semana (xxil). Here =/o is of the form 1— 2S and pee (ay aXe 2259} ( 2) ‘ and thus @, tends as n increases to take the form 8y,?/n, but as y. may be a considerable numerical coefficient 8y,” may be commensurable with n till n is very considerable. We next turn to «4, and shall endeavour to express it in terms of u.= is only 1°5°/, from the usually adopted value o, and the average standard deviation cy only 0°3°/, from its customary value o|V2n. Further 8, and f, are 0105 and 3:0003 respectively, or for all practical purposes have reached their normal values. We think it must be concluded that for samples of 50 the usual theory of the probable error of the standard deviation holds satisfactorily, and that to apply it for the case of n= 25 would not lead to any error which would be of importance in the majority of statistical problems. On the other hand, if a small sample, n < 20 say, of a population be taken, the value of the standard deviation found from it will be usually less than the standard n~ deviation of the true population. If we take the most probable value, >, as that EDITORIAL 529 which has most likely been observed, then the result should be divided by the number in the column entitled mode &/o to obtain the most reasonable value for a. For example, if } be observed, and n = 20, then the most reasonable value to give o is /°9487. The paper by Mr Fisher and the accompanying table more or less complete the work on the distribution of standard-deviations outlined by “Student” in 1908. Table of Values of the Constants of the Frequency Distribution of the Standard Deviations of Samples drawn at random from a Normal Population. caer Measures of Deviation from eon Standard Deviation Normality Sample Mode Mean = n S/o Z/o oy/o o3(o/V2n) Skewness By Bo 4 ‘7071 ‘7979 ‘3367 9524 2696 | -2359 3°1082 5 "7746 *8407 *B052 ‘9651 "2168 ‘1646 3°0593 6 *8165 “8686 *2808 9725 °1857 "1255 3°0370 7 "8452 "8882 "2612 ‘9774 "1648 | “1011 3°0251 8 “8660 "9027 "2452 “9808 "1495 "0845 3°0181 9 *8819 “9139 ‘2318 "9834. W188 je" -O725 3°0136 10 "8944 9227 2203 "9853 *1285 0634 3°0106 11 "9045 ‘9300 "2104 ‘9868 *1209 0564 3°0085 WD "9129 "9359 ‘2017 “9881 "1144 ‘0507 3°0070 13 "9199 *9410 “1940 “9891 "1088 0461 3°0059 14 "9258 "9453 1871 “9900 1041 0422 3°0049 15 “9309 "9490 “1809 “9907 “0998 ‘0390 3°0042 16 "9354 9523 1752 "9914 “0961 ‘0362 3°0036 17 *9393 “9551 ‘1701 ‘9919 0927 0337 3°0032 18 9428 ‘9576 "1654 9924 ‘0897 ‘0316 3°0028 19 "9459 “9599 ‘1611 “9928 ‘0869 ‘0297 3°0025 | 20 "9487 ‘9619 *1570 "9932 0844 ‘0281 3:0022 | 25 "9592 ‘9696 "1407 "9948 *0745 “0219 30014 | 30 ‘9661 ‘9748 "1285 *9956 ‘0674 ‘0180 3°0009 | 385 *9710 "9784 1191 "9963 0620 0153 3°0007 | 40 ‘9747 ‘9811 1114 *9967 0577 0132 3°0005 45 ‘9775 "9832 Odi 9977 0541 ‘O117 3°0004 50 9798 "9849 ‘0997 ‘9974 0512 “O105, 1) 3:0003) | 58 “9816 9863 ‘0951 ‘9977 "0488 ‘0095 3°0003 | 60 "9832 ‘9874 0911 | -9979 ‘0467 =| +0087 3°0002 | 65 "9845 “9884. ‘O875 “9980 ‘0447 “0080 3°0002 70 "9856 "9892 0844 | 9982 0430 0074 | 3:0002 75 ‘9866 ‘9900 0815 | -9983 0415 ‘0069 | 30001 80 ‘9874 "9906 0789 | -9984 | -0402 0064 == 30001 85 “9882 9911 ‘0766 "9985 "0389 70060 = 30001 90 ‘9888 ‘9916 0744 | -9986 | 0378 ‘0057 3°0001 95 "9894 “9921 0725 | ‘9987 03867 =| «0054 | 3:0001 100 "9899 "9925 ‘0706 "9987 0358 | ‘0051 3°0000 TUBERCULOSIS AND SEGREGATION. By ALICE LEE, DSc. (1) In his book The Prevention of Tuberculosis (London: Methuen, no date on the issue we have used) Dr A. Newsholme has examined the influence of segregation on Tuberculosis. This is the topic of Chapter xxxv. In the opening of this chapter, he writes: The exact measure of institutional segregation of phthisis is the ratio stating how many of the total days’ of sickness (number of patients and number of days of sickness) are passed in institutions. This ratio and the equivalents for it which have to be used in practice may for convenience be called the segregation ratio. The need for equivalents for the ratio as stated above arises from the fact that we are dealing with actual recorded experience, and the material has to be taken from the records as they happen to exist. (p. 266.) After noting the incompleteness of the records, Dr Newsholme continues : It becomes necessary therefore to select other figures which vary approximately with the total days of tuberculous sickness and the total days of tuberculous sickness passed in institutions. (p. 266.) We shall discuss below what “indirect measures of segregation” Dr Newsholme selects, but he gives the following most proper caution with regard to them: In using these indirect measures of institutional treatment of tuberculosis and of its pre- valence it must be remembered that they are indirect and approximate. Thus, for instance, figures for institutional treatment usually give the number of cases and not days of treatment, and while they tell how many people were segregated in institutions do not’ show the average duration, still less the quality of the treatment. Any of these indirect forms of segregation ratio has therefore to be verified wherever possible by the application to the same community and period of one or more other forms of the ratio, and checked wherever practicable by a special examination of sample constituent communities whose figures are included in the total. (p. 268.) Dr Newsholme in the course of his chapter gives a number of very high correlations between the phthisis deathrate and the indirect forms of the segre- gation ratio he has selected, and he interprets these as well as a long series of graphs as demonstrating that institutional segregation has been a most important factor in the diminution of the phthisis deathrate. Now any two variates which are changing continuously with the time—say, the consumption AticE LEE 531 of bananas per head of the population and the fall in the birthrate—will exhibit high correlation and will show graphically very high association, if plotted to appropriate scales and on a common time basis. Until the time factor has been removed, either by partial correlation or otherwise, it would be most un- wise to interpret such cases as providing any causal relationship. It seemed accordingly worth while to reinvestigate Dr Newsholme’s problems with the aid of a rather more adequate statistical apparatus. (2) We must frankly confess at the outset that we have had great difficulty in following Dr Newsholme’s description of the methods he has adopted to measure the amount of segregation. His charts do not seem always in ac- cordance with his tables, and both are occasionally out of agreement with his definitions. As he does not give the raw data on which his correlations are based, but only condensed versions of them in his tables and graphs, it is impossible to test his conclusions without returning to the original sources, which are not always stated, and when we have found them and our results differ, we are unable to say whether the difference is due to failure in his or in our arithmetic, or to divergences between his and our records. Dr Newsholme uses in all some six measures of the segregation ratio, four intentionally and two apparently by inadvertence. Let P= total population of a given area, @ = the total number of annual deaths from phthisis. Then ¢/P multiplied by 10,000 or 100,000, as the case may be, gives the crude deathrate from phthisis. Let D; be the deaths from all causes which occur in institutions and D the total deaths in the same area, then 100D,/D is Dr Newsholme’s first approximation to the segregation ratio*. On p. 270 he gives two tables which show in (a) England and Wales as a whole, (b) in London, that, while in the course of forty years 1000¢/P has practically halved, 100D;/D has practically doubled. The data, Dr Newsholme tells us, show “not only a very close correspondence between the increase of total institutional segregation measured by the ratio in question and the decrease of phthisis, but an even more striking similarity in the ratio at which these changes have occurred” (p. 271). This is illustrated by a graph on p. 271, in which the logarithms of the phthisis deathrate are plotted to time against the logarithms of the indices of institutional deaths to all deathst. We do not know why Dr Newsholme has chosen this method of representation; it certainly, with his choice of scales, makes the two curves roughly parallel, but this does not demonstrate the “similarity in the ratios at which these changes have occurred.” For, if the actual values be plotted to the time, the curve of phthisis deathrate is conver and the institutional death- rate concave to the time axis, in other words while the rate of one is increasing, * The assumption made appears to be that for the period in question D; is proportional to the institutional deaths from phthisis,—a very big assumption. + The logarithms of the ratios of institutional deaths to all deaths appear to be either wrongly plotted or wrongly calculated. Biometrika x 68 532 Tuberculosis and Segregation the rate of the other is decreasing during the period in question,—always on the supposition that we plot the results as Dr Newsholme has done with reversed directions of increasing scales for the two indices. He states that “the experience is summarised in the high correlation coefficients of ‘91 for England and Wales (1878—1903) and ‘90 for London (1866—1904)” (p. 271). The correlations found from his actual tables do not appear to agree with these, being, for example, — ‘93 for England and Wales with the negative sign as we should anticipate; but as Dr Newsholme does not give the same years for his correlation coefficients as in his tables, he may have worked out his coefficients for individual years. It is impossible to test the matter, as neither the figures nor their source are provided. If, however, we take his Tables LXII and LXIII, and apply the variate difference method* to Dr Newsholme’s data as they stand in his book, which are all the data available, we find Correlation of Phthisis Deathrate and Ratio of Deaths in Institutions to Total Deaths. England and Wales: Third Differences — 174+ 293, London : Second Differences —:094+:'252. In other words the data show no significant relationship between this measure of segregation and the phthisis deathrate, when the time-factor is annulled, even with the early differences. It is impossible to press the matter further because the data are far too sparse for difference treatment, but the results, such as they are, are sufficient to indicate that Dr Newsholme’s high correlations are solely due to the fact that both variates are continuously changing with the timef. (3) As a second measure of segregation Dr Newsholme takes 100¢;/6 and 1000¢/P is then correlated with this, ¢; being the deaths from phthisis in institutions. On p. 275 Dr Newsholme gives very meagre data for Brighton, Sheffield and Salford in groups of years, six pairs of values for Sheffield, five for Brighton and four for Salford. It is thus impossible to test these for annulment of the time-factor, and no references are given to the sources of the original data. On p. 276 we read: Coefficients of correlation summarising this correspondence for long series of single years work out at 67 for Salford from 1884 to 1904 and ‘80 for Sheffield from 1876 to 1905 f. If the arithmetical values be correct, they should certainly have negative signs, but even then they would not demonstrate anything but the increasing use of institutions and the decreasing prevalence of phthisis during the years in question. * Biometrika, Vol. x. pp. 179, 341. + These values might be modified if we could go to higher differences, but this is impossible on the very limited data which Dr Newsholme provides. On these data all we can state is that no evidence of organic relationship between the variates, such as is asserted by Dr Newsholme to exist, can be demonstrated. } There is no statement as to why Brighton has been omitted. Autck LEE 5338 There is, however, a much graver criticism to be made of Dr Newsholme’s method in this measure of segregation. He proposes to correlate 1006/6 and 10004/P, and interprets the high correlations as a sign of the value of segregation in reducing the phthisis deathrate. We have not his data to test his conclusions by, but we can compare them against certain results for 88 years in (i) England and Wales, (ii) Scotland, and (111) Ireland. Here they are: Correlation of Phthisis Deathrate and Ratio of Institutional Phthisical to all Phthisical Deaths. Years District Correlation 1866—1903 Scotland —°9815 + 0040 1866—1903 England and Wales — ‘9750 + 0054 1866—1903 Treland — 8720 + 0262 1876—1905 Sheffield — ‘80+°0443 1884—1904. Salford — 67+°0811 The reader may imagine in this table a confirmation of Dr Newsholme’s results, for the larger material gives higher values of the correlations. On the contrary, these correlations have been obtained by taking as the measure of segregation the ratio Mean Institutional deaths per annum from phthisis 1866— 1903 eee Annual Total deaths from phthisis Now it is clear that this index never varies with the increasing percentage of institutional deaths from phthisis. Yet all the correlations are greater than Dr Newsholme’s! We have little doubt that he would get higher values than he has done, if he replaced the actual institutional deaths per annum by the constant mean value. In other words the results reached by him are of no significance, for we get higher correlations by putting a single fictitious value for the annual institutional deathrate. The real source of his result is not the strong influence of segregation on phthisis, but the spurious correlation introduced by using the phthisis deaths, ¢, in the numerator of one variate, 1000¢/P, and in the denominator of the other, 100¢;/¢. Thus no scientific results of value can be found from Dr Newsholme’s second measure of segregation. In discussing this second measure of segregation, Dr Newsholme lays great stress on the part played by asylums for the insane in segregating the tuberculous. He notes that the percentage of lunatics treated privately with relatives and others was 18:4 in 1859 and fell to 5°5 in 1902, thus marking increasing segregation during the period of fall in the phthisis deathrate. He states (p. 274) that: “the deathrate from tuberculosis in borough and county asylums in 1901 was 15°8 per cent. of the inmates, and over ten times as great as in the general population.” Now 68—2 534 Tuberculosis and Segregation Dr Newsholme’s figure appears to be quoted from the 56th Annual Report of the Commissioners in Lunacy, and in this case it should read 15°8 per 1000 and not per 100, and although Dr Newsholme appears to have made a similar slip in dealing with the deathrate in the general population, he seems to be comparing deaths from all forms of tuberculosis among the insane—some of which have possibly a direct relation to their insanity—with deaths from phthisis alone in the general population. Further he has made no allowance for the very marked difference between the age distributions of the two groups he is comparing. The difference is so great that a phthisis deathrate of 1:46 per 1000 in the general male population is equivalent to one of 2°41 per 1000 among the insane population of males. Even if the corrected deathrate among the insane for phthisis were ten times its magnitude among the sane, we fail to understand what Dr Newsholme means when he asserts that: “the segregation of each tuberculous lunatic has been equivalent to the withdrawal of ten ordinary tuber- culous persons” (p. 274). Because tuberculosis among lunatics is ten times as frequent—judging by deaths, and accepting for the purpose of argument Dr News- holme’s figures—why should the isolation of one tuberculous lunatic be equivalent to the withdrawal of ten sane tuberculous persons? That must suppose a tuber- culous lunatic capable of spreading ten times the infection of a tuberculous but sane individual. All Dr Newsholme could say would be that from the standpoint of segregation it is ten times more desirable to segregate any lunatic, than any sane person, for the former is ten times as likely to die of tuberculosis. Dr Newsholme brings no evidence to show that the individual tuberculous lunatic is ten times as dangerous as the individual tuberculous sane person. As a matter of fact we still need very careful investigation of the relation of lunacy to tuberculosis, not only having regard to some forms of tuberculosis as possible sources of feeble-mindedness, if not of insanity, but also having regard to whether the old idea of asylum segregation as a possible cause of the spread of tuberculosis among lunatics is wholly erroneous, and we might further examine whether the new idea that the majority of tuberculous lunatics were tuberculous on admission is in its turn wholly sound*. In the present state of our know- ledge we think the assertion that the increased segregation of lunatics has substantial relation to the decrease in the phthisis deathrate is quite unproven. (4) Dr Newsholme’s third approximation to the segregation ratio is the index 100p,/p, where p; is the number of paupers in institutions and p, is the total number of paupers, indoor and outdoor. Unfortunately Dr Newsholme’s usage does not agree with his definition. The index he appears to use is generally 100p,/p;, and the values of this are given in the last column of Table LXV (p. 277) and Table LXVII (p. 279). In Table LXVI (on p. 277), however, the 100 factor is dropped and p,/p, again used in the heading to the central column, * Many lunatics enter and re-enter asylums, it does not follow because they died of tuberculosis and were tuberculous on last admission that their tuberculosis was there on first admission. AuicE LEE 535 although the figures in that column appear to refer to 100p,/p;. Below this table occur the words: This experience for the entire series of individual years is expressed by a coefficient of correlation of — ‘94 between segregation measured :by the fraction of pauper population treated in institutions and the phthisis deathrate. (p. 277.) The correlation to support Dr Newsholme’s views should be negative if 100p;/p, has been used, and positive if 100p,/p; has been used. But as many of his other correlations are given with the wrong sign, it is difficult to discover what measure of segregation he actually has used. To add to the confusion the index actually plotted is log p,/p;, and not 100p;/p,, which is what Dr Newsholme defines as his index. We have accordingly in our analysis of the figures, to be given later, used both indices 100p;/p, and 100p,/p;. It is very difficult to appreciate how the ratio 100p,/p, can effectively measure the segregation ratio—it is indeed impossible to agree with Dr Newsholme’s view that any of his indices “ measure with approximate accuracy the ratio which states how many of total days of tuberculous sickness are passed in institutions.” The policy of compelling as many paupers as possible to go into the workhouse was directly adopted with a view to diminishing the total pauperism as well as abuses connected with outdoor relief, and that policy is the source of increase in the index 100p;/p,._ Had Dr Newsholme examined his own Tables LXV, LXVII and LXIX carefully, he would have seen that the percentage of indoor paupers on the general population has remained almost constant for the period in question, while the total paupers per cent. of the general population in England with Wales and in Scotland have decreased. If the same relative number of paupers are segre- gated now as formerly, how can this segregation have diminished the chances of infection in the community? We can hardly assume that all paupers are tuber- culous, or markedly so relatively to other men, so that the reduction of the number of outside paupers by indoor segregation is equivalent practically to a reduction pro tanto (note the extraordinarily high correlations !) of the number of tuberculous in the community. If so, then the reduction of the tuberculous deathrate would be due not to the segregation, but to the large decrease in the total pauperism relative to the population of this country. The correlation, as we shall demon- strate, is not between the segregation of paupers and the phthisis deathrate, but between the diminution of total pauperism and the phthisis deathrate. We shall investigate how far this relationship between total pauperism and the phthisis deathrate is “organic,” i.e. continues after the annulment of the time-factor, or is purely due to the fact that both pauperism and phthisis have diminished during the forty-year period under consideration. It was this third definition of a segregation ratio in conjunction with the fourth segregation ratio to be considered later that led us to realise that the whole 536 Tuberculosis and Segregation problem must be dealt with afresh, and the modern methods of partial correlation and variate difference correlation applied to its various aspects. We have taken the period used by Dr Newsholme, 1866-1908 inclusive, and have used the figures for each individual year thus obtaining 38 entries, which are few indeed, but the best we can probably do with data of this kind, and therefore directly comparable with Dr Newsholme’s results, for he seems to have used individual years for his correlations although he does not always say so (cf. pp. 271 and 280), and notwith- standing that his tables are all given for five-year periods. The population numbers for England and Wales (Table A) were taken from the Registrar-General’s Annual Report for 1909, and the phthisis deaths from the Reports for 1866-1903; the average of each five years’ period agrees with Dr Newsholme’s values for phthisis, but the values for indoor and for total paupers do not quite agree with his. Dr Newsholme was therefore written to and asked whence he obtained his numbers. He was kind enough to reply, but said that he was unable to refer at the moment to the original tables, but that undoubtedly the data were the statistics given in the Annual Reports of the Registrars-General for England, Scotland and Ireland. We then examined the Local Government Board returns and found that Dr Newsholme apparently had used the pauper returns for the January quarter of each year. We kept therefore to the Registrar- General’s Report, as the numbers there given are based on the Local Government Board’s returns for the whole year, which are a fairer measure of pauperism than those for the January quarter alone. For Scotland, our numbers (Table A) agree with Dr Newsholme’s for both phthisis and indoor paupers, except when we take the first five-year period (1866-70), where they differ slightly. In the case of total paupers for the periods 1866-70, 1881-85, and 1896-1900 our figures do not agree*. We cannot find any reason for these divergences except a slip in his or our arithmetic, or the possibility that a wrong number of outside paupers has been taken by one or other of us. We do not think the differences in the values are such as to invalidate a comparison of results. In Ireland the only serious discrepancy in our values is in the total number of paupers for the period 1876-80. These discrepancies, however, emphasise the very necessary rules for statistical treatment: (1) that the ultimate raw data should be published with every inquiry, and (11) it should be stated exactly where they are taken from, and how they have been treated. Table A gives our raw data, Table B our deathrates and indices based thereon. We have correlated the phthisis deathrate taken as 10°¢/P with 100p;/p, and * We are unable to compare his and our data for individual years, because Dr Newsholme has only published his data for five-year periods. AuicE LEE 537 100p,/p;. Taking first England and Wales, and calling these three indices respectively Jy, J; and J,, we find: Correlation of J, and J; = — 9664 + 0072, I, and J, = + 9298 + 0148. Dr Newsholme gives — ‘94 as the coefficient of correlation “between segregation measured by the fraction of pauper population treated in institutions and the phthisis deathrate” (p. 277). Having regard to his confusion of J; and J, and his frequent interchange of the signs of correlation coefficients, we can only say our results confirm his high numerical value, but not his actual figure. But does this actual figure mean that there is any real relationship between segregation and the phthisis deathrate? To test this, we replaced the index J; by I,, where Mean number of indoor paupers per 10° for the population, 1866-1903 I, = 100 — 10° x p,/P -100 (5)/(). In this index the relative number of indoor paupers is assumed to remain absolutely constant. We found: Correlation for England and Wales of J/g and J; = —‘9459 + ‘0115, that is to say we get substantially the same value, a value higher than Dr Newsholme’s, by putting the number of indoor paupers relative to the general population constant throughout the period. It is very difficult, in the face of such a result, to suppose that segregation of paupers has anything whatever to do with the diminution of the phthisis deathrate. It is clearly due to a ; - F F 1 ae negative correlation of a high magnitude between —, and ¢/P, or to a positive (Me correlation between Br and p i.e. to a correlation between a high total pauper g paup Ie le rate anda high phthisis deathrate. Dr Newsholme’s result merely reduces to the statement that total pauperism in England and Wales has diminished con- temporaneously with phthisis. If the result has nothing to do with segregation, can we assert that the reduction of phthisis is causally related to the reduction in total pauperism ? Overlooking for a moment a new objection to be raised later, let us apply the variate difference method to the correlation of ¢/P with 100p,/p, and 100p,/p; in the cases of England with Wales, of Scotland, and of Ireland; also to the correlation of ¢/P with the index 100 (p;/P)/(p,/P) in the case of England with Wales. The following are the results: 538 Tuberculosis and Segregation TABLE I. England with Wales Scotland Ireland Correlation of 10°¢/P 10°$/P 10°¢/P 10°¢/P 10°¢/P 10°¢/P 10°¢/P with elas ateie i I; Iie I; I, I; I, Crude Indices | — 946 + 012 | — 966+ ‘007 |+°930+ 015 | — 952+ -010 |+ ‘920+ :017 | — °881 + 024 | + °893 + :022 Ay + 090 + °134 | — 258+ °126 | + 340+ °120 | — 265 + °126 |-+ °250+ °127 | — 280+ °125 | + -235 + 128 A, — 2014 °149 | — 4614123 |4+ °542+4°110 | — -2404°147 | 4+ °1824°151 | — 2644 °145 | 4+ °180+°151 A3 — 335 + °153 | —-508+4°127 |+°567+°116 | —:205+ :164 | + 086 + 170 | — 226+ °163 |+°162 + °167 Ay — 407+ °155 | — 518+ °136 |+ °547+°130 | —°186+°179 |+ 024+ °185 | — 182+ 179 |+°133 4-182 As — °475 + 7153 | — 528+ °143 | + 529+ °142 | -— 1824-191 | -—-003+°'198 |— 145+ °194 |+ 108+ °195 Ag — 538+ °149 |— 5434 °147 |+ °5834°151 — — "112+ °206 |+°081 + °208 | Ay — 584+ °145 |— 5624 °150 |+ 539+°156 —_— —_— — +°044+°219 As — 614+ °143 | —°587+4°151 |4+°557+°159 —_— — — — 004 + :230 It will be seen from this table that whether we use the index J; or its inverse I,, we get practically the same results—naturally with changed sign. But the results themselves are of extraordinary interest. For both Scotland and Ireland, when we proceed to annul the time-factor by correlating successive differences, we find that the high correlations interpreted by Dr Newsholme as marking a relation between pauper segregation and phthisis deathrate entirely disappear or become less than their probable errors. There is thus no organic relation between these variates as measured by the above indices. In the case of England and Wales, however, while there is a reduction on annulment of the time-factor to roughly two-thirds of the high value noted by Dr Newsholme, this value does not tend to disappear with increasing differences. Thus in England with Wales, as apart from the remainder of Great Britain, there would at first sight appear to be an organic relation between segregation of paupers and the phthisis deathrate. But our first column under the England with Wales section shows that if we fix the percentage in the general population of these indoor paupers and then annul the time-factor, we reach a slightly higher value of this apparent organic relation. It has therefore nothing to do with segregation. Thus Dr Newsholme’s interpretation of his original high correlations appears in every case fallacious. There are two methods of testing this result, ie. the absence of organic relationship between indoor pauperism and phthisis. Suppose we correlate the crude numbers of phthisis deaths per annum and of indoor paupers per annum, the resulting coefficient will have very small logical value because both these variates are continuously changing with the time*. But now suppose we annul * It is noteworthy that the England with Wales and the Scotland correlation coefficients for these crude variates are high and negative, but for Ireland the coefficient is moderate and positive. Thus the factors at work must be totally different in the two Islands. Since indoor paupers relative to the population have remained singularly constant the increase of phthisis deaths must have been much slower than the population increase in Great Britain, but somewhat faster in Ireland. ALICE LEE 539 the time-factor by correlating the differences of these variates, then we shall free ourselves from the influence of the time-variate, and in doing this we shall also free ourselves practically from the influence of change of population, which is a time change. The following table resulted from this investigation. TABLE II. Correlation of Crude Phthisis Deaths (b) and Indoor Paupers (pi). TL Variates England with Wales Scotland Treland Crude — 9384+ 014 —°718 +053 +°457 + ‘086 Ay — 376+°116 — 206 +°130 — 092 +°134 A, — 302 +°'141 — 2194148 — 1038 +:154 As — ‘2134164 — 180 +°166 — 143+°168 Ay — '100+°183 —'157+°181 —'147+°181 A; — 016+°198 —°158+°'193 — 140+°194 It will be seen that for all three countries, whether we start with the positive correlation of the Irish or the negative correlation of the English and Scottish returns, there is no remaining significant correlation after annulment of the time- factor between indoor pauperism and phthisis. A second method of verifying our conclusions is to find the partial correlation between indoor pauperism and phthisis deaths for a constant value of the total population and a constant value of total pauperism. We thus ask the question whether with a constant population and a constant amount of total pauperism, an increase of indoor pauperism would organically affect the number of deaths from phthisis. By making the population and the total pauperisin constant we are largely producing an annulment of the time-factor and ascertaining whether a change in the number of indoor paupers due to causes other than temporal influences the number of deaths from phthisis. The system of correlation coefficients given in Table ILI, p. 540, was determined : Here the values of py 7», for England with Wales and for Scotland confirm the conclusions we have reached by other methods, ie. there is no significant relationship at all between phthisis and indoor pauperism. The value for Ireland is, perhaps, significant, but having regard to its smallness (—‘3 +'1) and the size of its probable error, no one can lay real stress on it, in opposition to the results of the other two countries. In general the coefficients for the Irish data appear very anomalous, and certainly divergent from those for Great Britain. Thus our investigation of the relation between indoor pauperism and phthisis appears to be entirely opposed to Dr Newsholme’s conclusions. We find the segregation of paupers to have no substantial influence on deaths from phthisis. The one outstanding point at present, the relation between p,/P and ¢/P after Biometrika x 69 540 Tuberculosis and Segregation annulment of the time-factor (see our p. 538), has no bearing on the segregation problem of Dr Newsholme. TABLE III. Total and Partial Correlation Coefficients of Crude Numbers of Indoor and Total Paupers (p; and p,), Total Population (P), and Phthisis Deaths (¢). Coefficients | England with Wales Scotland Treland "pio — 9325 014 — 718 + 053 +°457 +087 (7 "yp +°955 +010 +°831 4-034 +°763 + 046 uv We Ok g ‘ > 1 AS : q Total "op 950+ 011 — 896 +022 +:479 + 084 Coefticients ie —°544+:077 — +528 +079 — 251+:°103 (ai "bp +577 + 073 +°780 +043 +:070+:109 7. "py P — 674 + 060 — 805 + 038 — 684+ -058 Pints — 287 +°100 +111 +°108 +:162+°107 u Partial p,! po 75+:073 4+:492+ Coefficients eho 5a A ie i Py! pio alee as CS —-017 +109 — "305 + 099 Pr pi To approach nearer to the meaning of the relation between total pauperism and phthisis we determined the correlation between p, and ¢@ for constant P, and found Pp 6= — 277 £101, which is barely significant having regard to its probable error. Now after elimination of the time-factor, we found for the correlation of ¢/P and J, at the eighth difference —‘614 +143, but this is the same as the corre- lation of oe significant, positive and of the order ‘6. Now if p, and ¢ after the removal of the time-factor were practically independent of each other, there would be a high positive correlation between p,/P and ¢$/P, due to the fact that P when it takes—after annulment of the time-factor—any random deviation appears in both variates’ denominators. In other words, we are inclined to believe that the high negative correlation between ¢/P and J, is solely due to spurious correlation arising from the nature of the indices used. and @/P. Hence the correlation of p,/P and ¢/P must be very To throw still more light on the matter we have investigated the correlation between the total number of paupers and the total number of deaths from phthisis when the time-factor approaches annulment. It will be seen from the table below that for both Scotland and Ireland there is finally no relationship at all ALicE LEE 541 between phthisis deaths and total pauperism. On the other hand, England with Wales is tending to a value at least approaching to the crude correlation. We have therefore this noteworthy result: England with Wales starts with a con- siderable value and concludes with an equally great value, Scotland starts witb a high value and ends with a zero value, Ireland starts with an insignificant value TABLE IV. Relation between Total Pauperism (p,) and Deaths from Phthisis (@). England with Wales Scotland Treland Crude values +:°5774:073 +°780 + 043 | +°070+°'109 Ai — 095 +134 +025 + 135 | +164 +°132 Ay +°1744°151 +°025+°156 +144 +4152 A3 +286 +°158 +:012+°172 | +131 £°169 Ai 4.347 4-163 +033 £185 | +:110 £183 As +413 +164 +027 +°198 | +090 +196 and ends with an insignificant value. If pauperism were causative of phthisis, it is hard to believe that this would not manifest itself in the Scottish and Irish returns; these negative any such hypothesis. It would appear that there are essential differences in the treatment of pauperism in the three countries. I suggest, but I cannot demonstrate the view, that phthisis itself leads to pauperism in England, ie. that the relatives of the phthisical breadwinner more often are allowed to become paupers in England than in the sister countries. In other words, that the only organic relationship between pauperism and phthisis we have been able to discover may be due to a relatively harsh treatment in England of the dependents of the phthisical. To show how effectively the variate difference correlation method removes time influence, we may note that we correlated total population (P) with total pauperism (p,) and total phthisis deaths (¢) with total population by this method, with a view to ascertaining whether the relation between p, and ¢ would be modified, if we determined it for constant population. The following results were reached : TABLE IV. England with Wales. Total Population Total Population and | an Total Pauperism | Phthisis Deaths (P and p,) (P and ¢) Crude values — 674+ :060 — 950+ :011 Ay +°457 + °107 — 039 + °135 Ay — ‘016 +°156 — °205 + *149 As — °022+°171 — 089 +°170 4 — 031 4°185 + 002 + °185 69—2 542 Tuberculosis and Segregation Thus we see that apart from the time-factor there is no relation whatever between either pauperism or phthisis and population. In the relation between total pauperism and phthisis deaths, no further correction for population is needful than that obtained by the annulment of the time-factor as in Table IV. Table IV bis shows us that neither pauperism nor phthisis is organically related to population, although we might well have anticipated that greater density of population would influence pauperism and provide greater chances of infection, and so of deaths, in the case of phthisis. (5) We now come to Dr Newsholme’s fourth and last measure of segregation. It is “the ratio in which the number of paupers treated in workhouses and work- house infirmaries stand to the total number of deaths in the community ” (p. 276). In our notation this is p;/f, or as an index 100p,/¢. But in the figures actually given in Table LXV (p. 277), and headed Segregation Ratio, Dr Newsholme appears to be using 100¢/p;.. The same remark applies to Tables LX VIII and LXIX (pp. 280—281). Thus it is difficult to be certain of what Dr Newsholme intends to be taken as his fourth measure of segregation. In our discussion below we have used both 100¢/p; and 100p;/¢ to provide for both contingencies and to check our results. Unfortunately Dr Newsholme makes little attempt to justify either his third or fourth ratio as an approximate measure of segregation. It will be remembered that he has defined the true method of measuring segregation to consist in forming the ratio “stating how many of the total days of sickness (number of patients and number of days of sickness) are passed in institutions” (p. 267). In this fourth index of segregation he replaces phthisical patients in institutions by indoor paupers, and total of phthisical patients by total deaths from phthisis, dropping any question of the number of days of sickness. At the very least this seems to involve two assumptions, (a) either that all indoor paupers are phthisical or that for the period in question the proportion of indoor paupers who are phthisical has remained constant, (b) that for the period in question the number of deaths from phthisis has remained a constant fraction of the total number of cases of phthisis. It is difficult to see how, without such assumptions, such figures can “measure with approximate accuracy the ratio which states how many of the total days of tuberculous sickness are passed in institutions” (p. 267). Yet in another paragraph Dr Newsholme quotes with apparent approval the statement of Mr Fleming, who speaks of the “great change in the character of workhouse inmates during recent years....The able-bodied inmates are gone and the sick inmates have come” (p. 273). Such a statement is absolutely inconsistent with the assumption (a) above. To justify (b) we must assert that for the last fifty years of the nineteenth century there has been no change in efficiency of treatment in the case of tuber- culosis, for without this we cannot assume that deaths from phthisis are even an approximate measure of the number of cases (p. 267). The fact that the reduction AuicE LEE 543 in the phthisis deathrate has been substantially different for the different age groups, and is especially marked in the case of children, seems to indicate that recovery, at least from puerile phthisis, is more frequent now than formerly. However, not to spend more time on these assumptions—which, it appears to us that Dr Newsholme has by no means justified—let us examine whither this fourth method of approximately measuring segregation leads us. Table V gives the necessary coefficients. TABLE V. Correlation and Difference Correlations of 10°6/P and 100p;/h or 100¢/p;. ; England with Wales Scotland Treland Variate 10°¢/P as ee 100046 1009/p; 100p;/9 100¢/p; 100p,/¢ 100¢/p; Crude |—°760+ :046 |+°976+ ‘005 | — ‘861 + °028 |+ °944+ °012 |—°712+4 054 |+ 666+ ‘061 Ay — 868 + 033 |+°848 + °038 | — °755+°058 | + °772 + 055 |— °819 + 045 | + °707 + (068 A, — 879+ °035 |+ °875 + ‘037 | — 824+ °050 |+ °834+ °047 |— 922+ :023 |+°755 + ‘067 A3 — '895 + 034 |}4+ °874+°041 | —:809 + :059 | + °824+ :055 |— 954+ :015 |4+ °791 + :064 Ay — ‘895 + ‘037 | + °860 + -048 | —°811+ ‘064 |+ °805 + :065 |— 964+ °013 | + °805 + 065 As — °898 + 038 |+ °847 + ‘056 | —°786 + ‘076 |+ °788 + °075 |— 970+ ‘012 |+°831+4 ‘061 Ag — ‘907 + ‘037 |+ °850 + °058 | —°788+ 079 |+ "794+ 077 |—°973+°011 | + °848 + :059 Ay —'917+:035 | + °835 + ‘067 | — "792+ 082 |+ °791 + ‘082 — + °857 + (056* Now this table at any rate demonstrates a very high correlation between $/P and p;/, while the previous table for Dr Newsholme’s third approximate segre- gation ratio led in the case of England with Wales to the value —‘587, and in the case of Scotland and Ireland to negligibly small values! Dr Newsholme himself writes: “ Any of these indirect forms of segregation ratio has therefore to be verified wherever possible by the application to the same community and period of one or more other forms of the ratio, and checked where practicable by a special examination of sample constituent communities whose figures are included in the total. This has been done so far as the information obtainable allowed. It will be seen that the results obtained by applying different ratios to the experience of the same country and period are usually, though not invariably, in good agreement ” (p. 268). What is quite clear from the above results is that, while in the case of Dr Newsholme’s two chief measures of segregation, there is very sensible difference in the case of England with Wales, there is an absolute discordance in the cases of both Scotland and Ireland. Accordingly on the basis of his own axiom, that we must check our results by application of one or more other forms of the ratio, * This correlation continues to rise until it reaches 929 with the thirteenth difference, but with such high differences the “population ” is so reduced that the method ceases really to be reliable. 544 Tuberculosis and Segregation we are bound to reject these ratios as even approximate measures of segre- gation*, But it would not be satisfactory to leave the matter here and not provide some explanation of why this fourth segregation ratio, both before and after the annul- ment of the time-factor, leads to such high correlations. Luckily the matter is capable of a perfectly straightforward and obvious explanation, which would have been anticipated had Dr Newsholme had in mind the danger of “spurious cor- relation.” What he is correlating are essentially ¢/P and p;,/¢. The latter may be written (p;/P)/(¢/P). Now pj/P is practically constant during the period in question. Hence Dr Newsholme is correlating ¢/P with 1/(¢/P), or a variate with its reciprocal. In other words we may anticipate something very closely approaching perfect correlation. The deviation from such correlation arises from the fact that p;/P is not absolutely steady, although its variations are very probably nearly random. The assertion therefore that this fourth measure of segregation assists in demonstrating the close relation between the fall in the phthisis death- rate and institutional segregation is based on a fallacy which entirely overlooks “ spurious correlation.” ; It will be seen therefore that not one of Dr Newsholme’s methods of reaching an approximate measure of the segregation is satisfactory, and they lead to con- tradictory and inconclusive results. Whether there is any really substantial relation between the prevalence of phthisis and institutional segregation we do not yet know. All we can say is that Dr Newsholme has entirely failed to demonstrate it, if it actually exists. (6) Before concluding this paper it may be of interest to judge how far it justifies the application of the method of variate difference correlation to such problems as are here dealt with. In the first place, the correlations of successive differences should approach steady values. This is generally—as the reader can judge by examining Tables I, II, IV and V—but not invariably, the case. The test cannot, however, be com- pleted, as the method ought not to be pressed to such high differences that the order of the difference is a large percentage of the original “ population.” We doubt whether it is advisable to carry differences beyond the 8th in a population of 38. 20°/, to 25°/, reduction in the population is as much surely as it is safe to allow where the original population is so small in number. It is true that a population of 38 itself is capable of exciting the derision of trained * Under the circumstances it is, perhaps, unnecessary to draw attention to Dr Newsholme’s state- ment that ‘‘the specific result of pauper segregation must have been lower in Ireland than in England or Scotland” (p. 282). Free of the time-factor the correlations of phthisis deathrate and Dr Newsholme’s fourth segregation ratio are higher in Ireland than in England or Scotland. This criticism as well as Dr Newsholme’s original remark are of no importance, because the fourth segregation ratio correlation is entirely spurious. ALIcE LEE 545 statisticians, and ought never to be used where hard work can produce larger numbers. But in annual returns, as has been indicated by others, a period of 30 to 50 years is often the maximum attainable, and we must take what we can get. In the present case the probable errors of the difference correlations—based on the Andersonian formulae for steady conditions—show us that we can form fairly legitimate conclusions from the results reached. A second test that we have applied is the approach to the theoretical values in the function 0”; ,/o°5 .. where 6,0 is the mth difference of the variate «. The following table shows that there is a reasonable approach to these theoretical values in the calculated standard deviations of the differences, and suffices to justify the application of the variate difference method within the ‘limits of practical statistics. We have continued the differences beyond the values used in some of the correlation results to indicate the sort of irregularities which may be expected to occur when using high differences in small populations. Terminal irregularities then begin to affect the uniform rise of 05 ,/o°5 m1 * Tuberculosis and Segregation 546 F09-& ai aes Si aa LEB-€ | = Sar a cag | rs €06-€ 699-& — 6LL-€ LG8-€ iis 899-€ a a =F — OL6-€ ae ra aa fale ae 0€6-€ LUL-E am EP8-€ 978-€ &I OPL-€ ar ae a — 106-€ — a | a =a oie 686-€ 8I8-€ ra LL8-€ €€8-€ as 9GL-€ —e as a = €68-€ =a — ae = = 668-E €28-E a 188-€ 818-€ IT GG9-€ =a ara ae are 698-€ i = | = ae as 068-€ POL-€ = P98-€ 008-€ OL 86P-S a = ie wae G98-€ = a — ares a5 OLL-€ g89.€ = OV8-€ BLL-€ 6 GEe-€ ae rs aS = €P8-€ a Tas < aa a 6GL-€ C6G-€ — G08-€ OGL-€ 8 ELE-E 889-€ C6G-€ 889-€ a O8L-§ aa 16L-€ 88¢-€ = am L69-€ PI&E 0€9-€ 66L-€ VIL-€ & GEL-€ PG9-€ I1Th-€ 999-& i P8G-€ 697-€ 869-€ IVV-€ 09G-€ ire 909-€ CES-E GLG.€ PLG-S L99-€ 9 6C6-G 86E-€ CCE-E L6G-€ SIP-€ C1G-€ C6E-E 89G-€ 1€€-€ 9GT-€ SPL-€ T€P-€ COG-€ OCF-€ LUGE 009-€ G OF8-% 060-.€ 180-€ ILP-€ 6GE-€ F98-6 €86-€ 90€-€ GPL-€ LOE-€ 1é9-€ O8L-€ GGP-& 60G-€ GEL-€ 00¢-€ Uf GE9-G 90¢-@ 8L8-E LUGE CTL-G E6P-% bS0-€ C08-G OOL-G 1L0-€ ILTL-€ 161-6 80E-€ 908:@ 8E6-E &8E-E & GGE.G 689. T PE6- 1 699-6 GIP 1 e6L-T OGE-G 618-1 09-1 600-€ I8L-T G09: T 666-6 CGE-G LIE-E 000-€ 6 8L0- SSI. 910. GLO- 8€0- 9TO- LE9- LG. 9F0- 960- | O0&0- 610- BLP. 9¢0- LEO. 000-@ rT SOR SOT@AA | | SATB AY | SOTEM SOTCM purely | puepyoog pue purjeiy | purpjoog pue purjary | purjjoog | pur purpaly — purpjoog pue puryery | purpyoog pue puypsugq | puepsug puepsuy | puepsug purpsugq serteg i | ; [801}o100, J, *d/P00T :[eo0rdtooyy *d/+doot + [eoordtooxy | $/"dQ0T +d/?dQot d/?01 oney uoeserseg YANO OY’ UOTYVSaIZeG paAIU I, oey uolesaiseg YAINOT Ole UOIVSeIZeg pA, ayel-yyvaq pue sIsimIyg UU é “~~ 0 yonouddn ayy pun *'".0/*"°.0 fo sanyo TA WIdViL "8061 et0jaq spuoday yonuup ut LOZIESE y 70 | O6FOOT €ECEP SS9EITP 6996 €E888 ITLIL |) &@Z6LEF 9899 PEOLIL L6GOGZ REEBLEEE GELOF SO6L = €SOLOL GELEP PLOCEPP OOF6 66698 f980L | 66cLESF 9199 6P8cOL CBI LI] IEOLEGSE LL90P GO61 sH | 96666 I8ECP OS9GTPP 6F°6 GePos =| 90EOL | O88E8FF 0989 LLG169 FLGGOG ENG LEITE PEGIP LO6L ne) CE666 G88LF LOGS9FP 9LOOT 0G8¢8 8986 8S69ETF Goel SEGLLO CP 6661 L8L6PEZE LOGGP OO6L OFOLOT LEQEF LOPGOGF O8T6 69678 C966 | O&cO6EP GEEL 9E8169 LVOPOG COETSETE BOPEP 6681 86LOLT O8LPP SLP8LGP 8196 17698 GEOOL =| G68CFFEF 6662 9GLIGL CI8P0G GELLIGTE SeELP 8681 cPcLe 6P8EP LI66ESF 8PL6 TG6L8 8066 GE L66EP C6EL Or99TL EIP66T CPESCLLE CT9OLP L68T 98696 O66 1F L90GPSP 1G06 €9098 6996 | SSlFcr G90L LOLLTL 61G00G 8S8Z080E TGGOP 9681 LP C66 6ILIF 9E66OCP 8916 LIET8 €806 =| &h960GP O€8L LE69EL 086006 SES 1SP0E O6PEP C68T 61¢66 €9CEP O9G68GF 9696 OGEZB GIZ6 =| «=«YO9S9TP 8PEL 8EFLOL LLOG6T TOGFOLOE IV9OIP F681 LEL66 9E8LP GOP LOOP 6986 GOOT8 F288. GBOBETF LOOL 6LF069 697061 GPBO9LEZ CE9EP €681 9TELOT G8CLP BO8EE9F 8POOT FL008 LEGS OI68L0P €989 998199 OLF6LI C6ELEPES SCEEP 6681 GLEFOT PPLIP 9LEO8OP 8e00L PSP08 O9T8 | SPE9EOF Lev! LETE99 STISPLI 61898066 STS97 L681 O60G0T 600€P 6C6LILF 9LTOL VSPE8 G818 GE 1SOOF 9ELL GLOV89 CSESLT ELOEILEST 99E8P O68T 9€E90T 9VOPP CSELGLy 1¢66 99178 ILF8 | +=GOeEL6E 6GTL 90ELTL 690é81 6EE8TF8S SELTP 6881 €LP6ol SEOcr GLELO8P 6186 T6E98 €988 LOLETP6E 1¥69 SroTeL 66948T 8GC9EL8S ShGTF 888T 6TOELT GPPOP 6TTLO8F 6ZEOL 18698 C606 SLEPL6é 6¢EL 980661 E99E8T 9OLLESLE GE6rr L881 COGTEL 67 LOP C68G067 F690L OLO88 C6F6 SS1988E 9E6L 660GEL 6T9I8T CESEEGLS GLELD 988T Or9cor 600LP S8I8E6P 60LOL CIEIB L006 #*LOEIG8E cO6L 8ELLOL LSOGBLT 9OLOZELE GLISP S8el 3 OTGé901 TPG8P L9SPLEP €8COL 80948 IF06 SLPLE8E &Z6L LL6669 989L41 CB LEEE9T SCEGP F8st i} 9OTILT 80g LL8€Z0G GLLOT 9TE88 GELS T9G86LE TErs OF60LL SELELT 6P697E99% €G00¢ £881 i. FO860T CIETY SLOTOTS 8¢C0I F906 F968 LOQOLLE 6908 OLE9TL PVPOLL CPEFEEIG STL8P e881 = 89crLT 8E0ES OLLGVIG LEOOT 896E6 OF06 POSETLE GE8L LOSVEL VITLLI GP LOFOIG IPGLY 1881 iS FE9LIL OL9TG 879EOEGG TELL 68066 9666 F66GOLE O618 9820EL 98ZELT S8CFI LES LOG8F O88I = PSél6 €90ES GE9G9GG 6L90T 8EL6 OFI6 EPPS99E LOS SLLGEL CI6F9T 68 IL LECS GLELG 6L81 = COGT8 90787 IPESSEG 90FOL C6L16 €9L8 SIESZEIE 6198 SOFE89 90E9ST 6EEEEOST 998GG SL8T €G08L S88TP O8E986G €erol O8&E6 9F08 | ez006cE O88 888999 LOLSFT 6EC669PE SGETg LL8T COPSL 9E6EP VUGLLEG G600L 68946 98cL E8lEGGE €0L8 G8 LOL9 TI68€L LOGOLEVE GLLTG 9L81 Oco9L 608SP 6C98LEG OPEOL 89886 E192 PPLVLGE 6688 SPIPIL VSOVEL G8EGTOPS €76GG GL8T Te6LL 669LF 6L6866¢ 9TF6 889€01 6944 | VOLLLVE 8908 GEELPL POOLET PESFELET 6LE6T FL8T GO9LL I6ELF SE6LEEG IVEOL FE96OL 8684 | 9SOLPPE c9rs ogce6l OLL8EL ISCR8OFEST CGET¢ EL81 LOTEL €SO9P O68ELEG O€FOT 9SE9TT COFL | S6LFOFE 6868 19GZS8 6LO8EL C6P960ES 68¢6¢ GL8I €0€0L F609F 6LIS6ES LEFOT 606ZEL 6PLL LE689EE 996 900876 89SEF1 P6S88LEE OLEES IL81 €8T0L CLI6P GIS8 TPS 6866 6EC9EL 8z6L LOLOSEE €F76 LO88L6 60S8FT ITETOSES TEGPS OL8I €600L SILE¢ P6067FS €866 0008 L 9FE8 | S88cOeE LE88 167796 I688FL 66CE6EES OLEEG 6981 | S6ITL P8Prg PI6S9PS OFF6 SPPOEL T6LS | OGESLEE S118 796946 ISG6FT ETLLEP61] €GP1g 898T OG6L9 LUTES 60G98FS €8e01 T9E8ET 6c6L | 860G7E LE88 {66166 9EZ8ET CGSLLOTS GVOGG LO8T 6109 OLSOG CPOGEGS 6LTIOL GVO9EL OLOL | 661S1ZE 6088 G60L98 6690ET P8960F 1G VILGG 9981 sredneg siadneg SISIyY sradneg srodneg SIsIuyy siadneg srodne g SISIqY [210L 1oopuy uonendog | wo1z syyveq 1810, IOopuy uonyeindog | wor syyvaq 1810], Ioopuy uonendog | Wor syyvaq| 1wd9az jo Jequinyy | Jo toqunyy jo Joquinyy | Jo zequinyy | Jo zaquunyy jo toquinyy | jo zoquiny | jo szequinyy jo toquinyy “PUnjaLy “punpqooy ‘Sav Yam pwopbuy : ‘Vv WIaVL . Biometrika x Tuberculosis and Segregation 548 4 GoV GG LIG G-§1 GLI ° Lg OFT 1-08 679 st® OGI S061 &V 697 GG GIG G61 FIL 19 OFT 0-08 61g 61 S@1 GO6L (ig VY &@ 1% L-GL OST L9 €G1 1-66 667 06 96T LO6L (aig 9TY VG 9G GIT Tél GL 99T G-66 GOP GG eéT OO6T rag O9T GG 116 LIT 9€1 tL LOT G-66 187 1G ee1 6681 OV Lov GG VIG GIL 8él €L 891 P-8E 967 06 T&1 8681 WW OrY &G GTé €-I1 TEL GL GLI 8-L@ 6LY 1é VEL L681 TW gor GG 661 LIT 9€L bL 991 6:16 L6P 0G OsT 9681 GP LGV &@ FIG 8-01 OIL 98 98T §-16 eLVy 1é OFT 681 (aig 6&7 &@ 60@ GIL L6L 64 PLL 6-46 OLY 1G 8éL P68T (aig vCV VG VIG 6-01 9é1 64 OLT 9-26 LEP &@ LPL S681 IP VIP VG LIG LOT VEL 08 891 6:9¢ TIP TG LPT é681 OP 9IP VG lak LOL OIL 16 F8l 6-96 GLe LE O91 1681 IP SGP VG 91z 6-6 90T c6 £61 1-96 69E LG 891 0681 IP &tP &6 606 0-01 SIL G8 O8T 9-9¢ LOV GG LST 6881 “GP POV GG 0G €-01 8el 8 9LT P-9G OGP VG LST 888T IV LEV GG VIG v.01 €G1 18 881 6-96 607 GG GOL L881 8€ cer &@ 61G 8-0T Oé1 v8 TOG 8-76 6LE 9G PLL 9881 cY LET &G 8Ié F-0L TIT 88 G06 F-9G ELE LG LL C881 cv Gor GG ETE GOL FIL 88 LOG PGE 09 86 €81 F8sl cv LOV 1é GIé 6-6 FOL L6 GCG L-VG 1g€ 6G 88T E881 Lv 00¢ 0G L0G 6-6 Ill 06 FIG 9-7E C9E 8é Ssl e881 97 Té&g 61 F6L 9-6 OTT 98 606 L-¥@ ELE LG S8T I8s1 67 167 0G VIG 8-6 FIL 88 1GG 9-€6 LG€ 8@ L8T O881 LG 667. 0G 861 9-6 801 €6 18% P-G6 CCE 1g 1 606 6L8T Lg GLY 1é P61 9-6 IOL 66 6&6 L:G6 T66 TE L1G SL8T 8¢ Ler &¢ L6L 9-8 T6 LOT 6EE 6-26 686 ce 806 LL8I 8¢ 9EP &G I6L 6-4 L8 TIL GPG 1-06 896 LE GIG 9L8I 09 LvV GE P61 LL 98 STL EGG 6-81 GG 6€ OGG GL8I 19 90¢ 0G SLT GL 96 FOL GEE F-81 6LE vE 80¢ FLL 19 BSP GE F6L GL &6 LOL 9F6 FLT 696 LE 61E EL8T €9 CTY &6 T6L v9 é8 GGL T9G €-91 PIG 8é LEG ELSI 99 CTT &G £61 &9 68 TéL BLE L&T 69G LE P&E IL8T OL c6P 0G TSI €-9 T8 6IT £86 GGT PLE LE 1vG OLST bh GEG 61 €81 F-9 t6 901 L9G v-GT G8G cs oe 6981 LL 94g LT &L L9 OOL OOL 896 9-91 066 TE T&G 8981 BL e1g 0G 681 6-9 06 ITl ELE L-¢L GGG Or VGG LO8T 18 00¢ 0G esl 9-4 08 GéL PLE LST PES &P 096 9981 sradne | erst: sredne uore[_ndo sredne SEONG sradne uotye[ndo sradne Sete siedne uonmeindo re1oz, oT ted | MOF eT | tour Sor {ous yosors9d| reno,z, or td | HO8 aT | so0pur Sor |eqa yo sor x94 | rerog, gor xod | WOU SuIvEC oopuy s0T |@q9 40.07 134 siodneg ae stele Jad sistpsygq | sisiyyd srodneg out 2 jad sistyyyg | sista saodnv gq poston rad sistyyyg| sisiayyd Iv9K Ioopuy ae a mor syyvay | Wory syyvEq IOopuy ets pe WLOIJ SYYBAC | VIOAZ SYYVO, Ioopuy Sago uloIy SABA] | WloIy STP VO jo raquinyy jo recta jo qequiny | jo zoquinyy | jo requinyy 0 Soot Nee eatUNG FO toa tua Ng eseiteacan hy iO sean jo aaquinyy | jo zequiny “‘puvnjas[ “pun7}099 ‘SsayV MM YRUN pun bury ‘da aTavViL THE INFLUENCE OF ISOLATION ON THE DIPHTHERIA ATTACK- AND DEATH-RATES. By ETHEL M. ELDERTON, Galton Research Fellow AND KARL PEARSON, F.R.S. (1) Introductory. The problem of the advantages of isolation, not only in the case of diphtheria but of other diseases of an infectious character, is hkely, owing to modern views as to “carriers” and other sources of transmission, to be much discussed in the near future. It is therefore well to consider what may be learnt from the statistics available. The questions which naturally arise are of the following kind: (i) In districts with a maximum of isolation is there a minimum of incidence? Gi) In districts with a maximum of isolation is there a minimum deathrate from the disease isolated ? There cannot be the slightest doubt that, if these two questions were answered in the affirmative and we could show that the incidence was markedly less and the deathrate significantly smaller in districts where isolation was most stringently carried out, then these results would be advanced as a strong argument in favour of isolation. To the trained statistician, however, no conclusion based upon such results without much further analysis would have any validity. To illustrate this point, let us consider the hypothetical case that medical or popular opinion in a given town has been persistently in favour of increasing the isolation-rate, and further suppose that in this district improved economic conditions have increased the immunity, or bettered sanitation lowered the incidence, while at the same time new methods of treatment have lowered the deathrate of the disease; it will be clear that in considering the statistical results over a course of years we should find a high isolation-rate negatively correlated with both the incidence- and the death-rates. Thus if we considered this correlational as a causal nexus, we should be raising an apparently strong argument in favour of a maximum of isolation, which would be based on the statistical fallacy, that when two quantities are both changing continuously with the time, this must of itself denote a causal relation. 70—2 550 A Study of the Effects of Diphtheria Tsolation In precisely the same way a positive correlation between the isolation rate and the attack- or death-rates by no means justifies us in asserting that isolation is worse than non-effective. It is conceivable that in the period or the district under consideration with an increasing isolation-rate there might be decreased immunity in the population, greater virulence of the disease, or even a limit to the available isolation accommodation, so that in the case of attacks of an epidemic nature the isolation rate would not increase proportionately to the cases, or indeed might even diminish*. Further, if apart from the changes in a single district, we consider a great variety of districts, it may chance that the greatest isolation-rate occurs in those districts where the disease has been found most prevalent, because it appeared the most obvious remedy, and thus a greater attack- or death-rate would be no real measure of the futility of high isolation. If, however, it should turn out that on the whole the higher isolation rate is associated with the bigher attack-rate or the higher death-rate then it will be clear (i) that there is ground for demanding a closer investigation as to the advantages of isolation, and (11) that we may be overlooking the real method, or at least one or more important factors, of the transmission of the disease. It is conceivable that isolation of all cases during attack may be of far less importance than isola- tion of certain special cases for a shorter or longer period well subsequent to the attack, and after they would normally have resumed their ordinary avocations fF. The main problems which arise are accordingly these : (i) Have isolation-, attack- and death-rates changed continuously with the time, and are the apparent correlations really suggestive of causal relationships ? (ii) Are associations between isolation-rate and attack- and death-rates really spurious arising from the fact that where the attack- and death-rates have been severe there the remedy which appeared nearest to hand was more isolation ? * For example, if there were only 100 hospital beds available, and out of 100 cases 50 were sent to hospital, the isolation-rate would be 50 °/,; but if in the next year there were 300 cases and all the beds were used, the isolation-rate would be only 33 °/,. Thus limited accommodation may tend to produce a negative correlation between isolation-rate and attack-rate, so that a positive correlation between these two rates may be of more importance than its apparent significance. It is extremely probable that some of the falls in isolation-rates are really due to an increase of incidence, so that the same percentage of cases cannot be met by the available hospital accommodation. + It is, on the hypothesis of natural selection, a plausible view that the parasites—including under this term all disease organisms—which ultimately survive must tend to become innocuous to their hosts, and thus the decreasing virulence of certain diseases may be accounted for. The organism is destroyed owing to the death of the host or its own death at his recovery, or it has been modified by selection so as to become innocuous to its host relative to his immunity. But immunity is a matter of personal equation, and thus the function of the ‘‘ carrier” in preserving and spreading a conceivably less nocuous form of the organisrn becomes clearer. We are not unaware of the view that the organism remains the same, but that the immunity is increased owing to ‘‘ practise” of the leucocytes, but such a view requires the assumption of inheritance of acquired characters to explain reduced disease virulence, and further compels us to assume two types of immunity, the one which destroys the organism, and the other which without modifying it, establishes so to speak a mutual modus vivendi. EruEe, M. ELpERToN AND Kari PEARSON 551 (ii) Are the districts which have adopted most isolation really urban districts where isolation was easiest to adopt and where possibly economic or social conditions favoured the spread of the disease or, in the case of the death- rate, the disease encountered a less resistant population ? (iv) What evidence is there to show that the districts which have rapidly increased their isolation-rates have subsequently lower attack- or death-rates ? If no one of these problems can be fully answered,—even in the case of a single disease—with the data at present available, at least light can be thrown on the lines which their solution in the future must take; and further something can be done to prevent hasty generalisation and excessive dogmatism as to the advantages or disadvantages of the isolation system. It can never be too strongly insisted upon, because it is so often forgotten, that preventive medicine is essentially an experimental science, and that in nine cases out of ten the efficiency of any line of action can only be adequately tested by statistics and by statistics collected after the expenditure of many thousand pounds, possibly spread over a long period of years, in carrying out this line of action *. (2) Material. In endeavouring to throw some light on the above problems we have fortunately received data of very considerable value from Dr E. H. Snell, the Medical Officer of Health for the City of Coventry. He obtained for a period of nine years, 1904-1912 inclusive, for about eighty towns or districts of large popula- tion but of very varying local conditions, (i) the annual number of diphtheria cases, (11) the number removed to hospital, (iii) the number of deaths. We have added to this material the estimated population of the town or district, and further certain data as to the economic and social conditions. Unfortunately there is no existing adequate measure of the general sanitary condition of individual towns, although the construction of a general sanitary “index number” would be of remarkable value in many forms of inquiry. We took as our measures of social condition : (a) Death-rate of infants under a year. (6) Amount of overcrowding, that is to say the percentage of the population in private families living more than two in a room. (c) Density of population, i.e. the number of persons to the acre. * Assert that it is most desirable to test the effect of sanatoria and of tuberculin in cases of tuberculosis, but do not dogmatically proclaim them as “cures” for phthisis, until statistics have been collected in sufficient amount and have been adequately and dispassionately examined to prove or disprove your statements. Insist on compulsory inoculation for enteric in the case of all recruits, but do not make it optional and then publish letters in the newspapers giving perfectly idle statistics, or go round to the camps giving popular lantern lectures to the recruits showing the gravestones of uninoculated persons, the portraits of persons dying of enteric, or much enlarged pictures of bacilli! If you think it experimentally worth doing, inoculate ; but don’t bring inoculation about by emphasising the dread of pain or the fear of death, both of which it is the first essential for a soldier wholly to disregard. 552 A Study of the Effects of Diphtheria Tsolation (d) Economic prosperity as measured by the number of indoor and outdoor servants of both sexes per 100 private families. Our data are baged on the census of 1911 as providing more ample information on these points. It will we think be admitted that the list of towns dealt with provides a very fair sample of the urban populations of this country. It ranges from manufacturing towns* like Preston, Rochdale and Bolton, mining and iron towns like Rhondda, Wigan and Middlesbrough, sea-ports like Hull, Liverpool and Southampton, to county towns like York and Reading, watering places like Brighton and Blackpool, suburban districts like Acton and Hornscy, and residential towns like Oxford or Bath. We ought from such a list to be able to throw some light on the relation of isolation to incidence under a variety of social conditions, if indeed these latter are factors in the problem at all. (3) What are the crude correlations between Isolation-Rate, Attack-Rate and Mortality-Rate? The isolation-rate (7) has been measured as the average per cent. of cases removed to hospital during a five or four year period. We have two such periods, the earlier period 1904-1908 and the later period 1909-1912. The attack-rate has been measured per 1000 of the population, uncorrected for age distribution. Since diphtheria is largely a disease of infancy and childhood this neglect of the age correction—the reduction to a standard population—may seem serious. But in the first place we had not the age incidence in the individual districts, and in the second place we satisfied ourselves that such correction, if it could have been made, would not substantially modify any argument we have based on our data. For we calculated the attack-rate (A’) on the population under 15 years of age, as well as the attack-rate (A) on the total population of the districts. We found the correlation between the two methods of measuring the attack-rate was +°972, which indicates how close is the relation between the two methods of measuring the attack-rate and how little influence small variations in the proportion of less immune persons in the population due to age differences could have on the resultst. The attack-rate (A) has been measured as the number of cases per 1000 of the population. The mortality-rate has been measured in two different ways; first as the population mortality, the death-rate in the ordinary sense (J/) or the deaths per 1000 of the population ; and secondly the case death-rate or the mortality (m) per 100 attacked. We now give the crude correlations between J and A. They are: First Period: 1904-1908, ry = +°427 + 063, Second Period: 1909-1912, Tra — +2290). 069: * See table, p. 567, for 76 of the 80 towns, the four others with full data only for the second period being Reading, Stoke, Dewsbury and Edmonton, + The formula giving the juvenile attack-rate A’ in terms of the crude attack-rate A is: q A’=1°3094 A+ -0164 with a probable error of -:1369. Thirty-three towns were selected at random out of the 80, and gave the following results for d’ calculated from 4 and A’ as observed. The theoretical mean error=*162; the mean error of the defects Erne, M. ELDERTON AND Kart PEARSON 553 Thus both periods show significant if not very large correlation*. The difference (1387 + 093) between the coefficients for the two periods is, however, probably not significant. Thus in towns with greater isolation-rate there is certainly a higher attack-rate, and equally certainly no argument can be based on the crude figures to prove that the more the isolation the less prevalent is diphtheria. We will now turn to the death-rate M, and we find: First Period: 1904-1908, Tim = +°153 + 075, Second Period: 1909-1912, Ny = ID Oho: In the first period isolation was associated with a higher diphtheria death- rate, in the second period with a lower diphtheria death-rate, but neither are of any real significance. Thus all we can conclude from the crude figures is that they show no evidence that isolation has reduced the general death-rate from diphtheria. We next take the case death-rate (m) and we have for the two periods: an a First Period: 1904-1908, Tim = — ‘509 + 057, Tam = — ‘d2T + 056, Second Period: 1909-1912, Tim = — 084 + 054, Pan = = 7405 4 -O5T. is—°153 and of the excesses+-134; this shows very fair accordance, 17 deviations being positive and 16 negative. The greatest deviations occur in Hornsey, Bath and Brighton, where residential neighbour- hoods show fewer children, and in Edmonton, Walthamstow and Rhondda where there are probably Observed sree A | Observed rine ts | A | | | | Derby seal) Gee 4:25. | —:17 ||Edmonton ... | _ 1°36 1:80 | +4:44 Southampton ... 2°67 2710) ate OSe Wl Bath. soenl| aIsilfs} 87 = 83] Hornsey Acni|| eects! 1:86 | —°82 || Newport cesta wht 1°25 +11 Bristol SS Alezioe 2°23 — 10 || Rhondda eae | ee lealte) 1°34 4°24 Reading Lele one 1:95 | —'18 || Bury e108 ‘Ole |). =-17 Nottingham ... 2°09 1:99 = 10 Rotherham... 1:06 1:18 +°12 | Salford ee 2°04 2°16 +12 || Dewsbury ae 1-01 1:01 =-*00 Ilford see 1:98 2°09 +1] Blackburn julie 299 91 ==-)Si | Brighton 1:94 1°68 —:26 || Manchester... 94 ‘97 JECaB}. | Stockton 1:90 Is, +°22 | Oxford Fei||Ue eeaehy) "79 | =-10 | Ipswich 1:87 1°89 +:02 |} Bolton a 86 85 — ‘01 Grimsby ee) 1:85 1:90 | +:05 || Rochdale oa "82 "75 —O7 | Walthamstow... 1°85 2°10 +°25 || Northampton ... ‘78 “76 — 02 Coventry 1°84 1:93 | +:09 || Barnsley a To | -88" | ees Plymouth 1°61 1°56 -'05 | Wigan ose 585 | <64 +°06 | Wakefield 1°40 1°39 —‘O1 || W. Bromwich...| °45 53 +:08 | Smethwick IS Se seed Se aCe || | | excess of children. On the whole the general order is very well maintained, and the general attack-rate closely fixes the juvenile attack-rate. In any further collection of material, it would of course be well to have the age-distribution of cases. * We endeavoured to see whether the correlation of isolation- and attack-rates would be modified if we took the attack-rate on children under 15 years. This made little difference, 7 being raised only from +'290+ 069 to +°315 + 068. 554 A Study of the Effects of Diphtheria Isolation According to these correlations, when or where the isolation-rate is high, the case mortality is low. Further when or where the attack-rate is high the case mortality is low. Now we know that: I =100 x isolated cases ~ all cases, A =1000 x all cases + population, m=100 x deaths + all cases. Hence if we selected the number attacked at random and chose the deaths to be simply some number less than this, we should expect to find a considerable negative correlation between A and m; and as we actually do find such a corre- lation, we cannot be certain that the actually observed values of r,,, are not due to “spurious correlation.” If they were “organic” we should interpret them to mean that a widespread epidemic (A large) was a less virulent epidemic (m small). On the other hand the spurious correlations of J and m would be positive in value, while the actual correlations are negative. Thus it would seem that while a high isolation-rate is associated with high attack-rate, it must be “organically” asso- ciated with a lessened case mortality. In other words while isolation does not, on the crude figures, appear to lessen the frequency of disease, it does appear to lessen the mortality among the attacked. This result appeared to be of such very great importance, if thoroughly established, that we determined to inquire into it further. It seemed reasonable to believe that the bulk of persons attacked might have better care in a hospital than in their own homes and thus isolation indirectly lessen the ill effects of the disease. We accordingly endeavoured to approach the problem from a somewhat different standpoint: Given two districts with the same total number of persons attacked (a), will that district with more isolated (7), have fewer or more deaths (d)? The answer to this question depends on whether the partial correlation coefficient of total isolated cases with total deaths for constant number attacked is negative or positive. We found: First Period Second Period Correlations 1904-1908 1909-1912 rq = Isolated Cases and Deaths ate + °860 + :020 + 867 +:019 Tiq =Asolated Cases and Attacked ize +°937+:010 +968 + 005 Tq= Attacked and Deaths at Se + 907+ :014 +°918 + °012 rectal hos end Datel} segsy-ort aor Thus in the first period for a given number of attacked more isolation was associated with more deaths, and in the second period for a given number of attacked, with fewer deaths; but in both periods, having regard to the probable errors, we cannot assert any real significance, or be reasonably certain that where there is more isolation, there recovery is more likely to occur. We shall see later that the correlation between J and m for constant total number of attacks is not the same thing as the correlation of the total isolated and Erne, M. Evperton anp Kari PEARSON 555 18 total deaths for constant total number of attacks. And this divergence, often in a marked degree, of partial correlations for rates and for absolute numbers is not unfamiliar to those who have had to deal with disease statistics. In the present case it renders still more obscure any argument drawn in favour of isolation from apparently lesser case mortality. ? (4) On the degree to which “spurious correlation” may be influencing the attack- and death-rates. It seemed desirable if possible to throw further hight on this point and accordingly we correlated attack- and death-rates with the total population. It will be remembered that: A = 1000 x cases + population, M = 1000 x deaths + population, and accordingly if A and M be correlated with the population P, we might anticipate that if cases and deaths had no relation to population, there would be a high negative correlation arising from A and M both varying inversely as P. We were comforted by finding practically insignificant positive correlations. Thus: First Period Second Period Correlations 1904-1908 1909-1912 ’p4=Population and Attack-rate +:°137 +°075 + 054+ ‘075 7py = Population and Death-rate ris lsO70 + ‘116+ 074 7», =Population and Isolation-rate 52+ 075 +102 + ‘075 The last correlation cofficients show us that there is very little relation between the size of a population and the amount of isolation practised. Further these isolation correlations in which there is no obvious source of spurious correlation are as significant as those of population with attack- and death-rates where the possibility of “spurious correlation” is manifest. We conclude accordingly that risk is more uniformly distributed over population than we had anticipated, and that the correlations between the three rates J, A and M are really open to ? “organic” interpretation. The next point which arises for discussion is whether the presence of the total number attacked (a) in the rates J and m can produce spurious correlation. If so we should anticipate that the absolute number a would be negatively correlated with both isolation and case mortality rates. We found: First Period Second Period Correlations 1904-1908 / 1~ 99-1912 Yaz =Total attacked and Isolation-rate 4+ :264+°072 -+°226 +072 vam= Total attacked and Case-Mortality = Ost Oe — 903 + :072 The first set of these coefficients are not even negative and therefore cannot be due to “spurious correlation,” although such correlation may have reduced their organic values. They admit, however, of an easy interpretation, namely that: where the number attacked has been large the isolation has been more practised. The second set of coefficients might be due to spurious correlation, but they again admit of a simple interpretation as apart from “ namely that: when the attacks are numerous the deaths are relatively few, spurious correlation,” Biometrika x 71 556 A Study of the Effects of Diphtheria Isolation because a wide-spread epidemic means a mild epidemic. All four coefficients are significant, and pair and pair they are quite consistent but in no case are they of any marked importance. They enable us, however, to correlate the isolation- rate and the case-mortality for a constant total attacked, ie. to find the partial correlation rm. We have the following results: First Period Second Period 1904-1908 1909-1912 ( Correlation of Isolation-rate ) ima a, a and Case mortality for — 474+ ‘056 —°512+°057 aa | constant number attacked j while we have already found : Correlation of number isolated aa with number of deaths for +066 + :077 — 220+ :072 constant number attacked Correlation of Total Numbers Isolated and Total Registered Deaths. Total Numbers Isolated. | 0 50 | 000 5 no) ~ T50 1250 Oo 4 QD Totals 7 750-1000 5) 500 1250—1500 2750—3000 3000—3250 1500—1 1750—Z£ 2000—2. 1000. Deaths Registered. 0— 75 64 2 -= iL — | — | = | — 94 75—150 | 16 2 36 150—225 2 9 225—300 2 4 300—375 1 6 3875—450 2 5 450—525 = ile §25—600 — 1 600—675 10) 675—750 (0) 750—825 1 Totals ; | sl | espcreldae Means 54:2 | 59°8 | 112°5 | 133°9 | 932°5 | 312°5 382°5 at 2125 isolated 103°4 It will therefore be clear that removing the variation in number attacked has made only shght reductions in the values of the correlation coefficients between isolation-rate and case-mortality. The discrepancy between the absolute numbers’ and the rates’ correlations is not to be accounted for by “spurious correlation ” involved in the use of total numbers attacked in both rates. It must therefore be due to: (i) lack of linearity in certain of the regressions, (11) high values in the coefficients of variation in certain of the quantities under discussion, or to a com- bination of these causes. With the small size of the populations under discussion it is by no means easy to test the true linearity of the regressions, even if we do what appears legitimate in this case, namely pool our data for the earlier and Eruet M. ELpERTON AND KARL PEARSON 557 later periods. Our actual correlations have all been found without grouping by the direct product-moment formula, but we give on pp. 556 and 558 two grouped tables to illustrate the difficulties which arise in analysis. Our first table is for the total numbers isolated and the total deaths registered. It will be seen at once that the marginal distributions are intensely skew, crowding up into the corner of few deaths and few cases isolated, so that they appear to asymptote to the zero Further, Diagram I shows that the regression curve Total isolated. O 250 500 750° 1000) 71250 1500 1750-2000: 2950 2500 values of the coordinates. Total deaths. Dracram I. of deaths on total number isolated is, if just sensibly, still not markedly skew. Turning to the actual numbers given by this table we have the following series of constants : Numbers Isolated (i) Registered Deaths (d) Mean 460 406 ae S0 % =475°33 d =103°42 Standard Deviation ae, Be o; =571°25 og=118'61 Coefficient of Variation ... ies OF = | 140) = 1°15 (=8.D./Mean) Correlation Coefficient and Ratio Tiq= '8348* + 0163 nai= 8564 + * Agrees reasonably well with the non-grouped values for the two separate periods. + Found by taking means of all 13 column-arrays. 71—2 558 A Study of the Effects of Diphtheria Isolation Clearly these results are of much interest; they show that the difference of 7 for deaths on isolation over 7 is not as great numerically as, perhaps, the graph suggests, but they indicate the markedly high values for the coefficients of variation. Now it is quite straightforward algebra to prove that aVita, dla = aVi, do provided we may neglect terms of the square and product order in v; and vg com- pared with unity, and this is perfectly legitimate when these coefficients of variation are, as is usual in anthropometric measurements, quite small quantities. But in the present case these quantities are greater than unity and their squares are not negligible as compared with unity, thus we need not be surprised at the marked inequality of q?ija,aa* and 47;,af found above. The values of the former show a marked relation between the case mortality and the isolation-rate, and the values of the latter indicate no appreciable betterment in the deaths due to increased isolation. Before we consider which of these coefficients gives us in the present case the better result as a guide to practical conduct, let us examine the correlation table for isolation-rate and case-mortality for the same 157 observations. Correlation of Isolation-Rate I and Case-Mortality m. Isolation- Rate. 0—10) 10—20| 20-30 | 30—f0 40—50 | 50—60 | 60—70 70—80 | 80—90 | Totals | | . wl 38 ee 1 ii 6 eae 3 28 ay cea 2 = 3 3 4 8 11 OV eee 45 “| 12-16] 5 | 4 3 8 3 8 6 GS Saal 44 Sel 7e=e20Nl) oa 2 1 2 3 3 i |) qo alee S | 20-24] 9 = 2 i 2 = 2 1 a 17 = | 24—28 33 | 1 2 6 Seieeeeeoa. ene ee a hae 1 Ceres : | Totals | 23 4 11 is |e} 26 28 22 | 16 | 187 Means | 19-04 | 14:00 | 16-18 | 15°60 | 14:00 | 11:08 | 11°71 | 11-09 | 9°25 13-26 | The following constants were found for this table: Isolation-Rate Case Mortality Mean Sen es OW ee age m =13:26 Standard Deviation ae aes Gpo= 25:52 Om = 558 Coefficient of Variation... ; Oe = OP} Um = 0°42 Correlation Coefficient and Ratio TIm= — ‘5291 +:038 Nmrt= ‘5546 The graph of the regression of case mortality on isolation-rate shows small evidence of skew regression (see Diagram II), and this is again confirmed by the difference between 1;,, and 7,,; being fairly small. The marginal frequency dis- tributions show, however, considerable skewness, and that for the isolation-rate is lumped up at the end where there is no isolation: more than half the numerator of 7,,; being contributed by the towns with little or no isolation. It is desirable to consider these towns further. They have an attack-rate of "76, which is sensibly * This is the a tm Of Our P. 556. ? + The values are given on our p. 554. Erne, M. Evperton AND Karu PEARSON 559 less than the mean attack-rate (1°30), but they have a case-mortality of 19°04 as against the average case-mortality of 13:26; the 17 towns* with no isolation at all give a case-mortality of 19-4, It would thus appear that the towns with little or no isolation are those with a lower average attack-rate, but with rare exceptions their case-mortality is high. Isolation-rate. 0 10 20 30 40 50 60 70 80 90 100 Case-mortality. Dracram II. To test the influence of these towns with little or no isolation, we have removed the column 0—10 isolation-rate group and recalculated r;,, and 7,,;; we find Tim = — "4120 + 0484, 9,7 = °4810. Thus while the correlations are somewhat reduced by excluding the towns with little or no isolation there is still in the towns which do isolate a very sensible relation between the degree of isolation and the case-mortality, and this relation exhibits rather more skewness. : We may sum up as follows: The relation between greater isolation and a lessened case-mortality appears to be a real one. We have shown that it is hardly due to spurious correlation, as this would have produced a positive correlation and further no great changes are made when we correct for inequality in the numbers * South Shields (1st and 2nd Periods), Sunderland (1st and 2nd Periods), Barrow (1st Period), Preston (lst Period), Wigan (1st Period), Smethwick (1st and 2nd Periods), Walsall (1st and 2nd Periods), West Bromwich (1st and 2nd Periods), Coventry (1st and 2nd Periods), Barnsley (1st and 2nd Periods). Of these towns West Bromwich in the 1st period had the highest case-mortality recordeu of any of our 80 towns, while Smethwick in both periods, and Coventry and Barnsley in the 2nd period with no isolation had case-mortalities below the general average. 560 A Study of the Effects of Diphtheria Isolation attacked. The regression is roughly linear and only very partially due to the high case-mortality in towns with no isolation. It is probable that where there is a large amount of isolation, the care of patients falls largely into the hands of a few men with a more extensive experience of the disease, and that this reduces the case-mortality. Against this may be set the fact that the correlations between the absolute numbers of deaths and of cases isolated for constant numbers attacked are in- significant. The divergence between the two methods of approaching the problem is, however, explicable because the coefficients of variation of the absolute numbers are greater than unity, and the identity of the correlations reached by the two methods depends on the neglect of the squares and products of the coefficients of variation compared to unity. It may be asked: Why in this case we prefer the partial correlation found from the rates to that found from the absolute numbers? We reply: Because the partial correlation coefficient for the absolute numbers depends on very high total correlations, and if these correlations be, as we have shown, non-linear, then the partial correlation coefficient not only loses its full meaning, but may, as experience has shown us, easily change its sign as well as its magnitude. We would suggest that in a minor sense total mortalities and total isolations are bound to give “restricted tables,” for deaths and isolated cases are perforce less than the numbers attacked, and that in such “restricted” tables, there is a general tendency to skew correlation and to a spurious factor*. On the other hand it is true that case-mortalities and isolation-rates cannot exceed 100°/, or fall short of 0°/,, but these limits are the same for every array and do not vary from array to array as in the previous case. On the whole we think it safe to say that isolation is associated with greater prevalence of the disease and with a lessened case-mortality. (5) Is there any significant Relation between Isolation-Rate and General Diphtheria Death-Rate? We have seen (p. 553) that insignificant correlations exist between J and M, and it is difficult to understand how a spurious factor could have modified this result. In the first place the small values of rp, and rpy on p. 555 show us that the value of p7,,, is sensibly the same as 7,,,; thus, for a con- stant population there is no sensible association between diphtheria mortality and isolation. But now let us ask whether for a constant attack-rate, isolation does not lessen general diphtheria mortality. We have: Correlation First Period Second Period 7,4 =AIsolation-rate and Death-rate ... +1532 — ‘0119 7,, =Isolation-rate and Attack-rate ... + 4268 +°2905 ?y4= Death-rate and Attack-rate a +°6772 +6879 Hence iim =Isolation-rate and Death-rate a constant Attack-rate ee: Seuss * See especially the illustrations of such ‘‘restricted” tables and their regression lines in a paper by Waite on Finger-Prints: Biometrika, Vol. x, pp. 421—478. Erne, M. EvpErRToN AND KARL PEARSON 561 Both of these values may be considered significant and negative, and hence when the attack-rate is constant there is a sensible, if not very close relationship between increased isolation and reduced general mortality from diphtheria. This confirms the view already reached that while isolation is associated with higher attack-rate its effect is to lessen the number of deaths whether they be reckoned as case-mortality or general population death-rate. (6) What is the meaning of the Association between Isolation and increasing prevalency of Diphtheria? The analysis of this problem is more complicated. The obvious answer of those who advocate increasing isolation would be that it has been adopted in those districts where the disease is most prevalent, and this of course may turn out to be correct. But we may ask in turn upon what statistics they depend to demonstrate their view that isolation lessens the preval- ence of the disease and is therefore advantageous, if our data demonstrate that where there is more isolation, there there is more diphtheria? It can only be by an analysis of no simple character that it is possible to deduce from such data that the practice of isolation has lessened the amount of the disease. There is, however, a preliminary problem to be dealt with. The isolation-rate has been increasing very sensibly from 1904 to 1912, the attack-rate has lessened although very slightly, the case-mortality has lessened and the mortality on the population is considerably less. These facts are exhibited in the following table : Means | Standard Deyiutions Variate | Symbol | | 1904-1908 1909-1912 | 1904-1908 | 1909-1912 Te! = oneal Attack-rate per 1000 population | A | be TERS IO pe MEIGS Fe) ‘657 | ‘639 Isolation-rate per 100 attacked | JI | 42:4 | 55°7 25°52 | 25-18 Mortality per 1000 population M 174 138 ‘080, ‘061 Mortality per 100 attacked m | 14:6 Ti Pa peeo air 501 Now it may well be, since the attack-rate has changed so little, that in the towns with increasing attack-rate there has been increasing isolation, both quantities changing with the time, but having no causal relation the one to the other. It is of some interest therefore to consider the type of districts in which isolation is most practised. In the first place we ask if any known bad social conditions are associated with prevalence of diphtheria. We took as our measure of sanitary conditions (1) the infant death-rate, or the deaths of children under one year per 1000 births, (11) overcrowding, or the percentage of the population in private families with more than two in a room. We found the following results : First Period Second Period Variates Correlated 1904-1908 1909-1912 Attack-rate and Infant Death-rate — "206+ -074 — °206 + ‘072 Attack-rate and Overcrowding ... — 1538+ ‘075 — 186+ ‘074 562 A Study of the Effects of Diphtheria Isolation These are not very considerable, but they are consistent, and indicate, as far as they go, that the incidence of diphtheria is not dependent upon such measures as the above of unfavourable sanitary conditions. If we now turn to the correlation between the mortality-rate on the population and these measures of unfavourable sanitary conditions we find: First Period Second Period Variates Correlated 1904-1908 1909-1912 Death-rate from Diphtheria and Infant Death-rate +081 + 076 +°118+ ‘074 Death-rate from Diphtheria and Overcrowding ... +061 + 079 +°004+ :075 All these are indeed positive, but they are of no significance and if they were significant would be so small as to be of no importance. The first indeed might have been anticipated to show a higher value, for a certain number of deaths from diphtheria must be deaths of infants. We can only conclude that as far as these measures of unsanitary conditions are concerned they do not in any way determine the diphtheria death-rate. We now turn to the isolation-rate and find: First Period Second Period Variates Correlated 1904-1908 -1909-1912 Isolation-rate and Infant Deathrate ... — ‘414+ 064 - 375 + °065 Isolation-rate and Overcrowding Bh — 236+ 073 — 235+ :071 These are significant although not very large and we conclude that most isolation is practised in those districts which have the lowest infant deathrate and the least overcrowding; the correlations are sensible if not very large. In other words the towns with better health conditions have adopted more extensively the practice of isolating diphtheria cases. It seemed further of interest to determine: (1) whether diphtheria and isolation were more or less associated with urban conditions, and we took for this purpose the number of persons per acre, and (ii) whether the well-to-do character of the district, as measured by the number of domestic servants, indoor and outdoor, male and female per 100 private families, has any influence on the incidence of mortality from, or the isolation of diphtheria. We found: First Period Second Period Variates Correlated 1904-1908 1999-1912 Persons per Acre and Attack-rate ine +165 + °075 +043 + ‘075 Persons per Acre and Death-rate a +:°169+4°075 +:°115+:074 Persons per Acre and Isolation-rate ... + :073 + ‘076 + °053 + ‘075 Not one of these correlations is of any importance, if indeed any of them can be considered significant. It is thus clear that the intensity of urban conditions has very little to do with the prevalence of diphtheria, for if anything the suburban conditions have the lesser death-rate; clearly isolation has no sensible relatfon to number of persons per acre. Erae, M. Evprerton AND KARL PEARSON 563 Turning now to our measure of the prosperity of the district, we find that it has no influence on the attack-rate, that it sensibly, but not very intensely affects the mortality, the higher death-rate occurring in the poorer districts, and that isolation is associated quite significantly with the prosperity of the district, i.e. the more well-to-do the district the more isolation is practised *. First Period Second Period Variates Correlated 1904-1908 1909-1912 Number of Domestic Servants and Attack-rate ... +095 + 076 + 024+ °075 Number of Domestic Servants and Death-rate ... —°219+°073 — +308 + :068 Number of Domestic Servants and Isolation-rate + °437 + ‘062 + °363 +065 We conclude therefore that the more prosperous and generally healthier districts are associated with fuller isolation, and that the more prosperous, but not necessarily the more healthy districts, have the less diphtheria death-rate. On the other hand the incidence of the disease seems independent of the prosperity or density of population of the district and to be somewhat greater in those towns where the sanitary conditions as judged by infant death-rate and overcrowding are better. Thus as far as our measures go, we must conclude that diphtheria is not to be considered as a disease of markedly urban districts, of overcrowded or of insanitary districts. It would appear that the more prosperous and healthy districts have the greater isolation and that these are subject to somewhat the greater incidence. * Of course this may largely mean that the more prosperous towns introduce isolation to remove the supposed danger of infection when servants of the families of the well-to-do are attacked. + In order to ascertain whether the variates persons per acre (p,) and overcrowding (O) were merely measures of the size of the town population (P) we correlated P with p, and with O and found: "pp, = +404 + -064 (1904-8), = + -402 + -063 (1909-12), Tpo= +°091 + :076 (1904-8), = + :074 = -075 (1909-12). Thus overcrowding has no relation to the size of the town, the larger towns do not show more over- crowding. There is, however, a considerable association of persons per acre with total population, the larger towns having more persons per acre without exhibiting any more significant overcrowding. Making the population constant we find: First Period 1904-1908 | Second Period 1909-1912 Total Correlation Partial Correlation | Total Correlation Partial Correlation » |—— a = Tap, =+°165£-075 | pray, = +1224 -076 "4p, = +1043 © 075 May, = +°023- 075 Tyo = —- "1534075 | pryg = —°1674%:075 |) typ = - 1186 4-074 go = — 1140+ 074 Thus correcting for population only makes the relation between persons per acre and incidence still more insignificant, while the relation between incidence and less overcrowding becomes slightly greater, without rising to any real importance. + This result must not offhand be extended to subdistricts of our towns, it is an inter-urban and not intra-urban statement. Biometrika x 72 564 A Study of the Effects of Diphtheria Tsolation It will be seen at once that this conclusion opens up new problems: (i) Is the greater isolation the outcome of greater incidence, the only remedy suggestable for greater incidence being a more complete isolation? (ii) Is the greater incidence in some manner a result of the greater isolation, and does it really tell against isolation as a remedy against the spread of diphtheria? The association of greater isolation with local prosperity would then be merely a measure of the economic capacity of the district for carrying out the accepted sanitary code. (iii) If (ii) is to be answered in the negative, then is there any factor in prosperity which makes for a greater diphtheritic incidence? The final answers to these problems can probably not be given on the basis of the present data. The correlations under discussion although significant are not of such a marked character as to provide more than provisional statements, or indeed more than suggestions for further inquiry and tabulation. (7) Does greater Isolation follow increasing Incidence, or greater Incidence follow increasing Isolation? The problem is a much more subtle one than appears at first sight. What we have established is that those towns with the higher isolation-rate have the higher attack-rate. It does not follow from this that the individual town which increases its isolation-rate will increase its attack-rate. To determine whether this is so we took as our variates: increase in isolation- rate ([) between the periods 1904-8 and 1909-12*, and the similar increase A of the attack-rate. We found: Ti = Zoot 012; a value probably significant, although not quite so large as that found for the inter-urban relation : Y47=+°427 + 063 (for 1904-8), = + ‘290 + 069 (for 1909-12). We can, we think, therefore conclude that the towns which increase their isolation- rate are those with increasing attack-rate, just as the towns with higher isolation- rate are those with higher attack-rate. But this does not answer the question as to which is “the cart” and which “the horse”! Does the increased attack-rate precede or follow the increased isolation-rate ? To answer this question we divided our material into three periods each of three years, let us say 7,, T, and 7;. Then the attack-rate increase between 7’, and 7, was correlated with the isolation-rate in 7, and the isolation-rate increase between 7, and 7, with the attack rate in 7;. In other words we asked whether towns with most rapid increase of attack-rate in the * That is the total number treated in hospital x 100 and divided by the total number of attacks was taken for the first period and for the second period, and their difference (second period—first period) was treated as increase in isolation-rate. In the same way the sum of the totals attacked for the years of the first period x 1000 and divided by the sum of the calculated intercensal populations for the same years was treated as the attack-rate, and the difference of second and first period values taken as the increase in the attack-rate. Erne, M. ELpERTON AND KARL PEARSON 565 early periods had most isolation in the later period, and whether the towns with most rapid increase of isolation-rate in the earlier periods had most incidence in the later period. We found: ES Hate 004 + ‘077, rid, = t+ 085 + “077. Thus there is no significant relation whatever between either increase of attack- rate or increase of isolation-rate in the first periods, and the isolation or the incidence in the following period. As criticism of this result it might, perhaps, be suggested that the correlation of A,_, and J, will be influenced by what has been the course of J in the periods T, and T, and the nature of A in 7,; we have accordingly, in order to test this, made the isolation-rate constant in the first two periods and the attack-rate constant in the third period and find Thy 4g" 44 I, ~ +147 ae ‘076. This is still of no real significance, although the sign appears to indicate that where the isolation-rate has been constant then increasing attack-rate in the earlier period is followed by very slightly more isolation in the third period, even if the attack-rate in that period be itself constant. Similarly we determined : UR OTE ply Rit ‘O77 + 077. This coefficient shows that towns which have increased their isolation-rate during a period of constant attack are not liable to sensibly heavier attack in the following period. It would thus seem that our first two problems are both to be answered in the negative. Towns which increase their isolation are not those which in the fol- lowing period have most incidence, nor are those which have increasing incidence markedly those of most isolation in the following period. Attack and isolation appear to have no causative relation, and the association we have found between more isolation and more incidence seems to be contemporaneous rather than successional. We are, it seems, compelled to search for something in the environ- ment, which favours incidence and at the same time isolation. The only common factor that we have been able to reach at present is the prosperity and general healthy condition of the town. Under these circumstances there appears to be economic possibility of greater isolation, but why should there be greater incidence ? Is it possible that in the more prosperous towns there is greater consumption of some easily contaminated commodities, which may act as carriers of the disease, or more concourse of those of susceptible ages at places of public amusement or instruction ? (8) Test of the “organic” nature of the correlation of Isolation- and Attack- rates by the method of Variate Differences. If the suggestion made at the end of the last section be correct we should anticipate that by the use of the method 129 566 A Study of the Effects of Diphtheria Isolation of variate differences we should free ourselves from the influence of the time factor, if attack-rate and isolation-rate increase simultaneously in the more prosperous towns, but without organic association. We have nine years’ returns, but the epidemic nature of diphtheria in many cases does not give one great confidence in applying the method of variate differences to individual years. We considered that it would not be wise to deal with smaller intervals than three-year periods, and should have preferred had the data been available to work with five-year intervals. As it is, we cannot with three-year intervals for each town go beyond the second differences. We have accordingly 228 isolation-rates and attack-rates obtained from 76 towns for each of three three-year periods, 152 first differences, and 76 second differences. We may symbolise them as I’ and A’, 6,1’ and 6,4’ and 6,J’ and 6,4’. We found the following results : rpg = +°332 +040, 15,1'3,A! — + 936 + ‘052, 13,1'3,4/ = + 159 + 075. The first of these results compares reasonably with the previous results for the first and second periods on p. 552, i.e. 1904-1908: rzy = + 427 + 063, 1909-1912: rz4 = + 290 + 069, with a mean value of +°358. And this is the more true because the values of - rrq were found by the product moment method without grouping, while 77-4, was obtained from grouping in a correlation table*. Now the above values bring out very markedly that when we endeavour to remove the influence of the time factor and to obtain a purely organic relationship between J and A, we more than halve the correlation between them by proceeding to the second difference only. If we might suppose that a hyperbola would give the asymptotic value of 75 775 4’ from the above three known correlations we should have 7084 Ne i 2 T3T8,4' = S105 a5 542, which indicates, although no stress can be laid on actual numbers, that at about the fifth difference rs 75 4, would tend to become negative. All we think it possible to say would be that if the time factor be eliminated there is very little positive organic association between high isolation-rate and high attack-rate to be cleared up,—certainly not more than is indicated by the correlation on p. 565: I, 43" Ay yl, ear piles ‘076, * It may be noted that 7; 4/3 7 was also found from a correlation table, but "5,1 ,4' 88 having only 76 cases by product moment without grouping. Erne, M. EvpERTON AND KARL PEARSON 567 which seems to suggest that other things being constant increasing incidence is to some slight extent followed—probably as the only suggested remedy—by the higher isolation rates*. (9) Can any other Factors be determined which measure the Relation between urban conditions and the Incidence of Diphtheria? It is worth while from this standpoint to place the towns with which we have dealt in the order of incidence, each town being credited with the mean of the three attack-rates for each of three-year periods. Now an examination of the four columns of this table shows that, with the exception of Oxford—which has a child incidence (‘89 as com- pared to ‘70) considerably above the population incidence owing to relatively few children—the towns with the least diphtheria are the Midland, and _parti- cularly the Northern manufacturing towns. These constitute practically the whole of the first column of 19 towns. The last column contains the big ports and certain suburban metropolitan districts, indeed all these for which we have data except Plymouth, Devonport and Tottenham fall into the second half of the Seventy-siz Towns in order of their Diphtheria Incidence Rates 1904-1912. OCOOBONIKDOP WHY West Bromwich (‘40) | 20 Rotherham (91) | 39 Birkenhead (1:20) | 58 Brighton Northampton (45) | 21 South Shields (92) | 40 Rhondda (1:24) | 59 Stockton Wigan (48) | 22 Preston (93) | 41 Smethwick (1:25) | 60 Grimsby Walsall (49) | 23 Wallasey (94) | 42 Barrow (1:25) | 61 Leyton Stockport (53) | 24 Bath (95) | 43 Newport (1:25) | 62 West Ham Oldham (59) | 25 Bootle (96) | 44 Wimbledon (1°30) | 63 Salford Bolton (59) | 26 York (99) | 45 Great Yarmouth (1°31) | 64 Nottingham Oxford (‘70) | 27 Blackpool (99) | 46 Southend-on-Sea (1°32) | 65 St Helens Barnsley (71) | 28 Tynemouth (1:00) | 47 Birmingham (1°32) | 66 Walthamstow Southport - (72) | 29° Tottenham (1:02) | 48 Gillingham (1:34) | 67 Ilford Rochdale (73) | 30 Halifax (1:03) | 49 Ipswich (1°36) | 68 Southampton Leicester (76) | 31 Sheffield (1:03) | 50 Liverpool (1:37) | 69 Cardiff Manchester (‘79) | 32 Plymouth (1:07) | 51 Hornsey (1°39) | 70 Enfield Bury (80) | 33. Coventry (1:11) | 52 Darlington (1°39) | 71 Hull Blackburn (83) | 34 Warrington (1:11) | 53 Acton (1:43) | 72 Bristol Wolverhampton (:86) | 35 Devonport (1:13) | 54 Newcastle (1:45) | 73 Croydon Burnley (86) | 36 Sunderland (1:14) | 55 Burtonon Trent (1°48) | 74 Portsmouth Huddersfield (89) | 37. Bournemouth (1°17) | 56 Bradford (1°56) | 75 Derby Wakefield (91) | 38° Middlesbrough (1:20) | 57 Willesden (1°56) 76 Lincoln * It is perhaps worth while putting on record the additional statistical constants obtained in deducing the above correlations, as they are probably fairly reliable values and should be compared with the two period constants on p. 561: A’ =Mean Attack-rate 1:26 ; Standard Deviation, Attack-rate 655 I’ =Mean Isolation-rate 47-75 ; Standard Deviation, Isolation-rate 26°341 6,4’ =Mean Increase in Attack-rate —'086; Standard Deviation of change in Attack-rate 648 8,4’ =Mean Increase in Isolation-rate 9:03; Standard Deviation of Increase in Isolation-rate 1°05 Thus while most towns have been sensibly increasing their amount of isolation by 17 °/, to 18 °/, of its mean value, the decrease in the attack-rate has only been 6°/, to 7°/, of the mean incidence, and the correlations show that this decrease has not occurred in the towns with marked increase of isolation. 568 A Study of the Effects of Diphtheria Isolation table, and there can be little doubt that on the whole sea-port conditions and the big new neighbourhoods round London favour, while manufacturing con- ditions restrict, the incidence of diphtheria. We have not data, however, available upon which we could test water and milk supply, or extent of consumption of milk and fish in these towns. The results for Derby and Lincoln are remarkable, but they are high for all three periods, and this notwithstanding the rapid increase of isolation in those towns. At first sight it seemed to us that the towns in the first column were markedly those in which there had been a greatly restricted birthrate*, while those in the last column were towns of greater fertility. Taking the births per 100 married women from 15 to 45 (B) we found: TAB = + ‘013 ar O75. Thus there is no association between incidence and the well-to-do character of a town as estimated by a low birthrate. Again having regard to the character of the towns in our first column, it occurred to us to test the incidence in relation to the employment of males in manufacturing processes involving smoke. We took out of the 1911 census the percentage (8) of males over 10 years of age, who fell under a rough test of smoke-producing occupations, namely 1x. 1, x. 1-2, 5-8, x1v. 1, Xv. and XVIII. 1-6 of the Registrar-General’s classification, and we found : TA a ae "180 ta 073. This is possibly significant and would undoubtedly be emphasised had we included as a factor the women engaged in textile industries. There seems therefore some slight reason to suppose that the conditions favourable to smoke production are unfavourable to the spread of diphtheria. If the data could be procured, it would be worth while considering water and milk supply and the extent of fish consumption in the towns we have dealt with. If these were found to be of little influence, the road would certainly be clearer for dealing with the chronic diphtheritic human carrier as the chief source of the spread of the diphtheria bacillus. (10) Conclusions. (a) No influence of greater isolation in reducing the attack-rate from diph- theria is discoverable. In fact there is a sensible, if not large, positive association between the isolation-rate and attack-rate. (b) The case mortality is somewhat less where there is more isolation. This may very probably be accounted for by more cases coming under specialised medical care. * We had partially in view here also the possibility that restricted birthrate meant employment of women and thus less breast-feeding and greater use of milk, so that cross-currents might be at work. ib Erue. M. Exuprerton AnD KARL PEARSON 569 (c) The attack-rate appears to be greater in the more prosperous towns and in towns of somewhat better sanitary conditions. We have not found the pre- valence of diphtheria associated with overcrowding or with the conditions leading to high infant mortality. (d) While a low birthrate, taken either as a measure of prosperity, or as a measure of the employment of women and so of the prevalence of hand feeding, appears to have no significance for the attack-rate of diphtheria, smoke-producing manufactures are probably unfavourable to the prevalence of the disease, which appears to attach itself in the main to the large ports and metropolitan suburban districts. (e) The association between the attack- and isolation-rates observed is not very significant, and while it might, to a very small extent, be due to increased isolation following or accompanying increased attack, it 1s more probably an association due to the more prosperous towns practising more isolation, and also to there being some element in prosperity which assists the spread of the disease. Generally all the correlations are of a low order; they contain, however, nothing to support the theory that isolation markedly limits the incidence of diphtheria ; the disease itself does not appear where overcrowding is greatest nor where the population is most dense; on the other hand isolation is most practised in those towns where domestic servants are most common and which may be supposed to be most prosperous. The chief argument for isolation—which can be drawn from the present data—is a lessened case-mortality, but such mortality might be obtained in all probability by specialised medical service as apart from isolation. MISCELLANEA. I. On the Probable Error of a Coefficient of Mean Square Contingency. By KARL PEARSON, F.R.S. Ler the sampled population be considered as to two variates and be represented by the total M and the cell-frequency m,, for the pth row and gth column cell. Further let the vertical marginal frequencies be given by m,, and the horizontal marginal frequencies by m,,, so that Myq+ Moq+ ...+Mpgt...=M.q, My + Mp. t+... + Myqg+...=Mp:- Let the corresponding quantities for the sample be WV, 2pq, 2.q and np.. Then we know that the mean square contingency ¢? is given by sg Hn: EEN & (w uu Yn) Pe ny eae ioe NxN Wo p= summed for every cell. Now in the great bulk of statistical phenomena we do not know more of the sampled popula- Mm, Mp: —* and —2 equal to ou tion than is given by the sample, and thus to determine ¢?2 we must put n, Np: : ‘ : and . Doing this we obtain the usual the most probable values known to us*, namely, Wi V value for the mean square contingency 2 P= (3% - Mn) |n Ra ltp } eee ebalt seeaeees -eceeeeReeee eres (ii). Starting from (ii) Blakeman and Pearson have found + the probable error of the mean square contingency. The process is admittedly very laborious and although it has now been used fairly often, it must be confessed that its chief value is to obtain appreciation of the probable errors of contingency coefficients in general, rather than in any usefulness in recording significant differences between long series of individual coefficients. But it has not been sufficiently recognised that the probable error thus found is that of the approximate value of the mean square contingency (ii) and not that of its true value (i). It is indeed the probable error of the expression actually used, but it is not the probable error of the true value as given by (i). The latter is easy to find and deserves consideration. Let us write Nm Mp | M?= ppg ol: (Mg — 2 “ hey (3 then 2 __ § { LE LE | fe a eee ie N Mpg N Hyg)” * Pearson, Philosophical Magazine, Vol. u. 1900, pp. 164 et seq. + Biometrika, Vol. v. pp. 191 et seq. Miscellanea 571 where we shall use ¢/2 and ¢,” for the true and approximate values of the mean square con- tingency. Thus Now for a sample of constant size pp, is constant and therefore representing small deviations by differentials én 5 =F 29 (72 mt) as Ppq Square, add for all samples and divide - the number of such samples and we have 4 (5 ) 8 Rg Up'q' ees y a) pa!’p'q Org 2= a7 A 5 Olan +3, 2 | —— Xon,o ly 72 2 p} Ry'q'’ Rngtp'q' }? oe ON ie pq N' Mg Hp'q! pa ™p'q'” ™pa"p'a where oy, is the standard deviation of n,q and r is the correlation of deviations in 7p and pq ™p'q’ Np'q 3 S iS a summation for every cell and 3 for every pair of cells. But it is well known*® that Myq Mp'q' r a loo a ee he "pa"'a’ Mou” where y is the factor 1—(N -1)/(M/—1), usually put unity, since M is as a rule large compared with WV, and which will be here put unity for the remainder of the work. 4 n2,.m 4 n,m 8 NngNy'q MygMp'q Hence ofa2=— 8 ( nar a) A Ss ( pg 2) on S ( pa!p'a’ “nq 2) mee Mp ng N M? peng N Poghy'g AL? 4 N? 5M 4 Ng M 2 : = s( men {8 ("4 ali Se ty A iv). diy Mp ng N Mpyq w) This is the standard deviation of the true value of the mean square contingency, and in most cases will be of no service, for we do not know the true values of mpg and ppq- Fig 7 rp/q’ If we put these equal to the values obtained from the actual sample under consideration we obtain the approximate value of the standard deviation of the true mean square contingency, which we may represent by the symbol («¢ 2) and compare with what Blakeman and Pearson a found, i.e. (%% 2). Thus our alternatives are a /t a? + 67449 (042.5 and pa’ t 67449 (og, 2), The real thing is pi? + 67449 0g 2. Shall we obtain a better insight into the variation of this by taking the approximate values of both ¢? and Tg Ps OF by taking the probable error of $,2? The problem is a subtle one, and, perhaps, only to be solved by experiment, not by theory. Of course when we take numerous samples and calculate $,’, then oy 2 will measure their variability. But this is not what we seek. We use ¢,? as an approximation to ¢,?, and it is the variability of the true value that we want. Are we not right in choosing (og), as its best value? In short would not—on the average of a great number of samples—(og), give us a closer result to THe than (79,2), ! Returning now to equation (iv) and putting in the observed values for 7pq, fypq We have (o92)a= - $1 aes {sa Pa ene ere (v). * The values here given are the true values before we approximate by putting mpq/M=Mpq/N, ete. Biometrika x 73 572 Miscellanea Or, after some reductions 2 1 : (72), aN (hat dae = du'} Liana seariancersenre cocenodan aco (vi), 7 _ hiqp: a where ve=S 4 (0 N ) SES4 aba ge cbs a sides e ROS (vil) (2 .q%p.)* (» = ares) and Data ee Nh ee eas Se (viii). N.gNp Again we have ane 5 Var (7$) a iN 7 yt = hath vancedenneccen ooseee eee eee (ix). Now what we usually need is the ae error of the contingency coefficient = p'(1+$%) But TC,=o¢ A — C2)? =04/(1+ $2)? Thus the probable error of the coefficient of mean square contingency _ 67449 [a5/pa? +1 — a’) 2 67419 x 00,= SF | oo } ue (x). This expression is much simpler than that for the probable error of the actually used value as given by Blakeman and Pearson*. It is not, however, asserted that it possesses greater theoretical validity. Those authors illustrate their formula by calculating the probable error of the contingency coefficient in the case of the association of handwriting and general intelligence in 1801 schoolgirls. They find C= ‘2957 + 0192. In the course of their work they deduce a2 = 09580, (o¢,,), = 03268, Wa? = "14865. Using these values we have from (ix) 1 14865 See | = 3. (Tb)a Sian loosest 90420} 0369 It is clear therefore that (74,),, does not differ very substantially from (og, ),. Calculating from (x) the probable error of C,, we find it=-0217, while the Blakeman-Pearson process gave 0192. The two values only differ by ‘0025, which is unlikely to be of importance in the case of most inferences in practical statistics. Beyond the knowledge of ¢,? only ,? is required by the present process. N.glp:\” Fs 2 .qMp: Py) ee eer oe ee GrgueDs 1 g he N ae N SS : N.qMp: N gp: This may be written In finding the mean square contingency ¢,7, however, the three expressions 2 (x _ Nqp: a N N gp: pq — “ar ’ N.qMp: N N.qMp: and W * loc. cit. p. 194. eeeeEeeE— Miscellanea 573 must have been written down for each cell and thus y,? can be readily calculated. We can also treat W,? as given by Ve=S Nibyg_\ _ 1-3¢,2 ; (2 .qM:)” < but the cubing of the often rather large cell frequencies is troublesome, just as it is rather more troublesome to calculate 2 b2=S (= we )- 1 N.gNp: N.qNy:\” Rpg — > 1 (mm N ) ¢ 2 as than da WV S AS ; owing to the largeness of the squares in the former expression. II. Measurements of Medieval English Femora. In a forthcoming memoir on the English Long Bones there will be a good deal to be said about the conclusions reached by Dr Parsons in his recent paper on the Rothwell femora. Meanwhile he has started an attack on the Biometric School in a Journal whose columns are not open to adequate reply,—i.e. to a reply of not greater length than the published attack— from members of that school. In his communication he suggested that I was unacquainted with the condition of atfairs at Rothwell, and behind this charge tried to escape any answers to the essential questions I asked him, and thus those questions still remain unanswered. The communication I made ran as follows : My informant who I hope is trustworthy speaks of (i) “the great mass of bones beneath the church at Rothwell” and (ii) of “the great collection of human bones beneath the old parish church at Rothwell” ; further (iii) “there are probably some 5000 or 6000 individuals represented in the vault at Rothwell, either altogether or in part”; and again (iv) “The stack varies in height and breadth, but is nowhere as high or broad as that at Hythe, although it is much longer. I know that at Hythe there are the remains of rather over 4000 people,....... I think that this collection contains more than this, partly because the stack is so much longer, partly because the bones are so much more decomposed and have therefore settled more.” Manouvrier after much piecing and mending while only able to measure the lengths of about 16 femora from the neolithic burial places of Montigny and Esbly, was yet able to determine the pilastric index of 127, and the platymeric index in 127 bones, that is to say in eight times as many bones as those for which he could obtain the maximum length. And had he dealt fully with the head and neck and the popliteal region, the multiplying factor would probably have been ten. Had piecing, mending and a maximum of care in handling been used, I can hardly believe that what Manouvrier achieved at Montigny was not possible for Dr Parsons at Rothwell. Dr Parsons writes: “If the remains of femurs, whether they are fit or unfit for measurement, are counted it will be found that females are quite as numerous as males though measurable male femurs from their stronger build are Jess liable to break in being extricated from the pile of bones, and so there are more of them available for measurement.” The italics are mine. Much depends on the method of ‘extrication,’ and if the capacity of a bone to stand a hole being drilled in it with a bradawl be part of the necessary fitness for measurement then the number might undoubtedly be limited. But trusting to what I know has been achieved by the SY fe Miscellanea French, I feel convinced that if Dr Parsons could measure 277 femoral heads where the femoral length was measurable, he could easily have measured 2000 heads in all and thus have ascer- tained, definitely, whether his Rothwell series is unique in showing a significant depression in frequency between 45 and 47mm. Further he could on such material by dealing with numbers 8 to 10 times those he has provided have given definite answers to many of the problems con- cerning platymery and the pilastric and popliteal indices, which other observers have been vainly trying to solve on far less adequate and in many cases far more fragmentary material. I would note that Dr Parsons gives no reply at all to my question of why he used Dwight’s measurements as a criterion of sex when they referred to bones with the cartilages attached, because without this reply his careful attention to ‘other points’ when the head fell between 45 and 47 mm. seems one-sided, and of no value in sexing the collection as a whole. He further gives no reply whatever to my question of why it is the male end, not the dwarf end, of his female distribution which is lacking, if absence of females be due to breakage. I would also state (i) that I have not sexed the Rothwell bones and therefore cannot say how far I should or should not agree with Dr Parsons. Dr Lee using the best available mathematical process found 145 9s and 133 gs, while Dr Parsons has 1039s and 174¢s. How this shows any agreement I fail to perceive ; (ii) that I have made no assertion about the bones being of the 13th and 14th centuries. I merely headed my letter with Dr Parsons’ heading ‘ Measurements of Medieval English Femora,” and asked why, if Dr Parsons holds these bones to be such, he considers them without cartilages comparable with the mixed results of a modern American dissecting room plus the cartilages. KP, CAMBRIDGE : PRINTED BY JOHN CLAY, M.A. AT THE UNIVERSITY PRESS _ ARTHUR THOMSON, ‘University of Oxford ARTHUR ee Royal covet of Surgeons ci 2 ( 2] Column with six ana a half Cervical and thirteen True TA ebrx, Ww $80! HON Rte ‘of the Cervical Spinal Cord and Nerves. DoverasG. ~~ el aR h.B. Edin., M.A. Trin. Coll, Camb. Appendices Epiploice and Pericolic Membranes. : ,M.B. The Topog? raphical and Applied Anatomy of the Gall-Bladder and Bile-Ducts. - ALMSLE I.B. Observations on Certain Structural Détails of the Neck of the Femur. LEY, oe i The Neck of the wes as a Static sole eat, _M. PF. aie Aes B. Ss. | Ribs. a a lithic Date. eran Plates XXIIT_XXV.). ‘By ©. G. ‘ion at Gibraltar in 1912. By W. L. H. Ducxworts, M.D., and Tribal Groups i in India. By W. Crooxz. The Orienta. ai, a ‘ Y. J. Perry. Lois de Croissance. By Dr Pavt Gopry. The: Ha, Se NER slands and its Makers. (With Plates XXVI—XXXV.) By Hon. i . Magic and Witchcraft on the Chota-Nagpur Plateau.—A Study - Ma y Sanat Cuanpra Roy, M.A., B.L., Ranchi. Les Touareg du Sud. Be By Fr. pe Zentner. Flint- finds in connection with Sand. By > s me Recent Work on Later Quaternary Geology and Anthropology, dir Se oF its be AG on the question of ‘*Pre-Boulder-Clay Man”. By A. Irvine, D.Sc., B.A. Notes on . Sia at Hal- Saflieni, Malta: By Rev. H. J. DUKINFIELD Astury, M.A., Litt. D., F.R.Hist.S8. ee aad arte Princess. By G, W. Murray. The Experimental Investigation of Flint Fracture and : Seine: m to. Problems of Human aa eam eee ae a cahial ine ve ey 8. Hazzuepine j Ss) See aie - Also Index’ aia’ Table of Contents to ook XLIV. at Pa ONG SON 22, He > PRICE (5s. NET OO é oe “General [Agent “FRANCIS EDWARDS, 83a, High Street, Sead a ae THROUGE AN, ee ge a a iieke > af “yt ie Pas Arae ublished u anders the dirantiong of the Royal: tae ees Institute of heat Britain Ba Ireland. _ Hach number of MAN consists of at least 16 Imp. 8yo. pages, with illustrations in the text ; _ together with one full-page plate; and includes Original Articles, Notes, and Correspondence; Reviews §8 © and Summaries; Reports of Meetings ; and Derpetntive Notices 2 the Acquisitions Ge ‘Museums and : Private Collections. pet 2 io Prrees) t ae or + 108. per Annum es fe sent in a state suitable for direct photographic reproduction, and if on decimal paper at should be blue ruled, and the lettering only pencilled. Papers will be accepted in German, French or Italian. In the first case the manuscript shoul } in Roman not German characters. fe Contributors receive 25 copies of their papers free. Fifty additional copies “may ‘be had ¢ on payment of 7/- per sheet of eight pages, or part of a sheet of eight pages, with an extra, ohargy for Plates; these should be ordered when the final proof is returned. . The. subscription price, payable in advance, is 30s. net per volume (post free) ; single nnmbers ” 10s. net. volume. versity Press, Fetter Lane, London, K.C., either direct or through ; any bookbeller, and communications respecting advertisements should algo be addressed to C. F. Clay. g Till further notice, new subscribers to Biometrika may obtain Vols. oy together for £11 net—or bound in Buckram for £13 net. The Cambridge University Press has appointed the University of Chicago Press Agents for the sale+ - of Biometrika in the United States of America, and has authorised aot to charge the Serine priceay : $7. 50 net per volume ; paele pats $2.50 net each. 5 i ~ (i) On the Probable Error of a Cocfticient of ee Square Contingency. By Kant (a) Measurements of Medieval English Femora. « # ; Page a The publication of a paper in Biometrika marks that in the Editor’s opinion it contains either in method or material something of interest to biometricians. But the Editor desires it to be distinctly — i understood that such publication does not mark assent to the arguments ‘used or i ‘the conclusions. leg drawn in the paper. : BEN teas Biometrika appears about four times a year. ‘A volume containing about’ 500 pages, with plates and tables, is issued annually. ie Ge the reserved) take Association of Finger-Prints. By H. Warne M ae B. Se. “With Plate XX and a Thirty Diagrams in the text) . . 0. Sete Ais ies eae II. On the Problem of Sexing Osteometric Material. ‘By. eae Parson, F.R (With One Diagram in the text) see COREL NS en ee fe TTT Further Evidence of Natural Selection i in Man. By ETHEL M, Expertow and Kany Pranson, FBS 36h VA ei bn lal 2 ae ee ‘IV. Frequency Distribution of the Tralee AG the Correlation Coefticient in samples from an, indefinitely Large Population. By R. A. ‘FISHER : he . V. On the Distribution of the Standard Deviations of Small Samples Append 3 vi -to Papers by “Student” and R. A. FISHER. (Editorial. MAAC, Vi. Tuberculosis and Segregation. By. AttcE LEE, D.Se. Ree ees ye, * VII. The Influence of Isolation on the Diphtheria Attack- and Death- rates, By ie Ernen M, Exprrton and Karu PEARSON, F.R.S. (With ‘Two oars in ae i : the text) . . 5 bee Heer. Mages (aes : ass, Miscellanea: Mek ee uray Vai era ne « if 7 : PEARSON . sear itaa soe tome apees : 3 : ( Senn alee ae ie By Kart Pearson. . .. . tk i Rss , é : % Ne we Papers for publication and books and offprints for notice should be sent to Profeston Karn ae University College, London. It is a condition of publication in Biometrika that the paper shall not Fes already have been issued elsewhere, and will not be reprinted elsewhere without leavé of the Editor. It ds very desirable that a copy of all measurements made, not necessarily for’publication, should accom- . pany each manuscript. In all cases the papers themselves should contain not only the calculated — _ constants, but the distributions from which they have been deduced. Diagrams and drawings should be _ or Volumes I—X (1902—15) complete, 30s. net per volume, Bound in Buckram 34/6 net per Index to Volumes I to V, 2s. net. Subscriptions may be sent to C. F. Clay, Cambridge Uni- OS CAMBRIDGE: PRINTED BY JOHN CLAY, M.A. AT ‘THE UNIVERSITY PRESS. ON cn dia Wh PAK ci) WOM akan . if ie ras Sue NS Se FORM ate henry tek si Ay aut "WILMA 3 9088 01230 9878 235 12)4 5%