"CO THE PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE 4Hmt00rapljB t;Mtr& bg dug No. 13 THE PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE By WILLIAM STERN TRANSLATED FROM THE GERMAN BY GUY MONTROSE WHIPPLE Assistant Professor of Educational Psychology Cornell University. BALTIMORE WARWICK & YORK 1914 Copyright, 1914 By WARWICK & YORK, Inc. CONTENTS Author's Preface v Translator's Preface ix Introduction : Nature and Problem of Intelligence Testing 1 1. Intelligence and Intelligence Testing 1 2. Practical Problems of Intelligence Testing 5 I. Single Tests and Series of Tests 13 1. Single Tests 13 2. The Inadequacy of the Single Test 18 3. Series of Tests 23 II. The Method of Age-Gradation (Binet-Simoo Method) 29 1. The Principle of the Method and the Tests Employed. 29 2. The Resultant Values (Mental Age, etc.) 36 3. Results with Normal Children 42 (a) General Distribution of the Level of Intelligence. 43 (b) Different Age-Levels and Nationalities 46 (c) Children of Different Social Strata 50 (d) Intelligence and School Performance 57 (e) Sex Differences 65 (f) Repeated Tests with the Same Children 68 4. Abnormal Children 70 (a) Mental Arrest and Retardation. Mental Quo- tient 70 (b) Relation to the Several Tests 85 (c) Intelligence and School Ability 90 5. Points of View for the Reorganization and Improve- ment of the Gradation Method 91 (a) Selection and Appraisement of the Tests 92 iii iv CONTENTS. (b) The Composition of Series for the Several Years. 99 (c) The Extension of the System 101 (d) The Computation of the Final Values 104 III. Estimation and Testing of Finer Gradations of Intelli- gence (Method of Ranks) 109 1. The Problem , 109 2. The Teacher's Estimation of Intelligence 116 3. Estimated Intelligence and School Performance 127 4. Rank-Orders of Intelligence Obtained by Tests. . . .135 Bibliography , 147 Appendix 1 155 Appendix II 156 Index . 159 AUTHOR'S PEEFACE I undertook for the last German congress of psy- chology, held at Berlin, April, 1912, a general review of the psychological methods of testing intelligence. As I had only an hour at my disposal in my address, I could at that time do little more than outline cer- tain of the main features of this very broad field. It seemed to me, however, hardly desirable to publish the address in the form in which it was given. I felt, on the contrary, that in view of the now ever- increasing interest displayed in the theme both in Germany and elsewhere and in view of the extraordi- narily scattered nature of the literature — much of which, by the way, is difficult of access — that an ex- position of the topic on a wider scale was demanded. So I have tried to elaborate my original review to this larger scale. I have treated in it three main topics: single tests, the serial method (after Binet- Simon) and the methods of correlation and estima- tion. In the form of my treatment, also, I have over- stepped the bounds of the mere ' ' general review. ' ' I have not confined myself to setting down what now exists, but have myself taken an attitude toward the problem, have offered criticisms of the methods and Vi PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE made proposals for their modification and develop- ment. In making these criticisms and suggestions I have been able to use the experience that has come from the tests of intelligence which have been in progress at Breslau for some years past. Many of these experiments, in which psychologists, educators and physicians have cooperated in a gratifying manner, have already been published; others are still in progress. Yet, thanks to the courtesy of these workers, I am able to make a preliminary report of some of these as yet unfinished investigations. I have also taken the opportunity to incorporate some minor contributions to the problem that have origi- nated in the exercises of the Psychological Seminary at Breslau. The subject under discussion is limited to some extent by the circumstance that tests of intelligence have been almost always restricted to children and youths. But it is just the peculiarity of the psycho- logical methods of intelligence testing — psycholog- ical in the narrower sense, in contrast, e. g., to the psychiatrical methods — that they take their start from the mental life of the child, though later, of course, the attempt is made to carry them over into test methods for adults. On this account I have treated in some detail the results that accrue to peda- gogy, and not only to the pedagogy of auxiliary classes and of the subnormal child, but also to the pedagogy of the normal child. In my judgment, intelligence testing is one of the most promising fields of applied psychology, using that term in the strictest sense. For this reason I wanted to make this survey of it accessible to wider vii circles of readers outside the psychological profes- sion, especially to teachers of normal and of back- ward children, to school administrative authorities, to school physicians, to specialists in nervous and in children's diseases, and to those engaged in child welfare work. This special edition, accordingly, has been arranged. I hope that it will demonstrate to the workers in these circles the great importance and fruitfulness of the psychologist's methods and at the same time show them the difficulties and the gaps in the present status of this work, and that so plainly as to prevent overhasty attempts at practical appli- cation. W. STERN. Breslau, October, 1912. TRANSLATOR'S PREFACE This translation of Stern's Die psychologischen Methoden der Intelligenzprufung has been under- taken because the monograph, though dealing with a different topic, aims, like my previous translation of Offner's Mental Fatigue, to collate, systematize and appraise a mass of scattered and to most readers in- accessible material that bears upon a problem of un- questioned importance. Professor Stern was one of the pioneers and most active expositors of the investigation of the psychol- ogy of testimony, for the furtherance of which he in- stituted a new periodical, Beitrdge zur Psychologie der Aussage, which W.MS later enlarged to cover the wider field of applied psychology in general (Zeit- schrifi fur angewandte Psychologie). Stern is like- wise well-known for his contributions to individual psychology, notably for his important work on indi- vidual differences ( Ueber Psychologie der individuel- len Differ enzen), published originally in 1900 and completely rewritten in 1911 under the title, Die dif- ferentielle Psychologie, and for his numerous sig- nificant contributions to the psychology of childhood. From his Psychological Seminary at Breslau have appeared many researches, some of which are re- ix X PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE ported for the first time in the present monograph. In conjunction with Lipmann he has also founded the Institut fiir angewandte Psychologic, which aims to serve as a museum and clearing house for the col- lection and dissemination of methods and materials for studying and recording the mental processes of individuals and for facilitating the application of psychology to various practical problems. What Stern has aimed to do in the present mono- graph is sufficiently set forth in his own preface, but it may be added here that his book affords what is, so far as I know, the best, and in fact almost the only authoritative, critical and compact general survey of the literature of intelligence testing which is adapted for lay readers as well as for professional psychologists. In perfecting this translation I have received much valuable aid from the members of my class in Ger- man Educational Psychology, in which the mono- graph was used as a text, and from my colleagues, Professor P. E. Pope, of the German Department, and Mr. D. K. Fraser, assistant in Educational Psy- chology. GUY MONTROSE WHIPPLE. Cornell University, January 1st, 1914. THE PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE INTRODUCTION Nature and Problem of Intelligence Testing 1. Intelligence and Intelligence Testing Modern experimental psychology, which started with the study of sense-perception and then under- took that of ideas and feelings, has in the last decade begun to deal with intellectual functions themselves. And it is worthy of note that general theoretical psy- chology and differential applied psychology took this step forward at the same time, though for the most part independently. In the former there was devel- oped a psychology of thinking, in the latter there appeared the investigation of differences in intelli- gence. Our discussion must be restricted to the second problem with which alone we are concerned. To the other branch of psychology we may confidently leave the question of the general nature of intellectual ac- tivity and the investigation of the phenomena that constitute thinking as such. What we are interested in is not intelligence as a phenomenon, but intelli- gence as a capacity and particularly a capacity with respect to which men differ one from another. And intelligence testing is the determination of the de- gree of this capacity in a given individual. i 2 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE The objection is often made that the problem of intellectual diagnosis can in no way be successfully dealt with until we have exact knowledge of the gen- eral nature of intelligence itself. But this objection does not seem to me pertinent. In science there is no such precise sequence of the different research prob- lems. We measure electro-motive force without knowing what electricity is, and we diagnose with very delicate test methods many diseases the real nature of which we know as yet very little. Indeed, it may be asserted, quite on the contrary, that prog- ress in testing intelligence may shed light from a new angle upon the theoretical study of intelligence and thus supplement the psychology of thinking in a valuable manner. If it turns out, for instance, that certain symptoms are relevant and others irrelevant for the differentiation of the intelligence shown by different persons ; if, again, one series of these symp- toms exhibit a high degree, another series a less de- gree of intercorrelation, then our knowledge of the structure of intelligence must thereby be little by lit- tle increased, and thus there will develop a fruitful reciprocity between the two phases of investigation, theoretical and applied. Naturally, we cannot begin our work without a pre- liminary definition of intelligence, however pro- visional it may be. And this definition must be neither too broad nor too narrow. Many psychiatrists have used a definition of intel- ligence that is too broad. They use intelligence, in fact, to include mental attainments of all kinds, all those mental qualities, then, that are not volitional or emotional. If this position be taken, it follows, NATURE AND PROBLEM OF INTELLIGENCE TESTING evidently that the examination of immediate mem- ory, of ability to learn, of range of information, of fidelity of report, or of discriminative sensitivity is just as much a constituent part of intelligence testing as the examination of ability to apprehend, to syn- thetize, of capacity to judge, to conclude, to define, to criticize, etc. Again, a question that is very im- portant for us, viz. : to what extent intelligence really enters into these first-named activities, and whether and in what way it shows signs of its presence in them, becomes absurd. But the advance made in the recent development of intelligence testing, in con- trast to the uncritical determination of mental level by any sort of questions and tests, consists in the fact that we not only limit intelligence by setting it over against the emotive and volitional nature of an individual, but also ascribe to it a definitely re- stricted place within the mental functions. This delimitation of the sphere of intelligence that is even now essential cannot be effected, in my opin- ion, from a phenomenological, but only from a tele- ological point of view. In fact, my definition is this : Intelligence is a general capacity of an individual consciously to adjust his thinking to new require- ments: it is general mental adaptability to new prob- lems and conditions of life. This definition differentiates intelligence clearly from other mental capacities. The fact that the adjustment is made to the new distinguishes intelligence from memory whose fun- damental teleological feature is the conservation and utilization of conscious contents already given. The fact of adaptation, again, emphasizes the de- 4 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE pendence of the performances upon external factors, on the problems and demands of life, and thus dis- tinguishes intelligence from genius, whose nature is to create the new spontaneously. Finally, the fact that the capacity is a general capacity distinguishes intelligence from talent the characteristic of which is precisely the limitation of efficiency to one kind of content. He is intelligent, on the contrary, who is able easily to effect mental adaptation to new requirements under the most varied conditions and in the most varied fields. If talent be a material efficiency, intelligence is a formal efficiency. I trust that these distinctions may serve to lessen the confusion that has been current. It is not so long ago, indeed, that in psychiatry ' information tests' were carried on as ' intelligence tests, * thereby con- fusing memory and intelligence. And we often, even nowadays, find intelligence and talent confused in everyday life. In the school, for instance, a teacher of a special subject like mathematics, who perceives the special gift of a pupil in that field, may easily come to believe without further evidence that this pupil has general ability, or in other words, to rate him as an intelligent pupil. But we should not interpret this delimitation to mean the erection of sharply distinct faculties, as in the old faculty theory. Intelligence, for instance, does not function by itself and memory by itself; rather, every operation of memory is more or less impregnated with intellectual functions and vice versa: the extent of this interconnection can be indi- cated only by the correlation of the tested symptoms. NATURE AND PROBLEM OF INTELLIGENCE TESTING 5 But just on account of this composite character of every actual mental process it seems to me that the definition of intelligence I have given above is indis- pensable as a regulative principle for further investi- gation : I mean that any sort of perceptive, memorial or attentive activity is at the same time an intelli- gent activity just in so far as it includes a new adjust- ment to new demands. We must add one final limitation : we are consid- ering only those phases of intelligence testing that deal with a scale of degrees. This does not mean to minimize in the slightest the importance of qualita- tive differences in types of intelligence (analytic- synthetic, objective-subjective, etc.) ; we need only refer to the importance of the essay as a means of testing for these phases.1 But we shall discuss in this monograph only those forms of procedure that permit us to say of a given person that his intelli- gence is of such and such degree. As the title of the book indicates, the problem of method will be prominent throughout our presenta- tion. We can thus best do justice to the present status of the question, for the significance of the re- sults thus far obtained lies particularly in the fact that they serve to provide new suggestions for the perfecting of our methods. 2. Practical Problems of Intelligence Testing Since we have to do here not with methods de- signed for purely theoretical investigations, but with aOn this aspect of intelligence, consult the general review and bibliography given in my earlier discussion (1 : pp. 203-213, 433-4). [Note : numbers in parentheses refer, unless otherwise indicated, to the reference list at the end of this monograph. — Translator.] 6 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE methods that are to be employed in daily life, their form is determined, at least in part, by the practical needs that are to be satisfied by intelligence testing. We must distinguish four groups that arise from the combination of the two pairs of terms : abnormal and normal, adult and child.2 (a) Adult, abnormal individuals form the chief material of the psychiatrists, who in consequence were the first to want to test intelligence.3 Not only have they invented single methods, but they have also devised whole series or systems of examination (Rieger, Kraepelin, Sommer, Ziehen, Gregor, Bern- stein, Bossolimo, et al.) The contents of these sys- tems are such as to bring them only partially within our scope; by far the greater portion of them take on the character of questions and qualitative tests rather than that of quantitatively gradable tests; even where these latter have been used, comparative material for normal persons is often enough want- ing. Whether the outcome of any one of these tests might really indicate an abnormally weak intelligence was frequently judged on the basis of a preconceived opinion as to how normal men might be expected to react to the test in question. In recent years this has been remedied. Eodenwald (22) showed with regard to a group of information tests how much of what had a priori been deemed abnormal really lay within the bounds of normality. Many psychiatrists have sought to obtain comparative standards for 2A similar division is used by Meumann (15), though he, to be sure, defines intelligence somewhat more broadly than do we. 8An extensive general summary of the more important methods of intelligence testing used by alienists will be found in Jaspers NATURE AND PROBLEM OF INTELLIGENCE TESTING 7 their methods by extensive application of them to normal persons (Sommer, 26; Ziehen, 30; Bansch- burg; Kossolimo, 23-25). Others have turned to ac- count the fact that certain methods had already been tried out extensively by psychologists upon normal persons, e. g., Ebbinghaus' completion method, the report experiment.4 But how far all this comes from meeting the need of the alienist himself is shown by the decision of the International Con- gress of Physicians to turn to the psychologists in order to secure normal series for the various psy- chiatrical tests of intelligence. This task has been undertaken by the Institute for Applied Psychology. (b) Abnormal children have become, just in the last few decades, a center of pedagogical, socio-polit- ical, and medical interest. The whole pedagogy of the subnormal, the schema of auxiliary schools and special classes, the juvenile court and the various protective and corrective institutions are, indeed, matters of very recent development, but they are de- manding a more exact study of the individuality of the child, both for purposes of mental diagnosis and for 'psychotechnic7 purposes (training, treatment, punishment, etc.). To meet these needs, the determi- nation of degree of intelligence is, though not the only, at least a most important factor. The weaknesses of the psychiatrical methods men- tioned above were doubled when these methods were applied to these new problems. With adults we knew little enough of the normal standard to which the per- formances of abnormal subjects were to be com- 4For a general account of these methods, see the translator's Manual of Mental ami Physical Tests, Baltimore, 2d ed., 1914. 8 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE pared, but with children we knew nothing at all. What is more, one normal standard is not enough in this case ; every age-year must have its own standard. The magnitude of a defect of intelligence in a nine- year old child can be determined only by comparing it with the normal nine-year old intelligence, and so with other ages. The consequent demand for the creation of normal test-series for each year of child- hood was met, as a matter of fact, not from the side of psychiatry, but from that of psychology. Alfred Binet, with the cooperation of the physician, Simon, has created such a graded series of tests; and al- though the system as it now stands may be far from final, its fundamental conception will retain its per- manent value and will doubtless lead us ultimately to a completely satisfactory solution. His method has already attained international usage. We shall dis- cuss it fully in the second part of our treatment. (c) Normal children and youths. It is not to be supposed, however, that intelligence testing of normal children has merely the secondary import- ance of supplying standards of comparison for in- vestigations of the feeble-minded. On the contrary, the gradation of intelligence within the range of normality is an entirely independent problem that is closely connected with practical pedagogical inter- ests. The ordinary school examinations afford a notion of the pupil's knowledge and of his external accomplishments, but they do not afford an index of his inner endowment, of his mental maturity and power ; it is here that psychological tests must sup- plement other forms of examination. This need is especially evident at entrance examinations, but it NATURE AND PROBLEM OP INTELLIGENCE TESTING 9 exists within the ordinary administration of the school as well, for the demand, nowadays so em- phatically voiced, that instruction shall be individ- ualized to the fullest possible extent, presupposes a fuller insight into the nature of individualities. Very recently, in fact, serious attempts have been made to make divisions into classes and sections on a psychological and qualitative basis (special classes for the subnormal, classes for the backward, sepa- rate classes for the specially gifted, ' parallel' classes with normal and minimal courses of instruc- tion for pupils of different degrees of ability in par- ticular subjects) — attempts that demand, as an in- dispensable prerequisite the possibility of very ex- act determination of the actual degree of mental ability.5 In this connection we must, of course, guard against the danger which is apt to arise of suppos- ing that we have grasped the individuality of a pupil in its totality when we have tested his intelligence. The fact that intelligence can be more easily treated quantitatively than can other individual capacities must not lead us to overestimate its import. Never- theless, the fact that we can deal with intelligence by itself does serve to disclose the structure of the in- dividuality. We can determine whether a perform- BA11 these pedagogical reform-movements that are related to the problem of intelligence were the general subject of discussion at the first German Congress for Child Training and Paidology (Kongress fur JugendMldung und Jugendkunde) that it was con- ducted by the School Reform Association (Bund fur Schulreform) at Dresden, 1911. The addresses and discussions of this congress have been published in separate form (11) : the special problem of testing intelligence was discussed in the addresses of Meumann, Kramer, and the author. 10 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE ance of greater or lesser degree depends on talent or on intelligence; we can investigate what degree of correspondence exists between the experimental results and the teachers' judgments of the intelli- gence of pupils; we can delimit the extent to which general school efficiency is dependent on intelligence itself on the one hand and on non-intellectual factors on the other hand — a delimitation that, as will be shown later, forms one of the chief merits of the psychological methods. The studies of normal children that bear directly upon our problem were first carried on by separate tests: this method, originated in Germany, has been very extensively employed and further developed in France and especially in America. Then arose in France Binet's system of tests with age gradations that we have already mentioned. England has lately joined the movement to good effect by giving us the correlation method for use in the more pre- cise testing of intelligence (Pearson, Spearman, et al.) These three main lines of activity will fur- nish the principle of division of our subsequent treat- ment. (d) Normal adults. Here we find ourselves in a realm whose exploitation is entirely in the future, for the tests of intelligence thus far administered to normal adults have not been undertaken for the sake of these persons, but only to get comparative standards for abnormal persons. Yet even now new developments are to be noted. Miinsterberg shows how important an exact knowledge of individuality would be for determining choice of a vocation and he has already suggested ways in which the voca- NATURE AND PROBLEM OF INTELLIGENCE TESTING 11 tional bureaus that exist in America might arrange psychological tests (19, 20). And Captain Meyer (17, 18) sees in intelligence testing a method that ought to help the recruiting office to keep unfit can- didates off the enlistment rolls. These last considerations show that the chief em- phasis of intelligence testing, which has hitherto lain wholly within psychopathology, must in the future be shifted distinctly toward normal psy- chology : so the labor expended by psychology in se- curing a reliable method will benefit not only physicians and those concerned in teaching the ab- normal, but also jurists, military officials, those con- cerned in teaching the normal child and others. But just this anticipated extension of the practical applicability of intelligence tests necessitates sev- eral words of warning. (a) We are still in the midst of our preliminary work on method. The methods that now prevail — and this is true also of the Binet-Simon system — are not yet to be regarded as diagnostic canons that ad- mit of official prescription. The law passed in New Jersey that directs the use of intelligence tests with all pupils suspected of backwardness seems on this account very premature. So, too, it will be long, very long, before we realize the optimistic hope that Spearman attaches to the correlation method of test- ing intelligence, when he says: "Indeed, it seems possible to foresee the day when there will be an an- nual official determination of the i intellectual index ' of every child in the empire " (Hart-Spearman; 75, p. 78). (b) It must be understood that tests of intelli- 12 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE gence are not easy to conduct. Their administra- tion demands extended practise, psychological train- ing, and a critical mind. Thus, for instance, the average teacher, whose work has been with the wholly different methods of pedagogical question- ing and examining, is very apt to apply psychologi- cal tests in those forms in which their value would be positively illusory. If, accordingly, the use of tests for practical purposes shall attain any very large currency, the training of a specially psycholog- ically drilled personnel will become a necessity. School psychologists would then take their place side by side with the school physicians.6 What erroneous ideas prevail concerning the ease of conducting tests is illustrated, e. g., in the declaration of Captain Meyer that in military enlistment tests of intelligence could some- day be carried on quite mechanically by subalterns. But, as a matter of fact, a psychological test is quite a different thing than the de- termination of weight or of stature which might very well be carried out by minor military officers. (c) Psychological tests must not be overesti- mated, as if they were complete and automatically operative measures of mind. At most they are the psychographic minimum that gives us a first orien- tation concerning individuals about whom nothing else is known, and they are of service to complement and to render comparable and objectively grad- able other observations — psychological, pedagogical, medical — not to replace these.7 7Similar warnings against the overestimation, mechanization and diletante employment of tests are to be found in Myers (21), Bobertag (40), and also in Binet's last work (37, pp. 155 ff.). [Cf. also the translator's Manual of Mental and, Physical Tests, Oh. 1.] "On the demand for school psychologists, see 11, p. 19. [The situation in America is discussed by J. E. W. Wallin in two interesting papers, Jour, of Educ. Psychol 2 : 1911, 121 and 191.— Translator.] I. Single Tests and Series of Tests 1. Single Tests All psychological experiments may be divided, ac- cording to their problem, into research experiments and test experiments. The latter are now generally known as " tests ;" their aim is "to determine for a given individual his mental constitution or person- ality or to determine a single one of his mental traits.1 " Tests include, of course, not only experi- ments in the narrower meaning of an investigation carried out with the aid of instruments, but also simple methods of procedure that do not involve the use of instruments — questions, problems, presenta- tion of pictures, and the like — provided that these are administered in a systematic and scientifically regulated manner and that their results are re- corded. Now, in no field have so many tests been proposed and put into operation as in the field of intelligence testing. To give a complete exposition of all these test methods and of the results that have been gained through them would exceed the bounds of this monograph. But this is not necessary, after all, be- cause, as will be shown in a moment, the funda- mental significance of our whole problem lies not in *See my earlier text (1, p. 87). 13 14 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE the single tests, but in the construction of well-con- sidered systems of tests, for which single tests merely supply the raw material. So we shall con- tent ourselves in this part of our essay with a cur- sory survey without any pretense at all to complete- ness.2 The varied nature of the proposals and test inves- tigations thus far made is due to the fact that the same problem has been approached in very different ways. (a) For a long time we started from the errone- ous presupposition that any psychological method of experimentation would be really usable as a test. It was thought that all that was necessary was to alter the direction, so to speak, of the plan x>f in- vestigation. When a large number of measurements had been secured by a single method on a few per- sons in the laboratory, the same method was ap- plied to many persons, but only once or a few times to each of them. If it turned out from such a mass experiment that the more intelligent persons ob- tained, all things considered, better average scores 2For all the literature on single tests, see my text on differential psychology (1, 426 ff.) ; also in Appendix II of that book there is a survey of the relation of the single tests to school performance. Fifty-four different tests, with numerous sub-types are described, together with their methods and chief results, in Whipple's Manual (28). A very large collection of materials for testing was exhibited by the Institute for Applied Psychology at the Berlin Congress, Easter, 1912, for information about which Lipmann's catalog in the report of the Congress may be consulted. Since the meeting, this exhibit has been made a permanent one and has been assigned a room in the exhibition by the Prussian Ministry of Education of German material for instruction, at Berlin, 126 Friedrichstrasse. The exhibit can be seen at that place by pre- vious appointment with the Secretary of the Institute (Dr. Lip- mann, Telephone Potsdam, No. 8). SINGLE TESTS AND SEEIES OF TESTS 15 than the less intelligent, then it was assumed that the method would answer for testing intelligence. Nearly all the methods that were familiar to the psychological experimenter have been tested out in this way, especially in the earlier periods of in- vestigation by mental tests, e. g., measurements of reaction-time, determinations of the threshold of differential sensitivity in the different modalities, optical illusions, experiments on motor skill or strength, association experiments, tachistoscopic ex- periments, learning of syllables, etc. In some cases, it is true, numerous results of interest were secured, but it must be admitted that a good deal of energy has been expended to little avail in these experi- ments. (b) A significant advance was made when it was finally recognized that this blind probing about could not lead us farther, that, on the contrary, tests of intelligence must be definitely selected on the basis of certain presuppositions that were to be made concerning the nature of intelligence. Investi- gators, therefore, sought then for exact methods of experimentation that would bring intelligence into direct and manifest operation. To be sure, the prob- lem was at first conceived of in a still too simple form, in that intelligence was thought to be exhibited as a definite clean-cut mental phenomenon and the plan of testing was directed to the examination of this assumed special phenomenon. The best-known instance of this is the so-called 1 combination method ' of Ebbinghaus, now better designated as the ' completion method ' (5). In Eb- binghaus ' view, every true instance of intellectual 16 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE ability may be reduced in the last analysis to an act of 'combining' i. e., to a process of synthetizing con- scious contents that previously had been present separately ; accordingly, he invented that method in which the subject of the test is to supply the correct connections between the separated parts of a text in which gaps have been introduced. This principle of combination or completion has been used by many other investigators as a basis for various forms of test. Thus, Ries (78) used two tests to measure the ability to bring two terms into a logical relation: A: Pairs of words were pre- sented that had a logical connection, e. g., fire-smoke, flood-need; then a test was made whether the naming of the first member of the pair reinstated the second by dint of the logical connection. B. Single words were given to which such wrords were to be ad- joined as would form a causally connected pair. A similar method is that of Winteler in which a term is to be named that is super- ordinate, sub-ordinate or co-ordinate to the word given. The combination test of Masselon in which a meaningful sen- tence is to be made from three given words has been extensively used. Recently, Meuniann (16) has elaborated this method in a special fashion ; he presents words so chosen that they can be joined in a sentence either in a banal and logically rather crude way or in a logically pertinent way, c. g., ass, bloivs; poor solution "The ass receives blows." Good solution "The lazy ass receives blows." The tendency toward the former or the latter rendition is taken as an index of intelligence. Heilbronner's picture- test (8, 27) examines ability to complete in the sphere of vision: the outline of an object is shown on a series of small cards and in such a way that there is a pro- gressive development from an initial very fragmentary outline by successively more detailed stages up to a complete picture of the object. The idea is to find out at what stage of incomplete delinea- tion the object can be recognized. To this class of tests belongs also the fitting together of cut up pictures (method of the Russian alienists, Bernstein and Rosso- limo). Other psychologists, however, have considered other and quite different mental functions to be the touch-stone of intelligence. SINGLE TESTS AND SERIES OP TESTS 17 Thus, in an earlier stage of his work Binet (2) be- lieved that the essence of intelligence was capacity to adjust attention : for this reason he used tests of attention, like the cancellation of letters in a speci- fied text (the Bourdon test), the copying of sentences, the esthesiometer (Binet regarded the discrimina- tion of two near-lying compass-points as a phenom- enon of attention, not of sensation), the sorting of cards containing the alphabet, or numbers, etc. In the work of Meumann (14) we note at times the laying of a certain one-sided emphasis on the under- standing of the abstract as being the root of intelli- gence. This was why he specially recommended the use in testing of the retention of abstract words. Quite a number of investigators have directed their attention particularly to capacity to apprehend as the index of intelligence, and hence have preferred to use for tests such things as the apprehension of pictures or ability to perceive linguistic material of different contents and extents. (c) We may consider as a third main class of tests those patterned after familiar pedagogical tasks. There are, indeed, certain school activities that admit of relatively precise grading, since they can be rated both in terms of quantity (amount done within a given time) and in terms of quality (fre- quency of mistakes). Those schoolroom tasks are most obviously adaptable for psychological pur- poses within which the course of activity is fairly homogeneous, e. g., the computation of specified arithmetical problems, writing from dictation, com- mitting to memory of vocabularies and poems, and all these tasks have, in fact, been used for testing in- 18 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE telligence. Evidently a chief objection to this method is that the activities mentioned are depen- dent to a large degree upon external conditions of the instruction, so that the intelligence of individ- uals that are working, or that have worked, under different school conditions cannot be subjected to comparative tests by means of these activities. (d) A fourth main class of tests is still farther removed from the precision of the laboratory ex- periment, but is thereby more nearly allied to real life. These tests aim to secure records of such evi- dences of intelligence as are accepted in ordinary life as special evidence of it. These direct tests of intellect have been specially developed by the psy- chiatrists: they comprise such things as defining, comparing, differentiating, the understanding of proverbs, grasping the point of a joke, seeing ab- surdities in verbal or pictorial presentations. These tests have the advantage that in them in- telligence is undoubtedly much more directly opera- tive than in the others : but on this account it is im- possible in most of them to scale the results : they are "alternative tests, " that admit of but the rough differentiation into right or wrong (+ or — ). The single test of this sort, therefore, does not make it possible to secure any very precise characterization of the person tested, or to rank him in a scale. 2. The Inadequacy of the Single Test A critique of all these confusingly many attempts might be undertaken by examining them, test by test, to see which ones deserve to be recommended as in- dicators of intelligence. But we feel that far more SINGLE TESTS AND SERIES OF TESTS 19 important than such a special scrutiny of single tests is the laying of emphasis upon a general critical position : no single test, no matter how good it may be, should ever be made the instrument for testing the intelligence of an individual.5 Because the single test tests on the one hand more, and on the other hand, less than it really ought to test. More, because the mental activity that is aroused in a subject by an experimental task, a test-question, or the like, is the fused resultant of quite varied an- tecedent conditioning factors: and we do not know what share that particular conditioning factor that we call intelligence played in the performance. In this equivocal nature of the object under investiga- tion lies the too often little noted distinction between tests and laboratory experiments. If I arrange an investigation of memory in the laboratory, I know that I am actually examining memory and not some- thing else, because in numerous single experiments I vary in a measurable way certain conditions only of the function of memory while I keep all the other conditions constant. But when, on the contrary, I administer a test of learning or a test of immediate memory by itself to a person, the outcome is affected by the real capacity of retention, understanding of the material, attention, interest, etc., all without con- trol— and this quite regardless of the disposition of the subject at the time. Or, take another exam- 8Cf. Binet (36, p. 201) : "One test has no meaning, but five or six tests do mean something. * * * The attention of psy- chologists must, then, be called especially to this principle of the multiplicity of tests," 20 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE pie : suppose that a subject has made a good record in the filling out of gaps in a text (Ebbinghaus* com- pletion test), does this good performance depend predominantly upon a real capacity for logical com- bination? Or upon a specially large vocabulary! Or upon a fine feeling for language? Or upon prac- tise in guessing riddles? The only way to analyze out from this fused re- sultant the ability we are after — in this case, for in- stance, the ability to effect combinations — is obviously to add several more tests of a different kind that will also involve the process of combining, but that will in addition involve mental processes of quite different sorts. Correspondences that may appear in the results of these different tests may then be ascribed with probability to their common factor- in our example, to the ability to effect combinations. The ability sought for must, therefore, be plotted, as it were, from different positions. Too little. But suppose that we have succeeded in determining a subject's ability to make combina- tions not by a single test but by a smaller number of different ' combination7 tests, have we then meas- ured his intelligence? By no means, for we have now determined far too little. Intelligence, it is to be noted, means an all-round ability; it refers to the general mental attitude toward new demands, and combining is only one side of this attitude. The other sides possess equal significance, e. g., the grasping by consciousness of a newly presented ob- ject (apprehension, apperception, understanding), the dividing of a whole into its parts (analysis), the taking of an intellectual attitude toward a content SINGLE TESTS AND SERIES OF TESTS 21 (judging, criticizing, deliberating, and deciding), etc. These functions of intelligence must, then, be con- sidered in their totality; and the actual testing of them ought not to be omitted unless we were certain that they had already been examined by implication along with some other tested function. Suppose that in a group of persons it had been possible to show that X had the best ability to combine; is it then certain that he would also take first place in other forms of activity involving intelligence and that he might, accordingly, be ranked first in total intelligence? To ask this question is enough to insure a nega- tive reply. I feel, I admit, that Spearman (75, 77, 80) is right in asserting that intelligence does really signify a general capacity which colors in a definite way the whole mental behavior of an individual. But we must not force this idea — nor does Spear- man— so far as to assume that all the separate con- stituent functions of intelligence in the different fields are mechanically of equivalent degree. Such a view is, indeed, contradicted by the circumstance that there is operative in each individual bit of be- havior not only a given quantity of intelligence, but also the special quality of intelligence of the person tested, and besides these a varied number of other mental traits. Thus, there are persons who have a pretty high grade of general intelligence, but who manifest it much better in analytic and critical than in synthetic work ; again, there are persons in whom the receptive activities of intelligence (apprehend- ing and understanding) are superior to the more spontaneous activities, and so on. 22 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE However, everyday life shows that we can disre- gard these qualitative differences and nevertheless may characterize the general grade of intelligence that a man possesses. When we do this we make, even unconsciously, certain compensations : two per- sons may have an intelligence of the same value, but of somewhat different kinds. In tests there must be introduced a kind of systematic compensation like this. We must test the different phases of the activ- ity of intelligence and seek to construct a general picture of the degree of intelligence from the differ- ent results, partially accordant, partially variant as they will be. This has given us a clear idea of what is wanted in the methodics of intelligence testing. Negatively, it must be declared that the method of isolated tests, the idea of basing everything on a single test, is methodologically no better than such a procedure as judging the total character of a man on the strength of the single arbitrarily selected symptom of his handwriting (graphology). Positively, three things are evident: first, series of tests must be arranged that will set in play the various constituent functions of intelligence; sec- ondly, for this purpose there must be a wise selec- tion of tests ; out of the immense number of possible tests only those should be chosen that afford a de- cided and a reliable symptomatic value, general ap- plicability, and possibility of objective evaluation; thirdly, there must be created a system by means of which the several particular results of the testing can be united into one resultant value, i. e., a value that shows the grade of intelligence of the subject objectively in an inclusive formula in which per- SINGLE TESTS AND SERIES OF TESTS 23 f ormances of different degrees of value shall in some way be compensated. 3. Series of Tests The first of these positive requirements has al- ready been met for a long time since; in especial, since Eieger numerous test series have been used by the psychiatrists for testing intelligence. These series have been based as a rule upon a psycholog- ical schema, though this schema has varied a good deal from one investigator to another. For illus- tration two such series may be mentioned, both of them quite recent. Sommer (26), in an article just published on the methods of intelligence testing, discusses in order the materials for testing the following aspects of the problem: relation of memory, of school information, of arithmetical ability and of association to understanding, also attention, capacity to apprehend, completeness of complexes, analysis of complexes, redintegration of complexes, mechanical knowledge (cleverness), constructive knowledge, logi- cal subordination and superordination, notion of cause and effect, intellectual interests, understanding of the environment. Ziehen (30), in the last (3d) edition of his Prin?ii>i<>n nud MclJt- oden der Intelligenzpriifunff, makes the following classification : retention, development and differentiation of ideas (generalization, isolation and complexion of ideas), reproduction and combination, and describes the numerous forms of questions and tests used in his clinic for each of these divisions. Although one cannot deny that these and other series devised by psychiatrists are quite compre- hensive, yet they are open to criticism in other re- spects : the requirements that were laid down above, under I and c, are met by them only partially or not at all. For all these series give the impression that the selection of tests may have been more a matter of chance or arbitrary choice than something deter- mined by actual guaging of their value. As a rule 24 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE the selection was based upon a priori reasoning that a certain capacity, e. g., retention or combination, which had been assumed to belong to intelligence was 'hit' by a certain kind of test. Very seldom was any actual preliminary investigation made to see whether this particular test was really superior to so and so many others by virtue of the precision, constancy and significance of the particular values that it afforded. Moreover, this chance selection evi- dently explains why there is so little agreement be- tween the test series of different investigators: every psychiatrical clinic has its own special method of testing intelligence; every specialist in nervous diseases, every physician in charge of classes for subnormals chooses his tests to suit his fancy, and thus it has been impossible, so far, to effect any real comparison, corroboration and standardization of the results of different investigators. Finally, the usual psychiatrical test-series suffer from the lack of any principle by which to summar- ize the results in a single value. The psychiatrists recognize that it is impossible to set a value on the intelligence of a person as a whole, for they apply such predicates as "poor in judgment/' "mentally feeble, " "imbecile/' "idiotic;" but if we watch the way in which, in the individual case, they arrive at the general conclusion that they draw from the data of their test-series, we note a yawning gap. The mosaic of test results is, and remains, only raw ma- terial ; no fundamental methodological principle, but only intuition, routine and subjective estimation of their results, dictates the final decision concerning the intelligence of the subject. In a certain sense, SINGLE TESTS AND SERIES OF TESTS 25 y there is an advantage in deciding in this way, for the gift — wellnigh an artistic gift — of intuitive appre- ciation and sympathetic understanding is peculiarly indispensable to the psychiatrist. But if we leave ty there remains a decided disadvantage, because every conclusion ar- rived at by this method then remains a subjective one that cannot be controlled or subjected to gener- alization. On this account we are justified in de- manding that, at least in addition to this intuitive diagnosis, there should also be a method for making an objective evaluation of the results. To meet this demand these mere collocations of tests will have to be replaced by a closed system of tests which will permit the derivation of a final general index of in- telligence from the results obtained from any subject whomsoever, and that in accordance with prescribed rules that can be applied in a comparable way in all places and on men of different grades of intelligence. An alienist has come forward lately with an attempt of this sort, i. e., an attempt to join together a series of tests systematically so as to furnish a 'picture' of an individuality. I refer to the so- called 'profile-method' of the Russian, Rossolimo (23-24a) : a method that really includes more than mere tests of intelligence and comes, therefore, but partially within our scope. Rossolimo has contrived ten tests for each of ten different men- tal functions. The results obtained from the single subject are set out graphically by erecting ordinates corresponding to the number of the tests achieved for each of the functions under test. The ends of these ordinates are then joined to make a curve that Rossolimo calls the 'individual profile.' This profile line is sup- posed to furnish a pictorial representation of the total nature of a patient. Thus, for instance, in those disorders in which the ca- pacity of immediate reproduction is decidedly reduced while the other capacities remain unaffected, the profile will show a sharp notch at a definite point, and so on. The tests proposed by Rossolimo have many commendable feat- ures; we may note, for example, the little puzzles, like the sepa- rating of two interlaced wire nooses, etc., that are used to test 26 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE technical ability. Yet, on the whole, the principle of the con- struction of the profile is too superficial and the coordination of certain tests to certain mental functions, c. g., to volitional acts, is not precise enough to allow us to hope for much success. This demand for a system of tests presents such an exceedingly difficult scientific problem that it is perfectly evident that alienists and educators can not solve it as a side issue of their professional work, but that psychology itself will have to undertake the task. It is interesting in this connection to note how psychology attacked the problem along two very dif- ferent lines. I feel that it is important to consider them separately in what follows. Neither of these two lines of effort should be regarded as the only correct one ; each method has its advantages and its disadvantages, and, what is particularly important, each has its special aim for which it is fitted. The method of age- gradation of Binet and Simon permits of a rough gradation of intelligence for the whole range of development of the child ; it is for use in a comparable manner with children of different ages, of different nationality and cultural level, with normal and with feeble-minded children of all grades. The method of rank correlation, on the other hand, is limited thus far to a comparison of the members of a small homogeneous group, but renders it possible to test the gradation of intelligence with- in this group with a precision that the Binet method can not approximate. A considerable amount of ma- terial is already available for the first of these methods, and we shall have to deal with it at some length for that reason. With the second method, on the contrary, our discussion will center upon the out- look for its future development. SINGLE TESTS AND SERIES OF TESTS 27 Both methods have been tried out so far almost exclusively upon school children; but it is to be ex- pected that they will find use also in testing the in- telligence of adults, both normal and mentally de- ficient. II. The Method of Age-Gradation (Binet-Simon Method1) 1. The Principle of the Method and the Tests Em- ployed in It In the nineties Binet and Simon conceived the idea of constructing a "graded scale of intelligence " (Echelle metrique de I 'intelligence) that should be especially planned for testing the intelligence of children. The requirements to be satisfied by the method were the following. A series of tests should be found for each year of childhood the passing of which could be considered normal and typical for children of just precisely this age. The tests must be relatively uninfluenced by external and chance conditions, especially by school learning, so that the result might bring out as purely as possible the real mental endowment of the child ; they must admit of as uniform use as possible in different nations, lan- guages or grades of culture : they should be easy to carry out, not necessitate laboratory apparatus or instruments of precision, should not exact too much 1Comprehensive descriptions of his method are given by Binet (partly in conjunction with Simon) in references 33 to 37. A general review of the development of the method is given by Bobertag (39). [Also recently by Meumann, Arch. f. d. ges. Psych., 25: 1912 (Literatur, 85 ff.).— Translator.} 29 30 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE time of the child, should not impose hardship on him or tire him, and yet must possess sufficient ac- curacy to make possible comparison and checking of the investigations undertaken by different persons ; and, finally, they should make it possible to work out a final value for each subject tested that could be deemed a measure of his general intelligence. It seems, at first blush, as if the fulfilling of so many different demands would raise insurmountable difficulties. Above all, there was no preliminary in- formation available as to what intellectual perform- ance might be expected, even approximately, from a child of a given age. If some time you ask a teacher or some one who has been dealing with children of different ages for a long time at what age a child could be expected to give correctly the difference be- tween two designated objects, e. g., wood and glass, and at what age he would be able to explain the dif- ference between two abstract concepts, e. g., lies and mistakes, he would either be silent or make a blind guess at it. Here, then, was virgin land to explore. When to that is added the conditions that have just been stated, many of which are hard to reconcile with one another — freedom from school training, general ease of application, brevity, precision, possi- bility of quantitative evaluation, there can be no doubt that there was laid down here one of the hard- est problems that applied psychology had set for itself up to this time. Nevertheless, the difficulty has, in principle, been overcome. Of course this does not mean that the present form of the method can be regarded as a final form : it will doubtless suffer so many modifica- THE METHOD OF AGE GRADATION 31 tions in the near future that it will hardly be recog- nized in the end. But we know that we are on the right track, and in some future decades it can be fully appreciated what praise Binet and his co-worker Simon have deserved by directing us along this path. A short time ago — October 18, 1911 — the gifted and highly esteemed creator of the method died. His all-too-early demise, that we mourn most bitterly, compels others now to pick up the threads that he had spun. At such a moment it is appropriate to summarize briefly what has been gained and to point out the steps that are to be taken for further ad- vance. After many years of preliminary empirical inves- tigation to determine what tests might be considered normal for given ages, Binet and Simon published (33) in the year 1908, the first complete account of their system or tests. It comprised a series of from five to seven tests for each age from three to thirteen years. A revised draft appeared in 1911 (35, 36) in which many tests are modified, many shifted to dif- ferent age-years and the number of tests for each age-grade brought uniformly to five. The 1911 sys- tem replaces tests for 11, 12 and 13-year-olds by tests for 13 and 15-year-olds and adults. A list of all the investigations conducted on the B. S. tests to date is given in the bibliography at the end. In the appendix there are brought together in comparative form the series of tests proposed for each age by Binet and Simon in 1908, and 1911, by Bobertag, and by Terinan and Childs. As a glance at the list of tests shows, almost all of them are of the alternative type, i. e., they are tests 32 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE in which performance can not be graduated, but can only be scored right or wrong (+ or — ). Failure to reply at all is counted l minus' just as much as an expressly given wrong answer. It must be admitted also that it is often quite hard to decide in a given case whether to rate an answer + or — : the only way to do this with certainty is to practise for a long time and to observe uniformly the criteria that have been chosen for the decision. The tests are extremely varied in nature. Memory is tested, on the one hand as immediate memory for digits and sentences of different lengths, for a story that is read, and for three simple orders given together, and on the other hand as possession of simple everyday knowledge (days of the week, months, coins, right and left). Size and availability of vocabulary is determined by the number of words that can be named in three minutes. Since 1911 a test of suggestibility (judgment of line-lengths) has been introduced. Motor abilities are tested by some tests of drawing from copy, paper cutting and writing. Practical accomplishments are in- volved in counting coins, making change for a larger coin, exe- cuting the three commissions just mentioned. Mbst of the tests, however, aim more directly at intellectual activities. Comparison and discrimination are dealt with in va- rious forms, e. g., sensory comparison (of small boxes of like ap- pearance, but unlike weight), logical discrimination from memory, both between concrete terms (wood and glass, fly and butterfly) and between abstract terms (lies and mistakes) ; esthetic com- parison (drawings of beautiful and ugly faces). There are also tested defining of both concrete and abstract terms, the completing of omissions in a text, the combining of three words into a sen- tence, orderly arranging both of sensory material (putting five little boxes in order according to their weight), and of logical, verbal material (placing jurnbled-up words in a sentence) ; the intelligent apprehension of a picture ; critical apprehension, both optical (noting omissions in drawings of persons), and logical (recognizing inconsistencies in certain sentences) ; practical moral intelligence (by questions in the form : 'What's the thing to do when so-and-so happens?'). Many of the tests recur in different age-levels in such a way that the standard of performance de- THE METHOD OP AGE GRADATION 33 manded is varied. Thus, the pictures are presented to subjects of all ages ; enumeration of the pictured objects corresponds to the 3-year old level, a descrip- tion of the action that the persons are carrying on, to the 7-year old level, a comprehension of the total meaning of the picture to the 12-year old level. The defining of concrete terms appears in the 6 and the 9- year stages; in the former, definition in terms of use suffices, e. g., "What is a horse ?" "To ride;" in the latter something superior to this is demanded, e. g., "What is a horse?" "An animal." Finally, the memory span tests for digits and sentences are graded into several classes according to their length ; thus, after once hearing the digits, the 3-year old child should be able to repeat two, the 4-year three, the 7-year five, the 12-year seven digits. The individual tests are of unequal value. Many are of exceptional merit, e. g., defining, describing pictures, answering questions that put a premium on intelligence. It is also a very meritorious feature that there are tests among them whose solution does not depend on readiness in the use of speech, e. g., the arrangement of the five weights, esthetic com- parison, recognizing omissions in pictures: we are as a rule altogether too much inclined to identify control of verbal expression with intelligence, an inference that is often false. Others of the tests, however, are more dependent than we could wish on external, particularly on home influences, e. g., know- ing coins, or are too much mere functions of pure mechanical memory (reciting the days of the week), so that it would be better to supplant them by others in the future. It must be recognized that any change 34 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE in the selection and arrangement of these tests pre- sents a difficulty of quite another sort than as if they were mere collocations of tests: for, since each of these tests is a factor in the determination of the final score, it is possible that a change may destroy the equilibrium of the whole system. This is easy to be seen in the supplementary investigation of Binet and Simon themselves, when they tried to cor- rect their system by the omission, insertion and transference of particular tests: for trials, e. g., those of Terman and Childs and of Chotzen, have shown that the second edition (1911) is in many re- spects less useful than the earlier form (1908). What remedies can be devised for this situation will be discussed below (Section 5a). The technique of the Binet-Simon method is by no means so easy as the simplicity of the material used would lead one at first to suppose. It is to be recom- mended that, so far as is in any way feasible, the ex- aminer should always do his work with the aid of an assistant to keep the record, so as to avoid the un- desirable division of attention between testing and recording. Both of these experimenters must have gained a high degree of practise and be well used to one another before they proceed to actual testing. The examiner must have an almost mechanical exact- ness and uniformity in the formulation of the con- tinually recurring questions, in the modulation of his voice, etc., yet he must be prepared for the many individual variations that appear in consequence of different reactions of the subjects, and must have definite measures in readiness for use in these junc- tures. Never must he permit it to be seen that some THE METHOD OF AGE GRADATION 35 answers are more, others less satisfactory to him: rather must he maintain an attitude of uniform and quiet friendliness. The recorder should not confine himself to the mere noting of plus and minus signs to show the net outcome of each test, but should also note down as fully as possible what the subject says and also such features of his behavior and attitude toward the tests as are worth noting. This is neces- sary both because it is often impossible to decide whether to credit a test 'plus' or l minus' until later on, after quiet consideration (and the material must be available for that) and also because it should make possible a qualitative analysis of the examinee. The individual subject ought, of course, to be tested not only with the tests of his age, but also with a considerable part of the whole series — on ac- count of the area of scattered distribution to be dis- cussed in a moment. The examiner should begin with tests that are neither too easy nor too difficult, avoid monotony and introduce short pauses if fatigue becomes noticeable. The testing of a single indi- vidual takes, for normals 20 to 30 minutes, accord- ing to age and circumstances, for abnormals from one-half to three-quarters of an hour, on account of the slower response. In mass experiments there is a source of difficulty in the possi- bility of communication between those already tested and those to be tested. It is true that the danger of such a 'psychic infec- tion' is not very great, on account of the peculiar character of the material used for the testing; nevertheless, one should avoid, as far as may be, the possibility of any spreading of information. Thus, for instance, it is not advisable to test the pupils of one class on several days in succession. If it is desired to examine a rather large number of children that belong in the same group, the plan followed at Breslau seems useful : four experimenters (with their clerks), all of whom had been trained to conduct the tests 36 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE in the same way, carried on tests the same afternoon in different rooms. Each experimenter could deal with four or five subjects in this time, and each subject was obliged to go home directly after his examination ; in this way, 16 to 20 members of the class were tested without there being possible any exchange of ideas between them. For all further details of the technique of these tests the directions for using them that are already available for different nations must be consulted. Such directions have been given for the examina- tions of French children by Binet and Simon in 1911 (35, 36), for English and American children by Whipple in his Manual (28), by Wallin (67) and more briefly by Huey (9), and for Italians by Treves-Saf- fiotti (66). For use in Germany Lipmann first fol- lowed the original instructions as precisely as possi- ble, and then Bobertag (40) described very fully his elaboration of them as based on practical tests— an elaboration that differs from Binet and Simon to advantage in some particulars, e. g., in the choice of pictures. The extended directions for testing and questioning that Bobertag has prepared should form the basis of all future investigations in Germany.2 2. The Resultant Values: Mental Aye, Mental Re- tardation, Advance, and Arrest; Mental Quotient We must now pass on to note how the grade of in- telligence of a subject can be derived from his per- formances in the tests. Considering the problem schematically, we might think that the grade of intelligence could be ex- 2The simple set of materials needed for carrying on the German tests, after Bobertag (lists of questions, tests of memory span, pictures, set of small boxes for weights, etc.), may be had of the Institute for Applied Psychology at Klein-Glienicke. THE METHOD OF AGE GRADATION 37 pressed by the stage whose tests could just be passed by the child: a subject who readily passed all the tests up through the 9-year ones, but failed with the 10-year and subsequent ones, would, accordingly, possess a nine-year grade of intelligence. But things are never quite so simple in actuality as they are in theory. The varying tests of any given age-level — we may call them a. b. c. d. e, — are not all equally difficult for all children, but there are, on the contrary, quite remarkable individual varia- tions. One child passes a to d, but fails with e; an- other passes a, c and e, but not b and d. This is due in part to momentary fluctuations of attention, fatigue, etc., that must, of course, always be reckoned with, but in part also to qualitative differences in in- telligence. The correlation between the different phases of intellectual functions is truly never so high that a positive accomplishing of test a must neces- sarily entail a like accomplishing of the approxi- mately * equally diffcult' tests b, c and d. And so it comes about that there is no hard and fast boundary between the age-level that a child passes completely and the levels that are unquestion- ably beyond his powers ; rather is there an interme- diate territory of greater or less extent within which successes and failures are scattered in irregular fashion: we shall call this the area of irregularity (Gebiet der Staff elstreuung). It is impossible to de- rive a mean or average value from the data afforded by this area without proceeding in a somewhat arbi- trary manner, but the formula proposed by Binet and Simon seems to have answered very well so far. 38 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE According to it, one first ascertains up to what age-level the tests are passed without failure (save that possible failure with a single test is not counted, because such failure may have been due to a momen- tary lapse of attention). This age-level is taken as the basis, but every five tests passed in levels above it are counted as one more year. If, then, a child should pass all tests (save a single one) to and in- cluding the six-year level and in addition three tests each in the 7th, the 8th, and the 9th year and one test also in the 10th year, these ten additional tests would be counted as two years, and the child would obtain for the net value of his intelligence, 6 + 2 years, i. e., his intelligence would be rated as that of an 8-year old child. This net value in terms of which the total intelli- gence of the subject is graded has, therefore, the sig- nificance of an age-designation : it indicates that the intelligence of the child tested is equivalent to the average intelligence of the children of the age stated. We thus arrive at the concept of mental age (Intel- ligenzalter, niveau intellectuel) , which is the cardinal feature of the method of graded tests. Now mental age must not, of course, be thought of as an absolutely unequivocal determination of a subject's intelligence, but only as a very rough quan- titative characterization of its value, without any implication as to qualitative differences, because one and the same mental age can be figured from the most varied sorts of distribution of passed and failed tests. But this very thing appears to constitute an advantage, rather than a disadvantage of the con- cept of mental age, for it gives expression to a fun- THE METHOD OF AGE GRADATION 39 damental psychological fact (already mentioned above) that, on account of the purely formal char- acter of intelligence and the lack of complete cor- relation among its constituent capacities, there never is a real phenomenological equivalence between the intelligence of two persons: what we do have is rather a teleological equivalence — when measured in terms of the single function of all intelligence, namely, adaptation to new requirements. And for this equivalence of two intelligences mental age fur- nishes an approximate measure, despite the fact that their equivalence does not mean their identity. The area of irregularity yet further affects the computation of mental age and in a way to which sufficient attention has not al- ways been given. In order to equalize possible omissions in the lower test-levels, one must always have at one's disposal tests in higher levels. Now the original Binet-Simon series comprised tests up to 13 years only : it follows that mental age 12 or 13 can- not be correctly computed, for tests from yet higher levels might perhaps have raised the total performance to a higher value. In using the 1908 Binet series, accordingly, computations ought to be carried up to mental age eleven only. The area of irregularity, again, affords another value in addition to mental age, viz.: the range of irregularity (Streuungsbreite). A child whose suc- cesses and failures are strewn irregularly over test- levels from 6 to 10 years has the same mental age, to be sure, but a very different range of irregularity, when compared with another whose mixture of suc- cesses and failures lies in the 7th to the 9th years only. Bobertag, who first gave attention to the im- portance of differences in ranges of irregularity, has devised a way of computing this factor ; I have myself suggested another way, but neither has been published as yet. 40 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE Yet, even with these methods, qualitative differ- ences in the area of irregularity are not touched, and for this reason it will be necessary in many cases to enter into a detailed analysis of the testing as well as to state the two resultant values (mental age and range of irregularity). It will often be distinctly worth while to determine in which tests there was special difficulty, in which special success. More- over, the value of observing the child during the test- ing must not be underestimated, for in many of the tests there are ways of setting about the task that may be of great interest (and for medical or peda- gogical judgment of the case, too), though these things would not be evident in the mere plus or minus set down for the outcome of the tests. We may al- lude, in this connection, among other things, to the kind of description given to the pictures, to the enu- meration of the 60 words, as well as to the behavior of the child when he arranges in order the five weights of like appearance but unlike weight. In this last it is not nearly so important whether the child finally gets the order right as it is to observe the child's manner of going to work — whether and how quickly he grasps the unaccustomed problem, whether he compares just two weights each time, or compares each weight with all the others when he puts it in place, or what not. In these investigations we should be warned, then, against the bare pursuit of numerical values : computation of such values and qualitative analysis must supplement one another, though, naturally, now the former and now the latter will receive special stress, according to the setting of the problem. THE METHOD OF AGE GRADATION 41 But let us return to mental age. The full signifi- cance of this final value is disclosed only when we consider it in relation to other circumstances. It can evidently be related to other quantitative scales, like chronological age, school grade and school standing, or we can find out how it varies with certain qualita- tive conditions, like social level, type of school, na- tionality and the like. Doubtless most significant is the relation of mental age to the actual chronological age of the subject, for, as already said, a certain mental level goes normally with a certain age, so that the relation of mental to chronological age indicates the amount of discrep- ancy between the intelligence present and that re- quired (in the sense of a norm to be expected), and in this way affords an expression for the degree of the child's intellectual endowment. Up to now this discrepancy has always been com- puted in the simple form of the difference between the two ages, which, when negative gave the absolute mental retardation, when positive the absolute mental advance of the child in terms of years. Thus, if mental retardation = - - 2, the child's mental de- velopment is two years behind the normal level of his age. It is perfectly clear how valuable the measurement of mental retardation is, particularly in the investi- gation of abnormal children. It has, however, been shown recently that the simple computation of the absolute difference between the two ages is not en- tirely adequate for this purpose, because this differ- ence does not mean the same thing at different ages (compare what is said in Section 4a, pp. 70 ff.) . Only 42 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE when children of approximately equal age-levels are under investigation can this value suffice: for all other cases the introduction of the mental quotient will be recommended farther on (cf. pp. 80 ff.). This value expresses not the difference, but the ratio of mental to chronological age and is thus partially in- dependent of the absolute magnitude of chronological age. The formula is, then : mental quotient = mental age -r- chronological age. With children who are just at their normal level, the value is 1, with those who are advanced, the value is greater than unity, with those mentally retarded, a proper fraction. The more pronounced the feeble-mindedness, the smaller the value of the fraction. Another and last concept that * mental age' sup- plies is that of mental arrest. This applies only to feeble-minded individuals and means a mental age that is not exceeded, despite increase of chronolog- ical age. 3. Results with Normal Children The investigation of normal children forms a pre- condition of the whole method, since the norm for each age must first be determined upon them. Yet, at the same time, investigations of these children have already brought out a series of results that permit us to set no slight value on the future worth of intelligence tests for the problems of normal peda- gogy. Thus far, tests have been made chiefly upon children in the common schools of both sexes and of different ages, less often upon pupils of the higher schools. THE METHOD OF AGE GRADATION 43 (a) General distribution of the level of intelli- gence. In those investigations where there have been tested a large number of elementary school pupils of different ages and with no attempt at spe- cial selection, there could be worked out general sta- tistics of the number of children that are at, above or below the mental level of their age. I bring to- gether in the following table the percentages thus far obtained. TABLE I. DISTRIBUTION OF THE LEVEL OF INTELLIGENCE FOR ALL AGES COLLECTIVELY. Difference, in Years, Between Mental and Chronological Age —2 —1 o -4-1 4-2 Binet (203 Children) 21 ^ C1 90 ^ 1 Bobertag (261 Children)3 3. 19 52 22.5 2.5 Goddard (1277 Children)4... IP 20.5 41.5 21.5 5.56 3Children from 5 to 10 years old. 4Children from 5 to 11 years old. Includes two or more years below or above age. Binet (37, p. 112) has brought together a frequency distribution of 203 normal children (ages not given). In this distribution we may note a remarkable symmetry : almost exactly one-half of the children are 'at age,' a good quarter are 'below age,' and a scant quarter are 'above age.' Bobertag has called attention to this peculiarly simple sym- metrical numerical distribution that he had noted first in his own results and then found confirmed in Binet. Bobertag has just published his own frequency distribution. I take from it (40, II, Table I) the figures for 261 children between 5 and 10 years. While here, again, the 'at age' children comprise half of all the cases, the divergence between the two other groups is but slight — the 'advanced' are somewhat more numerous than the 'retarded' children. A third set of data, derived from a much more extensive ma- terial, has been given us by Goddard (48), who has tested all the school children of a small American city (Vineland, N. J.). The distribution curve that Goddard has prepared from his raw figures 44 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE is certainly not usable, because he has included in it also the age- levels of 12 years and over, i. e., children for whom no more ade- quate tests were available from higher levels; the data for these subjects must therefore necessarily be thrown out If we bring together only those children whose area of irregularity is of satis- factory scope, children, then, between 4 and 11 years old, we shall have 1277 children, and it is their percentual distribution that I have computed. In it the percentage of children 'at age' is some- what less, that of children 'above age' is approximately the same as Bobertag, while that of children 'below age' shows a plain, though not very large increase. When we stop to think that we have to do in these three investigations not only with children of differ- ent nationality, but also with different examiners, each of whom had his own way of setting the tests and of evaluating them, we can not lay great stress upon what discrepancy exists between the three sets of statistics : we may conclude from them that when a sufficiently large number of non-selected children of different ages are tested, their degree of intelli- gence will be distributed in a somewhat symmetrical fashion. Approximately one-half (in America a scant half) stand at the level of their age; about a fifth (to a fourth) are a year retarded and the like number a year advanced; only a small percentage (at the most 11 per cent.) show more than one year of retardation and a still smaller fraction (at the most 5.5 per cent.) is mentally advanced by more than one year. One must be careful not to regard the 'at age' child and the 'normal' child as synonymous terms: on the contrary, the statistical results themselves show that the ' at age ' children simply constitute the middle section of normality, while the children that are one year advanced or retarded are still com- pletely within the bounds of normality. THE METHOD OF AGE GRADATION 45 It is worth noting that the distribution just cited bears a cer- tain resemblance to the simplest type of distribution known as Gauss' frequency curve, for this latter is not only a symmetrical curve, but it is also divided by the value known as the 'probable error' into three sections, and in such a manner that the middle section comprises the half, and the two end-sections each one- quarter of all the cases. Even a generation ago Galton advanced the hypothesis that the abilities of a large group of non-selected individuals would be distributed symmetrically in the form of the Gaussian curve. It is true that Galton thought the Gaussian law of distribution could be extended to apply to a very detailed gradation (16 grades) of ability, whereas statistics at present available only make it probable for a few main groups. Bobertag has supplemented our knowledge of this matter by discovering that a similar distribution holds good on other occasions when a fairly large number of individuals is divided into good, medium and poor groups. In statistics of marks pertaining to 2772 pupils, it turned out that marks of " better than satisfactory " were assigned to 25.7 per cent., of "satisfactory" to 50.8 per cent., and of " unsatis- factory " to 23.5 per cent, of all cases (40, IT, Table IV). Yet it is well not to ascribe too great significance to these ratios of distribution. In the first place, the empirical data now at our disposal are not enough to warrant as yet the assumption of a gen- eral conformity to law; and even for the data now at our disposal the formula holds only as a rough approximation and merely as a general tendency for a rather large number of cases within which the numerous irregularities compensate each other (compare on this point the next section). Neverthe- less, the findings already secured are of sufficient in- terest to be followed up farther.6 "Further discussion of this principle of symmetrical distribution and its relation to the Gaussian curve may be found in one of my previous articles (I, 248 ff.) and in Bobertag (40, II). 46 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE The principle governing this distribution has, how- ever, heuristic value even now in two ways: (1) When we are obliged to divide a group of persons on the basis of their mental ability into a good, a me- dium, and a poor group, the convenient and common division into three groups of equal size is certainly less close to the actual gradations than the setting off of a good and a poor quarter from the homoge- neous middle half of medium ability. (2) A require- ment, e. g., of a test or of a series of tests, may stand as normal for a given group of individuals when ap- proximately 75 per cent, of the members of the group meet it in a satisfactory or more than satisfactory manner. This idea has been used by Bobertag7 for the standardization of tests. (b) Different age-levels and nationalities. God- dard has thought that the symmetry of the curve of which we have just been speaking might be deemed proof that the Binet-Simon arrangement of tests represents in a way an ideal series, because it has af- forded on empirical test a distribution that was theoretically to have been expected. But this con- clusion is unjustifiable. The curve of symmetry ap- plies primarily only to all children taken collec- tively, without regard to age; but the Binet-Simon tests should really embrace normal standards for children of every one of the series of ages and their correctness would be demonstrated only provided the symmetrical distribution were disclosed for normal unselected children of each single year. But this is by no means the case, and, as a matter of fact, TSee Section 5a for details. THE METHOD OF AGE GRADATION 47 least of all in Goddard's own results. Bather is it true, as is evident from closer consideration, that the symmetrical curve above mentioned owes its existence to the fact that the varying results of dif- ferent years practically compensate each other.8 In truth, the results of almost all who have tried out the Binet-Simon method, regardless of the nation- ality tested, agree that the series set for the lower years are too easy, those for the higher too difficult. The evidence for this, so far as known to me, I have introduced in Table II. From Goddard's (48, p. 243), Bobertag's (40, II, Table I) and Miss Johnstone's9 raw tables I have computed the percentages of frequency for American, German and English children. Goddard's data I have figured for each year separately ; those for the two other investigators by bringing two or three years together, on account of the smaller number of cases (Table II). It will be seen that in the lower years many more are 'advanced' than there should be : in Goddard and in Johnstone the advanced outnumber not only the retarded, but even the 'at-age' children. Thus, for instance, in Goddard more than half the 5-year old children attain a mental age of G years or over — clear evidence that these tests are much too easy. In Bobertag the lack of symmetry is not so pronounced. The area of excessive percentage of advanced children ( and thus of excessive ease of the tests) extends in Goddard through the 7th, in Bobertag through the 8th year; in Johnstone it has not en- tirely disappeared even at the 9th year. Then comes a sudden re- versal : in the higher years the number of retarded children in- creases : the tests are therefore too hard. With the 155 subjects examined by Bloch and Preiss (38) there was made at the outset a selection such that only children of medium ability and school performance were tested. Consequently, retardation in mental age appeared almost not at all, but advance- did appear, though in diminishing frequency with advancing age. 8Ayres (31) calls attention to this point in his critique of Goddard. 9In Miss Johnstone's original work (52) the quantitative data are not given sufficiently clearly, but these data are given by Binet (36, p. 196) where will be found a table of distribution for 146 Sheffield school girls as imparted in a letter from Miss John- stone. 48 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE From the data given it can be computed that there were above the level of their age a full 50 per cent, of the 7-year olds, 20 per cent, of the 8 and 9-year olds and only 14 per cent, of the 10 and 11-year olds. TABLE II. PERCENTAGES RETARDED, AT AGE AND ADVANCED AT DIFFERENT YEARS. Chronological Investigator Age Retarded At Age Advanced 5 12 35 53 6 20.5 30 49.5 7 13 58 29 Goddard {8 44 41 15 9 40 28 32 10 27.5 56 16.5 11 56 36 8 5-6 11 60 29 Bobertag \ 7-8 7 48.5 44.5 [ 9-11 34 50 16 \ 6-7 12 20 68 Johnstone. . . .\ 8-9 20 40 40 [10-11 62 25 13 In the work of the Americans, Terman and Childs (64), and of Mile. Descoeudres (46), of Geneva, we find another method of presenting data, but the same result. The first-named tested 396 unselected children and figured the average value of each age; they found that the young children attained a much too high level, the older children a too low average level of intelligence, so that, on the whole, the mental levels were more like one another than were the chronological levels. It follows that the tests fail to bring fully to light the actual differences between the children. Mile. Descoeudres had tested in all only 24 children of six different ages ; the results showed differences of only two to four years in the mental ages of children in the youngest and oldest groups, though the chronological ages differed by six years. All these findings show, first of all, that the ar- rangement of the tests set forth by Binet and Simon in 1908 suffer from not inconsiderable errors that must be removed. Binet himself has recognized these defects, too, at least in part, for he subse- quently relegated the tests for 11, 12 and 13-year subjects to higher age-levels. THE METHOD OF AGE GRADATION 49 TABLE HI. AVERAGES FOB CERTAIN YEARS. (TERM AN AND CHILDS.) Chronological Age Mental Age 4.75 6.50 7.50 8.00 12.33 11.00 « But far more important is a positive result, viz. : the international accordance in the judgment as to special ease or special difficulty of certain test-levels. It is certainly not of minor significance that the 6- year old tests were too easy and the 11-year old as uniformly too difficult, with the 8 and 9-year old ap- parently forming a between-lying zone in the case of children in the common schools of America, Ger- many, France and England, all without exception. That, despite the differences in race and language, despite the divergences in school organization and in methods of instruction, there should be so decided agreement in the reactions of the children — is, in my opinion, the best vindication of the principle of the tests that one could imagine, because this agreement demonstrates that the tests do actually reach and discover the general developmental condi- tions of intelligence (so far as these are operative in public school children of the present cultural epoch), and not mere fragments of knowledge and attain- ments acquired by chance. And this confirmation of the principle may also lead us confidently to expect that the discrepancies that have been revealed at the same time in some of the details of the system can be obviated in the future. 50 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE (c) Children of different social strata. Social differences turn out otherwise than do differences of nationality, for they come out more or less con- spicuously in the results of the tests. The task of making comparative investigations by the graded tests of children of different social levels was under- taken in 1910 by Binet (36, p. 187) and by Breslau teachers, simultaneously. The incentive that led Binet to undertake this problem arose in certain investigations conducted by Decroly and Mile. Degand in a private school at Brussels (45), the results of which seemed to cast a measure of doubt upon the value of Binet 's tests, since the tests turned out, all of them, to be too easy. To be explicit, of 45 children tested, no one was be- low, 9 were at, and the rest were above the level of their age (13 by one year, 17 by two years, and 9 even by three years).10 Binet now points out, and rightly, that these figures present no argument what- soever against the value of his tests, but merely af- ford a positive contribution to the study of the dif- ferentiation conditioned by social factors. For all these Belgian children were sprung from the circles of the cultured middle class, whereas the Parisian children to whom the tests were ' fitted' belonged to lower classes. Binet, on this basis, reckons the average difference in mental age between children of the higher and lower classes at approximately a year and a half. Of course, this figure can stand only as a rough approximation ; it will vary, partic- ularly at different levels of chronological age — a 10See the review by Bobertag, Zeits. f. angew. Psych., 5 : p. 205, THE METHOD OF AGE GRADATION 51 point to which Binet, unfortunately, does not refer. Binet, himself, also induced some school directors of his acquaintance to take up this question in Paris. As a matter of fact, children in the superior schools were not considered, and the attempt was made merely to ascertain whether an influence of social environment could be discerned within the common schools. It is to be regretted that these tests were carried out upon but an extraordinarily small number of children. One investigation (p. 194) that was restricted to a single school came to no result. In this study there were examined 54 children, classified into four groups on the basis of social status. It may be mere accident that relatively more advanced children were found among the poorest than among the other groups ; but at any rate there was no trace of any positive relationship between mental age and social position. Probably, as Binet himself has already pointed out, the social differences present in this study were too small to affect the outcome. TABLE IV. DISTRIBUTION OP TWO GROUPS OF 30 PUBLIC SCHOOL CHILDREN EACH. f — Retarded — -> , — Advanced — x 2 Years 1 Year At Age 1 Year 2 Years Poor Neighborhood. ... 1 11 13 4 1 Good Neighborhood. ... 1 3 10 10 6 On the other hand, a clear difference was revealed when comparison was made of two public schools (p. 198), one of which was situated in the poorest quarter, the other in a relatively well-to-do neigh- borhood of Paris. There were tested from each school 30 children of corresponding ages, selected without reference to their school performance. Table IV shows how much more numerous were the cases of retarded intelligence in the poorer school. Binet figures the, average superiority in mental age 52 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE of the children of the better situated school to be three quarters of a year.11 A question as interesting as it is difficult to answer arises when we seek the causes of these differences in performance. It would obviously be very prema- ture to assume as already positively demonstrated that the intelligence, considered as innate mental ability, was of lower grade in children of the lower and poorer classes. Of course, it is not impossible that this may have been operative as a causal factor. One might, perhaps, assume that the very rise into the higher and better-off classes would itself predi- cate a certain intellectual selection, and that thus the children of these classes would have come into the world equipped with a superior intellectual endow- ment. But, on the other hand, it must be remembered that no series of tests, however skillfully selected it may be, does reach the innate intellectual endow- ment, stripped of all complications, but rather this endowment in conjunction with all the influences to "Since Stern assembled this material there has appeared an American study that does not confirm the general principle of en- vironmental influence (J. Weintrob and R. Weintrob. The Influ- ence of Environment on Mental Ability as Shown by Binet-Simon Tests. Jour, of Educ. Psych., 3: 1912, 577-583). The subjects were 210 children, 70 from the Horace Mann School of Teachers College, Columbia University, representing children from wealthy, or at -least very well-to-do families, 70 from the Speyer School, representing families of the "comfortable middle class" (wage- earners and small-business men), and 70 from the Hebrew Shelter- ing Orphan Asylum of New York, who were children springing from a very unfavorable environment. While the relatively small number of cases and the difference of nationality may render the outcome less conclusive, the results from the three institutions "showed very small and inconsistent differences." The original article should be consulted for further analysis of the data. — Translator. THE METHOD OF AGE GRADATION 53 which the examinee has been subjected up to the moment of the testing. And it is just these external influences that are different in the lower social classes. Children of higher social status are much more often in the company of adults, are stimulated in manifold ways, are busy in play and amusements with things that require thinking, acquire a totally different vocabulary and a notable command of lan- guage, and receive better school instruction ; all this must bring it about that they meet the demands of the tests better than children of the uncultured classes. Presumably, each of these factors, internal and external, endowment and environmental influences, plays a role in the result ; but we shall have to wait until very many, more extensive investigations have been made before we can secure more exact knowl- edge of the actual amount and range of influence possessed by the one or the other of them. The way to approach this problem is by special analysis of the data: it will be necessary to find out in which tests the superiority of the children of the cultured classes is particularly evident and which tests are passed with equal facility by children of both classes. i The material for a preliminary comparison of this sort has been drawn by Binet (36, p. 191) from the tables of Decroly and Degand. In them it is very interesting to note that the special superiority shown by better situated children is in those tests that involve thinking in the true sense of the term — apprehension, com- parison, criticism, formation of concepts, synthesis, etc., though, it must be admitted, most of them put a premium on linguistic readiness. The tests here included are: description and explana- tion of pictures, comparison of two objects, definition of abstract terms, recognition of omissions in drawings, criticism of absurd statements, arrangements of the five weights, naming 60 words in three minutes. To these are to be added certain tests that ob- 54 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE viously depend more upon external circumstances, like knowing the days of the week, the months of the year and coins. On the other hand, the tests that Binet designates as revealing social dif- ferences only slightly are for the most part those that hinge on school instruction, as copying, writing from dictation, counting backwards, making change, drawing a diamond. Only a single one of the tests that fail to reveal social differences is a real test of intelligence — the completion of gaps in a text. However, in view of the small number of children that could be used to base these results upon, any generalization of the conclusions from them is to be avoided. This problem of social differences and their effect upon intelligence leads over directly to certain prac- tical pedagogical principles. We may think, in this connection, for instance, of the demand [in Ger- many] for the establishment of the ' common' school (Einheitss chide) j in which children of all classes of society shall be included without distinction.12 It seems to me that in the discussion of this problem, just as in the problem of co-education, the purely psychological presuppositions are kept too little in mind because the socio-ethical phase of the ques- tion tends to claim first attention. But how the psychological methods of testing in- telligence can become of direct service for these practical questions will be shown, I hope, by an in- vestigation with the Binet-Siinon tests that is now being undertaken by a group of teachers in Breslau. The problem under study is that of a systematic comparison of pupils in a Volksschule and those in a Vorschule, i. e., the younger pupils in the Vorscliule of a Gymnasium.™ The aim is to find out whether 12See the footnote on p. 55. — Translator. 13 As it is impossible to render these terms in English equiva- lents, it is proper to explain that the German Volksschule is the elementary public school attended by children of the laboring or lower business classes. In it attendance is absolutely compulsory THE METHOD OF AGE GRADATION 55 there exist typical differences of intelligence be- tween groups of children of the same age, and what magnitude these differences attain at different ages. In Prussia, pupils may enter the Sexta (lowest class) of the Gymnasium after three years in the Vorschule, but only after four years in the Volksscliule. The tests were also aimed to discover to what extent this rule is psychologically justified, not only by the dif- ferences in the curricula of the two schools, but also by the general mental maturity of the children. Five groups were tested that had been carefully planned to be comparable, in the matter of age, viz. : 7 and 9 year old pupils of the Vorschule, and 7, 9 and 10 year old pupils in the Volksschule — in all about 150 boys. (See above, pp. 35 f., for some of the pre- cautionary rules observed in testing) . The results are now being worked out ; but, thanks to the courtesy of the investigators, I have been able to get some of from G to 14, unless the child is otherwise instructed. The Gym- nasium is one of a number of so-called 'higher' or 'secondary' schools with a 9-year curriculum (ages 9 to 18, or more), and preparatory for entrance into the university. Children of the bet- ter classes, destined for higher education, enter the Gymnasium (or some variant of it) after a preliminary three-year training (ages 6 to 9) in a Vorschule, which is thus virtually a special ele- mentary school for better-class children. Relatively infrequently do children started in the Volksschule later enter the Gymnasium. A demand is now being made by certain interests in Germany for the abolishment of these distinctions, at least in part, by com- pelling all children to begin school instruction in the same school (Einheitsschule) — a proposal which has been, and is, the occasion of very active, and even bitter discussion. I have given a somewhat fuller explanation of the German school system in Appendix II of my translation of Offner's Mental Fatigue, an earlier number of this series of Educational Psy- chology Monographs. — Transla tor. 56 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE them now, and from them I have prepared the fig- ures that appear in Table V. These figures, which must be regarded as strictly provisional, merely in- dicate with what percentual frequency all the tests, taken collectively, for which I have the data, have been passed, and this, it should be noted, for the three older groups of children only.14 TABLE V. PERCENTAGE OF TESTS PASSED IN CERTAIN AGE-LEVELS AT BRESLAU. 9-12 9 and 10 11 and 12 9- Year Vorschule Pupils 70 77 64 9-Year Volksschule Pupils 60 81 34 10- Year Volksschule Pupils 70 86 46 The first column shows that the 9-year old Volks- schule pupils rank in the number of tests passed 10 per cent, below the pupils of the same age in the bet- ter school, while the 10-year old Volksschule pupils attain the same measure of success as the Vorschule pupils a year younger. That, however, there is no real equality in this relationship is shown by the two other columns in which the percentage for the easier tests (9 and 10 levels) and the harder ones (levels 11 and 12) are calculated separately. While, in the easier tasks the Vorschule pupils, curiously enough, rank a little below the Volksschule pupils of their own age and 9 per cent, below the older pupils in that school, the outcome is quite different when we pass to the harder tasks (third column). These tests which lie above the age-level of the subjects are passed by the Vorschule pupils nearly twice as well as by their mates of like age in the Volksschule, 14For many very important tests, as for instance, description of the pictures, no results are at my disposal yet. THE METHOD OP AGE GRADATION 57 and even the older pupils in these tests fall 18 per cent, behind the younger children of the better school. If this interesting result should be con- firmed again in the detailed computations, as it prob- ably will be, we should then say: children of differ- ent social classes differ from each other less in the performances appropriate to their age than in the mastery of tasks that really lie above their level. We would have, then, a numerical demonstration for that well-known early ripening of children of the higher classes, for the anticipation of phenomena of developmental stages yet to come before the content of the current stage of development is fully ex- hausted. We may look forward with interest to the final re- sult of these investigations. The material of Table V also furnishes further confirmation of a law of differential psychology : the more complex a mental function, the more difficult to bring it into action, the later its appearance in the course of development, then so much the greater is its variability, and so much the more definitely are men and groups of men differentiated by it (cf. 1, pp.258 and 269). (d) Intelligence and school performance. The relation of these two factors is easily the most im- portant problem presented by our theme for prac- tical pedagogy. For at this point we may hope to get an insight into the factors that condition the progress of children within the school, the place that they take among their fellow-pupils on the basis of their work, and the way their marks turn out. Peo- ple are generally inclined to think there is a very 58 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE close connection between intellectual ability and school ability: good pupils are forthwith regarded as intelligent, and good school work is, with a cer- tain obviousness, expected of intelligent children, poor school work of the poor groups. So long, of course, as we had no special means of testing intelli- gence, there was no foundation on which to build up more exact knowledge of these interrelations: we had to content ourselves with opinions and with the generalization of occasional observations. But now we are beginning to get on firmer ground. Tests of intelligence have already taught us that the relations between intelligence and school ability are by no means so strict and uniform as most persons had thought. Just here we are concerned with the conclusions reached by the Binet-Simon method with normal children, but we shall encounter the same re- sult later on in two places (II, 4c and III, 3). We have two measures for the school capacity of a child that we want to compare with his intelligence— his pedagogical age and his marks. Pedagogical age is the normal age of the class to which the child belongs. If we assume that school- ing begins at 6, then the pedagogical age of a class that is just entering upon its fourth school year is 6 + 3 = 9 years. If there is in this class a child 11 years old, he then has a pedagogical retardation of two years, while an 8-year old classmate has an ad- vancement, or acceleration, of one year. The latter is very rare with us, on account of the exact way in which promotions are regulated; cases of it appear mostly when a child enters after private prepara- tion or from another school. Outside of Germany THE METHOD OF AGE GRADATION 59 cases of pedagogical advance seem to be more com- mon. Betardations are, however, quite frequent in consequence of non-promotion, long illness, etc., and sometimes they reach a considerable degree. Comparisons of pedagogical and mental age have been made by Binet and by Goddard. TABLE VI. RELATION OF PEDAGOGICAL AND MENTAL AGE (BINET). -Mental Age- Pedagogical Age Retarded At Level Advanced Total Retarded 14 9 1 24 Normal 16 33 16 65 Advanced 0 5 7 12 Total 30 47 24 101 Binet (36, p. 162) presents a distribution-table for 101 pupils and regards the agreement as tolerably satisfactory (Table VI). In fact, we do note that there are no paradoxical cases : no one of the children with mental retardation is pedagogically advanced, and only a single mentally advanced child turns out to be pedagogically retarded (and that case may be conditioned by illness). Yet in the remainder of the Table there are divergences of considerable magni- tude: only a scant third of the mentally advanced are also pedagogically advanced; less than half of the mentally retarded are likewise pedagogically re- tarded, while, of the pupils 'at age' pedagogically, one quarter surpass and another quarter fall short of the mental level of their age. An exact computation of these relations can be made by using the method of contingency.15 Con- 15The formula for it is developed in another of my treatises (1, 308 ff.). 60 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE tingency means the degree of correspondence be- tween two intersecting groups. If all children pedagogically retarded should also exhibit mental retardation, or if the converse should occur, then the contingency would be absolute (= 1) ; if among the pedagogically retarded children there were rela- tively no more mentally retarded than among the children with normal or with superior school attain- ments, then the contingency would be = 0. The de- gree of correspondence can be shown by a number lying between 0 and 1, termed the coefficient of con- tingency. From the above tables I have computed the follow- ing values : Degree of First Factor Second Factor Correspondence Pedagogical retardation Mental retardation 0.41 Mental retardation Pedagogical retardation 0.30 Pedagogical advance Mental advance 0.45 Mental advance Pedagogical advance 0.19 The index of correspondence, then, is but moder- ately large at best and even that only when we pass from school ability to intelligence, not in the re- verse direction. Hence, to draw conclusions about school status from varying intellectual abilities is even less permissible than to draw conclusions about intellectual ability from varying school status. In his mass-experiment, Goddard (48) came to a similar result. He found that more than the half of all the children tested were in classes that did not correspond to their mental age — most of them, as a matter of fact, in a lower and only a few in a higher class. Bobertag compared mental age with the school THE METHOD OF AGE GRADATION 61 marks (40, II, p. 501, Table II). In Ms table of dis- tribution (Table VII), too, there are no paradoxical cases. As for the rest, the coefficients of contin- gency are, according to my calculation, higher than with Binet, but still, however, of only moderate mag- nitude : TABLE VII RELATION OF MENTAL AGE AND SCHOOL MARKS ( EGBERT AG> •Mental Age- School Marks Retarded At Level Advanced Total Poor 29 17 0 46 Satisfactory 26 79 21 126 Good 0 13 31 44 Total 55 109 52 216 First Factor Second Factor Correspondence Poor marks Mental retardation 0.52 Mental retardation Poor marks 0.40 Good marks Mental advance 0.59 Mental advance Good marks 0.47 Here, again, it appears that inference from school performance to mental ability is safer than from mental ability to school performance, though here the correspondence between intelligence and the school performance is not so slight as with Binet, as above cited. What, now, is the significance of this lack of com- plete agreement between school efficiency and the outcome of the tests of intelligence? In the first place one might say that this was an- other proof of the defectiveness of the tests. That, since pedagogical age and school marks are the con- densed formulation or expression of the long-con- tinued and many-sided efficiency of the child and hence much more characteristic than the outcome of 62 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE a half-hour's testing, we would place confidence in the latter only if it agreed with the former; that if it did not, then the tests or at least the gradation derived from them would amount to nothing. Now we have already alluded in what has gone before to the weakness of the gradations of intelli- gence discovered by the Binet-Simon method, and it is entirely probable that this insufficiency has con- tributed in part to the lack of agreement with school performances.16 Since, for example, the tests for 7-year old children are too easy, many less gifted 7-year old children will reach the level of their age as a result of the testing, although they do not rank as "satisfactory" in the school. With the older children the reverse will obtain. Nevertheless, I do not think that this is the only cause of the lack of agreement : the true cause lies in something more fundamental. In the second place, one might believe that a true picture of mental endowment was given only by the tests, and that the blame for the disagreement should be ascribed entirely to the school; that the teachers had estimated the pupils wrongly when they assigned them marks not in accord with their mental level, and had treated them wrongly when they kept them back in a class beyond which they should have gone according to their mental level. In this vein, for instance, Goddard writes, for he refers this phenomenon almost entirely to a faulty system of promotion (48, pp. 241 and 249). But to dispose of the matter in that way is to ' ' pour out the baby with the bath. ' ' Of course, the 16This point has been made by Bobertag and others as well. THE METHOD OP AGE GRADATION 63 teachers, being human, make mistakes and not a few of the measures they adopt may be based upon mis- taken judgment of the mental maturity of the pupil, but it is inconceivable that half of all children should be victims of such mistakes. It seems to me, rather, that the results we have just been discussing themselves show that both of the opinions just cited are wrong. Complete agree- ment between school ability and intellectual ability is not to be expected at all nor even to be desired, because performance in the school depends not only upon intelligence, but also upon certain other and quite different factors. Thus, strength of memory, which, as is well known, is correlated only to a mod- erate degree with intelligence, certainly plays a large — perhaps a too large — role in the carrying on of school activities and in the estimate of their worth; the various special talents, too, cut across and modify the action of general intelligence. But beside this there are concerned factors that have nothing at all to do with intellect, but belong to the domain of will, in the widest sense of that term: I mean the degree and duration of attention, in- dustry and conscientiousness, sense of duty and capacity to fit into the social group. These are the essential elements that must be added to intelligence in order to transform mere potential to actual accomplishment, and these same elements are enough, even when conjoined with in- tellectual ability of lesser degrees, to produce ef- ficiency of a worthy degree. This is true in life, and it is true also even in the school ; and it is good that for once these relations should be brought out 64 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE clearly by numerical evidence. For the figures in the tables above do show just this, that intelligence is never more than a partial factor in school activ- ity: and this demonstration may serve to refute that one-sided intellectualism that notes and values in pupils only their intellectual ability. Not that in- tellectual endowment is not still to be regarded as a factor of chief importance : in truth when by tests of intelligence and other psychological devices we shall have obtained a more exact knowledge of it, there will be much of profit for the schools and many mistakes and wrong courses of procedure can be prevented, and this so much the more as we get clear ideas of the range and limits of its meaning and importance. If, for instance, a given pupil shows only a moderate success in the tests of intelli- gence but does distinctly good work at school, and if there is no chance that a special talent might have exerted a decided influence (which could easily be recognized if existent), then there is a probability approximating to certainty that this pupil's strength is to be sought primarily in qualities of character and will. Accordingly, the lack of agreement between tests of intelligence and school performance is really cal- culated to increase our confidence in the psycholog- ical test-methods. In this connection Kramer very pertinently remarks.17 "Had we found a strict parallelism between the results of the testing of in- telligence and the school performance, we should "See reference 54, pp. 30-31. Kramer was alluding to the ex- amination of abnormal children, but what he says applies to nor- mal cases as well. THE METHOD OP AGE GRADATION 65 have had to have felt the greatest distrust of the method. It would have raised the suspicion that we were doing nothing more than testing the school at- tainments themselves, either directly or indirectly, in which event the method would be futile for test- ing native endowment and its application would be superfluous, for we would need only to resort to the school performance directly for the information. " (e) Sex differences. Comparisons of the mental abilities of boys and girls have already been carried out in large numbers in experimental psychology, but they have been almost entirely confined to single tests,18 whereas the Binet-Simon serial tests have been used to but a surprisingly slight extent in the comparison of the sexes and have not yet led to positive conclusions. I confine myself to a brief ex- position of the material in question. Goddard tested 835 boys and 712 girls. Unfortu- nately, he has thrown together the data for the dif- ferent ages: it follows that his figures (48, p. 250) lose much of their value for comparative purposes, because retardation and advance have quite differ- ent meanings at different age-levels. Nevertheless, we may reproduce here the table of distribution for the children (which I have converted into percents). The tabular results suggest a slight inferiority of the boys, most evident in the group of those retarded 18The literature has been brought together by me elsewhere (1: Bibliography, Section VI) ; here we may cite the general sum- maries of the results of tests by Meyer and Wreschner, and the ex- tensive original studies of Cohn and Dieffenbacher (Nos. 1048, 1072 and 104 in the bibliography just cited). As one pretty gen- erally confirmed result may be mentioned, among others, that with the Ebbinghaus completion method girls are clearly inferior to boys of the same age. 66 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE one year to which 23 per cent, of the boys, but only 17 per cent, of the girls belong; the girls show a correspondingly greater percentage of their num- ber at or above age. TABLE VIII SEX DIFFEBENCES AS SHOWN BY BINET TESTS (GODDABD) , Retarded * , Advanced ^ Two Years One One Two Years or More Year At Age Year or More Boys 18.5 23 34.5 20 4 Girls 18.5 17 36.5 23 5 Goddard's statement that retardation of marked amount is more frequent with boys is not borne out by his own tables, for the percentage of boys and girls is here the same, 18.5.19 All the other investigators that have treated the question of sex differences have obtained results more favorable for the boys. Particularly decisive are the results obtained by Bloch and Preiss (38) upon Volksschule children in the manufacturing city of Kattowitz, in Upper Silesia. They tested 79 boys and 71 girls aged 7 to 11 years, all of whom displayed average native abil- ity and average school ability. The percentages passing successfully the various tests show almost in every one of them a very decided inferiority of the girls. In Table IX I have brought together all the tests for which Bloch and Preiss report the re- sults separately for the two sexes. No particular 19It must, of course, be borne in mind tbat these are pupils of schools for normal children, but the statement appears to be equally untrue for abnormal children. From one of Chotzen's tables (44, p. 462) I have calculated that excessive retardation, 5 years or more, appeared in 7 of 158 feeble-minded boys (4.5 per cent), but in 11 of 122 girls (9 per cent). THE METHOD OF AGE GRADATION 67 sex difference appears in the description of pictures and in the definition of abstract terms, and there is a slight superiority of the girls in the "hard" prob- lem-questions; but in all the other tests the boys aiford much higher percentages of success, often more than twice as high. Take, for instance, the 8- year old children: more than half of the boys, but not a single girl can arrange the five weights cor- rectly; four-fifths of all 8-year old boys recall cor- rectly what they have read, solve the easy problem- questions and state correctly the difference in things recalled in memory, whereas the percentage of girls TABLE IX SEX DIFFERENCES AS SHOWN BY BINET TESTS (BLOCH AND PREISS) Test Age Description of Pictures f 7 Memory of Story Read <{ 8 I 9 Arranging Three Weights 7 8 g Arranging Five Weights • -.X 11 ' 8 Easy Problem-questions \ 9 10 9 Hard Problem-questions \ 10 11 Defining Abstract Terms f ° Making a Sentence with Three Words \ 10 1 11 Arranging Words into Sentence. ... 11 Naming 60 Words in 3 Minutes ...'.. Detecting Absurdities 11 C if Comparing Objects from Memory. . . . < g r-Percentage of-^ Children Passing the Test Boys Girls No difference No difference 80 28 No difference 73 56 66 70 77 81 90 100 25 70 70 33 0 29 44 42 55 76 100 41 80 80 No difference 70 38 82 40 100 100 70 33 76 50 77 40 60 50 80 55 68 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE passing these tests successfully is only 28, 55 and 55, respectively. Where the same test runs through several years, the sex difference is nearly always greater in the younger than in the older children. This corresponds, again, with the psychological law that mental differences stand out more clearly in difficult than in easy tasks. Bloch and Preiss themselves point out that the number of persons upon which these results are based is too small to warrant final conclusions, but it is surely worthy of note that the inferiority of the girls extends to so many different kinds of tests. Bobertag (40, II, pp. 503-4) compared the same number of boys and girls of each age that ranked average in their school work. In each age the mental age of the boys turned out to be slightly above that of the girls ; the difference amounted to 1/7 year in the 8, 9 and 12-year old pupils, and to 1/5 year in the 10 and 11-year old pupils. Mile. Descoeudres (46) compared a very small number of pupils — one intelligent and one unintelli- gent boy and a like pair of girls from each of six chronological ages. Taking all the right answers together, the boys had 52, the girls 48 per cent. There is here, then, also, a superiority of the boys, though the amount of the difference is not, of course, significant. (/) Repeated tests of the same children. Atten- tion must be called to one other important experi- ment included in the article of Bobertag 's already mentioned (40, II) — an experiment that differs fun- damentally from all that have been conducted here- tofore. Bobertag retested in the year after a large THE METHOD OF AGE GRADATION 69 number of the children (83 in all) that he had tested in 1909. The reapplication of the same tests does not seem to have caused any noticeable difficulty, be- cause the memory of the details of the testing of the year before had as good as entirely disappeared. This experiment throws light upon three problems. In the first place it sheds an unexpectedly favorable light upon the reliability of the test method. Bober- tag arranged the 83 children in order on the basis of the number of tests solved by each of them and found that the order in the two years coincided very closely, in fact the correlation amounted to 0.95. Accordingly, even if the absolute grading into the different age-levels of intelligence that the Binet- Simon method affords is still somewhat uncertain, yet it is demonstrably very certain in its relative gradings. The position that a child takes in a group of children on the basis of a single testing of his in- telligence may be deemed to possess a high degree of reliability. In the second place there comes to light a clear relation between the mental status of a child and the rate of his subsequent intellectual development. Those children that ranked 'at age' in the first test- ing had advanced next year exactly one year, on the average, while the retarded children had advanced only two-thirds of a year, and the advanced children one year and a quarter in the same period. In the third place Bobertag found that the num- ber of children that deviated, either above or below the level of their age, increased as their age in- creased. It follows from this that, as chronological age increases, the gradation of ages becomes pro- 70 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE gressively of less significance as a standard of varia- tion : an intelligence that in the earlier years deviates above or below the level of its age by even less than a single year will in later years exceed this unit of deviation, which has then become relatively smaller. The same result had already been arrived at in in- vestigating abnormal children, as will be shown in the following section. 4. Abnormal Children (a) Mental arrest and mental retardation. The mental quotient. When Binet devised his system of tests, he had particularly in mind the testing of ab- normal children in order that children of this type could be recognized opportunely and transferred to the special classes and to the institutions for the feeble-minded. Furthermore, Binet, together with Simon, tried out his method upon a large number of such children, though, unfortunately, he has given us no detailed account of this investigation, but he did draw conclusions from his experiments that express the relation of feeble-mindedness to his method in very simple formulas. One of these theses refers to mental retardation and runs thus (38, p. 113) : "I am for my part of the opinion that every mental retardation amounting to two years can be regarded as a serious deficiency. ' ' A second of these theses refers to mental arrest and declares that the imbecile does not progress beyond the mental age of seven, the moron (feeble-minded in the narrower sense) beyond the mental age of nine.* *By other investigators and elsewhere by Binet the upper limit of moronity is placed at 12 years. — Translator. THE METHOD OP AGE GRADATION 71 The second investigation of feeble-minded chil- dren, that of Goddard (47) likewise suffers from lack of sufficiently detailed data. Goddard tested the children and adults in the Institution for the Feeble-Minded at Vineland, N. J., nearly 400 per- sons in all, using the 1908 Binet series. He reports, however, only the frequency with which the several age-levels were reached and does not relate these data to the chronological ages of his subjects, so that it is quite impossible to determine the degree of re- tardation from his tables. We can only derive cer- tain conclusions that will be mentioned later on. The only thorough investigations that have thus far been made upon large numbers of abnormal chil- dren are, accordingly, the tests made at Breslau by the psychiatrist Kramer (54) and Chotzen (in con- junction with Nicolauer (43, 44) ). These investiga- tors, by testing children of different types, have sup- plemented each other's work in a fortunate manner. Kramer's material consisted partly of young per- sons who had been brought before the juvenile court and referred thence to the psychiatrist for expert opinion, partly of children that had visited the clinics on account of mental or nervous affections. Chotzen applied the tests in his capacity of city school medical inspector for special classes: he, therefore, tested all the children that were newly turned over to the special school for defectives. While Kramer had to do mostly with older children, Chotzen 's business led him to deal mostly with chil- dren aged eight and nine years, but he extended his investigation by including some of the older children in the special school. The technique was patterned 72 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE exactly after that followed by Bobertag, who had himself tested a group of abnormal children as well as normal children. Both of these investigators express a favorable opinion of the value of the method for their pur- poses. Thus Kramer writes : "In summing up our results we might say, first of all, that we are very much satisfied with the method for our purposes. Leav- ing the quantitative results entirely out of consideration, we came in the course of the testing, on account of the varied nature of the tests, to get acquainted with the peculiarities of the child's make- up, to understand surprisingly well his response to requirements of a varied sort, and acquired valuable insight into the qualitative differences in the method of reaction displayed by the feeble- minded. In the case of the children sent by the central office for corrective treatment, most of whom we could get hold of but for a single examination, the relatively short time that was needed (about 45 minutes to one hour) to reach a reliable judgment con- cerning their intelligence proved to be an exceptionally agreeable feature of the work. In all the cases in which a judgment con- cerning the intelligence could be reached by anamnestic data or on the basis of clinical observations themselves, there resulted with but few exceptions no contradictions with the outcome of the Binet testing" (54, p. 27). To turn, now, to the figures : To begin with the second of the two theses of Binet that we cited above, his assertion of the existence of a "mental arrest" has also found confirmation in other directions. This thesis may be stated thus: For every feeble- minded child there is a level which, once attained, represents a definite terminus for his capacities to meet the demands of mental tests. That is, even though his age advances, his capacities do not ad- vance further than this level. Goddard found that the inmates of his institution were distributed in terms of mental age rather uni- formly over the age-levels from one to nine years (with approximately 10 to 11 per cent, in each year), THE METHOD OF AGE GRADATION 73 whereas the levels 10 to 12, taken together, com- prised only 7 per cent, of the total number. Though here, again, he unfortunately put together those children whose age was such that they might per- haps have been able later to transcend the level in which they were then found and the other inmates whose development had for ]ong been completely checked, yet his results do at least demonstrate that the feeble-minded only rarely transcend the mental age of nine. By comparing these mental ages with the diag- noses of the physicians he arrived at the following schema : , Imbeciles x Low- Middle- High- Type Idiots grade grade grade Morons Mental ages 1 to 2 3 to 4 5 6 to 7 8 to 12 Goddard's ' morons' coincide with our ' mentally feeble' (Debilen). The figures just given show that by far the greater number of them have a mental age of 8 and 9 years. Kramer (54, p. 29) and Chotzen (44, p. 494) reached similar results. Goddard compared the experimentally determined mental age with the general impression that the children had made on the teachers and officers of the institution and found a very satisfying amount of agreement. The children of a given mental age formed a fairly homogeneous group, both in respect to their every-day accomplishments and their ability to adapt themselves to the demands of institutional life. He adds to this a description of what can be expected in the line of practical behavior of a child 74 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE of a given mental age. But all of these statements stand very much in need of further testing. It is well to give here explicit warning against a certain false conception of the term "arrest." An imbecile who, during his life, never progresses past the mental age of seven, is not on that account to be thought of as the same as a seven-year old child. He does grow beyond that status in many respects : he acquires experiences that a normal 7-year old child does not possess, picks up many accomplish- ments, experiences the awakening within himself of impulses and needs that come with increasing years. The arrest, then, pertains only to that specific group of mental abilities that are tested by the tests. And even some ones of these abilities may show some de- velopment (cf. in this connection p. 86), only there still remain so many defects of a fundamental na- ture that, all in all, it is impossible for him to rise above the mental age of seven. Of importance is, furthermore, the discovery that Goddard made concerning the mental age of a spe- cial group, the morally feeble-minded. It turned out that this group was recruited solely from the upper of the age-levels represented in the institution. Of 22 such individuals, 15 had the mental age 9, 5 the age 10 and one each the ages 11 and 12. The circum- stance that moral defects do not extend down be- yond the mental age nine is explained by Goddard in the following way: Certain immoral instincts, like the impulse to lie, to steal, etc., normally awaken about the ninth year; later on reasoning develops and puts these tendencies under inhibition. With children whose mental age is below nine those in- THE METHOD OP AGE GRADATION 75 stincts are not yet developed, whereas with children who are arrested at about the mental age of nine, the instincts do show themselves without getting far enough along to develop the inhibition and so be- come a moral defect. We may leave undecided the question of the cor- rectness of this explanation, but in any event the fact remains that pronounced retardation in moral- ity is not associated ivith equally pronounced intel- lectual deficiency. The moral deficiency therefore displays a certain independence in its existence, and to that extent the old designation "moral insanity " was not utterly devoid of significance. We may allude, also, at this point, to a very sim- ilar conclusion reached by Kramer, who must, natu- rally, have encountered this type frequently among his criminal subjects. He says: "We have to do here with individuals whose defectiveness is on the moral side and in whom there can be noted even from their early youth a decided lack of moral ideas and altruistic spirit. In raising the question as to in how far these moral defects exist independent of intellectual deficiency, it is worth noting that in the examination a number of these children obtained a result that corresponded with their actual age. And even in the cases in which the mental ability fell be- low the norm, there was no parallelism at all be- tween the two kinds of deficiency. ' '20 20See Reference 54, p. 28. We may mention also in this connec- tion the results obtained by Frau Dosai-RevSsz (4) with separate tests. She compared the efficiency in computation, memory and report of normal children, simple feeble-minded and morally feeble- minded and found that the results for the last-named group fell almost entirely between the results for the two other groups. 76 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE But this discussion has already led us from the consideration of mental arrest to the question of the mental retardation of the feeble-minded. Binet used as the measure of retardation simply the dif- ference between mental age and chronological age and was so convinced of the general application of this measure that he looked upon the value "2 years " as a general expression for a definite and in fact serious deficiency. Binet 's successors also made use of this standard, but their own results teach us that we can not be satisfied with it. For it has become evident that one and the same absolute difference, e. g., a mental re- tardation of three years, means very different things at different years. Thus Kramer (54) remarks : "It should not be concluded that a 12-year old child with a mental age of 9 is of the same degree of feeble- mindedness as an 8-year21 child with a mental age of 5. In the case of the children turned over to us for examination by the Central Child Welfare Bureau (Jugendfursorgezentrale) it came out clearly that the differences revealed among the younger children were for the most part but small, but among the older children always greater, although the actual defects in these two groups, so far as we could judge them by other criteria, by no means revealed any corresponding difference, but seemed, on the average, to be about the same." Chot- zen (44, p. 493) also corroborates this view: "On account of a checking of development, the mental age ^Page 29. In the text there is. a typographical error here, 7 in- stead of 8-year. THE METHOD OF AGE GRADATION 77 of feeble-minded children lags progressively more and more behind their chronological age : the younger they are, the more, and the older they are, the less does a year's retardation mean in actual de- f ectiveness. ' ' How considerable the fluctuations are may be shown by some figures (Table X) that I have de- rived from one of Chotzen's tables (p. 485). In ad- dition to the tests, or rather independently of them, Chotzen examined all the pupils of the special school as the physician and the psychiatrist ordinarily would, and classified them, on the basis of this ex- amination, into the stock groups — moron, imbecile, idiot. Some he had to class outside of these groups by designating them as 'not feeble-minded' or as ' doubtful feeble-minded.' Now, it might be sup- posed that the members of any group, e. g., the morons, would necessarily show at least approxi- mately the same degree of mental endowment, re- gardless of differences in their chronological ages. But Table X shows that their mental retardation, computed as the absolute difference, has very dif- ferent values with the older than with the younger children, and Table XI, in which the average value of this measure of retardation has been figured for each age-level, reveals a rapid increase in the mag- nitude of the value, so that the 12-year-old imbeciles are retarded by twice as many years as the 8-year- old imbeciles (4.7 as against 2.3 years). From this it seems to me to follow that the abso- lute difference can be used only when we are dealing with children of a given age. If, for example, it should sometime be arranged to carry out tests of 78 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE intelligence upon all 6-year-old children when they entered school, then the designations " retarded one year" or "advanced one year" would have an un- equivocal meaning. TABLE X FREQUENCY OF MENTAL RETARDATION IN DIFFERENT FORMS OF FEEBLE- MINDEDNESS AND DIFFERENT CHRONOLOGICAL AGES r- Not Feeble-minded-^ r-Doubtfully Defective-^ Retardation 0 1 Yr. 2 Yrs. 3 Yrs. 1 Yr. 2 Yrs. 3 Yrs. 4 Yrs. 8 6 11 13 4 975 13 Chronological Age 10 2 3 11 1 1 12 1 1 13 1 / Mjorons > , Imbeciles ^ 1234 12345 Retardation Yr. Yrs. Yrs. Yrs. Yr. Yrs. Yrs. Yrs. Yrs. [84 10 2, c 6 21 921 Chronological 9 15 2 8 30 8 2 - 10 57 754 Age 11 5 1 2 12 21 12 But it is another matter when we have to consider children of quite different ages or when we want to express the degree of backwardness in a formula of general validity. The value based on absolute dif- ference, if given by itself, may mean very different things, so that at least the chronological age ought always to be stated to enable the reader to figure out the degree of importance to be attached to the dif- ference. To what prolixity of statement this method leads one may be illustrated by the follow- ing sentence from Chotzen (pp. 493-4) : "Children of 8 to 9 years can suffer a deficiency of one year, those of 10 to 12 years one of two years without THE METHOD OF AGE GRADATION 79 feeble-mindedness being present, but a backwardness of two, or of three years, respectively, for these ages, certainly cannot coexist with normal intelligence. " TABLE XI AVERAGE RETARDATION, IN YEARS, OF THE CHILDREN IN TABLE X Chronological Not Doubtful Age Feeble-minded Defect Morons Imbeciles 8 0.65 1.3 1.9 2.3 9 1.4 1.7 2.1 3.1 10 2.0 2.0 2.6 3.8 11 3.0 3.5 3.2 4.0 12 2.0 3.0 3.3 4.7 13 3.5 That the size of the absolute difference for the same degree of feeble-inindedness should increase as age increases is psychologically easily intelligible, for, since feeble-mindedness consists essentially in a condition of development that is below the normal condition, the rate of development will also be a slower one, and thus every added year of age must magnify the difference in question, at least as long as there is present anything that could be called mental development at all. With this in mind it is but a step to the idea of measuring the backward- ness by the relative difference, i. e., by the ratio be- tween mental and chronological age, instead of by the absolute difference. Bobertag had already con- ceived a plan of this sort, while Kramer (54, p. 30) hints at something of the sort, though very guard- edly: "Whether perhaps there might be devised a specific method of calculation for relating the dif- ference in years to chronological age and which would then give us an absolute measure for degree of feeble-mindedness, seems to me a matter of doubt." 80 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE The results of Chotzen that now lie before us per- mit us to test the feasibility of a relative measure of this sort. I should like to recommend the relating to chronological age not of the difference, but of the mental age itself. We would then obtain the mental quotient that has already been mentioned (p. 42). This quotient shows what fractional part of the in- telligence normal to his age a feeble-minded child attains. Mental quotient -- mental age -f- chrono- logical age. An 8-year old child with a mental age of six has, then, a mental quotient of 6/8 = 0.75. A 12-year old child with a mental age of 9 has the same mental quotient. TABLE XII AVERAGE MENTAL QUOTIENT OF THE CHILDREN IN TABLES X AND XI Chronological Not Doubtful Age Feeble-minded Defect Morons Imbeciles 8 0.92 0.84 0.76 0.71 9 0.85 0.81 0.77 0.67 10 (0.80) (0.80) 0.74 0.62 11 (0.73) (0.68) 0.71 (0.64) 12 (0.75) (0.75) (0.73) (0.61) 13 (0.73) Now when we turn into quotients the values cal- culated from Chotzen in Table XI, we obtain Table XII. The idiots have been omitted for reasons that will appear later. The figures in brackets are those that cannot be deemed reliable averages on account of too few individuals included in them. The table reveals mental quotients for the two main forms of feeble-mindedness that are, it is true, not constant, but that are, however, very similar through several chronological years. The morons, in especial, show surprisingly uniform values ; their average quotient THE METHOD OP AGE GRADATION 81 varies only within the narrow range 0.71 to 0.77 for the five years 8 to 12. Roughly expressed, therefore, their intelligence, measured by that of normal per- sons, is a ' three-quarter intelligence.' The imbeciles show somewhat greater variations, but their mental quotients are in quite fair agreement, at least for the years 9 to 11. They entitle their possessors, again roughly speaking, to a scant ' two-thirds intel- ligence.' The first two of Chotzen's groups are represented by too few cases to permit consideration of their averages, save at most for the younger ages. In these ages the mental quotient agrees finely with the medical diagnosis of the children. Those desig- nated as "not feeble-minded " have a mental quotient of about 0.90, while the doubtfully-defective, whose quotient lies between 0.80 and 0.84, form a real in- termediate grade between the 'not-abnormal' and the true morons. The isolated cases of older chil- dren (7 in all) that Chotzen classified in these two groups, are ranked by their quotient largely in the morons. It is possible that the mental quotient may supplement uncertain medical diagnoses in cases of this sort. Now the objection might be raised to the above series of quotients that they comprise only averages and that these have been derived in part from a too small number of values. To meet this objection I have made another computation in which I have worked out the mental quotients of individual chil- dren, and then have recorded their frequency-dis- tribution. In this computation I have disregarded chronological age, and have combined in each case 82 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE the values for 10 points on the scale, e. g., the quo- tients lying between 1.00 and 91, between 0.81 and 0.90, etc. TABLE XIII DISTRIBUTION OF MENTAL QUOTIENTS IN DIFFERENT GROUPS OF FEEBLE-MINDED Not Mental Quotient Abs. 0.91-1.00.... 0.81-0.90. . . . 0.71-0.80.... 0.61-0.70.... 0.51-0.60. . . . 0.41-0.50. . . . 0.31-0.40... feeble ibs. >-Md'd ^Doul Rel. Abs. btful — •» Rel. r-Morons-^ Abs. Rel. r-Imbeciles-^ Abs. Rel. 6 18 19 57 14 48 5 9 6 5.5 8 25 13 45 37 67 30 27. 2 7 13 24 49 44. 15 13.5 9 8 2 2 Totals... 33 100 :9 100 55 100 111 100 Table XIII shows the distribution of the results obtained in this way, both in absolute numbers and in percentages : Figure 1 also shows the distribution of the percentages graphically. 100 (.0 FIG. 1. DISTRIBUTION OF MENTAL QUOTIENTS DERIVED FROM CHOTZEN'S RESULTS. = not feeble-minded. = doubtful. —.—.—.—.—.—. = moron. = imbecile. THE METHOD OF AGE GRADATION 83 There appears a clear separation of the points of maximal frequency for the chief groups, and, it is to he noted, the mental quotient of the ' not-abnormal' children lies mostly between 0.81 and 0.90, that of the morons between 0.71 and 0.80, that of the im- beciles between 0.61 and 0.70 — all quite in accord- ance with our earlier figures. In the case of the im- beciles the range of the quotients is wider than with the other groups, as the average values had already shown. Attention should be called to the fairly symmetrical form of the three curves : this brings it about that the point of maximal frequency and the average tend to coincide within each group. The transitional character of the group of doubt- fully defective also finds expression typically in that its members are distributed fairly uniformly over the regions that are characteristic on the one hand of the normals and on the other hand of the un- doubted morons. The number of the children tested by Chotzen is not yet large enough and particularly their distri- bution over the different age-levels is not wide enough to consider the above figures as having con- clusive value for other sets of material, yet they do seem to me so far removed from objection as to demonstrate that the mental quotient is a very much more useful measure of backwardness than the com- monly used absolute difference. The quotient does not seem, however, to afford an actually constant expression of degree of feeble- mindedness, but shows a tendency to fall in value as age increases. This tendency, it is evident, is but slight within the limits of age that have been men- 84 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE tioned, so that for many problems it can be neglected. Before and after these ages the fall in the value seems to take place more rapidly. In the case of the later age-levels this is easily intelligible, for once the stage of arrest that we have previously dis- cussed is reached (for morons at the mental age of 9), the quotient obtained by dividing mental by chronological age must decrease as chronological age increases. The feeble-minded child, it must be remembered, not only has a slower rate of develop- ment than the normal child, but also reaches a stage of arrest at an age when the normal child's intelli- gence is still pushing forward in its development. At this time, then, the cleft between the two will be markedly widened. From these considerations it follows that the mental quotient can hold good as an index of feeble- mindedness only during that period when the de- velopment of the feeble-minded individual is still in progress. It is for this reason that there is no sense in calculating the quotient for idiots, because, in their case, the stage of arrested development has been entered upon long before the ages at which they are being subjected to examination. The above- mentioned gradual tendency of the mental quotient to sink during the progress of development shows that this development approaches the final level of arrest at a progressively decreasing rate.22 Whether we shall succeed some time in finding a formula for a truely constant coefficient of feeble- mindedness must be left for the future. ^In his last article (40, II) Bobertag lays special stress on this progressive retardation in the rate of development of the feeble- minded and attempts to present it in graphic form. THE METHOD OF AGE GRADATION 85 (b) Relation to the several tests. It must not be thought that the significance of the Binet-Simon method for the study of feeble-mindedness is re- stricted to the possibility of grading them quanti- tatively. Perhaps even more important than this is the qualitative analysis of the individual subject that the method allows and the discovery of how the several tests have participated in the final values. Chotzen's investigation, the first to attack this prob- lem, has shown how confusingly many special prob- lems and matters of interest are to be unearthed in this field. At the very outset, for example, there is thrust insistently upon us the question : Have we any right at all to equate a 10-year-old feeble-minded child with a 7-year-old normal child just because the re- sult of testing gives him a mental age of 7 years: in other words, can we say that feeble-mindedness is actually mere ' backwardness/ It is, indeed, quite often asserted that this expression is misleading because feeble-mindedness is something qualita- tively different from normality. But the Binet- Simon method makes it possible for us to work out the comparison between the two mental conditions exactly. And in fact comparison does show that the mental age of 7 years is not reached in the tests in quite the same way that the normal 7-year-old child reaches the same mental age, for the area of irregular dis- tribution is very much wider with the feeble-minded than ivith the normal child. Bobertag, in an as yet unpublished discussion, reckons the distribution at twice the area of that of a normal child. In other 8b PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE words we may say that the 'hits' and * misses' of the older feeble-minded children are scattered over very many more age-levels than are those of younger normal children : the defective fails unexpectedly to pass some quite easy tests, but succeeds here and there in meeting much higher requirements. There appears a certain dissociation of abilities that are normally more strictly intercorrelated. We are now in a position, moreover, to discover a general principle obtaining in this dissociation. There are certain abilities that are essentially a function of age, relatively independent of intelli- gence: there are other abilities that are conditioned entirely ~by specific degrees of intellectual develop- ment, regardless of the age at which this develop- ment is attained. A child of 9 or 10 years of age, even if he be defective, will be farther advanced than a normal child of 6 or 7 years of age in abilities of the first sort; but a normal child necessarily sur- passes a feeble-minded child in abilities of the second sort. A priori we should expect that to the first sort of abilities (those conditioned by age) would belong those dependent upon a mass of experiences fre- quently had and activities frequently discharged in everyday life. But a priori opinions of this sort are of no great service to us, and it will be of corre- spondingly great value for us to be able to discover by an analysis of the results of Binet-Simon tests which of the tests applied to the feeble-minded cor- relate more with age and which more with real in- telligence. Up to now the results of Chotzen are alone available for this purpose and even they af- THE METHOD OF AGE GRADATION 87 ford but an incomplete survey because Chotzen had to deal almost entirely with feeble-minded children of a single age-group (8 to 9 years). Chotzen gives us a whole series of computations to show the worth of the different tests for the diag- nosis of feeble-mindedness : the perusal of his diffi- cult exposition will afford the reader a new idea of the complications that arise when one really tries to analyze the serial system of tests to the last details. Because a repetition of investigations of this sort, especially with feeble-minded children of more ad- vanced ages is very much to be desired, we feel war- ranted in introducing here a brief account of the methods that Chotzen pursued in evaluating the tests. The simplest thing is, of course, the direct com- parison of feeble-minded with normal children of the same age (using Bobertag's data). From such a comparison it appeared (44, p. 440) that the back- wardness of the feeble-minded was least in the following tests: telling forenoon from afternoon, defining in terms of use, knowing own age, esthetic judgment, telling the number of the fingers, describing a picture, counting 13 pennies; the backwardness was, on the other hand, very pronounced in the following : memory-span for 16 syllables and for 5 digits, making change (80 Pfennige for 1 Mark), counting backward from 20 to 0, definition by super- ordinate terms, comparison of t\vo objects from memory, recall of a short story, naming the months and arranging the five weights. With children of other ages these lists would pre- sumably change. Thus the explanation of the pic- ture which is demanded of older children would doubtless bring out a decided difference between normal and feeble-minded children, though the de- scription of the picture which is demanded of the younger children did not bring out such a difference, according to Chotzen. 88 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE However, even these lists of Chotzen 's suffice to show that the differences between the two types of children turn out to be small in those tests that re- late to frequently practiced activities (counting, tell- ing how old they are) and to common experiences of everyday life (number of fingers, forenoon and afternoon) ; on the other hand, the deficiency of the feeble-minded is at once revealed in its entirety the moment that something unusual is demanded, that something new is presented and that attention must be sharply concentrated. A similar comparison can be carried out, in the next place, amongst the special-class pupils them- selves, i. e., between the different groups of feeble- minded that the medical diagnosis had established: Chotzen found out which tests exhibited a specially decided drop from one group to another in the feeble-minded children of the same age. I mention only those that showed a clear falling off of one-half in passing from the "not feeble-minded " to the morons and from the morons to the imbeciles. For 8- and 0-year-old children : drawing a diamond, repeating five digits, easy problem-questions. There was a somewhat smaller falling off in counting five coins and comparing two objects. For older children (Chotzen had also tested a series of older children for purposes of comparison) : comparison, reproduction of the item in the newspaper, arranging five weights, making change, defining by superordinate terms, knowing various pieces of money, repeating five digits. Thirdly and lastly, Chotzen figured out compara- tive results for those subjects of the same mental, but of different chronological age, as might happen, for instance, if an 8-year-old child were retarded two years, a 9-year-old child three years, or a 10- THE METHOD OF AGE GRADATION 89 year old child four years. He found that when children of a single mental level were considered, some tests show a clear increase in capacity with in- crease in chronological age, others no alteration, while yet others an actual decrease. Tests of the first sort, those that have an ' age-increase,' are doubtless tests that have least to do with intelli- gence, because, given the same intelligence, they are nevertheless better done by the older children. On the contrary, the other tests plainly stand in correla- tion with intelligence, more particularly the tests in which the older children actually turn out poorer. The following are the results : Decided increase with age is shown in copying, writing from dictation, the recall of two items of a story, naming the days of the week. "The tests accompanied by strong increase with age relate, then, almost exclusively to matters of information, particularly of school-information, the assimilation of which depends on the ex- tent of instruction. Where only a slight increase is to be detected, information also plays a role in some of the tests (five coins, knowing age), but for the most part the tests are such that not only practise, but also the natural increase of efficiency will im- prove the results, e. g., execution of three orders, counting back- wards, repeating 16 syllables. In all of these the increase with age is slight. No increase at all is present with tests that demand ability to judge and to combine or with such as put severe de- mands upon apprehension — comparison, problem-questions, noting omissions, repeating five digits" (44, p. 453). To this last category probably belong also: recall of six details of a story, arrangement of the five weights, explanation of a pic- ture, making change, though the figures are too small in these cases to permit positive conclusions. When we compare with each other these different lists obtained in different ways, we note, it is true, deviations in many details, yet, taken as a whole, the same tests keep cropping up as the ones in which defective intelligence is laid bare, unconcealed and 90 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE uncompensated, while in the other tests the defect of intelligence can be made good by greater age. When investigations of this kind shall have been carried out with a large number of feeble-minded individuals of different chronological ages, we may hope to reach a far deeper insight into the whole structure of defective intelligence in its different stages of development and degrees of enfeeblement. (c) Intelligence and school ability. The problem we have already met with the normal children (pp. 57 ff.) meets us again with the abnormal and leads us to quite similar conclusions. That is, only a par- tial correspondence exists between the magnitude of the mental defect and the reduction in school ability. Kramer states that a large number of chil- dren were retarded in their school classes by the same number of years as they were retarded in in- telligence, yet there were a good many who were more backward in school status than in intelligence (the opposite condition, less backward in the school, almost never obtained). In fact, there were some children completely incompetent for school work in whom no corresponding mental defect could be made out. Similarly, among the 8- and 9-year-old children turned over to the special classes (auxiliary school) Chotzen found a large number that did not have the two years of backwardness demanded by Binet for such a condition, but who, nevertheless, certainly belonged in the special school, because they failed completely in the regular school. This pedagogical retardation that is non-intel- lectually conditioned is, as will be understood, in THE METHOD OP AGE GRADATION 91 some cases a product of external conditions, in par- ticular of poor home conditions, neglect, change of residence and school, long illness, etc. In other cases, however, what is lacking is something in- ternal: those volitional attributes that must supple- ment intelligence to produce useful men are not de- veloped to the same degree as the intelligence. There are, then, the morally feeble-minded: "chil- dren of this type, as one might expect, shirk their lessons, are up to all sorts of mischief in the class, are quite unaffected by punishments, and so forth, so that, despite good intelligence, they more or less often fail of promotion. Those cases in which these mental anomalies are accompanied by intellectual deficiency of a small degree prove to be especially unpropitious (Kramer, 54, p. 31). 5. Points of View for the Reorganization and Im- provement of the Gradation Method Our discussion has revealed already a series of more or less serious defects in the Binet-Simon method, nor have these defects been removed by the revision made by Binet himself in 1911. Nearly every user of the method has called attention to weaknesses of some sort in it; moreover, many do more than merely criticize ; they make proposals for modifying or supplementing the method, or even make use themselves without more ado of modified methods of conducting the tests at this or that point. But it would become a very serious matter if in- dividual investigators, on mere grounds of personal preference or chance bits of criticism, should be for- ever making changes in an instrument of investiga- 92 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE tion that has attained international usage ; on the one hand such tinkering will destroy the balance of the whole system in which every test is peculiarly bound up with every other test, and on the other hand it will put an end to the comparison of the results of different investigators. For these reasons it is to be recommended that we proceed in the future in this way: wherever our object is to lay the emphasis on the substance of the results secured, as in the testing of children for practical purposes, let us for the present still con- tinue to use the old system, despite its evident de- fects. But independently from this, let investiga- tions directed to methodological issues be under- taken ivith the aim of constructing a gradation sys- tem that shall be revised in every particular. But this task is beyond the ability of the individual in- vestigator : the problems to be solved are too many and varied and the number of individuals that 'should be tested is too great. Bather is it true that here, if anywhere, is there opportunity for that community and division of work that is everywhere now de- manded in psychology. To prepare the way for the carrying out of such a program I enumerate here the chief points to be considered in this work of reconstruction and also offer for discussion some specific proposals of my own for modifications in the system. (a) Selection and appraisement of the various tests. The criticism that has been passed upon the various tests has been based sometimes on theoreti- cal considerations, sometimes on practical results. The critique of Ayres (31), who has done no work THE METHOD OP AGE GRADATION 93 himself with the method, is an instance of the first type. He complains that the tests have too seldom a direct relation to practical intelligence, that they principally concern such things as fluent use of lan- guage, memory span, response to problem questions that are quite foreign to real life, and also in part attainments that are to a great degree dependent on instruction and on influences of the home environ- ment, and also work with abstract concepts — some- thing, he says, with which only philosophers have to deal ( !), whereas they do not touch the ability to get on with the activities of life; he wants more "doing tests" introduced. Although Ayres' criticism is jus- tified in many respects, yet he seems to have over- looked the fundamental fact that intelligence is a formal activity, and that of necessity it is operative also in tasks whose content is not such as appears in real life. Indeed, problems of this sort have the methodological advantage that there is certainly no uncontrollable influence of training in them. More important are the criticisms that proceed from the empirical retrial of the tests. It has been shown that, as a matter of fact, there is too inti- mate dependence with school and environmental in- fluences in many of the tests ; others could not be as- signed positively to a specific age-level or showed no clear differences in the performances of children of unmistakably different intelligence. Again, objec- tion is to be raised to those tests in which there is a strong probability that the right answer may be a matter of mere chance, like the tests: "Show me your right hand, your left ear" and "Is it forenoon or afternoon?" 94 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE The fitness of a test to be employed at all, and its assignment to a given age-level is something that we shall be able in the future to work out by different methods. In the first place we have to make use of the rela- tion between the age-level and the test. In general, for a test to be valid for a certain level, the require- ment is that approximately 75 per cent, of all chil- dren of this age shall be able to pass the test. This requirement would correspond to the normal stand- ard of validity previously mentioned (p. 45 f.), and Bobertag, as more recently Bell (32), has actually checked up the assignment of given tests to given age-levels in accordance with this principle ; Terman and Childs (63) take 66 per cent, for the critical value, though this would seem, for the reasons al- ready cited, to be less appropriate. Taken alone, however, this principle is inadequate, for it does not inform us whether the test would be characteristic for just this age-level only and not just as much or nearly as much for another age-level. To determine this we must discover with what fre- quency the test is passed in other ages; and that test is most useful that shows the most decided ad- vance with age (a helpful methodological device to which Bobertag was the first to call attention). For the sake of illustration let us invent an example. Suppose two tests have each been passed successfully by 75 per cent, of 9-year-old children, but that the one test shows little, the other decided difference in the frequency with which it is passed by 8- and 10-year old children. If Test A be passed by 65, 75 ami 80 per cent, of 8-, 9- and 10-year old children, respectively, and Test B by 45, 75 and 90 per cent, in the same three ages, respectively, then the latter test sets a task whose performance is just normal for the 9-year-old, as compared with the 8-year-old children, and prac- tically a self-evident activity for 10-year-old children; Test B is then the more useful test THE METHOD OF AGE GRADATION 95 Since the process of mental development brings into maturity in succession a series of different part- functions, it follows that for each age there should be a series of tests to correspond to the phenomena of development that have just appeared; it must be possible, with the aid of the principle of decided ad- vance with age, to pick these tests out from a num- ber of others. Again, the matter of correspondence between the results of different investigations must be consid- ered in the selection of the tests. A test that grades the same or nearly the same with German, French, English and American children has naturally more claim to be included in the final system than one that varies markedly with the examiner or with the ex- aminees. The table that Bell (32) has prepared is instructive in this connection. He presents, side by side, the age-rank that each of the Binet-Simon tests would have attained on the basis of the results of Binet, Levistre and Morle, Johnstone, Groddard, Bobertag, and Terman and Childs.23 In many of the tests the variations are quite large; thus, the test of "comparing two objects from memory" ranges from the 6th year (Johnstone) to the 9th year (Terman and Childs), the test of "naming 60 words in three minutes" from the 10th year (God- dard) to the 15th year (Levistre and Morle, Terman and Childs). The assignment of such a test to a single age-level becomes, then, evidently an arbitrary matter. Over against these are other tests that show great constancy, at least so far. Thus, "counting 13 pennies," "esthetic comparison," "showing right hand and left 23It must be remembered that the tables and materials from which Bell had to construct his summary have been assembled so differently by the different investigators that their gauging of the several tests is not really directly comparable, so that Bell's tables must be regarded merely as a preliminary attempt at checking up the results of various investigations. 96 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE ear" fluctuate in rank-assignment only between the 6th and the 7th years, the "recognition of omissions in drawings" only between the 7th and the 8th years, the test of "counting backward from 20" only between the 8th and the 9th year, that of "naming the months" only between the 9th and the 10th year. The "hard problem-ques- tions" test is ranked by all these investigators save Goddard in the 12th year, etc. It will be seen that these are for the most part tests in which verbal formulation plays little or no part. It is, indeed, quite natural that where the problem and its answer are intimately connected with verbal expression, national peculiarities must make themselves evident; but it will be possible to reduce this source of error if more heed is given in the future to the transference of the tests from the one language to the other in such a way as to fit as exactly as possible the linguistic and cultural tone of the second nation and thus secure equal difficulty in the problems : the actual verbal translation that has been used by many investigators has often failed to meet this requirement. Thus, for instance, the rather free adaptation of the method that Bobertag made for Germany has yielded in many cases re- sults in closer accord with those of Binet than have the literal translations of the Americans. Finally, we shall have also to judge the value of a test according as it does succeed in bringing plainly to light differences in intelligence that are known from other sources to exist. On this point Mile. Descoeudres (46) has carried out a study of the present Binet tests, though, to be sure, upon but a limited number of children. She tested one "intelli- gent" and one "unintelligent" child from each of the six years of a boys' and of a girls' Volksschule: the selection was determined by the teachers' estimates THE METHOD OF AGE GRADATION 97 of the pupils7 intelligence. When, now, she com- pared the results of the tests for the 12 unintelli- gent and the 12 intelligent children, taken as groups, she found that the differentiation of the two groups appeared with quite unequal clearness in the differ- ent tests. Those tests in which the intelligent had the clearest advantage over the unintelligent (and that therefore have the most claim for consideration as tests of intelligence) are cited in the first column of Table XIV. TABLE XIV BINET TESTS WHEREIN A CLEAR DIFFERENCE IS SHOWN Between Intelligent and Unintelligent Normal Children ( Descoeudres ) . Between Normal and Feeble-Minded Chil- dren ( Chotzen ) . Between Different Grades of Feeble- Minded Children (Descoeudres). Arranging 5 weights. Definition superior to use. Counting backward. Explanation of pic- ture. Noting omissions in drawings. Detecting absurdities. Arranging 5 weights. Definition superior to use. Counting backward. Comparison of two objects from mem- ory. Repeating five digits, 1(5 syllables, and the story. Counting coins. Making change. Definitions. Description of pic- ture. Comparison of two objects from mem- ory. Problem-questions. Mile. Descoeudres has also undertaken a study of feeble-minded children (73), which may be intro- duced here for comparison. (We shall have occa- sion to discuss it in more detail later in another con- nection.) The children were arranged in order of the estimated degree of their feeble-mindedness and with this was compared their capacity in 15 differ- ent tests. Among these tests were six from the Binet-Simon series, four of which yielded extraordi- 98 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE narily high correlations (between 0.80 and 0.88) with the estimated intelligence. These four tests are listed in the third column of Table XIV. With the tests of "knowing coins " and "naming of 60 words in three minutes " the correspondence was of lesser degree. And thirdly, we must call to mind the results of Chotzen to which we have already referred (p. 87), in which certain tests gave far clearer expression than others to the difference between normal and feeble-minded children of the same age. These tests are listed in the second column of the table. It is worth noting that most of the tests appear several times in the three columns, despite the fact that the three investigations were carried out with children of quite different ages and under otherwise varying conditions. This shows that certain tests are particularly fitted to bring differences in intelli- gence out in clear relief, and at the same time it shows us a way to pick out these true tests of intelli- gence from the rest.24 It is unnecessary to add that there is no reason why controls like these should be limited only to the tests already used by Binet; in fact, comparisons between intelligent and unintelligent pupils have al- ready been carried out for the most varied sorts of tests by Meumann, Winteler, Cohn-Dieffenbacher and many foreign investigators. From these and other future investigations like them there will surely 24There is one other point of correspondence that ought to be mentioned, viz., that the differentiation of the children according to their social status was also revealed for the greater part by these same characteristic tests as revealed the intellectual differ- entiation (see pp. 53f.). THE METHOD OF AGE GRADATION 99 be found certain tests of so decided a symptomatic value that they will deserve to be adapted for intro- duction into the graded system. In this connection we may allude among others to the modifications of the Masselon test recently proposed by Meumann (v. p. 16) — a test that Meumann believes is usually solved with logical insight by the intelligent but not so by the unintelligent. Investigations in correlational psychology of which we shall speak in the next section likewise afford many tests whose results exhibit decided cor- respondence with estimated intelligence. These tests are evidently not such as can be introduced directly into the system of graded tests because they deal with fine gradation, whereas the Binet scale recognizes only tests that present merely the al- ternatives " right " or " wrong. " Possibly, how- ever, they will admit of rearrangement into a sim- pler form appropriate to the scale. By using all the methodological resources that we have cited we shall gradually succeed in selecting tests that are far more characteristic of the intelli- gence of a given age-level than those now in use and that are homogeneous for the different cultural groups and nations to be tested. (b) The composition of series for the several years. Since intelligence is a formal capacity that can be determined only by multiform testing, care must be taken that each single age-level should have a manifold of tests. It is not enough, therefore, to put together any sort of separate tests that happen to be passed by 75 per cent, of those of the age-level in question. If the tests are too similar to one an- 100 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE other, their combination does little more that the testing by any one of them would do. Binet and Simon did not keep this principle sufficiently in mind : some of their age-levels contain only linguistic tests and no tests of activity. Furthermore, the age-levels, considered as wholes, must also be adjusted, for to demand that particular tests be passed and to demand that all five tests of a given age-level shall be passed are two entirely dif- ferent things. This adjustment is rendered more dif- ficult by the fact that in computing mental age one must not only deal with the tests of one age-level, but also make supplementary use of tests from the higher levels; accordingly, in this adjustment of the levels as a whole attention must be paid to the interrela- tion of tests that come into consideration in connec tion with different near-by age-levels. The controlling principle for the adjustment or standardization of the age-levels is that approxi- mately symmetrical distribution of the mental ages must prevail for each level. That is, the tests are properly arranged and skillfully assembled into a system if, when a large number of unselected normal children of a given age are tested, a large middle group stand 'at age' and the rest are divided fairly equally between advanced and retarded cases. To carry out such investigations practically it will be necessary to try as many tests as possible with each pupil; in this way it will be feasible to assign the passing of each particular test to this or that age-level and to discover the general arrange- ment of the tests that furnishes the closest to a sym- metrical distribution. THE METHOD OF AGE GRADATION 101 The investigation can be made with more pre- cision if the curve of distribution be based upon the mental quotient instead of the mental age, for we should then anticipate fairly good correspondence between the curves of distribution of the different age-levels. The mental quotients for each age-level would then be grouped together in 10 per cent, ranges, i. e., we should have first the children with mental quotients ranging from 0.91 to 1.00 and 1.00 to 1.10 that would form the compact middle group, then on either side of them groups of rapidly dimin- ishing frequencies, those with mental quotients .81 to .90, .71 to .80, etc., below, and those with quotients 1.11 to 1.20, 1.21 to 1.30, etc., above. In the older form of the Binet-Simon scale the number of tests assigned to each year differed. In 1911 Binet put five tests in every age; it is to be recommended that this idea of uniformity be fol- lowed in the future because the computation of the final status is much simplified in that way.25 (c) The extension of the system. In the next place the system of tests is to be extended beyond its present limits and in different directions. Thus far the lack of tests has been most seriously felt in the upper years. The tests that Binet and others have devised above the llth year have been thus far quite tentative and provisional; at the best they could furnish us the necessary supplementary material for the ascertainment of mental ages 10 and 11, but they absolutely fail to provide a direct 20Cf. also the provisional new arrangement of Bobertag in Ap- pendix II. 102 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE measurement for the mental ages 12 to 15. We must admit that the discovery of appropriate tests for these higher levels of mental maturity is much more difficult than for the younger children, but the diffi- culty is to be overcome. Thus, Terman and Childs (64) have recently proposed a series of tests, each one of which is susceptible to diverse gradings with respect to the capacities that it requires, so that it can be employed up to mental age 15. Among these tests are arithmetical reasoning, familiarity with a list of selected words, a generalization test (discov- ering the 'moral' of a fable that is read to the sub- ject) and the Ebbinghaus completion test with the task made progressively more difficult.26 Let us hope that in such a way we may gradually advance from one year to another and may finally create a series for adults as the termination of the whole scale. However, this problem is certainly not so easy of solution as Binet thought when he trans- ferred to higher ages tests that he had originally de- veloped for the years 11, 12 and 13, and made the last of these groups over into tests for "adults" by the addition of two new ones. Another thing that is greatly to be desired is an extension of the system by the creation of parallel series of tests for each year.27 How gladly would we use the method to trace the mental development of the same children through several years; but there are difficulties in the way of this, because, of course, when the same tests are repeated, the child ^See Appendix II. "Cf. Binet, 36, p. 163. THE METHOD OP AGE GRADATION 103 confronts them with a different attitude.28 But if we had at our disposal other equivalent series of tests, it would be possible to make repeated testings of the same individuals more frequently. In the same way, when group tests were carried on, those children between whom there might be danger of collusion could be tested with different series. Finally, it is valuable to have a supplementary series at hand in case an investigation is rendered worthless by dis- turbance or ineptitude of any sort. When we shall have undertaken simply those try- outs of a considerable number of single tests sug- gested above (cf. pp. 92 ff.), we shall certainly have enough at our command to arrange parallel series for each year : though there will be some difficulty in securing an approximate equivalence between the corresponding scales. There has been some demand for yet another kind of extension of the scale. As it has appeared that the mental differences are extreme between one year and another in the case of the younger children, the need has been felt of intermediate stages, as for in- stance for specific standards for such age-levels as 6.5, 7.5 and 8.5 years. In our opinion this need is to be satisfied in another way, viz., by use of the mental quotient, since this permits us to take frac- 28Binet had five 9-year-old children tested twice with the same tests with a 14-day interval. On the average, the children passed 2.5 tests more on the second trial — an amount that would signify an increase of a half-year in mental age (pp. 164-5). As Bobertag has shown, the danger resident in repetition is not so great as this when the interval is longer (see above, p. 69) ; yet even under these conditions the use of the same tests is but a make-shift and a second and a third repetition of the same tests would be surely quite out of the question. 104 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE tions of ages into account without special half-year steps (see p. 105, below). Finally, mention may be made of other desires, curae posterior es: differentiation of the scales for children of different social strata, for the two sexes, and especially series devoid of the speech factor for the testing of the deaf and dumb, etc. (d) The computation of the final values. There are two main difficulties that demand our attention here. The one consists in the limitation of the measures of the mental age and the chronological age to ivhole numbers. This necessity of using whole numbers must often entail an arbitrariness that renders im- possible the carrying out of the method with pre- cision. For instance, a child who, when tested, lacks four months of completing his eighth year of life, must of necessity be classed as "8 years." If he passes the 7-year-old tests and two more, he still receives the mental age of 7 years, and is, accordingly, credited with a mental retardation of one year, although in reality there is practically no retardation at all. Bobertag29 tried to circumvent this difficulty by taking for his testing only those children that were close to their birthday (at least within 2 months). But usually there is no chance for free choice like this : there are certain children to be tested, regard- less of what their age happens to be at the time. Be- sides, that kind of selection at most only lessens the difficulty for the chronological age, not for the mental age. The failure to consider the two or three ex- cess tests passed still remains as a defect in the cal- culations. 29 40, I, p. 110. THE METHOD OF AGE GRADATION 105 Hence, however enticingly simple they may be, we shall have to give up the use of the rough whole-year designations, like 1, 2, 3 years of retardation, and make use of fractional values: it is enough, of course, to carry them to the first decimal place. In figuring mental age each single test passed in excess must, then, represent a fraction of a year. If, for example, two of the five 8-year tests are passed, then 2/5 is to be added to the mental age; the child in our example just above would then have obtained a mental age of 7.4 years. Terman and Childs (64) are already making use of such a mode of calcula- tion, only theirs is made rather awkward by the pres- ence of different fractional values in the several years : when the year contains seven tests, each test has only the value one-seventh, when five tests, one- fifth. This feature, too, confirms our desideratum al- ready expressed that every one of the years should contain just five tests, then each test would have the same value, 0.2 of its year. But now, once the use of the convenient whole numbers be given up, every objection against the in- troduction of the mental quotient is removed, for this furnishes us a single fractional value in place of the two fractional values, chronological age and mental age. This quotient lies for normal children in the neighborhood of 1.00 and grades off continuously from this value in both directions. As compared with the older method of dividing by the rough units of the age-levels, the use of the quotient has surely the advantage of affording a certain smoothness and continuity in the results, since the fraction (mental age divided by chronological age), when the deci- 106 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE mals are used in each term, may assume any value whatever. Thus the mental quotient becomes not only a useful methodological device for the testing of abnormal children, but also a device to be recom- mended for use with normal individuals. We have already mentioned (p. 101) a case illustrative of its application. The other difficulty of calculation pertains to the way in which scattered distribution of tests passed is handled in figuring mental age. As is well known, five scattered tests must be passed in order to add one year to the mental age, but no attention is then paid to the years in which these additional tests lie. Let us compare the two hypothetical examples which follow : CHILD A. All tests through the 6th year are passed : hence the basis for computation is a mental level of 6 years also passed in Age 7 two tests 1 also passed in Age 8 three tests I also passed in Age 9 three tests [ also passed in Age 10 two tests J total of 10 tests = 2 years Resulting mental level 8 years CHILD B. All tests through the 6th year are passed : hence the basis for computation is a mental level of % 6 years also passed in Age 7 three tests ] also passed in Age 8 five tests also passed in Age 9 two tests also passed in Age 10 no tests J total of 10 tests — 2 years Resulting mental level 8 years There seems no justification for equating these two children, because the first one really stands de- cidedly higher mentally by dint of his conspicuously THE METHOD OF AGE GRADATION 107 good capacities in the higher levels. To be correct, we must credit more difficult tests (those lying in higher levels) with a larger fractional value than the tests normal to the age in question when we fig- ure in these higher tests for addition to a lower age- level. We may propose a method of calculation for this purpose, that is not too complicated and that, like the mental quotient, takes account of the rela- tion of the several years to each other. A test from a higher level used to supplement a failure in a lower level shall be counted not merely as one test, but as a quotient of the two years in question. In our example just given, then, Child A would be figured out thus : basal point, mental age of 6 years ; the tests from the four following years would be counted in this way : Level 7 is formed by 2 tests from the 7th year (each of these counted therefore as "1 test") and by 3 tests from the 8th year, each of which are to be counted as 8/7 test). The 8th year would be formed by 3 tests from the 9th year (each counting 9/8 test) and 2 tests from the 10th year (each counting 10/8 test). We get, therefore, as total additional credits : 2 x 1 = 2 tests 8 3 X — = 3-4 tests 7 9 3 X — = 3-4 tests 8 10 2 x — = 2-5 tests 8 11.3 tests Since every 5 tests are worth one mental year, the above value indicates a supplement of 11.3 -^- 5 = 2.3 mental years : so Child A gets a mental age of 8.3 years. With Child B it works out thus : the 2 tests from age 9 serve to 108 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE supplement age 7, and therefore have the value 9/7 each, the re- maining 3 tests in the 7th year and likewise the 5 tests in the 8th year count for their own level, and thus figure 1 each. 3 x 1 — 3 tests 9 2 X — = 2.6 tests 5 X ! =5 tests 10.6 tests These 10.6 tests indicate a supplement of 2.1 years ; so Child B gets a mental age of 8.1 years and his inferiority to Child A is now brought out statistically. III. Estimation and Testing of Finer Gradations of Intelligence (With the aid of the method of ranks) 1. The Problem The different degrees of intelligence that are re- vealed by the Binet method are relatively gross: within any one of its age-levels there are possible other and very much finer gradations that escape detection by its tests. Yet these very differences are often enough just the ones of consequence, particu- larly whenever we are dealing with the members or a relatively homogeneous group. If, for instance, we are comparing the pupils of a school grade that are of approximately the same age and of corre- sponding school training, these pupils fall mostly into the same mental level according to the Binet- Simon tests, yet they occupy a finely graded scale of ranks within this level. Hence, the question what place a pupil occupies among those of his age or of his class in respect to intelligence must be answered by other methods that seek to establish a rank-order of the individuals concerned. Bank-orders of the pupils of a class can be estab- lished in quite different ways. In the first place there is the school or pedagogical rank-order that is based on school performances. Thus we number the pupils according to the outcome of a school exercise : 109 110 PSYCHOLOGICAL METHODS OP TESTING INTELLIGENCE the departmental teacher ranks them at the end of the term on the totality of their work in his subject : finally, all these ranks in different subjects are com- bined into a rank-order for the school certificate in which every pupil is assigned his "class-place." Since these rank-orders are always available, it is but natural to employ them for our problem, and this is what actually happened very often in the early stages of the experimental study of intelli- gence. Thus, for example, Ebbinghaus (5) divided his subjects into three sections on the basis of their class-marks in order to determine whether the dif- ferent groups responded with different degrees of success to his completion tests. Other investigators had the teachers select a number of ' good ' and 'poor' scholars in order to make comparison of their be- havior under experimentation. Yet, however convenient this ever-ready classifica- tion may be, it is not at all adequate for our pur- poses, because the implicit assumption that under- lies such uses of the class-marks — that school per- formance is an absolutely accurate indication of in- telligence— is unjustified. The results obtained by the Binet- Simon method have already shown this (see p. 59), and other statistical data will confirm it. As a matter of fact, every school man who is blessed with psychological insight, knows it himself. We need, then, a rank-order that is based directly upon the degree of intelligence of the pupils. Such an order does not exist in the ordinary school system, and must therefore first be created ad hoc. There are available, again, two different ways of ac- complishing this aim : either the teacher, on the basis ESTIMATION AND TESTING OP FINER GRADATIONS 111 of everything that he knows about his pupils, may estimate their intelligence and arrange them ac- cording to his estimate (see the next section) or we can apply experimental tests of intelligence, the out- come of which admits of arranging the pupils in a series. Kank-orders of intelligence are therefore divided into orders based on estimates and orders based on tests. In the last resort the second of these divisions brings us to the question: Is it possible, on the basis of a short examination with a series of tests to arrive at a gradation of pupils that corresponds with their actual differences of intelligence and such that the rank that each gains is sufficiently characteristic ~ -- 00 TH t- 05 o o b ^ cs,-. CO §7 1 ^— ' a V -agfl « rt3 CS^ ?^ i* 00 CO Oi CD t^ 00 odd ps "SB ^ S d d gg? or :^s :^5 •s g.a g'g 3 RSI ° u_i 5 * a ») «W ft o it Hi! P m I §1 5-0 Ij fllfili ? S « w JS « a S 0 § " ii £ g 3 ^ i ji .1 I „. 4-> -M 156 !! i *3 o Isll a a) a o aTS s ss 0> O<> 02 £S = g H o e3 OB •-* •O •Is.* S -2 "2 ^«* 3 § s? ^ : s..i 3 a> 02 o3 i O>t- D--P M *— ' • fl *S 3 o? a ^ 1 g a 1 ^ g ft •« « S *£H -M '•" ' ^ § .5^1 :? Jw O 05^3 o ss X S-J^-^G o-O 01 eg -^. — ^-; ^H H '^ «0 S w •c:- 02 «M a> o .2 jg fl ! S 2r i-, C^J a 'g^5 5 S Q 0 0! «,_, ^..Sf o> fa o> cd 05 tffc- a r "2, •* S 0 Q.C8 Q.O. es t* P a? D.a7c s Wtf «« lllii 2 M|s |in-" lisp? e 2 4> C« 3 a fa o« 1 fl fl o O % H g X 02 § a 02 -i is l.»-i •?sl|« fsSsf jiffj* fiili le. resident and 10 "3 ii d : score 4. test : score i of sele ). '033 W-l-T .11 s oa'- CQCJO i« : § s § •2 ~ .^S «• 53 f 3HSS | g •S B^. fl o 7 r PtfOH a^ss| | !f 111 0 0 — 'C, w ". •^ w eo fl g^-S i a ** gB-^s^ * flwoa-g g1 S^S^o § m si a a s a .5? .a £ ' 03 03 O a> a»# fl pQ jJ « ^2 2Hq 3 3^§ Q 0 .-g . S 4J '3 0 1-9 •I9A9I-93V 158 INDEX Abnormal adults, 6. Abnormal children, 6f., 41, 66, 70-91. Ayres, 47, 92. l T P Q4f Bernstein" 6 16 Bine : A 8 17 19 29 31 34 a^,^:W4^,«:^ 76, 90, 95, 98, 100, 103, 118. Binet-Simon method. 10f., 26; principle of, 29-36; resultant values in, 36-42 ; technique of, 34ff.; sex-differences in, 65- 68; improvement of, 91-108; composition of age-levels in, 99-101 ; extension of, 101-104. Bloch, E., 47, 66ff. Bobertag, O., 12, 29, 31, 36, 39, 43, 45-48, 50, 60ff., 68f., 72, 79 Descoeudres, Alice, 48, 68, 96, 141f. Dieffenbacher, 65, 98. Discrimination, tests of, 136. Dosai-Reve"sz, M., 75. Ebbinghaus, H., 7, 15f., 20, 65, 102' 110; see comPtetion test. of ^ K1 . -, -, .,,.,- Feeble-minded 66, 141 f. ; see *bnormal children. ralton F 4~ °'' ' intelliencP I12f intelligence, 112f. *MH ViT TT 100 llD' V '/0 !™ Kn - 43' 46ff" 59' Bourdon test 17 Su? Experiments at, 35, 50, 54-57, 71, 129. Brussels, experiments at, 50. Hurt, C., 128, 135, 140. CWIds, H., 31, 34, 48, 94f., 102, 105. Chotzen, F., 34, 66, 71, 73, 80ff., 76ff., 85, 97f. Cohn, 65, 98. Compensation, principle of, LISff., 139f. Completion test, 7, 15f., 20, 65, 102> 110- Contingency, method of, 59f. Correlation method, 10f., 26f., lllff., 135-143. Decroly, O., 50, 53. Degand, J., 50, 53. Gradation, method of, see Binet. , 16. imbecile, 70, 74, 81, 83 ; see moral imbecile. Information tests, 6. intelligence, nature of, 2-5, 9, 15> 17? 20f., 113ff., 120; gen- eral distribution of level of, 43ff. ; rank-orders of, 135-143. intermediate ages, 103 irregularity, area of, 37, 39, 85. Jaspers, K., 6. Johnstone, K. L., 47f., 95. Jones, H. G., 128. 159 160 PSYCHOLOGICAL METHODS OF TESTING INTELLIGENCE Kattowitz, investigations at, 66. Kraepelin, 6. Kramer, F., 9. 64, 71, 75f., 79, 90f. Levistre, 95. Lipmann, O., 14, 36. Masselon test, 16, 99. Mental advance, 41, 44, 59ff., 69. Mental age, 41 ; computation of, 37ff., 104-108. Mental arrest, 42, 70-84. Mental quotient, 42, 79-84, 101, 103, 105f. Mental retardation, 41, 44, 59ff., 69-84. Meumann, E., 6, 9, 16f., 29, 98f. Meyer, lit, 65. Moral imbecile, 74f. Morle, M.. 95. Moron, 70, 73, 80f., 83. Miinsterberg, H., 10. Myers, C. S., 12. Nationality, effect of, 49. Nicolauer, M., 71. Normal children, results with, 42-70. Parallel series of tests, 102. Paris, experiments at, 51f. Pearson, K., 10f., 128. Pedagogical age, 58ff. ; school standing. Proiss, A., 47, 66ff. Profile method, 25f. Psychiatrists, 6f., 18, 23. Repetition of tests, effect of, 68ff. Report experiment, 7. Rieger, 6, 23. Ries, G., 16, 135-139. Rindfleisch, 129. Rodenwald, E., 6. Rossolimo, 6, 16, 251 Saffiotti, 36. Scattered distribution, area of, 35. Scheifler, 129. School activities as tests, 17f. School standing and intelli- gence, 57-65, 90f., 110, 127-135. Series of tests, 13f., 23-27. Sex-differences, 65-68. Simon, 8, 29, 31, 34, 361, 49, 100. Single tests, 13-23, 135. Social differences, 50ff. Sommer, R., 61, 23. Spearman, C., 101, 21, 1121 Special classes, 9, 71, 90. Stern, W., 131, 45, 52. Subnormal Children, see abnor- mal children, feeble-minded, special classes. Talent, 4, 10. Teachers' estimates, see esti- mation. Terman, L., 34, 48, 941, 102, 105. Treves, 36. Vineland, experiments at, 71. Volksschule, experiments in, 54-57, 66ff, 96, 129. Qualitative analysis by Binet- Waite, H., 128. Simon method, 85-90. Wallin, J., 12, 36. Weintrob, J. and R., 52. Rank-order, method of, 109-144 ; Whipple, G. M., 7, 12, 14, 36. rules for forming, 120-127. Ranschburg, 7. Ranschburg method, 137. Reliability of tests, 69, 142. Winteler, J., 16, 98. Wreschner, 65. Ziehen, T., 61, 23. UNIVERSITY OF TORONTO LIBRARY Acme Library Card Pocket Under Pat. " Ref. Index File." Made by LIBRARY BUREAU