P-, Q-, AND R-TECHNIQUES 173 i. The Disparity of Units of Measurement.—His main criticism * is based on the fact that the units of measurement for different tests are generally incommensurable. But that objection, as it seems to me, comes from taking the word c test' too literally. For purposes of expounding the alternative method it is no doubt convenient to draw a sharp contrast between c persons ' and £ tests.' Yet, after all, what we correlate when we ' correlate tests' are measurements for certain persons' traits; and what we correlate when we * correlate persons ' are in theory measurements for the same traits in the same persons. The only difficulty, therefore, is to select traits and to find units which shall be consistent with the particular form of statistical analysis in view. After all, most examiners, I imagine, would be quite as ready to compare the same examiner's marks for different school subjects (or c tests ') as they would be to compare different examiners' marks for the same subject; and every teacher is continually contrasting the different abilities of one and the same individual: ' John is much better at Latin than he is at French'; * Joan did not do so well in the arithmetic tests as she did in reading and dictation.' If, to take Thomson's own instances of incommensurable measurements (p. 201), the marks obtained in an * analogies test' were insuperably disparate from those obtained in a ' dotting test' (i.e. if marks for such tests could not possibly be transmuted into commensurable units) how could we ever cross-multiply the figures to obtain the product-moment correlation between those tests ? But, it may be said, in correlating a couple of tests we begin by averaging marks for the same test, whereas in correlating persons we must first average marks from a number of different tests. How can we average such figures if the units of measurement are dis- similar ? As a matter of fact, the answer is still the same; and Thomson himself really supplies it in an earlier chapter of his book ([132], pp. 114 f.). There he proposes to find a weighted average for four such heterogeneous tests as picture completion, geometrical analogies, a reading test, and the Stanford-Binet test: his method is—first "we reduce the scores to standard measure" (p. 116). Once the arbitrary measurements furnished by the tests have been changed to standard measure, we can average the different tests with or without an additional weighting. That is all that is required for correlating by persons. This or its equivalent was, indeed, the plan suggested in my Memorandum for the Examinations Inquiry Council, namely, " first to reduce the crude scores to terms of the 1 With his discussion of an important side-issue—the so-called reciprocity principle—I shall deal in Part III.