BULLETINS OF THE Zoological Society of San Diego No. 17 Four Papers on the Applications of Statistical Methods to Herpetological Problems I. THE FREQUENCY DISTRIBUTIONS OF CERTAIN HERPETOLOGICAL VARIABLES II. ILLUSTRATIONS OF THE RELATIONSHIP BETWEEN POPULATIONS AND SAMPLES III. THE CORRELATION BETWEEN SCALATION AND LIFE ZONES IN SAN DIEGO COUNTY SNAKES IV. THE RATTLESNAKES LISTED BY LINNAEUS IN 1758 By L. M. KLAUBER Consulting Curator of Reptiles, Zoological Society of San Diego SAN DIEGO, CALIFORNIA October 15, 1941 Digitized by the Internet Archive in 2017 with funding from IMLS LG-70-15-0138-15 https://archive.org/details/bulletinsofzoolo1719unse ZOOLOGICAL SOCIETY OF SAN DIEGO BOARD OF DIRECTORS W. C. Crandall, President F. L. Ann able. First Vice-President Gordon Gray, Fred Kunzel, Secretary Thos. O. Burger, M. D. C. F. Cotant J. Waldo Malmberg Milton ' Second Vice-President T. M. Russell, Treasurer F. T. Olmstead Mrs. Robert P. Scripps Robert J. Sullivan Wegeforth HONORARY VICE-PRESIDENTS G. Allan Hancock Osa Johnson Fred E. Lewis STAFF OF ZOOLOGICAF GARDEN Mrs. Belle J. Bf.nchlf.y, Executive Secretary Ralph Virden, C. B. Perkins, Superintendent of Maintenance Milton Feeper, Head Gardener Herpetologist Karl L. Koch, Ornithologist Mrs. Lena P. Crouse, Peter March, Head of Educational Department Head Keeper Frank D. McKenney, D.V.M., Veterinary Pathologist THE PURPOSES 1. To advance a sincere and scientific study of nature. 2. To foster and stimulate interest in the conservation of wild life. 3. To maintain a permanent Zoological Exhibit in San Diego. 4. To stimulate public interest in the building and maintenance of a Zoo- logical Hospital. OF THE SOCIETY 5. To provide for the delivery of lectures, exhibit of pictures and publication of literature dealing with natural history and science. 6. To operate a society for the mutual benefit of its members for non- lucrative purposes. BULLETINS OF THE ZOOLOGICAL SOCIETY OF SAN DIEGO No. 17 Four Papers on the Applications of Statistical Methods to Herpetological Problems I. THE FREQUENCY DISTRIBUTIONS OF CERTAIN HERPETOLOGICAL VARIABLES II. ILLUSTRATIONS OF THE RELATIONSHIP BETWEEN POPULATIONS AND SAMPLES III. THE CORRELATION BETWEEN SCALATION AND LIFE ZONES IN SAN DIEGO COUNTY SNAKES IV. THE RATTLESNAKES LISTED BY LINNAEUS IN 175 8 BY L. M. Klauber Consulting Curator of Reptiles Zoological Society of San Diego SAN DIEGO, CALIFORNIA OCTOBER 15. 1941 TABLE OF CONTENTS Page I. The Frequency Distribution of Certain Herpetological Variables 5 Introduction 5 Importance of Homogeneity 6 Tests of Normality Scale Rows 8 Ventrals 11 Subcaudals 14 Labials and Other Head Scales 16 Lizard Scales 19 Turtle Scutes 19 Pattern 20 Broods .... 20 Rattles 21 Hemipenes 21 Measurements 21 Illustrative Sampling 2 5 Acknowledgments 27 Summary 27 Appendix 2 8 Bibliography of Statistical Texts 2 8 II. Illustrations of the Relationship Between Populations and Samples 3 3 Introduction 3 3 Artificial Populations 3 5 The Methods of Random Sampling 41 Variations of the Mean 42 Ranges of Variation 47 Dispersions of Samples Compared to Those of Populations 53 Sampling of Alternative Attributes 70 Summary 71 III. The Correlation Between Scalation and Life Zones in San Diego County Snakes 73 IV. The Rattlesnakes Listed by Linnaeus in 175 8 81 Introduction 81 Past Usages 81 Linnaeus’ Method and Types Specimens 82 Studies of Scale Counts 8 3 Horridus 87 Dryinas 89 Durissus 90 Conclusions 92 Acknowledgments 92 Summary 92 Bibliography 93 Klauber: Frequency Distributions 5 APPLICATIONS OF STATISTICAL METHODS TO HERPETOLOGICAL PROBLEMS I. THE FREQUENCY DISTRIBUTIONS OF CERTAIN HERPETOLOGICAL VARIABLES Introduction The methods of mathematical statistics may be used to advantage in the investigation of a number of herpetological problems; they will often va- lidate conclusions as to species relationships, morphology, ontogeny, and genetics. They are particularly valuable in assessing the significance of differences, relative degrees of variation, and the reality of correlations. But the accuracy of several of the formulas most commonly used depends to some extent on the closeness of adherence of the distribution of the variates to the normal probability curve.1 For example, in taxonomic prob- lems, one of the most frequently used formulas is that for determining the significance of the difference between two means, or the related problem of the probability that two samples were drawn from the same population and therefore represent the same species. This formula assumes a normal distribution of the population variates, although giving satisfactory results with moderate departures from normality.2 Similarly, normality is assumed in applying the correlation coefficient.3 Certain descriptive indicators, such as the interquartile range, do not give a satisfactory picture of a distribu- tion unless that distribution is substantially normal. Since, in taxonomy, we are interested primarily in the population which a sample represents, rather than the sample itself, it is usually desirable to have some indication of the probability that the sample was drawn from a normally distributed population. For the fortuities of sampling cause devia- tions from a normal distribution in the sample, even though the population from which the sample has been drawn is normally distributed. In herpeto- logical work of the kind here under consideration, the sample may com- prise from one to several hundred specimens of preserved laboratory material available for study; the population is the much larger, but unknown, group of live animals which were in the wild at the time the sample specimens were collected. 1 For references covering the statistical terms and methods used see the appendix and bibliography. In this discussion it is to be understood that a "normal” distribution is one in which the frequency distribution of the variates follows the normal probability curve. 2 Kenney, vol. 2, p. 141. 3 Treloar, p. 104. 6 Bulletin 17: Zoological Society of San Diego It is the purpose of this paper to set forth the results of an investigation of some typical herpetological distributions of scale counts, morphological features, and pattern to see whether normal distributions are frequent; and particularly to determine the nature of the deviations from normal in cer- tain especially important characters. For if substantially normal distribu- tions are the rule, as demonstrated in large samples representing these characters (ventral scale counts, for example), we may with some assur- ance presume normality in taxonomic problems involving closely related species, even though the available samples are too small to warrant final conclusions with respect to normality. (Larger samples permit greater assurance than smaller as to the probability of non-normality in the basic population.) If it be indicated that the variates in a certain character are distributed normally in several species, we need not investigate completely the distribution in some related form; if a visual inspection shows the dis- persion to be substantially normal we may safely use any formulas which give accurate results with approximate normality, as, for example, that for determining the significance of the difference between averages. Importance of Homogeneity In an investigation of the shape of a dispersion curve we should be sure of the homogeneity of the sample, otherwise an inaccurate conclusion may be drawn. Care must be exercised not to complicate the situation by the introduction of extraneous variables or stratification. Thus, if sexual dimorphism be present, the sexes should be tested separately; for if the ventral scutes in each sex of a certain snake be distributed normally, but there is a sex difference, then combining the sexes will produce a platykurtic (flat-topped) distribution, or even one which is bimodal. Hence, in such a combination a non-normal result may merely be an inefficient proof of sexual dimorphism. Similarly, geographically widespread samples are usually to be avoided in testing dispersion curves, for kurtosis or skewness may be only a cumbersome proof of geographic variation or incipient speciation. If there be doubt as to the existence of sexual or territorial dimorphism, one of the usual tests for the significance of differences should be made before treating the entire collection as a homogeneous unit. This may seem like arguing in a circle, since one of the purposes of determining normality is to validate the significance test. However, the latter is sub- stantially accurate even with some departure from normality, provided the distribution is unimodal and not strongely skewed. Sometimes a platykurtic distribution may in itself suggest an unrecognized hetero- geneity, if the same character is known to be normally distributed in other populations. For example, if the ventral scale counts of male rattle- snakes ordinarily have a normal distribution and they are found to be markedly platykurtic in a certain territory, one might suspect the presence of an unrecognized species confused with the one being investigated. Such a test would have indicated the composite character of Crotalus Klauber: Frequency Distributions 7 cinereous in Arizona, before the recognition and acceptance of C, scufnlatus as a valid species. However, the same result can usually be achieved some- what more simply by comparing the coefficients of variation in suspected populations with samples from other areas wherein the populations are assuredly homogeneous. Sometimes it may be desired to study distributions within a species or subspecies as a whole, even though territorial variations are known to exist intraspecifically. In such cases an attempt should be made to draw equal samples from each territorial or other element of the population, lest the result be distorted by the over-emphasis of the more numerous sections of the general sample. Accuracy, in making counts and in sexing, is essential if the results are to have value. For example, if there be sexual dimorphism in a certain character, inaccurate sexing will cause the distribution in each sex to be skewed toward the other. Also, there must be a uniform method of making counts. Herpetological methods are not highly standardized; if data are accumulated from more than one source there must be assurance that uniform rules were employed in deciding questionable counts. Tests of Normality Two types of tests of normality are in current use: (1) the chi-square test for goodness of fit; and (2) the comparison of certain moments of the sample distribution with the corresponding moments of the normal curve. Although involving somewhat less computation than a test of moments, the chi-square method has been criticized because of the neces- sity of grouping the small edge-frequencies, and because it ignores the signs and distribution of the differences of the class frequencies from normal; that is, it does not distinguish between kurtosis and skewness, or their directions — lepto- from platykursis, or positive from negative skew- ness.4 5 Both of these methods will be found discussed at length in statistical texts (see the appendix and bibliography) together with the nomenclature of aberrations from the normal curve (Figs. 1-5). Graphic methods, such as plotting the points of a distribution against the normal curve of best fit, either on rectangular co-ordinates or using a probability scale, will also give a picture of the fit, although they will not indicate the probability that the parent population is non-normal.'1 In fact, it will often be found worthwhile to plot the theoretical against the actual frequencies, after a chi-square calculation has been made, in order to visualize the departure 4 Geary and Pearson, p. 1. Although the chi-square test is criticized because of the sub- jective factor involved in grouping edge-classes, there are cases where one or two freak specimens too greatly affect moment determinations. These, often juveniles which probably would not survive, should usually be eliminated. 5 Codex Book Co. Arithmetic Probability Paper No. 3127 will be found useful; also the Otis Normal Percentile Chart of the World Book Co. Pearl (1940) p. 3 82 suggests an- other method of making a visual comparison between an empirical distribution and the normal curve. 8 Bulletin 17: Zoological Society of San Diego from normality. This is of particular value if one desires to find whether a certain attribute maintains a similar quality of deviation from the normal curve through several species; that is, whether this kind of deviation is characteristic of the attribute. Occasionally it will be desired to investi- gate such characteristics deviations further by fitting to other curves than the normal probability curve. See Croxton and Cowden, pp. 293-304, Elderton, pp. 5 8-127. Scale Rows The tapering bodies of most species of snakes are usually correlated with changes in the number of dorsal scale rows. In order to discuss the shape of the dispersion curve of this character it is necessary to define the particular count which is to be used as a basis of investigation. In most taxonomic work either the number of rows at mid-body is cited, or the number at the neck, at mid-body, and just before the vent. There are four types of suppression (or lack of suppression) of scale rows between the neck and the vent: (1) a constant number; (2) a con- stant number from neck to mid-body, followed by a decrease to the vent; (3) a continuously decreasing number from neck to vent; (4) an increase from neck (at its point of least diameter) to mid-body, followed by a decrease toward the vent. While many species, and even genera, adhere to only one of these methods, others may follow two or more, since, after all, there is no very sharp line between them. In fact, there is not entire agreement as to the definitions of the three points at which the rows are to be counted; usually they are (1) on the neck one head-length posterior to the hinge of the jaw; (2) at mid-body half-way between the head and vent; and (3) a short distance anterior to the vent, to avoid the irregularities involved in the considerable diminution in body diameter at that point. Complete studies of scale rows involve, not a determination of the num- ber of rows at these arbitrary and somewhat ill-defined points, but rather a two-dimensional picture presenting the sequence number of each row suppressed (considering the row bordering the ventrals as No. 1), and the point of suppression, the latter being located by the number of the ventral scute (counting from the head toward the tail) opposite which the suppression occurs. This introduces a more complicated set of variables than can be the subject of the present investigation which, therefore, will be restricted to the number of rows at mid-body. But the term will be used in the rather broad sense of the maximum rows evident in a trans- verse band at approximately the central part of the body, rather than at a single carefully determined mid-point. This will avoid variations pro- duced by a rigid definition rather than a true condition of the scale rows; it will not differentiate between suppression just anterior or posterior to the exact mid-body. Klauber: Frequency Distributions 9 Another difficulty in dealing with the distribution of scale rows lies in the strong tendency toward uneven rows in nearly all genera. This results from the fact that most counts are made up of a mid-dorsal row and two equal sets of lateral rows on either side, thus producing an odd-numbered total. Thus only in genera wherein the mid-dorsal row is occasionally sup- pressed, as for example in Trimorphodon, or in individuals having unequal numbers of lateral rows, is there an even-numbered total. Even where the mid-body count is considered to be the maximum found in a band extend- ing both anterior and posterior to the true mid-point there are cases of unbalance, that is, bilateral asymmetry, wherein a row is suppressed much sooner on one side than the other, or fails entirely to appear on one side. Sometimes a row may be represented by only a few scattered scales. But these even-numbered specimens are the exception rather than the rule; they will rarely reach eight per cent of the total, and in most species will be fewer. In checking the distribution of these variates against nor- mality these even-numbered specimens may be allocated to the uneven classifications next above and below; that is, if there are 10 specimens with 24 rows, add 5 to the number with 23 and 5 to the number with 2 5. The series of uneven numbers can then be tested for normality of distribu- tion. However, it is best, in those genera where there is true bilateral asymmetry (not the suppression of the mid-dorsal itself, as in Trimorpho- don), to consider the laterals, rather than the total dorsals, as the variable. This is done by deducting the mid-dorsal row and determining the dis- tribution of the laterals. In this method a snake with, say, 24 scale rows is presumed to have a mid-dorsal, 12 laterals on one side and 11 on the other; while, of course, one with 2 5 rows has a mid-dorsal and two sets of 12 laterals each. Thus we can make the type of transformation shown in Table 1. This distribution of laterals can then be checked for normality in the usual way. TABLE 1 San Diego County Crotalns viridis oreganus Conversion of Dorsals to Laterals Number of Number of — Number of lateral rows • dorsal rows specimens ll 12 13 23 7 14 24 5 5 5 25 440 880 26 37 37 37 27 121 242 28 1 1 29 2 Total 613 19 922 280 14 1 4 5 10 Bulletin 17: Zoological Society of San Diego The scale rows of most snakes at mid-body are too nearly invariant to require or justify an investigation of normality. The smaller and slimmer colubrids are often almost or quite without variation. For example, 213 specimens of Sonora occipitalis from eastern San Diego County all have 15 scale rows. The distribution in 274 specimens of Diadopbis amabilis similis from western San Diego County is 13 ( 12), 14(7), 15 (25 5).'' Of 3 34 specimens of Pbyllorbyncbus decurtatus perkinsi from desert San Diego County all have 19 scale rows except four, which are distributed as follows: 17(1), 18(2), and 21(1). Of 202 specimens of Rhinocbeilus lecontei from southern California all but four have 23 scale rows; of the four aberrants, three have 2 5, and the other 24 scale rows. Some of the larger species have a greater diversity. For example, 431 specimens of Lampropeltis getulus calif orniae (both pattern phases) from cismontane San Diego County have the following distribution: 22(2), 23 (3 54), 24(27), 25 (48); or, expressed as laterals, 10(2), 11(737), 12(123). This distribution does not exhibit enough variation to warrant a test for normality, but some of the larger colubrids may have a sufficient spread to justify such a test. For example, 178 specimens of Pituophis catenifer annectens from coastal San Diego County have the following dispersion of laterals: 13 (2), 14(22), 15 (111), 16(169), 17(48), 18(4). A chi-square test indicates that the dispersion probably approximates a normal distribution (P = 0.25).7 It is to be presumed that some of the larger species of the Boidae, with their high numbers of scale rows, would have some interesting variations, but securing sufficient data presents obvious difficulties. Among the smaller boids we have the following distribution of the laterals of 103 specimens of Licbanura roseofusca roseofusca from western San Diego County: 17(2) , 18 (5), 19(35), 20(93), 21 (67), 22(4). By the chi-square test P = 0.13. In Cbarina bottae 142 specimens are distributed as follows: 19(17), 20(88), 21 (89), 22(41), 23 (36), 24(13). Here P is less than 0.001, for the distribution is skewed; however, these data represent a territorially non- homogeneous population from a large area, which has affected the result. The rattlesnakes exhibit a moderate degree of variation in scale rows, no doubt because of their thick bodies; for it can be shown that among snakes there is often a positive intrageneric, and even intrafamily correla- tion of the number of scale rows with adult body diameter; and, assuming ,:i Throughout this paper, in expressing distributions in this way, the values of the variate will be stated first, followed in parentheses by the frequency of occurrence of that value. ‘ The ch i-square test for normality does not state the probability that a certain distribu- tion is normal; it answers the question in this way: "If the population from which this sample was drawn were indeed normal, what percentage or proportion of similarly sized samples, drawn at random, would exhibit as great, or a greater, departure from normality than this one?” So in a way it gives a negative rather than a positive answer to the question. If we take the often-used probability limit of 0.0 5 we merely determine (when P is above 0.05) that there is not a strong indication that the parent distribution is not normal. Klauber: Frequency Distributions 1 1 a constant coefficient of variation, the thicker species will have a higher number of scale-row classes. Table 2 sets forth the data on the five largest homogeneous series of rattlers available to me. TABLE 2 Distribution of Lateral Scale Rows in Homogeneous Series of Rattlesnakes Number of Lateral Rows Platteville series C.v. viridis Pierre series C.v. viridis Pateros series C.v. oreganus S. D. County series C.v. oreganus San Lucan series C. lucasensis 11 4 12 19 12 493 387 894 922 15 13 1113 895 322 280 523 14 56 64 2 5 148 15 5 16 3 Totals 1666 1346 1230 1226 694 These distributions are unimodal, but the dispersions are not great enough — that is, there are not enough classes — to determine whether there is a tendency away from normality. Summarizing the scale-row study, it may be said that in most species of snakes the scale rows are too constant to permit useful determinations of whether such variation as there is follows a normal dispersion. Probably only the largest boids would produce results of interest. The significance of differences in scale rows may best be ascertained by means of a chi- square test of an Rx2 table,8 rather than by determining the difference between means. No doubt the location of the termination of dropped rows will warrant statistical examination in some cases. Sexual dimorphism is sometimes present, either in the number of rows, or the point of termina- tion of suppressed rows. Ventrals The most important character employed in intrageneric classification is the ventral scale count, for it may be determined with accuracy and usually has a high degree of constancy in any territorially homogeneous series, the coefficient of variation approximating 2 per cent. Yet it is sub- 8 For the use of this method see such texts as Mills, p. 63 3; Snedecor, p. 164; Simpson and Roe, p. 295; Pearl, p. 329. 12 Bulletin 17: Zoological Society of San Diego ject to sufficient plasticity to show the effects of ecological and other changes. Chi-square tests of a number of species of colubrids indicate that nor- mality of distribution is probably the rule, the results being shown in Table 3 for several homogeneous series. TABLE 3 Evidence of Normality In the Distribution of Ventral Scale Counts of Example Colubrids Species Area Sex Number of Specimens Chi-square Probability P Di ado phis a. similis . Coastal S. D. Co. M 128 0.22 F 131 0.82 Pbyllorbyncbns d. perkinsi ... Desert S. D. Co. M 126 0.73 F 99 0.29 Pituopbis c. annectens . . Coastal S. D. Co. M 96 0.76 F 80 0.05 Tbamnopbis hammondii ...San Diego Co. M 170 0.61 F 159 0.93 Thamnopbis o. ordinoides . Nw. Oregon M 149 0.81 F 151 0.41 Lampropelth g. calif orniae... San Diego Co. M 202 0.46 F 171 0.60 Geopbis nasalis ...Volcan Zunil M 124 0.26 F 89 0.45 P in the table indicates the proportion of similarly sized samples that would show a departure from normality at least as great as that shown by the available sample, if the population sampled were truly normal. Thus the evidence for normality of distribution is quite strong in this group of colubrids. The five homogeneous series of rattlesnakes have been investigated by the alternative method of moments. The results are shown in Table 4. Klauber: Frequency Distributions 13 TABLE 4 Evidence of Skewness and Kurtosis in Ventral Scale Counts in Homogeneous Series of Rattlesnakes by the Method of Moments Number of Probability 9 Species Series Sex Specimens Skewness Kurtosis C.v. viridis Platteville M 441 0.44 0.73 F 392 0.64 0.08 C.v. viridis Pierre M 342 0.79 0.007 F 331 0.45 0.44 C.v. ore games Pateros M 326 0.80 0.05 F 289 0.00001- 0.81 C.v. oreganus San Diego Co. M 292 0.57 0.16 F 278 0.74 0.72 C. lucasensis San Lucan M 168 0.0009 0.0004 F 125 0.00001- 0.00001- These dispersions were also checked by the chi-square method and all were found to be well above the 5 per cent limit toward normality, except in the case of the Pateros females, but including lucasensis, which Table 4 shows to be non-normal. Thus we find substantial agreement between the two methods except in the case of the lucasensis series. Here it is determined that the low probability disclosed by the moment method results from two specimens (defective young) in each sex. These, of course, are grouped with others in the edge-classes when employing the chi-square method, and hence have small effect on the result. If we drop them out and recal- culate the results by the moment method we have the following: Male Female P (skewness) 0.43 0.62 P (kurtosis) 0.46 0.63 Thus there is little evidence of anormality in lucasensis when these aberrant individuals are omitted. As to the directions of the deviations, we find that the skewnesses are all positive, indicating a long tail toward the right, that is, a surplus of the higher ventral counts. With respect to kurtosis, six cases are positive 9 As in the case of the chi-square test, the method of moments tests the evidence of non- normality, rather than normality, by determining the ratios of certain departures from normality to their standard errors. From these ratios the significance of the departures may be determined. In the above table, if the probability is above 0.0 5, there is assumed to be no strong evidence for non-normality in the parent population. Any result smaller than 0.01 is taken to indicate a high probability of non-normality. 14 Bulletin 17: Zoological Society of San Diego (leptokurtic) , the other four negative; obviously there is no weight of evidence for a trend toward either a peaked or a flat-topped curve being the mode. In the worm snakes, genus Leptotypblops, the dorsals, rather than the ventrals, are used in taxonomy. The chi-square test applied to 56 specimens of L. h. humilis from western San Diego County, and to 40 specimens of L. h. cahuilae or cahuilae -humilis intergrades from the desert slope of the county, indicate normal distributions, although the number of specimens is too small to permit a fully adequate determination. 108 specimens of L. didcis dutch from Texas have a definitely flat-topped distribution (P = .001— ). However, this population may not be homogeneous, as dis- cussed elsewhere.10 SUBCAUDALS The distributions of the subcaudals in some typical series of colubrids are shown in Table 5. The evidence is in favor of normality in most cases. TABLE 5 Evidence of Normality in the Distribution of Subcaudal Scale Counts in Colubrids Number Chi-square of Probability Species Area Sex Specimens P Diado phis a. si mil is . Coastal S. D. Co. M 119 0.02 F 118 0.28 Phyllorhynchus d. perkinsi ... . . Desert S. D. Co. M 126 0.05 F 94 0.53 Pituophis c. annectens Coastal S. D. Co. M 88 0.64 F 80 0.48 Thamnophis hammondii . . San Diego Co. M 149 0.74 F 133 0.92 Thamnophis o. ordinoides ... Nw. Oregon M 110 0.44 F 122 0.35 Lam pro pelt is g. calif or niae San Diego Co. M 184 0.90 F 154 0.27 Geo phis nasalis Volcan Zunil M 115 0.06 F 84 0.001- 10 Trans. S. D. Soc. Nat. Hist., Vol. 9, No. 18, p. 103, 1940. Klauber: Frequency Distributions 15 The same result is apparent in the subcaudals of the rattlesnakes (Table 6), analyzed by the method of moments, although the foreshortened tails of the rattlesnakes render this character less important than in the sharp- tailed snakes. TABLE 6 Evidence of Skewness and Kurtosis in Subcaudal Scale Counts in Homogeneous Series of Rattlesnakes by the Method of Moments Number of Probability Species Series Sex Specimens Skewness Kurtosis C.v. viridis Platteville M 441 0.76 0.21 F 390 0.09 0.001- C.v. viridis Pierre M 342 0.41 0.51 F 331 0.46 0.85 C.v. oreganus Pateros M 324 0.57 0.93 F 289 0.002 0.07 C.v. oreganus S. D. County M 294 0.38 0.29 F 279 0.72 0.24 C. lucasensis San Lucan M 168 0.22 0.50 F 123 0.85 0.04 Thus, while not every test results in a probability above 0.0 5 the general indication is toward normality. I have rechecked the Platteville and Pateros females, the two cases which show the greatest probability of non-normality. I find that in both there are several high counts, possibly the result of inaccurately sexing juveniles, always a possibility in handling some hundreds of these little specimens. If we employ the chi-square test, which groups these end-classes, we find for the Platteville series P=0.82, and for the Pateros series P = 0.55. Thus the appearance of non-normality results from these few aberrant specimens, totaling only about one per cent of the available material. These distributions afford good examples of the extent of the departures from the normal curve of sets of variates in which there is strong evi- dence that the parent populations are truly normal. For instance, these are the detailed figures for the female Pierre viridis: 16 Bulletin 17: Zoological Society of San Diego Number of Subcaudais Actual Distribution Theoretical Distribution (A) Theoretical Distribution (B) 16 1 1.84* 2.18* 17 10 9.36 10.00 18 34 32.19 32.80 19 65 68.53 67.86 20 95 90.88 89.17 21 74 74.92 74.04 22 36 38.37 38.86 23 13 12.24 12.98 24 2 2.42 2.71 25 1 0.32* 0.39* Total 331 331.07 330.99 Column A gives the theoretical distribution by ordinates, Column B, by areas. I have stated that in cases such as this, where the variates can take only integral values, I have usually determined the theoretical distri- bution by the first method. It will be observed that the area method gives a slightly more platykurtic curve than the other. The actual distribution is exceedingly close to either, closer in fact than would be obtained in nine out of ten random samples if the parent population is indeed normally distributed; for P = 0.95 when the value of chi-square is obtained by com- paring with the ordinate curve, or P = 0.93 if the area curve is taken. Labials and Other Head Scales The labials, in homogeneous series of colubrids, seldom have a sufficient variation to indicate more than a strong unimodal tendency. In the rattlers there is more variation, usually five or more classes being present, and from these the trend toward normality can be ascertained. The distri- butions in the five series which have been used before are shown in Table 7. * 16 or less; 2 5 or more. Klauber: Frequency Distributions 17 TABLE 7 Distribution of Labial Scale Counts in Homogeneous Series of Rattlesnakes SuPRALABIALS Platteville 12 11 13 89 14 542 D 728 1 6 256 17 35 18 4 19 Total 1665 P (by chi-square) 0.01 Infralabials 12 1 13 12 14 106 15 487 16 641 17 382 18 79 19 6 20 21 Total 1664 P (by chi-square) 0.24 Pierre Pateros S. D. Co. San Lucan 6 1 2 65 18 28 4 362 190 225 13 625 600 569 103 233 351 320 308 54 67 69 203 1 3 11 56 4 1346 1229 1223 693 0.001- 0.07 0.025 0.08 9 1 1 90 30 51 1 428 232 301 11 503 532 502 61 254 363 295 198 56 71 48 275 5 2 12 124 2 20 4 1345 1230 1212 695 0.02 0.89 0.31 0.20 It will be noted that there is a greater indication of non-normality in the supralabials than the infralabials. A study of the former by the method of moments indicates that the trend away from normality is brought about by a leptokursis, the peak being sharper than that of a normal curve. Sometimes it is desirable to investigate entire species to determine curve shapes, especially if, in a character, there is reason to believe that little territorial variation is involved; for the greater number of specimens will give greater assurance of the curve shape. However, if there be territorial variation it is obvious that there will be a tendency toward greater dis- 18 Bulletin 17: Zoological Society of San Diego persion in the larger samples; so that if homogeneous segments of the population have normal distributions, the combinations will tend toward platykursis. There is not much variation in the labials of C. v. viridis, especially in the northern part of its range. I have therefore investigated the curve shapes of the labials in all the specimens available to me, to see whether these larger samples tend to verify the non-normality of the supralabials and the normality of the infralabials, as indicated by the Platteville and Pierre series. The data are as follows: Supra- 10 11 12 13 14 15 16 17 18 19 Total labials . Infra- 2 8 33 279 1460 2125 817 153 10 4887 labials . ... 1 2 47 284 1434 1834 897 214 14 4827 The results are as follows: P (chi-square) P (moments) Skewness Kurtosis Supralabials 0.0001- 0.037 0.000001- Infralabials 0.05 0.415 0.222 It will be seen that there is strong evidence that the supralabial frequency is not normal; in fact, the value of P for kurtosis is very much smaller than the figure given, for the distribution is strongly peaked. On the other hand, these tests, especially that based on moments, indicate that the infra- labial distribution is probably normal. Occasionally other head scales vary sufficiently to warrant investigation. For example, consider this distribution of the minimum scales between the supraoculars in the Pierre series of C. v. viridis : 1 (7), 2(145), 3 (370), 4(138), 5(12). By the method of moments we find P (skewness) to be 0.20, and P (kurtosis) 0.69. In the Platteville series of 831 specimens the distribution is 1(4), 2(71), 3(381), 4(311), 5(51), 6(12), 7(1). This distribution is definitely skewed, and P (chi-square) is 0.01. A species with a high number of scales in the supraocular bridge is C. ruber, in which the distribution among 243 specimens from all areas is 4(6), 5(35), 6(96), 7(75), 8 (29), 9(2). This is markedly skewed distribution. The distribution in C. lucascnsis is more symmetrical: 3(3), 4(34), 5(70), 6(135), 7(70), 8(24), 9(1); total 3 37. C. m. molossus and C. scutulatus are two forms in which these minimum scales across the frontal area are strongly skewed. For example, in 148 C. m. molossus the variation is 2(86), 3(29), 4(21), 5(10), 6(1), 7(1), giving what is known as a J-shaped curve. C. scutulatus is even more strongly skewed: 1(3), 2 (324), 3 (37), 4( 5), 5 (1); total 370. This is quite a different distri- bution from that of C. cinereous ( atrox ) which in all specimens avail- Klauber: Frequency Distributions 19 able to me is as follows: 3 (76), 4(276), 5 (230), 6(90), 7(11), 8 (1); total 684. In determining the significance of the difference between two such non-normal distributions as these, it is best to use an Rx2 table, rather than a comparison of means. Such a test would show the probability of a common origin to be far below 0.0001, clearly demonstrating the validity of scutulatus. The scales in the prefrontal area on the snout of a rattlesnake may be distributed normally, although usually either skewed, platykurtic, or both. In C. scutulatus there is slight skewness (P by chi-square = 0.07 in 274 specimens). But in the Pierre series of viridis the distribution in 672 specimens is decidedly flat-topped and P (chi-square) is less than 0.001. A few species of snakes have considerable diversity in loreals, although most are rather constant. A form exhibiting variation is Lichanura r. roseofusca, wherein we find the following distribution: 2(1), 3(22), 4(56), 5(69), 6(11), 7(5). This indicates the possibility of a normal distribution in the parent population, for P (chi-square) is found to be 0.09. Also in this species we have an occular ring, with no definite dis- tinction to be drawn between supra-, pre-, sub-, or post-oculars. The dis- tribution in 1 63 counts is as follows: 7(5), 8(16), 9(70), 10(61), 11(9), and 12(2). P is 0.17 and a normal distribution is therefore possible. In Pituophis the prefrontals are subject to considerable variation. For example, in 120 specimens of P. c. annectens from San Diego County the distribution is 2(4), 3(7), 4(91), 5(6), 6(9), 7(2), 8(1). This clearly is not a normal distribution, being sharply peaked at 4, and P is much less than 0.001. Lizard Scales Lizard scale distributions are also useful in taxonomic problems and may be checked for normality by the same methods. Thus in 56 specimens of Sceloporus j. jarrovii the scales around the body indicate a normal dis- tribution (P=0.59). In 760 specimens of Cnemidophorus t. tessellatus from all areas, the distribution of ventrals is surprisingly close to normal (P = 0.88). In the same species 1424 counts of the scales on the fourth toe show a leptokurtic distribution as follows: 17(12), 18 (92), 19(337), 20(673), 21(205), 22 (86), 23(15), 24(4). In this distribution, by the chi-square test, P is found to be much below 0.001. Similarly the number of dorsal scale rows in a large series of Anniella p. pulchra is peaked in distribution. In 1 5 06 counts in C. t. tessellatus (sexes combined), the femoral pores indicate that the distribution may be normal (P=0.27). I think there will be a growing tendency to apply statistical methods in future taxonomic studies of the lizards, particularly in verifying the sig- nificance of differences, for they have a greater number of countable characters than snakes. Turtle Scutes Southern California is a territory notably poor in chelonians and I have no original data on turtles. I have tested the distributions of the scutes of 20 Bulletin 17: Zoological Society of San Diego Lepidochelys olivacea, cited in the "Tetrapod Reptiles of Ceylon” by P. Deraniyagala. The total scutes in 378 specimens (p. 133) are not dis- tributed normally (P less than 0.001), for the distribution is decidedly platykurtic. The costals (p. 137) in 756 counts are both platykurtic and skewed (P less than 0.001). The vertebrals may possibly be normally dis- tributed (P = 0.08), although this particular sample is platykurtic. Pattern Where the pattern of a snake or lizard includes bands, saddles, blotches, or spots, the numbers on the body, tail, or both are frequently used in taxonomy. They often approach normality in distribution. Thus in 180 specimens of Pituophis c. annectens we find for the body blotches P = 0.06; in the tail spots of 89 males P = 0.32 and in 80 females P = 0.90. In the five large series of rattlesnakes a normal distribution is indi- cated, as shown in Table 8. In one case — the Pateros oregamis female tail rings — the distribution is strongly skewed; in three others the variation is limited to only three classes and the tendency is indeterminate. TABLE 8 Chi-Square Test of Normality of Body Blotches and Tail Rings in Homogeneous Series of Rattlesnakes Series Platteville viridis Pierre viridis Pateros oreganus San Diego Co. oreganus San Lucan lucasensis. .. Body Blotches Tail Rings Males Females Number P Number P Number P 832 0.09 440 0.20 392 0.41 672 0.28 342 0.32 330 0.81 616 0.22 326 0.33 290 0.001 579 0.90 285 0.15 283 339 0.13 198 146 Broods The sizes of broods of young snakes have been shown to be correlated with the size of the mothers, larger females having more young.11 Thus the frequency distribution of brood sizes may depend on the dispersion of fertile females, as well as on the variation for any given size of mother. The only series which I have available, large enough to afford a determina- tion, comprises data on broods and developing eggs of Crotalus v. viridis. Of these there are a total of 303 sets. The distribution is found to be both platykurtic and positively skewed; P (chi-square) is 0.006. It may well be 11 Occ. Papers S. D. Soc. Nat. Hist., No. 1, p. 16, 193 6. Klauber: Frequency Distributions 21 that a more nearly normal distribution would be in evidence in a series from mothers within a narrow length-range. The data available are not sufficient to permit checking this possibility. Rattles One rattle-variable is the number of rattles in adult strings. Data on this subject have been given in a discussion of the rattle.12 The distribu- tions in two series, the Platteville and San Lucan, including both complete and broken strings, are found to be leptokurtic in the first case (chi- square P less than 0.001) and possibly normal in the second (P=0.13). True normality could not be expected in these cases since, with very large series, a considerable number of snakes should have less than no rattles — an obvious impossibility. Hemipenes In some species the spines and fringes of the hemipenes are quite variable, although the spines, if non-uniform, are often difficult to count with accuracy. In a series of C. scutulatus the spines seem normally distributed (P=0.29) and the fringes likewise (P = 0.72). Measurements Thus far I have dealt with scales, blotches, and other countable quanti- ties, characters which can only take integral values. Because of their scalation most reptiles are well supplied with such countable characters (only fish equal them), a fortunate provision from the standpoint of ascertaining the significance of differences. However, even in herpetology, dispersion curves of measurements and weights are of interest. They are of importance in determining factors affecting variation and correlation, and in calculating the frequency of expectation of unusual specimens. But in using measurements in most difference problems we are confronted by the complication that proportionalities are subject to ontogenetic varia- tion. We are seldom concerned with the total variation within a species from birth to death; more often we wish to know the extent of variation at a single age or life period, since it is only within such limitations that a three-dimensional variation can be reduced to two dimensions, permitting an analysis of the frequency distribution. For usually the measurements to be used are of a relative or proportional nature involving two variables — for example the ratio between some part of the body and the whole; and although such ratios eliminate the unit of measurement, they do not ob- viate the effects of ontogeny. It is seldom that a body part remains in con- stant ratio with another part (or with the body as a whole) throughout life. Where such ontogenetic changes in proportionality are in evidence, to determine dispersion we must either have a very large assemblage of specimens at a single value of the independent variable — that is to say, at 12 Occ. Papers S. D. So c. Nat. Hist., No. 6, pp. 18-19, 1940. 22 Bulletin 17: Zoological Society of San Diego a given age or size — or we must make some assumptions and convert the available specimens to a standard value of the independent variable. This usually involves the determination of the probable size of a body part at a standard value of the body size. This may then be followed by a study of the frequency distribution of the dependent variable at what is equiva- lent to a cross-section of the dispersion surface. For example, if it be de- sired to determine the dispersion of the head size of a snake as a proportion of body size, we first set a standard body size (usually somewhere in the adult range) and then translate the head lengths of all the available specimens to what they would probably be at this standard body size. The translation is effected by determining the regression line for all speci- mens and then assuming that any specimen, in growing to (or returning to) the standard size, would do so by maintaining a constant percentage devia- tion from the regression line. This assumption is validated by the fact that the coefficients of dispersion of characters of this type seem to remain substantially constant through life. The results of several studies of this nature, with example computations, have already been published. 1,1 It is only by the use of such calculations that sufficient data can be secured to permit the study of frequency distributions. In a population of snakes — rattlesnakes for example — the lengths do not approach a normal distribution. On the contrary it is bimodal, for the young of the year are rather sharply differentiated from the adolescents and adults.14 The young of the year taken by themselves are probably normally distributed with respect to body length as shown by the fol- lowing tests on homogeneous series: Number Series of Individuals Zacatecas nigrescens 82 San Patricio cinereous, 139 Pierre viridis 152 Platteville viridis 229 P (chi-square) 0.51 0.64 0.77 0.01 Only the Platteville series is distributed non-normally; it is negatively skewed. A set of miscellaneous broods totaling 320 young snakes, when the in- dividual lengths were expressed as percentages of the mean of each brood, had a dispersion giving a chi-square value of P=0.25. Starting with young of the year having substantially normal distribu- tions, it would be interesting if we could trace the progress of each age- class as it passes through maturity until it finally disappears, losing indi- 13 Ratio of weight to length, Occ. Papers, S. D. Soc. Nat. Hist., No. 3, p. 47, 1937; ratio of head length to body length overall, idem, No. 4, p. 22, 1938; ratio of fang length to head and body length, idem. No. 5, p. 3 6, 193 9. 14 Occ. Papers S. D. Soc. Nat. Hist., No. 3, p. 20, 1937. Klauber: Frequency Distributions 23 viduals continuously along the way, to see whether the distribution con- tinues normal. But studies have shown that it is impossible to segregate successive classes by size after the first year; for even in their second year the most rapidly growing adolescents will have overtaken the smallest adults of the preceding year, thus preventing an accurate segregation. Sub- sequently the adults grow so slowly (probably never stopping growth en- tirely as do mammals and birds), compared to individual variations, that the separation of the age-classes becomes continually smaller and the overlap between successive years greater. While we may assume that the distribution of each age-class remains normal, since they start with such a distribution as young of the year, and second year individuals having complete strings of 5 rattles have been found to approach normality, we cannot prove this continuity. A complete population of adolescents and adults is both positively skewed and platykurtic, as might be expected from the nature of a curve comprising the sum of several normal curves of successively decreasing areas. It would be useful if the sizes of an adult population of snakes could be shown to have a normal distribution, and the mean and standard devia- tion could be determined; for from such parameters we could determine the probable frequency of occurrence of unusually large specimens, cer- tainly a matter of interest. But this is a difficult assignment. First, very large samples would be required, some hundreds of specimens of each sex, at least; for as there is sexual dimorphism in size in most species, the sexes must be treated sep- arately. There must be no conscious selection with respect to size; the sample must be truly representative of the population as a whole. This will at once eliminate the larger species from consideration owing to the prac- tical difficulty of collecting, preserving, and measuring great numbers of large individuals. There is the further complication that incomplete tails are numerous among the larger specimens of many species, particularly of those with slender tails such as the racers. This tends to distort the true frequency distribution. But most important of all, there is the difficulty of segregating the adolescents, as already mentioned. To avoid this com- plication there is some possibility of investigating only the right hand half of the curve of distribution; that is, the half above the mean which contains the largest specimens. Probably the garter snakes, because of their occurrence in large numbers, and their ease of capture around certain ponds and lakes will offer the best material for an investigation of this kind. To determine with any degree of certainty whether the distri- bution is normal, and particularly whether the largest specimens occur with greater or less frequency than would be expected with a normal distribution, would probably require the measurements of at least 500 specimens of each sex. I have no such series available, but I have checked the two best homo- geneous series at hand, although admittedly they are quite inadequate in 24 Bulletin 17: Zoological Society of San Diego numbers to afford conclusive evidence respecting the shape of the dis- persion curve. These are series of the little snakes Diadophis amabilis similis and Phyllorbyncbus decurtatus perkinsi. In the interest of homogeneity I have restricted the investigation to specimens from San Diego County, since territorial variations in size are evident in many species. By rather arbitrary methods I have attempted to eliminate adolescents. The statistics of the adult populations are as follows, all lengths being given in millimeters: Diadophis a. similis Phyllorhynchus d. perkinsi Males Females Males Females Number 101 86 119 70 Mean length 294.1 340.9 392.1 406.6 Standard deviation 38.64 50.11 53.10 32.85 Length of an individual 2 standard deviations above the mean 371.4 441.1 498.3 474.3 Theoretical number greater than this length (2.28%).. 2.30 1.96 2.71 1.60 Actual number greater than this length 2 4 0 2 It will be noted that in two cases ( Diadophis males and Phyllorhynchus females) the actual number of specimens at least two standard deviations larger than the mean is as near the theoretical number as possible. The number of large Diadophis females is four instead of two as calculated; while in the case of the Phyllorbyncbus males there should be about three specimens above 498.3 mm. long, whereas actually there are none so large. But it is interesting to note that the largest specimens come close to ex- pectation, for the three largest are 490, 491, and 495 mm., respectively. At least we can say, for these admittedly inadequate tests, that they offer no particular evidence that the size distributions of these adult populations are not normal with respect to the presence of unusually large individuals. One of the important correlative studies that may be made is that of weight on length. I have determined 15 that the dispersion around the re- gression line of 818 individuals of the Platteville viridis (standardized as discussed above) are probably not normally distributed for P (chi-square) is 0.003. The distribution is skewed. Head length dispersions are found to be normally distributed about the regression lines of head on body length over-all, in two series investigated.16 833 Platteville viridis P (chi-square) 0.893 715 Pierre viridis P (chi-square) 0.226 15 Occ. Papers S. D. Soc. Nat. Hist., No. 3, p. 47, 1937. 16 Idem. p. 18. A graphic illustration of one of the distributions is given. Klauber: Frequency Distributions 25 Similarly the distribution of fang lengths about either the fang-head or the fang-body regression lines are probably normally distributed. The results in the Platteville series were as follows: Fang on head length, 519 specimens, P (chi-square) =0.3 51 Fang on body length overall, 526 specimens, P (chi-square) =0.165. One measurement which remains unchanged in each individual during life is that of rattle width; that is, the width of any specific rattle of the sequence. Using only specimens with complete strings, so that the sequence number of each ring is known, the frequency distribution of measure- ments of any particular ring can be determined. An investigation of a number of series, upon which I hope to publish some notes later, would indicate that the distribution approximates normality. For example, the chi-square P for 448 buttons (No. 1 rings) of the Platteville series is 0.15 5. This particular series is somewhat platykurtic, but not excessively so. Illustrative Sampling To illustrate the variations in a series of random samples from a truly normal population, I have assumed a hypothetical homogeneous popula- tion comprising 100,000 snakes (all of one sex) with a mean ventral scale count of 100 and a coefficient of variation of 2 per cent, which is a degree of variation closely approached by many species. Thus, the standard devia- tion is two scales. Then, by the use of random sampling numbers (Tippett, 1927; Fisher and Yates, 1938, pp. 18 and 82) I have selected ten random samples, each comprising 100 specimens. The distributions of the entire normal population and each of the samples is shown in Table 9. The fit for all samples taken together is, by the chi-square test, P = 0.77, which is quite high; that is, the fit is very close. If we take only the first five sets of samples, the fit is not so good, for P= 0.20, the greatest deviation from normal being the low number of specimens with 101 ventrals (drawn 73; expected 8 8.0). Some of the individual samples will be observed to have still poorer fits. 26 Bulletin 17: Zoological Society of San Diego TABLE 9 Distribution of Ventrals in a Hypothetical Population of 100,000 Snakes Normally Distributed, Together with 10 Random Samples Each Composed of 100 Specimens Number of Ventrals Composition of Population* 1 2 3 4 Samples 5 6 7 8 9 10 Total of Samples 91 1 92 7 . . . 93 44 94 222 1 2 3 95 876 1 1 1 2 3 1 1 10 96 2,700 5 4 4 4 4 3 3 4 1 32 97 6,476 7 7 8 6 5 5 9 8 3 6 64 98 12,098 15 12 15 9 15 12 10 11 5 10 114 99 17,603 20 18 20 18 22 24 21 14 18 17 192 100 19,946 16 17 17 20 16 21 17 23 20 19 186 101 17,603 16 14 13 16 14 16 15 18 20 23 165 102 12,098 9 17 16 14 8 8 15 12 13 9 121 103 6,476 6 8 3 8 8 7 8 9 9 6 72 104 2,700 4 1 4 3 4 2 1 1 4 2 26 105 876 1 1 1 3 1 1 2 3 13 106 222 1 1 2 107 44 .... 108 7 .... .... 109 1 .... Total 100,000 100 100 100 100 100 100 100 100 100 100 1000 Calculated by ordinates, not areas. Klauber: Frequency Distributions 27 Acknowledgments I wish to acknowledge my indebtedness to Messrs. Charles E. Shaw, James Deuel, and Laurence H. Cook for scale counts, and to Mrs. Elizabeth Leslie and Alice G. Klauber for assistance in computations. Mr. Joseph R. Slevin was kind enough to furnish the scale counts of Gcophis nasalis. Mr. C. B. Perkins made several editorial suggestions for improvement. Summary A considerable number of tests, both by the chi-square method and the method of moments, indicate that many of the countable variable char- acters studied in herpetology, particularly in problems of taxonomy, follow a normal distribution, or one closely approximating such a distri- bution. Amongst others this is found to be the case with ventral scale counts, probably the most important single character used in herpetological classification. APPENDIX I have tried, as far as possible, to eliminate descriptions of routine sta- tistical methods from the herpetological discussion, mentioning only unusual points. Statistical texts of such number and variety have lately appeared that extensive references are no longer necessary. However, some refer- ences are given below for the use of those not familiar with these methods; they are limited to a few on each separate element. The characteristics of the normal curve: Walker (199-211), Treloar (76-83 ), Simpson and Roe (70-75 ), Croxton and Cowden (265-271). Skewness and kurtosis: Croxton and Cowden (234-245), Treloar (32- 3 5), Goulden (28-31). Tables of the normal curve: Abridged tables will be found in nearly every statistical text, a particularly convenient set being those of Camp (3 80-3 85 ). The following are more detailed and extensive: Davenport and Ekas (164-172), Kelley (14-114), Glover (392- 411), Pearson, part 2 (2-10). Biological approximations to the normal curve: Simpson and Roe (129- 132), Treloar (34-3 5). See also the interesting comment in the Preface to Kelley’s Tables. Fitting the normal curve to data: (a) by areas, Arkin and Colton (106- 108) , Chaddock and Croxton ( 123-126) , Croxton and Cowden (275- 280); (b) by ordinates, Arkin and Colton (108-109), Croxton and Cowden (271-275). 28 Bulletin 17: Zoological Society of San Diego The chi-square test for normality: Arkin and Colton (109-112), Mills (626-630), Treloar (219-226). Chi-square tables: Fisher (118-119), Fisher and Yates (27), Pearson, Part I (26-28), Davis and Nelson *(399-405). The moment tests for skewness and kurtosis: Arkin and Colton (145- 149), Geary and Pearson (1-15), Yule and Kendall (154-166), Tip- pett (33-42), Goulden (27-32), Fisher (54-56; 74-79), Madow (515-517). In applying the chi-square test I have used the standard deviation of the sample, rather than the estimated standard deviation of the popula- tion (Fisher, 53). Edge classes have been combined until the theoretical frequency was at least 5 in each group. The classes, as suppressed, in all cases numbered less than 20 (Kenney, Vol. 2, p. 170). The degrees of free- dom were taken at 3 less than the number of classes as suppressed (Rider, pp. 109-110), since the theoretical distribution is made to conform to the actual in total number of variates, mean, and standard deviation. In fitting a normal curve to the data, except in a few cases where grouping has been necessary, I have used the ordinate, rather than the area method, as seems to be preferable for discrete variates (Baten, p. 94). For this reason Shep- pard’s correction was not made in calculating the standard deviation of the sample. The differences involved in employing the two methods will usually be unimportant, unless near some assumed critical level of sig- nificance. I have used Fisher’s methods in determining the significance of skewness and kurtosis. BIBLIOGRAPHY OF STATISTICAL TEXTS Arkin, H. and Colton, R. R. Statistical Methods, Fourth Edition, New York, 1940. Baten, W. D. Elementary Mathematical Statistics, New York, 193 8. Camp, B. H. The Mathematical Part of Elementary Statistics, New York, 1931. Chaddock, R. E. and Croxton, F. E. Exercises in Statistical Methods, Cambridge, Mass., 1928. Croxton, F. E. and Cowden, D. J. Applied General Statistics, New York, 1939. Dahlberg, G. Statistical Methods for Medical and Biological Students, London, 1940. Klauber: Frequency Distributions 29 Davenport, C. B. and Ekas, M. P. Statistical Methods in Biology, Medicine and Psychology, Fourth Edition, New York, 1936. Davies, G. R. and Yoder, D. Business Statistics, New York, 1937. Davis, H. T. and Nelson, W. F. C. Elements of Statistics, Bloomington, Ind., 193 5. / Dunlap, J. W. and Kurtz, A. K. Flandbook of Statistical Nomographs, Tables and Formulas, Yonkers- on-Hudson, 1932. Elderton, W. P. Frequency Curves and Correlation, Cambridge, England, 193 8. Ezekiel, M. Methods of Correlation Analysis, New York, 1930. Fisher, R. A. Statistical Methods for Research Workers, Seventh Edition, Edinburgh and London, 193 8. Fisher, R. A. and Yates, F. Statistical Tables for Biological, Agriculture and Medical Research. Edinburgh and London, 193 8. Geary, R. C. and Pearson, E. S. Tests of Normality, Cambridge, England, 193 8. Glover, J. W. Tables of Applied Mathematics in Finance, Insurance, Statistics, Ann Arbor, 1923. Goulden, C. H. Methods of Statistical Analysis, New York, 1939. Kelley, T. L. The Kelley Statistical Tables, New York, 193 8. Kenney, J. F. Mathematics of Statistics, 2 vols., New York, 1939. Kurtz, A. K. and Edgerton, H. A. Statistical Dictionary of Terms and Symbols, New York, 1939. Madow, W. G. Note on Tests of Departure from Normality, Jour. Am. Statistical Assn., Vol. 35, No. 211, pp. 515-517, Sept., 1940. Mills, F. C. Statistical Methods Applied to Economics and Business, Revised Edition, New York, 1938. 30 Bulletin 17: Zoological Society of San Diego Otis, A. S. Normal Percentile Chart, Yonkers-on-Hudson, 193 8. Pearson, K. Tables for Statisticans and Biometricians, Part 1, Cambridge, England, 1914 (also Third Edition, 1930); Part 2, 1931. Pearl, R. Introduction to Medical Biometry and Statistics, Third Edition, Phila- delphia, 1940. Peters, C. C. and Van Voorhis, W. R. Statistical Procedures and their Mathematical Bases, New York, 1940. Rider, P. R. An Introduction to Modern Statistical Methods, New York, 1939. Simpson, G. G. and Roe, Anne. Quantitative Zoology, New York, 1939. Snedecor, G. W. Statistical Methods, Applied to Experiments in Agriculture and Biology, Third Edition, Ames, Iowa, 1940. Thurstone, L. L. The Fundamentals of Statistics, New York, 1927. Tippett, L. FT C. Tracts for Computers. XV Random Sampling Numbers, Cambridge. England, 1927. The Methods of Statistics, Second Edition, London, 1937. Treloar, A. E. Elements of Statistical Reasoning, New York, 1939. Walker, Helen M. Mathematics Essential for Elementary Statistics, New York, 1934. Yule, G. U. and Kendall, M. G. An Introduction to the Theory of Statistics, Eleventh Edition, London, 1937. Klauber: Frequency Distributions 31 / \ / \ / \ / \ FIG. 3 F IG. ^4- Figure 1. Figure 2. Leptokurtic Distribution. Figure 4. Positive Skewness. Normal Curve. Figure 3. Platykurtic Distribution. Figure 5. Negative Skewness. Klauber: Populations and Samples 33 II. ILLUSTRATIONS OF THE RELATIONSHIP BETWEEN POPULATIONS AND SAMPLES Introduction The relationship between a sample, whether an individual specimen or a series of specimens, and the total population out of which it was col- lected, is always somewhat uncertain. We have the sample before us; it is tangible; we know as much about it as our senses and our methods of investigation permit us to learn. Behind it lies the population which the sample represents, indefinite and nebulous, and in many ways un- known. It is true that the methods of mathematical statistics permit the approximate definition of a population from a sample; yet even with these formulas the population is presented only as a sort of shadow, out- lined by statements to the effect that it probably has such and such characteristics, and there is a certain percentage of chance that it falls within such and such limits. But its exact form, character, and limita- tions we can never know. In herpetology, as in other branches of biology, we deal with samples. They are the particular specimens which we have been able to acquire for study. Some of these samples, often the first collected, are assigned special importance by being selected as taxonomic or nomenclatorial ref- erence guides or anchors. These are the types. But all the while we are studying and classifying these samples, and attempting to differentiate them from others, we are not really thinking of the samples themselves, but of the populations, still in the wild, which the samples represent. For if taxonomy is to have any real purpose, it is not primarily the determina- tion of the similarities and differences between two or more individuals which we have at hand, but is a judgment respecting these similarities and differences as they are manifested in the original populations from which the samples were drawn. Whether we use the mathematical formulas for estimating the characteristics of populations from samples, or draw infer- ences as to these characteristics somewhat unconsciously, we are nonethe- less really aiming at a definition of the population rather than the sample. Of course it is individual variation that leads to the uncertainty. Were the animals of a single kind invariant we would know at once all about a population (except its number) from a single sample. But all animals, however closely related, differ in some degree from each other; the prob- lem is to estimate, from the known spread in a sample, how wide the range of these differences becomes in an entire population. Even if two samples are different we cannot be sure, without investigation, that both may not be included within the spread or range of the entire population. For ex- ample, one snake may have 150 ventral scutes and another specimen 160. How are we to know whether the entire population — a single homogeneous group of these snakes — does not contain individuals running from as low 34 Bulletin 17: Zoological Society of San Diego as 145 to as high as 170 scutes, thus including both? Parenthetically, I may say that this discussion has nothing to do with the interpretation of the extent of these differences into classification — that is, whether a given difference is great enough to warrant subspecific or specific recognition, or whether it should be considered merely an intrasubspecific territorial or ecological variation. The problem has to do with the determination of differences rather than their interpretation in taxonomy and nomen- clature. The reasons why the relationships between samples and parent popu- lations are sometimes ignored, even today when the statisticians have made available the formulas governing such relationships, are, first, their expression in mathematical terms, which may seem too abstract to permit visualizing the result; and secondly, the absence of the population itself for comparison. For the latter remains always hazy and ill-defined, and we never know how well our description, based on the samples, really fits it. We may presume that the larger the sample the more representative it becomes of the population — that is, the more closely its characteristics are likely to approach those of the population, but we must accept this largely on faith or inherent common sense. This leads to a tendency to treat the sample as if it were the population, and to draw unwarranted conclusions with respect to identities, similarities, and differences from other populations. While in actual practice we can never secure an entire population for study (except of an animal approaching extinction) we may experiment with theoretical or artificial populations, in any size and form desired, by setting up large groups of individuals segregated into classes premised on the variation in some particular character. An example of such a popula- tion, using subcaudal scales as the basic character, would be the follow- ing, comprising, for the sake of simplicity, only four classes: Number of Number of Subcaudals: Specimens 13 185 14 2,149 15 131,650 16 477 Total population 134,461 From such a population we may then select samples quite at random and thus see in operation, without recourse to mathematical formulas, the prin- ciples which cause samples to resemble their parent populations; and how they fluctuate and differ from other samples drawn from the same or different populations. Thus, we will have before us, for continuous ex- amination and comparison, both the entire parent population, simplified and perfected in character as compared to a real population, and the Klauber: Populations and Samples 35 sample. We can watch the sample grow (in numbers, not in the size of its individuals) and witness the favorable effects of larger samples or the adverse effects of heterogeneity in samples. All the while the formulas of the mathematicians will be at work, but we will not be using them; we will see only their results. To carry out such a program the tests on artificial populations which follow have been made; they will serve to illustrate some of the principles of the relationship between populations and samples. An artificial population has certain fundamental advantages over any real one. First, as previously mentioned, it may be visualized in its en- tirety, and thus be made available for comparison with samples; secondly, it may be very large, so large in fact, that it may be assumed toi remain un- changed in composition when a few individuals have been withdrawn as a sample; and lastly, it may be designed to follow any scheme of variation, and to fit that scheme perfectly, avoiding the complicating peculiarities found in every real population. For in considering a real population it is often difficult to divorce the relationship to be demonstrated from the particular characteristics of that species, its variations and morphology. When thase tests on theoretical populations are made there is no guar- antee that the result will follow a particular course, for each sample will be a truly random or chance sample. Thus, one may start out to prove a given point (for instance that the means of two separate samples tend to approach each other as the samples are increased in size) and by some freak of chance the first trial might prove exactly the opposite. But re- peated trials will surely demonstrate the truth of this proposition; in the lone run the results will follow the mathematical formulas, but without their seeming complications. For this reason several tests will usually be made to illustrate each type of relationship. It should be stated for the benefit of those unfamiliar with statistical methods that there is nothing having the slightest originality or novelty developed herein with respect to the mathematical relationships between populations and samples. And of course I am not presenting these illustra- tive examples for the purpose of proving the validity of mathematical formulas; such proofs are available in any statistical text. The purpose of the tests is to give a direct picture of the relationship, without re- course to the formulas. Flowever, from time to time, after demonstrating a relationship by example, I have pointed out. in the interest of clarity, what formula is involved, and how well the illustrative test follows it. But certainly this is with no idea of furnishing a proof where none is needed. Artificial Populations Populations may vary in several ways, such as the number of individuals included, the arithmetical mean or average value of a character, and the extent and nature of its variability, that is, how closely and in what way it is dispersed about the mean. In the present discussion, in the interest of simplicity, I shall prepare my populations according to rather rigid specifi- 36 Bulletin 17: Zoological Society of San Diego cations. First, only one character at a time will be considered; for ex- ample, a population will be made up of a group of individuals having dif- ferent numbers of ventral scutes and all other features of real reptdes will for the time be ignored. Secondly, with a few exceptions, each popula- tion will contain exactly 100,000 individuals, this number being sufficiently large to demonstrate the results of sampling without being cumbersome. However, the removal of a few individual specimens will not change the relative class-compositions of the remaining population, which is treated, in this regard, as if it were infinite. Next, the characters discussed will be of the type always expressible in integral terms, as far as any single speci- men is concerned; that is, they will be countable (rather than measure- able) characters, such as ventrals, subcaudals, body blotches, etc. Further, in the interest of simplicification, the population will always be so ar- ranged that its arithmetical mean or average will be expressible as an integer; for example, the mean in these hypothetical populations will al- ways be exactly 166 ventrals, or 48 blotches, instead of 166.3 5 2 ventrals, or 48.713 blotches, as would be the case in a real population. This simpli- fication can be made without in any way interfering with the demon- stration of the trends of samples, but has the important practical ad- vantage that the number of individuals in each class above the mean equals the number in the corresponding class below, if the distribution be sym- metrical. Finally, the populations will be normally distributed, that is, the variations will follow the normal curve of error. This distribution has several advantages: it has been extensively investigated and tabled, and thus the populations can be set up with only the most elementary calcula- tions; its distribution may be fully described and fixed by three simple statistics (the mean, the number of individuals, and the standard devia- tion) ; and finally, this type of distribution is closely approximated by many characters in natural history, and herpetology is no exception, as shown by investigations previously made.1 I have stated that these normal populations may be completely defined by three statistics — the number of individuals, the mean, and the standard deviation. It will be desirable to show how populations change when one of these statistics or parameters varies while the other two remain constant.2 In Tables 1, 2, and 3, such variations are set forth. I have chosen to consider the populations as representing the distribution of the ventral scutes in a hypothetical species of snake; of course, they might equally well have signified any other character. Table 1 shows three normal popu- lations, each about 100 times as large as the next. To avoid confusion these have not been given the slight adjustment necessary to cause them to total exactly to the nearest even thousand. 1 See Part I of this series. - The word "statistic” is usually taken to refer to a numerical characteristic of a sample, while "parameter” refers to the corresponding characteristic of a population. Klauber: Populations and Samples 37 TABLE 1 Effect of Variation in the Number of Individuals in a Population. Number of Population Population Population Ventrals No. 1 No. 2 No. 3 95 15 96 13 1,338 97 4 443 44,318 98 54 5,399 539,910 99 242 24,197 2,419,707 100 399 39,894 3,989,423 101 242 24,197 2,419,707 102 54 5,399 539,910 103 4 443 44,318 104 13 1,338 105 15 Total 999 99,998 9,999,999 An interesting feature of Table 1 is the increase in the over-all range — minimum to maximum — that follows an increased population. In Table 2 the mean of one population has been shifted from 100 to 98. It will be observed that the sizes of the groups or classes of variates remain otherwise unaffected. TABLE 2 Effect of Variation in the Mean of a Population. Number Population Population of No. 1 No. 2 Ventrals Mean =100 Mean = 98 94 13 95 443 96 13 5,399 97 443 24,197 98 5,399 39,894 99 24,197 24,197 100 39,894 5,399 101 24,197 443 102 5,399 13 103 443 104 13 Total 99,998 99,998 38 Bulletin 17: Zoological Society of San Diego In Table 3 the effect of changing the standard deviation (a) of a popu- lation is shown, the number of specimens remaining constant at 100,000, and the mean at 100. It wdl be noted that the spread or scatter increases, for the standard deviation is a measure of dispersion. In this and subse- quent populations I have in some instances made slight adjustments in the last figure of one or two of the most populous classes to cause the total to equal 100,000 exactly. This facilitates comparisons without chang- ing to an appreciable degree the chances involved in drawing random samples from that population. TABLE 3 Effect of Variation In the Standard Deviation of a Population Population Population Population Ventrals No. 1 No. 2 No. 3 O' = 0.8 (7=1.0 rO II b 94 1 95 26 96 13 332 97 44 443 2,380 98 2,191 5,399 9,714 99 22,831 24,197 22,586 100 49,868 39,896 29,922 101 22,831 24,197 22,586 102 2,191 5,399 9,714 103 44 443 2,380 104 13 332 105 26 106 1 Total 100,000 100,000 100,000 A few notes on the method of forming these populations should be recorded. Since we are dealing with characters that can take only discrete values, the ordinates rather than the areas of the normal curve have been used in segregating a population into classes. If one sets up a population by areas, the square of the standard deviation will always come out too high by 1/12, this being the amount of Sheppard’s correction for group- ing. But in the present instance, since the variates are integral, they always take a central position in each class and no correction should be made. A comparison of distributions by ordinates and areas is given in Table 4 for one value of the standard deviation (o-=l). It will be noted that the area basis gives a slightly wider dispersion. Klauber: Populations and Samples 39 TABLE 4 Effect Df Calculating Distributions by Ordinates and by Areas Number Population Population of No. 1 No. 2 Ventrals by Ordinates by Areas 96 13 23 97 443 598 98 5,399 6,060 99 24,197 24,173 100 39,896 38,292 101 24,197 24,173 102 5,399 6,060 103 443 598 104 13 23 Total 100,000 100,000 The standard deviation of the population as finally set up is never ex- actly equal to that sought, even when the ordinate method is employed. Efowever, no population that I have used differs in its standard deviation from the figure desired by more than 0.001. For example, in the third popu- lation given in Table 3, the standard deviation calculated, after the popu- lation was set up, was found to be 1.3 33 12 instead of 1.3 3 33 3. Obviously, this slight difference will not affect the results in the sampling tests that I have made. The particular table employed in setting up the populations is that of W. F. Sheppard in Karl Pearson: Tables for Statisticians and Biometricians, Part 1, Ed. 3, 1930, pp. 2-8. It will be observed that changing the mean or the number of indi- viduals in a population utilizes the same or proportionate numbers in each class; only a change in the standard deviation requires a completely new set of figures. For the purposes of this investigation, populations of 100,000 with the following 9 values of the standard deviation were set up: 0.667, 0.8, 1, 1.333, 2, 2.857, 4, 6.667, 10. It was found that these would fit almost any variable character used in herpetological classifica- tion closely enough to test an illustrative sampling problem. Before leaving these populations for the experiments in sampling, I wish to show the effect of heterogeneity on a composite population, each component of which is normally distributed. Let us assume a population composed of 100,000 males and an equal number of females, and observe the effect on the distribution of the combined ventral scale counts, as sexual dimorphism increases. The standard deviation will be taken as 1.3 3 3 in both sexes, but the means will be caused to diverge. In the first com- posite population both sexes average 100 ventrals; in the second, the females average 101 and the males 99; in the third the females 102 and 40 Bulletin 17: Zoological Society of San Diego males 98, etc. In each successive distribution the difference between the means increases by 2, but the composite mean remains at 100. The results are shown in Table 5. It will be observed in a case such as this, that, if the means are different, the composite distribution ceases to be normal; TABLE 5 Effect of Heterogeneity on Composite Populations when Each Component is Normally Distributed. o-= 1.333 Number of Difference Between the Means of Males and Females Ventrals 0 2 4 6 91 1 92 1 26 93 1 26 332 94 2 26 332 2,380 95 52 333 2,380 9,714 96 664 2,406 9,715 22,586 97 4,760 10,046 22,612 29,923 98 19,428 24,966 30,254 22,612 99 45,172 39,636 24,966 10,046 100 59,844 45,172 19,428 4,760 101 45,172 39,636 24,966 10,046 102 19,428 24,966 30,254 22,612 103 4,760 10,046 22,612 29,923 104 664 2,406 9,715 22,586 105 52 333 2,380 9,714 106 2 26 332 2,380 107 1 26 332 108 1 26 109 1 Total 200,000 200,000 200,000 200,000 it becomes flat-topped, and, as the difference between the means increases, it even becomes bimodal, as is evident in the last two columns. It can be shown that bimodality begins when the differences between the means ex- ceeds twice the standard deviation. Estimating population characteristics from samples, using rules and formulas premised on the substantial normal- ity of the population, will not produce accurate results when the normal components differ enough to produce marked abnormality in the composite group. Thus, combining sexes should usually be avoided in making com- parisons, if there be an important degree of sexual dimorphism. The Methods of Random Sampling Random sampling tables are available and their use described in the following publications: (a) Tracts for Computers, No. 15, Random Klauber: Populations and Samples 41 Sampling Numbers arranged by L. H. Tippett, pp. VIII + 24, Cambridge University Press, 1927; (b) Statistical Tables for Biological, Agricultural, and Medical Research, by Fisher and Yates, pp. 18-20, 82-87, London and Edinburgh, 193 8. The first table comprises 208 columns, each column containing 50 4-figure numbers; the second 3 0 columns, each column containing 50 10-figure numbers. The individual single-figure columns can be grouped in a great variety of ways; any five contiguous single- figure columns may be employed as random selections of 5 -figure num- bers, which may in turn be used directly in drawing individuals from a population of 100,000. For example, we set up our population with equiva- lent limiting numbers as given in Table 6. Then we have only to decide on a method of selecting five figure numbers from one of the columns (or combinations of parts of columns) in either table and we have a series of ventral counts by chance. We may use dice, cards, a roulette wheel, or bingo numbers to select both the page and the group of columns which are to be used. TABLE 6 An Example Application of Random Sampling Numbers. Number of Population Inclusive Ventrals Distribution Numbers 96 13 00001::'-00013 97 443 00014-00456 98 5,399 00457-05855 99 24,197 05856-30052 100 39,896 30053-69948 101 24,197 69949-94145 102 5,399 94146-99544 103 443 99545-99987 104 13 99988-100000 Total 100,000 * The number 00000 is recorded as 100,000. Suppose our method of selection leads to the numbers in columns 2 5-29 on p. XII of Tippett. Then the first five samples are represented by the numbers 00433, 48901, 27228, 72094, and 13224. Referring to Table 6, we find that we have drawn, in order, snakes with counts of 97, 100, 99, 101, and 99 ventrals. Random selections from heterogeneous populations may be made by using dice, the odd numbers representing males, for example, and the even females. Or, if the population components are not evenly divided, we may use cards or a roulette wheel, allocating the numbers in any desired ratios; or a two or three figure column, selected by lot from either of the 42 Bulletin 17: Zoological Society of San Diego tables of random numbers. It is only necessary that we play the game fairly and use a system without bias. Many ways of solving such prob- lems of selection will readily suggest themselves to any one accustomed to games of chance. Even the tables of random sampling numbers may be supplanted by the use of numbered discs or balls, although this will slow selection and introduce the possibility of bias through mechanical imperfection. The totalizer wheels of an old speedometer may be used as a selector by freeing them from each other, but they must be well bal- anced to avoid concentration on particular numbers. Variations of the Mean So much for discussions of the methods of setting up populations and drawing random samples from them. We shall now put these schemes to work to illustrate various relationships between populations and samples, and how samples tend to vary. We shall, as far as possible, select illustra- tions approximating situations found in the herpetological field. It is first desired to follow the trend in the mean of the ventral scale counts in samples representing a homogeneous population of rattlesnakes. The investigation is limited to one sex so that sexual dimorphism will not complicate the result. There are, as usual, 100,000 individuals in the population. The mean is 200 ventrals, and the coefficient of variation is 2 per cent, a figure representative of homogeneous series of rattlers. The standard deviation is then 4, since — i CO N) o CO 194 211 192 207 191 205 Note: — To visualize the population from which tjiese samples were drawn see the second column in Table 18. Table 1 1 is interesting in showing a considerable variation in the final scores; particularly Sample 2 reaches a range rather close to the true population range for so few specimens. Klauber: Populations and Samples 49 TABLE 11 Changes in the Range with Increasing Sizes of Samples. Body Blotches. Population Mean 40; Absolute Range 23 to 57. Specimen Sample No. 1 Sample No. 2 Sample No. 3 Sample No. 4 Number Min. Max. Min. Max.. Min. Max. Min. Max 1 39 39 43 43 44 44 43 43 2 36 .... 41 41 39 3 46 4 35 .... 36 40 5 .... 44 38 6 40 7 8 .... 45 49 37 9 46 36 10 36 11 35 12 13 14 15 16 29 45 17 .... 46 18 .... 50 19 20 32 21 22 23 47 24 25 Final Score 35 46 29 50 32 49 36 47 Note:- —For the population distribution see Table 13. J Before proceeding to the effects of heterogeneity I wish to carry further the range experiments on the population used in Table 11, by bringing the samples up to 200 each; but to conserve space I shall combine each series in groups of ten. Thus it will only be possible to tell within a range of 10 specimens when a new record was made. The results are set forth in Table 12. It will be seen how much greater are the ranges attained in this table with 200 specimens in each sample, as compared to those in Table 11 having 2 5 specimens per sample. This relationship between the number of specimens available and the extreme range of a character should 50 Bulletin 17: Zoological Society of San Diego always be given consideration in taxonomic work, not too much de- pendence being placed on over-all ranges derived from small samples. TABLE 12 Specimen Number 1 to 10 11 to 20 21 to 30 31 to 40 41 to 50 51 to 60 61 to 70 71 to 80 81 to 90 91 to 100 101 to 110 111 to 120 121 to 130 131 to 140 141 to 150 151 to 160 161 to 170 171 to 180 181 to 190 191 to 200 Changes in the Range with Increasing Sizes of Samples by Groups of 10. Body Blotches. Population Mean 40; Sample No. 1 Absolute Range 23 to Sample No. 2 57 Sample No. 3 Sample No. 4 Min. Max. Min. Max. Min. Max. Min. Max. 3 3 44 34 50 36 50 34 45 46 31 49 48 30 29 51 32 51 27 49 27 30 28 51 Final Score 30 49 27 51 28 5 1 27 51 Note: — For the population distribution see Table 13. A table giving the relationship between the standard deviation of a normal distribution and the mean range, as it varies with the number of specimens in the sample, is available in Pearson’s Tables for Statisticians and Biometricians, Part II, Table XXII, pp. 165-166, 1931. We find from this table that the mean range attained in my Table 1 1 should be about 15.7; the actual ranges are 11, 21, 17, and 11 blotches, which gives a mean of 15. In Table 12 the final ranges were 19, 24, 23, and 24; mean 22.5. From Pearson’s table we learn that the range should average 5.49 times the standard deviation (4), or 22.0; this is good agreement for only 4 samples. Theoretically a sample comprising 5 specimens will have Klauber: Populations and Samples 51 about twice the range shown by the first two specimens, while 60 speci- mens will again double the range of the first five. But even 1000 speci- mens will not again double the range; they will add only 40 per cent. This matter of range is thought to be of sufficient interest and im- portance to warrant the presentation of Table 13, which shows the actual population from which the samples in Tables 11 and 12 were drawn. TABLE 13 Normal Distribution of Body Blotches in a Population of 100,000 Specimens. Mean 40; Standard Deviation 4. Number of Blotches Number of Specimens 23 1 24 3 25 9 26 22 27 51 28 111 29 227 30 438 31 793 32 1,350 33 2,157 34 3,238 3 5 4,566 36 6,049 37 7,529 38 8,802 39 9,667 40 9,974 The other half of the distribution is not given since it duplicates the first half in reverse; that is, there are 9,667 specimens with 41 blotches, 8,802 with 42, etc. A random sample of 4000 specimens drawn from this population had a range of 23 to 54 blotches; in other words, the lowest individual in the population was drawn, but the highest drawn was 3 below the absolute maximum contained in the population. 52 Bulletin 17: Zoological Society of San Diego Table 14 represents the results of sampling the same composite popu- lation discussed under Table 9. The spread is increased and the results are more erratic than those disclosed in sampling homogeneous popula- tions. The results are somewhat similar to sampling a normal population with a higher standard deviation, but this is not exactly true, as the com- TABLE 14 Changes in the Range with Increasing Sizes of Samples, Showing the Effect of Heterogeneity. Ventral Scale Counts; Absolute Range 162 to 205. Specimen Sample No. 1 Sample No. 2 Sample No. 3 Sample No. 4 Number Min. Max. Min. Max. Min. Max. Min. Max. 1 187 187 188 188 185 185 183 183 2 191 185 189 182 3 184 193 193 4 184 178 178 178 5 182 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Final Score 181 171 175 193 77 176 171 191 175 193 177 193 176 193 Klauber: Populations and Samples 53 posite population is not normally distributed. It should be understood that while Tables 9 and 14 are drawn from the same composite population, they are based on separately selected samples. A comparison of the trends in the means, as indicated by Tables 7, 8, and 9, with trends and varia- tions in the over-all ranges as shown in Tables 10, 11, 12, and 14, will demonstrate how much more accurately the mean represents the popula- tion mean than the range in the sample represents the population range. For example, compare the results of Tables 8 and 11, Table 11 being used only up to and including Specimen 20: Mean Minimum Maximum Population Parameter 40.0 23 57 Sample 1 39.9 35 46 2 40.9 29 50 3 40.2 32 49 4 39.8 3 6 45 It is not only that the sample range fails to reach the population range, for this is to be expected, but there is a considerable discrepancy between the several sample ranges as compared with the slight variation between the sample means. In subsequent studies of dispersion, comparisons are drawn between population parameters and sample statistics, with respect to the mean and range, where each sample comprises 100 specimens, instead of only 20 as above. The results are set forth in Table 22. The consistency of sample means, and the lack of dependability in the maxima and minima are there evident. Further, it is to be remembered that these results are derived from populations that are truly normal. In real populations it is quite likely that there may be fringe specimens in greater number than are expected in a normal distribution, especially if juveniles are included. For juvenile specimens are occasionally so distorted and aberrant that they probably would not survive. Although the range of a character is often used in taxonomic work, especially to show whether there is an overlap between two forms, it is really a rather poor indicator because of its lack of close adherence to the range of the population. It should never be used without a statement giving the number of specimens in the sample; otherwise it is almost without value. Later, in discussing dispersion in samples, the use of the interquartile range, as a dispersion indicator, will be examined, and additional examples of the relationship between population and sample ranges will be adduced. Dispersions of Samples Compared to Those of Populations Thus far I have discussed the relationships of samples and parent popu- lations in respect of the two simplest and most frequently used herpeto- logical statistics — the mean and the range of variation. Illustrations have 54 Bulletin 17: Zoological Society of San Diego been given showing how these sample statistics change, and in general tend to approach the population parameters as the samples increase in size. There remains the important attribute of dispersion, that is the scatter of the variates about the mean — how closely they adhere to the mean and what the nature of the dispersion may be. This is an extremely important statistic in taxonomic problems, especially those having to do with the significance of differences between subspecies or species. For, given a certain difference between means, the extent of the overlap be- tween two forms will obviously depend upon the extent to which the variates spread on each side of the mean. Returning to the relationship between samples and the populations from which they were drawn, I again resort to the method of illustration by drawing a number of samples from each of several different popu- lations, setting them beside each other in tabular form to facilitate a visual comparison. Since samples of appreciable size are rarely identical (each sample having its own individuality), no single sample can out- line this relationship completely, which is the reason for drawing a num- ber of illustrative samples from each population. TABLE 15 Changes in the Dispersion of a Sample with Enlargement of the Sample. Infralabials: Mean =16, a— 1 Number of of Infralabials Distribution of Population Successive Sample Steps 1 2 3 4 5 6 7 8 9 10 11 12 13 13 443 1 1 1 1 1 14 5,399 2 5 19 15 24,197 1 2 3 3 5 10 15 24 48 129 16 39,896 1 1 1 1 1 1 7 15 38 74 199 17 24,197 1 3 4 16 32 61 125 18 5,399 1 3 3 3 9 25 19 443 2 2 20 13 Total 100,000 1 2 3 4 5 10 25 50 100 200 500 In presenting tables showing the trends of means and over-all ranges with the growth of samples, each specimen increment was shown separately. This is usually too cumbersome a method when illustrating the relation- ship between the dispersion in samples and populations; however, in Table 15 1 have set forth such a growth of one sample by successive steps, using, as the population, an assumed distribution of infralabials in a species of rattlesnake, this particular distribution being closely approxi- Klauber: Populations and Samples 55 mated in several species. Table 1 5 shows how, as a sample grows, it tends more closely to approximate the parent population in the character of its dispersion. But even at 200 specimens the discrepancies are quite con- spicuous; in this particular sample there are too few specimens with 14 infralabials and too many with 17. By the time the sample has been built up to 500 specimens these imperfections have mostly disappeared, and the sample shows a closer resemblance to the parent population. In making such a comparison, however, it is important to recall how seldom we have a herpetological sample as large as 500 specimens. TABLE 16 Dispersions of 10 Samples, Each Comprising 100 Individuals. Infralabials: Mean -16, a = 1 Number of Infralabials Distribution Sampl e Number Population l 2 3 4 5 6 7 8 9 10 Total 12 13 13 443 2 2 1 1 6 14 5,399 1 9 7 4 9 3 3 1 7 7 51 15 24,197 29 24 15 15 29 30 22 34 21 16 235 16 39,896 40 37 39 47 37 36 38 39 43 42 398 17 24,197 24 25 37 30 20 28 33 21 23 27 268 18 5,399 6 3 2 4 2 3 3 4 4 8 39 19 443 1 1 1 3 20 13 Total 100,000 100 100 100 100 100 100 100 100 100 100 1000 Table 16 uses the same population, drawing therefrom 10 new samples, each containing 100 specimens. A considerable variation amongst the samples will be noted; each seems to have an individuality of its own. In the summation of the ten samples, bringing the total to 1000 speci- mens, there is a rather marked deficiency in specimens having 18 in'frala- bials, and correspondingly an overabundance of those with 17. Some of the samples are badly distorted, while others more closely follow the dis- tribution of the population. Sample No. 9 is probably the closest fit to 1/1000 of the population, while No. 3 is particularly unbalanced. A chi-square test of the total of the 1000 specimens gives P=0.21.4 This is not a particularly good fit, for it shows that 79 similarly drawn samples out of 100 would more closely approximate the distribution of the population. 4 In making these chi-square tests I have taken the degrees of freedom as one less than the number of classes as suppressed, since a fit has been, forced only with respect to the totals of the theoretical and actual distributions. 5 6 Bulletin 17: Zoological Society of San Diego But the principal point to be observed with respect to these samples, comparing each with the others, and with the parent population from which they were drawn, is the varying impression that a taxonomist might gain from them, having in mind the fact that the taxonomist would have only one sample before him, neither the population nor the other samples being available. Tables 15 and 16 gave the results of random sampling from a popula- tion having a rather closely concentrated character; after all there is not much variation in the infralabials of any homogeneous series of rattle- snakes. Table 17 sets forth the results of sampling a population which is divided into a greater number of classes. The character considered is the subcaudal scale count; the mean is 30 and the standard deviation 2. Thus the coefficient of variation is 6.67 per cent. Such a distribution is not unusual; it is closely approximated, for example, by female Pbyllorhynchus decurtatus perkinsi. It will be observed that, from the separate samples, one gains a less accurate suggestion of the parent population than was the case with the more concentrated character of Table 16. Sample 2 is particularly distorted; Sample 9 is good as far as the central classes are concerned, but poorly distributed in the edge-classes. The grand total of 1000 specimens rather closely approximates the parent curve (P by chi- square = 0.61 ) , although there is on overabundance of specimens with 32 subcaudals, and a shortage of those with 28. Klauber: Populations and Samples 57 TABLE 17 Dispersions of 10 Samples Each Comprising 100 Individuals. Subcaudals: Mean 30, cr -2 Number of Subcaudals Distribution Sample Number Population l 2 3 4 5 6 7 8 9 10 Total 21 1 .... — 22 7 — — — 23 44 1 1 24 222 1 1 2 25 876 1 1 1 1 2 — 2 1 2 11 26 2,700 4 2 2 3 5 5 2 2 3 2 30 27 6,476 7 4 7 10 8 6 1 6 9 7 65 28 12,098 11 4 12 15 4 12 12 9 9 10 98 29 17,603 16 26 19 15 17 17 12 18 18 16 174 30 19,946 18 24 18 22 19 13 21 23 22 18 198 31 17,603 15 16 21 15 20 15 25 15 17 14 173 32 12,098 15 11 12 10 11 17 14 16 10 17 133 33 6,476 6 6 3 4 10 11 6 8 5 10 69 34 2,700 5 5 2 4 4 1 4 1 7 3 36 35 876 1 2 2 1 1 1 8 36 222 1 .... 1 37 44 1 .... 1 3 8 7 .... .... 39 1 .... .... .... .... .... 100,000 100 100 100 100 100 100 100 100 100 100 Total 1000 58 Bulletin 17: Zoological Society of San Diego Passing on to a character with a still wider spread, Table 18 presents the distributions of the usual 10 samples of 100 individuals each, drawn from the same population which comprised the basis of Tables 7 and 10, namely the ventral scutes in a homogeneous series of rattlesnakes with TABLE 18 Dispersions of 10 Samples, Each Comprising 100 Individuals. Ventrals: Mean = 200, d HN o i—i o o rH OS oo Csl O OS s OO ON ON ON OS OS oo oo OS ON oo "a, E 33 i—l rH rH i—i rH rH rH rH 1-H 1—H rH H 03 o Vz*s 'Tt" CM o O o o as O o OS OS ON oo a s o O o o os o o OS OS ON On 03 (N (N < ON so UH r\ UH Vr, C-N vh 03 G 03 s riN no nO riN rO nO nO rO rO no Oh K O Oh oo d rH Cr, Lh "T" V-N nO Us Vo NO V**\ I”"1 § CM (N Cs| CM Csl Cn| Cs| Cs4 Cs| CM Csl M CQ l\ H c 1—H c O rO Csl CM '✓"S OS O VO rO Lh o c r ns o o (N O so © rH nO rH O Csl o VO 03 03 d H X o OO OO OO OO OS OO OO OS ON OO CD a s CM r"" 1-H T"H T"H •“ 1 1 rH 03 VO U . C/5 1) £ 03 03 Jh h CM no us NO r\ OO ON O 03 rH Ph o' C £ o 80.0-121.1 6 83 116 80.4-120.9 7 79 122 78.7-120.3 8 84 116 81.1-117.4 9 85 125 78.9-122.9 10 84 110 81.8-117.5 Total (Sample of 1000) 79 125 80.2-119.8 We see that the population minimum is 72, while the sample minimums vary from 79 to 88; the population maximum is 128 and the sample maximums vary from 110 to 12 5. On the other hand the sample statistics represented by M — 3cr are much more consistent indicators of the corre- sponding population parameter; for the population low figure is 80.0 while the samples vary between 78.7 and 82.0, and the population high figure is 120, with the samples varying between 117.4 and 122.9. Thus, there is every reason to recommend the M 111 3cr statistic as a population indicator as compared to the over-all range. Admittedly the samples I have used in this test (100 individuals) are relatively large, but similar advantages in the use of this statistic will be found in the case of smaller samples. For example, in Table 26 I have built a sample up to 2 5 speci- mens, by adding one randomly selected individual at a time, all the while recording the trend in the over-all range and in the statistic M — 3a. Klauber: Populations and Samples 67 The population used is that shown in Table 13, that is, a homogeneous series of snakes having an average number of 40 body blotches, with a standard deviation of 4. All data are calculated from the sample as it grows, the population being assumed unavailable, as would be the case in actual practice. Thus, both M and a change as each specimen is added; and in each calculation the factor [N/(N— 1 ) ] ^ is used in securing the optimum population value of o\ TABLE 26 Changes in the Range and in M ± 3cr as a Sample Increases from 1 Specimen to 2 5. Body Blotches. Mean = 40, a — 4. Specimen Count Overall Range Min. Max. Range M ± 3(7 Population Parameter 40 23 57 28.0-52.0 Specimen No. 1 .. 44 44 44 44.0-44.0 2 43 43 44 42.0-44.0 3 .. 41 41 44 38.1-47.3 4 .. 37 37 44 32.0-50.5 5 34 34 44 27.2-52.4 6 . 44 34 44 28.1-52.9 7 41 34 44 29.2-51.9 8 .. 44 34 44 30.0-52.0 9 . 35 34 44 28.2-52.3 10 .. 33 33 44 26.3-52.9 11 39 33 44 27.0-52.2 12 ... 37 33 44 27.1-51.5 13 ... 42 33 44 27.6-51.5 14 ... 38 33 44 27.8-51.0 15 ... 33 3 3 44 26.9-51.2 16 ... 31 31 44 25.3-51.7 17 ... 38 31 44 25.7-51.3 18 ... 36 31 44 25.8-50.8 19 ... 42 31 44 26.1-51.0 20 ... 42 31 44 26.4-51.0 21 ... 42 31 44 26.7-51.1 22 ... 40 31 44 27.0-50.9 23 ... 45 31 45 26.9-51.4 24 ... 40 31 45 27.2-51.2 25 ... 32 31 45 26.4-51.4 Note: In determining the population was calculated from range M — 3(j, the sample. the estimated standard deviation of 68 Bulletin 17: Zoological Society of San Diego We see that the extreme or over-all range never approaches closely to that of the population. Even after 2 5 specimens are accumulated the range is only from 5 above the mean to 9 below, a notable unbalance in itself. But the statistic M ± }a reaches a figure quite close to the population parameter after the accumulation of only 5 specimens, and remains con- sistently close to that parameter as long as specimens are added. So once more the consistency of a statistic which is a multiple of the standard deviation is demonstrated. However, it should be noted that these strictures upon the relative values of statistics of the over-all range as compared to some multiple of the standard deviation, are only pertinent when ap- plied to variates having a substantially normal distribution, or at least one which is fairly symmetrical. The over-all range may be a better criterion in strongly skewed distributions. These statistics of populations and samples are primarily necessary in taxonomic work to demonstrate the validity of differences — namely the chances that two supposed species overlap in a particular character and the extent of such overlap. I shall illustrate a typical case of overlap and how it becomes increasingly evident as more specimens are added to the available collections, that is, as the samples increase in size. Table 27 represents the results of sampling two populations coincidently, the same number of specimens being added to each. The populations sampled have similar characteristics, except with respect to their averages. They represent ventral scale counts in two species of snakes, both having standard deviations of 4; but the mean of one population is 200 scutes and of the other 184. Thus, the difference between the means is 4 times the standard deviation of either. The vertical columns in the table are cumulative samples; that is, the first column shows the first sample drawn, the second, the first plus the second, etc. To conserve space indi- vidual drawings are shown up to Sample 5, then by twos, threes, fives, etc. The interesting feature of this test is that one does not get the feeling that there is a probable overlap between the two forms until the seventh speci- men of each has become available; and an actual overlap did not occur until the drawing of the eighteenth specimen. This is somewhat typical of our knowledge of rarer forms, which are often first thought to be quite well separated, and are so noted in keys, but which later are shown to over- lap, when additional specimens have become available. The probability that such an overlap would eventually be evident might have been pre- dicted by calculation as early as Specimen 5. Of course these remarks on the gradual evidence of an overlap have little to do with the validity of the two species, since even with the overlap, the difference between the means is sufficient to warrant recognition. According to Ginsburg’s cri- terion the small overlap (2.28%) would indicate full species. However, the discussion of the extent of divergence, its measure and significance, and 5 Zoologica, Vol. 23, pp. 253-286, 1938. TABLE 27. Simultaneous Sampling of Two Populations. Ventral Scutes: M] = 184, M2 = 200; Sample Step Populations •'4' II b - ^ h — — CM rrN CM rrN »/"\ lx " — i CM SO * — 1 ©o i*"n cs v — < *— I 1—1 ^Nt'CKawooo^HKt^^HMH : ; ; : ; : ; ; ; ; t“H 1— H i ! C\J C\J (N ’ — < ^ 0\ O »^N \D ty-N r-y 4— H U-S ’ < ^ ; r-< : : ; ; ; : : ; ; ; ; ; ; ■ i ^ —< CM cm : cm CM N ^ O SO OS ^ x K f'l >-n OS KXVCONOSMK^K «^>r u C o u. vt: i i i i i i i i i i F-H os CM SO SO r-H f-H ^H r-H r-H r-H K Os ^H ^H r-H t— H r-H 1— H ^t" so r-H T-H os O r-H r-H o Os o o o o o o o o o o o o o o o o o SO OS O o Os o o o o o o o o o o o o o o o o o o o r\ o K Os OS o OS (N r\ K OO Ks\ OS OS (N o OS OS so so <**■> u c 5 U W hJ {a w w Mh O V5 c Cis cq < H U a > rQ £ H W e u a> t CQ 2 Oh CO to c rt Li o C rj i-» c o £ 4> I c «2 « C C3 *-» c o E > a l\ o C\ OS Tf o K so oo so oo ns IN. CM l\ nf so os so so oo so o H- so as oo as oo so so SO K SC OS (N ^ (N Os © O (N o K o OO os r\ K ^t* OO o K ON r-H 00 t-H as NO «^N os CM T— H G— s K l\ CM f^N rsi NO K o K T-H rsi CM r3 * N. <0 O *3 too V <3 r«* © N * N, V. oo n3 E D n £ £ ts I-H a k Oo o o • K. *>* £ Co -2 ■K. © K. u o Q ★ Klauber: Scalation and Life Zones 77 species. The only exception is Pituophis catenifer. The evidence continues to multiply that the two supposed subspecies, P. c. annectens and P. c. deserticola, meet or overlap, but do not intergrade in eastern San Diego County. The results of the investigation of the number of ventral scutes are given in Table 1. Since sexual dimorphism is present in nearly all these forms, the sexes are treated separately, except in the case of Leptotyphlops humilis. Also in this species the dorsals, rather than the ventrals are used, since they can be more accurately counted, and therefore are more often employed in taxonomic work. From this tabulation of the thirteen forms we find that there is an al- most universal tendency toward a higher number of ventrals in the desert specimens, as compared to those which were collected in the more humid 4 5 cismontane region. No less than ten out of the thirteen forms show this trend in both sexes; and in every instance the significance is beyond the usually accepted level of P = 0.05, meaning less than one chance in 20 that the result has occurred through an accident of random sampling. In the majority of cases the probability is below one in a thousand, leaving no doubt as to the reality of the trend. a The exceptions are three in number: Lichanura roseofusca, Coluber flagellum frenatus, and Crotalus mitchellii pyrrhus. In the first two we find that the males follow the usual trend, that is, the desert males have more ventrals than the coastal. The females show no significant territorial variation in Lichanura; while in C. /. frenatus, the desert females average lower than those from the coastal side of the mountains, although the difference is below the usually accepted level of significance. The last species which fails to follow the trend of the majority is the rattlesnake C. m. pyrrhus. Flere the desert males average slightly higher than the coastal, and the desert females somewhat lower, but the differences are below the significance level. It may be seriously doubted whether larger samples would reverse the condition noted in this species. Of the 13 forms listed in this study C. m. pyrrhus is the only one, besides Trimorphodon 4 The difference in humidity is the outstanding difference between the two regions; how- ever I do not claim to have shown that this is the cause of the observed difference in ventral scale counts, which might result from any of a number of secondary environmental characteristics, of which temperature is outstanding. 5 As several of the samples are rather small, especially in the case of some species which are not plentiful in the desert region, I have in all cases used the Atest and the method of pooling, in determining the significance of the difference. The equation is ^=(M-M') [NN'(N + N'-2)]5* [(N + N') (Nv + NV) ] - 'A, where M and M' are the means of the two samples, N and N' the numbers of specimens in each of the samples, and v and v' are the variances (standard deviations squared) of the samples. The f-table is entered at N + N' -2 degrees of freedom. (R. A. Fisher, Statistical Methods for Research Workers, Seventh Edition, 193 8, p. 128; J. F. Kenney, Mathematics of Statistics, part 2, 1939, p. 140; P. R. Rider, An Introduction to Modern Statistical Methods, 1939, p. 91.) 78 Bulletin 17: Zoological Society of San Diego vandenburghi, which inhabits only a limited zone in the cismontane area; for although both may rarely be found in the coastal and inland valleys zones, their infrequency shows that they are more or less strays. Their headquarters are in the foothill zones on both sides of the mountains. Thus it may be said, in possible explanation of the deviation of C. m. pyrrhus from the trend followed by the majority of forms, that it has never been fully subjected to the coastal influence. It will be observed that the three smallest snakes, Leptotyphlops humilis, Hypsiglena ocbrarbyncbus, and T ant ilia eiseni, have the highest coefficients of divergence. ,J This is probably evidence of the trend, frequently observable, that slowly moving forms exhibit greater differences per unit of distance than more widely wandering species. Of the three colubrids which fail to attain a significance of P = .001 — , or greater than one in a thousand, two have not been fully influenced by zonal extremes; L. g. calif orniae is quite rare in the desert, and T. van- denburghi is virtually absent along the coast. Even so, both would prob- ably attain a significance of .001 with larger samples, since, if the co- efficient of divergence remains unchanged, the significance increases with the size of the sample. Although a general tendency can be shown to exist in some genera (the rattlesnakes are an example) for larger species and subspecies to have higher scale counts, this is not a causative factor in increasing the number of ventrals in these desert specimens. In several cases desert specimens run somewhat smaller in size than in the cismontane region, as is the case, for example, in Crotalus ruber and Coluber flagellum frenatus. The reverse is true of Leptotyphlops ; in other forms the inhabitants of the two areas do not differ conspicuously in size. The trend shown definitely to exist in the ventrals in at least ten of thirteen species is not repeated in any other characteristic, the following having been tested by the same method: scale rows, caudals, supralabials, infralabials, body blotches, and tail rings. Some have no variations at all, such being true of several species in the case of scale rows and labials; for many colubrids have little or no intraspecific variation in these char- acters. Many show differences below the level of significance, which I have taken at P = .0 5. But even where there is significance in one species, there is no consistent trend throughout the thirteen forms or even a majority of them. Thus in the case of the dorsal scale rows we find that C. ruber has a significantly higher average on the coast, while C. m. pyrrhus has a cor- respondingly higher average in the desert. The others either have no dif- ferences, or such differences as there are fall below the level of significance. With respect to the caudals the following show significant differences: Salvador a, Lampropeltis, Rhinocheilus, and Hypsiglena average higher in 6 Defined as the difference between the means divided by half the sum of the means. Klauber: Scalation and Life Zones 79 the desert than in the cismontane region, thus following the trend in the ventrals; Pi t uo phis, on the other hand, has the opposite trend, for the coastal subspecies has a higher average. The study of the caudals is some- what handicapped by lack of specimens; for many specimens have incom- plete tails, thus always reducing the number below those available for the study of ventrals. Larger series may show more definite trends; but they are not likely to be as consistent as is found to be the case in the ventrals. The labials are constant in a number of species, as is often the case in the colubrids. Only Trimorphodon shows a significant difference, the coastal specimens having the greater number of supralabials. As to the infralabials A. e. occidentals, T. vandenburgbi, and C. ruber average sig- nificantly higher in the coastal region, while the contrary is true of Pituophis. As regards pattern, those species which have rings or blotches — as dif- ferentiated from the striped forms — do exhibit differences, although not always above the level of significance. Thus Rhinocheilus and Pituophis have markedly fewer blotches in the desert than along the coast. However, the opposite is true in Arizona and Trimorphodon, although in both these cases the differences are somewhat below the significance level. Desert Hypsiglena also has a higher number of blotches than the cismontane form. Thus we see that the lightening of color in the desert individuals, which is universal in all 13 forms except Lam -pro pelt is, is not secured by reducing the number of blotches, although this is the case with Pituophis, in which both fewer blotches and reduced pigment contribute to the lighter tone. Rather, it is effected by a reduction of pigment in blotches, ground color, or both. It is worthy of note that the trend in ventrals, which has been shown to exist, is evident in that character which is relatively the most consistent of the really variable scale counts, that is, the ventrals have the lowest coefficients of variation. I wish to acknowledge my indebtedness to Charles Shaw for making scale counts, Mrs. Elizabeth Leslie for statistical computations, and C. B. Perkins for his usual pertinent suggestions. Scale counts of particular specimens were received from Dr. R. B. Cowles, Charles M. Bogert, David Regnery, and J. R. Slevin. Rattlesnakes Listed by Linnaeus 81 IV. THE RATTLESNAKES LISTED BY LINNAEUS IN 175 8. Introduction Linnaeus’ descriptions of reptiles were so brief and so frequently based on composite material that, unless the type specimens are still extant, linking them with known species is often difficult and sometimes impos- sible. Yet, as the tenth edition of the Systema Naturae, 175 8, is the foundation of all nomenclature, it is important that attempts be made to solve these problems of identification. Linnaeus listed three species of rattlesnakes in the tenth edition: horridus, dryinas, and durissus. In the twelfth edition he added two more; these are not difficult to assign, being the species now known as Sistrurus miliarius and Lachesis mnta, the latter not a rattlesnake. But the first three have been the source of much confusion among taxonomists, and even now there is not complete agreement respecting the proper applications of these names. It has occurred to me that the large collections of specimens at present available might justify a re-examination of the problems involved, since we can now more accurately define the scale-count ranges of the several species which may have been the real subjects of Linnaeus’ descrip- tions. We likewise have new statistical methods of determining degrees of difference. The confusion primarily relates to the correct names to be assigned to five species and subspecies of rattlesnakes; these are the timber (or banded) rattlesnake of the eastern United States, the canebrake rattler (a south- eastern subspecies of the timber rattler) , the Florida diamondback, the Central American rattlesnake, and the South American rattlesnake, the last two being subspecies of the Neotropical rattlesnake. I am constrained, for the moment, to refer to these by their common names, since to use the technical names would involve the confusion I am trying to explain. Be- sides the three initiated by Linnaeus, another technical name, that of terrifcus Laurenti, 1768, must also be considered. Past Usages Some past allocations have been as follows: (a) The timber rattlesnake (to which the name h. horridus is now usually assigned) was identified as durissus by Holbrook, 1842, and by Dumeril, Bibron, and Dumeril, 18 54. (b) The Florida diamondback (now generally known as adamanteus) was called durissus by Boulenger, 1896, and in the Mission Scientifique, 1909; and terri ficus by Le Conte, 1 8 53. (c) The Central American rattlesnake (now usually called durissus ) was assigned to horridus by D., B., and D., 18 54, and Gunther, 1902; and to terrifcus by Boulenger, 1896. 82 Bulletin 17: Zoological Society of San Diego (d) The South American rattlesnake (to which the name terrificus is usually applied) was referred to as durissus by Jan, 18 59. Dryinas has been adjudged so vague that the name was dropped at an early date and has not been used for many years. I have mentioned the decisions of only a few herpetologists; the list and confusion could be greatly extended. Linnaeus’ Method and Type Specimens Linnaeus, in describing the snakes in the Systema Naturae, generally used a schedule comprising five parts, especially if the type specimen was contained in one of the collections to which he had access, as was the case with the rattlesnakes. These parts are: ( 1 ) The sex and number of ventrals and subcaudals of the type specimen. (2) A primary reference, in which a more complete description of the type, either by himself or some other author, may be found. (3) Secondary references which Linnaeus assigned to the same species. However, these often lead to confusion, since they may refer to species other than that of the type, or they may refer to composite or indefinitely described material. When there is a conflict, these secondary references must obviously yield to 1 and 2. It is important that the primary refer- ence be not confused with the secondary; it can usually be identified through the scale counts, as well as by its initial position. Sometimes all references have an equal value, but such is not the case with the three rattlesnakes. The primary reference may contain still others, which may be considered tertiary. (4) A habitat. Since this is expressed more as an over-all range than a type locality, as we know the latter today, the statement is usually too broad to be of any service in assigning names. For example, the habitats of all three rattlesnakes are given in the tenth edition simply as "America,” and therefore do not facilitate the problems of identification. ( 5 ) A description, usually including color notes. These are all too brief; they sometimes involve descriptions of specimens other than the type, and it is often clear that the specimens described were much faded. Sometimes the descriptive notes on the type specimen are supplanted by natural history notes culled from other references. None of the three types of Linnaeus’ rattlesnakes is now available for study. The type of horridus was contained in the King Adolf Fredrik Museum, most of the material from which was eventually transferred to the Royal Museum in Stockholm. Andersson, 1899, p. 5, states that horridus is one of the types now missing. He mentions (p. 27) two jars labeled Crotahis horridus, one containing a head, which is not that of a rattler; and the other the tail of a rattlesnake, which presumably is not that of the type of horridus, since it has more rattles than the type had, Rattlesnakes Listed by Linnaeus 83 and more subcaudals as well. I communicated with the Royal Natural History Museum in 193 5, hoping the rattle might be a complete string which could be analyzed, but Count Nils Gyldenstolpe replied that the string was incomplete, and that there were no new developments with regard to the lost type of horridus , only the jars and their contents men- tioned by Andersson remaining. Lonnberg, 1896, states that the types of both dryinas and durissus are also lost (pp. 18 and 27). These specimens were originally contained in two collections which were available to Linnaeus for study, the first the Adolf Fredrik Collection (not the same assemblage as the Adolf Fredrik Museum) and the second the Claudius Grill Collection, also referred to as the Surinam Collection. Both collections were eventually transferred to the Zoological Museum of the Royal University at Upsala, but the two rattlesnakes have disappeared. Thus our investigation must be restricted to the original descriptions supplied by Linnaeus. In fact, it must be clear that the uncertainties respecting the proper applications of the Linnean names are present only because the types are gone. Were they available, they would take precedence over the inadequate and conflicting descrip- tions upon which we must now depend. Studies of Scale Counts The rattlesnakes scale counts given by Linnaeus are as follows: 192. horridus. 167-23:2. 195. Dryinas. 16 5-30. 196. Durissus. 172-21:3. In each case the figure preceding the name is the sum of the ventrals and subcaudals.1 If there are two figures, separated by a colon, representing the subcaudals, it is to be understood that the first indicates the number of entire scales, and the second those which are divided. Only two other scale counts are made available by Linnaeus; in the primary reference the supral- abials of dryinas are given as 14-14, and the infralabials 14-14 also. All three types are stated to be males. Before matching these scale counts against the known dispersions of present day species and subspecies by the methods of mathematical statistics, we can narrow the field by some general considerations of the characters and ranges of the forms which are now recognized as valid. Hereafter I shall use the scientific names in their customary modern assignments (Klauber, 1936). Of the more than forty species and subspecies of rattlesnakes now recognized, many can be eliminated from consideration on one of two counts: Either their ranges are so restricted or were so inaccessible to the 1 It is interesting to note that Linnaeus listed the snakes in each genus in the order of their total ventral plus subcaudal scales, beginning with the lowest number, in effect a sort of numerical index. 84 Bulletin 17: Zoological Society of San Diego early eighteenth century traveler as virtually to exclude the possibility of their being in the three collections which contained these types; or their ventral and subcaudal scale counts are so widely different from those of the three types as to preclude their being the species described. The first criterion practically eliminates all species except those found along the eastern coasts of North and South America, or territories not far inland; the second excludes many other forms. I think we may exclude C. viridis and all of its subspecies on the score of geographic inaccessibility. In fact, it seems to me highly significant that no amphibian or reptile specimen from what is now the United States was contained in any of the three collections which included the three rattlesnake types. In the tenth edition of the Systema Naturae, Linnaeus described six land reptiles (other than C. horridus ) whose ranges center in the United States. Using their present-day names these are: Eumeces fasciatus, Coluber constrictor, Natrix sipedon, T hamnophis sirtalis, Che- lydra serpentina, and Terrapene Carolina. I omit Bufo marinus as being primarily Neotropical. The type descriptions of all of these were based on Kalm, Catesby, or Edwards, with the exception of C. serpentina (about the original of which type Linnaeus was not definite) none being described from specimens in the three Swedish collections containing the rattlers. This would leave one to infer that the chance that, of all the reptiles in these collections, only one or more of the rattlers came from the United States, is somewhat remote; at least it would take rather strong evidence to balance the probability that they did not. In the twelfth edition of the Systema Naturae, Linnaeus described fourteen additional land reptiles from the United States, but all except one were premised on Catesby’s descriptions; thus up to 1766 no U. S. specimens had reached these col- lections which Linnaeus studied, although many Neotropical forms were included therein. Nevertheless I have not excluded the timber rattler and Florida diamondback as possibilities. But at least we are justified in elim- inating such western forms as viridis and its subspecies. Returning to other species which may be omitted from consideration on the score of rarity or geographical inaccessibility, I think we can ex- clude both molossus or scutulatus, which, although they are found in the vicinity of Mexico City are rare so near the southern limits of their ranges. Neither reaches the east coast of Mexico. C. triseriatns and S. catenatus are eliminated on the score of scale counts. The likeliest candidates remaining are the following: C. d. durissus Central American Rattlesnake C. d. terrificus South American Rattlesnake. C. unicolor Aruba Island Rattlesnake. C. adamanteus Florida Diamondback Rattlesnake. C. cinereous ( atrox ) Western Diamond Rattlesnake. C. h. horridus Timber Rattlesnake. C.h. atricaudatus Canebrake Rattlesnake. Rattlesnakes Listed by Linnaeus 85 I have included C. unicolor as a possibility because the color descrip- tion of Linnaeus’ dryinas fits it well, although the chance that he had access to such an island form appears remote. However, unicolor may also occur on the mainland (Klauber, 1936, p. 197). I now proceed to analyze the relative chances that the three species described by Linnaeus represent each of the seven species and subspecies listed as being possibilities. This analysis is made by taking the statistics of these forms, as deduced from scale counts now at hand, and calculating, by the /-test, the significance of the difference between the population mean and Linnaeus’ scale count. The formula is /= (M-X) / This scheme involves the following assumptions: That the scale counts (and sexes) as given by Linnaeus are accurate and were made by the same methods as those used today; that the dispersions of these characters are substantially normal, as seems to be the case (see Sec. 1 of this series) ; and finally that the scale counts included in my samples represent the areas from which Linnaeus’ specimens were derived, so that the tests are not adversely affected by intransubspecific trends or dines. With respect to the scale counts, aside from obvious slips, Linnaeus seems to have been quite accurate. When we compare Andersson’s checks on Linnean types we find that there is seldom a difference of more than one in either the ventrals or subcaudals. Usually when there is a difference, Andersson’s results are one higher than Linnaeus’. If there are any counts especially to be doubted they are the 14-14 of both supralabials and in- fralabials in the type of dryinas. This is a uniformity rarely met with in actuality. Admittedly these conditions render conclusions somewhat hazardous; nevertheless, by this method we will be making the best use of the tenuous numerical data which Linnaeus has supplied. Table 1 sets forth the statistics resulting from studies of the species and subspecies considered to be possible solutions. 2 R. A. Fisher: Statistical Methods for Research Workers, Seventh Edition, pp 104- 106, 1938. 86 Bulletin 17: Zoological Society of San Diego TABLE 1 Statistics of Scale Counts Ventrals Subcaudals Subspecies N M (T N M cr C. d. durissus 52 175.27 3.92 51 30.14 1.91 C. d. t err i ficus 18 170.39 3.26 18 28.44 2.18 C. unicolor 12 158.67 2.10 12 28.67 1.30 C. adamanteus 35 170.49 2.73 35 29.49 1.34 C. cinereous 147 178.63 3.24 147 25.88 1.47 C. h. horrid us 154 167.66 3.29 154 24.69 1.82 C. h. atricaudatus .... 24 171.04 3.52 25 26.72 1.86 Supralabials Infralabials Subspecies N M