Skip to main content

Full text of "Introduction to quantitative genetics"

See other formats





Agricultural Research Councils Unit of Animal Genetics 
University of Edinburgh 


All Rights Reserved 

No part of this book may be reproduced 

in any form without permission in writing 

from the Publisher 



i SCIEHCt V 1 c ~* • 

Copyright © i960 D. S. Falconer 

Printed in Great Britain by 
Robert MacLehose and Company Limited, Glasgow 


My aim in writing this book has been to provide an introductory text- 
book of quantitative genetics, with the emphasis on general principles 
rather than on practical application, and one moreover that can be 
understood by biologists of no more than ordinary mathematical 
ability. In pursuit of this latter aim I have set out the mathematics in 
the form that I, being little of a mathematician, find most compre- 
hensible, hoping that the consequent lack of rigour and elegance will 
be compensated for by a wider accessibility. The reader is not, how- 
ever, asked to accept conclusions without proof. Though only the 
simplest algebra is used, all the mathematical deductions essential 
to the exposition of the subject are demonstrated in full. Some 
knowledge of statistics, however, is assumed, particularly of the ana- 
lysis of variance and of correlation and regression. Elementary 
knowledge of Mendelian genetics is also assumed. 

I have had no particular class of reader exclusively in mind, but 
have tried to make the book useful to as wide a range of readers as 
possible. In consequence some will find less detail than they require 
and others more. Those who intend to become specialists in this 
branch of genetics or in its application to animal or plant breeding 
will find all they require of the general principles, but will find little 
guidance in the techniques of experimentation or of breeding 
practice. Those for whom the subject forms part of a course of 
general genetics will find a good deal more detail than they require. 
The section headings, however, should facilitate the selection of what 
is relevant, and any of the following chapters could be omitted without 
serious loss of continuity : Chapters 4, 5, 10 (after p. 168), 12, 13, 
and 15-20. 

The choice of symbols presented some difficulties because there 
are several different systems in current use, and it proved impossible 
to build up a self-consistent system entirely from these. I have 
accordingly adopted what seemed to me the most appropriate of the 


symbols in current use, but have not hesitated to introduce new 
symbols where consistency or clarity seemed to require them. I 
hope that my system will not be found unduly confusing to those 
accustomed to a different one. There is a list of symbols at the end, 
where some of the equivalents in other systems are given. 


Many people have helped me in various ways, to all of whom I 
should like to express my thanks. I am greatly indebted to Professor 
C. H. Waddington for his encouragement and for the facilities that I 
have enjoyed in his laboratory. It is no exaggeration to say that with- 
out Dr Alan Robertson's help this book could not have been written. 
Not only has his reading of the manuscript led to the elimination of 
many errors, but I have been greatly assisted in my understanding 
of the subject, particularly its more mathematical aspects, by frequent 
discussions with him. Dr R. C. Roberts read the whole manuscript 
with great care and his valuable suggestions led to many improve- 
ments being made. Parts of the manuscript were read also by Dr 
N. Bateman, Dr J. C. Bowman, Dr D. G. Gilmour, Dr J. H. Sang, 
and my wife, to all of whom I am grateful for advice. I owe much 
also to the Honours and Diploma students of Animal Genetics in 
Edinburgh between 195 1 and 1957, whose questions led to improve- 
ments of presentation at many points. Despite all the help I have 
received, many imperfections remain and there can hardly fail to be 
some errors that have escaped detection : the responsibility for all 
of these is entirely mine. To Mr E. D. Roberts I am indebted for 
drawing all the graphs and diagrams, and I greatly appreciate the 
care and skill with which he has drawn them. I am indebted also to 
the Director and Staff of the Commonwealth Bureau of Animal 
Breeding for assistance with the preparation of the list of references. 


Institute of Animal Genetics, Edinburgh 
December, 1958 





Frequencies of genes and genotypes ... ... ... 5 

Hardy-Weinberg equilibrium ... ... ... ... 9 


Migration ... ... ... ... ... ... 23 

Mutation ... ... ... ... ... ... ... 24 

Selection ... ... ... ... ... ... ... 26 

3 SMALL POPULATIONS: I. Changes of gene frequency under 

simplified conditions ... ... ... ... ... ... 47 

The idealised population ... ... ... ... 48 

Sampling ... ... ... ... ... ... ... 50 

Inbreeding ... ... ... ... ... ... 60 

4 SMALL POPULATIONS: II. Less simplified conditions ... 68 

Effective population size ... ... ... ... 68 

Migration, Mutation, and Selection ... ... ... 74 

Random drift in natural populations ... ... ... 81 

5 SMALL POPULATIONS: III. Pedigreed populations and 

close inbreeding ... ... ... ... ... ... 85 

Pedigreed populations ... ... ... ... ... 86 

Regular systems of inbreeding ... ... ... ... 90 


Metric characters ... ... ... ... ••• 106 

General survey of subject-matter ... ... ... 109 




Population mean . . . 
Average effect 
Breeding value 
Dominance deviation 
Interaction deviation 


Genotypic and environmental variance 
Genetic components of variance 
Environmental variance ... 


Genetic covariance 
Environmental covariance 
Phenotypic resemblance . . 


Estimation of heritability 

The precision of estimates of heritability 

Identical twins 

11 SELECTION: I. The response and its prediction 

Response to selection 

Measurement of response 

Change of gene frequency under artificial selection 

12 SELECTION: II. The results of experiments 

Repeatability of response 

Asymmetry of response 

Long-term results of selection ... ... .... 

13 SELECTION: III. Information from relatives 

Methods of selection 

Expected response 

Relative merits of the methods ... 


mean value 

Inbreeding depression ... 
Heterosis ... 

I. Changes 




Redistribution of genetic variance 
Changes of environmental variance 
Uniformity of experimental animals 


tion of heterosis 

Variance between crosses 

Methods of selection for combining ability 

Overdominance ... 



Selection for threshold characters 


Genetic and environmental correlations 

Correlated response to selection 

Genotype-environment interaction 

Simultaneous selection for more than one character . . . 


Relation of metric characters to fitness 

Maintenance of genetic variation 

The genes concerned with quantitative variation 
















Quantitative genetics is concerned with the inheritance of those differ- 
ences between individuals that are of degree rather than of kind, 
quantitative rather than qualitative. These are the individual differ- 
ences which, as Darwin wrote, "afford materials for natural selection 
to act on and accumulate, in the same manner as man accumulates in 
any given direction individual differences in his domestic produc- 
tions." An understanding of the inheritance of these differences is thus 
of fundamental significance in the study of evolution and in the appli- 
cation of genetics to animal and plant breeding; and it is from these 
two fields of enquiry that the subject has received the chief impetus to 
its growth. 

Virtually every organ and function of any species shows individual 
differences of this nature, the differences of size among ourselves or 
our domestic animals being an example familiar to all. Individuals 
form a continuously graded series from one extreme to the other and 
do not fall naturally into sharply demarcated types. Qualitative 
differences, in contrast, divide individuals into distinct types with 
little or no connexion by intermediates. Examples are the differ- 
ences between blue-eyed and brown-eyed individuals, between the 
blood groups, or between normally coloured and albino individuals. 
The distinction between quantitative and qualitative differences 
marks, in respect of the phenomena studied, the distinction between 
quantitative genetics and the parent stem of "Mendelian" genetics. 
In respect of the mechanism of inheritance the distinction is between 
differences caused by many or by few genes. The familiar Mendelian 
ratios, which display the fundamental mechanism of inheritance, can 
be seen only when a gene difference at a single locus gives rise to a 
readily detectable difference in some property of the organism. 
Quantitative differences, in so far as they are inherited, depend on gene 
differences at many loci, the effects of which are not individually dis- 
tinguishable. Consequently the Mendelian ratios are not exhibited 
by quantitative differences, and the methods of Mendelian analysis 
are inappropriate. 


It is, nevertheless, a basic premiss of quantitative genetics that the 
inheritance of quantitative differences depends on genes subject to 
the same laws of transmission and having the same general properties 
as the genes whose transmission and properties are displayed by- 
qualitative differences. Quantitative genetics is therefore an extension 
of Mendelian genetics, resting squarely on Mendelian principles as its 

The methods of study in quantitative genetics differ from those 
employed in Mendelian genetics in two respects. In the first place, 
since ratios cannot be observed, single progenies are uninformative, 
and the unit of study must be extended to "populations," that is 
larger groups of individuals comprising many progenies. And, in the 
second place, the nature of the quantitative differences to be studied 
requires the measurement, and not just the classification, of the indi- 
viduals. The extension of Mendelian genetics into quantitative gene- 
tics may thus be made in two stages, the first introducing new con- 
cepts connected with the genetic properties of "populations" and the 
second introducing concepts connected with the inheritance of 
measurements. This is how the subject is presented in this book. In 
the first part, which occupies Chapters i to 5, the genetic properties of 
populations are described by reference to genes causing easily identi- 
fiable, and therefore qualitative, differences. Quantitative differences 
are not discussed until the second part, which starts in Chapter 6. 
These two parts of the subject are often distinguished by different 
names, the first being referred to as "Population Genetics" and the 
second as "Biometrical Genetics" or "Quantitative Genetics." 
Some writers, however, use "Population Genetics" to refer to the 
whole. The terminology of this distinction is therefore ambiguous. 
The use of "Quantitative Genetics" to refer to the whole subject may 
be justified on the grounds that the genetics of populations is not just 
a preliminary to the genetics of quantitative differences, but an in- 
tegral part of it. 

The theoretical basis of quantitative genetics was established 
round about 1920 by the work of Fisher (19 18), Haldane (1924-32, 
summarised 1932) and Wright (1921). The development of the 
subject over the succeeding years, by these and many other gene- 
ticists and statisticians, has been mainly by elaboration, clarifica- 
tion, and the filling in of details, so that today we have a substantial 
body of theory accepted by the majority as valid. As in any healthily 
growing science, there are differences of opinion, but these are chiefly 


matters of emphasis, about the relative importance of this or that 

The theory consists of the deduction of the consequences of 
Mendelian inheritance when extended to the properties of popula- 
tions and to the simultaneous segregation of genes at many loci. The 
premiss from which the deductions are made is that the inheritance of 
quantitative differences is by means of genes, and that these genes 
are subject to the Mendelian laws of transmission and may have any 
of the properties known from Mendelian genetics. The property of 
"variable expression" assumes great importance and might be raised 
to the status of another premiss: that the expression of the genotype 
in the phenotype is modifiable by non-genetic causes. Other pro- 
perties whose consequences are to be taken into account include 
dominance, epistasis, pleiotropy, linkage, and mutation. 

These theoretical deductions enable us to state what will be the 
genetic properties of a population if the genes have the properties 
postulated, and to predict what will be the consequences of applying 
any specified plan of breeding. In principle we should then be able to 
make observations of the genetic properties of natural or experi- 
mental populations, and of the outcome of special breeding methods, 
and deduce from these observations what are the properties of the 
genes concerned. The experimental side of quantitative genetics, 
however, has lagged behind the theoretical in its development, and it 
is still some way from fulfilling this complementary function. The 
reason for this is the difficulty of devising diagnostic experiments 
which will unambiguously discriminate between the many possible 
situations envisaged by the theory. Consequently the experimental 
side has developed in a somewhat empirical manner, building general 
conclusions out of the experience of many particular cases. Never- 
theless there is now a sufficient body of experimental data to substan- 
tiate the theory in its main outlines; to allow a number of generalisa- 
tions to be made about the inheritance of quantitative differences; 
and to enable us to predict with some confidence the outcome of 
certain breeding methods. Discussion of all the difficulties would be 
inappropriate in an introductory treatment. The aim here is to 
describe all that is reasonably firmly established and, for the sake of 
clarity, to simplify as far as is possible without being misleading. 
Consequently the emphasis is on the theoretical side. Though con- 
clusions will often be drawn directly from experimental data, the 
experimental side of the subject is presented chiefly in the form of 


examples, chosen with the purpose of illustrating the theoretical 
conclusions. These examples, however, cannot always be taken as 
substantiating the postulates that underlie the conclusions they 
illustrate. Too often the results of experiments are open to more than 
one interpretation. 

No attempt has been made to give exhaustive references to pub- 
lished work in any part of the subject; or to indicate the origins, or 
trace the history, of the ideas. To have done this would have required 
a much longer book, and a considerable sacrifice of clarity. The chief 
sources, from which most of the material of the book is derived, are 
listed below. These sources are not regularly cited in the text. 
References are given in the text when any conclusion is stated without 
full explanation of its derivation. These references are not always to 
the original papers, but rather to the more recent papers where the 
reader will find a convenient point of entry to the topic under dis- 
cussion. References are also given to the sources of experimental 
data, but these, for reasons already explained, cover only a small part 
of the experimental side of the subject. In particular, a great deal 
more work has been done on plants and on farm animals than would 
appear from its representation among the experimental work cited. 

Chief Sources 

(For details see List of References) 

Fisher, R. A. (1930), The Genetical Theory of Natural Selection. 

Haldane, J. B. S. (1932), The Causes of Evolution. 

Kempthorne, O. (1957), An Introduction to Genetic Statistics. 

Lerner, I. M. (1950), Population Genetics and Animal Improvement. 

Li, C. C. (1955), Population Genetics. 

Lush , J . L . ( 1 945 ), A nimal Breeding Plans. 

Malecot, G. (1948), Les Mathematiques de VHeredite. 

Mather, K. (1949), Biometrical Genetics. 

Wright, S. (1921), Systems of Mating. Genetics 6: 111-178. 

(1931), Evolution in Mendelian Populations. Genetics 16: 





Frequencies of Genes and Genotypes 

To describe the genetic constitution of a group of individuals we 
should have to specify their genotypes and say how many of each geno- 
type there were. This would be a complete description, provided the 
nature of the phenotypic differences between the genotypes did not 
concern us. Suppose for simplicity that we were concerned with a 
certain autosomal locus, A, and that two different alleles at this locus, 
A x and A 2 , were present among the individuals. Then there would be 
three possible genotypes, AjA^ AjAa, and A 2 A 2 . (We are concerned 
here, as throughout the book, exclusively with diploid organisms.) 
The genetic constitution of the group would be fully described by 
the proportion, or percentage, of individuals that belonged to each 
genotype, or in other words by the frequencies of the three genotypes 
among the individuals. These proportions or frequencies are called 
genotype frequencies, the frequency of a particular genotype being its 
proportion or percentage among the individuals. If, for example, we 
found one quarter of the individuals in the group to be AjA^ the 
frequency of this genotype would be 0-25, or 25 per cent. Natura lly 
the frequencies of all the genotype s together m ust ad d up to unit y, or 
1 per c ent. " " "" " 

Example i.i. The M-N blood groups in man are determined by two 
alleles at a locus, and the three genotypes correspond with the three blood 
groups, M, MN, and N. The following figures, taken from the tabulation 
of Mourant (1954), show the blood group frequencies among Eskimoes 
of East Greenland and among Icelanders as follows: 


Blood group 

Number of 



83-5 15-6 0-9 



31-2 51-5 17-3 



Clearly the two populations differ in these genotype frequencies, the N 
blood group being rare in Greenland and relatively common in Iceland. 
Not only is this locus a source of variation within each of the two popula- 
tions, but it is also a source of genetic difference between the populations. 

A population, in the genetic sense, is not just a group of individuals, 
but a breeding group; and the genetics of a population is concerned 
not only with the genetic constitution of the individuals but also with 
the transmission of the genes from one generation to the next. In the 
transmission the genotypes of the parents are broken down and a new 
set of genotypes is constituted in the progeny, from the genes trans- 
mitted in the gametes. The genes carried by the population thus have 
continuity from generation to generation, but the genotypes in which 
they appear do not. The genetic constitution of a population, refer- 
ring to the genes it carries, is described by the array of gene frequencies; 
that is by specification of the alleles present at every locus and the 
numbers or proportions of the different alleles at each locus. If, 
for example, A x is an allele at the A locus, then the frequency of A x 
genes, or the gene frequency of A lt is the proportion or percent- 
age of all genes at this locus that are the A x allele. The frequencies 
of all the alleles at any one locus must add up to unity, or ioo per 

The gene frequencies at a particular locus among a group of 
individuals can be determined from a knowledge of the genotype 
frequencies. To take a hypothetical example, suppose there are two 
alleles, A ± and A 2 , and we classify ioo individuals and count the 
numbers in each genotype as follows: 

AjAi AjA 2 A 2 A 2 Total 
Number of individuals 30 60 10 100 

Number of genes < . , _ V200 

& \A 2 o 60 20 80J 

Each individual contains two genes, so we have counted 200 repre- 
sentatives of the genes at this locus. Each^ p^jj^diyjdual_contains 
two At genes and e ach A X A 2 contains one A x gene. So there are 120 A x 
genes intne sample, and 80 A 2 genes. The frequency of A ± is there- 
fore 60 per cent or o-6, and the frequency of A 2 is 40 per cent or 0-4. 
To express the relationship in a more general form, let the frequencies 
of genes and of genotypes be as follows: 

Chap. I] 




A 1 A 2 

AjAj AjA-2 x\2^2 


P 9 

P H Q 

so that p+q= i, and P + H+ Q = i. Since each individual contains 
two genes, the frequency of A x genes is J(2P + H) } and the relation- 
ship between gene frequency and genotype frequency among the 
individuals counted is as follows: 





Example 1.2. To illustrate the calculation of gene frequencies from 
genotype frequencies we may take the M-N blood group frequencies given 
in Example 1 . 1 . The M and N blood groups represent the two homozygous 
genotypes and the MN group the heterozygote. The frequency of the M 
gene in Greenland is, from equation 1.1, 0-835 +2(0*156) = 0-913, and the 
frequency of the N gene is 0-009 +i(o- 156) = 0-087, tne sum °f tne 
frequencies being i-ooo as it should be. Doing the same for the Iceland 
sample we find the following gene frequencies in the two populations, ex- 
pressed now as percentages: 









43 -° 

Thus the two populations differ in gene frequency as well as in genotype 

The genetic properties of a population are influenced in the pro- 
cess of transmission of genes from one. generation to the next by a 
number of agencies. These form the chief subject-matter of the next 
four chapters, but we may briefly review them here in order to have 
some idea of what factors are being left out of consideration in this 
chapter. The agencies through which the genetic properties of a 
population may be changed are these: 

Population size. The genes passed from one generation to the 
next are a sample of the genes in the parent generation. Therefore 
the gene frequencies are subject to sampling variation between suc- 
cessive generations, and the smaller the number of parents the greater 
is the sampling variation. The effects of sampling variation will be 
considered in Chapters 3-5, and meantime we shall exclude it from 

B F.Q.G. 


the discussion by supposing always that we are dealing with a ' 'large 
population," which means simply one in which sampling variation is 
so small as to be negligible. For practical purposes a "large popula- 
tion" is one in which the number of adult individuals is in the hundreds 
rather than in the tens. 

Differences of fertility and viability. Though we are not at 
present concerned with the phenotypic effects of the genes under dis- 
cussion, we cannot ignore their effects on fertility and viability, be- 
cause these influence the genetic constitution of the succeeding 
generation. The different genotypes among the parents may have 
different fertilities, and if they do they will contribute unequally to 
the gametes out of which the next generation is formed. In this way 
the gene frequency may be changed in the transmission. Further, 
the genotypes among the newly formed zygotes may have different 
survival rates, and so the gene frequencies in the new generation may 
be changed by the time the individuals are adult and themselves 
become parents. These processes are called selection, and will be 
described in Chapter 2. Meanwhile we shall suppose they are not 
operating. It is difficult to find examples of genes not subject to 
selection. For the purpose of illustration, however, we may take the 
human blood-group genes since the selective forces acting on these 
are probably not very strong. Genes that produce a mutant pheno- 
type which is abnormal in comparison with the wild-type are, in 
contrast, usually subject to much more severe selection. 

Migration and mutation. The gene frequencies in the popula- 
tion may also be changed by immigration of individuals from another 
population, and by gene mutation. These processes will be described 
in Chapter 2, and at this stage will also be supposed not to operate. 

Mating system. The genotypes in the progeny are determined 
by the union of the gametes in pairs to form zygotes, and the union of 
gametes is influenced by the mating of the parents. So the genotype 
frequencies in the offspring generation are influenced by the geno- 
types of the pairs that mate in the parent generation. We shall at 
first suppose that mating is at random with respect to the genotypes 
under discussion. Random mating, or panmixia, means that any 
individual has an equal chance of mating with any other individual in 
the population. The important points are that there should be no 
special tendency for mated individuals to be alike in genotype, or to 
be related to each other by ancestry. If a population covers a large 
geographic area individuals inhabiting the same locality are more 

Chap. I] 


likely to mate than individuals inhabiting different localities, and so 
the mated pairs tend to be related by ancestry. A widely spread 
population is therefore likely to be subdivided into local groups and 
mating is random only within the groups. The properties of sub- 
divided populations depend on the size of the local groups, and will 
be described under the effects of population size in Chapters 3-5. 

Hardy-Weinberg Equilibrium 

I n a lar ^e rajiploiiiamatin g^ populat ion both _gene_fre quencies and 
per^ot^pe frequencies are constan t from generatio n to gene^^ p-n. in 
th^^tfifince of migration, m utation and selection; and the genotype 
frequencies are determined by the gene frequencies. These propert ies 
of a popula tion we re fi rst demonstrated fry Harfly ancLhy , Weinberg 
inde pendently in iqo 8, and are generally known as the Hardy- 
Weinberg Law. (See Stern, 1943, where a translation of the relevant 
part of Weinberg's paper will be found.) Such a population is said 
to be in Hardy-Weinberg equilibrium. Deduction of the Hardy- 
Weinberg Law involves three steps: (1) from the parents to the 
gametes they produce; (2) from the union of the gametes to the geno- 
types in the zygotes produced; and (3) from the genotypes of the 
zygotes to the gene frequency in the progeny generation. These steps, 
in detail, are as follows: 

1 . Let the parent generation have gene and genotype frequencies 
as follows: 

P 9. 


A X A 2 

A 2 A 2 


Two sorts of gametes are produced, those bearing A x and those bear- 
ing A 2 . The frequencies of these gametic types are the same as the 
gene frequencies, p and q, in the generation producing them, for this 
reason: A X A X individuals produce only A ± gametes, and A X A 2 indi- 
viduals produce equal numbers of A ± and A 2 gametes (provided, of 
course, there is no anomaly of segregation). So the frequency of A ± 
gametes produced by the whole population is P + \H, which by 
equation j.j is the gene frequency of A ± . 

2. Random mating between individuals is equivalent to random 
union among their gametes. We can think of a pool of gametes to 
which all the individuals contribute equally; zygotes are formed by 



[Chap. I 

random union between pairs of gametes from the pool. The genotype 
frequencies among the zygotes are then the products of the frequencies 
of the gametic types that unite to produce them. The genotype 
frequencies among the progeny produced by random mating can 
therefore be determined simply by multiplying the frequencies of the 
gametic types as shown in the following table: 

s 8- 

Female gc 

imetes and 

their frequencies 


A 2 




AiA 2 

A 1 


P 2 


A 1 A 2 

A 2 A 2 

A 2 



q 2 

We need not distinguish the union of A x eggs with A 2 sperms from 
that of A 2 eggs with A 1 sperms; so the genotype frequencies of the 
zygotes are 

AiAj_ A]A 2 

A 2 A 2 



Note that these genotype frequencies depend only on the gene fre- 
quency in the parents, and not on the parental genotype frequencies, 
provided the parents mate at random. 

3. Finally we use these genotype frequencies to determine the 
gene frequency in the offspring generation. Applying equation 1.1 
we find the gene frequency of A x is j> 2 + \ {zpq) =p(p + q) =p, which is 
t he same as in the pare nt generation. ' — 

The properties ot appellation with respect to a single locus, ex- 
pressed in the Hardy- Weinberg law and demonstrated above, are 

^ (1) A large random-mating population, in the absence of migra- 
tion, mutation, and selection, is stable with respect to both gene and 
genotype frequencies: there is no inherent tendency for its genetic 
properties to change from generation to generation. 

(2) The genotype frequencies in the progeny produced by random 
mating among the parents are determined solely by the gene fre- 
quencies among the parents. Consequently: 

Chap. I] 



(a) a population in Hardy- Weinberg equilibrium has the rela- 
tionship expressed in equation 1.2 between the gene and 
genotype frequencies in any one generation. And, 

(b) these Hardy- Weinberg genotype frequencies are established 
by one generation of random mating, irrespective of the 
genotype frequencies among the parents. 



D 6 



Z A 









^a7a 2 


I -2 -3 -4 5 -6 7 8 9 I 


Fig. i.i. Relationship between genotype frequencies and gene 
frequency for two alleles in a population in Hardy- Weinberg 

We shall later give another proof of the Hardy- Weinberg law by 
a different method. Let us now first illustrate the properties of a 
population in Hardy- Weinberg equilibrium, and then show to what 
uses these properties can be put. The relationship between gene 
frequency and genotype frequencies expressed in equation 1.2 is 

,000 I M oa%$X?-- 


illustrated graphically in Fig. i.i, which shows how the frequencies 
of the three genotypes for a locus with two alleles depend on the gene 
frequency. As an example of the Hardy- Weinberg genotype fre- 
quencies we shall take again the M-N blood groups in man. 

Example 1.3. Race and Sanger (1954) quote the following frequencies 
(%) of the M-N blood groups in a sample of 1,279 English people. From 
the observed genotype (i.e. blood group) frequencies we can calculate the 
gene frequencies by equation 1.1. These gene frequencies are shown on 
the right. 

Blood group Gene 


Observed 28-38 49-57 22-05 53-165 46-835 

Expected 28-265 49-800 21-935 

Now from the gene frequencies we can calculate the expected Hardy- 
Weinberg genotype frequencies by equation 1.2, and we find that the 
observed frequencies agree very closely with those expected for a popula- 1 
tion in Hardy- Weinberg equilibrium. 

Comparison of observed with expected genotype frequencies may 
be regarded as a test of the fulfilment of the conditions on which the 
Hardy- Weinberg equilibrium depends. ^Xhese conditions are: 
random mating amo n g the parents of the individuals obse rved, equal 
fertility of the different genotypes among; the parents, an d equal 
vi ability of the different genotypes a mnn^ the nffoprjng from f^rtilisa- 
tion up to the time of observation. In addition, the classification of 
individuals as to genotype must have been correctly made. The 
blood group frequencies in Example 1.3 give no cause to doubt the 
fulfilment of these conditions. It-should be noted, however, that a 
difference of fertility or of viability between the genotypes, though it 
can be detected, cannot be measured from a comparison of observed 
v^ith^expected frequencies (Wallace, 1958). The. expected frequencies 
arej)ased on the observed gene frequencies after the differences of fer- 
ity or viability have had their effect. In order to measure these effects 
wejshould have to know the original gene or genotype frequencies. 

At the beginning of the chapter we saw, in equation J. J, how the 
gene frequencies among a group of individuals can be determined 
from their genotype frequencies; but for this it was necessary to know 
the frequencies of all three genotypes. Consequently the relationship 
in equation 1.1 cannot be applied to the case of a recessive allele, 


Chap. I] 




when the heterozygote is indistinguishable from the dominant homo- 
zygote. Consideration of the population as a breeding unit, however, 
shows that when the conditions for Hardy- Weinberg equilibrium 
hold, only the frequency of one of the homozygous genotypes is 
needed to determine the gene frequency, and the difficulty of recessive 
genes is thus overcome. Let A 2 , for example, be a recessive gene 
with frequency q; then the frequency of A 2 A 2 homozygotes is q 2 . In 
other words the gene frequency is the square root of the homozygote 
frequency. Thus we can determine the gene frequency of recessive 
abnormalities, provided that selective mortality of the homozygote 
can be discounted or allowed for. But we can go further, and this is 
often the more important point: we can also determine the frequency 
of heterozygotes, or "carriers," of recessive abnormalities, which is f 
2q(i -q). It comes as a surprise to most people to discover how com-C- J^ 
mem heterozygotes of a rare recessive abnormality are. 


Example 1.4. Albinism in man is probably determined by a single 
recessive autosomal gene, and the frequency of albinos is about 1/20,000 
in human populations (see Stern, 1949). If q is the frequency of the albino 
gene, then q 2 = 1/20,000, and q = 1/141, if selective mortality is disregarded. 
The frequency of heterozygotes is then 2^(1 -q) y which works out to about 
1/70. So about one person in seventy is a heterozygote for albinism, 
though only one in twenty thousand is a homozygote. 

Example 1.5. There is a recessive autosomal gene in the Ayrshire 
breed of cattle in Britain which causes dropsy in the new-born calf. The 
frequency of this abnormality is about 1 in 300 births (Donald, Deas, and 
Wilson, 1952). A means of reducing the frequency of the defect would 
obviously be the avoidance of the use of bulls known or thought to be 
heterozygous. We might first want to know what proportion of bulls 
would be expected to be heterozygotes. In this case the conditions for 
Hardy-Weinberg equilibrium are certainly not all fulfilled: the breed is not 
a single random-breeding population, and the abnormal homozygotes are 
not fully viable up to the time of birth. So we can only get a rough idea of 
the frequency of heterozygotes by assuming the observations to refer to a 
population in Hardy-Weinberg equilibrium. On this assumption, 
q 2 = 0-0033, so tf = '°57; m e frequency of heterozygotes is zq{i -q) = o-n. 
So we should expect, very approximately, one bull in ten to be a hetero- 

Mating frequencies and another proof of the Hardy- 
Weinberg law. Let us now look more closely into the breeding 

/ / 1 


structure of a random-mating population, distinguishing the types of 
mating according to the genotypes of the pairs, and seeing what are 
the genotype frequencies among the progenies of the different types 
of mating. This provides a general method for relating genotype 
frequencies in successive generations, which we shall use in a later 
chapter. It also provides another proof of the Hardy- Weinberg law; 
a proof more cumbersome than that already given but showing more 
clearly how the Hardy- Weinberg frequencies arise from the Men- 
delian laws of segregation. The procedure is to obtain first the 
frequencies of all possible mating types according to the frequencies 
of the genotypes among the parents, and then to obtain the fre- 
quencies of genotypes among the progeny of each type of mating 
according to the Mendelian ratios. 

Consider a locus with two alleles, and let the frequencies of genes 
and genotypes in the parents be, as before, 

Genes Genotypes 

A 1 A 2 -A-i-A-i A 1 A 2 A 2 A 2 

Frequencies p q P H Q 

There are altogether nine types of mating, and their frequencies 
when mating is random are found thus: 

Q ^ s 

S ^a 

Since the sex of the parent is irrelevant in this context, some of the 
types of mating are equivalent, and the number of different types 
reduces to six. By summation of the frequencies of equivalent types, 
we obtain the frequencies of mating types in the first two columns of 
Table i . i . Now we have to consider the genotypes of offspring pro- 
duced by each type of mating, and find the/frequency of each geno- 
type in the total progeny, assuming, of course, that all types of mating 
are equally fertile and all genotypes equally viable. This is done in 
the right hand side of Table i . i . Thus, for example, matings of the 
type A X A X x A^ produce only A X A X offspring. So, of all the A^ 


ipe and ft 

equency o 


A]A 2 

A 2 A 2 






P 2. 



A X A 2 



H 2 


A 2 A 2 




Q 2 

Chap. I] 



genotypes in the total progeny, a proportion P 2 come from this type 
of mating. Similarly a quarter of the offspring of A X A 2 x A X A 2 
matings are A^. So this type of mating, which has a frequency of 
H 2 y contributes a proportion \H 2 of the total A^ progeny. To find 
the frequency of each genotype in the total progeny we add the 


Table i.i 

Genotype and frequency of progeny 




AiA 2 

A„A n 

■/x-^-fij X A-^/ij 

P 2 

P 2 

■ — 

Xil/il X XTL-f/lo 





A-jAj x A 2 A 2 





AjA 2 x AjA 2 


\H 2 

w 2 

iff 2 

A X A 2 x A 2 A 2 


— ' 



A 2 A 2 X r\ 2 r\. 2 

Q 2 




Q 2 


2{P + 

WW + W) 

(Q + Wf 





frequencies contributed by each type of mating. The sums, after 
simplification, are given at the foot of the table, and from the identity 
given in equation J.J they are seen to be equal to p 2 , 2pq, and q 2 . 
These are the Hardy-Weinberg equilibrium frequencies, and we 
have shown that they are attained by one generation of random mating, 
irrespective of the genotype frequencies among the parents. 

Multiple alleles. Restriction of the treatment to two alleles at a 
locus suffices for many purposes. If we are interested in one 
particular allele, as often happens, then all the other alleles at the 
locus can be treated as one. Formulation of the situation in terms of 
two alleles is therefore often possible even if there are in fact more 
than two. If we are interested in more than one allele we can still, if 
we like, treat the situation as a two-allele system by considering each 
allele in turn and lumping the others together. But the treatment can 
be easily extended to cover more than two alleles, and no new prin- 
ciple is introduced. In general, if q x and q 2 are the frequencies of any 
two alleles, A x and A 2 , of a multiple series, then the genotype fre- 
quencies under Hardy-Weinberg equilibrium are as follows (Li, 

Genotype: A^ A X A 2 A 2 A 2 
Frequency: q 2 2q ± q 2 q 2 


These frequencies are also attained by one generation of random 
mating. This can readily be seen by reducing the situation to a two- 
allele system, and considering each allele in turn. Or it can be 
proved, though somewhat more laboriously, by the method explained 
above for the two-allele system. 

Example i.6. The ABO blood groups in man are determined by a 
series of allelic genes. For the purpose of illustration we shall recognise 
three alleles, A, B, and O, and show how the gene frequencies can be 
estimated from the blood group frequencies. Let the frequencies of the 
A, B, and O genes be p, q, and r respectively, so that p+q + r=i. The 
following table shows (i) the genotypes, (2) the blood groups (i.e. pheno- 
types) corresponding to the different genotypes, (3) the expected frequen- 
cies of the blood groups in terms of p, q, and r, on the assumption of 
Hardy- Weinberg equilibrium, (4) observed frequencies of blood groups in 
a sample of 190,177 United Kingdom airmen, quoted by Race and Sanger 

Genotype AA AO BB BO 00 AB 

Blood group A B O AB 

Frequency (%) 

expected p 2 + 2pr q 2 + zqr r 2 zpq 

observed 41 716 8-560 46-684 3*040 

Calculation of the gene frequencies is rather more complicated than with 
two alleles. The following is the simplest method: a more refined method 
is described by Ceppellini et al. (1955). Fi rsL the frequency of th e O gene 
is simply the squ are roqf of the freq uen cy of t)ie._Q group. Next it will be 
seen that the sum of the frequencies of the B and O groups is q 2 + zqr + r 2 = 
(q + r) 2 = (i -p) 2 . So p = 1 - J(B + O), where B and O are the frequencies 
of the blood groups B and O. In the same way q=i -^/(A + 0), and we 
have seen that r = JO. This method gives the following gene frequencies 
in the sample: 

A gene: ^ = 0-2567 
B gene: # = 0-0598 
Ogene: r = 0-6833 

Total 0-9998 

As a result of sampling errors these frequencies do not add up exactly to 
unity, but we shall not trouble to make an adjustment for so small a dis- 
crepancy. We may now calculate the expected frequency of the AB blood 


Chap. I] 



group, which has not been used in arriving at these gene frequencies, and 
see whether the observed frequency agrees satisfactorily. The expected 
frequency of AB from estimates of p and q is 3-070 per cent, which is in 
good agreement with the observed frequency of 3-040 percent. (x 2=z °'7> 
with 1 d.f., calculated by the method given by Race and Sanger.) 

Sex-linked genes. With sex-linked genes the situation is rather 
more complex than with autosomal genes. The relationship between 
gene frequency and genotype frequency in the homogametic sex is 
the same as with an autosomal gene, but the heterogametic sex has 
only two genotypes and each individual carries only one gene instead 
of two. For this reason two-thirds of the sex-linked genes in the 
population are carried by tKeTibmogametic sex and one-third by the 
heterogametic. For the sake of brevity we shall now refer to the 
heterogametic sex as male. Consider two alleles, A x and A 2 , with 
frequencies^) and q, and let the genotypic frequencies be as follows: 


AjAj AjA 2 
P H 

A 2 A 2 


A x A 2 
R S 

The frequency of A 1 among the females is then p f =P + \H y and the 

frequency among the males is p r 
whole population is 

R. The frequency of A ± in the 

= i(2pf+Pm) 

= ±(2P + H + R) 



Now, if the gene frequencies among males and among females are 
different, the population is not in equilibrium. The gene frequency 
in the population as a whole does not change, but its distribution 
between the two sexes oscillates as the population approaches equili- 
brium. The reason for this can be seen from the following con- 
siderations. Males get their sex-linked genes only from their 
mothers; therefore p m is equal to p f in the previous generation. 
Females get their sex-linked genes equally from both parents; there - 

fore p f is equal to the mean of p m and p f in the previous generation , 
Using primes to indicate the previous generation, we have 

Pm=p' f 



The difference between the frequencies in the two sexes is 

Pf-pm = i(Pm +Pf)-Pf 

= -i(Pf-p'm) 

i.e. half the difference in the previous generation, but in the other 
direction. Therefore the distribution of the genes between the two 
sexes oscillates, but the difference is halved in successive generations 
and the population rapidly approaches an equilibrium in which the 










Fig. 1.2. Approach to equilibrium under random mating for a 
sex-linked gene, showing the gene frequency among females, 
among males, and in the two sexes combined. The population 
starts with females all of one sort (qf — i), and males all of the 
other sort (q m = o). 

frequencies in the two sexes are equal. The situation is illustrated 
in Fig. 1.2, which shows the consequences of mixing females of one 
sort (all AjAi) with males of another sort (all A 2 ) and letting them 
breed at random. 

Example 1.7. Searle (1949) gives the frequencies of a number of 
genes in a sample of cats in London. The animals examined were sent to 

Chap. I] 



clinics for destruction; they were therefore not necessarily a random 
sample. Among the genes studied was ''yellow" (y) which is sex-linked 
and for which all three genotypes in females are recognisable, the hetero- 
zygote being tortoise-shell. The data were used to test for agreement with 
Hardy- Weinberg equilibrium. The numbers observed in each phenotypic 
class are shown in table (i). We may first see whether the gene frequency 



+ + +y yy 

Numbers observed 277 54 7 
Numbers expected 269-6 64-5 3-9 


3 11 







in females 




in males 

3 11 







is equal in the two sexes. The numbers of genes counted, and the 
frequency (q) of the gene y, in each sex are as given in table (ii). The 


X 2 testing difference in q between the sexes is 0-4 which is quite in- 
significant. There is therefore no reason to think the population is not 
in equilibrium, and we may take the estimate of gene frequency from both 
sexes combined: it is # = 0-107. From this estimate of q the expected 
numbers in the different phenotypic classes are calculated; they are shown 
in table (i). Only the females are relevant to the test of random mating. 
The x 2 testing agreement between observed and expected numbers in 
females is 4-4, with 2 degrees of freedom. This has a probability of o-i and 
cannot be judged significant. The data are therefore compatible with the 
Hardy- Weinberg equilibrium, in spite of the deficiency of tortoise-shell 
females. If the deficiency of heterozygous females were real we might 
attribute it to the method of sampling and infer that the tortoise-shells 
were sent for destruction less often than the other colours, on account of 
human preference. 

More than one locus. The attainment of the equilibrium in 
genotype frequencies after one generation of random mating is true 
of all autosomal loci considered separately. But it is not true of the 
genotypes with respect to two or more loci considered jointly. To 
illustrate the point, consider a population made up of equal numbers 



of A^B^ and A 2 A 2 B 2 B 2 individuals, of both sexes. The gene 
frequency at both loci is then J, and if the individuals mated at ran- 
dom only three out of the nine genotypes would appear in the pro- 
geny; the genotype A 1 A 1 B 2 B 2 , for example, would be absent though 
its frequency in an equilibrium population would be yg-. The missing 
genotypes appear in subsequent generations, but not immediately 
at their equilibrium frequencies. The approach to equilibrium is 
described by Li (19550) an< ^ nere we snan onr y outline the con- 

Consider two loci each with two alleles, and let the frequencies of 
the four types of gamete formed by the initial population be as fol- 

type of gamete A 1 B 1 A X B 2 A 2 B X A 2 B 2 
frequency r s t u 

Then if the population is in equilibrium, ru=st, as may be seen by 
writing the gametic frequencies in terms of the gene frequencies. 
The difference, ru - st, gives a measure of the extent of the departure 
from equilibrium. This difference is halved in each successive genera- 
tion of random mating, and the approach to equilibrium is thus fairly 
rapid (see Fig. 1.3). If, however, more than two loci are to be con- 
sidered jointly the approach to equilibrium becomes progressively 
slower as the number of loci increases. 

Linked loci. If two loci are linked the approach to equilibrium 
under random mating is slower in proportion to the closeness of the 
linkage. When equilibrium is reached the coupling and repulsion 
phases are equally frequent; the frequencies of the gametic types then 
depend only on the gene frequencies and not at all on the linkage. It 
is easy to suppose that association between two characters, as for 
example between hair colour and eye colour, is evidence of linkage 
between the genes concerned. Association between characters, 
however, is more often evidence of pleiotropy than of linkage. Link- 
age can give rise to association only after a mixture of populations, 
the length of time that the association persists depending on the 
closeness of the linkage. 

The approach to equilibrium after the mixture of populations 
differing in respect of the genes at two linked loci can be described in 
the manner of the preceding section. The departure from equili- 
brium, d, is expressed as d — ru-st, where ru is the frequency of 
coupling heterozygotes and st that of repulsion heterozygotes. If c 

Chap. I] 



is the frequency of recombination between the two loci then the 
difference, d, at generation t is 

d t = (i-c)d t _ 1 

Thus if, for example, there is 25 per cent recombination the difference 
is reduced by one quarter in each generation; or if there is 10 per cent 
recombination the difference is reduced by 10 per cent in each 

4 5 6 7 

I I 

Fig. 1.3. Approach to equilibrium under random mating of two 
loci, considered jointly. The graphs show the difference of fre- 
quency (d) between coupling and repulsion heterozygotes in suc- 
cessive generations, starting with all individuals repulsion hetero- 
zygotes. The five graphs refer to different degrees of linkage 
between the two loci, as indicated by the recombination frequency 
shown alongside each graph. The graph marked .5 refers to un- 
linked loci. 

generation. Closely linked loci will therefore continue for a consider- 
able time to show the effects of a past mixture of populations. The 
approach to equality of coupling and repulsion phases with different 
degrees of linkage is illustrated in Fig. 1.3. 


Assortative mating. Assortative mating is a form of non-random 
mating, but this is the most convenient place to mention it. If the 
mated pairs tend to be of the same genotype more often than would 
occur by chance this is called positive assortative mating, and if less 
often it is called negative assortative (or sometimes disassortative) 
mating. The consequences are described by Wright (1921) and sum- 
marised by Li (1955^) and will be only briefly outlined here. Posi- 
tive assortative mating is of some importance in human populations, 
where it occurs with respect to intelligence and other mental charac- 
ters. These however are not single gene differences such as can be 
discussed in the present context. The consequences of assortative 
mating with a single locus can be deduced from Table 1 . 1 by appro- 
priate modification of the frequencies of the types of mating to allow 
for the increased frequency of matings between like genotypes. The 
effect on the genotype frequencies among the progeny is to increase 
the frequencies of homozygotes and reduce that of heterozygotes. 
In effect the population becomes partially subdivided into two 
groups, mating taking place more frequently within than between 
the groups. 



We have seen that a large random-mating population is stable with 
respect to gene frequencies and genotype frequencies, in the absence 
of agencies tending to change its genetic properties. We can now 
proceed to a study of the agencies through which changes of gene 
frequency, and consequently of genotype frequencies, are brought 
about. There are two sorts of process: systematic processes, which 
tend to change the gene frequency in a manner predictable both in 
amount and in direction; and the dispersive process, which arises in 
small populations from the effects of sampling, and is predictable in 
amount but not in direction. In this chapter we are concerned only 
with the systematic processes, and we shall consider only large random- 
mating populations in order to exclude the dispersive process from 
the picture. There are three systematic processes: migration, mutation, 
and selection. We shall study these separately at first, assuming that 
only one process is operating at a time, and then we shall see how the 
different processes interact. 


The effect of migration is very simply dealt with and need not con- 
cern us much here, though we shall have more to say about it later, 
in connexion with small populations. Let us suppose that a large 
population consists of a proportion, m, of new immigrants in each 
generation, the remainder, i - m, being natives. Let the frequency 
of a certain gene be q m among the immigrants and q among the 
natives. Then the frequency of the gene in the mixed population, q lf 
will be 

mq m + (i-m)q 


The change of gene frequency, Aq, brought about by one generation 



of immigration is the difference between the frequency before 
immigration and the frequency after immigration. Therefore 

= m(q m -q ) (2.2) 

Thus the rate of change of gene frequency in a population subject to 
immigration depends, as must be obvious, on the immigration rate 
and on the difference of gene frequency between immigrants and 


The effect of mutation on the genetic properties of the population 
differs according to whether we are concerned with a mutational 
event so rare as to be virtually unique, or with a mutational step that 
recurs repeatedly. The first produces no permanent change, whereas 
the second does. 

3fe» Non-recurrent mutation. Consider first a mutational event 
*mat gives rise to just one representative of the mutated gene or 
chromosome in the whole population. This sort of mutation is of 
little importance as a cause of change of gene frequency, because the 
product of a unique mutation has an infinitely small chance of sur- 
viving in a large population, unless it has a selective advantage. This 
can be seen from the following consideration. As a result of the single 
mutation there will be one A X A 2 individual in a population all the 
rest of which is AjA^ The frequency of the mutated gene, A 2 , is 
therefore extremely low. Now according to the Hardy- Weinberg 
equilibrium the gene frequency should not change in subsequent 
generations. But with this situation we can no longer ignore the 
variation of gene frequency due to sampling. With a gene at very low 
frequency the sampling variation, even though very small, may take 
the frequency to zero, and the gene will then be lost from the popu- 
lation. Though at each generation a single gene has an equal chance 
of surviving or being lost, the loss is permanent and the probability 
of the gene still being present decreases with the passage of genera- 
tions (see Li, 1955a). The conclusion, therefore, is that a unique 
mutation without selective advantage cannot produce a permanent 
change in the population. 

Recurrent mutation. It is with the second type of mutation — 

Chap. 2] 



recurrent mutation — that we are concerned as an agent for causing 
change of gene frequency. Each mutational event recurs regularly 
with characteristic frequency, and in a large population the frequency 
of a mutant gene is never so low that complete loss can occur from 
sampling. We have, then, to find out what is the effect of this "pres- 
sure" of mutation on the gene frequency in the population. 

S uppose gene A^mutates to A^ with^aJre quencv u p er generation. 
(u is the proportion of all A x genes that mutate to A 2 between one 
generation and the next.) If the frequency of A x in one generation is 
p the frequency of newly mutated A 2 genes in the next generation is 
upQ. So the new gene frequency of A x is p - up , and the change of 
gene frequency is - up . Now consider what happens when the genes 
mutate in both directions. Suppose for simplicity that there are only 
two alleles, A x and A 2 , with initial frequencies p and q . A x mutates 
to A 2 at a rate u per generation, and A 2 mutates to A x at a rate v. 
Then after one generation there is a gain of A 2 genes equal to up due 
to mutation in one direction, and a loss equal to vq due to mutation 
in the other direction. Stated in symbols, we have the situation: 


Mutation rate A x ^ A 2 


Initial gene frequencies p q 

Then the change of gene frequency in one generation is 

Aq=up -vq 

It is easy to see that this situation leads to an equilibrium in gene 
frequency at which no further change takes place, because if the 
frequency of one allele increases fewer of the other are left to mutate 
in that direction and more are available to mutate in the other direc- 
tion. The point of equilibrium can be found by equating the change 
of frequency, Aq, to zero. Thus at equilibrium 











u + v 


Three conclusions can be drawn from the effect of mutation on 
gene frequency. Measurements of mutation rates indicate values 
ranging between about io~ 4 and io -8 per generation (one in ten 


thousand and one in a hundred million gametes). With normal 
mutation rates, therefore, mutation alone can produce only very slow 
changes of gene frequency; on an evolutionary time-scale they might 
be important, but they could scarcely be detected by experiment 
unless with micro-organisms. The second conclusion concerns the 
equilibrium between mutation in the two directions. Studies of 
reverse mutation (from mutant to wild type) indicate that it is usually 
less frequent than forward mutation (from wild type to mutant), on 
the whole about one tenth as frequent (Muller and Oster, 1957). 
The equilibrium gene frequencies for such loci, resulting from 
mutation alone, would therefore be about o-i of the wild-type allele 
and 0-9 of the mutant; in other words the "mutant" would be the 
common form and the "wild type" the rare form. Since this is not 
the situation we find in natural populations it is clear that the fre- 
quencies of such genes are not the product of mutation alone. We 
shall see in the next section that the rarity of mutant alleles is attribu- 
table to selection. The third conclusion concerns the effects of an 
increase of mutation rates such as might be caused by an increase of 
the level of ionising radiation to which the population is subjected. 
Any loci at which the gene frequencies are in equilibrium from the 
effects of mutation alone will not be affected by a change of mutation 
rate, provided the change affects forward and reverse mutation pro- 
portionately. This can be seen from consideration of the equilibrium 
gene frequencies given in equation 2.4. 


Hitherto we have supposed that all individuals in the population 
contribute equally to the next generation. Now we must take account 
of the fact that individuals differ in viability and fertility, and that 
they therefore contribute different numbers of offspring to the next 
generation. The proportionate contribution of offspring to the next 
generation is called the fitness of the individual, or sometimes the 
adaptive value, or selective value. If the differences of fitness are in 
any way associated with the presence or absence of a particular gene 
in the individual's genotype, then selection operates on that gene. 
When a gene is subject to selection its frequency in the offspring is 
not the same as in the parents, since parents of different genotypes 
pass on their genes unequally to the next generation. In this way 

Chap. 2] 



selection causes a change of gene frequency, and consequently also of 
genotype frequency. The change of gene frequency resulting from 
selection is more complicated to describe than that resulting from 
mutation, because the differences of fitness that give rise to the 
selection are an aspect of the phenotype. We therefore have to take 
account of the degree of dominance shown by the genes in question. 
Dominance, in this connexion, means dominance with respect to 
fitness, and this is not necessarily the same as the dominance with 
respect to the main visible effects of the gene. Most mutant genes, for 
example, are completely recessive to the wild type in their visible 

A 2 A 2 

I — 




A,A 2 

— I 


A 2 A 2 
i — 
1 -s 




— I 


A 2 A 2 






Fig. 2.i. Degrees of dominance with respect to fitness. 

effects, but this does not necessarily mean that the heterozygote has a 
fitness equal to that of the wild-type homozygote. The meaning of 
the different degrees of dominance with which we shall deal is 
illustrated in Fig. 2.1. 

It is most convenient to think of selection acting against the gene 
in question, in the form of selective elimination of one or other of the 
genotypes that carry it. This may operate either through reduced 
viability or through reduced fertility in its widest sense, including 
mating ability. In either case the outcome is the same: the genotype 
selected against makes a smaller contribution of gametes to form 
zygotes in the next generation. We may therefore treat the change of 
gene frequency as taking place between the counting of genotypes 
among the zygotes of the parent generation and the formation of 


zygotes in the offspring generation. The intensity of the selection is 
expressed as the coefficient of selection, s, which is the proportionate 
reduction in the gametic contribution of a particular genotype com- 
pared with a standard genotype, usually the most favoured. The 
contribution of the favoured genotype is taken to be i, and the 
contribution of the genotype selected against is then i - s. This 
expresses the fitness of one genotype compared with the other. Sup- 
pose, for example, that the coefficient of selection is s = o-i; this 
means that for every ioo zygotes produced by the favoured genotype, 
only 90 are produced by the genotype selected against. 

The fitness of a genotype with respect to any particular locus is 
not necessarily the same in all individuals. It depends on the en- 
vironmental circumstances in which the individual lives, and also on 
the genotype with respect to genes at other loci. When we assign a 
certain fitness to a genotype, this refers to the average fitness in the 
whole population. Though differences of fitness between individuals 
result in selection being applied to many, perhaps to all, loci simul- 
taneously, we shall limit our attention here to the effects of selection 
on the genes at a single locus, supposing that the average fitness of the 
different genotypes remains constant despite the changes resulting 
from selection applied simultaneously to other loci. The conclusions 
we shall reach apply equally to natural selection occurring under 
natural conditions without the intervention of man, and to artificial 
selection imposed by the breeder or experimenter through his choice 
of individuals as parents and through the number of offspring he 
chooses to rear from each parent. 

Change of gene frequency under selection. We have first to 
derive the basic formulae for the change of gene frequency brought 
about by one generation of selection. Then we can consider what they 
tell us about the effectiveness of selection. The different conditions 
of dominance have to be taken account of, but the method is the same 
for all, and we shall illustrate it by reference to the case of complete 
dominance with selection acting against the recessive homozygote. 
Let the genes A x and A 2 have initial frequencies p and q, A x being 
completely dominant to A 2 , and let the coefficient of selection against 
A 2 A 2 individuals be s. Multiplying the initial frequency by the fitness 
of each genotype we obtain the proportionate contribution of each 
genotype to the gametes that will form the next generation, thus: 

Chap. 2] 




Initial frequencies 


Gametic contribution 

A X A 2 





I -S 




i -sq 2 

Note that the total gametic contribution is no longer unity, because 
there has been a proportionate loss of sq 2 due to the selection. To 
find the frequency of A 2 gametes produced — and so the frequency of 
A 2 genes in the progeny — we take the gametic contribution of A 2 A 2 
individuals plus half that of A X A 2 individuals and divide by the new 
total, i.e. we apply equation J.J. Thus the new gene frequency is 


_ q\i-s)+pq 
qi ~ l-sq* 

The change of gene frequency, Aq, resulting from one generation of 
selection is 

_g% -%&pq n 


which on simplification reduces to 

Aq = 

^ 2 (l~g) 
i -sq 2 


From this we see that the effect of selection on gene frequency de- 
pends not only on the intensity of selection, s, but also on the initial 
gene frequency. But both relationships are somewhat complex, and 
the examination of their significance will be postponed till after the 
other situations have been dealt with. 

Selection may act against the dominant phenotype and favour the 
recessive: we then put i - s for the fitness of A^ and of A X A 2 geno- 
types. The expression for Aq is given in Table 2.1. The difference 
may best be appreciated by considering the effects of total elimination 
(s = i). The expression for selection against the dominant allele then 
reduces to Aq = 1 - q, which expresses the fact that if only the reces- 
sive genotype survives to breed the frequency of the recessive allele 
will become 1 after a single generation of selection. But, on the other 
hand, if there is complete elimination of the recessive genotype the 
frequency of the dominant allele does not reach 1 after a single 
generation. The difference between the effects of selection in oppo- 
site directions becomes less marked as the value of s decreases. 


If there is incomplete dominance the expression for Aq is again 
different. The case of exact intermediate dominance is given in 
Table 2.1. Here we put 1 - %s for the fitness of A x K 2y and 1 -s for 
the fitness of A 2 A 2 genotype. For selection in the opposite direction 
in this case we need only interchange the initial frequencies of the 
two alleles, writings in the place of q. 

Table 2.1 

Change of gene frequency, Aq, after one generation of selection 
under different conditions of dominance specified in Fig. 2.1. 
Conditions of domin- Initial frequencies and Change of frequency, 

ance and selection fitness of the genotypes Aq, of gene A 2 

A]A X AjA 2 A 2 A 2 

p 2 2pq q 2 

No dominance , had -a) 

, . . A 1 i-is i-s — =-^ ^ (1) 

selection against A 2 1 - sq ' 

Complete dominance ^(i -q) 

selection against A 2 A 2 1 -sq 2 

Complete dominance sq 2 (i -q) 

selection against A x - i-s(i-q 2 ) 


selection against 1 - s 1 1 1 - ^ 2 + p ^ lP — — (4) 

AjAi and A 2 A 2 


When s is small the denominators differ little from 1, and the numerators 
alone can be taken to represent Aq sufficiently accurately for most purposes. 

Finally, selection may favour the heterozygote, a condition known 
as overdominance. In this case we put 1 - s ± and 1 -s 2 for the fitness 
of the two homozygotes. The expression for Aq is given in Table 2. 1 . 
This special case will be given more detailed attention later. The 
different conditions of dominance to which the expressions in 
Table 2.1 refer are illustrated diagrammatically in Fig. 2.1. Let us 
now see what these equations tell us about the effectiveness of selec- 

Effectiveness of selection. We see from the formulae that the 
effectiveness of selection, i.e. the magnitude of Aq, depends on the 
initial gene frequency, q. The nature of this relationship is best 
appreciated from graphs showing Aq at different values of q. Fig. 2.2 



^ 016 








■ // 






A A 
















Fig. 2.2. Change of gene frequency, Aq, under selection of intensity s =o-2, at 
different values of initial gene frequency, q. Upper figure: a gene with no domi- 
nance. Lower figure: a gene with complete dominance. The graphs marked 
( -) refer to selection against the gene whose frequency is q, so that Aq is nega- 
tive. The graphs marked ( +) refer to selection in favour of the gene, so that 
Aq is positive. (From Falconer, 1954a; reproduced by courtesy of the editor of 
the International Union of Biological Sciences.) 


shows these graphs for the cases of no dominance and complete 
dominance. They also distinguish between selection in the two 
directions. A value of s = o-2 was chosen for the coefficient of selec- 
tion because, for reasons given in Chapter 12, this seems to be the 
right order of magnitude for the coefficient of selection operating on 
genes concerned with metric characters in laboratory selection experi- 
ments. First we may note that with this value of s there is never a 
great difference in Aq according to the direction of selection. The 
two important points about the effectiveness of selection that these 
graphs demonstrate are: (i) Selection is most effective at intermediate 
gene frequencies and becomes least effective when q is either large or 
small, (ii) Selection for or against a recessive gene is extremely 
ineffective when the recessive allele is rare. This is the consequence 
of the fact, noted earlier, that when a gene is rare it is represented 
almost entirely in heterozygotes. 

Another way of looking at the effect of the initial gene frequency on 
the effectiveness of selection is to plot a graph showing the course of 
selection over a number of generations, starting from one or other 
extreme. Such graphs are shown in Fig. 2.3. They were constructed 
directly from those of Fig. 2.2, and refer again to a coefficient of 
selection, s = o-z. They show that the change due to selection is at 
first very slow, whether one starts from a high or a low initial gene 
frequency; it becomes more rapid at intermediate frequencies and 
falls off again at the end. In the case of a fully dominant gene one is 
chiefly interested in the frequency of the homozygous recessive 
genotype, i.e. q 2 . For this reason the graph shows the effect of selec- 
tion on q 2 instead of on q. 

It is often useful to express the change of gene frequency, Aq, 
under selection in a simplified form, which is a sufficiently good 
approximation for many purposes. If either the coefficient of selec- 
tion, s y or the gene frequency, q, is small, then the denominators of 
the equations in Table 2.1 become very nearly unity, and we can use 
the numerators alone as expressions for Aq. Then for selection in 
either direction we have, with no dominance: 

Aq=±isq(i-q) (approx.) (2.7) 

and with complete dominance: 

Aq= ±sq 2 (i-q) (approx.) (2.8) 










, r-r-i ! i i i i ■ 




. . . , 1 . . , . 








Fig. 2.3. Change of gene frequency during the course of selection from one 
extreme to the other. Intensity of selection, s —0-2. Upper figure: a gene with 
no dominance. Lower figure: a gene with complete dominance, q being the 
frequency of the recessive allele and q 2 that of the recessive homozygote. The 
graphs marked ( - ) refer to selection against the gene whose frequency is q, so 
that q or q 2 decreases. The graphs marked ( + ) refer to selection in favour of the 
gene, so that q or q 2 increases. (From Falconer, 1954a; reproduced by courtesy 
of the editor of the International Union of Biological Sciences.) 



[Chap. 2 

Example 2.1. As an example of the change of gene frequency under 
selection we shall take the case of a sex-linked gene, in spite of the added 
complication, because there is no well documented case of an autosomal 
gene. Fig. 2.4 shows the change of the frequency of the recessive sex- 
linked gene "raspberry" in Drosophila melanogaster over a period of about 
eighteen generations, described by Merrell (1953). The population was 
started with a gene frequency of 0-5 in both sexes, and was therefore in 


Fig. 2.4. Change of gene frequency under natural selection in 
the laboratory, as described in Example 2.1. (Data from Merrell, 

equilibrium at the beginning (see p. 17). Counts were made at about 
monthly intervals, and the gene frequency in both sexes combined (by 
equation 1.3) is shown against the scale of days in the figure. Measure- 
ments of fitness were made by comparison of the relative viability of 
mutant and wild-type phenotypes, and of their relative success in mating. 
No differences of viability were detected, nor of the success of females in 

Chap. 2] 



mating. But mutant males were only 50 per cent as successful as wild- 
type males in mating. The changes of gene frequency expected on the 
basis of this difference of fitness were then calculated generation by 
generation, and these calculated values are shown in the figure by the 
smooth curve, plotted against the scale of generations. From a similar 
experiment with a different mutant it was found that the calculated and 
observed curves coincided if a period of 24 days was taken as the interval 
between generations. For this reason 24 days to a generation was taken as 
the basis for superimposing the curves shown here. Since the calculated 
curve was to this extent made to fit the observed, the good agreement 
between the two cannot be taken as proof that selection operated only 
through the males' success in mating. But the similarity in their shapes 
illustrates well how the change of gene frequency is rapid at first, tails off 
as the gene frequency becomes lower, and becomes very slow when it 
approaches zero. 

Number of generations required. How many generations of 
selection would be needed to effect a specified change of gene fre- 
quency? An answer to this question is sometimes required in con- 
nexion with breeding programmes or proposed eugenic measures. 
We shall here consider only the case of selection against a recessive 
when elimination of the unwanted homozygote is complete, i.e. s=i. 
This would apply to natural selection against a recessive lethal, and 
artificial selection against an unwanted recessive in a breeding pro- 
gramme. We shall also, for the moment, suppose that there is no 
mutation. We had in equation 2.5 an expression for the new gene 
frequency after one generation of selection against a recessive. 
Substituting s = 1 in this equation and writing q , q ly q 2 , ... ,q t for the 
gene frequency after o, 1, 2, . . . , t generations of selection we have 








by substituting for q 1 and simplifying. So in general 






and the number of generations, t, required to change the gene 
frequency from q to q t is 

t Jhzli 

11 / X 

= (2.10) 

Qt q 

We may use this formula to illustrate the point already made, that 
when the frequency of a recessive gene is low selection is very slow 
to change it. 

Example 2.2. It is sometimes suggested, as a eugenic measure, that 
those suffering from serious inherited defects should be prevented from 
reproducing, since in this way the frequency of such defects would be 
reduced in future generations. Before deciding whether the proposal is a 
good one we ought to know what it would be expected to achieve. We 
cannot properly discuss this problem without taking mutation into ac- 
count, as we shall do later; the answer we get ignoring mutation, as we do 
now, shows what is the best that could be hoped for. Let us take albinism 
as an example, though it cannot be regarded as a very serious defect, and 
ask the question: how long would it take to reduce its frequency to half the 
present value? The present frequency is about 1/20,000, and this makes 
q = 1/141, as we saw in Example 1.4. The objective is q 2 = 1/40,000, which 
makes q t = 1/200. So, from equation 2. io, t = zoo - 141 =59 generations. 
With 25 years to a generation it would take nearly 1500 years to achieve 
this modest objective. More serious recessive defects are generally even 
less common than albinism and with them elimination would be still 

Balance between mutation and selection. Having described 
the effects of mutation and selection separately we must now compare 
them and consider them jointly. Which is the more effective process 
in causing change of gene frequency? Is it reasonable to attribute the 
low frequency of deleterious genes that we find in natural popula- 
tions to the balance between mutation tending to increase the fre- 
quency and selection tending to decrease it? The expressions already 
obtained for the change of gene frequency under mutation or selec- 
tion alone show that both depend on the initial gene frequency, but in 
different ways. Mutation to a particular gene is most effective in 
increasing its frequency when the mutant gene is rare (because there 

Chap. 2] 



are more of the unmutated genes to mutate); but selection is least 
effective when the gene is rare. The relative effectiveness of the two 
processes depends therefore on the gene frequency, and if both pro- 
cesses operate for long enough a state of equilibrium will eventually 
be reached. So we must find what the gene frequency will be when 
equilibrium is reached. This is done by equating the two expressions 
for the change of gene frequency, because at equilibrium the change 
due to mutation will be equal and opposite to the change due to 

Let us consider first a fully ] recessive gene with frequency q> 
mutation rate to it «, and from it v\ and selection coefficient against it 

s. Then from equations {2.3) and! (2.6) we have at equilibrium 


l sf{i-q) 



This equation is too complicated to give a clear answer to our ques- 
tion. But we can make two simplifications with only a trivial sacrifice 
of accuracy. We are specifically interested in genes at low equilibrium 
frequencies. If q is small the term vq representing back mutation is 
relatively unimportant and can be neglected; and we can use the 
approximate expression (equation 2.8) for the selection effect. 
Making these simplifications we have the equilibrium condition for 
selection against a recessive gene 

u(i ~q)=sq 2 (i -q) (approx.) 

u = sq d 



(approx.) (2.12) 

(approx.) (2.13) 

For a gene with no dominance similar reasoning from equation (1) 
in Table 2.1 gives the equilibrium condition 

q=- (approx.) 


Finally, consider selection against a completely dominant gene, the 
frequency of the dominant gene being 1 - q, and the mutation rate 
to it being v. In this case 1 -q is very small and the term w(i -q) in 
ere equation 2. 11 is negligible. We have therefore at equilibrium 


vq = sq 2 (i -q) (approx.) 

q(i-q)=j (approx.) 

or H=— (approx.) ( 2 - J 5) 

where H is the frequency of heterozygotes. If the mutant gene is 
rare H is very nearly the frequency of the mutant phenotype in the 

Example 2.3. If the equilibrium state is accepted as applicable, we 
can use it to get an estimate of the mutation rate of dominant abnormalities 
for which the coefficient of selection is known. Among some human 
examples described by Haldane (1949) is the case of dominant dwarfism 
(chondrodystrophy) studied in Denmark. The frequency of dwarfs was 
estimated at 10-7 x io -5 , and their fitness (1 -s) at 0-196. The estimate of 
fitness was made from the number of children produced by dwarfs com- 
pared with their normal sibs. The mutation rate, by equation (2. 75), 
comes out at 4-3 x io -5 . Though there is a possibility of serious error in 
the estimate of frequency owing to prenatal mortality of dwarfs, the 
mutation rate is almost certainly estimated within the right order of magni- 
tude. For a discussion of the estimation of mutation rates in man see 
Crow (1956). 

These expressions for the equilibrium gene frequency under the 
joint action of mutation and selection show that the gene frequency 
can have any value at equilibrium, depending on the relative magni- 
tude of the mutation rate and the coefficient of selection. But if 
mutation rates are of the order of magnitude commonly accepted, 
i.e. io -5 , or thereabouts, then only a mild selection against the mutant 
gene will be needed to hold it at a very low equilibrium frequency. 
For example, the following are the equilibrium frequencies of a 
recessive gene and of the recessive homozygote under various intensi- 
ties of selection if the mutation rate is io -5 : 

s = 





9 = 





q 2 = 





Thus, if a gene mutates at the rate of io -5 , a selective disadvantage of 
10 per cent is enough to hold the frequency of the recessive homo- 
zygote at one in ten thousand; and a 50 per cent disadvantage will 



Chap. 2] SELECTION 39 

hold it at one in fifty thousand. It is quite clear therefore that the 
low frequency of deleterious mutants in natural populations is in 
accord with what would be expected from the joint action of mutation 
and selection. A further conclusion is that mutation alone is most 
unlikely to be a cause of evolutionary change. It is not mutation, but 
selection, that chiefly determines whether a gene spreads through the 
population or remains a rare abnormality, unless the mutation rate 
is very much higher than seems to be the rule. 

Let us now briefly consider two questions of social importance 
concerning the balance between selection and mutation: the effect of 
an increase of mutation rate, and the effect of a change in the intensity 
of selection against deleterious mutants. These questions are more 
fully discussed by Crow (1957). 

Increase of mutation rate. Since the products of mutation are 
predominantly deleterious, the process of mutation has a harmful 
effect on a proportion of the individuals in a population. When an 
individual dies or fails to reproduce in consequence of the reduced 
fitness of its genotype, we may refer to this as a ''genetic death." An 
increase in the frequency of genetic deaths would reduce the poten- 
tial reproductive rate and might thus reduce the speed with which a 
species could multiply in an unoccupied territory. But when the 
numbers of adults are held constant by density-dependent factors, 
even quite a high frequency of genetic deaths will not affect the 
ability of the population to perpetuate itself, especially if the repro- 
ductive rate is high, because the death of some individuals leaves room 
for others that would otherwise have died from lack of food or some 
ut if other cause. There is a species of Drosophila, for example (D. 
tropicalis, from Central America), in which 50 per cent of individuals 
in a certain locality suffer genetic death, and yet the population 
flourishes (Dobzhansky and Pavlovsky, 1955). In species with low 
reproductive rates the frequency of genetic deaths is of greater conse- 
quence, particularly in ourselves, where the death of every individual 
is a matter of concern. Let us therefore consider what effect is to be 

xpected from an increase of mutation rate such as might be caused 
by an increase in the amount of ionising radiation to which human 
populations are exposed. 

Let us take the case of a recessive gene with a mutation rate (to it) 
Df u, the gene being in equilibrium at a frequency of q. Then, if the 

oefficient of selection against the homozygote is s, the frequency of 
genetic deaths is sq 2 . This is the proportionate loss due to selection, 



as shown on p. 29, and it is equal to u, by equation 2.12. Thus the 
frequency of genetic deaths, when equilibrium has been attained, 
depends on the mutation rate alone, and is not influenced by the 
degree of harmfulness of the gene. The reason for this apparent para- 
dox is that the more harmful genes come to equilibrium at lower 

Now, if the mutation rate is increased, and maintained at the new 
level, the gene will begin to increase toward a new point of equili^ 
brium at which sq 2 will be equal to the new mutation rate. Thus if 
the mutation rate were doubled the frequency of genetic deaths would 
also be doubled, when the new equilibrium had been reached. But 
the approach to the new equilibrium would be very slow. The change 
of gene frequency in the first generation is approximately 

Aq = u(i-q)-sq 2 (i-q) 

u being the new mutation rate (from equations 2.3 and 2.8, but 
ctingback negle mutation). To see what this means let us take a 
mutation rate of io -5 as being probably representative of many loci, 
and let us suppose that this was doubled. We may with sufficient 
accuracy take 1 - q as unity. Then 

Aq = 2 x io~ 5 - io -5 
= io- 5 

The immediate effect of the increase of mutation rate would there- 
fore be very small indeed. 

Change of selection intensity. Intensification of selection is 
sometimes advocated as a eugenic measure in human populations, 
on the grounds that if sufferers from genetic defects were prevented 
from breeding the frequency of the defects would be reduced. We 
saw from Example 2.2. that the effect of selection against a recessive 
defect is very slow indeed, even when mutation is ignored. The true 
situation is even worse. We cannot reduce the frequency of an 
abnormality, whether dominant or recessive, below the new equili- 
brium frequency. The serious defects have already a fairly strong 
natural selection working on them, and the addition of artificial 
selection can do no more than make the coefficient of selection, s, 
equal to 1. This would probably seldom do more than double the 
present coefficient of selection, and the incidence of defects would be 
reduced to not less than half their present values (equations 2.13, 
2.14, 2.15). With a dominant gene the effect would be immediate, 

Chap. 2] 



but with a recessive the approach to the new equilibrium would be 
extremely slow. 

The situation with respect to recessives is complicated by the 
fact that deleterious recessives are certainly not at their equilibrium 
frequencies in present-day human populations (Haldane, 1939). 
The reason is that modern civilisation has reduced the degree of 
subdivision (i.e. inbreeding) and so reduced the frequency of homo- 
zygotes, as will be explained in the next chapter. In consequence 
both the gene frequencies and the homozygote frequencies are below 
their equilibrium values, and must be presumed to be at present 
increasing slowly toward new equilibria at higher values. 

Perhaps the converse of the question posed above is one that 
should give us more concern, namely the consequences of the reduced 
intensity of natural selection under modern conditions. Minor 
genetic defects, such as colour-blindness, must presumably have had 
some selective disadvantage in the past but now have very little, if 
any, effect on fitness. Moreover, the development and extension of 
medical treatment prolongs the lives of many people with diseases 
that have at least some degree of genetic causation through genes that 
increase susceptibility. This relaxation of the selection operating on 
minor genetic defects and against genes concerned in the causation of 
disease suggests that the frequencies of these genes will increase 
toward new equilibria at higher values. If this is true we must expect 
the incidence of minor genetic defects to increase in the future, and 
also the proportion of people who need medical treatment for a 
variety of diseases. By applying humanitarian principles for our own 
good now we are perhaps laying up a store of inconvenience for our 
descendants in the distant future. 

Selection favouring heterozygotes. We have considered the 
effects of selection operating on genes that are partially or fully 
dominant with respect to fitness; but, though the appropriate for- 
mula was given in Table 2.1, we have not yet discussed the conse- 
quences of overdominance with respect to fitness; that is, when the 
heterozygote has a higher fitness than either homozygote. At first 
sight it may seem rather improbable that selection should favour the 
heterozygote of two alleles rather than one or other of the homo- 
zygotes, but there are reasons for thinking that this in fact is not at all 
an uncommon situation. Let us first examine the consequences of 
this form of selection, and then consider the evidence of its occur- 
rence in nature. 


Selection operating on a gene with partial or complete dominance 
tends toward the total elimination of one or other allele, the final gene 
frequency, in the absence of mutation, being o or i . When selection 
favours the heterozygote, however, the gene frequency tends toward 
an equilibrium at an intermediate value, both alleles remaining in the 
population, even without mutation. The reason is as follows. The 
change of gene frequency after one generation was given in Table 2.1 
as being 

pq(s 1 p-s 2 q) 


hp 2 - s 2 q 

The condition for equilibrium is that Aq = o, and this is fulfilled when 
s 1 p=s 2 q. The gene frequencies at this point of equilibrium are 
therefore «X» 

-=- Zl 

q Sl 

?= *7T7 2 ^ 

Now, if q is greater than its equilibrium value (but not 1), and p 
therefore less, s x p will be less than s 2 q, and Aq will be negative; that is 
to say q will decrease. Similarly if q is less than its equilibrium value 
(but not o) it will increase. Therefore when the gene frequency has 
any value, except o or 1 , selection changes it toward the intermediate 
point of equilibrium given in equation 2.16, and both alleles remain 
permanently in the population. Three or more alleles at a locus are 
maintained in the same way, provided the heterozygote of any pair 
is superior in fitness to both homozygotes of that pair (Kimura, 
1956). A feature of the equilibrium worthy of note is that the gene 
frequency depends not on the degree of superiority of the hetero- 
zygote but on the relative disadvantage of one homozygote compared 
with that of the other. Therefore there is a point of equilibrium at 
some more or less intermediate gene frequency whenever a hetero- 
zygote is superior to both the homozygotes, no matter by how little. 

Our previous consideration of genes with complete dominance 
showed that the balance between selection and mutation satisfactorily 
accounts for the presence of deleterious genes at low frequencies, 
causing the appearance of rare abnormal, or mutant, individuals. 
Genes at intermediate frequencies, however, are common in very 
many species, and the presence of these cannot satisfactorily be 

Chap. 2] 





accounted for in this way. But the intermediate frequencies are just 
what would be expected if selection favoured the heterozygotes. 
The existence in a population of individuals with readily discernible 
differences caused by genes at intermediate frequencies is referred to 
as polymorphism. The blood group differences of man are perhaps 
the best known examples, but antigenic differences are found also in 
many other species and are probably universal in animals. More 
striking forms of polymorphism are the colour varieties found in 
many species, particularly among insects, snails, and fishes. The 
genes causing polymorphism have usually no obvious advantage of 
one allele over another, all the genotypes being essentially normal, or 
"wild-type," individuals. In these circumstances, as we noted above, 
only a very slight superiority of the heterozygote would be sufficient 
to establish an equilibrium at an intermediate gene frequency. The 
properties of the genes concerned with polymorphism seem, there- 
fore, to accord well with the hypothesis that selection is operating on 
them in favour of the heterozygotes, and this is generally conceded to 
be the most probable reason for their intermediate frequencies. As a 
general cause of polymorphism, however, it cannot be taken as fully 
proved, because the superior fitness of heterozygotes has been 
demonstrated in relatively few cases, and there are other possible 
reasons for the existence of polymorphism. For example, the genes 
might be in a transitional stage of a change from one extreme to the 
other as a result of slow environmental change; or the intermediate 
frequencies might be the point of equilibrium between mutation in 
opposite directions, with virtually no selective advantage of one allele 
over the other. But these explanations seem improbable, particularly 
as some polymorphisms are known to be of very long standing. The 
polymorphism of shell colours in the land snail Cepaea nemoralis, for 
example, goes back to Neolithic times (Cain and Sheppard, 1954a). 
Another possible cause of polymorphism lies in the heterogeneity of 
the environment in which a population lives. If the differences of 
environment influence the selection coefficients in such a way that 
one allele is favoured in some conditions and another allele in other 
conditions, then polymorphism may result provided that mating is 
not entirely at random over the range of environments. (See Levene, 
1953; Li, 19556; Mather, 1955a; Waddington, 1957.) 

If heterozygotes are indeed superior in fitness, one naturally 
wants to enquire into the nature of their superiority. Unfortunately, 
however, very little is known about this, though evidence is accumu- 


lating, in the case of the human blood groups, that certain blood groups 
are associated with an increased susceptibility to certain diseases 
(Roberts, 1957); group O, for example, with duodenal ulcer and group 
A with pernicious anaemia. If one states this the other way round 
and says that the other alleles confer increased resistance to these 
diseases, then it is not unreasonable to suppose that each allele 
increases resistance to different diseases, and that the presence of two 
alleles increases the resistance to two different diseases, thereby 
giving a selective advantage to the heterozygote. 

Another question of interest concerns the evolutionary signifi- 
cance of polymorphism. Is it an "adaptive" feature of a species? 
Does it, in other words, confer some advantage over a population 
without it? Some think that it does. (See, particularly, Dobzhansky, 
195 ib). Others, however, point out that the average fitness of a 
population with polymorphism resulting from superior fitness of 
heterozygotes is less than that of a population in which a single 
allele performs the same function as the two different alleles in the 
heterozygote (Cain and Sheppard, 19546). On this view, polymor- 
phism is a situation that, once established, is perpetuated by selection 
between individuals within the population, but is a disadvantage to 
the population as a whole in competition with another population 
lacking the polymorphism. 

The foregoing account of polymorphism leaves many problems 
unsolved, and does little more than sketch the outlines of a most 
interesting aspect of the genetics of populations. In particular, we 
have not mentioned the extensive and detailed investigations of poly- 
morphism in respect of inverted segments of chromosomes found in 
species of Drosophila and, to a lesser extent, in some other animals 
and plants. For a description of these studies, and also for a fuller 
general account of polymorphism, the reader must be referred to 
Dobzhansky (1951a). We conclude by giving one example of poly- 
morphism where the nature of the superiority of heterozygotes is 
clear. Other cases are described by Dobzhansky (1951a), Ford 
(1953), Lerner (1954), and Sheppard (1958). 

Example 2.4. Sickle-cell anaemia (Allison, 1955). There is a gene, 
found in American negroes and in the indigenous East Africans, which 
causes the formation of an abnormal type of haemoglobin. Homozygotes 
suffer from an anaemia, characterised by the "sickle" shape of the erythro- 
cytes; it is a severe disease from which many die. All the haemoglobin of 
homozygotes is of the abnormal type, though there is a variable admixture 

Chap. 2] 



of foetal haemoglobin. Heterozygotes do not suffer from anaemia, but 
they can be recognised by the presence of sickle cells if the haemoglobin 
is deoxygenated. About 35 per cent of their haemoglobin is of the ab- 
normal type. With respect to haemoglobin synthesis, therefore, the sickle- 
cell gene is partially dominant, though with respect to the anaemia it is 
recessive, and with respect to fitness it has been proved to be over- 
dominant. In routine surveys the few surviving homozygotes are not 
readily distinguished from heterozygotes; we shall refer to the combined 
heterozygotes and surviving homozygotes as "abnormals." The frequency 
of abnormals varies very much with the locality: in American negroes it 
is about 9 per cent, and in different parts of Africa it varies from zero up 
to a maximum of about 40 per cent. In view of the severe disability of the 
homozygotes it is impossible to account for these high frequencies unless 
the heterozygotes have a quite substantial selective advantage over the 
normal homozygotes. The nature of this selective advantage has been 
shown to be connected with resistance to malaria. Heterozygotes are less 
susceptible to malaria than normal homozygotes, and the frequency of 
abnormals in different areas is correlated with the prevalence of malaria. 
Let us work out the gene frequency corresponding with the maximum 
frequency of 40 per cent abnormals, and then find the magnitude of the 
selective advantage of heterozygotes necessary to maintain this gene 
frequency in equilibrium. 

If the gene frequency is in equilibrium it will be the same after selec- 
tion has taken place as it was before. Therefore, if we assume that all the 
selection takes place before adulthood — an assumption that is not very far 
from the truth — we can estimate the gene frequency from the genotype 
frequencies in the adult population. But it is first necessary to know what 
proportion of abnormals are homozygotes. This has been estimated as 
being approximately 2-9 per cent (Allison, 1954). Thus, when the fre- 
quency of abnormals is 0-4, the frequency of homozygotes is 0-012, and 
that of heterozygotes is 0-388. The gene frequency, then, by equation 1.1, 
is the frequency of homozygotes plus half the frequency of heterozygotes, 
which comes to q = 0-206. If this gene frequency is the equilibrium value 
maintained by natural selection favouring the heterozygotes, and if we 
assume mating to be random, then the gene frequency is related to the 
selection coefficients by equation 2.16. The fitness of sickle-cell homo- 
zygotes, relative to that of heterozygotes, has been estimated from a 
comparison of viability and fertility as being approximately 0-25. There- 
fore the coefficient of selection against homozygotes is s a = '75' Substi- 
tuting this value of s 2 , and the value of q found above, in equation 2.16 
gives ^ = 0-197. This is the coefficient of selection against normal homo- 
zygotes, relative to heterozygotes. If we want to express the selective 
advantage of heterozygotes as the superiority of heterozygotes, relative to 


normal homozygotes, we may do so, since the fitness of heterozygotes 

relative to normal homozygotes is 

. This is 1-24. Thus the selective 

advantage to be attributed to the resistance of heterozygotes to malaria, 
if these are the forces holding the gene in equilibrium, is 24 per cent. 

The presence of the sickle-cell gene in American negroes can be 
attributed to their African origin. The gene's present frequency of 0-046, 
deduced in the manner described above, can be accounted for partly by 
racial mixture and partly by the change of habitat which, removing the 
advantage of heterozygotes, has exposed the gene to the full power of the 
selection against homozygotes. 

As an example of polymorphism the sickle-cell gene is not altogether 
typical, because the differences of fitness are rather large and one of the 
genotypes is clearly abnormal. But it illustrates in an exaggerated form 
the nature of the selective forces that are presumed to underlie the more 
usual forms of polymorphism. 



I. Changes of Gene Frequency under 
Simplified Conditions 

We have now to consider the last of the agencies through which gene 
frequencies can be changed. This is the dispersive process, which 
differs from the systematic processes in being random in direction, 
and predictable only in amount. In order to exclude this process 
from the previous discussions we have postulated always a "large" 
population, and we have seen that in a large population the gene 
frequencies are inherently stable. That is to say, in the absence of 
migration, mutation, or selection, the gene and genotype frequencies 
remain unaltered from generation to generation. This property of 
stability does not hold in a small population, and the gene frequencies 
are subject to random fluctuations arising from the sampling of 
gametes. The gametes that transmit genes to the next generation 
carry a sample of the genes in the parent generation, and if the sample 
is not large the gene frequencies are liable to change between one 
generation and the next. This random change of gene frequency 
is the dispersive process. 

The dispersive process has, broadly speaking, three important 
consequences. The first is differentiation between sub-populations. 
The inhabitants of a large area seldom in nature constitute a single 
large population, because mating takes place more often between 
inhabitants of the same region. Natural populations are therefore 
more or less subdivided into local groups or sub-populations, and the 
sampling process tends to cause genetic differences between these, if 
the number of individuals in the groups is small. Domesticated or 
laboratory populations, in the same way, are often subdivided — for 
example, into herds or strains — and in them the subdivision and its 
resultant differentiation are often more marked. The second con- 
sequence is a reduction of genetic variation within a small population. 
The individuals of the population become more and more alike in 
genotype, and this genetic uniformity is the reason for the widespread 


use of inbred strains of laboratory animals in physiological and allied 
fields of research. (An inbred strain, it may be noted, is a small 
population.) The third consequence of the dispersive process is an 
increase in the frequency of homozygotes at the expense of hetero- 
zygotes. This, coupled with the general tendency for deleterious 
alleles to be recessive, is the genetic basis of the loss of fertility and 
viability that almost always results from inbreeding. To explain 
these three consequences of the dispersive process is the chief purpose 
of this chapter. 

There are two different ways of looking at the dispersive process 
and of deducing its consequences. One is to regard it as a sampling 
process and to describe it in terms of sampling variance. The other 
is to regard it as an inbreeding process and describe it in terms of the 
genotypic changes resulting from matings between related indi- 
viduals. Of these, the first is probably the simpler for a description 
of how the process works, but the second provides a more convenient 
means of stating the consequences. The plan to be followed here is 
first to describe the general nature of the dispersive process from the 
point of view of sampling. This will show how the three chief con- 
sequences come about. Then we shall approach the process afresh 
from the point of view of inbreeding, and show how the two view- 
points connect with each other. In all this we shall confine our 
attention to the simplest possible situation, excluding migration, 
mutation, and selection. Thus we shall see what happens in small 
populations in the absence of other factors influencing gene frequency. 
In the next chapter we shall extend the conclusions to more realistic 
situations, by removing the restrictive simplifications, and we shall in 
particular consider the joint effects of the dispersive process and the 
systematic processes. Finally, in Chapter 5, we shall consider the 
special cases of pedigreed populations, and very small populations 
maintained by regular systems of close inbreeding. 

The Idealised Population 

In order to reduce the dispersive process to its simplest form we 
imagine an idealised population as follows. We suppose there to be 
initially one large population in which mating is random, and this 
population becomes subdivided into a large number of sub-popula- 
tions. The subdivision might arise from geographical or ecological 
causes under natural conditions, or from controlled breeding in 

Chap. 3] 



domesticated or laboratory populations. The initial random-mating 
population will be referred to as the base population, and the sub- 
populations will be referred to as lines. All the lines together consti- 
tute the whole population, and each line is a "small population" in 
which gene frequencies are subject to the dispersive process. When a 
single locus is under discussion we cannot properly understand what 
goes on in one line except by considering it as one of a large number 
of lines. But what happens to the genes at one locus in a number of 
lines happens equally to those at a number of loci in one line, pro- 
vided they all start at the same gene frequency. So the consequences 
of the process apply equally to a single line provided we consider 
many loci in it. 

The simplifying conditions specified for the idealised population 
are the following: 

i. Mating is restricted to members of the same line. The lines 
are thus isolated in the sense that no genes can pass from one line to 
another. In other words migration is excluded. 

2. The generations are distinct and do not overlap. 

3. The number of breeding individuals in each line is the same for 
all lines and in all generations. Breeding indviduals are those that 
transmit genes to the next generation. 

4. Within each line mating is random, including self-fertilisation 
in random amount. 

5. There is no selection at any stage. 

6. Mutation is disregarded. 
The situation implied by these conditions is represented dia- 

grammatically in Fig. 3.1, and may be described thus: All breeding 



Gametes 2A 7 



I Breeding 

CD ul\ CD CD CD 



2 Breeding i — L — i i — * — i r— * — i 

LZj LZj 







Gametes 2N 


individuals! ,1 

I \ I 

Fig. 3.1. Diagrammatic representation of the subdivision of a 
single large population — the base population — into a number of 
sub-populations, or lines. 






individuals contribute equally to a pool of gametes from which zygotes 
will be formed. Union of gametes is strictly random. Out of a 
potentially large number of zygotes only a limited number survive to 
become breeding individuals in the next generation, and this is the 
stage at which the sampling of the genes transmitted by the gametes 
takes place. Survival of zygotes is random, and consequently the 
contribution of the parents to the next generation is not uniform, but 
varies according to the chances of survival of their progeny. Since 
the population size is constant from generation to generation, the 
average number of progeny that reach breeding age is one per 
individual parent or two per mated pair of parents. For any particular 
zygote the chance of survival is small, and therefore the number of 
progeny contributed by individual parents, or by pairs of parents, has 
a Poisson distribution. 

The following symbols will be used in connexion with the 
idealised population. 

N=the number of breeding individuals in each line and genera- 
tion. This is the population size. 
/ = time, in generations, starting from the base population at t . 
q = frequency of a particular allele at a locus. 
p = i - q = frequency of all other alleles at that locus, q and p refer 
to the frequencies in any one line; q and p refer to the fre- 
quencies in the whole population and are the means of q and^>; 
q andpQ are the frequencies in the base population. 


Variance of gene frequency. The change of gene frequency 
resulting from sampling is random in the sense that its direction is 
unpredictable. But its magnitude can be predicted in terms of the 
variance of the change. Consider the formation of the lines from 
the base population. Each line is formed from a sample of N in- 
dividuals drawn from the base population. Since each individual 
carries two genes at a locus, the sub-division of the population 
represents a series of samples each of 2N genes, drawn at random 
from the base population. The gene frequencies in these samples 
will have an average value equal to that in the base population, i.e. 
q , and will be distributed about this mean with a variance p q /2N, 
which is simply the variance of a ratio, the sample size being in this 

Chap. 3] 



case 2N. Thus the change of gene frequency, Aq f resulting from 
sampling in one generation, can be stated in terms of its variance as 


2 _Mo 
° A « 2N 

This variance of Aq expresses the magnitude of the change of gene 
frequency resulting from the dispersive process. It expresses the 
expected change in any one line, or the variance of gene frequencies 
that would be found among many lines after one generation. Its 
effect is a dispersion of gene frequencies among the lines; in other 
words the lines come to differ in gene frequency, though the mean 
in the population as a whole remains unchanged. 

In the next generation the sampling process is repeated, but each 
line now starts from a different gene frequency and so the second 
sampling leads to a further dispersion. The variance of the change 
now differs among the lines, since it depends on the gene frequency, 
q lt in the first generation of each line separately. The effect of con- 
tinued sampling through successive generations is that each line 
fluctuates irregularly in gene frequency, and the lines spread apart pro- 
gressively, thus becoming differentiated. The erratic changes of gene 
frequency shown by the individual lines are exemplified in Fig. 3.2; 

6 8 10 12 14 

Fig. 3.2. Random drift of the colour gene "non-agouti" in three 
lines of mice, each maintained by 6 pairs of parents per generation. 
(Original data.) 



[Chap. 3 

■ i i i ■' i 

10 12 

16 18 20 22 24 26 28 30 32 


7 £ 




number of bw 5 genes 

Fig. 3.3. Distributions of gene frequencies in 19 consecutive 
generations among 105 lines of Drosophila melanogaster , each of 16 
individuals. The gene frequencies refer to two alleles at the 
"brown" locus (bw™ and bw), with initial frequencies of 0-5. The 
height of each black column shows the number of lines having the 
gene frequency shown on the scale below. (From Buri, 1956; 
reproduced by courtesy of the author and the editor of Evolution.) 

Chap. 3] 



and the consequent differentiation, or spreading apart, of the lines 
in Fig. 3.3. These changes of gene frequency resulting from samp- 
ling in small populations are known as random drift (Wright, 193 1). 

O -06 

8 10 


Fig. 3.4. Variance of gene frequencies among lines in the ex- 
periment illustrated in Fig. 3.3. The circles are the observed values, 
and the smooth curve shows the expected variance as given by 
equation 3.2. The value taken for N is 1 1 -5, which is the "effective 
number," N e , as explained in the next chapter. (Data from Buri, 

As the dispersive process proceeds, the variance of gene frequency 
among the lines increases, as shown in Fig. 3.4. At any generation, t, 
the variance of gene frequencies, o-J, among the lines is as follows 

(see Crow, 1954): 

- p ^[ I -[ I -^) i ] 


Since the mean gene frequency among all the lines remains unchanged, 
q=q . We may note a fact that will be needed later, and is obvious 
from equation 3.2, namely that g^ — u 1 . The dispersion of the gene 
frequencies, which we have described by reference to one locus in 
many lines, could equally well be described by reference to the 


frequencies at a number of different loci in one line, provided they all 
started from the same initial frequency, and were unlinked. 

Fixation. There are limits to the spreading apart of the lines that 
can be brought about by the dispersive process. The gene frequency 
cannot change beyond the limits of o or i, and sooner or later each 
line must reach one or other of these limits. Moreover, the limits are 
"traps" or points of no return, because once the gene frequency has 
reached o or i it cannot change any more in that line. When a 
particular allele has reached a frequency of i it is said to be fixed in 
that line, and when it reaches a frequency of o it is lost. When an 
allele reaches fixation no other allele can be present in that line, and 
the line may then be said to be fixed. When a line is fixed all indi- 
viduals in it are of identical genotype with respect to that locus. 
Eventually all lines, and all loci in a lino, become fixed. The indi- 
viduals of a line are then genetically identical, and this is the basis of 
the genetic uniformity of highly inbred strains. 

The proportion of the lines in which different alleles at a locus are 
fixed is equal to the initial frequencies of the alleles. If the base 
population contains two alleles A x and A 2 at frequencies p and q 
respectively, then A x will be fixed in the proportion p of the lines, 
and A 2 in the remaining proportion, q . The variance of the gene 
frequency among the lines is then p q , as may be seen from equation 
3.2 by putting t equal to infinity. (In Fig. 3.3 the lines in which 
fixation or loss has just occurred are shown, but not those in which it 
occurred earlier.) 

When concerned with the attainment of genetic uniformity one 
wants to know how soon fixation takes place; what is the probability 
of a particular locus being fixed, or what proportion of all loci in a 
line will be fixed, after a certain number of generations. Considera- 
tion of the progressive nature of the dispersion, as illustrated in Fig. 
3.3, will show that fixation does not start immediately; the dispersion of 
gene frequencies must proceed some way before any line is likely to 
reach fixation. To deduce the probability of fixation is mathemati- 
cally complicated (see particularly Wright, 193 1; Kimura, 1955), and 
only an outline of the conclusions can be given here. There are two 
phases in the dispersive process: during the initial phase the gene 
frequencies are spreading out from the initial value; this leads to a 
steady phase, when the gene frequencies are evenly spread out over 
the range between the two limits, and all gene frequencies except the 
two limits are equally probable. The duration of the initial phase 

Chap. 3] 



in generations is a small multiple of the population size, depending 
on the initial gene frequency. With q = O'S it l ast s about zN genera- 
tions, and with ^ = o- 1 it lasts about 4^ generations (Kimura, 1955). 
(In the experiment illustrated in Fig. 3.3 it lasted till about the 
seventeenth generation.) The theoretical distributions of gene 
frequency during the initial phase, with original frequencies of 0-5 
and o-i, are shown in Fig. 3.5. 







1 n 





/ / / 


\ \ \ 

■ yy . 

■ I • 

. w- 

Q.O 0.5 1.0 

Fig. 3.5. Theoretical distributions of gene frequency among lines. 
The initial and mean gene frequency is 0-5 in the left hand figure, 
and o*i in the right hand figure. Previously fixed lines are excluded. 
N= population size; T=time in generations. Note the general 
agreement of the left hand figure with the observed distributions 
shown in Fig. 3.3. (From Kimura, 1955; reproduced by courtesy 
of the author and the editor of the Proc. Nat. Acad. Set. Wash.) 

To visualise the process one might think of a pile of dry sand in a 
narrow trough open at the two ends. Agitation of the trough will 
cause the pile to spread out along the trough, till eventually it is 
evenly spread along its length. Toward the end of the spreading out 
some of the sand will have fallen off the ends of the trough, and this 
represents fixation and loss. Continued agitation after the sand is 

E F.Q.G. 


Small populations-. 

[Chap. 3 

evenly spread will cause it to fall off the ends at a steady rate, and the 
depth of sand left in the trough will be continually reduced at a 
steady rate until in the end none is left. The initial gene frequency is 
represented by the position of the initial pile of sand. If it is near one 
end of the trough, much of the sand will have fallen off that end be- 

10 12 


Fig. 3.6. Fixation and loss occurring among 107 lines of Droso- 
phila melanogaster, during 19 generations. This is not the same 
experiment as that illustrated in Figs. 3.3 and 3.4, but was similar 
in nature. There were 16 parents per generation in each line, and 
the effective number (see chapter 4) was 9. The closed circles 
show the percentage of lines in which the bw 75 allele has become 
fixed; the open circles show the percentage in which it has been 
lost and the bw allele fixed. The smooth curve is the expected 
amount of fixation of one or other allele, computed from the effec- 
tive number by equation 3.3. (Data from Buri, 1956.) 

fore any reaches the other end, and the total amount falling off each 
end will be in proportion to the relative distance of the initial pile 
from the two ends. Relating this model to the diagram of the process 
in Fig. 3.5, the position along the trough represents the horizontal 
axis, or gene frequency, and the depth of the sand represents the 
vertical axis, or the probability of a line having a particular gene 

Chap. 3] 



frequency. The graphs are thus analogous to longitudinal sections 
through the trough and its sand. 

The probability of fixation at any time during the initial phase is 
too complicated for explanation here, and the reader is referred to 
the papers of Kimura (1954, 1955). After the steady phase has been 
reached fixation proceeds at a constant rate: a proportion ijzN of the 
lines previously unfixed become fixed in each generation. The 
proportion of lines in which a gene with initial frequency q is 
expected to be fixed, lost, or to be still segregating is as follows 
(Wright, 1952a): 

fixed: q -3PoqoP 

lost: po~3PoqoP 

neither: bp^qJP 

where P 


Fig. 3.6 shows the progress of fixation and loss in an experiment 
with Drosophila. 

Genotype frequencies. Change of gene frequency leads to 
change of genotype frequencies; so the genotype frequencies in small 
populations follow the changes of gene frequency resulting from the 
dispersive process. In the idealised population, which we are still 
considering, mating is random within each of the lines. Consequently 
the genotype frequencies in any one line are the Hardy- Weinberg 
frequencies appropriate to the gene frequency in the previous genera- 
tion of that line. As the lines drift apart in gene frequency they 
become differentiated also in genotype frequencies. But differentia- 
tion is not the only aspect of the change: the general direction of the 
change is toward an increase of homozygous, and a decrease of 
heterozygous, genotypes. The reason for this is the dispersion of gene 
frequencies from intermediate values toward the extremes. Hetero- 
zygotes are most frequent at intermediate gene frequencies (see 
Fig. 1.1), so the drift of gene frequencies toward the extremes leads, 
on the average, to a decline in the frequency of heterozygotes. 

The genotype frequencies in the population as a whole can be 
deduced from a knowledge of the variance of gene frequencies in the 
following way. If an allele has a frequency q in one particular line, 
homozygotes of that allele will have a frequency of q 2 in that line. 
The frequency of these homozygotes in the population as a whole will 
therefore be the mean value of q 2 over all lines. We shall write this 


mean frequency of homozygotes as (q 2 ). The value of (q 2 ) can be 
found from a knowledge of the variance of gene frequencies among 
the lines, by noting that the variance of a set of observations is found 
by deducting the square of the mean from the mean of the squared 
observations. Thus 

and (« 2 )=<P + ° S 2 (3-4) 

where o\ is the variance of gene frequencies among the lines, as given 
in equation 3.2, and q 2 is the square of the mean gene frequency. 
Since the mean gene frequency, q, is equal to the original, q > it 
follows that q 2 or q% is the original frequency of homozygotes in the 
base population. Thus in the population as a whole the frequency of 
homozygotes of a particular allele increases, and is always in excess 
of the original frequency by an amount equal to the variance of the 
gene frequency among the lines. In a two-allele system the same 
applies to the other allele, and the frequency of heterozygotes is 
reduced correspondingly. Noting from equation 3.2 that o\ — a\ we 
therefore find the genotypic frequencies for a locus with two alleles 
as follows: 


Frequency in 


whole population 


Po + rf 

AiA 2 

2p q -2or* 

A 2 A 2 

ql + a 2 

These genotype frequencies are no longer the Hardy- Weinberg 
frequencies appropriate to the original or mean gene frequency. The 
Hardy- Weinberg relationships between gene frequency and genotype 
frequencies, though they hold good within each line separately, do 
not hold if the lines are taken together and regarded as a single 
population. This fact causes some difficulty in relating gene and 
genotype frequencies in natural populations, because they are often 
more or less subdivided and the degree of subdivision is seldom 
known. An example of the decrease of heterozygotes resulting from 
the dispersion of gene frequencies is shown in Fig. 3.7. 

The foregoing account of genotype frequencies describes the 
situation in terms of one locus in many lines. It can be regarded 
equally as referring to many loci in one line; then the change in any 
one line or small population is an increase in the number of loci at 

Chap. 3] 



which individuals are homozygous and a corresponding decrease in 
the number at which they are heterozygous — in short an increase of 
homozygotes at the expense of heterozygotes. This change of geno- 
type frequencies resulting from the dispersive process is the genetic 
basis of the phenomenon of inbreeding depression, of which a full 
explanation will be found in Chapter 14. 



Fig. 3.7. Change of frequency of heterozygotes among 105 lines 
of Drosophila melanogaster, each with 16 parents. The same ex- 
periment as is illustrated in Figs. 3.3. and 3.4. The frequency of 
heterozygotes refers to the population as a whole, all lines taken 
together. The smooth curve is the expected frequency of hetero- 
zygotes. (Data from Buri, 1956.) 

We have now surveyed the general nature of the dispersive process 
and its three major consequences — differentiation of sub-populations, 
genetic uniformity within sub-populations, and overall increase in 
the frequency of homozygous genotypes. Let us now look at the 
process from another viewpoint, as an inbreeding process. Instead 
of regarding the increase of homozygotes as a consequence of the 
dispersion of gene frequencies, we shall now look directly at the 
manner in which the additional homozygotes arise. 



Inbreeding means the mating together of individuals that are 
related to each other by ancestry. That the degree of relationship 
between the individuals in a population depends on the size of the 
population will be clear by consideration of the numbers of possible 
ancestors. In a population of bisexual organisms every individual 
has two parents, four grand-parents, eight great-grandparents, etc., 
and t generations back it has 2* ancestors. Not very many generations 
back the number of individuals required to provide separate ancestors 
for all the present individuals becomes larger than any real popula- 
tion could contain. Any pair of individuals must therefore be related 
to each other through one or more common ancestors in the more or 
less remote past; and the smaller the size of the population in previous 
generations the less remote are the common ancestors, or the greater 
their number. Thus pairs mating at random are more closely related 
to each other in a small population than in a large one. This is why 
the properties of small populations can be treated as the consequences 
of inbreeding. 

The essential consequence of two individuals having a common 
ancestor is that they may both carry replicates of one of the genes 
present in the ancestor; and if they mate they may pass on these 
replicates to their offspring. Thus inbred individuals — that is to 
say, offspring produced by inbreeding — may carry two genes at a 
locus that are replicates of one and the same gene in a previous 
generation. Consideration of this consequence of inbreeding shows 
that there are two sorts of identity among allelic genes, and two sorts 
of homozygote. The sort of identity we have hitherto considered is a 
functional identity. Two genes are regarded as being identical if they 
are not recognisably different in their phenotypic effects, or by any 
other functional criterion; in other words, if they have the same 
allelemorphic state. Following the terminology of Crow (1954) they 
may be called alike in state. An individual carrying a pair of such genes 
is a homozygote in the ordinary sense. The new sort of identity is 
one of replication. If two genes originated from the replication of one 
gene in a previous generation, they may be said to be identical by 
descent, or simply identical. An individual possessing two identical 
genes at a locus may be called an identical homozygote. Genes that 
are not identical by descent may be called independent, whether they 

Chap, 3] 



are alike in state or different alleles; and homozygotes of independent 
genes may be called independent homozygotes. 

Identity by descent provides the basis for a measure of the dis- 
persive process, through the degree of relationship between the 
mating pairs. The measure is the coefficient of inbreeding, which is the 
probability that the two genes at any locus in an individual are identi- 
cal by descent. It refers to an individual and expresses the degree of 
relationship between the individual's parents. If the parents mated 
at random then the coefficient of inbreeding of the progeny is the 
probability that two gametes taken at random from the parent 
generation carry identical genes at a locus. The coefficient of in- 
breeding, generally symbolised by F, was first defined by Wright 
(1922) as the correlation between uniting gametes; the definition 
given here, which follows that of Malecot (1948) and Crow (1954), is 

The degree of relationship expressed in the inbreeding coefficient 
is essentially a comparison between the population in question and 
some specified or implied base population. Without this point of 
reference it is meaningless, as the following consideration will show. 
On account of the limitation in the number of independent ancestors 
in any population not infinitely large, all genes now present at a locus 
in the population would be found to be identical by descent if traced 
far enough back into the remote past. Therefore the inbreeding 
coefficient only becomes meaningful if we specify some time in the 
past beyond which ancestries will not be pursued, and at which all 
genes present in the population are to be regarded as independent — 
that is, not identical by descent. This point is the base population 
and by its definition it has an inbreeding coefficient of zero. The 
inbreeding coefficient of a subsequent generation expresses the 
amount of the dispersive process that has taken place since the base 
population, and compares the degree of relationship between the 
individuals now, with that between individuals in the base population. 
Reference to the base population is not always explicitly stated, but is 
always implied. For example, we can speak of the inbreeding coeffi- 
cient of a population subdivided into lines. The comparison of 
relationship is between the individuals of a line and individuals 
taken at random from the whole population. The base population 
implied is a hypothetical population from which all the lines were 

Inbreeding in the idealised population. Let us now return to 


the idealised population and deduce the coefficient of inbreeding in 
successive generations, starting with the base population and its 
progeny constituting generation i. The situation may be visualised 
by thinking of a hermaphrodite marine organism, capable of self- 
fertilisation, shedding eggs and sperm into the sea. There are N 
individuals each shedding equal numbers of gametes which unite at 
random. All the genes at a locus in the base population have to be 
regarded as being non-identical; so, considering only one locus, 
among the gametes shed by the base population there are zN different 
sorts, in equal numbers, bearing the genes A l5 A 2 , A 3 , etc. at the A 
locus. The gametes of any one sort carry identical genes; those of 
different sort carry genes of independent origin. What is the pro- 
bability that a pair of gametes taken at random carry identical genes? 
This is the inbreeding coefficient of generation i . Any gamete has a 
i/aiVth chance of uniting with another of the same sort, so i/zNis the 
probability that uniting gametes carry identical genes, and is thus the 
coefficient of inbreeding of the progeny. Now consider the second 
generation. There are now two ways in which identical homo- 
zygotes can arise, one from the new replication of genes and the other 
from the previous replication. The probability of newly replicated 
genes coming together in a zygote is again i/2N. The remaining 
proportion, i - i/zN, of zygotes carry genes that are independent in 
their origin from generation i, but may have been identical in their 
origin from generation o. The probability of their identical origin in 
generation o is what we have already deduced as the inbreeding 
coefficient of generation i. Thus the total probability of identical 
homozygotes in generation 2 is 

F >=m + [*-£?)*> 

where F x and F 2 stand for the inbreeding coefficients of generations 
1 and 2 respectively. The same argument applies to subsequent 
generations, so that in general the inbreeding coefficient of individuals 
in generation t is 

Thus the inbreeding coefficient is made up of two parts: an "incre- 
ment," i/zN, attributable to the new inbreeding, and a "remainder," 
attributable to the previous inbreeding and having the inbreeding 

Chap. 3] 



coefficient of the previous generation. In the idealised population the 
"new inbreeding" arises from self-fertilisation, which brings together 
genes replicated in the immediately preceding generation. Exclusion 
of self-fertilisation simply shifts the replication one generation 
further back, so that the "new inbreeding" brings together genes 
replicated in the grand-parental generation; the coefficient of in- 
breeding is affected, but not very much, as we shall see later. The 
distinction between "new" and "old" inbreeding brings clearly to 
light a point which we note here in passing because it will be needed 
later and is often important in practice: if there is no "new inbreed- 
ing," as would happen if the population size were suddenly increased, 
the previous inbreeding is not undone, but remains where it was 
before the increase of population size. 

Let us call the "increment" or "new inbreeding" AF, so that 




Equation 3.6 may then be rewritten in the form 

F t =AF+(i-^F)Ft-i (3-8) 

Further rearrangement makes clearer the precise meaning of the 
"increment," AF. 



' i ~F,-i 


From the equation written thus we see that the "increment," AF, 
measures the rate of inbreeding in the form of a proportionate increase. 
It is the increase of the inbreeding coefficient in one generation, rela- 
tive to the distance that was still to go to reach complete inbreeding. 
This measure of the rate of inbreeding provides a convenient way of 
going beyond the restrictive simplifications of the idealised popula- 
tion, and it thus provides a means of comparing the inbreeding effects 
of different breeding systems. When the inbreeding coefficient is 
expressed in terms of AF, equation 3.8 is valid for any breeding system 
and is not restricted to the idealised population, though only in the 
idealised population is AF equal to 1/2N. 

So far we have done no more than relate the inbreeding coefficient 
in one generation to that of the previous generation. It remains to 
extend equation 3.8 back to the base population and so express the 
inbreeding coefficient in terms of the number of generations. This is 


made easier by the use of a symbol, P, for the complement of the 
inbreeding coefficient, i -P, which is known as the panmictic index. 
Substitution of P= i -F in equation 3.8 gives 

p-=i-AF {3.10) 

Thus the panmictic index is reduced by a constant proportion in 
each generation. Extension back to generation t - 2 gives 

and extension back to the base population gives 

p t ={i-AFyp B (3.11) 

where P is the panmictic index of the base population. The base 
population is defined as having an inbreeding coefficient of o, and 
therefore a panmictic index of 1. The inbreeding coefficient in any 
generation, t, referred to the base population, is therefore 

F t = i-(i-AFY (3.12) 

The consequences of the dispersive process were described earlier 
from the viewpoint of sampling variance. Let us now look again at 
them, applying the rate of inbreeding and the inbreeding coefficient 
as measures of the process. Strictly speaking we should refer still 
to the idealised population, but the equating of the two viewpoints 
can be regarded as generally valid except in some very special and 
unlikely circumstances (see Crow, 1954). 

Variance of gene frequency. First, the variance of the change 
of gene frequency in one generation, taken from equation 3.1 and 
expressed in terms of the rate of inbreeding, becomes 

<= P -§ =M,AF {3 - I3) 

Similarly, the variance of gene frequencies among the lines at 
generation t, taken from equation 3.2 and expressed in terms of the 
inbreeding coefficient from 3. 12, becomes 

of= M .[l-(l-^)'] 
=P&loF (3-14) 

Chap. 3] INBREEDING 65 

Thus AF expresses the rate of dispersion and F the cumulated effect 
of random drift. 

Genotype frequencies. Leaving fixation aside for the moment, 
let us consider next the genotype frequencies in the population as a 
whole. The genotype frequencies expressed in terms of the variance 
of gene frequency in equations 3.5 can be rewritten in terms of the 
coefficient of inbreeding from equation 3.J4. The frequency of A 2 A 2 , 
for example, is 

(?)=q%+°%=q 2 o+P<tioF 

The genotype frequencies expressed in this way are entered in the 
left-hand side of Table 3.1. As was explained before, this way of 
writing the genotype frequencies shows how the homozygotes in- 

Table 3.1 

Genotype frequencies for a locus with two alleles, expressed 
in terms of the inbreeding coefficient, F. 

Original Change 

fre- due to 

quencies inbreeding 

Independent Identical 


Pi + M/ 


Pl(i -F) + Pa F 

A X A 2 

A 2 A 2 

2M0 - 2/XtfoF 
Qo + P0Q0F 


2/>o?o(i -F) 
sKi -F) + 1oF 

crease at the expense of the heterozygotes. Recognition of identity 
by descent to which the inbreeding viewpoint led us means that we 
can now distinguish the two sorts of homozygote, identical and 
independent, among both the A 1 A 1 or A 2 A 2 genotypes. The fre- 
quency of identical homozygotes among both genotypes together is 
by definition the inbreeding coefficient, F; and it is clear that the 
division between the two genotypes is in proportion to the initial 
gene frequencies. So p$F is the frequency of A X A X identical homo- 
zygotes, and q F that of A 2 A 2 identical homozygotes. The remaining 
genotypes, both homozygotes and heterozygotes, carry genes that are 
independent in origin and are therefore the equivalent of pairs of 
gametes taken at random from the population as a whole. Their 
frequencies are therefore the Hardy- Weinberg frequencies. Thus, 
from the inbreeding viewpoint, we arrive at the genotype frequencies 
shown in the right-hand columns of Table 3.1. This way of writing 
the genotype frequencies shows how homozygotes are divided be- 


tween those of independent and those of identical origin. The 
equivalence of the two ways of expressing the genotype frequencies 
can be verified from their algebraic identity. Both ways show equally 
clearly how the heterozygotes are reduced in frequency in proportion 
to i -F. The term "heterozygosity" is often used to express the 
frequency of heterozygotes at any time, relative to their frequency in 
the base population. The heterozygosity is the same as the panmic- 
tic index, P. Thus if H t and H are the frequencies of heterozygotes 
for a pair of alleles at generation t and in the base population res- 
pectively, then the heterozygosity at generation t is 

§=P* (3-15) 

Fixation. There is little to add, from the inbreeding viewpoint, to 
the description of fixation given earlier. The rate of fixation — that 
is the proportion of unfixed loci that become fixed in any generation — 
is equal to AF, after the steady phase has been reached and the dis- 
tribution of gene frequencies has become flat. The quantity P in 
equations 3.3 which give the probability of a gene having become 
fixed or lost, is equal to 1 -F. We may note, however, that the 
probability of fixation is not very different from the inbreeding 
coefficient itself. The explanation comes more readily by considering 
the probability that a locus remains unfixed. This probability was 
given in equation 3.3 for a locus with two alleles after enough genera- 
tions have passed to take the population into the steady phase. 
Expressed in terms of the inbreeding coefficient, from equation 3. 12, 
it is 6p q (i -F). Now, the value of p q does not change very much 
over quite a wide range of gene frequencies, and so the probability 
that a locus is still unfixed is not very sensitive to the initial gene 
frequency. The value of 6p q lies between i-o and 1-5 over a range 
of gene frequency from 0-2 to o-8, a range that is likely to cover many 
situations. Consequently the probability that a line still segregates, 
or the proportion of loci expected to remain unfixed, is likely to lie 
between (1 -F) and 1-5(1 -F). Thus the inbreeding coefficient gives 
a good idea of the approximate probability of fixation, even in the 
absence of a knowledge of the initial gene frequencies. That the 
approximation may be quite close enough for practical purposes may 
be seen by taking a specific example. In work involving immuno- 
logical reactions it may be necessary to produce a strain in which all 
loci that determine the reactions have been fixed. One therefore 

Chap. 3] 



wants to know the inbreeding coefficient necessary to raise the 
probability of fixation, or the proportion of loci expected to be fixed, 
to a certain level — say 90 per cent. The inbreeding coefficient needed 
to do this would, on the above considerations, lie between 0-90 and 
0-93, and this would answer the question with quite enough accuracy 
for most purposes. 



II. Less Simplified Conditions 

In order to simplify the description of the dispersive process we 
confined our attention in the last chapter to an idealised population, 
and to do this we had to specify a number of restrictive conditions, 
which could seldom be fulfilled in real populations. The purpose of 
this chapter is to adapt the conclusions of the last chapter to situations 
in which the conditions imposed do not hold; in other words to 
remove the more serious restrictions and bring the conclusions closer 
to reality. The restrictive conditions were of two sorts, one sort 
being concerned with the breeding structure of the population and 
the other excluding mutation, migration, and selection from con- 
sideration. We shall first describe the effects of deviations from the 
idealised breeding structure, and then consider the outcome of the 
dispersive process when mutation, migration, or selection are oper- 
ating at the same time. 

Effective Population Size 

If the breeding structure does not conform to that specified for 
the idealised population, it is still possible to evaluate the dispersive 
process in terms of either the variance of gene frequencies or the rate 
of inbreeding. This can be done by the same general methods and 
no new principles are involved. We shall therefore give the con- 
clusions briefly and without detailed explanation. The most con- 
venient way of dealing with any particular deviation from the 
idealised breeding structure is to express the situation in terms of 
the effective number of breeding individuals, or the effective population 
size. This is the number of individuals that would give rise to the 
sampling variance or the rate of inbreeding appropriate to the con- 
ditions under consideration, if they bred in the manner of the 
idealised population. Thus, by converting the actual number, N, to 

Chap. 4] 



the effective number, N e , we can apply the formulae deduced in the 
last chapter. The rate of inbreeding, for example, is 




just as for the idealised population AF= ijzN (equation 3.7). 

The relationships between actual and effective numbers in the 
situations most commonly met with are given below. The exact 
expressions are often complicated, but in most circumstances an 
approximation can be used with sufficient accuracy. We should first 
note that the actual number, N t refers to breeding individuals — the 
breeding individuals of one generation — and it therefore cannot be 
obtained directly from a census, unless the different age-groups are 

Bisexual organisms: self-fertilisation excluded. The ex- 
clusion of self-fertilisation makes very little difference to the rate of 
inbreeding, unless N is very small, as with close inbreeding. The 
relationship of effective to actual numbers (Wright, 1 931) is 

N e =N+i 

and the rate of inbreeding is 







The exact expression for the inbreeding coefficient in a bisexual 
population, and its derivation, are given by Malecot (1948). 

Different numbers of males and females. In domestic and 
laboratory animals the sexes are often unequally represented among 
the breeding individuals, since it is more economical, when possible, 
to use fewer males than females. The two sexes, however, whatever 
their relative numbers, contribute equally to the genes in the next 
generation. Therefore the sampling variance attributable to the two 
sexes must be reckoned separately. Since the sampling variance is 
proportional to the reciprocal of the number, the effective number is 
twice the harmonic mean of the numbers of the two sexes (Wright, 
1 931), so that 

1 1 


N e iN m '4N f 



where N m and N f are the actual numbers of males and females 
respectively. The rate of inbreeding is then 

AF= sk + m < a pp rox -> (*5) 

This gives a close enough approximation unless both N m and N f are 
very small, as with close inbreeding. It should be noted that the rate 
of inbreeding depends chiefly on the numbers of the less numerous 
sex. For example, if a population were maintained with an in- 
definitely large number of females but only one male in each genera- 
tion, the effective number would be only about 4. 

Unequal numbers in successive generations. The rate of 
inbreeding in any one generation is given, as before, by i/zN. If the 
numbers are not constant from generation to generation, then the 
mean rate of inbreeding is the mean value of i/zN'm successive genera- 
tions. The effective number is the harmonic mean of the numbers in 
each generation (Wright, 1939). Over a period of / generations, 

w e =1 tlk + k + k + - + w] (approx - } {4 - 6) 

Thus the generations with the smallest numbers have the most effect. 
The reason for this can be seen by consideration of the "new" and 
"old" inbreeding referred to in connexion with equation 3.6. An 
expansion in numbers does not affect the previous inbreeding; it 
merely reduces the amount of new inbreeding. So, in a population 
with fluctuating numbers the inbreeding proceeds by steps of varying 
amount, and the present size of the population indicates only the 
present rate of inbreeding. 

Non-random distribution of family size. This is probably the 
commonest and most important deviation from the breeding system 
of the idealised population. Its consequence is usually to render the 
effective number less than the actual, but in special circumstances it 
makes it greater. Family size means here the number of progeny of 
an individual parent or of a pair of parents, that survive to become 
breeding individuals. It will be remembered that each breeding 
individual in the idealised population contributes equally to the pool 
of gametes, and therefore equally also to the potential zygotes in the 
next generation. Survival of zygotes is random. The mean number 
of progeny surviving to breeding age is 1 for individual parents and 2 

Chap. 4] 



for pairs of parents. Since the chance of survival for any particular 
zygote is small, the variation of family size follows a Poisson distribu- 
tion. The variance of family size is therefore equal to the mean family 
size, equality of mean and variance being a property of the Poisson 
distribution. Thus in a population of bisexual organisms, in which 
all other conditions of the idealised population are satisfied, family 
size will have a mean and a variance of 2. In natural populations the 
mean is not likely to differ much from 2, but the variance must be 
expected to be usually greater, for reasons of differing fertility be- 
tween the parent individuals and differing viability between the 
families. If the variance of family size is increased, a greater propor- 
tion of the following generation will be the progeny of a smaller 
number of parents, and the effective number of parents will be less 
than the actual number. Conversely, if the variance of family size is 
reduced below that of the idealised population, the effective number 
will be greater than the actual number. It can be shown that, when 
the mean family size is 2, the effective number is as follows (Wright, 
1940; Crow, 1954): 

N e = 

2 + 0I 


where erf is the variance of family size. (Strictly speaking this is the 
effective number as it affects variance of gene frequency and fixation: 

for its effects on the inbreeding coefficient, N e =- % . The differ- 
ence is small and we shall ignore it.) Thus, when there is equal 
fertility of the parents and random survival of the progeny of — 2, and 
N e =N. When differences of fertility and viability make of greater 
than 2, as in most actual populations, then N e is less than N. The 
effective number under consideration here refers to a population with 
equal numbers of males and females, and with monogamous mating. 
If males are not restricted to a single mate, then the families of males 
are likely to be more variable in size than those of females. In these 
circumstances the relationship of effective to actual numbers will 
differ for male and female parents. 

It is possible by controlled breeding to make the variance of 
family size, of, less than 2, and therefore to make the effective 
number greater than the actual. If two members of each family are 
deliberately chosen to be parents of the next generation, then the 
variance of family size is zero. Under these special circumstances, 

F F.Q.G. 


and if the sexes are equal in numbers, the effective number is twice 
the actual: 

N. = zN (4.8) 

The rate of inbreeding is consequently half what it would be in an 
idealised population of equal size, and is usually less than half the 
rate of inbreeding under normal circumstances and random mating. 
Under this controlled breeding system the rate of inbreeding is the 
lowest possible with a given number of breeding individuals. The 
reduced variance of family size is the path through which the ' 'de- 
liberate avoidance of inbreeding" works. The problem often arises 
of keeping a stock with minimum inbreeding, but with a limitation of 
the actual population size imposed by the space or facilities available. 
A common practice under these circumstances is the deliberate 
avoidance of sib-matings and perhaps also of cousin-matings. One 
may go further and by the use of pedigrees (in the manner described 
in the next chapter) choose pairs for mating that have the least 
possible relationship with each other. Deliberate avoidance of in- 
breeding in this way has the effect of distributing the individuals 
chosen to be parents evenly over the available families, and thus 
reduces the variance of family size and the rate of inbreeding. The 
same result, however, can be achieved with less labour simply by 
ensuring that the available families are as far as possible equally 
represented among the individuals chosen to be the parents of the 
next generation. If, in addition, matings between close relatives are 
avoided, the inbreeding coefficient in any generation is slightly lower 
and is more uniform between the individuals in the generation than if 
matings between close relatives are allowed; but the rate of inbreeding 
is the same. 

If the sexes are unequal in numbers, but the individuals chosen as 
parents are equally distributed, in numbers and sexes, between the 
families, so that the variance of family size is still zero, then the rate 
of inbreeding is given by the following formula (Gowe, Robertson, 
and Latter, 1959): 

where N m and N f are the actual numbers of male and female parents 
respectively, and females are more numerous than males. 

Chap. 4] 



Example 4.1. Several flocks of poultry in the United States and in 
Canada, which are used as controls for breeding experiments, are main- 
tained by the following breeding system (Gowe, Robertson, and Latter, 


There are 50 breeding males and 250 breeding females in each genera- 
tion. Every male is the son of a different father, and every female the 
daughter of a different mother, so that the variance of family size is zero. 
One of the objectives of this breeding system is to minimise the rate of 
inbreeding. Let us therefore find what the rate of inbreeding is, and then 
see how much is achieved in this respect by the deliberate equalisation of 
family size. By equation 4.9 the rate of inbreeding in these flocks is 
AF = 0-002. If there were no deliberate choice of breeding individuals, and 
family size conformed to a Poisson distribution, the rate of inbreeding by 
equation 4.5 would be AF = 0-003. Thus, without the deliberate equalisa- 
tion of family size the rate of inbreeding would be 50 per cent greater. If 
a low rate of inbreeding were the only objective, the number of females 
could be substantially reduced without much effect. For example, if 
there were no more females than males, with 50 of each sex (N= 100) and 
with equalisation of family size, the rate of inbreeding from equation 4.8 
would be AF= 0-0025, which is not very much greater than with five times 
as many females. This illustrates the point, mentioned earlier, that most 
of the inbreeding comes from the less numerous sex. 

Ratio of effective to actual number. When matings are con- 
trolled and pedigree records kept, the rate of inbreeding can readily 
be computed, as will be explained in the next chapter. But pedigree 
records are not available for natural populations, nor for laboratory 
populations kept by mass culture, as for example Drosophila popul- 
tions. How are we to estimate the rate of inbreeding in such popula- 
tions? We know the effective number is likely to be less than the 
actual, but how much less ? To estimate the effective number requires 
a special experiment, and only the actual number is likely to be known. 
Determinations of the ratio of effective to actual numbers, N e /N, 
from data on man, Drosophila, and the snail Lymnaea, led to values 
ranging from 70 per cent to 95 per cent (Crow and Morton, 1955). 
In the absence of specific knowledge, therefore, it would seem 
reasonable to take the effective number as being, very roughly, about 
three-quarters of the actual number. There are two methods by 
which the ratio NJN may be determined: (1) by the estimation of the 
variance of family size, which yields N e by equation 4.7 (though 
adjustment has to be made if the mean family size at the time of 
measurement is not 2); and (2) by the estimation of the variance of the 


changes of gene frequency during inbreeding, which yields N e by 
equation 3.1. Both methods have been applied to Drosophila 
melanogaster in laboratory cultures. The ratio N e /N for female 
parents was 71 per cent by the first method and 76 per cent by the 
second; and for male parents, 48 per cent and 35 per cent (Crow and 
Morton, 1955). The ratio NJN for the sexes jointly, determined by 
the second method, ranged from 56 per cent to 83 per cent, with a 
mean of 70 per cent, in five experiments with equal actual numbers 
of males and females (Kerr and Wright, 19540, b; Wright and Kerr, 
1954; Buri, 1956). The low value of 56 per cent was found in rather 
poor culture conditions of crowding, where there was more compe- 
tition (Buri, 1956). 

Example 4.2. As an illustration of the use of the ratio NJN let us find 
the expected rate of inbreeding in a population of Drosophila maintained 
by 20 pairs of parents in each generation. The actual number is TV = 40. 
If the effective number were equal to the actual, the rate of inbreeding, by 
equation 4. J, would be AF= 1/80 = 1 -25 per cent. If we take N e = o-yN, from 
the experimental results cited above, then iV e = 28, and the rate of in- 
breeding is AF= 1/56 = 1 786 per cent. The coefficient of inbreeding after 
10, 50, and 100 generations would then be (by equation 3.12) 17 per cent, 
59 per cent, and 84 per cent. 

Migration, Mutation, and Selection 

The description of the dispersive process given so far in this 
chapter and the previous one is conditional on the systematic pro- 
cess of mutation, migration, and selection being absent, and its rele- 
vance to real populations is therefore limited. So let us now consider 
the effects of the dispersive and systematic processes when acting 
jointly. The systematic processes, as we have seen in Chapter 2, 
tend to bring the gene frequencies to stable equilibria at particular 
values which would be the same for all populations under the same 
conditions. The dispersive process, in contrast, tends to scatter the 
gene frequencies away from these equilibrium values, and if not held 
in check by the systematic processes it would in the end lead to all 
genes being either fixed or lost in all populations not infinite in size. 
The tendency of the systematic processes to change the gene fre- 
quency toward its equilibrium value becomes stronger as the fre- 
quency deviates further from this value. For this reason the opposing 

Chap. 4] 



tendencies of the dispersive and systematic processes reach a point of 
balance: a point at which the dispersion of the gene frequencies is 
held in check by the systematic processes. When this point of balance 
is reached there will be a certain degree of differentiation between 
sub-populations, but it will neither increase nor decrease so long as 
the conditions remain unchanged. The problem is therefore to find 
the distribution of gene frequencies among the lines of a subdivided 
population when this steady state has been reached. The solution is 
complicated mathematically, and we shall give only the main con- 
clusions, explaining their meaning but not their derivation. For 
details of the joint action of the dispersive and systematic processes, 
see Wright (193 1, 1942, 1948, 195 1). 

Mutation and migration. Mutation and migration can be 
dealt with together because they change the gene frequency in the 
same manner. Consider again a population subdivided into many 
lines, all with an effective size N e \ and let a proportion, m, of the 
breeding individuals of every generation in each line be immigrants 
coming at random from all other lines. Consider two alleles at a 
locus, with mean frequencies p and q in the population as a whole, and 
with mutation rates u and v in the two directions. Then, when the 
balance between dispersion on the one hand and mutation and 
migration on the other is reached, the variance of the gene frequency 
among the lines is given by the following expression (Wright, 1931; 
Malecot, 1948): 


1 + \N e (u + v + m) 



The degree of dispersion represented here by the variance of the gene 
frequency can also be expressed as a coefficient of inbreeding, by 
putting o\ =Fpq, from equation 3. 14. Then 

v + 



i+4N e (u 

The theoretical distributions of the gene frequency appropriate to 
four different values of F, when the mean gene frequency is 0-5, are 
shown in Fig. 4. 1 . These distributions show how high F must be for 
there to be a substantial amount of fixation or of differentiation be- 
tween sub-populations. What the distributions depict can be stated 
in three ways: (a) If we had a large number of sub-populations and we 
determined the frequency of a particular gene in all of them, the dis- 



[Chap. 4 

tribution curve is what we should obtain by plotting the percentage 
of sub-populations showing each gene frequency. Or, in other words, 
the height of the curve at a particular gene frequency shows the 
probability of finding that gene frequency in any one sub-population. 
(b) If we had one sub-population and measured the gene frequencies 
at a large number of loci, all of which started with the same initial 
frequency, the curve is the distribution of frequencies that we should 

find, (c) If we had one sub- 
population and measured the 
frequency of one particular gene 
repeatedly over a long period of 
time, the curve is the distribu- 
tion of frequencies that we 
should find. The distributions 
describe the state of affairs when 
equilibrium between the sys- 
tematic and dispersive pro- 
cesses has been reached, and 
the population as a whole is in 
a steady state. From the dis- 
tributions shown in Fig. 4.1 it 
will be seen that when F is 0-005 
there is very little differentia- 
tion, and when F is 0-048 there 
is a fair amount of differentia- 
tion but still no fixation. When 
F is 0-333 tne distribution is flat, 
which means that all gene fre- 
quencies are equally probable 
(including o and 1); thus there 
is much differentiation, and in 
addition a substantial amount 
of fixation and loss occurs. 
When F exceeds this critical value intermediate gene frequencies 
become rarer, and a greater proportion of sub-populations have the 
gene either fixed or lost. When mutation or migration occurs, fixation 
or loss is not a permanent state in any one sub-population; the amount 
of fixation or loss is what would be found at any one time. 

Let us return now to the expression, 4. jt, relating the coefficient 
of inbreeding to the rates of mutation and migration when the 

Fig. 4.1. Theoretical distributions 
of gene frequency among sub- 
populations, when dispersion is 
balanced by mutation or migration. 
The states of dispersion to which 
the curves refer are indicated by the 
values of F in the figure. (Redrawn 
from Wright, 195 1.) 

Chap. 4] 



population has reached the steady state; and let us consider the rates 
of mutation or migration, in relation to the effective population size, 
that would just allow the dispersive process to go to the critical point 
corresponding to the value of ^=0-333. Putting this value of F in 
equation 4.11 yields 


u + v + m=-^-r (approx.) 

First let us consider mutation alone. If the sum of the mutation 
rates in the two directions (u + v) were io -5 , which is a realistic value 
to take according to what is known of mutation rates, then the critical 
state of dispersion will be reached in sub-populations of effective size 
N e = 50,000. In other words, mutation rates of this order of magni- 
tude will arrest the dispersive process before the critical state only in 
populations with effective numbers greater than 50,000. Populations 
smaller than this will show a substantial amount of fixation of genes 
having this mutation rate. In practice, therefore, mutation may be 
discounted as a force opposing dispersion in populations that would 
commonly be regarded as "small"; populations, that is, with effective 
numbers of the order of 100, or even 1,000. 

With migration the picture is different, because what would be 
considered a high rate of mutation would be judged a low rate of 
migration. The critical value of F= 0-333 w ^ occur when m = ijzN e . 
With this rate of migration there would be only one immigrant 
individual in every second generation, irrespective of the population 
size. Thus we see that only a small amount of interchange between 
sub-populations will suffice to prevent them from differentiating 
appreciably in gene frequency. 

The situation to which this consideration of migration refers is 
known as the * 'island model." It pictures a discontinuous population 
such as might be found inhabiting widely separated islands, inter- 
change taking place by occasional migrants from one sub-population 
to another. But differentiation of sub-populations by random drift 
can take place also in a continuous population if the motility of the 
organism is small in relation to the population density. This is known 
as "isolation by distance" or the "neighbourhood model" (Wright, 
1940; 1943; 1946; 1 951). Clearly, if there is little dispersal over the 
territory between one generation and the next the choice of mates is 
restricted and mating cannot be at random. The population is then 
subdivided into "neighbourhoods" (Wright, 1946) within which 


individuals find mates. A neighbourhood is an area within which 
mating is effectively random. The size of a neighbourhood depends 
on the distance covered by dispersal between one generation and the 
next. If the distances between localities inhabited by offspring and 
parents at corresponding stages of the life cycle are distributed with a 
variance oj, then the area of a neighbourhood is the area enclosed by a 
circle of radius 2o dy which is 7r(2cr d ) 2 . The effective population size 
of a neighbourhood is the number of breeding individuals in the 
area of a neighbourhood. The subdivision of a population into 
neighbourhoods leads to random drift, but the amount of local 
differentiation depends on the size of the whole population as well 
as on the effective number in the neighbourhood. If the whole 
population is not very much larger than the neighbourhood then the 
whole population will drift, and there will be little local differentiation 
within it. The conclusion to which the neighbourhood model leads 
is that a great amount of local differentiation will take place if the 
effective number in a neighbourhood is of the order of 20, and a 
moderate amount if it is of the order of 200; but with larger neigh- 
bourhoods it will be negligible. There will be much more local 
differentiation in a population inhabiting a linear territory, such as a 
river or shore line, because a neighbourhood is then open to immi- 
gration only from two directions instead of from all round. The extent 
of a neighbourhood in a population distributed in one dimension is 
the square root of the area of a neighbourhood in a population dis- 
tributed in two dimensions. The effective population size is there- 
fore the number of breeding individuals in a distance 2a d Jir of terri- 

Example 4.3. As an illustration of the computation of the effective 
population size of a neighbourhood we may take some observations from 
the detailed studies by Lamotte (195 1) of the snail Cepaea nemoralis in 
France. Marked individuals were released in spring and the distance 
travelled from the point of release by those recaptured in the autumn was 
noted. Since the snails are inactive in winter this represented the dis- 
placement occurring in one year. The mean displacement was 8-i metres, 
and its standard deviation 9-4 m. The standard deviation of the displace- 
ment between birth and mating, which usually takes place in the second 
year of life, was estimated as 0-^ = 15 m. The area occupied by a neigh- 
bourhood is therefore 7r(2u d ) 2 = 12-50-1 = 2,813 sq. m. The density of in- 
dividuals in two large colonies was found to be 2 per sq. m., and in another 
3 per sq. m. The effective population size of the neighbourhoods in these 
colonies was therefore about 5,600 and 8,400. These figures are a good 

Chap. 4] 



deal larger than the size of neighbourhoods from which we would expect 
differentiation within the colonies. Five colonies inhabiting linear terri- 
tories had densities ranging from 4-5 to 20 individuals per metre. The 
effective population size of the neighbourhoods in these colonies ranged 
from 236 to 1,050. These are approaching the size from which differentia- 
tion within a colony would be expected. 

Selection. Selection operating on a locus in a large population 
brings the gene frequency to an equilibrium; when selection against 
a recessive or semidominant gene is balanced by mutation the 
equilibrium is at a low gene frequency, and when selection favours 
the heterozygote the equilibrium 
is more likely to be at an inter- 
mediate frequency. The question 
we have now to consider is: How 
much can the dispersive process 
disturb these equilibria and cause 
small populations to deviate from 
the point of equilibrium? The 
importance of this question lies 
in the fact that an increase of the 
frequency of a deleterious gene 
will reduce the fitness — that is, 
will increase the frequency of 
"genetic deaths" — and the dis- 
persive process may therefore 
lead to non-adaptive changes in 
small populations. We shall not 
attempt to cover the joint effects 
of selection and dispersion in 
detail, but shall merely illustrate 
their general nature by reference 
to a particular case of selection 

against a recessive gene balanced by mutation. The effects of selec- 
tion in favour of heterozygotes will be discussed in the next chapter, 
because they have more importance in connexion with close in- 

Fig. 4.2 shows the state of dispersion of a gene among sub- 
populations of three sizes under the following conditions. Mutation 
is supposed to be the same in both directions, and the coefficient of 

Fig. 4.2. Theoretical distributions 
of gene frequency among sub- 
populations when the dispersion is 
balanced by mutation and selection. 
The graphs refer to a recessive gene 
with u=v =-£qS, in populations of 
size: (a) N e = 50 Is, (b) N e =$ls, and 
(c) N e =0-5/5. (Redrawn from Wright, 


selection against the homozygote is supposed to be twenty times the 
mutation rate. In a large population the balance between the 
mutation and the selection would bring the gene frequency to equili- 
brium at about 0-2. The population sizes to which the graphs refer 
are (a) N e = 50/5, (b) N e = 5/s, and (c) N e = o-$/s. If we assumed a muta- 
tion rate of io~ 5 in both directions then the intensity of selection would 
be s = zo x io~ 5 , and the effective population sizes to which the graphs 
refer would be (a) 250,000 (b) 25,000 and (c) 2,500. These graphs 
show that with the largest value of N e there is little differentiation 
between sub-populations; with the intermediate value of N e random 
drift is strong enough to cause a good deal of differentiation; with the 
smallest value of N e the effects of random drift predominate over those 
of mutation and selection, intermediate gene frequencies are almost 
absent, and in the majority of sub-populations the allele is either fixed 
or lost. In this case, moreover, a fair proportion of the sub-populations 
have the deleterious allele fixed in them. This illustrates how random 
drift can overcome relatively weak selection and lead to fixation of a 
deleterious gene. 

This particular case illustrates in principle what will happen when 
the processes of random drift, selection, and mutation are all operating. 
But we need to have some idea of how intense the selection must be 
before it overcomes the effects of random drift. If we are content not 
to be very precise we can say that selection begins to be more im- 
portant than random drift when the coefficient of selection, s, is of the 
order of magnitude of 1 j^N e . For example, in a population of effective 
size 100, the critical value of s would be about 0-0025. This is a very 
low intensity of selection, quite beyond the reach of experimental 
detection. The conclusion to be drawn, therefore, is that in all but 
very small populations, even a very slight selective advantage of one 
allele over another will suffice to check the dispersive process before 
it causes an appreciable amount of fixation or of differentiation be- 
tween sub-populations. 

Example 4.4. The opposing forces of dispersion and selection are 
illustrated in Fig. 4.3, from an experiment with Drosophila melanogaster 
(Wright and Kerr, 1954). The frequency of the sex-linked gene "Bar" 
was followed for 10 generations in 108 lines each maintained by 4 pairs of 
parents. (On account of the complication of sex-linkage, which increases 
the rate of dispersion, the theoretical effective number was 6765: the 
effective number as judged from the actual rate of dispersion was N e = 4*87.) 
The initial gene frequency was 0-5. The circles in the figure show the 

Chap. 4] 



distribution of the gene frequency among the lines in the fourth to tenth 
generations, when the distribution had reached its steady form. The 
smooth curve shows the theoretical distribution based on N e = $ and a 
coefficient of selection against Bar of 5 = 0-17. Previously fixed lines are 
not included in the distributions. Altogether, at the tenth generation, 95 

2 4 6 8 


Fig. 4.3. Distribution of gene frequencies under inbreeding and 
selection, as explained in Example 4.4. (Data from Wright and 
Kerr, 1954.) 

of the 108 lines had become fixed for the wild-type allele and 3 for Bar 
while 10 remained unfixed. Thus, despite a 17 per cent selective dis- 
advantage, the deleterious allele was fixed in about 3 per cent of the lines. 

Random Drift in Natural Populations 

Having described the dispersive process and its theoretical conse- 
quences, we may now turn to the more practical question of how far 
these consequences are actually seen in natural populations. The 
answering of this question is beset with difficulties, and the following 
comments are intended more to indicate the nature of these diffi- 
culties than to answer the question. 


The theory of small populations, outlined in this and the pre- 
ceding chapter, is essentially mathematical in nature and is un- 
questionably valid: given only the Mendelian mechanism of inheri- 
tance, the conclusions arrived at are a necessary consequence under 
the conditions specified. The question at issue, then, is whether the 
conditions in natural populations are often such as would allow the 
dispersion of gene frequencies to become detectable. The pheno- 
mena which would be expected to result from the dispersive process, 
if the conditions were appropriate, are differentiation between the 
inhabitants of different localities, and differences between successive 
generations. Both these phenomena are well known in subdivided or 
small isolated populations, and it is tempting to conclude that because 
they are the expected consequences of random drift, random drift 
must be their cause. But there are other possible causes: the en- 
vironmental conditions probably differ from one locality to another 
and from one season to another; so the intensity, or even the direction 
of selection may well vary from place to place and from year to year, 
and the differences observed could equally well be attributed to 
variation of the selection pressure. Before we can justifiably attribute 
these phenomena to random drift, therefore, we have to know (a) 
that the effective population size is small enough, (b) that the sub- 
populations are well enough isolated (or the size of the ' 'neighbour- 
hoods" sufficiently small), and (c) that the genes concerned are subject 
to very little selection. 

The estimation of the present size of a population, though not tech- 
nically easy, presents no difficulties of principle. But the present 
state of differentiation depends on the population size in the past, 
and this can generally only be guessed at. It is difficult to know how 
often the population may have been drastically reduced in size in 
unfavourable seasons, and the dispersion taking place in these 
generations of lowest numbers is permanent and cumulative. There 
is less difficulty in deciding whether the sub-populations are suffi- 
ciently well isolated. With a discontinuous population inhabiting 
widely separated islands, it is often possible to be reasonably sure 
that there is not too much immigration; and with a continuous 
population the size of the "neighbourhoods" is, at least in principle, 
measurable. The greatest difficulty lies in estimating the intensity of 
natural selection acting on the genes concerned. Selection of an 
intensity far lower than could be detected experimentally is sufficient 
to check dispersion in all but the smallest populations. It seems 

Chap. 4] 



rather unlikely — though this is no more than an opinion — that any 
gene that modifies the phenotype enough to be recognised would 
have so little effect on fitness. The genes concerned with quantitative 
differences, which are not individually recognisable, may however be 
nearly enough neutral for random drift to take place. There is no 
doubt at all that genes of this sort do show random drift, at least in 
laboratory populations, as will be shown in Chapter 15. Of the 
individually recognisable genes, those concerned with polymorphism 
seem the most likely to show the effects of random drift. At inter- 
mediate frequencies a small displacement from the equilibrium would 
be detectable, and therefore a relatively small amount of dispersion 
of the gene frequency might well lead to recognisable differentiation. 
The following example will serve to illustrate the observed differen- 
tiation of a natural population, as well as the difficulties of its inter- 

Example 4.5. The polymorphism in respect of the banding of the 
shell in the snail Cepaea nemoralis has been extensively studied by Lamotte 
(1951) in France. The population is subdivided into colonies with a high 
degree of isolation between them. The absence of dark-coloured bands 
on the shell is caused by a single recessive gene. The mean frequency of 
bandless snails is 29 per cent, but individual colonies range between the 
two extremes, some being entirely bandless and a few entirely banded. 
The colonies vary in the number of individuals that they contain, and 291 
colonies were divided into three groups according to their population 
size. The variation in the frequency of bandless snails was then compared 
in the three groups, as shown in Fig. 4.4. The variation between the 
colonies, which measures the degree of differentiation, was found to be 
greater among the small colonies than among the large. The variance of 
the frequency of bandless between colonies was 0-067 among colonies of 
500-1,000 individuals, 0-048 among colonies of 1,000-3,000, and 0-037 
among colonies of 3,000-10,000 individuals. This dependence of the 
degree of differentiation on the population size is interpreted by Lamotte 
as evidence that the differentiation is caused by random drift. 

Cain and Sheppard (1954a), on the other hand, offer a different 
interpretation, sustained by an equally thorough study of colonies in 
England. They show that predation by birds — chiefly thrushes — exerts a 
strong selection in favour of shell colours matching the background of the 
habitat. Though the polymorphism is maintained by selection, of an 
unknown nature, in favour of heterozygotes, the frequency of the different 
types in any colony is determined by selection in relation to the nature of 
the habitat. In the areas occupied by small colonies, they argue, there is 
less variation of habitat than in the areas occupied by large colonies. There- 



[Chap. 4 

fore the variation of habitat between small colonies is greater than between 
large. This they regard as the cause of the greater differentiation among 
small colonies than among large, selection bringing the frequency of band- 
less forms to a value appropriate to the mean habitat of the colony. It is 
not for us here to attempt an assessment of these two conflicting interpre- 


Fig. 4.4. Distribution of the frequency of bandless snails among 
colonies of three sizes. (Data from Lamotte, 195 1.) 

(a) (b) (c) 

Population size 500-1,000 1,000-3,000 3,000-10,000 

Mean frequency of bandless 0-292 0-256 0-211 

Variance between colonies 0-067 0-048 0-037 


III. Pedigreed Populations and Close Inbreeding 

In the two preceding chapters the genetic properties of small popu- 
lations were described by reference to the effective number of breeding 
individuals; and expressions were derived, in terms of the effective 
number, by means of which the state of dispersion of the gene 
frequencies could be expressed as the coefficient of inbreeding. The 
coefficient of inbreeding, which is the probability of any individual 
being an identical homozygote, was deduced from the population size 
and the specified breeding structure. It expressed, therefore, the 
average inbreeding coefficient of all individuals of a generation. 
When pedigrees of the individuals are known, however, the coeffi- 
cient of inbreeding can be more conveniently deduced directly from 
the pedigrees, instead of indirectly from the population size. This 
method has several advantages in practice. Knowledge is often re- 
quired of the inbreeding coefficient of individuals, rather than of the 
generation as a whole, and this is what the calculation from pedigrees 
yields. In domestic animals some individuals often appear as parents 
in two or more generations, and this overlapping of generations causes 
no trouble when the pedigrees are known. (Non-overlapping of 
generations was one of the conditions of the idealised population 
which we have not yet removed.) The first topic for consideration in 
this chapter is therefore the computation of inbreeding coefficients 
from pedigrees. The second topic concerns regular systems of close 
inbreeding. When self-fertilisation is excluded the rate of inbreeding 
expressed in terms of the population size is only an approximation, 
and the approximation is not close enough if the population size is 
very small. Under systems of close inbreeding, therefore, the rate of 
inbreeding must be deduced differently, and this is best done also 
by consideration of the pedigrees. 

When the coefficient of inbreeding is deduced from the pedigrees 
of real populations it does not necessarily describe the state of dis- 
persion of the gene frequencies. It is essentially a statement about 


the pedigree relationships, and its correspondence with the state of 
dispersion is dependent on the absence of the processes that counteract 
dispersion, in particular on selection being negligible. We were able 
to use the coefficient of inbreeding as a measure of dispersion in the 
preceding chapters because the necessary conditions for its relation- 
ship with the variance of gene frequencies were specified. 

Pedigreed Populations 

The inbreeding coefficient of an individual is the probability 
that the pair of alleles carried by the gametes that produced it were iden- 
tical by descent. Computation of the inbreeding coefficient therefore 
requires no more than the tracing of the pedigree back to common 
ancestors of the parents and computing the probabilities at each 
segregation. Consider the pedigree in Fig. 5.1. X is the individual 
we are interested in, whose parents are P and Q. We 
A want to know what is the probability that X receives 

J identical alleles transmitted through P and Q from A. 

Consider first B and C. The probability that they 
B C receive replicates of the same gene from A is J, and the 

probability that they receive different genes is J. But 
if they receive different genes from A, then the prob- 
ability of these being identical as a result of previous 
Y inbreeding is the inbreeding coefficient of A. There- 

I fore the total probability of B and C receiving identical 

I genes from A is J(i +F A ). Put in other words, this is 

the probability that two gametes taken at random 


Fig. 5.1 from A will contain identical alleles. Now consider 
the rest of the path through B. The probability that B 
passes the gene it got from A on to D is ^; from D to P is J, and from 
P to X is \ . Similarly for the other side of the ancestry through C and 
Q. Putting all this together we find the probability that X receives 
identical alleles descended from A is |(i +F A )(^y +2 , or 
\(i + F A )(^) n i +n 2, where n 1 is the number of generations from one 
parent back to the common ancestor and n 2 from the other parent. 
If the two parents have more than one ancestor in common the separ- 
ate probabilities for each of the common ancestors have to be summed 
to give the inbreeding coefficient of the progeny of these parents. 
Thus the general expression for the inbreeding coefficient of an 

Chap. 5] 



individual is 

Fx = m) n ^ +1 (i+F A )] (5-1) 

(Wright, 1922). When inbreeding coefficients are computed in this 
way it is necessary, of course, to define the base population to which 
the present inbreeding is referred. The base population might be the 
individuals from which an experiment was started or a herd founded; 
or it might be those born before a certain date. The designation of an 
individual as belonging to the base population means that it will be 
assumed to have zero inbreeding coefficient. When pedigrees are 
long and complicated there may be very many common ancestors, but 
it is not necessary to trace back all lines of descent. A sufficiently 
accurate estimate can be got by sampling a limited number of lines of 
descent (Wright and McPhee, 1925). 

Example 5.1. As an illustration of the use of formula 5.J let us consider 
the hypothetical pedigree in Fig. 
5.2. The relevant individuals in 
the pedigree are indicated by 
letters. Individual Z is the one 
whose inbreeding coefficient is to 
be computed. Its parents are X 
and Y, so we have to trace the 
paths of common ancestry con- 
necting X with Y. There are 
four common ancestors, A, B, C, 
and H, and five paths connecting 
X with Y through them. We as- 
sume A, B, and C to have zero in- 
breeding coefficients, since the 
pedigree tells us nothing about 
their ancestry. Individual H, 
however, has parents that are 
half sibs, and the inbreeding 
coefficient of H is therefore 5 ' 2 

Common Path from 
ancestor X to Y 

Generations to 
common ancestor: 
from X from Y 


coeff. of 

common ancestor 


to inbreeding 







(i) 9 = -00195 

ay = -00195 
ay = -00391 
ay = -00781 
(*) 5 .1 = -03516 

Total by summation 




(i)(i+i+i) = i # ^he computation of the separate paths may now be made as 
shown in the table. By addition of the contributions from the five paths 
we get the inbreeding coefficient of Z as Fz =0-05078, or 5-1 per cent. 

"Coancestry." There is another method of computing inbreed- 
ing coefficients (Cruden, 1949; Emik and Terrill, 1949) which is more 
convenient for many purposes, and is also more readily adapted to 
a variety of problems. We shall use it later to work out the inbreeding 
coefficients under regular systems of close inbreeding. The method 
does not differ in principle from the formula 5.J given above, but 
instead of working from the present back to the common ancestors 
we work forward, keeping a running tally generation by generation, 
and compute the inbreeding that will result from the matings now 
being made. The inbreeding coefficient of an individual depends on 
the amount of common ancestry in its two parents. Therefore, 
instead of thinking about the inbreeding of the progeny, we can think 
of the degree or relationship by descent between the two parents. 
This we shall call the coancestry of the two parents, and symbolise it 
by /. It is identical with the inbreeding coefficient of the progeny, 
and is the probability that two gametes taken one from one parent 
and one from the other will contain alleles that are identical by 
descent. (Malecot, 1948, calls this the "coefficient de parente," but 
the translation "coefficient of relationship" cannot be used because 
Wright (1922) has used this term with a different meaning.) 

Consider the generalised pedigree in Fig. 5.3. 
X is an individual with parents P and Q and grand- A x B C x D 
parents A, B, C, and D. Now, the coancestry of P j J 

with Q is fully determined by the coancestries relating P x Q 
A and B with C and D, and if these are known we | 

need go no further back in the pedigree. It can be X 

shown that the coancestry of P with Q is simply the Fig. 5.3 
mean of the four coancestries AC, AD, BC, and BD. 
This will be clearer if stated in the form of probabilities, though the 
explanation is cumbersome when put into words. Take one gamete 
at random from P and one from Q, and repeat this many times. In 
half the cases P's gamete will carry a gene from A and in half from B: 
similarly for Q's gamete. So the two gametes, one from P and one 
from Q, will carry genes from A and C in a quarter of the cases, from 
A and D in a quarter, from B and C in a quarter, and from B and D 
in a quarter of the cases. Now the probability that two gametes 

taken at random, one from A and the other from C, are identical by 
descent is the coancestry of A with C, i.e. f AC etc. So, reverting now 
to symbols, 

/pq — J/ac + J/ad + J/bc + J/bd 

This gives the basic rule relating coancestries in one generation with 
those in the next: 

Chap. 5] 



Fx =/pq — | (Ac +/ad +/bc +/bd) 


With this rule the experimenter can tabulate the coancestries genera- 
tion by generation, and this gives a basis for planning matings and 
computing inbreeding coefficients. More detailed accounts of the 
operation are given by Cruden (1949), Emik and Terrill (1949), and 
Plum (1954). 

If there is overlapping of generations it may happen that we must 
find the coancestry between individuals belonging to different 
generations. This situation is covered by the following supplementary 
rules, which can readily be deduced by a consideration of probabilities 
in the manner explained above. Referring to the same pedigree 

( Fi g- 5-3)> 


/pc : 








which by substitution reduces to the basic rule. 

Before we can apply this method to systems of close inbreeding 
we have to see how the basic rule is to be applied when there are 
fewer than four grandparents. As an example 
we shall consider the coancestry between a 
pair of full sibs. The pedigree can be written 
as in Fig. 5.4: A and B are parents of both P x 
and P 2 , which are full sibs and have an off- 
spring X. Applying the basic rule (equation 
5.2), and noting that / BA =/ AB , we have 






Fig. 5.4 

^x =/ Pi p 2 = K/aa +/bb + 2/ A b) 


The meaning of / AA , the coancestry of an individual with itself, is the 
probability that two gametes taken at random from A will contain 


identical alleles, and we have already seen that this probability is 
equal to |(i +F A ). The value of F A will be known from the coancestry 
of A's parents. The coancestry between offspring and parent can be 
found in a similar way, by application of the supplementary rules in 
5.5. Substituting the individuals in Fig. 5.4 for those in Fig. 5.3 and 
applying the first two equations of 5. 3 gives 

/pa = 2 (/aa +/ab) 1 / -x 

/pB — K/BB +/ab) J 

where P is equivalent to either P x or P 2 ; and applying the third 
equation of 5.5 gives the coancestry between full sibs 

/piPa = J(/pa +/pb) 

= K/aa+/bb + 2/ A b) 

as above. We now have all the rules needed for computing the in- 
breeding coefficients in successive generations under regular systems 
of inbreeding. 

Regular Systems of Inbreeding 

The consequences of regular systems of inbreeding have been 
the subject of much study. They were first described in detail by 
Wright ( 1 921) in a series of papers which form the foundation of the 
whole theory of small populations. Wright's studies were based on 
the method of path coefficients (Wright, 1934, 1954). Haldane 
(1937, 1955) and Fisher (1949) derived the consequences by the 
method of matrix algebra. The inbreeding coefficients in successive 
generations can, however, be more simply derived by application of 
the rules of coancestry explained in the previous section, and this is 
the method we shall follow here. We shall illustrate the application 
of the method for consecutive full-sib mating, which is one of the most 
commonly used systems, and give the results for some other systems. 
The inbreeding coefficients refer to autosomal genes; the results for 
sex-linked genes are described by Wright (1933) in a paper which 
also contains a useful summary of the results for autosomal genes in 
a great variety of mating systems. 

Full-sib mating. The equation 5.4 given above for the co- 
ancestry between full sibs can be applied to successive generations to 

Chap. 5] 





5- 1 





systems of 








































































































J 9 











System of mating Recurrence equation 

Self-fertilisation, or re- 
peated backcrosses to highly |(i +F t _ 1 ) 
inbred line. 

Full brother x sister, or off- 
spring x younger parent: 

Inbreeding coefficient. J(i + 2F t _ ± +F t _ 2 ) 

Probability of fixation 
(from Schafer, 1937). 

Half sib (females half J(i +6F t - 1 +F t _ 2 ) 

Repeated backcrosses to J(i +zF t _ 1 ) 
random-bred individual. 


give the inbreeding coefficients under continued full-sib mating. 
But it is more convenient to rearrange the equation so that the in- 
breeding coefficient is given in terms of the inbreeding coefficients 
of the previous generations. Note first that, because the mating sys- 
tem is regular, contemporaneous individuals have the same inbreeding 
coefficients and coancestries: so, referring again to the pedigree in 
Fig. 5.4>/aa=/bb> and F a = F b . Now, if we let t be the generation to 
which individual X belongs, then/ AB =F t _ lf and/ AA =/ BB =i(i +-^-2)- 
The coancestry equation can therefore be rewritten to give the 
inbreeding coefficient in any generation, t y in terms of the inbreeding 
coefficients of the previous two generations, thus: 

F t =l(i+zF t _ 1+ F t _ 2 ) ( 5 .6) 

This recurrence equation enables us to write down the inbreeding 
coefficients in successive generations. In the first generation F t _ ± 
and F t _ 2 are both zero and so F (t=1) =0-25. The inbreeding coeffi- 
cients in the first four generations are 0-25, 0-375, 0-50, and 0-59. 
The rate of inbreeding is not constant in the first few generations, as 
may be seen by computing AF from equation 3.9. For the first four 
generations AF is 0-25, 0-17, 0-20, and 0-19. It later settles down 
to a constant value of 0-191 (Wright, 193 1). The inbreeding co- 
efficients over the first 20 generations of full-sib mating are given in 
Table 5.1. 

Some other systems of mating may now be mentioned briefly. 
Self-fertilisation gives the most rapid inbreeding. If X is the off- 
spring of P, we have from the coancestry identities 

F x =f?r = l(i+Fj,) 

and the recurrence equation is therefore 

F t =i(i +F t _ 1 ) {5.7) 

The inbreeding coefficients over the first ten generations of self- 
fertilisation are given in Table 5.1. The rate of inbreeding is con- 
stant from the beginning; AF=o-$ exactly. 

Parent-offspring mating, in which offspring are mated to the 
younger parent, gives the same series of inbreeding coefficients as 
full-sib mating for autosomal genes, but for sex-linked genes it gives 
a slightly higher rate of inbreeding. For sex-linked genes AF is 0-293 
after the first few generations (Wright, 1933). 

Chap. 5] 



Half-sib mating is usually between paternal half sibs, one male being 
mated to two or more of his half sisters. If these females are half 
sisters of each other the recurrence equation is 

F t =i(x+6F t _ 1+ F t 


The first 20 generations are given in Table 5.1. There are, however, 
practical difficulties in the way of maintaining this system regularly, 
and sometimes females that are full sisters of each other have to be 
used. The inbreeding will then go a little faster. If full-sister females 
are always used the recurrence equation is 

F t =M3 + 8^_ 1 + 4 F < _ 2 +^_ 8 ) 

Repeated backcrosses to an individual or to a highly 
inbred line are often made, for a variety of purposes. 
The resulting inbreeding is as follows. The pedigree 
(Fig. 5.5) shows an individual, A, which will probably 
be a male, mated to his daughter, C, his granddaughter, 
D, etc. From the supplementary rule (5.5) 

Fx =/ad = J(/aa +/ac) 
The recurrence equation is therefore 

^ = i(l+^A + 2^-l) 







Fig. 5.5 


I where F A is the inbreeding coefficient of the individual to which the 
j repeated backcrosses are made. If A is an individual from the base 
J population andF A = o, the equation becomes 

F ( = 1(1+2.^) 


The inbreeding coefficients over the first 9 generations are given in 
Table 5.1. If A is an individual from a highly inbred line and F A = 1 , 
the equation becomes 

F t =i(i+F t _0 


which is identical with self-fertilisation. In this case A need not be 
the same individual in successive generations: it can be any member 
of the inbred line. 


Example 5.2. As an example of the use of coancestry for computing 
inbreeding coefficients let us consider populations derived from "2-way" 
and from "4-way" crosses between highly inbred lines. In a 2-way cross 
two inbred lines are crossed and the population is maintained by random 
mating among the cross-bred individuals and subsequently among their 
progeny. In a 4-way cross four inbred lines are crossed in two pairs, and 
the two cross-bred groups are again crossed, subsequent generations being 
maintained by random mating. If the base population is taken to be a real, 
or hypothetical, random-bred population from which the inbred lines were 
derived, we may compute the inbreeding coefficients of the population 
derived from the cross, referring it to this base. The crosses and sub- 
sequent generations are shown schematically in the diagram below. 

Generation 2-way cross 4-way cross 

1 AxB AxB CxD 

X x x X 2 Xjl X 2 Y 2 Y 2 

Xj x Y 1 X 2 x Y 2 

1 ' 1 

O Zj x Z a 

4 O 

The inbred lines are represented by A, B, C, and D. If they are fully 
inbred, as we shall take them to be, the coefficient of inbreeding of the 
individuals from the lines is 1, and the coancestry of an individual with 
another of the same line is also 1. Therefore only one individual of each 
line need be represented in the scheme, even though any number may 
actually be used. The progeny of the crosses between the inbred lines are 
represented by X and Y, the suffices 1 and 2 indicating different individuals. 
In the 2-way cross the progeny of these cross-bred individuals are the 
foundation generation whose inbreeding coefficient we are to compute. 
They are represented by O. In the 4-way cross the two sorts of cross-bred 
individuals, X and Y, are crossed, one sort with the other. Two such 
matings are represented in the scheme. They produce the "double-cross" 
individuals, Z, whose progeny constitute the foundation generation repre- 
sented by O, whose inbreeding coefficient we are to compute. 

In the computation of the coancestries we shall omit the symbol /, 
writing for example AB for / AB , the coancestry of individual A with in- 
dividual B. The coancestries of the parents in generation 1 are 


Chap. 5] 



The coancestries in the second generation of the 2-way cross are 

X ± X 2 = J(AA + BB + AB + BA) (by equation 5.2) 
= |( 1 +1+0+0) 

Therefore F = 0-5, which is the required inbreeding coefficient of the 
foundation generation of the population derived from the 2-way cross. 
The subsequent matings between the O individuals need produce no 
further inbreeding provided enough 2nd generation matings are made. 
The coancestries in the second generation of the 4-way cross are 


X X X 2 = Y^ = J (as shown for the 2-way cross) 
X X Y 2 = X 2 Y X = i(AC + AD + BC + BD) = o 

The coancestries in the third generation are 

Z X Z 2 = i(X x X 2 + Y X Y 2 + X X Y 2 + X 2 Y X ) 


_ 1 


Therefore the inbreeding coefficient of the foundation generation is 
^0 = 0-25. Again, the inbreeding need not increase further, provided 
enough third generation matings are made. 

The meaning of these coefficients of inbreeding, with the base popula- 
tion as stated, may be clarified thus. If we made a large number of 2-way, 
or of 4-way, crosses each with a different set of inbred lines, the populations 
derived from the crosses would constitute a set of lines or sub-populations. 
The inbreeding coefficients would then indicate the expected amount of 
dispersion of gene frequencies among these lines. Populations derived 
from 2-way crosses are equivalent to progenies of one generation of self- 
fertilisation. The gene frequencies can therefore have only three values, 
o, J, and 1. Populations derived from 4-way crosses are equivalent to 
progenies of one generation of full-sib mating, and the gene frequencies 
can have only five values, o, J, J, f , and 1 . 

Reference to a different base population. Having computed 
a coefficient of inbreeding with reference to a certain group of indi- 
viduals as the base population, one may then want to change the base 
and refer the inbreeding coefficient to another group of individuals. 
One might, for example, compute the inbreeding coefficient of a herd 


of cattle referred to the foundation animals of the herd as the base, 
and then want to recompute the inbreeding coefficient 

so as to refer to the breed as a whole with a base popula- A 

tion in the more remote past. Let X represent the group J, 

of individuals whose inbreeding coefficient is required, B 

and let A and B represent ancestral groups, A being more ! 

remote than B, as shown in Fig. 5.6. Then it follows from X 

equation 3. II that Fig. 5.6 

Px.a = Px.bPb.a (5.13) 

where P x A = i -F x .a1 Fx.a being the inbreeding coefficient of X 
referred to A as base, and similarly for the other subscripts. 

Example 5.3. A selection experiment with mice was started from a 
foundation population made by a 4-way cross of highly inbred lines 
(Falconer, 1953). According to the computation given above in Example 
5.2, the inbreeding coefficient of this foundation population was reckoned 
to be 25 per cent. On this basis the inbreeding coefficients of subsequent 
generations were computed from the pedigrees by the coancestry method. The 
inbreeding coefficient at generation 24, computed thus, was 58-8 per cent. 
What would the inbreeding coefficient be if referred to the foundation 
population as base, instead of to the more remote hypothetical population 
from which the inbred lines were derived? The figures to be substituted in 

equation 3-13 are P x .a = °*4 i 2 and Pb.a = '75- Therefore Px.b = — 

= 0-549. The inbreeding coefficient at generation 24, referred to the 
foundation population as base, is therefore 45-1 per cent. 

We may use this population of mice also to compare the rate of in- 
breeding when computed by the two methods, from the pedigrees and from 
the effective population size. Computed from the pedigrees, the average 
rate of inbreeding over the 24 generations is found from equation 3.12 
thus: 0-451 = 1 -(1 -zlF) 24 , whence AF = 2-4.7 P er cent - The population 
was maintained by six pairs of parents in each generation. Matings were 
made between individuals with the lowest coancestries and this has the 
effect of equalising family size, as explained in the previous chapter. 
Therefore, by equation 4.8, the effective number was twice the actual, i.e. 

N e = 24. The rate of inbreeding, by equation 4.1, is therefore AF = —= = 2-08 


per cent. The slightly higher rate of inbreeding as computed directly 

from the pedigrees can be attributed to some irregularities in the mating 

system, resulting from the sterility of some parents and the death of some 

whole litters. The random drift of a colour gene in this line, and two others 

maintained in the same manner, was shown in Fig. 3.2. 

Chap. 5] 



Fixation. One is often more interested in the probability of 
fixation as a consequence of inbreeding than in the inbreeding coeffi- 
cient. The inbreeding coefficient gives the probability of an indi- 
vidual being a homozygote, which is i - 2p q (i -F) from Table 3.1. 
But one wants to know also how soon all individuals in a line can be 
expected to be homozygous for the same allele. This is the "purity" 
implied by the term "pure line" which is often used to mean highly 
inbred line. The degree of "purity" is the probability of fixation. 
The probability of fixation has been worked out by Haldane (1937, 
1955), Schafer (1937) and Fisher (1949). It depends on the number 
of alleles and their arrangement in the initial mating of the line. The 
probabilities of fixation over the first 20 generations of full-sib mating 
are given in Table 5.1, when 4 alleles were present in the initial 
mating. There cannot, of course, be more than 4 alleles in a sib- 
mated line, and when there are fewer the probability of fixation is 
greater (see Haldane, 1955). 

Linkage. Linkage introduces a problem in connexion with the 
consequences of inbreeding of which a solution is sometimes needed. 
Individuals heterozygous at a particular locus will also be hetero- 
zygous for a segment of chromosome in which the locus lies, and it 
may be of interest to know the average length of heterozygous 
segments. The form in which this problem most commonly arises is 
connected with the transference of a marker gene to an inbred line by 
repeated backcrosses, when one wants to know how much of the 
foreign chromosome is transferred along with the marker. This 
problem has been worked out by Bartlett and Haldane (1935). A 
dominant gene can be transferred by successive crosses of the hetero- 
zygote to the strain into which it is to be introduced. In this case the 
mean length of chromosome introduced with the gene after t crosses 
is ijt cross-over units on each side of the gene. A recessive gene is 
commonly transferred by alternating backcrosses and intercrosses 
from which the homozygote is extracted. The mean length of foreign 
chromosome in this case is z\t cross-over units on each side, after t 
cycles. Other cases are described in the paper cited. From this and a 
knowledge of the total map length of the organism we can arrive at 
the expected proportion of the total chromatin that is still hetero- 

Example 5.4. What percentage of the total chromatin is expected to 
be still heterogeneous after a dominant gene has been transferred to an 
inbred strain of mice by five, and by ten successive backcrosses? The 



[Chap. 5 








1 ' ^ r* — , ™ — ■ - , 

i i 

in t " 


TV ( 




at 1 .. 




viii c 

ix c 


xi C 

xii C 




xvi c 

XIX (_ 



Chap. 5] 






viii n 


xvi c 
xvii n 

xviii c 


xx C 


Fig. 5.7. Theoretical models illustrating the distribution of 
heterozygous segments of chromosome (shown black) after (a) 5 
generations, (b) 10 generations, and (c) 20 generations of full-sib 
mating, in an organism with twenty chromosomes, such as the 
mouse. The total map-length is taken to be 2500 centimorgans, 
and the chromosomes are assumed to be of equal genetic length. 
The points marked A, B, C, D, in chromosomes I to IV are loci 
held heterozygous by forced segregation, and the associated hetero- 
zygous segments are cross-hatched. (From Fisher, The Theory of 
Inbreeding, Oliver and Boyd, 1949; reproduced by courtesy of the 
author and publishers.) 

expected length of heterogeneous chromosome associated with the gene 
is 0*2. centimorgans after five crosses, and o-i cM after ten. The average 
map length of the 20 chromosomes in male mice is 977 cM (Slizynski, 
1955). Therefore 0-2 per cent of the chromosome will be heterogeneous 
after five crosses, and o-i per cent after ten, assuming that the gene is 
transferred through males, and taking the average as being the length of 
the chromosome carrying the gene. The percentage of chromatin not 
associated with the gene that is expected still to be heterogeneous can be 
taken as approximately i-F t from column A of Table 5.1: that is, 3-1 per 
cent after five crosses and o-i per cent after ten. The total percentage of 
heterogeneous chromatin is therefore 3-4 per cent after five crosses, and 
0-2 per cent after ten. 


The more general problem of the mean length of heterozygous 
segments during inbreeding has been treated by Haldane (1936) and 
by Fisher (1949). It need not be discussed in detail here. The con- 
clusions are well illustrated in Fig. 5.7, which is Fisher's diagrammatic 
representation of the situation in an organism with 20 chromosomes, 
such as the mouse, after five, ten, and twenty generations of full-sib 
mating. The diagrams show the expected number and lengths of 
unfixed segments. The first four chromosomes are supposed to carry 
loci at which segregation is maintained by mating always hetero- 
zygotes with homozygotes. The slower reduction of the lengths of 
these unfixed segments can be seen. 

Mutation. After a long period of inbreeding mutation may be- 
come an important factor in determining the frequency of hetero- 
zygotes. If u is the mutation rate of a gene that has reached near- 
fixation in the line, then the frequency of heterozygotes at this locus 
due to mutation is \u under self-fertilisation, and \zu under full-sib 
mating, for autosomal loci (Haldane, 1936). These are very small 
frequencies if we are concerned with only one locus, but if the effects 
of all loci are taken together mutation is not entirely negligible as a 
source of heterozygosis in long inbred strains such as the widely used 
strains of mice. The practical consequences of the origin of hetero- 
geneity by mutation are that the characteristics of a line will slowly 
change through the fixation of mutant alleles, and that sub-lines will 
become differentiated. Examples are given in Chapter 15. 

Selection favouring heterozygotes. When close inbreeding is 
practised the object is generally to produce fixation, or homozygosis 
within the lines, and the experimenter is not usually interested in the 
differentiation between lines. It is therefore a matter of little concern 
which allele is fixed, so long as fixation occurs. Selection against a 
deleterious recessive may prevent the deleterious allele becoming 
fixed, but it will not prevent or delay the fixation of the more favour- 
able allele. Therefore the conclusions about selection reached in the 
previous chapter are of little relevance to close inbreeding. Selection 
that favours heterozygotes, however, is another matter. A conse- 
quence of inbreeding almost universally observed is a reduction of 
fitness, the reasons for which will be given in Chapter 14. Thus 
selection resists the inbreeding, since the more homozygous indi- 
viduals are the less fit, and this can only mean that selection favours 
heterozygotes — not necessarily heterozygotes of the loci taken singly, 
but heterozygotes of segments of chromosome. It is only necessary 

Chap. 5] 



to have two deleterious genes, recessive or partially recessive, linked 
in repulsion, to confer a selective advantage on the heterozygote of 
the segment of chromosome within which the genes are located. It is 
therefore important to find out how the opposing tendencies of 
inbreeding and selection in favour of heterozygotes balance each 
other, in order to assess the reliability of the computed inbreeding 
coefficient as a measure of the probability of fixation. 

The outcome of the joint action of inbreeding and selection in 
favour of heterozygotes depends on whether there is replacement of 
the less fit lines by the more fit; in other words, on whether selection 
operates between lines or only within lines. Within any one line, 
selection against homozygotes only delays the progress toward 
fixation and cannot arrest it, the delay being roughly in proportion to 
the intensity of the selection (Reeve, 19550). Table 5.2 shows the 

Table 5.2 

Rate of inbreeding, AF, with selection favouring the 
heterozygote. (Except with self-fertilisation, the rates are 
only approximate over the first few generations of in- 

Coefficient of 
selection against 

"* \/o) 


the homozygotes 


Full sib 

Half sib 

























* Females full sisters to each other. 

rates of inbreeding with various intensities of selection, when there 
are two alleles and selection acts equally against both homozygotes. 
(The rate of inbreeding, AF, is used here to mean the rate of dispersion 
of gene frequencies and, after the first few generations when the 
distribution of gene frequencies has become flat, it measures the rate 
of fixation — i.e. the proportion of unfixed loci that become fixed in 
each generation — as explained in Chapter 3.) The delay of fixation 
caused by selection is least under the closest systems of inbreeding. 


Thus the rate is halved under self-fertilisation when the coefficient 
of selection is 0-67; under full-sib mating when it is 0-44; and under 
half-sib mating when it is 0-35. It will be seen from the table that the 
rate of inbreeding, though much reduced by intense selection, does 
not become zero until the coefficient of selection rises to 1 . If there is 
only one line, therefore, fixation eventually goes to completion, unless 
both homozygotes are entirely inviable or sterile. 

If there are many lines, however, selection may arrest the progress 
of fixation and lead to a state of equilibrium, for the following reason. 
The amount by which the inbreeding has changed the frequency of a 
particular gene from its original value differs at any one time from 
line to line. In other words, the state of dispersion of the locus has 
gone further in some lines than in others. Now, if those lines in 
which the dispersion has gone furthest, and which are consequently 
most reduced in fitness, die out or are discarded, and if they are 
replaced by sub -lines taken from the lines in which it has gone least 
far, then the progress of the dispersive process will have been set 
back. When there is replacement of lines in this way, and the selec- 
tion is sufficiently intense, a state of balance between the opposing 
tendencies of inbreeding and selection is reached. The intensity of 
selection needed to arrest the dispersive process has been worked out 
for regular systems of close inbreeding (Hayman and Mather, 1953). 
Some of the conclusions, for the case of two alleles with equal selec- 
tion against the two homozygotes, are given in Table 5.3, which 
shows the intensity of selection against the homozygotes which will 
(a) just allow fixation to go eventually to completion, and (b) arrest 

Table 5.3 

Balance between inbreeding and selection in favour of 
heterozygotes, when selection operates between lines. The 
figures are the selective disadvantages of homozygotes, s f 
expressed as percentages. Column (a) shows the highest 
value of ^ compatible with complete fixation. Column (b) 
shows the value of s that leads to a steady state at 
P=i-F = o- 5 . 



Mating system 

(P = o) 











(females half sisters) 

Chap. 5] 



the dispersive process at a point of balance where the frequency of 
heterozygotes is half its original value, i.e. where P= i-F=o-$. 
These figures show that only a moderate advantage of heterozygotes 
will suffice to prevent complete fixation. Under full-sib mating, for 
example, loci, or segments of chromosomes that do not recombine, 
with a 25 per cent disadvantage in homozygotes will not all go to 
fixation. And, of those with a 50 per cent disadvantage, only about 
half will become fixed, no matter for how long the inbreeding is 

It must be stressed, however, that prevention of fixation in this 
way can only take place when there is replacement of lines and sub- 
lines. The following breeding methods, for example, would allow 
replacement of lines: if seed, set by self-fertilisation, were collected 
in bulk and a random sample taken for planting, and this were re- 
peated in successive generations; or, if sib pairs of mice were taken at 
random from all the surviving progeny, so that the same amount of 
breeding space was occupied in successive generations. 

The conclusions outlined above refer to a single locus. If there 
were more than a few loci on different chromosomes all subject to 
selection against homozygotes of an intensity sufficient to arrest or 
seriously delay the progress of inbreeding, the total loss of fitness 
from all the loci would be very severe. Inbred lines of organisms 
with a high reproductive rate, such as plants and Drosophila, might 
well stand up to a total loss of fitness sufficient to keep several loci 
or segments of chromosome permanently unfixed. But the loss of 
fitness involved in preventing the fixation of more than two or three 
loci in an organism such as the mouse would be crippling. Under 
laboratory conditions the highly inbred strains of mice, after 100 or 
more generations of sib-mating, have a fitness not much less than half 
that of non-inbred strains. It is conceivable that they might have one 
locus permanently unfixed, but it is difficult to believe that they can 
have more. Complete lethality or sterility of both homozygotes at 
one locus means a 50 per cent loss of progeny; at two unlinked loci, a 
75 per cent loss. A mouse strain with a mortality or sterility of 50 
per cent can be kept going, but hardly one with 75 per cent. 




It will be obvious, to biologist and layman alike, that the sort of 
variation discussed in the foregoing chapters embraces but a small 
part of the naturally occurring variation. One has only to consider 
one's fellow men and women to realise that they all differ in countless 
ways, but that these differences are nearly all matters of degree and 
seldom present clear-cut distinctions attributable to the segregation 
of single genes. If, for example, we were to classify individuals ac- 
cording to their height, we could not put them into groups labelled 
"tall" and "short," because there are all degrees of height, and a 
division into classes would be purely arbitrary. Variation of this sort, 
without natural discontinuities, is called continuous variation, and 
characters that exhibit it are called quantitative characters or metric 
characters, because their study depends on measurement instead of 
on counting. The genetic principles underlying the inheritance of 
metric characters are basically those outlined in the previous chapters, 
but since the segregation of the genes concerned cannot be followed 
individually, new methods of study have had to be developed and 
new concepts introduced. A branch of genetics has consequently 
grown up, concerned with metric characters, which is called variously 
population genetics, biometrical genetics or quantitative genetics. The 
importance of this branch of genetics need hardly be stressed; most 
of the characters of economic value to plant and animal breeders are 
metric characters, and most of the changes concerned in micro- 
evolution are changes of metric characters. It is therefore in this 
branch that genetics has its most important application to practical 
problems and also its most direct bearing on evolutionary theory. 

How does it come about that the intrinsically discontinuous varia- 
tion caused by genetic segregation is translated into the continuous 
variation of metric characters? There are two reasons: one is the 
simultaneous segregation of many genes affecting the character, and 
the other is the superimposition of truly continuous variation arising 
from non-genetic causes. Consider, for example, a simplified situa- 

Chap. 6] 



tion. Suppose there is segregation at six unlinked loci, each with two 
alleles at frequencies of 0-5. Suppose that there is complete domin- 
ance of one allele at each locus and that the dominant alleles each add 
one unit to the measurement of a certain character. Then if the 
segregation of these genes were the only cause of variation there would 

Fig. 6.1. Distributions expected from the simultaneous segrega- 
tion of two alleles at each of several or many loci: (a) 6 loci, (b) 24 
loci. There is complete dominance of one allele over the other at 
each locus, and the gene frequencies are all 0-5. Each locus, when 
homozygous for the recessive allele, is supposed to reduce the 
measurement by 1 unit in (a), and by \ unit in (b). The horizontal 
scale, representing the measurement, shows the number of loci 
homozygous for the recessive allele, and the vertical axis shows the 
probability, or the percentage of individuals expected in each class. 
The probabilities are derived from the binomial expansion of 
(i + !) w > where n is the number of loci, and they are taken from the 
tables of Warwick (1932). 

be 7 discrete classes in the measurements of the character, according 
to whether the individual had the dominant allele present at o, 1, 2, . . . 
or 6 of the loci. The frequencies of the classes would be according to 
the binomial expansion of (i + |) 6 , as shown in Fig. 6.1 (a). If our 
measurements were sufficiently accurate we should recognise these 
classes as being distinct and we should be able to place any individual 


unambiguously in its class. If there were more genes segregating but 
each had a smaller effect, there would be more classes with smaller 
differences between them, as in Fig. 6.1 (b). It would then be more 
difficult to distinguish the classes, and if the difference between the 
classes became about as small as the error of measurement we should 
no longer be able to recognise the discontinuities. In addition, metric 
characters are subject to variation from non-genetic causes, and this 
variation is truly continuous. Its effect is, as it were, to blur the edges 
of the genetic discontinuity so that the variation as we see it becomes 
continuous, no matter how accurate our measurements may be. 

Thus the distinction between genes concerned with Mendelian 
characters and those concerned with metric characters lies in the 
magnitude of their effects relative to other sources of variation. A 
gene with an effect large enough to cause a recognisable discontinuity 
even in the presence of segregation at other loci and of non-genetic 
variation can be studied by Mendelian methods, whereas a gene whose 
effect is not large enough to cause a discontinuity cannot be studied 
individually. This distinction is reflected in the terms major gene and 
minor gene. There are, however, all intermediate grades, genes that 
cannot properly be classed as major or as minor, such as the "bad 
genes" of Mendelian genetics. And, furthermore, as a result of 
pleiotropy the same gene may be classed as major with respect to one 
character and minor with respect to another character. The distinc- 
tion, though convenient, is therefore not a fundamental one, and there 
is no good evidence that there are two sorts of genes with different 
properties. Variation caused by the simultaneous segregation of 
many genes may be called polygenic variation, and the minor genes 
concerned are sometimes referred to as polygenes (see Mather, 1949). 

Metric Characters 

The metric characters that might be studied in any higher organ- 
ism are almost infinitely numerous. Any attribute that varies con- 
tinuously and can be measured might in principle be studied as a 
metric character — anatomical dimensions and proportions, physio- 
logical functions of all sorts, and mental or psychological qualities. 
The essential condition is that they should be measureable. The 
technique of measurement, however, sets a practical limitation on 
what can be studied. Usually rather large numbers of individuals 

Chap. 6] 



30 35 t 40 




40 T 60 80 100 120 


Fig. 6.2. Frequency distributions of four metric characters, with 
normal curves superimposed. The means are indicated by arrows. 
The characters are as follows, the number of observations on which 
each histogram is based being given in brackets: 

(a) Mouse (<?<?): growth from 3 to 6 weeks of age. (380) 

(b) Mouse: litter size (number of live young in 1st litters). 

(c) Drosophila melanogaster ($$): number of bristles on ventral 
surface of 4th and 5th abdominal segments, together. (900) 

(d) Drosophila melanogaster ($?): number of facets in the eye of 
the mutant "Bar". (488) 

(a), (b), and (c) are from original data: (d) is from data of Zeleny 


have to be measured and the study of any character whose measure- 
ment requires an elaborate technique therefore becomes impracti- 
cable. Consequently the characters that have been used in studies of 
quantitative genetics are predominantly anatomical dimensions, or 
physiological functions measured in terms of an end-product, such as 
lactation, fertility, or growth rate. 

Some examples of metric characters are illustrated in Fig. 6.2. 
The variation is represented graphically by the frequency distribu- 
tion of measurements. The measurements are grouped into equally 
spaced classes and the proportion of individuals falling in each class 
is plotted on the vertical scale. The resulting histogram is discontinu- 
ous only for the sake of convenience in plotting. If the class ranges 
were made smaller and the number of individuals measured were in- 
creased indefinitely the histogram would become a smooth curve. 
The variation of some metric characters, such as bristle number or 
litter size, is not strictly speaking continuous because, being measured 
by counting, their values can only be whole numbers. Nevertheless, 
one can regard the measurements in such cases as referring to an 
underlying character whose variation is truly continuous though 
expressible only in whole numbers, in a manner analogous to the 
grouping of measurements into classes. For example, litter size may 
be regarded as a measure of the underlying, continuously varying 
character, fertility. For practical purposes such characters can be 
treated as continuously varying, provided the number of classes is 
not too small. When there are too few classes, as for example when 
susceptibility to disease is expressed as death or survival, different 
methods have to be employed, as will be explained in Chapter 18. 

The frequency distributions of most metric characters approxi- 
mate more or less closely to normal curves. This can be seen in 
Fig. 6.2, where the smooth curves drawn through the histograms are 
normal curves having means and variances calculated from the data. 
In the study of metric characters it is therefore possible to make use 
of the properties of the normal distribution and to apply the appro- 
priate statistical techniques. Sometimes, however, the scale of 
measurement must be modified if a distribution approximating to the 
normal is to be obtained. The distribution in Fig. 6.2 {d\ for example, 
would be skewed if measured and plotted simply as the number of 
facets. But it becomes symmetrical, and approximates to a normal 
distribution, if measured and plotted in logarithmic units. The 
criteria on which the choice of a scale of measurement rests cannot be 

Chap. 6] 



fully appreciated at this stage, and will be explained in Chapter 17. 
Meantime it will be assumed that any metric character under dis- 
cussion is measured on an appropriate scale and has a distribution 
that is approximately normal. 

General Survey of the Subject-matter 

There are tw.o^basi c genetic phenomena conce rned with metric 
characters, botmnore or'less jjaqm^arto aUJiLoio psts. and each forms 
the basis of a breeding method. The first is the resemblance between 
relatives. Everyone is familiar with the fact that relatives tend to 
resemble each other, and the closer the relationship, in general the 
closer the resemblance. Though it is only in our own species that 
resemblances are readily discernible without measurement, the 
phenomenon is equally present in other species. The degree of 
resemblance varies with the character, some showing more, some less. 
The resemblance between offspring and parents provides the basis 
for selective breeding. Use of the more desirable individuals as 
parents brings about an improvement of the mean level of the next 
generation, and just as some characters show more resemblance than 
others, so some are more responsive to selection than others. The 
degree of resemblance between relatives is one of the properties of a 
population that can be readily observed, and it is one of the aims of 
quantitative genetics to show how the degree of resemblance between 
different sorts of relatives can be used to predict the outcome of 
selective breeding and to point to the best method of carrying out the 
selection. This problem will form the central theme of the next 
seven chapters, the resemblance between relatives being dealt with in 
Chapters 9 and 10, and the effects of selection in Chapters 1 1-13. 

with its converse hybrid vigour, or heterosis. This phenomenon is less 
familiar to the layman than the first, since the laws against incest pre- 
vent its more obvious manifestations in our own species; but it is well 
known to animal and plant breeders. Inbreeding tends to reduce the 
mean level of all characters closely connected with fitness in animals 
and in naturally outbreeding plants, and to lead in consequence to loss 
of general vigour and fertility. Since most characters of economic 
value in domestic animals and plants are aspects of vigour or fertility, 
inbreeding is generally deleterious. The reduced vigour and fertility 


of inbred lines is restored on crossing, and in certain circumstances 
this hybrid vigour can be made use of as a means of improvement. 
The enormous improvement of the yield of commercially grown 
maize has been achieved by this means and represents probably the 
greatest practical achievement of genetics (see Mangelsdorf, 195 1). 
The effects of inbreeding and crossing will be described in Chapters 

The properties of a population that we can observe in connexion 
with a metric character are means, variances, and covariances. The 
natural subdivision of the population into families allows us to analyse 
the variance into components which form the basis for the measure- 
ment of the degree of resemblance between relatives. We can in 
addition observe the consequences of experimentally applied breed- 
ing methods, such as selection, inbreeding or cross-breeding. The 
practical objective of quantitative genetics is to find out how we can 
use the observations made on the population as it stands to predict 
the outcome of any particular breeding method. The more general 
aim is to find out how the observable properties of the population are 
influenced by the properties of the genes concerned and by the various 
non-genetic circumstances that may influence a metric character. The 
chief properties of genes that have to be taken account of are the 
degree of dominance, the manner in which genes at different loci 
combine their effects, pleiotropy, linkage, and fitness under natural 
selection. To take account of all these properties simultaneously, in 
addition to a variety of non-genetic circumstances, would make the 
problems unmanageably complex. We therefore have to simplify 
matters by dealing with one thing at a time, starting with the simpler 

The plan to be followed in the succeeding chapters is this: we 
shall first show what determines the population mean, and then 
introduce two new concepts — average effect and breeding value — 
which are necessary to an understanding of the variance. Then we 
shall discuss the variance, its analysis into components, and the co- 
variance of relatives, which will lead us to the degree of resemblance 
between relatives. In all this we shall take full account of dominance 
from the beginning: the other complicating factors will be more 
briefly discussed when they become relevant. The most important 
simplification that we shall make concerns the effect of genes on 
fitness: we shall assume that Mendelian segregation is undisturbed 
by differential fitness of the genotypes. The description of means. 

Chap. 6] 



variances, and covariances will refer to a random breeding popula- 
tion, with Hardy- Weinberg equilibrium genotype frequencies, with 
no selection and no inbreeding. That is to say, we shall describe the 
population before any special breeding method is applied to it. Then 
in Chapters n-13 we shall describe the effects of selection, and in 
Chapters 14-16 the effects of inbreeding. This will cover the funda- 
mentals of quantitative genetics, and in the final chapters we shall 
discuss some special topics. 



We have seen in the early chapters that the genetic properties of a 
population are expressible in terms of the gene frequencies and geno- 
type frequencies. In order to deduce the connexion between these 
on the one hand and the quantitative differences exhibited in a metric 
character on the other, we must introduce a new concept, the concept 
o f valu e, e xpressible in the metric units by wnichtne charact er is 
mea^gmjed. The value observed when the character is measured on an 
individual is the phenotypic value of that individual. All observations, 
whether of means, variances, or covariances, must clearly be based on 
measurements of phenotypic values. In order to analyse the genetic 
properties of the population we have to divide the phenotypic value 
into component parts attributable to different causes. Explanation of 
the meanings of these components is our chief concern in this chapter, 
though we shall also be able to find out how the population mean is 
influenced by the array of gene frequencies. 

The first division of phenotypic value is into components attribut- 
able to the influence of genotype and environment. The genotype is 
the particular assemblage of genes possessed by the individual, and 
the environment is all the non-genetic circumstances that influence the 
phenotypic value. Inclusion of all non-genetic circumstances und er 
th e term environment means that the genotype and the envi ronment 
a re by definition the only determinants of phenotypic, value . The two 
components of value associated with genotype and environment are 
the genotypic value and the environmental deviation. We may think 
of the genotype conferring a certain value on the individual and the 
environment causing a deviation from this, in one direction or the 
other. Or, symbolically, 

P=G+E (7.1) 

where P is the phenotypic value, G is the genotypic value, and J? is the 
environmental deviation. The mean environmental deviation in the 
population as a whole is taken to be zero, so that the mean phenotypic 

Chap. 7] 



value is equal to the mean genotypic value. T^he term popula tion 
mean then refers equall y to phenotypic or to genotypic values. When 
dealing with successive generations we shall assume for simplicity that 
the environment remains constant from generation to generation, so 
that the population mean is constant in the absence of genetic change. 
If we could replicate a particular genotype in a number of individuals 
and measure them under environmental conditions normal for the 
population, their mean environmental deviations would be zero, and 
their mean phenotypic value would consequently be equal to the 
genotypic value of that particular genotype. This is the meaning of 
the genotypic value of an individual. In principle it is measurable, 
but in practice it is not, except when we are concerned with a single 
locus where the genotypes are phenotypically distinguishable, or with 
the genotypes represented in highly inbred lines. 

For the purposes of deduction we must assign arbitrary values to 
the genotypes under discussion. This is done in the following way. 
Considering a single locus with two alleles, A x and A 2 , we call the 
genotypic value of one homozygote + a, that of the other homozygote 
- a, and that of the heterozygote d. (We shall adopt the convention 
that A x is the allele that increases the value.) We thus have a scale of 
genotypic values as in Fig. 7.1. The origin, or point of zero value, on 
this scale is mid- way between the values of the two homozygotes. 


A 2 A 2 

AjA 2 AjAj 

Genotypic -a o d +a 


Fig. 7.1. Arbitrarily assigned genotypic values. 

The value, d. o f the het erozygote depends on the degree of domin ance. 
If there is no dominance, d = o; if A x is dominant over A 2 , dis positive, 
and if A 2 is dominant over A 1? d is negative. If dominance is com- 
plete, d is equal to +a or -a, and if there is overdominance_ d is 
g reater than + a or less than - a. The degree of dominance m av be 

Example 7.1. For the purposes of illustration in this chapter, and also 
later on, we shall refer to a dwarfing gene in the mouse, known as "pygmy' ' 
(symbol pg), described by King (1950, 1955), and by Warwick and Lewis 
(1954). This gene reduces body-size and is nearly, but not quite, recessive 
in its effect on size. It was present in a strain of small mice (Mac Arthur's) 
at the time the studies cited above were made. The weights of mice of the 

114 VALUES AND MEANS [Chap. 7 

three genotypes at 6 weeks of age were approximately as follows (sexes 

+ + +Pg PgPg 
Weight in grams 14 12 6 

(The weight of heterozygotes given here is to some extent conjectural, but 
it is unlikely to be more than 1 gm. in error.) These are average weights 
obtained under normal environmental conditions, and they are therefore 
the genotypic values. The mid-point in genotypic value between the two 
homozygotes is 10 gm., and this is the origin, or zero-point, on the scale 
of values assigned as in Fig. 7. 1 . The value of a on this scale is therefore 
4 gm., and that of d is 2 gm. 

Population Mean 

We can now see how the gene frequencies influence the mean of 
the character in the population as a whole. Let the gene frequencies 
of A ± and A 2 be p and q respectively. Then the first two columns of 
Table 7.1 show the three genotypes and their frequencies in a random 
breeding population, from formula 1.2. The third column shows the 
genotypic values as specified above. The mean value in the whole 

Table 7.1 

freq. x vol. 
p 2 a 
-q 2 a 





P 2 

+ a 

A X A 2 



A 2 A 2 



Sum = a(p -q) + 2dpq 

population is obtained by multiplying the value of each genotype by 
its frequency and summing over the three genotypes. The reason why 
this yields the mean value may be understood by converting fre- 
quencies to numbers of individuals. Multiplying the value by the 
number of individuals in each genotype and summing over genotypes 
gives the sum of values of all individuals. The mean value would then 
be this sum of values divided by the total number of individuals. The 
procedure in working with frequencies is the same, but since the sum 
of the frequencies is 1, the sum of values x frequencies is the mean 
value. In other words, the division by the total number has already 
been made in obtaining the frequencies. Multiplication of values by 
frequencies to obtain the mean value is a procedure that will be often 


used in this chapter and subsequent ones. Returning to the popula- 
tion mean, multiplication of the value by the frequency of each 
genotype is shown in the last column of Table 7.1. Summation of 
this column is simplified by noting that p 2 - q 2 = (p+q)(p -q)=p- q- 
The population mean, which is the sum of this column, is thus 

flif— i(ft Ul I zdpq 


This is both the mean genotypic value and the mean phenotypic 
value of the population with respect to the character. 

The contribution of any locus to the population mean thus has 
two terms: a(p - q) attributable to homozygotes, and zdpq attributable 
to heterozygotes. If there is no dominance (d=o) the second term is 
zero, and the mean is proportional to the gene frequency: M= a(i - 2q). 
If there is complete dominance (d=a) the mean is proportional to the 
square of the gene frequency: M=a(i -2q 2 ). The total range of 
values attributable to the locus is 2a, in the absence of overdominance. 
That is to say, if A x were fixed in the population (p = i) the popula- 
tion mean would be a, and if A 2 were fixed (q=i) it would be - a. 
If the locus shows overdominance, however, the mean of an unfixed 
population is outside this range. 

Example 7.2. Let us take again the pygmy gene in mice, as described 
in Example 7.1, and see what effect this gene would have on the population 
mean when present at two particular frequencies. First, the total range is 
from 6 gm. to 14 gm.: a population consisting entirely of pygmy homo- 
zygotes would have a mean of 6 gm., and one from which the gene was 
entirely absent would have a mean of 14 gm. (These values refer speci- 
fically to MacArthur's Small Strain at the time the observations were 
made.) Now suppose the gene were present at a frequency of o-i, so that 
under random mating homozygotes would appear with a frequency of 1 
per cent. The values to be substituted in equation 7.2 are p = o-g, q = o-i } 
and a = 4 gm., d = 2 gm., as shown in Example 7.1. The population mean, 
by equation 7.2, is therefore: M = \ x o-8 + 2 x o- 18 = 3-56. This value of 
the mean, however, is measured from the mid-homozygote point, which is 
10 gm., as origin. Therefore the actual value of the population mean is 
13-56 gm. Next suppose the gene were present at a frequency of 0-4. 
Substituting in the same way, we find M — 176, to which must be added 
10 gm. for the origin, giving a value of 11-76 gm. Rough corroboration of 
these figures is given by the records of the strain carrying the gene. 
When the gene was present at a frequency of about 0-4 the mean weight 
was about 12 gm. Two generations later, when the pygmy gene had been 
deliberately eliminated, the mean weight rose to about 14 gm. 



[Chap. 7 

Now we have to put together the contributions of genes at several 
loci and find their joint effect on the mean. This introduces, the 
qiiegfliflQ nf ^ nw 8m£^d4jlifegn^Qc^oinbirietg pro duce a j oint 
efl^cj^nthgjjjjgra^ter. For the mom ent we shall suppose that com - 
bi nation isJiv a ddition, which means that the value of a genotype 
with respect to several loci is the sum of the values attributable to the 
separate loci. For example, if the genotypic value of A^ is a A and 
that of B 1 B 1 is a By then the genotypic value of AjA^B]. is a A +a B . 
The consequences of non-additive combination will be explained at 
the end of this chapter. With additive combination, then, the popu- 
lation mean resulting from the joint effects of several loci is the sum of 
the contributions of each of the separate loci, thus: 

M=Za(p-q) + 2Zdpq 


This is again both the genotypic and the phenotypic mean value. The 
total range in the absence of overdominance is now 2Ua. If all alleles 
that increase the value were fixed the mean would be + £a, and if all 
alleles that decrease the value were fixed it would be - Ea. These are 
the theoretical limits to the range of potential variation in the popula- 
tion. The origin from which the mean value in equation 7.5 is 
measured is the mid-point of the total range. This is equivalent to 
the average mid-homozygote point of all the loci separately. 

Example 7.3. As an example of two loci that combine additively, and 
also of their joint effects on the population mean, we shall refer to two 
colour genes in mice, whose effects on the number of pigment granules 
have been described by Russell (1949). This is a metric character which 
reflects the intensity of pigmentation in the coat. The two genes are 
"brown" (b) and "extreme dilution" (c e ), an allele of the albino series. 
Measurements were made of the number of melanin granules per unit 
volume of hair, in wild-type homozygotes, in the two single mutant homo- 
zygotes, and in the double mutant homozygote. We shall assume both 
wild-type alleles to be completely dominant, so that only these four geno- 
types need be considered. The mean numbers of granules in the four 
genotypes were as follows: 



2a B 


c e c e 







Chap. 7] 



The difference between the two figures in each row and in each column 
measures the homozygote difference, or 2a on the scale of values assigned 
as in Fig. 7.1. Apart from the trivial discrepancy of 1 unit, these differences 
are independent of the genotype at the other locus. In other words, the 
difference of value between B - and bb is the same among C - genotypes 
as it is among c e c e genotypes; and similarly the difference between C - and 
c e c e is the same in B - as it is in bb. Thus the two loci combine addi- 
tively, and the value of a composite genotype can be rightly predicted 
from knowledge of the values of the single genotypes. For example: the 
bb genotype is 5 units less than the wild-type, and the c e c e is 57 units less; 
therefore bb c e c e should be 62 units less, namely 33, which is almost iden- 
tical with the observed value of 34. 

We may use this example further to illustrate the effect of the two 
loci jointly on the population mean. Let us work out, from the effects of 
the loci taken separately, what would be the mean granule number in a 
population in which the frequency of bb was ql = 0-4, and that of c e c e 
was ql = o-2. For the effects of the loci separately we shall take a B = 2 and 
fl c = 28. The population mean, considering one locus, is M = a(i -2q 2 ), 
when there is complete dominance. For the B locus this is M B = 2 x 0-2 
= 0-4; and for the C locus M c = 28 xo-6 = i6-8. The mean, considering both 
loci together, is M B + M c = 17-2 (by equation 7.3). The point of origin from 
which this is measured is the mid-point between the two double homo- 
zygotes, which is ^(95 + 34) = 64-5. Thus the mean granule number in this 
population would be 64-5 + 17-2 = 81-7. We may check this from the ob- 
servations of the values of the joint genotypes. The four genotypes would 
have the following frequencies and observed values: 


B- C- 

B - c e c e 


bb c e c e 






Observed value 





The mean value is obtained by multiplying the values by the frequencies 
and summing over the four genotypes. This yields a mean granule number 

Average Effect 

In order to deduce the properties of a population connected with 
its family structure we have to deal with the transmission of value 
from parent to offspring, and this cannot be done by means of geno- 
typic values alone, because parents pass on their genes and not their 
genotypes to the next generation, genotypes being created afresh in 

118 VALUES AND MEANS [Chap. 7 

each generation. A new measure of value is therefore needed which 
will refer to genes and not to genotypes. This will enable us to assign 
a ''breeding value" to individuals, a value associated with the genes 
carried by the individual and transmitted to its offspring. The new 
measure is the "average effect." We can assign an average effect to a 
gene in the population, or to the difference between one gene and 
another of an allelic pair. The average effect of a gene is the mean 
deviation from the population mean of individuals which received 
that gene from one parent, the gene received from the other parent 
having come at random from the population. This may be stated in 
another way. Let a number of gametes all carrying A ± unite at ran- 
dom with gametes from the population; then the mean deviation from 
the population mean of the genotypes so produced is equal to the 
average effect of the gene A x . The concept of average effect is perhaps 
easier to grasp in the form of the average effect of a gene-substitution, 
which can more conveniently be used when only two alleles at a locus 
are under consideration. If we could change, say, A 2 genes into A x at 
random in the population, and could then note the resulting change of 
value, this would be the average effect of the gene-substitution. It is 
equal to the difference between the average effects of the two genes 
involved in the substitution. A graphical representation of the average 
effect of a gene-substitution is given later in Fig. 7.2. 

It is important to realise that the average effect of a gene or a gene- 
substitution depends on the gene frequency, and that the average 
effect is therefore a property of the population as well as of the gene. 
The reason for this can be seen in the words "taken at random" in the 
definitions, because the content of the random sample depends on the 
gene frequency in the population. The point may perhaps be more 
easily understood from a specific example. Consider the substitution 
of a recessive gene, a, for its dominant allele, A. The substitution will 
change the value only when the individual already carries one reces- 
sive allele, in other words in heterozygotes. Changing AA into Aa 
will not affect the value, but changing Aa into aa will. Now, when the 
frequency of the recessive allele, a, is low there will be many AA 
individuals, which the substitution will not affect; but when the 
recessive is at high frequency there will be very few AA individuals, 
and most of the individuals in which a substitution can be made will 
be affected by it. Therefore the average effect of the substitution will 
be small when the frequency of the recessive allele is low and large 
when it is high. 

Chap. 7] 



Let us see how the average effect is related to the genotypic 
values, a and d, in terms of which the population mean was expressed. 
This will help to make the concept clearer. The reasoning is set out 
in Table 7.2. Consider a locus with two alleles, A ± and A 2 , at fre- 
quencies p and q respectively, and take first the average effect of the 

Table 7.2 

Type of 

Values and 

frequencies of 

genotypes produced 

A^Aj A1A2 A^A^ 
a d -a 

Mean value 

of genotypes 


Population mean 
to be deducted 

Average effect 
of gene 

A 2 

P q 

P q 

pa +qd 
-qa +pd 




gene A ly for which we shall use the symbol a x . If gametes carrying A t 
unite at random with gametes from the population, the frequencies 
of the genotypes produced will be p of A^! and q of A X A 2 . The 
genotypic value of AjAj is + a and that of A X A 2 is d, and the mean of 
these, taking account of the proportions in which they occur, is 
pa+qd. The difference between this mean value and the population 
mean is the average effect of the gene A ± . Taking the value of the 
population mean from equation 7.2 we get 

a i =P a +qd- [a(p -q) + zdpq] 

=q[a + d(q-p)] (7.4a) 

Similarly the average effect of the gene A 2 is 

cc 2 =-p[a + d(q-p)] (7.4b) 

Now consider the average effect of the gene-substitution, letting A x 
be substituted for A 2 . Of the A 2 genes taken at random from the 
population for substitution, a proportion p will be found in A X A 2 
genotypes and a proportion q in A 2 A 2 genotypes. In the former the 
substitution will change the value from d to +a, and in the latter 
from -a to d. The average change is therefore p(a-d)+q(d J rd), 
which on rearrangement becomes a + d(q-p). Thus the average 
effect of the gene-substitution (written as a, without subscript) is 

<x = a + d(q-p) 


The relation of a to a x and <% 2 can be seen by comparing equations 
7.5 and 7.4, whence 

I F.Q.G. 

120 VALUES AND MEANS [Chap. 7 

oc = oc 1 -a 2 (7.6) 

} (7-7) 


oc 1 =q<x 
a 2 = -poc 

Example 7.4. Consider again the pygmy gene and its effect on body 
weight, for which a = 4 gm. and d = 2 gm. If the frequency of the pg gene 
were # = o-i, the average effect of substituting + for pg would be, by 
equation 7.5, <x = 4 + 2x -0-8 = 2-4 § m - And if the frequency were 
q = 0-4, the average effect of the gene-substitution would be: a = 4 + 2 x - o>2 
= 3-6 gm. Thus, the average effect is greater when the gene frequency is 
greater. The average effects of the genes separately are, by equation 7.7: 

q = o-i <7 = o-4 

Average effect of + : oc 1 = +0-24 +!*44 
Average effect of pg : a 2 = -2-16 -2- 16 

(The identity of the average effects of pg at the two gene frequencies is 
only a coincidence.) 

Breeding Value 

The usefulness of the concept of average effect arises from the fact, 
already noted, that parents pass on their genes and not their genotypes 
to their progeny. It is therefore the average effects of the parent's 
genes that determine the mean genotypic value of its progeny. The 
value of an individual, judged by the mean value of its progeny, is 
called the breeding value of the individual. Breeding value, unlike 
average effect, can therefore be measured. If an individual is mated 
to a number of individuals taken at random from the population then 
its breeding value is twice the mean deviation of the progeny from 
the population mean. The deviation has to be doubled because the 
parent in question provides only half the genes in the progeny, the 
other half coming at random from the population. Breeding values 
can be expressed in absolute units, but are usually more conveniently 
expressed in the form of deviations from the population mean, as 
defined above. Just as the average effect is a property of the gene 
and the population so is the breeding value a property of the individual 
and the population from which its mates are drawn. One cannot 
speak of an individual's breeding value without specifying the popu- 
lation in which it is to be mated. 

Chap. 7] 



Defined in terms of average effects, the breeding value of an 
individual is equal to the sum of the average effects of the genes it 
carries, the summation being made over the pair of alleles at each 
locus and over all loci. Thus, for a single locus with two alleles the 
breeding values of the genotypes are as follows: 

Genotype Breeding value 
A^ 2a x = 2qoc 

A X A 2 a 1 + ot 2 = (q-p)ac 
A 2 A 2 2a 2 = — 2/)a 

Example 7.5. Let us illustrate breeding values by reference to the 
pygmy gene in mice. The average effects of the + and pg genes were 
given in the last example. From these we may find the breeding values of 
the three genotypes as explained above. These breeding values, which are 
given below, are deviations from the population mean. The population 
means with gene frequencies of o-i and 0-4 were found in Example 7-2 and 
are shown again below in the column headed M. 

q = o-i 
2 = o-4 


i3'5 6 
1 1 76 

Breeding values 
+ + +Pg PgPg 

+ 0-48 
+ 2-88 




(The breeding values of pygmy homozygotes are only hypothetical 
because in fact pygmy homozygotes are nearly all sterile: but this compli- 
cation may be overlooked in the present context.) 

Extension to a locus with more than two alleles is straightforward, 
the breeding value of any genotype being the sum of the average 
effects of the two alleles present. If all loci are to be taken into 
account, the breeding value of a particular genotype is the sum of the 
breeding values attributable to each of the separate loci. If there is 
non-additive combination of genotypic values a slight complication 
arises. We have given two definitions of breeding value, a practical 
one in terms of the measured value of the progeny and a theoretical 
one in terms of average effects. Non-additive combination renders 
these two definitions not quite equivalent. This point will be more 
fully explained in Chapter 9. 

Consideration of the definition of breeding value will show that 
in a population in Hardy- Weinberg equilibrium the mean breeding 
value must be zero; or if breeding values are expressed in absolute 


122 VALUES AND MEANS [Chap. 7 

units the mean breeding value must be equal to the mean genotypic 
value and to the mean phenotypic value. This can be verified from 
the breeding values listed above. Multiplying the breeding value by 
the frequency of each genotype and summing gives the mean breeding 
value (expressed as a deviation from the population mean) as 

2p 2 q<x + 2pq(q -p)<x. - 2q 2 poc = 2pqoc(p + q-p-q) = o 

The breeding value is sometimes referred to as the "additive 
genotype," and variation in breeding value ascribed to the "additive 
effects" of genes. Though we shall not use these terms we shall 
follow custom in using the term "additive" in connexion with the 
variation of breeding values to be discussed in the next chapter, and 
we shall use the symbol A to designate the breeding value of an 

Dominance Deviation 

We have separated off the breeding value as a component part of 
the genotypic value of an individual. Let us consider now what 
makes up the remainder. When a single locus only is under con- 
sideration, the difference between the genotypic value, G, and the 
breeding value, A, of a particular genotype is known as the dominance 
deviation D, so that 

G=A+D { 7 .8) 

The dominance deviation arises from the property of dominance 
among the alleles at a locus, since in the absence of dominance breed- 
ing values and genotypic values coincide. From the statistical point 
of view the dominance deviations are interactions between alleles, or 
within-locus interactions. They represent the effect of putting genes 
together in pairs to make genotypes; the effect not accounted for by 
the effects of the two genes taken singly. Since the average effects of 
genes and the breeding values of genotypes depend on the gene 
frequency in the population, the dominance deviations are also 
dependent on gene frequency. They are therefore partly properties 
of the population and are not simply measures of the degree of 

Example 7.6. Continuing with the example of the pygmy gene, we 
may now list the genotypic values and the breeding values, and so obtain 
the dominance deviations of the three genotypes, by equation J.8. These 



Chap. 7] 

values, all now expressed as deviations from the population mean, M, are 
as follows: 

? = o-i:M=i3-56 

+ + +Pg PgPg 

Genotypic value, G 
Breeding value, A 
Dominance dev., D 

o-8i o-i8 o-oi 

+ o«44 -1-56 -7-56 

+ 0-48 -1-92 -4*32 

-0-04 +0-36 -3-24 

q = 0\ 

j.: M=ii"j6 

+ + 

+ Pg 





+ 2-24 

+ 0-24 

-57 6 

+ 2-88 




+ 0-96 


The relations between genotypic values, breeding values and 
dominance deviations can be illustrated graphically, as in Fig. 7.2, 

+ a 

s • 



> « 

ry— - — —i - 






■ 1 

1 — ' 


(q-p) * 


A 2 A 2 


A,A 2 




P 2 

Fig. 7.2. Graphical representation of genotypic values (closed 
circles), and breeding values (open circles), of the genotypes for a 
locus with two alleles, A x and A 2 , at frequencies p and q, as ex- 
plained in the text. Horizontal scale: number of A x genes in the 
genotype. Vertical scales of value: on left— arbitrary values as- 
signed as in Fig. 7.1; on right — deviations from the population 
mean. The figure is drawn to scale for the values: d — la, and q=\. 



[Chap. 7 

and the meaning of the dominance deviation is perhaps more easily 
understood in this way. In the figure the genotypic value (black dots) 
is plotted against the number of A x genes in the genotype. A straight 
regression line is fitted by least squares to these points, each point 
being weighted by the frequency of the genotype it represents. The 
position of this line gives the breeding values of each genotype, as 
shown by the open circles. The differences between the breeding 
values and the genotypic values are the dominance deviations, indi- 
cated by vertical dotted lines. The cross marks the population mean. 
The average effect, a, of the gene-substitution is given by the differ- 
ence in breeding value between A 2 A 2 and A^, or between A X A 2 and 
AjAi, as indicated. The original definition of the average effect of a 
gene-substitution was given by Fisher (191 8, 1941) in terms of this 
linear regression of genotypic value on number of genes. 

The dominance deviation can be expressed in terms of the arbi- 
trarily assigned genotypic values a and d, by subtraction of the breed- 
ing value from the genotypic value, as shown in Table 7.3. The 

Table 7.3 

Values of genotypes in a two-allele system, measured as 
deviations from the population mean. 
Population mean: M=a(p -q) + 2dpq 
Average effect of gene-substitution: a = a + d(q -p) 



AiA 2 

A 2 A 2 




? 2 

Assigned values 




Deviations from 


Genotypic value 


2q{a -pd) 
2q(a. - qd) 

a(q-p) + d(i-2pq) 
(q-p)a + 2pqd 

- 2p(a +pd) 

Breeding value 




Dominance deviation 



-2p 2 d 

genotypic values must first be converted to deviations from the 
population mean, because the breeding values have been expressed 
in this way. The genotypic values, so converted, are given in two 
forms: in terms of a and in terms of a. Let us take the genotype A^Aj. 
to show how these are obtained and how the dominance deviation is 
obtained by subtraction of the breeding value. The arbitrarily as- 
signed genotypic value of h 1 A 1 is + a, and the population mean is 

Chap. 7] 



a(p —q) + zdpq. Expressed as a deviation from the population mean, 
the genotypic value is therefore 

a - [a(p -q) + zdpq] =a(i-p+q)- zdpq — zqa - zdpq = zq(a - dp). 

This may be expressed in terms of the average effect, a, by substitut- 
ing a = a- d(q -p) (from equation 7.5), and the genotypic value then 
becomes zq(oc - qd). Subtraction of the breeding value, zq<x, gives the 
dominance deviation as - zq 2 d. By similar reasoning the dominance 
deviation of A X A 2 is zpqd, and that of A 2 A 2 is - zp 2 d. Thus all the 
dominance deviations are functions of d. If there is no dominance d 
is zero and the dominance deviations are also all zero. Therefore in 
the absence of dominance, breeding values and genotypic values are 
the same. Genes that show no dominance (d=o) are sometimes called 
"additive genes," or are said to "act additively." 

Since the mean breeding value and the mean genotypic value are 
equal, it follows that the mean dominance deviation is zero. This can 
be verified by multiplying the dominance deviation by the frequency 
of each genotype and summing. The mean dominance deviation is 

- zp 2 q 2 d + 4p 2 q 2 d - zp 2 q 2 d — o 

Another fact, which will be needed later when we deal with 
variances, may be noted here: there is no correlation between the 
dominance deviation and the breeding value of the different genotypes. 
This can be shown by multiplying together the dominance deviation, 
the breeding value and the frequency of each genotype. Summation 
gives the sum of cross-products, and it works out to be zero, thus: 

- 4p 2 q 3 ocd + 4p 2 q 2 (q -p)ocd + 4p 3 q 2 ad=4p 2 q 2 ad(-q+q -p +p) = o 

Since the sum of cross-products is zero, breeding values and domin- 
ance deviations are uncorrelated. 

Interaction Deviation 

When only a single locus is under consideration the genotypic 
value is made up of the breeding value and the dominance deviation 
only. But when the genotype refers to more than one locus the geno- 
typic value may contain an additional deviation due to non-additive 
combination. Let G A be the genotypic value of an individual attri- 
butable to one locus, G B that attributable to a second locus, and G the 



[Chap. 7 

aggregate genotypic value attributable to both loci together. Then 

G = G X +G^ + I 



where 7 AB is the deviation from additive combination of these geno- 
typic values. In dealing with the population mean, earlier in this 
chapter, we assumed that I was zero for all combinations of geno- 
types. If / is not zero for any combination of genes at different loci, 
those genes are said to "interact" or to exhibit "epistasis," the term 
epistasis being given a wider meaning in quantitative genetics than 
in Mendelian genetics. The deviation / is called the interaction 
deviation or epistatic deviation. Loci may interact in pairs or in threes 
or higher numbers, and the interactions may be of many different 
sorts, as the behaviour of major genes shows. The complex nature of 
the interactions, however, need not concern us, because in the aggre- 
gate genotypic value interactions of all sorts are treated together as a 
single interaction deviation. So for all loci together we can write 

G=A + D + I (7.10) 

where A is the sum of the breeding values attributable to the separate 
loci, and D is the sum of the dominance deviations. 

If the interaction deviation is zero the genes concerned are said to 
"act additively" between loci. Thus "additive action" may mean two 
different things. Referred to genes at one locus it means the absence 
of dominance, and referred to genes at different loci it means the 
absence of epistasis. 

Example 7.7. As an example of non-additive combination of two loci 
we shall take the same two colour genes in mice that were used in Example 
7.3 to illustrate additive combination; but this time we refer to their effects 
on the size of the pigment granules, instead of their number (Russell, 
1949). The mean size (diameter in fj,) of the granules in the four geno- 
types was as follows: 





c e c e 








This time the differences are not independent of the other genotype: the 
c e gene for example has quite a large effect on the B - genotype, but none 
at all on the bb genotype. Thus the two loci show epistatic interaction and 

Chap. 7] 



do not combine additively. Let us therefore work out the interaction 
deviations. This is not altogether a straightforward matter because the 
deviations depend on the gene frequencies in the population under dis- 
cussion; it does, however, help to clarify the meaning of the interaction 

If we were to measure the homozygote differences of these two loci 
with the object of estimating the value of a for each, the results would 
depend on the gene frequency at the other locus. For example, the differ- 
ence between B - and bb would be 0-67 if measured in C - genotypes, but 
0-17 if measured in c e c e genotypes. The value of a therefore depends on 
the population in which it is measured. Let us take, for the sake of illus- 
tration, a population in which the frequency of bb genotypes is ^ = 0-4 
and the frequency of c e c e genotypes is q% = 0-2. Then the mean homo- 
zygote difference for the B locus will be 2« B = (0-67 x o-8) + (0-17 x 0-2) = 
0-57. Similarly, for the C locus, 2fl c =o*30. The object now is to find for 
each genotype the aggregate genotypic value, G, for the two loci combined 
(i.e. the observed values given above); then the genotypic values, G B and 
G Ci derived from consideration of the two loci separately; and, finally, the 
interaction deviation, I BC , according to equation y.g. The procedure is 
simplified if all these values are expressed as deviations from the popula- 
tion mean. The table gives, in line (1), the four genotypes (assuming again 
complete dominance at both loci); in line (2), the frequency of each geno- 
type in the population; and in line (3), the observed value of granule size 
in each genotype. The population mean is found by multiplying the value 
by the frequency of each genotype and summing over the four genotypes. 
This yields M= 1-112. Subtracting the population mean from the ob- 
served value gives the aggregate genotypic value, G, as a deviation from 
the population mean, shown in line (4). Now consider each locus separ- 

(1) Genotypes 

B- C- 

B- c e c e 


bb c e c e 


(2) Frequencies 





(3) Observed values 






(4) G 

+ 0-328 





(5) G B + G C 

+ 0-288 





(6) /bo 

+ 0-040 


- 0-060 

+ 0-240 


ately, paying no regard to the other locus. The genotypic values for a 
single locus, expressed as deviations from the population mean, were given 
in Table 7.3. With complete dominance these reduce to zaq 2 for the two 
dominant genotypes combined, and -20(1 -q 2 ) for the recessive homo- 
zygote. Take the B - genotype for example: the value of 2« B m tne popu- 
lation under consideration was shown above to be 0-57, and the value of 
q 2 assumed is 0-4; therefore the genotypic value is 0-57x0-4= +0-228- 



[Chap. 7 

This is the average value of the B - genotype irrespective of the other locus. 
The other single-locus values, found in a similar way, are as follows: 





c e c e 

- 0-228 


1 G c : 

+ o-o6o 

- 0-240 

The values given in line (5) of the table as G B + G c are found by summa- 
tion of the two appropriate single-locus values. For example, the B - C - 
genotype is +0-228 + 0-060= +0-288. These are the genotypic values 
expected if there were additive combination. It may be noted that their 
mean, obtained by summation of (value x frequency) is zero, as is the mean 
of the aggregate genotypic values in line (4). Finally, the interaction devi- 
ations, 7 BC , given in line (6) are obtained by subtracting the "expected" 
values in line (5) from the "actual" values in line (4). The mean interaction 
deviation is also zero. 



The genetics of a metric character centres round the study of its 
variation, for it is in terms of variation that the primary genetic 
questions are formulated. The basic idea in the study of variation is 
its partitioning into components attributable to different causes. The 
relative magnitude of these components determines the genetic 
properties of the population, in particular the degree of resemblance 
between relatives. In this chapter we shall consider the nature of 
these components and how the genetic components depend on the 
gene frequency. Then, in the next chapter, we shall show how the 
degree of resemblance between relatives is determined by the magni- 
tudes of the components. 

The amount of variation is measured and expressed as the vari- 
ance: when values are expressed as deviations from the population 
mean the variance is simply the mean of the squared values. The 
components into which the variance is partitioned are the same as the 
components of value described in the last chapter; so that, for 
example, the genotypic variance is the variance of genotypic values, 
and the environmental variance is the variance of environmental 
deviations. The total variance is the phenotypic variance, or the 
variance of phenotypic values, and is the sum of the separate com- 
ponents. The components of variance and the values whose variance 
they measure are listed in Table 8.1. 

Table 8 


Components of 


Value whose variance 

Variance component 


is measured 


v P 

Phenotypic value 



Genotypic value 


V A 

Breeding value 



Dominance deviation 



Interaction deviation 


V E 

Environmental deviation 

130 VARIANCE [Chap. 8 

The total variance is then, with certain qualifications to be men- 
tioned below, the sum of the components, thus: 

V P = V G +V E 

= V A + V D + V I +V E (8.1) 

Let us now consider these components of variance in detail. 

Genotypic and Environmental Variance 

The first division of phenotypic value that we made in the last 
chapter was into genotypic value and environmental deviation, 
P=G +E. The corresponding partition of the variance into genotypic 
and environmental components formulates the problem of "heredity 
versus environment' ' or "nature and nurture"; or, to put the question 
more precisely, the relative importance of genotype and- environment 
in determining the phenotypic value. The "relative importance" of a 
cause of variation means the amount of variation that it gives rise to, 
as a proportion of the total. So the relative importance of genotype 
as a determinant of phenotypic value is given by the ratio of geno- 
typic to phenotypic variance, V G /V P . The genotypic and environ- 
mental components cannot be estimated directly from observations 
on the population, but in certain circumstances they can be estimated 
in experimental populations. If one or other component could be 
completely eliminated, the remaining phenotypic variance would 
provide an estimate of the remaining component. Environmental 
variance cannot be removed because it includes by definition all 
non-genetic variance, and much of this is beyond experimental 
control. Elimination of genotypic variance can, however, be achieved 
experimentally. Highly inbred lines, or the F x of a cross between two 
such lines, provide individuals all of identical genotype and therefore 
with no genotypic variance. If a group of such individuals is raised 
under the normal range of environmental circumstances, their pheno- 
typic variance provides an estimate of the environmental variance 
V . Subtraction of this from the phenotypic variance of a genetically 
mixed population then gives an estimate of the genotypic variance 
of this population. 

Example 8.i. Partitioning of the phenotypic variance into its geno- 
typic and environmental components has been done for several characters 

Chap. 8] 



in Drosophila melanogaster. The results are given later, in Table 8.2, but 
here we may describe the results for one character in more detail in order 
to show how the partitioning is made. The character is the length of the 
thorax (in units of i/ioo mm.), which may be regarded as a measure of body- 
size. The phenotypic variance was measured first in a genetically mixed — 
i.e. a random-bred — population, and then in a genetically uniform popu- 
lation, consisting of the F ± generation of three crosses between highly 
inbred lines. The first estimates the genotypic and environmental variance 
together, and the second estimates the environmental variance alone. So, 
by subtraction, an estimate of the genotypic variance is obtained. The 
results, obtained by F. W. Robertson ( 19576), were as follows: 



Observed variance 


v G + v E 



v E 



v G 


Va/Vp =49% 

Thus 49 per cent of the variation of thorax length in this genetically mixed 
population is attributable to genetic differences between individuals, and 
5 1 per cent to non-genetic differences. 

Individuals of identical genotype are also provided by identical 
twins in man and cattle, but their use in partitioning the variance is 
very limited: they will be discussed in a later chapter when the 
problems that they raise will be better understood. Apart from the 
severely limited use of identical twins, the partitioning of the vari- 
ance into genotypic and environmental components depends on the 
availability of highly inbred lines, and is therefore restricted to experi- 
mental populations of plants or small animals. 

Three complications arise in connexion with the partitioning of 
the variance into genotypic and environmental components. They 
are all things that can usually be neglected or circumvented with little 
risk of error, but in some circumstances they may be important. The 
following account of them might well be omitted at a first reading, 
unless the reader is worried by the logical fallacies introduced by 
neglecting them. 

Dependence of environmental variance on genotype. Ex- 
periments of the type illustrated in Example 8.1 rest on the assump- 
tion that the environmental variance is the same in all genotypes, and 
this is certainly not always true. The environmental variance mea- 
sured in one inbred line or cross is that shown by this one particular 

132 VARIANCE [Chap. 8 

genotype, and other genotypes may be more or less sensitive to 
environmental influences and may therefore show more or less 
environmental variance. The environmental variance of the mixed 
population may therefore not be the same as that measured in the 
genotypically uniform group. Not very much is yet known about this 
complication except that many characters show more environmental 
variance among inbred than among outbred individuals, inbreds being 
more sensitive or less well "buffered." The reality of the complica- 
tion is therefore not in doubt. Further discussion of the phenomenon 
will be found under the effects of inbreeding, in Chapter 15, where it 
more properly belongs. The existence of this complication means 
that when dealing with genotypically mixed populations we have to 
define the environmental component of variance as the mean en- 
vironmental variance of the genotypes in the population, and we have 
to recognise the possibility that if the frequencies of the genotypes 
are changed, as by selection, the environmental variance may also be 
changed in consequence. 

Genotype-environment correlation. Hitherto we have tacitly 
assumed that environmental deviations and genotypic values are 
independent of each other; in other words that there is no correlation 
between genotypic value and environmental deviation, such as would 
arise if the better genotypes were given better environments. Corre- 
lation between genotype and environment is seldom an important 
complication, and can usually be neglected in experimental popula- 
tions, where randomisation of environment is one of the chief objects 
of experimental design. There are some situations, however, in 
which the correlation exists. Milk-yield in dairy cattle provides an 
example. The normal practice of dairy husbandry is to feed cows 
according to their yield, the better phenotypes being given more 
food. This introduces a correlation between phenotypic value and 
environmental deviation; and, since genotypic and phenotypic values 
are correlated, there is also a correlation between genotypic value and 
environmental deviation. The complication of genotype-environment 
correlation is very simply overcome by regarding the special environ- 
ment — i.e. the feeding level in the case of cows — as part of the geno- 
type. This situation is covered by the definition of genotypic value, 
provided genotypic values are taken to refer to genotypes as they 
occur under the normal conditions of association with specific 
environments. If genotypic values were not so defined we could not 
treat the phenotypic variance as simply the sum of the genotypic and 

Chap. 8] 



environmental variances, but we should have to include a covariance 
term, thus: 

V P = Vq + V E + 2C0V 0E 


where cov GE is the covariance of genotypic values and environmental 
deviations. If the genotypic variance is estimated, as in Example 8.i, 
by the comparison of genetically identical with genetically mixed 
groups, then the covariance would be eliminated with the genotypic 
variance from the genetically identical group, and the estimate ob- 
tained will be of genotypic variance together with twice the co- 
variance. Thus, while on theoretical grounds it is convenient, on 
practical grounds it is unavoidable, to regard any covariance that may 
arise from genotype-environment correlation as being part of the 
genotypic variance. 

Genotype-environment interaction. Another assumption that 
we have made, which is not always justifiable, is that a specific differ- 
ence of environment has the same effect on different genotypes; or, in 
other words, that we can associate a certain environmental deviation 
with a specific difference of environment, irrespective of the genotype 
on which it acts. When this is not so there is an interaction, in the 
statistical sense, between genotypes and environments. There are 
several forms which this interaction may take (Haldane, 1946). For 
example, a specific difference of environment may have a greater 
effect on some genotypes than on others; or there may be a change in 
the order of merit of a series of genotypes when measured under 
different environments. That is to say, genotype A may be superior 
to genotype B in environment X, but inferior in environment Y, as in 
the following example. 

Example 8.2. The following figures show the growth, between 3 and 
6 weeks of age, of two strains of mice reared on two levels of nutrition 
(original data): 





Strain A 

17-2 gm. 

12-6 gm. 

Strain B 

16-6 gm. 

13*3 g m - 

Strain A grows better than strain B under good conditions, but worse 
under bad conditions. 

134 VARIANCE [Chap. 8 

An interaction between genotype and environment, whatever its 
nature, gives rise to an additional component of variance. This 
interaction variance can be isolated and measured only under rather 
artificial circumstances. We may replicate genotypes by the use of 
inbred lines or Fx's, and replicate specific environments by the con- 
trol of such factors as nutrition or temperature. Then an analysis of 
variance in a two-way classification of genotypes x environments will 
yield estimates of the genotypic variance (between genotypes), the 
environmental variance (between environments) and the variance 
attributable to interaction of genotypes with environments. The 
specific environments in such an experiment are, however, more in 
the nature of "treatments" because a population under genetical 
study would not normally encounter so wide a range of environments 
as that provided by the different treatments. It is therefore the 
genotype-environment interaction occurring within one such treat- 
ment that is relevant to the genetical study of a population, and this 
cannot be measured because the separate elements of the environ- 
ment cannot be isolated and controlled. In an experiment such as 
that of Example 8.1, which removes the genotypic variance by the 
use of inbred lines or F^s, the interaction variance remains with the 
environmental in the phenotypic variance measured in the genetically 
uniform individuals. In normal circumstances, therefore, the vari- 
ance due to genotype-environment interaction, since it cannot be 
separately measured, is best regarded as part of the environmental 
variance. When large differences of environment, such as between 
different habitats, are under consideration, the presence of genotype- 
environment interaction becomes important in connexion with the 
specialisation of breeds or varieties to local conditions. This matter 
will be taken up again later, in Chapter 19, because it can be more 
profitably discussed from a different viewpoint. 

Genetic Components of Variance 

The partition into genotypic and environmental variance does not 
take us far toward an understanding of the genetic properties of a 
population, and in particular it does not reveal the cause of resem- 
blance between relatives. The genotypic variance must be further 
divided according to the division of genotypic value into breeding 
value, dominance deviation, and interaction deviation. Thus we have: 

Chap. 8] 




Variance components 

G = A + D + I 

V G = V A + V D + V t (8. 4 ) 

(genotypic) (additive) (dominance) (interaction) 

The additive variance, which is the variance of breeding values, is the 
important component since it is the chief cause of resemblance be- 
tween relatives and therefore the chief determinant of the observable 
genetic properties of the population and of the response of the popu- 
lation to selection. Moreover, it is the only component that can be 
readily estimated from observations made on the population. In prac- 
tice, therefore, the important partition is into additive genetic variance 
versus all the rest, the rest being non-additive genetic and environ- 
mental variance. This partitioning is most conveniently expressed 
as the ratio of additive genetic to total phenotypic variance, V A /V P > a 
ratio called the heritability. 

Estimation of the additive variance rests on observation of the 
degree of resemblance between relatives and will be described later 
when we have discussed the causes of resemblance between relatives. 
Our immediate concern here is to show how the genetic components 
of variance are influenced by the gene frequency. To do this we have 
to express the variance in terms of the gene frequency and the as- 
signed genotypic values a and d. We shall consider first a single locus 
with two alleles, thus excluding interaction variance for the moment. 

Additive and dominance variance. The information needed to 
obtain expressions for the variance of breeding values and the variance 
of dominance deviations was given in the last chapter in Table 7.3 
(p. 125). This table gives the breeding values and dominance devia- 
tions of the three genotypes, expressed as deviations from the popu- 
lation mean. It will be remembered that the means of both breeding 
values and dominance deviations are zero. Therefore no correction 
for an assumed mean is needed, and the variance is simply the mean 
of the squared values. The variances are thus obtained by squaring 
the values in the table, multiplying by the frequency of the genotype 
concerned, and summing over the three genotypes. (The procedure 
of multiplying values by frequencies to obtain the mean was explained 
on p. 114.) The additive variance, which is the variance of breeding 
values, is obtained as follows: 

V A = oPfoqy + (q -pf . zpq + tf y] 
= 2pqoc 2 (2pq +p 2 +q 2 - zpq + zpq) 
= zpqa? 
= zpq[a + d(q-p)Y 

.(8. 5 . a) 
\8. 5 .b) 



136 VARIANCE [Chap. 8 

Similarly the variance of dominance deviations is 
V D = d%iq 2 p 2 + Sp V + ^p V) 

= (^) 2 (5.6) 

It was noted in the last chapter that breeding values and dominance 
deviations are uncorrelated. From this it follows that the genotypic 
variance is simply the sum of the additive and dominance variances. 

v Q =v A + v D 

= zpq[a + d(q -p)f + [zpqdf (8. 7 ) 

Example 8.3. To illustrate the genetic components of variance arising 
from a single locus let us return to the pygmy gene in mice, used for 
several examples in the last chapter. From the values tabulated in Ex- 
ample 7.6 (p. 123) we may compute the components of variance directly. 
Since the values are expressed as deviations from the population mean, the 
variance is obtained by multiplying the frequency of each genotype by 
the square of its value, and summing over the three genotypes. For ex- 
ample, the genotypic variance when q = o-i is o-8i(o-44) 2 + o-i8( - i*56) 2 + 
o-oi(-7'56) 2 = 1-1664. The additive variance is obtained in the same way 
from the variance of breeding values, and the dominance variance from 
the variance of dominance deviations. The variances obtained are as 

q = o-i <Z = o*4 

Genotypic, Vq 1-1664 7 ,:i: 4 2 4 

Additive, V A 1-0368 6-2208 

Dominance, Vj) 0-1296 0-9216 

The variances may be obtained also, and with less trouble, by use of the 
formulae given above in equations #.5, 8.6 and 8.J. The values to be sub- 
stituted were given in Example 7.1; namely, a = 4 and d = z. Notice that 
the dominance variance is quite small in comparison with the additive. 

The ways in which the gene frequency and the degree of domin- 
ance influence the magnitude of the genetic components of 
variance can best be appreciated from graphical representations of the 
relationships derived above, in equations £.5, 8.6, and 8.J. The 
graphs in Fig. 8.1 show the amounts of genotypic, additive, and 
dominance variance arising from a single locus with two alleles, 
plotted against the gene frequency. Three cases are shown to illus- 

Chap. 8] 



trate the effect of different degrees of dominance: in graph (a) there 
is no dominance (d=o); in graph (b) there is complete dominance 
(d=a); and in graph (c) there is "pure" over- dominance (a = o). 
In the first case the genotypic variance is all additive, and it is 
greatest whenp=q=o-$. In the second case the dominance variance 
is maximal when p=q = 0-5, and the additive is maximal when the 
frequency of the recessive allele is q = o-j$. In the third case the 
dominance variance is the same as in the second and is maximal 






































^*— j 


<N v 




\ N 





0-2 0-4 0-6 0-8 0-2 04 06 08 I 


Fig. 8.1. Magnitude of the genetic components of variance 
arising from a single locus with two alleles, in relation to the gene 
frequency. Genotypic variance — thick lines; additive variance — 
thin lines; dominance variance — broken lines. The gene fre- 
quency, q, is that of the recessive allele. The degrees of dominance 
are: in (a) no dominance (d=o); in (b) complete dominance (d = a); 
and in (c) "pure" overdominance (a =o). The figures on the vertical 
scale, showing the amount of variance, are to be multiplied by a 2 
in graphs (a) and (b), and by d 2 in graph (c). 


when p=q=o-$. The additive variance, however, is zero when 
p=q = o-$ y and has two maxima, one at ^ = 0-15 and the other at 
^ = 0-85. The genotypic variance, in this case, remains practically 
constant over a wide range of gene frequency, though its composition 
changes profoundly. The general conclusion to be drawn from these 
graphs is that genes contribute much more variance when at inter- 
mediate frequencies than when at high or low frequencies: recessives 
at low frequency, in particular, contribute very little variance. 

A possible misunderstanding about the concept of additive gene- 
tic variance, to which the terminology may give rise, should be 

138 VARIANCE [Chap. 8 

mentioned here. The concept of additive variance does not carry 
with it the assumption of additive gene action; and the existence of 
additive variance is not an indication that any of the genes act addi- 
tively (i.e. show neither dominance nor epistasis). No assumption is 
made about the mode of action of the genes concerned. Additive 
variance can arise from genes with any degree of dominance or epis- 
tasis, and only if we find that all the genotypic variance is additive can 
we conclude that the genes show neither dominance nor epistasis. 

The existence of more than two alleles at a locus introduces no 
new principle, though it complicates the theoretical description of the 
effect of the locus. Expressions for the additive and dominance 
variances are given by Kempthorne (1955a). The locus contributes 
additive variance arising from the average effects of its several alleles, 
and dominance variance arising from the several dominance devia- 

To arrive at the variance components expressed in the population 
the separate effects of all loci that contribute variance have to be 
combined. The additive variance arising from all loci together is the 
sum of the additive variances attributable to each locus separately; 
and the dominance variance is similarly the sum of the separate contri- 
butions. But when more than one locus is under consideration then 
the interaction deviations, if present, give rise to another component 
of variance, the interaction variance, which is the variance of the 
interaction deviations. 

Interaction variance. We shall treat the interaction variance as 
a complication, like genotype-environment correlation or inter- 
action, to be circumvented: that is to say, we shall not discuss its 
properties in detail, but we shall show what happens to it if it is 
ignored. It is only comparatively recently that the properties of the 
interaction variance have been worked out (see Cockerham, 1954; 
Kempthorne, 1954, 1955a, 6) and little is yet known about its import- 
ance in relation to the other components. It seems probable, how- 
ever, that the amount of variance contributed by it is usually rather 
small, and that neglect of it is therefore not likely to lead to serious 
error. Description of the properties of interaction variance rests on 
its further subdivision into components. It is first subdivided ac- 
cording to the number of loci involved: two-factor interaction arises 
from the interaction of two loci, three-factor from three loci, etc. 
Interactions involving larger numbers of loci contribute so little 
variance that they can be ignored, and we shall confine our attention 

Chap. 8] 



to two-factor interactions since these suffice to illustrate the principles 
involved. The next subdivision of the interaction variance is accord- 
ing to whether the interaction involves breeding values or dominance 
deviations. There are thus three sorts of two-factor interactions. 
Interaction between the two breeding values gives rise to additive x 
additive variance, V AA \ interaction between the breeding value of one 
locus and the dominance deviation of the other gives rise to additive x 
dominance variance, V AD ; and interaction between the two domin- 
ance deviations gives rise to dominance x dominance variance, V DD . 
So the interaction variance is broken down into components thus: 

Vi = V AA + V AD + V DD + etc. 


the terms designated "etc." being similar components arising from 
interactions between more than two loci. At the moment we cannot 
go further than this in the description of the interaction variance, but 
we shall show later how it affects the resemblance between relatives 
and what happens to it when components of variance are estimated 
from observations on the population. 

That completes the description of the nature of the genetic com- 
ponents of variance. The practical value of the partitioning of the 
variance will not yet be fully apparent because it arises from the 
causes of resemblance between relatives, which is the subject of the 
next chapter. The partitioning we have made is essentially a theo- 
retical one, and before we pass on we should consider how much of it 
can actually be made in practice. When observations of resemblance 
between relatives are available we can estimate the additive variance 
and so make the partition V A : (V D + V r + V E ). And if inbred lines 
are available we can estimate the environmental variance and so make 
the partition V G : V E . If both these partitions are made we can 
separate the additive genetic from the rest of the genetic variance, and 
so make the three-fold partition into additive genetic, non-additive 
genetic, and environmental variance, V A : (V D + Vj) : V^, the domin- 
ance and interaction components being lumped together as non- 
additive genetic variance. Examples of this partitioning are given in 
Table 8.2, although at this stage the method by which the additive 
component is estimated will not be understood. This partitioning is 
as far as we can go by means of relatively simple experiments. By 
more elaborate techniques, requiring large numbers of observations, 
it may be possible to go some way toward separating the dominance 
from the interaction components, or at least to get an idea of their 

140 VARIANCE [Chap. 8 

relative importance. (See, in particular, Robinson and Comstock, 
1955; Hayman, 1955, 1958; Cockerham, 19566.) 

Table 8.2 

Partitioning of the variance of four characters in Drosophila 

melanogaster. Components as percentages of the total, 

phenotypic, variance. 











v P 





Additive genetic 

V A 





Non-additive genetic 

V d + Vj 






v E 






(1) Number of bristles on 4th + 5th abdominal segments (Clayton, Morris, 
and Robertson, 1957; Reeve and Robertson, 1954). 

(2) Length of thorax (F. W. Robertson, 1 9576). 

(3) Size of ovaries, i.e. number of ovarioles in both ovaries. (F. W. 
Robertson, 19570). 

(4) Number of eggs laid in 4 days (4th to 8th after emergence) (F. W. 
Robertson, 19576). 

Environmental Variance 

Environmental variance, which by definition embraces all varia- 
tion of non-genetic origin, can have a great variety of causes and its 
nature depends very much on the character and the organism studied. 
Generally speaking, environmental variance is a source of error that 
reduces precision in genetical studies and the aim of the experimenter 
or breeder is therefore to reduce it as much as possible by careful 
management or proper design of experiments. {N utrition al and 
climatic factors are the commonest external causes of environmental 
variation, and they are at least partly under experimental control. 
Maternal effects form another source of environmental variation that 
is sometimes important, particularly in mammals, but is less sus- 
ceptible to control. Maternal effects are prenatal and postnatal 
influences, mainly nutritional, of the mother on her young: we shall 
have more to say about them in the next chapter in connexion with 

Chap. 8] 



resemblance between relatives. Error of measurement is another 
source of variation, though it is usually quite trivial. When a charac- 
ter can be measured in units of length or weight it is usually measured 
so accurately that the variance attributable to measurement is neg- 
ligible in comparison with the rest of the variance. Some characters, 
however, cannot strictly speaking be measured, but have to be graded 
by judgement into classes. Carcass qualities of livestock are an ex- 
ample. With such characters the variance due to measurement may 
be considerable. 

In addition to the variation arising from recognisable causes, such 
as those mentioned, there is usually also a substantial amount of 
non-genetic variation whose cause is unknown, and which therefore 
cannot be eliminated by experimental design. This is generally 
referred to as "intangible" variation. Some of the intangible varia- 
tion may be caused by "environmental" circumstances, in the common 
meaning of the word — that is, by circumstances external to the 
individual — even though their nature is not known. Some, however, 
may arise from "developmental" variation: variation, that is, which 
cannot be attributed to external circumstances, but is attributed, in 
ignorance of its exact nature, to "accidents" or "errors" of develop- 
ment as a general cause. Characters whose intangible variation is 
predominantly developmental are those connected with anatomical 
structure, which do not change after development is complete, such 
as skeletal form, pigmentation, or bristle number in Drosophila. 
Characters more susceptible to the influences of the external environ- 
ment, in contrast, are those connected with metabolic processes, such 
as growth, fertility, and lactation. 

Example 8.4. Human birth weight provides an example of a character 
subject to much environmental variation whose nature has been analysed 
in detail (Penrose, 1954; Robson, 1955). The partitioning of the pheno- 
typic variance given in the table shows the relative importance of all the 
identified sources of variation, birth weight being regarded as a character 
of the child. All the environmental variation is "maternal" in the sense 
that it is connected with the prenatal environment, but several distinct 
components of the maternal environment are distinguished. "Maternal 
genotype," which accounts for 20 per cent of the total phenotypic variance, 
reflects genetic variation (chiefly additive) between mothers in the birth 
weight of their children; i.e. birth weight regarded as a character of the 
mother. "Maternal environment, general," which accounts for another 
18 per cent, reflects non-genetic variation between mothers in the same way. 

142 VARIANCE [Chap. 8 

These two components, totalling 38 per cent, are maternal causes of varia- 
tion in birth weight that affect all children of the same mother alike. 
"Maternal environment, immediate" means causes attributable to the 
mother but differing in successive pregnancies. Two causes of the same 
nature — "age of mother" and "parity" (i.e. whether the child is the first, 

Partitioning of variance of human birth-weight. Com- 
ponents as percentages of the total, phenotypic, variance. 

Cause of variation 





Non-additive (approx) 




Total genotypic 


Maternal genotype 


Maternal environment, 



Maternal environment, 



Age of mother 






Total environmental 



second, etc.) — are separately identifiable. Finally, the "intangible" 
variation is all the remainder, of which the cause cannot be identified. To 
explain how these various components were estimated would take too 
much space, and could not properly be done until the end of Chapter 10. 
It must suffice to say that the estimates all come from comparisons of the 
degree of resemblance between identical twins, fraternal twins, full sibs, 
children of sisters, and other sorts of cousins. 

Multiple measurements. When more than one measurement 
of the character can be made on each individual, the phenotypic 
variance can be partitioned into variance within individuals and 
variance between individuals. This subdivision serves to show how 
much is to be gained by the repetition of measurements, and it may 
also throw light on the nature of the environmental variation. There 
are two ways by which the repetition of a character may provide 
multiple measurements: by temporal repetition and by spatial repe- 
tition. Milk-yield and litter size are examples of characters repeated 
in time. Milk-yield can be measured in successive lactations, and 

Chap. 8] 



litter size in successive pregnancies. Several measurements of each in- 
dividual can thus be obtained. The variance of yield per lactation, or 
of the number of young per litter, can then be analysed into a com- 
ponent within individuals, measuring the differences between the 
performances of the same individual, and a component between in- 
dividuals, measuring the permanent differences between individuals. 
The within-individual component is entirely environmental in 
origin, caused by temporary differences of environment between suc- 
cessive performances. The between-individual component is partly 
environmental and partly genetic, the environmental part being 
caused by circumstances that affect the individuals permanently. By 
this analysis, therefore, the variance due to temporary environmental 
circumstances is separated from the rest, and can be measured. 

Characters repeated in space are chiefly structural or anatomical, 
and are found more often in plants than in animals. For example, 
plants that bear more than one fruit yield more than one measure- 
ment of any character of the fruit, such as its shape or seed content. 
Spatial repetition in animals is chiefly found in characters that can be 
measured on the two sides of the body or on serially repeated parts, 
such as the number of bristles on the abdominal segments of Droso- 
phila. With spatially repeated characters the within-individual 
variance is again entirely environmental in origin but, unlike that of 
temporally repeated characters, it represents the "developmental" 
variation arising from localised circumstances operating during 

In order that we may discuss both temporal and spatial repetition 
together we shall use the term special environmental variance, V Es , to 
refer to the within-individual variance arising from temporary or 
localised circumstances; and the term general environmental variance, 
V Eg , to refer to the environmental variance contributing to the 
between-individual component and arising from permanent or non- 
localised circumstances. The ratio of the between-individual com- 
ponent to the total phenotypic variance measures the correlation (r) 
between repeated measurements of the same individual, and is 
known as the repeatability of the character. The partitioning of the 
phenotypic variance expressed by the repeatability is thus into two 
components, V Es versus (V G + V Eg ), so that the repeatability is 

r = 

Vq+Ve , 


144 - VARIANCE [Chap. 8 

The repeatability therefore expresses the proportion of the variance 
of single measurements that is due to permanent, or non-localised, 
differences between individuals, both genetic and environmental. 
The repeatability differs very much according to the nature of the 
character, and also, of course, according to the genetic properties of 

Table 8.3 
Some Examples of Repeatability 

Organism and character 


Drosophila melanogaster : 

Abdominal bristle number (see Example 8.6). 
Ovary size (F. W. Robertson, 19570). 




Weight at 6 weeks (repeated on 4 consecutive days. 

Original data). 
Litter size (see Example 8.5). 




Weight of fleece, measured in different years (Morley, 



Milk-yield (Johansson, 1950). -40 

the population and the environmental conditions under which the 
individuals are kept. The estimates in Table 8.3 give some idea of 
the sort of values that may be found with various characters, and two 
cases are described in more detail in the following examples. 

Example 8.5. Litter size in mice will serve as an example of a character 
repeated in time. The number of live young born in first and in second 
litters was recorded in 296 mice of a genetically heterogeneous stock, and 
yielded the following components of variance (original data): 

Between mice 3-58 
Within mice 4-44 

(The procedure for estimating the components of variance from an 
analysis of variance is described by Snedecor (1956, Section 10.12) and is 
outlined below, in Chapter 10, p. 173.) The repeatability of litter size is 
given by the ratio of the between-mice component to the sum of the be- 
tween-mice and the within-mice components: i.e. 

Chap. 8] 





Example 8.6. The number of bristles on the ventral surfaces of the 
abdominal segments is a character that has been much studied in Droso- 
phila melanogaster, because it is technically convenient and its genetic 
properties are relatively simple. We have already mentioned it several 
times but have not yet used it as an example. There are about 20 bristles 
on each of 3 segments in males and each of 4 segments in females. The 
number of bristles per segment can therefore be treated as a spatially 
repeated character. The sources of variation in this character have been 
studied in detail by Reeve and Robertson (1954), and the following com- 
ponents of variance were found: 



Total phenotypic 

v P 



Between flies 

v G +v Eg 



Within flies 

V Es 






Estimation of the repeatability of a character separates off the 
component of variance due to special environment, V Esi but it leaves 
the other component of environmental variance — that due to general 
environment, V Eg — confounded with the genotypic variance, as 
shown in the above example. The component due to general en- 
vironment can be separately estimated only if the genotypic variance 
(i.e. including the non-additive components) has been estimated, in 
the manner explained in Example 8.1. This has been done with two 
characters in Drosophila, and the results are given in Table 8.4. The 

Table 8.4 

Partitioning of the environmental variance of two charac- 
ters in Drosophila melanogaster into components due to 
general, V Eg , and special, V Esy environment. The charac- 
ters are: abdominal bristle-number (Reeve and Robertson, 
1954) as explained in Example 8.6, and ovary size (F. W. 
Robertson, 1957a), measured in the two ovaries by the 
number of ovarioles, or "egg strings." 

Total environmental, V E 
General environmental, V Eg 
Special environmental, V Es 










9 1 

146 VARIANCE [Chap. 8 

nature of the environmental variation revealed by these results is 
remarkable. With both characters less than 10 per cent of the 
environmental variance is general — that is, due to causes influencing 
the individual as a whole. These characters are therefore very little 
influenced by the conditions of the external environment: or, perhaps 
it would be more accurate to say that the experimental technique of 
rearing the flies has been very successful in eliminating unwanted 
sources of environmental variation. Yet, fully half the phenotypic 
variation of one measurement (one segment or ovary) is non-genetic, 
or environmental in the wide sense, as shown in Table 8.2; and, 
moreover, is due to strictly localised causes that influence the seg- 
ments or ovaries independently. Whether this developmental 
variation represents a real indeterminacy of development, or has 
material causes still undetected but in principle controllable, is quite 
unknown. Nor is it known whether the situation revealed in these 
two characters is at all general. We cannot here pursue further the 
biological nature of the non-genetic variation: a general discussion of 
these problems will be found in Waddington (1957). 

We must return to the repeatability and consider its uses. 
Knowledge of the repeatability of a character is useful in two ways. 
First, it sets upper limits to the values of the two ratios, V A jV P and 
V /V P . The first (additive genetic to total phenotypic variance), is the 
heritability, which as we shall see in later chapters is of great practical 
importance. The second (genotypic to phenotypic variance) measures 
the degree of genetic determination of the character. The repeatability 
is usually much easier to determine than either of these two ratios, 
and it may often be known when they are not. 

The second way in which knowledge of the repeatability is useful 
is that it indicates the gain in accuracy to be expected from multiple 
measurements. Suppose that each individual is measured n times, 
and that the mean of these n measurements is taken to be the pheno- 
typic value of the individual, say P (n) . Then the phenotypic variance 
is made up of the genotypic variance, the general environmental 
variance, and one n th of the special environmental variance: 

V Pin) = V a +V Eg + ^V Es (8 jo) 

Thus, increasing the number of measurements reduces the amount 
of variance due to special environment that appears in the pheno- 
typic variance, and this reduction of the phenotypic variance repre- 

Chap. 8] 



sents the gain in accuracy. The variance of the mean of n measure- 
ments as a proportion of the variance of one measurement can be 
expressed in terms of the repeatability, as follows: 


i + r(n - i ) 


where r is the repeatability, or the correlation between the measure- 
ments of the same individual. Fig. 8.2 shows how the phenotypic 
variance is reduced by multiple measurements, with characters of 

100 i 








— -—. — 







- — 






J -0-25 


123456789 l( 


Fig. 8.2. Gain in accuracy from multiple measurements of each 
individual. The vertical scale gives the variance of the mean of n 
measurements as a percentage of the variance of one measurement. 
The horizontal scale gives the number of measurements, up to io. 
The four graphs refer to characters of different repeatability as 

148 VARIANCE [Chap. 8 

different repeatabilities. When the repeatability is high, and there is 
therefore little special environmental variance, multiple measure- 
ments give little gain in accuracy. When the repeatability is low, 
multiple measurements may lead to a worth-while gain in accuracy. 
The gain in accuracy, however, falls off rapidly as the number of 
measurements increases, and it is seldom worth while to make more 
than two measurements. 

Example 8.7. Studies of abdominal bristle number in Drosophila are 
generally based on two measurements, i.e. of the fourth and fifth seg- 
ments, and the phenotypic values are expressed as the sum of the two 
counts. As an illustration of the nature of the advantage gained by the 
double measurement we may compare the percentage composition of the 
phenotypic variance when phenotypic values are based on counts of one 
or of two segments: 






v P 



Additive genetic 

v A 



Non-additive genetic 

Vb + Vj 



Environmental, general 




Environmental, special 




By reducing the amount of environmental variance, the making of two 
measurements increases the proportionate amount of genetic variance: in 
practice it is the increase of the proportion of additive variance — in this 
case from 34 per cent to 52 per cent — that is the important consideration. 

There is an important assumption implicit in the idea of repeata- 
bility, which we have not yet mentioned. It is the assumption that 
the multiple measurements are indeed measurements of what is 
genetically the same character. Consider for example milk-yield in 
successive lactations. If the assumption were valid it would mean that 
the genes that influence yield in first lactations are entirely the same 
as those that influence yield in second or later lactations; or, to put the 
matter in another way, that yield in all lactations is dependent on 
identical developmental and physiological processes. If this assump- 
tion is not valid, as it certainly is not for milk-yield in cattle, then the 
variation within individuals is not purely environmental, and equation 
8.11 is erroneous. The variance between the means of individuals 
will be augmented by additional variance arising from what may 
formally be regarded as interaction between genotype and "environ- 

Chap. 8] 



ment," that is between genotype and the time or location of the 
measurement. And this additional variance may be enough to 
counteract the reduction of environmental variance which we have 
described as the chief advantage to be gained from multiple measure- 
ments. Consequently an increase in the proportion of additive genetic 
variance from multiple measurements cannot be relied on until the 
genetical identity of the character measured has been established. 
The number of bristles on the abdominal segments of Drosophila has 
been proved to be genetically the same character, as will be explained 
in Chapter 19, and the conclusions reached in Example 8.7 are valid. 
Milk-yield in cattle, in contrast, is not the same character in suc- 
cessive lactations, and the proportion of additive variance is actually 
less for the mean of several lactations than for first lactations only. 
(See Rendel, et al. y 1957.) 



The resemblance between relatives is one of the basic genetic pheno- 
mena displayed by metric characters, and the degree of resemblance 
is a property of the character that can be determined by relatively 
simple measurements made on the population without special experi- 
mental techniques. The degree of resemblance provides the means 
of estimating the amount of additive variance, and it is the propor- 
tionate amount of additive variance (i.e. the heritability) that chiefly 
determines the best breeding method to be used for improvement. 
An understanding of the causes of resemblance between relatives is 
therefore fundamental to the practical study of metric characters and 
to its application in animal and plant improvement. In this chapter, 
therefore, we shall examine the causes of resemblance between rela- 
tives, and show in principle how the amount of additive variance can 
be estimated from the observed degree of resemblance, leaving the 
more practical aspects of the estimation of the heritability for con- 
sideration in the next chapter. 

In the last chapter we saw how the phenotypic variance can be 
partitioned into components attributable to different causes. These 
components we shall call causal components of variance, and denote 
them as before by the symbol V. The measurement of the degree of 
resemblance between relatives rests on the partitioning of the pheno- 
typic variance in a different way, into components corresponding to 
the grouping of the individuals into families. These components can 
be estimated directly from the phenotypic values and for this reason 
we shall call them observational components of phenotypic variance, 
and denote them by the symbol ct 2 in order to keep the distinction 
clear. Consider, for example, the grouping of individuals into 
families of full sibs. By the analysis of variance we can partition the 
total observed variance into two components, within groups and 
between groups. The within-group component is the variance of 
individuals about their group means, and the between-group com- 
ponent is the variance of the "true" means of the groups about the 

Chap. 9] 



population mean. The true mean of a group is the mean estimated 
without error from a very large number of individuals. An explana- 
tion of the estimation of these two components will be given, with 
examples, in the next chapter. Now, the resemblance between related 
individuals, i.e. between full sibs in the case under discussion, can be 
looked at either as similarity of individuals in the same group, or as 
difference between individuals in different groups. The greater the 
similarity within the groups, the greater in proportion will be the 
difference between the groups. The degree of resemblance can 
therefore be expressed as the between-group component as a pro- 
portion of the total variance. This is the intra-class correlation coeffi- 
cient and is given by 


o B -to w 

where <j% is the between-group component and o> the within-group 
component. (It is customary to use the symbol t for the intra-class 
correlation of phenotypic values in order to avoid confusion with 
other types of correlation for which the symbol r is used.) The 
between-group component expresses the amount of variation that is 
common to members of the same group, and it can equally well be 
referred to as the covariance of members of the groups. In the case of 
the resemblance between offspring and parents the grouping of the 
observations is into pairs rather than groups; one parent, or the mean 
of two parents, paired with one offspring or the mean of several 
offspring. It is then more convenient to compute the covariance of 
offspring with parents from the sum of cross-products, rather than 
from the between-pair component of variance. With offspring- 
parent relationships, also, it is usually more convenient to express the 
degree of resemblance as the regression coefficient of offspring on 
parent, instead of the correlation between them, the regression being 
given by 


cov OF 

where cov OY is the covariance of offspring and parents, and o-J is the 
variance of parents. 

Thus, the covariance of related individuals is the new property 
of the population that we have to deduce in seeking the cause of 
resemblance between relatives, whether sibs or offspring and parents. 

L F.Q.G. 


The covariance, being simply a portion of the total phenotypic 
variance, is composed of the causal components described in the last 
chapter, but in amounts and proportions differing according to the 
sort of relationship. By finding out how the causal components con- 
tribute to the covariance we shall see how an observed covariance can 
be used to estimate the causal components of which it is composed. 

Both genetic and environmental sources of variance contribute to 
the covariance of relatives. We shall consider the genetic causes of 
resemblance first, then the environmental causes, and finally, by 
putting the two causes together, arrive at the phenotypic covariance 
and the degree of resemblance that can be observed from measure- 
ments of phenotypic values. A general description of the covariance, 
applicable to any sort of relationship, is given by Kempthorne 
(1955a). Here we shall consider only four sorts of relationship: (1) 
between offspring and one parent, (2) between half sibs, (3) between 
offspring and the mean of the two parents, and (4) between full sibs. 
These are the most important relationships in practice. Identical 
twins will be considered in the next chapter, because the problems 
they raise will be better understood then. 

Genetic Covariance 

Our object now is to deduce from theoretical considerations the 
covariance of relatives arising from genetic causes, neglecting for the 
time being any non-genetic causes of resemblance that there may be. 
This means that we have to deduce the covariance of the genotypic 
values of the related individuals. This will be done by reference to 
two alleles at a locus, but the conclusions are equally valid for loci 
with any number of alleles. We shall at first omit interaction deviations 
and the interaction component of variance from consideration, but 
we shall describe its effects briefly later. 

Offspring and one parent. The covariance to be deduced is 
that of the genotypic values of individuals with the mean genotypic 
values of their offspring produced by mating at random in the popu- 
lation. If values are expressed as deviations from the population 
mean, then the mean value of the offspring is by definition half the 
breeding value of the parent, as explained in Chapter 7. Therefore 
the covariance to be computed is that of an individual's genotypic 
value with half its breeding value, i.e. the covariance of G with \A. 



Chap. 9] 

Since G=A+D (D being the dominance deviation) the covariance 
is that of (A+D) with \A. Taking the sum of cross-products, we 

sum of cross-products =Z\A(A +D) 
= ±ZA 2 + \ZAD 

Since A and D are uncorrelated (see p. 125), the term \ZAT> is 
zero. Then if we divide both sides by the number of paired observa- 
tions we have 

cov 01 > = iV A 


since ZA 2 is the sum of squares of breeding values. The genetic 
covariance of offspring and one parent is therefore half the additive 

The covariance may be derived by another method, which though 
less concise is perhaps more explicit. Table 9.1 gives the genotypes 
of the parents, their frequencies in the population, and their geno- 
typic values expressed as deviations from the population mean (from 
Table 7.3). The right-hand column gives the mean genotypic values 


Table 9.1 




Genotypic value 

Mean genotypic value 


p 2 



A X A 2 


(q -p)oc + zpqd 


A 2 A 2 


- 2p{tx +pd) 


of the offspring, which are half the breeding values of the parents as 
given in Table 7.3. The covariance of offspring and parent is then the 
mean cross-product, and is obtained by multiplying together the 
three columns — frequency x genotypic value of parent x genotypic 
value of offspring — and summing over the three genotypes of the 
parents. After collecting together the terms in a 2 and the terms in ocd 
we obtain 

cov OY =pq<x 2 (p 2 + Zpq + q 2 ) + 2p 2 q 2 ad( -q + q-p +p) 
=pq<x 2 
= Wa 

since from equation £.5, V A = zpqa. 2 . Summing over all loci we again 
reach the conclusion that the covariance of offspring and one parent 
is equal to half the additive variance. 


Half sibs. Half sibs are individuals that have one parent in com- 
mon and the other parent different. A group of half sibs is therefore 
the progeny of one individual mated at random and having one 
offspring by each mate. Thus the mean genotypic value of the group 
of half sibs is by definition half the breeding value of the common 
parent. The covariance is the variance of the means of the half-sib 
groups, and is therefore the variance of half the breeding values of the 
parents; this is a quarter of the additive variance: 

CW(BB) = V*A=hV A (9-2) 

This covariance also can be demonstrated by the longer method, 
from the values already given in Table 9.1. The covariance is the 
variance of the means of the groups of offspring listed in the right- 
hand column. Squaring the offspring values and multiplying by their 
frequencies we get 

Variance of means of half-sib families 

=p 2 q 2 * 2 + Zpq. l(q -p) 2 oc 2 + q 2 p 2 <x 2 
=pqoc 2 [pq + i(q-p) 2 +pq] 

=pq« 2 ii(P+q) 2 ] 

= ipq* 2 
Therefore, since zpqoc 2 = V A (from equation 8.5), 

coV( m) =lV A 

summation being made over all loci. 

Offspring and mid-parent. The covariance of the mean of the 
offspring and the mean of both parents (commonly called the * 'mid- 
parent") may be deduced in the following way. Let O be the mean of 
the offspring, and P and P' be the values of the two parents. Then 
we want to find cov t>\ that is, the covariance of O with |(P + P'). 
This is equal to \{cov ^ + cov ov >). If P and P' have the same variance, 
then cov ov = cov ov > and cov ? = cov OY . Thus, provided the two sexes 
have equal variances, the covariance of offspring and mid-parent is 
the same as that of offspring with one parent, which we have seen is 
equal to half the additive variance. This conclusion may be extended 
to other sorts of relative: the covariance of any individual with the 
mean value of a number of relatives of the same sort is equal to its 
covariance with one of those relatives. 

The longer method of demonstrating the covariance of offspring 
with mid-parent is rather laborious, but it must be given since it will 

Chap. 9] 









Qj *• 












Co S* 



J >. 











eny me t 

















1? rg 



£ S 


•3 ^ 


Q <^> 




53 ^0 







S ^ 














i— ( 




5S w 




















^ s 


y s 


s J 











8 5? 


























ir> <fc 

1 fc 







§ s, 













be needed for arriving at the covariance of full sibs. We shall, how- 
ever, omit some of the steps of algebraic reduction. A table (Table 
9.2) is made in the same manner as for offspring and one parent, but 
now we have to tabulate types of mating and their frequencies, in- 
stead of single parents. This was done in Chapter 1 (Table 1.1). 
Against each type of mating we put the mean genotypic value of the 
two parents, i.e. the mid-parent value; then the genotypes of the pro- 
geny and the mean genotypic value of the progeny. The working is 
made easier by writing the genotypic values in terms of a and d 
instead of as deviations from the population mean. In the last two 
columns of the table we put the product of progeny-mean x mid- 
parent, and the square of the progeny for later use. Now, to get the 
covariance of progeny-mean and mid-parent value, we take the pro- 
duct of progeny-mean x mid-parent and multiply it by the frequency 
of the mating type, and then sum over mating types. This gives the 
mean product (M.P.) from which we have to deduct a correction for 
the population mean, since values are not here expressed as deviations 
from the mean. The correction is simply the square of the population 
mean (M 2 ) since the means of parents and of progeny are equal. 
Both the M.P. and M 2 contain terms in a 2 , in ad, and in d 2 . By col- 
lecting together these terms and simplifying a little we obtain 

M.P. = a 2 [p 3 (p +q)+ q\p + q)] + 2adpq(p 2 - q 2 ) + d 2 pq(p 2 + 2pq + q 2 ) 
M 2 = a\p 2 - 2pq + q 2 ) + \adpq(p - q) ■ + \d 2 p 2 q 2 

Then, cov ^ = M.P.-M 2 

= a 2 pq - 2adpq(p -q) + d 2 pq(p - q) 2 

=pq[a + d(q-p)] 2 

=pqoc 2 

=Wa (9-3) 

when summed over all loci. 

So the genetic covariance of offspring with the mean of their parents 
is equal to half the additive genetic variance. That this covariance 
comes out the same as that of offspring and one parent need cause no 
surprise when we note that the variance of mid-parent values is half 
the variance of individual values (see below, p. 162). 

Full sibs. The covariance of full sibs is the variance of the means 
of full-sib families, and is got with little additional work from Table 
9.2. The last column shows the squares of progeny means and it will 
be seen that these squares are all exactly the same as the products of 

Chap. 9] 



progeny-mean x mid-parent, except for the two entries in the middle 
involving terms in d 2 . The mean square (M.S.) can therefore be got 
from the mean product (M.P.) already calculated, thus 

M.S.=M.?.+d 2 .2p 2 q 2 -id\4p 2 q 2 
= M.¥.+dyq 2 

The correction for the mean is the same as before, so we have 

cov im 

- coV(y§ + d 2 p 2 q 2 
-pqo. 2 + d 2 p 2 q 2 

Since 2pqcx. 2 = V A (from equation 8.5) and ^d 2 p 2 q 2 
8.6) the covariance of full sibs is 

V D (from equation 


cov m) =iV A + lV D 

summing over all loci. 

So the genetic covariance of full sibs is equal to half the additive 
genetic variance plus a quarter of the dominance variance. This is the 
only one of the relationships that we have considered where we find 
the dominance variance contributing to the resemblance. The reason 
is that full sibs have both parents in common, and a pair of full sibs 
have a quarter chance of having the same genotype for any locus. 

Covariance due to epistatic interaction. Before we turn to the 
environmental causes of resemblance between relatives let us briefly 
examine the role of interaction variance arising from epistasis. In 
Chapter 8 we noted that the interaction variance, V Iy is subdivided 
into components according to the number of loci interacting, and 
according to whether the interaction is between breeding values or 
dominance deviations. The covariances of relatives, with the contri- 
butions of the two-factor interactions included, are shown in Table 9.3 

Table 9.3 
Covariances of relatives including the contributions of 

two-factor interactions. 


Variance components and the coeffi- 
cients of their contributions 

Offspring-parent: cov ? ■■ 
Half sibs: cov ms) 

Full sibs: covcfs) 

General: cov ■■ 
















(for details see Kempthorne, 19550, b). The offspring-parent co- 
variance refers equally to one parent and to mid-parent values. 
For the sake of clarity the components of variance are shown at the 
heads of the columns and their coefficients in the covariances are 
listed below. For example, the offspring-parent covariance is 
i^A+l Vaa> The contributions of interaction to the covariances are 
expressible in a simple general form, shown in the bottom line of the 
table. If the covariance contains xV A then it contains also xW AA \ and 
if it contains yV D it contains also xyV AD and y 2 V DD . Interactions in- 
volving more than two loci contribute progressively smaller propor- 
tions as the number of loci increases. The effect of the interaction 
variance on the resemblance between relatives is, in principle, that 
the offspring-parent covariance is not twice the half-sib covariance, 
but a little more than twice; and that the excess of the full-sib co- 
variance over the half-sib represents not only dominance variance but 
also some of the interaction variance. 

When the interaction variance was first discussed in Chapter 8 
we said we would regard it as a complication to be circumvented, 
noting only the consequences of neglecting it. These consequences 
are now apparent. First, only small fractions of it contribute to the 
covariances and therefore its effect on the resemblance between rela- 
tives is small unless the amount of interaction variance is large in 
comparison with the other components. And second, it appears that 
there is little we can do in practice except ignore it, because, apart 
from the special experimental methods mentioned on p. 139, there is 
no practicable means of separating the interaction from the other 
components. The consequences of ignoring the interaction variance 
are thus that any estimate of V A made from offspring-parent regres- 
sions will contain also \V AA + \V AAA +etc; any estimate of V A from 
half-sib correlations will contain also iV AA +T6V AAA -\-etc; and any 
estimate of V D obtained from a full-sib correlation will contain also 
portions of the interaction components. We noted in Chapter 7 that 
the two definitions of breeding value given there are not equivalent 
if there is interaction between loci. We can now see how this comes 
about. Defined in terms of the measured values of progeny — the 
practical definition — breeding value includes additive x additive 
interaction deviations in addition to the average effects of the genes 
carried by the parents; whereas, defined in terms of the average 
effects of genes — the theoretical definition — it does not. 

Effect of linkage. Throughout the discussion of the covariances 

Chap. 9] 



of relatives we have ignored the effects of linkage, assuming always 
that the loci concerned segregate independently. The effects of 
linkage in a random-mating population, where the coupling and 
repulsion phases are in equilibrium, are as follows (Cockerham, 
1956a). The covariances of offspring and parents are not affected, 
but the covariances of half and full sibs are increased; the closer the 
linkage the greater the increase. The additional covariance due to 
linkage appears with the interaction component. Therefore what is 
formally attributed to epistatic interaction may be in part due to 

Environmental Covariance 

Genetic causes are not the only reasons for resemblance between 
relatives; there are also environmental circumstances that tend to 
make relatives resemble each other, some sorts of relatives more than 
others. If members of a family are reared together, as with human 
families or litters of pigs or mice, they share a common environment. 
This means that some environmental circumstances that cause 
differences between unrelated individuals are not a cause of difference 
between members of the same family. In other words there is a com- 
ponent of environmental variance that contributes to the variance 
between means of families but not to the variance within the families, 
and it therefore contributes to the covariance of the related individuals. 
This between-group environmental component, for which we shall 
use the symbol V Ecy is usually called the common environment, a term 
that seems more appropriate when we think of the component as a 
cause of similarity between members of a group than when we think 
of it as a cause of difference between members of different groups. 
The remainder of the environmental variance, which we shall denote 
by V Ew , arises from causes of difference that are unconnected with 
whether the individuals are related or not. It therefore appears in 
the within-group component of variance, but does not contribute to 
the between-group component, which is the variance of the true 
means of the groups. In considerations of the resemblance between 
relatives, therefore, the environmental variance must be divided into 
two components: 

Vn=V MB +V 




one of the components, V Ecy contributing to the covariance of the 
related individuals. 

The sources of common environmental variance are many and 
varied, and only a few examples can be mentioned. Soil conditions 
may differentiate families of plants when the members of a family are 
grown together on the same plot: similarly the conditions of the cul- 
ture medium may differentiate families of Drosophila or other small 
animals. With farm animals, related individuals are likely to have 
been reared on the same farm, and differences of climate or of manage- 
ment contribute to the resemblance between the relatives. "Maternal 
effects" are a frequent source of environmental difference between 
families, especially with mammals. The young are subject to a 
maternal environment during the first stages of their life, and this 
influences the phenotypic values of many metric characters even 
when measured on the adult, causing offspring of the same mother to 
resemble each other. Finally, members of the same family tend to be 
contemporaneous, and changes of climatic or nutritional conditions 
tend to differentiate members of different families. This source of 
common environmental variation is especially important in animals 
that produce their young in broods or litters. 

These various sources of common environmental variation con- 
tribute chiefly to the resemblance between sibs, though some may 
also cause resemblance between parent and offspring. Maternal 
effects, in particular, often cause a resemblance between mother and 
offspring as well as among the offspring themselves. Body size in mice 
and other mammals provides an example. Large mothers tend to 
provide better nutrition for their young, both before and after birth, 
than small mothers. Therefore the young of large mothers tend to 
grow faster, and the effect of the rapid early growth may persist, so 
that when adult their body size is larger. Thus mothers and offspring 
tend to resemble each other in body size. 

It will be seen from the examples given that the nature of the 
component of variance due to common environment differs according 
to the circumstances. What we designate as the V Ec component 
depends on the way in which individuals are grouped when we esti- 
mate the observational components of phenotypic variance. What- 
ever the form of the analysis, the part of the variance between the 
means of groups that can be ascribed to environmental causes is 
called the V Ec component. The nature of this component thus 
depends on the form of the analysis applied. If the groups in the 

Chap. 9] 



analysis are full-sib families then the V Ec component represents 
environmental causes of similarity between full sibs; if the groups are 
half sibs it represents causes of similarity between half sibs. And in 
parent-offspring relationships a comparable covariance term repre- 
sents environmental causes of resemblance between offspring and 
parent. Thus, whenever we measure a phenotypic covariance with 
the object of using it to estimate a causal component of variance we 
have to decide whether it includes an appreciable component due to 
common environment, and this is often a matter of judgment based 
on a biological understanding of the organism and the character. In 
experiments, much of the V Ec component can often be eliminated by 
suitable design. For example, members of the same family need not 
always be reared in the same vial, cage, or plot; they can be random- 
ised over the rearing environments. Or, by replication, the V Ec 
component can be measured and suitable allowance made for it in the 
resemblance between the relatives. 

Thus relatives of all sorts may in principle be subject to an en- 
vironmental source of resemblance. In what follows, however, we 
shall make the simplification of disregarding the V Ec component for 
all relatives except full sibs, though from time to time we shall put in 
a reminder of its possible presence. Full sibs are subject to a com- 
mon maternal environment and this is often the most troublesome 
source of environmental resemblance to overcome by experimental 
design. Consequently a V Ec component contributes more often and 
in greater amount to the covariance of full sibs than to that of any 
other sort of relative. The simplification of disregarding all other 
sources of common environmental variance is therefore not entirely 

Phenotypic Resemblance 

The covariance of phenotypic values is the sum of the covariances 
arising from genetic and from environmental causes. Thus by 
putting together the conclusions of the two preceding sections we 
arrive at the phenotypic covariances given in Table 9.4. (It will be 
remembered that some possible sources of environmental covariance 
are being neglected, particularly in offspring-parent relationships 
involving the mother.) In all these relationships except that of full 
sibs the covariance is either a half or a quarter of the additive genetic 
variance. By observing the phenotypic covariance of relatives we can 


thus estimate the amount of additive variance in the population and 
make the partition of the variance into additive versus the rest. 

To arrive at the degree of resemblance expressed as a regression or 
correlation coefficient we have to divide the covariance by the appro- 
priate variance. The resemblance between sibs is expressed as a 
correlation and the covariance is divided by the total phenotypic 
variance. The correlation between half sibs, for example, is therefore 
\V A jV P . The resemblance between offspring and parent is expressed 

Table 9.4 
Phenotypic Resemblance between Relatives 

Offspring and one parent 
Offspring and mid-parent 
Half sibs 
Full sibs 

as the regression of offspring on parent, and the covariance is there- 
fore divided by the variance of parents. In the case of single parents 
this is again the phenotypic variance, and the regression of offspring 
on one parent is thus \V A \V P . In a random-breeding population the 
phenotypic variance of parents and offspring is the same, and then the 
correlation between offspring and one parent is the same as the re- 
gression. The case of mid-parent values, however, is a little different. 
The covariance has to be divided by the variance of mid-parent values, 
and this is half the phenotypic variance, for the following reason. Let 
X and Y stand for the phentoypic values of male and female parents 
respectively. Then Gx = oy=V P . The mid-parent value is \X+\Y. 
and the variance of mid-parent values, assuming X and Y to be 
uncorrelated, is therefore u£ x + °Vf = ia\x = 2 • \^x — 2 Vp- Thus 
the regression of offspring on mid-parent is \V A \\V P = V A jV P . The 
correlation between offspring and mid-parent values, however, is 
2 ^/ cr P <7 0) where op and cr are the square roots of the phenotypic vari- 
ances of mid-parents and offspring respectively, and this is not the 
same as the regression of offspring on mid-parent. 


Regression (b) 

or correlation (t) 


b -*v P 



1 W P 

Wa+Wd + V Bc 

, Wa+Wd + Vec 

*~ T/ 

Chap. 9] 



The regressions of offspring on parents and the correlations of 
sibs are shown in Table 9.4. All except the full-sib correlation are 
simple fractions of the ratio V A jV P . Thus the different degrees of 
resemblance between different sorts of relatives become apparent. 
For example, the regression of offspring on one parent is twice the 
correlation between half sibs, and the correlation between full sibs is 
twice the correlation between half sibs if there is no dominance and 
no common environment. 

The difference between the full-sib covariance and twice the 
half-sib covariance can, in principle, be used to estimate the domin- 
ance variance, V Di provided there is no variance due to common 
environment, though some of the variance due to epistatic interaction 
would be included, as may be seen from Table 9.3. In practice, 
however, it is usually very difficult to be certain that there is no 
variance due to common environment, and estimates of the domin- 
ance variance obtained in this way are generally to be regarded as 
upper limits rather than as precise estimates. 

Table 9.5 
The Resemblance between Relatives for some Characters in Man 







Full sib 



•5 1 






Length of forearm 








Birth weight 




(1) Pearson and Lee (1903). 

(2) Unweighted averages of several estimates, cited by 
Penrose (1949). 

(3) Quoted from Robson (1955). 

The chief use of measurements of the degree of resemblance 
between relatives is to estimate the proportionate amount of additive 
genetic variance, V A \ V P , which is the heritability . The meaning of the 
heritability and the methods of estimating it will be considered more 
fully in the next chapter. To conclude this chapter we give in Table 
9.5 some examples of correlations between relatives in man. These 


are undoubtedly complicated by covariance due to common en- 
vironment, and also by assortative mating. The correlation between 
husband and wife for intelligence, for example, is as high as 0*58 
(see Penrose, 1949). For these reasons human correlations cannot 
easily be used to partition the variation into its components. 




The heritability of a metric character is one of its most important 
properties. It expresses, as we have seen, the proportion of the total 
variance that is attributable to the average effects of genes, and this is 
what determines the degree of resemblance between relatives. But 
the most important function of the heritability in the genetic study 
of metric characters has not yet been mentioned, namely its predictive 
role, expressing the reliability of the phenotypic value as a guide to 
the breeding value. Only the phenotypic values of individuals can 
be directly measured, but it is the breeding value that determines their 
influence on the next generation. Therefore if the breeder or experi- 
menter chooses individuals to be parents according to their pheno- 
typic values, his success in changing the characteristics of the popu- 
lation can be predicted only from a knowledge of the degree of corre- 
spondence between phenotypic values and breeding values. This 
degree of correspondence is measured by the heritability, as the fol- 
lowing considerations will show. 

The heritability is defined as the ratio of additive genetic variance 
to phenotypic variance: 

h 2 = 



(The customary symbol h 2 stands for the heritability itself and not for 
its square. The symbol derives from Wright's (1921) terminology, 
where h stands for the corresponding ratio of standard deviations.) 
An equivalent meaning of the heritability is the regression of breeding 
value on phenotypic value: 

h 2 =b AP i 10 - 2 ) 

The equivalence of these meanings can be seen from reasoning similar 
to that by which we derived the genetic covariance of offspring and 
one parent on p. 153. If we split the phenotypic value into breeding 
value and a remainder (R) consisting of the environmental, domin- 

166 HERITABILITY [Chap. 10 

ance, and interaction deviations, thenP=A+R. Since A and R are 
uncorrelated, cov AP = V A and so b AP = V A jV P . 

We may note also that the correlation between breeding values 
and phenotypic values, r AP , is equal to the square root of the heri- 
tability. This follows from the general relationship between corre- 
lation and regression coefficients, which gives 

» op 
r AP—°Ap— 


=h (10.3) 

By regarding the heritability as the regression of breeding value 
on phenotypic value we see that the best estimate of an individual's 
breeding value is the product of its phenotypic value and the heri- 

^(expected) = h*P (IO.4) 

breeding values and phenotypic values both being reckoned as 
deviations from the population mean. In other words the heritability. 
expresses the reliability of the phenotypic value as a guide to the 
breeding value, or the degree of correspondence between phenotypic 
value and breeding value. For this reason the heritability enters into 
almost every formula connected with breeding methods, and many 
practical decisions about procedure depend on its magnitude. These 
matters, however, will be considered in the next chapters; here we 
are concerned only to point out that the determination of the heri- 
tability is one of the first objectives in the genetic study of a metric 

It is important to realise that the heritability is a property not 
only of a character but also of the population and of the environ- 
mental circumstance to which the individuals are subjected. Since 
the value of the heritability depends on the magnitude of all the com- 
ponents of variance, a change in any one of these will affect it. All 
the genetic components are influenced by gene frequencies and may 
therefore differ from one population to another, according to the past 
history of the population. In particular, small populations maintained 
long enough for an appreciable amount of fixation to have taken place 
are expected to show lower heritabilities than large populations. 
The environmental variance is dependent on the conditions of culture 

Chap. 10] 



or management: more variable conditions reduce the heritability, 
more uniform conditions increase it. So, whenever a value is stated 
for the heritability of a given character it must be understood to refer 
to a particular population under particular conditions. Values found 
in other populations under other circumstances will be more or less 
the same according to whether the structure of the population and the 
environmental conditions are more or less alike. 

Very many determinations of heritabilities have been made for a 
variety of characters, chiefly in farm animals. Some representative 
examples are given in Table io.i. Different determinations of the 
heritability of the same character show a considerable range of varia- 
tion. This is partly due to statistical sampling, but some of the 
variation reflects real differences between the populations or the 
conditions under which they are studied. For these reasons, and be- 
cause estimations of heritabilities can seldom be very precise, the 
figures quoted in the table are rounded to the nearest 5 per cent. 
From Table 10. 1 it can be seen that the magnitude of the heritability 
shows some connexion with the nature of the character. On the 
whole, the characters with the lowest heritabilities are those most 
closely connected with reproductive fitness, while the characters 
with the highest heritabilities are those that might be judged on bio- 
logical grounds to be the least important as determinants of natural 
fitness. This is well seen in the gradation of the four characters of 

Table io.i 

Approximate values of the heritability of various characters 
in domestic and laboratory animals. 


Amount of white spotting in Friesians (Briquet and Lush, 1947) -95 

Butterfat % (Johansson, 1950) -6 

Milk-yield (Johansson, 1950) -3 

Conception rate (in 1st service) (A. Robertson, 1957a) -oi 


Thickness of back fat (Fredeen and Jonsson, 1957) *55 

Body length (Fredeen and Jonsson, 1957) -5 

Weight at 180 days (Whatley, 1942) «3 

Litter size (Lush and Molln, 1 942) • 1 5 

{Continued overleaf) 
M F.Q.G. 

168 HERITABILITY [Chap. 10 

Sheep (Australian Merino) 

Length of wool (Morley, 1955) *55 

Weight of fleece (Morley, 1955) *4 

Body weight (Morley, 1955) *35 

Poultry (White Leghorn) 

Egg weight (Lerner and Cruden, 195 1) *6 

Age at laying of first egg (King and Henderson, 19546) *5 
Egg-production (annual, of surviving birds) (King and Henderson, 

I954&) *3 

Egg-production (annual, of all birds) (King and Henderson, 

19546) -2 

Body weight (Lerner and Cruden, 1951) *2 

Viability (Robertson and Lerner, 1949) *i 


Expression of hooded gene (amount of white) (from data of Castle 

and Wright, 1 9 1 6) -4 

Ovary response to gonadotrophic hormone (Chapman, 1946) -35 

Age at puberty in females (Warren and Bogart, 1952) -15 


Tail length at 6 weeks (Falconer, 19546) -6 

Body weight at 6 weeks (Falconer, 1953) -35 

Litter size (1st litters) (Falconer, 1955) '15 

Drosophila melanogaster 

Abdominal bristle number (Clayton, Morris, and Robertson, 

1957) '5 

Body size (thorax length) (F. W. Robertson, 19576) -4 

Ovary size (F. W. Robertson, 1957a:) -3 

Egg production (F.W. Robertson, 19576) -2 

Estimation of Heritability 

Let us first compare the merits of the different sorts of relatives 
for estimating either the additive genetic variance from the covariance, 
or the heritability from the regression or correlation coefficient. 
Table 10.2 shows again the composition of the phenotypic covariances, 

Chap. 10] 



and shows also the regression or correlation expressed in terms of the 
heritability. The choice depends on the circumstances. In addition 

Table 10.2 


Offspring and one parent 
Offspring and mid-parent 
Half sibs 
Full sibs 


t v a 

Wa+Wd+V Eo 

Regression (b) or 
correlation (t) 

b = \W 
b=h 2 

t = \h* 

to the practical matter of which sorts of relatives are in fact obtain- 
able, there are two points to consider — sampling error and environ- 
mental sources of covariance. The statistical precision of the estimate 
depends on the experimental design and also on the magnitude of the 
heritability being estimated, and so no hard and fast rule can be 
made. The matter of statistical precision will be further considered 
in a later section of this chapter. The question of environmental 
sources of covariance is generally more important than the statistical 
precision of the estimate, because it may introduce a bias which 
cannot be overcome by statistical procedure. From considerations of 
the biology of the character and the experimental design we have to 
decide which covariance is least likely to be augmented by an en- 
vironmental component, a matter already discussed in the last 
chapter. Generally speaking the half-sib correlation and the regres- 
sion of offspring on father are the most reliable from this point of 
view. The regression of offspring on mother is sometimes liable to 
give too high an estimate on account of maternal effects, as it would, 
for example, with body size in most mammals. The full-sib corre- 
lation, which is the only relationship for which an environmental 
component of covariance is shown in the table, is the least reliable of 
all. The component due to common environment is often present in 
large amount and is difficult to overcome by experimental design; 
and the full-sib covariance is further augmented by the dominance 
variance. The full-sib correlation can therefore seldom do more than 
set an upper limit to the heritability. 

Example io.i. The heritability of abdominal bristle number in 
Drosophila melanogaster has been determined by three different methods, 
applied to the same population (Clayton, Morris, and Robertson, 1957), 
with the following results: 

170 HERITABILITY [Chap. 10 

Method of estimation Heritability 

Offspring-parent regression *5 1 ± '07 

Half-sib correlation -48 ± • 1 1 

Full-sib correlation -53 ± '07 

Combined estimate ^52 

The estimates obtained by the three methods are in very satisfactory 
agreement. In this case, the character — bristle number — is free of com- 
plications arising from maternal effects and common environment. 

Let us now consider briefly some technical matters concerning the 
translation of observational data into estimates of heritability. We 
shall deal first with the estimation of the heritability; and we shall 
later discuss the standard error of the estimate, and the design that 
gives an experiment its greatest precision. 

Selection of parents and assortative mating. In the treatment 
of resemblance between relatives we have supposed the parents to be 
a random sample of their generation and to be mated at random. Quite 
often, however, one or other of these conditions does not hold, and 
the choice of which sort of relative to use in the estimation of herita- 
bility is then somewhat restricted. In experimental and domesticated 
populations the parents are often a selected group and consequently 
the phenotypic variance among the parents is less than that of the 
population as a whole and less than that of the offspring. The regres- 
sion of offspring on parents, however, is not affected by the selection 
of parents because the covariance is reduced to the same extent as the 
the variance of the parents, so that the slope of the regression line is 
unaltered. Thus the regression of offspring on one parent is a valid 
measure of J/? 2 , and that of offspring on mid-parent is a valid measure 
of h 2 . But the covariance is not a valid measure of V Ay nor the vari- 
ance of parents of V P \ moreover, the correlation and regression coeffi- 
cients are not equal. 

Sometimes the mating of parents is not made at random but 
according to their phenotypic resemblance, a system known as 
assortative mating. There is then a correlation between the pheno- 
typic values of the mated pairs. The consequences of assortative 
mating are described by Reeve (19556) but they are too complicated 
to explain in detail here. They can be deduced by modification of 
Table 9.2, the frequencies of the different types of mating being 
altered according to the correlation between the mated pairs. The 
variance of mid-parent values is increased and consequently also the 

Chap. 10] 



covariance of full sibs. The regression of offspring on mid-parent, 
however, is very little affected and it can be taken as a valid measure 
of h 2 . The increased variance of mid-parent values under assortative 
mating has the practical advantage of reducing the sampling error of 
the regression coefficient and thus of the estimate of heritability. 

Offspring-parent relationship. The estimation of heritability 
from the regression of offspring on parent is comparatively straight- 
forward and needs little comment apart from the points mentioned in 
the preceding paragraphs. The data are obtained in the form of 
measurements of parents and the mean values of their offspring. The 
covariance is then computed in the usual way from the cross-products 
of the paired values. The mean values of offspring may be weighted 
according to the number of offspring in each family, if the numbers 
differ. The appropriate weighting is discussed by Kempthorne and 
Tandon (1953) and by Reeve (1955c). 

Fig. 10. i. Regression of offspring on mid-parent for wing-length 
in Drosophila, as explained in Example 10.2. Mid-parent values are 
shown along the horizontal axis, and mean value of offspring along 
the vertical axis. (Drawn from data kindly supplied by Dr E. C. 
R. Reeve.) 

Example 10.2. Fig. 10. 1 illustrates the regression of offspring on 
mid-parent values for wing length in Drosophila melanogaster (Reeve and 
Robertson, 1953). There are 37 pairs of parents and a mean of 273 
offspring were measured from each pair of parents. The parents were 
mated assortatively, with the result that the variance of mid-parent values 

172 HERITABILITY [Chap. 10 

is greater than it would be if mating had been at random. Each point on 
the graph represents the mean value of one pair of parents (measured along 
the horizontal axis), and the mean value of their offspring (measured along 
the vertical axis). The axes are marked at intervals of i/ioo mm., and they 
intersect at the mean value of all parents and all offspring. The sloping 
line is the linear regression of offspring on mid-parent. The slope of this 
line estimates the heritability, and has the value ( ± standard error): 

h 2 =b ? = 0-577 ±0-07 

A complication in the use of the regression of offspring on mid- 
parent arises if the variance is not equal in the two sexes. We noted 
in the previous chapter that the genetic covariance of offspring and 
mid-parent is equal to half the additive variance on condition that the 
sexes are equal in variance. If this is not so, the regression on mid- 
parent cannot, strictly speaking, be used, and the heritability must 
be estimated separately for each sex from the regression of daughters 
on mothers and of sons on fathers. If the heritabilities are found to 
be equal in the two sexes, then a joint estimate can be made from the 
regression on mid-parent, by taking the mean value of the offspring 
as the unweighted mean of males and females. 

Sib analysis. The estimation of heritability from half sibs is 
more complicated than appears at first sight and needs more detailed 
comment. A common form in which data are obtained with animals 
is the following. A number of males (sires) are each mated to several 
females (dams), and a number of offspring from each female are 
measured to provide the data. The individuals measured thus form a 
population of half-sib and full-sib families. An analysis of variance 
is then made by which the phenotypic variance is divided into ob- 
servational components attributable to differences between the pro- 
geny of different males (the between-sire component, u 2 s ); to differ- 
ences between the progeny of females mated to the same male 
(between-dam, within-sires, component, v%)\ and to differences 
between individual offspring of the same female (within-progenies 
component, o-j^). The form of the analysis is shown in Table 10.3. 
There are supposed to be s sires, each mated to d dams, which 
produce k offspring each. The values of the mean squares are de- 
noted by MS S , MS Di and MS W . The mean square within progenies 
is itself the estimate of the within-progeny variance component, 
vw\ but tne other mean squares are not the variance components. 
The compositions of the mean squares in terms of the observational 

Chap. 10] 



components of variance are shown in the right-hand column of the 
table, consideration of which will show how the variance components 
are to be estimated. The between-dam mean square, for example, is 
made up of the within-progeny component together with k times the 
between-dam component; so the between-dam component is esti- 
mated as vi ~{ijk){MS D - MS W ), i.e. we deduct the mean square for 
progenies from the mean square for dams and divide by the number 
of offspring per dam. Similarly the between-sire component is 
estimated as os = {ijdk)(MS s - MS D ), where dk is the number of off- 

Table 10.3 
Form of Analysis of Half-Sib and Full-Sib Families 

Composition oj 



Mean Square 

Mean Square 

Between sires 



= c?w + ko% + dkal 

Between dams 



= a^ + kal 

(within sires) 

Within progenies 



= a w 

s = number of sires 

d = number of dams per sire 

k = number of offspring per dam 

spring per sire. If there are unequal numbers of offspring from the 
dams, or of dams in the sire groups, the exact solution, which is 
described by King and Henderson (1954a), Williams (1954), and 
Snedecor (1956, section 10.17) becomes too complicated for descrip- 
tion here. We can, however, use the mean values of d and k with 
little error, provided the inequality of numbers is not very great. 

The next step is to deduce the connexions between the observa- 
tional components that have been estimated from the data and the 
causal components, in particular the additive genetic variance, the 
estimation of which is the main purpose of the analysis. Though all 
the information needed has already been given, the interpretation of 
the observational components, which is given in Table 10.4, is not 
immediately apparent without explanation. The first point to note 
is that the estimate of the phenotypic variance is given by the sum 
(o-y) of the three observational components: V P = 0% = 0% + 0% + crj^. 
This is not necessarily equal to the observed variance as estimated 
from the total sum of squares, though the two seldom differ by much. 
Now consider the interpretation of the between-sire component, 



[Chap. 10 

g%. This is the variance between the means of half-sib families and 
it therefore estimates the phenotypic covariance of half sibs, cov (mi)y 
which is \V A . Thus o\ = \V A . Next consider the within-progeny 
component, o-^. Since any between-group variance component is 
equal to the covariance of the members of the groups, it follows that a 
within-group component is equal to the total variance minus the 
covariance of members of the groups. The progenies of the dams are 

Table 10.4 

Interpretation of the observational components of variance 
in a sib analysis 

Observational component 

Covariance and causal components 


°l = 


ol = 


cr w = 

Total: 4 = 

ffs + ^-f 

a w = 

Sires + Dams: 

^ + o-J = 

=Wa+W» + v Ec 
=Wa+Wi>+v EV} 
= v A + v J> +v Ee +v Ew 

= WA+iVn+V Ee 

cov (aB) 

v P 

cov {m) 

full-sib families and so the within-progeny variance estimates 
V P - coV( FS) . This leads to the interpretation o> =\V A +%V D + V Ew . 
Finally, there remains the between-dam component, and what it 
estimates can be found by subtraction as follows: 

^D = ^T-^s-^w=cov {m -cov (K $ ) =IV a + IV d + V Ec 

Consideration of the between-sire and between-dam components will 
show that their sum gives an estimate of the full-sib covariance, 
co<v (fs)> Du t this provides no new information for estimating the causal 
components. These conclusions about the connexion between ob- 
servational and causal components of variance are summarised in 
Table 10.4. The contributions of the interaction variance to the 
observational components is given by Kempthorne (1955(2), and 
can be deduced from the contributions to the covariances given in 
Table 9.3. 

Example 10.3. As an illustration of the estimation of heritability from 
a sib analysis we refer to the study of Danish Landrace pigs based on the 
records of the Danish Pig Progeny Testing Stations (Fredeen and Jonsson, 
1957). The data came from 468 sires each mated to 2 dams, the analysis 
being made on the records of 2 male and 2 female offspring from each 
dam. Only one such analysis is given here: that of body length in the male 
offspring. The analysis, shown in the table, was made within stations and 

Chap. 10] 



within years, and this accounts for the degrees of freedom being fewer than 
would appear appropriate from the numbers stated above. The interpre- 
tation of the analysis, shown at the foot of the table, has been slightly 

Sib analysis of body length in Danish Landrace pigs; data 
for male offspring only (from Fredeen and Jonsson, 1957). 



Mean Square 

Component of variance 

Between sires 



^=1(6*03 -3-8i) = o-555 

Between dams, 

within sires 



^ = i(3* 81 - 2-87) = 0-47 

Within progenies 



a 2 w = 2-87 

4= 3-895 

Interpretation of analysis 
Sib correlations Estimates of heritability 

Half sibs: t^ 

} ) = — - 2 =0*142 Sire-component: h 2 = : ~ 

(J rp O ' rp 

Dam-component: h 2 



= 0-57 
= 0-48 

Full sibs: t( FS ) 

2 , 2 

= 0-263 Sire + Dam: 

h * = *M+J® =o- S3 



simplified by the omission of some minor adjustments not relevant for us 
at this stage. The between-dam component is not greater than the between- 
sire component, so there cannot be much non-additive genetic variance or 
variance due to common environment. The two estimates of the heri- 
tability, from the sire and dam components respectively, can therefore be 
regarded as equally reliable, and their combination based on the resem- 
blance between full sibs may be taken as the best estimate. 

Example 10.4. We have not yet had an example to illustrate the effect 
of common environment in augmenting the full-sib correlation. This is 
provided by body size in mice. The analysis given in table (i) refers to the 

Table (i) 



Mean Square Composition of M. S. 




17-10 ct£ + k'a% + dk'ol 

0-1 = 0-48 



10-79 <Tw + karl + 

4 = 2-47 



2-19 al 

0^ = 2-19 

6 = 3-48; k' =4-16; ^ = 2-33 

4=5 #I 4 

176 HERITABILITY [Chap. 10 

weight of female mice at 6 weeks of age (J. C. Bowman, unpublished). 
There were 719 offspring from 74 sires and 192 dams, each with one 
litter. These were spread over 4 generations and the analysis was made 
within generations. The analysis is complicated by the inequality of the 
number of offspring per dam and of dams per sire. We shall not attempt 
to explain the adjustments made for these inequalities, but simply give 
the compositions of the mean squares from which the components are 
estimated. The dam component is much greater than the sire component, 
indicating a substantial amount of variance due to common environment. 
Therefore only the sire component can be used to estimate the heritability. 
The estimate obtained is A 2 = 4 x 0-48/5-14 = 0-37. Let us now use the analysis 
to estimate the causal components according to the interpretation given 
in Table 10.4, but with the assumption that non-additive genetic variance 
is negligible in amount. Table (ii) gives the estimates and shows how they 


le (ii) 

v F - 

= <J T 

= 5*14 = 


v A - 


= 1-92 = 


v Sc 



= 1-99 = 


* Ew~- 


-<J W - 


= 1-23 = 


are derived. The percentage contribution of each component to the total 
variance is given in the right-hand column. It will be seen that the vari- 
ance due to common environment (Ve c ) amounts to 39 per cent of the 
total, and is greater than the environmental variance within full-sib 
families (Ve w ) which amounts to only 24 per cent of the total. 

Intra-sire regression of offspring on dam. The heritability 
can be estimated from the offspring-parent relationship in a popula- 
tion with the structure described in the foregoing section, but a slight 
modification is necessary. Since each male is mated to several females, 
the regression of offspring on mid-parent is inappropriate; and, since 
there are usually rather few male parents, the simple regressions on 
one or other parent are both unsuitable. The heritability can, how- 
ever, be satisfactorily estimated from the average regression of off- 
spring on dams, calculated within sire groups. That is to say, the 
regression of offspring on dam is calculated separately for each set of 
dams mated to one sire, and the regressions from each set pooled in a 
weighted average. This method is commonly used for the estimation 
of heritabilities in farm animals. The intra-sire regression of off- 
spring on dam estimates half the heritability, as the following con- 
sideration will show. The progeny of one sire has a mean deviation 

Chap. 10] 



from the population mean equal to half the breeding value of the sire, 
provided the females he is mated to are a random sample from the 
population. The progeny of one dam deviates from the mean of the 
sire group by half the breeding value of the dam. Therefore the 
within-sire covariance of offspring and dam is equal to half the 
additive variance of the population as a whole; and the within-sire 
regression of offspring on dam is equal to half the heritability, just 
like the simple regression of offspring on one parent. The validity 
of the estimate is, of course, dependent on the absence of maternal 
effects contributing to the resemblance between daughters and dams. 
Inequality of the variance of males and females calls for an adjustment 
if the heritability is to be estimated from the intra-sire regression of 
male offspring on dams. The regression coefficient should then be 
multiplied by the ratio of the phenotypic standard deviation of females 
to that of males. 

Example 10.5. The heritability of abdominal bristle-number in 
Drosophila melanogaster, estimated from the offspring-parent regression, 
was cited in Example 10.1. This was in fact a joint estimate based on 
intra-sire regressions of daughters on dams and of sons on dams, the latter 
being corrected for inequality of variance in the two sexes (Clayton, Morris, 
and Robertson, 1957). The separate regression coefficients, with the cor- 
rection for inequality of variances, and the estimates of the heritability 
are given in the table. 

Estimate of 

Standard deviation: females 
Standard deviation: males 
Standard deviation: female/male 
Regression coefficient: daughter-dam 
Regression coefficient: son- dam 
Regression coefficient: son-dam corrected 

0-206 x 1-17 = 
Joint estimate, as given in Example 10.1, 






The Precision of Estimates of Heritability 

It is of the greatest importance to know the precision of any esti- 
mate of heritability. When an estimate has been obtained one wants 
to be able to indicate its precision by the standard error. And when 

178 HERITABILITY [Chap. 10 

an experiment aimed at estimating a heritability is being planned one 
wants to choose the method and design the experiment so that the 
estimate will have the greatest possible precision within the limita- 
tions imposed by the scale of the experiment. The precision of an 
estimate depends on its sampling variance, the lower the sampling 
variance the greater the precision; and the standard error is the square 
root of the sampling variance. Estimates of heritability are derived 
from estimates of either a regression coefficient or an intra-class cor- 
relation coefficient, and the sampling variances of these are given in 
textbooks of statistics. We shall therefore present the necessary 
formulae without explanation of their derivation. The information on 
the design of experiments given here is derived from the paper by A. 
Robertson (19590) on this subject. 

The problems of experimental design are, first, the choice of 
method and, second, the decision of how many individuals in each 
family are to be measured. Since the total number of individuals 
measured cannot be increased indefinitely, an increase of the number 
of individuals per family necessarily entails a reduction of the number 
of families. The problem is therefore to find the best compromise 
between large families and many families. In assessing the relative 
efficiencies of different methods and designs we have to compare 
experiments made on the same scale; that is to say, with the same 
total expenditure in labour or cost. We must therefore decide first 
what are the circumstances that limit the scale of the experiment. If 
the labour of measurement is the limiting factor, as for example in 
experiments with Drosophila, then the limitation is in the total 
number of individuals measured, including the parents if they are 
measured. If, on the other hand, breeding and rearing space is the 
limiting factor, as it generally is with larger animals, the limitation 
may be either in the number of families or in the total number of 
offspring that can be produced for measurement, and measurements 
of the parents may be included without additional cost. We cannot 
here take account of all the possible ways in which the scale of the 
experiment may be limited. Therefore for the sake of illustration we 
shall consider only a limitation of the total number of individuals 
measured. That is to say, we shall assume the total number of in- 
dividuals measured to be the same for all methods and all experi- 
mental designs. What we have to do, then, is to consider each method 
on this basis and see what design and which method will give an 
estimate of the heritability with the lowest sampling variance. 



Offspring-parent regression. Consider first estimates based on 
the regression of offspring on parents. LetX be the independent 
variate, which may be either the value of a single parent or the mid- 
parent value. Let Y be the dependent variate, which may be either a 
single offspring of each parent or the mean of n offspring. Let cr x 
and oy De the variances of X and Y respectively; let b be the regres- 
sion of FonZ, and N the number of paired observations of X and Y, 
which is equivalent to the number of families in the experiment. 
Let T be the total number of individuals measured, which is fixed by 
the scale of the experiment. The number of offspring measured is 
nN, and the number of parents N or zN according to whether the 
regression is on one parent or on the mid-parent value. So, with one 
parent measured, T=N(n + i)> and with both parents measured 
T=N(n + 2). With these symbols, the variance of the estimate of the 
regression coefficient is 



For use as a guide to design this formula is more convenient 
if put in a simplified and approximate form. The regression coeffi- 
cient is usually small enough that b 2 can be ignored; and we may sup- 
pose that N is fairly large, so that the variance of the estimate may be 
put, approximately, as 

2 _ 1 <4 

(approx.) (10.6) 

When only one parent is measured the variance of parental values is 
equal to the phenotypic variance, i.e. u x = V P . When both parents 
are measured (provided they were not mated assortatively) the vari- 
ance of mid-parent values is half the phenotypic variance, i.e. 
crx — iVp- The variance of the offspring values, cry, is the variance of 
the means of families of n individuals. This depends on the pheno- 
typic correlation, t, between members of families, in a manner that 
will be explained in Chapter 13, (see Table 13.2), where it will be 
shown that 


Gy= Vp 


Therefore by substitution for cr x and g y in equation 10.6 the sampling 
variance of the regression on one parent becomes 

180 HERITABILITY [Chap. 10 

° b = — k/V ( a PP rox O (10.7) 

and that of the regression on mid-parent is twice as great. Since the 
phenotypic correlation, t, depends on the heritability it will not 
generally be known at the time an experiment is being planned. 
Therefore the best design cannot be exactly determined in advance. 
We can, however, get an approximate idea of how many offspring of 
each parent should be measured. On the assumption already stated, 
that the total number of individuals measured including the parents 
is fixed, it can be shown that the sampling variance given in equation 
10.7 is minimal when n = J(i - t)jt if one parent is measured and when 
n = \iz(i - t)jt if both parents are measured. Consider, for example, a 
character with a heritability of 20 per cent and no variance due to 
common environment, so that the phenotypic correlation in full-sib 
families is t = o-i. Then the optimal family size works out to be 
n = 3 when only one parent is measured and n=\ when both parents 
are measured. If we had taken a higher heritability the optimal family 
size would have been lower. Large families are advantageous only 
for the estimation of very low heritabilities. For example, full-sib 
families of about 10 or 14 would be optimal for estimating a herit- 
ability of 2 per cent. 

So far we have considered only the sampling variance of the 
regression coefficient, and how this can be reduced by the design of 
the experiment. Now let us consider the sampling variance of the 
estimate of heritability, so that we can compare methods, i.e. the use 
of one parent or of mid-parent values. A just comparison can only 
be made on the assumption of the optimal design for each method, 
and therefore we can only illustrate the comparison by reference to a 
particular case. We shall consider the particular case mentioned 
above where the phenotypic correlation is £ = o-i, which would be 
found in full-sib families when the heritability is 20 per cent. The 
optimal family sizes are 3 or 4 as stated above. For the purpose of 
comparison we have to express the sampling variance of the regression 
coefficient given in equation 10.7 in terms of the total number of 
individuals measured, T, since this is assumed to be the same for all 
methods. We therefore substitute in equation 10.7 as follows. When 
one parent is measured N= T\{n +1), and n = 3. When both parents are 
measured N — Tj(n + 2), and n = 4. Substitution in equation 10.7 then 
yields 0-6=4* 8/3 T when one parent is measured, and of = 3 • 9/T when both 


are measured. The regression on one parent must be doubled to give 
the estimate of heritability, but the regression on mid-parent is itself 
the estimate. So the sampling variances of the estimates of herit- 
ability, in the special case under consideration, are: 

By regression on one parent: o$ = \o\ = 6-^/T (approx.) 
By regression on mid-parent: 0$ = ol — y^jT (approx.) 

Thus the estimate based on mid-parent values has considerably less 
sampling variance. A regression on mid-parent values, in general, 
yields a more precise estimate of heritability for a given total number 
of individuals measured. 

Sib analyses. Now let us consider estimates obtained from the 
intra-class correlation of full-sib or half-sib families. We shall at 
first suppose for simplicity that half-sib families are not subdivided 
into full-sib families; i.e. that only one offspring from each dam is 
measured in paternal half-sib families. In the case of full-sib families 
we shall assume that there is no variance due to common environ- 
ment so that the estimate of heritability is a valid one. Let N be the 
number of families, and n the number of individuals per family, so 
that the total number of individuals measured is T=nN. Let the 
intra-class correlation be t. The sampling variance of the intra-class 
correlation is then 

„ 2[l+(?Z 
0?= L 




When the value of T=nN is limited by the size of the experiment it 
can be shown that the sampling variance of the intra-class correlation 
is minimal when n = i/t, approximately. Therefore the optimal family 
size depends on the heritability. In the case of full-sib families 
h 2 = 2t, and in the case of half-sib families, h 2 =\t. So the most 
efficient design has the following family sizes: 

With full-sib families: n—-^ 

h 2 

With half-sib families: n — -^ 

h 2 

Since prior knowledge of the heritability will be at the best only 
approximate, the optimal family size cannot be exactly determined 
before-hand. The loss of efficiency, however, is much greater if the 

182 HERITABILITY [Chap. 10 

family size is below the optimum than if it is above. It is therefore 
better to err on the side of having too large families. A. Robertson 
(1959a) shows that, in the absence of prior knowledge of the herita- 
bility, half-sib analyses should generally be designed with families of 
between 20 and 30. 

If the experiment has the most efficient design, with n = ijt, then 
the sampling variance of the intra-class correlation is approximately 

°t=f {10.9) 

Therefore under optimal design the sampling variances of the esti- 
mates of heritability are as follows: 

16A 2 
From full-sib families: 0$ = 40? = —=- (approx.) 

From half-sib families: 0$ = 1 6^ = ^-=- (approx.) 

Thus, other things being equal, an estimate from full-sib families is 
twice as precise as one from half-sib families. 

At this point let us compare the precision of estimates from sib 
analyses with those from offspring-parent regressions, assuming 
optimal design in each case. Again we have to choose a specific case 
for illustration of the comparison. Let us for simplicity suppose as we 
did before that the heritability to be estimated is 20 per cent. And, 
though perhaps not very representative of situations likely to arise in 
practice, let us compare an estimate obtained from a half-sib analysis 
with one obtained from the regression of offspring on one parent 
when the offspring consist of full-sib families. The variance of the 
estimate of heritability from the half-sib analysis would then be 6-/\./T 
by substitution in the formula given above, and from the regression of 
offspring on one parent it would also be 6'4/Tas we found previously. 
In this case, therefore, these two methods would give equally precise 
estimates for a given total number of individuals measured. If we had 
considered a higher heritability, then the regression method would 
have had the lower sampling variance. The comparison we have made, 
though referring to a particular case, illustrates the general conclusion, 
which is that the regression method is preferable for estimating 
moderately high heritabilities and the sib correlation method is 
preferable for low heritabilities, the critical heritability being, very 



roughly, about 20 per cent when the comparison is made on the basis 
of an equal total number of individuals measured. 

Finally let us consider briefly a sib analysis where the half-sib 
families are subdivided into full-sib families. The situation is then 
more complicated, and for details the reader should consult the papers 
of Osborne and Paterson (1952) and A. Robertson (1959 a). The 
conclusions are as follows. In many cases the estimation of heri- 
ability will be based only on the between-sire component, i.e. the 
half-sib correlation. This will arise when common environment 
renders the full-sib correlation unsuitable. The most efficient design 
then has only one offspring per dam, and is exactly the same as the 
half-sib analysis discussed above. If there is no common environ- 
ment and it is desired to estimate the correlations from sire and from 
dam components with equal precision, then the optimal design has 
3 or 4 dams per sire with the number of offspring per dam equal to 
z/h 2 . In the absence of prior knowledge of the heritability the analysis 
should be planned with 3 or 4 dams per sire, and 10 offspring per 

Identical Twins 

Identical twins seem at first sight to provide, for man and cattle, a 
means of estimating the genotypic variance. They provide individuals 
of identical genotype, just as inbred lines, or crosses between lines, do 
for laboratory animals or for plants. The phenotypic variance within 
pairs of identical twins should, therefore, estimate the environmental 
variance and so allow the partition of the phenotypic variance into 
genotypic and environmental components to be made. (This would 
not estimate the heritability, but the use of identical twins seems 
nevertheless most appropriately discussed at this point.) Many 
studies of human twins have been made, and have shown the mem- 
bers of the pairs to be extremely alike in most characters, even when 
reared apart from childhood (see Stern, 1949, Ch. 23, for review and 
references). Studies of cattle twins, though on a much smaller scale, 
show the same thing (see Hancock, 1954; Brumby, 1958). Taken at 
their face value these studies seem to indicate a very high degree of 
genetic determination — up to 90 per cent or even more — for many 
characters. The use of identical twins in this way is, however, vitiated 
by the additional similarity due to common environment. Twins 
share a common environment from conception to birth, and over the 

N F.Q.G. 

184 HERITABILITY [Chap. 10 

period during which they are reared together, so that the within-pair 
variance contains only a part, and perhaps only a small part, of the 
total environmental variance. This difficulty may be partially over- 
come by the comparison of identical with fraternal twins. Fraternal 
twins are full sibs which share a common environment to approxi- 
mately the same extent as identical twins. Let us therefore consider 
how the causal components of variance contribute to the observa- 
tional components between pairs and within pairs for the two sorts of 
twins. The composition of the observational components are given 
in Table 10.5, the between-pair component being the phenotypic 
covariance. The environmental components are shown as being the 
same for fraternal as for identical twins. This is not necessarily true, 
but one can proceed only on the assumption that it is. 

Table 10.5 

Composition of the components of variance between and 
within pairs of twins. 

Between pairs Within pairs 

Identicals V A + V D + V Ec V Ew 

Fraternals Wa+Wd + V Ec Wa+Wd + V Ew 

Difference Wa+Wd Wa+Wd 

The contributions of the interaction variance, which for simplicity 
are omitted, can be added from Table 9.3 (p. 1 57). If the environmental 
components are the same for the two sorts of twins, then the differ- 
ence between identicals and fraternals in either of the two components 
estimates half the additive variance together with three-quarters of 
the dominance variance (and more than three-quarters of the inter- 
action variance). To take the partitioning further it is necessary to 
have an estimate of the additive variance, reliably free from admixture 
with variance due to common environment. By subtraction of half 
the additive variance we may then obtain an estimate of three-quarters 
of the dominance variance together with more than three-quarters of 
the interaction variance. This would give at least an approximate idea 
of the amount of non-additive genetic variance. There is, however, a 
difficulty with cattle in comparisons between identical and fraternal 
twins, connected again with the environmental components of 
variance. Vascular anastomoses frequently occur in the placentae of 
both sorts of twins, so that the blood of the two twins is mixed. This 
will not make identicals any more alike, but it may make fraternals 
more alike than they would otherwise be. 

\Chap. 10] 



Some results of twin-studies are quoted in Table 10.6, in order to 
illustrate the degree of resemblance between identical and between 
(fraternal twins in both man and cattle. The difference between the 
I correlation coefficients of identicals and fraternals, given in the right- 
hand column, could be taken as an estimate of half the heritability if 
there were no non-additive genetic variance and if there were no 
complications arising from a common circulation. But since non- 
additive variance cannot reasonably be assumed to be absent, the 
difference can only be regarded as setting an upper limit to half the 
heritability. The vascular anastomoses in cattle twins may, however, 
render the estimates of the heritability, or of its upper limit, too low. 

Table 10.6 

Resemblance between Twins 

Correlation coefficients 
Character Reference Identicals Fraternals Difference 


Birth weight 


Milk-yield, 1st lactation 
Butterfat-yield, 1st lactation 
Fat % in milk, 1st lactation 
Weight at 96 weeks 
Body length at 96 weeks 















* J 3 

(1) Newman, Freeman, and Holzinger (1937). Based on 50 pairs of 
identicals and 50 pairs of fraternals, corrected for age differences. 

2) Quoted from Robson (1955). 

(3) Brumby and Hancock (1956). Based on 10 pairs of identicals and 11 
pairs of fraternals. 


I. The Response and its Prediction 

Up to this point in our treatment of metric characters we have been 
concerned with the description of the genetic properties of a popula- 
tion as it exists under random mating, with no influences tending to 
change its properties; now we have to consider the changes brought 
about by the action of breeder or experimenter. There are two ways, 
as we noted in Chapter 6, in which the action of the breeder can change 
the genetic properties of the population; the first by the choice of 
individuals to be used as parents, which constitutes selection, and the 
second by control of the way in which the parents are mated, which 
embraces inbreeding and cross breeding. We shall consider selection 
first, and in doing so we shall ignore the effects of inbreeding, even 
though we cannot realistically suppose that we are always dealing 
with a population large enough for its effects to be negligible. 

The basic effect of selection is to change the array of gene fre- 
quencies in the manner described in Chapter 2. The changes of gene 
frequency themselves, however, are now almost completely hidden 
from us because we cannot deal with the individual loci concerned 
with a metric character. We therefore have to describe the effects of 
selection in a different manner, in terms of the observable properties 
— means, variances and covariances — though without losing sight of 
the fact that the underlying cause of the changes we describe is the 
change of gene frequencies. Before we come to details let us consider 
the change of gene frequencies a little further in general terms. 

To describe the change of the genetic properties from one genera- 
tion to the next we have to compare successive generations at the same 
point in the life cycle of the individuals, and this point is fixed by the 
age at which the character under study is measured. Most often the 
character is measured at about the age of sexual maturity or on the 
young adult individuals. The selection of parents is made after the 
measurements, and the gene frequencies among these selected in- 
dividuals are different from what they were in the whole population 



before selection. If there are no differences of fertility among the 
selected individuals or of viability among their progeny, then the gene 
frequencies are the same in the offspring generation as in the selected 
parents. Thus artificial selection — that is, selection resulting from 
the action of the breeder in the choice of parents — produces its change 
of gene frequency by separating the adult individuals of the parent 
generation into two groups, the selected and the discarded, that differ 
in gene frequencies. Natural selection, operating through differences 
of fertility among the parent individuals or of viability among their 
progeny, may cause further changes of gene frequency between the 
parent individuals and the individuals on which measurements are 
made in the offspring generation. Thus there are three stages at 
which a change of gene frequency may result from selection: the first 
through artificial selection among the adults of the parent generation; 
the second through natural differences of fertility, also among the 
adults of the parent generation; and the third through natural differ- 
ences of viability among the individuals of the offspring generation. 
Though natural differences of fertility and viability are always present 
they are not necessarily always relevant, because they are not neces- 
sarily connected with the genes concerned with the metric character. 


Response to Selection 

The change produced by selection that chiefly interests us is the 
change of the population mean. This is the response to selection, 
which we shall symbolise by R; it is the difference of mean phenotypic 
value between the offspring of the selected parents and the whole of 
the parental generation before selection. The measure of the selec- 
tion applied is the average superiority of the selected parents, which 
is called the selection differential, and will be symbolised by S. It is 
the mean phenotypic value of the individuals selected as parents 
expressed as a deviation from the population mean, that is from the 
mean phenotypic value of all the individuals in the parental genera- 
tion before selection was made. To deduce the connexion between 
response and selection differential let us imagine two successive 
generations of a population mating at random, as represented dia- 
grammatically in Fig. 1 1 . i . Each point represents a pair of parents 
and their progeny, and is positioned according to the mid-parent 
value measured along the horizontal axis and the mean value of the 



[Chap. II 

progeny measured along the vertical axis. The origin represents the 
population mean, which is assumed to be the same in both generations. 
The sloping line is the regression line of offspring on mid-parent. 
(A diagram of this sort, plotted from actual data was given in Fig. 
10. i.) Now let us regard a group of individuals in the parental 
generation as having been selected — say those with the highest 
values. These pairs of parents and their offspring are indicated by 
solid dots in the figure. The parents have been selected on the basis 

Fig. i i.i. Diagrammatic representation of the mean values of 
progeny plotted against the mid-parent values, to illustrate the 
response to selection, as explained in the text. 

of their own phenotypic values, without regard to the values of their 
progeny or of any other relatives. (This chapter deals exclusively 
with selection made in this way: other methods will be described in 
Chapter 13.) Let S be the mean phenotypic value of these selected 
parents, expressed as a deviation from the population mean. And 
similarly let R be the mean deviation of their offspring from the 
population mean. Then S is the selection differential and R is the 
response. The point marked by the cross represents the mean value 
of the selected parents and of their progeny, and it lies on the regres- 
sion line. The regression coefficient of offspring on parents is thus 
equal to R/S. Therefore the connexion between response and selection 
differential is 




Chap. II] 



We saw in the last chapter that the regression of offspring on mid- 
parent is equal to the heritability, provided there is no non-genetic 
cause of resemblance between offspring and parents. To this we must 
add the further condition that there should be no natural selection: 
that is to say, that fertility and viability are not correlated with the 
phenotypic value of the character under study. Provided these 
conditions hold, therefore, the ratio of response to selection differ- 
ential is equal to the heritability, and the response is given by 



The connexion between the response and the selection differen- 
tail, expressed in equation JJ.2, follows directly from the meaning of 
the heritability. We noted in the last chapter (equation 10.2) that the 
heritability is equivalent to the regression of an individual's breeding 
value on its phenotypic value. The deviation of the progeny from 
the population mean is, by definition, the breeding value of the 
parents, and so the response is equivalent to the breeding value of the 
parents. Thus it follows that the expected value of the progeny is 
given by R=h 2 S. 

There is one point at which the situation envisaged in deducing 
the equations of response does not coincide with what is actually 
done in selection. We supposed the individuals of the parent genera- 
tion to have mated at random and the selection to have been applied 
subsequently. In practice, however, the selection is usually made 
before mating, on the basis of the individuals' values and not the 
mid-parent values. The effect of this is that the individuals, when 
regarded as part of the whole parental population, have been mated 
assortatively. Assortative mating, however, has very little effect on 
the offspring-parent regression, as we noted in the last chapter, and 
this feature of selection procedure can therefore be disregarded. 

Prediction of response. The chief use of these equations of 
response is for predicting the response to selection. Let us consider a 
little further the nature of the prediction that can be made. First, it 
is clear that equation 11.1 is not a prediction but simply a description, 
because the regression of offspring on parent cannot be measured 
until the offspring generation has been reared. We could, however, 
measure the regression, & p, in a previous generation, and then use 
the equation R=b ^S to predict the response to selection. There is 
no genetics involved in this; it is simply an extrapolation of direct 
observation, and the only conditions on which it depends are the 

190 SELECTION: I [Chap. 1 1 

absence of environmental change and the absence of genetic change 
between the generations from which the regression was estimated and 
the generation to which selection is applied. The equation R=h 2 S, 
however, provides a means of prediction based on observations made 
only on the individuals of the parent generation before selection. Its 
validity rests on obtaining a reliable estimate of h 2 from the resem- 
blance between relatives, such as half sibs; and on the truth of the 
identity Z> p = A 2 . 

Example i i . i . The selection for abdominal bristle number in Droso- 
phila melanogaster, by Clayton,'Morris, and Robertson (1957), will provide 
an illustration of the prediction of the response, and will serve also to 
indicate the extent of the agreement between observation and prediction. 
(The data for this example were kindly supplied by Dr G. A. Clayton.) 
The heritability of bristle number was first estimated from the base 
population before selection, and the value found was 0-52, as stated in 
Example 10.1. Five samples of 100 males and 100 females were taken from 
the base population, and selection for high and for low bristle number was 
made in each of the five samples, the 20 most extreme individuals of each 
sex being selected as parents. The mean deviations of these selected indi- 
viduals from the mean of the sample out of which they were selected are 
given in the table in the columns headed S, the negative signs under down- 
ward selection being omitted. These are the selection differentials. The 
expected responses are obtained by multiplying the selection differentials 
by the heritability, according to equation 11. 2. The observed responses 

Upward selection 

Downward selection 









Exp. Obs. 






2-41 2-44 






2-38 2-29 






2-27 0-67 






2-91 1-13 






2-14 2-68 


4-8 1 




2-42 1-84 

are the differences between the progeny means and the sample means out 
of which the parents were selected. The expected and observed responses 
are also given in the table, negative signs being again omitted. Comparison 
of the observed with the expected responses shows that on the whole there 
is fairly good agreement, though in some lines — particularly lines 3 and 4 
selected downward — there are quite serious discrepancies. These dis- 
crepancies, which are typical of selection experiments, illustrate the fact that 

Chap. II] 



a single generation of selection in only one line cannot be relied on to 
follow the prediction at all closely. 

The prediction of response is valid, in principle, for only one 
generation of selection. The response depends on the heritability of 
the character in the generation from which the parents are selected. 
The basic effect of the selection is to change the gene frequencies, so 
the genetic properties of the offspring generation, in particular the 
heritability, are not the same as in the parent generation. Since the 
changes of gene frequency are unknown we cannot strictly speaking 
predict the response to a second generation of selection without re- 
determining the heritability. Experiments have shown, however, 
that the response is usually maintained with little change over several 
generations — up to five, ten, or even more. This will be seen in the 
graphs of responses to selection given later in this chapter and in the 
next. In practice, therefore, the prediction may be expected to hold 
good over several generations. The effects of selection over longer 
periods, and also its effects on properties other than the mean, will be 
discussed in a later section. 

The selection differential. We have seen that the change of the 
population mean brought about by selection — i.e. the response — 
depends on the heritability of the character and on the amount of 
selection applied as measured by the selection differential. The 
selection differential will not be known, however, until the selection 
among the parental generation has actually been made. So the equa- 
tions of response in the form given above are only of limited useful- 
ness for predicting the response. To be able to predict further ahead 
we need to know what determines the magnitude of the selection 
differential. Consideration of the factors that influence the selection 
differential will also enable us to see more clearly the means by which 
the breeder may improve the response to selection. 

The magnitude of the selection differential depends on two fac- 
tors: the proportion of the population included among the selected 
group, and the phenotypic standard deviation of the character. The 
dependence of the selection differential on these two factors is illus- 
trated diagrammatically in Fig. 11.2. The graphs show the distribu- 
tion of phenotypic values, which is assumed to be normal. The 
individuals with the highest values are supposed to be selected, so 
that the distribution is sharply divided at a point of truncation, all 
individuals above this value being selected and all below rejected. 



[Chap. II 

The arrow in each figure marks the mean value of the selected group, 
and S is the selection differential. In graph (a) half the population is 
selected, and the selection differential is rather small: in graph (b) 
only 20 per cent of the population is selected, and the selection differ- 
ential is much larger. In graph (c) 20 per cent is again selected, but 

Fig. i 1.2. Diagrams to show how the selection differential, S, 
depends on the proportion of the population selected, and on the 
variability of the character. All the individuals in the stippled 
areas, beyond the points of truncation, are selected. The axes are 
marked in hypothetical units of measurement. 

( a ) 5°% selected; standard deviation 2 units: S = i-6 units 

(b) 20% selected; standard deviation 2 units: S = 2-8 units 

(c) 20 % selected; standard deviation 1 unit: S = 1 -4 units 

the character represented is less variable and the selection differential 
is consequently smaller. The standard deviation in (c) is half as great 
as in (b) and the selection differential is also half as great. 

The standard deviation, which measures the variability, is a 
property of the character and the population, and it sets the units in 
which the response is expressed — i.e. so many pounds, millimetres, 
bristles, etc. The response to selection may be generalised if both 
response and selection differential are expressed in terms of the 
phenotypic standard deviation, o>. Then Rjop is a generalised mea- 
sure of the response, by means of which we can compare different 
characters and different populations; and*S/a P is a generalised measure 
of the selection differential, by means of which we can compare 
different methods or procedures for carrying out the selection. The 
' 'standardised" selection differential, Sjo P , will be called the intensity 
of selection, symbolised by i. The equation of response {n. 2) then 

Op Up 

; Chap. II] 


R = i(j p h 2 


By noting that h = (t a /g p , where v A is the standard deviation of breed- 
ing values (square root of the additive genetic variance), we may write 
this equation in the form 

R=ihcr A ( JI >4) 

which is sometimes used in comparisons of different methods of 

The intensity of selection, % depends only on the proportion of 
the population included in the selected group, and, provided the 


•« 1-4 






u_ 10 



















30 40 50 60 70 





Fig. i i .3 . Intensity of selection in relation to proportion selected. 
The intensity of selection is the mean deviation of the selected 
individuals, in units of phenotypic standard deviations. The upper 
graph refers to selection out of a large total number of individuals 
measured: the lower two graphs refer to selection out of totals of 20 
and 10 individuals respectively. 

194 SELECTION: I [Chap. II 

distribution of phenotypic values is normal, it can be determined 
from tables of the properties of the normal distribution. If p is the 
proportion selected — i.e. the proportion of the population falling 
beyond the point of truncation — and z is the height of the ordinate at 
the point of truncation, then it follows from the mathematical 
properties of the normal distribution that 

S . z , x 

Thus, given only the proportion selected, p, we can find out by how 
many standard deviations the mean of the selected individuals will 
exceed the mean of the population before selection: that is to say, the 
intensity of selection, i. The graphs in Fig. 11.3 show the relation- 
ship between i and p\ the value of i for any given value of p can be 
read from the graphs with sufficient accuracy for most purposes. The 
relationship between i and p given in equation 11. 5 applies, strictly 
speaking, only to a large sample: that is to say, when a large number of 
individuals have been measured, among which the selection is to be 
made. When selection is made out of a small number of measured 
individuals, the mean deviation of the selected group is a little less. 
The intensity of selection can be found from tables of deviations of 
ranked data (Table XX of Fisher and Yates, 1943). The two lower 

Table ii.i 

Intensities of selection when selection is made out of a small 
number of individuals measured. The figures in the table 
are values of i =Sjop = mean deviation in standard measure. 









































































curves in Fig 

. 11.3 

show the intensity 

of selection for samples of 10 

and 20. Selection 

intensities for 

samples smaller than 10 

are given 

in Table 11.1. 

Chap. II] 



Example 11.2. A comparison of the expected and observed responses 
under different intensities of selection was made by Clayton, Morris, and 
Robertson (1957), studying abdominal bristle number in Drosophila. The 
heritability was first determined by three methods which yielded a com- 
bined estimate of 0-52 (see Example 10.1). The standard deviation of 
bristle number (average of the two sexes) was 3-35. Selection at four 
different intensities was carried on for five generations, both upward and 
downward (i.e. both for increased and for decreased bristle number). In 
each case 20 males and 20 females were selected as parents, the intensity 
being varied by the number out of which these were selected, as shown in 
the first column of the table. The intensities of selection corresponding to 
these proportions selected may be read off the graphs in Fig. 11.3. They 
are given in the second column of the table. The expected responses are 

Mean response per generation 


Intensity of 



selected, p 

selection, i 




20/100 = 0-20 





20/75 = 0*267 





20/50 = 0-40 










then found from equation 11.3. Under the most intense selection, for 
example, it is ^ = 1-4x3-35 xo*52 = 2-44. There were five replicate lines 
in both directions under the most intense selection, and three replicates 
under the other intensities. The observed responses are quoted in the last 
two columns of the table. Although they do not agree very precisely 
with expectation, they show how the change made by selection falls off as 
the intensity of selection is reduced, and the data serve to illustrate the 
computation of the expected response. 

It will now be clear that there are two methods open to the breeder 
for improving the rate of response to selection: one by increasing the 
heritability and the other by reducing the proportion selected and so 
increasing the intensity of selection. The heritability can be increased 
only by reducing the environmental variation through attention to the 
technique of rearing and management. Reducing the proportion 
selected seems at first sight to be a straightforward means of improv- 
ing the response, but there are several factors to be considered which 
set a limit to what the breeder can do in this way. First is the matter 
of population size and inbreeding. This sets a lower limit to the 
number of individuals to be used as parents. In experimental work, 
for example, one might decide to use not less than 10 or even 20 pairs 

196 SELECTION: I [Chap. II 

of parents; and in livestock improvement, particularly if artificial 
insemination came into general use as a means of intense selection on 
males, care would have to be taken not to restrict the number of 
males too much. For this reason the intensity of selection can be 
increased above a certain point only by increasing the total number of 
individuals measured, out of which the selection is made. With 
organisms that have a high reproductive rate, such as Drosophila and 
plants, very large numbers can, in principle, be measured; but in 
practice a limit is set to the intensity of selection by the time and 
labour required for the measurement. With organisms that have a 
low reproductive rate the limit to the intensity of selection is set by 
the reproductive rate, since the proportion saved can never be less 
than the proportion needed for replacement; that is to say, two 
individuals are needed on the average to replace each pair of parents. 
Usually fewer males are needed than females, because each male can 
mate with several females, and so the males leave more offspring than 
the females. A higher intensity of selection can then be made on 
males than on females. Suppose, for example, that females leave on 
the average 5 offspring, and each male mates with 10 females, so that 
males leave on the average 50 offspring. Then the proportion of 
females selected cannot be less than 1/5, but only 1/50 of the males 
need be selected. The upper limits of the intensity of selection in this 
case would be 1-40 for females, and 2-64 for males. 

The number of offspring produced by a pair of parents depends 
not only on their reproductive rate but also on how long the breeder 
is willing to wait before he makes the selection. This introduces a 
new factor — the interval of time between generations — which we 
have not yet taken into account in the treatment of the response to 
selection, and which we must now consider. 

Generation interval. The progress per unit of time is usually 
more important in practice than the progress per generation, so the 
interval between generations is an important factor in reckoning the 
response to selection. The generation interval is the interval of time 
between corresponding stages of the life cycle in successive genera- 
tions, and it is most conveniently reckoned as the average age of the 
parents when the offspring are born that are destined to become 
parents in the next generation. By waiting until more offspring have 
been reared before he makes the selection the breeder can increase the 
intensity of selection and the response per generation; but in doing so 
he inevitably increases the generation interval and may thereby 

Chap. II] 



reduce the response per unit of time. There is thus a conflict of 
interest between intensity of selection and generation interval, and 
the best compromise must be found between the two. Increasing the 
number of offspring will pay up to a certain point, and beyond this 
point it will not. The optimal number of offspring cannot be stated 
in general terms, and each case must be worked out according to its 
special circumstances. The procedure is explained in the following 
example, referring to mice. 

Example 11.3. Let us suppose that selection is to be applied to some 
character in mice, and that speed of progress per unit of time is the aim. 
The question is: how many litters should be raised? To find the number of 
litters that will give the maximum speed of progress we have to find the 
intensity of selection and the generation interval. The ratio of the two will 
then give the relative speed. The actual speed could be obtained by multi- 
plying by the heritability and the standard deviation, but these factors will 
be assumed to be independent of the number of litters raised. A comparison 
of the expected rates of progress per week is made in the table. The com- 
parison is made for three different average sizes of litter, meaning the 
number of young reared per litter. It is assumed that the character to be 
selected can be measured before sexual maturity, and that first litters are 
born when the parents are 9 weeks old, subsequent litters following at 
intervals of 4 weeks. It is assumed also that the population is large enough 
to be treated as a large sample in reckoning the intensity of selection; and 
that equal numbers of males and females are selected. The optimal 
number of litters differs according to the number reared per litter. If 6 

N = 6 


N = : 






p i ijt 









•50 o-8o -089 1 









•25 1-27 -098 









•167 1-50 -088 









■125 1-65 -079 




Column headings 

: L- 

= number of litters raised. 


= generation interval in weeks 


= proportion selected. 


= intensity of selection. 


= relative speed of progress. 


= number of young reared per 


young are reared the maximum speed is attained by rearing only one 
litter. If 4 young are reared it is worth while to wait for second litters 
before making the selection, but not for third litters. If only 2 young are 
reared per litter, raising three litters gives the maximum speed of progress. 

198 SELECTION: I [Chap. II 

Most mouse stocks are able to rear 6 young per litter, so under most cir- 
cumstances it is best to make the selection from the first litters, and not to 
wait for second litters. This conclusion could hardly have been guessed at 
without the computations shown in the table. 

Measurement of Response 

When one or more generations of selection have been made the 
measurement of the response actually obtained introduces several 
problems. These are matters of procedure rather than of principle 
and will be only briefly discussed. 

Variability of generation means. The first problem to be 
solved arises from the variability of generation means. Inspection of 
any of the graphs of selection given in the examples shows that the 
generation means do not progress in a simple regular fashion, but 
fluctuate erratically and more or less violently. There are two main 
causes of this variation between the generation means: sampling 
variation, depending on the number of individuals measured; and 
environmental change, which is usually the more important of the 
two. The consequence of this variation between generation means is 
that the response can seldom be measured with any pretence of 
accuracy until several generations of selection have been made. The 
best measure of the average response per generation is then obtained 
from the slope of a regression line fitted to the generation means, the 
assumption being made that the true response is constant over the 
period. The variation between generation means appears as error 
variation about the regression line, and the standard error of the 
estimate of response is based on it. Variation due to changes of 
environment can, of course, be overcome, or at least reduced, by the 
use of a control population. The measurement of the response can, 
however, be improved in accuracy if the "control" is not an un- 
selected population but is selected in the opposite direction. This is 
known as a "two-way" selection experiment. The response measured 
from the divergence of the two lines is then about twice as great as 
that of the lines separately, and the variation between generations is 
reduced to the extent that the environmental changes affect both lines 
alike. An unselected control is, however, preferable if for practical 
reasons one is interested only in the change in one direction, because 
the response is not always equal in the two directions. This point will 
be discussed in the next chapter. 

Chap. II] 



Example i i .4. Fig. 1 1 .4 shows the results of 1 1 generations of two-way 
selection for body weight in mice (Falconer, 1953). On the left the "up" 
and "down" lines are shown separately, and on the right the divergence be- 
tween the two is shown. Linear regression lines are fitted to the observed 

2468 10 2468 10 


Fig. i 1.4. Two-way selection for 6-week weight in mice. Ex- 
planation in Example 11.4. (Redrawn from Falconer, 1953.) 

generation means. (The first generation of selection is disregarded be- 
cause the method of selection was different.) The estimates of the average 
response per generation, with their standard errors, are as follows: 

Response ± standard error 
in grams per generation. 

Up 0-27 ± 0-050 

Down 0-62 ± 0-046 

Divergence o-88 ± 0-036 

The difference between the upward and downward responses will be dis- 
cussed in the next chapter. 

The foregoing example shows how the variation of the generation 
means can be reduced when the response is measured from the differ- 
ence betw r een two lines, each acting in the manner of a control for the 
other. Controls, however, are not always available, and then a more 
serious difficulty may arise from progressive changes of environment. 
This makes it difficult to assess the effectiveness of selection in the 
improvement of domesticated animals, and to a lesser extent of plants, 
because in the absence of a control there is no sure way of deciding 

O F.Q.G. 

200 SELECTION: I [Chap. It 

how much of the improvement is due to selection and how much to a 
progressive change in the conditions of management. 

Example 11.5. Lush (1950) has assembled a number of graphs show- 
ing the improvement of farm animals that has taken place during the 
present century. Instead of reproducing any of these graphs we give in 
the table an indication of the increase of yield per individual over a period 
of years, as a percentage of the initial yield. It is difficult to avoid the con- 
clusion that much of the improvement of these characters is the result of 
selection, but in the absence of any standard of comparison it is very 
difficult to decide how much is due to selection and how much to improved 
methods of feeding and management. 




Improvement, % 




1920- 1944 



New Zealand 



Fat % in milk 





Efficiency of growth 




Body length 





Fleece weight 





Egg production 




Weighting the selection differential. In experimental selection 
the selection differential as well as the response has to be measured 
because it is the relationship between the two, and not the response 
alone, that is of interest from the genetic point of view. We have to 
distinguish between the expected and the effective selection differ- 
ential, because in practice the individual parents do not contribute 
equally to the offspring generation. Differences of fertility are always 
present so that some parents contribute more offspring than others. 
To obtain a measure of the selection differential that is relevant to 
the response observed in the mean of the offspring generation we 
therefore have to weight the deviations of the parents according to 
the number of their offspring that are measured. The expected 
selection differential is the simple mean phenotypic deviation of the 
parents as defined at the beginning of this chapter; the effective 
selection differential is the weighted mean deviation of the parents, 
the weight given to each parent, or pair of parents, being their pro- 
portionate contribution to the individuals that are measured in the 
next generation. 

The weighting of the selection differential takes account of a good 
part of the effects of natural selection. If the differences of fertility 

Chap. II] 



are related to the parents' phenotypic values for the character being 
selected, then this natural selection will either help or hinder the 
artificial selection. If, for example, the more extreme phenotypes are 
less fertile or more frequently sterile, then natural selection is working 
against artificial selection. By weighting the selection differential we 
measure the joint effects of natural and artificial selection together. 
A comparison of the effective (i.e. weighted) with the expected selec- 
tion differential may thus be used to discover whether natural selec- 
tion is operative. 

Example ii.6. In an experiment with mice, selection for body size 
(weight at 6 weeks) was carried through 30 generations in the upward 
direction and 24 generations in the downward direction (see Falconer, 
1955). Comparisons are made in the table between the effective (weighted) 
and the expected (unweighted) selection differentials in the two lines. The 
period of selection is divided into two parts and the comparisons are made 
separately in each. Throughout the whole of the upward selection there 
was virtually no difference between the effective and expected selection 
differential, and we can conclude that natural selection was unimportant 
as a factor influencing the response. The situation in the downward 
selected line, however, is different, the effective selection differential being 
less than the expected, especially in the second part. From this we can 
conclude that natural selection was operating in favour of large size, thus 
hindering the artificial selection and reducing the response obtained, 
particularly in the latter part of the experiment. The cause of the natural 
selection and the reason why it operated only in the downward selected 
line were as follows. Large mice produce larger litters than small mice; but 
for the purpose of standardisation, litters were artificially reduced to 8 
young at birth. At the beginning, and throughout the whole period in the 
upward selected line, there were few litters with less than 8 young, and so 

Direction of 








Selection differential per 
generation (gms.) 


Expected Effective 




1 -oi 


the differential fertility had no consequence in the upward selected line. 
In the downward selected line, however, there was soon no standardisation 
because there were few litters with as many as 8 young. Thus the smaller 

202 SELECTION: I [Chap. II 

mice produced fewer young and this reduced the effective selection differ- 
ential. In the second part of the experiment the smallest mice did not 
breed at all and this reduced the effective selection differential still further. 

The weighting of the selection differential does not take account 
of the whole effect of natural selection. We noted at the beginning of 
the chapter that natural selection may operate at two stages, through 
differences of fertility among the parents and through differences of 
viability among the offspring. The effect of differences of viability 
among the offspring are not accounted for in the effective selection 
differential. For further examples and a fuller account of the inter- 
action of natural and artificial selection see Lerner (1954, 1958). 

Realised heritability. The equation of response, R=h 2 S {11.2), 
which we discussed earlier from the point of view of predicting the 
response, can be looked at the other way round, as a means of esti- 
mating the heritability from the result of selection already carried 
out, the heritability being estimated as the ratio of response to selec- 
tion differential: 

*=§ (n-5) 

The same conditions are necessary for the valid use of the equation 
for estimating heritability as for predicting response, except that now 
by weighting the selection differential a good part of the effects of 
natural selection can be taken account of. There is also the condition 
that the observed response should not be confounded with systematic 
changes of generation mean due to the environment or the effects of 
inbreeding. This, and the absence of maternal effects, are the im- 
portant conditions for the valid estimation of heritability from the 
response to selection. 

The ratio of response to selection differential, however, has an 
intrinsic interest of its own, quite apart from whether it provides a 
valid estimate of the heritability. It provides the most useful empiri- 
cal description of the effectiveness of selection, which allows com- 
parison of different experiments to be made even when the intensity 
of selection is not the same. The term realised heritability will be used 
to denote the ratio R/S, irrespective of its validity as a measure of the 
true heritability. The realised heritability is estimated as follows. 
The generation means are plotted against the cumulated selection 
differential. That is to say, the selection differentials, appropriately 

Chap. II] 



weighted, are summed over successive generations so as to give the 
total selection applied up to the generation in question. A regression 
line is then fitted to the points and the slope of this line measures the 
average value of R/S, the realised heritability. 

Example 11.7. Fig. 11.5 shows the results of 21 and 18 generations 
of two-way selection for 6-week weight in mice (Falconer, 1954 a). The 


Fig. 1 1.5. Two-way selection for 6-week weight in mice. Res- 
ponse plotted against cumulated selection differential, as explained 
in Example 11.7. (From Falconer, 19540; reproduced by courtesy 
of the editor of the International Union of Biological Sciences.) 

generation means are plotted against the cumulated selection differential 
and linear regression lines are fitted to the points. The realised herit- 
abilities, estimated from the slopes of these lines, are: 

Upward selection: 0-175 ± 0-0161 
Downward selection: 0-5 1 8 ± 0-023 l 

The difference between the upward and downward selection is referred to 
in the next chapter. 

Change of Gene Frequency under Artificial Selection 

It was pointed out at the beginning of this chapter that the change 
of the population mean resulting from selection is brought about 
through changes of the gene frequencies at the loci which influence 
the character selected. But since the effects of the loci cannot be 



[Chap. II 

individually identified, the changes of gene frequency cannot in 
practice be followed. Consequently the process of selection for a 
metric character had to be described in terms of the selection differ- 
ential, or the intensity of selection, and of the change of the popula- 
tion mean, representing the combined effects of all the loci. This 
leaves unanswered the fundamental question: How great are the 
changes of gene frequency underlying the response of a metric 
character to selection? To answer this question, and so to bridge the 
gap between the treatment of selection given in this chapter and that 
given earlier in Chapter 2, we have to find the connexion between the 
intensity of selection (i) and the coefficient of selection (s) operating 
on a particular locus. 

The effect of selection for a metric character on one of the loci 
concerned may best be pictured in the manner illustrated in Fig. 
1 1.6. This refers to a locus with two alleles of which one (A T ) is com- 

Fig. 1 1.6. Selection for a metric character operating on one of 
the loci concerned. The frequency of A 2 A 2 as depicted is q 2 = I. 

pletely dominant. With respect to this locus, therefore, the popula- 
tion is divided into two portions which differ in their mean pheno- 
typic values by an amount 2<z, this being the difference between the 
two homozygotes in the notation of earlier chapters (see Fig. 7.1, 
p. 1 1 3). It is assumed that the residual variance within each portion is 
the same, this residual variance arising from all the other loci as well 

Chap. II] 



as from environmental causes. The proportion of individuals in the 
two portions depends on the gene frequency at the locus, q 2 being in 
the portion consisting of A 2 A 2 genotypes, and i -q 2 in the portion 
containing A X A X and A X A 2 genotypes. When artificial selection is 
applied, a proportion of the whole population lying beyond the point 
of truncation is cut off, and the proportion of A 2 A 2 genotypes is lower 
among this selected group than in the population as a whole, selec- 
tion acting in the case illustrated against the A 2 allele. Now, the new 
gene frequency, q l9 is the frequency of A 2 genes among the selected 
group of individuals. This may be found by deducing the regression 
of gene frequency on phenotypic value, b qP . The selected group 
deviates in mean phenotypic value from the population mean by an 
amount £, which is the selection differential. The gene frequency 
among the selected group will then be given by the regression equa- 

qi =q+b qP S (11.6) 

The regression of gene frequency on phenotypic value is found as 
follows. The three genotypes are listed in Table 11.2 with their 

Table 11.2 

q G 

AiA 2 

A 2 A 2 

p 2 

frequencies in the whole population. The third column of the table 
gives the frequency of the A 2 allele among each of the three geno- 
types, which is simply o, J, and 1 . The last column gives the geno- 
typic values. Provided there is no correlation between genotype and 
environment, these are also the mean phenotypic values of each 
genotype. There is now no assumption of complete dominance. 
The covariance of gene frequency with phenotypic value is obtained 
from the sum of the products of q and P, each multiplied by the 
frequency of the genotype. From this sum of products must be 
deducted the product of the means of the gene frequency and the 
phenotypic value. Thus the covariance is cov qP =pqd-q 2 a-qM, 
where M is the population mean. Substituting the value of M from 
equation 7.2, the covariance reduces to - pq[a + d(q - p)] — - pqa, 
where a is the average effect of the gene substitution (see equation J.5). 
The regression of gene frequency on phenotypic value is therefore 

206 SELECTION: I [Chap. II 


where o P is the phenotypic variance. 

Next, we substitute this regression coefficient in equation u.6, 
putting also S = io P from equation 11.5. This gives the gene frequency 
among the selected parents as 


and the change of gene frequency resulting from the selection 
reduces to 

Aq= -ipq— (11.8) 


The change is negative because selection is acting against the allele 
A 2 whose frequency is q. This formula enables us to translate the 
intensity of selection, i, into the coefficient of selection, s, against A 2 , 
because equations for the change of gene frequency in terms of s were 
given in Chapter 2. We shall take the approximate equations given 
in 2.7 and 2.8. If dominance is complete, d=a and a = 2qa. Then 
equating 1 1.8 with 2.8 gives 



If there is no dominance d=o and <x = a. Then equating 11. 8 with 
2.7 gives 

ipq^- = isq(i-q) 


Both these equations, on simplification, reduce to 

s~i— ( JJ -9) 


Thus we find that the two ways of expressing the "force" of selection 
— by the intensity and the coefficient of selection — are very simply 
related to each other. The coefficient of selection operating on any 
locus is directly proportional to the intensity of selection and to the 
quantity zajop. This quantity is the difference of value between the two 
homozygotes expressed in terms of the phenotypic standard deviation. 

Chap. II] 



For want of a more suitable term we shall refer to this, rather loosely, 
as the "proportionate effect" of the locus. There is nothing more 
that we can do with the relationship expressed in equation n.g at 
the moment, but we shall use it in the next chapter to draw some 
tentative conclusions about the "proportionate effects" of loci con- 
cerned with metric characters. 



II. The Results of Experiments 

In the last chapter we saw that the theoretical deductions about the 
effects of artificial selection are limited to the change of the popula- 
tion mean, and strictly speaking over only one generation. By chang- 
ing the gene frequencies selection changes the genetic properties of 
the population upon which the effects of further selection depend. 
And, because the effects of the individual loci are unknown, the 
changes of gene frequency cannot be predicted, and so the response 
to selection can be predicted only for as long as the genetic properties 
remain substantially unchanged. Thus there are many consequences 
of selection that can be discovered only by experiment. The object of 
this chapter is to describe briefly what seem to be the most general 
conclusions about these consequences that have emerged from 
experimental studies of selection. It should be noted, however, that 
the drawing of conclusions from the results of experiments in the 
field of quantitative genetics is to some extent a matter of personal 
judgement. Many of the conclusions put forward in this chapter 
therefore represent a personal viewpoint, and are not necessarily 
accepted generally. The most important questions to be answered 
by experiment concern the long-term effects of selection. For how 
long does the response continue? By how much can the population 
mean ultimately be changed ? What is the genetic nature of the limit 
to further progress? These questions will be dealt with in the latter 
part of the chapter. First we shall consider two questions raised by 
the examples in the last chapter. 

Repeatability of Response 

In Example ii.i we saw that the response in one generation of 
selection was very variable when the selection was replicated in a 
number of lines. Though the average response agreed fairly well 

Chap. 12] 



with the prediction, the responses of the individual lines did not. 
This raises the question: How consistent, or repeatable, are the 
results of selection ? If selection is applied to different samples drawn 
from the same population, how closely will the results agree? Part 
of the problem here concerns sampling variation — the extent to 
which the samples differ in gene frequencies, both initially and 
during the course of the continued selection. This depends, of course, 
on the size of the populations, or lines, during the course of the 
selection; but it depends also on the initial gene frequencies in the 
base population from which the samples were drawn. If most of the 
loci concerned with the character have genes at more or less inter- 
mediate frequencies then the response to selection is not likely to be 
much influenced by sampling variation. On the other hand, if there 
are loci with genes at low frequency then these will be included in 
some samples drawn from the initial population but will be absent 
from others. Then, if any of these low-frequency genes have a fairly 
large effect on the character their presence or absence may appreciably 
influence the outcome of selection. The experiment on abdominal 
bristle-number in Drosophila whose first generation was quoted in 
Example ii.i, provides the only evidence on this point (Clayton, 
Morris, and Robertson, 1957). Fig. 12. 1 shows the responses in the 
five up and the five down lines over 20 generations. The responses 
are reasonably consistent over the first 5 generations in the up lines 
and over about 10 generations in the down lines. Thereafter the 
lines begin to differentiate, and by the twentieth generation there are 
substantial differences between them. The conclusion suggested by 
the early similarity and the later divergence between the replicate 
lines is that the early response is governed chiefly by genes at more or 
less intermediate frequencies, but in the later stages genes at initially 
low frequencies begin to come into play, the initial sampling having 
caused differences between the lines in respect of these genes. 

The question of repeatability of the response to selection may be 
extended to differences between populations. This is not a matter of 
sampling variation but of the differences in the genetic properties of 
populations. We noted in Chapter 10 that heritabilities frequently 
differ between populations, and consequently we should not expect 
the responses to selection to be the same. It is of interest nevertheless 
to compare the results of selection applied to different populations 
and to see how they do actually differ. Fig. 12.2 shows the results of 
selection for thorax length in Drosophila melanogaster applied to three 



[Chap. 12 


Fig. 12.1. Selection for abdominal bristle number in Drosophila 
melanogaster, replicated in 5 lines in each direction. The broken 
lines refer to suspended selection and the thin continuous lines to 
inbreeding without selection. (From Clayton, Morris, and Robert- 
son, 1957; reproduced by courtesy of the authors and the editor of 
the Journal of Genetics.) 

Chap. 12] 



LARG£_ . 



^~s \..- % 

^ - 








— i 1 I ■ — i 1 — I 1 — i — i 1 1 i—i — i— i — r 





\ A 

v \ 

^s. SMALL 







O 5 IO IS 20 


Fig. 12.2. Selection for thorax length in Drosophila melanogaster 
from three different base populations. The broken lines refer to 
reversed selection and the dotted lines to suspended selection. 
(From F. W. Robertson, 1955; reproduced by courtesy of the 
author and the editor of the Cold Spring Harbor Symposia on 
Quantitative Biology.) 

212 SELECTION: II [Chap. 12 

different wild populations, (F. W. Robertson, 1955). The responses 
of the three populations, both upward and downward, are fairly alike. 
It is not possible to discuss further the degree of repeatability 
between the responses found in these two experiments, because there 
is no objective criterion for deciding how closely the responses ought 
to agree. One can therefore only regard them as empirical evidence 
of what in practice does occur. 

Asymmetry of Response 

A surprising feature of the experimental results illustrated in the 
last chapter is the inequality of the responses to selection in opposite 
directions, seen particularly well in Fig. 11.5. This asymmetry of 
response has been found in many two-way selection experiments, 
but its cause is not yet known. For this reason we shall not discuss 
the phenomenon in detail, but shall merely note the possible causes, 
of which there are several. These possible causes are, briefly, as 

1. Selection differential. The selection differential may differ 
between the upward and downward selected lines, for several reasons, 
(i) Natural selection may aid artificial selection in one direction or 
hinder it in the other, (ii) The fertility may change so that a higher 
intensity of selection is achieved in one direction than in the other, 
(iii) The variance may change as a result of the change of mean: the 
selection differential will increase as the variance increases and de- 
crease as it decreases. This is a "scale- effect," to be discussed more 
fully in Chapter 17. These three causes operating through the selec- 
tion differential were all found in the experiment with mice cited in 
the last chapter, but they operated in the direction opposite to that of 
the asymmetry found. The selection differential was greater in the 
upward selection but the response was greater in the downward 
selection. Differences of the selection differential influence the 
response per generation, but they affect the realised heritability only a 
little. Therefore if the response is plotted against the cumulated 
selection differential and there is still much asymmetry, as in Fig. 
1 1.5, it cannot be attributed to any cause operating through the 
selection differential. 

2. "Genetic asymmetry." There are two sorts of asymmetry 
in the genetic properties of the initial population that could give rise 

Chap. 12] 



to asymmetry of the responses to selection (Falconer, 1954a). These 
concern the dominance and the gene frequencies of the loci concerned 
with the character. The dominant alleles at each locus may be mostly 
those that affect the character in one direction, instead of being more 
or less equally distributed between those that increase and those that 
decrease it. We shall refer to this situation as directional dominance. 
If the initial gene frequencies were about 0-5, the response would be 
expected to be greater in the direction in which the alleles tend to be 
recessive. It will be shown in Chapter 14 that this is also the direction 
in which the mean is expected to change on inbreeding. Therefore 
we should, in general, expect characters that show inbreeding 
depression to respond more rapidly to downward selection than to 
upward selection. There may also be asymmetry in the distribution of 
gene frequencies. The more frequent alleles at each locus may be 
mostly those that affect the character in one direction — a situation 
that we shall refer to as directional gene frequencies. In the absence of 
directional dominance this would be expected to cause a more rapid 
response to selection in the direction of the less frequent alleles. 
Under natural selection the less favourable alleles, in respect to fit- 
ness, will have been brought to lower frequencies. Therefore if 
selection in one direction reduces fitness more than selection in the 
other, we should expect a more rapid response in the direction of the 
greater loss of fitness. The asymmetry of the response to selection 
theoretically expected from these two causes may be seen by con- 
sideration of Fig. 2.3, which shows the expected response arising 
from one locus. Neither of these two causes — directional dominance 
and directional gene frequencies — would, however, be expected to 
give rise to immediate asymmetry; that is, in the first few generations 
of selection. The asymmetry would appear only as the gene fre- 
quencies in the upward and downward selected lines become differ- 
entiated. The asymmetry found in some experiments undoubtedly 
appears sooner than would be expected from these causes. 

3. Selection for heterozygotes. If selection in one direction 
favours heterozygotes at many loci, or at a few loci with important 
effects, the response would become slow as the gene frequencies ap- 
proach their equilibrium values. But the response in the other direc- 
tion would be rapid until the favoured alleles approach fixation. This 
situation, which is a form of directional dominance, would also be 
expected to give rise to an asymmetrical response (Lerner, 1954); but, 
again, not immediately. 


214 SELECTION: II [Chap. 12 

4. Inbreeding depression. Most experiments on selection are 
made with populations not very large in size, and there is usually 
therefore an appreciable amount of inbreeding during the progress 
the selection. If the character selected is one subject to inbreeding 
depression, there will be a tendency for the mean to decline through 
inbreeding. This will reduce the rate of response in the upward 
direction and increase it in the downward direction, thus giving rise 
to asymmetry. An unselected control population will reveal how 
much asymmetry can be attributed to this cause. Inbreeding 
depression has been shown to be an insufficient cause of the asymmetry 
in the experiments cited in the last chapter. 

5. Maternal effects. Characters complicated by a maternal effect 
may show an asymmetry of response associated with the maternal 
component of the character. The situation envisaged may best 
be explained by reference to the selection for body weight in 
mice (Falconer, 1955), which showed the strong asymmetry illus- 
trated in Fig. 1 1.5. The character selected — 6- week weight — may be 
divided into two components, weaning weight and post-weaning 
growth, the former being maternally determined. It was found that 
all the asymmetry resided in the weaning weight and none in the 
post-weaning growth. The weaning weight increased hardly at all 
in the large line but decreased very much in the small line. Thus it 
was the mothering ability that changed asymmetrically under selec- 
tion and not the growth of the young themselves. To attribute an 
asymmetrical response to maternal effects does not, however, solve 
the problem, because the asymmetry has merely been shifted from 
the character selected to another, and is still just as much in need of 
an explanation. 

These, then, are the possible causes of asymmetry that may be 
suggested. There are probably others. Until the causes of asym- 
metry are better understood it is clear that predictions of the rate of 
response to selection must be made with caution. Where there is 
asymmetry of response the mean of the realised heritabilities in the 
two directions will presumably correspond with the heritability 
estimated from the resemblance between relatives. Therefore the 
response predicted will presumably be about the mean of the two- 
way responses actually obtained. If the asymmetry found in the 
mouse experiment should prove to be characteristic of selection for 
economically desirable characters in mammals, it means that we must 
expect actual progress to fall short of the predicted progress. In this 

Chap. 12] 



experiment the mean realised heritability was 35 per cent, but the 
upward progress was only at the rate of 18 per cent. In other words 
the progress made was only about half as rapid as would, presumably, 
have been predicted. 

Long-term Results of Selection 

The response to selection cannot be expected to continue in- 
definitely. Sooner or later it is to be expected that all the favourable 
alleles originally segregating will be brought to fixation. As they 
approach fixation the genetic variance should decline and the rate of 
response diminish, till, when fixation is complete, the response should 
cease. The population should then fail also to respond to selection in 
the opposite direction, and further response to selection in either 
direction will depend on the origin of new genetic variation by 
mutation. But how many generations must elapse before the response 
ceases, and how great will be the total response are questions that can 
be answered only by experiment. Let us first see what evidence is 
available on these points, and then see how far the long-term effects 
of selection conform to the simple theoretical picture outlined above. 

Total response and duration of response. When the response 
to selection has ceased, the population is said to be at the selection 
limit. It is usually impossible to decide exactly at what point the 
limit is reached, because the limit is approached gradually, the res- 
ponse becoming progressively slower. The total response, and par- 
ticularly the duration of the response, can therefore be estimated only 
approximately. Bearing this in mind, we may examine the results of 
four two-way selection experiments, two with Drosophila and two 
with mice, given in Table 12.1. The asymmetry of the responses is 
disregarded, and the total response is taken as the sum of the total 
responses in the two directions. This is the difference between the 
upper and lower selection limits, and may be called the total range. In 
the table the total range is expressed in three ways: as a percentage of 
the initial population mean, M ] in terms of the phenotypic standard 
deviation, a P , in the initial population; and in terms of the standard 
deviation of breeding values, cr Ai (i.e. the square root of the additive 
variance) in the initial population. To draw general conclusions from 
these four experiments would be rash, because the experiments 
differed in several ways — in the intensity of selection, the population 

P F.Q.G. 

216 SELECTION: II [Chap. 12 

size, and the nature of the initial population — all of which would be 
expected to affect the duration of response and the total range. 
Despite these differences, however, the picture they give is fairly- 
consistent. The response continues for about 20 to 30 generations; 

Table 12.1 
Total Responses in four Selection Experiments 

Experiment Duration Total range 

(generations) /M (%) ja P ja A 

(1) abdominal bristles 30 189 20 28 

(2) thorax length 20 24 12 22 


(3) 6-week weight 





(4) 60-day weight 






(1) Clayton and Robertson (1957). 

(2) F. W. Robertson (1955). 

(3) Falconer (1955). 

(4) MacArthur (1949); Butler (1952). 

and the total range is between 15 and 30 times the square root of the 
additive variance, or about 10 to 20 times the phenotypic standard 
deviation in the initial population. The relationship between the 
total range and the original population mean, however, is quite 

The total response produced by selection in these experiments, 
though it may be impressive when reckoned in terms of the variation 
present in the original population, is not at all spectacular when com- 
pared with the achievements of the breeders of domestic animals. 
For example, the upper limits of body weight of the mice in the 
experiments quoted are 2 to 3 times the lower limits; but the weights 
of the largest breeds of dog are about 75 times greater than those of 
the smallest (Sierts-Roth, 1953). The reason for the disappointing 
results of experimental selection when viewed against the differences 
between the breeds of domestic animals is that experiments are 
carried out with closed populations of not very large size. The limits 
are set by the gene content of the foundation individuals, since no 
genes are brought in after selection has been started. The breeder of 


Chap. 12] 



domestic animals, in contrast, by intermittent crossing casts his net 
far wider in the search for genes favourable to his purposes. 

The effects of inbreeding during the selection have been ignored 
in this account of selection limits. It is clear on theoretical grounds 
that inbreeding will tend to cause fixation of unfavourable alleles at 
some loci. Both the total response and the duration of the response 
must therefore be expected to be reduced if the selection is carried 
out in a small population with a fairly high rate of inbreeding. There 
is, however, little experimental evidence on the magnitude of this 
effect of inbreeding. The four experiments discussed above were all 
carried out on fairly large populations, so that the rate of inbreeding 
was fairly low. 

Number of "loci." When the total range has been determined 
by experiment it is possible, in principle, to deduce the number of 
loci that gave rise to the response, and the magnitude of their effects. 
The estimates that can be made in practice, however, are only rough 
ones, because the properties of the individual loci are unknown and 
have to be guessed at. But even though we can do no more than 
establish the order of magnitude of the number and effects of the loci, 
this is better than no estimate at all; so let us see how these estimates 
may be obtained. The limitations will become apparent as we pro- 

The estimates come from a comparison of the total range with 
the amount of additive genetic variance in the original population. In 
principle it is clear that with a given amount of initial variation a 
small number of genes will produce less total response than a larger 
number; and that if a given amount of variation is produced by few 
genes the magnitude of their effects must be greater than if it is pro- 
duced by many. It is clear, also, that linkage is an important factor 
in the relationship between variance and total response. Some seg- 
ments of chromosome that segregate as units in the initial popula- 
tion will recombine during the selection and appear as many genes 
contributing to the total response. Other segments may fail to re- 
combine and will be counted as single genes. In order to emphasise 
this limitation, the estimate of the number of loci may be referred to 
as the number of "effective factors" or as the "segregation index." 
There are, however, other uncertainties, and we shall simply refer to 
it as the number of "loci," letting the inverted commas serve to 
remind us of the unavoidable limitations and qualifications. 

We must first suppose that there has been no inbreeding and when 


i R 2 — N 
n = o-^ • ( J2 -5) 

This equation gives the basis for estimating the number of "loci." 
Their effects may then be estimated from equation 12.4. The most 
meaningful measure of the "effect" of a locus, however, is what we 

218 SELECTION: II [Chap. 12 

the selection limits have been reached all loci are fixed for the favour- 
able allele. The total range is then zUa, where za is the difference of 
genotypic value between the two most extreme homozygotes at a 
particular locus, and is the precise meaning of what we have loosely 
called the "effect" of the locus. If R is the total range and n is the 
number of loci that have contributed to the response, then 

R = 2tia i 12 - 1 ) 

where a is the mean value of a. Next we must suppose that each locus 
has only two alleles. The additive variance arising from one locus is 
then o-jj =2pq[a + d(q-p)Y, from equation £.5. (We shall use a 2 here 
to denote variance instead of V, because it simplifies the formulation 
when standard deviations are involved.) The gene frequencies at the 
individual loci thus enter the picture. Unless the initial population 
was made from crosses between inbred lines, the gene frequencies 
are not known and we shall therefore have to insert hypothetical 
values. We shall suppose that all segregating genes are at frequencies 
of 0-5, as they would be if the initial population were made from a 
cross between two inbred lines. The additive variance contributed by 
one locus then becomes a\ = |<2 2 , and the degree of dominance be- 
comes irrelevant. Next we have to suppose there is no linkage be- 
tween the loci, so that the additive variance due to all n loci together is 

fliHW?) .V.. ..(12.2) 

where (a 2 ) is the mean of the squares of a for each locus. Finally we 
shall suppose that all loci have equal effects, so that equations 12.1 
and 12.2 become 

R = 2tia ( I2 >3) 


<y\ = lna 2 (12.4) 

Squaring equation 12.3 and substituting a 2 = (2/n)(j^ from equation 
12.4 gives R 2 = 8/zo-l, whence 

Chap. 12] 



have earlier called the * 'proportionate effect," 2<z/cr P , which is the 
difference between the homozygotes expressed in terms of the pheno- 
typic standard deviation. By rearrangement of equation 12.4 this 

g p \] \n) 


where h is the square root of the heritability. 

Let us see what results these theoretical deductions yield when 
applied to the experiments quoted in Table 12.1. The estimates of 
the number of "loci" and of the proportionate effects of the genes are 






effect (za/ap) 

Drosophila : 

(1) abdominal bristles 



(2) thorax length 

59 , 



(3) 6-week weight 



(4) 60-day weight 



(For references to experiments see Table 12. 1) 

given in Table 12.2. Since the estimation of the number of "loci" is 
necessarily so imprecise it does not seem worth while to discuss in 
detail its limitations or the errors that may have been introduced by 
the assumptions that were made. These matters are discussed by 
Wright ( 1 9526). The results given in Table 12.2, then, suggest that 
the responses to selection in these experiments have resulted from 
about 100 loci (i.e. more nearly 100 than 10 or 1,000); and that on the 
average the difference in value between homozygotes at one locus 
amounts to about one-fifth of the phenotypic standard deviation. 

Nature of the selection limit. The deductions made in the last 
section from the observed total response were based on the assump- 
tion that the selection limit represents fixation of all favourable 
alleles. The simple theoretical expectation is that selection should 
lead to fixation with the consequent loss of genetic variance. Let us 
now consider the evidence from experiments about the nature of the 
selection limit and see how far it conforms to this simple theoretical 
picture. If the genetic variance declines as the limit is approached 

220 SELECTION: II [Chap. 12 

this ought to be apparent in a decline of phenotypic variance. In 
many experiments, however, the phenotypic variance has been found 
not to decline, even when the selection limit has been reached, and 
when due allowance for "scale effects" has been made as will be 
explained in Chapter 17. A fairly typical example is provided by the 
experiment with mice which was described in the last chapter (Fig. 
1 1.5). The phenotypic variance is shown in Fig. 12.3, expressed in 
the form of the coefficient of variation in order to eliminate scale 


Fig. 12.3. Coefficient of variation of 6 -week weight in mice. The 
thin continuous line starting at generation 23 refers to the un- 
selected control. The broken lines refer to reversed selection and 
the dotted lines to suspended selection. (From Falconer, 1955; 
reproduced by courtesy of the editor of the Cold Spring Harbor 
Symposia on Quantitative Biology.) 

effects. The variance in the large line remains at the same level 
throughout the experiment, and after the limit has been reached at 
about the twenty-fifth generation a comparison with the unselected 
control shows the variance not to have declined at all. The variance 
in the small line shows a sudden and large increase, but we shall 
return to this point later. An example from Drosophila is provided 
by the experiment on abdominal bristle-number illustrated in Fig. 
12. 1 . The phenotypic variance in the base population and in the most 
extreme of the high and of the low lines after 35 and 34 generations 
respectively is illustrated by frequency distributions in Fig. 12.4. In 
this case the variance not only failed to decline but increased very 
much during the selection in both directions. Before we consider the 



Chap. 12] 

reasons for this behaviour of the variance we shall mention another 
fact often found in selection experiments. It is that when the response 
to continued selection has ceased the population will often respond 
to selection in the reverse direction and will often respond rapidly. 
This is well illustrated in Fig. 12.2, where the three lines selected for 

Fig. 12.4. Frequency distributions of abdominal bristle number 
in Drosophila melanogaster (females), in the base population and in 
the most extreme high and low lines after 35 and 34 generations 
of selection. (From Clayton, Morris, and Robertson, 1957; re- 
produced by courtesy of the authors and the editor of the Journal 
of Genetics.) 

increased thorax length returned rapidly to the unselected level 
when the direction of selection was reversed after the upward res- 
ponses had ceased. The lines selected for reduced thorax length, 
however, did not respond to reversed selection. From this brief 
outline of the evidence it is clear that the simple theoretical picture of 
the selection limit is not substantiated by experiment. Instead, we 
find — not always but often — no loss of phenotypic variance and the 
ability to respond rapidly to reversed selection. Let us now consider 
what may be the possible reasons for these facts, and what conclusions 
about the genetic nature of the selection limit can be drawn from 

1 . The failure of the phenotypic variance to decline may be due 
to an increase of non-genetic variance compensating for the expected 
reduction of genetic variance. With the approach to fixation of the 

222 SELECTION: II [Chap. 12 

loci concerned, and of others linked to them, the frequency of homo- 
zygotes will increase. There is evidence, mentioned in Chapter 8 
and to be discussed more fully in Chapter 15, that homozygotes are 
sometimes more variable from environmental causes than hetero- 
zygotes. This could cause an increase of environmental variance which 
might counterbalance a reduction of genetic variance; but there is 
little experimental evidence concerning the matter. 

2. If the population, after the selection limit has been reached, 
responds to reversed selection we can only conclude that genetic 
variance of some sort remains. The continued presence of genetic 
variance could result from the following causes: 

(i) We saw in Example 11.6 how natural selection opposed the 
artificial selection for small size in mice, partly because small mice 
are less fertile than large ones and partly because the smallest mice 
were sterile. Natural selection acting in this sort of way may increase 
as the population mean changes further from the original level, until 
it becomes strong enough to counteract completely the artificial 
selection. The response would then cease, but reversed selection 
would be aided by natural selection and the population would res- 

(ii) Selection may favour heterozygotes at some loci. At the 
selection limit the genes would be in equilibrium at more or less 
intermediate frequencies, and they would give rise to genetic vari- 
ance. But the variance would be non-additive, and there would be 
no immediate response to reversed selection. If reversed selection 
were continued a response would slowly develop and become more 
rapid as the gene frequencies changed away from the equilibrium 
values. The behaviour of populations at the selection limit, however, 
does not seem commonly to be of this sort. 

(iii) If there is superiority of heterozygotes arising from the com- 
bined action of artificial and natural selection then the situation is 
quite different. Consider a locus at which the heterozygote AjA 2 is 
superior in the character selected to the homozygote AjA^, and the 
homozygote A 2 A 2 is inviable or sterile. Artificial selection will choose 
A]A 2 , or perhaps A 2 A 2 if it is viable, but natural selection will reject 
A 2 A 2 , so that under the combined effect of artificial and natural 
selection the heterozygote is superior. The pygmy gene in mice 
which was used for several examples in Chapter 7 provides just such a 
case, when artificial selection is in the direction of small size. Hetero- 
zygotes are favoured because they are smaller than normal homozy- 

Chap. 12] 



gotes; homozygous pygmies are smaller still but are sterile. When the 
selection limit is reached under this situation there will be genetic 
variance due to the gene, but no further response. When selection is 
reversed, however, it is only the artificial selection that is reversed in 
direction, and one homozygote will be favoured. The population will 
therefore respond immediately. This may be regarded as an extreme 
form of asymmetrical response to selection. It leads to the anomaly of 
a high heritability — about 50 per cent — estimated from the offspring- 
parent regression, but a realised heritability of zero in one direction 
and up to 100 per cent in the other direction. The anomaly, however, 
is only apparent because the estimation of heritability and the pre- 
diction of the response to selection are valid only if natural selection 
does not interfere with the appearance of the genotypes in their proper 
Mendelian ratios. 

The situation described above was proved to exist in one of the 
lines of Drosophila selected for high bristle number in the experiment 
illustrated in Fig. 12.1. There was a gene present which was lethal 
in the homozygote and which in the heterozygote increased bristle 
number by 22, which is 5-8 times the original phenotypic standard 
deviation (Clayton and Robertson, 1957). The line carrying this 
gene was the one whose distribution is shown in Fig. 12.4, and the 
bimodality of the distribution can be seen. It seems probable that 
in cases like this the gene does not have so large an effect in the original 
population, but that the effect of the heterozygote is enhanced during 
the selection, either by "modifying" genes or by a cross-over which 
separates a linked gene whose effect is in the opposite direction. A 
mechanism of this sort seems to be required to account for the very 
great increase of variance often found in selected lines (F. W. Robert- 
son and Reeve, 19520; Clayton and Robertson, 1957). 

The selection of heterozygotes at one or a few loci with major 
effects through the combined action of artificial and natural selection 
in the manner explained above seems to be a common situation in 
Drosophila populations at the selection limit. Whether it occurs as 
frequently in other organisms is not known because the genetic 
analyses required to detect it are more difficult to make. The increase 
of variance in the mice selected for small size shown in Fig. 12.3 may 
well have been due to this cause. 

The deleterious effect on fitness is an essential part of the situa- 
tion, so genes of this sort will always be at low frequencies in the 
initial population. The appearance of any particular gene in a selected 

224 SELECTION: II [Chap. 12 

line will therefore depend very much on the chances of sampling, or 
on its occurring later by mutation. Consequently such genes will be a 
cause of differences between replicated lines, such as we noted at the 
beginning of this chapter in the experiment on Drosophila bristle 
number, and they will render the selection limit to a large extent 
unpredictable in its level and its precise genetic nature. 

Relevance of selection limits to animal and plant improve- 
ment. It may be thought that experimental studies of long continued 
selection are of little relevance to the practice of selection in animal 
and plant improvement, because the breeder is concerned only with 
the first five or ten generations. This, however, is not necessarily so. 
The breeds of animals and varieties of plants which he seeks to im- 
prove have already been under selection for more or less the same 
characters over a long time. They may therefore by now be approach- 
ing, if they are not already at, the selection limits. An understanding 
of the nature of the selection limit and of the behaviour of populations 
at the selection limit may therefore be very relevant in the field of 



III. Information from Relatives 

In our consideration of selection we have up to now supposed that 
individuals are measured for the character to be selected and that the 
best are chosen to be parents in accordance with the individual pheno- 
typic values. An individual's own phenotypic value, however, is not 
the only source of information about its breeding value; additional 
information is provided by the phenotypic values of relatives, particu- 
larly by those of full or half sibs. With some characters, indeed, the 
values of relatives provide the only available information. Milk- 
yield, to take an obvious example, cannot be measured in males, so 
the breeding value of a male can only be judged from the phenotypic 
values of its female relatives. Ovarian response to gonadotropic 
hormone, a character for which selection has been applied in rats 
(Kyle and Chapman, 1953), cannot be measured on the living animal, 
so selection can only be based on the phenotypic values of female 
relatives. The use of information from relatives is of great importance 
in the application of selection to animal breeding, for two reasons. 
First, the characters to be selected are often ones of low heritability, 
and with these the mean value of a number of relatives often provides 
a more reliable guide to breeding value than the individual's own 
phenotypic value. And, second, when the outcome of selection is a 
matter of economic gain even quite a small improvement of the 
response will repay the extra effort of applying the best technique. 
In this chapter we shall outline the principles underlying the use of 
information from relatives and the choice of the best method of 
selection, but we shall not discuss the technical details of procedure 
in the application of selection to animal breeding. 

Methods of Selection 

If the family structure of the population is taken into account we 
can compute the mean phenotypic value of each family; this is known 

226 SELECTION: /// [Chap. 13 

as the family mean. Suppose, then, that we have a population in which 
the individuals are grouped in families, which may be full or half sibs, 
and we have measurements of each individual and of the means of 
every family. A choice of procedure for applying selection to this 
population is then open, according to the use we make of the family 
means. Let us first look at the problem from the point of view of the 
additional information provided by the values of relatives. Suppose, 
for example, that we have an individual whose own value puts it on 
the border-line between selection and rejection, and it has a number 
of sibs with high values, so that the family to which it belongs has a 
high mean. We may interpret the situation in one of two ways. 
Either we may say that the individual's own rather poor value has 
been due to poor environmental circumstances, and that the high 
family mean suggests that its breeding value is likely to be a good deal 
better than its phenotypic value. Or we may say that the high family 
mean has been due to a favourable common environment, provided 
perhaps by a good mother, from which the individual in question 
must also have benefited; on this interpretation, therefore, the in- 
dividual's breeding value is likely to be less good than its phenotypic 
value. In the first case we should regard the information from the 
relatives as favourable and we should select the individual in question, 
while in the second case we should regard it as unfavourable and 
should reject the individual. Here then is the problem: how do we 
decide which is the correct interpretation ? It turns out that only three 
things need be known: the kind of family (whether full or half sibs), 
the number of individuals in the families (i.e. the family size), and the 
phenotypic correlation between members of the families with respect 
to the character. The choice of method is thus a relatively simple 
matter in practice. But the explanation of the principles underlying 
the choice is more complicated. Before embarking on this explana- 
tion we shall therefore give a brief general account of the different 
methods of selection according to the use made of the information 
from relatives, indicating the circumstances to which each method 
is specially suited. Then we shall explain how the response expected 
under each method is deduced; and finally we shall compare the 
relative merits of the methods under different circumstances. 

The phenotypic value of an individual, P, measured as a deviation 
from the population mean, is the sum of two parts: the deviation of 
its family mean from the population mean, P fy and the deviation of the 
individual from the family mean, P w (the within-family deviation); 

Chap. 13] 
so that 



P=Pf+P l 



The procedure of selection, then, varies according to the attention 
paid, or the weight given, to these two parts. If we select on the basis 
of individual values only, as assumed in the last two chapters, we give 
equal weight to the two components P f and P w of the individual's 
value P. This is known as individual selection. We may, alternatively, 
select on the basis of the family mean P f alone, disregarding the 
within-family deviation P w entirely. This is known as family selection 
and it corresponds to the procedure adopted in the first case discussed 
above. Again, we may select on the basis of the within-family devia- 
tion P w alone, disregarding the family mean P f entirely. This is 
known as within-family selection and it corresponds to the second case 
discussed above. Finally, we may take account of both components 
P f and P w but give them different weights chosen so as to make the 
best use of the two sources of information. This is known as selection 
by optimum combination, or combined selection. It represents the 
general solution for obtaining the maximum rate of response, and the 
other three simpler methods are special cases in which the weights 
given to the two sources of information are either i or o. It is there- 
fore in principle always the best method. But its advantage over one 
or other of the simpler methods is never very great, and it is a refine- 
ment that is not often worth while in practice. Beyond showing why 
this is so, we shall therefore not give very much attention to combined 

The salient features of the three simpler methods are as follows, 
the differences of procedure between them being illustrated diagram- 
matically in Fig. 13. 1. 

Individual selection. Individuals are selected solely in accord- 
ance with their own phenotypic values. This method is usually the 
simplest to operate and in many circumstances it yields the most rapid 
response. It should therefore be used unless there are good reasons 
for preferring another method. Mass selection is a term often used for 
individual selection, especially when the selected individuals are put 
together en masse for mating, as for example Drosophila in a bottle. 
The term individual selection is used more specifically when the 
matings are controlled or recorded, as with mice or larger animals. 

Family selection. Whole families are selected or rejected as 
units according to the mean phenotypic value of the family. In- 



[Chap. 13 

dividual values are thus not acted on except in so far as they determine 
the family mean. In other words the within-family deviations are 
given zero weight. The families may be of full sibs or half sibs, families 
of more remote relationship being of little practical significance. The 
use of full-sib families is dependent on a high reproductive rate and 
with slow-breeding organisms half sibs must generally be used. 

i • i i 

! • i 

° ' i 

1 7 i | 

O • | ' 

A ' ' ? 

° • i 

1 • i ° 

o ! ' ] o 

i o 7 

i ; o , 

' I o 

1 I I I 

' I O I 

i i T i 

I I I I 

I O I 

i I o 

' I I 

I o ' 

I 1 1 







Fig. i 3. i. Diagram to illustrate the different methods of selec- 
tion. The dots and circles represent individuals plotted on a 
vertical scale of merit, those with the best measurements being at 
the top. The individuals to be selected are those shown as dots. 
There are 5 families each with 5 individuals; {a), {b) y and (c) show 
identical arrangements of the same 25 individuals. The families 
are separated laterally, with the individuals of each family placed 
one above the other. The mean of each family is shown by a cross- 
bar. The situation in which within-family selection is most useful 
is shown in (d), where the variation between families is very great 
in comparison with the variation within families. (Redrawn from 
Falconer, 1957a.) 

The chief circumstance under which family selection is to be pre- 
ferred is when the character selected has a low heritability. The 
efficacy of family selection rests on the fact that the environmental 
deviations of the individuals tend to cancel each other out in the mean 

Chap. 13] 



value of the family. So the phenotypic mean of the family comes close 
to being a measure of its genotypic mean, and the advantage gained is 
greater when environmental deviations constitute a large part of the 
phenotypic variance, or in other words when the heritability is low. 
On the other hand, environmental variation common to members of a 
family impairs the efficacy of family selection. If this component is 
large, as illustrated in Fig. 13. i (d) y it will tend to swamp the genetic 
differences between families and family selection will be corre- 
spondingly ineffective. Another important factor in the efficacy of 
family selection is the number of individuals in the families, or the 
family size. The larger the family the closer is the correspondence 
between mean phenotypic value and mean genotypic value. So the 
conditions that favour family selection are low heritability, little 
variation due to common environment, and large families. 

There are practical difficulties in the application of family selec- 
tion, particularly in laboratory populations. They arise from the 
conflict between the intensity of selection and the avoidance of in- 
breeding. It is generally desirable to keep the rate of inbreeding as 
low as possible. If the minimum number of parents is fixed by con- 
siderations of inbreeding — say at ten pairs — then under family 
selection ten families must be selected, since each family represents 
only one pair of parents in the previous generation. And, if a reason- 
ably high intensity of selection is to be achieved, the number of 
families bred and measured must be perhaps twice to four times this 
number. Family selection is thus costly of space, and if breeding space 
is limited the intensity of selection that can be achieved under family 
selection may be quite small. The two following methods are variants 
of family selection. 

Sib selection. Some characters, we have already noted, cannot 
be measured on the individuals that are to be used as parents, and 
selection can only be based on the values of relatives. This amounts 
to family selection but with the difference that now the selected indi- 
viduals have not contributed to the estimate of their family mean. 
The difference affects the way in which the response is influenced by 
family size. Where the distinction is of consequence we shall use the 
term sib selection when the selected individuals are not measured and 
family selection when they are measured and included in the family 
mean. When families are very large the two methods are equivalent, 
and the term family selection is then to be understood to cover both. 

Progeny testing is a method of selection widely applied in ani- 

230 SELECTION: III [Chap. 13 

mal breeding. We shall not discuss it in detail, except in so far as it 
can be treated as a form of family selection. The criterion of selection, 
as the name implies, is the mean value of an individual's progeny. 
At first sight this might seem to be the ideal method of selection and 
the easiest to evaluate because, as we saw in Chapter 7, the mean 
value of an individual's offspring comes as near as we can get to a 
direct measure of its breeding value, and is in fact the operational 
definition of breeding value. In practice, however, it suffers from the 
serious drawback of a much lengthened generation interval, because 
the selection of the parents cannot be carried out until the offspring 
have been measured. The evaluation of selection by progeny testing 
is apt to be rather confusing because of the inevitable overlapping of 
generations, and because of a possible ambiguity about which genera- 
tion is being selected, the parents or the progeny. The progeny, 
whose mean is used to judge the parents, are ready to be used as 
parents just when the parents have been tested and await selection. 
Thus both the selected parents and their progeny are used con- 
currently as parents. The difficulty of interpretation may be partially 
overcome by regarding progeny testing as a modified form of family 
selection. The progenies are families, usually of half sibs, and selec- 
tion is made between them on the basis of the family means in the 
manner described above. The only difference is that the selected 
families are increased in size by allowing their parents to go on breed- 
ing. The additional, younger, members of the families do not con- 
tribute to the estimates of the family means and are therefore selected 
by sib selection. Increasing the size of the selected families by un- 
measured individuals does not improve the accuracy of the selection, 
but it reduces the replacement rate and so increases the intensity of 
selection that can be achieved. This is the principal advantage of 
progeny testing, but it can only be realised in operations on a large 
scale, when the danger of inbreeding is not introduced by limitation 
of space. 

Within-family selection. The criterion of selection is the 
deviation of each individual from the mean value of the family to 
which it belongs, those that exceed their family mean by the greatest 
amount being regarded as the most desirable. This is the reverse of 
family selection, the family means being given zero weight. The chief 
condition under which this method has an advantage over the others 
is a large component of environmental variance common to members 
of a family. Fig 13. 1 (d) shows how within-family selection would be 

Chap. 13] 



applied in this situation. Pre-weaning growth of pigs or mice might 
be cited as examples of such a character. A large part of the variation 
of individuals' weaning weights is attributable to the mother and is 
therefore common to members of a family. Selection within families 
would eliminate this large non-genetic component from the variation 
operated on by selection. An important practical advantage of selec- 
tion within families, especially in laboratory experiments, is that it 
economises breeding space, for the same reason that family selection 
is costly of space. If single-pair matings are to be made, then two 
members of every family must be selected in order to replace the 
parents. This means that every family contributes equally to the 
parents of the next generation, a system that we saw in Chapter 4 
renders the effective population size twice the actual. Thus when 
selection within families is practised, the breeding space required to 
keep the rate of inbreeding below a certain value is only half as great 
as would be required under individual selection. 

Expected Response 

To evaluate the relative merits of the different methods of selec- 
tion we have to deduce the response expected from each. There is 
nothing to be added here about individual selection to what was said 
in Chapter 11. The expected response was given in equation 11. 3 as 
R=icrph?, where i is the intensity of selection (i.e. the selection 
differential in standard deviations), g p is the standard deviation, and 
W the heritability, of the phenotypic values of individuals. The 
response expected under family selection or within-family selection 
is arrived at in an analogous manner. Under family selection, the 
criterion of selection is the mean phenotypic value of the members of 
a family, so the expected response to family selection is 

R f =icr f h 2 f 

to- 2 ) 

where i is the intensity of selection, o f is the observed standard 
deviation of family means, and hj is the heritability of family means. 
In the same way the expected reponse to within-family selection is 

R w =icr l 


where o w is the standard deviation, and h\ the heritability of within- 
family deviations. 


232 SELECTION: III [Chap. 13 

The concept of heritability applied to family means or to within- 
family deviations introduces no new principle. It is simply the pro- 
portion of the phentoypic variance of these quantities that is made 
up of additive genetic variance. These heritabilities can be expressed 
in terms of the heritability of individual values (which we shall con- 
tinue to refer to simply as the heritability, with symbol A 2 ), the pheno- 
typic correlation between members of families, and the number of 
individuals in the families, all of which can be estimated by observa- 
tion. To arrive at the appropriate expressions we have to consider 
again how the observational components of variance are made up of 
the causal components, as explained in Chapters 9 and 10 (see in 
particular Tables 9.4 and 10.4). First let us simplify matters by 
supposing that all families contain a large number of individuals, so 
that the means of all families are estimated without error. Consider 
first the phenotypic variance. The intra-class correlation, t, between 
members of families is the between-group component divided by the 
total variance: t — G%ju^. Therefore the between-group component 
can be expressed as G% — tG%, and the within-group component as 
<7jp = (i -£)crf.. This expresses the partitioning of the phenotypic 
variance into its observational components. The total variance, 
written here as oy, is the phenotypic variance which we shall write 
as V P in the context of causal components. Now, the partitioning of 
the additive variance between and within families can be expressed 
in the same way, in terms of the correlation of breeding values, for 
which we shall use the symbol r. (The meaning of this correlation 
will be explained in a moment.) Thus the additive variance between 
families is rV A and the additive variance within families is (1 -r)V A . 
The dual partitioning is summarised in Table 13.1. 

Table 13. i 

Partitioning of the variance between and within families of 
large size. 

Observational component Additive variance Phenotypic variance 

Between families, 0% rV A tV P 

Within families, al (i-r)V A (i-t)V P 

This partitioning of both the additive and the phenotypic variance 
leads at once to the heritabilities of family means and of within- 
family deviations, since these heritabilities are simply the ratios of 
the additive variance to the phenotypic variance. Thus, when the 

Chap. 13] 



families are large, the heritability of family means is rV A jtV Pi or (r/t)h 2 , 
since V A jV P is the heritability of individual values, h 2 . 

The correlation of breeding values between members of families 
is a measure of the degree of relationship, usually called the "coeffi- 
cient of relationship." The correlation between the breeding values 
of relatives in a random-mating population is twice their coancestry 

r = 2f 


that is to say, twice the inbreeding coefficient of their progeny if the 
relatives were mated together. Its values in full-sib and half-sib 
families can be seen from Table 9.4; for full sibs it is \ and for half 
sibs it is J. In order to be able to discuss full-sib and half-sib families 
at the same time in what follows, we shall retain the symbol r in the 
formulae instead of inserting the appropriate values of \ or \. 

The foregoing account of the heritabilities of family means and 
within-family deviations was simplified by the supposition of large 
families. This simplification is not justified in practice and we must 
now remove it by considering families of finite size. We shall, how- 
ever, suppose that all families are of equal size. The number of 
individuals in a family — called the family size — has to be taken into 
consideration for the following reason. If selection is based on the 
family mean, or on the deviations from the family mean, then it is the 
observed mean that we are concerned with and not the true mean. In 
other words we are not concerned with the observational components 
of variance which we have hitherto discussed, but with the variance of 
the observed means and of the observed within-family deviations. 
The observed means of groups are subject to sampling variance which 
comes from the within-group variance. If there are n individuals in a 
group then the sampling variance of the group-mean is (i/n) o>, where 
&w is the component of variance within the group. Thus the variance of 
observed group-means is augmented by (i/w) af Vy and the variance of 

Table 13.2 
Composition of observed variances with families of size n. 

Observed variance 

of family means 

of within-family 


ctJ + - °w 


Causal components 
Additive Phenotypic 

i+(n-i)r i+(n-i)t. 


(n-i)(i-r ) v ( n -i){i-t) 



[Chap. 13 

observed deviations within groups is correspondingly diminished by 
the same amount. The observed variances, with family size w, are 
therefore made up of the observational components as shown in 
Table 13.2. The causal components entering into the observed 
variances can now be found by translating the observational com- 
ponents into causal components from Table 13. 1. They are shown in 
the two right-hand columns of Table 13.2. 

To find the heritabilities of family means and of within-family 
deviations we have only to divide the additive component by the 
phenotypic component of the observed variances. Thus the herit- 
ability of family means is 

I+ („_ I K 2 

3 i+(n-i)t 
and the heritability of within-family deviations is 

h 2 

At this point sib selection has to be distinguished from family 
selection. The foregoing account referred to family selection where 
the individuals to be selected were themselves measured and contributed 
to the observed family mean. Sib selection differs in that the individuals 
selected are not measured. This does not affect the phenotypic com- 
ponent, because this is simply the observed variance of what is 
measured. But it does affect the additive component, because the 
mean breeding value with which we are concerned is not that of the 
individuals whose phenotypic values have been measured, but of 
others that have not been measured. Therefore the appropriate 
variance of mean breeding values is simply the between-family com- 
ponent of additive variance, rV A , irrespective of the number of other 
individuals that have been measured. The heritability of family 
means appropriate to sib selection is therefore 

hl = 


i+(n- i)t 

The heritabilities of the different methods of selection, whose deriva- 
tions have now been explained, are listed in Table 13.3. 

To deduce the expected response is now a simple matter. Let us 


take family selection for illustration. The expected response was 
given in equation 13.2 as 

R f = i(j f h} 

where cr f is the standard deviation of observed family means. This 
expression, however, is not much use as it stands, because it does not 
readily allow a comparison to be made with the other methods. It 
will be most convenient to cast it into a form that facilitates compari- 
son with individual selection. This can be done by substituting the 

Table 13.3 

Heritability and expected response under different methods 
of selection. 

Method of 

Family h} = h 2 





h 2 
1 +(n-i)r 


' i+(n-i)t 

R = icr P h 2 
R t = ia P h 2 . 

R s = ia P h 2 . 

Expected response 

i+(n- i)r 



hi = hK 




VL (i-O i+(»-i)u 



i = intensity of selection (selection differential in standard measure): 
assumed to be equal for all methods, but not necessarily so. 
o P = standard deviation in phenotypic values of individuals. 
h 2 = heritability of individual values. 
r: with full-sib families, r = \ 
with half-sib families, r = J 
t = correlation of phenotypic values of members of the families. 
n = number of individuals in the families. 

expression for the heritability of family means, h}, given above, and 
by putting the standard deviation of observed family means, oy, in 
terms of the standard deviation of individual phenotypic values, 
°p( = JVp) from the right-hand column of Table 13.2. The expected 
response then becomes 

236 SELECTION: III [Chap. 13 

Rf = i h Hn-*) t i+(n-i)r 

x V w i+(rc-i)* 

which reduces to 

j^v^r '+(»-* ;' i 

' L>/[»{i+(»-i)*}]J 

The term i<j P h 2 is equivalent to the expected response under indi- 
vidual selection, so the expression within the square brackets is the 
factor that compares family selection with individual selection. The 
expression looks very complicated but it contains only three simple 
quantities: n, which is the family size; r, which is \ for full-sib and 
\ for half-sib families; and t> which is the phenotypic intra-class 

The expected responses under the different methods of selection 
are listed in Table 13.3, all expressed in this manner which allows 
the comparisons to be made with individual selection. The relative 
merits of the different methods will be discussed in the next section: 
first we must deal with combined selection. 

Combined selection. We shall deal very briefly with combined 
selection, referring the reader to Lush (1947), Lerner (1950) and 
A. Robertson (1955a) for details. First we have to find what are the 
appropriate weighting factors to be used in its application. We saw 
before that the phenotypic value of an individual is made up of two 
parts, the family mean and the within-family deviation, P=P f +P w , 
and that each part gives some information about the individual's 
breeding value. In Chapter 10 we saw that the heritability is equi- 
valent to the regression of breeding value on phenotypic value 
(equation J0.2), so that the best estimate of an individual's breeding 
value to be derived from its phenotypic value is h 2 P. This idea can 
be applied separately to the two parts of the phenotypic value, since 
these are uncorrelated and supply independent information about the 
breeding value. Therefore, taking both parts of the phenotypic value 
into account, the best estimate of an individual's breeding value is 
given by the multiple regression equation 

expected breeding value = hjP f + h%P w 

(P f being measured as a deviation from the population mean, and P w 
as a deviation from the family mean). The weighting factors that 
make the most efficient use of the two sources of information are 
therefore the two heritabilities, appropriate to family means and to 

Chap. 13] 



within-family deviations respectively. The criterion of selection 
under combined selection is thus an index, /, in the form 

I=h}P f + h^P w 


If the values of the heritabilities are inserted from Table 13.3 it will 
be seen that the term h 2 is common to both weighting factors, and 
this term may therefore be omitted without affecting the relative 
weighting. We then have an index for the computation of which only 
n, r, and t need be known. In practice it is more convenient to work 
with the individual values in place of the within-family deviations, 
and to assign them a weight of 1 . The family mean is thus used in the 
manner of a correction, supplementing the information provided by 
the individual itself. Rearrangement of the appropriate weighting 
factor for the family mean leads to an index made up as follows (Lush, 


/=p+r~. ,* / ]p, (jj.6) 

\_i-r i+(n-i)tj T v w» / 

where P is the individual value and P f the family mean, in which the 
individual itself is included. 

This solution of the problem of how we can best make use of the 
information provided by relatives is now cast in precisely the form 
in which the problem was introduced at the beginning of this 
chapter. The expression in the square brackets in equation JJ.6, 
which contains nothing but easily measurable quantities, shows how 
we can best use the family mean to supplement the individual values 
in making the selection. 

The expected response to combined selection, cast in a form 
suitable for comparison with individual selection, is given at the foot 
of Table 13.3. For its derivation see Lush (1947). 

Relative Merits of the Methods 

The formulae for the expected responses that we have derived 
enable us to compare one method of selection with another and dis- 
cover what are the conditions that determine the choice of the best 
method. Before making detailed comparisons let us note the reason 
for individual selection being usually better than either family selec- 
tion or within-family selection. The reason is that the standard 

238 SELECTION: III [Chap. 13 

deviations of family means and of within-family deviations are both 
bound to be less than the standard deviation of individual values; 
and the standard deviation of the criterion of selection is one of the 
factors governing the response. If we compare, for example, family 
selection with individual selection by writing the expected responses 
in the form 

R = icjph 2 (for individual selection) 
and R f =i(7 f h} (for family selection) 

then it is clear that family selection cannot be better than individual 
selection unless the heritability of family means, h} i is greater than 
the heritability of individual values, W, by an amount great enough 
to counterbalance the lower standard deviation of family means. 
And the same applies to within-family selection. 

A general picture of the circumstances that make one method 
better than another can best be obtained from graphical representa- 
tions of the relative responses: that is, the response expected from 
one method expressed as a proportion of the response expected from 
another, the expected responses being taken from Table 13.3. In 
making these comparisons we shall assume that the intensity of 
selection is the same for all methods. Though not necessarily true, 
this simplification is unavoidable because no generalisation can be 
made about the proportions selected under the different methods. 
We shall make the comparisons separately for full-sib families (r = J) 
and for half-sib families (r = J). Then the relative responses depend 
only on two factors, the family size, n> and the intra-class correlation 
of phenotypic values, t. If there is no variance due to common en- 
vironment contributing to the variance of family means, then the 
correlation in full-sib families is equal to half the heritability, and that 
in half-sib families to one quarter of the heritability. This lets us see 
in a general way how the heritability of the character influences the 
relative response. It is, however, the correlation and not the herit- 
ability that is the determining factor, so only the correlation need be 
known when a choice of method is to be made. 

Fig. 13.2 gives a general picture of all the methods, showing how 
their relative merits depend on the phenotypic correlation. The 
graphs refer only to full-sib families and only to the two extremes of 
family size: infinitely large families in (a) and families of 2 in (b). 
The comparisons are made here with combined selection since this is 
necessarily the method that gives the greatest response. The graphs 

Chap. 13] 



therefore show the ratio of the response for each method to that for 
combined selection: e.g. for family selection, the ratio RfjR c . The 
general picture indicated by the graphs is as follows. The relative 
merit of individual selection is greatest when the correlation is 0-5 
and falls off as the correlation drops below or rises above this value. 
The relative merit of family selection is greatest when the correlation 
is low, and that of within-family selection when the correlation is 

> 5 


.-"'w - 

- / 


\ ■ 





\ \ 

1 •'" 

1/ i 







•=A • 

\ \ 



\ \ 


- .1 -..- 


2 4 6 


(a) n = 00 

4 6 

(b) n = l 


Fig. 13.2. Relative merits of the different methods of selection, 
with full-sib families. Responses relative to that for combined 
selection plotted against the phenotypic intra-class correlation, t. 
/= individual selection; F = family selection; W= within-family 

high. Now, a low correlation between sibs can only result from a 
character of low heritability, and with very little variance due to 
common environment. These therefore are the circumstances that 
favour family selection. A high correlation can only result from a 
large amount of variance due to common environment. Even if the 
heritability were 100 per cent the correlation between full sibs could 
not exceed 0*5 without augmentation by common environment. A 
large amount of variation due to common environment is therefore 
the circumstance that favours within-family selection. We shall 
examine the three simpler methods in more detail in a moment. 
First let us look at what may be gained from combined selection. 
Though combined selection is always as good as or superior to any 
other method, its superiority is never very great. With large families 
its superiority is greatest when the correlation is close to 0-25 or 075, 
but even then its superiority is not much more than 10 per cent. 

240 SELECTION: III [Chap. 13 

With families of 2 its superiority reaches 20 per cent when the cor- 
relation is 0-875. Thus the range of circumstances under which 
combined selection is more than a few per cent better than one or 
other of the simpler methods is very narrow. In general, therefore, 
there is little to be gained from the extra trouble of applying combined 
selection, and we shall not give it any further consideration. 

Let us now examine the simpler methods in more detail. The 
most useful comparison to make now is with individual selection. 
The expected responses will therefore be expressed as a proportion 
of the response to individual selection. We shall examine each method 
in turn, commenting on the special questions that arise in connexion 
with each. 

Family selection. Fig. 13.3 shows the relative response R f jR 
plotted against the family size, n, for full-sib families in (a) and 
for half-sib families in (b). These graphs therefore show primarily the 
effect of family size on the relative merit of family selection, but the 
magnitude of the correlation, t, is taken into account by separate 
curves for different correlations. Only the circumstances when family 
selection is superior to individual selection are shown on the graphs. 
The chief points made clear by the graphs are these, (i) As we saw 
from Fig. 13.2, there is a critical value of the correlation, above which 
family selection cannot be superior to individual selection. From the 
expected responses in Table 13.3 it is easy to show that when the 
families are large the relative response expected is R f /R = r/Jt. So, 
with large families, family selection becomes superior to individual 
selection when r exceeds Jt. The critical value of the correlation, t, 
depends a little on the family size and differs between full-sib and 
half-sib families. Family selection with full sibs is very little better 
than individual selection unless the correlation is below 0-2; and with 
half sibs unless it is below 0-05 . (ii) The effect of family size is greatest 
when the correlation is low. Therefore there is little to be gained 
from very large families unless the correlation is well below the critical 
value. There is, however, another consideration in connexion with 
the family size which will be explained later, (iii) Finally, there is the 
question whether full sibs or half sibs are to be preferred for family 
selection. This depends so much on the special circumstances that 
general conclusions cannot be drawn. From the graphs it would 
appear that full sibs must always be better than half sibs. But the 
full-sib correlation is more likely to be increased by common en- 
vironment, and full-sib families are likely to be a good deal smaller 

Chap. 13] 



than half-sib families. Both these factors work in favour of half-sib 
families. It has been shown that in selection for egg-production in 
poultry the factor of family size makes half-sib families superior to 
full sibs (Osborne, 19570). 



0^ 1-4 






t=2 „ 



h 1 = -40 





20 30 40 







/! 2 = -20 



Fig. 13.3. Responses expected under family selection relative to 
that for individual selection, plotted against family size. The 
separate curves refer to different values of the phenotypic cor- 
relation, t, as indicated. The corresponding values of the heri- 
tability, h 2 , in the absence of variation due to common environment, 
are also given, (a) full-sib families; (b) half-sib families. 

Sib selection. The use of this method is usually dictated by 
necessity rather than by choice, and comparisons with other methods 
are of less interest. The chief practical question that arises concerns 



[Chap. 13 

the family size: how many sibs should be measured? Or, how far is it 
worth while increasing family size ? The effect of family size on the 
response to sib selection is shown in Fig. 13.4. The graphs show the 
response with family size n f as a percentage of the response with 
infinitely large families, which would be the maximum possible 




9 70 

1 60 



O 50 


2 40 









20 30 





Fig. 13.4. Effect of family size on the response to sib selection, 
with either full- or half-sib families. The expected response is 
shown as a percentage of the response with infinitely large families. 
The separate curves refer to different values of the phenotypic 
correlation, t, as indicated. 

response. The graphs are valid for both full and half sibs. Again the 
effect of increasing family size is greatest when the correlation is low. 
But with sib selection as with family selection there is another con- 
sideration to be taken into account in connexion with the family size, 
which will now be explained. 


Optimal family size. Though the graphs suggest that the larger 
the family size the greater will be the response, under both family 
selection and sib selection, this is not so in practice because the in- 
tensity of selection is involved as a factor in the following way. In 
practice there is always a limitation on the amount of breeding space 
or facilities for measurement. The total available space can be filled 
with a large number of small families, or with a small number of large 
families. Considerations of inbreeding set a lower limit to the number 
of families that will be selected, so the larger the number of families 
measured the greater will be the intensity of selection. Therefore 
there is a conflict of advantage between the size of the families and 
the intensity of selection: large families lead to a lower intensity of 
selection. When the intensity of selection is taken into consideration 
it turns out that there is an optimal family size which gives the 
greatest expected response. The optimal family size with half-sib 
families can be found approximately from the following simple 
formula (A. Robertson, 19576): 

VA ( J *7) 

7Z = 0-56 

where n is the otpimal family size, T is the total number of individuals 
that can be accommodated and measured, N is the number of families 
to be selected, and h 2 is the heritability of the character. 

Within-family selection. Fig. 13.5 shows the relative response, 
R w /R, for within-family selection applied to full-sib families. Half- 
sib families need not be considered since the method is unlikely to be 
applied to them. The graphs show primarily the effect of the pheno- 
typic correlation, t> on the response. Four graphs are given repre- 
senting family sizes between 2 and 30, and it can be seen that the 
family size does not have a great effect. The relative response when 
the families are very large can be shown from the expected responses 
given in Table 13.3 to be R w /R = (i -r)/J(i -i). So, with large 
families, within-family selection will be superior to individual selec- 

Ition if (1 - r) exceeds J(i - 1). The graphs in Fig. 13.5 show that the 
correlation, t> in full-sib families would have to exceed about 075 to 
0-85, according to the family size. Correlations as high as this cannot 
arise without a large amount of variation due to common environ- 
ment. Correlations high enough to make within-family selection 
superior to individual selection are, however, not commonly found, 
and the advantage of within-family selection therefore comes chiefly 



[Chap. 13 

from the reduced rate of inbreeding which was mentioned earlier. 
Fig. 13.5 shows how much will be sacrificed in the rate of response if 
within-family selection is applied. Most characters have full-sib 
correlations below about 0-5, and within-family selection is then only 
about half as effective as individual selection. 








« = 30 
n = 10 




'/ \ 

ft J 






•3 -4 5 6 7 


Fig. 13.5. Response expected under within-family selection rela- 
tive to that of individual selection, plotted against the phenotypic 
correlation, t. The separate curves refer to different family sizes, 
as indicated. 

Weights to be attached to families of different size. Through 
out this chapter we have assumed that all families whose mean values 
are to be used in selection have equal numbers of individuals in them; 
i.e. n is the same for all families. This is a reasonable enough assump- 
tion to make when we are considering the expected response from the 
point of view of the planner who has to decide on the method of 
selection to be applied. But, in practice, families are very seldom of 
equal size and if we are to apply any method of selection based on 
family means we are immediately faced with the problem of how to 
make allownace for different numbers in the families. Obviously the 
mean of a large family is more reliable than that of a small one, and 
should be given more weight when the selection is being made. The 
solution of the problem comes from a consideration of the heritability 

Chap. 13] 



as the regression of breeding value on phenotypic value. The best 
estimate of the breeding value of a family is obtained by multiplying 
the family mean (measured as a deviation from the population mean) 
by the heritability of family means. The appropriate weighting factor 
for family means is therefore the heritability of family means, cal- 
culated separately for each family according to its size. Quantities 
that are constant for all families may be omitted without altering the 
relative weights. Thus, in the application of family selection, each 
family mean, calculated as a deviation from the population mean, 
should be weighted by [i +(n- i)r]/[i +(n- i)t], and in sib selection by 
p*/[i + (n - i)t]. The heritability of within-family deviations does not 
contain the term w, and is therefore unaffected by family size. Thus no 
weighting is required in the application of within-family selection. The 
weighting factor to be used in combined selection has already been 
given in equation 13.6. 

We conclude this chapter with an example from a laboratory 
experiment which compared the responses actually obtained under 
different methods of selection. 

Example 13.1. In an experiment with Drosophila melanogaster selec- 
tion for abdominal bristle-number was made by three methods (Clayton, 
Morris, and Robertson, 1957). The responses to individual selection at 
different intensities were quoted in Example 11.2. Sib selection was also 
applied in both full-sib and half-sib families and the responses compared 
with expectation. Here we shall compare the responses under sib selection 
with the response under individual selection, according to the formula in 
Table 13.3. The same proportion of the population was selected in each 
case, namely 20 per cent, but the intensities of selection under sib selection 

Relative response, RJR 

Full sibs Half sibs 
Exp. 0-832 0-614 

Obs. up 0-618 0-527 

Obs. down 0-919 0-635 

were lower than under individual selection because there was a smaller 
total number of families than of individuals — 10 half-sib families, 20 
full-sib families, and 100 individuals. The intensity of selection under 
individual selection was 1-40. Those under sib selection are given in the 
table, together with the other data needed for calculating the expected 
responses under sib selection relative to that under individual selection. 


Full sibs 

Half sibs 















[Chap. 13 

In applying the formula from Table 13.3 we have to take account of the 
intensity of selection, multiplying by the ratio of the intensity under sib 
selection to the intensity under individual selection. It will be seen that 
the correlation of breeding values, r, between half sibs is a little greater 
than J. This is because the females mated to a male were not entirely 
unrelated to each other. The ratios of the responses expected and observed 
are given in the right-hand half of the table. The expectation is that in- 
dividual selection should be the best method, and so it proved to be. 
There is, however, some discrepancy between the upward and downward 
responses, of which the reason is not known. 



I. Changes of Mean Value 

We turn our attention now to inbreeding, the second of the two ways 
open to the breeder for changing the genetic constitution of a popula- 
tion. The harmful effects of inbreeding on reproductive rate and 
general vigour are well known to breeders and biologists, and were 
mentioned in Chapter 6 as one of the two basic genetic phenomena 
displayed by metric characters. The opposite, or complementary, 
phenomenon of hybrid vigour resulting from crosses between inbred 
lines or between different races or varieties is equally well known, and 
forms an important means of animal and plant improvement. The 
production of lines for subsequent crossing in the utilisation of 
hybrid vigour is one of two main purposes for which inbreeding may 
be carried out. The other is the production of genetically uniform 
strains, particularly of laboratory animals, for use in bioassay and in 
research in a variety of fields. Inbreeding in itself, however, is almost 
universally harmful and the breeder or experimenter normally seeks 
to avoid it as far as possible, unless for some specific purpose. Men- 
tion should be made here of naturally self-fertilising plants, to which 
much of the discussion in this chapter is inapplicable. Since inbreed- 
ing is their normal mating system they cannot be further inbred: they 
can, however, be crossed, but they do not regularly show hybrid 

In the treatment of inbreeding given in Chapter 3 the conse- 
quences were described in terms of the expected changes of gene 
frequencies and of genotype frequencies. Here we have to show how 
the changes of gene and genotype frequencies are expected to affect 
metric characters. And at the same time we have to consider the 
observed consequences of inbreeding and crossing, and see what 
light they throw on the properties of the genes concerned with 
metric characters. We shall first consider the changes of mean value 
and then, in the next chapter, the changes of variance resulting from 
inbreeding and crossbreeding. Finally, in Chapter 16, we shall con- 

R F.Q.G. 


sider the combination of selection with inbreeding and crossbreeding 
by means of which hybrid vigour may be utilised in animal and plant 

Inbreeding Depression 

The most striking observed consequence of inbreeding is the 
reduction of the mean phenotypic value shown by characters con- 
nected with reproductive capacity or physiological efficiency, the 
phenomenon known as inbreeding depression. Some examples of in- 
breeding depression are given in Table 14. i, from which one can see 
what sort of characters are subject to inbreeding depression, and — 
very roughly — the magnitude of the effect. From the results of these 
and many other studies we can make the generalisation that inbreed- 
ing tends to reduce fitness. Thus, characters that form an important 
component of fitness, such as litter size or lactation in mammals, 
show a reduction on inbreeding; whereas characters that contribute 
little to fitness, such as bristle number in Drosophila, show little or no 

In saying that a certain character shows inbreeding depression, 
we refer to the average change of mean value in a number of lines. 
The separate lines are commonly found to differ to a greater or lesser 
extent in the change they show, as, indeed, we should expect in 
consequence of random drift of gene frequencies. This matter of dif- 
ferentiation of lines will be discussed later when we deal with changes 
of variance. It is mentioned here only to emphasise the fact that the 
changes of mean value now to be discussed refer to changes of the 
mean value of a number of lines derived from one base population. 
As in our earlier account of inbreeding we have to picture the "whole 
population" consisting of many lines. The population mean then 
refers to the whole population and inbreeding depression refers to a 
reduction of this population mean. Let us now consider the theoreti- 
cal basis of the change of population mean on inbreeding. 

First, we may recall and extend some of the conclusions from 
Chapter 3, supposing at first that selection does not in any way inter- 
fere with the dispersion of gene frequencies. Since the gene fre- 
quencies in the population as a whole do not change on inbreeding, 
any change of the population mean must be atrributed to the changes 
of genotype frequencies. Inbreeding causes an increase in the frequen- 
cies of homozygous genotypes and a decrease of heterozygous genotypes. 

Chap. 14] 



Table 14. i 

Some Examples of Inbreeding Depression 

The figures given show approximately the decrease of mean 
phenotypic value per 10 per cent increase of the coefficient 
of inbreeding: column (1) in absolute units; column (2) as 
percentage of non-inbred mean; column (3) in terms of the 
original phenotypic standard deviation (data not available 
for all characters). 

Character Inbreeding depression per 

10% increase ofF 

to (2) 

units % 

Cattle (A. Robertson, 1954) 

Milk-yield 29-6 gal. 3-2 

Pigs (Dickerson et al. 1954) 



Litter size at birth 
Weight at 154 days 

0*38 young 
3-64 lb. 





Sheep (Morley, 1954) 

Fleece weight 
Length of wool 
Body weight at 1 year 

0-64 lb. 
o-i2 cm. 
2-91 lb. 




Poultry (Shoffner, 1948) 

Body weight 

9-26 eggs 

0-04 lb. 



Mice (Original data) 

Litter size at birth 

o*6o young 



Weight at 6 weeks ($?) 

0-58 gm. 



Drosophila melanogaster 

(Tantawy and Reeve, 1956) 
Fertility (per pair per day) 
Viability (egg to adult) 
Wing length 

2-2 offspring 

2-6 % 

2'8 (too) mm. 




Drosophila subobscura 

(Hollingsworth and Smith, 1955) 
Fertility (per pair per day) 
Egg hatchability 

6-o offspring 
8-3 % 





Therefore a change of population mean on inbreeding must be con- 
nected with a difference of genotypic value between homozygotes and 
heterozygotes. Let us now see more precisely how the population 
mean depends on the degree of inbreeding, which we may con- 
veniently express as the inbreeding coefficient, F. \ 

Consider a population, subdivided into a number of lines, with a 
coefficient of inbreeding, F. The expression for the population mean 
is derived by putting together the reasoning set out in Tables 3.1 and 
7.1, in the following way. Table 14.2 shows the three genotypes of a 
two-allele locus with their genotypic frequencies in the whole popula- 
tion. These frequencies come from Table 3.1, p and q being the 
gene frequencies in the whole population. Then the third column 
gives the genotypic values assigned as in Fig. 7.1. The value and 

Table 14.2 

Genotype Frequency Value Frequency x Value 

A^ p+pqF +a p 2 a+pqaF 

A^ 2pq-2pqF d 2pqd-2pqdF 

A 2 A 2 q 2 +pqF -a -q 2 a-pqaF 


Sum = a(p -q) + 2dpq - 2dpqF 
= a(p -q) + 2dpq(i -F) 

frequency of each genotype are multiplied together in the right-hand 
column, the summation of which gives the contribution of this locus a 
to the population mean. Thus, referring still to the effects of a single 
locus, we find that a population with inbreeding coefficient F has a 
mean genotypic value: 

M F = a{p-q) + 2dpq(i-F) (14.1) 

= M -zdpqF (14.2) 

where M is the population mean before inbreeding, from equation 
7.2. The change of mean resulting from inbreeding is therefore 
— 2dpqF. This shows that a locus will contribute to a change of mean 
value on inbreeding only if d is not zero; in other words if the value 
of the heterozygote differs from the average value of the homozygotes. ^ 
This conclusion, though demonstrated in detail only for two alleles ^ 
at a locus, is equally valid for loci with more than two alleles. The 
following general conclusions can therefore be drawn: that a change 
of mean value on inbreeding is a consequence of dominance at the 
loci concerned with the character, and that the direction of the change 


Chap. 14] 



is toward the value of the more recessive alleles. The dominance may 
be partial or complete, or it may be overdominance; all that is neces- 
sary for a locus to contribute to a change of mean is that the heterozy- 
gote should not be exactly intermediate between the two homozygotes. 
Equation 14.2 shows also that the magnitude of the change of mean 
depends on the gene frequencies. It is greatest when pq is maximal: 
that is, when j>=<7 = |. Genes at intermediate frequencies therefore 
contribute more to a change of mean than genes at high or low fre- 
quencies, other things being equal. 

Now let us consider the combined effect of all the loci that affect 
the character. In so far as the genotypic values of the loci combine 
additively, the population mean is given by summation of the contri- 
butions of the separate loci, thus: 

M F =Za{p - q) + 2{Zdpq)(i -F) 
= M -2FZdpq 


and the change of mean on inbreeding is - zFZdpq. 

These expressions show what are the circumstances under which 
a metric character will show a change of mean value on inbreeding. 
The chief one is if the dominance of the genes concerned is pre- 
ponderantly in one direction; i.e. if there is directional dominance. 
If the genes that increase the value of the character are dominant 
over their alleles that reduce the value, then inbreeding will result in 
a reduction of the population mean, i.e. a change in the direction of 
the more recessive alleles. The contribution of each locus, however, 
depends also on its gene frequencies, those with intermediate fre- 
quencies having the greatest effect on the change of mean value. 

We have now reached two conclusions about the effects of in- 
breeding, one from observation — that inbreeding reduces fitness; the 
other from theory — that the change is in the direction of the more 
recessive alleles. Putting these two conclusions together leads to the 
generalisation, already familiar from Mendelian genetics, that dele- 
terious alleles tend to be recessive. 

Another conclusion that can be drawn from equation 14.4 is that 
when loci combine additively the change of mean on inbreeding 
should be directly proportional to the coefficient of inbreeding. In 
other words the change of mean should be a straight line when 
plotted against F. Two examples of experimentally observed inbreed- 
ing depression are illustrated in Fig. 14.1. 

On the whole the observed inbreeding depression does tend to be 



[Chap. 14 

linear with respect to F, and this might be taken as evidence that 
epistatic interaction between loci is not of great importance. There 
are, however, several practical difficulties that stand in the way of 
drawing firm conclusions from observations of the rate of inbreeding 
depression. One is that as inbreeding proceeds and reproductive 
capacity deteriorates, it soon becomes impossible to avoid the loss of 

Fig. 14. i. Examples of inbreeding depression affecting fertility. 
(a) Litter-size in mice (original data). Mean number born alive in 
1 st litters, plotted against the coefficient of inbreeding of the litters. 
The first generation was by double-first-cousin mating; thereafter 
by full-sib mating. No selection was practised, (b) Fertility in 
Drosophila subobscura. Mean number of adult progeny per pair per 
day, plotted against the inbreeding coefficient of the parents. 
Consecutive full-sib matings. (Redrawn from Hollingsworth & 
Smith, I955-) 

some lines. The survivors are then a selected group to which the 
theoretical expectations no longer apply. Thus precise measurement 
of the rate of inbreeding depression can generally be made only over 
the early stages, before the inbreeding coefficient reaches high levels. 
Another difficulty, met with particularly in the study of mammals, 
arises from maternal effects. Maternal qualities are among the most 
sensitive characters to inbreeding depression. The effect of inbreed- 
ing on another character that is influenced by maternal effects is 
therefore two-fold: part being attributable to the inbreeding of the 
individuals measured and part to the inbreeding in the mothers. So 
the relationship between the character measured and the coefficient 
of inbreeding cannot be depicted in any simple manner. In conse- 

Chap. 14] 



quence of these difficulties reliable conclusions cannot easily be 
drawn from the exact form of the inbreeding depression observed in 

Example 14. i. The complications arising from maternal effects may 
be illustrated by litter size in pigs and mice. Litter size is a composite 
character, which is partly an attribute of the mother and partly an attribute 
of the young in the litter. It is therefore influenced both by the inbreeding 
of the mother and by the inbreeding of the young, and these two influences 
are difficult to disentangle in practice. Studies on pigs (Dickerson et al., 
1954) have shown that the reduction of litter size due to inbreeding in the 
mother alone is about 0-20 young per 10 per cent of inbreeding; and the 
reduction due to inbreeding in the young alone is about 0-17 young per 10 
per cent of inbreeding. Thus the effects of inbreeding in the mother and in 
the young are about equally important. A small experiment with mice 
(original data) gave much the same picture. A rough separation of the 
effects of inbreeding in the mother and in the young was made by means of 
crosses between lines after 2 or 3 generations of sib mating. (The justifi- 
cation for regarding this as a measure of the inbreeding depression will be 
explained in the next section.) The mean litter sizes, arranged according 
to the coefficient of inbreeding of the mothers and of the young, are given 
in the table. 

Inbreeding coefficient of mothers 

0% 37'5% 50% 





The three comparisons in the first row show the effect of inbreeding in the 
mothers, and give values of 0-19, 0-18 and 0-16 for the reduction of litter 
size per 10 per cent of inbreeding. The comparisons in the second and 
third column show the effect of inbreeding in the young, and give values 
of 0-24 and 0-25 for the reduction per 10 per cent of inbreeding. Thus 
inbreeding in the young had rather more effect than inbreeding in the 
mother. These results, however, should not be taken as being character- 
istic of mice in general. 

The effect of selection. The neglect of selection during in- 
breeding is an unrealistic omission because natural selection cannot 
be wholly avoided even in laboratory experiments. Since inbreeding 
tends to reduce fitness, natural selection is likely to oppose the in- 
breeding process by favouring the least homozygous individuals. 


The balance between selection and the dispersion of gene frequencies 
was discussed in Chapter 4, and the only further point that need be 
added here is that the operation of natural selection makes the in- 
breeding depression dependent on the rate of inbreeding. One must 
distinguish between the state of dispersion of gene frequencies and 
the coefficient of inbreeding as computed from the population size or 
the pedigree relationships. The state of dispersion is what determines 
the amount of inbreeding depression; the coefficient of inbreeding is a 
measure of the state of dispersion only in the absence of selection. 
When selection operates, the state of dispersion will be less than that 
indicated by the coefficient of inbreeding, and the discrepancy be- 
tween the two will be greater when the rate of inbreeding is slower, 
because the selection will then be relatively more potent. Therefore 
one must expect the inbreeding depression caused by a given increase 
of the computed coefficient of inbreeding to be less when inbreeding 
is slow than when it is rapid. 


Complementary to the phenomenon of inbreeding depression is 
its opposite, "hybrid vigour" or heterosis. When inbred lines are 
crossed, the progeny show an increase of those characters that previ- 
ously suffered a reduction from inbreeding. Or, in general terms, the 
fitness lost on inbreeding tends to be restored on crossing. That the 
phenomenon of heterosis is simply inbreeding depression in reverse 
can be seen by consideration of how the population mean depends on 
the coefficient of inbreeding, as shown in equation 14.4. Consider, as 
before, a population subdivided into a number of lines. If the lines 
are crossed at random, the average coefficient of inbreeding in the 
cross-bred progeny reverts to that of the base population. Thus, if a 
number of crosses are made at random between the lines, the mean 
value of any character in the cross-bred progeny is expected to be the 
same as the population mean of the base population. In other words, 
the heterosis on crossing is expected to be equal to the depression on 
inbreeding. Furthermore, if the population is continued after the 
crossing by random mating among the cross-bred and subsequent 
generations, the coefficient of inbreeding will remain unchanged, and 
the population mean is consequently expected to remain at the level 
of the base population. We may, thus, make the following generalisa- 


Chap. 14] 



tion on theoretical grounds: that, in the absence of selection, in- 
breeding followed by crossing of the lines in a large population is not 
expected to make any permanent change in the population mean. 

Example 14.2. An experiment with mice (R. C. Roberts, unpublished) 
was designed to test the theoretical expectation that in the absence of 
selection the heterosis on crossing should be equal to the depression on 
inbreeding. The character studied was litter size. Thirty lines taken from 
a random-bred population were inbred by 3 consecutive generations of 
full-sib mating, bringing the coefficient of inbreeding up to 50 per cent in 
the litters and 37-5 per cent in the mothers. No selection was practised 
during the inbreeding, and only 2 of the 30 lines were lost as a conse- 
quence of their inbreeding depression. 

Litter size 
Before inbreeding 8-i 

Inbred (litters: F = 50%) 57 

Cross-bred 8-5 

After the third generation of inbreeding, crosses were made at random 
between the lines, and in the next generation crosses between the F/s were 
made so as to give cross-bred mothers with non-inbred young. The mean 
litter sizes observed at the different stages are given in the table. The 
inbreeding depression was 2-4 and the heterosis 2-8; the two are equal 
within the limits of experimental error. 

Single crosses. The foregoing theoretical conclusions refer to 
the average of a large number of crosses between lines derived from a 
single base population. In practice, however, one is often interested 
in a somewhat different problem, namely the heterosis shown by a 
particular cross between two lines, or between two populations which 
may have no known common origin. To refer the changes of mean 
value to changes of inbreeding coefficient would be inappropriate 
under these circumstances, and the theoretical basis of the heterosis is 
better expressed in terms of the gene frequencies in the two lines. 
We may recall from Chapter 3 that inbreeding leads to a dispersion of 
gene frequencies among the lines, the lines becoming differentiated 
in gene frequency as inbreeding proceeds; and the coefficient of 
inbreeding is a means of expressing the degree of differentiation 
(equation 3.14). In turning from the inbreeding coefficient to the 
gene frequencies as a basis for discussion we are therefore turning 
from the general, or average, consequence of crossing, to the particu- 
lar circumstances in two lines. 


Let us, then, consider two populations, referred to as the ' 'parent 
populations," both random-bred though not necessarily large. The 
parent populations are crossed to produce an F x or "first cross-bred 
generation," and the F x individuals are mated together at random to 
produce an F 2 or "second cross-bred generation." The amount of 
heterosis shown by the F x or the F 2 will be measured as the deviation 
from the mid-parent value, i.e. as the difference from the mean of the 
two parent populations. First consider the effects of a single locus 
with two alleles whose frequencies are p and q in one population, and 
p' and q' in the other. Let the difference of gene frequency between 
the two populations be y, so that y=p-p' =q' -q. The algebra is 
then simplified by writing the gene frequencies^/ and q' in the second 
population as (p -y) and (q +y). Let the genotypic values be a, d, - a, 
as before. They are assumed to be the same in the two popula- 
tions, epistatic interaction being disregarded. We have to find the 
mean of each parent population and the mid-parent value; then the 
mean of the F x and the mean of the F 2 . The parental means, M Vl and » 
Mp 2 , are found from equation 7.2. They are 

M 1 > 1 =a(p-q) + 2dpq 

Mj> 2 = a{p-y-q-y) + zd(p -y)(q +y) 
= a(p-q- 2y) + zd[pq +y(p -q)- y 2 ] 

The mid-parent value is 

Mp = «M Pi +Mp 2 ) 

= a(p-q-y) + d[2pq+y{p-q)-y*\ (14.5) 

When the two populations are crossed to produce the F lf indi- 
viduals taken at random from one population are mated to indivi- 

Table 14.3 
Frequencies of Zygotes in the F 1 

Gametes from P 1 
Aj A 2 

P Q 

Gametes \ A 1 p-y 
from¥ 2 J A 2 q+y 

p(p-y) q(p-y) 

p(q+y) q{<i+y) 

duals taken at random from the other population. This is equivalent 
to taking genes at random from the two populations, as shown in 
Table 14.3. The F x is therefore constituted as follows: I ence 

Chap. 14] 



Genotypic values 


AiA 2 


A 2 A 2 



The mean genotypic value of the F x is therefore: 

M ¥i = a(p 2 -py-q 2 -qy) + d[2pq+y(p-q)] 
= a{p-q-y) + d[zpq +y(p - q)] 

The amount of heterosis, expressed as the difference between the F 1 
and the mid-parent values, is obtained by subtracting equation 14.5 
from equation 14.6: 


H Fl =M Fl -Mp 
= dy* 


Thus heterosis, just like inbreeding depression, depends for its occur- 
rence on dominance. Loci without dominance (i.e. loci for which 
d=6) cause neither inbreeding depression nor heterosis. The amount 
of heterosis following a cross between two particular lines or popula- 
tions depends on the square of the difference of gene frequency (y) 
between the populations. If the populations crossed do not differ in 
gene frequency there will be no heterosis, and the heterosis will be 
greatest when one allele is fixed in one population and the other allele 
in the other population. 

Now consider the joint effects of all loci at which the two parent 
populations differ. In so far as the genotypic values attributable to 
the separate loci combine additively, we may represent the heterosis 
produced by the joint effects of all the loci as the sum of their separate 
contributions. Thus the heterosis in the F 1 is 

H Vl =Zdy* 


If some loci are dominant in one direction and some in the other their 
effects will tend to cancel out, and no heterosis may be observed, in 
spite of the dominance at the individual loci. The occurrence of 
heterosis on crossing is therefore, like inbreeding depression, de- 
pendent on directional dominance, and the absence of heterosis is not 
sufficient ground for concluding that the individual loci show no 

Before we go on to consider the F 2 it is perhaps worth noting that 
the formulation of the heterosis in terms of the square of the differ- 
ence of gene frequency, in equations J4.7 and 14.8, is quite in line 


with the previous formulation of the inbreeding depression in terms 
of the coefficient of inbreeding. If we envisage once more the whole 
population subdivided into lines, and we suppose pairs of lines to be 
taken at random, then the mean squared difference of gene frequency 
between the pairs of lines will be equal to twice the variance of gene 
frequency among the lines. That is: (j 2 ) = 2o^. And, by equation 
3.14, 2o\ = 2pqF. Therefore the mean amount of heterosis shown by 
crosses between random pairs of lines is equal to the inbreeding 
depression as given in equation 14.2, though of opposite sign. 

Now let us consider the F 2 of a particular cross of two parent 
populations, the F 2 being made by random mating among the indi- 
viduals of the Fj. In consequence of the random mating, the geno- 
type frequencies in the F 2 will be the Hardy- Weinberg frequencies 
corresponding to the gene frequency in the F v The mean genotypic 
value of the F 2 is then easily derived by application of equation 7.2. 
The gene frequency in the F 1} being the mean of the gene frequencies 
in the two parent populations, is (p - \y) for one allele, and (q + \y) 
for the other. Putting these gene frequencies in place of p and q 
respectively in equation 7.2 gives the mean genotypic value of the 

2 as: 

M Vi = a(p-iy-q-ly) + 2d(p-iy)(q + iy) 

= a(p-q-y) + d[zpq+y(p-q)-iy 2 ] 

The amount of heterosis shown by the F 2 is the difference between 
the F 2 and mid-parent values. So, from equations 14.5 and X4.9, 

= \dy* 

=i#F x {14-1°) 

We find therefore that the heterosis shown by the F 2 is only half as 
great as that shown by the F x . In other words, the F 2 is expected to 
drop back half-way from the F x value toward the mid-parent value. 
At first sight this conclusion may seem to contradict the one arrived 
at earlier, when we were considering crosses between many lines, the 
F]_ and F 2 means then being equal. The difference between the two 
situations is that an F 2 made by random mating among a large number 
of different crosses has the same inbreeding coefficient as the F 2 . 
But an F 2 made from an F x derived from a single cross has inevitably 
an increased inbreeding coefficient. If the inbreeding coefficient is 


Chap. 14] 



worked out in the manner described in Example 5.2, it will be found 
to be half the inbreeding coefficient of the parent lines. The change 
of mean from F x to F 2 may therefore be regarded as inbreeding de- 
pression. It cannot be overcome by having a large number of parents 
of the F 2 because the restriction of population size that causes the 
inbreeding has already been made in the single cross of only two lines, 
or parent populations. There need, however, be no further rise of the 
inbreeding coefficient in the F 3 and subsequent generations. Pro- 
vided, therefore, that there is no other reason for the gene frequency 
to change, the population mean will be the same in the generations 
following as in the F 2 . 

That the heterosis expected in the F 2 is half that found in the F ± 
is equally true when the joint effects of all loci are considered, pro- 
vided that epistatic interaction is absent. The conclusion for a single 
locus was based on the principle that Hardy- Weinberg equilibrium 
is attained by a single generation of random mating. It will be 
remembered from Chapter 1 (p. 19), however, that this is not true 
with respect to genotypes at more than one locus considered jointly. 
Therefore if there is epistatic interaction, the population mean will 
not reach its equilibrium value in the F 2 , but will approach it more or 
less rapidly according to the number of interacting loci and the 
closeness of the linkage between them. The existence of epistatic 
interaction is intimately connected with the scale of measurement, 
but this matter will not be discussed until Chapter 17. Here we need 
only note that for reasons connected with the scale of measurement 
the halving of the heterosis in the F 2 expected on theoretical grounds 
is not often found at all exactly in practice, though the F 2 usually falls 
somewhere between the F x and mid-parent values. Some examples 
from plants of the heterosis observed in the F 1 and F 2 generations are 
illustrated in Fig. 14.2. It will be noticed that with some of the 
characters shown, the F x and F 2 are lower in value than the mid- 
parent, and the heterosis is consequently negative in sign. This is in 
no way inconsistent with our definition of heterosis as the difference 
between the F x or F 2 and the mid-parent value. The sign of the 
difference depends simply on the nature of the measurement. For 
example, the character "days to first fruit," represented in the lower 
graphs, shows heterosis of negative sign: but if the character were 
called "speed of development" and expressed as a reciprocal of time 
the heterosis would be positive in sign. 

The relative amount of heterosis observed in the F x and F 2 



[Chap. 14 

generations is complicated also by the existence of maternal effects, 
particularly in mammals. A character subject to a maternal effect, 








F 2 






Fig. 14.2. Some illustrations of heterosis observed in crosses 
between pairs of highly inbred strains of plants. The points show 
the mean values of the two parent strains, the F x and the F 2 
generations. The mid-parent values are shown by horizontal lines. 
Graph (a) refers to tobacco, Nicotiana rustica (data from Smith, 
1952). All the other graphs refer to tomatoes, Ly coper sicon (Data 
from Powers, 1952). The characters represented are: 

(a) Height of plant (in.) 

(b) Mean weight of one fruit (gm.) 

(c) Number of locules per fruit 
{d) Mean weight per locule (gm.) 

(e)-(h) Mean time in days between the planting of the seed and 
the ripening of the first fruit, in 4 different crosses. 

such as litter size, is divided between two generations. The maternally 
determined component of the character may be expected to follow the 

Chap. 14] 



same general pattern of heterosis in the F ± and F 2 as we have just 
discussed, but it will be one generation out of phase with the non- 
maternal part of the character. Thus the heterosis observed in the F 1 
is attributable to the non-maternal part, the maternal effect being still 
at the inbred level. In the F 2 , however, the non-maternal part will 
lose half the heterosis as explained above, but the maternal effect will 
now show the full effect of its heterosis since the mothers are now in 
the Fj stage. This rather complicated situation may perhaps be more 















r V \ 


| 1 1 1 


F 2 


F 3 


Fig. 14.3. Diagram of the heterosis expected in a character sub- 
ject to a maternal effect, when two lines are crossed and the F 2 is 
made by random mating among the F x . The maternal and non- 
maternal components of the character separately are here supposed 
to show equal amounts of heterosis, and to combine by simple 
addition to give the character as it is measured. 

readily grasped from the diagrammatic representation in Fig. 14.3. 
As a result of maternal effects, therefore, the loss of heterosis in the 
F 2 and subsequent generations is usually less noticeable with animals 
than with plants, and experiments of great precision would be re- 
quired to detect any regular pattern. 

Wide crosses. We have seen that the amount of heterosis shown 
by a particular cross depends, among other things, on the differences 
of gene frequency between the two populations crossed. This would 
seem to indicate that the amount of heterosis would increase with the 


degree of genetic differentiation between the two populations and 
would be limited only by the barrier of interspecific sterility. This, 
however, is not true. Crosses between subspecies, or between local 
races, taken from the wild often fail to show heterosis, particularly 
in characters closely related to fitness which show heterosis in crosses 
between less differentiated laboratory populations. Indeed the F^s 
of wide crosses are often less fit than the parent populations. Much 
of the evidence about such crosses comes from studies of wild 
populations of Drosophila pseudoobscura and other species, (see 
Dobzhansky, 1950; Wallace and Vetukhiv, 1955). Though wide 
crosses may not show heterosis in fitness, they do often show hetero- 
sis in certain characters, particularly growth rate in plants. Dob- 
zhansky (1950, 1952), who drew attention to this, refers to heterosis 
in fitness as "euheterosis" and to heterosis in a character that does not 
confer greater fitness as "luxuriance." 

The error in extending our earlier conclusion to wide crosses 
arises from the fact that we have assumed epistatic interaction be- 
tween loci to be negligible, an assumption that is probably justified 
for crosses between breeds of domestic animals or between laboratory 
populations, but is obviously not justified in the case of crosses be- 
tween differentiated wild populations. The existing genetic differen- 
tiation between wild populations has, for the most part, arisen by 
evolutionary adaptation to the local conditions. Adaptation to local 
conditions or to a particular way of life involves many different 
characters, both structural and functional, because the fitness of the 
organism depends on the harmonious interrelations of all its parts. 
If two populations adapted to different ways of life are crossed, the 
cross-bred individuals will be adapted to neither, and will conse- 
quently be less fit than either of the parent populations. The effect 
of this evolutionary adaptation on the genetic structure of the popu- 
lations is as follows. The genes A x and B 1} say, are selected in one 
population because together they increase fitness, though either one 
separately may not; while, in another population living under differ- 
ent conditions, the genes A 2 and B 2 are selected for similar reasons. 
In respect of fitness, therefore, there is epistatic interaction between 
these two loci. But if these pairs of genes become fixed throughout the 
two populations, A ± and B ± in one and A 2 and B 2 in the other, and so 
become part of their constant genetic structure, the variation arising 
from this interaction will disappear. Within any one population, 
therefore, we may find very little epistatic variation, and the interac- 


Chap. 14] 



tion will become apparent as a cause of variation between individuals 
only in a cross-bred population in which there is segregation at both 
interacting loci. 

The idea that the genetic structure of a natural population evolves 
as a whole, so that the selection pressure on any one locus is depend- 
ent on the alleles present at many of the other loci, is expressed in the 
terms "coadaptation" and "integration," used to describe the genetic 
structure of natural populations. (For general discussions of these 
concepts, see Dobzhansky, 195 ib; Lerner, 1954, 1958; Wright, 
1956.) The important point for us to note is this. The property of 
coadaptation, or integration, assumes primary importance only when 
different populations are to be compared and when the results of 
crossing adaptively differentiated populations are to be studied; it is 
of less importance in the genetic study of a single population. In 
this book we are chiefly concerned with the genetic variation within a 
population: that is, the variation arising from the segregation of genes 
in the population. Some of this variation arises from epistatic iner- 
action between the genes segregating at different loci, which is the 
raw material, as it were, from which coadaptation could evolve if the 
population were to become subdivided. But the amount of this epi- 
static variation within a population is probably seldom very large, 
and moreover it is seldom necessary to distinguish it from other 
sources of non-additive genetic variance. 




II. Changes of Variance 

The effect of inbreeding on the genetic variance of a metric character 
is apparent, in its general nature, from the description of the changes 
of gene frequency given in Chapter 3. Again, we have to imagine the 
whole population, consisting of many lines. Under the dispersive 
effect of inbreeding, or random drift, the gene frequencies in the 
separate lines tend toward the extreme values of o or 1, and the lines 
become differentiated in gene frequency. Since the mean genotypic 
value of a metric character depends on the gene frequencies at the 
loci affecting it, the lines become differentiated, or drift apart, in 
mean genotypic value. And, since the genetic components of vari- 
ance diminish as the gene frequencies tend toward extreme values 
(see Fig. 8.1), the genetic variance within the lines decreases. The 
general consequence of inbreeding, therefore, is a redistribution of the 
genetic variance; the component appearing between the means of 
lines increases, while the component appearing within the lines 
decreases. In other words, inbreeding leads to genetic differentiation 
between lines and genetic uniformity within lines. The differentia- 
tion is illustrated from experimental data in Fig. 15.1. 

The subdivision of an inbred population into lines introduces an 
additional observational component of variance, the between-line 
component, and it is not surprising that this adds a considerable 
complication to the theoretical description of the components of 
genetic variance. Indeed, a full theoretical treatment of the redistri- 
bution of variance has not yet been achieved. Here we shall attempt 
no more than a brief description of the main outlines, and for this we 
shall have to make some simplifications. In particular we shall 
entirely neglect the interaction component of genetic variance arising 
from epistasis. For detailed treatment of various aspects of the 
problem, and for references, see Kempthorne (1957, Ch. 17). After 
this description of the redistribution of genetic variance we shall 
consider changes of environmental variance. The greater sensitivity 



Chap. 15] 

of inbred individuals to environmental sources of variation was 
mentioned earlier, in Chapter 8. This phenomenon interferes with 
the experimental study of the changes of variance, and until it is 
better understood we cannot put much reliance on the theoretical 

11 12 13 14 15 16 17 

60 61 62 63 64 65 66 67 68 69 
generations of inbreeding 

Fig. 15. i. Differentiation between lines by random drift, shown 
by abdominal bristle number in Drosophila melanogaster. The 
graphs show the mean bristle number in each of 10 lines during 
full-sib inbreeding without artificial selection. (From Rasmuson, 
1952; reproduced by courtesy of the author and the editor of Acta 

expectations concerning variance being manifest in the observable 
phenotypic variance. Finally, in this chapter, we shall discuss the use 
of inbred animals for experimental purposes. 

Redistribution of Genetic Variance 

The redistribution of variance arising from additive genes (i.e. 

genes with no dominance) is easily deduced. This is because with 

additive genes the proportions in which the original variance is dis- 
I tributed within and between lines does not depend on the original 
I gene frequencies. When there is dominance, however, we cannot 
j deduce the changes of variance without a knowledge of the initial 
I gene frequencies. This not only adds considerably to the mathematical 

complexity, but it renders a general solution impossible. We shall 


first consider the case of additive genes, and then very briefly indicate 
the conclusions arrived at for dominant genes. The effect of selection 
will not be specifically discussed. We need only note that natural 
selection will tend to render the actual state of dispersion of gene 
frequencies less than that indicated by the inbreeding coefficient 
computed from the population size or pedigree relationships. There- 
fore we must expect the redistribution of genetic variance to proceed 
at a slower rate than the theoretical expectation, and we must expect 
the discrepancy to be greater when inbreeding is slow than when it is 

No dominance. What follows refers to the variance arising from 
additive genes: it does not apply to the additive variance arising from 
genes with dominance. The conclusions therefore apply, strictly 
speaking, only to characters which show no non-additive variance. 
They serve, however, to indicate the general effect of inbreeding on 
variance, and may be taken as a fair approximation to what is expected 
of characters such as bristle number in Drosophila, that show little 
non-additive genetic variance. The description to be given refers to 
slow inbreeding, and is not strictly true of rapid inbreeding by sib- 
mating or self-fertilisation. The redistribution of the variance under 
rapid inbreeding is, however, not very different except in the first few 

Consider first a single locus. When there is no dominance the 
genotypic variance in the base population, given in equation 8.y i be- 

V G = 2p Q q Q a 2 

The variance within any one line is 

V G = 2pqa 2 

where p and q are the gene frequencies in that line. The mean vari- 
ance within lines is 

V Gw = 2(pq)a 2 

where (pq) is the mean value of pq over all lines. Now, z(pq) is the 
overall frequency of heterozygotes in the whole population, which, by 
Table 3.1, is equal to 2p q (i -F), where F is the coefficient of in- 
breeding. Therefore 

V Gw = 2p Q q a 2 (i-F) 
= V (i-F) 

r Chap. 15] 



and this remains true when summation of the variances is made over 
all loci. Thus the within-line variance is (i -F) times the original 
variance, and as F approaches unity the within-line variance approaches 

Now let us consider the between-line variance. This is the vari- 
ance of the true means of lines, and would be estimated from an 
analysis of variance as the between-line component. For a single 
locus, still with no dominance, the mean genotypic value of a line 
with gene frequency^) and q is obtained from equation 7.2 as 

= a{i-zq) 

Thus we want to find the variance of (a - zaq). Now, in general, 
w\x-Y) ~ G x + °r> ^ X and Y are uncorrelated. Since in this case a is 
constant from line to line (epistasis being assumed absent) it has no 
variance, and so 

Again, in general, o^x 

K*<j'x when K is a constant. So 


= ^a 2 p q F (from 3.14) 
=zFV G 

and this also remains true when summation is made over all loci. 
Thus the between-line genetic variance is zF times the genetic vari- 
ance in the base population. 

The partitioning of the genetic variance into components as 
explained above is summarised in Table 15.1. The total genetic 

Table 15. i 

Partitioning of the variance due to additive genes in a 

population with inbreeding coefficient F, when the variance 

due to additive genes in the base population is V G . 

Between lines zFV G 

Within lines (i-F)V G 

Total (1 +F)V G 

variance in the whole population is the sum of the within-line and 
between-line components, and is equal to (1 +F) times the original 
genetic variance. (This is true also of close inbreeding.) Thus when 
inbreeding is complete the genetic variance in the population as a 


whole is doubled, and all of it appears as the between-line component. 
The genetic variance within lines, before inbreeding is complete, 
is partitioned within and between the families of which the lines are 
composed. Under slow inbreeding with random mating within the 
lines, it is partitioned equally within and between full-sib families. 
The covariance of relatives within the lines is just as described in 
Chapter 9, each line being a separate random-breeding population 
with a total genetic variance of (1 -F)V G , on the average. From this 
we can deduce what the heritability is expected to be within any one 
line. It will be (1 -F)V G j[(i -F)V Q + V E \ and this reduces to 

*- x-Wt (J5 - J) 

where h 2 t and F t are the heritability within lines and the inbreeding 
coefficient at time t, and h% is the original heritability in the base 
population. This shows how the heritability is expected to decline 
with the inbreeding in a small population. The formula, however, is 
applicable only to characters with no non-additive variance, and in 
the absence of selection. The operation of natural selection renders 
the reduction of the heritability less than expected, especially under 
slow inbreeding. This point has been demonstrated experimentally 
with Drosophila (Tantawy and Reeve, 1956). 

Dominance. The components of variance arising from additive 
genes will have been seen to be independent of the gene frequencies 
in the base population. When we consider genes with any degree of 
dominance, however, we find that the changes of variance on in- 
breeding depend on the initial gene frequencies, and this makes it 
impossible to give a general solution in terms of the genetic variance 
present in the base population. We shall therefore do no more than 
give the conclusions arrived at by A. Robertson (1952) for the case of 
fully dominant genes, when the recessive allele is at low frequency. 
This is the situation most likely to apply to variation in fitness arising 
from deleterious recessive genes, though the effects of selection are 
here disregarded. Fig. 15.2 shows the redistribution of variance 
arising from recessive genes at a frequency of q — o-i in the base 
population. Fig. 15.2(a) refers to full-sib mating with only one 
family in each line, and Fig. 15.2(6) refers to slow inbreeding. A 
surprising feature of the conclusions is that the within-line variance 
at first increases, reaching a maximum when the coefficient of in- 
breeding is a little under 0-5, and it remains at a fairly high level until 

Chap. 15] 



the coefficient of inbreeding approaches I. The reason, in general 
terms, for the apparent anomaly that the variation within lines in- 
creases during the first stages of inbreeding, can be seen from a con- 
sideration of the relationship between the gene frequency and the 
variance arising from a dominant gene shown in Fig. 8.1(b). The 
gene frequency is taken to start at a value of o-i, and on inbreeding it 




Fig. 15.2. Redistribution of variance arising from a single fully 
recessive gene with initial frequency q =o*i. (a) with full-sib 
mating, (b) with slow inbreeding. (From A. Robertson, 1952; 
reproduced by courtesy of the author and the editor of Genetics.) 

V t —total genetic variance. 

V b = between-line component. 

V w =within-line component. 

V a = additive genetic variance within lines. 

will increase in some lines and decrease in others, the increase being 
on the average equal in amount to the decrease. But examination of 
the graph shows that an increase of gene frequency by a certain 
amount will increase the variance more than a decrease of the same 
amount will reduce it. Therefore, on the average, the variance within 
the lines will increase in the early stages of inbreeding. This increase 
of variance would be detectable in practice only if a substantial part 
of the genetic variance were due to recessive genes at low frequencies. 
Practical considerations. The extent to which the theoretical 
changes of variance described in this chapter can be observed in 
practice depends on how much environmental variance is present. 
The precise estimation of variance requires a large number of obser- 
vations and the estimates obtained in practice are usually subject to 


rather large deviations due to the chances of sampling. Consequently 
the changes of variance must usually be quite substantial before they 
are likely to be readily detected. The genotypic variance, moreover, 
seldom constitutes the major part of the phenotypic variance. 
Therefore, in relation to the original phenotypic variance, the expected 
changes due to inbreeding are usually rather small, and this renders 
their detection all the more difficult. Furthermore, the detection of 
the expected changes of phenotypic variance is entirely dependent on 
the constancy of the environmental variance, and this cannot be 
assumed without evidence, as we shall show in the next section. For 
these reasons, and also because of the simplifications we have had to 
make, we must bear in mind the uncertainties in the connexion 
between what is expected and what may be observed in the pheno- 
typic variance. 

Changes of Environmental Variance 

Several times in previous chapters we have referred to the fact that 
the environmental component of variance may differ according to 
the genotype; in particular that inbred individuals often show more 
environmental variation than non-inbred individuals. This fact has 
been revealed by many experiments in which the variances of inbreds 
and of hybrids have been compared. Any difference of phenotypic 
variance between highly inbred lines and the F 2 between them (i.e. 
the "hybrid") must be attributed to a difference of the environmental 
component, because the genetic variance is negligible in amount in 
the hybrids as well as in the inbred lines. The greater susceptibility 
of inbreds than of hybrids to environmental sources of variation has 
been observed in a wide variety of characters and organisms. Some 
examples are cited in Table 15.2; others will be found in the review 
by Lerner(i954). 

The cause of the greater environmental variance of inbreds is not 
yet fully understood. It has been suggested that the possession of 
different alleles at specific loci endows the hybrids with greater 
' 'biochemical versatility" (Robertson and Reeve, 19526), which 
enables them to adjust their development and physiological mech- 
anisms to the circumstances of the environment: in other words that 
developmental and physiological homeostasis is improved by allelic 
diversity. On the other hand, it has been suggested (Mather, 19530) 

Chap. 15] 







that the reduced homeostatic power of inbreds is to be regarded as a 
manifestation of inbreeding depression: homeostatic power is likely 
to be an important aspect of fitness, and would therefore be expected, 
like other aspects of fitness, to decline on inbreeding. The under- 
lying mechanism, we may presume, would be directional dominance, 
genes that increase homeostatic power tending on the average to be 

Table 15.2 

Comparisons of Phenotypic Variance in Inbreds and 

The figures are the averages of the inbred lines, and of the 

Fj's where more than one cross was made. (C.V.) 2 = Squared 

coefficient of variation. 

Inbreds Hybrids 

Drosophila melanogaster — wing length 
(Robertson and Reeve, 19526) (C.V.) 2 . 
6 inbreds and 6 F/s 

Mice — duration of "Nembutal" anaesthesia 
(McLaren and Michie, 19566). Log minutes. 

2 inbreds and 1 F 1 

Mice — age at opening of vagina 
(Yoon, 1955). Days. 

3 inbreds and 2 F/s 

Mice — weight at ages given 
(Chai, 1957) (C.V.) 2 . 

2 inbreds and 1 F x 

Rats — weight at 90 days 
(Livesay, 1930.) (C.V.) 2 . 

3 inbreds and 2 F/s 

dominant over their alleles that decrease it. 
causal connexion between variability and fitness. He believes greater 
stability to be a general property of heterozygotes and regards it as 
the cause of their greater fitness. Though the increase of environ- 
mental variance on inbreeding is a phenomenon of great theoretical 
interest and some practical importance, too little is known about it to 
justify a more detailed discussion of its causes here. Comprehensive 
discussions will be found in Lerner (1954) and Waddington (1957). 

There are, however, two further points in connexion with the 
phenomenon that should be mentioned. The first is a technical 
matter. If the mean value of the character differs between inbreds 

< 3 weeks 
L60 days 



J 9 








Lerner (1954) sees a 


and hybrids, as it frequently does, then it may be difficult to decide 
on a proper basis for the comparison of the variances. It is necessary 
to find a measure of the variance that does not merely reflect the 
difference of mean value, and for this purpose the coefficient of 
variation is often an appropriate measure. The problem is basically a 
matter of the choice of scale, and will be discussed again in Chapter 

The second point concerns the nature of the environmental 
variation that is being measured. There is a distinction to be made 
between the "developmental" variation arising from "accidents of 
development" on the one hand, and adaptive reponses to changed 
conditions on the other. The developmental variation is a mani- 
festation of incomplete buffering, or canalisation, of development and 
is generally regarded as being harmful. Inbreds, in so far as they 
show a greater amount of developmental variation, are therefore less 
fit than hybrids; they are less well able to adjust their development to 
different conditions of the environment so as to achieve the optimal 
phenotype. An adaptive response, in contrast, is a modification of the 
phenotypic value that is beneficial to the individual, such as for 
example the thickening of the coat of mammals in response to low 
temperature. If the greater fitness of hybrids over inbreds extends to 
adaptive responses we should therefore expect hybrids to show more 
variation of this sort than inbreds. Thus the nature of the environ- 
mental variation has an important bearing on the interpretation of a 
difference of variability between inbreds and hybrids. 

Uniformity of Experimental Animals 

Inbred strains of laboratory animals, particularly of mice, are 
widely used as experimental material in pharmacological, physio- 
logical, and nutritional laboratories, when uniformity of biological 
material is desired. In some kinds of work, work for example which 
demands the absence of immunological reactions,, it is genetic uni- 
formity that is required, and abundant experience has shown that the 
inbred strains of mice fully satisfy this requirement. In spite of 
doubts about how effective natural selection for heterozygotes may be 
in delaying the progress towards homozygosity, these strains have 
been proved in practice to be genetically uniform. In the course of 
their maintenance, however, strains inevitably become split up into 

Chap. 15] 



sublines, and it is only within a subline that their genetic uniformity 
can be relied on. Recent work, described in the two following 





©o © 


Fig. 15.3. Differentiation between sublines of the C3H inbred 
strain of mice, in the number of lumbar vertebrae. Each circle 
represents a sample of individuals classified for the number of 
lumbar vertebrae. The proportions of black and white in the 
circles show the proportions of individuals with 6 and with 5 
lumbar vertebrae respectively. (Small proportions of asymmetrical 
individuals are included with the 5 -vertebra classes.) The circles 
are positioned according to the date of clasification, and arranged 
according to their pedigree relationships. (Data from McLaren 
and Michie, 1954.) 

examples, has revealed genetic differentiation within two widely used 
strains of mice, and has shown that differences can sometimes be 
detected between sublines separated by only a few generations. 


Example 15.1. The inbred strain of mice known as C3H exhibits 
variability in the number of lumbar vertebrae, and the sublines differ 
markedly in this character. Some sublines consist entirely of mice with 
5 vertebrae, others entirely of mice with 6, and others with different pro- 
portions. The strain originated in 1920 and was split into three main 
groups of sublines in about 1930, each group being later subdivided 
further. The number of lumbar vertebrae has been studied in 16 sublines 
maintained in America and Britain (McLaren and Michie, 1954). The 
pedigree relationships between these sublines, and the proportions of the 
two vertebral types in them, are shown in Fig. 15.3. One of the three main 
groups of sublines has predominantly 6 lumbar vertebrae, and the other 
two groups predominantly 5. This differentiation between the main 
groups may have been due to residual segregation in the strain at the time 
when the main groups became separated. The strain had, however, been 
full-sib mated for 10 years — probably between 20 and 30 generations — 
before the separation of the groups, and residual segregation therefore 
seems unlikely. The sublines within the main groups are differentiated in 
a manner that points to mutation rather than residual segregation as the 
cause. The mutational origin of differentiation is more clearly proved in 
the study described in the next example. 

Example 15.2. Another inbred strain of mice, known as C57BL, has 
been the subject of a thorough study by Griineberg and co-workers (Deol, 
Griineberg, Searle, and Truslove, 1957; Carpenter, Griineberg, and Rus- 
sell, 1957). Twenty-seven skeletal characters were examined in four main 
groups of sublines, three maintained in America and one in Britain, the 
British group being studied in greater detail. The nature and extent of the 
differentiation found cannot be easily summarised, and therefore we shall 
only state the conclusions reached about the cause of the differentiation. 
Each of the four main groups differed from the others in between 7 and 17 
out of the 27 characters. The following conclusions were drawn: (1) The 
differentiation could not reasonably be attributed to residual segregation 
before the separation of the sublines; and segregation following an acci- 
dental outcross was conclusively disproved. (2) Sublines that had been 
separated for a longer time tended to differ by a greater number of charac- 
ters than sublines more recently separated. But the magnitude of the 
difference in any one character was no greater between long-separated 
sublines than between sublines only recently separated. From this it was 
concluded that the differences in each character were caused by mutations 
at single loci. The average difference caused by one mutational step 
amounted to about o-6 standard deviation of the character affected. 

The study cited in the above example shows that the differences 
between sublines, though they may be readily detectable, are prob- 

Chap. 15] 



ably caused by rather few loci. The differentiation is quite small in 
comparison with the differences between strains or between indi- 
viduals in a non-inbred population. 

In much of the work for which inbred strains are used it is not 
the genetic uniformity alone that matters, but the phenotypic uni- 
formity. The more variable the animals the larger the number that 
must be used to attain a given degree of precision in measuring their 
mean response to a treatment. The value of uniformity is therefore 
in reducing the number of animals that must be used in an experi- 
ment or a test. Inbred animals, however, are costly to produce 
because of their poor breeding qualities, and the advantage gained 
from genetic uniformity has to be weighed against the extra cost of 
the material. If the character to be measured is one of which the 
phenotypic variance is chiefly environmental in origin, then the 
absence of genetic variation in an inbred strain will reduce the pheno- 
typic variance by only a small amount. The extra cost of the inbred 
animals may then outweigh the advantage of their being slightly 
more uniform than non-inbred animals. The phenotypic uniformity 
of inbred animals, however, has been taken on trust from the genetical 
theory of inbreeding, and it seems now that this trust has, to some 
extent at least, been misplaced. In some characters inbred animals 
are more phenotypically variable than non-inbred (see Table 15.4) 
on account of their greatly increased environmental variation. It 
seems now that for some, perhaps for many, characters the greatest 
phenotypic uniformity is found in hybrids (i.e. F^s) produced by 
crossing two inbred strains. The value of hybrids for work requiring 
phenotypic uniformity has been discussed by Griineberg (1954); and 
by Biggers and Claringbold (1954). 

One final point about the use of inbred and hybrid animals may 
be noted. An inbred strain or the F x of two inbred strains has a 
unique genotype; and that of an inbred, moreover, is one that cannot 
occur in a natural population. Testing the response to any treatment 
on one inbred strain or one hybrid is therefore testing it on one geno- 
type. If there are appreciable differences of response between 
different genotypes, the experimenter is then not justified in describ- 
ing his results as referring, for example, to "the mouse." 



III. The Utilisation of Heterosis 

The crossing of inbred lines plays a major role in the present methods 
of plant improvement, though in animal improvement it plays a much 
less important part. In this chapter the genetic principles underlying 
the use of inbreeding and crossing will be explained, and the various 
methods described in outline. Technical details, however, will not be 
given: for these the reader should consult a textbook of plant breeding 
(e.g. Hayes, Immer, and Smith, 1955). We shall be concerned with 
outbreeding plants and with animals. But since at first sight the 
methods applicable to naturally self-fertilising plants are super- 
ficially rather like those applicable to outbreeding plants and animals, 
it will be advisable first to consider very briefly the improvement of 
self-fertilising plants. 

Self-fertilising plants. Each variety of a naturally self-fertilising 
plant is a highly inbred line, and the only genetic variation within it is 
that arising from mutation. Genetic improvement can therefore be 
made only by choosing the best of the existing varieties or by crossing 
different varieties. The purpose of the crossing is to produce genetic 
variation on which selection can operate. After a cross has been 
made, the F x and subsequent generations are allowed to self -fertilise 
naturally. A new population, subdivided into lines, is thus made, and 
the lines become differentiated as the inbreeding proceeds. Selection 
is applied by choosing the best lines, which become new and im- 
proved varieties. The essential point to note is that what is sought is 
an improved inbred line, and not a superior crossbred generation: the 
purpose of the crossing is to provide genetic variation and not to 
produce heterosis. The process of crossing and selection among the 
subsequent lines may be repeated cyclically. If two good lines are 
selected out of the first cross, these may be crossed and a second cycle 
of selection applied to the derived lines. The genetic properties of a 
population derived from a cross of two highly inbred lines, such as 
two varieties of a self-fertilising plant, are peculiar in that all segre- 

Chap. 16] 



gating genes have a frequency of 0-5 in the population as a whole. 
This greatly simplifies the theoretical description of the variances and 
covariances. Special methods of analysis applicable to such popula- 
tions have been developed which lead to a separation of the additive, 
dominance, and epistatic effects, and so provide a guide to the possi- 
bilities of improvement in the population of lines derived from a 
particular cross. For a description of these methods, see Mather 
(1949), Hayman (1958), and Kempthorne (1957, Ch. 21) where other 
references are given. 

Outbreeding plants, and animals. Applied to naturally out- 
breeding plants and to animals, the purpose of crossing inbred lines is 
to produce superior cross-bred, or F 1} individuals. The utilisation of 
heterosis in this way depends on selection as well as on the inbreeding 
and crossing. The selection is applied, in principle, to the crosses, 
with the aim of finding pairs of lines that cross well, so that the lines 
may be perpetuated and provide cross-bred individuals for com- 
mercial use. In practice, however, the performance of the lines 
themselves has to be taken into account, because the lines must be 
reasonably productive if they are to be maintained and used for 
crossing. This method has been very successful with plants, and has 
led to an improvement of 50 per cent in the yield of maize grown 
commercially in the United States, since hybrid seed started to be 
used in the early 1930's (Mangelsdorf, 195 1). Its success with 
animals, however, has been much less notable. The reasons probably 
lie chiefly in the greater amount of space and labour required by 
animals and in their lower reproductive rate, both of which add 
greatly to the difficulty of producing and testing the inbred lines. 
During the inbreeding a large proportion of the lines die out from 
inbreeding depression before a reasonably high degree of inbreeding 
has been attained. Consequently the inbreeding programme must 
start with a very large number of lines if enough are to be left after the 
wastage to give some scope for the selection of good crosses. Another 
point is that with plants that can be self-fertilised, such as maize, the 
inbreeding proceeds much faster than with animals. To attain an 
inbreeding coefficient of, say, 90 per cent would require only 4 
years for maize, but 1 1 years for pigs or chickens, and about 50 years 
for cattle with a 4- or 5 -year generation interval. 

Let us now consider the genetic principles on which the utilisa- 
tion of heterosis depends. It was shown in Chapter 14 that crosses 
made at random between lines inbred without selection are expected 


to have a mean value equal to that of the base population. This is 
the reason why inbreeding and crossing alone cannot be expected to 
lead to an improvement, but must be supplemented by selection. In 
practice some improvement can be expected from the effects of 
natural selection. It eliminates lethal and severely deleterious genes 
during the inbreeding, and in so far as these genes affect the desired 
character an improvement of the cross-bred mean over that of the 
base population is to be expected. But this improvement will not be 
very great, because the deleterious genes eliminated will have been at 
low frequencies in the base population — and the more harmful, the 
lower the frequency — so that their effect on the population mean will 
be small. It has been calculated, on the basis of assumptions about 
the number of loci concerned and their mutation rates, that an im- 
provement of 5 per cent in fitness is the most that could be expected 
from the elimination of deleterious recessive genes (Crow, 1948, 1952). 
The bulk of the improvement, therefore, must come from artificial 
selection applied to the economically desirable characters. 

The crossing of inbred lines produces no genotypes that could not 
occur in the base population. But whereas the best genotypes occur 
only in certain individuals in the base population, they are replicated 
in every individual of certain crosses. It is in this replication of a 
desirable genotype that the chief merit of the method lies. Let us, for 
simplicity, consider crosses between fully inbred lines. The gametes 
produced by a highly inbred line are all identical, except for mutation. 
And the gene content of the gametes of any one line could in principle 
be found in a gamete from the base population. Therefore the geno- 
type of the F x of two lines could in principle be found in an individual 
of the base population. Thus, provided there has been no selection 
during the inbreeding, a set of crosses made at random is genetically 
equivalent to a set of individuals taken at random from the base popu- 
lation; and the individuals of one cross are replicates of one individual 
in the base population. This replication of a genotype in the indi- 
viduals of a cross allows the genotypic value to be measured with little 
error; whereas the genotypic value of an individual in the base popu- 
lation is only crudely measured by its phenotypic value. Further, it is 
the genotypic value that is measured in the cross and can be repro- 
duced indefinitely, as long as the inbred lines are maintained; whereas 
only the breeding value can be reproduced by selection of individuals 
in a non-inbred population. Therefore the condition under which 
inbreeding and crossing are likely to be a better means of improvement 

Chap. 16] 



than selection without inbreeding is when much of the genetic 
variance of the character is non-additive. 

The amount of improvement that can be made by selection among 
a number of crosses depends on the amount of variation between the 
crosses. The same relationship holds between the intensity of selec- 
tion, the standard deviation, and the selection differential as was 
described in Chapter n and illustrated in Fig. 11.3. In the following 
section the variance between crosses made at random between pairs 
of lines inbred without selection will be examined. 

Variance between Crosses 

The variance between crosses to be considered is the variance of 
the true means of the crosses, or the between-cross component as 
estimated from an analysis of variance. The variance of the observed 
means will contain a fraction of the within-cross component for the 
reasons explained in connexion with family selection in Chapter 13. 
We shall assume that the experimental design has eliminated all 
non-genetic sources of variation from the between-cross component. 

If the lines crossed are fully inbred there will be no genetic vari- 
ance within the crosses, and the variance between crosses will be 
equal to the genotypic variance in the base population, since each 
cross is equivalent to an individual of the base population. When the 
lines are only partially inbred, however, some genetic variance will 
appear within the crosses, and the between-cross variance will be less 
than with fully inbred lines. It is therefore important to know in 
what manner the between-cross variance increases as inbreeding 
proceeds, since this will tell us how much is to be gained by proceed- 
ing to high levels of inbreeding. 

We noted that crosses between fully inbred lines are genetically 
equivalent to single individuals of the base population. Crosses 
between partially inbred lines are analogous, not to individuals, but 
to families, with degrees of relationship dependent on the inbreeding 
coefficient of the lines. The variance between families can be formu- 
lated in terms of the degree of relationship in the families (Kemp- 
thorne, 1954), and this formulation may be extended to crosses by 
regarding the crosses as families with a relationship depending on the 
inbreeding coefficient of the lines. The following expression is then 
obtained for the component of variance between crosses: 

T F.Q.G. 


Between-cross variance 

= F V *+F*V D +F*V AA +F*V AD +F*V DD + (16.1) 

In this expression V A and V D are the additive and dominance vari- 
ances in the base population; V AA , V AD and V DD are the interaction 
components as explained in Chapter 8; and F is the inbreeding 
coefficient of the lines as specified below. The interaction components 
are included because epistasis may have important effects. Only 
two-factor interactions, however, are shown: the higher interactions 
have coefficients in correspondingly higher powers of F. (For every 
A in the subscript there is a factor F, and for every D a factor F 2 .) 
The formulation in equation 16. 1 is conditional on the following 
specifications about how the crosses are made. 1 . All lines have the 
same coefficient of inbreeding. 2. All lines have independent ancestry 
back to the base population; i.e. there is no relationship between the 
lines. 3. Each cross is made from many individuals of the parent 
lines; and these individuals are not related to each other within their 
lines. This means that the genetic variance within the lines is fully 
represented within the crosses. 4. The coefficient of inbreeding, F, 
refers not to the individuals used as parents of the crosses, but to their 
progeny if they were mated within their own lines; in other words, F 
is the inbreeding coefficient of the next generation of the lines. 

Let us now examine the expression 16.1 and consider what it tells 
us about the variance between crosses. When the inbreeding coeffi- 
cient is unity the between-cross variance is, as we have already stated, 
simply the sum of all the components of genetic variance in the base 
population. During the progress of the inbreeding the contribution 
of the additive variance increases linearly with F; those of the domin- 
ance variance and of Ax A interactions increases with the square of 
F; and the other interaction components with the third or fourth 
power of F. This means that the dominance and interaction com- 
ponents contribute proportionately more at higher levels of inbreed- 
ing than at lower levels. If the character is one with predominantly 
non-additive variance, the crosses will differ little in merit during the 
early stages but will differentiate rapidly in the final stages. Since this 
is the sort of character for which inbreeding and crossing is likely to 
be the most effective means of improvement, it is clear that inbreed- 
ing must be taken to a fairly high level if anything approaching its full 
benefit is to be realised. Some idea of the level of inbreeding required 
can be obtained by noting that with F = 0-5 the between-cross vari- 

Chap. 16] 



ance is equal to the variance between full-sib families in the base 
population. At this level of inbreeding, therefore, the best cross would 
do no more than replicate the best full-sib family in a non-inbred 

Combining ability. The components of genetic variance making 
up the between-cross variance that we have been discussing are 
causal components, in the sense explained in Chapter 9. The vari- 
ance between crosses, however, can also be analysed into observa- 
tional components in the following way. Suppose a set of lines are 
crossed at random, each line being simultaneously crossed with a 
number of others. We can then calculate for each line its mean per- 
formance, i.e. the mean value of the Fj/s in crosses with other lines. 
This is known as the general combining ability of the line. The 
performance of a particular cross may deviate from the average 
general combining ability of the two lines, and this deviation is 
known as the special (or specific) combining ability of the cross. Or, if 
we measure the mean values as deviations from the general mean of 
all crosses, we can express the value of a certain cross as the sum of 
the general combining abilities of the two lines and the special 
combining ability of the pair of lines. Thus the mean value of the 
cross of line X with line Y is 

M XY = G.C. X + G.C. Y + S.C. XY 


where G.C. and S.C. stand for the general and special combining 
abilities. The variance between crosses can therefore be analysed 
into two components: variance of general combining abilities and 
variance of special combining abilities; the latter being, in statistical 
terms, the interaction component. 

The observational components of variance attributable to general 
and special combining ability are made up of the causal components 
in the following way. 


Variance of crosses attributable to: 

General combining ability =FV A +F 2 V AA + . . . \ 

Special combining ability =FW D +FW AD +FW DD + . . . J 

So differences of general combining ability are due to the additive 
genetic variance in the base population, and to Ax A interactions; 
and differences of special combining ability are attributable to the 
non-additive genetic variance. Consequently the variance of general 


combining ability increases linearly with F (apart from the interaction 
component), while the variance of special combining ability increases 
with higher powers of F. It is therefore the special, and not the 
general, combining ability that is expected to increase more rapidly 
as the inbreeding reaches high levels. 

Example 16.1. An analysis of egg-laying in crosses between highly 
inbred lines of Drosophila melanogaster is reported by Gowen (1952). 
Five lines were crossed in all ways, including reciprocals, and the numbers 
of eggs laid by females in the fifth to ninth days of adult life were recorded. 
The analysis of the crosses yielded the following percentage composition 
of the variance of egg number: 

Variance component % of total 
General combining ability 11-3 

Special combining ability 9-7 

Differences between reciprocals 2-3 

Within crosses 76-6 

Thus about half the variance between crosses was due to general, and half 
to special, combining ability. 

Some of the methods of improvement by crossing aim at utilising 
only the variance of general combining ability, and then the measure- 
ment of the general combining ability of the lines becomes an im- 
portant procedure. In addition to the making of specific crosses 
between the lines, there are two other methods of measuring general 
combining ability. A method convenient for use with plants is known 
as the polycross method. A number of plants from all the lines to be 
tested are grown together and allowed to pollinate naturally, self- 
pollination being prevented by the natural mechanism for cross- 
pollination, or by the arrangement of the plants in the plot. The seed 
from the plants of one line are therefore a mixture of random crosses 
with other lines, and their performance when grown tests the general 
combining ability of that line. Another method, applicable also to 
animals, is known as top-crossing. Individuals from the line to be 
tested are crossed with individuals from the base population. The 
mean value of the progeny then measures the general combining 
ability of the line, because the gametes of individuals from the base 
population are genetically equivalent to the gametes of a random set 
of inbred lines derived without selection from the base population. 

Chap. 16] 



These methods are essentially methods for comparing the general 
combining abilities of different lines, and so leading to the choice of 
the lines most likely to yield the best cross, among all the crosses that 
might be made between the available lines. But if much of the varia- 
tion between crosses is due to special combining ability, then the 
general combining ability of two lines will not provide a reliable 
guide to the performance of their cross. 

Methods of Selection for Combining Ability 

The methods of improvement by inbreeding and crossing fall into 
two groups, according to whether they are designed to utilise only 
the variation in general combining ability or to utilise also the varia- 
tion in special combining ability. 

Selection for general combining ability. When the improve- 
ment of general combining ability only is sought the procedure of 
selection is much simplified. The general combining abilities of all 
available lines can be measured, as already explained, without the 
necessity of making and testing all the possible crosses between them. 
Some selection can usefully be applied to the lines before they are 
tested in crosses. There is some degree of correlation between a line's 
performance as an inbred and its general combining ability, so a 
proportion of lines can be discarded on the basis of their own per- 
formance before the crosses are made. And, finally, there is less 
to be lost by making the crosses at a relatively low coefficient of in- 
breeding. Selection for general combining ability may be repeated 
in cycles, a procedure known in plant breeding as recurrent selection. 
(In animal breeding this term has come to have a different meaning, 
as will be explained below.) Lines are inbred by self-fertilisation 
for one or two generations and their general combining abilities 
tested. The lines with the best general combining abilities are then 
crossed and a second cycle of inbreeding and selection carried out. 
A review of the progress made by this method is given by Sprague 

( I 952). 

The seed for commercial use is usually not made by a single cross 
of two lines, but by a 3-way or 4-way cross. The object of this is to 
overcome the generally low production of an inbred used as seed 
parent. In a 3-way cross the F x of two lines is used as seed parent and 
crossed with a third inbred line. In a 4-way cross two F^s of differ- 


ent pairs of lines are crossed. The performance of 3 -way and 4-way 
crosses can be reliably predicted from the performance of the con- 
stituent single crosses. 

Even though selection for general combining ability is widely 
used in plant breeding and has abundantly proved its success, it is 
not, perhaps, altogether clear why it is preferred to selection without 
inbreeding, made either by individual selection or by family selection. 
Since the variation in general combining ability is attributable to 
additive variance in the population from which the lines were derived, 
selection should be effective without inbreeding. Comparisons of the 
two methods by experiment have not been made on a scale sufficient 
to prove convincingly the superiority of selection with inbreeding 
(see Robinson and Comstock, 1955). 

Selection for general and specific combining ability. The 
specific combining ability of a cross cannot be measured without 
making and testing that particular cross. Therefore to achieve a 
reasonably high intensity of selection for specific combining ability a 
large number of crosses must be made and tested. Is no short-cut 
possible? Could the superior combining ability not be, as it were, 
built into the lines by selection? From the causes of heterosis ex- 
plained in Chapter 14 it is clear that what is wanted is two lines that 
differ widely in the gene frequencies at all loci that affect the character 
and that show dominance. It should therefore be possible to build 
up these differences of gene frequency in two lines by selection. 
Instead of the differences of gene frequency being produced by the 
random process of inbreeding, they would be produced by the directed 
process of selection, which would be both more effective and more 
economical. Two methods based on this idea have been devised. 
These methods, though originating from plant breeding, provide — 
in theory at least — the most hopeful means of utilising heterosis in 
animals. We shall first describe the method known as reciprocal 
recurrent selection, or simply as reciprocal selection. In outline, the 
procedure is as follows. 

The start is made from two lines, say A and B. (We shall call 
them "lines" even though they will not be deliberately inbred.) 
Crosses are made reciprocally, a number of A 33 being mated to 
B ??, and a number of B 33 to A $?. The cross-bred progeny are then 
measured for the character to be improved and the parents are judged 
from the performance of their progeny. The best parents are selected 
and the rest discarded, together with all the cross-bred progeny, which 



are used only to test the combining ability of the parents. The selected 
individuals must then be remated, to members of their own line, to pro- 
duce the next generation of parents to be tested. These are crossed 
again as before and the cycle repeated. It is seldom practicable to select 
among the female parents, and the selection is chiefly applied to the 
males. Each male is mated to several females of the other line so that 
the judgment of his combining ability may be based on a reasonably 
large number of progeny. Most of these females are needed to mate to 
the selected males of their own line for the continuation of the line. 
Deliberate inbreeding is avoided as far as possible, for the reason to be 
explained below. The use of all the females as parents in their own lines 
helps to reduce the rate of inbreeding and allows relatively few males to 
be used, which intensifies the selection. 

An essential prerequisite is that there should be some difference of 
gene frequency between the two lines at the beginning, or else selec- 
tion for combining ability will be unable to produce a differentiation 
of the lines. Any locus at which the gene frequencies are the same in 
the two lines will be in equilibrium, though an unstable equilibrium. 
Any shift in one direction or the other will give the selection something 
to act on and the difference will be increased. The initial difference 
between the lines may be obtained by starting from two different 
breeds or varieties, choosing two that already cross well; or by de- 
liberate inbreeding, up to perhaps 25 per cent, and relying on random 
differentiation of gene frequencies. 

Though the performance of the cross is expected to increase 
under this method of selection, the performance of the lines them- 
selves in respect of the character selected is expected to decrease, for 
this reason. Characters to which selection would be applied in this 
way are those subject to inbreeding depression and heterosis; that is 
to say, those in which dominance is directional. The changes of gene 
frequency brought about by the selection are toward the extremes, 
and consequently the mean values of the lines will decline for the 
reasons explained in connexion with inbreeding in Chapter 14. This 
decline in the performance of the lines, however, should not be quite 
as deleterious as the effects of deliberate inbreeding. Inbreeding, as a 
random process, affects all loci, and the mean values of all characters 
showing directional dominance decline. But under reciprocal selec- 
tion it is only the selected character that should decline, except in so 
far as linked loci are carried along. Nevertheless, reproductive fitness 
is nearly always a component of economic value, and it is doubtful 


how far the distinction will hold. This, however, is the reason why 
deliberate inbreeding of the lines is to be avoided. 

The second method is simpler in procedure than reciprocal 
selection described above. It was devised as a modification of recur- 
rent selection, intended to utilise special as well as general combining 
ability (Hull, 1945), and as yet it has no distinctive name. It is known 
variously as "Hull's modification of recurrent selection," ' 'recurrent 
selection to inbred tester," "recurrent selection for special combining 
ability," and in animal breeding simply as "recurrent selection." It 
differs from reciprocal selection in the following way. Instead of 
starting with two lines and selecting both for combining ability with 
the other, one starts with only one line and selects it for combining 
ability with a "tester" line which has previously been inbred. This 
reduces the amount of effort spent on the testing, and is expected to 
yield more rapid progress at the beginning because the initial differ- 
ences of gene frequency between the line and the tester are likely to 
be more marked. But the ultimate gain is expected to be less than 
under reciprocal selection, because the general combining ability of 
the tester line is predetermined, and only the general combining 
ability of the selected line and the special combining ability of the 
cross can be improved. 

The two methods of selection for special combining ability de- 
scribed in this section are comparatively new methods of improvement 
and very little practical experience of them has yet been gained. The 
account of them given here is consequently based almost entirely on 
theory. Theoretical assessments of their merits in relation to other 
methods have been made by Comstock, Robinson, and Harvey 
(1949) and by Dickerson (1952). Though on theoretical grounds 
they seem promising, the results of the only experiments so far pub- 
lished (Bell, Moore, and Warren, 1955; Rasmuson, 1956) are not 

Before we leave the subject of inbreeding we must give some 
further consideration to the particular genetic property that makes 
selection with inbreeding and crossing preferable to selection without 
inbreeding. From the theoretical point of view, and leaving all prac- 
tical considerations aside, the crucial genetic property is over- 
dominance of the genes concerned. The following section is devoted 
to a consideration of overdominance and its significance. 

Chap. 16] 




Overdominance is the property shown by two alleles when the 
heterozygote lies outside the range of the two homozygotes in 
genotypic value with respect to the character under discussion. Its 
meaning was illustrated in Fig. 2.3 with respect to fitness as the 
character, and it has been mentioned from time to time in other 
chapters. We saw in Chapter 2 how selection favouring hetero- 
zygotes leads to a stable gene frequency at an intermediate value, and 
how this overdominance with respect to fitness probably accounts for 
much of the stable polymorphism found in natural populations. 
And in Chapter 12 we saw how overdominance may be a source of 
non-additive genetic variance in populations that have reached their 
limit under artificial selection. It is, however, in connexion with the 
utilisation of heterosis by inbreeding and crossing, or by reciprocal 
selection, that overdominance has its most important practical conse- 
quences. In earlier chapters two basic methods of improvement 
were distinguished, one being selection without inbreeding, and the 
other inbreeding followed by crossing. In this chapter we have seen 
that selection is an integral part of the second method also. The 
essential distinction therefore lies in the crossing, rather than in the 
selection. Now, crossing two lines in which different alleles are fixed 
gives an F 1 in which all individuals are heterozygotes; and this is the 
only way of producing a group of individuals that are all heterozy- 
gotes. In a non-inbred population no more than 50 per cent of the 
individuals can be heterozygotes for a particular pair of alleles. 
Consequently, if heterozygotes of a particular pair of alleles are 
superior in merit to homozygotes, inbreeding and crossing will be a 
better means of improvement than selection without inbreeding. 
Furthermore, it is only when there is overdominance with respect to 
the desired character, or combination of characters, that inbreeding 
and crossing can achieve what selection without inbreeding cannot. 
Under any other conditions of dominance the best genotype is one of 
the homozygotes, and all individuals can be made homozygous by 
selection, without the disadvantages attendant on inbreeding and 
much more simply than by methods dependent on crossing. It was 
stated earlier in this chapter that the potentialities of inbreeding and 
crossing are greatest when there is much non-additive genetic vari- 
ance and little additive. Now we see that this is only part of the truth: 


in principle inbreeding and crossing can surpass selection without in- 
breeding only when a substantial part of the non-additive variance is 
due to over dominance. It is therefore of great practical importance 
to know whether overdominance with respect to economically 
desirable characters is a major source of variation. It is also of great 
theoretical interest to know whether overdominance with respect to 
natural fitness is a common phenomenon affecting many loci, because 
natural selection favouring heterozygotes would be a potent factor 
tending to maintain genetic variation in populations. This point will 
be discussed further in Chapter 20. 

The contribution of overdominance to the variance, and the pro- 
portion of loci that show overdominance, are really two different 
questions. Genes that are overdominant with respect to fitness will be 
at intermediate frequencies and will therefore contribute much 
more variation than genes at low frequencies. So overdominance may 
be a major source of variation and yet be a property of only a few 

The evidence concerning overdominance has been compre- 
hensively reviewed by Lerner (1954), who reaches the conclusion that 
overdominance with respect to fitness and characters closely con- 
nected with it is widespread and very important. A contrary view is 
expressed by Mather (19556) on the grounds that much of what 
appears to be overdominance with respect to certain characters in 
plants can be attributed to epistatic interaction. These two conflicting 
opinions will be enough to show that the problem of overdominance 
remains still an open question. The aim here is not to discuss the 
opinions, but to indicate briefly the nature of the evidence. 

The evidence concerning overdominance is broadly speaking of 
two sorts, direct and indirect. The direct evidence comes from the 
comparison of heterozygotes and homozygotes in identifiable geno- 
types. The indirect evidence comes from the study of the expected 
consequences of overdominance as they affect the genetic properties of 
a population, or the outcome of certain breeding methods. Both sorts 
of evidence are complicated by linkage. We have to distinguish 
between overdominance as a property of a single locus, and over- 
dominance as a property of a segment of chromosome, which we shall 
refer to as apparent overdominance. Unequivocal evidence of over- 
dominance arising from a single locus is scarce because it can only be 
obtained from a locus that has mutated in a highly inbred line, or 
from a population in which coupling and repulsion linkages are in 

Chap. 16] 



equilibrium. The segregation that can be observed in practice, and 
that gives rise to the genetic variation in a population, is usually not a 
segregation of single loci but of segments of chromosome, longer or 
shorter according to the amount of crossing-over. These segments 
of chromosome, or units of segregation, can show overdominance 
even though the separate loci do not. All that is needed to produce 
some degree of apparent overdominance is two genes, linked in 
repulsion, and both partially recessive. Its most extreme form is pro- 
duced by two lethal genes linked in repulsion — a "balanced lethal" 
system — when the heterozygote of the segment spanned by the two 
loci is the only viable genotype. 

In considering the direct evidence it is necessary to recognise that 
overdominance may be manifested at different "levels" according to 
the complexity of the character under discussion. A pair of alleles 
with pleiotropic effects may be found not to exhibit overdominance 
when any of the characters they affect is examined separately; yet if 
natural fitness or economic merit is founded on a combination of 
these characters, the alleles may show overdominance with respect to 
fitness or merit. Thus there may be no overdominance at the lower 
level of the simpler characters, but overdominance at the higher level 
of the more complex character. 

Example 16.2. An example of overdominance due to pleiotropy is 
provided by the pygmy gene in mice, already referred to in several ex- 
amples in earlier chapters. The gene reduces body size and in the homo- 
zygote it causes sterility (King, 1955). In respect of body size it is nearly, 
but not quite, recessive. In respect of sterility it is probably also nearly 
recessive, though this was not proved. In neither body size nor sterility 
separately is there overdominance. But if small size were desirable (as it 
was in the experiment in which the gene was discovered), then under these 
conditions the genotype with the highest merit is the heterozygote, since 
the sterile homozygotes cannot reproduce. With respect to merit, or fitness 
under these conditions, the gene therefore shows overdominance. The 
lethal gene in the line of Drosophila selected for high bristle number, 
mentioned in Chapter 12, is another case of the same sort of overdomin- 
ance; and so also is the sickle-cell anaemia described in Example 2.4. 

The observations that provide direct evidence concerning over- 
dominance may be briefly summarised as follows. The experience 
of Mendelian genetics shows that mutant genes are not commonly 
overdominant with respect to their main effects. Nor is overdomin- 
ance with respect to natural fitness at all obvious. Indeed, if there 


were more than a mild degree of overdominance with respect to 
fitness a gene would not be rare enough to be classed as a "mutant." 
Though the evidence of Mendelian genetics suggests that overdomin- 
ance is not a very common property of genes, many cases are never- 
theless known. Overdominance due to pleiotropy, such as the cases 
mentioned in the above example, are not infrequent. And, over- 
dominance with respect to certain components of natural fitness has 
been proved for some of the blood group genes in poultry (see Briles, 
Allen, andMillen, 1957; Gilmour, 1958). 

The nature of the indirect evidence concerning overdominance 
is, in brief summary, as follows. 

1 . Experiments on the rate of loss of genetic variance during in- 
breeding point to the operation of natural selection in favour of 
heterozygotes (Tantawy and Reeve, 1956; Briles, Allen, and Millen, 
1957; Gilmour, 1958). This indicates apparent overdominance, but 
it does not prove overdominance at the individual loci. 

2. Crow (1948, 1952) has given reasons for thinking that the 
yield of grain obtained from the best crosses between inbred lines of 
maize is too high to be accounted for without overdominance at some 
loci. The reasoning depends on assumptions about the number of 
loci affecting yield and the mutation rates, and the conclusion is 
therefore tentative. Robinson et at. (1956) point out that the reason- 
ing cannot justifiably be applied to maize crosses because the lines 
crossed generally come from different varieties and not from the 
same base population as required by Crow's hypothesis. 

3. Comstock and Robinson (1952) have devised methods for 
measuring the average degree of dominance from measurements 
made on non-inbred populations. Preliminary results from maize 
(Robinson and Comstock, 1955) suggest that there cannot be over- 
dominance (as distinct from apparent overdominance) at more than a 
small proportion of the loci that influence the yield of grain. 

4. The existence of polymorphism in natural populations, asj 
described in Chapter 2, cannot readily be explained except by sup- 
posing that the genes concerned are overdominant with respect to 

From the foregoing outline of the evidence it is clear that the 
problem of how important overdominance is remains unsolved. 
Some of the differences of opinion about it may arise from different 
views of what phenomena are to be included under the term — 
whether apparent overdominance due to linkage, or overdominance 

Chap. 16] 



jldue to pleiotropy, are to be regarded as overdominance or not. 
I Moreover, the question of how important overdominance is means 
|| different things according to whether we are concerned with its 
I frequency as a property of genes, or with the amount of variation it 
I causes. 



The choice of a suitable scale for the measurement of a metric charac- 
ter has been mentioned several times in the foregoing chapters. The 
explanation of what is involved in the choice of a scale and a discussion 
of the criteria of suitability have, however, been deferred till this 
point because these are matters that cannot be properly appreciated 
until the nature of the deductions to be made from the data are 
understood. In other words the choice of a scale has to be made in 
relation to the object for which the data are to be used. The data 
from any experimental or practical study are obtained in the form 
most convenient for the measurement of the character. That is to 
say the phenotypic values are recorded in grams, pounds, centimetres, 
days, numbers, or whatever unit of measurement is most convenient. 
The point at issue is whether these raw data should be transformed to 
another scale before they are subjected to analysis or interpretation. 
A transformation of scale means the conversion of the original units 
to logarithms, reciprocals, or some other function, according to what 
is most appropriate for the purpose for which the data are to be used. 

It is tempting to suppose that each character has its "natural" 
scale, the scale on which the biological process expressed in the 
character works. Thus, growth is a geometrical rather than an arith- 
metical process, and a geometric scale would appear to be the most 
' 'natural." For example, an increase of 1 gm. in a mouse weighing 
20 gm. has not the same biological significance as an increase of 1 gm. 
in a mouse weighing 2 gm.: but an increase of 10 per cent has ap- 
proximately the same significance in both. For this reason a trans- 
formation to logarithms would seem appropriate for measurements of 
weight. This, however, is largely a subjective judgment, and some 
objective criterion for the choice of a scale is needed. There are 
several recognised criteria (see Wright, 1952&); but, as Wright points 
out, the different criteria are often inconsistent in the scale they indi- 
cate. And, moreover, the same criterion applied to the same character 
may indicate different scales in different populations. Therefore the 

Chap. 17] 



idea that every character must have its "natural" and correct scale is 
largely illusory. 

In the first chapter on metric characters, Chapter 6, it was stated 
that we should assume throughout that any metric character under 
discussion would be measured on an "appropriate" scale, the 
criterion being that the distribution of phenotypic values should 
approximate to a normal curve. This is, in principle, the chief 
criterion, and a markedly asymmetrical, or skewed, distribution is a 
certain indication that the data may have to be transformed if they are 
to be used in certain ways. But a transformation may still be required 
even if the distribution is not markedly asymmetrical: we shall see 
below that the most important criterion then is that the variance 
should be independent of the mean. We shall treat the choice of 
scale in this chapter by showing what will arise if the transformation 
required is not made. We shall find that certain phenomena arise, 
called scale effects, which disappear when the appropriate transforma- 
tion is made. For the sake of clarity we shall discuss in particular the 
logarithmic transformation which converts an arithmetic to a geo- 
metric scale. This is probably the commonest and most useful 
transformation. The general principles, outlined by reference to the 
log transformation, will, however, apply equally to other transforma- 
tions. Let us first consider the distribution of phenotypic values. 

Fig. 17. i shows three distributions plotted as if from the original 
data on an arithmetic scale. They would all three be symmetrical 
and normal if the data were first transformed to logarithms, or plotted 
on logarithmic paper. There are two points of importance to notice. 
First, the degree of departure from normality depends on the amount 
of variation in relation to the mean. This may be seen from a com- 
parison of the two upper graphs, (a) and (b), which are not very 
noticeably asymmetrical, with the lower graph, (c), which is. The 
relationship between the amount of variation and the mean, which 
determines the degree of departure from normality, is best expressed 
as the coefficient of variation; i.e. the ratio of standard deviation to 
mean, often multiplied by 100 to bring it to a percentage. The 
coefficient of variation of the two upper graphs is 20 per cent, while 
that of the lower graph is 50 per cent. Thus, a transformation to 
logarithms does not make an appreciable difference to the shape of the 
distribution unless the coefficient of variation is fairly high — that is, 
above about 20 per cent or so. Consequently, statistical procedures 
which do not rely on a strictly normal distribution, such as the ana- 



[Chap. 17 

lysis of variance, can be carried out on the untransformed data when 
the coefficient of variation is not above about 20 per cent. Trans- 
formations to other scales are also less necessary when the coefficient 
of variation is low than when it is high. 

The second point to notice in Fig. 17. 1 is that the variance, when 
computed in arithmetic units, increases when the mean increases. 
This may be seen in the two upper graphs, (a) and (b). These have 

Fig. 17. i. Distributions that are symmetrical and normal on a 
logarithmic scale shown plotted on an arithmetic scale. Explana- 
tion in text. 

both the same variance in logarithmic units, but different means. 
The mean — or strictly speaking the mode — of (b) is double that of (a) 
and the standard deviation in arithmetic units is correspondingly 
doubled. Though the distributions are not very noticeably skewed 
and a transformation does not seem to be very strongly indicated, yet 
in consequence of the difference of mean the variances differ very 
greatly. Here, then, is one of the commonest scale effects, namely a 
change of variance following a change of the population mean. The 
two graphs (a) and (b) in Fig. 17.1 might well represent two popula- 

Chap. 17] 



tions which have diverged by some generations of two-way selection, 
if the character were something like body weight measured in grams 
or pounds. Such characters are commonly found to increase in 
variance when the mean increases and to decrease in variance when 
the mean decreases. Fig. 17.2 shows an example from an experiment 
with mice (MacArthur, 1949), the character being weight at 60 days. 











• / 



1 y^TV/^ 

3 IC 



25 30 






Fig. 17.2. Distributions of body weight of male mice at 60 days. 
Centre: base population before selection. Left and right: small 
and large strains after 21 generations of two-way selection. (Re- 
drawn from MacArthur, 1949.) 

Small Unselected Large 
Standard deviation 171 2*56 5-10 

Coeff. of variation, % 14-3 ii-i 12-8 

Phenomena such as the change of variance discussed above are 
called scale effects if they disappear when the measurements are 
appropriately transformed: in other words, if their cause can be 
attributed to the scale of measurement. But they are none the less 
real, though labelled as a scale effect or removed by transformation. 
The large mice, for example, are really more variable than the small 
when their weights are measured in grams. What is gained by recog- 
nising this as a scale effect is that there is no need to look deeper into 
the genetic properties of the character for an explanation. 

A convenient test for the appropriateness of a logarithmic trans- 
formation is provided by the proportionality of standard deviation 

U F.Q.G. 

296 SCALE [Chap. 17 

and mean, which we noted in connexion with graphs (a) and (b) in 
Fig. 17. i. If two distributions have the same variance on a logarith- 
mic scale then the coefficients of variation in arithmetic units will be 
the same. Thus, constancy of the coefficient of variation indicates 
constancy of variance on a logarithmic scale. And, if variances are to 
be compared, we may simply compare the coefficients of variation 
instead of expressing the variances in logarithmic units. The stand- 
ard deviations and coefficients of variation of the distributions shown 
in Fig. 17.2 are given in the legend to the figure. The coefficients of 
variation, though not identical, are much more alike than the stand- 
ard deviations, and this shows that the changes of variance that have 
resulted from the selection can be attributed, in large part at least, to 
the scale of measurement. 

The effect of scale on the connexion between variance and mean 
complicates the comparison of the variances of two populations that 
differ also in mean, as for example the comparison of the variances of 
inbreds and hybrids discussed in Chapter 15. If a difference of 
variance is to be unambiguously attributed to a difference of homeo- 
static power, for example, there must be independent grounds for 
believing that a similar difference would not be expected as a scale 
effect connected with the difference of mean. 

Let us return to the consequences of selection and pursue them a 
little further. If the variance changes with the change of mean as a 
result of selection, so also will the selection differential and the 
response. The response per generation of a character such as we have 
been considering would therefore be expected to increase with the 
progress of selection in the upward direction, and to decrease corre- 
spondingly in the downward direction. The response to two-way 
selection would then be asymmetrical. An example of an asymmetri- 
cal response which can most probably be attributed to a scale effect 
in this way is shown in Fig. 17.3. Plotted in arithmetic units, as in 
(a), the response is much greater in the upward than in the downward 
direction. A transformation to logarithms, shown in (b), renders the 
response much more nearly symmetrical. This does not do away 
with the fact that the character as measured increased much more than 
it decreased under selection. But it accounts for the asymmetry 
without the need for more elaborate hypotheses. A convenient way of 
eliminating scale effects from the graphical presentation of a response 
to selection is to plot the response in the form of the realised herit- 
ability, as explained in Chapter 11 and illustrated in Fig. 11.5. The 

Chap. 17] 



realised heritability, which is the ratio of response to selection differ- 
ential, is very little influenced by scale effects (Falconer, 1954a:). 

When means or variances are to be compared, for example in a 
comparison of two populations or in following the changes resulting 
from selection, and a transformation to logarithms is indicated, it is 
not necessary to convert each individual measurement. On the other 














: / 





**"■"-• — .^"*\. 


, ;^,— t 





- z 


cC |. 4 

2 4 6 8 10 


/ * 




; \ 

~;v : 

2 4 6 8 10 


Fig. 17.3. Response to two-way selection for resistance to dental 
caries in rats. Resistance is measured in days and plotted on an arith- 
metic scale in (a), and on a logarithmic scale in (b). The arithmetic 
means were converted to logarithmic means by formula 17. 1. The 
coefficient of variation was high — about 50 % — and was approxi- 
mately constant. The reason why the upward selection has not 
covered so many generations as the downward is simply that the 
increased resistance lengthened the generation interval. (Data 
from Hunt, Hoppert, and Erwin, 1944.) 

hand it is not sufficient to convert the arithmetic mean or variance to 
logarithms, unless the coefficient of variation is very low. The con- 
versions may be conveniently made by the two following formulae, 
given by Wright (19526). The first converts the mean of arithmetic 
values to the mean of logarithmic values, and the second converts the 
variance as computed from the arithmetic values to the variance as it 

(log x) = log x - I log ( 1 + C 2 ) 

o'dogo;) =0-4343 log (i+C 2 ) 

.(I 7 .I) 

would be computed from logarithmic values. In these formulae C is 
the coefficient of variation in the form ujx computed from arithmetic 
values, and the logarithms are to the base 10. 

298 SCALE [Chap. 17 

We turn now to what is perhaps a more fundamental effect of a 
scale transformation — its effect on the apparent nature of the genetic 
variance. To understand this we must go back to a single locus and 
consider the effect, or mode of action, of the genes. Let us imagine a 
locus with two alleles whose mode of action is geometric, the geno- 
typic value of A 2 A 2 being 50 per cent greater than A X A 2 and that of 
A X A 2 being also 50 per cent greater than A-^. Thus on the logarith- 
mic scale there is no dominance, the heterozygote being exactly mid- 
way between the two homozygotes. Now suppose the genotypic 
values are measured in arithmetic units, such as grams, and that A X A X 
has a value of 10 units. Then A X A 2 will be 15 units and A 2 A 2 22-5 
units. On the arithmetic scale, therefore, A x is partially dominant to 
A 2 , the heterozygote no longer falling mid-way between the homo- 
zygotes. Thus the degree of dominance is influenced by the scale of 
measurement, and so also is the proportionate amount of dominance 
variance. This effect of a scale transformation, however, is normally 
rather small. A gene that causes a 50 per cent difference between the 
genotypic values, such as we have considered, would be a major gene, 
easily recognisable individually. But even so the degree of dominance 
on the arithmetic scale is not very great. Minor genes with effects of 
perhaps 1 per cent or 10 per cent would be scarcely influenced in their 

In the same way that the dominance is affected by the scale, so 
also is the epistatic interaction between different loci. Loci with 
geometric effects would combine without interaction if the genotypic 
values were measured in logarithmic units. But when measured in 
arithmetic units there would be interaction deviations due to epis- 
tasis. Thus the amount of interaction variance is also influenced by 
the scale of measurement. The following example illustrates the 
dependence of interaction on scale. 

Example 17.1. The pygmy gene in mice is a major gene affecting body 
size, homozygotes being much reduced in size. The effect of this gene 
was studied in different genetic backgrounds (King, 1955). The gene was 
transferred from the strain selected for small size where it arose, to a strain 
selected for large size, by repeated backcrosses. The mean difference be- 
tween pygmy homozygotes and normals (i.e. heterozygotes and normal 
homozygotes together) was measured in the two strains and during the 
transference, the comparisons being made between pygmies and normals 
in the same litters. The results are shown in Fig. 17.4. The difference 
between pygmies and normals increases with the weight of the normals. 

Chap. 17] 



In the background of the small strain the pygmies were about 7 gm. smaller 
than normals, but in the background of the large strain they were about 
12 gm. smaller. Thus the pygmy gene shows epistatic interaction with the 
other genes that affect body size. But if the effect of the gene is expressed 
as a proportion, it is constant and independent of the other genes present. 

8 - 


"10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
Weight of normal mice (g.) 

Fig. 17.4. Intra-litter comparisons of the 6-week weights of pyg- 
mies and normals. Mean of pygmies plotted against mean of nor- 
mals in the same litter. (From King, 1955; reproduced by courtesy 
of the author and the editor of the Journal of Genetics.) 

Pygmies are about half the weight of their normal litter-mates, no matter 
what the actual weights are. Thus if the comparisons are made in logar- 
ithmic units there is no epistatic interaction. 

In general, therefore, a scale transformation may remove or 
reduce the variance attributable to epistatic interaction, and this 
variance might then be labelled as a scale effect. A transformation 
which removes or reduces interaction variance may be useful if con- 
clusions are to be drawn from an analysis that depends for its validity 
on the absence of interaction. A detailed treatment of the relation- 
ship between scale and epistatic interaction is given by Horner, 
Comstock, and Robinson (1955). 

In this chapter we have outlined some of the scale effects most 
commonly met with, and have indicated the circumstances under 
which a transformation of scale may be helpful to the interpretation 
of results and the drawing of conclusions. Transformations of scale, 
however, should not be made without good reason. The first pur- 
pose of experimental observations is the description of the genetic 

300 SCALE [Chap. 17 

properties of the population, and a scale transformation obscures 
rather than illuminates the description. If epistasis, for example, is 
found, this is an essential part of the description, and it is better 
labelled as epistasis than as a scale effect. The transformation of scale 
is essentially a statistical device to be employed for the purpose of 
simplifying the analysis of the data, or to make possible the drawing 
of valid conclusions from the analysis. It is sometimes helpful also in 
the interpretation of results. If epistasis, for example, were found to 
disappear on transformation to a logarithmic scale we could conclude 
that the effects of different loci combined by multiplication rather 
than by addition. Or, if there were good reasons for attributing a 
difference of variance to a scale effect we should not need to invoke 
more complicated genetic explanations. The choice of scale, how- 
ever, raises troublesome problems in connexion with the interpreta- 
tion of results. Logical justification of a scale transformation can 
only come from some criterion other than the property about which 
the conclusions are to be drawn. If there is no independent criterion 
the argument becomes circular, and the distinction between a scale 
effect and some other interpretation becomes meaningless. There is 
also a more fundamental difficulty: the scale appropriate for one 
population may not be appropriate for another, and the scale appro- 
priate to the genetic and environmental components of the variation 
may be different. This difficulty is strikingly illustrated by an analysis 
of the character " weight per locule" in a number of crosses between 
varieties of tomato (Powers, 1950). By the same criterion — normality 
of the distribution — this character was found to require an arithmetic 
scale in some crosses and a geometric scale in others; and, moreover, 
in the F 2 generations of some crosses the genetic variation required one 
scale while the environmental variation required another. 



There are many characters of biological interest or economic im- 
portance whose inheritance is multifactorial but whose distribution is 
discontinuous. For example: resistance to disease, a character ex- 
pressed either in survival or in death with no intermediate; "litter" 
size in the larger mammals that bear usually one young at a time but 
sometimes two or three; or the presence or absence of any organ or 
structure. Characters of this sort appear at first sight to be outside the 
realm of quantitative genetics because they do not exhibit continuous 
variation; yet when subjected to genetic analysis they are found to be 
under the influence of many genes just as any metric character. For 
this reason they have been called "quasi-continuous variations" 
(Griineberg, 1952): the phenotypic values are discontinuous but the 
mode of inheritance is like that of a continuously varying character. 

The clue to the understanding of the inheritance of such characters 
lies in the idea that the character has an underlying continuity with a 
"threshold" which imposes a discontinuity on the visible expression 
of the character, as depicted in Fig. 18.1. The underlying continuous 
variation is both genetic and environmental in origin, and may be 
thought of as the concentration of some substance or the speed of some 
developmental process — of something, that is to say, that could in 
principle be measured and studied as a metric character in the 
ordinary way. The hypothetical measurement of this variation is 
supposed to be made on a scale that renders its distribution normal, 
and the unit of measurement is the standard deviation of the dis- 
tribution. This provides what may be called the underlying scale. We 
now have two scales for the description of the phenotypic values: the 
underlying scale which is continuous, and the visible scale which is 
discontinuous. The two are connected by the threshold, or point of 
discontinuity. This is a point on the continuous scale which corre- 
sponds with the discontinuity in the visible scale. The idea will be 
clearer from an inspection of Fig. 18.1, which depicts a character 
whose visible expression can take only two forms, such as alive versus 



[Chap. 18 

dead, or present versus absent. Individuals whose phenotypic values 
on the underlying scale exceed the threshold will appear in one visible 
class, while individuals below the threshold will appear in the other. 


I + 2 +3 -3 -2 -I 

+ 3 

Fig. i 8. i. Illustrations of a threshold character with two visible 
classes. The vertical line marks the theshold between the two 
phenotypic classes, one of which is cross-hatched. The population 
depicted on the left has an incidence of io%; that on the right, an 
incidence of 90 %. 

On the visible scale individuals can have only two values, o or 1. 
Groups of individuals, however, such as families or the population as 
a whole can have any value, in the form of the proportion or percent- 
age of individuals in one or other class. This may be referred to as 
the incidence of the character. Susceptibility to disease, for example, 
can be expressed as the percentage mortality in the population or in 
a family. The incidence is quite adequate as a description of the 
population or group, but the percentage scale in which the incidence 
is expressed is inappropriate for some purposes because on a per- 
centage scale variances differ according to the mean. The interpre- 
tation of genetic analyses of threshold characters is therefore facili- 
tated by the transformation of incidences to values on the underlying 
scale. The transformation is easily made by reference to a table of 
probabilities of the normal curve. The threshold is a point of trun- 
cation whose deviation from the population mean can be found from 
the proportion of the population falling beyond it. A table of ''pro- 
bits" (Fisher and Yates, 1943, Table ix) is convenient to use because 
it refers to a single tail of the distribution and obviates confusion 
over the sign of the deviation. The transformation from the visible 
to the underlying scale enables us to state the mean phenotypic value 
of a population or family in terms of its standard deviation, and to 

Chap. 18] 



compare the means of different populations or families provided they 
have the same standard deviation. It is convenient to take the posi- 
tion of the threshold as the origin, or zero-point, on the underlying 
scale and to express the mean as a deviation from the threshold. 
Thus if the incidence of the character is, for example, 10 per cent, a 
table of the normal curve shows that the threshold exceeds the mean 
by 1-28 standard deviations. The population mean, referred to the 
threshold as origin, is therefore - 1-280-. Or, if the incidence were 
90 per cent then the population mean would be + i-28cj, as shown in 
Fig. 1 8. 1. For any comparison of means, however, it is necessary to 
assume that the populations compared have the same variance on the 
underlying scale. If reasons are known for the variances not being 
equal — in comparisons, for example, between inbreds, F x 's and F 2 's — 
then the means cannot be expressed on a common scale that allows a 
valid comparison to be made. 

This is as far as we can go with a character that is visibly expressed 
in only two classes. The mean of a population or group can be stated, 
but not the variance, because the mean has to be stated in terms of the 
standard deviation. We can, however, subject the observed means of 
families to analysis and compute the heritability of the character. 
The heritability of threshold characters is treated by A. Robertson 
and Lerner (1949) and by Dempster and Lerner (1950), and will not 
be further discussed here. 

If a character has three classes in its visible scale then comparisons 
can be made between the variances of populations as well as between 
the means. The number of lumbar vertebrae in mice is a character 
of this sort that has been extensively studied (Green, 1951; McLaren 
and Michie, 1955). The number is usually either 5 or 6, but some 
individuals have 5 on one side and 6 on the other. This comes about 
through the last vertebra being sacralised on one side and not on the 
other. The asymmetrical mice have 5! lumbar vertebrae and are 
regarded as being intermediate between the 5 -class and the 6-class. 

When the visible scale has three classes there are two thresholds, 
as shown in Fig. 18.2. If the assumption is made that the difference 
between the two thresholds represents a constant difference on the 
underlying scale, then we have not only a fixed origin of the scale but 
also a fixed unit, and this provides a basis for the comparison of 
variances as well as of means. The underlying scale then has one of 
the thresholds as origin and the threshold difference as the unit of 
measurement. The idea is most easily explained by a numerical 



[Chap. 18 

example. Consider the two populations illustrated in Fig. 18.2. Let 
their standard deviations on a common underlying scale be g 1 and o 2 
respectively, and let them have the following incidences in the three 
visible classes, X, I, and Z, of which I is the intermediate class: 

* +5 


Fig. 18.2. Illustrations of a threshold character with three visible 
classes, in two populations with incidences as shown. The axes are 
marked in threshold units, and the population means are indicated 
by arrows. Further explanation in text. 






I 5 







Population (1) 

+ 0-250-! 

+ 0-6701 

Population (2) 

— 0-8403 


Incidence, %. Population (1) 
Population (2) 

The deviations of the thresholds from the population means, found 
from a table of the normal curve, are as follows: 

Threshold interval 


The intervals between the two thresholds, given above on the right, 
are found by subtraction of the deviations of the two thresholds in 
each population. These threshold intervals are supposed by hypo- 
thesis to be equal on the common underlying scale. By assigning the 
threshold interval the value of one ' 'threshold unit" we can therefore 
express the standard deviations of the two populations on a common 
basis in terms of threshold units. The standard deviations then 

o 1 = 2 , 38 threshold units 
ct 2 = 3'I2 threshold units. 

Chap. 18] 



The means of the populations can also be expressed in threshold 
units. Reckoned from the X/I threshold as origin they are 

M 1 = - 0-25 01 = - o-6o threshold units 
M 2 = + 0-84 o- 2 = + 2-62 threshold units. 

The standard deviation and population mean of a character with 
three visible classes may be put in general form in the following way. 
Let X be the incidence in one visible class, and Y the incidence in 
this class together with the intermediate class. Let the threshold 
between these two classes be the origin of the underlying scale. Let 
x and y be the deviations of the two thresholds corresponding to the 
incidences X and Y respectively. Then the standard deviation is 

and the mean is 

x -y 

M= -xg 


threshold units 


threshold units 



The comparison of variances in this way depends entirely, as we 
have pointed out, on the assumption that the interval between the 
two thresholds is constant from one population to another. If we 
think again of the hypothetical substance or process whose concentra- 
tion or rate determines the value on the underlying scale, the assump- 
tion is that the intermediate class spans the same difference of con- 
centration or of rate in the two populations compared. Whether this 
assumption is a reasonable one or not is hard to judge. It may, 
nevertheless, lead to reasonable results, as the following example 

Example 18.1. The number of lumbar vertebrae was studied in two 
inbred lines of mice and their cross (Green and Russell, 195 1). The inbred 
lines were a branch of the C3H strain with predominantly 5 lumbar 
vertebrae, and the C57BL strain with predominantly 6 lumbar vertebrae. 
Crosses were made reciprocally, and F 2 generations were made from each 
F P The incidences of the 5-vertebra class and of the intermediate class of 
asymmetrical mice with 5^ are given in the table. The reciprocal F/s 
were found to differ and are listed separately. The F 2 's did not differ and 
their results are pooled. The table gives also the positions of the two thresh- 
olds in standard deviations; and the mean and standard deviation com- 



[Chap. 18 

puted in threshold units, the mean being reckoned from the threshold 
between the 5-class and the asymmetrical class as origin. The distribu- 


Incidence, % 
5 5i 

Deviation of 

thresholds from 

mean, in a 

5/5* Si/6 

Mean and stand- 
ard deviation in 
threshold units 
M a 

Inbreds C3H 


C3H? x C 5 7<? 
C 5 7?xC 3 H<? 




+ 1-87 


+ 2-41 

-3 '44 

+ 574 




+ OI9 


+ o-6i 
+ o-io 


+ 0-85 


F 2 (pooled) 




+ 0-23 

+ 0-27 


tions of the populations, based on the computed means and standard 
deviations, are shown graphically in Fig. 18.3. It should be noted that the 
means and standard deviations of the inbreds are not very precisely esti- 
mated because the incidences are low. The computed properties of the 
populations follow the expected pattern. The F x generation is intermediate 
in mean between the two parental populations, though there is a maternal 
effect causing a difference between the reciprocal F/s. This maternal 
effect has been further studied and confirmed by McLaren and Michie 
(1956a). The variance of the F 1 is somewhat lower than that of the 
parental inbreds, as might be expected from a reduction of environmental 
variance in the hybrids. This was further studied and confirmed by 
McLaren and Michie (1955). The F 2 is equal in mean to the F l9 but shows 
an increased variance as would be expected from the segregation of genes. 
If we take 2-00 as the mean standard deviation of the F lf representing 
purely environmental variation, then the environmental variance is 4-00, 
and the total phenotypic variance given by the F 2 is 10-56; therefore the 
genotypic variance works out at 6-56, or 62 per cent of the total. Thus the 
analysis of the threshold character studied in this cross leads to very 
reasonable results, and the assumptions on which it rests do not seem to be 
very seriously wrong. 

The meaning of the threshold unit in which values on the under- 
lying scale are expressed may conveniently be discussed by reference 
to the number of lumbar vertebrae in mice, described in the above 
example. From the graduation of the scale at the foot of Fig. 18.3 
it appears that the threshold interval corresponds to one vertebra. It 
is therefore tempting to regard the scale as indicating ' 'potential' 
vertebrae, ranging from 5 at the origin to 15 at the upper extreme 

Chap. 18] 





5 ►M- 6 ► 


Fig. 18.3. Distributions of number of lumbar vertebrae in mice 
transformed to the underlying scale of threshold units. The upper 
distributions are two inbred lines, the two middle ones are the two 
reciprocal F/s, and the lower distribution is the F 2 . (Data from 
Green & Russell, 1951-) See example 18.1 for further explanation. 


and to - 5 at the lower extreme. We should then regard the develop- 
ing vertebral column as being protected by canalisation against this 
wide range of potential variation, so that the vertebrae actually 
formed are restricted to the narrow range between 5 and 6. This 
interpretation, however, assumes that individuals with a potential 
number anywhere between 5 and 6 will be asymmetrical with 5! 
vertebrae; and for this there is no justification. The asymmetrical 
individuals may equally well, or more probably, be those with almost 
exactly 5 \ potential vertebrae. Suppose, for example, that the range 
of potential vertebrae that gave rise to an asymmetrical individual 
were between 5-4 and 5-6. Then 1 threshold unit would correspond 
to o-2 potential vertebrae; the origin of the underlying scale would 
be at 5-4 and the variation would range from 7-4 potential vertebrae 
at one extreme to 3-4 at the other. Or, if the asymmetrical individuals 
covered a range of only o-i potential vertebrae, the whole distribu- 
tion would lie within the potential numbers of 5 and 6, just as the 
actual range does. Thus the threshold unit is purely arbitrary in 
nature; though useful for the comparison of populations, it cannot be 
given any concrete interpretation. 

From what has been said so far in this chapter it will be clear that 
threshold characters do not provide ideal material for the study of 
quantitative genetics, because the genetic analyses to which they can 
be subjected are limited in scope and subject to assumptions that one 
would be unwilling to make except under the force of necessity. We 
turn now to a consideration of some aspects of selection for threshold 
characters, which has more practical importance than the genetic 
analyses that we have been considering, and does not involve the same 
theoretical difficulties. 

Selection for Threshold Characters 

Selection for threshold characters has some practical importance 
in connexion with the improvement of viability and with changing 
the response of experimental animals to treatments, such as, for 
example, increasing or decreasing drug resistance. We shall consider 
only characters with two visible classes; and we shall assume that 
there is no means of measuring some aspect of the character that 
varies continuously, such as measuring the time of survival instead of 
classifying simply dead versus alive. 


Chap. 18] 



The response to selection depends in the usual way on the selec- 
tion differential. But the selection differential does not depend prim- 

I arily on the proportion selected, as with a continuously varying 
character, but on the incidence, for the following reason. We may 

I breed exclusively from those individuals in the desired phenotypic 

i class, but we cannot discriminate between those with high and those 
with low values on the underlying scale. The selected individuals are 
therefore a random sample from the desired class, and the mean of 
the selected individuals is the mean of the desired class, irrespective 
of whether we select all of the desired class or only a portion of it. 
The point will be made clearer by reference to Fig. 18.1, letting the 
cross-hatching represent the desired class. Let us suppose that the 
replacement rate allows us to select 10 per cent of the population. If 
we select out of the population on the right, with an incidence of 90 

I per cent, the mean of the selected individuals will be the same as if 
we had selected 90 per cent. But if we select out of the population 
on the left, with an incidence of 10 per cent, we shall use all of the 
individuals in the desired class and none of the others. The selection 
differential will then be the same as if we had selected on the basis of 
a continuously varying character. Thus the selection differential is 
greatest when the incidence is exactly equal to the proportion selected. 
If it is less we shall be forced to use some individuals of the un- 
desired class; and if it is greater we shall do no better than we should 
by selecting the whole of the desired class. 

With some characters, however, the incidence can be altered and 
this provides a means of improving the response to selection. If the 
character is, for example, a reaction to some treatment, the treatment 
can be increased or reduced in intensity, so that the incidence is 
altered. This is an alteration of the mean level of the environment, 
and its effect is in principle to shift the distribution of phenotypic 
values with respect to the fixed threshold. But it is more con- 
venient to regard it as changing the nature of the character and shift- 
ing the threshold with respect to a fixed mean phenotypic level. 
When the level of the threshold can be controlled in this way, the 
maximum speed of progress under selection will be attained by ad- 
justing the threshold so that the incidence is kept as nearly as possible 
equal to the minimum proportion that must be selected for breeding. 
The progress made can be assessed by subjecting the population, or 
part of it, to the original treatment under which the threshold is at its 
original level. 



[Chap. 18 

Genetic assimilation. A very interesting result of the applica- 
tion of this principle of changing the threshold by environmental 
means is the phenomenon known as "genetic assimilation" (Wad- 
dington, 1953). If a threshold character appears as a result of an 
environmental stimulus, and selection is applied for this character, it 
may eventually be made to appear spontaneously, without the neces- 
sity of the environmental stimulus. In this way what was originally 
an "acquired character" becomes by perfectly orthodox principles of 
selection an "inherited character" (Waddington, 1942). In such a 
situation there are two thresholds, one spontaneous and the other 

4 t 6 


Fig. 1 8.4. Diagram illustrating genetic assimilation of a threshold 
character. Distributions on the underlying scale, which is marked 
in standard deviations. The vertical lines show the positions of the 
induced and spontaneous thresholds, and the arrows mark the 
population means at three stages of selection. 

(a) before selection: incidence — induced = 30 %, spontaneous = o % 

(b) after some selection: incidence — induced = 80 %, spontaneous = 2 % ■■ 

(c) after further selection: incidence — induced = 100 %, spontaneous =95 % 

induced, as shown in Fig. 18.4. The spontaneous threshold is at first 
outside the range of variation of the population, so that there is no 
variation of phenotype and no selection can be applied, (Fig. 18.4, a). 
The induced threshold, however, is within the range of the under- 
lying scale covered by the population, and it allows individuals toward 
one end of the distribution to be picked out by selection. In this way 
the mean genotypic value of the population is changed. If this change 
goes far enough some individuals will eventually cross the spon- 
taneous threshold and appear as spontaneous variants, (Fig. 18.4, b). 
When the spontaneous incidence becomes high enough selection may 

Chap. 18] 



be continued without the aid of the environmental stimulus, and the 
spontaneous incidence may be further increased, (Fig. 18.4, c). 

Example 18.2. An experimental demonstration of genetic assimilation 
in Drosophila melanogaster is described by Waddington (1953). The charac- 
ter was the absence of the posterior cross-vein of the wing. In the base 
population no flies with this abnormality were present, but treatment of 
the puparium by heat shock caused about 30 per cent of cross-veinless 
individuals to appear. Selection in both directions was applied to the 
treated flies, and after 14 generations the incidence of the induced character 
had risen to 80 per cent and fallen to 8 per cent. At this time cross-veinless 
flies began to appear in small numbers among untreated flies of the upward- 
selected line, and by generation 16 the spontaneous incidence was between 
1 and 2 per cent. Selection was then continued without treatment, the 
population being subdivided into a number of lines. The best four of the 
lines, selected without further treatment, reached spontaneous incidences 
ranging from 67 per cent to 95 per cent. The distributions in Fig. 18.4 
illustrate the progress of the upward selection. Graph (b) shows a spon- 
taneous incidence of 2 per cent and an induced incidence of 80 per cent 
and thus corresponds approximately with generation 16. On the assump- 
tion of constant variance, the change of mean at this stage amounted to 
1-36 standard deviations. Graph (c) shows a spontaneous incidence of 
95 per cent and represents the line that finally showed the greatest pro- 
gress. Its mean on the underlying scale is 5-15 standard deviations above 
that of the initial population. 

The idea of genetic assimilation is not confined to threshold 
characters; but for its wider significance the reader must be referred 
to Waddington (1957). 




This chapter deals with the relationships between two metric charac- 
ters, in particular with characters whose values are correlated — 
either positively or negatively — in the individuals of a population. 
Correlated characters are of interest for three chief reasons. Firstly 
in connexion with the genetic causes of correlation through the 
pleiotropic action of genes: pleiotropy is a common property of major 
genes, but we have as yet had little occasion to consider its effects in 
quantitative genetics. Secondly in connexion with the changes 
brought about by selection: it is important to know how the im- 
provement of one character will cause simultaneous changes in other 
characters. And thirdly in connexion with natural selection: the 
relationship between a metric character and fitness is the primary 
agent that determines the genetic properties of that character in a 
natural population. This last point, however, will be discussed in 
the next chapter. 

Genetic and Environmental Correlations 

In genetic studies it is necessary to distinguish two causes of cor- 
relation between characters, genetic and environmental. The genetic 
cause of correlation is chiefly pleiotropy, though linkage is a cause of 
transient correlation particularly in populations derived from crosses 
between divergent strains. Pleiotropy is simply the property of a 
gene whereby it affects two or more characters, so that if the gene is 
segregating it causes simultaneous variation in the characters it 
affects. For example, genes that increase growth rate increase both 
stature and weight, so that they tend to cause correlation between 
these two characters. Genes that increase fatness, however, influence 
weight without affecting stature, and are therefore not a cause of 
correlation. The degree of correlation arising from pleiotropy ex- 
presses the extent to which two characters are influenced by the same 



genes. But the correlation resulting from pleiotropy is the overall, or 
net, effect of all the segregating genes that affect both characters. 
Some genes may increase both characters, while others increase one 
and reduce the other; the former tend to cause a positive correlation, 
the latter a negative one. So pleiotropy does not necessarily cause a 
detectable correlation. The environment is a cause of correlation in 
so far as two characters are influenced by the same differences of 
environmental conditions. Again, the correlation resulting from en- 
vironmental causes is the overall effect of all the environmental 
factors that vary; some may tend to cause a positive correlation, others 
a negative one. 

The association between two characters that can be directly 
observed is the correlation of phenotypic values, or the phenotypic 
correlation. This is determined from measurements of the two 
characters in a number of individuals of the population. Suppose, 
however, that we knew not only the phenotypic values of the indi- 
viduals measured, but also their genotypic values and their environ- 
mental deviations for both characters. We could then compute the 
correlation between the genotypic values of the two characters and 
the correlation between the environmental deviations, and so assess 
independently the genetic and environmental causes of correlation. 
And if, in addition, we knew the breeding values of the individuals, we 
could determine also the correlation of breeding values. In principle 
there are also correlations between dominance deviations, and be- 
tween the various interaction deviations. To deal with all these cor- 
relations, even in theory, would be unmanageably complex, and 
fortunately is not necessary, since the practical problems can be quite 
adequately dealt with in terms of two correlations. These are the 
genetic correlation, which is the correlation of breeding values, and 
the environmental correlation, which is not strictly speaking the cor- 
relation of environmental deviations, but the correlation of environ- 
mental deviations together with non-additive genetic deviations. In 
other words, just as the partitioning of the variance of one charac- 
ter into the two components, additive genetic versus all the rest, 
was adequate for many purposes, so now the covariance of two 
characters need only be partitioned into these same two compon- 
ents. The ' 'genetic" and " environmental" correlations thus corres- 
pond to the partitioning of the covariance into the additive genetic 
component versus all the rest. The methods of estimating these 
two correlations will be explained later. Let us consider first how 


they combine together to give the directly observable phenotypic 

The following symbols will be used throughout this chapter: 

X and Y: the two characters under consideration. 

r P the phenotypic correlation between the two characters, 

r A the genetic correlation between X and Y (i.e. the 

correlation of breeding values). 
r E the environmental correlation between X and Y 

(including non-additive genetic effects). 
cov the covariance of the two characters X and Y, with 

subscripts P, A, or E, having the same meaning as for 

the correlations. 
cr 2 and g variance and standard deviation, with subscripts 

P, A, or E, as above, and X or Y according to the 

character referred to. E.g. g px = phenotypic variance 

of character X. 
h 2 the heritability, with subscript X or Y, according to 

the character. 
e 2 = i - h 2 . 

(The customary symbol for the genetic correlation is r G , but since the 
genetic correlation is almost always the correlation of breeding values 
we shall use the symbol r A for the sake of consistency with previous 

A correlation, whatever its nature, is the ratio of the appropriate 
covariance to the product of the two standard deviations. For 
example, the phenotypic correlation is 


r P 

G PX G P y 

The phenotypic covariance is the sum of the genetic and environ- 
mental covariances, so we can write the phenotypic correlation as 

_cov A +cov E 

r p — 


The denominator can be differently expressed by the following 
device: g\ — h 2 G P , and g% — ^g p . So G P —G A jh=G E je. The phenotypic 
correlation then becomes 




7 j cov A COV E 

r P = h x h Y - — —+e x e Y 

a AX?AY 



r P =h x h Y r A +e x e Y r E 

,{l 9 .l) 

This shows how the genetic and environmental causes of correlation 
combine together to give the phenotypic correlation. If both 
characters have low heritabilities then the phenotypic correlation is 
determined chiefly by the environmental correlation: if they have 
high heritabilities then the genetic correlation is the more important. 

The genetic and environmental correlations are often very differ- 
ent in magnitude and sometimes different even in sign, as may be 
seen from the examples given in Table 19.1. A difference in sign 
between the two correlations shows that genetic and environmental 
sources of variation affect the characters through different physio- 
logical mechanisms. The correlations between body-weight and egg- 
laying characters in poultry provide striking examples. Pullets that 
are larger at 18 weeks from genetic causes reach sexual maturity later 
and lay fewer eggs, but the eggs are larger. Pullets that are larger 
from environmental causes reach sexual maturity earlier and lay 
more eggs, which however are very little different in size. 

The dual nature of the phenotypic correlation makes it clear that 
the magnitude and even the sign of the genetic correlation cannot be 
determined from the phenotypic correlation alone. Let us therefore 
consider the methods by which the genetic correlation can be 

Estimation of the genetic correlation. The estimation of 
genetic correlations rests on the resemblance between relatives in a 
manner analogous to the estimation of heritabilities described in 
Chapter 10. Therefore only the principle and not the details of the 
procedure need be described here. Instead of computing the com- 
ponents of variance of one character from an analysis of variance, we 
compute the components of covariance of the two characters from an 
analysis of covariance which takes exactly the same form as the ana- 
lysis of variance. Instead of starting from the squares of the individual 
values and partitioning the sums of squares according to the source 
of variation, we start from the product of the values of the two 
characters in each individual and partition the sums of products 
according to the source of variation. This leads to estimates of the 
observational components of covariance, whose interpretation in 


Table 19. i 

Some Examples of Phenotypic, Genetic, and 
Environmental Correlations 

The environmental correlations (except those marked*) were 
calculated for this table from the genetic correlations and 
heritabilities given in the papers cited, by equation ig.i. 
They are not purely environmental in causation but include 
correlation due to non-additive genetic causes, as explained 
in the text. Those marked* are true environmental correla- 
tions, estimated directly from the phenotypic correlation in 
inbred lines and crosses. 

r P r A r E 

Cattle (Johansson, 1950) 
Milk-yield : butterfat-yield. 
Milk-yield : butterfat %. 
Butterfat-yield : butterfat %. 

Pigs (Fredeen and Jonsson, 1957) 
Body length : backfat thickness. 
Growth rate : feed efficiency. 
Backfat thickness : feed efficiency. 

Sheep (Morley, 1955) 

Fleece weight : length of wool. 
Fleece weight : crimps per inch. 
Fleece weight : body weight. 

Poultry (Dickerson, 1957) 

Body weight : egg-production. 

(at 18 weeks) (to 72 weeks of age) 

Body weight : egg weight. 

(at 18 weeks) 

Body weight : age at first egg. - -30 -29 - -50 

(at 18 weeks) 

Mice (Falconer, 1954&) 

Body weight : tail length (within litters). -44 -59 -34 

Drosophila melanogaster 

Bristle number, abdominal : sternopleural. -06 -08 -04 

(Clayton, Knight, Morris, and Robert- 
son, 1957) 

Number of bristles on different abdominal 

segments. (Reeve and Robertson, 1954) — '96 -05 

Thorax length : wing length. 

(Reeve and Robertson, 1953) — 75 '5° 


















- -II 











terms of causal components of covariance is exactly the same as that 
of the components of variance given in Table 10.4. Thus, in an 
analysis of half-sib families the component of covariance between 
sires estimates \cov Ay i.e. one quarter of the covariance of breeding 
values of the two characters. For the estimation of the correlation 
the components of variance of each character are also needed. Thus 
the between-sire components of variance estimate la AX and \v\ Y - 
Therefore the genetic correlation is obtained as 


s/var x var Y 

where var and cov refer to the components of variance and covariance. 
The offspring-parent relationship can also be used for estimating 
the genetic correlation. To estimate the heritability of one character 
from the resemblance between offspring and parents we compute 
the covariance of offspring and parent for the one character by 
taking the product of the parent or mid-parent value and the mean 
value of the offspring. To estimate the genetic correlation between 
two characters we compute what might be called the "cross- 
covariance," obtained from the product of the value of X in parents 
and the value of Y in offspring. This "cross-co variance" is half the 
genetic covariance of the two characters, i.e. \cov A . The covariances 
of offspring and parents for each of the characters separately are also 
needed, and then the genetic correlation is given by 

Cm ^=^ (19-3) 

vCOVxx cov Y y 

where cov^y is the "cross-covariance," and cov X x an d cov Y y are tne 
offspring-parent covariances of each character separately. 

The genetic correlation can also be estimated from responses to 
selection in a manner analogous to the estimation of realised herit- 
ability. This will be explained in the next section. 

Data that provide estimates of genetic correlations provide also 
estimates of the heritabilities of the correlated characters, and of the 
phenotypic correlations. The environmental correlation can then be 
found from equation ig.i. If highly inbred lines are available the 
environmental correlations can be estimated directly from the 
phenotypic correlation within the lines, or preferably within the F/s 
of crosses between the lines. 

Estimates of genetic correlations are usually subject to rather 


large sampling errors and are therefore seldom very precise. The 
sampling variance of genetic correlations is treated by Reeve (1955^) 
and by A. Robertson (19596). The standard error of an estimate 
is given approximately by the following formula : 

U (r A ) ~ 

V <*(ft|) <*0 

where <r denotes standard error. Since the standard errors of the two 
heritabilities appear in the numerator, an experiment designed to 
minimise the sampling variance of an estimate of heritability, in the 
manner described in Chapter 10, will also have the optimal design for 
the estimation of a genetic correlation. 

Correlated Response to Selection 

The next problem for consideration concerns the response to 
selection: if we select for character X, what will be the change of the 
correlated character Y? The expected response of a character, Y, 
when selection is applied to another character, X, may be deduced in 
the following way. The response of character X — i.e. the character 
directly selected — is equivalent to the mean breeding value of the 
selected individuals. This was explained in Chapter 11. The conse- 
quent change of character Y is therefore given by the regression of the 
breeding value of Y on the breeding value of X. This regression is 

_cov A _ G AY 

°(A)YX——^ — 'A 

<*AX &AX 

The response of character X, directly selected, by equation 11. 4, is 

Rx = ihx°Ax 
Therefore the correlated response of character Y is 
CR Y =b U)YX R x 

■j (J AY 

=in x a Ax r A 


=ihxr A °AY ( J 9-' 

Or, by putting g ay — h Y cr PYi the correlated response becomes 

CR Y = ih x h Y r A a PY ( I 9-5) 


Chap. 19] 



Thus the response of a correlated character can be predicted if 
the genetic correlation and the heritabilities of the two characters are 
known. And, conversely, if the correlated response is measured by 
experiment, and the two heritabilities are known, the genetic corre- 
lation can be estimated. If the heritability of character Y is to be 
estimated as the realised heritability from the response to selection, 
then it is necessary to do a double selection experiment. Character X 
is selected in one line and character Y in another. Then both the 
direct and the correlated responses of each character can be measured. 
This type of experiment provides two estimates of the genetic corre- 
lation (by equation 19.5), one from the correlated response of each 
character; and the two estimates should agree if the theory of corre- 
lated responses expressed in equation J9.5 adequately describes the 
observed responses (Falconer, 1954&). A joint estimate of the genetic 
correlation can be obtained from such double selection experiments, 
without the need for estimates of the heritabilities, from the following 
formula which may be easily derived from equations 11. 4 and 19.4: 

r A = 

Ry- Rxr 

.(i 9 .6) 

Example 19. i. In a study of wing length and thorax length in Droso- 
phila melanogaster, Reeve and Robertson (1953) estimated the genetic 
correlation between these two measures of body size from the responses to 
selection. There were two pairs of selection lines; one pair was selected for 
increased and for decreased thorax length, and the other pair for increased 
and for decreased wing length. In each line the correlated response of the 
character not directly selected was measured, as well as the response of the 
character directly selected. Two estimates of the genetic correlation were 
obtained by equation J9.6, one from the responses to upward selection and 
the other from the responses to downward selection. In addition, estimates 
of the genetic correlation in the unselected population were obtained from 
the offspring-parent covariance and also from the full-sib co variance. The 
four estimates were as follows: 


Genetic correlation 



Full sib 


Selection, upward 


Selection, downward 


The agreement between the estimates from selection and the estimates 
from the unselected population shows that the correlated responses were 


very close to what would have been predicted from the genetic analysis of 
the unselected population. 

Close agreement between observed and predicted correlated 
responses, such as was shown in the above example, cannot always be 
expected, particularly if the genetic correlation is low. With a low 
genetic correlation the expected response is small and is liable to be 
obscured by random drift (see Clayton, Knight, Morris and Robert- 
son, 1957). Also, if the genetic correlation is to any great extent 
caused by linkage, it is likely to diminish in magnitude through 
recombination, with a consequent diminution of the correlated 
response. There has not yet been enough experimental study of 
correlated responses to allow us to draw any conclusions about the 
number of generations over which they continue, nor about the total 
response when the limit is reached. 

Indirect selection. Consideration of correlated responses sug- 
gests that it might sometimes be possible to achieve more rapid pro- 
gress under selection for a correlated response than from selection for 
the desired character itself. In other words, if we want to improve 
character X, we might select for another character, Y, and achieve 
progress through the correlated response of character X. We shall 
refer to this as "indirect" selection; that is to say, selection applied to 
some character other than the one it is desired to improve. And we 
shall refer to the character to which selection is applied as the 
"secondary" character. The conditions under which indirect selec- 
tion would be advantageous are readily deduced. Let R x be the 
direct response of the desired character, if selection were applied 
directly to it. And let CR X be the correlated response of character X 
resulting from selection applied to the secondary character, Y. The 
merit of indirect selection relative to that of direct selection may then 
be expressed as the ratio of the expected responses, CR X /R X . Taking 
the expected correlated response from equation 19.4 and the expected 
direct response from equation 11. 4, we find 

CR X = t Y ^Y r A^AX 

Rx ixhx°AX 

l x n x 
If the same intensity of selection can be achieved when selecting for 

Chap. 19] 



character Y as when selecting for character X, then the correlated 
response will be greater than the direct response if r A h Y is greater 
than h x . Therefore indirect selection cannot be expected to be 
superior to direct selection unless the secondary character has a 
substantially higher heritability than the desired character, and the 
genetic correlation between the two is high; or, unless a substantially 
higher intensity of selection can be applied to the secondary than to 
the desired character. The circumstances most likely to render 
indirect selection superior to direct selection are chiefly concerned 
with technical difficulties in applying selection directly to the desired 
character. Two such technical difficulties may be mentioned briefly. 

i . If the desired character is difficult to measure with precision, 
the errors of measurement may so reduce the heritability that indirect 
selection becomes advantageous. Threshold characters in general are 
likely for this reason to repay a search for a suitable correlated charac- 
ter, unless the position of the threshold can be adjusted in the manner 
described in the last chapter. An interesting experimental result which 
may well prove to be an example of indirect selection being superior 
to direct selection concerns sex ratio in mice. The sex ratio among 
the progeny may be regarded as a metric character of the parents. 
Selection applied directly to sex ratio was ineffective in changing it 
(Falconer, 1954c), but selection for blood-pH produced a correlated 
change of sex ratio (Weir and Clark, 1955; Weir, 1955). The reason 
for the ineffectiveness of direct selection is probably that the true sex 
ratio of a family is subject to a large error of estimation resulting 
from the sampling variation, and the heritability is consequently very 

2. If the desired character is measurable in one sex only, but the 
secondary character is measurable in both, then a higher intensity of 
selection will be possible by indirect selection. Other things being 
equal, the intensity of selection would be twice as great by indirect 
as by direct selection; but a better plan would be to select one sex 
directly for the desired character and the other indirectly for the 
secondary character. 

Though indirect selection has been presented above as an alterna- 
tive to direct selection, the most effective method in theory is neither 
one nor the other but a combination of the two. The most effective use 
that can be made of a correlated character is in combination with the 
desired character, as an additional source of information about the 
breeding values of individuals. This, however, is a special case of a 


more general problem which will be dealt with in the final section of 
this chapter. First we shall show how the idea of indirect selection 
can be extended to cover selection in different environments. 

Genotype-Environment Interaction 

The concept of genetic correlation can be applied to the solution 
of some problems connected with the interaction of genotype with 
environment. The meaning of interaction between genotype and 
environment was explained in Chapter 8, where it was discussed as a 
source of variation of phenotypic values, which in most analyses is 
inseparable from the environmental variance. The chief problem 
which it raises and which we are now in a position to discuss concerns 
adaptation to local conditons. The existence of genotype-environ- 
ment interaction may mean that the best genotype in one environ- 
ment is not the best in another environment. It is obvious, for 
example, that the breed of cattle with the highest milk-yield in 
temperate climates is unlikely also to have the highest yield in tropical 
climates. But it is not so obvious whether smaller differences of en- 
vironmental conditions also require locally adapted breeds; nor is it 
intuitively obvious how much of the improvement made in one 
environment will be carried over if the breed is then transferred to 
another environment. These matters have an important bearing on 
breeding policy. If selection is made under good conditions of feeding 
and management on the best farms and experimental stations, will 
the improvement achieved be carried over when the later generations 
are transferred to poorer conditions? Or would the selection be 
better done in the poorer conditions under which the majority of 
animals are required to live ? The idea of genetic correlation provides 
the basis for a solution of these problems in the following way. 

A character measured in two different environments is to be 
regarded not as one character but as two. The physiological mechan- 
isms are to some extent different, and consequently the genes re- 
quired for high performance are to some extent also different. For 
example, growth rate on a low plane of nutrition may be principally 
a matter of efficiency of food-utilisation, whereas on a high plane of 
nutrition it may be principally a matter of appetite. By regarding 
performance in different environments as different characters with 
genetic correlation between them we can in principle solve the prob- 

Chap. 19] 



lems outlined above from a knowledge of the heritabilities of the 
different characters and the genetic correlations between them 
(Falconer, 1952). If the genetic correlation is high, then performance 
in two different environments represents very nearly the same 
character, determined by very nearly the same set of genes. If it is 
low, then the characters are to a great extent different, and high 
performance requires a different set of genes. Here we shall con- 
sider only two environments, but the idea can be extended to an 
indefinite number of different environments (A. Robertson, 

Let us consider the problem of the ' 'carry-over" of the improve- 
ment from one environment to another. Let us suppose that we 
select for character X — say growth rate on a high plane of nutrition — 
and we look for improvement in character Y — say growth rate on a 
low plane of nutrition. The improvement of character Y is simply a 
correlated response and the expected rate of improvement was given 
in equation J9.5 as 

CR Y — thjJtyTAVp? 

The improvement of performance in an environment different from 
the one in which selection was carried out can therefore be predicted 
from a knowledge of the heritability of performance in each environ- 
ment and the genetic correlation between the two performances. We 
can also compare the improvement expected by this means with that 
expected if we had selected directly for character Y, i.e. for perfor- 
mance in the environment for which improvement is wanted. This 
is simply a comparison of indirect with direct selection, which was 
explained in the previous section. The comparison is made from the 
ratio of the two expected responses given in equation ig.7, i.e. 


r A ~T~ 

i Y h Y 

This shows how much we may expect to gain or lose by carrying out 
the selection in some environment other than the one in which the 
improved population is required to live. If we assume that the in- 
tensity of selection is not affected by the environment in which the 
selection is carried out, then the indirect method will be better if 
r A^x is greater than h Y , where h x is the square root of the heritability 
in the environment in which selection is made, and h Y is the square 
root of the heritability in the environment in which the population is 


required subsequently to live. If the genetic correlation is high, then 
the two characters can be regarded as being substantially the same; 
and if there are no special circumstances affecting the heritability or 
the intensity of selection it will make little difference in which en- 
vironment the selection is carried out. But if the genetic correlation 
is low, then it will be advantageous to carry out the selection in the 
environment in which the population is destined to live, unless the 
heritability or the intensity of selection in the other environment is 
very considerably higher. 

This is the theoretical basis for dealing with selection in different 
environments. So far, however, there has been little experimental 
work to substantiate the theory. The results of the experiments that 
have been carried out do not appear to be fully in agreement with 
theoretical expectations, and this suggests that other factors not yet 
understood are probably operating. (See Falconer and Latyszewski, 
1952; Falconer, 1952.) 

Simultaneous Selection for more than one Character 

When selection is applied to the improvement of the economic 
value of animals or plants it is generally applied to several characters 
simultaneously and not just to one, because economic value depends 
on more than one character. For example, the profit made from a 
herd of pigs depends on their fertility, mothering ability, growth 
rate, efficiency of food-utilisation, and carcass qualities. How, then, 
should selection be applied to the component characters in order to 
achieve the maximum improvement of economic value? There are 
several possible procedures. One might select in turn for each 
character singly ("tandem" selection); or one might select for all the 
characters at the same time but independently, rejecting all individuals 
that fail to come up to a certain standard for each character regardless 
of their values for any other of the characters ("independent culling 
levels"). It has been shown, however, that the most rapid improve 
ment of economic value is expected from selection applied simul 
taneously to all the component characters together, appropriate weight 
being given to each character according to its relative economic 
importance, its heritability and the genetic and phenotypic correla- 
tions between the different characters, (Hazel and Lush, 1942; 
Hazel, 1943). The practice of selection for economic value is thus a 


Chap. 19] 



matter of some complexity. The component characters have to be 
combined together into a score, or index, in such a way that selection 
applied to the index, as if the index were a single character, will yield 
the most rapid possible improvement of economic value. If the 
characters are uncorrelated there is no great problem: each character 
is weighted by the product of its relative economic value and its 
heritability. This is the best that can be done in the absence of 
information about the genetic correlations, but if the genetic correla- 
tions are known the efficiency of the index can be improved. The 
following account gives an outline of the principles on which the 
construction of a selection index is based. For a fuller account the 
reader should consult Lerner (1950) and the original papers of 
Fairfield Smith (1936) and Hazel (1943). 

For the sake of simplicity we shall consider only two component 
characters of economic value, but the conclusions can readily be ex- 
tended to any number of characters. Let the economic value be 
determined by two characters X and Y, and let w be the additional 
profit expected from one unit increase of Y relative to that from one 
unit increase of X. The aim of selection therefore is to pick out 
individuals with the highest values of (A x + wA Y ), where A x and A Y 
are the breeding values of the two characters X and Y. Let us call 
this compound breeding value "merit," with the symbol H, so that 

H = A x + wA 

(i 9 .8) 

The problem is to find out how the phenotypic values, P x and P Y , of 
the two component characters are to be combined into an index that 
gives the best estimate of an individual's merit, H. In Chapter 10 we 
saw how the best estimate of the breeding value of an individual for 
one character is the regression equation A =b AP P, where b AP is the 
regression of breeding value on phenotypic value, and is equal to the 
heritability (see p. 166). The present problem is essentially the same, 
only now we have to use partial regression coefficients. The multiple 
regression equation giving the best estimate of merit is 

H=b H ^ Y P x +b HY , x P Y (J9.9) 

where P x and P Y are phenotypic values measured as deviations from 
the population mean. (In this formula, and in those that follow, the 
symbol X has the same meaning as P x , i.e. the phenotypic value of 
character X; and similarly Y and P Y both mean the phenotypic value 
of character Y. Thus, b HX , Y is the regression of merit on the pheno- 



[Chap. 19 

typic value of X when the phenotypic value of Y is held constant, and 
^hy.x nas a similar meaning with X and Y interchanged.) In practice 
it is convenient to have the index in a form that requires the manipu- 
lation of only one of the phenotypic values, i.e. in the form 

I=P X + WP Y ( 

where / is the index by means of which individuals are to be chosen, 
and W is a factor by which the phenotypic value of character Y is 
to be multiplied. Since the absolute magnitude of the index is of no 
importance, but only its relative magnitude in different individuals, 
we can work with the phenotypic values as they stand instead of with 
deviations from the population mean. And we can put equation 
ig.g into the form of equation simply by dividing through by 
t>HX.Y- Then W in equation is the ratio of the two partial 
regression coefficients, 


} HX.Y 

and our task now is to find a way of expressing W in terms of the 
genetic properties of the two characters. 

First let us put the partial regression coefficients in terms of the 
total regression coefficients. For example, 






'HY.X — 



r XY 





Now let us express these total regressions in terms of covariances and 
variances. For example, b HY = cov HY /a Y . After some simplification 
the expression reduces to 


g x cov hy - cov EX cov XY 



The variances o x and <j y here, and in what follows, are the pheno- 
typic variances of characters X and Y. The covariances in the above 
expression can be expressed in terms of the'phenotypic variance anc 
the heritability of each character and of the phenotypic and genetic 
correlations between the two characters, all of which quantities car 

Chap. 19] 



be estimated. Take, for example, the covariance of H with X. This 
| may be written as follows: 

cov HX = covariance of (A x + wA Y ) with P x 
=cov Ux .p x) +cov iwAYw p x) 
= h\ox + wr A h x o x h Y o Y 

J In this way the covariances in equation 19.11 can be expressed as 
| follows, a andcr 2 being phenotypic standard deviations and variances 
i ! throughout: 

cov HX = h x cr x + wr A h x h Y a x <j Y ^ 

cov HY = wh\a\ + r A h x h Y u x a Y > ( 19.12) 

cov XY = r P <j x v Y J 

The procedure for selection is thus to compute the covariances 

I given in 19.12, substitute them in 19.11 and use the value of W so 

obtained to compute the index of selection / given in 19.10. The 

value of the index for each individual then forms the basis of selection. 

The index as formulated above is applicable only to individual 
selection. If family selection is applied then the heritabilities and 
correlations that go into the index must be those appropriate to the 
family means. Family selection, however, is not greatly improved by 
the use of an index, because the family heritabilities of the component 
characters are generally fairly high and the mean economic value of a 
family in terms of phenotypic values is not very different from its 
merit in terms of breeding values. Therefore family selection for 
economic value can be applied with little loss of efficiency if the 
phenotypic values are weighted only by w, the relative economic im- 
portance of each component character. 

The complexity of selection by means of an index need hardly be 
emphasised, especially when the index is extended to cover many 
component characters. Even with two characters, estimates of no 
fewer than seven quantities are required for the construction of the 
index. Since some of these, particularly the genetic correlation, 
cannot usually be estimated with any great precision, the index 
cannot be regarded as much more than a rough guide to procedure. 
But since selection has to be applied to economic value by some 
means, it seems better to use a selection index, however imprecise, 
i than to base selection on a purely arbitrary combination of com- 
ponent characters. 

Use of a secondary character by means of an index. The 

Y F.Q.G. 


selection index described above can readily be adapted to meet the 
case where improvement of only one character is sought, the other 
character being used merely as an aid to more efficient selection. 
The use of a secondary character in this way was mentioned earlier, 
in connexion with indirect selection. Let X be the character it is 
desired to improve, and Y the secondary character. Then the 
relative "economic" value of character Y is zero, and we can substitute 
w = o in the formulae of ig.12. Substitution of the covariances in 
equation ig.n then yields a formula which on simplification reduces 

w ^Ar A h Y -r P h) 

o Y (h x -r A h Y ) K y J) 

The selection index of equation is then used with this value of 
W. The value of W in the index may be negative. This will arise if 
the phenotypic correlation between the two characters is chiefly 
environmental in origin. The secondary character then acts as an 
indicator of the environmental deviation rather than of the breeding 
value of the desired character (see Rendel, 1954; and Osborne, 

Genetic correlation and the selection limit. There is one 
important consequence of simultaneous selection for several charac- 
ters to be discussed before we leave the subject. Just as the herit- 
abilities are expected to change after selection has been applied for 
some time, so also are the genetic correlations. If selection has been 
applied to two characters simultaneously the genetic correlation 
between them is expected eventually to become negative, for the 
following reason. Those pleiotropic genes that affect both characters 
in the desired direction will be strongly acted on by selection and 
brought rapidly toward fixation. They will then contribute little to 
the variances or to the covariance of the two characters. The pleio- 
tropic genes that affect one character favourably and the other ad- 
versely will, however, be much less strongly influenced by selection 
and will remain for longer at intermediate frequencies. Most of the 
remaining covariance of the two characters will therefore be due to 
these genes, and the resulting genetic correlation will be negative. 
The consequence of a negative genetic correlation, whether produced 
by selection in this way or present from the beginning, is that the two 
characters may each show a heritability that is far from zero, and yet 
when selection is applied to them simultaneously neither responds. 


Chap. 19] 



We have already discussed, in Chapter 12, what is essentially the 
same situation resulting from the combined effects of artificial and 
natural selection: a selection limit is reached even though the charac- 
ter to which artificial selection is applied still shows a substantial 
amount of additive genetic variance. 

Example 19.2. A practical example from a commercial flock of poultry 
is described by Dickerson (1955). Selection for economic value had been 
applied for many years, but recent progress in the component characters 
was much less than was to be expected from their heritabilities, which were 
found to be moderately high. Estimations of the genetic correlations 
between the component characters showed that many of these were nega- 
tive. To take just one example, the relationships between egg-production 
and egg weight were as follows: 







0-32 0-59 

0-04 -0-39 

+ 0-25 

In spite of the high heritabilities neither character had shown any improve- 
ment over the last 10-15 years. The high negative genetic correlation 
would account for this failure to respond, if selection was applied to both 
characters simultaneously. It is interesting to note that environmental 
variation, unlike genetic variation, affects both characters in the same way 
and leads to a positive environmental correlation. The phenotypic cor- 
relation, which is almost zero, gives no clue to the genetic relationship 
between the two characters, and the failure to respond to selection could 
mot have been predicted from it alone. 

A population which has been subjected over a long period to 
selection for economic value throws light on the genetic properties to 
be expected in natural populations subject to natural selection for 
fitness. Fitness is a compound character with many components — 
far more than would appear in the most elaborate assessment of 
economic value — and so we should expect negative genetic correla- 
tions between its major components, a conclusion to be developed 
further in the next chapter. It is interesting to note, however, that 
natural selection takes no account of heritabilities or genetic correla- 
tions, and is therefore, in theory, less efficient in improving fitness 
than artificial selection by means of an index is in improving economic 



Throughout the discussion of the genetic properties of metric 
characters, which has occupied the major part of the book, very little 
attention has been given to the effects of natural selection, and some- 
thing must now be done to remedy this omission. The absence of 
differential viability and fertility was specified as a condition in the 
theoretical development of the subject: that is to say, natural selection 
was assumed to be absent. Though for many purposes this assump- 
tion may lead to no serious error, a complete understanding of metric 
characters will not be reached until the effects of natural selection 
can be brought into the picture. The operation of natural selection 
on metric characters has, however, a much wider interest than just 
as a complication that may disturb the simple theoretical picture and 
the predictions based on it. It is to natural selection that we must look 
for an explanation of the genetic properties of metric characters which 
hitherto we have accepted with little comment. The genetic pro- 
perties of a population are the product of natural selection in the past, 
together with mutation and random drift. It is by these processes 
that we must account for the existence of genetic variability; and it is 
chiefly by natural selection that we must account for the fact that 
characters differ in their genetic properties, some having propor- 
tionately more additive variance than others, some showing in- 
breeding depression while others do not. These, however, are very 
wide problems which are still far from solution, and in this con- 
cluding chapter we can do little more than indicate their nature. Any 
discussion of them, moreover, cannot but be controversial; the reader 
should therefore understand that the contents of this chapter are to a 
large extent matters of personal opinion, and that any conclusions to 
which the discussion may lead are open to dispute. 

We shall refer throughout to a population that is in genetic equil- 
ibrium. Being in genetic equilibrium means that the gene frequencies 
are not changing, and therefore that the mean values of all metric 




characters are constant. (Changes of environmental conditions are 
assumed to be so slow as to be negligible.) The population is con- 
stantly subject to natural selection tending to increase fitness, but 
despite the selection the gene frequencies do not change and fitness 
does not improve. There can therefore be no additive genetic vari- 
ance of fitness: in other words, if we could measure fitness itself as a 
character we should find that its genetic variance was entirely non- 
additive. For the purposes of discussion we may regard any natural 
population as being in genetic equilibrium, at least approximately, 
and also any population that has been subject to artificial selection 
consistently over a long period of time, provided that fitness is 
defined in terms of both the artificial and the natural selection. 
Fitness, crudely defined, is the "character" selected for, whether by 
natural selection alone or by artificial and natural selection com- 

If a population is in genetic equilibrium it follows that a reduction 
of fitness must in principle result from any change in the array of gene 
frequencies, apart from any genes that may have no effect on fitness. 
Natural selection must therefore be expected to resist any tendency 
to change of the gene frequencies, such as must result from artificial 
selection applied to any metric character other than fitness itself. 
This principle has been called "genetic homeostasis," and its conse- 
quences have been discussed, by Lerner (1954). Thus if we change 
any metric character by artificial selection we must expect a reduction 
of fitness as a correlated response. And if we then suspend the arti- 
ficial selection before any of the variation has been lost by fixation, 
we must expect the population mean to revert to its original value. 
On the whole, the experience of artificial selection is in general 
agreement with this expectation, though under laboratory conditions 
the reduction of fitness may not be apparent in the early stages, and 
some characters appear to revert very slowly, if at all, toward the 
original value. Our domesticated animals and plants are perhaps the 
best demonstration of the effects of the principle. The improvements 
that have been made by selection in these have clearly been accom- 
panied by a reduction of fitness for life under natural conditions, and 
only the fact that domestic animals and plants do not have to live 
under natural conditions has allowed these improvements to be 
made. The problems for discussion in this chapter must be seen 
against the background of this principle: that the existing array of 
gene frequencies, and consequently the existing genetic properties of 


the population, represent the best total adjustment to existing con- 
ditions that is possible with the available genetic variation. 

The problem of how natural selection operates on metric charac- 
ters has two aspects: the relation between any particular metric 
character and fitness, and the way in which natural selection operates 
on the individual loci concerned with a metric character. This latter 
aspect is part of a wider problem which concerns the reasons for the 
existence of genetic variation. We shall discuss these two aspects 
separately, because any conclusions that may be drawn about the 
second will depend on what can be discovered about the first. 

Relation of Metric Characters to Fitness 

The fitness of an individual is the final outcome of all its develop- 
mental and physiological processes. The differences between indi- 
viduals in these processes are seen in variation of the measurable 
attributes which can be studied as metric characters. Thus the 
variation of each metric character reflects to a greater or lesser degree 
the variation of fitness; and the variation of fitness can theoretically 
be broken down into variation of metric characters. Let us consider 
for example a mammal such as the mouse, because this matter is more 
easily discussed in concrete terms. Fitness itself might be broken 
down into two or three major components, which could be measured 
and studied as metric characters. These might be the total number 
of young reared, and some measure of the quality of the young, such 
as their weaning weight. The variation of the major components 
would account for all the variation of fitness. Each of the major com- 
ponents might be broken down into other metric characters which 
would account for all their variation. Thus the total number of 
young weaned depends on the viability of the parent up to breeding 
age, its mating ability, average litter size, frequency of litters, and 
longevity. These characters in turn might be further broken down. 
For example, litter size depends on the number of eggs shed and the 
proportion that are brought to term. The number of eggs shed 
depends, again, on body size and endocrine activity, among other 
things. Thus each metric character has its place in one of a series of 
chains of causation converging toward fitness. And these chains of 
causation interconnect one with another: body size, for example, 
influences not only litter size, but also lactation, longevity, and prob- 



ably many other characters. The relationship between any particular 
metric character and fitness is thus a very complicated matter. The 
following discussion of the problem is based largely on the ideas put 
forward by A. Robertson (19556). 

The way in which natural selection operates on a character 
depends on the part played by the character in the causation of differ- 
ences of fitness: that is to say, on the manner and degree by which 
differences of value of the metric character cause differences of fitness. 
This we shall refer to as the "functional relationship" between the 
character and fitness. The functional relationship expresses the 
mode of operation of natural selection on the metric character; but it 
is not necessarily also the relationship that would be revealed if we 
could measure the fitness of individuals and compare their fitness with 
their values for the metric character. This point, however, will be 
more easily explained by an example to be given in a moment. 
Different characters must be expected to have different functional 
relationships with fitness, according to the nature of the character. 
In explanation of the kinds of relationship that may be envisaged let 
us take some examples of different sorts of character at different 
positions in the chain of causation. 

1. Neutral characters. There may be some characters that have 
no functional relationship at all with fitness. This does not mean 
that, like vestigial organs, they have no function or use. It means that 
the variation in the character is not a cause of variation of fitness. 
Abdominal bristle number in Drosophila may be taken as an example 
of a character which is probably not far from this state, and two 
reasons can be given for regarding it thus. First, it is difficult to 
conceive of any biological reason why it should be important to have 
18 bristles, or thereabouts, on each segment rather than more or 
fewer. And second, if we change the bristle number by artificial 
selection and then suspend the selection, the mean bristle number 
does not return to its original value — or returns only very slowly — 
under the influence of natural selection, even though it could be 
brought back rapidly by artificial selection (Clayton, Morris, and 
Robertson, 1957). In other words, genetic homeostasis in respect of 
bristle number is weak or non-existent. Such a metric character may 
be termed "neutral" with respect to fitness. The mean value of a 
neutral character in the population has little or nothing to do with the 
character itself, but is the outcome of the pleiotropic effects of the 
genes whose frequencies are controlled by their effects on other 


characters. Though a neutral character has no functional relationship 
with fitness, we may nevertheless find that individuals with different 
values do in fact differ in fitness in a regular way. If the genetic 
variance of the character is predominantly additive then individuals 
with intermediate values will tend on the whole to be heterozygous at 
more loci than individuals with extreme values. Then if hetero- 
zygotes were superior in fitness for some other reason, unconnected 
with the character in question, this would result in intermediates 
being superior in fitness. At the level of observation there would be 
a relationship between values of the character and fitness, but this 
would not be a functional relationship because the values of the 
character are not the cause of the differences of fitness. The differ- 
ences of fitness are the result of the functional relationships of other 
characters affected by the pleiotropic action of the genes. 

2. Characters with intermediate optima. There are some 
characters for which an intermediate value is optimal for functional 
reasons. One might distinguish three sorts of intermediate opti- 
mum according to the reasons for intermediates being superior in 

(i) Optima determined by the character itself. As an example we 
might take any character that measures the thermal insulation of a 
mammalian coat. Too dense a coat would be disadvantageous and so 
would too sparse a coat. An intermediate density would confer the 
highest fitness as a consequence of the function of the coat in thermo- 
regulation. For such a character the mean value in the population is 
the optimal value, provided there are no complications of the sort to 
be considered later. Though irrefutable biological reasons might be 
given for supposing that a character such as the density of fur has an 
intermediate optimal value, we might nevertheless find that over the 
range of variation covered by the population there was very little 
variation in fitness. In practice therefore one could not expect always 
to draw a clear line between this sort of character and a neutral 
character such as we have taken bristle number to be. 

(ii) Optima imposed by the environment. As an example we may 
take the clutch size of birds. It has been shown, particularly for the 
European robin and swift, that a larger number of young are reared 
from nests containing the average number of eggs than from nests 
with larger or smaller clutches (Lack, 1954). Thus individuals with 
intermediate values appear to be the fittest. If a character such as 
this has an optimal value that is intermediate there must obviously 



be some other factor interacting with it to determine fitness; for, 
otherwise, the individuals that lay more eggs must inevitably be the 
fitter. The other factor in this case is the supply of insects for feeding 
the young and the length of daylight available for their capture. 
With characters of this sort natural selection tends to eliminate indi- 
viduals with extreme values and favours individuals with intermediate 
values. The mean value in the population is the optimal value under 
the environmental conditions to which the population is subjected. 
If the environment were to change, the population mean would 
change too in adjustment to the new optimum. In the case of clutch 
size it is noteworthy that the mean value varies with the latitude, 
being larger in the north than in the south. 

(iii) Optima imposed by a correlated character. Body size in mice 
may be taken as an example. Larger mice have larger litters and, 
under laboratory conditions, they rear more young. Therefore if 
there were no other factor involved, larger mice would be fitter. Since 
body size can, as we have seen, be readily increased by artificial 
selection, there must be some other factor that prevents its being 
increased by natural selection in the wild. The other factor in this case 
is probably not environmental, but another character negatively 
correlated with size, namely wildness. A change of body size under 
artificial selection is always accompanied by a correlated change of 
wildness. Large mice are phlegmatic and unreactive to disturbance, 
whereas small mice are alert and react energetically to disturbance 
(MacArthur, 1949; Falconer, 1953). Therefore under natural con- 
ditions larger mice would more readily fall prey to cats and owls than 
small mice, and the advantage of greater fertility would be offset by 
the disadvantage of being less well fitted to escape predators. The 
body size of wild mice, it may be suggested, represents the best 
compromise between these two correlated characters. If we could 
measure the relationship between size and fitness in wild mice we 
should find that those of intermediate size were fittest. With charac- 
ters of this sort also, the population mean represents the optimal 
value. But this value is optimal not because of this character itself but 
because of its genetic correlations with other characters. Large mice 
are selected against not because they are large but because, being 
large, they are inevitably also less wild. This example brings us to the 
point mentioned at the end of the last chapter: that we must expect to 
find negative genetic correlations between characters under simul- 
taneous selection. In this case we find a negative genetic correlation 


between large size and wildness, both of which may reasonably be 
supposed to be favoured by natural selection. These two characters 
are * 'components' ' of fitness in the same way that characters of econ- 
omic importance are components of total economic merit. What 
natural selection "aims at" is to increase both characters indefinitely, 
but the physiological connexions between them, which we see as a 
negative genetic correlation, limit the increase that is possible with 
the existing genetic variability. 

3. Major components of fitness. If we could measure fitness 
itself — which is technically very difficult — we should obviously find 
no "optimal" value; the individuals most favoured by natural 
selection would not be those nearest to the population mean, but the 
most extreme. In spite of the selection toward higher values the 
mean fails to change under natural selection because there is no 
additive variance of fitness. If we measure as a metric character 
something that is a major component of fitness, in the sense that it 
accounts for a large part of the variation of fitness, we should probably 
find the same sort of relationship. Fitness would increase as the 
value of the character increased. At the very highest values, however, 
fitness would probably decline again slightly. Egg-laying in Droso- 
phila might well be such a character, even if measured only over a 
few days, since the daily egg production is highly correlated with the 
total production (Gowen, 1952). We should almost certainly find 
that the fittest individuals were not those that laid an intermediate 
number of eggs, but those that laid almost the most. The most ex- 
treme individuals would probably be slightly less fit because of some 
environmental limitation or some correlated character, perhaps 
longevity. There must be many characters whose relationships with 
fitness fall between this and the previous type, characters with an 
optimal value above the population mean but yet below that of the 
most extreme individuals. 

The foregoing discussion will be enough to explain the nature of 
the problem of the relationship between a metric character and fitness 
and to indicate the sort of solution that may be sought. Let us turn 
now to the connexion between the relationship with fitness and the 
nature of the genetic variation of a metric character. When we first 
discussed the heritability as a property of a character in Chapter 10, 
we noted a tendency toward lower heritabilities among characters 
more closely connected with fitness. But the precise meaning of a 
"close connexion" with fitness was not explained. It may now be 



suggested that the meaning of a close connexion with fitness may 
perhaps be seen in the functional relationships discussed above. 
Characters with the closest connexion are of the third type where the 
population mean is not at an optimal value; characters with a less 
close connexion are nearer to the second type; while characters with 
the least connexion are the neutral or nearly neutral characters. On 
the whole it does seem that characters with high heritabilities are to 
be found among the first type and characters with low heritabilities 
among the third. Differences of heritability are, however, not really 
relevant here. It is the genetic variance with which we are concerned; 
and the differences in the proportion of the genotypic variance that 
is additive, that we want to account for. But so little is known about 
how the genotypic variance is partitioned into additive and non- 
additive components that we can scarcely begin to tackle the prob- 
lem. Four characters of Drosophila, however, seem to fit the picture 
fairly well, (see Table 8.2). For bristle number, which we have taken 
as a neutral character of the first type, 85 per cent of the genotypic 
variance is additive. Thorax length, which might perhaps be of the 
second type, has about the same proportion. For ovary size, however, 
only 43 per cent of the genotypic variance is additive, and this 
character might well be between the second and third types. For 
egg laying, which we have taken to be of the third type, the propor- 
tion is 29 per cent. These comparisons, of course, cannot be given 
much weight because in fact we know almost nothing of the func- 
tional relations of the characters with fitness. But they do suggest 
that the solution of the problem of why characters differ in their 
genetic properties may lie along these lines. The reaction of a charac- 
ter to inbreeding seems also to be connected with the proportion of 
non-additive genetic variance, those with most non-additive variance 
being those that suffer the greatest inbreeding depression. Some, 
perhaps most, of the non-additive variance must be attributed to 
dominance. Reasons for expecting the effects of genes on characters 
closely connected with fitness to show dominance, while the effects on 
characters not closely connected with fitness do not, have been put 
forward by A. Robertson (19556); but it would take too much space 
here to summarise the argument. There we must leave the problem 
of the nature of the genetic variance and pass on to the second aspect 
of the operation of natural selection on metric characters. 


Maintenance of Genetic Variation 

The second aspect of the operation of natural selection on metric 
characters — its effects on the individual loci — is part of a wider 
problem, which concerns the mechanisms by which genetic variation 
is maintained. Almost every metric character, of the many that have 
been studied both in natural populations and in domesticated animals 
and plants, exhibits genetic variation. What are the reasons for the 
existence of this genetic variation? The coexistence in a population 
of different alleles at a locus is governed by the three processes of 
mutation, random drift, and selection. Allelic differences originate 
by mutation and are extinguished by random drift, since no natural 
population is infinite in size. Natural selection may tend to eliminate 
the differences by favouring one allele over all others at a locus; or it 
may tend to perpetuate the differences by favouring heterozygotes. 
Let us discuss the role of natural selection first and the roles of muta- 
tion and random drift later. 

Effects of selection on individual loci. The way in which 
selection operates on any locus depends on the effects that the differ- 
ent alleles have on fitness itself, and not simply on their effects on one 
particular metric character. Therefore the functional relations be 
tween characters and fitness, which were discussed above, can indi-j w 
cate the action of selection only on those loci which affect fitness ■% 
through the character in question and not through any pleiotropic 
effects on other characters. Let us consider the three types of 
character in turn. 

i. Neutral characters. If there are genes whose only effects are 
on a neutral character, then selection plays no part in the existence of 
allelic differences at these loci. The gene frequencies at these loci 
must be controlled solely by mutation and random drift. 

2. Characters with intermediate optima. The consequences of 
selection favouring individuals of intermediate value have been 
examined from different aspects by Wright (19350, ^)> Haldane 
(19540), and by A. Robertson (1956) who reaches the following 
conclusions. If the intermediate optimum is the result of the func- 
tional relations of the character to fitness, and the optimum is deter 
mined by the character itself or by the environment, then selection 
will tend toward fixation at all the loci whose only influence on fitness 
is through the character in question. This would apply to characters 

Chap. 20] 



of type 2 (i) and (ii) described above and exemplified by the density 
of mammalian fur and by clutch size in birds. Selection will thus 
tend to eliminate rather than to conserve variability arising from loci 
which affect fitness only through such characters. The rate at which 
the gene frequencies are expected to change toward fixation is very 
slow, and so the rate at which variation would be eliminated is also 
very slow; but on an evolutionary time-scale it would not be negligible. 
Characters of type 2 (iii), where an intermediate optimum is deter- 
mined by a correlated character, have not yet been investigated in 
this connexion, and the mode of operation of selection on loci that 
affect them is not known. 

3. Major components of fitness. The essential feature of a major 
component of fitness is that the population mean is not at the opti- 
mum. But we cannot deduce, from this fact alone, how selection 
operates on the individual loci. If the genes that affect these charac- 
ters are at intermediate frequencies, it seems most probable that they 
are held there by selection favouring heterozygotes, because it seems 
hardly possible that the coefficients of selection are small enough to 
allow mutation alone to maintain intermediate frequencies. We do not 
know, however, whether these genes are at intermediate frequencies. 
It seems quite possible that a considerable portion of the genetic 
variation of these characters is due to genes at very low frequencies, 
where they are maintained by the balance between mutation and 
selection against the recessive homozygotes. Much evidence, how- 
ever, has been presented by Lerner (1954) in support of the view that 
heterozygotes in general are superior in fitness; and Haldane (19546) 
has pointed out that a general superiority of heterozygotes is a very 
reasonable expectation from biochemical considerations of gene 
action. Though the matter is not yet settled, the weight of evidence 
at present seems to point to superior fitness of heterozygotes, and 
consequently to natural selection favouring heterozygotes at most of 
the loci that affect fitness through its major components. 

There are three other ways in which selection may influence 
genetic variability, to be discussed before we leave the subject. They 
are all subsidiary to the main effects on gene frequencies which we 
have been discussing; they may modify these main effects, but they 
do not in themselves provide a sufficient description of the operation 
of natural selection. 

Variable selection. If characters have optimal values these 
optima are likely to vary from time to time and from season to season 


according to the environmental conditions. The selection pressures 
on the individual genes are therefore likely to change from generation 
to generation. The consequence of variable selection coefficients has 
been shown (Kimura, 1954) to be a tendency toward fixation — or 
more strictly, near-fixation — the favoured allele being the one that 
gives the highest average fitness. In this aspect selection would 
therefore tend to eliminate variability. The optimal values are 
likely to vary also from place to place within each generation, especi- 
ally if different genotypes choose different environments in which to 
live, as Waddington (1957) suggests. This form of variable selection 
has been shown to be capable under certain conditions of maintaining 
stable polymorphism, as was mentioned in Chapter 2. Its effect on 
the variation of metric characters, however, has not been examined. 
It does not seem likely to be very great. 

Balanced linkage. Mather's theory of "polygenic balance" is 
based on the idea of selection favouring intermediate values of metric 
characters and the effect this is likely to have on linkage (see for 
example, Mather, 1949, 1953^). In considering linkage between the 
loci affecting a metric character we have to take account of the linkage 
phase. We may say that two genes on the same chromosome are in 
coupling if they affect the character in the same direction, and in 
repulsion if they affect it in opposite directions. The two phases will 
be represented in equal frequencies in a random-breeding population 
subject to no selection, as was shown in Chapter 1. Now, chromo- 
somes carrying genes in coupling will contribute more to the variation 
than chromosomes carrying genes in repulsion. And individuals with 
intermediate values will tend on the whole to carry repulsion chromo- 
somes rather than coupling chromosomes. Therefore, if intermediates 
are favoured for functional reasons, selection will favour repulsion 
chromosomes and thus tend to build up * 'balanced" combinations of 
genes: that is, combinations in predominantly repulsion linkage, 
which contribute the minimal amount of variance. In this way, 
according to Mather, "potential" genetic variability is stored in latent 
form, and a compromise is reached between the conflicting needs of 
uniformity in adaptation to present circumstances and flexibility in 
adaptation to changing circumstances. 

If, however, this supposed tendency of selection to build up 
balanced combinations is to have any significant effect on genetic 
variability it is necessary that the selection should be strong enough 
to maintain the balanced combinations in the face of recombination 


Chap. 20] 



which must tend continuously to reduce them to a random arrange- 
ment. The selective forces required have been examined by Wright 
(1952&). It is clear, without going into the details, that coefficients of 
selection of the same order of magnitude as the recombination fre- 
quencies would be required. The balancing of linkage by natural 
selection therefore seems from Wright's reasoning to be relevant only 
to very short segments of chromosome. Loci with more than about 1 
per cent recombination between them would not be expected to 
depart significantly from a random arrangement, unless they carried 
major genes with large effects on the character. Furthermore, if we 
consider a number of loci on the same chromosome, it is not clear 
how much difference of variance would be expected between fully 
balanced and fully random arrangements; it might well be very little. 
Experimental evidence on the matter is scanty. In two experiments, 
one with mice and the other with Drosophila, where artificial selection 
was applied for and against intermediates, no changes of variance 
were detected (Falconer and Robertson, 1956; Falconer, 19576). 
Intensification of the selection against extremes therefore does not 
seem to have any effect on the variance within the time-span of a 
laboratory experiment. 

Canalisation. Waddington's theory of "canalisation" is con- 
cerned with the developmental pathways through which the pheno- 
typic values come to their expression (see Waddington, 1957). If 
intermediates are favoured because of their values of the metric 
character in question, then deviation from the optimal value is dis- 
advantageous. Selection will therefore operate against the causes of 
deviation, and will tend to produce a greater stability so that develop- 
ment is canalised along the path that leads to the optimal phenotypic 
expression. The role ascribed to selection is its discrimination against 
alleles that increase variability. These may be at loci that affect the 
character in question or at other loci. Variation both of environ- 
mental and of genetic origin may be reduced in this way. The 
genetic variation is reduced not by eliminating the segregation, but by 
rendering the organism less sensitive to the effects of the segregation. 
A change in the proportion of genetic to environmental variation is 
therefore not necessarily to be expected. As a consequence of canalisa- 
tion we should expect to find some characters less variable than 
others, the less variable being those for which deviation from the 
optimum has the more serious effect on fitness. This expected con- 
sequence of canalisation, however, cannot easily be tested experi- 


mentally, because, as Waddington (1957) points out, it is difficult to 
find a logical basis for comparing the variability of different characters. 
Origin of variation by mutation. Before the reasons for the 
existence of genetic variability can be fully understood it will be 
necessary to know what part mutation plays in restoring what is lost 
by random drift or by selection. If there were no selection of any 
kind then the amount of genetic variation would come to equilibrium 
when its rate of origin was equal to its rate of extinction by random 
drift. The rate of extinction presents no very serious problem be- 
cause we need know the population size only approximately. If, 
therefore, we knew the rate of origin by mutation we could decide 
whether a significant amount of the existing variation can be ascribed 
to mutation. Very little, however, is known about the rate of origin 
by mutation. The only evidence comes from two studies oiDrosophila 
by Clayton and Robertson (1955) and Paxman (1957), which yielded 
very similar results. The following discussion is based on the experi- 
ment of Clayton and Robertson. Selection for abdominal bristle 
number was applied to an inbred line derived from the same base 
population on which the other studies of this character were made. 
From the rate of response to selection it was concluded that the aver- 
age amount of variation arising by spontaneous mutation in one 
generation amounted to one thousandth part of the genetic variation 
present in the base population. In other words it would take about 
1000 generations for mutation to restore the genetic variation to its 
original level. (We may note in passing that this proves mutation to 
have a negligible influence on the response of non-inbred populations 
to artificial selection, apart from the rare occurrence of mutants with 
major effects.) Now consider the loss of variance due to random drift 
in a population of effective size N e , subject to no selection. If all the 
genetic variance is additive, as it very nearly is in the case of bristle 
number, then the rate of loss per generation is equal to the rate of 1 f 
inbreeding, which is ijzN e . (This follows from the reasoning given 
in Chapter 15, where the variance within a line was shown to be 
(1 -F) times the original variance.) Therefore the new variation 
arising by mutation at the rate found in this experiment would be lost 
at the same rate, if the rate of inbreeding were 1/1,000: that is, in a 
population of effective size 500. The base population was roughly 
ten times this size and therefore the expected rate of extinction by 
random drift is less than the observed rate of origin by mutation. In 
other words, mutation alone seems to be capable of accounting for 





Chap. 20] 



more variation of bristle number than was actually present in the 
base population. Therefore selection favouring heterozygotes does 
not seem to have been an important cause of the genetic variability of 
bristle number. This suggests that little of the variation of bristle 
number is due to the pleiotropic effects of genes that affect the major 
components of fitness. It suggests, in other words, that much of the 
variation of bristle number is due to genes that are not far from being 
neutral with respect to fitness. This conclusion, though only tenta- 
tive, is in line with the fact, mentioned earlier, that bristle number 
I shows little tendency to revert to the original mean value when 
I artificial selection is relaxed. The conclusions to which the results of 
I this experiment point cannot yet be extended to other characters. 
I Characters more closely connected with fitness, when they have been 
i| studied from this point of view, may present a very different picture. 
Evolutionary significance of variability. There can be little 
doubt that the existence of genetic variation is advantageous to the 
evolutionary survival of a species, the advantage it confers being the 
ability to evolve rapidly and so to meet the needs of a changing 
environment, both through the course of time and in the colonisation 
of new localities. Sexual reproduction and outbreeding are necessary 
conditions for the continued existence of genetic variation and it is 
noteworthy that the naturally inbreeding species among the higher 
plants are of comparatively recent origin. This suggests that the 
possession of genetic variability is necessary for the continued exist- 
ence of a species over a long period of time; or in other words, that the 
prevalence of genetic variability among existing species is because 
those without it have not survived. The inbreeding plants, however, 
as we see them at present, compete successfully with the outbreeding 
species, and this proves that the possession of genetic variability does 
not confer much immediate advantage. The evolutionary significance 
of genetic variability, however, throws no light on the mechanisms 
that maintain it. It is these mechanisms, which have been discussed 
in this chapter, that are the concern of quantitative genetics. 

The Genes concerned with Quantitative Variation 

The genetic variation of metric characters appears from the re- 
sults of experimental selection to be the product of segregation at 
some hundreds of loci, or more probably some thousands if the 

z F.Q.G. 


variation of all characters is included. So natural populations prob- 
ably carry a variety of alleles at a considerable proportion of loci, even 
perhaps at virtually every locus. It seems unreasonable, therefore, to 
think of genes having the control of a metric character as their 
specific function: we cannot reasonably suppose that there are genes 
whose only functions are the adjustment of, say, body size to an 
optimal value. How, then, are we to think of the genes with which we 
are concerned in quantitative genetics? Our knowledge of these 
genes may be briefly summarised as follows. 

The distinction between ' 'major" and "minor" genes marks the 
difference between those which we can study individually, and whose 
properties are therefore fairly easily discovered, and those which we 
cannot study individually and whose properties can only be deduced 
by indirect means. Both are concerned with quantitative variation. 
Among the major genes two sorts may be distinguished. There are 
genes with more or less severely deleterious effects on fitness, and 
these include nearly all the "mutants" of Mendelian genetics, as well 
as lethals. Each may have pleiotropic effects on a variety of metric 
characters. They are recessive, or nearly so, in their effects on fitness, 
but not necessarily also in their effects on metric characters. They are 
kept in equilibrium at low frequencies by natural selection balanced 
against mutation. Being at low frequencies they contribute, individu- 
ally, little to the genetic variance of any character; their total contri- 
bution, however, is unknown. They are probably an important cause 
of inbreeding depression. Major genes of the second sort are those 
responsible for the antigenic differences. The alleles at these loci are 
at intermediate frequencies where they are probably maintained by 
selection favouring heterozygotes. Their effects on fitness, however, 
are probably fairly small — certainly small enough for all to be 
regarded as "wild-type" alleles. Their effects on metric characters 
are almost unexplored, and their importance as sources of variation is 
consequently unknown. They presumably contribute to inbreeding 
depression if heterozygotes are superior in fitness, but again their 
relative importance in this respect is not known with certainty. 
About the minor genes little is known. They do not necessarily 
occupy loci different from those occupied by major genes. It seems 
more likely, on the contrary, that they are isoalleles, capable of 
mutating to major deleterious genes. They are performing their 
primary functions perfectly adequately and may differ only in the rate 
at which their primary product is synthesised. The variation of 

Chap. 20] 



metric characters which they produce may be quite incidental to their 
main biochemical functions. There is no reason at present to think 
that these minor genes differ in any essential way from the genes that 
determine antigenic differences. The fact that their effects are not 
individually recognisable, whereas the antigenic differences are, may 
be due only to the inadequacy of the techniques available for detect- 
ing biochemical differences among essentially normal individuals. 

The problems that have been raised but left unanswered in this 
chapter will be sufficient indication of the directions which the future 
development of quantitative genetics may take. It does not seem to 
the present writer that much progress toward their solution is likely 
to be made by deductive reasoning, because most of the outstanding 
problems are not essentially theoretical in nature: the theoretical 
structure of the subject is now fairly clear, at least in its main out- 
lines. Some of the outstanding problems are beyond the reach of the 
experimental techniques now at our command. New techniques, both 
more penetrating and more discriminating, will therefore be needed. 
Other problems arise from the paucity of experimental data and the 
consequent difficulty of deciding what phenomena are general and 
what are due to special circumstances. These problems will be solved 
not so much by deliberately designed experiments, but rather from 
the accumulated experience of experiments extended to a wider 
variety of characters and of organisms. 



This list gives the meanings of most of the symbols used in the book. 
Many of the symbols listed are used also with other meanings in certain 
places, but these meanings, as well as the symbols not listed, do not appear 
more than a page or two removed from their definition. The more im- 
portant differences from current usage are indicated where the equivalent 
symbols used by Lerner (1950) — denoted by (L) — and by Mather (1949) 
— denoted by (M) — are given. 

A x , A 2 Allelomorphic genes. 

A Breeding value. = G (L). 

a Genotypic value of the homozygote A^, as deviation from the 

mid-homozygote value. = d (M). 

a Average effect of a gene-substitution. 

a x , a 2 Average effects of the alleles A x and A 2 respectively. 

b Regression coefficient; e.g. &op = regression of offspring on parent. 

CR Correlated response to selection. 

D Dominance deviation. 

d Genotypic value of the heterozygote A X A 2 , as deviation from the 

mid-homozygote value. = /z|(M). 

A Change of -, as Aq = change of gene frequency, Zli r = rate of in- 


E Environmental deviation. 

Ec Common environment; i.e. environmental deviation of family mean 
from population mean. = C (L). 

Ew Within-family environment; i.e. environmental deviation of indi- 
vidual from family mean. = E' (L). 

F Coefficient of inbreeding. 

F 1 First generation of cross between lines or populations. 

F 2 Second generation of cross, by random mating among F x . 

FS Full sibs. 

/ Coancestry; i.e. inbreeding coefficient of the progeny of the indi 

viduals concerned. 

/ (Chap. 13): Subscript referring to selection between families. 

G Genotypic value. = Ge (L). 










N e 





Frequency of heterozygous genotype (A X A 2 ). 

Amount of heterosis; i.e. deviation of cross mean from mid-parent 

Half sibs. 

Interaction deviation, due to epistasis. 
(Chap. 13 & 19): Index for selection. 
Intensity of selection; i.e. selection differential in units of the 

phenotypic standard deviation. = 1 (L). 
Population mean. 
Immigration rate. 
Population size; i.e. number of breeding individuals in a population 

or line. 
(Chap. 10 & 13): Number of families. 
Effective population size. 
Number in various contexts. In Chapters 10 and 13, specifically 

number of offspring per family. 

Parent. P = Mid-parent. 
Frequency of homozygous genotype (A^). 
Panmictic index, ( = 1 - F). 
Phenotypic value. 
Gene frequency (of A x ). = u (M). 
(Chap. 11, part): proportion selected as parents from a normally 

distributed population. = v (L). 
Frequency of homozygous genotype (A 2 A 2 ). 
Gene frequency (of A 2 ). = v (M). 
Response to selection — specifically to individual selection. = AG 

(Chap. 8): Repeatability; i.e. correlation between repeated measure- 
ments of the same individual. 
(Chap. 13): Coefficient of relationship; i.e. correlation of breeding 

values between related individuals. = r G (L). 
(Chap. 19): Correlations between two characters: 
r A additive genetic correlation. = r G (L). 
r E environmental correlation. 
r P phenotypic correlation. = r (L). 

Selection differential in actual units of measurement. = i (L). 
Coefficient of selection against a particular genotype. 
(Chap. 13): subscript referring to sib-selection. 


E Summation of the quantity following the sign. 
a Standard deviation (a 2 = variance) of the quantity indicated by 

subscript. Components of variance, from an analysis of vari- 
ance are indicated by subscripts as follows: 
a% between groups, or families. 
o% between dams, within sires, 
of between sires. 

a\ total; i.e. the sum of all components. 
o\ within groups, or families. 
t Time in number of generations. As a subscript it means "at 

generation t". 
t Phenotypic correlation between members of families. 

u Mutation rate (from A x to A 2 ). 

V Variance (causal component) of the value or deviation indicated by 

subscript. The most important are: 

V P Phenotypic variance. = o% (L), = V (M). 

Vq Genotypic variance. = o% e (L). 

Vj Additive genetic variance. = al (L), = \T> (M). 

Vj) Dominance variance.) 2 . (=^H(M). 

Vi Interaction variance. J G \ = / (M). 

V E Environmental variance. =ct^(L), =E(M). 
v Mutation rate (from A 2 to A x ). 

w (Chap. 13): subscript referring to selection within families. 

X (Chap. 19): One of two correlated characters. 

Y (Chap. 19): The other of two correlated characters. 

y (Chap. 14): Difference of gene frequency between two lines. 

z (Chap. 11): Height of the ordinate of a normal distribution, in 

units of the standard deviation. 


The numbers in square brackets refer to pages in 
the text where the work is mentioned 

Allison, A. C. 1954. Notes on sickle-cell polymorphism. Ann. hum. Genet. 

[Lond.], 19: 39-57- [45] 

1955. Aspects of polymorphism in man. Cold Spr. Harb. Symp. quant. 

Biol, 20: 239-252. [44] 

Bartlett, M. S., and Haldane, J. B. S. 1935. The theory of inbreeding 

with forced heterozygosis. J. Genet., 31: 327-340. [97] 

Bell, A. E., Moore, C. H., and Warren, D. C. 1955. The evaluation of 

new methods for the improvement of quantitative characteristics. 

Cold Spr. Harb. Symp. quant. Biol., 20: 197-21 1. [286] 

Biggers, J. D., and Claringbold, P. J. 1954. Why use inbred lines? 

Nature [Lond.], 174: 596. [275] 

Briles, W. E., Allen, C. P., and Millen, T. W. 1957. The B blood group 

system of chickens. I. Heterozygosity in closed populations. 

Genetics, 42: 631-648. [290] 

Briquet, R., and Lush, J. L. 1947. Heritability of amount of spotting in 

Holstein-Friesian cattle. J. Hered., 38: 99-105. [167] 

Brumby, P. J. 1958. Monozygotic twins and dairy cattle improvement. 

Anim. Breed. Abstr., 26: 1-12. [ J 83] 

Brumby, P. J., and Hancock, J. 1956. A preliminary report of growth and 

milk production in identical- and fraternal-twin dairy cattle. N.Z. 

J. Sci. Tech., Agric, 38: 184-193. [185] 

Buri, P. 1956. Gene frequency in small populations of mutant Drosophila. 

Evolution, 10: 367-402. [52, 53, 56, 59, 74] 

Butler, L. 1952. A study of size inheritance in the house mouse. II. Analysis 

of five preliminary crosses. Canad. J. Zool., 30: 154-171. [216] 
Cain, A. J., and Sheppard, P. M. 1954a. Natural selection in Cepaea. 

Genetics, 39: 89-116. [43, 83] 

19546. The theory of adaptive polymorphism. Amer. Nat., 88: 321- 

326. [44] 

Carpenter, J. R., Gruneberg, H., and Russell, E. S. 1957. Genetical 

differentation involving morphological characters in an inbred 

strain of mice. II. The American branches of the C57BL and 

C57BR strains. J. Morph., 100: 377-388. [274] 

Castle, W. E., and Wright, S. 1916. Studies of inheritance in guinea-pigs 

and rats. Publ. Carneg. Instn. Wash., No. 241: iv + 192 pp. [168] 
Ceppellini, R., Siniscalco, M., and Smith, C. A. B. 1955. The estimation 

of gene frequencies in a random-mating population. Ann. hum. 

Genet. [Lond.], 20: 97-115. [16] 


Chai, C. K. 1957. Developmental homeostasis of body growth in mice. 

Amer. Nat., 91: 49-55. [271] 

Chapman, A. B. 1946. Genetic and nongenetic sources of variation in the 

weight response of the immature rat ovary to a gonadotropic 

hormone. Genetics, 31: 494-507. [168] 

Clayton, G. A., Knight, G. R., Morris, J. A., and Robertson, A. 1957. 

An experimental check on quantitative genetical theory. III. 

Correlated responses. J. Genet., 55: 171-180. [316, 320] 

Clayton, G. A., Morris, J. A., and Robertson, A. 1957. An experimental 

check on quantitative genetical theory. I. Short-term responses to 

selection. J. Genet., 55: 131-151. 

[140, 168, 169, 177, 190, 195, 209, 210, 221, 245, 333] 

Clayton, G. A., and Robertson, A. 1955. Mutation and quantitative 

variation. Amer. Nat., 89: 1 51-158. [342] 

1957. An experimental check on quantitative genetical theory. II. 

The long-term effects of selection. J. Genet., 55: 152-170. 

[216, 223] 
Cockerham, C. C. 1 954. An extension of the concept of partitioning 
hereditary variance for analysis of covariances among relatives 
when epistasis is present. Genetics, 39: 859-882. [138] 

1956*2. Effects of linkage on the covariances between relatives. Genetics, 
41: 138-141. [159] 

19566. Analysis of quantitative gene action. Genetics in Plant Breeding. 
Brookhaven Symp. Biol., No. 9: 53-68. [140] 

Comstock, R. E., and Robinson, H. F. 1952. Estimation of average domi- 
nance of genes. Heterosis, ed. J. W. Gowen. Ames: Iowa State 
College Press. Pp. 494-516. [290] 

Comstock, R. E., Robinson, H. F., and Harvey, P. H. 1949. A breeding 
procedure designed to make maximum use of both general and 
specific combining ability. J. Amer. Soc. Agron., 41: 360-367. 


Crow, J. F. 1948. Alternative hypotheses of hybrid vigor. Genetics, 33: 

477-487. [278, 290] 

1952. Dominance and overdominance. Heterosis, ed. J. W. Gowen. 

Ames: Iowa State College Press. Pp. 282-297. [278, 290] 

1954. Breeding structure of populations. II. Effective population 

number. Statistics and Mathematics in Biology, ed. O. Kempthorne, 

T. A. Bancroft, J. W. Gowen, and J. L. Lush. Ames: Iowa State 

College Press. Pp. 543"55o- [53, 60, 61, 64, 71] 

1956. The estimation of spontaneous and radiation-induced mutation 
rates in man. Eugen. Quart., 3: 201-208. [38] 

1957. Possible consequences of an increased mutation rate. Eugen. 
Quart., 4: 67-80. [39] 

Crow, J. F., and Morton, N. E. 1955. Measurement of gene frequency 
drift in small populations. Evolution, 9: 202-214. [73, 74] 

Cruden, D. 1949. The computation of inbreeding coefficients in closed 
populations. J. Hered., 40: 248-251. [88, 89] 




Dempster, E. R., and Lerner, I. M. 1950. Heritability of threshold charac- 
ters. Genetics, 35: 212-236. [303] 
Deol, M. S., Gruneberg, H., Searle, A. G., and Truslove, G. M. 1957. 
Genetical differentiation involving morphological characters in an 
inbred strain of mice. I. A British branch of the C57BL strain. 
J. Morph., 100: 345-376. [274] 
Dickerson, G. E. 1952. Inbred lines for heterosis tests? Heterosis, ed. 
J. W. Gowen. Ames: Iowa State College Press. Pp. 330-351. 

1955. Genetic slippage in response to selection for multiple objectives. 
Cold Spy. Harb. Symp. quant. Biol., 20: 213-224. [329] 

I 957- (Two abstracts.) Poult. Sci., 36: 11 12-1 113. [316] 

Dickerson, G. E., et al. 1954. Evaluation of selection in developing inbred lines 
of swine. Res. Bull. Mo. agric. Exp. Sta., No. 55 1 : 60 pp. [249, 253] 
Dobzhansky, Th. 1950. Genetics of natural populations. XIX. Origin of 
heterosis through natural selection in populations of Drosophila 
pseudoobscura. Genetics, 35: 288-302. [262] 

195 \a. Genetics and the Origin of Species. New York: Columbia Uni- 
versity Press. 3rd edn. xi+364pp. [44] 
195 ib. Mendelian populations and their evolution. Genetics in the 2.0th 
Century, ed. L. C. Dunn. New York: Macmillan Co. Pp. 573-589. 

[44> 263] 

1952. Nature and origin of heterosis. Heterosis, ed. J. W. Gowen. 
Ames: Iowa State College Press. Pp. 218-223. [262] 

Dobzhansky, Th., and Pavlovsky, O. 1955. An extreme case of heterosis 

in a Central American population of Drosophila tropicalis. Proc. 

nat. Acad. Sci. U.S.A., 41: 289-295. [39] 

Donald, H. P., Deas, D. W., and Wilson, A. L. 1952. Genetical analysis 

of the incidence of dropsical calves in herds of Ayrshire cattle. 

Brit. vet. J., 108: 227-245. [13] 

Emik, L. O., and Terrill, C. E. 1949. Systematic procedures for calculating 

inbreeding coefficients, jf. Hered., 40: 51-55. [88, 89] 

Falconer, D. S. 1952. The problem of environment and selection. Amer. 

Nat., 86: 293-298. [323, 324] 

1953. Selection for large and small size in mice. J. Genet., 51: 470-501. 

[96, 168, 199, 335] 

1954a. Asymmetrical responses in selection experiments. Symposium 
on Genetics of Population Structure, Istituto di Genetica, Universita 
di Pavia, Italy, August 20-23, 1953. Un. int. Sci. biol., No. 15: 
16-41. [31, 33, 203, 213, 297] 

19546. Validity of the theory of genetic correlation. An experimental 
test with mice. J. Hered., 45: 42-44. [168, 316, 319] 

1954c. Selection for sex ratio in mice and Drosophila. Amer. Nat., 88: 
385-397. [321] 

1 9SS- Patterns of response in selection experiments with mice. Cold 
Spr. Harb. Symp. quant. Biol., 20: 178-196. 

[168, 201, 214, 216, 220] 


1957a. Breeding methods — I. Genetic considerations. The UFAW 
Handbook on the Care and Management of Laboratory Animals, 2nd 
edn., edd. A. N. Worden and W. Lane-Petter. London: Univer- 
sities Federation for Animal Welfare. Pp. 85-107. [228] 
19576. Selection for phenotypic intermediates in Drosophila. J. Genet., 
55: 55i-56i. [341] 

Falconer, D. S., and Latyszewski, M. 1952. The environment in relation 
to selection for size in mice. J. Genet., 51: 67-80. [324] 

Falconer, D. S., and Robertson, A. 1956. Selection for environmental 
variability of body size in mice. Z. indukt. Abstamm.-u. Vererblehre, 

87: 385-39I- [34i] 

Fisher, R. A. 19 18. The correlation between relatives on the supposition 
of Mendelian inheritance. Trans, roy. Soc. Edinb., 52: 399-433. 

[2, 124] 

1930. The Genetical Theory of Natural Selection. Oxford University 

Press. xiv+272pp. [4] 

1 94 1. Average excess and average effect of a gene substitution. Ann. 

Eugen. [Lond.], 11: 53-63 . [124] 

1949. The Theory of Inbreeding. Edinburgh: Oliver & Boyd, viii + 120 

pp. [90, 97, 99, 100] 

Fisher, R. A., and Yates, F. 1943. Statistical Tables. Edinburgh: Oliver & 

Boyd. 2nd edn. viii +98 pp. [i94 5 302] 

Ford, E. B. 1953. The genetics of polymorphism in the Lepidoptera. 

Advanc. Genet., 5: 43-87. [44] 

Fredeen, H. T., and Jonsson, P. 1957. Genie variance and covariance in 

Danish Landrace swine as evaluated under a system of individual 

feeding of progeny test groups. Z. Tierz. Zuchtbiol., 70: 348-363. 

[167, 174, 175, 316] 
Gilmour, D. G. 1958. Maintenance of segregation of blood group genes 
during inbreeding in chickens. (Abstr.) Heredity, 12: 141-142. 


Gowe, R. S., Robertson, A., and Latter, B. D. H. 1959. Environment and 

poultry breeding problems. 5. The design of poultry control 

strains. Poult. Sci., 38: 462-471. [72, 73] 

Gowen, J. W. 1952. Hybrid vigor in Drosophila. Heterosis, ed. J. W. 

Go wen. Ames: Iowa State College Press. Pp. 474-493. 

[282, 33«] 

Green, E. L. 195 1. The genetics of a difference in skeletal type between 

two inbred strains of mice (BalbC and C57blk). Genetics, 36: 391- 

409. [303] 

Green, E. L., and Russell, W. L. 1951. A difference in skeletal type be- 
tween reciprocal hybrids of two inbred strains of mice (C57BLK 
and C3H). Genetics, 36: 641-651. [305, 307] 

Gruneberg, H. 1952. Genetical studies on the skeleton of the mouse. IV. 
Quasi-continuous variations. J. Genet., 51: 95-114. [3 01 ] 

1954. Variation within inbred strains of mice. Nature [Lond.], 173: 674. 




Haldane, J. B. S. 1924-32. A mathematical theory of natural and artificial 
selection. Proc. Camb. phil. Soc, 23: 19-41; 158-163; 363-372; 
607-615; 838-844; 26: 220-230; 27: 131-142; 28: 244-248. [2] 

1932. The Causes of Evolution. London: Longmans, Green & Co., Ltd. 

vii + 235pp. [2,4] 

Haldane, J. B. S. 1936. The amount of heterozygosis to be expected in an 

approximately pure line. J. Genet., 32: 375-391. [100] 

1937. Some theoretical results of continued brother-sister mating. J. 
Genet., 34: 265-274. [90, 97] 

1939. The spread of harmful autosomal recessive genes in human 
populations. Ann. Eugen. [Lond.], 9: 232-237. [41] 

1946. The interaction of nature and nurture. Ann. Eugen. [Lo?id.], 13: 
197-205. [133] 

1949. The rate of mutation of human genes. Proc. 8th int. Congr. Genet. 
1948 [Stockh.]. Lund: Issued as a supplementary volume of 
Hereditas, 1949. Pp. 267-273. [38] 

1954a. The measurement of natural selection. Proc. gth int. Congr. 
Genet. [Bellagio (Como)], 1953, Pt. I (Suppl. to Caryologia, 6): 480- 
487. [338] 

19546. The Biochemistry of Genetics. London: George Allen & Unwin 
Ltd. 144 pp. [339] 

J 955- The complete matrices for brother-sister and alternate parent- 
offspring mating involving one locus. J. Genet., 53: 315-324. 

[90, 97] 
Hancock, J. 1954. Monozygotic twins in cattle. Advanc. Genet., 6: 141- 

181. [183] 

Hardy, G. H. 1908. Mendelian proportions in a mixed population. Science, 

28: 49-50- [9] 

Hayes, H. K., Immer, F. R., and Smith, D. C. 1955. Methods of Plant 

Breeding. New York: McGraw-Hill Book Co., Inc. 2nd edn. 

xi+551 pp. [276] 

Hayman, B. I. 1955. The description and analysis of gene action and inter- 
action. Cold Spr. Harb. Symp. quant. Biol., 20: 79-84. [i4°] 
1958. The theory and analysis of diallel crosses. II. Genetics, 43: 63-85. 

[140, 277] 
Hayman, B. I., and Mather, K. 1953. The progress of inbreeding when 

homozygotes are at a disadvantage. Heredity, 7: 165-183. [102] 
Hazel, L. N. 1943. The genetic basis for constructing selection indexes. 

Genetics, 28: 476-490. [324* 325] 

Hazel, L. N., and Lush, J. L. 1942. The efficiency of three methods of 

selection. J. Hered., 33: 393~399- [324] 

Hollingsworth, M. J., and Smith, J. M. 1955. The effects of inbreeding 

on rate of development and on fertility in Drosophila subobscura. 

J. Genet., 53: 295-314. f 2 49, 252] 

Horner, T. W., Comstock, R. E., and Robinson, H. F. 1955. Non-allelic 

gene interactions and the interpretation of quantitative genetic data. 

Tech. Bull. N.C. agric. Exp. Sta., No. 1 18: v + 1 17 pp. [299] 


Hull, F. H. 1945. Recurrent selection for specific combining ability in 
corn. J. Amer. Soc. Agron., 37: 134-145. [286] 

Hunt, H. R., Hoppert, C. A., and Erwin, W. G. 1944. Inheritance of 
susceptibility to caries in albino rats {Mus norvegicus). y. dent. Res., 

23: 385-401- [297] 

Johansson, I. 1950. The heritability of milk and butterfat yield. Anim. 
Breed. Abstr., 18: 1-12. [i44> 167, 316] 

Kempthorne, O. 1954. The correlation between relatives in a random 
mating population. Proc. roy. Soc, B, 143: 103-113. [138, 279] 
I 955 a - The theoretical values of correlations between relatives in ran- 
dom mating populations. Genetics, 40: 153-167. 

[138, 152, 158, 174] 
19556. The correlations between relatives in random mating popula- 
tions. Cold Spr. Harb. Symp. quant. Biol., 20: 60-75. [ x 38, 158] 
1957. An Introduction to Genetic Statistics. New York: John Wiley & Sons, 
Inc.; London: Chapman & Hall, Ltd. xvii + 545 pp. [4, 264, 277] 
Kempthorne, O., and Tandon, O. B. 1953. The estimation of heritability 
by regression of offspring on parent. Biometrics, 9: 90-100. [171] 
Kerr, W. E., and Wright, S. 1954a. Experimental studies of the distribu- 
tion of gene frequencies in very small populations of Drosophila 
melanogaster: I. Forked. Evolution, 8: 172-177. [74] 

19546. Experimental studies of the distribution of gene frequencies in 
very small populations of Drosophila melanogaster. III. Aristapedia 
and spineless. Evolution, 8: 293-302. [74] 

Kimura, M. 1954. Process leading to quasi-fixation of genes in natural 
populations due to random fluctuation of selection intensities. 
Genetics, 39: 280-295. [57, 34°] 

J 955- Solution of a process of random genetic drift with a continuous 
model. Proc. nat. Acad. Sci. U.S.A., 41: 144-150. [54, 55, 57] 
1956. Rules for testing stability of a selective polymorphism. Proc. nat. 
Acad. Sci. U.S.A., 42: 336-340. [42] 

King, J. W. B. 1950. Pygmy, a dwarfing gene in the house mouse. J.Hered., 
41: 249-252. [113] 

x 955- Observations on the mutant "pygmy" in the house mouse. J& 
Genet., 53: 487"497- [113, 289, 298, 299] 

King, S. C, and Henderson, C. R. 1954a. Variance components analysis 
in heritability studies. Poult. Sci., 33: 147-154. [173] 

19546. Heritability studies of egg production in the domestic fowl. 
Poult. Sci., 33: 155-169. [168] 

Kyle, W. H., and Chapman, A. B. 1953. Experimental check of the effec- 
tiveness of selection for a quantitative character. Genetics, 38: 421- 
443. [225] 

Lack, D. 1954. The Natural Regulation of Animal Numbers. Oxford: 
Clarendon Press. viii+343pp. [334] 

Lamotte, M. 195 1. Recherches sur la structure genetique des populations 
naturelles de Cepaea nemoralis (L.). Bull, biol., Suppl. 35: 238 pp. 

[78, 83, 84] 



Lerner, I. M. 1950. Population Genetics and Animal Improvement. Cam- 
bridge University Press, xviii + 342 pp. [4, 236, 325, 346] 

1954. Genetic Homeostasis. Edinburgh: Oliver & Boyd, vii + 134 pp. 

[44, 202, 213, 263, 270, 271, 288, 331, 339] 

1958. The Genetic Basis of Selection. New York: John Wiley & Sons, 

Inc. xvi+298 pp. [202, 263] 

Lerner, I. M., and Cruden, D. 195 i. The heritability of egg weight: the 

advantages of mass selection and of early measurements. Poult. Sci., 

30: 34-41- [168] 

Levene, H. 1953. Genetic equilibrium when more than one ecological niche 

is available. Amer. Nat., 87: 331-333. [43] 

Li, C. C. 1955a. Population Genetics. Chicago: University of Chicago Press; 

London: Cambridge University Press, xi +366 pp. 

[4, 15, 20, 22, 24] 

19556. The stability of an equilibrium and the average fitness of a 

population. Amer. Nat., 89: 281-296. [43] 

Livesay, E. A. 1930. An experimental study of hybrid vigor or heterosis in 

rats. Genetics, 15: 17-54. [271] 

Lush, J. L. 1945. Animal Breeding Plans . Ames: Iowa State College Press. 

3rd edn. viii+443pp. [4] 

1947. Family merit and individual merit as bases for selection. Pt. I, 

Pt. II. Amer. Nat., 81: 241-261; 362-379. [236, 237] 

1950. Genetics and animal breeding. Genetics in the Twentieth Century, 

ed. L. C. Dunn. New York: Macmillan Co. Pp. 493-525. [200] 

Lush, J. L., and Molln, A. E. 1942. Litter size and weight as permanent 

characteristics of sows. Tech. Bull. U.S. Dep. Agric, No. 836: 40 pp. 

Mac Arthur, J. W. 1949. Selection for small and large body size in the 

house mouse. Genetics, 34: 194-209. [216, 295, 335] 

McLaren, A., and Michie, D. 1954. Factors affecting vertebral variation 

in mice. 1. Variation within an inbred strain. J. Embryol. exp. 

Morph., 2: 149-160. [273, 274] 

1955. Factors affecting vertebral variation in mice. 2. Further evidence 
on intra-strain variation. J. Embryol. exp. Morph., 3: 366-375. 

[303, 306] 

1956a. Factors affecting vertebral variation in mice. 3. Maternal 

effects in reciprocal crosses. J. Embryol. exp. Morph., 4: 1 61-166. 

1 [306] 

19566. Variability of response in experimental animals. J. Genet., 54: 

440-455. [271] 

' Malecot, G. 1948. Les Mathematiques deVHeredite. Paris: Masson et Cie. 

vi+63 pp. [4, 61, 69, 75, 88] 

' : Mangelsdorf, P. C. 1 95 1. Hybrid corn: its genetic basis and its significance 

1 ! in human affairs. Genetics in the Twentieth Century, ed. L. C. Dunn. 

New York: Macmillan Co. Pp. 555-571. [no, 277] 

' j Mather, K. 1949. Biometrical Genetics. London: Methuen & Co., Ltd. 

ix + 162 pp. [4, 106, 277, 340, 346] 


1953a. Genetical control of stability in development. Heredity, 7: 297- 

336. [270] 

19536. The genetical structure of populations. Symp. Soc. exp. Biol. 7: 

66-95. [34o] 

1955a. Polymorphism as an outcome of disruptive selection. Evolution, 

9: 52-61. [43] 

19556. The genetical basis of heterosis. Proc. roy. Soc, B., 144: 143- 

150. [288] 

Merrell, D. J. 1953. Selective mating as a cause of gene frequency changes 

in laboratory populations of Drosophila melanogaster. Evolution, 7: 

287-296. [34] 

Morley, F. H. W. 195 1. Selection for economic characters in Australian 

Merino sheep. (1) Estimates of phenotypic and genetic parameters. 

Set. Bull. Dep. Agric. N.S.W., No. 73: 45 pp. [144] 

1954. Selection for economic characters in Australian Merino sheep. 

IV. The effect of inbreeding. Aust. J. agric. Res., 5: 305-316. 


1955. Selection for economic characters in Australian Merino sheep. 

V. Further estimates of phenotypic and genetic parameters. Aust. 
J. agric. Res., 6: 77-90. [168, 316] 

Mourant, A. E. 1954. The Distribution of the Human Blood Groups. Oxford: 
Blackwell. xxi+428pp. [5] 

Muller, H. J., and Oster, I. I. 1957. Principles of back mutation as ob- 
served in Drosophila and other organisms. Advances in Radio- 
biology. Proc. 5th int. Conf.Radiobiol. [Stockh.], 1956. Edinburgh: 
Oliver & Boyd. Pp. 407-415. [26] 

Newman, H. H., Freeman, F. N., and Holzinger, K. J. 1937. Twins: a 
Study of Heredity and Environment. Chicago: University of Chicago 
Press. xvi+369pp. [185] 

Osborne, R. 1957a. The use of sire and dam family averages in increasing 
the efficiency of selective breeding under a hierarchical mating 
system. Heredity, 11: 93-116. [241] 

19576. Correction for regression on a secondary trait as a method of 
increasing the efficiency of selective breeding. Aust. J. biol. Sci., 
10: 365-366. [328] 

Osborne, R., and Paterson, W. S. B. 1952. On the sampling variance of 
heritability estimates derived from variance analyses. Proc. roy.. 
Soc. Edinb., B., 64: 456-461. [183] 

Paxman, G. J. 1957. A study of spontaneous mutation in Drosophila 
melanogaster. Genetica, 29: 39-57. [342] 

Pearson, K., and Lee, A. 1903. On the laws of inheritance in man. I. In- 
heritance of physical characters. Biometrika, 2: 357-462. [163] 

Penrose, L. S. 1949. The Biology of Mental Defect. London: Sidgwick & 

Jackson. xiv+285pp. [163,164] 

1954. Some recent trends in human genetics. Proc. gth int. Congr. 

Genet. [Bellagio (Como)], 1953, Pt. I (Suppl. to Caryologia, 6): 521- 

530. [Hi] 



Plum, M. 1954. Computation of inbreeding and relationship coefficients. 

J. Hered., 45: 92-94- [89] 

Powers, L. 1950. Determining scales and the use of transformations in 

studies on weight per locule of tomato fruit. Biometrics, 6: 145-163. 

1952. Gene recombination and heterosis. Heterosis, ed. J. W. Gowen. 
Ames: Iowa State College Press. Pp. 298-319. [260] 

Race, R. R., and Sanger, R. 1954. Blood Groups in Man. Oxford: Black- 
well. 2nd edn. xvi+40opp. [12, 16] 
Rasmuson, M. 1952. Variation in bristle number of Drosophila melanogaster . 
Acta zool. [Stockh.], 33: 277-307. [265] 
1956. Recurrent reciprocal selection. Results of three model experi- 
ments on Drosophila for improvement of quantitative characters. 
Hereditas [Lund], 42: 397-414. [286] 
Reeve, E. C. R. 1955a. Inbreeding with homozygotes at a disadvantage. 
Ann. hum. Genet. [Lond.], 19: 332-346. [101] 
^S^- (Contribution to discussion.) Cold Spr. Harb. Symp. quant. 
Biol., 20: 76-78. [170] 
1955c. The variance of the genetic correlation coefficient. Biometrics, 
11: 357-374- [171, 318] 
Reeve, E. C. R., and Robertson, F. W. 1953. Studies in quantitative in- 
heritance. II. Analysis of a strain of Drosophila melanogaster 
selected for long wings. J. Genet., 51: 276-316. [171, 316, 319] 
1954. Studies in quantitative inheritance. VI. Sternite chaeta number 
in Drosophila: a metameric quantitative character. Z. indukt. Ab- 
stamm.-u. Vererblehre, 86: 269-288. [140, 145, 316] 
Rendel, J. M. 1954. The use of regressions to increase heritability. Aust. 
J. biol. Sci., 7: 368-378. [328] 
Rendel, J. M., Robertson, A., Asker, A. A., Khishin, S. S., and Ragab, 
M. T. 1957. The inheritance of milk production characteristics. 
J. agric. Sci., 48: 426-432. [149] 
Roberts, J. A. Fraser. 1957. Blood groups and susceptibility to disease: a 
review. Brit. J. prev. Soc. Med., 11: 107-125. [44] 
Robertson, A. 1952. The effect of inbreeding on the variation due to re- 
cessive genes. Genetics, 37: 189-207. [268, 269] 
1954. Inbreeding and performance in British Friesian cattle. Proc. 
Brit. Soc. Anim. Prod., 1954: 87-92. [249] 
1955a. Prediction equations in quantitative genetics. Biometrics, 11: 
95-98. [236] 
19556. Selection in animals: synthesis. Cold Spr. Harb. Symp. quant. 
Biol, 20: 225-229. [333, 337] 
1956. The effect of selection against extreme deviants based on devia- 
tion or on homozygosis. J. Genet., 54: 236-248. [338] 
1957a. Genetics and the improvement of dairy cattle. Agric. Rev. 
[Lond.], 2 (8): 10-21. [167] 
19576. Optimum group size in progeny testing and family selection. 
Biometrics, 13: 442-450. [243] 


ig$ga. Experimental design in the evaluation of genetic parameters. 

Biometrics, 15: 219-226. [178, 182, 183] 

19596. The sampling variance of the genetic correlation coefficient. 

Biometrics, 15: 469-485. [318, 323] 

Robertson, A., and Lerner, I. M. 1949. The heritability of all-or-none 
traits: viability of poultry. Genetics, 34: 395-411. [168, 303] 

Robertson, F. W. 1955. Selection response and the properties of genetic 
variation. Cold Spr. Harb. Symp. quant. Biol., 20: 166-177. 

[211, 212, 216] 
1957a. Studies in quantitative inheritance. X. Genetic variation of ovary 
size in Drosophila. J. Genet., 55: 410-427. [!4 > j 44> x 45> io 8] 

19576. Studies in quantitative inheritance. XI. Genetic and environ- 
mental correlation between body size and egg production in 
Drosophila melanogaster . J. Genet., 55: 428-443. [131, 140, 168] 

Robertson, F. W., and Reeve, E. C. R. 1952a. Studies in quantitative in- 
heritance. I. The effects of selection of wing and thorax length in 
Drosophila melanogaster. J. Genet., 50: 414-448. [223] 

19526. Heterozygosity, environmental variation and heterosis. Nature 
[Lond.], 170: 296. [270, 271] 

Robinson, H. F., and Comstock, R. E. 1955. Analysis of genetic variability 
in corn with reference to probable effects of selection. Cold Spr. 
Harb. Symp. quant. Biol., 20: 127-135. [140, 284, 290] 

Robinson, H. F., Comstock, R. E., Khalil, A., and Harvey, P. H. 1956. 
Dominance versus over-dominance in heterosis: evidence from 
crosses between open-pollinated varieties of maize. Amer. Nat., 
90: 127-13 1. [290] 

Robson, E. B. 1955. Birth weight in cousins. Ann. hum. Genet. [Lond.], 19: 
262-268. [141, 163, 185] 

Russell, E. S. 1949. A quantitative histological study of the pigment found 
in the coat-color mutants of the house mouse. IV. The nature of 
the effects of genie substitution in five major allelic series. Genetics, 
34: 146-166. [116, 126] 

Schafer, W. 1937. Uber die Zunahme der Isozygotie (Gleicherbarkeit) 
bei fortgesetzter Bruder-Schwester-Inzucht. Z. indukt. Abstamm.- 
u. Vererblehre, 72: 50-78. [91, 97] 

Searle, A. G. 1949. Gene frequencies in London's cats. J. Genet., 49: 
214-220. [18] 

Sheppard, P. M. 1958. Natural Selection and Heredity. London: Hutchin- 
son & Co. (Publishers) Ltd. 212 pp. [44] 

Shoffner, R. N. 1948. The reaction of the fowl to inbreeding. Poult. Sci., 
27: 448-452. [249] 

Sierts-roth, U. 1953. Geburts- und Aufzuchtgewichte von Rassehunden. 
Z. Hundeforsch., 20: 1 -122. [216] 

Slizynski, B. M. 1955. Chiasmata in the male mouse. J. Genet., 53: 597- 
605. [99] 

Smith, H. Fairfield. 1936. A discriminant function for plant selection. 
Ann. Eugen. [Lond.], 7: 240-250. [325] 




Smith, H. H. 1952. Fixing transgressive vigor in Nicotiana rustica. 
Heterosis, ed. J. W. Gowen. Ames: Iowa State College Press. Pp. 
161-174. [260] 

Snedecor, G. W. 1956. Statistical Methods. Ames: Iowa State College 
Press. 5th edn. xiii + 534 pp. [144,173] 

Sprague, G. F. 1952. Early testing and recurrent selection. Heterosis, ed. 
J. W. Gowen. Ames:' Iowa State College Press. Pp. 400-417. 

Stern, C. 1943. The Hardy-Weinberg law. Science, 97: 137-138. [9] 

1949. Principles of Human Genetics. San Francisco: W. H. Freeman & 
Co. xi+6i7pp. [13, 183] 

Tantawy, A. O., and Reeve, E. C. R. 1956. Studies in quantitative inheri- 
tance. IX. The effects of inbreeding at different rates in Drosophila 
melanogaster . Z. indukt. Abstamm.-u. Vererblehre, 87: 648-667. 

[249, 268, 290] 
Waddington, C. H. 1942. Canalisation of development and the inheritance 
of acquired characters. Nature [Lond.], 150: 563. [310] 

IQ 53- Genetic assimilation of an acquired character. Evolution, 7: 118- 
126. [310,311] 

1957. The Strategy of the Genes. London: George Allen & Unwin, Ltd. 
ix + 262 pp. [43, 146, 271, 311, 340, 341, 342] 

Wallace, B. 1958. The comparison of observed and calculated zygotic 
distributions. Evolution, 12: 113-115. [12] 

Wallace, B., and Vetukhiv, M. 1955. Adaptive organization of the gene 
pools of Drosophila populations. Cold Spr. Harb. Symp. quant. 
Biol, 20: 303-309. [262] 

Warren, E. P., and Bogart, R. 1952. Effect of selection for age at time of 
puberty on reproductive performance in the rat. Sta. Tech. Bull. Ore. 
agric. Exp. Sta., No. 25: 27 pp. [168] 

Warwick, B. L. 1932. Probability tables for Mendelian ratios with small 
numbers. Bull. Tex. agric. Exp. Sta., No. 463: 28 pp. [105] 

Warwick, E. J., and Lewis, W. L. 1954. Increase in frequency of a de- 
leterious recessive gene in mice. J. Hered., 45: 143-145. [113] 
Weinberg, W. 1908. Uber den Nachweis der Vererbung beim Menschen. 
Jh. Ver. vaterl. Naturk. Wiirttemb., 64: 369-382. [9] 
Weir, J. A. 1955. Male influence on sex ratio of offspring in high and low 
blood-/>H lines of mice. J. Hered., 46: 277-283. [321] 
Weir, J. A., and Clark, R. D. 1955. Production of high and low blood-^H 
lines of mice by selection with inbreeding. J. Hered., 46: 125-132. 

Whatley, J. A. 1942. Influence of heredity and other factors on 180- 
day weight in Poland China swine. J. agric. Res., 65: 249-264. 

Williams, E. J. 1954. The estimation of components of variability. Tech. 
Pap. Div. math. Statist. C.S.I.R.O. Aust., No. 1: 22 pp. [173] 

Wright, S. 1921. Systems of mating. Genetics, 6: 11 1-178. 

[2, 4, 22, 90, 165] 
2A f.q.g. 


1922. Coefficients of inbreeding and relationship. Amer. Nat., 56: 330- 
338. [22,87,88] 

1 93 1. Evolution in Mendelian populations. Genetics, 16: 97-159. 

[4, 53, 54, 69, 75, 92] 

1933. Inbreeding and homozygosis. Proc. nat. Acad. Sci. [Wash.], 19: 
411-420. [90,92] 

1934. The method of path coefficients. Ann. math. Statist., 5: 161-215. 


1935a. The analysis of variance and the correlations between relatives 

with respect to deviations from an optimum. J. Genet., 30: 243-256. 

19356. Evolution in populations in approximate equilibrium. J. Genet., 

30: 257-266. [338] 

1939. Statistical genetics in relation to evolution. Actualites scientifiques 
et industr idles, 802. Paris: Hermann et Cie. 63 pp. [70] 

1940. Breeding structure of populations in relation to speciation. 
Amer. Nat., 74: 232-248. [71, 77] 

1942. Statistical genetics and evolution. Bull. Amer. math. Soc, 48: 
223-246. [75, 79] 

1943. Isolation by distance. Genetics, 28: 1 14-138. [77] 
1946. Isolation by distance under different systems of mating. Genetics, 

3i: 39-59- [77] 

1948. On the roles of directed and random changes in gene frequency 

in the genetics of populations. Evolution, 2: 279-294. [75] 

195 1. The genetical structure of populations. Ann. Eugen. [Lond.], 15: 

323-354. [75, 76, 77] 

1952a. The theoretical variance within and among subdivisions of a 
population that is in a steady state. Genetics, 37: 312-321. [57] 
19526. The genetics of quantitative variability. Quantitative Inheritance, 
edd. E. C. R. Reeve and C. H. Waddington. London: H.M.S.O. 
Pp. 5-41. [219, 292, 297, 34i] 

1954. The interpretation of multivariate systems. Statistics and Mathe- 
matics in Biology, edd. O. Kempthorne, T. A. Bancroft, J. W. 
Gowen and J. L. Lush. Ames: Iowa State College Press. Pp. 1 1-33. 

1956. Modes of selection. Amer. Nat., 90: 5-24. [263] 

Wright, S., and Kerr, W. E. 1954. Experimental studies of the distribu- 
tion of gene frequencies in very small populations of Drosophila 
melanogaster. II. Bar. Evolution, 8: 225-240. [74, 80, 81] 

Wright, S., and McPhee, H. C. 1925. An approximate method of cal- 
culating coefficients of inbreeding and relationship from livestock 
pedigrees. J. agric. Res., 31: 377-383 • [87] 

Yoon, C. H. 1955. Homeostasis associated with heterozygosity in the 
genetics of time of vaginal opening in the house mouse. Genetics, 
40: 297-309. [271] 

Zeleny, C. 1922. The effect of selection for eye facet number in the white 
bar-eye race of Drosophila melanogaster. Genetics, 7: 1-115. [107] 


Adaptive value, 26. 

action of genes, 126, 138; 

combination of loci, 116-7; 

effects, 122; 

genes, 124, 138; 

variance, 135-8. 
albinism in man, 13, 36. 
assimilation, genetic, 310-1. 
assortative mating, 22, 164, 170-1. 
asymmetry in selection response, 

as scale effect, 296-7. 
average effect, 117-20. 

Base population, 49, 61, 95-6. 
blood groups: 

in man, 5, 7, 12, 16, 44; 

in poultry, 290; 

selective advantage, 44. 
Breeding value, 120-5; 

difference between definitions, 

Canalisation, 272, 308, 341. 

cats, 18-19. 

cattle, dropsy in, 13. 

causal components of variance, 150. 

Cepaea nemoralis, 43, 78-9, 83-4. 

coadaptation, 263. 

coancestry, 88-90, 233. 


of inbreeding, see under Inbreed- 

of relationship, 233; 

of selection, 28. 
combining ability, 281-6. 
continuous variation, 104-1 1 . 
correlated characters, 312-29, 335-6. 
correlation (between characters), 

genetic, 313-8. / 
correlation (between relatives): 

of breeding valWs, 233; 

phenotypic, 151X162-3. 
covariance, 15 1-2; 

environmental, 159-61; 

genetic, 152-9, 

offspring-parent, 152-6, 
sibs, 154, 156-7; \ 

phenotypic, 16 1-4. 

heterosis, 255-63; 

in plant and animal improvement, 

variance between crosses, 279-83. 

Developmental variation, 141, 143. 

dominance, 122-5; 

environmental, 112; 

interaction (epistatic), 125-8. 
discontinuous variation, 104, 108, 

dispersive process, 23, 47-8 (see 

also Inbreeding), 
dominance, 27, 113; 

deviation, 122-5; 

directional, 213; 

effect on variance, 137; 

and fitness, 337; 

and heterosis, 257; 

and inbreeding depression, 251; 

and scale, 298. 
drift, random, 50-7; 

in natural populations, 81-4. 
Drosophila melanogaster : 

Bar, 80-1, 107; 

bristle number: 

components of variance, 140, 

fitness relationship, 333, 343, 



frequency distribution, 107, 
heritability, 169-70, 177, 
mutation, 342, 
number of "loci", 219, 
random drift, 265, 
repeatability, 145, 
response to selection, 190, 195, 
209, 210, 216, 221, 223, 245- 
brown, 52, 53, 56, 59; 
effective population size, 73-4; 
egg number, 140, 282, 336; 
ovary size, 140, 145; 
raspberry, 34; 
thorax length: 

components of variance, 130, 

response to selection, 209, 211- 
212, 216, 219, 221, 319; 
wing length, 17 1-2, 319. 
Drosophila pseudoobscura, 262. 
Drosophila subobscura, 252. 
Drosophila tropicalis, 39. 
dwarfism (chondrodystrophy) in 
man, 38. 

Effective factors, number of, 217. 
effective population size (number), 

< 68-74; 
ratio of, to actual number, 73-4. 
environment, 112 {see also under 
common, 159-61. 
epistasis, 126 {see also under Inter- 

Hardy- Weinberg, 9-12; 
under inbreeding, 74-81, 

and selection for heterozygotes, 
with linked loci, 20-1; 
with more than one locus, 19-20; 
under mutation, 25-6, 

with selection, 36-41; 
under natural selection, 331-2; 
under selection for heterozygotes, 

eugenics, 36, 40. 
euheterosis, 262. 

Factors, effective number of, 217. 
family size: 

and heritability estimates, 177-83; 

and inbreeding, 70-3; 

and selection, 233-46. 
fitness, natural, 26, 167, 329, 330-43. 
fixation, 54-7, 66-7, 97; 

of deleterious genes, 80-1. 

Gene frequency, 6-22; 
change of, 23-36, 

by selection for metric character, 
directional, 213; 
distributions of: 

with inbreeding, 52, 55, 84, 
with inbreeding and mutation, 

with inbreeding and selection, 
79, 81; 
effect on variance, 137; 
sampling variance of, 50-4, 64. 
generation interval, 196-8. 
genetic death, 39-40. 
genotype, 112; 
frequencies, 5-7, 

with inbreeding, 57-9, 65-6; 
with random mating, 9-22. 
genotype-environment correlation, 

genotype-environment interaction, 

133-4, 148-9, 322-4. 
genotypic value, 11 2-4, 123-5, x 32. 

Hardy- Weinberg law, 9-15. 
heritability, 135, 163, 165-7; 

estimation, 168-85; 

examples, 167-8; 

of family means, 232-7; 

after inbreeding, 268; 

precision of estimates, 177-83; 

realised, 202-3, 296-7; 

of threshold characters, 303; 

of within-family deviations, 232-7. 



heterosis, 254; 

examples, 260; 

in single crosses, 255-61; 

in wide crosses, 261-3; 

utilisation of, 276-86. 

frequency of, 

with inbreeding, 65-6, 

with random mating, 9-13, 38; 

selection for, see under selection, 

developmental, 270-2; 

genetic, 331. 
hybrid vigour, see heterosis, 
hybrids, uniformity of, 270-2, 275, 

Idealised population, 48-50. 

experimental use, 272, 275; 
sub-line differentiation, 273-4; 
variability, 270-1, 296. 
inbreeding, 60-7; 
coefficient, 61; 

from pedigrees, 86-8, 
from population size, 61-4, 
for regular systems, 90-5; 
depression, 247-54, 

examples, 249; 
rate of, 63, 69-70, 92, 96, 101-2; 
regular systems, 90-5; 
and variance, 265-72. 
index for selection, 325-8. 
incidence of threshold character, 302. 
I intangible variation, 141. 
J integration, 263. 
intensity of selection, 192. 

between loci (epistatic), 125-8, 
and heterosis, 259, 262, 
and inbreeding, 252, 
and scale, 298-9; 
deviation, 125-8; 

between genotype and environ- 
ment, 133-4, 148-9, 322-4. 

island model, 77. 
isoalleles, 344. 
isolation by distance, 


Line (subdivision of a population), 


and correlation, 312, 320; 

and HarqLy-Weinberg equilibrium, 

and inbreeding, 97-100; 
and polygenic balance, 340-1; 
and resemblance between rela- 
tives, 158-9. 
logarithmic transformation, 297. 
luxuriance, 262. 
Ly coper sicon, 260, 300. 

Maize, 277, 290. 

albinism, 13, 36; 
birth weight, 141-2; 
blood groups, 5, 7, 12, 16, 44; 
dwarfism (chondrodystrophy), 38; 
sickle-cell anaemia, 44-6. 
maternal effects, 140-2, 160, 214, 

252-3, 260-1. 
mating, types of, 15. 
metric character, 104-11. 
migration, 23; 

in small populations, 75-9. 

blood-pH, 321; 
body weight: 

fitness relationship, 335-6, 
number of "loci", 219, 
realised heritability, 203, 
response to selection, 199, 214, 

216, 220, 
selection differentials, 201, 
sib-analysis, 175-6, 
variance and scale, 295; 
growth rate, 107, 133; 
litter size: 

frequency distribution, 107, 

heterosis, 255, 

inbreeding depression, 252-3, 



repeatability, 144; 
non-agouti, 51; 

pigment granules, 116-7, 126-8; 
pygmy, 113-5, 120-3, 136, 222-3, 

289, 299; 
sex ratio, 321; 
skeletal variants, 274; 
vertebrae, number of, 273-4, 3°5 — 
multiple alleles, 15-17, 42, 138. 
multiple measurements, 142-9. 
mutation, 23; 

balanced against selection, 36-41; 
change of gene frequency by, 24-6; 
and inbreeding, 75-9, 100, 274-5; 
and origin of variation, 342-3; 

estimation of, 38, 
increase of, 26, 39. 

Neighbourhood model, 77-9. 
Nicotiana, 260. 

combination of genes, 125-8; 

variance, 139-40, 280, 287, 337. 

Observational components of vari- 
ance, 150. 
overdominance, 27, 287-91; 
effect on variance, 137; 
equilibrium gene frequency, 41-6; 
and fitness, 339, 344; 
in selection experiments, 213, 222- 

Panmictic index, 64, 66. 

panmixia, 8. 

pedigrees and inbreeding, 85-90. 


body-length, 174-5; 

litter size, 253. 
pleiotropy, 289-91, 312-3, 328-9, 

polycross, 282. 
polygenes, 106. 
polygenic balance, 340-1. 
polygenic variation, 106. 


base-, 49, 61, 95-6; 

effective size (number), 68-74, 
ratio of, to actual size, 73-4; 

-mean, 113-7; 

size, 50. 
premisses, 2, 3. 
probit transformation, 302. 
progeny testing, 229-30. 
proportionate effect, 207, 219. 

Quantitative character, 104. 
Quasi-continuous variation, 301. 

Radiation, 26, 39. 
random drift, 51-7; 

in natural populations, 81-4. 
random mating, 8-21. 
range, total, 115, 116, 215-9. 
regression, offspring on parents, 

151, 162-3. 
relatives, resemblance between, 

repeatability, 143-9. 

Scale, 108-9, 292-300; 
-effects, 293; 
underlying, 301. 
segregation index, 217. 
selection, 23, 26, 186-7; 

balanced by mutation, 36-41; 
change of gene frequency, 28-36; 
coefficient of, 28, 

related to intensity of, 203-7; 
combined, 227, 236-7, 239-40; 
for combining ability, 283-6; 
correlated response to, 318-24; 
in different environments, 322-4; 
-differential, 187, 191-8, 

weighting of, 200-2; 
for economic value, 324-9; 
eugenic effects, 36, 40-1; 
family, 227-8 (see also Selection, 
and family size, 243-5; 
for heterozygotes, 41-6, 213, 222- 
223, 339, 



affecting inbreeding, 100-3, 

-index, 325-8; 
indirect, 320-4; 
individual, 227 (see also Selection, 

intensity of, 192, 

related to coefficient of, 203- 


for intermediates, 338-42; 

-limit, 215, 219-24, 328-9; 

long-term results, 215-4; 

mass, 227; 

methods (use of relatives), 225-31, 
heritabilities, 232-6, 
relative merits, 237-44, 
responses expected, 231-7; 

natural, 187, 200-2, 212, 253, 266, 

reciprocal, 284; 
recurrent, 283, 286; 
response, 187-91, 

asymmetry, 212-5, 296-7, 
duration, 215-7, 
measurement, 198-203, 
number of "loci", 217-9, 
prediction, 189-91, 214-5, 
repeatability, 208-12, 
total, 215-7; 
sib, 229 (see also Selection, 

in small populations, 79-81; 
for threshold characters, 308-11; 
variable, 339-40; 

within-family, 227 (see also Selec- 
tion, methods), 
selective value, 26. 
self-fertilising plants, 247, 276-7. 
sex-linked genes, 17-19, 34. 
sickle-cell anaemia, 44-6. 
sib-analysis, 172-6. 
snails, 43, 78-9, 83-4. 

systematic processes, 23, 

in small populations, 74-81. 

Threshold characters, 301-11, 321. 
transformation of scale, 108-9, 292- 

logarithmic, 297; 

probit, 302. 
tobacco, 260. 
tomato, 260, 300. 
top-cross, 282. 
twins, 131, 183-5. 

Uniformity of inbred lines, 54, 66-7, 
97, 100-3. 


genotypic, 11 2-4, 123-5; 
phenotypic, 112. 

additive, 135-8; 
between crosses, 279-83; 
components, 129-30, 

causal, 150, 

genetic, 134-4°, 

observational, 150; 
dominance, 135-8, 163, 
environmental, 130-4, 140-9, 

common, 159-61, 

general, 143-9, 

inbreeding effects, 270-2, 

special, 143-9; 
genotypic, 130-4; 
inbreeding effects, 265-72; 
interaction (epistatic), 138-40, 

and resemblance between rela- 
tives, 157-9; 
non-additive, 139-40, 280, 287, 


continuous, 104-1 1 ; 
discontinuous, 104, 108, 301; 
quasi-continuous, 301. 


Date Due 





NOV 2 * 199 

SEP 3 199 i 


3 1262 05478 2528 





> Z N 


AUG 95 

—**' m&*