INTRODUCTION TO
QUANTITATIVE GENETICS
INTRODUCTION TO
QUANTITATIVE GENETICS
D. S. FALCONER
Agricultural Research Councils Unit of Animal Genetics
University of Edinburgh
THE RONALD PRESS COMPANY • NEW YORK
All Rights Reserved
No part of this book may be reproduced
in any form without permission in writing
from the Publisher
FIRST PUBLISHED IN GREAT BRITAIN i960
6%\
i SCIEHCt V 1 c ~* •
Copyright © i960 D. S. Falconer
Printed in Great Britain by
Robert MacLehose and Company Limited, Glasgow
PREFACE
My aim in writing this book has been to provide an introductory text
book of quantitative genetics, with the emphasis on general principles
rather than on practical application, and one moreover that can be
understood by biologists of no more than ordinary mathematical
ability. In pursuit of this latter aim I have set out the mathematics in
the form that I, being little of a mathematician, find most compre
hensible, hoping that the consequent lack of rigour and elegance will
be compensated for by a wider accessibility. The reader is not, how
ever, asked to accept conclusions without proof. Though only the
simplest algebra is used, all the mathematical deductions essential
to the exposition of the subject are demonstrated in full. Some
knowledge of statistics, however, is assumed, particularly of the ana
lysis of variance and of correlation and regression. Elementary
knowledge of Mendelian genetics is also assumed.
I have had no particular class of reader exclusively in mind, but
have tried to make the book useful to as wide a range of readers as
possible. In consequence some will find less detail than they require
and others more. Those who intend to become specialists in this
branch of genetics or in its application to animal or plant breeding
will find all they require of the general principles, but will find little
guidance in the techniques of experimentation or of breeding
practice. Those for whom the subject forms part of a course of
general genetics will find a good deal more detail than they require.
The section headings, however, should facilitate the selection of what
is relevant, and any of the following chapters could be omitted without
serious loss of continuity : Chapters 4, 5, 10 (after p. 168), 12, 13,
and 1520.
The choice of symbols presented some difficulties because there
are several different systems in current use, and it proved impossible
to build up a selfconsistent system entirely from these. I have
accordingly adopted what seemed to me the most appropriate of the
vi PREFACE
symbols in current use, but have not hesitated to introduce new
symbols where consistency or clarity seemed to require them. I
hope that my system will not be found unduly confusing to those
accustomed to a different one. There is a list of symbols at the end,
where some of the equivalents in other systems are given.
Acknowledgements
Many people have helped me in various ways, to all of whom I
should like to express my thanks. I am greatly indebted to Professor
C. H. Waddington for his encouragement and for the facilities that I
have enjoyed in his laboratory. It is no exaggeration to say that with
out Dr Alan Robertson's help this book could not have been written.
Not only has his reading of the manuscript led to the elimination of
many errors, but I have been greatly assisted in my understanding
of the subject, particularly its more mathematical aspects, by frequent
discussions with him. Dr R. C. Roberts read the whole manuscript
with great care and his valuable suggestions led to many improve
ments being made. Parts of the manuscript were read also by Dr
N. Bateman, Dr J. C. Bowman, Dr D. G. Gilmour, Dr J. H. Sang,
and my wife, to all of whom I am grateful for advice. I owe much
also to the Honours and Diploma students of Animal Genetics in
Edinburgh between 195 1 and 1957, whose questions led to improve
ments of presentation at many points. Despite all the help I have
received, many imperfections remain and there can hardly fail to be
some errors that have escaped detection : the responsibility for all
of these is entirely mine. To Mr E. D. Roberts I am indebted for
drawing all the graphs and diagrams, and I greatly appreciate the
care and skill with which he has drawn them. I am indebted also to
the Director and Staff of the Commonwealth Bureau of Animal
Breeding for assistance with the preparation of the list of references.
D. S. FALCONER
Institute of Animal Genetics, Edinburgh
December, 1958
CONTENTS
PREFACE v
INTRODUCTION ..' i
1 GENETIC CONSTITUTION OF A POPULATION ... 5
Frequencies of genes and genotypes ... ... ... 5
HardyWeinberg equilibrium ... ... ... ... 9
2 CHANGES OF GENE FREQUENCY 23
Migration ... ... ... ... ... ... 23
Mutation ... ... ... ... ... ... ... 24
Selection ... ... ... ... ... ... ... 26
3 SMALL POPULATIONS: I. Changes of gene frequency under
simplified conditions ... ... ... ... ... ... 47
The idealised population ... ... ... ... 48
Sampling ... ... ... ... ... ... ... 50
Inbreeding ... ... ... ... ... ... 60
4 SMALL POPULATIONS: II. Less simplified conditions ... 68
Effective population size ... ... ... ... 68
Migration, Mutation, and Selection ... ... ... 74
Random drift in natural populations ... ... ... 81
5 SMALL POPULATIONS: III. Pedigreed populations and
close inbreeding ... ... ... ... ... ... 85
Pedigreed populations ... ... ... ... ... 86
Regular systems of inbreeding ... ... ... ... 90
6 CONTINUOUS VARIATION 104
Metric characters ... ... ... ... ••• 106
General survey of subjectmatter ... ... ... 109
via
CONTENTS
7 VALUES AND MEANS
Population mean . . .
Average effect
Breeding value
Dominance deviation
Interaction deviation
8 VARIANCE
Genotypic and environmental variance
Genetic components of variance
Environmental variance ...
9 RESEMBLANCE BETWEEN RELATIVES
Genetic covariance
Environmental covariance
Phenotypic resemblance . .
10 HERITABILITY
Estimation of heritability
The precision of estimates of heritability
Identical twins
11 SELECTION: I. The response and its prediction
Response to selection
Measurement of response
Change of gene frequency under artificial selection
12 SELECTION: II. The results of experiments
Repeatability of response
Asymmetry of response
Longterm results of selection ... ... ....
13 SELECTION: III. Information from relatives
Methods of selection
Expected response
Relative merits of the methods ...
14 INBREEDING AND CROSSBREEDING:
mean value
Inbreeding depression ...
Heterosis ...
I. Changes
CONTENTS
15 INBREEDING AND CROSSBREEDING: II. Changes of
variance
Redistribution of genetic variance
Changes of environmental variance
Uniformity of experimental animals
16 INBREEDING AND CROSSBREEDING: III. The utilisa
tion of heterosis
Variance between crosses
Methods of selection for combining ability
Overdominance ...
17 SCALE
18 THRESHOLD CHARACTERS
Selection for threshold characters
19 CORRELATED CHARACTERS
Genetic and environmental correlations
Correlated response to selection
Genotypeenvironment interaction
Simultaneous selection for more than one character . . .
20 METRIC CHARACTERS UNDER NATURAL
SELECTION
Relation of metric characters to fitness
Maintenance of genetic variation
The genes concerned with quantitative variation
GLOSSARY OF SYMBOLS
INDEXED LIST OF REFERENCES
SUBJECT INDEX
264
265
270
272
276
279
283
287
292
301
308
312
312
318
322
324
33o
332
338
343
346
349
361
INTRODUCTION
Quantitative genetics is concerned with the inheritance of those differ
ences between individuals that are of degree rather than of kind,
quantitative rather than qualitative. These are the individual differ
ences which, as Darwin wrote, "afford materials for natural selection
to act on and accumulate, in the same manner as man accumulates in
any given direction individual differences in his domestic produc
tions." An understanding of the inheritance of these differences is thus
of fundamental significance in the study of evolution and in the appli
cation of genetics to animal and plant breeding; and it is from these
two fields of enquiry that the subject has received the chief impetus to
its growth.
Virtually every organ and function of any species shows individual
differences of this nature, the differences of size among ourselves or
our domestic animals being an example familiar to all. Individuals
form a continuously graded series from one extreme to the other and
do not fall naturally into sharply demarcated types. Qualitative
differences, in contrast, divide individuals into distinct types with
little or no connexion by intermediates. Examples are the differ
ences between blueeyed and browneyed individuals, between the
blood groups, or between normally coloured and albino individuals.
The distinction between quantitative and qualitative differences
marks, in respect of the phenomena studied, the distinction between
quantitative genetics and the parent stem of "Mendelian" genetics.
In respect of the mechanism of inheritance the distinction is between
differences caused by many or by few genes. The familiar Mendelian
ratios, which display the fundamental mechanism of inheritance, can
be seen only when a gene difference at a single locus gives rise to a
readily detectable difference in some property of the organism.
Quantitative differences, in so far as they are inherited, depend on gene
differences at many loci, the effects of which are not individually dis
tinguishable. Consequently the Mendelian ratios are not exhibited
by quantitative differences, and the methods of Mendelian analysis
are inappropriate.
INTRODUCTION
It is, nevertheless, a basic premiss of quantitative genetics that the
inheritance of quantitative differences depends on genes subject to
the same laws of transmission and having the same general properties
as the genes whose transmission and properties are displayed by
qualitative differences. Quantitative genetics is therefore an extension
of Mendelian genetics, resting squarely on Mendelian principles as its
foundation.
The methods of study in quantitative genetics differ from those
employed in Mendelian genetics in two respects. In the first place,
since ratios cannot be observed, single progenies are uninformative,
and the unit of study must be extended to "populations," that is
larger groups of individuals comprising many progenies. And, in the
second place, the nature of the quantitative differences to be studied
requires the measurement, and not just the classification, of the indi
viduals. The extension of Mendelian genetics into quantitative gene
tics may thus be made in two stages, the first introducing new con
cepts connected with the genetic properties of "populations" and the
second introducing concepts connected with the inheritance of
measurements. This is how the subject is presented in this book. In
the first part, which occupies Chapters i to 5, the genetic properties of
populations are described by reference to genes causing easily identi
fiable, and therefore qualitative, differences. Quantitative differences
are not discussed until the second part, which starts in Chapter 6.
These two parts of the subject are often distinguished by different
names, the first being referred to as "Population Genetics" and the
second as "Biometrical Genetics" or "Quantitative Genetics."
Some writers, however, use "Population Genetics" to refer to the
whole. The terminology of this distinction is therefore ambiguous.
The use of "Quantitative Genetics" to refer to the whole subject may
be justified on the grounds that the genetics of populations is not just
a preliminary to the genetics of quantitative differences, but an in
tegral part of it.
The theoretical basis of quantitative genetics was established
round about 1920 by the work of Fisher (19 18), Haldane (192432,
summarised 1932) and Wright (1921). The development of the
subject over the succeeding years, by these and many other gene
ticists and statisticians, has been mainly by elaboration, clarifica
tion, and the filling in of details, so that today we have a substantial
body of theory accepted by the majority as valid. As in any healthily
growing science, there are differences of opinion, but these are chiefly
INTRODUCTION
matters of emphasis, about the relative importance of this or that
aspect.
The theory consists of the deduction of the consequences of
Mendelian inheritance when extended to the properties of popula
tions and to the simultaneous segregation of genes at many loci. The
premiss from which the deductions are made is that the inheritance of
quantitative differences is by means of genes, and that these genes
are subject to the Mendelian laws of transmission and may have any
of the properties known from Mendelian genetics. The property of
"variable expression" assumes great importance and might be raised
to the status of another premiss: that the expression of the genotype
in the phenotype is modifiable by nongenetic causes. Other pro
perties whose consequences are to be taken into account include
dominance, epistasis, pleiotropy, linkage, and mutation.
These theoretical deductions enable us to state what will be the
genetic properties of a population if the genes have the properties
postulated, and to predict what will be the consequences of applying
any specified plan of breeding. In principle we should then be able to
make observations of the genetic properties of natural or experi
mental populations, and of the outcome of special breeding methods,
and deduce from these observations what are the properties of the
genes concerned. The experimental side of quantitative genetics,
however, has lagged behind the theoretical in its development, and it
is still some way from fulfilling this complementary function. The
reason for this is the difficulty of devising diagnostic experiments
which will unambiguously discriminate between the many possible
situations envisaged by the theory. Consequently the experimental
side has developed in a somewhat empirical manner, building general
conclusions out of the experience of many particular cases. Never
theless there is now a sufficient body of experimental data to substan
tiate the theory in its main outlines; to allow a number of generalisa
tions to be made about the inheritance of quantitative differences;
and to enable us to predict with some confidence the outcome of
certain breeding methods. Discussion of all the difficulties would be
inappropriate in an introductory treatment. The aim here is to
describe all that is reasonably firmly established and, for the sake of
clarity, to simplify as far as is possible without being misleading.
Consequently the emphasis is on the theoretical side. Though con
clusions will often be drawn directly from experimental data, the
experimental side of the subject is presented chiefly in the form of
4 INTRODUCTION
examples, chosen with the purpose of illustrating the theoretical
conclusions. These examples, however, cannot always be taken as
substantiating the postulates that underlie the conclusions they
illustrate. Too often the results of experiments are open to more than
one interpretation.
No attempt has been made to give exhaustive references to pub
lished work in any part of the subject; or to indicate the origins, or
trace the history, of the ideas. To have done this would have required
a much longer book, and a considerable sacrifice of clarity. The chief
sources, from which most of the material of the book is derived, are
listed below. These sources are not regularly cited in the text.
References are given in the text when any conclusion is stated without
full explanation of its derivation. These references are not always to
the original papers, but rather to the more recent papers where the
reader will find a convenient point of entry to the topic under dis
cussion. References are also given to the sources of experimental
data, but these, for reasons already explained, cover only a small part
of the experimental side of the subject. In particular, a great deal
more work has been done on plants and on farm animals than would
appear from its representation among the experimental work cited.
Chief Sources
(For details see List of References)
Fisher, R. A. (1930), The Genetical Theory of Natural Selection.
Haldane, J. B. S. (1932), The Causes of Evolution.
Kempthorne, O. (1957), An Introduction to Genetic Statistics.
Lerner, I. M. (1950), Population Genetics and Animal Improvement.
Li, C. C. (1955), Population Genetics.
Lush , J . L . ( 1 945 ), A nimal Breeding Plans.
Malecot, G. (1948), Les Mathematiques de VHeredite.
Mather, K. (1949), Biometrical Genetics.
Wright, S. (1921), Systems of Mating. Genetics 6: 111178.
(1931), Evolution in Mendelian Populations. Genetics 16:
97J59
■i
CHAPTER i
GENETIC CONSTITUTION OF A
POPULATION
Frequencies of Genes and Genotypes
To describe the genetic constitution of a group of individuals we
should have to specify their genotypes and say how many of each geno
type there were. This would be a complete description, provided the
nature of the phenotypic differences between the genotypes did not
concern us. Suppose for simplicity that we were concerned with a
certain autosomal locus, A, and that two different alleles at this locus,
A x and A 2 , were present among the individuals. Then there would be
three possible genotypes, AjA^ AjAa, and A 2 A 2 . (We are concerned
here, as throughout the book, exclusively with diploid organisms.)
The genetic constitution of the group would be fully described by
the proportion, or percentage, of individuals that belonged to each
genotype, or in other words by the frequencies of the three genotypes
among the individuals. These proportions or frequencies are called
genotype frequencies, the frequency of a particular genotype being its
proportion or percentage among the individuals. If, for example, we
found one quarter of the individuals in the group to be AjA^ the
frequency of this genotype would be 025, or 25 per cent. Natura lly
the frequencies of all the genotype s together m ust ad d up to unit y, or
1 per c ent. " " "" "
Example i.i. The MN blood groups in man are determined by two
alleles at a locus, and the three genotypes correspond with the three blood
groups, M, MN, and N. The following figures, taken from the tabulation
of Mourant (1954), show the blood group frequencies among Eskimoes
of East Greenland and among Icelanders as follows:
Frequency,
Blood group
Number of
individuals
M MN N
Greenland
835 156 09
569
Iceland
312 515 173
747
6 GENETIC CONSTITUTION OF A POPULATION [Chap. I
Clearly the two populations differ in these genotype frequencies, the N
blood group being rare in Greenland and relatively common in Iceland.
Not only is this locus a source of variation within each of the two popula
tions, but it is also a source of genetic difference between the populations.
A population, in the genetic sense, is not just a group of individuals,
but a breeding group; and the genetics of a population is concerned
not only with the genetic constitution of the individuals but also with
the transmission of the genes from one generation to the next. In the
transmission the genotypes of the parents are broken down and a new
set of genotypes is constituted in the progeny, from the genes trans
mitted in the gametes. The genes carried by the population thus have
continuity from generation to generation, but the genotypes in which
they appear do not. The genetic constitution of a population, refer
ring to the genes it carries, is described by the array of gene frequencies;
that is by specification of the alleles present at every locus and the
numbers or proportions of the different alleles at each locus. If,
for example, A x is an allele at the A locus, then the frequency of A x
genes, or the gene frequency of A lt is the proportion or percent
age of all genes at this locus that are the A x allele. The frequencies
of all the alleles at any one locus must add up to unity, or ioo per
cent.
The gene frequencies at a particular locus among a group of
individuals can be determined from a knowledge of the genotype
frequencies. To take a hypothetical example, suppose there are two
alleles, A ± and A 2 , and we classify ioo individuals and count the
numbers in each genotype as follows:
AjAi AjA 2 A 2 A 2 Total
Number of individuals 30 60 10 100
Number of genes < . , _ V200
& \A 2 o 60 20 80J
Each individual contains two genes, so we have counted 200 repre
sentatives of the genes at this locus. Each^ p^jj^diyjdual_contains
two At genes and e ach A X A 2 contains one A x gene. So there are 120 A x
genes intne sample, and 80 A 2 genes. The frequency of A ± is there
fore 60 per cent or o6, and the frequency of A 2 is 40 per cent or 04.
To express the relationship in a more general form, let the frequencies
of genes and of genotypes be as follows:
Chap. I]
FREQUENCIES OF GENES AND GENOTYPES
Genes
Genotypes
A 1 A 2
AjAj AjA2 x\2^2
Frequencies
P 9
P H Q
so that p+q= i, and P + H+ Q = i. Since each individual contains
two genes, the frequency of A x genes is J(2P + H) } and the relation
ship between gene frequency and genotype frequency among the
individuals counted is as follows:
p=p
q=Q
H
Xi.x)
Example 1.2. To illustrate the calculation of gene frequencies from
genotype frequencies we may take the MN blood group frequencies given
in Example 1 . 1 . The M and N blood groups represent the two homozygous
genotypes and the MN group the heterozygote. The frequency of the M
gene in Greenland is, from equation 1.1, 0835 +2(0*156) = 0913, and the
frequency of the N gene is 0009 +i(o 156) = 0087, tne sum °f tne
frequencies being iooo as it should be. Doing the same for the Iceland
sample we find the following gene frequencies in the two populations, ex
pressed now as percentages:
Gene
M
N
Greenland
9!'3
87
Iceland
570
43 °
Thus the two populations differ in gene frequency as well as in genotype
frequencies.
The genetic properties of a population are influenced in the pro
cess of transmission of genes from one. generation to the next by a
number of agencies. These form the chief subjectmatter of the next
four chapters, but we may briefly review them here in order to have
some idea of what factors are being left out of consideration in this
chapter. The agencies through which the genetic properties of a
population may be changed are these:
Population size. The genes passed from one generation to the
next are a sample of the genes in the parent generation. Therefore
the gene frequencies are subject to sampling variation between suc
cessive generations, and the smaller the number of parents the greater
is the sampling variation. The effects of sampling variation will be
considered in Chapters 35, and meantime we shall exclude it from
B F.Q.G.
8 GENETIC CONSTITUTION OF A POPULATION [Chap. I
the discussion by supposing always that we are dealing with a ' 'large
population," which means simply one in which sampling variation is
so small as to be negligible. For practical purposes a "large popula
tion" is one in which the number of adult individuals is in the hundreds
rather than in the tens.
Differences of fertility and viability. Though we are not at
present concerned with the phenotypic effects of the genes under dis
cussion, we cannot ignore their effects on fertility and viability, be
cause these influence the genetic constitution of the succeeding
generation. The different genotypes among the parents may have
different fertilities, and if they do they will contribute unequally to
the gametes out of which the next generation is formed. In this way
the gene frequency may be changed in the transmission. Further,
the genotypes among the newly formed zygotes may have different
survival rates, and so the gene frequencies in the new generation may
be changed by the time the individuals are adult and themselves
become parents. These processes are called selection, and will be
described in Chapter 2. Meanwhile we shall suppose they are not
operating. It is difficult to find examples of genes not subject to
selection. For the purpose of illustration, however, we may take the
human bloodgroup genes since the selective forces acting on these
are probably not very strong. Genes that produce a mutant pheno
type which is abnormal in comparison with the wildtype are, in
contrast, usually subject to much more severe selection.
Migration and mutation. The gene frequencies in the popula
tion may also be changed by immigration of individuals from another
population, and by gene mutation. These processes will be described
in Chapter 2, and at this stage will also be supposed not to operate.
Mating system. The genotypes in the progeny are determined
by the union of the gametes in pairs to form zygotes, and the union of
gametes is influenced by the mating of the parents. So the genotype
frequencies in the offspring generation are influenced by the geno
types of the pairs that mate in the parent generation. We shall at
first suppose that mating is at random with respect to the genotypes
under discussion. Random mating, or panmixia, means that any
individual has an equal chance of mating with any other individual in
the population. The important points are that there should be no
special tendency for mated individuals to be alike in genotype, or to
be related to each other by ancestry. If a population covers a large
geographic area individuals inhabiting the same locality are more
Chap. I]
FREQUENCIES OF GENES AND GENOTYPES
likely to mate than individuals inhabiting different localities, and so
the mated pairs tend to be related by ancestry. A widely spread
population is therefore likely to be subdivided into local groups and
mating is random only within the groups. The properties of sub
divided populations depend on the size of the local groups, and will
be described under the effects of population size in Chapters 35.
HardyWeinberg Equilibrium
I n a lar ^e rajiploiiiamatin g^ populat ion both _gene_fre quencies and
per^ot^pe frequencies are constan t from generatio n to gene^^ pn. in
th^^tfifince of migration, m utation and selection; and the genotype
frequencies are determined by the gene frequencies. These propert ies
of a popula tion we re fi rst demonstrated fry Harfly ancLhy , Weinberg
inde pendently in iqo 8, and are generally known as the Hardy
Weinberg Law. (See Stern, 1943, where a translation of the relevant
part of Weinberg's paper will be found.) Such a population is said
to be in HardyWeinberg equilibrium. Deduction of the Hardy
Weinberg Law involves three steps: (1) from the parents to the
gametes they produce; (2) from the union of the gametes to the geno
types in the zygotes produced; and (3) from the genotypes of the
zygotes to the gene frequency in the progeny generation. These steps,
in detail, are as follows:
1 . Let the parent generation have gene and genotype frequencies
as follows:
P 9.
P
A X A 2
H
A 2 A 2
Q
Two sorts of gametes are produced, those bearing A x and those bear
ing A 2 . The frequencies of these gametic types are the same as the
gene frequencies, p and q, in the generation producing them, for this
reason: A X A X individuals produce only A ± gametes, and A X A 2 indi
viduals produce equal numbers of A ± and A 2 gametes (provided, of
course, there is no anomaly of segregation). So the frequency of A ±
gametes produced by the whole population is P + \H, which by
equation j.j is the gene frequency of A ± .
2. Random mating between individuals is equivalent to random
union among their gametes. We can think of a pool of gametes to
which all the individuals contribute equally; zygotes are formed by
10
GENETIC CONSTITUTION OF A POPULATION
[Chap. I
random union between pairs of gametes from the pool. The genotype
frequencies among the zygotes are then the products of the frequencies
of the gametic types that unite to produce them. The genotype
frequencies among the progeny produced by random mating can
therefore be determined simply by multiplying the frequencies of the
gametic types as shown in the following table:
s 8
Female gc
imetes and
their frequencies
\
A 2
P
9.
AA
AiA 2
A 1
P
P 2
pq
A 1 A 2
A 2 A 2
A 2
9
pq
q 2
We need not distinguish the union of A x eggs with A 2 sperms from
that of A 2 eggs with A 1 sperms; so the genotype frequencies of the
zygotes are
AiAj_ A]A 2
A 2 A 2
zpq
.(1.2)
Note that these genotype frequencies depend only on the gene fre
quency in the parents, and not on the parental genotype frequencies,
provided the parents mate at random.
3. Finally we use these genotype frequencies to determine the
gene frequency in the offspring generation. Applying equation 1.1
we find the gene frequency of A x is j> 2 + \ {zpq) =p(p + q) =p, which is
t he same as in the pare nt generation. ' —
The properties ot appellation with respect to a single locus, ex
pressed in the Hardy Weinberg law and demonstrated above, are
these:
^ (1) A large randommating population, in the absence of migra
tion, mutation, and selection, is stable with respect to both gene and
genotype frequencies: there is no inherent tendency for its genetic
properties to change from generation to generation.
(2) The genotype frequencies in the progeny produced by random
mating among the parents are determined solely by the gene fre
quencies among the parents. Consequently:
Chap. I]
HARDYWEINBERG EQUILIBRIUM
II
(a) a population in Hardy Weinberg equilibrium has the rela
tionship expressed in equation 1.2 between the gene and
genotype frequencies in any one generation. And,
(b) these Hardy Weinberg genotype frequencies are established
by one generation of random mating, irrespective of the
genotype frequencies among the parents.
10
>
u
z
D 6
a
o
Z A
\
/
\
/
\a,a,
A2A2/
\
\
^a7a 2
\
I 2 3 4 5 6 7 8 9 I
GENE FREQUENCY of A 2
Fig. i.i. Relationship between genotype frequencies and gene
frequency for two alleles in a population in Hardy Weinberg
equilibrium.
We shall later give another proof of the Hardy Weinberg law by
a different method. Let us now first illustrate the properties of a
population in Hardy Weinberg equilibrium, and then show to what
uses these properties can be put. The relationship between gene
frequency and genotype frequencies expressed in equation 1.2 is
,000 I M oa%$X?
12 GENETIC CONSTITUTION OF A POPULATION [Chap. I
illustrated graphically in Fig. i.i, which shows how the frequencies
of the three genotypes for a locus with two alleles depend on the gene
frequency. As an example of the Hardy Weinberg genotype fre
quencies we shall take again the MN blood groups in man.
Example 1.3. Race and Sanger (1954) quote the following frequencies
(%) of the MN blood groups in a sample of 1,279 English people. From
the observed genotype (i.e. blood group) frequencies we can calculate the
gene frequencies by equation 1.1. These gene frequencies are shown on
the right.
Blood group Gene
M MN N M N
Observed 2838 4957 2205 53165 46835
Expected 28265 49800 21935
Now from the gene frequencies we can calculate the expected Hardy
Weinberg genotype frequencies by equation 1.2, and we find that the
observed frequencies agree very closely with those expected for a popula 1
tion in Hardy Weinberg equilibrium.
Comparison of observed with expected genotype frequencies may
be regarded as a test of the fulfilment of the conditions on which the
Hardy Weinberg equilibrium depends. ^Xhese conditions are:
random mating amo n g the parents of the individuals obse rved, equal
fertility of the different genotypes among; the parents, an d equal
vi ability of the different genotypes a mnn^ the nffoprjng from f^rtilisa
tion up to the time of observation. In addition, the classification of
individuals as to genotype must have been correctly made. The
blood group frequencies in Example 1.3 give no cause to doubt the
fulfilment of these conditions. Itshould be noted, however, that a
difference of fertility or of viability between the genotypes, though it
can be detected, cannot be measured from a comparison of observed
v^ith^expected frequencies (Wallace, 1958). The. expected frequencies
arej)ased on the observed gene frequencies after the differences of fer
ity or viability have had their effect. In order to measure these effects
wejshould have to know the original gene or genotype frequencies.
At the beginning of the chapter we saw, in equation J. J, how the
gene frequencies among a group of individuals can be determined
from their genotype frequencies; but for this it was necessary to know
the frequencies of all three genotypes. Consequently the relationship
in equation 1.1 cannot be applied to the case of a recessive allele,
)X>f>
Chap. I]
HARDYWEINBERG EQUILIBRIUM
CO
13
when the heterozygote is indistinguishable from the dominant homo
zygote. Consideration of the population as a breeding unit, however,
shows that when the conditions for Hardy Weinberg equilibrium
hold, only the frequency of one of the homozygous genotypes is
needed to determine the gene frequency, and the difficulty of recessive
genes is thus overcome. Let A 2 , for example, be a recessive gene
with frequency q; then the frequency of A 2 A 2 homozygotes is q 2 . In
other words the gene frequency is the square root of the homozygote
frequency. Thus we can determine the gene frequency of recessive
abnormalities, provided that selective mortality of the homozygote
can be discounted or allowed for. But we can go further, and this is
often the more important point: we can also determine the frequency
of heterozygotes, or "carriers," of recessive abnormalities, which is f
2q(i q). It comes as a surprise to most people to discover how comC J^
mem heterozygotes of a rare recessive abnormality are.
lL
Example 1.4. Albinism in man is probably determined by a single
recessive autosomal gene, and the frequency of albinos is about 1/20,000
in human populations (see Stern, 1949). If q is the frequency of the albino
gene, then q 2 = 1/20,000, and q = 1/141, if selective mortality is disregarded.
The frequency of heterozygotes is then 2^(1 q) y which works out to about
1/70. So about one person in seventy is a heterozygote for albinism,
though only one in twenty thousand is a homozygote.
Example 1.5. There is a recessive autosomal gene in the Ayrshire
breed of cattle in Britain which causes dropsy in the newborn calf. The
frequency of this abnormality is about 1 in 300 births (Donald, Deas, and
Wilson, 1952). A means of reducing the frequency of the defect would
obviously be the avoidance of the use of bulls known or thought to be
heterozygous. We might first want to know what proportion of bulls
would be expected to be heterozygotes. In this case the conditions for
HardyWeinberg equilibrium are certainly not all fulfilled: the breed is not
a single randombreeding population, and the abnormal homozygotes are
not fully viable up to the time of birth. So we can only get a rough idea of
the frequency of heterozygotes by assuming the observations to refer to a
population in HardyWeinberg equilibrium. On this assumption,
q 2 = 00033, so tf = '°57; m e frequency of heterozygotes is zq{i q) = on.
So we should expect, very approximately, one bull in ten to be a hetero
zygote.
Mating frequencies and another proof of the Hardy
Weinberg law. Let us now look more closely into the breeding
/ / 1
14 GENETIC CONSTITUTION OF A POPULATION [Chap. I
structure of a randommating population, distinguishing the types of
mating according to the genotypes of the pairs, and seeing what are
the genotype frequencies among the progenies of the different types
of mating. This provides a general method for relating genotype
frequencies in successive generations, which we shall use in a later
chapter. It also provides another proof of the Hardy Weinberg law;
a proof more cumbersome than that already given but showing more
clearly how the Hardy Weinberg frequencies arise from the Men
delian laws of segregation. The procedure is to obtain first the
frequencies of all possible mating types according to the frequencies
of the genotypes among the parents, and then to obtain the fre
quencies of genotypes among the progeny of each type of mating
according to the Mendelian ratios.
Consider a locus with two alleles, and let the frequencies of genes
and genotypes in the parents be, as before,
Genes Genotypes
A 1 A 2 AiAi A 1 A 2 A 2 A 2
Frequencies p q P H Q
There are altogether nine types of mating, and their frequencies
when mating is random are found thus:
Q ^ s
S ^a
Since the sex of the parent is irrelevant in this context, some of the
types of mating are equivalent, and the number of different types
reduces to six. By summation of the frequencies of equivalent types,
we obtain the frequencies of mating types in the first two columns of
Table i . i . Now we have to consider the genotypes of offspring pro
duced by each type of mating, and find the/frequency of each geno
type in the total progeny, assuming, of course, that all types of mating
are equally fertile and all genotypes equally viable. This is done in
the right hand side of Table i . i . Thus, for example, matings of the
type A X A X x A^ produce only A X A X offspring. So, of all the A^
Genofy
ipe and ft
equency o
A,A,
A]A 2
A 2 A 2
P
H
Q
A,A,
P
P 2.
PH
PQ
A X A 2
H
pA
H 2
w>
A 2 A 2
Q
PQ
HQ
Q 2
Chap. I]
HARDYWEINBERG EQUILIBRIUM
15
genotypes in the total progeny, a proportion P 2 come from this type
of mating. Similarly a quarter of the offspring of A X A 2 x A X A 2
matings are A^. So this type of mating, which has a frequency of
H 2 y contributes a proportion \H 2 of the total A^ progeny. To find
the frequency of each genotype in the total progeny we add the
Mating
Table i.i
Genotype and frequency of progeny
Type
Frequency
AA
AiA 2
A„A n
■/x^fij X A^/ij
P 2
P 2
■ —
Xil/il X XTLf/lo
zPH
PH
PH
—
AjAj x A 2 A 2
2PQ
—
2PQ
—
AjA 2 x AjA 2
H*
\H 2
w 2
iff 2
A X A 2 x A 2 A 2
zHQ
— '
HQ
HQ
A 2 A 2 X r\ 2 r\. 2
Q 2
Sums
—
—
Q 2
{P+Wf
2{P +
WW + W)
(Q + Wf
=
p*
zpq
f
frequencies contributed by each type of mating. The sums, after
simplification, are given at the foot of the table, and from the identity
given in equation J.J they are seen to be equal to p 2 , 2pq, and q 2 .
These are the HardyWeinberg equilibrium frequencies, and we
have shown that they are attained by one generation of random mating,
irrespective of the genotype frequencies among the parents.
Multiple alleles. Restriction of the treatment to two alleles at a
locus suffices for many purposes. If we are interested in one
particular allele, as often happens, then all the other alleles at the
locus can be treated as one. Formulation of the situation in terms of
two alleles is therefore often possible even if there are in fact more
than two. If we are interested in more than one allele we can still, if
we like, treat the situation as a twoallele system by considering each
allele in turn and lumping the others together. But the treatment can
be easily extended to cover more than two alleles, and no new prin
ciple is introduced. In general, if q x and q 2 are the frequencies of any
two alleles, A x and A 2 , of a multiple series, then the genotype fre
quencies under HardyWeinberg equilibrium are as follows (Li,
Genotype: A^ A X A 2 A 2 A 2
Frequency: q 2 2q ± q 2 q 2
16 GENETIC CONSTITUTION OF A POPULATION [Chap. I
These frequencies are also attained by one generation of random
mating. This can readily be seen by reducing the situation to a two
allele system, and considering each allele in turn. Or it can be
proved, though somewhat more laboriously, by the method explained
above for the twoallele system.
Example i.6. The ABO blood groups in man are determined by a
series of allelic genes. For the purpose of illustration we shall recognise
three alleles, A, B, and O, and show how the gene frequencies can be
estimated from the blood group frequencies. Let the frequencies of the
A, B, and O genes be p, q, and r respectively, so that p+q + r=i. The
following table shows (i) the genotypes, (2) the blood groups (i.e. pheno
types) corresponding to the different genotypes, (3) the expected frequen
cies of the blood groups in terms of p, q, and r, on the assumption of
Hardy Weinberg equilibrium, (4) observed frequencies of blood groups in
a sample of 190,177 United Kingdom airmen, quoted by Race and Sanger
(1954)
Genotype AA AO BB BO 00 AB
Blood group A B O AB
Frequency (%)
expected p 2 + 2pr q 2 + zqr r 2 zpq
observed 41 716 8560 46684 3*040
Calculation of the gene frequencies is rather more complicated than with
two alleles. The following is the simplest method: a more refined method
is described by Ceppellini et al. (1955). Fi rsL the frequency of th e O gene
is simply the squ are roqf of the freq uen cy of t)ie._Q group. Next it will be
seen that the sum of the frequencies of the B and O groups is q 2 + zqr + r 2 =
(q + r) 2 = (i p) 2 . So p = 1  J(B + O), where B and O are the frequencies
of the blood groups B and O. In the same way q=i ^/(A + 0), and we
have seen that r = JO. This method gives the following gene frequencies
in the sample:
A gene: ^ = 02567
B gene: # = 00598
Ogene: r = 06833
Total 09998
As a result of sampling errors these frequencies do not add up exactly to
unity, but we shall not trouble to make an adjustment for so small a dis
crepancy. We may now calculate the expected frequency of the AB blood
M
Chap. I]
HARDYWEINBERG EQUILIBRIUM
17
group, which has not been used in arriving at these gene frequencies, and
see whether the observed frequency agrees satisfactorily. The expected
frequency of AB from estimates of p and q is 3070 per cent, which is in
good agreement with the observed frequency of 3040 percent. (x 2=z °'7>
with 1 d.f., calculated by the method given by Race and Sanger.)
Sexlinked genes. With sexlinked genes the situation is rather
more complex than with autosomal genes. The relationship between
gene frequency and genotype frequency in the homogametic sex is
the same as with an autosomal gene, but the heterogametic sex has
only two genotypes and each individual carries only one gene instead
of two. For this reason twothirds of the sexlinked genes in the
population are carried by tKeTibmogametic sex and onethird by the
heterogametic. For the sake of brevity we shall now refer to the
heterogametic sex as male. Consider two alleles, A x and A 2 , with
frequencies^) and q, and let the genotypic frequencies be as follows:
Females
AjAj AjA 2
P H
A 2 A 2
Q
Males
A x A 2
R S
The frequency of A 1 among the females is then p f =P + \H y and the
frequency among the males is p r
whole population is
R. The frequency of A ± in the
= i(2pf+Pm)
= ±(2P + H + R)
(1.3)
(14)
Now, if the gene frequencies among males and among females are
different, the population is not in equilibrium. The gene frequency
in the population as a whole does not change, but its distribution
between the two sexes oscillates as the population approaches equili
brium. The reason for this can be seen from the following con
siderations. Males get their sexlinked genes only from their
mothers; therefore p m is equal to p f in the previous generation.
Females get their sexlinked genes equally from both parents; there 
fore p f is equal to the mean of p m and p f in the previous generation ,
Using primes to indicate the previous generation, we have
Pm=p' f
Pf^Wm+P'f)
18 GENETIC CONSTITUTION OF A POPULATION [Chap. I
The difference between the frequencies in the two sexes is
Pfpm = i(Pm +Pf)Pf
= i(Pfp'm)
i.e. half the difference in the previous generation, but in the other
direction. Therefore the distribution of the genes between the two
sexes oscillates, but the difference is halved in successive generations
and the population rapidly approaches an equilibrium in which the
>
u
z
LU
D
a
LU
FEMALES
MALES
FEMALES & MALES COMBINED
GENERATIONS
Fig. 1.2. Approach to equilibrium under random mating for a
sexlinked gene, showing the gene frequency among females,
among males, and in the two sexes combined. The population
starts with females all of one sort (qf — i), and males all of the
other sort (q m = o).
frequencies in the two sexes are equal. The situation is illustrated
in Fig. 1.2, which shows the consequences of mixing females of one
sort (all AjAi) with males of another sort (all A 2 ) and letting them
breed at random.
Example 1.7. Searle (1949) gives the frequencies of a number of
genes in a sample of cats in London. The animals examined were sent to
Chap. I]
HARDYWEINBERG EQUILIBRIUM
19
clinics for destruction; they were therefore not necessarily a random
sample. Among the genes studied was ''yellow" (y) which is sexlinked
and for which all three genotypes in females are recognisable, the hetero
zygote being tortoiseshell. The data were used to test for agreement with
Hardy Weinberg equilibrium. The numbers observed in each phenotypic
class are shown in table (i). We may first see whether the gene frequency
(i)
Females
+ + +y yy
Numbers observed 277 54 7
Numbers expected 2696 645 39
Males
3 11
3I52
y
42
378
(")
+
y
?y
in females
608
68
oioi
in males
3 11
42
0119
total
919
no
0107
is equal in the two sexes. The numbers of genes counted, and the
frequency (q) of the gene y, in each sex are as given in table (ii). The
J
X 2 testing difference in q between the sexes is 04 which is quite in
significant. There is therefore no reason to think the population is not
in equilibrium, and we may take the estimate of gene frequency from both
sexes combined: it is # = 0107. From this estimate of q the expected
numbers in the different phenotypic classes are calculated; they are shown
in table (i). Only the females are relevant to the test of random mating.
The x 2 testing agreement between observed and expected numbers in
females is 44, with 2 degrees of freedom. This has a probability of oi and
cannot be judged significant. The data are therefore compatible with the
Hardy Weinberg equilibrium, in spite of the deficiency of tortoiseshell
females. If the deficiency of heterozygous females were real we might
attribute it to the method of sampling and infer that the tortoiseshells
were sent for destruction less often than the other colours, on account of
human preference.
More than one locus. The attainment of the equilibrium in
genotype frequencies after one generation of random mating is true
of all autosomal loci considered separately. But it is not true of the
genotypes with respect to two or more loci considered jointly. To
illustrate the point, consider a population made up of equal numbers
¥
20 GENETIC CONSTITUTION OF A POPULATION [Chap. I
of A^B^ and A 2 A 2 B 2 B 2 individuals, of both sexes. The gene
frequency at both loci is then J, and if the individuals mated at ran
dom only three out of the nine genotypes would appear in the pro
geny; the genotype A 1 A 1 B 2 B 2 , for example, would be absent though
its frequency in an equilibrium population would be yg. The missing
genotypes appear in subsequent generations, but not immediately
at their equilibrium frequencies. The approach to equilibrium is
described by Li (19550) an< ^ nere we snan onr y outline the con
clusions.
Consider two loci each with two alleles, and let the frequencies of
the four types of gamete formed by the initial population be as fol
lows:
type of gamete A 1 B 1 A X B 2 A 2 B X A 2 B 2
frequency r s t u
Then if the population is in equilibrium, ru=st, as may be seen by
writing the gametic frequencies in terms of the gene frequencies.
The difference, ru  st, gives a measure of the extent of the departure
from equilibrium. This difference is halved in each successive genera
tion of random mating, and the approach to equilibrium is thus fairly
rapid (see Fig. 1.3). If, however, more than two loci are to be con
sidered jointly the approach to equilibrium becomes progressively
slower as the number of loci increases.
Linked loci. If two loci are linked the approach to equilibrium
under random mating is slower in proportion to the closeness of the
linkage. When equilibrium is reached the coupling and repulsion
phases are equally frequent; the frequencies of the gametic types then
depend only on the gene frequencies and not at all on the linkage. It
is easy to suppose that association between two characters, as for
example between hair colour and eye colour, is evidence of linkage
between the genes concerned. Association between characters,
however, is more often evidence of pleiotropy than of linkage. Link
age can give rise to association only after a mixture of populations,
the length of time that the association persists depending on the
closeness of the linkage.
The approach to equilibrium after the mixture of populations
differing in respect of the genes at two linked loci can be described in
the manner of the preceding section. The departure from equili
brium, d, is expressed as d — rust, where ru is the frequency of
coupling heterozygotes and st that of repulsion heterozygotes. If c
Chap. I]
HARDYWEINBERG EQUILIBRIUM
21
is the frequency of recombination between the two loci then the
difference, d, at generation t is
d t = (ic)d t _ 1
Thus if, for example, there is 25 per cent recombination the difference
is reduced by one quarter in each generation; or if there is 10 per cent
recombination the difference is reduced by 10 per cent in each
4 5 6 7
GENERATIONS
I I
Fig. 1.3. Approach to equilibrium under random mating of two
loci, considered jointly. The graphs show the difference of fre
quency (d) between coupling and repulsion heterozygotes in suc
cessive generations, starting with all individuals repulsion hetero
zygotes. The five graphs refer to different degrees of linkage
between the two loci, as indicated by the recombination frequency
shown alongside each graph. The graph marked .5 refers to un
linked loci.
generation. Closely linked loci will therefore continue for a consider
able time to show the effects of a past mixture of populations. The
approach to equality of coupling and repulsion phases with different
degrees of linkage is illustrated in Fig. 1.3.
22 GENETIC CONSTITUTION OF A POPULATION [Chap. I
Assortative mating. Assortative mating is a form of nonrandom
mating, but this is the most convenient place to mention it. If the
mated pairs tend to be of the same genotype more often than would
occur by chance this is called positive assortative mating, and if less
often it is called negative assortative (or sometimes disassortative)
mating. The consequences are described by Wright (1921) and sum
marised by Li (1955^) and will be only briefly outlined here. Posi
tive assortative mating is of some importance in human populations,
where it occurs with respect to intelligence and other mental charac
ters. These however are not single gene differences such as can be
discussed in the present context. The consequences of assortative
mating with a single locus can be deduced from Table 1 . 1 by appro
priate modification of the frequencies of the types of mating to allow
for the increased frequency of matings between like genotypes. The
effect on the genotype frequencies among the progeny is to increase
the frequencies of homozygotes and reduce that of heterozygotes.
In effect the population becomes partially subdivided into two
groups, mating taking place more frequently within than between
the groups.
CHAPTER 2
CHANGES OF GENE FREQUENCY
We have seen that a large randommating population is stable with
respect to gene frequencies and genotype frequencies, in the absence
of agencies tending to change its genetic properties. We can now
proceed to a study of the agencies through which changes of gene
frequency, and consequently of genotype frequencies, are brought
about. There are two sorts of process: systematic processes, which
tend to change the gene frequency in a manner predictable both in
amount and in direction; and the dispersive process, which arises in
small populations from the effects of sampling, and is predictable in
amount but not in direction. In this chapter we are concerned only
with the systematic processes, and we shall consider only large random
mating populations in order to exclude the dispersive process from
the picture. There are three systematic processes: migration, mutation,
and selection. We shall study these separately at first, assuming that
only one process is operating at a time, and then we shall see how the
different processes interact.
Migration
The effect of migration is very simply dealt with and need not con
cern us much here, though we shall have more to say about it later,
in connexion with small populations. Let us suppose that a large
population consists of a proportion, m, of new immigrants in each
generation, the remainder, i  m, being natives. Let the frequency
of a certain gene be q m among the immigrants and q among the
natives. Then the frequency of the gene in the mixed population, q lf
will be
mq m + (im)q
.(2.1)
The change of gene frequency, Aq, brought about by one generation
F.Q.G.
24 CHANGES OF GENE FREQUENCY [Chap. 2
of immigration is the difference between the frequency before
immigration and the frequency after immigration. Therefore
= m(q m q ) (2.2)
Thus the rate of change of gene frequency in a population subject to
immigration depends, as must be obvious, on the immigration rate
and on the difference of gene frequency between immigrants and
natives.
Mutation
The effect of mutation on the genetic properties of the population
differs according to whether we are concerned with a mutational
event so rare as to be virtually unique, or with a mutational step that
recurs repeatedly. The first produces no permanent change, whereas
the second does.
3fe» Nonrecurrent mutation. Consider first a mutational event
*mat gives rise to just one representative of the mutated gene or
chromosome in the whole population. This sort of mutation is of
little importance as a cause of change of gene frequency, because the
product of a unique mutation has an infinitely small chance of sur
viving in a large population, unless it has a selective advantage. This
can be seen from the following consideration. As a result of the single
mutation there will be one A X A 2 individual in a population all the
rest of which is AjA^ The frequency of the mutated gene, A 2 , is
therefore extremely low. Now according to the Hardy Weinberg
equilibrium the gene frequency should not change in subsequent
generations. But with this situation we can no longer ignore the
variation of gene frequency due to sampling. With a gene at very low
frequency the sampling variation, even though very small, may take
the frequency to zero, and the gene will then be lost from the popu
lation. Though at each generation a single gene has an equal chance
of surviving or being lost, the loss is permanent and the probability
of the gene still being present decreases with the passage of genera
tions (see Li, 1955a). The conclusion, therefore, is that a unique
mutation without selective advantage cannot produce a permanent
change in the population.
Recurrent mutation. It is with the second type of mutation —
Chap. 2]
MUTATION
25
recurrent mutation — that we are concerned as an agent for causing
change of gene frequency. Each mutational event recurs regularly
with characteristic frequency, and in a large population the frequency
of a mutant gene is never so low that complete loss can occur from
sampling. We have, then, to find out what is the effect of this "pres
sure" of mutation on the gene frequency in the population.
S uppose gene A^mutates to A^ with^aJre quencv u p er generation.
(u is the proportion of all A x genes that mutate to A 2 between one
generation and the next.) If the frequency of A x in one generation is
p the frequency of newly mutated A 2 genes in the next generation is
upQ. So the new gene frequency of A x is p  up , and the change of
gene frequency is  up . Now consider what happens when the genes
mutate in both directions. Suppose for simplicity that there are only
two alleles, A x and A 2 , with initial frequencies p and q . A x mutates
to A 2 at a rate u per generation, and A 2 mutates to A x at a rate v.
Then after one generation there is a gain of A 2 genes equal to up due
to mutation in one direction, and a loss equal to vq due to mutation
in the other direction. Stated in symbols, we have the situation:
u
Mutation rate A x ^ A 2
V
Initial gene frequencies p q
Then the change of gene frequency in one generation is
Aq=up vq
It is easy to see that this situation leads to an equilibrium in gene
frequency at which no further change takes place, because if the
frequency of one allele increases fewer of the other are left to mutate
in that direction and more are available to mutate in the other direc
tion. The point of equilibrium can be found by equating the change
of frequency, Aq, to zero. Thus at equilibrium
pu
P.
(*3)
or
qv
v
u
and
u
n
y
u + v
(2.4)
Three conclusions can be drawn from the effect of mutation on
gene frequency. Measurements of mutation rates indicate values
ranging between about io~ 4 and io 8 per generation (one in ten
lb CHANGES OF GENE FREQUENCY [Chap. 2
thousand and one in a hundred million gametes). With normal
mutation rates, therefore, mutation alone can produce only very slow
changes of gene frequency; on an evolutionary timescale they might
be important, but they could scarcely be detected by experiment
unless with microorganisms. The second conclusion concerns the
equilibrium between mutation in the two directions. Studies of
reverse mutation (from mutant to wild type) indicate that it is usually
less frequent than forward mutation (from wild type to mutant), on
the whole about one tenth as frequent (Muller and Oster, 1957).
The equilibrium gene frequencies for such loci, resulting from
mutation alone, would therefore be about oi of the wildtype allele
and 09 of the mutant; in other words the "mutant" would be the
common form and the "wild type" the rare form. Since this is not
the situation we find in natural populations it is clear that the fre
quencies of such genes are not the product of mutation alone. We
shall see in the next section that the rarity of mutant alleles is attribu
table to selection. The third conclusion concerns the effects of an
increase of mutation rates such as might be caused by an increase of
the level of ionising radiation to which the population is subjected.
Any loci at which the gene frequencies are in equilibrium from the
effects of mutation alone will not be affected by a change of mutation
rate, provided the change affects forward and reverse mutation pro
portionately. This can be seen from consideration of the equilibrium
gene frequencies given in equation 2.4.
Selection
Hitherto we have supposed that all individuals in the population
contribute equally to the next generation. Now we must take account
of the fact that individuals differ in viability and fertility, and that
they therefore contribute different numbers of offspring to the next
generation. The proportionate contribution of offspring to the next
generation is called the fitness of the individual, or sometimes the
adaptive value, or selective value. If the differences of fitness are in
any way associated with the presence or absence of a particular gene
in the individual's genotype, then selection operates on that gene.
When a gene is subject to selection its frequency in the offspring is
not the same as in the parents, since parents of different genotypes
pass on their genes unequally to the next generation. In this way
Chap. 2]
SELECTION
27
selection causes a change of gene frequency, and consequently also of
genotype frequency. The change of gene frequency resulting from
selection is more complicated to describe than that resulting from
mutation, because the differences of fitness that give rise to the
selection are an aspect of the phenotype. We therefore have to take
account of the degree of dominance shown by the genes in question.
Dominance, in this connexion, means dominance with respect to
fitness, and this is not necessarily the same as the dominance with
respect to the main visible effects of the gene. Most mutant genes, for
example, are completely recessive to the wild type in their visible
A 2 A 2
I —
\s
NO
DOMINANCE
A,A 2
+
A,A,
— I
1is
A 2 A 2
i —
1 s
COMPLETE
DOMINANCE
A,A,
A,A,
— I
OVERDOMINANCE
A 2 A 2
1*,
A,A,
1*,
FITNESS
A,A,
Fig. 2.i. Degrees of dominance with respect to fitness.
effects, but this does not necessarily mean that the heterozygote has a
fitness equal to that of the wildtype homozygote. The meaning of
the different degrees of dominance with which we shall deal is
illustrated in Fig. 2.1.
It is most convenient to think of selection acting against the gene
in question, in the form of selective elimination of one or other of the
genotypes that carry it. This may operate either through reduced
viability or through reduced fertility in its widest sense, including
mating ability. In either case the outcome is the same: the genotype
selected against makes a smaller contribution of gametes to form
zygotes in the next generation. We may therefore treat the change of
gene frequency as taking place between the counting of genotypes
among the zygotes of the parent generation and the formation of
28 CHANGES OF GENE FREQUENCY [Chap. 2
zygotes in the offspring generation. The intensity of the selection is
expressed as the coefficient of selection, s, which is the proportionate
reduction in the gametic contribution of a particular genotype com
pared with a standard genotype, usually the most favoured. The
contribution of the favoured genotype is taken to be i, and the
contribution of the genotype selected against is then i  s. This
expresses the fitness of one genotype compared with the other. Sup
pose, for example, that the coefficient of selection is s = oi; this
means that for every ioo zygotes produced by the favoured genotype,
only 90 are produced by the genotype selected against.
The fitness of a genotype with respect to any particular locus is
not necessarily the same in all individuals. It depends on the en
vironmental circumstances in which the individual lives, and also on
the genotype with respect to genes at other loci. When we assign a
certain fitness to a genotype, this refers to the average fitness in the
whole population. Though differences of fitness between individuals
result in selection being applied to many, perhaps to all, loci simul
taneously, we shall limit our attention here to the effects of selection
on the genes at a single locus, supposing that the average fitness of the
different genotypes remains constant despite the changes resulting
from selection applied simultaneously to other loci. The conclusions
we shall reach apply equally to natural selection occurring under
natural conditions without the intervention of man, and to artificial
selection imposed by the breeder or experimenter through his choice
of individuals as parents and through the number of offspring he
chooses to rear from each parent.
Change of gene frequency under selection. We have first to
derive the basic formulae for the change of gene frequency brought
about by one generation of selection. Then we can consider what they
tell us about the effectiveness of selection. The different conditions
of dominance have to be taken account of, but the method is the same
for all, and we shall illustrate it by reference to the case of complete
dominance with selection acting against the recessive homozygote.
Let the genes A x and A 2 have initial frequencies p and q, A x being
completely dominant to A 2 , and let the coefficient of selection against
A 2 A 2 individuals be s. Multiplying the initial frequency by the fitness
of each genotype we obtain the proportionate contribution of each
genotype to the gametes that will form the next generation, thus:
Chap. 2]
SELECTION
29
Genotypes
Initial frequencies
Fitness
Gametic contribution
A X A 2
zpq
i
2pq
r\_2rA.o
f~
It
I S
q*(is)
Total
i
i sq 2
Note that the total gametic contribution is no longer unity, because
there has been a proportionate loss of sq 2 due to the selection. To
find the frequency of A 2 gametes produced — and so the frequency of
A 2 genes in the progeny — we take the gametic contribution of A 2 A 2
individuals plus half that of A X A 2 individuals and divide by the new
total, i.e. we apply equation J.J. Thus the new gene frequency is
■(25)
_ q\is)+pq
qi ~ lsq*
The change of gene frequency, Aq, resulting from one generation of
selection is
_g% %&pq n
sq
which on simplification reduces to
Aq =
^ 2 (l~g)
i sq 2
(2.6)
From this we see that the effect of selection on gene frequency de
pends not only on the intensity of selection, s, but also on the initial
gene frequency. But both relationships are somewhat complex, and
the examination of their significance will be postponed till after the
other situations have been dealt with.
Selection may act against the dominant phenotype and favour the
recessive: we then put i  s for the fitness of A^ and of A X A 2 geno
types. The expression for Aq is given in Table 2.1. The difference
may best be appreciated by considering the effects of total elimination
(s = i). The expression for selection against the dominant allele then
reduces to Aq = 1  q, which expresses the fact that if only the reces
sive genotype survives to breed the frequency of the recessive allele
will become 1 after a single generation of selection. But, on the other
hand, if there is complete elimination of the recessive genotype the
frequency of the dominant allele does not reach 1 after a single
generation. The difference between the effects of selection in oppo
site directions becomes less marked as the value of s decreases.
30 CHANGES OF GENE FREQUENCY [Chap. 2
If there is incomplete dominance the expression for Aq is again
different. The case of exact intermediate dominance is given in
Table 2.1. Here we put 1  %s for the fitness of A x K 2y and 1 s for
the fitness of A 2 A 2 genotype. For selection in the opposite direction
in this case we need only interchange the initial frequencies of the
two alleles, writings in the place of q.
Table 2.1
Change of gene frequency, Aq, after one generation of selection
under different conditions of dominance specified in Fig. 2.1.
Conditions of domin Initial frequencies and Change of frequency,
ance and selection fitness of the genotypes Aq, of gene A 2
A]A X AjA 2 A 2 A 2
p 2 2pq q 2
No dominance , had a)
, . . A 1 iis is — =^ ^ (1)
selection against A 2 1  sq '
Complete dominance ^(i q)
selection against A 2 A 2 1 sq 2
Complete dominance sq 2 (i q)
selection against A x  is(iq 2 )
Overdominance
selection against 1  s 1 1 1  ^ 2 + p ^ lP — — (4)
AjAi and A 2 A 2
ihp'w
When s is small the denominators differ little from 1, and the numerators
alone can be taken to represent Aq sufficiently accurately for most purposes.
Finally, selection may favour the heterozygote, a condition known
as overdominance. In this case we put 1  s ± and 1 s 2 for the fitness
of the two homozygotes. The expression for Aq is given in Table 2. 1 .
This special case will be given more detailed attention later. The
different conditions of dominance to which the expressions in
Table 2.1 refer are illustrated diagrammatically in Fig. 2.1. Let us
now see what these equations tell us about the effectiveness of selec
tion.
Effectiveness of selection. We see from the formulae that the
effectiveness of selection, i.e. the magnitude of Aq, depends on the
initial gene frequency, q. The nature of this relationship is best
appreciated from graphs showing Aq at different values of q. Fig. 2.2
1^
028
024
020
016
012
008
004
000
032
028
024
020
^ 016

)
■



1
■ //
1
Y
012
008
004
000
■

A A
)
\

\

\
\

\

V

\
A
10
Fig. 2.2. Change of gene frequency, Aq, under selection of intensity s =o2, at
different values of initial gene frequency, q. Upper figure: a gene with no domi
nance. Lower figure: a gene with complete dominance. The graphs marked
( ) refer to selection against the gene whose frequency is q, so that Aq is nega
tive. The graphs marked ( +) refer to selection in favour of the gene, so that
Aq is positive. (From Falconer, 1954a; reproduced by courtesy of the editor of
the International Union of Biological Sciences.)
32 CHANGES OF GENE FREQUENCY [Chap. 2
shows these graphs for the cases of no dominance and complete
dominance. They also distinguish between selection in the two
directions. A value of s = o2 was chosen for the coefficient of selec
tion because, for reasons given in Chapter 12, this seems to be the
right order of magnitude for the coefficient of selection operating on
genes concerned with metric characters in laboratory selection experi
ments. First we may note that with this value of s there is never a
great difference in Aq according to the direction of selection. The
two important points about the effectiveness of selection that these
graphs demonstrate are: (i) Selection is most effective at intermediate
gene frequencies and becomes least effective when q is either large or
small, (ii) Selection for or against a recessive gene is extremely
ineffective when the recessive allele is rare. This is the consequence
of the fact, noted earlier, that when a gene is rare it is represented
almost entirely in heterozygotes.
Another way of looking at the effect of the initial gene frequency on
the effectiveness of selection is to plot a graph showing the course of
selection over a number of generations, starting from one or other
extreme. Such graphs are shown in Fig. 2.3. They were constructed
directly from those of Fig. 2.2, and refer again to a coefficient of
selection, s = oz. They show that the change due to selection is at
first very slow, whether one starts from a high or a low initial gene
frequency; it becomes more rapid at intermediate frequencies and
falls off again at the end. In the case of a fully dominant gene one is
chiefly interested in the frequency of the homozygous recessive
genotype, i.e. q 2 . For this reason the graph shows the effect of selec
tion on q 2 instead of on q.
It is often useful to express the change of gene frequency, Aq,
under selection in a simplified form, which is a sufficiently good
approximation for many purposes. If either the coefficient of selec
tion, s y or the gene frequency, q, is small, then the denominators of
the equations in Table 2.1 become very nearly unity, and we can use
the numerators alone as expressions for Aq. Then for selection in
either direction we have, with no dominance:
Aq=±isq(iq) (approx.) (2.7)
and with complete dominance:
Aq= ±sq 2 (iq) (approx.) (2.8)
SELECTION
33
vR
•
•
•
■
^t+)
,,,.,,,,,
, rri ! i i i i ■
^V
V)
■
. . . , 1 . . , .
to/
.........
20
30
40
50
60
GENERATIONS
Fig. 2.3. Change of gene frequency during the course of selection from one
extreme to the other. Intensity of selection, s —02. Upper figure: a gene with
no dominance. Lower figure: a gene with complete dominance, q being the
frequency of the recessive allele and q 2 that of the recessive homozygote. The
graphs marked (  ) refer to selection against the gene whose frequency is q, so
that q or q 2 decreases. The graphs marked ( + ) refer to selection in favour of the
gene, so that q or q 2 increases. (From Falconer, 1954a; reproduced by courtesy
of the editor of the International Union of Biological Sciences.)
34
CHANGES OF GENE FREQUENCY
[Chap. 2
Example 2.1. As an example of the change of gene frequency under
selection we shall take the case of a sexlinked gene, in spite of the added
complication, because there is no well documented case of an autosomal
gene. Fig. 2.4 shows the change of the frequency of the recessive sex
linked gene "raspberry" in Drosophila melanogaster over a period of about
eighteen generations, described by Merrell (1953). The population was
started with a gene frequency of 05 in both sexes, and was therefore in
Generations
Days
Fig. 2.4. Change of gene frequency under natural selection in
the laboratory, as described in Example 2.1. (Data from Merrell,
1953.)
equilibrium at the beginning (see p. 17). Counts were made at about
monthly intervals, and the gene frequency in both sexes combined (by
equation 1.3) is shown against the scale of days in the figure. Measure
ments of fitness were made by comparison of the relative viability of
mutant and wildtype phenotypes, and of their relative success in mating.
No differences of viability were detected, nor of the success of females in
Chap. 2]
SELECTION
35
mating. But mutant males were only 50 per cent as successful as wild
type males in mating. The changes of gene frequency expected on the
basis of this difference of fitness were then calculated generation by
generation, and these calculated values are shown in the figure by the
smooth curve, plotted against the scale of generations. From a similar
experiment with a different mutant it was found that the calculated and
observed curves coincided if a period of 24 days was taken as the interval
between generations. For this reason 24 days to a generation was taken as
the basis for superimposing the curves shown here. Since the calculated
curve was to this extent made to fit the observed, the good agreement
between the two cannot be taken as proof that selection operated only
through the males' success in mating. But the similarity in their shapes
illustrates well how the change of gene frequency is rapid at first, tails off
as the gene frequency becomes lower, and becomes very slow when it
approaches zero.
Number of generations required. How many generations of
selection would be needed to effect a specified change of gene fre
quency? An answer to this question is sometimes required in con
nexion with breeding programmes or proposed eugenic measures.
We shall here consider only the case of selection against a recessive
when elimination of the unwanted homozygote is complete, i.e. s=i.
This would apply to natural selection against a recessive lethal, and
artificial selection against an unwanted recessive in a breeding pro
gramme. We shall also, for the moment, suppose that there is no
mutation. We had in equation 2.5 an expression for the new gene
frequency after one generation of selection against a recessive.
Substituting s = 1 in this equation and writing q , q ly q 2 , ... ,q t for the
gene frequency after o, 1, 2, . . . , t generations of selection we have
go
and
?2
i+go
gl
!+gl
go
by substituting for q 1 and simplifying. So in general
go
g<
tq
(2.9)
36 CHANGES OF GENE FREQUENCY [Chap. 2
and the number of generations, t, required to change the gene
frequency from q to q t is
t Jhzli
11 / X
= (2.10)
Qt q
We may use this formula to illustrate the point already made, that
when the frequency of a recessive gene is low selection is very slow
to change it.
Example 2.2. It is sometimes suggested, as a eugenic measure, that
those suffering from serious inherited defects should be prevented from
reproducing, since in this way the frequency of such defects would be
reduced in future generations. Before deciding whether the proposal is a
good one we ought to know what it would be expected to achieve. We
cannot properly discuss this problem without taking mutation into ac
count, as we shall do later; the answer we get ignoring mutation, as we do
now, shows what is the best that could be hoped for. Let us take albinism
as an example, though it cannot be regarded as a very serious defect, and
ask the question: how long would it take to reduce its frequency to half the
present value? The present frequency is about 1/20,000, and this makes
q = 1/141, as we saw in Example 1.4. The objective is q 2 = 1/40,000, which
makes q t = 1/200. So, from equation 2. io, t = zoo  141 =59 generations.
With 25 years to a generation it would take nearly 1500 years to achieve
this modest objective. More serious recessive defects are generally even
less common than albinism and with them elimination would be still
slower.
Balance between mutation and selection. Having described
the effects of mutation and selection separately we must now compare
them and consider them jointly. Which is the more effective process
in causing change of gene frequency? Is it reasonable to attribute the
low frequency of deleterious genes that we find in natural popula
tions to the balance between mutation tending to increase the fre
quency and selection tending to decrease it? The expressions already
obtained for the change of gene frequency under mutation or selec
tion alone show that both depend on the initial gene frequency, but in
different ways. Mutation to a particular gene is most effective in
increasing its frequency when the mutant gene is rare (because there
Chap. 2]
SELECTION
37
are more of the unmutated genes to mutate); but selection is least
effective when the gene is rare. The relative effectiveness of the two
processes depends therefore on the gene frequency, and if both pro
cesses operate for long enough a state of equilibrium will eventually
be reached. So we must find what the gene frequency will be when
equilibrium is reached. This is done by equating the two expressions
for the change of gene frequency, because at equilibrium the change
due to mutation will be equal and opposite to the change due to
selection.
Let us consider first a fully ] recessive gene with frequency q>
mutation rate to it «, and from it v\ and selection coefficient against it
s. Then from equations {2.3) and! (2.6) we have at equilibrium
u{iq)i
l sf{iq)
sq¥
.(2.11)
This equation is too complicated to give a clear answer to our ques
tion. But we can make two simplifications with only a trivial sacrifice
of accuracy. We are specifically interested in genes at low equilibrium
frequencies. If q is small the term vq representing back mutation is
relatively unimportant and can be neglected; and we can use the
approximate expression (equation 2.8) for the selection effect.
Making these simplifications we have the equilibrium condition for
selection against a recessive gene
u(i ~q)=sq 2 (i q) (approx.)
u = sq d
r
j
(approx.) (2.12)
(approx.) (2.13)
For a gene with no dominance similar reasoning from equation (1)
in Table 2.1 gives the equilibrium condition
q= (approx.)
(2.14)
Finally, consider selection against a completely dominant gene, the
frequency of the dominant gene being 1  q, and the mutation rate
to it being v. In this case 1 q is very small and the term w(i q) in
ere equation 2. 11 is negligible. We have therefore at equilibrium
38 CHANGES OF GENE FREQUENCY [Chap. 2
vq = sq 2 (i q) (approx.)
q(iq)=j (approx.)
or H=— (approx.) ( 2  J 5)
where H is the frequency of heterozygotes. If the mutant gene is
rare H is very nearly the frequency of the mutant phenotype in the
population.
Example 2.3. If the equilibrium state is accepted as applicable, we
can use it to get an estimate of the mutation rate of dominant abnormalities
for which the coefficient of selection is known. Among some human
examples described by Haldane (1949) is the case of dominant dwarfism
(chondrodystrophy) studied in Denmark. The frequency of dwarfs was
estimated at 107 x io 5 , and their fitness (1 s) at 0196. The estimate of
fitness was made from the number of children produced by dwarfs com
pared with their normal sibs. The mutation rate, by equation (2. 75),
comes out at 43 x io 5 . Though there is a possibility of serious error in
the estimate of frequency owing to prenatal mortality of dwarfs, the
mutation rate is almost certainly estimated within the right order of magni
tude. For a discussion of the estimation of mutation rates in man see
Crow (1956).
These expressions for the equilibrium gene frequency under the
joint action of mutation and selection show that the gene frequency
can have any value at equilibrium, depending on the relative magni
tude of the mutation rate and the coefficient of selection. But if
mutation rates are of the order of magnitude commonly accepted,
i.e. io 5 , or thereabouts, then only a mild selection against the mutant
gene will be needed to hold it at a very low equilibrium frequency.
For example, the following are the equilibrium frequencies of a
recessive gene and of the recessive homozygote under various intensi
ties of selection if the mutation rate is io 5 :
s =
•001
•01
•1
•5
9 =
•1
•03
•01
•0045
q 2 =
•01
•001
•0001
2x10
Thus, if a gene mutates at the rate of io 5 , a selective disadvantage of
10 per cent is enough to hold the frequency of the recessive homo
zygote at one in ten thousand; and a 50 per cent disadvantage will
oth
iino
Chap. 2] SELECTION 39
hold it at one in fifty thousand. It is quite clear therefore that the
low frequency of deleterious mutants in natural populations is in
accord with what would be expected from the joint action of mutation
and selection. A further conclusion is that mutation alone is most
unlikely to be a cause of evolutionary change. It is not mutation, but
selection, that chiefly determines whether a gene spreads through the
population or remains a rare abnormality, unless the mutation rate
is very much higher than seems to be the rule.
Let us now briefly consider two questions of social importance
concerning the balance between selection and mutation: the effect of
an increase of mutation rate, and the effect of a change in the intensity
of selection against deleterious mutants. These questions are more
fully discussed by Crow (1957).
Increase of mutation rate. Since the products of mutation are
predominantly deleterious, the process of mutation has a harmful
effect on a proportion of the individuals in a population. When an
individual dies or fails to reproduce in consequence of the reduced
fitness of its genotype, we may refer to this as a ''genetic death." An
increase in the frequency of genetic deaths would reduce the poten
tial reproductive rate and might thus reduce the speed with which a
species could multiply in an unoccupied territory. But when the
numbers of adults are held constant by densitydependent factors,
even quite a high frequency of genetic deaths will not affect the
ability of the population to perpetuate itself, especially if the repro
ductive rate is high, because the death of some individuals leaves room
for others that would otherwise have died from lack of food or some
ut if other cause. There is a species of Drosophila, for example (D.
tropicalis, from Central America), in which 50 per cent of individuals
in a certain locality suffer genetic death, and yet the population
flourishes (Dobzhansky and Pavlovsky, 1955). In species with low
reproductive rates the frequency of genetic deaths is of greater conse
quence, particularly in ourselves, where the death of every individual
is a matter of concern. Let us therefore consider what effect is to be
xpected from an increase of mutation rate such as might be caused
by an increase in the amount of ionising radiation to which human
populations are exposed.
Let us take the case of a recessive gene with a mutation rate (to it)
Df u, the gene being in equilibrium at a frequency of q. Then, if the
oefficient of selection against the homozygote is s, the frequency of
genetic deaths is sq 2 . This is the proportionate loss due to selection,
F.Q.G.
40 CHANGES OF GENE FREQUENCY [Chap. 2
as shown on p. 29, and it is equal to u, by equation 2.12. Thus the
frequency of genetic deaths, when equilibrium has been attained,
depends on the mutation rate alone, and is not influenced by the
degree of harmfulness of the gene. The reason for this apparent para
dox is that the more harmful genes come to equilibrium at lower
frequencies.
Now, if the mutation rate is increased, and maintained at the new
level, the gene will begin to increase toward a new point of equili^
brium at which sq 2 will be equal to the new mutation rate. Thus if
the mutation rate were doubled the frequency of genetic deaths would
also be doubled, when the new equilibrium had been reached. But
the approach to the new equilibrium would be very slow. The change
of gene frequency in the first generation is approximately
Aq = u(iq)sq 2 (iq)
u being the new mutation rate (from equations 2.3 and 2.8, but
ctingback negle mutation). To see what this means let us take a
mutation rate of io 5 as being probably representative of many loci,
and let us suppose that this was doubled. We may with sufficient
accuracy take 1  q as unity. Then
Aq = 2 x io~ 5  io 5
= io 5
The immediate effect of the increase of mutation rate would there
fore be very small indeed.
Change of selection intensity. Intensification of selection is
sometimes advocated as a eugenic measure in human populations,
on the grounds that if sufferers from genetic defects were prevented
from breeding the frequency of the defects would be reduced. We
saw from Example 2.2. that the effect of selection against a recessive
defect is very slow indeed, even when mutation is ignored. The true
situation is even worse. We cannot reduce the frequency of an
abnormality, whether dominant or recessive, below the new equili
brium frequency. The serious defects have already a fairly strong
natural selection working on them, and the addition of artificial
selection can do no more than make the coefficient of selection, s,
equal to 1. This would probably seldom do more than double the
present coefficient of selection, and the incidence of defects would be
reduced to not less than half their present values (equations 2.13,
2.14, 2.15). With a dominant gene the effect would be immediate,
Chap. 2]
SELECTION
41
but with a recessive the approach to the new equilibrium would be
extremely slow.
The situation with respect to recessives is complicated by the
fact that deleterious recessives are certainly not at their equilibrium
frequencies in presentday human populations (Haldane, 1939).
The reason is that modern civilisation has reduced the degree of
subdivision (i.e. inbreeding) and so reduced the frequency of homo
zygotes, as will be explained in the next chapter. In consequence
both the gene frequencies and the homozygote frequencies are below
their equilibrium values, and must be presumed to be at present
increasing slowly toward new equilibria at higher values.
Perhaps the converse of the question posed above is one that
should give us more concern, namely the consequences of the reduced
intensity of natural selection under modern conditions. Minor
genetic defects, such as colourblindness, must presumably have had
some selective disadvantage in the past but now have very little, if
any, effect on fitness. Moreover, the development and extension of
medical treatment prolongs the lives of many people with diseases
that have at least some degree of genetic causation through genes that
increase susceptibility. This relaxation of the selection operating on
minor genetic defects and against genes concerned in the causation of
disease suggests that the frequencies of these genes will increase
toward new equilibria at higher values. If this is true we must expect
the incidence of minor genetic defects to increase in the future, and
also the proportion of people who need medical treatment for a
variety of diseases. By applying humanitarian principles for our own
good now we are perhaps laying up a store of inconvenience for our
descendants in the distant future.
Selection favouring heterozygotes. We have considered the
effects of selection operating on genes that are partially or fully
dominant with respect to fitness; but, though the appropriate for
mula was given in Table 2.1, we have not yet discussed the conse
quences of overdominance with respect to fitness; that is, when the
heterozygote has a higher fitness than either homozygote. At first
sight it may seem rather improbable that selection should favour the
heterozygote of two alleles rather than one or other of the homo
zygotes, but there are reasons for thinking that this in fact is not at all
an uncommon situation. Let us first examine the consequences of
this form of selection, and then consider the evidence of its occur
rence in nature.
42 CHANGES OF GENE FREQUENCY [Chap. 2
Selection operating on a gene with partial or complete dominance
tends toward the total elimination of one or other allele, the final gene
frequency, in the absence of mutation, being o or i . When selection
favours the heterozygote, however, the gene frequency tends toward
an equilibrium at an intermediate value, both alleles remaining in the
population, even without mutation. The reason is as follows. The
change of gene frequency after one generation was given in Table 2.1
as being
pq(s 1 ps 2 q)
Aq
hp 2  s 2 q
The condition for equilibrium is that Aq = o, and this is fulfilled when
s 1 p=s 2 q. The gene frequencies at this point of equilibrium are
therefore «X»
= Zl
q Sl
?= *7T7 2 ^
Now, if q is greater than its equilibrium value (but not 1), and p
therefore less, s x p will be less than s 2 q, and Aq will be negative; that is
to say q will decrease. Similarly if q is less than its equilibrium value
(but not o) it will increase. Therefore when the gene frequency has
any value, except o or 1 , selection changes it toward the intermediate
point of equilibrium given in equation 2.16, and both alleles remain
permanently in the population. Three or more alleles at a locus are
maintained in the same way, provided the heterozygote of any pair
is superior in fitness to both homozygotes of that pair (Kimura,
1956). A feature of the equilibrium worthy of note is that the gene
frequency depends not on the degree of superiority of the hetero
zygote but on the relative disadvantage of one homozygote compared
with that of the other. Therefore there is a point of equilibrium at
some more or less intermediate gene frequency whenever a hetero
zygote is superior to both the homozygotes, no matter by how little.
Our previous consideration of genes with complete dominance
showed that the balance between selection and mutation satisfactorily
accounts for the presence of deleterious genes at low frequencies,
causing the appearance of rare abnormal, or mutant, individuals.
Genes at intermediate frequencies, however, are common in very
many species, and the presence of these cannot satisfactorily be
Chap. 2]
SELECTION
43
lcies,
ven"
accounted for in this way. But the intermediate frequencies are just
what would be expected if selection favoured the heterozygotes.
The existence in a population of individuals with readily discernible
differences caused by genes at intermediate frequencies is referred to
as polymorphism. The blood group differences of man are perhaps
the best known examples, but antigenic differences are found also in
many other species and are probably universal in animals. More
striking forms of polymorphism are the colour varieties found in
many species, particularly among insects, snails, and fishes. The
genes causing polymorphism have usually no obvious advantage of
one allele over another, all the genotypes being essentially normal, or
"wildtype," individuals. In these circumstances, as we noted above,
only a very slight superiority of the heterozygote would be sufficient
to establish an equilibrium at an intermediate gene frequency. The
properties of the genes concerned with polymorphism seem, there
fore, to accord well with the hypothesis that selection is operating on
them in favour of the heterozygotes, and this is generally conceded to
be the most probable reason for their intermediate frequencies. As a
general cause of polymorphism, however, it cannot be taken as fully
proved, because the superior fitness of heterozygotes has been
demonstrated in relatively few cases, and there are other possible
reasons for the existence of polymorphism. For example, the genes
might be in a transitional stage of a change from one extreme to the
other as a result of slow environmental change; or the intermediate
frequencies might be the point of equilibrium between mutation in
opposite directions, with virtually no selective advantage of one allele
over the other. But these explanations seem improbable, particularly
as some polymorphisms are known to be of very long standing. The
polymorphism of shell colours in the land snail Cepaea nemoralis, for
example, goes back to Neolithic times (Cain and Sheppard, 1954a).
Another possible cause of polymorphism lies in the heterogeneity of
the environment in which a population lives. If the differences of
environment influence the selection coefficients in such a way that
one allele is favoured in some conditions and another allele in other
conditions, then polymorphism may result provided that mating is
not entirely at random over the range of environments. (See Levene,
1953; Li, 19556; Mather, 1955a; Waddington, 1957.)
If heterozygotes are indeed superior in fitness, one naturally
wants to enquire into the nature of their superiority. Unfortunately,
however, very little is known about this, though evidence is accumu
44 CHANGES OF GENE FREQUENCY [Chap. 2
lating, in the case of the human blood groups, that certain blood groups
are associated with an increased susceptibility to certain diseases
(Roberts, 1957); group O, for example, with duodenal ulcer and group
A with pernicious anaemia. If one states this the other way round
and says that the other alleles confer increased resistance to these
diseases, then it is not unreasonable to suppose that each allele
increases resistance to different diseases, and that the presence of two
alleles increases the resistance to two different diseases, thereby
giving a selective advantage to the heterozygote.
Another question of interest concerns the evolutionary signifi
cance of polymorphism. Is it an "adaptive" feature of a species?
Does it, in other words, confer some advantage over a population
without it? Some think that it does. (See, particularly, Dobzhansky,
195 ib). Others, however, point out that the average fitness of a
population with polymorphism resulting from superior fitness of
heterozygotes is less than that of a population in which a single
allele performs the same function as the two different alleles in the
heterozygote (Cain and Sheppard, 19546). On this view, polymor
phism is a situation that, once established, is perpetuated by selection
between individuals within the population, but is a disadvantage to
the population as a whole in competition with another population
lacking the polymorphism.
The foregoing account of polymorphism leaves many problems
unsolved, and does little more than sketch the outlines of a most
interesting aspect of the genetics of populations. In particular, we
have not mentioned the extensive and detailed investigations of poly
morphism in respect of inverted segments of chromosomes found in
species of Drosophila and, to a lesser extent, in some other animals
and plants. For a description of these studies, and also for a fuller
general account of polymorphism, the reader must be referred to
Dobzhansky (1951a). We conclude by giving one example of poly
morphism where the nature of the superiority of heterozygotes is
clear. Other cases are described by Dobzhansky (1951a), Ford
(1953), Lerner (1954), and Sheppard (1958).
Example 2.4. Sicklecell anaemia (Allison, 1955). There is a gene,
found in American negroes and in the indigenous East Africans, which
causes the formation of an abnormal type of haemoglobin. Homozygotes
suffer from an anaemia, characterised by the "sickle" shape of the erythro
cytes; it is a severe disease from which many die. All the haemoglobin of
homozygotes is of the abnormal type, though there is a variable admixture
Chap. 2]
SELECTION
45
of foetal haemoglobin. Heterozygotes do not suffer from anaemia, but
they can be recognised by the presence of sickle cells if the haemoglobin
is deoxygenated. About 35 per cent of their haemoglobin is of the ab
normal type. With respect to haemoglobin synthesis, therefore, the sickle
cell gene is partially dominant, though with respect to the anaemia it is
recessive, and with respect to fitness it has been proved to be over
dominant. In routine surveys the few surviving homozygotes are not
readily distinguished from heterozygotes; we shall refer to the combined
heterozygotes and surviving homozygotes as "abnormals." The frequency
of abnormals varies very much with the locality: in American negroes it
is about 9 per cent, and in different parts of Africa it varies from zero up
to a maximum of about 40 per cent. In view of the severe disability of the
homozygotes it is impossible to account for these high frequencies unless
the heterozygotes have a quite substantial selective advantage over the
normal homozygotes. The nature of this selective advantage has been
shown to be connected with resistance to malaria. Heterozygotes are less
susceptible to malaria than normal homozygotes, and the frequency of
abnormals in different areas is correlated with the prevalence of malaria.
Let us work out the gene frequency corresponding with the maximum
frequency of 40 per cent abnormals, and then find the magnitude of the
selective advantage of heterozygotes necessary to maintain this gene
frequency in equilibrium.
If the gene frequency is in equilibrium it will be the same after selec
tion has taken place as it was before. Therefore, if we assume that all the
selection takes place before adulthood — an assumption that is not very far
from the truth — we can estimate the gene frequency from the genotype
frequencies in the adult population. But it is first necessary to know what
proportion of abnormals are homozygotes. This has been estimated as
being approximately 29 per cent (Allison, 1954). Thus, when the fre
quency of abnormals is 04, the frequency of homozygotes is 0012, and
that of heterozygotes is 0388. The gene frequency, then, by equation 1.1,
is the frequency of homozygotes plus half the frequency of heterozygotes,
which comes to q = 0206. If this gene frequency is the equilibrium value
maintained by natural selection favouring the heterozygotes, and if we
assume mating to be random, then the gene frequency is related to the
selection coefficients by equation 2.16. The fitness of sicklecell homo
zygotes, relative to that of heterozygotes, has been estimated from a
comparison of viability and fertility as being approximately 025. There
fore the coefficient of selection against homozygotes is s a = '75' Substi
tuting this value of s 2 , and the value of q found above, in equation 2.16
gives ^ = 0197. This is the coefficient of selection against normal homo
zygotes, relative to heterozygotes. If we want to express the selective
advantage of heterozygotes as the superiority of heterozygotes, relative to
46 CHANGES OF GENE FREQUENCY [Chap. 2
normal homozygotes, we may do so, since the fitness of heterozygotes
relative to normal homozygotes is
. This is 124. Thus the selective
advantage to be attributed to the resistance of heterozygotes to malaria,
if these are the forces holding the gene in equilibrium, is 24 per cent.
The presence of the sicklecell gene in American negroes can be
attributed to their African origin. The gene's present frequency of 0046,
deduced in the manner described above, can be accounted for partly by
racial mixture and partly by the change of habitat which, removing the
advantage of heterozygotes, has exposed the gene to the full power of the
selection against homozygotes.
As an example of polymorphism the sicklecell gene is not altogether
typical, because the differences of fitness are rather large and one of the
genotypes is clearly abnormal. But it illustrates in an exaggerated form
the nature of the selective forces that are presumed to underlie the more
usual forms of polymorphism.
CHAPTER 3
SMALL POPULATIONS:
I. Changes of Gene Frequency under
Simplified Conditions
We have now to consider the last of the agencies through which gene
frequencies can be changed. This is the dispersive process, which
differs from the systematic processes in being random in direction,
and predictable only in amount. In order to exclude this process
from the previous discussions we have postulated always a "large"
population, and we have seen that in a large population the gene
frequencies are inherently stable. That is to say, in the absence of
migration, mutation, or selection, the gene and genotype frequencies
remain unaltered from generation to generation. This property of
stability does not hold in a small population, and the gene frequencies
are subject to random fluctuations arising from the sampling of
gametes. The gametes that transmit genes to the next generation
carry a sample of the genes in the parent generation, and if the sample
is not large the gene frequencies are liable to change between one
generation and the next. This random change of gene frequency
is the dispersive process.
The dispersive process has, broadly speaking, three important
consequences. The first is differentiation between subpopulations.
The inhabitants of a large area seldom in nature constitute a single
large population, because mating takes place more often between
inhabitants of the same region. Natural populations are therefore
more or less subdivided into local groups or subpopulations, and the
sampling process tends to cause genetic differences between these, if
the number of individuals in the groups is small. Domesticated or
laboratory populations, in the same way, are often subdivided — for
example, into herds or strains — and in them the subdivision and its
resultant differentiation are often more marked. The second con
sequence is a reduction of genetic variation within a small population.
The individuals of the population become more and more alike in
genotype, and this genetic uniformity is the reason for the widespread
48 SMALL POPULATIONS: I [Chap. 3
use of inbred strains of laboratory animals in physiological and allied
fields of research. (An inbred strain, it may be noted, is a small
population.) The third consequence of the dispersive process is an
increase in the frequency of homozygotes at the expense of hetero
zygotes. This, coupled with the general tendency for deleterious
alleles to be recessive, is the genetic basis of the loss of fertility and
viability that almost always results from inbreeding. To explain
these three consequences of the dispersive process is the chief purpose
of this chapter.
There are two different ways of looking at the dispersive process
and of deducing its consequences. One is to regard it as a sampling
process and to describe it in terms of sampling variance. The other
is to regard it as an inbreeding process and describe it in terms of the
genotypic changes resulting from matings between related indi
viduals. Of these, the first is probably the simpler for a description
of how the process works, but the second provides a more convenient
means of stating the consequences. The plan to be followed here is
first to describe the general nature of the dispersive process from the
point of view of sampling. This will show how the three chief con
sequences come about. Then we shall approach the process afresh
from the point of view of inbreeding, and show how the two view
points connect with each other. In all this we shall confine our
attention to the simplest possible situation, excluding migration,
mutation, and selection. Thus we shall see what happens in small
populations in the absence of other factors influencing gene frequency.
In the next chapter we shall extend the conclusions to more realistic
situations, by removing the restrictive simplifications, and we shall in
particular consider the joint effects of the dispersive process and the
systematic processes. Finally, in Chapter 5, we shall consider the
special cases of pedigreed populations, and very small populations
maintained by regular systems of close inbreeding.
The Idealised Population
In order to reduce the dispersive process to its simplest form we
imagine an idealised population as follows. We suppose there to be
initially one large population in which mating is random, and this
population becomes subdivided into a large number of subpopula
tions. The subdivision might arise from geographical or ecological
causes under natural conditions, or from controlled breeding in
Chap. 3]
THE IDEALISED POPULATION
49
domesticated or laboratory populations. The initial randommating
population will be referred to as the base population, and the sub
populations will be referred to as lines. All the lines together consti
tute the whole population, and each line is a "small population" in
which gene frequencies are subject to the dispersive process. When a
single locus is under discussion we cannot properly understand what
goes on in one line except by considering it as one of a large number
of lines. But what happens to the genes at one locus in a number of
lines happens equally to those at a number of loci in one line, pro
vided they all start at the same gene frequency. So the consequences
of the process apply equally to a single line provided we consider
many loci in it.
The simplifying conditions specified for the idealised population
are the following:
i. Mating is restricted to members of the same line. The lines
are thus isolated in the sense that no genes can pass from one line to
another. In other words migration is excluded.
2. The generations are distinct and do not overlap.
3. The number of breeding individuals in each line is the same for
all lines and in all generations. Breeding indviduals are those that
transmit genes to the next generation.
4. Within each line mating is random, including selffertilisation
in random amount.
5. There is no selection at any stage.
6. Mutation is disregarded.
The situation implied by these conditions is represented dia
grammatically in Fig. 3.1, and may be described thus: All breeding
Generation
BASE POPULATION (N=co)
Gametes 2A 7
2N
2N
I Breeding
individuals
CD ul\ CD CD CD
I
2N
2N
2 Breeding i — L — i i — * — i r— * — i
LZj LZj
2N
\
CD
I
2N
2;V
Gametes 2N
\
individuals! ,1
I \ I
Fig. 3.1. Diagrammatic representation of the subdivision of a
single large population — the base population — into a number of
subpopulations, or lines.
CD
I
2N
\
CD
50 SMALL POPULATIONS: I [Chap. 3
individuals contribute equally to a pool of gametes from which zygotes
will be formed. Union of gametes is strictly random. Out of a
potentially large number of zygotes only a limited number survive to
become breeding individuals in the next generation, and this is the
stage at which the sampling of the genes transmitted by the gametes
takes place. Survival of zygotes is random, and consequently the
contribution of the parents to the next generation is not uniform, but
varies according to the chances of survival of their progeny. Since
the population size is constant from generation to generation, the
average number of progeny that reach breeding age is one per
individual parent or two per mated pair of parents. For any particular
zygote the chance of survival is small, and therefore the number of
progeny contributed by individual parents, or by pairs of parents, has
a Poisson distribution.
The following symbols will be used in connexion with the
idealised population.
N=the number of breeding individuals in each line and genera
tion. This is the population size.
/ = time, in generations, starting from the base population at t .
q = frequency of a particular allele at a locus.
p = i  q = frequency of all other alleles at that locus, q and p refer
to the frequencies in any one line; q and p refer to the fre
quencies in the whole population and are the means of q and^>;
q andpQ are the frequencies in the base population.
Sampling
Variance of gene frequency. The change of gene frequency
resulting from sampling is random in the sense that its direction is
unpredictable. But its magnitude can be predicted in terms of the
variance of the change. Consider the formation of the lines from
the base population. Each line is formed from a sample of N in
dividuals drawn from the base population. Since each individual
carries two genes at a locus, the subdivision of the population
represents a series of samples each of 2N genes, drawn at random
from the base population. The gene frequencies in these samples
will have an average value equal to that in the base population, i.e.
q , and will be distributed about this mean with a variance p q /2N,
which is simply the variance of a ratio, the sample size being in this
Chap. 3]
SAMPLING
51
case 2N. Thus the change of gene frequency, Aq f resulting from
sampling in one generation, can be stated in terms of its variance as
.(**)
2 _Mo
° A « 2N
This variance of Aq expresses the magnitude of the change of gene
frequency resulting from the dispersive process. It expresses the
expected change in any one line, or the variance of gene frequencies
that would be found among many lines after one generation. Its
effect is a dispersion of gene frequencies among the lines; in other
words the lines come to differ in gene frequency, though the mean
in the population as a whole remains unchanged.
In the next generation the sampling process is repeated, but each
line now starts from a different gene frequency and so the second
sampling leads to a further dispersion. The variance of the change
now differs among the lines, since it depends on the gene frequency,
q lt in the first generation of each line separately. The effect of con
tinued sampling through successive generations is that each line
fluctuates irregularly in gene frequency, and the lines spread apart pro
gressively, thus becoming differentiated. The erratic changes of gene
frequency shown by the individual lines are exemplified in Fig. 3.2;
6 8 10 12 14
GENERATIONS
Fig. 3.2. Random drift of the colour gene "nonagouti" in three
lines of mice, each maintained by 6 pairs of parents per generation.
(Original data.)
52
SMALL POPULATIONS: I
[Chap. 3
■ i i i ■' i
10 12
16 18 20 22 24 26 28 30 32
c
7 £
o
<D
en
number of bw 5 genes
Fig. 3.3. Distributions of gene frequencies in 19 consecutive
generations among 105 lines of Drosophila melanogaster , each of 16
individuals. The gene frequencies refer to two alleles at the
"brown" locus (bw™ and bw), with initial frequencies of 05. The
height of each black column shows the number of lines having the
gene frequency shown on the scale below. (From Buri, 1956;
reproduced by courtesy of the author and the editor of Evolution.)
Chap. 3]
SAMPLING
53
and the consequent differentiation, or spreading apart, of the lines
in Fig. 3.3. These changes of gene frequency resulting from samp
ling in small populations are known as random drift (Wright, 193 1).
O 06
8 10
GENERATIONS
Fig. 3.4. Variance of gene frequencies among lines in the ex
periment illustrated in Fig. 3.3. The circles are the observed values,
and the smooth curve shows the expected variance as given by
equation 3.2. The value taken for N is 1 1 5, which is the "effective
number," N e , as explained in the next chapter. (Data from Buri,
1956.)
As the dispersive process proceeds, the variance of gene frequency
among the lines increases, as shown in Fig. 3.4. At any generation, t,
the variance of gene frequencies, oJ, among the lines is as follows
(see Crow, 1954):
 p ^[ I [ I ^) i ]
(3.2)
Since the mean gene frequency among all the lines remains unchanged,
q=q . We may note a fact that will be needed later, and is obvious
from equation 3.2, namely that g^ — u 1 . The dispersion of the gene
frequencies, which we have described by reference to one locus in
many lines, could equally well be described by reference to the
54 SMALL POPULATIONS: I [Chap. 3
frequencies at a number of different loci in one line, provided they all
started from the same initial frequency, and were unlinked.
Fixation. There are limits to the spreading apart of the lines that
can be brought about by the dispersive process. The gene frequency
cannot change beyond the limits of o or i, and sooner or later each
line must reach one or other of these limits. Moreover, the limits are
"traps" or points of no return, because once the gene frequency has
reached o or i it cannot change any more in that line. When a
particular allele has reached a frequency of i it is said to be fixed in
that line, and when it reaches a frequency of o it is lost. When an
allele reaches fixation no other allele can be present in that line, and
the line may then be said to be fixed. When a line is fixed all indi
viduals in it are of identical genotype with respect to that locus.
Eventually all lines, and all loci in a lino, become fixed. The indi
viduals of a line are then genetically identical, and this is the basis of
the genetic uniformity of highly inbred strains.
The proportion of the lines in which different alleles at a locus are
fixed is equal to the initial frequencies of the alleles. If the base
population contains two alleles A x and A 2 at frequencies p and q
respectively, then A x will be fixed in the proportion p of the lines,
and A 2 in the remaining proportion, q . The variance of the gene
frequency among the lines is then p q , as may be seen from equation
3.2 by putting t equal to infinity. (In Fig. 3.3 the lines in which
fixation or loss has just occurred are shown, but not those in which it
occurred earlier.)
When concerned with the attainment of genetic uniformity one
wants to know how soon fixation takes place; what is the probability
of a particular locus being fixed, or what proportion of all loci in a
line will be fixed, after a certain number of generations. Considera
tion of the progressive nature of the dispersion, as illustrated in Fig.
3.3, will show that fixation does not start immediately; the dispersion of
gene frequencies must proceed some way before any line is likely to
reach fixation. To deduce the probability of fixation is mathemati
cally complicated (see particularly Wright, 193 1; Kimura, 1955), and
only an outline of the conclusions can be given here. There are two
phases in the dispersive process: during the initial phase the gene
frequencies are spreading out from the initial value; this leads to a
steady phase, when the gene frequencies are evenly spread out over
the range between the two limits, and all gene frequencies except the
two limits are equally probable. The duration of the initial phase
Chap. 3]
SAMPLING
55
in generations is a small multiple of the population size, depending
on the initial gene frequency. With q = O'S it l ast s about zN genera
tions, and with ^ = o 1 it lasts about 4^ generations (Kimura, 1955).
(In the experiment illustrated in Fig. 3.3 it lasted till about the
seventeenth generation.) The theoretical distributions of gene
frequency during the initial phase, with original frequencies of 05
and oi, are shown in Fig. 3.5.
T*N/I0
3.0

/tN/5\
2
T=N/2
1 n
T=N
^nT
T=2N
tK
/ / /
T=3N
\ \ \
■ yy .
■ I •
. w
Q.O 0.5 1.0
Fig. 3.5. Theoretical distributions of gene frequency among lines.
The initial and mean gene frequency is 05 in the left hand figure,
and o*i in the right hand figure. Previously fixed lines are excluded.
N= population size; T=time in generations. Note the general
agreement of the left hand figure with the observed distributions
shown in Fig. 3.3. (From Kimura, 1955; reproduced by courtesy
of the author and the editor of the Proc. Nat. Acad. Set. Wash.)
To visualise the process one might think of a pile of dry sand in a
narrow trough open at the two ends. Agitation of the trough will
cause the pile to spread out along the trough, till eventually it is
evenly spread along its length. Toward the end of the spreading out
some of the sand will have fallen off the ends of the trough, and this
represents fixation and loss. Continued agitation after the sand is
E F.Q.G.
56
Small populations.
[Chap. 3
evenly spread will cause it to fall off the ends at a steady rate, and the
depth of sand left in the trough will be continually reduced at a
steady rate until in the end none is left. The initial gene frequency is
represented by the position of the initial pile of sand. If it is near one
end of the trough, much of the sand will have fallen off that end be
10 12
GENERATIONS
Fig. 3.6. Fixation and loss occurring among 107 lines of Droso
phila melanogaster, during 19 generations. This is not the same
experiment as that illustrated in Figs. 3.3 and 3.4, but was similar
in nature. There were 16 parents per generation in each line, and
the effective number (see chapter 4) was 9. The closed circles
show the percentage of lines in which the bw 75 allele has become
fixed; the open circles show the percentage in which it has been
lost and the bw allele fixed. The smooth curve is the expected
amount of fixation of one or other allele, computed from the effec
tive number by equation 3.3. (Data from Buri, 1956.)
fore any reaches the other end, and the total amount falling off each
end will be in proportion to the relative distance of the initial pile
from the two ends. Relating this model to the diagram of the process
in Fig. 3.5, the position along the trough represents the horizontal
axis, or gene frequency, and the depth of the sand represents the
vertical axis, or the probability of a line having a particular gene
Chap. 3]
SAMPLING
57
frequency. The graphs are thus analogous to longitudinal sections
through the trough and its sand.
The probability of fixation at any time during the initial phase is
too complicated for explanation here, and the reader is referred to
the papers of Kimura (1954, 1955). After the steady phase has been
reached fixation proceeds at a constant rate: a proportion ijzN of the
lines previously unfixed become fixed in each generation. The
proportion of lines in which a gene with initial frequency q is
expected to be fixed, lost, or to be still segregating is as follows
(Wright, 1952a):
fixed: q 3PoqoP
lost: po~3PoqoP
neither: bp^qJP
where P
4^y
Fig. 3.6 shows the progress of fixation and loss in an experiment
with Drosophila.
Genotype frequencies. Change of gene frequency leads to
change of genotype frequencies; so the genotype frequencies in small
populations follow the changes of gene frequency resulting from the
dispersive process. In the idealised population, which we are still
considering, mating is random within each of the lines. Consequently
the genotype frequencies in any one line are the Hardy Weinberg
frequencies appropriate to the gene frequency in the previous genera
tion of that line. As the lines drift apart in gene frequency they
become differentiated also in genotype frequencies. But differentia
tion is not the only aspect of the change: the general direction of the
change is toward an increase of homozygous, and a decrease of
heterozygous, genotypes. The reason for this is the dispersion of gene
frequencies from intermediate values toward the extremes. Hetero
zygotes are most frequent at intermediate gene frequencies (see
Fig. 1.1), so the drift of gene frequencies toward the extremes leads,
on the average, to a decline in the frequency of heterozygotes.
The genotype frequencies in the population as a whole can be
deduced from a knowledge of the variance of gene frequencies in the
following way. If an allele has a frequency q in one particular line,
homozygotes of that allele will have a frequency of q 2 in that line.
The frequency of these homozygotes in the population as a whole will
therefore be the mean value of q 2 over all lines. We shall write this
58 SMALL POPULATIONS: I [Chap. 3
mean frequency of homozygotes as (q 2 ). The value of (q 2 ) can be
found from a knowledge of the variance of gene frequencies among
the lines, by noting that the variance of a set of observations is found
by deducting the square of the mean from the mean of the squared
observations. Thus
and (« 2 )=<P + ° S 2 (34)
where o\ is the variance of gene frequencies among the lines, as given
in equation 3.2, and q 2 is the square of the mean gene frequency.
Since the mean gene frequency, q, is equal to the original, q > it
follows that q 2 or q% is the original frequency of homozygotes in the
base population. Thus in the population as a whole the frequency of
homozygotes of a particular allele increases, and is always in excess
of the original frequency by an amount equal to the variance of the
gene frequency among the lines. In a twoallele system the same
applies to the other allele, and the frequency of heterozygotes is
reduced correspondingly. Noting from equation 3.2 that o\ — a\ we
therefore find the genotypic frequencies for a locus with two alleles
as follows:
(3.5)
Frequency in
Genotype
whole population
AA
Po + rf
AiA 2
2p q 2or*
A 2 A 2
ql + a 2
These genotype frequencies are no longer the Hardy Weinberg
frequencies appropriate to the original or mean gene frequency. The
Hardy Weinberg relationships between gene frequency and genotype
frequencies, though they hold good within each line separately, do
not hold if the lines are taken together and regarded as a single
population. This fact causes some difficulty in relating gene and
genotype frequencies in natural populations, because they are often
more or less subdivided and the degree of subdivision is seldom
known. An example of the decrease of heterozygotes resulting from
the dispersion of gene frequencies is shown in Fig. 3.7.
The foregoing account of genotype frequencies describes the
situation in terms of one locus in many lines. It can be regarded
equally as referring to many loci in one line; then the change in any
one line or small population is an increase in the number of loci at
Chap. 3]
SAMPLING
59
which individuals are homozygous and a corresponding decrease in
the number at which they are heterozygous — in short an increase of
homozygotes at the expense of heterozygotes. This change of geno
type frequencies resulting from the dispersive process is the genetic
basis of the phenomenon of inbreeding depression, of which a full
explanation will be found in Chapter 14.
10
GENERATIONS
Fig. 3.7. Change of frequency of heterozygotes among 105 lines
of Drosophila melanogaster, each with 16 parents. The same ex
periment as is illustrated in Figs. 3.3. and 3.4. The frequency of
heterozygotes refers to the population as a whole, all lines taken
together. The smooth curve is the expected frequency of hetero
zygotes. (Data from Buri, 1956.)
We have now surveyed the general nature of the dispersive process
and its three major consequences — differentiation of subpopulations,
genetic uniformity within subpopulations, and overall increase in
the frequency of homozygous genotypes. Let us now look at the
process from another viewpoint, as an inbreeding process. Instead
of regarding the increase of homozygotes as a consequence of the
dispersion of gene frequencies, we shall now look directly at the
manner in which the additional homozygotes arise.
60 SMALL POPULATIONS: I [Chap. 3
Inbreeding
Inbreeding means the mating together of individuals that are
related to each other by ancestry. That the degree of relationship
between the individuals in a population depends on the size of the
population will be clear by consideration of the numbers of possible
ancestors. In a population of bisexual organisms every individual
has two parents, four grandparents, eight greatgrandparents, etc.,
and t generations back it has 2* ancestors. Not very many generations
back the number of individuals required to provide separate ancestors
for all the present individuals becomes larger than any real popula
tion could contain. Any pair of individuals must therefore be related
to each other through one or more common ancestors in the more or
less remote past; and the smaller the size of the population in previous
generations the less remote are the common ancestors, or the greater
their number. Thus pairs mating at random are more closely related
to each other in a small population than in a large one. This is why
the properties of small populations can be treated as the consequences
of inbreeding.
The essential consequence of two individuals having a common
ancestor is that they may both carry replicates of one of the genes
present in the ancestor; and if they mate they may pass on these
replicates to their offspring. Thus inbred individuals — that is to
say, offspring produced by inbreeding — may carry two genes at a
locus that are replicates of one and the same gene in a previous
generation. Consideration of this consequence of inbreeding shows
that there are two sorts of identity among allelic genes, and two sorts
of homozygote. The sort of identity we have hitherto considered is a
functional identity. Two genes are regarded as being identical if they
are not recognisably different in their phenotypic effects, or by any
other functional criterion; in other words, if they have the same
allelemorphic state. Following the terminology of Crow (1954) they
may be called alike in state. An individual carrying a pair of such genes
is a homozygote in the ordinary sense. The new sort of identity is
one of replication. If two genes originated from the replication of one
gene in a previous generation, they may be said to be identical by
descent, or simply identical. An individual possessing two identical
genes at a locus may be called an identical homozygote. Genes that
are not identical by descent may be called independent, whether they
Chap, 3]
INBREEDING
61
are alike in state or different alleles; and homozygotes of independent
genes may be called independent homozygotes.
Identity by descent provides the basis for a measure of the dis
persive process, through the degree of relationship between the
mating pairs. The measure is the coefficient of inbreeding, which is the
probability that the two genes at any locus in an individual are identi
cal by descent. It refers to an individual and expresses the degree of
relationship between the individual's parents. If the parents mated
at random then the coefficient of inbreeding of the progeny is the
probability that two gametes taken at random from the parent
generation carry identical genes at a locus. The coefficient of in
breeding, generally symbolised by F, was first defined by Wright
(1922) as the correlation between uniting gametes; the definition
given here, which follows that of Malecot (1948) and Crow (1954), is
equivalent.
The degree of relationship expressed in the inbreeding coefficient
is essentially a comparison between the population in question and
some specified or implied base population. Without this point of
reference it is meaningless, as the following consideration will show.
On account of the limitation in the number of independent ancestors
in any population not infinitely large, all genes now present at a locus
in the population would be found to be identical by descent if traced
far enough back into the remote past. Therefore the inbreeding
coefficient only becomes meaningful if we specify some time in the
past beyond which ancestries will not be pursued, and at which all
genes present in the population are to be regarded as independent —
that is, not identical by descent. This point is the base population
and by its definition it has an inbreeding coefficient of zero. The
inbreeding coefficient of a subsequent generation expresses the
amount of the dispersive process that has taken place since the base
population, and compares the degree of relationship between the
individuals now, with that between individuals in the base population.
Reference to the base population is not always explicitly stated, but is
always implied. For example, we can speak of the inbreeding coeffi
cient of a population subdivided into lines. The comparison of
relationship is between the individuals of a line and individuals
taken at random from the whole population. The base population
implied is a hypothetical population from which all the lines were
derived.
Inbreeding in the idealised population. Let us now return to
62 SMALL POPULATIONS: I [Chap. 3
the idealised population and deduce the coefficient of inbreeding in
successive generations, starting with the base population and its
progeny constituting generation i. The situation may be visualised
by thinking of a hermaphrodite marine organism, capable of self
fertilisation, shedding eggs and sperm into the sea. There are N
individuals each shedding equal numbers of gametes which unite at
random. All the genes at a locus in the base population have to be
regarded as being nonidentical; so, considering only one locus,
among the gametes shed by the base population there are zN different
sorts, in equal numbers, bearing the genes A l5 A 2 , A 3 , etc. at the A
locus. The gametes of any one sort carry identical genes; those of
different sort carry genes of independent origin. What is the pro
bability that a pair of gametes taken at random carry identical genes?
This is the inbreeding coefficient of generation i . Any gamete has a
i/aiVth chance of uniting with another of the same sort, so i/zNis the
probability that uniting gametes carry identical genes, and is thus the
coefficient of inbreeding of the progeny. Now consider the second
generation. There are now two ways in which identical homo
zygotes can arise, one from the new replication of genes and the other
from the previous replication. The probability of newly replicated
genes coming together in a zygote is again i/2N. The remaining
proportion, i  i/zN, of zygotes carry genes that are independent in
their origin from generation i, but may have been identical in their
origin from generation o. The probability of their identical origin in
generation o is what we have already deduced as the inbreeding
coefficient of generation i. Thus the total probability of identical
homozygotes in generation 2 is
F >=m + [*£?)*>
where F x and F 2 stand for the inbreeding coefficients of generations
1 and 2 respectively. The same argument applies to subsequent
generations, so that in general the inbreeding coefficient of individuals
in generation t is
Thus the inbreeding coefficient is made up of two parts: an "incre
ment," i/zN, attributable to the new inbreeding, and a "remainder,"
attributable to the previous inbreeding and having the inbreeding
Chap. 3]
INBREEDING
63
coefficient of the previous generation. In the idealised population the
"new inbreeding" arises from selffertilisation, which brings together
genes replicated in the immediately preceding generation. Exclusion
of selffertilisation simply shifts the replication one generation
further back, so that the "new inbreeding" brings together genes
replicated in the grandparental generation; the coefficient of in
breeding is affected, but not very much, as we shall see later. The
distinction between "new" and "old" inbreeding brings clearly to
light a point which we note here in passing because it will be needed
later and is often important in practice: if there is no "new inbreed
ing," as would happen if the population size were suddenly increased,
the previous inbreeding is not undone, but remains where it was
before the increase of population size.
Let us call the "increment" or "new inbreeding" AF, so that
AF
i
zN
(5.7)
Equation 3.6 may then be rewritten in the form
F t =AF+(i^F)Fti (38)
Further rearrangement makes clearer the precise meaning of the
"increment," AF.
AF:
F±FU
' i ~F,i
(39)
From the equation written thus we see that the "increment," AF,
measures the rate of inbreeding in the form of a proportionate increase.
It is the increase of the inbreeding coefficient in one generation, rela
tive to the distance that was still to go to reach complete inbreeding.
This measure of the rate of inbreeding provides a convenient way of
going beyond the restrictive simplifications of the idealised popula
tion, and it thus provides a means of comparing the inbreeding effects
of different breeding systems. When the inbreeding coefficient is
expressed in terms of AF, equation 3.8 is valid for any breeding system
and is not restricted to the idealised population, though only in the
idealised population is AF equal to 1/2N.
So far we have done no more than relate the inbreeding coefficient
in one generation to that of the previous generation. It remains to
extend equation 3.8 back to the base population and so express the
inbreeding coefficient in terms of the number of generations. This is
64 SMALL POPULATIONS: I [Chap. 3
made easier by the use of a symbol, P, for the complement of the
inbreeding coefficient, i P, which is known as the panmictic index.
Substitution of P= i F in equation 3.8 gives
p=iAF {3.10)
Thus the panmictic index is reduced by a constant proportion in
each generation. Extension back to generation t  2 gives
and extension back to the base population gives
p t ={iAFyp B (3.11)
where P is the panmictic index of the base population. The base
population is defined as having an inbreeding coefficient of o, and
therefore a panmictic index of 1. The inbreeding coefficient in any
generation, t, referred to the base population, is therefore
F t = i(iAFY (3.12)
The consequences of the dispersive process were described earlier
from the viewpoint of sampling variance. Let us now look again at
them, applying the rate of inbreeding and the inbreeding coefficient
as measures of the process. Strictly speaking we should refer still
to the idealised population, but the equating of the two viewpoints
can be regarded as generally valid except in some very special and
unlikely circumstances (see Crow, 1954).
Variance of gene frequency. First, the variance of the change
of gene frequency in one generation, taken from equation 3.1 and
expressed in terms of the rate of inbreeding, becomes
<= P § =M,AF {3  I3)
Similarly, the variance of gene frequencies among the lines at
generation t, taken from equation 3.2 and expressed in terms of the
inbreeding coefficient from 3. 12, becomes
of= M .[l(l^)']
=P&loF (314)
Chap. 3] INBREEDING 65
Thus AF expresses the rate of dispersion and F the cumulated effect
of random drift.
Genotype frequencies. Leaving fixation aside for the moment,
let us consider next the genotype frequencies in the population as a
whole. The genotype frequencies expressed in terms of the variance
of gene frequency in equations 3.5 can be rewritten in terms of the
coefficient of inbreeding from equation 3.J4. The frequency of A 2 A 2 ,
for example, is
(?)=q%+°%=q 2 o+P<tioF
The genotype frequencies expressed in this way are entered in the
lefthand side of Table 3.1. As was explained before, this way of
writing the genotype frequencies shows how the homozygotes in
Table 3.1
Genotype frequencies for a locus with two alleles, expressed
in terms of the inbreeding coefficient, F.
Original Change
fre due to
quencies inbreeding
Origin:
Independent Identical
M,
Pi + M/
or
Pl(i F) + Pa F
A X A 2
A 2 A 2
2M0  2/XtfoF
Qo + P0Q0F
or
or
2/>o?o(i F)
sKi F) + 1oF
crease at the expense of the heterozygotes. Recognition of identity
by descent to which the inbreeding viewpoint led us means that we
can now distinguish the two sorts of homozygote, identical and
independent, among both the A 1 A 1 or A 2 A 2 genotypes. The fre
quency of identical homozygotes among both genotypes together is
by definition the inbreeding coefficient, F; and it is clear that the
division between the two genotypes is in proportion to the initial
gene frequencies. So p$F is the frequency of A X A X identical homo
zygotes, and q F that of A 2 A 2 identical homozygotes. The remaining
genotypes, both homozygotes and heterozygotes, carry genes that are
independent in origin and are therefore the equivalent of pairs of
gametes taken at random from the population as a whole. Their
frequencies are therefore the Hardy Weinberg frequencies. Thus,
from the inbreeding viewpoint, we arrive at the genotype frequencies
shown in the righthand columns of Table 3.1. This way of writing
the genotype frequencies shows how homozygotes are divided be
66 SMALL POPULATIONS: I [Chap. 3
tween those of independent and those of identical origin. The
equivalence of the two ways of expressing the genotype frequencies
can be verified from their algebraic identity. Both ways show equally
clearly how the heterozygotes are reduced in frequency in proportion
to i F. The term "heterozygosity" is often used to express the
frequency of heterozygotes at any time, relative to their frequency in
the base population. The heterozygosity is the same as the panmic
tic index, P. Thus if H t and H are the frequencies of heterozygotes
for a pair of alleles at generation t and in the base population res
pectively, then the heterozygosity at generation t is
§=P* (315)
Fixation. There is little to add, from the inbreeding viewpoint, to
the description of fixation given earlier. The rate of fixation — that
is the proportion of unfixed loci that become fixed in any generation —
is equal to AF, after the steady phase has been reached and the dis
tribution of gene frequencies has become flat. The quantity P in
equations 3.3 which give the probability of a gene having become
fixed or lost, is equal to 1 F. We may note, however, that the
probability of fixation is not very different from the inbreeding
coefficient itself. The explanation comes more readily by considering
the probability that a locus remains unfixed. This probability was
given in equation 3.3 for a locus with two alleles after enough genera
tions have passed to take the population into the steady phase.
Expressed in terms of the inbreeding coefficient, from equation 3. 12,
it is 6p q (i F). Now, the value of p q does not change very much
over quite a wide range of gene frequencies, and so the probability
that a locus is still unfixed is not very sensitive to the initial gene
frequency. The value of 6p q lies between io and 15 over a range
of gene frequency from 02 to o8, a range that is likely to cover many
situations. Consequently the probability that a line still segregates,
or the proportion of loci expected to remain unfixed, is likely to lie
between (1 F) and 15(1 F). Thus the inbreeding coefficient gives
a good idea of the approximate probability of fixation, even in the
absence of a knowledge of the initial gene frequencies. That the
approximation may be quite close enough for practical purposes may
be seen by taking a specific example. In work involving immuno
logical reactions it may be necessary to produce a strain in which all
loci that determine the reactions have been fixed. One therefore
Chap. 3]
INBREEDING
67
wants to know the inbreeding coefficient necessary to raise the
probability of fixation, or the proportion of loci expected to be fixed,
to a certain level — say 90 per cent. The inbreeding coefficient needed
to do this would, on the above considerations, lie between 090 and
093, and this would answer the question with quite enough accuracy
for most purposes.
CHAPTER 4
SMALL POPULATIONS:
II. Less Simplified Conditions
In order to simplify the description of the dispersive process we
confined our attention in the last chapter to an idealised population,
and to do this we had to specify a number of restrictive conditions,
which could seldom be fulfilled in real populations. The purpose of
this chapter is to adapt the conclusions of the last chapter to situations
in which the conditions imposed do not hold; in other words to
remove the more serious restrictions and bring the conclusions closer
to reality. The restrictive conditions were of two sorts, one sort
being concerned with the breeding structure of the population and
the other excluding mutation, migration, and selection from con
sideration. We shall first describe the effects of deviations from the
idealised breeding structure, and then consider the outcome of the
dispersive process when mutation, migration, or selection are oper
ating at the same time.
Effective Population Size
If the breeding structure does not conform to that specified for
the idealised population, it is still possible to evaluate the dispersive
process in terms of either the variance of gene frequencies or the rate
of inbreeding. This can be done by the same general methods and
no new principles are involved. We shall therefore give the con
clusions briefly and without detailed explanation. The most con
venient way of dealing with any particular deviation from the
idealised breeding structure is to express the situation in terms of
the effective number of breeding individuals, or the effective population
size. This is the number of individuals that would give rise to the
sampling variance or the rate of inbreeding appropriate to the con
ditions under consideration, if they bred in the manner of the
idealised population. Thus, by converting the actual number, N, to
Chap. 4]
EFFECTIVE POPULATION SIZE
69
the effective number, N e , we can apply the formulae deduced in the
last chapter. The rate of inbreeding, for example, is
AF=
zN.
(4.1)
just as for the idealised population AF= ijzN (equation 3.7).
The relationships between actual and effective numbers in the
situations most commonly met with are given below. The exact
expressions are often complicated, but in most circumstances an
approximation can be used with sufficient accuracy. We should first
note that the actual number, N t refers to breeding individuals — the
breeding individuals of one generation — and it therefore cannot be
obtained directly from a census, unless the different agegroups are
distinguished.
Bisexual organisms: selffertilisation excluded. The ex
clusion of selffertilisation makes very little difference to the rate of
inbreeding, unless N is very small, as with close inbreeding. The
relationship of effective to actual numbers (Wright, 1 931) is
N e =N+i
and the rate of inbreeding is
AF=
2N+1
(approx.)
(approx.)
.(4.2)
(4.3)
The exact expression for the inbreeding coefficient in a bisexual
population, and its derivation, are given by Malecot (1948).
Different numbers of males and females. In domestic and
laboratory animals the sexes are often unequally represented among
the breeding individuals, since it is more economical, when possible,
to use fewer males than females. The two sexes, however, whatever
their relative numbers, contribute equally to the genes in the next
generation. Therefore the sampling variance attributable to the two
sexes must be reckoned separately. Since the sampling variance is
proportional to the reciprocal of the number, the effective number is
twice the harmonic mean of the numbers of the two sexes (Wright,
1 931), so that
1 1
+
N e iN m '4N f
(44)
70 SMALL POPULATIONS: II [Chap.4
where N m and N f are the actual numbers of males and females
respectively. The rate of inbreeding is then
AF= sk + m < a pp rox > (*5)
This gives a close enough approximation unless both N m and N f are
very small, as with close inbreeding. It should be noted that the rate
of inbreeding depends chiefly on the numbers of the less numerous
sex. For example, if a population were maintained with an in
definitely large number of females but only one male in each genera
tion, the effective number would be only about 4.
Unequal numbers in successive generations. The rate of
inbreeding in any one generation is given, as before, by i/zN. If the
numbers are not constant from generation to generation, then the
mean rate of inbreeding is the mean value of i/zN'm successive genera
tions. The effective number is the harmonic mean of the numbers in
each generation (Wright, 1939). Over a period of / generations,
therefore,
w e =1 tlk + k + k +  + w] (approx  } {4  6)
Thus the generations with the smallest numbers have the most effect.
The reason for this can be seen by consideration of the "new" and
"old" inbreeding referred to in connexion with equation 3.6. An
expansion in numbers does not affect the previous inbreeding; it
merely reduces the amount of new inbreeding. So, in a population
with fluctuating numbers the inbreeding proceeds by steps of varying
amount, and the present size of the population indicates only the
present rate of inbreeding.
Nonrandom distribution of family size. This is probably the
commonest and most important deviation from the breeding system
of the idealised population. Its consequence is usually to render the
effective number less than the actual, but in special circumstances it
makes it greater. Family size means here the number of progeny of
an individual parent or of a pair of parents, that survive to become
breeding individuals. It will be remembered that each breeding
individual in the idealised population contributes equally to the pool
of gametes, and therefore equally also to the potential zygotes in the
next generation. Survival of zygotes is random. The mean number
of progeny surviving to breeding age is 1 for individual parents and 2
Chap. 4]
EFFECTIVE POPULATION SIZE
71
for pairs of parents. Since the chance of survival for any particular
zygote is small, the variation of family size follows a Poisson distribu
tion. The variance of family size is therefore equal to the mean family
size, equality of mean and variance being a property of the Poisson
distribution. Thus in a population of bisexual organisms, in which
all other conditions of the idealised population are satisfied, family
size will have a mean and a variance of 2. In natural populations the
mean is not likely to differ much from 2, but the variance must be
expected to be usually greater, for reasons of differing fertility be
tween the parent individuals and differing viability between the
families. If the variance of family size is increased, a greater propor
tion of the following generation will be the progeny of a smaller
number of parents, and the effective number of parents will be less
than the actual number. Conversely, if the variance of family size is
reduced below that of the idealised population, the effective number
will be greater than the actual number. It can be shown that, when
the mean family size is 2, the effective number is as follows (Wright,
1940; Crow, 1954):
N e =
4iV
2 + 0I
(47)
where erf is the variance of family size. (Strictly speaking this is the
effective number as it affects variance of gene frequency and fixation:
for its effects on the inbreeding coefficient, N e = % . The differ
ence is small and we shall ignore it.) Thus, when there is equal
fertility of the parents and random survival of the progeny of — 2, and
N e =N. When differences of fertility and viability make of greater
than 2, as in most actual populations, then N e is less than N. The
effective number under consideration here refers to a population with
equal numbers of males and females, and with monogamous mating.
If males are not restricted to a single mate, then the families of males
are likely to be more variable in size than those of females. In these
circumstances the relationship of effective to actual numbers will
differ for male and female parents.
It is possible by controlled breeding to make the variance of
family size, of, less than 2, and therefore to make the effective
number greater than the actual. If two members of each family are
deliberately chosen to be parents of the next generation, then the
variance of family size is zero. Under these special circumstances,
F F.Q.G.
72 SMALL POPULATIONS: II [Chap. 4
and if the sexes are equal in numbers, the effective number is twice
the actual:
N. = zN (4.8)
The rate of inbreeding is consequently half what it would be in an
idealised population of equal size, and is usually less than half the
rate of inbreeding under normal circumstances and random mating.
Under this controlled breeding system the rate of inbreeding is the
lowest possible with a given number of breeding individuals. The
reduced variance of family size is the path through which the ' 'de
liberate avoidance of inbreeding" works. The problem often arises
of keeping a stock with minimum inbreeding, but with a limitation of
the actual population size imposed by the space or facilities available.
A common practice under these circumstances is the deliberate
avoidance of sibmatings and perhaps also of cousinmatings. One
may go further and by the use of pedigrees (in the manner described
in the next chapter) choose pairs for mating that have the least
possible relationship with each other. Deliberate avoidance of in
breeding in this way has the effect of distributing the individuals
chosen to be parents evenly over the available families, and thus
reduces the variance of family size and the rate of inbreeding. The
same result, however, can be achieved with less labour simply by
ensuring that the available families are as far as possible equally
represented among the individuals chosen to be the parents of the
next generation. If, in addition, matings between close relatives are
avoided, the inbreeding coefficient in any generation is slightly lower
and is more uniform between the individuals in the generation than if
matings between close relatives are allowed; but the rate of inbreeding
is the same.
If the sexes are unequal in numbers, but the individuals chosen as
parents are equally distributed, in numbers and sexes, between the
families, so that the variance of family size is still zero, then the rate
of inbreeding is given by the following formula (Gowe, Robertson,
and Latter, 1959):
where N m and N f are the actual numbers of male and female parents
respectively, and females are more numerous than males.
Chap. 4]
EFFECTIVE POPULATION SIZE
73
Example 4.1. Several flocks of poultry in the United States and in
Canada, which are used as controls for breeding experiments, are main
tained by the following breeding system (Gowe, Robertson, and Latter,
1959)
There are 50 breeding males and 250 breeding females in each genera
tion. Every male is the son of a different father, and every female the
daughter of a different mother, so that the variance of family size is zero.
One of the objectives of this breeding system is to minimise the rate of
inbreeding. Let us therefore find what the rate of inbreeding is, and then
see how much is achieved in this respect by the deliberate equalisation of
family size. By equation 4.9 the rate of inbreeding in these flocks is
AF = 0002. If there were no deliberate choice of breeding individuals, and
family size conformed to a Poisson distribution, the rate of inbreeding by
equation 4.5 would be AF = 0003. Thus, without the deliberate equalisa
tion of family size the rate of inbreeding would be 50 per cent greater. If
a low rate of inbreeding were the only objective, the number of females
could be substantially reduced without much effect. For example, if
there were no more females than males, with 50 of each sex (N= 100) and
with equalisation of family size, the rate of inbreeding from equation 4.8
would be AF= 00025, which is not very much greater than with five times
as many females. This illustrates the point, mentioned earlier, that most
of the inbreeding comes from the less numerous sex.
Ratio of effective to actual number. When matings are con
trolled and pedigree records kept, the rate of inbreeding can readily
be computed, as will be explained in the next chapter. But pedigree
records are not available for natural populations, nor for laboratory
populations kept by mass culture, as for example Drosophila popul
tions. How are we to estimate the rate of inbreeding in such popula
tions? We know the effective number is likely to be less than the
actual, but how much less ? To estimate the effective number requires
a special experiment, and only the actual number is likely to be known.
Determinations of the ratio of effective to actual numbers, N e /N,
from data on man, Drosophila, and the snail Lymnaea, led to values
ranging from 70 per cent to 95 per cent (Crow and Morton, 1955).
In the absence of specific knowledge, therefore, it would seem
reasonable to take the effective number as being, very roughly, about
threequarters of the actual number. There are two methods by
which the ratio NJN may be determined: (1) by the estimation of the
variance of family size, which yields N e by equation 4.7 (though
adjustment has to be made if the mean family size at the time of
measurement is not 2); and (2) by the estimation of the variance of the
74 SMALL POPULATIONS: II [Chap. 4
changes of gene frequency during inbreeding, which yields N e by
equation 3.1. Both methods have been applied to Drosophila
melanogaster in laboratory cultures. The ratio N e /N for female
parents was 71 per cent by the first method and 76 per cent by the
second; and for male parents, 48 per cent and 35 per cent (Crow and
Morton, 1955). The ratio NJN for the sexes jointly, determined by
the second method, ranged from 56 per cent to 83 per cent, with a
mean of 70 per cent, in five experiments with equal actual numbers
of males and females (Kerr and Wright, 19540, b; Wright and Kerr,
1954; Buri, 1956). The low value of 56 per cent was found in rather
poor culture conditions of crowding, where there was more compe
tition (Buri, 1956).
Example 4.2. As an illustration of the use of the ratio NJN let us find
the expected rate of inbreeding in a population of Drosophila maintained
by 20 pairs of parents in each generation. The actual number is TV = 40.
If the effective number were equal to the actual, the rate of inbreeding, by
equation 4. J, would be AF= 1/80 = 1 25 per cent. If we take N e = oyN, from
the experimental results cited above, then iV e = 28, and the rate of in
breeding is AF= 1/56 = 1 786 per cent. The coefficient of inbreeding after
10, 50, and 100 generations would then be (by equation 3.12) 17 per cent,
59 per cent, and 84 per cent.
Migration, Mutation, and Selection
The description of the dispersive process given so far in this
chapter and the previous one is conditional on the systematic pro
cess of mutation, migration, and selection being absent, and its rele
vance to real populations is therefore limited. So let us now consider
the effects of the dispersive and systematic processes when acting
jointly. The systematic processes, as we have seen in Chapter 2,
tend to bring the gene frequencies to stable equilibria at particular
values which would be the same for all populations under the same
conditions. The dispersive process, in contrast, tends to scatter the
gene frequencies away from these equilibrium values, and if not held
in check by the systematic processes it would in the end lead to all
genes being either fixed or lost in all populations not infinite in size.
The tendency of the systematic processes to change the gene fre
quency toward its equilibrium value becomes stronger as the fre
quency deviates further from this value. For this reason the opposing
Chap. 4]
MIGRATION, MUTATION, AND SELECTION
75
tendencies of the dispersive and systematic processes reach a point of
balance: a point at which the dispersion of the gene frequencies is
held in check by the systematic processes. When this point of balance
is reached there will be a certain degree of differentiation between
subpopulations, but it will neither increase nor decrease so long as
the conditions remain unchanged. The problem is therefore to find
the distribution of gene frequencies among the lines of a subdivided
population when this steady state has been reached. The solution is
complicated mathematically, and we shall give only the main con
clusions, explaining their meaning but not their derivation. For
details of the joint action of the dispersive and systematic processes,
see Wright (193 1, 1942, 1948, 195 1).
Mutation and migration. Mutation and migration can be
dealt with together because they change the gene frequency in the
same manner. Consider again a population subdivided into many
lines, all with an effective size N e \ and let a proportion, m, of the
breeding individuals of every generation in each line be immigrants
coming at random from all other lines. Consider two alleles at a
locus, with mean frequencies p and q in the population as a whole, and
with mutation rates u and v in the two directions. Then, when the
balance between dispersion on the one hand and mutation and
migration on the other is reached, the variance of the gene frequency
among the lines is given by the following expression (Wright, 1931;
Malecot, 1948):
pq
1 + \N e (u + v + m)
(approx.)
,{4.10)
The degree of dispersion represented here by the variance of the gene
frequency can also be expressed as a coefficient of inbreeding, by
putting o\ =Fpq, from equation 3. 14. Then
v +
(approx.)
.(4.11)
i+4N e (u
The theoretical distributions of the gene frequency appropriate to
four different values of F, when the mean gene frequency is 05, are
shown in Fig. 4. 1 . These distributions show how high F must be for
there to be a substantial amount of fixation or of differentiation be
tween subpopulations. What the distributions depict can be stated
in three ways: (a) If we had a large number of subpopulations and we
determined the frequency of a particular gene in all of them, the dis
76
SMALL POPULATIONS: II
[Chap. 4
tribution curve is what we should obtain by plotting the percentage
of subpopulations showing each gene frequency. Or, in other words,
the height of the curve at a particular gene frequency shows the
probability of finding that gene frequency in any one subpopulation.
(b) If we had one subpopulation and measured the gene frequencies
at a large number of loci, all of which started with the same initial
frequency, the curve is the distribution of frequencies that we should
find, (c) If we had one sub
population and measured the
frequency of one particular gene
repeatedly over a long period of
time, the curve is the distribu
tion of frequencies that we
should find. The distributions
describe the state of affairs when
equilibrium between the sys
tematic and dispersive pro
cesses has been reached, and
the population as a whole is in
a steady state. From the dis
tributions shown in Fig. 4.1 it
will be seen that when F is 0005
there is very little differentia
tion, and when F is 0048 there
is a fair amount of differentia
tion but still no fixation. When
F is 0333 tne distribution is flat,
which means that all gene fre
quencies are equally probable
(including o and 1); thus there
is much differentiation, and in
addition a substantial amount
of fixation and loss occurs.
When F exceeds this critical value intermediate gene frequencies
become rarer, and a greater proportion of subpopulations have the
gene either fixed or lost. When mutation or migration occurs, fixation
or loss is not a permanent state in any one subpopulation; the amount
of fixation or loss is what would be found at any one time.
Let us return now to the expression, 4. jt, relating the coefficient
of inbreeding to the rates of mutation and migration when the
Fig. 4.1. Theoretical distributions
of gene frequency among sub
populations, when dispersion is
balanced by mutation or migration.
The states of dispersion to which
the curves refer are indicated by the
values of F in the figure. (Redrawn
from Wright, 195 1.)
Chap. 4]
MIGRATION, MUTATION, AND SELECTION
77
population has reached the steady state; and let us consider the rates
of mutation or migration, in relation to the effective population size,
that would just allow the dispersive process to go to the critical point
corresponding to the value of ^=0333. Putting this value of F in
equation 4.11 yields
(4.12)
u + v + m=^r (approx.)
First let us consider mutation alone. If the sum of the mutation
rates in the two directions (u + v) were io 5 , which is a realistic value
to take according to what is known of mutation rates, then the critical
state of dispersion will be reached in subpopulations of effective size
N e = 50,000. In other words, mutation rates of this order of magni
tude will arrest the dispersive process before the critical state only in
populations with effective numbers greater than 50,000. Populations
smaller than this will show a substantial amount of fixation of genes
having this mutation rate. In practice, therefore, mutation may be
discounted as a force opposing dispersion in populations that would
commonly be regarded as "small"; populations, that is, with effective
numbers of the order of 100, or even 1,000.
With migration the picture is different, because what would be
considered a high rate of mutation would be judged a low rate of
migration. The critical value of F= 0333 w ^ occur when m = ijzN e .
With this rate of migration there would be only one immigrant
individual in every second generation, irrespective of the population
size. Thus we see that only a small amount of interchange between
subpopulations will suffice to prevent them from differentiating
appreciably in gene frequency.
The situation to which this consideration of migration refers is
known as the * 'island model." It pictures a discontinuous population
such as might be found inhabiting widely separated islands, inter
change taking place by occasional migrants from one subpopulation
to another. But differentiation of subpopulations by random drift
can take place also in a continuous population if the motility of the
organism is small in relation to the population density. This is known
as "isolation by distance" or the "neighbourhood model" (Wright,
1940; 1943; 1946; 1 951). Clearly, if there is little dispersal over the
territory between one generation and the next the choice of mates is
restricted and mating cannot be at random. The population is then
subdivided into "neighbourhoods" (Wright, 1946) within which
78 SMALL POPULATIONS: II [Chap. 4
individuals find mates. A neighbourhood is an area within which
mating is effectively random. The size of a neighbourhood depends
on the distance covered by dispersal between one generation and the
next. If the distances between localities inhabited by offspring and
parents at corresponding stages of the life cycle are distributed with a
variance oj, then the area of a neighbourhood is the area enclosed by a
circle of radius 2o dy which is 7r(2cr d ) 2 . The effective population size
of a neighbourhood is the number of breeding individuals in the
area of a neighbourhood. The subdivision of a population into
neighbourhoods leads to random drift, but the amount of local
differentiation depends on the size of the whole population as well
as on the effective number in the neighbourhood. If the whole
population is not very much larger than the neighbourhood then the
whole population will drift, and there will be little local differentiation
within it. The conclusion to which the neighbourhood model leads
is that a great amount of local differentiation will take place if the
effective number in a neighbourhood is of the order of 20, and a
moderate amount if it is of the order of 200; but with larger neigh
bourhoods it will be negligible. There will be much more local
differentiation in a population inhabiting a linear territory, such as a
river or shore line, because a neighbourhood is then open to immi
gration only from two directions instead of from all round. The extent
of a neighbourhood in a population distributed in one dimension is
the square root of the area of a neighbourhood in a population dis
tributed in two dimensions. The effective population size is there
fore the number of breeding individuals in a distance 2a d Jir of terri
tory.
Example 4.3. As an illustration of the computation of the effective
population size of a neighbourhood we may take some observations from
the detailed studies by Lamotte (195 1) of the snail Cepaea nemoralis in
France. Marked individuals were released in spring and the distance
travelled from the point of release by those recaptured in the autumn was
noted. Since the snails are inactive in winter this represented the dis
placement occurring in one year. The mean displacement was 8i metres,
and its standard deviation 94 m. The standard deviation of the displace
ment between birth and mating, which usually takes place in the second
year of life, was estimated as 0^ = 15 m. The area occupied by a neigh
bourhood is therefore 7r(2u d ) 2 = 12501 = 2,813 sq. m. The density of in
dividuals in two large colonies was found to be 2 per sq. m., and in another
3 per sq. m. The effective population size of the neighbourhoods in these
colonies was therefore about 5,600 and 8,400. These figures are a good
Chap. 4]
MIGRATION, MUTATION, AND SELECTION
79
deal larger than the size of neighbourhoods from which we would expect
differentiation within the colonies. Five colonies inhabiting linear terri
tories had densities ranging from 45 to 20 individuals per metre. The
effective population size of the neighbourhoods in these colonies ranged
from 236 to 1,050. These are approaching the size from which differentia
tion within a colony would be expected.
Selection. Selection operating on a locus in a large population
brings the gene frequency to an equilibrium; when selection against
a recessive or semidominant gene is balanced by mutation the
equilibrium is at a low gene frequency, and when selection favours
the heterozygote the equilibrium
is more likely to be at an inter
mediate frequency. The question
we have now to consider is: How
much can the dispersive process
disturb these equilibria and cause
small populations to deviate from
the point of equilibrium? The
importance of this question lies
in the fact that an increase of the
frequency of a deleterious gene
will reduce the fitness — that is,
will increase the frequency of
"genetic deaths" — and the dis
persive process may therefore
lead to nonadaptive changes in
small populations. We shall not
attempt to cover the joint effects
of selection and dispersion in
detail, but shall merely illustrate
their general nature by reference
to a particular case of selection
against a recessive gene balanced by mutation. The effects of selec
tion in favour of heterozygotes will be discussed in the next chapter,
because they have more importance in connexion with close in
breeding.
Fig. 4.2 shows the state of dispersion of a gene among sub
populations of three sizes under the following conditions. Mutation
is supposed to be the same in both directions, and the coefficient of
Fig. 4.2. Theoretical distributions
of gene frequency among sub
populations when the dispersion is
balanced by mutation and selection.
The graphs refer to a recessive gene
with u=v =£qS, in populations of
size: (a) N e = 50 Is, (b) N e =$ls, and
(c) N e =05/5. (Redrawn from Wright,
1942.)
80 SMALL POPULATIONS: II [Chap. 4
selection against the homozygote is supposed to be twenty times the
mutation rate. In a large population the balance between the
mutation and the selection would bring the gene frequency to equili
brium at about 02. The population sizes to which the graphs refer
are (a) N e = 50/5, (b) N e = 5/s, and (c) N e = o$/s. If we assumed a muta
tion rate of io~ 5 in both directions then the intensity of selection would
be s = zo x io~ 5 , and the effective population sizes to which the graphs
refer would be (a) 250,000 (b) 25,000 and (c) 2,500. These graphs
show that with the largest value of N e there is little differentiation
between subpopulations; with the intermediate value of N e random
drift is strong enough to cause a good deal of differentiation; with the
smallest value of N e the effects of random drift predominate over those
of mutation and selection, intermediate gene frequencies are almost
absent, and in the majority of subpopulations the allele is either fixed
or lost. In this case, moreover, a fair proportion of the subpopulations
have the deleterious allele fixed in them. This illustrates how random
drift can overcome relatively weak selection and lead to fixation of a
deleterious gene.
This particular case illustrates in principle what will happen when
the processes of random drift, selection, and mutation are all operating.
But we need to have some idea of how intense the selection must be
before it overcomes the effects of random drift. If we are content not
to be very precise we can say that selection begins to be more im
portant than random drift when the coefficient of selection, s, is of the
order of magnitude of 1 j^N e . For example, in a population of effective
size 100, the critical value of s would be about 00025. This is a very
low intensity of selection, quite beyond the reach of experimental
detection. The conclusion to be drawn, therefore, is that in all but
very small populations, even a very slight selective advantage of one
allele over another will suffice to check the dispersive process before
it causes an appreciable amount of fixation or of differentiation be
tween subpopulations.
Example 4.4. The opposing forces of dispersion and selection are
illustrated in Fig. 4.3, from an experiment with Drosophila melanogaster
(Wright and Kerr, 1954). The frequency of the sexlinked gene "Bar"
was followed for 10 generations in 108 lines each maintained by 4 pairs of
parents. (On account of the complication of sexlinkage, which increases
the rate of dispersion, the theoretical effective number was 6765: the
effective number as judged from the actual rate of dispersion was N e = 4*87.)
The initial gene frequency was 05. The circles in the figure show the
Chap. 4]
MIGRATION, MUTATION, AND SELECTION
81
distribution of the gene frequency among the lines in the fourth to tenth
generations, when the distribution had reached its steady form. The
smooth curve shows the theoretical distribution based on N e = $ and a
coefficient of selection against Bar of 5 = 017. Previously fixed lines are
not included in the distributions. Altogether, at the tenth generation, 95
2 4 6 8
NUMBER OF BAR GENES
Fig. 4.3. Distribution of gene frequencies under inbreeding and
selection, as explained in Example 4.4. (Data from Wright and
Kerr, 1954.)
of the 108 lines had become fixed for the wildtype allele and 3 for Bar
while 10 remained unfixed. Thus, despite a 17 per cent selective dis
advantage, the deleterious allele was fixed in about 3 per cent of the lines.
Random Drift in Natural Populations
Having described the dispersive process and its theoretical conse
quences, we may now turn to the more practical question of how far
these consequences are actually seen in natural populations. The
answering of this question is beset with difficulties, and the following
comments are intended more to indicate the nature of these diffi
culties than to answer the question.
82 SMALL POPULATIONS: II [Chap. 4
The theory of small populations, outlined in this and the pre
ceding chapter, is essentially mathematical in nature and is un
questionably valid: given only the Mendelian mechanism of inheri
tance, the conclusions arrived at are a necessary consequence under
the conditions specified. The question at issue, then, is whether the
conditions in natural populations are often such as would allow the
dispersion of gene frequencies to become detectable. The pheno
mena which would be expected to result from the dispersive process,
if the conditions were appropriate, are differentiation between the
inhabitants of different localities, and differences between successive
generations. Both these phenomena are well known in subdivided or
small isolated populations, and it is tempting to conclude that because
they are the expected consequences of random drift, random drift
must be their cause. But there are other possible causes: the en
vironmental conditions probably differ from one locality to another
and from one season to another; so the intensity, or even the direction
of selection may well vary from place to place and from year to year,
and the differences observed could equally well be attributed to
variation of the selection pressure. Before we can justifiably attribute
these phenomena to random drift, therefore, we have to know (a)
that the effective population size is small enough, (b) that the sub
populations are well enough isolated (or the size of the ' 'neighbour
hoods" sufficiently small), and (c) that the genes concerned are subject
to very little selection.
The estimation of the present size of a population, though not tech
nically easy, presents no difficulties of principle. But the present
state of differentiation depends on the population size in the past,
and this can generally only be guessed at. It is difficult to know how
often the population may have been drastically reduced in size in
unfavourable seasons, and the dispersion taking place in these
generations of lowest numbers is permanent and cumulative. There
is less difficulty in deciding whether the subpopulations are suffi
ciently well isolated. With a discontinuous population inhabiting
widely separated islands, it is often possible to be reasonably sure
that there is not too much immigration; and with a continuous
population the size of the "neighbourhoods" is, at least in principle,
measurable. The greatest difficulty lies in estimating the intensity of
natural selection acting on the genes concerned. Selection of an
intensity far lower than could be detected experimentally is sufficient
to check dispersion in all but the smallest populations. It seems
Chap. 4]
RANDOM DRIFT IN NATURAL POPULATIONS
83
rather unlikely — though this is no more than an opinion — that any
gene that modifies the phenotype enough to be recognised would
have so little effect on fitness. The genes concerned with quantitative
differences, which are not individually recognisable, may however be
nearly enough neutral for random drift to take place. There is no
doubt at all that genes of this sort do show random drift, at least in
laboratory populations, as will be shown in Chapter 15. Of the
individually recognisable genes, those concerned with polymorphism
seem the most likely to show the effects of random drift. At inter
mediate frequencies a small displacement from the equilibrium would
be detectable, and therefore a relatively small amount of dispersion
of the gene frequency might well lead to recognisable differentiation.
The following example will serve to illustrate the observed differen
tiation of a natural population, as well as the difficulties of its inter
pretation.
Example 4.5. The polymorphism in respect of the banding of the
shell in the snail Cepaea nemoralis has been extensively studied by Lamotte
(1951) in France. The population is subdivided into colonies with a high
degree of isolation between them. The absence of darkcoloured bands
on the shell is caused by a single recessive gene. The mean frequency of
bandless snails is 29 per cent, but individual colonies range between the
two extremes, some being entirely bandless and a few entirely banded.
The colonies vary in the number of individuals that they contain, and 291
colonies were divided into three groups according to their population
size. The variation in the frequency of bandless snails was then compared
in the three groups, as shown in Fig. 4.4. The variation between the
colonies, which measures the degree of differentiation, was found to be
greater among the small colonies than among the large. The variance of
the frequency of bandless between colonies was 0067 among colonies of
5001,000 individuals, 0048 among colonies of 1,0003,000, and 0037
among colonies of 3,00010,000 individuals. This dependence of the
degree of differentiation on the population size is interpreted by Lamotte
as evidence that the differentiation is caused by random drift.
Cain and Sheppard (1954a), on the other hand, offer a different
interpretation, sustained by an equally thorough study of colonies in
England. They show that predation by birds — chiefly thrushes — exerts a
strong selection in favour of shell colours matching the background of the
habitat. Though the polymorphism is maintained by selection, of an
unknown nature, in favour of heterozygotes, the frequency of the different
types in any colony is determined by selection in relation to the nature of
the habitat. In the areas occupied by small colonies, they argue, there is
less variation of habitat than in the areas occupied by large colonies. There
84
SMALL POPULATIONS: II
[Chap. 4
fore the variation of habitat between small colonies is greater than between
large. This they regard as the cause of the greater differentiation among
small colonies than among large, selection bringing the frequency of band
less forms to a value appropriate to the mean habitat of the colony. It is
not for us here to attempt an assessment of these two conflicting interpre
tations.
FREQUENCY OF BANDLESS
Fig. 4.4. Distribution of the frequency of bandless snails among
colonies of three sizes. (Data from Lamotte, 195 1.)
(a) (b) (c)
Population size 5001,000 1,0003,000 3,00010,000
Mean frequency of bandless 0292 0256 0211
Variance between colonies 0067 0048 0037
CHAPTER 5
SMALL POPULATIONS:
III. Pedigreed Populations and Close Inbreeding
In the two preceding chapters the genetic properties of small popu
lations were described by reference to the effective number of breeding
individuals; and expressions were derived, in terms of the effective
number, by means of which the state of dispersion of the gene
frequencies could be expressed as the coefficient of inbreeding. The
coefficient of inbreeding, which is the probability of any individual
being an identical homozygote, was deduced from the population size
and the specified breeding structure. It expressed, therefore, the
average inbreeding coefficient of all individuals of a generation.
When pedigrees of the individuals are known, however, the coeffi
cient of inbreeding can be more conveniently deduced directly from
the pedigrees, instead of indirectly from the population size. This
method has several advantages in practice. Knowledge is often re
quired of the inbreeding coefficient of individuals, rather than of the
generation as a whole, and this is what the calculation from pedigrees
yields. In domestic animals some individuals often appear as parents
in two or more generations, and this overlapping of generations causes
no trouble when the pedigrees are known. (Nonoverlapping of
generations was one of the conditions of the idealised population
which we have not yet removed.) The first topic for consideration in
this chapter is therefore the computation of inbreeding coefficients
from pedigrees. The second topic concerns regular systems of close
inbreeding. When selffertilisation is excluded the rate of inbreeding
expressed in terms of the population size is only an approximation,
and the approximation is not close enough if the population size is
very small. Under systems of close inbreeding, therefore, the rate of
inbreeding must be deduced differently, and this is best done also
by consideration of the pedigrees.
When the coefficient of inbreeding is deduced from the pedigrees
of real populations it does not necessarily describe the state of dis
persion of the gene frequencies. It is essentially a statement about
86 SMALL POPULATIONS: III [Chap. 5
the pedigree relationships, and its correspondence with the state of
dispersion is dependent on the absence of the processes that counteract
dispersion, in particular on selection being negligible. We were able
to use the coefficient of inbreeding as a measure of dispersion in the
preceding chapters because the necessary conditions for its relation
ship with the variance of gene frequencies were specified.
Pedigreed Populations
The inbreeding coefficient of an individual is the probability
that the pair of alleles carried by the gametes that produced it were iden
tical by descent. Computation of the inbreeding coefficient therefore
requires no more than the tracing of the pedigree back to common
ancestors of the parents and computing the probabilities at each
segregation. Consider the pedigree in Fig. 5.1. X is the individual
we are interested in, whose parents are P and Q. We
A want to know what is the probability that X receives
J identical alleles transmitted through P and Q from A.
Consider first B and C. The probability that they
B C receive replicates of the same gene from A is J, and the
probability that they receive different genes is J. But
if they receive different genes from A, then the prob
ability of these being identical as a result of previous
Y inbreeding is the inbreeding coefficient of A. There
I fore the total probability of B and C receiving identical
I genes from A is J(i +F A ). Put in other words, this is
the probability that two gametes taken at random
D
Fig. 5.1 from A will contain identical alleles. Now consider
the rest of the path through B. The probability that B
passes the gene it got from A on to D is ^; from D to P is J, and from
P to X is \ . Similarly for the other side of the ancestry through C and
Q. Putting all this together we find the probability that X receives
identical alleles descended from A is (i +F A )(^y +2 , or
\(i + F A )(^) n i +n 2, where n 1 is the number of generations from one
parent back to the common ancestor and n 2 from the other parent.
If the two parents have more than one ancestor in common the separ
ate probabilities for each of the common ancestors have to be summed
to give the inbreeding coefficient of the progeny of these parents.
Thus the general expression for the inbreeding coefficient of an
Chap. 5]
PEDIGREED POPULATIONS
87
individual is
Fx = m) n ^ +1 (i+F A )] (51)
(Wright, 1922). When inbreeding coefficients are computed in this
way it is necessary, of course, to define the base population to which
the present inbreeding is referred. The base population might be the
individuals from which an experiment was started or a herd founded;
or it might be those born before a certain date. The designation of an
individual as belonging to the base population means that it will be
assumed to have zero inbreeding coefficient. When pedigrees are
long and complicated there may be very many common ancestors, but
it is not necessary to trace back all lines of descent. A sufficiently
accurate estimate can be got by sampling a limited number of lines of
descent (Wright and McPhee, 1925).
Example 5.1. As an illustration of the use of formula 5.J let us consider
the hypothetical pedigree in Fig.
5.2. The relevant individuals in
the pedigree are indicated by
letters. Individual Z is the one
whose inbreeding coefficient is to
be computed. Its parents are X
and Y, so we have to trace the
paths of common ancestry con
necting X with Y. There are
four common ancestors, A, B, C,
and H, and five paths connecting
X with Y through them. We as
sume A, B, and C to have zero in
breeding coefficients, since the
pedigree tells us nothing about
their ancestry. Individual H,
however, has parents that are
half sibs, and the inbreeding
coefficient of H is therefore 5 ' 2
Common Path from
ancestor X to Y
Generations to
common ancestor:
from X from Y
Inbreeding
coeff. of
common ancestor
Contribution
to inbreeding
ofZ
A KGCADHL
B KHDBEJM
B KHDBEL
C KGCHL
H KHL
4
4
4
3
2
4
4
3
3
2
\
(i) 9 = 00195
ay = 00195
ay = 00391
ay = 00781
(*) 5 .1 = 03516
Total by summation
005078
F.Q.G.
88 SMALL POPULATIONS: III [Chap. 5
(i)(i+i+i) = i # ^he computation of the separate paths may now be made as
shown in the table. By addition of the contributions from the five paths
we get the inbreeding coefficient of Z as Fz =005078, or 51 per cent.
"Coancestry." There is another method of computing inbreed
ing coefficients (Cruden, 1949; Emik and Terrill, 1949) which is more
convenient for many purposes, and is also more readily adapted to
a variety of problems. We shall use it later to work out the inbreeding
coefficients under regular systems of close inbreeding. The method
does not differ in principle from the formula 5.J given above, but
instead of working from the present back to the common ancestors
we work forward, keeping a running tally generation by generation,
and compute the inbreeding that will result from the matings now
being made. The inbreeding coefficient of an individual depends on
the amount of common ancestry in its two parents. Therefore,
instead of thinking about the inbreeding of the progeny, we can think
of the degree or relationship by descent between the two parents.
This we shall call the coancestry of the two parents, and symbolise it
by /. It is identical with the inbreeding coefficient of the progeny,
and is the probability that two gametes taken one from one parent
and one from the other will contain alleles that are identical by
descent. (Malecot, 1948, calls this the "coefficient de parente," but
the translation "coefficient of relationship" cannot be used because
Wright (1922) has used this term with a different meaning.)
Consider the generalised pedigree in Fig. 5.3.
X is an individual with parents P and Q and grand A x B C x D
parents A, B, C, and D. Now, the coancestry of P j J
with Q is fully determined by the coancestries relating P x Q
A and B with C and D, and if these are known we 
need go no further back in the pedigree. It can be X
shown that the coancestry of P with Q is simply the Fig. 5.3
mean of the four coancestries AC, AD, BC, and BD.
This will be clearer if stated in the form of probabilities, though the
explanation is cumbersome when put into words. Take one gamete
at random from P and one from Q, and repeat this many times. In
half the cases P's gamete will carry a gene from A and in half from B:
similarly for Q's gamete. So the two gametes, one from P and one
from Q, will carry genes from A and C in a quarter of the cases, from
A and D in a quarter, from B and C in a quarter, and from B and D
in a quarter of the cases. Now the probability that two gametes
taken at random, one from A and the other from C, are identical by
descent is the coancestry of A with C, i.e. f AC etc. So, reverting now
to symbols,
/pq — J/ac + J/ad + J/bc + J/bd
This gives the basic rule relating coancestries in one generation with
those in the next:
Chap. 5]
PEDIGREED POPULATIONS
89
Fx =/pq —  (Ac +/ad +/bc +/bd)
■(52)
With this rule the experimenter can tabulate the coancestries genera
tion by generation, and this gives a basis for planning matings and
computing inbreeding coefficients. More detailed accounts of the
operation are given by Cruden (1949), Emik and Terrill (1949), and
Plum (1954).
If there is overlapping of generations it may happen that we must
find the coancestry between individuals belonging to different
generations. This situation is covered by the following supplementary
rules, which can readily be deduced by a consideration of probabilities
in the manner explained above. Referring to the same pedigree
( Fi g 53)>
and
/pc :
/pD
/PQ
J(/ac+/bo]
K/ad+/bd)
K/i
PC
+/pd)
(5.3)
which by substitution reduces to the basic rule.
Before we can apply this method to systems of close inbreeding
we have to see how the basic rule is to be applied when there are
fewer than four grandparents. As an example
we shall consider the coancestry between a
pair of full sibs. The pedigree can be written
as in Fig. 5.4: A and B are parents of both P x
and P 2 , which are full sibs and have an off
spring X. Applying the basic rule (equation
5.2), and noting that / BA =/ AB , we have
B
B
Pi
1
X
Fig. 5.4
^x =/ Pi p 2 = K/aa +/bb + 2/ A b)
(54)
The meaning of / AA , the coancestry of an individual with itself, is the
probability that two gametes taken at random from A will contain
90 SMALL POPULATIONS: III [Chap. 5
identical alleles, and we have already seen that this probability is
equal to (i +F A ). The value of F A will be known from the coancestry
of A's parents. The coancestry between offspring and parent can be
found in a similar way, by application of the supplementary rules in
5.5. Substituting the individuals in Fig. 5.4 for those in Fig. 5.3 and
applying the first two equations of 5. 3 gives
/pa = 2 (/aa +/ab) 1 / x
/pB — K/BB +/ab) J
where P is equivalent to either P x or P 2 ; and applying the third
equation of 5.5 gives the coancestry between full sibs
/piPa = J(/pa +/pb)
= K/aa+/bb + 2/ A b)
as above. We now have all the rules needed for computing the in
breeding coefficients in successive generations under regular systems
of inbreeding.
Regular Systems of Inbreeding
The consequences of regular systems of inbreeding have been
the subject of much study. They were first described in detail by
Wright ( 1 921) in a series of papers which form the foundation of the
whole theory of small populations. Wright's studies were based on
the method of path coefficients (Wright, 1934, 1954). Haldane
(1937, 1955) and Fisher (1949) derived the consequences by the
method of matrix algebra. The inbreeding coefficients in successive
generations can, however, be more simply derived by application of
the rules of coancestry explained in the previous section, and this is
the method we shall follow here. We shall illustrate the application
of the method for consecutive fullsib mating, which is one of the most
commonly used systems, and give the results for some other systems.
The inbreeding coefficients refer to autosomal genes; the results for
sexlinked genes are described by Wright (1933) in a paper which
also contains a useful summary of the results for autosomal genes in
a great variety of mating systems.
Fullsib mating. The equation 5.4 given above for the co
ancestry between full sibs can be applied to successive generations to
Chap. 5]
REGULAR SYSTEMS OF INBREEDING
91
r
FABLE
5 1
Inbreeding
coefficients
under
various
systems of
close
i
nbreeding
Generation
A
B
B
C
D
(0
(
1)
(2)
o
I
•500
250
•125
•250
2
750
375
•219
*375
3
•875
500
•063
•305
•438
4
•938
594
•172
.381
•469
5
•969
672
•293
•449
•484
6
•984
734
•409
•509
•492
7
.992
785
•512
•563
•496
8
•996
826
•601
•611
•498
9
.998
859
•675
•654
•499
10
•999
886
736
•691
ii
908
785
•725
12
926
•826
755
13
940
•859
•782
H
95i
•886
•806
15
961
•908
•827
16
968
•925
•846
17
974
•940
•863
18
979
•951
•878
J 9
983
•960
•891
20
986
•968
•903
Column
A
(1)
(2)
D
System of mating Recurrence equation
Selffertilisation, or re
peated backcrosses to highly (i +F t _ 1 )
inbred line.
Full brother x sister, or off
spring x younger parent:
Inbreeding coefficient. J(i + 2F t _ ± +F t _ 2 )
Probability of fixation
(from Schafer, 1937).
Half sib (females half J(i +6F t  1 +F t _ 2 )
sisters).
Repeated backcrosses to J(i +zF t _ 1 )
randombred individual.
92 SMALL POPULATIONS: III [Chap. 5
give the inbreeding coefficients under continued fullsib mating.
But it is more convenient to rearrange the equation so that the in
breeding coefficient is given in terms of the inbreeding coefficients
of the previous generations. Note first that, because the mating sys
tem is regular, contemporaneous individuals have the same inbreeding
coefficients and coancestries: so, referring again to the pedigree in
Fig. 5.4>/aa=/bb> and F a = F b . Now, if we let t be the generation to
which individual X belongs, then/ AB =F t _ lf and/ AA =/ BB =i(i +^2)
The coancestry equation can therefore be rewritten to give the
inbreeding coefficient in any generation, t y in terms of the inbreeding
coefficients of the previous two generations, thus:
F t =l(i+zF t _ 1+ F t _ 2 ) ( 5 .6)
This recurrence equation enables us to write down the inbreeding
coefficients in successive generations. In the first generation F t _ ±
and F t _ 2 are both zero and so F (t=1) =025. The inbreeding coeffi
cients in the first four generations are 025, 0375, 050, and 059.
The rate of inbreeding is not constant in the first few generations, as
may be seen by computing AF from equation 3.9. For the first four
generations AF is 025, 017, 020, and 019. It later settles down
to a constant value of 0191 (Wright, 193 1). The inbreeding co
efficients over the first 20 generations of fullsib mating are given in
Table 5.1.
Some other systems of mating may now be mentioned briefly.
Selffertilisation gives the most rapid inbreeding. If X is the off
spring of P, we have from the coancestry identities
F x =f?r = l(i+Fj,)
and the recurrence equation is therefore
F t =i(i +F t _ 1 ) {5.7)
The inbreeding coefficients over the first ten generations of self
fertilisation are given in Table 5.1. The rate of inbreeding is con
stant from the beginning; AF=o$ exactly.
Parentoffspring mating, in which offspring are mated to the
younger parent, gives the same series of inbreeding coefficients as
fullsib mating for autosomal genes, but for sexlinked genes it gives
a slightly higher rate of inbreeding. For sexlinked genes AF is 0293
after the first few generations (Wright, 1933).
Chap. 5]
REGULAR SYSTEMS OF INBREEDING
93
Halfsib mating is usually between paternal half sibs, one male being
mated to two or more of his half sisters. If these females are half
sisters of each other the recurrence equation is
F t =i(x+6F t _ 1+ F t
■(58)
The first 20 generations are given in Table 5.1. There are, however,
practical difficulties in the way of maintaining this system regularly,
and sometimes females that are full sisters of each other have to be
used. The inbreeding will then go a little faster. If fullsister females
are always used the recurrence equation is
F t =M3 + 8^_ 1 + 4 F < _ 2 +^_ 8 )
Repeated backcrosses to an individual or to a highly
inbred line are often made, for a variety of purposes.
The resulting inbreeding is as follows. The pedigree
(Fig. 5.5) shows an individual, A, which will probably
be a male, mated to his daughter, C, his granddaughter,
D, etc. From the supplementary rule (5.5)
Fx =/ad = J(/aa +/ac)
The recurrence equation is therefore
^ = i(l+^A + 2^l)
(59)
B
I
D
X
I
X
Fig. 5.5
(5io)
I where F A is the inbreeding coefficient of the individual to which the
j repeated backcrosses are made. If A is an individual from the base
J population andF A = o, the equation becomes
F ( = 1(1+2.^)
(5ii)
The inbreeding coefficients over the first 9 generations are given in
Table 5.1. If A is an individual from a highly inbred line and F A = 1 ,
the equation becomes
F t =i(i+F t _0
■(5i2)
which is identical with selffertilisation. In this case A need not be
the same individual in successive generations: it can be any member
of the inbred line.
94 SMALL POPULATIONS: III [Chap. 5
Example 5.2. As an example of the use of coancestry for computing
inbreeding coefficients let us consider populations derived from "2way"
and from "4way" crosses between highly inbred lines. In a 2way cross
two inbred lines are crossed and the population is maintained by random
mating among the crossbred individuals and subsequently among their
progeny. In a 4way cross four inbred lines are crossed in two pairs, and
the two crossbred groups are again crossed, subsequent generations being
maintained by random mating. If the base population is taken to be a real,
or hypothetical, randombred population from which the inbred lines were
derived, we may compute the inbreeding coefficients of the population
derived from the cross, referring it to this base. The crosses and sub
sequent generations are shown schematically in the diagram below.
Generation 2way cross 4way cross
1 AxB AxB CxD
X x x X 2 Xjl X 2 Y 2 Y 2
Xj x Y 1 X 2 x Y 2
1 ' 1
O Zj x Z a
I
4 O
The inbred lines are represented by A, B, C, and D. If they are fully
inbred, as we shall take them to be, the coefficient of inbreeding of the
individuals from the lines is 1, and the coancestry of an individual with
another of the same line is also 1. Therefore only one individual of each
line need be represented in the scheme, even though any number may
actually be used. The progeny of the crosses between the inbred lines are
represented by X and Y, the suffices 1 and 2 indicating different individuals.
In the 2way cross the progeny of these crossbred individuals are the
foundation generation whose inbreeding coefficient we are to compute.
They are represented by O. In the 4way cross the two sorts of crossbred
individuals, X and Y, are crossed, one sort with the other. Two such
matings are represented in the scheme. They produce the "doublecross"
individuals, Z, whose progeny constitute the foundation generation repre
sented by O, whose inbreeding coefficient we are to compute.
In the computation of the coancestries we shall omit the symbol /,
writing for example AB for / AB , the coancestry of individual A with in
dividual B. The coancestries of the parents in generation 1 are
AA=BB=CC=DD=i
Chap. 5]
and
REGULAR SYSTEMS OF INBREEDING
AB=AC=AD=BC=BD=CD=o
95
The coancestries in the second generation of the 2way cross are
X ± X 2 = J(AA + BB + AB + BA) (by equation 5.2)
= ( 1 +1+0+0)
Therefore F = 05, which is the required inbreeding coefficient of the
foundation generation of the population derived from the 2way cross.
The subsequent matings between the O individuals need produce no
further inbreeding provided enough 2nd generation matings are made.
The coancestries in the second generation of the 4way cross are
and
X X X 2 = Y^ = J (as shown for the 2way cross)
X X Y 2 = X 2 Y X = i(AC + AD + BC + BD) = o
The coancestries in the third generation are
Z X Z 2 = i(X x X 2 + Y X Y 2 + X X Y 2 + X 2 Y X )
i(
_ 1
~~4
Therefore the inbreeding coefficient of the foundation generation is
^0 = 025. Again, the inbreeding need not increase further, provided
enough third generation matings are made.
The meaning of these coefficients of inbreeding, with the base popula
tion as stated, may be clarified thus. If we made a large number of 2way,
or of 4way, crosses each with a different set of inbred lines, the populations
derived from the crosses would constitute a set of lines or subpopulations.
The inbreeding coefficients would then indicate the expected amount of
dispersion of gene frequencies among these lines. Populations derived
from 2way crosses are equivalent to progenies of one generation of self
fertilisation. The gene frequencies can therefore have only three values,
o, J, and 1. Populations derived from 4way crosses are equivalent to
progenies of one generation of fullsib mating, and the gene frequencies
can have only five values, o, J, J, f , and 1 .
Reference to a different base population. Having computed
a coefficient of inbreeding with reference to a certain group of indi
viduals as the base population, one may then want to change the base
and refer the inbreeding coefficient to another group of individuals.
One might, for example, compute the inbreeding coefficient of a herd
96 SMALL POPULATIONS: III [Chap. 5
of cattle referred to the foundation animals of the herd as the base,
and then want to recompute the inbreeding coefficient
so as to refer to the breed as a whole with a base popula A
tion in the more remote past. Let X represent the group J,
of individuals whose inbreeding coefficient is required, B
and let A and B represent ancestral groups, A being more !
remote than B, as shown in Fig. 5.6. Then it follows from X
equation 3. II that Fig. 5.6
Px.a = Px.bPb.a (5.13)
where P x A = i F x .a1 Fx.a being the inbreeding coefficient of X
referred to A as base, and similarly for the other subscripts.
Example 5.3. A selection experiment with mice was started from a
foundation population made by a 4way cross of highly inbred lines
(Falconer, 1953). According to the computation given above in Example
5.2, the inbreeding coefficient of this foundation population was reckoned
to be 25 per cent. On this basis the inbreeding coefficients of subsequent
generations were computed from the pedigrees by the coancestry method. The
inbreeding coefficient at generation 24, computed thus, was 588 per cent.
What would the inbreeding coefficient be if referred to the foundation
population as base, instead of to the more remote hypothetical population
from which the inbred lines were derived? The figures to be substituted in
equation 313 are P x .a = °*4 i 2 and Pb.a = '75 Therefore Px.b = —
= 0549. The inbreeding coefficient at generation 24, referred to the
foundation population as base, is therefore 451 per cent.
We may use this population of mice also to compare the rate of in
breeding when computed by the two methods, from the pedigrees and from
the effective population size. Computed from the pedigrees, the average
rate of inbreeding over the 24 generations is found from equation 3.12
thus: 0451 = 1 (1 zlF) 24 , whence AF = 24.7 P er cent  The population
was maintained by six pairs of parents in each generation. Matings were
made between individuals with the lowest coancestries and this has the
effect of equalising family size, as explained in the previous chapter.
Therefore, by equation 4.8, the effective number was twice the actual, i.e.
N e = 24. The rate of inbreeding, by equation 4.1, is therefore AF = —= = 208
48
per cent. The slightly higher rate of inbreeding as computed directly
from the pedigrees can be attributed to some irregularities in the mating
system, resulting from the sterility of some parents and the death of some
whole litters. The random drift of a colour gene in this line, and two others
maintained in the same manner, was shown in Fig. 3.2.
Chap. 5]
REGULAR SYSTEMS OF INBREEDING
97
Fixation. One is often more interested in the probability of
fixation as a consequence of inbreeding than in the inbreeding coeffi
cient. The inbreeding coefficient gives the probability of an indi
vidual being a homozygote, which is i  2p q (i F) from Table 3.1.
But one wants to know also how soon all individuals in a line can be
expected to be homozygous for the same allele. This is the "purity"
implied by the term "pure line" which is often used to mean highly
inbred line. The degree of "purity" is the probability of fixation.
The probability of fixation has been worked out by Haldane (1937,
1955), Schafer (1937) and Fisher (1949). It depends on the number
of alleles and their arrangement in the initial mating of the line. The
probabilities of fixation over the first 20 generations of fullsib mating
are given in Table 5.1, when 4 alleles were present in the initial
mating. There cannot, of course, be more than 4 alleles in a sib
mated line, and when there are fewer the probability of fixation is
greater (see Haldane, 1955).
Linkage. Linkage introduces a problem in connexion with the
consequences of inbreeding of which a solution is sometimes needed.
Individuals heterozygous at a particular locus will also be hetero
zygous for a segment of chromosome in which the locus lies, and it
may be of interest to know the average length of heterozygous
segments. The form in which this problem most commonly arises is
connected with the transference of a marker gene to an inbred line by
repeated backcrosses, when one wants to know how much of the
foreign chromosome is transferred along with the marker. This
problem has been worked out by Bartlett and Haldane (1935). A
dominant gene can be transferred by successive crosses of the hetero
zygote to the strain into which it is to be introduced. In this case the
mean length of chromosome introduced with the gene after t crosses
is ijt crossover units on each side of the gene. A recessive gene is
commonly transferred by alternating backcrosses and intercrosses
from which the homozygote is extracted. The mean length of foreign
chromosome in this case is z\t crossover units on each side, after t
cycles. Other cases are described in the paper cited. From this and a
knowledge of the total map length of the organism we can arrive at
the expected proportion of the total chromatin that is still hetero
geneous.
Example 5.4. What percentage of the total chromatin is expected to
be still heterogeneous after a dominant gene has been transferred to an
inbred strain of mice by five, and by ten successive backcrosses? The
98
SMALL POPULATIONS: III
[Chap. 5
A
i
•
B
i
i
C
1
I
D
1 ' ^ r* — , ™ — ■  ,
i i
in t "
■
TV (
■
XVI J
1
at 1 ..
i
(a)
XV C
viii c
ix c
xC
xi C
xii C
XIII c
xrvc
xv
xvi c
XIX (_
xxC
(b)
Chap. 5]
REGULAR SYSTEMS OF INBREEDING
99
in
XV£
u
viii n
XIII c
xvi c
xvii n
xviii c
XIX C
xx C
(c)
Fig. 5.7. Theoretical models illustrating the distribution of
heterozygous segments of chromosome (shown black) after (a) 5
generations, (b) 10 generations, and (c) 20 generations of fullsib
mating, in an organism with twenty chromosomes, such as the
mouse. The total maplength is taken to be 2500 centimorgans,
and the chromosomes are assumed to be of equal genetic length.
The points marked A, B, C, D, in chromosomes I to IV are loci
held heterozygous by forced segregation, and the associated hetero
zygous segments are crosshatched. (From Fisher, The Theory of
Inbreeding, Oliver and Boyd, 1949; reproduced by courtesy of the
author and publishers.)
expected length of heterogeneous chromosome associated with the gene
is 0*2. centimorgans after five crosses, and oi cM after ten. The average
map length of the 20 chromosomes in male mice is 977 cM (Slizynski,
1955). Therefore 02 per cent of the chromosome will be heterogeneous
after five crosses, and oi per cent after ten, assuming that the gene is
transferred through males, and taking the average as being the length of
the chromosome carrying the gene. The percentage of chromatin not
associated with the gene that is expected still to be heterogeneous can be
taken as approximately iF t from column A of Table 5.1: that is, 31 per
cent after five crosses and oi per cent after ten. The total percentage of
heterogeneous chromatin is therefore 34 per cent after five crosses, and
02 per cent after ten.
100 SMALL POPULATIONS: III [Chap. 5
The more general problem of the mean length of heterozygous
segments during inbreeding has been treated by Haldane (1936) and
by Fisher (1949). It need not be discussed in detail here. The con
clusions are well illustrated in Fig. 5.7, which is Fisher's diagrammatic
representation of the situation in an organism with 20 chromosomes,
such as the mouse, after five, ten, and twenty generations of fullsib
mating. The diagrams show the expected number and lengths of
unfixed segments. The first four chromosomes are supposed to carry
loci at which segregation is maintained by mating always hetero
zygotes with homozygotes. The slower reduction of the lengths of
these unfixed segments can be seen.
Mutation. After a long period of inbreeding mutation may be
come an important factor in determining the frequency of hetero
zygotes. If u is the mutation rate of a gene that has reached near
fixation in the line, then the frequency of heterozygotes at this locus
due to mutation is \u under selffertilisation, and \zu under fullsib
mating, for autosomal loci (Haldane, 1936). These are very small
frequencies if we are concerned with only one locus, but if the effects
of all loci are taken together mutation is not entirely negligible as a
source of heterozygosis in long inbred strains such as the widely used
strains of mice. The practical consequences of the origin of hetero
geneity by mutation are that the characteristics of a line will slowly
change through the fixation of mutant alleles, and that sublines will
become differentiated. Examples are given in Chapter 15.
Selection favouring heterozygotes. When close inbreeding is
practised the object is generally to produce fixation, or homozygosis
within the lines, and the experimenter is not usually interested in the
differentiation between lines. It is therefore a matter of little concern
which allele is fixed, so long as fixation occurs. Selection against a
deleterious recessive may prevent the deleterious allele becoming
fixed, but it will not prevent or delay the fixation of the more favour
able allele. Therefore the conclusions about selection reached in the
previous chapter are of little relevance to close inbreeding. Selection
that favours heterozygotes, however, is another matter. A conse
quence of inbreeding almost universally observed is a reduction of
fitness, the reasons for which will be given in Chapter 14. Thus
selection resists the inbreeding, since the more homozygous indi
viduals are the less fit, and this can only mean that selection favours
heterozygotes — not necessarily heterozygotes of the loci taken singly,
but heterozygotes of segments of chromosome. It is only necessary
Chap. 5]
REGULAR SYSTEMS OF INBREEDING
101
to have two deleterious genes, recessive or partially recessive, linked
in repulsion, to confer a selective advantage on the heterozygote of
the segment of chromosome within which the genes are located. It is
therefore important to find out how the opposing tendencies of
inbreeding and selection in favour of heterozygotes balance each
other, in order to assess the reliability of the computed inbreeding
coefficient as a measure of the probability of fixation.
The outcome of the joint action of inbreeding and selection in
favour of heterozygotes depends on whether there is replacement of
the less fit lines by the more fit; in other words, on whether selection
operates between lines or only within lines. Within any one line,
selection against homozygotes only delays the progress toward
fixation and cannot arrest it, the delay being roughly in proportion to
the intensity of the selection (Reeve, 19550). Table 5.2 shows the
Table 5.2
Rate of inbreeding, AF, with selection favouring the
heterozygote. (Except with selffertilisation, the rates are
only approximate over the first few generations of in
breeding.)
Coefficient of
selection against
"* \/o)
Self
the homozygotes
fertilisation
Full sib
Half sib
v)
5000
1910
1301
02
44'44
1488
932
04
375o
1032
56?
o6
2857
57i
248
075
2000
262
082
o8
1667
176
046
* Females full sisters to each other.
rates of inbreeding with various intensities of selection, when there
are two alleles and selection acts equally against both homozygotes.
(The rate of inbreeding, AF, is used here to mean the rate of dispersion
of gene frequencies and, after the first few generations when the
distribution of gene frequencies has become flat, it measures the rate
of fixation — i.e. the proportion of unfixed loci that become fixed in
each generation — as explained in Chapter 3.) The delay of fixation
caused by selection is least under the closest systems of inbreeding.
102 SMALL POPULATIONS: III [Chap. 5
Thus the rate is halved under selffertilisation when the coefficient
of selection is 067; under fullsib mating when it is 044; and under
halfsib mating when it is 035. It will be seen from the table that the
rate of inbreeding, though much reduced by intense selection, does
not become zero until the coefficient of selection rises to 1 . If there is
only one line, therefore, fixation eventually goes to completion, unless
both homozygotes are entirely inviable or sterile.
If there are many lines, however, selection may arrest the progress
of fixation and lead to a state of equilibrium, for the following reason.
The amount by which the inbreeding has changed the frequency of a
particular gene from its original value differs at any one time from
line to line. In other words, the state of dispersion of the locus has
gone further in some lines than in others. Now, if those lines in
which the dispersion has gone furthest, and which are consequently
most reduced in fitness, die out or are discarded, and if they are
replaced by sub lines taken from the lines in which it has gone least
far, then the progress of the dispersive process will have been set
back. When there is replacement of lines in this way, and the selec
tion is sufficiently intense, a state of balance between the opposing
tendencies of inbreeding and selection is reached. The intensity of
selection needed to arrest the dispersive process has been worked out
for regular systems of close inbreeding (Hayman and Mather, 1953).
Some of the conclusions, for the case of two alleles with equal selec
tion against the two homozygotes, are given in Table 5.3, which
shows the intensity of selection against the homozygotes which will
(a) just allow fixation to go eventually to completion, and (b) arrest
Table 5.3
Balance between inbreeding and selection in favour of
heterozygotes, when selection operates between lines. The
figures are the selective disadvantages of homozygotes, s f
expressed as percentages. Column (a) shows the highest
value of ^ compatible with complete fixation. Column (b)
shows the value of s that leads to a steady state at
P=iF = o 5 .
(a)
(b)
Mating system
(P = o)
(iVo5)
Selffertilisation
500
667
Fullsib
237
446
Halfsib
188
472
(females half sisters)
Chap. 5]
REGULAR SYSTEMS OF INBREEDING
103
the dispersive process at a point of balance where the frequency of
heterozygotes is half its original value, i.e. where P= iF=o$.
These figures show that only a moderate advantage of heterozygotes
will suffice to prevent complete fixation. Under fullsib mating, for
example, loci, or segments of chromosomes that do not recombine,
with a 25 per cent disadvantage in homozygotes will not all go to
fixation. And, of those with a 50 per cent disadvantage, only about
half will become fixed, no matter for how long the inbreeding is
continued.
It must be stressed, however, that prevention of fixation in this
way can only take place when there is replacement of lines and sub
lines. The following breeding methods, for example, would allow
replacement of lines: if seed, set by selffertilisation, were collected
in bulk and a random sample taken for planting, and this were re
peated in successive generations; or, if sib pairs of mice were taken at
random from all the surviving progeny, so that the same amount of
breeding space was occupied in successive generations.
The conclusions outlined above refer to a single locus. If there
were more than a few loci on different chromosomes all subject to
selection against homozygotes of an intensity sufficient to arrest or
seriously delay the progress of inbreeding, the total loss of fitness
from all the loci would be very severe. Inbred lines of organisms
with a high reproductive rate, such as plants and Drosophila, might
well stand up to a total loss of fitness sufficient to keep several loci
or segments of chromosome permanently unfixed. But the loss of
fitness involved in preventing the fixation of more than two or three
loci in an organism such as the mouse would be crippling. Under
laboratory conditions the highly inbred strains of mice, after 100 or
more generations of sibmating, have a fitness not much less than half
that of noninbred strains. It is conceivable that they might have one
locus permanently unfixed, but it is difficult to believe that they can
have more. Complete lethality or sterility of both homozygotes at
one locus means a 50 per cent loss of progeny; at two unlinked loci, a
75 per cent loss. A mouse strain with a mortality or sterility of 50
per cent can be kept going, but hardly one with 75 per cent.
F.Q.G.
CHAPTER 6
CONTINUOUS VARIATION
It will be obvious, to biologist and layman alike, that the sort of
variation discussed in the foregoing chapters embraces but a small
part of the naturally occurring variation. One has only to consider
one's fellow men and women to realise that they all differ in countless
ways, but that these differences are nearly all matters of degree and
seldom present clearcut distinctions attributable to the segregation
of single genes. If, for example, we were to classify individuals ac
cording to their height, we could not put them into groups labelled
"tall" and "short," because there are all degrees of height, and a
division into classes would be purely arbitrary. Variation of this sort,
without natural discontinuities, is called continuous variation, and
characters that exhibit it are called quantitative characters or metric
characters, because their study depends on measurement instead of
on counting. The genetic principles underlying the inheritance of
metric characters are basically those outlined in the previous chapters,
but since the segregation of the genes concerned cannot be followed
individually, new methods of study have had to be developed and
new concepts introduced. A branch of genetics has consequently
grown up, concerned with metric characters, which is called variously
population genetics, biometrical genetics or quantitative genetics. The
importance of this branch of genetics need hardly be stressed; most
of the characters of economic value to plant and animal breeders are
metric characters, and most of the changes concerned in micro
evolution are changes of metric characters. It is therefore in this
branch that genetics has its most important application to practical
problems and also its most direct bearing on evolutionary theory.
How does it come about that the intrinsically discontinuous varia
tion caused by genetic segregation is translated into the continuous
variation of metric characters? There are two reasons: one is the
simultaneous segregation of many genes affecting the character, and
the other is the superimposition of truly continuous variation arising
from nongenetic causes. Consider, for example, a simplified situa
Chap. 6]
CONTINUOUS VARIATION
105
tion. Suppose there is segregation at six unlinked loci, each with two
alleles at frequencies of 05. Suppose that there is complete domin
ance of one allele at each locus and that the dominant alleles each add
one unit to the measurement of a certain character. Then if the
segregation of these genes were the only cause of variation there would
Fig. 6.1. Distributions expected from the simultaneous segrega
tion of two alleles at each of several or many loci: (a) 6 loci, (b) 24
loci. There is complete dominance of one allele over the other at
each locus, and the gene frequencies are all 05. Each locus, when
homozygous for the recessive allele, is supposed to reduce the
measurement by 1 unit in (a), and by \ unit in (b). The horizontal
scale, representing the measurement, shows the number of loci
homozygous for the recessive allele, and the vertical axis shows the
probability, or the percentage of individuals expected in each class.
The probabilities are derived from the binomial expansion of
(i + !) w > where n is the number of loci, and they are taken from the
tables of Warwick (1932).
be 7 discrete classes in the measurements of the character, according
to whether the individual had the dominant allele present at o, 1, 2, . . .
or 6 of the loci. The frequencies of the classes would be according to
the binomial expansion of (i + ) 6 , as shown in Fig. 6.1 (a). If our
measurements were sufficiently accurate we should recognise these
classes as being distinct and we should be able to place any individual
106 CONTINUOUS VARIATION [Chap. 6
unambiguously in its class. If there were more genes segregating but
each had a smaller effect, there would be more classes with smaller
differences between them, as in Fig. 6.1 (b). It would then be more
difficult to distinguish the classes, and if the difference between the
classes became about as small as the error of measurement we should
no longer be able to recognise the discontinuities. In addition, metric
characters are subject to variation from nongenetic causes, and this
variation is truly continuous. Its effect is, as it were, to blur the edges
of the genetic discontinuity so that the variation as we see it becomes
continuous, no matter how accurate our measurements may be.
Thus the distinction between genes concerned with Mendelian
characters and those concerned with metric characters lies in the
magnitude of their effects relative to other sources of variation. A
gene with an effect large enough to cause a recognisable discontinuity
even in the presence of segregation at other loci and of nongenetic
variation can be studied by Mendelian methods, whereas a gene whose
effect is not large enough to cause a discontinuity cannot be studied
individually. This distinction is reflected in the terms major gene and
minor gene. There are, however, all intermediate grades, genes that
cannot properly be classed as major or as minor, such as the "bad
genes" of Mendelian genetics. And, furthermore, as a result of
pleiotropy the same gene may be classed as major with respect to one
character and minor with respect to another character. The distinc
tion, though convenient, is therefore not a fundamental one, and there
is no good evidence that there are two sorts of genes with different
properties. Variation caused by the simultaneous segregation of
many genes may be called polygenic variation, and the minor genes
concerned are sometimes referred to as polygenes (see Mather, 1949).
Metric Characters
The metric characters that might be studied in any higher organ
ism are almost infinitely numerous. Any attribute that varies con
tinuously and can be measured might in principle be studied as a
metric character — anatomical dimensions and proportions, physio
logical functions of all sorts, and mental or psychological qualities.
The essential condition is that they should be measureable. The
technique of measurement, however, sets a practical limitation on
what can be studied. Usually rather large numbers of individuals
Chap. 6]
METRIC CHARACTERS
107
30 35 t 40
NO. OF BRISTLES
45
20
40 T 60 80 100 120
NO. OF FACETS
Fig. 6.2. Frequency distributions of four metric characters, with
normal curves superimposed. The means are indicated by arrows.
The characters are as follows, the number of observations on which
each histogram is based being given in brackets:
(a) Mouse (<?<?): growth from 3 to 6 weeks of age. (380)
(b) Mouse: litter size (number of live young in 1st litters).
(689)
(c) Drosophila melanogaster ($$): number of bristles on ventral
surface of 4th and 5th abdominal segments, together. (900)
(d) Drosophila melanogaster ($?): number of facets in the eye of
the mutant "Bar". (488)
(a), (b), and (c) are from original data: (d) is from data of Zeleny
(1922).
108 CONTINUOUS VARIATION [Chap. 6
have to be measured and the study of any character whose measure
ment requires an elaborate technique therefore becomes impracti
cable. Consequently the characters that have been used in studies of
quantitative genetics are predominantly anatomical dimensions, or
physiological functions measured in terms of an endproduct, such as
lactation, fertility, or growth rate.
Some examples of metric characters are illustrated in Fig. 6.2.
The variation is represented graphically by the frequency distribu
tion of measurements. The measurements are grouped into equally
spaced classes and the proportion of individuals falling in each class
is plotted on the vertical scale. The resulting histogram is discontinu
ous only for the sake of convenience in plotting. If the class ranges
were made smaller and the number of individuals measured were in
creased indefinitely the histogram would become a smooth curve.
The variation of some metric characters, such as bristle number or
litter size, is not strictly speaking continuous because, being measured
by counting, their values can only be whole numbers. Nevertheless,
one can regard the measurements in such cases as referring to an
underlying character whose variation is truly continuous though
expressible only in whole numbers, in a manner analogous to the
grouping of measurements into classes. For example, litter size may
be regarded as a measure of the underlying, continuously varying
character, fertility. For practical purposes such characters can be
treated as continuously varying, provided the number of classes is
not too small. When there are too few classes, as for example when
susceptibility to disease is expressed as death or survival, different
methods have to be employed, as will be explained in Chapter 18.
The frequency distributions of most metric characters approxi
mate more or less closely to normal curves. This can be seen in
Fig. 6.2, where the smooth curves drawn through the histograms are
normal curves having means and variances calculated from the data.
In the study of metric characters it is therefore possible to make use
of the properties of the normal distribution and to apply the appro
priate statistical techniques. Sometimes, however, the scale of
measurement must be modified if a distribution approximating to the
normal is to be obtained. The distribution in Fig. 6.2 {d\ for example,
would be skewed if measured and plotted simply as the number of
facets. But it becomes symmetrical, and approximates to a normal
distribution, if measured and plotted in logarithmic units. The
criteria on which the choice of a scale of measurement rests cannot be
Chap. 6]
METRIC CHARACTERS
109
fully appreciated at this stage, and will be explained in Chapter 17.
Meantime it will be assumed that any metric character under dis
cussion is measured on an appropriate scale and has a distribution
that is approximately normal.
General Survey of the Subjectmatter
There are tw.o^basi c genetic phenomena conce rned with metric
characters, botmnore or'less jjaqm^arto aUJiLoio psts. and each forms
the basis of a breeding method. The first is the resemblance between
relatives. Everyone is familiar with the fact that relatives tend to
resemble each other, and the closer the relationship, in general the
closer the resemblance. Though it is only in our own species that
resemblances are readily discernible without measurement, the
phenomenon is equally present in other species. The degree of
resemblance varies with the character, some showing more, some less.
The resemblance between offspring and parents provides the basis
for selective breeding. Use of the more desirable individuals as
parents brings about an improvement of the mean level of the next
generation, and just as some characters show more resemblance than
others, so some are more responsive to selection than others. The
degree of resemblance between relatives is one of the properties of a
population that can be readily observed, and it is one of the aims of
quantitative genetics to show how the degree of resemblance between
different sorts of relatives can be used to predict the outcome of
selective breeding and to point to the best method of carrying out the
selection. This problem will form the central theme of the next
seven chapters, the resemblance between relatives being dealt with in
Chapters 9 and 10, and the effects of selection in Chapters 1 113.
"Jjy^ej^BfUja^j^^gej^
with its converse hybrid vigour, or heterosis. This phenomenon is less
familiar to the layman than the first, since the laws against incest pre
vent its more obvious manifestations in our own species; but it is well
known to animal and plant breeders. Inbreeding tends to reduce the
mean level of all characters closely connected with fitness in animals
and in naturally outbreeding plants, and to lead in consequence to loss
of general vigour and fertility. Since most characters of economic
value in domestic animals and plants are aspects of vigour or fertility,
inbreeding is generally deleterious. The reduced vigour and fertility
110 CONTINUOUS VARIATION [Chap. 6
of inbred lines is restored on crossing, and in certain circumstances
this hybrid vigour can be made use of as a means of improvement.
The enormous improvement of the yield of commercially grown
maize has been achieved by this means and represents probably the
greatest practical achievement of genetics (see Mangelsdorf, 195 1).
The effects of inbreeding and crossing will be described in Chapters
1416.
The properties of a population that we can observe in connexion
with a metric character are means, variances, and covariances. The
natural subdivision of the population into families allows us to analyse
the variance into components which form the basis for the measure
ment of the degree of resemblance between relatives. We can in
addition observe the consequences of experimentally applied breed
ing methods, such as selection, inbreeding or crossbreeding. The
practical objective of quantitative genetics is to find out how we can
use the observations made on the population as it stands to predict
the outcome of any particular breeding method. The more general
aim is to find out how the observable properties of the population are
influenced by the properties of the genes concerned and by the various
nongenetic circumstances that may influence a metric character. The
chief properties of genes that have to be taken account of are the
degree of dominance, the manner in which genes at different loci
combine their effects, pleiotropy, linkage, and fitness under natural
selection. To take account of all these properties simultaneously, in
addition to a variety of nongenetic circumstances, would make the
problems unmanageably complex. We therefore have to simplify
matters by dealing with one thing at a time, starting with the simpler
situations.
The plan to be followed in the succeeding chapters is this: we
shall first show what determines the population mean, and then
introduce two new concepts — average effect and breeding value —
which are necessary to an understanding of the variance. Then we
shall discuss the variance, its analysis into components, and the co
variance of relatives, which will lead us to the degree of resemblance
between relatives. In all this we shall take full account of dominance
from the beginning: the other complicating factors will be more
briefly discussed when they become relevant. The most important
simplification that we shall make concerns the effect of genes on
fitness: we shall assume that Mendelian segregation is undisturbed
by differential fitness of the genotypes. The description of means.
Chap. 6]
GENERAL SURVEY OF SUBJECTMATTER
III
variances, and covariances will refer to a random breeding popula
tion, with Hardy Weinberg equilibrium genotype frequencies, with
no selection and no inbreeding. That is to say, we shall describe the
population before any special breeding method is applied to it. Then
in Chapters n13 we shall describe the effects of selection, and in
Chapters 1416 the effects of inbreeding. This will cover the funda
mentals of quantitative genetics, and in the final chapters we shall
discuss some special topics.
CHAPTER 7
VALUES AND MEANS
We have seen in the early chapters that the genetic properties of a
population are expressible in terms of the gene frequencies and geno
type frequencies. In order to deduce the connexion between these
on the one hand and the quantitative differences exhibited in a metric
character on the other, we must introduce a new concept, the concept
o f valu e, e xpressible in the metric units by wnichtne charact er is
mea^gmjed. The value observed when the character is measured on an
individual is the phenotypic value of that individual. All observations,
whether of means, variances, or covariances, must clearly be based on
measurements of phenotypic values. In order to analyse the genetic
properties of the population we have to divide the phenotypic value
into component parts attributable to different causes. Explanation of
the meanings of these components is our chief concern in this chapter,
though we shall also be able to find out how the population mean is
influenced by the array of gene frequencies.
The first division of phenotypic value is into components attribut
able to the influence of genotype and environment. The genotype is
the particular assemblage of genes possessed by the individual, and
the environment is all the nongenetic circumstances that influence the
phenotypic value. Inclusion of all nongenetic circumstances und er
th e term environment means that the genotype and the envi ronment
a re by definition the only determinants of phenotypic, value . The two
components of value associated with genotype and environment are
the genotypic value and the environmental deviation. We may think
of the genotype conferring a certain value on the individual and the
environment causing a deviation from this, in one direction or the
other. Or, symbolically,
P=G+E (7.1)
where P is the phenotypic value, G is the genotypic value, and J? is the
environmental deviation. The mean environmental deviation in the
population as a whole is taken to be zero, so that the mean phenotypic
Chap. 7]
VALUES AND MEANS
113
value is equal to the mean genotypic value. T^he term popula tion
mean then refers equall y to phenotypic or to genotypic values. When
dealing with successive generations we shall assume for simplicity that
the environment remains constant from generation to generation, so
that the population mean is constant in the absence of genetic change.
If we could replicate a particular genotype in a number of individuals
and measure them under environmental conditions normal for the
population, their mean environmental deviations would be zero, and
their mean phenotypic value would consequently be equal to the
genotypic value of that particular genotype. This is the meaning of
the genotypic value of an individual. In principle it is measurable,
but in practice it is not, except when we are concerned with a single
locus where the genotypes are phenotypically distinguishable, or with
the genotypes represented in highly inbred lines.
For the purposes of deduction we must assign arbitrary values to
the genotypes under discussion. This is done in the following way.
Considering a single locus with two alleles, A x and A 2 , we call the
genotypic value of one homozygote + a, that of the other homozygote
 a, and that of the heterozygote d. (We shall adopt the convention
that A x is the allele that increases the value.) We thus have a scale of
genotypic values as in Fig. 7.1. The origin, or point of zero value, on
this scale is mid way between the values of the two homozygotes.
Genotype
A 2 A 2
l__
AjA 2 AjAj
Genotypic a o d +a
value
Fig. 7.1. Arbitrarily assigned genotypic values.
The value, d. o f the het erozygote depends on the degree of domin ance.
If there is no dominance, d = o; if A x is dominant over A 2 , dis positive,
and if A 2 is dominant over A 1? d is negative. If dominance is com
plete, d is equal to +a or a, and if there is overdominance_ d is
g reater than + a or less than  a. The degree of dominance m av be
Example 7.1. For the purposes of illustration in this chapter, and also
later on, we shall refer to a dwarfing gene in the mouse, known as "pygmy' '
(symbol pg), described by King (1950, 1955), and by Warwick and Lewis
(1954). This gene reduces bodysize and is nearly, but not quite, recessive
in its effect on size. It was present in a strain of small mice (Mac Arthur's)
at the time the studies cited above were made. The weights of mice of the
114 VALUES AND MEANS [Chap. 7
three genotypes at 6 weeks of age were approximately as follows (sexes
averaged):
+ + +Pg PgPg
Weight in grams 14 12 6
(The weight of heterozygotes given here is to some extent conjectural, but
it is unlikely to be more than 1 gm. in error.) These are average weights
obtained under normal environmental conditions, and they are therefore
the genotypic values. The midpoint in genotypic value between the two
homozygotes is 10 gm., and this is the origin, or zeropoint, on the scale
of values assigned as in Fig. 7. 1 . The value of a on this scale is therefore
4 gm., and that of d is 2 gm.
Population Mean
We can now see how the gene frequencies influence the mean of
the character in the population as a whole. Let the gene frequencies
of A ± and A 2 be p and q respectively. Then the first two columns of
Table 7.1 show the three genotypes and their frequencies in a random
breeding population, from formula 1.2. The third column shows the
genotypic values as specified above. The mean value in the whole
Table 7.1
freq. x vol.
p 2 a
2pqd
q 2 a
Genotype
Frequency
Value
AA
P 2
+ a
A X A 2
zpq
d
A 2 A 2
q*
a
Sum = a(p q) + 2dpq
population is obtained by multiplying the value of each genotype by
its frequency and summing over the three genotypes. The reason why
this yields the mean value may be understood by converting fre
quencies to numbers of individuals. Multiplying the value by the
number of individuals in each genotype and summing over genotypes
gives the sum of values of all individuals. The mean value would then
be this sum of values divided by the total number of individuals. The
procedure in working with frequencies is the same, but since the sum
of the frequencies is 1, the sum of values x frequencies is the mean
value. In other words, the division by the total number has already
been made in obtaining the frequencies. Multiplication of values by
frequencies to obtain the mean value is a procedure that will be often
Chap.7] POPULATION MEAN 115
used in this chapter and subsequent ones. Returning to the popula
tion mean, multiplication of the value by the frequency of each
genotype is shown in the last column of Table 7.1. Summation of
this column is simplified by noting that p 2  q 2 = (p+q)(p q)=p q
The population mean, which is the sum of this column, is thus
flif— i(ft Ul I zdpq
■{72)
This is both the mean genotypic value and the mean phenotypic
value of the population with respect to the character.
The contribution of any locus to the population mean thus has
two terms: a(p  q) attributable to homozygotes, and zdpq attributable
to heterozygotes. If there is no dominance (d=o) the second term is
zero, and the mean is proportional to the gene frequency: M= a(i  2q).
If there is complete dominance (d=a) the mean is proportional to the
square of the gene frequency: M=a(i 2q 2 ). The total range of
values attributable to the locus is 2a, in the absence of overdominance.
That is to say, if A x were fixed in the population (p = i) the popula
tion mean would be a, and if A 2 were fixed (q=i) it would be  a.
If the locus shows overdominance, however, the mean of an unfixed
population is outside this range.
Example 7.2. Let us take again the pygmy gene in mice, as described
in Example 7.1, and see what effect this gene would have on the population
mean when present at two particular frequencies. First, the total range is
from 6 gm. to 14 gm.: a population consisting entirely of pygmy homo
zygotes would have a mean of 6 gm., and one from which the gene was
entirely absent would have a mean of 14 gm. (These values refer speci
fically to MacArthur's Small Strain at the time the observations were
made.) Now suppose the gene were present at a frequency of oi, so that
under random mating homozygotes would appear with a frequency of 1
per cent. The values to be substituted in equation 7.2 are p = og, q = oi }
and a = 4 gm., d = 2 gm., as shown in Example 7.1. The population mean,
by equation 7.2, is therefore: M = \ x o8 + 2 x o 18 = 356. This value of
the mean, however, is measured from the midhomozygote point, which is
10 gm., as origin. Therefore the actual value of the population mean is
1356 gm. Next suppose the gene were present at a frequency of 04.
Substituting in the same way, we find M — 176, to which must be added
10 gm. for the origin, giving a value of 1176 gm. Rough corroboration of
these figures is given by the records of the strain carrying the gene.
When the gene was present at a frequency of about 04 the mean weight
was about 12 gm. Two generations later, when the pygmy gene had been
deliberately eliminated, the mean weight rose to about 14 gm.
116
VALUES AND MEANS
[Chap. 7
Now we have to put together the contributions of genes at several
loci and find their joint effect on the mean. This introduces, the
qiiegfliflQ nf ^ nw 8m£^d4jlifegn^Qc^oinbirietg pro duce a j oint
efl^cj^nthgjjjjgra^ter. For the mom ent we shall suppose that com 
bi nation isJiv a ddition, which means that the value of a genotype
with respect to several loci is the sum of the values attributable to the
separate loci. For example, if the genotypic value of A^ is a A and
that of B 1 B 1 is a By then the genotypic value of AjA^B]. is a A +a B .
The consequences of nonadditive combination will be explained at
the end of this chapter. With additive combination, then, the popu
lation mean resulting from the joint effects of several loci is the sum of
the contributions of each of the separate loci, thus:
M=Za(pq) + 2Zdpq
.(75)
This is again both the genotypic and the phenotypic mean value. The
total range in the absence of overdominance is now 2Ua. If all alleles
that increase the value were fixed the mean would be + £a, and if all
alleles that decrease the value were fixed it would be  Ea. These are
the theoretical limits to the range of potential variation in the popula
tion. The origin from which the mean value in equation 7.5 is
measured is the midpoint of the total range. This is equivalent to
the average midhomozygote point of all the loci separately.
Example 7.3. As an example of two loci that combine additively, and
also of their joint effects on the population mean, we shall refer to two
colour genes in mice, whose effects on the number of pigment granules
have been described by Russell (1949). This is a metric character which
reflects the intensity of pigmentation in the coat. The two genes are
"brown" (b) and "extreme dilution" (c e ), an allele of the albino series.
Measurements were made of the number of melanin granules per unit
volume of hair, in wildtype homozygotes, in the two single mutant homo
zygotes, and in the double mutant homozygote. We shall assume both
wildtype alleles to be completely dominant, so that only these four geno
types need be considered. The mean numbers of granules in the four
genotypes were as follows:
B
bb
2a B
c
c e c e
95
38
90
34
5
4
2flc
57
56
Chap. 7]
POPULATION MEAN
117
The difference between the two figures in each row and in each column
measures the homozygote difference, or 2a on the scale of values assigned
as in Fig. 7.1. Apart from the trivial discrepancy of 1 unit, these differences
are independent of the genotype at the other locus. In other words, the
difference of value between B  and bb is the same among C  genotypes
as it is among c e c e genotypes; and similarly the difference between C  and
c e c e is the same in B  as it is in bb. Thus the two loci combine addi
tively, and the value of a composite genotype can be rightly predicted
from knowledge of the values of the single genotypes. For example: the
bb genotype is 5 units less than the wildtype, and the c e c e is 57 units less;
therefore bb c e c e should be 62 units less, namely 33, which is almost iden
tical with the observed value of 34.
We may use this example further to illustrate the effect of the two
loci jointly on the population mean. Let us work out, from the effects of
the loci taken separately, what would be the mean granule number in a
population in which the frequency of bb was ql = 04, and that of c e c e
was ql = o2. For the effects of the loci separately we shall take a B = 2 and
fl c = 28. The population mean, considering one locus, is M = a(i 2q 2 ),
when there is complete dominance. For the B locus this is M B = 2 x 02
= 04; and for the C locus M c = 28 xo6 = i68. The mean, considering both
loci together, is M B + M c = 172 (by equation 7.3). The point of origin from
which this is measured is the midpoint between the two double homo
zygotes, which is ^(95 + 34) = 645. Thus the mean granule number in this
population would be 645 + 172 = 817. We may check this from the ob
servations of the values of the joint genotypes. The four genotypes would
have the following frequencies and observed values:
Genotype
B C
B  c e c e
bbC
bb c e c e
Frequency
048
012
032
008
Observed value
95
38
90
34
The mean value is obtained by multiplying the values by the frequencies
and summing over the four genotypes. This yields a mean granule number
of8i68.
Average Effect
In order to deduce the properties of a population connected with
its family structure we have to deal with the transmission of value
from parent to offspring, and this cannot be done by means of geno
typic values alone, because parents pass on their genes and not their
genotypes to the next generation, genotypes being created afresh in
118 VALUES AND MEANS [Chap. 7
each generation. A new measure of value is therefore needed which
will refer to genes and not to genotypes. This will enable us to assign
a ''breeding value" to individuals, a value associated with the genes
carried by the individual and transmitted to its offspring. The new
measure is the "average effect." We can assign an average effect to a
gene in the population, or to the difference between one gene and
another of an allelic pair. The average effect of a gene is the mean
deviation from the population mean of individuals which received
that gene from one parent, the gene received from the other parent
having come at random from the population. This may be stated in
another way. Let a number of gametes all carrying A ± unite at ran
dom with gametes from the population; then the mean deviation from
the population mean of the genotypes so produced is equal to the
average effect of the gene A x . The concept of average effect is perhaps
easier to grasp in the form of the average effect of a genesubstitution,
which can more conveniently be used when only two alleles at a locus
are under consideration. If we could change, say, A 2 genes into A x at
random in the population, and could then note the resulting change of
value, this would be the average effect of the genesubstitution. It is
equal to the difference between the average effects of the two genes
involved in the substitution. A graphical representation of the average
effect of a genesubstitution is given later in Fig. 7.2.
It is important to realise that the average effect of a gene or a gene
substitution depends on the gene frequency, and that the average
effect is therefore a property of the population as well as of the gene.
The reason for this can be seen in the words "taken at random" in the
definitions, because the content of the random sample depends on the
gene frequency in the population. The point may perhaps be more
easily understood from a specific example. Consider the substitution
of a recessive gene, a, for its dominant allele, A. The substitution will
change the value only when the individual already carries one reces
sive allele, in other words in heterozygotes. Changing AA into Aa
will not affect the value, but changing Aa into aa will. Now, when the
frequency of the recessive allele, a, is low there will be many AA
individuals, which the substitution will not affect; but when the
recessive is at high frequency there will be very few AA individuals,
and most of the individuals in which a substitution can be made will
be affected by it. Therefore the average effect of the substitution will
be small when the frequency of the recessive allele is low and large
when it is high.
Chap. 7]
AVERAGE EFFECT
119
Let us see how the average effect is related to the genotypic
values, a and d, in terms of which the population mean was expressed.
This will help to make the concept clearer. The reasoning is set out
in Table 7.2. Consider a locus with two alleles, A ± and A 2 , at fre
quencies p and q respectively, and take first the average effect of the
Table 7.2
Type of
gamete
Values and
frequencies of
genotypes produced
A^Aj A1A2 A^A^
a d a
Mean value
of genotypes
produced
Population mean
to be deducted
Average effect
of gene
A,
A 2
P q
P q
pa +qd
qa +pd
[a(pq)+2dpq]
[a(pq)+2dpq]
q[a+d(qp)]
p[a+d(qp)]
gene A ly for which we shall use the symbol a x . If gametes carrying A t
unite at random with gametes from the population, the frequencies
of the genotypes produced will be p of A^! and q of A X A 2 . The
genotypic value of AjAj is + a and that of A X A 2 is d, and the mean of
these, taking account of the proportions in which they occur, is
pa+qd. The difference between this mean value and the population
mean is the average effect of the gene A ± . Taking the value of the
population mean from equation 7.2 we get
a i =P a +qd [a(p q) + zdpq]
=q[a + d(qp)] (7.4a)
Similarly the average effect of the gene A 2 is
cc 2 =p[a + d(qp)] (7.4b)
Now consider the average effect of the genesubstitution, letting A x
be substituted for A 2 . Of the A 2 genes taken at random from the
population for substitution, a proportion p will be found in A X A 2
genotypes and a proportion q in A 2 A 2 genotypes. In the former the
substitution will change the value from d to +a, and in the latter
from a to d. The average change is therefore p(ad)+q(d J rd),
which on rearrangement becomes a + d(qp). Thus the average
effect of the genesubstitution (written as a, without subscript) is
<x = a + d(qp)
(75)
The relation of a to a x and <% 2 can be seen by comparing equations
7.5 and 7.4, whence
I F.Q.G.
120 VALUES AND MEANS [Chap. 7
oc = oc 1 a 2 (7.6)
} (77)
and
oc 1 =q<x
a 2 = poc
Example 7.4. Consider again the pygmy gene and its effect on body
weight, for which a = 4 gm. and d = 2 gm. If the frequency of the pg gene
were # = oi, the average effect of substituting + for pg would be, by
equation 7.5, <x = 4 + 2x 08 = 24 § m  And if the frequency were
q = 04, the average effect of the genesubstitution would be: a = 4 + 2 x  o>2
= 36 gm. Thus, the average effect is greater when the gene frequency is
greater. The average effects of the genes separately are, by equation 7.7:
q = oi <7 = o4
Average effect of + : oc 1 = +024 +!*44
Average effect of pg : a 2 = 216 2 16
(The identity of the average effects of pg at the two gene frequencies is
only a coincidence.)
Breeding Value
The usefulness of the concept of average effect arises from the fact,
already noted, that parents pass on their genes and not their genotypes
to their progeny. It is therefore the average effects of the parent's
genes that determine the mean genotypic value of its progeny. The
value of an individual, judged by the mean value of its progeny, is
called the breeding value of the individual. Breeding value, unlike
average effect, can therefore be measured. If an individual is mated
to a number of individuals taken at random from the population then
its breeding value is twice the mean deviation of the progeny from
the population mean. The deviation has to be doubled because the
parent in question provides only half the genes in the progeny, the
other half coming at random from the population. Breeding values
can be expressed in absolute units, but are usually more conveniently
expressed in the form of deviations from the population mean, as
defined above. Just as the average effect is a property of the gene
and the population so is the breeding value a property of the individual
and the population from which its mates are drawn. One cannot
speak of an individual's breeding value without specifying the popu
lation in which it is to be mated.
Chap. 7]
BREEDING VALUE
121
Defined in terms of average effects, the breeding value of an
individual is equal to the sum of the average effects of the genes it
carries, the summation being made over the pair of alleles at each
locus and over all loci. Thus, for a single locus with two alleles the
breeding values of the genotypes are as follows:
Genotype Breeding value
A^ 2a x = 2qoc
A X A 2 a 1 + ot 2 = (qp)ac
A 2 A 2 2a 2 = — 2/)a
Example 7.5. Let us illustrate breeding values by reference to the
pygmy gene in mice. The average effects of the + and pg genes were
given in the last example. From these we may find the breeding values of
the three genotypes as explained above. These breeding values, which are
given below, are deviations from the population mean. The population
means with gene frequencies of oi and 04 were found in Example 72 and
are shown again below in the column headed M.
q = oi
2 = o4
M
i3'5 6
1 1 76
Breeding values
+ + +Pg PgPg
+ 048
+ 288
192
072
432
4*32
(The breeding values of pygmy homozygotes are only hypothetical
because in fact pygmy homozygotes are nearly all sterile: but this compli
cation may be overlooked in the present context.)
Extension to a locus with more than two alleles is straightforward,
the breeding value of any genotype being the sum of the average
effects of the two alleles present. If all loci are to be taken into
account, the breeding value of a particular genotype is the sum of the
breeding values attributable to each of the separate loci. If there is
nonadditive combination of genotypic values a slight complication
arises. We have given two definitions of breeding value, a practical
one in terms of the measured value of the progeny and a theoretical
one in terms of average effects. Nonadditive combination renders
these two definitions not quite equivalent. This point will be more
fully explained in Chapter 9.
Consideration of the definition of breeding value will show that
in a population in Hardy Weinberg equilibrium the mean breeding
value must be zero; or if breeding values are expressed in absolute
\
122 VALUES AND MEANS [Chap. 7
units the mean breeding value must be equal to the mean genotypic
value and to the mean phenotypic value. This can be verified from
the breeding values listed above. Multiplying the breeding value by
the frequency of each genotype and summing gives the mean breeding
value (expressed as a deviation from the population mean) as
2p 2 q<x + 2pq(q p)<x.  2q 2 poc = 2pqoc(p + qpq) = o
The breeding value is sometimes referred to as the "additive
genotype," and variation in breeding value ascribed to the "additive
effects" of genes. Though we shall not use these terms we shall
follow custom in using the term "additive" in connexion with the
variation of breeding values to be discussed in the next chapter, and
we shall use the symbol A to designate the breeding value of an
individual.
Dominance Deviation
We have separated off the breeding value as a component part of
the genotypic value of an individual. Let us consider now what
makes up the remainder. When a single locus only is under con
sideration, the difference between the genotypic value, G, and the
breeding value, A, of a particular genotype is known as the dominance
deviation D, so that
G=A+D { 7 .8)
The dominance deviation arises from the property of dominance
among the alleles at a locus, since in the absence of dominance breed
ing values and genotypic values coincide. From the statistical point
of view the dominance deviations are interactions between alleles, or
withinlocus interactions. They represent the effect of putting genes
together in pairs to make genotypes; the effect not accounted for by
the effects of the two genes taken singly. Since the average effects of
genes and the breeding values of genotypes depend on the gene
frequency in the population, the dominance deviations are also
dependent on gene frequency. They are therefore partly properties
of the population and are not simply measures of the degree of
dominance.
Example 7.6. Continuing with the example of the pygmy gene, we
may now list the genotypic values and the breeding values, and so obtain
the dominance deviations of the three genotypes, by equation J.8. These
DOMINANCE DEVIATION
123
Chap. 7]
values, all now expressed as deviations from the population mean, M, are
as follows:
? = oi:M=i356
+ + +Pg PgPg
Frequency
Genotypic value, G
Breeding value, A
Dominance dev., D
o8i oi8 ooi
+ o«44 156 756
+ 048 192 4*32
004 +036 324
q = 0\
j.: M=ii"j6
+ +
+ Pg
PgPg
036
048
016
+ 224
+ 024
57 6
+ 288
072
4'32
064
+ 096
144
The relations between genotypic values, breeding values and
dominance deviations can be illustrated graphically, as in Fig. 7.2,
+ a
s •
•
/*S
> «
ry—  — —i 
1
1
!
:
i
■ 1
1 — '
2q,
(qp) *
2pm
A 2 A 2
1
A,A 2
FREQUENCY
2pq
A,A,
P 2
Fig. 7.2. Graphical representation of genotypic values (closed
circles), and breeding values (open circles), of the genotypes for a
locus with two alleles, A x and A 2 , at frequencies p and q, as ex
plained in the text. Horizontal scale: number of A x genes in the
genotype. Vertical scales of value: on left— arbitrary values as
signed as in Fig. 7.1; on right — deviations from the population
mean. The figure is drawn to scale for the values: d — la, and q=\.
124
VALUES AND MEANS
[Chap. 7
and the meaning of the dominance deviation is perhaps more easily
understood in this way. In the figure the genotypic value (black dots)
is plotted against the number of A x genes in the genotype. A straight
regression line is fitted by least squares to these points, each point
being weighted by the frequency of the genotype it represents. The
position of this line gives the breeding values of each genotype, as
shown by the open circles. The differences between the breeding
values and the genotypic values are the dominance deviations, indi
cated by vertical dotted lines. The cross marks the population mean.
The average effect, a, of the genesubstitution is given by the differ
ence in breeding value between A 2 A 2 and A^, or between A X A 2 and
AjAi, as indicated. The original definition of the average effect of a
genesubstitution was given by Fisher (191 8, 1941) in terms of this
linear regression of genotypic value on number of genes.
The dominance deviation can be expressed in terms of the arbi
trarily assigned genotypic values a and d, by subtraction of the breed
ing value from the genotypic value, as shown in Table 7.3. The
Table 7.3
Values of genotypes in a twoallele system, measured as
deviations from the population mean.
Population mean: M=a(p q) + 2dpq
Average effect of genesubstitution: a = a + d(q p)
Genotypes
AA
AiA 2
A 2 A 2
Frequencies
p*
zpq
? 2
Assigned values
a
d
a
Deviations from
populationmean:
Genotypic value
{
2q{a pd)
2q(a.  qd)
a(qp) + d(i2pq)
(qp)a + 2pqd
2p(aqd)
 2p(a +pd)
Breeding value
2q<x
(qp)<x
2poc
Dominance deviation
2<fd
2pqd
2p 2 d
genotypic values must first be converted to deviations from the
population mean, because the breeding values have been expressed
in this way. The genotypic values, so converted, are given in two
forms: in terms of a and in terms of a. Let us take the genotype A^Aj.
to show how these are obtained and how the dominance deviation is
obtained by subtraction of the breeding value. The arbitrarily as
signed genotypic value of h 1 A 1 is + a, and the population mean is
Chap. 7]
DOMINANCE DEVIATION
125
a(p —q) + zdpq. Expressed as a deviation from the population mean,
the genotypic value is therefore
a  [a(p q) + zdpq] =a(ip+q) zdpq — zqa  zdpq = zq(a  dp).
This may be expressed in terms of the average effect, a, by substitut
ing a = a d(q p) (from equation 7.5), and the genotypic value then
becomes zq(oc  qd). Subtraction of the breeding value, zq<x, gives the
dominance deviation as  zq 2 d. By similar reasoning the dominance
deviation of A X A 2 is zpqd, and that of A 2 A 2 is  zp 2 d. Thus all the
dominance deviations are functions of d. If there is no dominance d
is zero and the dominance deviations are also all zero. Therefore in
the absence of dominance, breeding values and genotypic values are
the same. Genes that show no dominance (d=o) are sometimes called
"additive genes," or are said to "act additively."
Since the mean breeding value and the mean genotypic value are
equal, it follows that the mean dominance deviation is zero. This can
be verified by multiplying the dominance deviation by the frequency
of each genotype and summing. The mean dominance deviation is
thus
 zp 2 q 2 d + 4p 2 q 2 d  zp 2 q 2 d — o
Another fact, which will be needed later when we deal with
variances, may be noted here: there is no correlation between the
dominance deviation and the breeding value of the different genotypes.
This can be shown by multiplying together the dominance deviation,
the breeding value and the frequency of each genotype. Summation
gives the sum of crossproducts, and it works out to be zero, thus:
 4p 2 q 3 ocd + 4p 2 q 2 (q p)ocd + 4p 3 q 2 ad=4p 2 q 2 ad(q+q p +p) = o
Since the sum of crossproducts is zero, breeding values and domin
ance deviations are uncorrelated.
Interaction Deviation
When only a single locus is under consideration the genotypic
value is made up of the breeding value and the dominance deviation
only. But when the genotype refers to more than one locus the geno
typic value may contain an additional deviation due to nonadditive
combination. Let G A be the genotypic value of an individual attri
butable to one locus, G B that attributable to a second locus, and G the
126
VALUES AND MEANS
[Chap. 7
aggregate genotypic value attributable to both loci together. Then
G = G X +G^ + I
AB
■(79)
where 7 AB is the deviation from additive combination of these geno
typic values. In dealing with the population mean, earlier in this
chapter, we assumed that I was zero for all combinations of geno
types. If / is not zero for any combination of genes at different loci,
those genes are said to "interact" or to exhibit "epistasis," the term
epistasis being given a wider meaning in quantitative genetics than
in Mendelian genetics. The deviation / is called the interaction
deviation or epistatic deviation. Loci may interact in pairs or in threes
or higher numbers, and the interactions may be of many different
sorts, as the behaviour of major genes shows. The complex nature of
the interactions, however, need not concern us, because in the aggre
gate genotypic value interactions of all sorts are treated together as a
single interaction deviation. So for all loci together we can write
G=A + D + I (7.10)
where A is the sum of the breeding values attributable to the separate
loci, and D is the sum of the dominance deviations.
If the interaction deviation is zero the genes concerned are said to
"act additively" between loci. Thus "additive action" may mean two
different things. Referred to genes at one locus it means the absence
of dominance, and referred to genes at different loci it means the
absence of epistasis.
Example 7.7. As an example of nonadditive combination of two loci
we shall take the same two colour genes in mice that were used in Example
7.3 to illustrate additive combination; but this time we refer to their effects
on the size of the pigment granules, instead of their number (Russell,
1949). The mean size (diameter in fj,) of the granules in the four geno
types was as follows:
B
bb
Diff.
C
c e c e
144
094
077
077
067
017
Diff.
050
ooo
This time the differences are not independent of the other genotype: the
c e gene for example has quite a large effect on the B  genotype, but none
at all on the bb genotype. Thus the two loci show epistatic interaction and
Chap. 7]
INTERACTION DEVIATION
127
do not combine additively. Let us therefore work out the interaction
deviations. This is not altogether a straightforward matter because the
deviations depend on the gene frequencies in the population under dis
cussion; it does, however, help to clarify the meaning of the interaction
deviations.
If we were to measure the homozygote differences of these two loci
with the object of estimating the value of a for each, the results would
depend on the gene frequency at the other locus. For example, the differ
ence between B  and bb would be 067 if measured in C  genotypes, but
017 if measured in c e c e genotypes. The value of a therefore depends on
the population in which it is measured. Let us take, for the sake of illus
tration, a population in which the frequency of bb genotypes is ^ = 04
and the frequency of c e c e genotypes is q% = 02. Then the mean homo
zygote difference for the B locus will be 2« B = (067 x o8) + (017 x 02) =
057. Similarly, for the C locus, 2fl c =o*30. The object now is to find for
each genotype the aggregate genotypic value, G, for the two loci combined
(i.e. the observed values given above); then the genotypic values, G B and
G Ci derived from consideration of the two loci separately; and, finally, the
interaction deviation, I BC , according to equation y.g. The procedure is
simplified if all these values are expressed as deviations from the popula
tion mean. The table gives, in line (1), the four genotypes (assuming again
complete dominance at both loci); in line (2), the frequency of each geno
type in the population; and in line (3), the observed value of granule size
in each genotype. The population mean is found by multiplying the value
by the frequency of each genotype and summing over the four genotypes.
This yields M= 1112. Subtracting the population mean from the ob
served value gives the aggregate genotypic value, G, as a deviation from
the population mean, shown in line (4). Now consider each locus separ
(1) Genotypes
B C
B c e c e
bbC
bb c e c e
Mean
(2) Frequencies
048
0*12
032
008
(3) Observed values
144
094
077
077
III2
(4) G
+ 0328
0172
0342
0342
O
(5) G B + G C
+ 0288
00I2
0282
0582
O
(6) /bo
+ 0040
0160
 0060
+ 0240
O
ately, paying no regard to the other locus. The genotypic values for a
single locus, expressed as deviations from the population mean, were given
in Table 7.3. With complete dominance these reduce to zaq 2 for the two
dominant genotypes combined, and 20(1 q 2 ) for the recessive homo
zygote. Take the B  genotype for example: the value of 2« B m tne popu
lation under consideration was shown above to be 057, and the value of
q 2 assumed is 04; therefore the genotypic value is 057x04= +0228
128
VALUES AND MEANS
[Chap. 7
This is the average value of the B  genotype irrespective of the other locus.
The other singlelocus values, found in a similar way, are as follows:
B
bb
1
C
c e c e
 0228
0342
1 G c :
+ oo6o
 0240
The values given in line (5) of the table as G B + G c are found by summa
tion of the two appropriate singlelocus values. For example, the B  C 
genotype is +0228 + 0060= +0288. These are the genotypic values
expected if there were additive combination. It may be noted that their
mean, obtained by summation of (value x frequency) is zero, as is the mean
of the aggregate genotypic values in line (4). Finally, the interaction devi
ations, 7 BC , given in line (6) are obtained by subtracting the "expected"
values in line (5) from the "actual" values in line (4). The mean interaction
deviation is also zero.
CHAPTER 8
VARIANCE
The genetics of a metric character centres round the study of its
variation, for it is in terms of variation that the primary genetic
questions are formulated. The basic idea in the study of variation is
its partitioning into components attributable to different causes. The
relative magnitude of these components determines the genetic
properties of the population, in particular the degree of resemblance
between relatives. In this chapter we shall consider the nature of
these components and how the genetic components depend on the
gene frequency. Then, in the next chapter, we shall show how the
degree of resemblance between relatives is determined by the magni
tudes of the components.
The amount of variation is measured and expressed as the vari
ance: when values are expressed as deviations from the population
mean the variance is simply the mean of the squared values. The
components into which the variance is partitioned are the same as the
components of value described in the last chapter; so that, for
example, the genotypic variance is the variance of genotypic values,
and the environmental variance is the variance of environmental
deviations. The total variance is the phenotypic variance, or the
variance of phenotypic values, and is the sum of the separate com
ponents. The components of variance and the values whose variance
they measure are listed in Table 8.1.
Table 8
.1
Components of
Variance
Value whose variance
Variance component
Symbol
is measured
Phenotypic
v P
Phenotypic value
Genotypic
Vg
Genotypic value
Additive
V A
Breeding value
Dominance
v»
Dominance deviation
Interaction
Vj
Interaction deviation
Environmental
V E
Environmental deviation
130 VARIANCE [Chap. 8
The total variance is then, with certain qualifications to be men
tioned below, the sum of the components, thus:
V P = V G +V E
= V A + V D + V I +V E (8.1)
Let us now consider these components of variance in detail.
Genotypic and Environmental Variance
The first division of phenotypic value that we made in the last
chapter was into genotypic value and environmental deviation,
P=G +E. The corresponding partition of the variance into genotypic
and environmental components formulates the problem of "heredity
versus environment' ' or "nature and nurture"; or, to put the question
more precisely, the relative importance of genotype and environment
in determining the phenotypic value. The "relative importance" of a
cause of variation means the amount of variation that it gives rise to,
as a proportion of the total. So the relative importance of genotype
as a determinant of phenotypic value is given by the ratio of geno
typic to phenotypic variance, V G /V P . The genotypic and environ
mental components cannot be estimated directly from observations
on the population, but in certain circumstances they can be estimated
in experimental populations. If one or other component could be
completely eliminated, the remaining phenotypic variance would
provide an estimate of the remaining component. Environmental
variance cannot be removed because it includes by definition all
nongenetic variance, and much of this is beyond experimental
control. Elimination of genotypic variance can, however, be achieved
experimentally. Highly inbred lines, or the F x of a cross between two
such lines, provide individuals all of identical genotype and therefore
with no genotypic variance. If a group of such individuals is raised
under the normal range of environmental circumstances, their pheno
typic variance provides an estimate of the environmental variance
V . Subtraction of this from the phenotypic variance of a genetically
mixed population then gives an estimate of the genotypic variance
of this population.
Example 8.i. Partitioning of the phenotypic variance into its geno
typic and environmental components has been done for several characters
Chap. 8]
GENOTYPIC AND ENVIRONMENTAL VARIANCE
131
in Drosophila melanogaster. The results are given later, in Table 8.2, but
here we may describe the results for one character in more detail in order
to show how the partitioning is made. The character is the length of the
thorax (in units of i/ioo mm.), which may be regarded as a measure of body
size. The phenotypic variance was measured first in a genetically mixed —
i.e. a randombred — population, and then in a genetically uniform popu
lation, consisting of the F ± generation of three crosses between highly
inbred lines. The first estimates the genotypic and environmental variance
together, and the second estimates the environmental variance alone. So,
by subtraction, an estimate of the genotypic variance is obtained. The
results, obtained by F. W. Robertson ( 19576), were as follows:
Population
Components
Observed variance
Mixed
v G + v E
0366
Uniform
v E
0186
Difference
v G
0180
Va/Vp =49%
Thus 49 per cent of the variation of thorax length in this genetically mixed
population is attributable to genetic differences between individuals, and
5 1 per cent to nongenetic differences.
Individuals of identical genotype are also provided by identical
twins in man and cattle, but their use in partitioning the variance is
very limited: they will be discussed in a later chapter when the
problems that they raise will be better understood. Apart from the
severely limited use of identical twins, the partitioning of the vari
ance into genotypic and environmental components depends on the
availability of highly inbred lines, and is therefore restricted to experi
mental populations of plants or small animals.
Three complications arise in connexion with the partitioning of
the variance into genotypic and environmental components. They
are all things that can usually be neglected or circumvented with little
risk of error, but in some circumstances they may be important. The
following account of them might well be omitted at a first reading,
unless the reader is worried by the logical fallacies introduced by
neglecting them.
Dependence of environmental variance on genotype. Ex
periments of the type illustrated in Example 8.1 rest on the assump
tion that the environmental variance is the same in all genotypes, and
this is certainly not always true. The environmental variance mea
sured in one inbred line or cross is that shown by this one particular
132 VARIANCE [Chap. 8
genotype, and other genotypes may be more or less sensitive to
environmental influences and may therefore show more or less
environmental variance. The environmental variance of the mixed
population may therefore not be the same as that measured in the
genotypically uniform group. Not very much is yet known about this
complication except that many characters show more environmental
variance among inbred than among outbred individuals, inbreds being
more sensitive or less well "buffered." The reality of the complica
tion is therefore not in doubt. Further discussion of the phenomenon
will be found under the effects of inbreeding, in Chapter 15, where it
more properly belongs. The existence of this complication means
that when dealing with genotypically mixed populations we have to
define the environmental component of variance as the mean en
vironmental variance of the genotypes in the population, and we have
to recognise the possibility that if the frequencies of the genotypes
are changed, as by selection, the environmental variance may also be
changed in consequence.
Genotypeenvironment correlation. Hitherto we have tacitly
assumed that environmental deviations and genotypic values are
independent of each other; in other words that there is no correlation
between genotypic value and environmental deviation, such as would
arise if the better genotypes were given better environments. Corre
lation between genotype and environment is seldom an important
complication, and can usually be neglected in experimental popula
tions, where randomisation of environment is one of the chief objects
of experimental design. There are some situations, however, in
which the correlation exists. Milkyield in dairy cattle provides an
example. The normal practice of dairy husbandry is to feed cows
according to their yield, the better phenotypes being given more
food. This introduces a correlation between phenotypic value and
environmental deviation; and, since genotypic and phenotypic values
are correlated, there is also a correlation between genotypic value and
environmental deviation. The complication of genotypeenvironment
correlation is very simply overcome by regarding the special environ
ment — i.e. the feeding level in the case of cows — as part of the geno
type. This situation is covered by the definition of genotypic value,
provided genotypic values are taken to refer to genotypes as they
occur under the normal conditions of association with specific
environments. If genotypic values were not so defined we could not
treat the phenotypic variance as simply the sum of the genotypic and
Chap. 8]
GENOTYPIC AND ENVIRONMENTAL VARIANCE
133
environmental variances, but we should have to include a covariance
term, thus:
V P = Vq + V E + 2C0V 0E
.(8.2)
where cov GE is the covariance of genotypic values and environmental
deviations. If the genotypic variance is estimated, as in Example 8.i,
by the comparison of genetically identical with genetically mixed
groups, then the covariance would be eliminated with the genotypic
variance from the genetically identical group, and the estimate ob
tained will be of genotypic variance together with twice the co
variance. Thus, while on theoretical grounds it is convenient, on
practical grounds it is unavoidable, to regard any covariance that may
arise from genotypeenvironment correlation as being part of the
genotypic variance.
Genotypeenvironment interaction. Another assumption that
we have made, which is not always justifiable, is that a specific differ
ence of environment has the same effect on different genotypes; or, in
other words, that we can associate a certain environmental deviation
with a specific difference of environment, irrespective of the genotype
on which it acts. When this is not so there is an interaction, in the
statistical sense, between genotypes and environments. There are
several forms which this interaction may take (Haldane, 1946). For
example, a specific difference of environment may have a greater
effect on some genotypes than on others; or there may be a change in
the order of merit of a series of genotypes when measured under
different environments. That is to say, genotype A may be superior
to genotype B in environment X, but inferior in environment Y, as in
the following example.
Example 8.2. The following figures show the growth, between 3 and
6 weeks of age, of two strains of mice reared on two levels of nutrition
(original data):
Good
Bad
nutrition
nutrition
Strain A
172 gm.
126 gm.
Strain B
166 gm.
13*3 g m 
Strain A grows better than strain B under good conditions, but worse
under bad conditions.
134 VARIANCE [Chap. 8
An interaction between genotype and environment, whatever its
nature, gives rise to an additional component of variance. This
interaction variance can be isolated and measured only under rather
artificial circumstances. We may replicate genotypes by the use of
inbred lines or Fx's, and replicate specific environments by the con
trol of such factors as nutrition or temperature. Then an analysis of
variance in a twoway classification of genotypes x environments will
yield estimates of the genotypic variance (between genotypes), the
environmental variance (between environments) and the variance
attributable to interaction of genotypes with environments. The
specific environments in such an experiment are, however, more in
the nature of "treatments" because a population under genetical
study would not normally encounter so wide a range of environments
as that provided by the different treatments. It is therefore the
genotypeenvironment interaction occurring within one such treat
ment that is relevant to the genetical study of a population, and this
cannot be measured because the separate elements of the environ
ment cannot be isolated and controlled. In an experiment such as
that of Example 8.1, which removes the genotypic variance by the
use of inbred lines or F^s, the interaction variance remains with the
environmental in the phenotypic variance measured in the genetically
uniform individuals. In normal circumstances, therefore, the vari
ance due to genotypeenvironment interaction, since it cannot be
separately measured, is best regarded as part of the environmental
variance. When large differences of environment, such as between
different habitats, are under consideration, the presence of genotype
environment interaction becomes important in connexion with the
specialisation of breeds or varieties to local conditions. This matter
will be taken up again later, in Chapter 19, because it can be more
profitably discussed from a different viewpoint.
Genetic Components of Variance
The partition into genotypic and environmental variance does not
take us far toward an understanding of the genetic properties of a
population, and in particular it does not reveal the cause of resem
blance between relatives. The genotypic variance must be further
divided according to the division of genotypic value into breeding
value, dominance deviation, and interaction deviation. Thus we have:
Chap. 8]
GENETIC COMPONENTS OF VARIANCE
135
Values
Variance components
G = A + D + I
V G = V A + V D + V t (8. 4 )
(genotypic) (additive) (dominance) (interaction)
The additive variance, which is the variance of breeding values, is the
important component since it is the chief cause of resemblance be
tween relatives and therefore the chief determinant of the observable
genetic properties of the population and of the response of the popu
lation to selection. Moreover, it is the only component that can be
readily estimated from observations made on the population. In prac
tice, therefore, the important partition is into additive genetic variance
versus all the rest, the rest being nonadditive genetic and environ
mental variance. This partitioning is most conveniently expressed
as the ratio of additive genetic to total phenotypic variance, V A /V P > a
ratio called the heritability.
Estimation of the additive variance rests on observation of the
degree of resemblance between relatives and will be described later
when we have discussed the causes of resemblance between relatives.
Our immediate concern here is to show how the genetic components
of variance are influenced by the gene frequency. To do this we have
to express the variance in terms of the gene frequency and the as
signed genotypic values a and d. We shall consider first a single locus
with two alleles, thus excluding interaction variance for the moment.
Additive and dominance variance. The information needed to
obtain expressions for the variance of breeding values and the variance
of dominance deviations was given in the last chapter in Table 7.3
(p. 125). This table gives the breeding values and dominance devia
tions of the three genotypes, expressed as deviations from the popu
lation mean. It will be remembered that the means of both breeding
values and dominance deviations are zero. Therefore no correction
for an assumed mean is needed, and the variance is simply the mean
of the squared values. The variances are thus obtained by squaring
the values in the table, multiplying by the frequency of the genotype
concerned, and summing over the three genotypes. (The procedure
of multiplying values by frequencies to obtain the mean was explained
on p. 114.) The additive variance, which is the variance of breeding
values, is obtained as follows:
V A = oPfoqy + (q pf . zpq + tf y]
= 2pqoc 2 (2pq +p 2 +q 2  zpq + zpq)
= zpqa?
= zpq[a + d(qp)Y
.(8. 5 . a)
\8. 5 .b)
F.Q.G.
I
136 VARIANCE [Chap. 8
Similarly the variance of dominance deviations is
V D = d%iq 2 p 2 + Sp V + ^p V)
= (^) 2 (5.6)
It was noted in the last chapter that breeding values and dominance
deviations are uncorrelated. From this it follows that the genotypic
variance is simply the sum of the additive and dominance variances.
Thus
v Q =v A + v D
= zpq[a + d(q p)f + [zpqdf (8. 7 )
Example 8.3. To illustrate the genetic components of variance arising
from a single locus let us return to the pygmy gene in mice, used for
several examples in the last chapter. From the values tabulated in Ex
ample 7.6 (p. 123) we may compute the components of variance directly.
Since the values are expressed as deviations from the population mean, the
variance is obtained by multiplying the frequency of each genotype by
the square of its value, and summing over the three genotypes. For ex
ample, the genotypic variance when q = oi is o8i(o44) 2 + oi8(  i*56) 2 +
ooi(7'56) 2 = 11664. The additive variance is obtained in the same way
from the variance of breeding values, and the dominance variance from
the variance of dominance deviations. The variances obtained are as
follows:
q = oi <Z = o*4
Genotypic, Vq 11664 7 ,:i: 4 2 4
Additive, V A 10368 62208
Dominance, Vj) 01296 09216
The variances may be obtained also, and with less trouble, by use of the
formulae given above in equations #.5, 8.6 and 8.J. The values to be sub
stituted were given in Example 7.1; namely, a = 4 and d = z. Notice that
the dominance variance is quite small in comparison with the additive.
The ways in which the gene frequency and the degree of domin
ance influence the magnitude of the genetic components of
variance can best be appreciated from graphical representations of the
relationships derived above, in equations £.5, 8.6, and 8.J. The
graphs in Fig. 8.1 show the amounts of genotypic, additive, and
dominance variance arising from a single locus with two alleles,
plotted against the gene frequency. Three cases are shown to illus
Chap. 8]
GENETIC COMPONENTS OF VARIANCE
137
trate the effect of different degrees of dominance: in graph (a) there
is no dominance (d=o); in graph (b) there is complete dominance
(d=a); and in graph (c) there is "pure" over dominance (a = o).
In the first case the genotypic variance is all additive, and it is
greatest whenp=q=o$. In the second case the dominance variance
is maximal when p=q = 05, and the additive is maximal when the
frequency of the recessive allele is q = oj$. In the third case the
dominance variance is the same as in the second and is maximal
1
04
02
00
02
00
j
/
\
(a)
(b)
/
/
\
/
/
\\
/
A
/
\
/
/
\
/
(c)
/
rr—
A
~,
/
/_
/
'4
■/
\
£
^*— j
r
<N v
>.\
/^
/
\ N
08
06
04
02
02 04 06 08 02 04 06 08 I
GENE FREQUENCY, q
Fig. 8.1. Magnitude of the genetic components of variance
arising from a single locus with two alleles, in relation to the gene
frequency. Genotypic variance — thick lines; additive variance —
thin lines; dominance variance — broken lines. The gene fre
quency, q, is that of the recessive allele. The degrees of dominance
are: in (a) no dominance (d=o); in (b) complete dominance (d = a);
and in (c) "pure" overdominance (a =o). The figures on the vertical
scale, showing the amount of variance, are to be multiplied by a 2
in graphs (a) and (b), and by d 2 in graph (c).
00
when p=q=o$. The additive variance, however, is zero when
p=q = o$ y and has two maxima, one at ^ = 015 and the other at
^ = 085. The genotypic variance, in this case, remains practically
constant over a wide range of gene frequency, though its composition
changes profoundly. The general conclusion to be drawn from these
graphs is that genes contribute much more variance when at inter
mediate frequencies than when at high or low frequencies: recessives
at low frequency, in particular, contribute very little variance.
A possible misunderstanding about the concept of additive gene
tic variance, to which the terminology may give rise, should be
138 VARIANCE [Chap. 8
mentioned here. The concept of additive variance does not carry
with it the assumption of additive gene action; and the existence of
additive variance is not an indication that any of the genes act addi
tively (i.e. show neither dominance nor epistasis). No assumption is
made about the mode of action of the genes concerned. Additive
variance can arise from genes with any degree of dominance or epis
tasis, and only if we find that all the genotypic variance is additive can
we conclude that the genes show neither dominance nor epistasis.
The existence of more than two alleles at a locus introduces no
new principle, though it complicates the theoretical description of the
effect of the locus. Expressions for the additive and dominance
variances are given by Kempthorne (1955a). The locus contributes
additive variance arising from the average effects of its several alleles,
and dominance variance arising from the several dominance devia
tions.
To arrive at the variance components expressed in the population
the separate effects of all loci that contribute variance have to be
combined. The additive variance arising from all loci together is the
sum of the additive variances attributable to each locus separately;
and the dominance variance is similarly the sum of the separate contri
butions. But when more than one locus is under consideration then
the interaction deviations, if present, give rise to another component
of variance, the interaction variance, which is the variance of the
interaction deviations.
Interaction variance. We shall treat the interaction variance as
a complication, like genotypeenvironment correlation or inter
action, to be circumvented: that is to say, we shall not discuss its
properties in detail, but we shall show what happens to it if it is
ignored. It is only comparatively recently that the properties of the
interaction variance have been worked out (see Cockerham, 1954;
Kempthorne, 1954, 1955a, 6) and little is yet known about its import
ance in relation to the other components. It seems probable, how
ever, that the amount of variance contributed by it is usually rather
small, and that neglect of it is therefore not likely to lead to serious
error. Description of the properties of interaction variance rests on
its further subdivision into components. It is first subdivided ac
cording to the number of loci involved: twofactor interaction arises
from the interaction of two loci, threefactor from three loci, etc.
Interactions involving larger numbers of loci contribute so little
variance that they can be ignored, and we shall confine our attention
Chap. 8]
GENETIC COMPONENTS OF VARIANCE
139
to twofactor interactions since these suffice to illustrate the principles
involved. The next subdivision of the interaction variance is accord
ing to whether the interaction involves breeding values or dominance
deviations. There are thus three sorts of twofactor interactions.
Interaction between the two breeding values gives rise to additive x
additive variance, V AA \ interaction between the breeding value of one
locus and the dominance deviation of the other gives rise to additive x
dominance variance, V AD ; and interaction between the two domin
ance deviations gives rise to dominance x dominance variance, V DD .
So the interaction variance is broken down into components thus:
Vi = V AA + V AD + V DD + etc.
,(8.8)
the terms designated "etc." being similar components arising from
interactions between more than two loci. At the moment we cannot
go further than this in the description of the interaction variance, but
we shall show later how it affects the resemblance between relatives
and what happens to it when components of variance are estimated
from observations on the population.
That completes the description of the nature of the genetic com
ponents of variance. The practical value of the partitioning of the
variance will not yet be fully apparent because it arises from the
causes of resemblance between relatives, which is the subject of the
next chapter. The partitioning we have made is essentially a theo
retical one, and before we pass on we should consider how much of it
can actually be made in practice. When observations of resemblance
between relatives are available we can estimate the additive variance
and so make the partition V A : (V D + V r + V E ). And if inbred lines
are available we can estimate the environmental variance and so make
the partition V G : V E . If both these partitions are made we can
separate the additive genetic from the rest of the genetic variance, and
so make the threefold partition into additive genetic, nonadditive
genetic, and environmental variance, V A : (V D + Vj) : V^, the domin
ance and interaction components being lumped together as non
additive genetic variance. Examples of this partitioning are given in
Table 8.2, although at this stage the method by which the additive
component is estimated will not be understood. This partitioning is
as far as we can go by means of relatively simple experiments. By
more elaborate techniques, requiring large numbers of observations,
it may be possible to go some way toward separating the dominance
from the interaction components, or at least to get an idea of their
140 VARIANCE [Chap. 8
relative importance. (See, in particular, Robinson and Comstock,
1955; Hayman, 1955, 1958; Cockerham, 19566.)
Table 8.2
Partitioning of the variance of four characters in Drosophila
melanogaster. Components as percentages of the total,
phenotypic, variance.
Character
(I)
(2)
(3)
(4)
Bristles
Thorax
Ovary
Egg*
Phenotypic
v P
100
100
100
100
Additive genetic
V A
52
43
30
18
Nonadditive genetic
V d + Vj
9
6
40
44
Environmental
v E
39
5i
30
38
Characters:
(1) Number of bristles on 4th + 5th abdominal segments (Clayton, Morris,
and Robertson, 1957; Reeve and Robertson, 1954).
(2) Length of thorax (F. W. Robertson, 1 9576).
(3) Size of ovaries, i.e. number of ovarioles in both ovaries. (F. W.
Robertson, 19570).
(4) Number of eggs laid in 4 days (4th to 8th after emergence) (F. W.
Robertson, 19576).
Environmental Variance
Environmental variance, which by definition embraces all varia
tion of nongenetic origin, can have a great variety of causes and its
nature depends very much on the character and the organism studied.
Generally speaking, environmental variance is a source of error that
reduces precision in genetical studies and the aim of the experimenter
or breeder is therefore to reduce it as much as possible by careful
management or proper design of experiments. {N utrition al and
climatic factors are the commonest external causes of environmental
variation, and they are at least partly under experimental control.
Maternal effects form another source of environmental variation that
is sometimes important, particularly in mammals, but is less sus
ceptible to control. Maternal effects are prenatal and postnatal
influences, mainly nutritional, of the mother on her young: we shall
have more to say about them in the next chapter in connexion with
Chap. 8]
ENVIRONMENTAL VARIANCE
141
resemblance between relatives. Error of measurement is another
source of variation, though it is usually quite trivial. When a charac
ter can be measured in units of length or weight it is usually measured
so accurately that the variance attributable to measurement is neg
ligible in comparison with the rest of the variance. Some characters,
however, cannot strictly speaking be measured, but have to be graded
by judgement into classes. Carcass qualities of livestock are an ex
ample. With such characters the variance due to measurement may
be considerable.
In addition to the variation arising from recognisable causes, such
as those mentioned, there is usually also a substantial amount of
nongenetic variation whose cause is unknown, and which therefore
cannot be eliminated by experimental design. This is generally
referred to as "intangible" variation. Some of the intangible varia
tion may be caused by "environmental" circumstances, in the common
meaning of the word — that is, by circumstances external to the
individual — even though their nature is not known. Some, however,
may arise from "developmental" variation: variation, that is, which
cannot be attributed to external circumstances, but is attributed, in
ignorance of its exact nature, to "accidents" or "errors" of develop
ment as a general cause. Characters whose intangible variation is
predominantly developmental are those connected with anatomical
structure, which do not change after development is complete, such
as skeletal form, pigmentation, or bristle number in Drosophila.
Characters more susceptible to the influences of the external environ
ment, in contrast, are those connected with metabolic processes, such
as growth, fertility, and lactation.
Example 8.4. Human birth weight provides an example of a character
subject to much environmental variation whose nature has been analysed
in detail (Penrose, 1954; Robson, 1955). The partitioning of the pheno
typic variance given in the table shows the relative importance of all the
identified sources of variation, birth weight being regarded as a character
of the child. All the environmental variation is "maternal" in the sense
that it is connected with the prenatal environment, but several distinct
components of the maternal environment are distinguished. "Maternal
genotype," which accounts for 20 per cent of the total phenotypic variance,
reflects genetic variation (chiefly additive) between mothers in the birth
weight of their children; i.e. birth weight regarded as a character of the
mother. "Maternal environment, general," which accounts for another
18 per cent, reflects nongenetic variation between mothers in the same way.
142 VARIANCE [Chap. 8
These two components, totalling 38 per cent, are maternal causes of varia
tion in birth weight that affect all children of the same mother alike.
"Maternal environment, immediate" means causes attributable to the
mother but differing in successive pregnancies. Two causes of the same
nature — "age of mother" and "parity" (i.e. whether the child is the first,
Partitioning of variance of human birthweight. Com
ponents as percentages of the total, phenotypic, variance.
Cause of variation
%oft
Genetic
Additive
15
Nonadditive (approx)
1
Sex
2
Total genotypic
Environmental
Maternal genotype
20
Maternal environment,
general
18
Maternal environment,
immediate
6
Age of mother
1
Parity
7
Intangible
3°
Total environmental
18
82
second, etc.) — are separately identifiable. Finally, the "intangible"
variation is all the remainder, of which the cause cannot be identified. To
explain how these various components were estimated would take too
much space, and could not properly be done until the end of Chapter 10.
It must suffice to say that the estimates all come from comparisons of the
degree of resemblance between identical twins, fraternal twins, full sibs,
children of sisters, and other sorts of cousins.
Multiple measurements. When more than one measurement
of the character can be made on each individual, the phenotypic
variance can be partitioned into variance within individuals and
variance between individuals. This subdivision serves to show how
much is to be gained by the repetition of measurements, and it may
also throw light on the nature of the environmental variation. There
are two ways by which the repetition of a character may provide
multiple measurements: by temporal repetition and by spatial repe
tition. Milkyield and litter size are examples of characters repeated
in time. Milkyield can be measured in successive lactations, and
Chap. 8]
ENVIRONMENTAL VARIANCE
143
litter size in successive pregnancies. Several measurements of each in
dividual can thus be obtained. The variance of yield per lactation, or
of the number of young per litter, can then be analysed into a com
ponent within individuals, measuring the differences between the
performances of the same individual, and a component between in
dividuals, measuring the permanent differences between individuals.
The withinindividual component is entirely environmental in
origin, caused by temporary differences of environment between suc
cessive performances. The betweenindividual component is partly
environmental and partly genetic, the environmental part being
caused by circumstances that affect the individuals permanently. By
this analysis, therefore, the variance due to temporary environmental
circumstances is separated from the rest, and can be measured.
Characters repeated in space are chiefly structural or anatomical,
and are found more often in plants than in animals. For example,
plants that bear more than one fruit yield more than one measure
ment of any character of the fruit, such as its shape or seed content.
Spatial repetition in animals is chiefly found in characters that can be
measured on the two sides of the body or on serially repeated parts,
such as the number of bristles on the abdominal segments of Droso
phila. With spatially repeated characters the withinindividual
variance is again entirely environmental in origin but, unlike that of
temporally repeated characters, it represents the "developmental"
variation arising from localised circumstances operating during
development.
In order that we may discuss both temporal and spatial repetition
together we shall use the term special environmental variance, V Es , to
refer to the withinindividual variance arising from temporary or
localised circumstances; and the term general environmental variance,
V Eg , to refer to the environmental variance contributing to the
betweenindividual component and arising from permanent or non
localised circumstances. The ratio of the betweenindividual com
ponent to the total phenotypic variance measures the correlation (r)
between repeated measurements of the same individual, and is
known as the repeatability of the character. The partitioning of the
phenotypic variance expressed by the repeatability is thus into two
components, V Es versus (V G + V Eg ), so that the repeatability is
r =
Vq+Ve ,
Vp
.(8.9)
144  VARIANCE [Chap. 8
The repeatability therefore expresses the proportion of the variance
of single measurements that is due to permanent, or nonlocalised,
differences between individuals, both genetic and environmental.
The repeatability differs very much according to the nature of the
character, and also, of course, according to the genetic properties of
Table 8.3
Some Examples of Repeatability
Organism and character
Repeatability
Drosophila melanogaster :
Abdominal bristle number (see Example 8.6).
Ovary size (F. W. Robertson, 19570).
•42
73
Mouse:
Weight at 6 weeks (repeated on 4 consecutive days.
Original data).
Litter size (see Example 8.5).
•95
•45
Sheep:
Weight of fleece, measured in different years (Morley,
I95i)
74
Cattle:
Milkyield (Johansson, 1950). 40
the population and the environmental conditions under which the
individuals are kept. The estimates in Table 8.3 give some idea of
the sort of values that may be found with various characters, and two
cases are described in more detail in the following examples.
Example 8.5. Litter size in mice will serve as an example of a character
repeated in time. The number of live young born in first and in second
litters was recorded in 296 mice of a genetically heterogeneous stock, and
yielded the following components of variance (original data):
Between mice 358
Within mice 444
(The procedure for estimating the components of variance from an
analysis of variance is described by Snedecor (1956, Section 10.12) and is
outlined below, in Chapter 10, p. 173.) The repeatability of litter size is
given by the ratio of the betweenmice component to the sum of the be
tweenmice and the withinmice components: i.e.
Chap. 8]
ENVIRONMENTAL VARIANCE
145
358
358+444
0*45
Example 8.6. The number of bristles on the ventral surfaces of the
abdominal segments is a character that has been much studied in Droso
phila melanogaster, because it is technically convenient and its genetic
properties are relatively simple. We have already mentioned it several
times but have not yet used it as an example. There are about 20 bristles
on each of 3 segments in males and each of 4 segments in females. The
number of bristles per segment can therefore be treated as a spatially
repeated character. The sources of variation in this character have been
studied in detail by Reeve and Robertson (1954), and the following com
ponents of variance were found:
es
??
Total phenotypic
v P
424
5'44
Between flies
v G +v Eg
182
219
Within flies
V Es
242
325
Repeatability
0429
0403
Estimation of the repeatability of a character separates off the
component of variance due to special environment, V Esi but it leaves
the other component of environmental variance — that due to general
environment, V Eg — confounded with the genotypic variance, as
shown in the above example. The component due to general en
vironment can be separately estimated only if the genotypic variance
(i.e. including the nonadditive components) has been estimated, in
the manner explained in Example 8.1. This has been done with two
characters in Drosophila, and the results are given in Table 8.4. The
Table 8.4
Partitioning of the environmental variance of two charac
ters in Drosophila melanogaster into components due to
general, V Eg , and special, V Esy environment. The charac
ters are: abdominal bristlenumber (Reeve and Robertson,
1954) as explained in Example 8.6, and ovary size (F. W.
Robertson, 1957a), measured in the two ovaries by the
number of ovarioles, or "egg strings."
Total environmental, V E
General environmental, V Eg
Special environmental, V Es
Bristle
Ovar
number
size
100
100
3
9
97
9 1
146 VARIANCE [Chap. 8
nature of the environmental variation revealed by these results is
remarkable. With both characters less than 10 per cent of the
environmental variance is general — that is, due to causes influencing
the individual as a whole. These characters are therefore very little
influenced by the conditions of the external environment: or, perhaps
it would be more accurate to say that the experimental technique of
rearing the flies has been very successful in eliminating unwanted
sources of environmental variation. Yet, fully half the phenotypic
variation of one measurement (one segment or ovary) is nongenetic,
or environmental in the wide sense, as shown in Table 8.2; and,
moreover, is due to strictly localised causes that influence the seg
ments or ovaries independently. Whether this developmental
variation represents a real indeterminacy of development, or has
material causes still undetected but in principle controllable, is quite
unknown. Nor is it known whether the situation revealed in these
two characters is at all general. We cannot here pursue further the
biological nature of the nongenetic variation: a general discussion of
these problems will be found in Waddington (1957).
We must return to the repeatability and consider its uses.
Knowledge of the repeatability of a character is useful in two ways.
First, it sets upper limits to the values of the two ratios, V A jV P and
V /V P . The first (additive genetic to total phenotypic variance), is the
heritability, which as we shall see in later chapters is of great practical
importance. The second (genotypic to phenotypic variance) measures
the degree of genetic determination of the character. The repeatability
is usually much easier to determine than either of these two ratios,
and it may often be known when they are not.
The second way in which knowledge of the repeatability is useful
is that it indicates the gain in accuracy to be expected from multiple
measurements. Suppose that each individual is measured n times,
and that the mean of these n measurements is taken to be the pheno
typic value of the individual, say P (n) . Then the phenotypic variance
is made up of the genotypic variance, the general environmental
variance, and one n th of the special environmental variance:
V Pin) = V a +V Eg + ^V Es (8 jo)
Thus, increasing the number of measurements reduces the amount
of variance due to special environment that appears in the pheno
typic variance, and this reduction of the phenotypic variance repre
Chap. 8]
ENVIRONMENTAL VARIANCE
147
sents the gain in accuracy. The variance of the mean of n measure
ments as a proportion of the variance of one measurement can be
expressed in terms of the repeatability, as follows:
Pin)
i + r(n  i )
(8.U)
where r is the repeatability, or the correlation between the measure
ments of the same individual. Fig. 8.2 shows how the phenotypic
variance is reduced by multiple measurements, with characters of
100 i
60
40
20
Vs
^
\
^—.
— —. —
V
V
r=075
\\
\
^
 —
\
\
r=0'5
\
\
J 025
r=0l
123456789 l(
NUMBER OF MEASUREMENTS
Fig. 8.2. Gain in accuracy from multiple measurements of each
individual. The vertical scale gives the variance of the mean of n
measurements as a percentage of the variance of one measurement.
The horizontal scale gives the number of measurements, up to io.
The four graphs refer to characters of different repeatability as
indicated.
148 VARIANCE [Chap. 8
different repeatabilities. When the repeatability is high, and there is
therefore little special environmental variance, multiple measure
ments give little gain in accuracy. When the repeatability is low,
multiple measurements may lead to a worthwhile gain in accuracy.
The gain in accuracy, however, falls off rapidly as the number of
measurements increases, and it is seldom worth while to make more
than two measurements.
Example 8.7. Studies of abdominal bristle number in Drosophila are
generally based on two measurements, i.e. of the fourth and fifth seg
ments, and the phenotypic values are expressed as the sum of the two
counts. As an illustration of the nature of the advantage gained by the
double measurement we may compare the percentage composition of the
phenotypic variance when phenotypic values are based on counts of one
or of two segments:
One
segment
Two
segments
Phenotypic
v P
100
100
Additive genetic
v A
34
52
Nonadditive genetic
Vb + Vj
6
9
Environmental, general
Vm.
2
4
Environmental, special
Vm.
58
35
By reducing the amount of environmental variance, the making of two
measurements increases the proportionate amount of genetic variance: in
practice it is the increase of the proportion of additive variance — in this
case from 34 per cent to 52 per cent — that is the important consideration.
There is an important assumption implicit in the idea of repeata
bility, which we have not yet mentioned. It is the assumption that
the multiple measurements are indeed measurements of what is
genetically the same character. Consider for example milkyield in
successive lactations. If the assumption were valid it would mean that
the genes that influence yield in first lactations are entirely the same
as those that influence yield in second or later lactations; or, to put the
matter in another way, that yield in all lactations is dependent on
identical developmental and physiological processes. If this assump
tion is not valid, as it certainly is not for milkyield in cattle, then the
variation within individuals is not purely environmental, and equation
8.11 is erroneous. The variance between the means of individuals
will be augmented by additional variance arising from what may
formally be regarded as interaction between genotype and "environ
Chap. 8]
ENVIRONMENTAL VARIANCE
149
ment," that is between genotype and the time or location of the
measurement. And this additional variance may be enough to
counteract the reduction of environmental variance which we have
described as the chief advantage to be gained from multiple measure
ments. Consequently an increase in the proportion of additive genetic
variance from multiple measurements cannot be relied on until the
genetical identity of the character measured has been established.
The number of bristles on the abdominal segments of Drosophila has
been proved to be genetically the same character, as will be explained
in Chapter 19, and the conclusions reached in Example 8.7 are valid.
Milkyield in cattle, in contrast, is not the same character in suc
cessive lactations, and the proportion of additive variance is actually
less for the mean of several lactations than for first lactations only.
(See Rendel, et al. y 1957.)
CHAPTER 9
RESEMBLANCE BETWEEN RELATIVES
The resemblance between relatives is one of the basic genetic pheno
mena displayed by metric characters, and the degree of resemblance
is a property of the character that can be determined by relatively
simple measurements made on the population without special experi
mental techniques. The degree of resemblance provides the means
of estimating the amount of additive variance, and it is the propor
tionate amount of additive variance (i.e. the heritability) that chiefly
determines the best breeding method to be used for improvement.
An understanding of the causes of resemblance between relatives is
therefore fundamental to the practical study of metric characters and
to its application in animal and plant improvement. In this chapter,
therefore, we shall examine the causes of resemblance between rela
tives, and show in principle how the amount of additive variance can
be estimated from the observed degree of resemblance, leaving the
more practical aspects of the estimation of the heritability for con
sideration in the next chapter.
In the last chapter we saw how the phenotypic variance can be
partitioned into components attributable to different causes. These
components we shall call causal components of variance, and denote
them as before by the symbol V. The measurement of the degree of
resemblance between relatives rests on the partitioning of the pheno
typic variance in a different way, into components corresponding to
the grouping of the individuals into families. These components can
be estimated directly from the phenotypic values and for this reason
we shall call them observational components of phenotypic variance,
and denote them by the symbol ct 2 in order to keep the distinction
clear. Consider, for example, the grouping of individuals into
families of full sibs. By the analysis of variance we can partition the
total observed variance into two components, within groups and
between groups. The withingroup component is the variance of
individuals about their group means, and the betweengroup com
ponent is the variance of the "true" means of the groups about the
Chap. 9]
RESEMBLANCE BETWEEN RELATIVES
151
population mean. The true mean of a group is the mean estimated
without error from a very large number of individuals. An explana
tion of the estimation of these two components will be given, with
examples, in the next chapter. Now, the resemblance between related
individuals, i.e. between full sibs in the case under discussion, can be
looked at either as similarity of individuals in the same group, or as
difference between individuals in different groups. The greater the
similarity within the groups, the greater in proportion will be the
difference between the groups. The degree of resemblance can
therefore be expressed as the betweengroup component as a pro
portion of the total variance. This is the intraclass correlation coeffi
cient and is given by
oi
o B to w
where <j% is the betweengroup component and o> the withingroup
component. (It is customary to use the symbol t for the intraclass
correlation of phenotypic values in order to avoid confusion with
other types of correlation for which the symbol r is used.) The
betweengroup component expresses the amount of variation that is
common to members of the same group, and it can equally well be
referred to as the covariance of members of the groups. In the case of
the resemblance between offspring and parents the grouping of the
observations is into pairs rather than groups; one parent, or the mean
of two parents, paired with one offspring or the mean of several
offspring. It is then more convenient to compute the covariance of
offspring with parents from the sum of crossproducts, rather than
from the betweenpair component of variance. With offspring
parent relationships, also, it is usually more convenient to express the
degree of resemblance as the regression coefficient of offspring on
parent, instead of the correlation between them, the regression being
given by
'OP
cov OF
where cov OY is the covariance of offspring and parents, and oJ is the
variance of parents.
Thus, the covariance of related individuals is the new property
of the population that we have to deduce in seeking the cause of
resemblance between relatives, whether sibs or offspring and parents.
L F.Q.G.
152 RESEMBLANCE BETWEEN RELATIVES [Chap. 9
The covariance, being simply a portion of the total phenotypic
variance, is composed of the causal components described in the last
chapter, but in amounts and proportions differing according to the
sort of relationship. By finding out how the causal components con
tribute to the covariance we shall see how an observed covariance can
be used to estimate the causal components of which it is composed.
Both genetic and environmental sources of variance contribute to
the covariance of relatives. We shall consider the genetic causes of
resemblance first, then the environmental causes, and finally, by
putting the two causes together, arrive at the phenotypic covariance
and the degree of resemblance that can be observed from measure
ments of phenotypic values. A general description of the covariance,
applicable to any sort of relationship, is given by Kempthorne
(1955a). Here we shall consider only four sorts of relationship: (1)
between offspring and one parent, (2) between half sibs, (3) between
offspring and the mean of the two parents, and (4) between full sibs.
These are the most important relationships in practice. Identical
twins will be considered in the next chapter, because the problems
they raise will be better understood then.
Genetic Covariance
Our object now is to deduce from theoretical considerations the
covariance of relatives arising from genetic causes, neglecting for the
time being any nongenetic causes of resemblance that there may be.
This means that we have to deduce the covariance of the genotypic
values of the related individuals. This will be done by reference to
two alleles at a locus, but the conclusions are equally valid for loci
with any number of alleles. We shall at first omit interaction deviations
and the interaction component of variance from consideration, but
we shall describe its effects briefly later.
Offspring and one parent. The covariance to be deduced is
that of the genotypic values of individuals with the mean genotypic
values of their offspring produced by mating at random in the popu
lation. If values are expressed as deviations from the population
mean, then the mean value of the offspring is by definition half the
breeding value of the parent, as explained in Chapter 7. Therefore
the covariance to be computed is that of an individual's genotypic
value with half its breeding value, i.e. the covariance of G with \A.
GENETIC COVARIANCE
153
Chap. 9]
Since G=A+D (D being the dominance deviation) the covariance
is that of (A+D) with \A. Taking the sum of crossproducts, we
have
sum of crossproducts =Z\A(A +D)
= ±ZA 2 + \ZAD
Since A and D are uncorrelated (see p. 125), the term \ZAT> is
zero. Then if we divide both sides by the number of paired observa
tions we have
cov 01 > = iV A
(90
since ZA 2 is the sum of squares of breeding values. The genetic
covariance of offspring and one parent is therefore half the additive
variance.
The covariance may be derived by another method, which though
less concise is perhaps more explicit. Table 9.1 gives the genotypes
of the parents, their frequencies in the population, and their geno
typic values expressed as deviations from the population mean (from
Table 7.3). The righthand column gives the mean genotypic values
Parents
Table 9.1
Offspring
Genotype
Frequency
Genotypic value
Mean genotypic value
AA
p 2
2q(ocqd)
qoc
A X A 2
2pq
(q p)oc + zpqd
Vsp)*
A 2 A 2
<?
 2p{tx +pd)
—pa
of the offspring, which are half the breeding values of the parents as
given in Table 7.3. The covariance of offspring and parent is then the
mean crossproduct, and is obtained by multiplying together the
three columns — frequency x genotypic value of parent x genotypic
value of offspring — and summing over the three genotypes of the
parents. After collecting together the terms in a 2 and the terms in ocd
we obtain
cov OY =pq<x 2 (p 2 + Zpq + q 2 ) + 2p 2 q 2 ad( q + qp +p)
=pq<x 2
= Wa
since from equation £.5, V A = zpqa. 2 . Summing over all loci we again
reach the conclusion that the covariance of offspring and one parent
is equal to half the additive variance.
154 RESEMBLANCE BETWEEN RELATIVES [Chap.9
Half sibs. Half sibs are individuals that have one parent in com
mon and the other parent different. A group of half sibs is therefore
the progeny of one individual mated at random and having one
offspring by each mate. Thus the mean genotypic value of the group
of half sibs is by definition half the breeding value of the common
parent. The covariance is the variance of the means of the halfsib
groups, and is therefore the variance of half the breeding values of the
parents; this is a quarter of the additive variance:
CW(BB) = V*A=hV A (92)
This covariance also can be demonstrated by the longer method,
from the values already given in Table 9.1. The covariance is the
variance of the means of the groups of offspring listed in the right
hand column. Squaring the offspring values and multiplying by their
frequencies we get
Variance of means of halfsib families
=p 2 q 2 * 2 + Zpq. l(q p) 2 oc 2 + q 2 p 2 <x 2
=pqoc 2 [pq + i(qp) 2 +pq]
=pq« 2 ii(P+q) 2 ]
= ipq* 2
Therefore, since zpqoc 2 = V A (from equation 8.5),
coV( m) =lV A
summation being made over all loci.
Offspring and midparent. The covariance of the mean of the
offspring and the mean of both parents (commonly called the * 'mid
parent") may be deduced in the following way. Let O be the mean of
the offspring, and P and P' be the values of the two parents. Then
we want to find cov t>\ that is, the covariance of O with (P + P').
This is equal to \{cov ^ + cov ov >). If P and P' have the same variance,
then cov ov = cov ov > and cov ? = cov OY . Thus, provided the two sexes
have equal variances, the covariance of offspring and midparent is
the same as that of offspring with one parent, which we have seen is
equal to half the additive variance. This conclusion may be extended
to other sorts of relative: the covariance of any individual with the
mean value of a number of relatives of the same sort is equal to its
covariance with one of those relatives.
The longer method of demonstrating the covariance of offspring
with midparent is rather laborious, but it must be given since it will
Chap. 9]
GENETIC COVARIANCE
155
w
«
<
^^
^^
«
<£>§
s
Qj *•
^
*
+
+
^
^3
<3
+
^3
HtH
1
Co S*
IN
ca
J >.
«
<3
Oh
HH<
H^I
^^
^_^
s
N
^
eny me t
X
parent
+
+
^s
N
**3
<N
<3
<3
+
O
^3
rH<N
<3
M
(M
<3
1? rg
1
(M
£ S
<3
Ho*
•3 ^
^
Q <^>
^
+
<3
53 ^0
a
+
^3
^3
Hid
1
S ^
1
S*
h«
hn
<N
<M
i
1
1


H<*
H«
i— (
<
i
^
5S w
1
h«
M
H<N
HC*
1
l<
i
1
M
HN
1
HH<
1
1
<
1
1
c
^ s
QJ
y s
^3
s J
*\^3
<3
+
O
^
+
<3
1
<3
1
s
h£
H^
8 5?
^"1
<N
eo
«*
w
<N
<N
<3^
A,
*
*
*
<X
<
<N
<N
(M
<N
<
"I
<
<
<
^
^
<
ir> <fc
1 fc
iH
H
H
<N
(M
(N
§ s,
<
<H
<
<!
<
<!
e>
rH
rH
<
<
156 RESEMBLANCE BETWEEN RELATIVES [Chap. 9
be needed for arriving at the covariance of full sibs. We shall, how
ever, omit some of the steps of algebraic reduction. A table (Table
9.2) is made in the same manner as for offspring and one parent, but
now we have to tabulate types of mating and their frequencies, in
stead of single parents. This was done in Chapter 1 (Table 1.1).
Against each type of mating we put the mean genotypic value of the
two parents, i.e. the midparent value; then the genotypes of the pro
geny and the mean genotypic value of the progeny. The working is
made easier by writing the genotypic values in terms of a and d
instead of as deviations from the population mean. In the last two
columns of the table we put the product of progenymean x mid
parent, and the square of the progeny for later use. Now, to get the
covariance of progenymean and midparent value, we take the pro
duct of progenymean x midparent and multiply it by the frequency
of the mating type, and then sum over mating types. This gives the
mean product (M.P.) from which we have to deduct a correction for
the population mean, since values are not here expressed as deviations
from the mean. The correction is simply the square of the population
mean (M 2 ) since the means of parents and of progeny are equal.
Both the M.P. and M 2 contain terms in a 2 , in ad, and in d 2 . By col
lecting together these terms and simplifying a little we obtain
M.P. = a 2 [p 3 (p +q)+ q\p + q)] + 2adpq(p 2  q 2 ) + d 2 pq(p 2 + 2pq + q 2 )
M 2 = a\p 2  2pq + q 2 ) + \adpq(p  q) ■ + \d 2 p 2 q 2
Then, cov ^ = M.P.M 2
= a 2 pq  2adpq(p q) + d 2 pq(p  q) 2
=pq[a + d(qp)] 2
=pqoc 2
=Wa (93)
when summed over all loci.
So the genetic covariance of offspring with the mean of their parents
is equal to half the additive genetic variance. That this covariance
comes out the same as that of offspring and one parent need cause no
surprise when we note that the variance of midparent values is half
the variance of individual values (see below, p. 162).
Full sibs. The covariance of full sibs is the variance of the means
of fullsib families, and is got with little additional work from Table
9.2. The last column shows the squares of progeny means and it will
be seen that these squares are all exactly the same as the products of
Chap. 9]
GENETIC COVARIANCE
157
progenymean x midparent, except for the two entries in the middle
involving terms in d 2 . The mean square (M.S.) can therefore be got
from the mean product (M.P.) already calculated, thus
M.S.=M.?.+d 2 .2p 2 q 2 id\4p 2 q 2
= M.¥.+dyq 2
The correction for the mean is the same as before, so we have
cov im
 coV(y§ + d 2 p 2 q 2
pqo. 2 + d 2 p 2 q 2
Since 2pqcx. 2 = V A (from equation 8.5) and ^d 2 p 2 q 2
8.6) the covariance of full sibs is
V D (from equation
(94)
cov m) =iV A + lV D
summing over all loci.
So the genetic covariance of full sibs is equal to half the additive
genetic variance plus a quarter of the dominance variance. This is the
only one of the relationships that we have considered where we find
the dominance variance contributing to the resemblance. The reason
is that full sibs have both parents in common, and a pair of full sibs
have a quarter chance of having the same genotype for any locus.
Covariance due to epistatic interaction. Before we turn to the
environmental causes of resemblance between relatives let us briefly
examine the role of interaction variance arising from epistasis. In
Chapter 8 we noted that the interaction variance, V Iy is subdivided
into components according to the number of loci interacting, and
according to whether the interaction is between breeding values or
dominance deviations. The covariances of relatives, with the contri
butions of the twofactor interactions included, are shown in Table 9.3
Table 9.3
Covariances of relatives including the contributions of
twofactor interactions.
Relatives
Variance components and the coeffi
cients of their contributions
Offspringparent: cov ? ■■
Half sibs: cov ms)
Full sibs: covcfs)
General: cov ■■
V,
V
AA
V
AD
DD
1
4
_1_
16
1
4
xy
1
16
1,2
158 RESEMBLANCE BETWEEN RELATIVES [Chap. 9
(for details see Kempthorne, 19550, b). The offspringparent co
variance refers equally to one parent and to midparent values.
For the sake of clarity the components of variance are shown at the
heads of the columns and their coefficients in the covariances are
listed below. For example, the offspringparent covariance is
i^A+l Vaa> The contributions of interaction to the covariances are
expressible in a simple general form, shown in the bottom line of the
table. If the covariance contains xV A then it contains also xW AA \ and
if it contains yV D it contains also xyV AD and y 2 V DD . Interactions in
volving more than two loci contribute progressively smaller propor
tions as the number of loci increases. The effect of the interaction
variance on the resemblance between relatives is, in principle, that
the offspringparent covariance is not twice the halfsib covariance,
but a little more than twice; and that the excess of the fullsib co
variance over the halfsib represents not only dominance variance but
also some of the interaction variance.
When the interaction variance was first discussed in Chapter 8
we said we would regard it as a complication to be circumvented,
noting only the consequences of neglecting it. These consequences
are now apparent. First, only small fractions of it contribute to the
covariances and therefore its effect on the resemblance between rela
tives is small unless the amount of interaction variance is large in
comparison with the other components. And second, it appears that
there is little we can do in practice except ignore it, because, apart
from the special experimental methods mentioned on p. 139, there is
no practicable means of separating the interaction from the other
components. The consequences of ignoring the interaction variance
are thus that any estimate of V A made from offspringparent regres
sions will contain also \V AA + \V AAA +etc; any estimate of V A from
halfsib correlations will contain also iV AA +T6V AAA \etc; and any
estimate of V D obtained from a fullsib correlation will contain also
portions of the interaction components. We noted in Chapter 7 that
the two definitions of breeding value given there are not equivalent
if there is interaction between loci. We can now see how this comes
about. Defined in terms of the measured values of progeny — the
practical definition — breeding value includes additive x additive
interaction deviations in addition to the average effects of the genes
carried by the parents; whereas, defined in terms of the average
effects of genes — the theoretical definition — it does not.
Effect of linkage. Throughout the discussion of the covariances
Chap. 9]
GENETIC COVARIANCE
159
of relatives we have ignored the effects of linkage, assuming always
that the loci concerned segregate independently. The effects of
linkage in a randommating population, where the coupling and
repulsion phases are in equilibrium, are as follows (Cockerham,
1956a). The covariances of offspring and parents are not affected,
but the covariances of half and full sibs are increased; the closer the
linkage the greater the increase. The additional covariance due to
linkage appears with the interaction component. Therefore what is
formally attributed to epistatic interaction may be in part due to
linkage.
Environmental Covariance
Genetic causes are not the only reasons for resemblance between
relatives; there are also environmental circumstances that tend to
make relatives resemble each other, some sorts of relatives more than
others. If members of a family are reared together, as with human
families or litters of pigs or mice, they share a common environment.
This means that some environmental circumstances that cause
differences between unrelated individuals are not a cause of difference
between members of the same family. In other words there is a com
ponent of environmental variance that contributes to the variance
between means of families but not to the variance within the families,
and it therefore contributes to the covariance of the related individuals.
This betweengroup environmental component, for which we shall
use the symbol V Ecy is usually called the common environment, a term
that seems more appropriate when we think of the component as a
cause of similarity between members of a group than when we think
of it as a cause of difference between members of different groups.
The remainder of the environmental variance, which we shall denote
by V Ew , arises from causes of difference that are unconnected with
whether the individuals are related or not. It therefore appears in
the withingroup component of variance, but does not contribute to
the betweengroup component, which is the variance of the true
means of the groups. In considerations of the resemblance between
relatives, therefore, the environmental variance must be divided into
two components:
Vn=V MB +V
Ew
(95)
160 RESEMBLANCE BETWEEN RELATIVES [Chap.9
one of the components, V Ecy contributing to the covariance of the
related individuals.
The sources of common environmental variance are many and
varied, and only a few examples can be mentioned. Soil conditions
may differentiate families of plants when the members of a family are
grown together on the same plot: similarly the conditions of the cul
ture medium may differentiate families of Drosophila or other small
animals. With farm animals, related individuals are likely to have
been reared on the same farm, and differences of climate or of manage
ment contribute to the resemblance between the relatives. "Maternal
effects" are a frequent source of environmental difference between
families, especially with mammals. The young are subject to a
maternal environment during the first stages of their life, and this
influences the phenotypic values of many metric characters even
when measured on the adult, causing offspring of the same mother to
resemble each other. Finally, members of the same family tend to be
contemporaneous, and changes of climatic or nutritional conditions
tend to differentiate members of different families. This source of
common environmental variation is especially important in animals
that produce their young in broods or litters.
These various sources of common environmental variation con
tribute chiefly to the resemblance between sibs, though some may
also cause resemblance between parent and offspring. Maternal
effects, in particular, often cause a resemblance between mother and
offspring as well as among the offspring themselves. Body size in mice
and other mammals provides an example. Large mothers tend to
provide better nutrition for their young, both before and after birth,
than small mothers. Therefore the young of large mothers tend to
grow faster, and the effect of the rapid early growth may persist, so
that when adult their body size is larger. Thus mothers and offspring
tend to resemble each other in body size.
It will be seen from the examples given that the nature of the
component of variance due to common environment differs according
to the circumstances. What we designate as the V Ec component
depends on the way in which individuals are grouped when we esti
mate the observational components of phenotypic variance. What
ever the form of the analysis, the part of the variance between the
means of groups that can be ascribed to environmental causes is
called the V Ec component. The nature of this component thus
depends on the form of the analysis applied. If the groups in the
Chap. 9]
ENVIRONMENTAL COVARIANCE
161
analysis are fullsib families then the V Ec component represents
environmental causes of similarity between full sibs; if the groups are
half sibs it represents causes of similarity between half sibs. And in
parentoffspring relationships a comparable covariance term repre
sents environmental causes of resemblance between offspring and
parent. Thus, whenever we measure a phenotypic covariance with
the object of using it to estimate a causal component of variance we
have to decide whether it includes an appreciable component due to
common environment, and this is often a matter of judgment based
on a biological understanding of the organism and the character. In
experiments, much of the V Ec component can often be eliminated by
suitable design. For example, members of the same family need not
always be reared in the same vial, cage, or plot; they can be random
ised over the rearing environments. Or, by replication, the V Ec
component can be measured and suitable allowance made for it in the
resemblance between the relatives.
Thus relatives of all sorts may in principle be subject to an en
vironmental source of resemblance. In what follows, however, we
shall make the simplification of disregarding the V Ec component for
all relatives except full sibs, though from time to time we shall put in
a reminder of its possible presence. Full sibs are subject to a com
mon maternal environment and this is often the most troublesome
source of environmental resemblance to overcome by experimental
design. Consequently a V Ec component contributes more often and
in greater amount to the covariance of full sibs than to that of any
other sort of relative. The simplification of disregarding all other
sources of common environmental variance is therefore not entirely
unrealistic.
Phenotypic Resemblance
The covariance of phenotypic values is the sum of the covariances
arising from genetic and from environmental causes. Thus by
putting together the conclusions of the two preceding sections we
arrive at the phenotypic covariances given in Table 9.4. (It will be
remembered that some possible sources of environmental covariance
are being neglected, particularly in offspringparent relationships
involving the mother.) In all these relationships except that of full
sibs the covariance is either a half or a quarter of the additive genetic
variance. By observing the phenotypic covariance of relatives we can
162 RESEMBLANCE BETWEEN RELATIVES [Chap. 9
thus estimate the amount of additive variance in the population and
make the partition of the variance into additive versus the rest.
To arrive at the degree of resemblance expressed as a regression or
correlation coefficient we have to divide the covariance by the appro
priate variance. The resemblance between sibs is expressed as a
correlation and the covariance is divided by the total phenotypic
variance. The correlation between half sibs, for example, is therefore
\V A jV P . The resemblance between offspring and parent is expressed
Table 9.4
Phenotypic Resemblance between Relatives
Relatives
Offspring and one parent
Offspring and midparent
Half sibs
Full sibs
as the regression of offspring on parent, and the covariance is there
fore divided by the variance of parents. In the case of single parents
this is again the phenotypic variance, and the regression of offspring
on one parent is thus \V A \V P . In a randombreeding population the
phenotypic variance of parents and offspring is the same, and then the
correlation between offspring and one parent is the same as the re
gression. The case of midparent values, however, is a little different.
The covariance has to be divided by the variance of midparent values,
and this is half the phenotypic variance, for the following reason. Let
X and Y stand for the phentoypic values of male and female parents
respectively. Then Gx = oy=V P . The midparent value is \X+\Y.
and the variance of midparent values, assuming X and Y to be
uncorrelated, is therefore u£ x + °Vf = ia\x = 2 • \^x — 2 Vp Thus
the regression of offspring on midparent is \V A \\V P = V A jV P . The
correlation between offspring and midparent values, however, is
2 ^/ cr P <7 0) where op and cr are the square roots of the phenotypic vari
ances of midparents and offspring respectively, and this is not the
same as the regression of offspring on midparent.
Covariance
Regression (b)
or correlation (t)
Wa
b *v P
Wa
Wa
1 W P
Wa+Wd + V Bc
, Wa+Wd + Vec
*~ T/
Chap. 9]
PHENOTYPIC RESEMBLANCE
163
The regressions of offspring on parents and the correlations of
sibs are shown in Table 9.4. All except the fullsib correlation are
simple fractions of the ratio V A jV P . Thus the different degrees of
resemblance between different sorts of relatives become apparent.
For example, the regression of offspring on one parent is twice the
correlation between half sibs, and the correlation between full sibs is
twice the correlation between half sibs if there is no dominance and
no common environment.
The difference between the fullsib covariance and twice the
halfsib covariance can, in principle, be used to estimate the domin
ance variance, V Di provided there is no variance due to common
environment, though some of the variance due to epistatic interaction
would be included, as may be seen from Table 9.3. In practice,
however, it is usually very difficult to be certain that there is no
variance due to common environment, and estimates of the domin
ance variance obtained in this way are generally to be regarded as
upper limits rather than as precise estimates.
Table 9.5
The Resemblance between Relatives for some Characters in Man
Correlation
coefficient
Parent
Character
Reference
offspring
Full sib
Stature
(1)
•5 1
•53
Span
(1)
•45
•54
Length of forearm
(1)
•42
•48
Intelligence
(2)
•49
•49
Birth weight
(3)
—
•50
(1) Pearson and Lee (1903).
(2) Unweighted averages of several estimates, cited by
Penrose (1949).
(3) Quoted from Robson (1955).
The chief use of measurements of the degree of resemblance
between relatives is to estimate the proportionate amount of additive
genetic variance, V A \ V P , which is the heritability . The meaning of the
heritability and the methods of estimating it will be considered more
fully in the next chapter. To conclude this chapter we give in Table
9.5 some examples of correlations between relatives in man. These
164 RESEMBLANCE BETWEEN RELATIVES [Chap. 9
are undoubtedly complicated by covariance due to common en
vironment, and also by assortative mating. The correlation between
husband and wife for intelligence, for example, is as high as 0*58
(see Penrose, 1949). For these reasons human correlations cannot
easily be used to partition the variation into its components.
I
CHAPTER 10
HERITABILITY
The heritability of a metric character is one of its most important
properties. It expresses, as we have seen, the proportion of the total
variance that is attributable to the average effects of genes, and this is
what determines the degree of resemblance between relatives. But
the most important function of the heritability in the genetic study
of metric characters has not yet been mentioned, namely its predictive
role, expressing the reliability of the phenotypic value as a guide to
the breeding value. Only the phenotypic values of individuals can
be directly measured, but it is the breeding value that determines their
influence on the next generation. Therefore if the breeder or experi
menter chooses individuals to be parents according to their pheno
typic values, his success in changing the characteristics of the popu
lation can be predicted only from a knowledge of the degree of corre
spondence between phenotypic values and breeding values. This
degree of correspondence is measured by the heritability, as the fol
lowing considerations will show.
The heritability is defined as the ratio of additive genetic variance
to phenotypic variance:
h 2 =
V,
.(lO.l)
(The customary symbol h 2 stands for the heritability itself and not for
its square. The symbol derives from Wright's (1921) terminology,
where h stands for the corresponding ratio of standard deviations.)
An equivalent meaning of the heritability is the regression of breeding
value on phenotypic value:
h 2 =b AP i 10  2 )
The equivalence of these meanings can be seen from reasoning similar
to that by which we derived the genetic covariance of offspring and
one parent on p. 153. If we split the phenotypic value into breeding
value and a remainder (R) consisting of the environmental, domin
166 HERITABILITY [Chap. 10
ance, and interaction deviations, thenP=A+R. Since A and R are
uncorrelated, cov AP = V A and so b AP = V A jV P .
We may note also that the correlation between breeding values
and phenotypic values, r AP , is equal to the square root of the heri
tability. This follows from the general relationship between corre
lation and regression coefficients, which gives
» op
r AP—°Ap—
<*A
=h (10.3)
By regarding the heritability as the regression of breeding value
on phenotypic value we see that the best estimate of an individual's
breeding value is the product of its phenotypic value and the heri
tability:
^(expected) = h*P (IO.4)
breeding values and phenotypic values both being reckoned as
deviations from the population mean. In other words the heritability.
expresses the reliability of the phenotypic value as a guide to the
breeding value, or the degree of correspondence between phenotypic
value and breeding value. For this reason the heritability enters into
almost every formula connected with breeding methods, and many
practical decisions about procedure depend on its magnitude. These
matters, however, will be considered in the next chapters; here we
are concerned only to point out that the determination of the heri
tability is one of the first objectives in the genetic study of a metric
character.
It is important to realise that the heritability is a property not
only of a character but also of the population and of the environ
mental circumstance to which the individuals are subjected. Since
the value of the heritability depends on the magnitude of all the com
ponents of variance, a change in any one of these will affect it. All
the genetic components are influenced by gene frequencies and may
therefore differ from one population to another, according to the past
history of the population. In particular, small populations maintained
long enough for an appreciable amount of fixation to have taken place
are expected to show lower heritabilities than large populations.
The environmental variance is dependent on the conditions of culture
Chap. 10]
HERITABILITY
167
or management: more variable conditions reduce the heritability,
more uniform conditions increase it. So, whenever a value is stated
for the heritability of a given character it must be understood to refer
to a particular population under particular conditions. Values found
in other populations under other circumstances will be more or less
the same according to whether the structure of the population and the
environmental conditions are more or less alike.
Very many determinations of heritabilities have been made for a
variety of characters, chiefly in farm animals. Some representative
examples are given in Table io.i. Different determinations of the
heritability of the same character show a considerable range of varia
tion. This is partly due to statistical sampling, but some of the
variation reflects real differences between the populations or the
conditions under which they are studied. For these reasons, and be
cause estimations of heritabilities can seldom be very precise, the
figures quoted in the table are rounded to the nearest 5 per cent.
From Table 10. 1 it can be seen that the magnitude of the heritability
shows some connexion with the nature of the character. On the
whole, the characters with the lowest heritabilities are those most
closely connected with reproductive fitness, while the characters
with the highest heritabilities are those that might be judged on bio
logical grounds to be the least important as determinants of natural
fitness. This is well seen in the gradation of the four characters of
Drosophila.
Table io.i
Approximate values of the heritability of various characters
in domestic and laboratory animals.
Cattle
Amount of white spotting in Friesians (Briquet and Lush, 1947) 95
Butterfat % (Johansson, 1950) 6
Milkyield (Johansson, 1950) 3
Conception rate (in 1st service) (A. Robertson, 1957a) oi
'igs
Thickness of back fat (Fredeen and Jonsson, 1957) *55
Body length (Fredeen and Jonsson, 1957) 5
Weight at 180 days (Whatley, 1942) «3
Litter size (Lush and Molln, 1 942) • 1 5
{Continued overleaf)
M F.Q.G.
168 HERITABILITY [Chap. 10
Sheep (Australian Merino)
Length of wool (Morley, 1955) *55
Weight of fleece (Morley, 1955) *4
Body weight (Morley, 1955) *35
Poultry (White Leghorn)
Egg weight (Lerner and Cruden, 195 1) *6
Age at laying of first egg (King and Henderson, 19546) *5
Eggproduction (annual, of surviving birds) (King and Henderson,
I954&) *3
Eggproduction (annual, of all birds) (King and Henderson,
19546) 2
Body weight (Lerner and Cruden, 1951) *2
Viability (Robertson and Lerner, 1949) *i
Rats
Expression of hooded gene (amount of white) (from data of Castle
and Wright, 1 9 1 6) 4
Ovary response to gonadotrophic hormone (Chapman, 1946) 35
Age at puberty in females (Warren and Bogart, 1952) 15
Mice
Tail length at 6 weeks (Falconer, 19546) 6
Body weight at 6 weeks (Falconer, 1953) 35
Litter size (1st litters) (Falconer, 1955) '15
Drosophila melanogaster
Abdominal bristle number (Clayton, Morris, and Robertson,
1957) '5
Body size (thorax length) (F. W. Robertson, 19576) 4
Ovary size (F. W. Robertson, 1957a:) 3
Egg production (F.W. Robertson, 19576) 2
Estimation of Heritability
Let us first compare the merits of the different sorts of relatives
for estimating either the additive genetic variance from the covariance,
or the heritability from the regression or correlation coefficient.
Table 10.2 shows again the composition of the phenotypic covariances,
Chap. 10]
ESTIMATION OF HERITABILITY
169
and shows also the regression or correlation expressed in terms of the
heritability. The choice depends on the circumstances. In addition
Table 10.2
Relatives
Offspring and one parent
Offspring and midparent
Half sibs
Full sibs
Covariance
t v a
Wa
Wa+Wd+V Eo
Regression (b) or
correlation (t)
b = \W
b=h 2
t = \h*
t>ih*
to the practical matter of which sorts of relatives are in fact obtain
able, there are two points to consider — sampling error and environ
mental sources of covariance. The statistical precision of the estimate
depends on the experimental design and also on the magnitude of the
heritability being estimated, and so no hard and fast rule can be
made. The matter of statistical precision will be further considered
in a later section of this chapter. The question of environmental
sources of covariance is generally more important than the statistical
precision of the estimate, because it may introduce a bias which
cannot be overcome by statistical procedure. From considerations of
the biology of the character and the experimental design we have to
decide which covariance is least likely to be augmented by an en
vironmental component, a matter already discussed in the last
chapter. Generally speaking the halfsib correlation and the regres
sion of offspring on father are the most reliable from this point of
view. The regression of offspring on mother is sometimes liable to
give too high an estimate on account of maternal effects, as it would,
for example, with body size in most mammals. The fullsib corre
lation, which is the only relationship for which an environmental
component of covariance is shown in the table, is the least reliable of
all. The component due to common environment is often present in
large amount and is difficult to overcome by experimental design;
and the fullsib covariance is further augmented by the dominance
variance. The fullsib correlation can therefore seldom do more than
set an upper limit to the heritability.
Example io.i. The heritability of abdominal bristle number in
Drosophila melanogaster has been determined by three different methods,
applied to the same population (Clayton, Morris, and Robertson, 1957),
with the following results:
170 HERITABILITY [Chap. 10
Method of estimation Heritability
Offspringparent regression *5 1 ± '07
Halfsib correlation 48 ± • 1 1
Fullsib correlation 53 ± '07
Combined estimate ^52
The estimates obtained by the three methods are in very satisfactory
agreement. In this case, the character — bristle number — is free of com
plications arising from maternal effects and common environment.
Let us now consider briefly some technical matters concerning the
translation of observational data into estimates of heritability. We
shall deal first with the estimation of the heritability; and we shall
later discuss the standard error of the estimate, and the design that
gives an experiment its greatest precision.
Selection of parents and assortative mating. In the treatment
of resemblance between relatives we have supposed the parents to be
a random sample of their generation and to be mated at random. Quite
often, however, one or other of these conditions does not hold, and
the choice of which sort of relative to use in the estimation of herita
bility is then somewhat restricted. In experimental and domesticated
populations the parents are often a selected group and consequently
the phenotypic variance among the parents is less than that of the
population as a whole and less than that of the offspring. The regres
sion of offspring on parents, however, is not affected by the selection
of parents because the covariance is reduced to the same extent as the
the variance of the parents, so that the slope of the regression line is
unaltered. Thus the regression of offspring on one parent is a valid
measure of J/? 2 , and that of offspring on midparent is a valid measure
of h 2 . But the covariance is not a valid measure of V Ay nor the vari
ance of parents of V P \ moreover, the correlation and regression coeffi
cients are not equal.
Sometimes the mating of parents is not made at random but
according to their phenotypic resemblance, a system known as
assortative mating. There is then a correlation between the pheno
typic values of the mated pairs. The consequences of assortative
mating are described by Reeve (19556) but they are too complicated
to explain in detail here. They can be deduced by modification of
Table 9.2, the frequencies of the different types of mating being
altered according to the correlation between the mated pairs. The
variance of midparent values is increased and consequently also the
Chap. 10]
ESTIMATION OF HERITABILITY
171
covariance of full sibs. The regression of offspring on midparent,
however, is very little affected and it can be taken as a valid measure
of h 2 . The increased variance of midparent values under assortative
mating has the practical advantage of reducing the sampling error of
the regression coefficient and thus of the estimate of heritability.
Offspringparent relationship. The estimation of heritability
from the regression of offspring on parent is comparatively straight
forward and needs little comment apart from the points mentioned in
the preceding paragraphs. The data are obtained in the form of
measurements of parents and the mean values of their offspring. The
covariance is then computed in the usual way from the crossproducts
of the paired values. The mean values of offspring may be weighted
according to the number of offspring in each family, if the numbers
differ. The appropriate weighting is discussed by Kempthorne and
Tandon (1953) and by Reeve (1955c).
Fig. 10. i. Regression of offspring on midparent for winglength
in Drosophila, as explained in Example 10.2. Midparent values are
shown along the horizontal axis, and mean value of offspring along
the vertical axis. (Drawn from data kindly supplied by Dr E. C.
R. Reeve.)
Example 10.2. Fig. 10. 1 illustrates the regression of offspring on
midparent values for wing length in Drosophila melanogaster (Reeve and
Robertson, 1953). There are 37 pairs of parents and a mean of 273
offspring were measured from each pair of parents. The parents were
mated assortatively, with the result that the variance of midparent values
172 HERITABILITY [Chap. 10
is greater than it would be if mating had been at random. Each point on
the graph represents the mean value of one pair of parents (measured along
the horizontal axis), and the mean value of their offspring (measured along
the vertical axis). The axes are marked at intervals of i/ioo mm., and they
intersect at the mean value of all parents and all offspring. The sloping
line is the linear regression of offspring on midparent. The slope of this
line estimates the heritability, and has the value ( ± standard error):
h 2 =b ? = 0577 ±007
A complication in the use of the regression of offspring on mid
parent arises if the variance is not equal in the two sexes. We noted
in the previous chapter that the genetic covariance of offspring and
midparent is equal to half the additive variance on condition that the
sexes are equal in variance. If this is not so, the regression on mid
parent cannot, strictly speaking, be used, and the heritability must
be estimated separately for each sex from the regression of daughters
on mothers and of sons on fathers. If the heritabilities are found to
be equal in the two sexes, then a joint estimate can be made from the
regression on midparent, by taking the mean value of the offspring
as the unweighted mean of males and females.
Sib analysis. The estimation of heritability from half sibs is
more complicated than appears at first sight and needs more detailed
comment. A common form in which data are obtained with animals
is the following. A number of males (sires) are each mated to several
females (dams), and a number of offspring from each female are
measured to provide the data. The individuals measured thus form a
population of halfsib and fullsib families. An analysis of variance
is then made by which the phenotypic variance is divided into ob
servational components attributable to differences between the pro
geny of different males (the betweensire component, u 2 s ); to differ
ences between the progeny of females mated to the same male
(betweendam, withinsires, component, v%)\ and to differences
between individual offspring of the same female (withinprogenies
component, oj^). The form of the analysis is shown in Table 10.3.
There are supposed to be s sires, each mated to d dams, which
produce k offspring each. The values of the mean squares are de
noted by MS S , MS Di and MS W . The mean square within progenies
is itself the estimate of the withinprogeny variance component,
vw\ but tne other mean squares are not the variance components.
The compositions of the mean squares in terms of the observational
Chap. 10]
ESTIMATION OF HERITABILITY
173
components of variance are shown in the righthand column of the
table, consideration of which will show how the variance components
are to be estimated. The betweendam mean square, for example, is
made up of the withinprogeny component together with k times the
betweendam component; so the betweendam component is esti
mated as vi ~{ijk){MS D  MS W ), i.e. we deduct the mean square for
progenies from the mean square for dams and divide by the number
of offspring per dam. Similarly the betweensire component is
estimated as os = {ijdk)(MS s  MS D ), where dk is the number of off
Table 10.3
Form of Analysis of HalfSib and FullSib Families
Composition oj
Source
d.f.
Mean Square
Mean Square
Between sires
SI
MS S
= c?w + ko% + dkal
Between dams
s(di)
MSn
= a^ + kal
(within sires)
Within progenies
sd(ki)
MS W
= a w
s = number of sires
d = number of dams per sire
k = number of offspring per dam
spring per sire. If there are unequal numbers of offspring from the
dams, or of dams in the sire groups, the exact solution, which is
described by King and Henderson (1954a), Williams (1954), and
Snedecor (1956, section 10.17) becomes too complicated for descrip
tion here. We can, however, use the mean values of d and k with
little error, provided the inequality of numbers is not very great.
The next step is to deduce the connexions between the observa
tional components that have been estimated from the data and the
causal components, in particular the additive genetic variance, the
estimation of which is the main purpose of the analysis. Though all
the information needed has already been given, the interpretation of
the observational components, which is given in Table 10.4, is not
immediately apparent without explanation. The first point to note
is that the estimate of the phenotypic variance is given by the sum
(oy) of the three observational components: V P = 0% = 0% + 0% + crj^.
This is not necessarily equal to the observed variance as estimated
from the total sum of squares, though the two seldom differ by much.
Now consider the interpretation of the betweensire component,
174
HERITABIUTY
[Chap. 10
g%. This is the variance between the means of halfsib families and
it therefore estimates the phenotypic covariance of half sibs, cov (mi)y
which is \V A . Thus o\ = \V A . Next consider the withinprogeny
component, o^. Since any betweengroup variance component is
equal to the covariance of the members of the groups, it follows that a
withingroup component is equal to the total variance minus the
covariance of members of the groups. The progenies of the dams are
Table 10.4
Interpretation of the observational components of variance
in a sib analysis
Observational component
Covariance and causal components
estimated
Sires:
°l =
Dams:
ol =
Progenies:
„1
cr w =
Total: 4 =
ffs + ^f
a w =
Sires + Dams:
^ + oJ =
=Wa+W» + v Ec
=Wa+Wi>+v EV}
= v A + v J> +v Ee +v Ew
= WA+iVn+V Ee
cov (aB)
Vpcov^)
v P
cov {m)
fullsib families and so the withinprogeny variance estimates
V P  coV( FS) . This leads to the interpretation o> =\V A +%V D + V Ew .
Finally, there remains the betweendam component, and what it
estimates can be found by subtraction as follows:
^D = ^T^s^w=cov {m cov (K $ ) =IV a + IV d + V Ec
Consideration of the betweensire and betweendam components will
show that their sum gives an estimate of the fullsib covariance,
co<v (fs)> Du t this provides no new information for estimating the causal
components. These conclusions about the connexion between ob
servational and causal components of variance are summarised in
Table 10.4. The contributions of the interaction variance to the
observational components is given by Kempthorne (1955(2), and
can be deduced from the contributions to the covariances given in
Table 9.3.
Example 10.3. As an illustration of the estimation of heritability from
a sib analysis we refer to the study of Danish Landrace pigs based on the
records of the Danish Pig Progeny Testing Stations (Fredeen and Jonsson,
1957). The data came from 468 sires each mated to 2 dams, the analysis
being made on the records of 2 male and 2 female offspring from each
dam. Only one such analysis is given here: that of body length in the male
offspring. The analysis, shown in the table, was made within stations and
Chap. 10]
ESTIMATION OF HERITABIUTY
175
within years, and this accounts for the degrees of freedom being fewer than
would appear appropriate from the numbers stated above. The interpre
tation of the analysis, shown at the foot of the table, has been slightly
Sib analysis of body length in Danish Landrace pigs; data
for male offspring only (from Fredeen and Jonsson, 1957).
Source
d.f.
Mean Square
Component of variance
Between sires
432
603
^=1(6*03 38i) = o555
Between dams,
within sires
468
3.81
^ = i(3* 81  287) = 047
Within progenies
936
287
a 2 w = 287
4= 3895
Interpretation of analysis
Sib correlations Estimates of heritability
Half sibs: t^
} ) = —  2 =0*142 Sirecomponent: h 2 = : ~
(J rp O ' rp
Damcomponent: h 2
4°"j
Grp
= 057
= 048
Full sibs: t( FS )
2 , 2
crf
= 0263 Sire + Dam:
h * = *M+J® =o S3
2
o>
simplified by the omission of some minor adjustments not relevant for us
at this stage. The betweendam component is not greater than the between
sire component, so there cannot be much nonadditive genetic variance or
variance due to common environment. The two estimates of the heri
tability, from the sire and dam components respectively, can therefore be
regarded as equally reliable, and their combination based on the resem
blance between full sibs may be taken as the best estimate.
Example 10.4. We have not yet had an example to illustrate the effect
of common environment in augmenting the fullsib correlation. This is
provided by body size in mice. The analysis given in table (i) refers to the
Table (i)
Source
d.f.
Mean Square Composition of M. S.
Components
Sires
70
1710 ct£ + k'a% + dk'ol
01 = 048
Dams
118
1079 <Tw + karl +
4 = 247
Progenies
527
219 al
0^ = 219
6 = 348; k' =416; ^ = 233
4=5 #I 4
176 HERITABILITY [Chap. 10
weight of female mice at 6 weeks of age (J. C. Bowman, unpublished).
There were 719 offspring from 74 sires and 192 dams, each with one
litter. These were spread over 4 generations and the analysis was made
within generations. The analysis is complicated by the inequality of the
number of offspring per dam and of dams per sire. We shall not attempt
to explain the adjustments made for these inequalities, but simply give
the compositions of the mean squares from which the components are
estimated. The dam component is much greater than the sire component,
indicating a substantial amount of variance due to common environment.
Therefore only the sire component can be used to estimate the heritability.
The estimate obtained is A 2 = 4 x 048/514 = 037. Let us now use the analysis
to estimate the causal components according to the interpretation given
in Table 10.4, but with the assumption that nonadditive genetic variance
is negligible in amount. Table (ii) gives the estimates and shows how they
Tab
le (ii)
v F 
= <J T
= 5*14 =
100%
v A 
=4"!
= 192 =
37%
v Sc
«4
°l
= 199 =
39%
* Ew~
„2
<J W 
2o\
= 123 =
24%
are derived. The percentage contribution of each component to the total
variance is given in the righthand column. It will be seen that the vari
ance due to common environment (Ve c ) amounts to 39 per cent of the
total, and is greater than the environmental variance within fullsib
families (Ve w ) which amounts to only 24 per cent of the total.
Intrasire regression of offspring on dam. The heritability
can be estimated from the offspringparent relationship in a popula
tion with the structure described in the foregoing section, but a slight
modification is necessary. Since each male is mated to several females,
the regression of offspring on midparent is inappropriate; and, since
there are usually rather few male parents, the simple regressions on
one or other parent are both unsuitable. The heritability can, how
ever, be satisfactorily estimated from the average regression of off
spring on dams, calculated within sire groups. That is to say, the
regression of offspring on dam is calculated separately for each set of
dams mated to one sire, and the regressions from each set pooled in a
weighted average. This method is commonly used for the estimation
of heritabilities in farm animals. The intrasire regression of off
spring on dam estimates half the heritability, as the following con
sideration will show. The progeny of one sire has a mean deviation
Chap. 10]
ESTIMATION OF HERITABILITY
177
from the population mean equal to half the breeding value of the sire,
provided the females he is mated to are a random sample from the
population. The progeny of one dam deviates from the mean of the
sire group by half the breeding value of the dam. Therefore the
withinsire covariance of offspring and dam is equal to half the
additive variance of the population as a whole; and the withinsire
regression of offspring on dam is equal to half the heritability, just
like the simple regression of offspring on one parent. The validity
of the estimate is, of course, dependent on the absence of maternal
effects contributing to the resemblance between daughters and dams.
Inequality of the variance of males and females calls for an adjustment
if the heritability is to be estimated from the intrasire regression of
male offspring on dams. The regression coefficient should then be
multiplied by the ratio of the phenotypic standard deviation of females
to that of males.
Example 10.5. The heritability of abdominal bristlenumber in
Drosophila melanogaster, estimated from the offspringparent regression,
was cited in Example 10.1. This was in fact a joint estimate based on
intrasire regressions of daughters on dams and of sons on dams, the latter
being corrected for inequality of variance in the two sexes (Clayton, Morris,
and Robertson, 1957). The separate regression coefficients, with the cor
rection for inequality of variances, and the estimates of the heritability
are given in the table.
Estimate of
heritability
Standard deviation: females
Standard deviation: males
Standard deviation: female/male
Regression coefficient: daughterdam
Regression coefficient: son dam
Regression coefficient: sondam corrected
0206 x 117 =
Joint estimate, as given in Example 10.1,
3*54
3'03
117
0269
0*206
0241
°*54
048
051
The Precision of Estimates of Heritability
It is of the greatest importance to know the precision of any esti
mate of heritability. When an estimate has been obtained one wants
to be able to indicate its precision by the standard error. And when
178 HERITABILITY [Chap. 10
an experiment aimed at estimating a heritability is being planned one
wants to choose the method and design the experiment so that the
estimate will have the greatest possible precision within the limita
tions imposed by the scale of the experiment. The precision of an
estimate depends on its sampling variance, the lower the sampling
variance the greater the precision; and the standard error is the square
root of the sampling variance. Estimates of heritability are derived
from estimates of either a regression coefficient or an intraclass cor
relation coefficient, and the sampling variances of these are given in
textbooks of statistics. We shall therefore present the necessary
formulae without explanation of their derivation. The information on
the design of experiments given here is derived from the paper by A.
Robertson (19590) on this subject.
The problems of experimental design are, first, the choice of
method and, second, the decision of how many individuals in each
family are to be measured. Since the total number of individuals
measured cannot be increased indefinitely, an increase of the number
of individuals per family necessarily entails a reduction of the number
of families. The problem is therefore to find the best compromise
between large families and many families. In assessing the relative
efficiencies of different methods and designs we have to compare
experiments made on the same scale; that is to say, with the same
total expenditure in labour or cost. We must therefore decide first
what are the circumstances that limit the scale of the experiment. If
the labour of measurement is the limiting factor, as for example in
experiments with Drosophila, then the limitation is in the total
number of individuals measured, including the parents if they are
measured. If, on the other hand, breeding and rearing space is the
limiting factor, as it generally is with larger animals, the limitation
may be either in the number of families or in the total number of
offspring that can be produced for measurement, and measurements
of the parents may be included without additional cost. We cannot
here take account of all the possible ways in which the scale of the
experiment may be limited. Therefore for the sake of illustration we
shall consider only a limitation of the total number of individuals
measured. That is to say, we shall assume the total number of in
dividuals measured to be the same for all methods and all experi
mental designs. What we have to do, then, is to consider each method
on this basis and see what design and which method will give an
estimate of the heritability with the lowest sampling variance.
Chap. 10] THE PRECISION OF ESTIMATES OF HERITABILITY
179
Offspringparent regression. Consider first estimates based on
the regression of offspring on parents. LetX be the independent
variate, which may be either the value of a single parent or the mid
parent value. Let Y be the dependent variate, which may be either a
single offspring of each parent or the mean of n offspring. Let cr x
and oy De the variances of X and Y respectively; let b be the regres
sion of FonZ, and N the number of paired observations of X and Y,
which is equivalent to the number of families in the experiment.
Let T be the total number of individuals measured, which is fixed by
the scale of the experiment. The number of offspring measured is
nN, and the number of parents N or zN according to whether the
regression is on one parent or on the midparent value. So, with one
parent measured, T=N(n + i)> and with both parents measured
T=N(n + 2). With these symbols, the variance of the estimate of the
regression coefficient is
^AfhtiH
(10.5)
For use as a guide to design this formula is more convenient
if put in a simplified and approximate form. The regression coeffi
cient is usually small enough that b 2 can be ignored; and we may sup
pose that N is fairly large, so that the variance of the estimate may be
put, approximately, as
2 _ 1 <4
(approx.) (10.6)
When only one parent is measured the variance of parental values is
equal to the phenotypic variance, i.e. u x = V P . When both parents
are measured (provided they were not mated assortatively) the vari
ance of midparent values is half the phenotypic variance, i.e.
crx — iVp The variance of the offspring values, cry, is the variance of
the means of families of n individuals. This depends on the pheno
typic correlation, t, between members of families, in a manner that
will be explained in Chapter 13, (see Table 13.2), where it will be
shown that
i+(ni)t
Gy= Vp
n
Therefore by substitution for cr x and g y in equation 10.6 the sampling
variance of the regression on one parent becomes
180 HERITABILITY [Chap. 10
° b = — k/V ( a PP rox O (10.7)
and that of the regression on midparent is twice as great. Since the
phenotypic correlation, t, depends on the heritability it will not
generally be known at the time an experiment is being planned.
Therefore the best design cannot be exactly determined in advance.
We can, however, get an approximate idea of how many offspring of
each parent should be measured. On the assumption already stated,
that the total number of individuals measured including the parents
is fixed, it can be shown that the sampling variance given in equation
10.7 is minimal when n = J(i  t)jt if one parent is measured and when
n = \iz(i  t)jt if both parents are measured. Consider, for example, a
character with a heritability of 20 per cent and no variance due to
common environment, so that the phenotypic correlation in fullsib
families is t = oi. Then the optimal family size works out to be
n = 3 when only one parent is measured and n=\ when both parents
are measured. If we had taken a higher heritability the optimal family
size would have been lower. Large families are advantageous only
for the estimation of very low heritabilities. For example, fullsib
families of about 10 or 14 would be optimal for estimating a herit
ability of 2 per cent.
So far we have considered only the sampling variance of the
regression coefficient, and how this can be reduced by the design of
the experiment. Now let us consider the sampling variance of the
estimate of heritability, so that we can compare methods, i.e. the use
of one parent or of midparent values. A just comparison can only
be made on the assumption of the optimal design for each method,
and therefore we can only illustrate the comparison by reference to a
particular case. We shall consider the particular case mentioned
above where the phenotypic correlation is £ = oi, which would be
found in fullsib families when the heritability is 20 per cent. The
optimal family sizes are 3 or 4 as stated above. For the purpose of
comparison we have to express the sampling variance of the regression
coefficient given in equation 10.7 in terms of the total number of
individuals measured, T, since this is assumed to be the same for all
methods. We therefore substitute in equation 10.7 as follows. When
one parent is measured N= T\{n +1), and n = 3. When both parents are
measured N — Tj(n + 2), and n = 4. Substitution in equation 10.7 then
yields 06=4* 8/3 T when one parent is measured, and of = 3 • 9/T when both
Chap. 10] THE PRECISION OF ESTIMATES OF HERITABILITY 181
are measured. The regression on one parent must be doubled to give
the estimate of heritability, but the regression on midparent is itself
the estimate. So the sampling variances of the estimates of herit
ability, in the special case under consideration, are:
By regression on one parent: o$ = \o\ = 6^/T (approx.)
By regression on midparent: 0$ = ol — y^jT (approx.)
Thus the estimate based on midparent values has considerably less
sampling variance. A regression on midparent values, in general,
yields a more precise estimate of heritability for a given total number
of individuals measured.
Sib analyses. Now let us consider estimates obtained from the
intraclass correlation of fullsib or halfsib families. We shall at
first suppose for simplicity that halfsib families are not subdivided
into fullsib families; i.e. that only one offspring from each dam is
measured in paternal halfsib families. In the case of fullsib families
we shall assume that there is no variance due to common environ
ment so that the estimate of heritability is a valid one. Let N be the
number of families, and n the number of individuals per family, so
that the total number of individuals measured is T=nN. Let the
intraclass correlation be t. The sampling variance of the intraclass
correlation is then
„ 2[l+(?Z
0?= L
i)t]%ity
.(10.8)
n(ni)(N~i)
When the value of T=nN is limited by the size of the experiment it
can be shown that the sampling variance of the intraclass correlation
is minimal when n = i/t, approximately. Therefore the optimal family
size depends on the heritability. In the case of fullsib families
h 2 = 2t, and in the case of halfsib families, h 2 =\t. So the most
efficient design has the following family sizes:
2
With fullsib families: n—^
h 2
With halfsib families: n — ^
h 2
Since prior knowledge of the heritability will be at the best only
approximate, the optimal family size cannot be exactly determined
beforehand. The loss of efficiency, however, is much greater if the
182 HERITABILITY [Chap. 10
family size is below the optimum than if it is above. It is therefore
better to err on the side of having too large families. A. Robertson
(1959a) shows that, in the absence of prior knowledge of the herita
bility, halfsib analyses should generally be designed with families of
between 20 and 30.
If the experiment has the most efficient design, with n = ijt, then
the sampling variance of the intraclass correlation is approximately
°t=f {10.9)
Therefore under optimal design the sampling variances of the esti
mates of heritability are as follows:
16A 2
From fullsib families: 0$ = 40? = —= (approx.)
From halfsib families: 0$ = 1 6^ = ^= (approx.)
Thus, other things being equal, an estimate from fullsib families is
twice as precise as one from halfsib families.
At this point let us compare the precision of estimates from sib
analyses with those from offspringparent regressions, assuming
optimal design in each case. Again we have to choose a specific case
for illustration of the comparison. Let us for simplicity suppose as we
did before that the heritability to be estimated is 20 per cent. And,
though perhaps not very representative of situations likely to arise in
practice, let us compare an estimate obtained from a halfsib analysis
with one obtained from the regression of offspring on one parent
when the offspring consist of fullsib families. The variance of the
estimate of heritability from the halfsib analysis would then be 6/\./T
by substitution in the formula given above, and from the regression of
offspring on one parent it would also be 6'4/Tas we found previously.
In this case, therefore, these two methods would give equally precise
estimates for a given total number of individuals measured. If we had
considered a higher heritability, then the regression method would
have had the lower sampling variance. The comparison we have made,
though referring to a particular case, illustrates the general conclusion,
which is that the regression method is preferable for estimating
moderately high heritabilities and the sib correlation method is
preferable for low heritabilities, the critical heritability being, very
Chap. 10] THE PRECISION OF ESTIMATES OF HERITABILITY
183
roughly, about 20 per cent when the comparison is made on the basis
of an equal total number of individuals measured.
Finally let us consider briefly a sib analysis where the halfsib
families are subdivided into fullsib families. The situation is then
more complicated, and for details the reader should consult the papers
of Osborne and Paterson (1952) and A. Robertson (1959 a). The
conclusions are as follows. In many cases the estimation of heri
ability will be based only on the betweensire component, i.e. the
halfsib correlation. This will arise when common environment
renders the fullsib correlation unsuitable. The most efficient design
then has only one offspring per dam, and is exactly the same as the
halfsib analysis discussed above. If there is no common environ
ment and it is desired to estimate the correlations from sire and from
dam components with equal precision, then the optimal design has
3 or 4 dams per sire with the number of offspring per dam equal to
z/h 2 . In the absence of prior knowledge of the heritability the analysis
should be planned with 3 or 4 dams per sire, and 10 offspring per
dam.
Identical Twins
Identical twins seem at first sight to provide, for man and cattle, a
means of estimating the genotypic variance. They provide individuals
of identical genotype, just as inbred lines, or crosses between lines, do
for laboratory animals or for plants. The phenotypic variance within
pairs of identical twins should, therefore, estimate the environmental
variance and so allow the partition of the phenotypic variance into
genotypic and environmental components to be made. (This would
not estimate the heritability, but the use of identical twins seems
nevertheless most appropriately discussed at this point.) Many
studies of human twins have been made, and have shown the mem
bers of the pairs to be extremely alike in most characters, even when
reared apart from childhood (see Stern, 1949, Ch. 23, for review and
references). Studies of cattle twins, though on a much smaller scale,
show the same thing (see Hancock, 1954; Brumby, 1958). Taken at
their face value these studies seem to indicate a very high degree of
genetic determination — up to 90 per cent or even more — for many
characters. The use of identical twins in this way is, however, vitiated
by the additional similarity due to common environment. Twins
share a common environment from conception to birth, and over the
N F.Q.G.
184 HERITABILITY [Chap. 10
period during which they are reared together, so that the withinpair
variance contains only a part, and perhaps only a small part, of the
total environmental variance. This difficulty may be partially over
come by the comparison of identical with fraternal twins. Fraternal
twins are full sibs which share a common environment to approxi
mately the same extent as identical twins. Let us therefore consider
how the causal components of variance contribute to the observa
tional components between pairs and within pairs for the two sorts of
twins. The composition of the observational components are given
in Table 10.5, the betweenpair component being the phenotypic
covariance. The environmental components are shown as being the
same for fraternal as for identical twins. This is not necessarily true,
but one can proceed only on the assumption that it is.
Table 10.5
Composition of the components of variance between and
within pairs of twins.
Between pairs Within pairs
Identicals V A + V D + V Ec V Ew
Fraternals Wa+Wd + V Ec Wa+Wd + V Ew
Difference Wa+Wd Wa+Wd
The contributions of the interaction variance, which for simplicity
are omitted, can be added from Table 9.3 (p. 1 57). If the environmental
components are the same for the two sorts of twins, then the differ
ence between identicals and fraternals in either of the two components
estimates half the additive variance together with threequarters of
the dominance variance (and more than threequarters of the inter
action variance). To take the partitioning further it is necessary to
have an estimate of the additive variance, reliably free from admixture
with variance due to common environment. By subtraction of half
the additive variance we may then obtain an estimate of threequarters
of the dominance variance together with more than threequarters of
the interaction variance. This would give at least an approximate idea
of the amount of nonadditive genetic variance. There is, however, a
difficulty with cattle in comparisons between identical and fraternal
twins, connected again with the environmental components of
variance. Vascular anastomoses frequently occur in the placentae of
both sorts of twins, so that the blood of the two twins is mixed. This
will not make identicals any more alike, but it may make fraternals
more alike than they would otherwise be.
\Chap. 10]
IDENTICAL TWINS
185
Some results of twinstudies are quoted in Table 10.6, in order to
illustrate the degree of resemblance between identical and between
(fraternal twins in both man and cattle. The difference between the
I correlation coefficients of identicals and fraternals, given in the right
hand column, could be taken as an estimate of half the heritability if
there were no nonadditive genetic variance and if there were no
complications arising from a common circulation. But since non
additive variance cannot reasonably be assumed to be absent, the
difference can only be regarded as setting an upper limit to half the
heritability. The vascular anastomoses in cattle twins may, however,
render the estimates of the heritability, or of its upper limit, too low.
Table 10.6
Resemblance between Twins
Correlation coefficients
Character Reference Identicals Fraternals Difference
Man
Height
Weight
Intelligence
Birth weight
Cattle
Milkyield, 1st lactation
Butterfatyield, 1st lactation
Fat % in milk, 1st lactation
Weight at 96 weeks
Body length at 96 weeks
«
(1)
(1)
(2)
(3)
•93
•92
•88
•67
•91
•90
•95
•83
75
•64
•63
•63
•58
•65
■51
•86
78
•62
•29
•29
•25
•09
•26
•39
•09
•05
* J 3
(1) Newman, Freeman, and Holzinger (1937). Based on 50 pairs of
identicals and 50 pairs of fraternals, corrected for age differences.
2) Quoted from Robson (1955).
(3) Brumby and Hancock (1956). Based on 10 pairs of identicals and 11
pairs of fraternals.
CHAPTER ii
SELECTION:
I. The Response and its Prediction
Up to this point in our treatment of metric characters we have been
concerned with the description of the genetic properties of a popula
tion as it exists under random mating, with no influences tending to
change its properties; now we have to consider the changes brought
about by the action of breeder or experimenter. There are two ways,
as we noted in Chapter 6, in which the action of the breeder can change
the genetic properties of the population; the first by the choice of
individuals to be used as parents, which constitutes selection, and the
second by control of the way in which the parents are mated, which
embraces inbreeding and cross breeding. We shall consider selection
first, and in doing so we shall ignore the effects of inbreeding, even
though we cannot realistically suppose that we are always dealing
with a population large enough for its effects to be negligible.
The basic effect of selection is to change the array of gene fre
quencies in the manner described in Chapter 2. The changes of gene
frequency themselves, however, are now almost completely hidden
from us because we cannot deal with the individual loci concerned
with a metric character. We therefore have to describe the effects of
selection in a different manner, in terms of the observable properties
— means, variances and covariances — though without losing sight of
the fact that the underlying cause of the changes we describe is the
change of gene frequencies. Before we come to details let us consider
the change of gene frequencies a little further in general terms.
To describe the change of the genetic properties from one genera
tion to the next we have to compare successive generations at the same
point in the life cycle of the individuals, and this point is fixed by the
age at which the character under study is measured. Most often the
character is measured at about the age of sexual maturity or on the
young adult individuals. The selection of parents is made after the
measurements, and the gene frequencies among these selected in
dividuals are different from what they were in the whole population
Chap. II] SELECTION: I. THE RESPONSE AND ITS PREDICTION
187
before selection. If there are no differences of fertility among the
selected individuals or of viability among their progeny, then the gene
frequencies are the same in the offspring generation as in the selected
parents. Thus artificial selection — that is, selection resulting from
the action of the breeder in the choice of parents — produces its change
of gene frequency by separating the adult individuals of the parent
generation into two groups, the selected and the discarded, that differ
in gene frequencies. Natural selection, operating through differences
of fertility among the parent individuals or of viability among their
progeny, may cause further changes of gene frequency between the
parent individuals and the individuals on which measurements are
made in the offspring generation. Thus there are three stages at
which a change of gene frequency may result from selection: the first
through artificial selection among the adults of the parent generation;
the second through natural differences of fertility, also among the
adults of the parent generation; and the third through natural differ
ences of viability among the individuals of the offspring generation.
Though natural differences of fertility and viability are always present
they are not necessarily always relevant, because they are not neces
sarily connected with the genes concerned with the metric character.
1
Response to Selection
The change produced by selection that chiefly interests us is the
change of the population mean. This is the response to selection,
which we shall symbolise by R; it is the difference of mean phenotypic
value between the offspring of the selected parents and the whole of
the parental generation before selection. The measure of the selec
tion applied is the average superiority of the selected parents, which
is called the selection differential, and will be symbolised by S. It is
the mean phenotypic value of the individuals selected as parents
expressed as a deviation from the population mean, that is from the
mean phenotypic value of all the individuals in the parental genera
tion before selection was made. To deduce the connexion between
response and selection differential let us imagine two successive
generations of a population mating at random, as represented dia
grammatically in Fig. 1 1 . i . Each point represents a pair of parents
and their progeny, and is positioned according to the midparent
value measured along the horizontal axis and the mean value of the
188
SELECTION: I
[Chap. II
progeny measured along the vertical axis. The origin represents the
population mean, which is assumed to be the same in both generations.
The sloping line is the regression line of offspring on midparent.
(A diagram of this sort, plotted from actual data was given in Fig.
10. i.) Now let us regard a group of individuals in the parental
generation as having been selected — say those with the highest
values. These pairs of parents and their offspring are indicated by
solid dots in the figure. The parents have been selected on the basis
Fig. i i.i. Diagrammatic representation of the mean values of
progeny plotted against the midparent values, to illustrate the
response to selection, as explained in the text.
of their own phenotypic values, without regard to the values of their
progeny or of any other relatives. (This chapter deals exclusively
with selection made in this way: other methods will be described in
Chapter 13.) Let S be the mean phenotypic value of these selected
parents, expressed as a deviation from the population mean. And
similarly let R be the mean deviation of their offspring from the
population mean. Then S is the selection differential and R is the
response. The point marked by the cross represents the mean value
of the selected parents and of their progeny, and it lies on the regres
sion line. The regression coefficient of offspring on parents is thus
equal to R/S. Therefore the connexion between response and selection
differential is
R=bovS
OP*
.(11.1)
Chap. II]
RESPONSE TO SELECTION
189
We saw in the last chapter that the regression of offspring on mid
parent is equal to the heritability, provided there is no nongenetic
cause of resemblance between offspring and parents. To this we must
add the further condition that there should be no natural selection:
that is to say, that fertility and viability are not correlated with the
phenotypic value of the character under study. Provided these
conditions hold, therefore, the ratio of response to selection differ
ential is equal to the heritability, and the response is given by
R=h*S
(II.2)
The connexion between the response and the selection differen
tail, expressed in equation JJ.2, follows directly from the meaning of
the heritability. We noted in the last chapter (equation 10.2) that the
heritability is equivalent to the regression of an individual's breeding
value on its phenotypic value. The deviation of the progeny from
the population mean is, by definition, the breeding value of the
parents, and so the response is equivalent to the breeding value of the
parents. Thus it follows that the expected value of the progeny is
given by R=h 2 S.
There is one point at which the situation envisaged in deducing
the equations of response does not coincide with what is actually
done in selection. We supposed the individuals of the parent genera
tion to have mated at random and the selection to have been applied
subsequently. In practice, however, the selection is usually made
before mating, on the basis of the individuals' values and not the
midparent values. The effect of this is that the individuals, when
regarded as part of the whole parental population, have been mated
assortatively. Assortative mating, however, has very little effect on
the offspringparent regression, as we noted in the last chapter, and
this feature of selection procedure can therefore be disregarded.
Prediction of response. The chief use of these equations of
response is for predicting the response to selection. Let us consider a
little further the nature of the prediction that can be made. First, it
is clear that equation 11.1 is not a prediction but simply a description,
because the regression of offspring on parent cannot be measured
until the offspring generation has been reared. We could, however,
measure the regression, & p, in a previous generation, and then use
the equation R=b ^S to predict the response to selection. There is
no genetics involved in this; it is simply an extrapolation of direct
observation, and the only conditions on which it depends are the
190 SELECTION: I [Chap. 1 1
absence of environmental change and the absence of genetic change
between the generations from which the regression was estimated and
the generation to which selection is applied. The equation R=h 2 S,
however, provides a means of prediction based on observations made
only on the individuals of the parent generation before selection. Its
validity rests on obtaining a reliable estimate of h 2 from the resem
blance between relatives, such as half sibs; and on the truth of the
identity Z> p = A 2 .
Example i i . i . The selection for abdominal bristle number in Droso
phila melanogaster, by Clayton,'Morris, and Robertson (1957), will provide
an illustration of the prediction of the response, and will serve also to
indicate the extent of the agreement between observation and prediction.
(The data for this example were kindly supplied by Dr G. A. Clayton.)
The heritability of bristle number was first estimated from the base
population before selection, and the value found was 052, as stated in
Example 10.1. Five samples of 100 males and 100 females were taken from
the base population, and selection for high and for low bristle number was
made in each of the five samples, the 20 most extreme individuals of each
sex being selected as parents. The mean deviations of these selected indi
viduals from the mean of the sample out of which they were selected are
given in the table in the columns headed S, the negative signs under down
ward selection being omitted. These are the selection differentials. The
expected responses are obtained by multiplying the selection differentials
by the heritability, according to equation 11. 2. The observed responses
Upward selection
Downward selection
Resp
onse
Response
Line
S
Exp.
Obs.
S
Exp. Obs.
1
5'29
275
260
4'63
241 244
2
512
266
223
458
238 229
3
4'44
231
2'43
436
227 067
4
432
225
312
560
291 113
5
488
2'54
268
412
214 268
Mean
48 1
250
2'6l
466
242 184
are the differences between the progeny means and the sample means out
of which the parents were selected. The expected and observed responses
are also given in the table, negative signs being again omitted. Comparison
of the observed with the expected responses shows that on the whole there
is fairly good agreement, though in some lines — particularly lines 3 and 4
selected downward — there are quite serious discrepancies. These dis
crepancies, which are typical of selection experiments, illustrate the fact that
Chap. II]
RESPONSE TO SELECTION
191
a single generation of selection in only one line cannot be relied on to
follow the prediction at all closely.
The prediction of response is valid, in principle, for only one
generation of selection. The response depends on the heritability of
the character in the generation from which the parents are selected.
The basic effect of the selection is to change the gene frequencies, so
the genetic properties of the offspring generation, in particular the
heritability, are not the same as in the parent generation. Since the
changes of gene frequency are unknown we cannot strictly speaking
predict the response to a second generation of selection without re
determining the heritability. Experiments have shown, however,
that the response is usually maintained with little change over several
generations — up to five, ten, or even more. This will be seen in the
graphs of responses to selection given later in this chapter and in the
next. In practice, therefore, the prediction may be expected to hold
good over several generations. The effects of selection over longer
periods, and also its effects on properties other than the mean, will be
discussed in a later section.
The selection differential. We have seen that the change of the
population mean brought about by selection — i.e. the response —
depends on the heritability of the character and on the amount of
selection applied as measured by the selection differential. The
selection differential will not be known, however, until the selection
among the parental generation has actually been made. So the equa
tions of response in the form given above are only of limited useful
ness for predicting the response. To be able to predict further ahead
we need to know what determines the magnitude of the selection
differential. Consideration of the factors that influence the selection
differential will also enable us to see more clearly the means by which
the breeder may improve the response to selection.
The magnitude of the selection differential depends on two fac
tors: the proportion of the population included among the selected
group, and the phenotypic standard deviation of the character. The
dependence of the selection differential on these two factors is illus
trated diagrammatically in Fig. 11.2. The graphs show the distribu
tion of phenotypic values, which is assumed to be normal. The
individuals with the highest values are supposed to be selected, so
that the distribution is sharply divided at a point of truncation, all
individuals above this value being selected and all below rejected.
192
SELECTION: I
[Chap. II
The arrow in each figure marks the mean value of the selected group,
and S is the selection differential. In graph (a) half the population is
selected, and the selection differential is rather small: in graph (b)
only 20 per cent of the population is selected, and the selection differ
ential is much larger. In graph (c) 20 per cent is again selected, but
Fig. i 1.2. Diagrams to show how the selection differential, S,
depends on the proportion of the population selected, and on the
variability of the character. All the individuals in the stippled
areas, beyond the points of truncation, are selected. The axes are
marked in hypothetical units of measurement.
( a ) 5°% selected; standard deviation 2 units: S = i6 units
(b) 20% selected; standard deviation 2 units: S = 28 units
(c) 20 % selected; standard deviation 1 unit: S = 1 4 units
the character represented is less variable and the selection differential
is consequently smaller. The standard deviation in (c) is half as great
as in (b) and the selection differential is also half as great.
The standard deviation, which measures the variability, is a
property of the character and the population, and it sets the units in
which the response is expressed — i.e. so many pounds, millimetres,
bristles, etc. The response to selection may be generalised if both
response and selection differential are expressed in terms of the
phenotypic standard deviation, o>. Then Rjop is a generalised mea
sure of the response, by means of which we can compare different
characters and different populations; and*S/a P is a generalised measure
of the selection differential, by means of which we can compare
different methods or procedures for carrying out the selection. The
' 'standardised" selection differential, Sjo P , will be called the intensity
of selection, symbolised by i. The equation of response {n. 2) then
becomes
Op Up
; Chap. II]
or
RESPONSE TO SELECTION
R = i(j p h 2
193
By noting that h = (t a /g p , where v A is the standard deviation of breed
ing values (square root of the additive genetic variance), we may write
this equation in the form
R=ihcr A ( JI >4)
which is sometimes used in comparisons of different methods of
selection.
The intensity of selection, % depends only on the proportion of
the population included in the selected group, and, provided the
20
18
16
•« 14
Z
o
_i
UJ
<•>
u_ 10
o
>
z
06
04
02
.
\N
\
A
,o\\
V
N^
\
\
Sb.
^
N
20
30 40 50 60 70
PROPORTION SELECTED, %
80
90
100
Fig. i i .3 . Intensity of selection in relation to proportion selected.
The intensity of selection is the mean deviation of the selected
individuals, in units of phenotypic standard deviations. The upper
graph refers to selection out of a large total number of individuals
measured: the lower two graphs refer to selection out of totals of 20
and 10 individuals respectively.
194 SELECTION: I [Chap. II
distribution of phenotypic values is normal, it can be determined
from tables of the properties of the normal distribution. If p is the
proportion selected — i.e. the proportion of the population falling
beyond the point of truncation — and z is the height of the ordinate at
the point of truncation, then it follows from the mathematical
properties of the normal distribution that
S . z , x
Thus, given only the proportion selected, p, we can find out by how
many standard deviations the mean of the selected individuals will
exceed the mean of the population before selection: that is to say, the
intensity of selection, i. The graphs in Fig. 11.3 show the relation
ship between i and p\ the value of i for any given value of p can be
read from the graphs with sufficient accuracy for most purposes. The
relationship between i and p given in equation 11. 5 applies, strictly
speaking, only to a large sample: that is to say, when a large number of
individuals have been measured, among which the selection is to be
made. When selection is made out of a small number of measured
individuals, the mean deviation of the selected group is a little less.
The intensity of selection can be found from tables of deviations of
ranked data (Table XX of Fisher and Yates, 1943). The two lower
Table ii.i
Intensities of selection when selection is made out of a small
number of individuals measured. The figures in the table
are values of i =Sjop = mean deviation in standard measure.
Number
Size
ofsampl
e
selected
9
8
7
6
5
4
3
2
1
149
142
i35
127
116
103
085
056
2.
I2I
114
106
096
083
067
042
—
3
I'OO
091
082
070
o55
o34
—
—
4
082
072
062
048
029
—
—
—
5
o66
o55
042
025
—
—
—
—
6
050
038
023
—
—
—
—
—
7
o35
020
8
019
curves in Fig
. 11.3
show the intensity
of selection for samples of 10
and 20. Selection
intensities for
samples smaller than 10
are given
in Table 11.1.
Chap. II]
RESPONSE TO SELECTION
195
Example 11.2. A comparison of the expected and observed responses
under different intensities of selection was made by Clayton, Morris, and
Robertson (1957), studying abdominal bristle number in Drosophila. The
heritability was first determined by three methods which yielded a com
bined estimate of 052 (see Example 10.1). The standard deviation of
bristle number (average of the two sexes) was 335. Selection at four
different intensities was carried on for five generations, both upward and
downward (i.e. both for increased and for decreased bristle number). In
each case 20 males and 20 females were selected as parents, the intensity
being varied by the number out of which these were selected, as shown in
the first column of the table. The intensities of selection corresponding to
these proportions selected may be read off the graphs in Fig. 11.3. They
are given in the second column of the table. The expected responses are
Mean response per generation
Proportion
Intensity of
Exp
Observed
selected, p
selection, i
ected
Up
Down
20/100 = 020
140
244
2*02
148
20/75 = 0*267
123
2*14
2*20
126
20/50 = 040
097
165
146
079
20/25=080
0'34
059
028
008
then found from equation 11.3. Under the most intense selection, for
example, it is ^ = 14x335 xo*52 = 244. There were five replicate lines
in both directions under the most intense selection, and three replicates
under the other intensities. The observed responses are quoted in the last
two columns of the table. Although they do not agree very precisely
with expectation, they show how the change made by selection falls off as
the intensity of selection is reduced, and the data serve to illustrate the
computation of the expected response.
It will now be clear that there are two methods open to the breeder
for improving the rate of response to selection: one by increasing the
heritability and the other by reducing the proportion selected and so
increasing the intensity of selection. The heritability can be increased
only by reducing the environmental variation through attention to the
technique of rearing and management. Reducing the proportion
selected seems at first sight to be a straightforward means of improv
ing the response, but there are several factors to be considered which
set a limit to what the breeder can do in this way. First is the matter
of population size and inbreeding. This sets a lower limit to the
number of individuals to be used as parents. In experimental work,
for example, one might decide to use not less than 10 or even 20 pairs
196 SELECTION: I [Chap. II
of parents; and in livestock improvement, particularly if artificial
insemination came into general use as a means of intense selection on
males, care would have to be taken not to restrict the number of
males too much. For this reason the intensity of selection can be
increased above a certain point only by increasing the total number of
individuals measured, out of which the selection is made. With
organisms that have a high reproductive rate, such as Drosophila and
plants, very large numbers can, in principle, be measured; but in
practice a limit is set to the intensity of selection by the time and
labour required for the measurement. With organisms that have a
low reproductive rate the limit to the intensity of selection is set by
the reproductive rate, since the proportion saved can never be less
than the proportion needed for replacement; that is to say, two
individuals are needed on the average to replace each pair of parents.
Usually fewer males are needed than females, because each male can
mate with several females, and so the males leave more offspring than
the females. A higher intensity of selection can then be made on
males than on females. Suppose, for example, that females leave on
the average 5 offspring, and each male mates with 10 females, so that
males leave on the average 50 offspring. Then the proportion of
females selected cannot be less than 1/5, but only 1/50 of the males
need be selected. The upper limits of the intensity of selection in this
case would be 140 for females, and 264 for males.
The number of offspring produced by a pair of parents depends
not only on their reproductive rate but also on how long the breeder
is willing to wait before he makes the selection. This introduces a
new factor — the interval of time between generations — which we
have not yet taken into account in the treatment of the response to
selection, and which we must now consider.
Generation interval. The progress per unit of time is usually
more important in practice than the progress per generation, so the
interval between generations is an important factor in reckoning the
response to selection. The generation interval is the interval of time
between corresponding stages of the life cycle in successive genera
tions, and it is most conveniently reckoned as the average age of the
parents when the offspring are born that are destined to become
parents in the next generation. By waiting until more offspring have
been reared before he makes the selection the breeder can increase the
intensity of selection and the response per generation; but in doing so
he inevitably increases the generation interval and may thereby
Chap. II]
RESPONSE TO SELECTION
197
reduce the response per unit of time. There is thus a conflict of
interest between intensity of selection and generation interval, and
the best compromise must be found between the two. Increasing the
number of offspring will pay up to a certain point, and beyond this
point it will not. The optimal number of offspring cannot be stated
in general terms, and each case must be worked out according to its
special circumstances. The procedure is explained in the following
example, referring to mice.
Example 11.3. Let us suppose that selection is to be applied to some
character in mice, and that speed of progress per unit of time is the aim.
The question is: how many litters should be raised? To find the number of
litters that will give the maximum speed of progress we have to find the
intensity of selection and the generation interval. The ratio of the two will
then give the relative speed. The actual speed could be obtained by multi
plying by the heritability and the standard deviation, but these factors will
be assumed to be independent of the number of litters raised. A comparison
of the expected rates of progress per week is made in the table. The com
parison is made for three different average sizes of litter, meaning the
number of young reared per litter. It is assumed that the character to be
selected can be measured before sexual maturity, and that first litters are
born when the parents are 9 weeks old, subsequent litters following at
intervals of 4 weeks. It is assumed also that the population is large enough
to be treated as a large sample in reckoning the intensity of selection; and
that equal numbers of males and females are selected. The optimal
number of litters differs according to the number reared per litter. If 6
N = 6
N=4
N = :
L
t
P
i
i\t
p i ijt
P
i
i\t
1
9
•333
IIO
•122
•50 o8o 089 1
•0
ooo
•000
2
13
•167
150
■115
•25 127 098
•50
o8o
•062
3
17
•in
171
•101
•167 150 088
•333
iio
•065
4
21
•083
185
•088
■125 165 079
•25
127
•060
Column headings
: L
= number of litters raised.
t
= generation interval in weeks
P~
= proportion selected.
i
= intensity of selection.
i\t
= relative speed of progress.
N
= number of young reared per
litter
young are reared the maximum speed is attained by rearing only one
litter. If 4 young are reared it is worth while to wait for second litters
before making the selection, but not for third litters. If only 2 young are
reared per litter, raising three litters gives the maximum speed of progress.
198 SELECTION: I [Chap. II
Most mouse stocks are able to rear 6 young per litter, so under most cir
cumstances it is best to make the selection from the first litters, and not to
wait for second litters. This conclusion could hardly have been guessed at
without the computations shown in the table.
Measurement of Response
When one or more generations of selection have been made the
measurement of the response actually obtained introduces several
problems. These are matters of procedure rather than of principle
and will be only briefly discussed.
Variability of generation means. The first problem to be
solved arises from the variability of generation means. Inspection of
any of the graphs of selection given in the examples shows that the
generation means do not progress in a simple regular fashion, but
fluctuate erratically and more or less violently. There are two main
causes of this variation between the generation means: sampling
variation, depending on the number of individuals measured; and
environmental change, which is usually the more important of the
two. The consequence of this variation between generation means is
that the response can seldom be measured with any pretence of
accuracy until several generations of selection have been made. The
best measure of the average response per generation is then obtained
from the slope of a regression line fitted to the generation means, the
assumption being made that the true response is constant over the
period. The variation between generation means appears as error
variation about the regression line, and the standard error of the
estimate of response is based on it. Variation due to changes of
environment can, of course, be overcome, or at least reduced, by the
use of a control population. The measurement of the response can,
however, be improved in accuracy if the "control" is not an un
selected population but is selected in the opposite direction. This is
known as a "twoway" selection experiment. The response measured
from the divergence of the two lines is then about twice as great as
that of the lines separately, and the variation between generations is
reduced to the extent that the environmental changes affect both lines
alike. An unselected control is, however, preferable if for practical
reasons one is interested only in the change in one direction, because
the response is not always equal in the two directions. This point will
be discussed in the next chapter.
Chap. II]
MEASUREMENT OF RESPONSE
199
Example i i .4. Fig. 1 1 .4 shows the results of 1 1 generations of twoway
selection for body weight in mice (Falconer, 1953). On the left the "up"
and "down" lines are shown separately, and on the right the divergence be
tween the two is shown. Linear regression lines are fitted to the observed
2468 10 2468 10
GENERATIONS
Fig. i 1.4. Twoway selection for 6week weight in mice. Ex
planation in Example 11.4. (Redrawn from Falconer, 1953.)
generation means. (The first generation of selection is disregarded be
cause the method of selection was different.) The estimates of the average
response per generation, with their standard errors, are as follows:
Response ± standard error
in grams per generation.
Up 027 ± 0050
Down 062 ± 0046
Divergence o88 ± 0036
The difference between the upward and downward responses will be dis
cussed in the next chapter.
The foregoing example shows how the variation of the generation
means can be reduced when the response is measured from the differ
ence betw r een two lines, each acting in the manner of a control for the
other. Controls, however, are not always available, and then a more
serious difficulty may arise from progressive changes of environment.
This makes it difficult to assess the effectiveness of selection in the
improvement of domesticated animals, and to a lesser extent of plants,
because in the absence of a control there is no sure way of deciding
O F.Q.G.
200 SELECTION: I [Chap. It
how much of the improvement is due to selection and how much to a
progressive change in the conditions of management.
Example 11.5. Lush (1950) has assembled a number of graphs show
ing the improvement of farm animals that has taken place during the
present century. Instead of reproducing any of these graphs we give in
the table an indication of the increase of yield per individual over a period
of years, as a percentage of the initial yield. It is difficult to avoid the con
clusion that much of the improvement of these characters is the result of
selection, but in the absence of any standard of comparison it is very
difficult to decide how much is due to selection and how much to improved
methods of feeding and management.
Character
Country
Period
Improvement, %
Cows:
Milkyield
Sweden
1920 1944
21
Butterfatyield
New Zealand
19101940
47
Fat % in milk
Netherlands
19061945
22
Pigs:
Efficiency of growth
Denmark
19221949
16
Body length
Denmark
19261949
5
Sheep:
Fleece weight
Australia
18811945
7i
Hens:
Egg production
U.S.A.
19091950
64
Weighting the selection differential. In experimental selection
the selection differential as well as the response has to be measured
because it is the relationship between the two, and not the response
alone, that is of interest from the genetic point of view. We have to
distinguish between the expected and the effective selection differ
ential, because in practice the individual parents do not contribute
equally to the offspring generation. Differences of fertility are always
present so that some parents contribute more offspring than others.
To obtain a measure of the selection differential that is relevant to
the response observed in the mean of the offspring generation we
therefore have to weight the deviations of the parents according to
the number of their offspring that are measured. The expected
selection differential is the simple mean phenotypic deviation of the
parents as defined at the beginning of this chapter; the effective
selection differential is the weighted mean deviation of the parents,
the weight given to each parent, or pair of parents, being their pro
portionate contribution to the individuals that are measured in the
next generation.
The weighting of the selection differential takes account of a good
part of the effects of natural selection. If the differences of fertility
Chap. II]
MEASUREMENT OF RESPONSE
201
are related to the parents' phenotypic values for the character being
selected, then this natural selection will either help or hinder the
artificial selection. If, for example, the more extreme phenotypes are
less fertile or more frequently sterile, then natural selection is working
against artificial selection. By weighting the selection differential we
measure the joint effects of natural and artificial selection together.
A comparison of the effective (i.e. weighted) with the expected selec
tion differential may thus be used to discover whether natural selec
tion is operative.
Example ii.6. In an experiment with mice, selection for body size
(weight at 6 weeks) was carried through 30 generations in the upward
direction and 24 generations in the downward direction (see Falconer,
1955). Comparisons are made in the table between the effective (weighted)
and the expected (unweighted) selection differentials in the two lines. The
period of selection is divided into two parts and the comparisons are made
separately in each. Throughout the whole of the upward selection there
was virtually no difference between the effective and expected selection
differential, and we can conclude that natural selection was unimportant
as a factor influencing the response. The situation in the downward
selected line, however, is different, the effective selection differential being
less than the expected, especially in the second part. From this we can
conclude that natural selection was operating in favour of large size, thus
hindering the artificial selection and reducing the response obtained,
particularly in the latter part of the experiment. The cause of the natural
selection and the reason why it operated only in the downward selected
line were as follows. Large mice produce larger litters than small mice; but
for the purpose of standardisation, litters were artificially reduced to 8
young at birth. At the beginning, and throughout the whole period in the
upward selected line, there were few litters with less than 8 young, and so
Direction of
selection
Upwards
Downwards
Generation
numbers
122
233°
118
1924
Selection differential per
generation (gms.)
Effective
Expected Effective
1*39
108
103
082
136
109
096
070
Expected
098
1 oi
093
o86
the differential fertility had no consequence in the upward selected line.
In the downward selected line, however, there was soon no standardisation
because there were few litters with as many as 8 young. Thus the smaller
202 SELECTION: I [Chap. II
mice produced fewer young and this reduced the effective selection differ
ential. In the second part of the experiment the smallest mice did not
breed at all and this reduced the effective selection differential still further.
The weighting of the selection differential does not take account
of the whole effect of natural selection. We noted at the beginning of
the chapter that natural selection may operate at two stages, through
differences of fertility among the parents and through differences of
viability among the offspring. The effect of differences of viability
among the offspring are not accounted for in the effective selection
differential. For further examples and a fuller account of the inter
action of natural and artificial selection see Lerner (1954, 1958).
Realised heritability. The equation of response, R=h 2 S {11.2),
which we discussed earlier from the point of view of predicting the
response, can be looked at the other way round, as a means of esti
mating the heritability from the result of selection already carried
out, the heritability being estimated as the ratio of response to selec
tion differential:
*=§ (n5)
The same conditions are necessary for the valid use of the equation
for estimating heritability as for predicting response, except that now
by weighting the selection differential a good part of the effects of
natural selection can be taken account of. There is also the condition
that the observed response should not be confounded with systematic
changes of generation mean due to the environment or the effects of
inbreeding. This, and the absence of maternal effects, are the im
portant conditions for the valid estimation of heritability from the
response to selection.
The ratio of response to selection differential, however, has an
intrinsic interest of its own, quite apart from whether it provides a
valid estimate of the heritability. It provides the most useful empiri
cal description of the effectiveness of selection, which allows com
parison of different experiments to be made even when the intensity
of selection is not the same. The term realised heritability will be used
to denote the ratio R/S, irrespective of its validity as a measure of the
true heritability. The realised heritability is estimated as follows.
The generation means are plotted against the cumulated selection
differential. That is to say, the selection differentials, appropriately
Chap. II]
MEASUREMENT OF RESPONSE
203
weighted, are summed over successive generations so as to give the
total selection applied up to the generation in question. A regression
line is then fitted to the points and the slope of this line measures the
average value of R/S, the realised heritability.
Example 11.7. Fig. 11.5 shows the results of 21 and 18 generations
of twoway selection for 6week weight in mice (Falconer, 1954 a). The
SELECTION
Fig. 1 1.5. Twoway selection for 6week weight in mice. Res
ponse plotted against cumulated selection differential, as explained
in Example 11.7. (From Falconer, 19540; reproduced by courtesy
of the editor of the International Union of Biological Sciences.)
generation means are plotted against the cumulated selection differential
and linear regression lines are fitted to the points. The realised herit
abilities, estimated from the slopes of these lines, are:
Upward selection: 0175 ± 00161
Downward selection: 05 1 8 ± 0023 l
The difference between the upward and downward selection is referred to
in the next chapter.
Change of Gene Frequency under Artificial Selection
It was pointed out at the beginning of this chapter that the change
of the population mean resulting from selection is brought about
through changes of the gene frequencies at the loci which influence
the character selected. But since the effects of the loci cannot be
204
SELECTION: I
[Chap. II
individually identified, the changes of gene frequency cannot in
practice be followed. Consequently the process of selection for a
metric character had to be described in terms of the selection differ
ential, or the intensity of selection, and of the change of the popula
tion mean, representing the combined effects of all the loci. This
leaves unanswered the fundamental question: How great are the
changes of gene frequency underlying the response of a metric
character to selection? To answer this question, and so to bridge the
gap between the treatment of selection given in this chapter and that
given earlier in Chapter 2, we have to find the connexion between the
intensity of selection (i) and the coefficient of selection (s) operating
on a particular locus.
The effect of selection for a metric character on one of the loci
concerned may best be pictured in the manner illustrated in Fig.
1 1.6. This refers to a locus with two alleles of which one (A T ) is com
Fig. 1 1.6. Selection for a metric character operating on one of
the loci concerned. The frequency of A 2 A 2 as depicted is q 2 = I.
pletely dominant. With respect to this locus, therefore, the popula
tion is divided into two portions which differ in their mean pheno
typic values by an amount 2<z, this being the difference between the
two homozygotes in the notation of earlier chapters (see Fig. 7.1,
p. 1 1 3). It is assumed that the residual variance within each portion is
the same, this residual variance arising from all the other loci as well
Chap. II]
CHANGE OF GENE FREQUENCY
205
as from environmental causes. The proportion of individuals in the
two portions depends on the gene frequency at the locus, q 2 being in
the portion consisting of A 2 A 2 genotypes, and i q 2 in the portion
containing A X A X and A X A 2 genotypes. When artificial selection is
applied, a proportion of the whole population lying beyond the point
of truncation is cut off, and the proportion of A 2 A 2 genotypes is lower
among this selected group than in the population as a whole, selec
tion acting in the case illustrated against the A 2 allele. Now, the new
gene frequency, q l9 is the frequency of A 2 genes among the selected
group of individuals. This may be found by deducing the regression
of gene frequency on phenotypic value, b qP . The selected group
deviates in mean phenotypic value from the population mean by an
amount £, which is the selection differential. The gene frequency
among the selected group will then be given by the regression equa
tion
qi =q+b qP S (11.6)
The regression of gene frequency on phenotypic value is found as
follows. The three genotypes are listed in Table 11.2 with their
Table 11.2
q G
AiA 2
A 2 A 2
p 2
zpq
frequencies in the whole population. The third column of the table
gives the frequency of the A 2 allele among each of the three geno
types, which is simply o, J, and 1 . The last column gives the geno
typic values. Provided there is no correlation between genotype and
environment, these are also the mean phenotypic values of each
genotype. There is now no assumption of complete dominance.
The covariance of gene frequency with phenotypic value is obtained
from the sum of the products of q and P, each multiplied by the
frequency of the genotype. From this sum of products must be
deducted the product of the means of the gene frequency and the
phenotypic value. Thus the covariance is cov qP =pqdq 2 aqM,
where M is the population mean. Substituting the value of M from
equation 7.2, the covariance reduces to  pq[a + d(q  p)] —  pqa,
where a is the average effect of the gene substitution (see equation J.5).
The regression of gene frequency on phenotypic value is therefore
206 SELECTION: I [Chap. II
Op
where o P is the phenotypic variance.
Next, we substitute this regression coefficient in equation u.6,
putting also S = io P from equation 11.5. This gives the gene frequency
among the selected parents as
Gp
and the change of gene frequency resulting from the selection
reduces to
Aq= ipq— (11.8)
dp
The change is negative because selection is acting against the allele
A 2 whose frequency is q. This formula enables us to translate the
intensity of selection, i, into the coefficient of selection, s, against A 2 ,
because equations for the change of gene frequency in terms of s were
given in Chapter 2. We shall take the approximate equations given
in 2.7 and 2.8. If dominance is complete, d=a and a = 2qa. Then
equating 1 1.8 with 2.8 gives
ipq^=sq\iq).
Gp
If there is no dominance d=o and <x = a. Then equating 11. 8 with
2.7 gives
ipq^ = isq(iq)
Gp
Both these equations, on simplification, reduce to
.2a
s~i— ( JJ 9)
Gp
Thus we find that the two ways of expressing the "force" of selection
— by the intensity and the coefficient of selection — are very simply
related to each other. The coefficient of selection operating on any
locus is directly proportional to the intensity of selection and to the
quantity zajop. This quantity is the difference of value between the two
homozygotes expressed in terms of the phenotypic standard deviation.
Chap. II]
CHANGE OF GENE FREQUENCY
207
For want of a more suitable term we shall refer to this, rather loosely,
as the "proportionate effect" of the locus. There is nothing more
that we can do with the relationship expressed in equation n.g at
the moment, but we shall use it in the next chapter to draw some
tentative conclusions about the "proportionate effects" of loci con
cerned with metric characters.
CHAPTER 12
SELECTION:
II. The Results of Experiments
In the last chapter we saw that the theoretical deductions about the
effects of artificial selection are limited to the change of the popula
tion mean, and strictly speaking over only one generation. By chang
ing the gene frequencies selection changes the genetic properties of
the population upon which the effects of further selection depend.
And, because the effects of the individual loci are unknown, the
changes of gene frequency cannot be predicted, and so the response
to selection can be predicted only for as long as the genetic properties
remain substantially unchanged. Thus there are many consequences
of selection that can be discovered only by experiment. The object of
this chapter is to describe briefly what seem to be the most general
conclusions about these consequences that have emerged from
experimental studies of selection. It should be noted, however, that
the drawing of conclusions from the results of experiments in the
field of quantitative genetics is to some extent a matter of personal
judgement. Many of the conclusions put forward in this chapter
therefore represent a personal viewpoint, and are not necessarily
accepted generally. The most important questions to be answered
by experiment concern the longterm effects of selection. For how
long does the response continue? By how much can the population
mean ultimately be changed ? What is the genetic nature of the limit
to further progress? These questions will be dealt with in the latter
part of the chapter. First we shall consider two questions raised by
the examples in the last chapter.
Repeatability of Response
In Example ii.i we saw that the response in one generation of
selection was very variable when the selection was replicated in a
number of lines. Though the average response agreed fairly well
Chap. 12]
REPEATABILITY OF RESPONSE
209
with the prediction, the responses of the individual lines did not.
This raises the question: How consistent, or repeatable, are the
results of selection ? If selection is applied to different samples drawn
from the same population, how closely will the results agree? Part
of the problem here concerns sampling variation — the extent to
which the samples differ in gene frequencies, both initially and
during the course of the continued selection. This depends, of course,
on the size of the populations, or lines, during the course of the
selection; but it depends also on the initial gene frequencies in the
base population from which the samples were drawn. If most of the
loci concerned with the character have genes at more or less inter
mediate frequencies then the response to selection is not likely to be
much influenced by sampling variation. On the other hand, if there
are loci with genes at low frequency then these will be included in
some samples drawn from the initial population but will be absent
from others. Then, if any of these lowfrequency genes have a fairly
large effect on the character their presence or absence may appreciably
influence the outcome of selection. The experiment on abdominal
bristlenumber in Drosophila whose first generation was quoted in
Example ii.i, provides the only evidence on this point (Clayton,
Morris, and Robertson, 1957). Fig. 12. 1 shows the responses in the
five up and the five down lines over 20 generations. The responses
are reasonably consistent over the first 5 generations in the up lines
and over about 10 generations in the down lines. Thereafter the
lines begin to differentiate, and by the twentieth generation there are
substantial differences between them. The conclusion suggested by
the early similarity and the later divergence between the replicate
lines is that the early response is governed chiefly by genes at more or
less intermediate frequencies, but in the later stages genes at initially
low frequencies begin to come into play, the initial sampling having
caused differences between the lines in respect of these genes.
The question of repeatability of the response to selection may be
extended to differences between populations. This is not a matter of
sampling variation but of the differences in the genetic properties of
populations. We noted in Chapter 10 that heritabilities frequently
differ between populations, and consequently we should not expect
the responses to selection to be the same. It is of interest nevertheless
to compare the results of selection applied to different populations
and to see how they do actually differ. Fig. 12.2 shows the results of
selection for thorax length in Drosophila melanogaster applied to three
210
SELECTION: II
[Chap. 12
MEANS
Fig. 12.1. Selection for abdominal bristle number in Drosophila
melanogaster, replicated in 5 lines in each direction. The broken
lines refer to suspended selection and the thin continuous lines to
inbreeding without selection. (From Clayton, Morris, and Robert
son, 1957; reproduced by courtesy of the authors and the editor of
the Journal of Genetics.)
Chap. 12]
REPEATABILITY OF RESPONSE
MASS SELECTION THORAX
211
LARG£_ .
ISCHIA
s
^~s \.. %
^ 
\
«••
s
Nv ,— ' CONTROL LEVEL
10
^~"~^~~\MALL
IS
20
— i 1 I ■ — i 1 — I 1 — i — i 1 1 i—i — i— i — r
/
\.^....yt
.■*S
LARGE,
«
\ RENFREW
\ A
v \
^s. SMALL
CONTROL LEVEL
\
BACK SELECTION
EL SO SELEC IO
^~^\^^y^
CRIANLARICH
O 5 IO IS 20
GENERATIONS
Fig. 12.2. Selection for thorax length in Drosophila melanogaster
from three different base populations. The broken lines refer to
reversed selection and the dotted lines to suspended selection.
(From F. W. Robertson, 1955; reproduced by courtesy of the
author and the editor of the Cold Spring Harbor Symposia on
Quantitative Biology.)
212 SELECTION: II [Chap. 12
different wild populations, (F. W. Robertson, 1955). The responses
of the three populations, both upward and downward, are fairly alike.
It is not possible to discuss further the degree of repeatability
between the responses found in these two experiments, because there
is no objective criterion for deciding how closely the responses ought
to agree. One can therefore only regard them as empirical evidence
of what in practice does occur.
Asymmetry of Response
A surprising feature of the experimental results illustrated in the
last chapter is the inequality of the responses to selection in opposite
directions, seen particularly well in Fig. 11.5. This asymmetry of
response has been found in many twoway selection experiments,
but its cause is not yet known. For this reason we shall not discuss
the phenomenon in detail, but shall merely note the possible causes,
of which there are several. These possible causes are, briefly, as
follows.
1. Selection differential. The selection differential may differ
between the upward and downward selected lines, for several reasons,
(i) Natural selection may aid artificial selection in one direction or
hinder it in the other, (ii) The fertility may change so that a higher
intensity of selection is achieved in one direction than in the other,
(iii) The variance may change as a result of the change of mean: the
selection differential will increase as the variance increases and de
crease as it decreases. This is a "scale effect," to be discussed more
fully in Chapter 17. These three causes operating through the selec
tion differential were all found in the experiment with mice cited in
the last chapter, but they operated in the direction opposite to that of
the asymmetry found. The selection differential was greater in the
upward selection but the response was greater in the downward
selection. Differences of the selection differential influence the
response per generation, but they affect the realised heritability only a
little. Therefore if the response is plotted against the cumulated
selection differential and there is still much asymmetry, as in Fig.
1 1.5, it cannot be attributed to any cause operating through the
selection differential.
2. "Genetic asymmetry." There are two sorts of asymmetry
in the genetic properties of the initial population that could give rise
Chap. 12]
ASYMMETRY OF RESPONSE
213
to asymmetry of the responses to selection (Falconer, 1954a). These
concern the dominance and the gene frequencies of the loci concerned
with the character. The dominant alleles at each locus may be mostly
those that affect the character in one direction, instead of being more
or less equally distributed between those that increase and those that
decrease it. We shall refer to this situation as directional dominance.
If the initial gene frequencies were about 05, the response would be
expected to be greater in the direction in which the alleles tend to be
recessive. It will be shown in Chapter 14 that this is also the direction
in which the mean is expected to change on inbreeding. Therefore
we should, in general, expect characters that show inbreeding
depression to respond more rapidly to downward selection than to
upward selection. There may also be asymmetry in the distribution of
gene frequencies. The more frequent alleles at each locus may be
mostly those that affect the character in one direction — a situation
that we shall refer to as directional gene frequencies. In the absence of
directional dominance this would be expected to cause a more rapid
response to selection in the direction of the less frequent alleles.
Under natural selection the less favourable alleles, in respect to fit
ness, will have been brought to lower frequencies. Therefore if
selection in one direction reduces fitness more than selection in the
other, we should expect a more rapid response in the direction of the
greater loss of fitness. The asymmetry of the response to selection
theoretically expected from these two causes may be seen by con
sideration of Fig. 2.3, which shows the expected response arising
from one locus. Neither of these two causes — directional dominance
and directional gene frequencies — would, however, be expected to
give rise to immediate asymmetry; that is, in the first few generations
of selection. The asymmetry would appear only as the gene fre
quencies in the upward and downward selected lines become differ
entiated. The asymmetry found in some experiments undoubtedly
appears sooner than would be expected from these causes.
3. Selection for heterozygotes. If selection in one direction
favours heterozygotes at many loci, or at a few loci with important
effects, the response would become slow as the gene frequencies ap
proach their equilibrium values. But the response in the other direc
tion would be rapid until the favoured alleles approach fixation. This
situation, which is a form of directional dominance, would also be
expected to give rise to an asymmetrical response (Lerner, 1954); but,
again, not immediately.
M
214 SELECTION: II [Chap. 12
4. Inbreeding depression. Most experiments on selection are
made with populations not very large in size, and there is usually
therefore an appreciable amount of inbreeding during the progress
the selection. If the character selected is one subject to inbreeding
depression, there will be a tendency for the mean to decline through
inbreeding. This will reduce the rate of response in the upward
direction and increase it in the downward direction, thus giving rise
to asymmetry. An unselected control population will reveal how
much asymmetry can be attributed to this cause. Inbreeding
depression has been shown to be an insufficient cause of the asymmetry
in the experiments cited in the last chapter.
5. Maternal effects. Characters complicated by a maternal effect
may show an asymmetry of response associated with the maternal
component of the character. The situation envisaged may best
be explained by reference to the selection for body weight in
mice (Falconer, 1955), which showed the strong asymmetry illus
trated in Fig. 1 1.5. The character selected — 6 week weight — may be
divided into two components, weaning weight and postweaning
growth, the former being maternally determined. It was found that
all the asymmetry resided in the weaning weight and none in the
postweaning growth. The weaning weight increased hardly at all
in the large line but decreased very much in the small line. Thus it
was the mothering ability that changed asymmetrically under selec
tion and not the growth of the young themselves. To attribute an
asymmetrical response to maternal effects does not, however, solve
the problem, because the asymmetry has merely been shifted from
the character selected to another, and is still just as much in need of
an explanation.
These, then, are the possible causes of asymmetry that may be
suggested. There are probably others. Until the causes of asym
metry are better understood it is clear that predictions of the rate of
response to selection must be made with caution. Where there is
asymmetry of response the mean of the realised heritabilities in the
two directions will presumably correspond with the heritability
estimated from the resemblance between relatives. Therefore the
response predicted will presumably be about the mean of the two
way responses actually obtained. If the asymmetry found in the
mouse experiment should prove to be characteristic of selection for
economically desirable characters in mammals, it means that we must
expect actual progress to fall short of the predicted progress. In this
Chap. 12]
ASYMMETRY OF RESPONSE
215
experiment the mean realised heritability was 35 per cent, but the
upward progress was only at the rate of 18 per cent. In other words
the progress made was only about half as rapid as would, presumably,
have been predicted.
Longterm Results of Selection
The response to selection cannot be expected to continue in
definitely. Sooner or later it is to be expected that all the favourable
alleles originally segregating will be brought to fixation. As they
approach fixation the genetic variance should decline and the rate of
response diminish, till, when fixation is complete, the response should
cease. The population should then fail also to respond to selection in
the opposite direction, and further response to selection in either
direction will depend on the origin of new genetic variation by
mutation. But how many generations must elapse before the response
ceases, and how great will be the total response are questions that can
be answered only by experiment. Let us first see what evidence is
available on these points, and then see how far the longterm effects
of selection conform to the simple theoretical picture outlined above.
Total response and duration of response. When the response
to selection has ceased, the population is said to be at the selection
limit. It is usually impossible to decide exactly at what point the
limit is reached, because the limit is approached gradually, the res
ponse becoming progressively slower. The total response, and par
ticularly the duration of the response, can therefore be estimated only
approximately. Bearing this in mind, we may examine the results of
four twoway selection experiments, two with Drosophila and two
with mice, given in Table 12.1. The asymmetry of the responses is
disregarded, and the total response is taken as the sum of the total
responses in the two directions. This is the difference between the
upper and lower selection limits, and may be called the total range. In
the table the total range is expressed in three ways: as a percentage of
the initial population mean, M ] in terms of the phenotypic standard
deviation, a P , in the initial population; and in terms of the standard
deviation of breeding values, cr Ai (i.e. the square root of the additive
variance) in the initial population. To draw general conclusions from
these four experiments would be rash, because the experiments
differed in several ways — in the intensity of selection, the population
P F.Q.G.
216 SELECTION: II [Chap. 12
size, and the nature of the initial population — all of which would be
expected to affect the duration of response and the total range.
Despite these differences, however, the picture they give is fairly
consistent. The response continues for about 20 to 30 generations;
Table 12.1
Total Responses in four Selection Experiments
Experiment Duration Total range
(generations) /M (%) ja P ja A
Drosophila:
(1) abdominal bristles 30 189 20 28
(2) thorax length 20 24 12 22
Mice:
(3) 6week weight
25
69
8
16
(4) 60day weight
20
122
10
21
References:
(1) Clayton and Robertson (1957).
(2) F. W. Robertson (1955).
(3) Falconer (1955).
(4) MacArthur (1949); Butler (1952).
and the total range is between 15 and 30 times the square root of the
additive variance, or about 10 to 20 times the phenotypic standard
deviation in the initial population. The relationship between the
total range and the original population mean, however, is quite
irregular.
The total response produced by selection in these experiments,
though it may be impressive when reckoned in terms of the variation
present in the original population, is not at all spectacular when com
pared with the achievements of the breeders of domestic animals.
For example, the upper limits of body weight of the mice in the
experiments quoted are 2 to 3 times the lower limits; but the weights
of the largest breeds of dog are about 75 times greater than those of
the smallest (SiertsRoth, 1953). The reason for the disappointing
results of experimental selection when viewed against the differences
between the breeds of domestic animals is that experiments are
carried out with closed populations of not very large size. The limits
are set by the gene content of the foundation individuals, since no
genes are brought in after selection has been started. The breeder of
I
Chap. 12]
LONGTERM RESULTS OF SELECTION
217
domestic animals, in contrast, by intermittent crossing casts his net
far wider in the search for genes favourable to his purposes.
The effects of inbreeding during the selection have been ignored
in this account of selection limits. It is clear on theoretical grounds
that inbreeding will tend to cause fixation of unfavourable alleles at
some loci. Both the total response and the duration of the response
must therefore be expected to be reduced if the selection is carried
out in a small population with a fairly high rate of inbreeding. There
is, however, little experimental evidence on the magnitude of this
effect of inbreeding. The four experiments discussed above were all
carried out on fairly large populations, so that the rate of inbreeding
was fairly low.
Number of "loci." When the total range has been determined
by experiment it is possible, in principle, to deduce the number of
loci that gave rise to the response, and the magnitude of their effects.
The estimates that can be made in practice, however, are only rough
ones, because the properties of the individual loci are unknown and
have to be guessed at. But even though we can do no more than
establish the order of magnitude of the number and effects of the loci,
this is better than no estimate at all; so let us see how these estimates
may be obtained. The limitations will become apparent as we pro
ceed.
The estimates come from a comparison of the total range with
the amount of additive genetic variance in the original population. In
principle it is clear that with a given amount of initial variation a
small number of genes will produce less total response than a larger
number; and that if a given amount of variation is produced by few
genes the magnitude of their effects must be greater than if it is pro
duced by many. It is clear, also, that linkage is an important factor
in the relationship between variance and total response. Some seg
ments of chromosome that segregate as units in the initial popula
tion will recombine during the selection and appear as many genes
contributing to the total response. Other segments may fail to re
combine and will be counted as single genes. In order to emphasise
this limitation, the estimate of the number of loci may be referred to
as the number of "effective factors" or as the "segregation index."
There are, however, other uncertainties, and we shall simply refer to
it as the number of "loci," letting the inverted commas serve to
remind us of the unavoidable limitations and qualifications.
We must first suppose that there has been no inbreeding and when
\
i R 2 — N
n = o^ • ( J2 5)
This equation gives the basis for estimating the number of "loci."
Their effects may then be estimated from equation 12.4. The most
meaningful measure of the "effect" of a locus, however, is what we
218 SELECTION: II [Chap. 12
the selection limits have been reached all loci are fixed for the favour
able allele. The total range is then zUa, where za is the difference of
genotypic value between the two most extreme homozygotes at a
particular locus, and is the precise meaning of what we have loosely
called the "effect" of the locus. If R is the total range and n is the
number of loci that have contributed to the response, then
R = 2tia i 12  1 )
where a is the mean value of a. Next we must suppose that each locus
has only two alleles. The additive variance arising from one locus is
then ojj =2pq[a + d(qp)Y, from equation £.5. (We shall use a 2 here
to denote variance instead of V, because it simplifies the formulation
when standard deviations are involved.) The gene frequencies at the
individual loci thus enter the picture. Unless the initial population
was made from crosses between inbred lines, the gene frequencies
are not known and we shall therefore have to insert hypothetical
values. We shall suppose that all segregating genes are at frequencies
of 05, as they would be if the initial population were made from a
cross between two inbred lines. The additive variance contributed by
one locus then becomes a\ = <2 2 , and the degree of dominance be
comes irrelevant. Next we have to suppose there is no linkage be
tween the loci, so that the additive variance due to all n loci together is
fliHW?) .V.. ..(12.2)
where (a 2 ) is the mean of the squares of a for each locus. Finally we
shall suppose that all loci have equal effects, so that equations 12.1
and 12.2 become
R = 2tia ( I2 >3)
and
<y\ = lna 2 (12.4)
Squaring equation 12.3 and substituting a 2 = (2/n)(j^ from equation
12.4 gives R 2 = 8/zol, whence
Chap. 12]
LONGTERM RESULTS OF SELECTION
219
have earlier called the * 'proportionate effect," 2<z/cr P , which is the
difference between the homozygotes expressed in terms of the pheno
typic standard deviation. By rearrangement of equation 12.4 this
becomes
g p \] \n)
(12.6)
where h is the square root of the heritability.
Let us see what results these theoretical deductions yield when
applied to the experiments quoted in Table 12.1. The estimates of
the number of "loci" and of the proportionate effects of the genes are
Table
12.2
Experiment
Number
"loci"
of
Proportionate
effect (za/ap)
Drosophila :
(1) abdominal bristles
99
021
(2) thorax length
59 ,
020
Mice:
(3) 6week weight
35
023
(4) 60day weight
53
019
(For references to experiments see Table 12. 1)
given in Table 12.2. Since the estimation of the number of "loci" is
necessarily so imprecise it does not seem worth while to discuss in
detail its limitations or the errors that may have been introduced by
the assumptions that were made. These matters are discussed by
Wright ( 1 9526). The results given in Table 12.2, then, suggest that
the responses to selection in these experiments have resulted from
about 100 loci (i.e. more nearly 100 than 10 or 1,000); and that on the
average the difference in value between homozygotes at one locus
amounts to about onefifth of the phenotypic standard deviation.
Nature of the selection limit. The deductions made in the last
section from the observed total response were based on the assump
tion that the selection limit represents fixation of all favourable
alleles. The simple theoretical expectation is that selection should
lead to fixation with the consequent loss of genetic variance. Let us
now consider the evidence from experiments about the nature of the
selection limit and see how far it conforms to this simple theoretical
picture. If the genetic variance declines as the limit is approached
220 SELECTION: II [Chap. 12
this ought to be apparent in a decline of phenotypic variance. In
many experiments, however, the phenotypic variance has been found
not to decline, even when the selection limit has been reached, and
when due allowance for "scale effects" has been made as will be
explained in Chapter 17. A fairly typical example is provided by the
experiment with mice which was described in the last chapter (Fig.
1 1.5). The phenotypic variance is shown in Fig. 12.3, expressed in
the form of the coefficient of variation in order to eliminate scale
GENERATIONS
Fig. 12.3. Coefficient of variation of 6 week weight in mice. The
thin continuous line starting at generation 23 refers to the un
selected control. The broken lines refer to reversed selection and
the dotted lines to suspended selection. (From Falconer, 1955;
reproduced by courtesy of the editor of the Cold Spring Harbor
Symposia on Quantitative Biology.)
effects. The variance in the large line remains at the same level
throughout the experiment, and after the limit has been reached at
about the twentyfifth generation a comparison with the unselected
control shows the variance not to have declined at all. The variance
in the small line shows a sudden and large increase, but we shall
return to this point later. An example from Drosophila is provided
by the experiment on abdominal bristlenumber illustrated in Fig.
12. 1 . The phenotypic variance in the base population and in the most
extreme of the high and of the low lines after 35 and 34 generations
respectively is illustrated by frequency distributions in Fig. 12.4. In
this case the variance not only failed to decline but increased very
much during the selection in both directions. Before we consider the
LONGTERM RESULTS OF SELECTION
221
Chap. 12]
reasons for this behaviour of the variance we shall mention another
fact often found in selection experiments. It is that when the response
to continued selection has ceased the population will often respond
to selection in the reverse direction and will often respond rapidly.
This is well illustrated in Fig. 12.2, where the three lines selected for
Fig. 12.4. Frequency distributions of abdominal bristle number
in Drosophila melanogaster (females), in the base population and in
the most extreme high and low lines after 35 and 34 generations
of selection. (From Clayton, Morris, and Robertson, 1957; re
produced by courtesy of the authors and the editor of the Journal
of Genetics.)
increased thorax length returned rapidly to the unselected level
when the direction of selection was reversed after the upward res
ponses had ceased. The lines selected for reduced thorax length,
however, did not respond to reversed selection. From this brief
outline of the evidence it is clear that the simple theoretical picture of
the selection limit is not substantiated by experiment. Instead, we
find — not always but often — no loss of phenotypic variance and the
ability to respond rapidly to reversed selection. Let us now consider
what may be the possible reasons for these facts, and what conclusions
about the genetic nature of the selection limit can be drawn from
them.
1 . The failure of the phenotypic variance to decline may be due
to an increase of nongenetic variance compensating for the expected
reduction of genetic variance. With the approach to fixation of the
222 SELECTION: II [Chap. 12
loci concerned, and of others linked to them, the frequency of homo
zygotes will increase. There is evidence, mentioned in Chapter 8
and to be discussed more fully in Chapter 15, that homozygotes are
sometimes more variable from environmental causes than hetero
zygotes. This could cause an increase of environmental variance which
might counterbalance a reduction of genetic variance; but there is
little experimental evidence concerning the matter.
2. If the population, after the selection limit has been reached,
responds to reversed selection we can only conclude that genetic
variance of some sort remains. The continued presence of genetic
variance could result from the following causes:
(i) We saw in Example 11.6 how natural selection opposed the
artificial selection for small size in mice, partly because small mice
are less fertile than large ones and partly because the smallest mice
were sterile. Natural selection acting in this sort of way may increase
as the population mean changes further from the original level, until
it becomes strong enough to counteract completely the artificial
selection. The response would then cease, but reversed selection
would be aided by natural selection and the population would res
pond.
(ii) Selection may favour heterozygotes at some loci. At the
selection limit the genes would be in equilibrium at more or less
intermediate frequencies, and they would give rise to genetic vari
ance. But the variance would be nonadditive, and there would be
no immediate response to reversed selection. If reversed selection
were continued a response would slowly develop and become more
rapid as the gene frequencies changed away from the equilibrium
values. The behaviour of populations at the selection limit, however,
does not seem commonly to be of this sort.
(iii) If there is superiority of heterozygotes arising from the com
bined action of artificial and natural selection then the situation is
quite different. Consider a locus at which the heterozygote AjA 2 is
superior in the character selected to the homozygote AjA^, and the
homozygote A 2 A 2 is inviable or sterile. Artificial selection will choose
A]A 2 , or perhaps A 2 A 2 if it is viable, but natural selection will reject
A 2 A 2 , so that under the combined effect of artificial and natural
selection the heterozygote is superior. The pygmy gene in mice
which was used for several examples in Chapter 7 provides just such a
case, when artificial selection is in the direction of small size. Hetero
zygotes are favoured because they are smaller than normal homozy
Chap. 12]
LONGTERM RESULTS OF SELECTION
223
gotes; homozygous pygmies are smaller still but are sterile. When the
selection limit is reached under this situation there will be genetic
variance due to the gene, but no further response. When selection is
reversed, however, it is only the artificial selection that is reversed in
direction, and one homozygote will be favoured. The population will
therefore respond immediately. This may be regarded as an extreme
form of asymmetrical response to selection. It leads to the anomaly of
a high heritability — about 50 per cent — estimated from the offspring
parent regression, but a realised heritability of zero in one direction
and up to 100 per cent in the other direction. The anomaly, however,
is only apparent because the estimation of heritability and the pre
diction of the response to selection are valid only if natural selection
does not interfere with the appearance of the genotypes in their proper
Mendelian ratios.
The situation described above was proved to exist in one of the
lines of Drosophila selected for high bristle number in the experiment
illustrated in Fig. 12.1. There was a gene present which was lethal
in the homozygote and which in the heterozygote increased bristle
number by 22, which is 58 times the original phenotypic standard
deviation (Clayton and Robertson, 1957). The line carrying this
gene was the one whose distribution is shown in Fig. 12.4, and the
bimodality of the distribution can be seen. It seems probable that
in cases like this the gene does not have so large an effect in the original
population, but that the effect of the heterozygote is enhanced during
the selection, either by "modifying" genes or by a crossover which
separates a linked gene whose effect is in the opposite direction. A
mechanism of this sort seems to be required to account for the very
great increase of variance often found in selected lines (F. W. Robert
son and Reeve, 19520; Clayton and Robertson, 1957).
The selection of heterozygotes at one or a few loci with major
effects through the combined action of artificial and natural selection
in the manner explained above seems to be a common situation in
Drosophila populations at the selection limit. Whether it occurs as
frequently in other organisms is not known because the genetic
analyses required to detect it are more difficult to make. The increase
of variance in the mice selected for small size shown in Fig. 12.3 may
well have been due to this cause.
The deleterious effect on fitness is an essential part of the situa
tion, so genes of this sort will always be at low frequencies in the
initial population. The appearance of any particular gene in a selected
224 SELECTION: II [Chap. 12
line will therefore depend very much on the chances of sampling, or
on its occurring later by mutation. Consequently such genes will be a
cause of differences between replicated lines, such as we noted at the
beginning of this chapter in the experiment on Drosophila bristle
number, and they will render the selection limit to a large extent
unpredictable in its level and its precise genetic nature.
Relevance of selection limits to animal and plant improve
ment. It may be thought that experimental studies of long continued
selection are of little relevance to the practice of selection in animal
and plant improvement, because the breeder is concerned only with
the first five or ten generations. This, however, is not necessarily so.
The breeds of animals and varieties of plants which he seeks to im
prove have already been under selection for more or less the same
characters over a long time. They may therefore by now be approach
ing, if they are not already at, the selection limits. An understanding
of the nature of the selection limit and of the behaviour of populations
at the selection limit may therefore be very relevant in the field of
practice.
CHAPTER 13
SELECTION:
III. Information from Relatives
In our consideration of selection we have up to now supposed that
individuals are measured for the character to be selected and that the
best are chosen to be parents in accordance with the individual pheno
typic values. An individual's own phenotypic value, however, is not
the only source of information about its breeding value; additional
information is provided by the phenotypic values of relatives, particu
larly by those of full or half sibs. With some characters, indeed, the
values of relatives provide the only available information. Milk
yield, to take an obvious example, cannot be measured in males, so
the breeding value of a male can only be judged from the phenotypic
values of its female relatives. Ovarian response to gonadotropic
hormone, a character for which selection has been applied in rats
(Kyle and Chapman, 1953), cannot be measured on the living animal,
so selection can only be based on the phenotypic values of female
relatives. The use of information from relatives is of great importance
in the application of selection to animal breeding, for two reasons.
First, the characters to be selected are often ones of low heritability,
and with these the mean value of a number of relatives often provides
a more reliable guide to breeding value than the individual's own
phenotypic value. And, second, when the outcome of selection is a
matter of economic gain even quite a small improvement of the
response will repay the extra effort of applying the best technique.
In this chapter we shall outline the principles underlying the use of
information from relatives and the choice of the best method of
selection, but we shall not discuss the technical details of procedure
in the application of selection to animal breeding.
Methods of Selection
If the family structure of the population is taken into account we
can compute the mean phenotypic value of each family; this is known
226 SELECTION: /// [Chap. 13
as the family mean. Suppose, then, that we have a population in which
the individuals are grouped in families, which may be full or half sibs,
and we have measurements of each individual and of the means of
every family. A choice of procedure for applying selection to this
population is then open, according to the use we make of the family
means. Let us first look at the problem from the point of view of the
additional information provided by the values of relatives. Suppose,
for example, that we have an individual whose own value puts it on
the borderline between selection and rejection, and it has a number
of sibs with high values, so that the family to which it belongs has a
high mean. We may interpret the situation in one of two ways.
Either we may say that the individual's own rather poor value has
been due to poor environmental circumstances, and that the high
family mean suggests that its breeding value is likely to be a good deal
better than its phenotypic value. Or we may say that the high family
mean has been due to a favourable common environment, provided
perhaps by a good mother, from which the individual in question
must also have benefited; on this interpretation, therefore, the in
dividual's breeding value is likely to be less good than its phenotypic
value. In the first case we should regard the information from the
relatives as favourable and we should select the individual in question,
while in the second case we should regard it as unfavourable and
should reject the individual. Here then is the problem: how do we
decide which is the correct interpretation ? It turns out that only three
things need be known: the kind of family (whether full or half sibs),
the number of individuals in the families (i.e. the family size), and the
phenotypic correlation between members of the families with respect
to the character. The choice of method is thus a relatively simple
matter in practice. But the explanation of the principles underlying
the choice is more complicated. Before embarking on this explana
tion we shall therefore give a brief general account of the different
methods of selection according to the use made of the information
from relatives, indicating the circumstances to which each method
is specially suited. Then we shall explain how the response expected
under each method is deduced; and finally we shall compare the
relative merits of the methods under different circumstances.
The phenotypic value of an individual, P, measured as a deviation
from the population mean, is the sum of two parts: the deviation of
its family mean from the population mean, P fy and the deviation of the
individual from the family mean, P w (the withinfamily deviation);
Chap. 13]
so that
METHODS OF SELECTION
227
P=Pf+P l
.(*?•*)
the
The procedure of selection, then, varies according to the attention
paid, or the weight given, to these two parts. If we select on the basis
of individual values only, as assumed in the last two chapters, we give
equal weight to the two components P f and P w of the individual's
value P. This is known as individual selection. We may, alternatively,
select on the basis of the family mean P f alone, disregarding the
withinfamily deviation P w entirely. This is known as family selection
and it corresponds to the procedure adopted in the first case discussed
above. Again, we may select on the basis of the withinfamily devia
tion P w alone, disregarding the family mean P f entirely. This is
known as withinfamily selection and it corresponds to the second case
discussed above. Finally, we may take account of both components
P f and P w but give them different weights chosen so as to make the
best use of the two sources of information. This is known as selection
by optimum combination, or combined selection. It represents the
general solution for obtaining the maximum rate of response, and the
other three simpler methods are special cases in which the weights
given to the two sources of information are either i or o. It is there
fore in principle always the best method. But its advantage over one
or other of the simpler methods is never very great, and it is a refine
ment that is not often worth while in practice. Beyond showing why
this is so, we shall therefore not give very much attention to combined
selection.
The salient features of the three simpler methods are as follows,
the differences of procedure between them being illustrated diagram
matically in Fig. 13. 1.
Individual selection. Individuals are selected solely in accord
ance with their own phenotypic values. This method is usually the
simplest to operate and in many circumstances it yields the most rapid
response. It should therefore be used unless there are good reasons
for preferring another method. Mass selection is a term often used for
individual selection, especially when the selected individuals are put
together en masse for mating, as for example Drosophila in a bottle.
The term individual selection is used more specifically when the
matings are controlled or recorded, as with mice or larger animals.
Family selection. Whole families are selected or rejected as
units according to the mean phenotypic value of the family. In
228
SELECTION: III
[Chap. 13
dividual values are thus not acted on except in so far as they determine
the family mean. In other words the withinfamily deviations are
given zero weight. The families may be of full sibs or half sibs, families
of more remote relationship being of little practical significance. The
use of fullsib families is dependent on a high reproductive rate and
with slowbreeding organisms half sibs must generally be used.
i • i i
i'ii
i'ii
! • i
° ' i
1 7 i 
O •  '
A ' ' ?
° • i
1 • i °
o ! ' ] o
i o 7
i ; o ,
' I o
1 I I I
' I O I
i i T i
I I I I
I O I
i I o
' I I
I o '
I 1 1
(a)
INDIVIDUAL SELECTION
(b)
FAMILY SELECTION
(c)
(d)
WITHINFAMILY SELECTION
Fig. i 3. i. Diagram to illustrate the different methods of selec
tion. The dots and circles represent individuals plotted on a
vertical scale of merit, those with the best measurements being at
the top. The individuals to be selected are those shown as dots.
There are 5 families each with 5 individuals; {a), {b) y and (c) show
identical arrangements of the same 25 individuals. The families
are separated laterally, with the individuals of each family placed
one above the other. The mean of each family is shown by a cross
bar. The situation in which withinfamily selection is most useful
is shown in (d), where the variation between families is very great
in comparison with the variation within families. (Redrawn from
Falconer, 1957a.)
The chief circumstance under which family selection is to be pre
ferred is when the character selected has a low heritability. The
efficacy of family selection rests on the fact that the environmental
deviations of the individuals tend to cancel each other out in the mean
Chap. 13]
METHODS OF SELECTION
229
value of the family. So the phenotypic mean of the family comes close
to being a measure of its genotypic mean, and the advantage gained is
greater when environmental deviations constitute a large part of the
phenotypic variance, or in other words when the heritability is low.
On the other hand, environmental variation common to members of a
family impairs the efficacy of family selection. If this component is
large, as illustrated in Fig. 13. i (d) y it will tend to swamp the genetic
differences between families and family selection will be corre
spondingly ineffective. Another important factor in the efficacy of
family selection is the number of individuals in the families, or the
family size. The larger the family the closer is the correspondence
between mean phenotypic value and mean genotypic value. So the
conditions that favour family selection are low heritability, little
variation due to common environment, and large families.
There are practical difficulties in the application of family selec
tion, particularly in laboratory populations. They arise from the
conflict between the intensity of selection and the avoidance of in
breeding. It is generally desirable to keep the rate of inbreeding as
low as possible. If the minimum number of parents is fixed by con
siderations of inbreeding — say at ten pairs — then under family
selection ten families must be selected, since each family represents
only one pair of parents in the previous generation. And, if a reason
ably high intensity of selection is to be achieved, the number of
families bred and measured must be perhaps twice to four times this
number. Family selection is thus costly of space, and if breeding space
is limited the intensity of selection that can be achieved under family
selection may be quite small. The two following methods are variants
of family selection.
Sib selection. Some characters, we have already noted, cannot
be measured on the individuals that are to be used as parents, and
selection can only be based on the values of relatives. This amounts
to family selection but with the difference that now the selected indi
viduals have not contributed to the estimate of their family mean.
The difference affects the way in which the response is influenced by
family size. Where the distinction is of consequence we shall use the
term sib selection when the selected individuals are not measured and
family selection when they are measured and included in the family
mean. When families are very large the two methods are equivalent,
and the term family selection is then to be understood to cover both.
Progeny testing is a method of selection widely applied in ani
230 SELECTION: III [Chap. 13
mal breeding. We shall not discuss it in detail, except in so far as it
can be treated as a form of family selection. The criterion of selection,
as the name implies, is the mean value of an individual's progeny.
At first sight this might seem to be the ideal method of selection and
the easiest to evaluate because, as we saw in Chapter 7, the mean
value of an individual's offspring comes as near as we can get to a
direct measure of its breeding value, and is in fact the operational
definition of breeding value. In practice, however, it suffers from the
serious drawback of a much lengthened generation interval, because
the selection of the parents cannot be carried out until the offspring
have been measured. The evaluation of selection by progeny testing
is apt to be rather confusing because of the inevitable overlapping of
generations, and because of a possible ambiguity about which genera
tion is being selected, the parents or the progeny. The progeny,
whose mean is used to judge the parents, are ready to be used as
parents just when the parents have been tested and await selection.
Thus both the selected parents and their progeny are used con
currently as parents. The difficulty of interpretation may be partially
overcome by regarding progeny testing as a modified form of family
selection. The progenies are families, usually of half sibs, and selec
tion is made between them on the basis of the family means in the
manner described above. The only difference is that the selected
families are increased in size by allowing their parents to go on breed
ing. The additional, younger, members of the families do not con
tribute to the estimates of the family means and are therefore selected
by sib selection. Increasing the size of the selected families by un
measured individuals does not improve the accuracy of the selection,
but it reduces the replacement rate and so increases the intensity of
selection that can be achieved. This is the principal advantage of
progeny testing, but it can only be realised in operations on a large
scale, when the danger of inbreeding is not introduced by limitation
of space.
Withinfamily selection. The criterion of selection is the
deviation of each individual from the mean value of the family to
which it belongs, those that exceed their family mean by the greatest
amount being regarded as the most desirable. This is the reverse of
family selection, the family means being given zero weight. The chief
condition under which this method has an advantage over the others
is a large component of environmental variance common to members
of a family. Fig 13. 1 (d) shows how withinfamily selection would be
Chap. 13]
METHODS OF SELECTION
231
applied in this situation. Preweaning growth of pigs or mice might
be cited as examples of such a character. A large part of the variation
of individuals' weaning weights is attributable to the mother and is
therefore common to members of a family. Selection within families
would eliminate this large nongenetic component from the variation
operated on by selection. An important practical advantage of selec
tion within families, especially in laboratory experiments, is that it
economises breeding space, for the same reason that family selection
is costly of space. If singlepair matings are to be made, then two
members of every family must be selected in order to replace the
parents. This means that every family contributes equally to the
parents of the next generation, a system that we saw in Chapter 4
renders the effective population size twice the actual. Thus when
selection within families is practised, the breeding space required to
keep the rate of inbreeding below a certain value is only half as great
as would be required under individual selection.
Expected Response
To evaluate the relative merits of the different methods of selec
tion we have to deduce the response expected from each. There is
nothing to be added here about individual selection to what was said
in Chapter 11. The expected response was given in equation 11. 3 as
R=icrph?, where i is the intensity of selection (i.e. the selection
differential in standard deviations), g p is the standard deviation, and
W the heritability, of the phenotypic values of individuals. The
response expected under family selection or withinfamily selection
is arrived at in an analogous manner. Under family selection, the
criterion of selection is the mean phenotypic value of the members of
a family, so the expected response to family selection is
R f =icr f h 2 f
to 2 )
where i is the intensity of selection, o f is the observed standard
deviation of family means, and hj is the heritability of family means.
In the same way the expected reponse to withinfamily selection is
R w =icr l
{13.3)
where o w is the standard deviation, and h\ the heritability of within
family deviations.
F.Q.G.
232 SELECTION: III [Chap. 13
The concept of heritability applied to family means or to within
family deviations introduces no new principle. It is simply the pro
portion of the phentoypic variance of these quantities that is made
up of additive genetic variance. These heritabilities can be expressed
in terms of the heritability of individual values (which we shall con
tinue to refer to simply as the heritability, with symbol A 2 ), the pheno
typic correlation between members of families, and the number of
individuals in the families, all of which can be estimated by observa
tion. To arrive at the appropriate expressions we have to consider
again how the observational components of variance are made up of
the causal components, as explained in Chapters 9 and 10 (see in
particular Tables 9.4 and 10.4). First let us simplify matters by
supposing that all families contain a large number of individuals, so
that the means of all families are estimated without error. Consider
first the phenotypic variance. The intraclass correlation, t, between
members of families is the betweengroup component divided by the
total variance: t — G%ju^. Therefore the betweengroup component
can be expressed as G% — tG%, and the withingroup component as
<7jp = (i £)crf.. This expresses the partitioning of the phenotypic
variance into its observational components. The total variance,
written here as oy, is the phenotypic variance which we shall write
as V P in the context of causal components. Now, the partitioning of
the additive variance between and within families can be expressed
in the same way, in terms of the correlation of breeding values, for
which we shall use the symbol r. (The meaning of this correlation
will be explained in a moment.) Thus the additive variance between
families is rV A and the additive variance within families is (1 r)V A .
The dual partitioning is summarised in Table 13.1.
Table 13. i
Partitioning of the variance between and within families of
large size.
Observational component Additive variance Phenotypic variance
Between families, 0% rV A tV P
Within families, al (ir)V A (it)V P
This partitioning of both the additive and the phenotypic variance
leads at once to the heritabilities of family means and of within
family deviations, since these heritabilities are simply the ratios of
the additive variance to the phenotypic variance. Thus, when the
Chap. 13]
EXPECTED RESPONSE
233
families are large, the heritability of family means is rV A jtV Pi or (r/t)h 2 ,
since V A jV P is the heritability of individual values, h 2 .
The correlation of breeding values between members of families
is a measure of the degree of relationship, usually called the "coeffi
cient of relationship." The correlation between the breeding values
of relatives in a randommating population is twice their coancestry
r = 2f
(13.4)
that is to say, twice the inbreeding coefficient of their progeny if the
relatives were mated together. Its values in fullsib and halfsib
families can be seen from Table 9.4; for full sibs it is \ and for half
sibs it is J. In order to be able to discuss fullsib and halfsib families
at the same time in what follows, we shall retain the symbol r in the
formulae instead of inserting the appropriate values of \ or \.
The foregoing account of the heritabilities of family means and
withinfamily deviations was simplified by the supposition of large
families. This simplification is not justified in practice and we must
now remove it by considering families of finite size. We shall, how
ever, suppose that all families are of equal size. The number of
individuals in a family — called the family size — has to be taken into
consideration for the following reason. If selection is based on the
family mean, or on the deviations from the family mean, then it is the
observed mean that we are concerned with and not the true mean. In
other words we are not concerned with the observational components
of variance which we have hitherto discussed, but with the variance of
the observed means and of the observed withinfamily deviations.
The observed means of groups are subject to sampling variance which
comes from the withingroup variance. If there are n individuals in a
group then the sampling variance of the groupmean is (i/n) o>, where
&w is the component of variance within the group. Thus the variance of
observed groupmeans is augmented by (i/w) af Vy and the variance of
Table 13.2
Composition of observed variances with families of size n.
Observed variance
of family means
of withinfamily
deviations
Observational
components
ctJ +  °w
n
°W
Causal components
Additive Phenotypic
i+(ni)r i+(ni)t.
V,
(ni)(ir ) v ( n i){it)
234
SELECTION: III
[Chap. 13
observed deviations within groups is correspondingly diminished by
the same amount. The observed variances, with family size w, are
therefore made up of the observational components as shown in
Table 13.2. The causal components entering into the observed
variances can now be found by translating the observational com
ponents into causal components from Table 13. 1. They are shown in
the two righthand columns of Table 13.2.
To find the heritabilities of family means and of withinfamily
deviations we have only to divide the additive component by the
phenotypic component of the observed variances. Thus the herit
ability of family means is
I+ („_ I K 2
3 i+(ni)t
and the heritability of withinfamily deviations is
h 2
At this point sib selection has to be distinguished from family
selection. The foregoing account referred to family selection where
the individuals to be selected were themselves measured and contributed
to the observed family mean. Sib selection differs in that the individuals
selected are not measured. This does not affect the phenotypic com
ponent, because this is simply the observed variance of what is
measured. But it does affect the additive component, because the
mean breeding value with which we are concerned is not that of the
individuals whose phenotypic values have been measured, but of
others that have not been measured. Therefore the appropriate
variance of mean breeding values is simply the betweenfamily com
ponent of additive variance, rV A , irrespective of the number of other
individuals that have been measured. The heritability of family
means appropriate to sib selection is therefore
hl =
nr
i+(n i)t
The heritabilities of the different methods of selection, whose deriva
tions have now been explained, are listed in Table 13.3.
To deduce the expected response is now a simple matter. Let us
Chap. 13] EXPECTED RESPONSE 235
take family selection for illustration. The expected response was
given in equation 13.2 as
R f = i(j f h}
where cr f is the standard deviation of observed family means. This
expression, however, is not much use as it stands, because it does not
readily allow a comparison to be made with the other methods. It
will be most convenient to cast it into a form that facilitates compari
son with individual selection. This can be done by substituting the
Table 13.3
Heritability and expected response under different methods
of selection.
Method of
selection
Individual
Family h} = h 2
Sib
Within
family
Combined
hl=h
Heritability
h 2
1 +(ni)r
i+(ni)t
nr
' i+(ni)t
R = icr P h 2
R t = ia P h 2 .
R s = ia P h 2 .
Expected response
i+(n i)r
sln{i+(ni)t)
nr
hi = hK
(ir)
(iO
Jn{i+(ni)t}
VL (iO i+(»i)u
R
(it)
i = intensity of selection (selection differential in standard measure):
assumed to be equal for all methods, but not necessarily so.
o P = standard deviation in phenotypic values of individuals.
h 2 = heritability of individual values.
r: with fullsib families, r = \
with halfsib families, r = J
t = correlation of phenotypic values of members of the families.
n = number of individuals in the families.
expression for the heritability of family means, h}, given above, and
by putting the standard deviation of observed family means, oy, in
terms of the standard deviation of individual phenotypic values,
°p( = JVp) from the righthand column of Table 13.2. The expected
response then becomes
236 SELECTION: III [Chap. 13
Rf = i h Hn*) t i+(ni)r
x V w i+(rci)*
which reduces to
j^v^r '+(»* ;' i
' L>/[»{i+(»i)*}]J
The term i<j P h 2 is equivalent to the expected response under indi
vidual selection, so the expression within the square brackets is the
factor that compares family selection with individual selection. The
expression looks very complicated but it contains only three simple
quantities: n, which is the family size; r, which is \ for fullsib and
\ for halfsib families; and t> which is the phenotypic intraclass
correlation.
The expected responses under the different methods of selection
are listed in Table 13.3, all expressed in this manner which allows
the comparisons to be made with individual selection. The relative
merits of the different methods will be discussed in the next section:
first we must deal with combined selection.
Combined selection. We shall deal very briefly with combined
selection, referring the reader to Lush (1947), Lerner (1950) and
A. Robertson (1955a) for details. First we have to find what are the
appropriate weighting factors to be used in its application. We saw
before that the phenotypic value of an individual is made up of two
parts, the family mean and the withinfamily deviation, P=P f +P w ,
and that each part gives some information about the individual's
breeding value. In Chapter 10 we saw that the heritability is equi
valent to the regression of breeding value on phenotypic value
(equation J0.2), so that the best estimate of an individual's breeding
value to be derived from its phenotypic value is h 2 P. This idea can
be applied separately to the two parts of the phenotypic value, since
these are uncorrelated and supply independent information about the
breeding value. Therefore, taking both parts of the phenotypic value
into account, the best estimate of an individual's breeding value is
given by the multiple regression equation
expected breeding value = hjP f + h%P w
(P f being measured as a deviation from the population mean, and P w
as a deviation from the family mean). The weighting factors that
make the most efficient use of the two sources of information are
therefore the two heritabilities, appropriate to family means and to
Chap. 13]
EXPECTED RESPONSE
237
withinfamily deviations respectively. The criterion of selection
under combined selection is thus an index, /, in the form
I=h}P f + h^P w
<*35)
If the values of the heritabilities are inserted from Table 13.3 it will
be seen that the term h 2 is common to both weighting factors, and
this term may therefore be omitted without affecting the relative
weighting. We then have an index for the computation of which only
n, r, and t need be known. In practice it is more convenient to work
with the individual values in place of the withinfamily deviations,
and to assign them a weight of 1 . The family mean is thus used in the
manner of a correction, supplementing the information provided by
the individual itself. Rearrangement of the appropriate weighting
factor for the family mean leads to an index made up as follows (Lush,
1947):
/=p+r~. ,* / ]p, (jj.6)
\_ir i+(ni)tj T v w» /
where P is the individual value and P f the family mean, in which the
individual itself is included.
This solution of the problem of how we can best make use of the
information provided by relatives is now cast in precisely the form
in which the problem was introduced at the beginning of this
chapter. The expression in the square brackets in equation JJ.6,
which contains nothing but easily measurable quantities, shows how
we can best use the family mean to supplement the individual values
in making the selection.
The expected response to combined selection, cast in a form
suitable for comparison with individual selection, is given at the foot
of Table 13.3. For its derivation see Lush (1947).
Relative Merits of the Methods
The formulae for the expected responses that we have derived
enable us to compare one method of selection with another and dis
cover what are the conditions that determine the choice of the best
method. Before making detailed comparisons let us note the reason
for individual selection being usually better than either family selec
tion or withinfamily selection. The reason is that the standard
238 SELECTION: III [Chap. 13
deviations of family means and of withinfamily deviations are both
bound to be less than the standard deviation of individual values;
and the standard deviation of the criterion of selection is one of the
factors governing the response. If we compare, for example, family
selection with individual selection by writing the expected responses
in the form
R = icjph 2 (for individual selection)
and R f =i(7 f h} (for family selection)
then it is clear that family selection cannot be better than individual
selection unless the heritability of family means, h} i is greater than
the heritability of individual values, W, by an amount great enough
to counterbalance the lower standard deviation of family means.
And the same applies to withinfamily selection.
A general picture of the circumstances that make one method
better than another can best be obtained from graphical representa
tions of the relative responses: that is, the response expected from
one method expressed as a proportion of the response expected from
another, the expected responses being taken from Table 13.3. In
making these comparisons we shall assume that the intensity of
selection is the same for all methods. Though not necessarily true,
this simplification is unavoidable because no generalisation can be
made about the proportions selected under the different methods.
We shall make the comparisons separately for fullsib families (r = J)
and for halfsib families (r = J). Then the relative responses depend
only on two factors, the family size, n> and the intraclass correlation
of phenotypic values, t. If there is no variance due to common en
vironment contributing to the variance of family means, then the
correlation in fullsib families is equal to half the heritability, and that
in halfsib families to one quarter of the heritability. This lets us see
in a general way how the heritability of the character influences the
relative response. It is, however, the correlation and not the herit
ability that is the determining factor, so only the correlation need be
known when a choice of method is to be made.
Fig. 13.2 gives a general picture of all the methods, showing how
their relative merits depend on the phenotypic correlation. The
graphs refer only to fullsib families and only to the two extremes of
family size: infinitely large families in (a) and families of 2 in (b).
The comparisons are made here with combined selection since this is
necessarily the method that gives the greatest response. The graphs
Chap. 13]
RELATIVE MERITS OF THE METHODS
239
therefore show the ratio of the response for each method to that for
combined selection: e.g. for family selection, the ratio RfjR c . The
general picture indicated by the graphs is as follows. The relative
merit of individual selection is greatest when the correlation is 05
and falls off as the correlation drops below or rises above this value.
The relative merit of family selection is greatest when the correlation
is low, and that of withinfamily selection when the correlation is
> 5
<
XI
."'w 
 /
X
\ ■
•/..
/
X
\
\ \
1 •'"
1/ i
1
•
M
~~>^rtC
/

•=A •
\ \

\v^'
\ \
\\
 .1 ..
1
2 4 6
10
(a) n = 00
4 6
(b) n = l
10
Fig. 13.2. Relative merits of the different methods of selection,
with fullsib families. Responses relative to that for combined
selection plotted against the phenotypic intraclass correlation, t.
/= individual selection; F = family selection; W= withinfamily
selection.
high. Now, a low correlation between sibs can only result from a
character of low heritability, and with very little variance due to
common environment. These therefore are the circumstances that
favour family selection. A high correlation can only result from a
large amount of variance due to common environment. Even if the
heritability were 100 per cent the correlation between full sibs could
not exceed 0*5 without augmentation by common environment. A
large amount of variation due to common environment is therefore
the circumstance that favours withinfamily selection. We shall
examine the three simpler methods in more detail in a moment.
First let us look at what may be gained from combined selection.
Though combined selection is always as good as or superior to any
other method, its superiority is never very great. With large families
its superiority is greatest when the correlation is close to 025 or 075,
but even then its superiority is not much more than 10 per cent.
240 SELECTION: III [Chap. 13
With families of 2 its superiority reaches 20 per cent when the cor
relation is 0875. Thus the range of circumstances under which
combined selection is more than a few per cent better than one or
other of the simpler methods is very narrow. In general, therefore,
there is little to be gained from the extra trouble of applying combined
selection, and we shall not give it any further consideration.
Let us now examine the simpler methods in more detail. The
most useful comparison to make now is with individual selection.
The expected responses will therefore be expressed as a proportion
of the response to individual selection. We shall examine each method
in turn, commenting on the special questions that arise in connexion
with each.
Family selection. Fig. 13.3 shows the relative response R f jR
plotted against the family size, n, for fullsib families in (a) and
for halfsib families in (b). These graphs therefore show primarily the
effect of family size on the relative merit of family selection, but the
magnitude of the correlation, t, is taken into account by separate
curves for different correlations. Only the circumstances when family
selection is superior to individual selection are shown on the graphs.
The chief points made clear by the graphs are these, (i) As we saw
from Fig. 13.2, there is a critical value of the correlation, above which
family selection cannot be superior to individual selection. From the
expected responses in Table 13.3 it is easy to show that when the
families are large the relative response expected is R f /R = r/Jt. So,
with large families, family selection becomes superior to individual
selection when r exceeds Jt. The critical value of the correlation, t,
depends a little on the family size and differs between fullsib and
halfsib families. Family selection with full sibs is very little better
than individual selection unless the correlation is below 02; and with
half sibs unless it is below 005 . (ii) The effect of family size is greatest
when the correlation is low. Therefore there is little to be gained
from very large families unless the correlation is well below the critical
value. There is, however, another consideration in connexion with
the family size which will be explained later, (iii) Finally, there is the
question whether full sibs or half sibs are to be preferred for family
selection. This depends so much on the special circumstances that
general conclusions cannot be drawn. From the graphs it would
appear that full sibs must always be better than half sibs. But the
fullsib correlation is more likely to be increased by common en
vironment, and fullsib families are likely to be a good deal smaller
Chap. 13]
RELATIVE MERITS OF THE METHODS
241
than halfsib families. Both these factors work in favour of halfsib
families. It has been shown that in selection for eggproduction in
poultry the factor of family size makes halfsib families superior to
full sibs (Osborne, 19570).
20
18
0^ 14
12
10
00
t^
1?SS
t=2 „
^=•20
t=i
h 1 = 40
14
12
10
20
20 30 40
FAMILY SIZE , 71
25
30
(b)
,=Q25_
i=05
/! 2 = 20
50
60
Fig. 13.3. Responses expected under family selection relative to
that for individual selection, plotted against family size. The
separate curves refer to different values of the phenotypic cor
relation, t, as indicated. The corresponding values of the heri
tability, h 2 , in the absence of variation due to common environment,
are also given, (a) fullsib families; (b) halfsib families.
Sib selection. The use of this method is usually dictated by
necessity rather than by choice, and comparisons with other methods
are of less interest. The chief practical question that arises concerns
242
SELECTION: III
[Chap. 13
the family size: how many sibs should be measured? Or, how far is it
worth while increasing family size ? The effect of family size on the
response to sib selection is shown in Fig. 13.4. The graphs show the
response with family size n f as a percentage of the response with
infinitely large families, which would be the maximum possible
100
90
80
9 70
1 60
X
<
z
O 50
ui
<
2 40
u
30
20
^T^sT
~~^Z~
*&&
<ToT
^^023
20 30
FAMILY SIZE
40
50
60
n
Fig. 13.4. Effect of family size on the response to sib selection,
with either full or halfsib families. The expected response is
shown as a percentage of the response with infinitely large families.
The separate curves refer to different values of the phenotypic
correlation, t, as indicated.
response. The graphs are valid for both full and half sibs. Again the
effect of increasing family size is greatest when the correlation is low.
But with sib selection as with family selection there is another con
sideration to be taken into account in connexion with the family size,
which will now be explained.
Chap. 13] RELATIVE MERITS OF THE METHODS 243
Optimal family size. Though the graphs suggest that the larger
the family size the greater will be the response, under both family
selection and sib selection, this is not so in practice because the in
tensity of selection is involved as a factor in the following way. In
practice there is always a limitation on the amount of breeding space
or facilities for measurement. The total available space can be filled
with a large number of small families, or with a small number of large
families. Considerations of inbreeding set a lower limit to the number
of families that will be selected, so the larger the number of families
measured the greater will be the intensity of selection. Therefore
there is a conflict of advantage between the size of the families and
the intensity of selection: large families lead to a lower intensity of
selection. When the intensity of selection is taken into consideration
it turns out that there is an optimal family size which gives the
greatest expected response. The optimal family size with halfsib
families can be found approximately from the following simple
formula (A. Robertson, 19576):
VA ( J *7)
7Z = 056
where n is the otpimal family size, T is the total number of individuals
that can be accommodated and measured, N is the number of families
to be selected, and h 2 is the heritability of the character.
Withinfamily selection. Fig. 13.5 shows the relative response,
R w /R, for withinfamily selection applied to fullsib families. Half
sib families need not be considered since the method is unlikely to be
applied to them. The graphs show primarily the effect of the pheno
typic correlation, t> on the response. Four graphs are given repre
senting family sizes between 2 and 30, and it can be seen that the
family size does not have a great effect. The relative response when
the families are very large can be shown from the expected responses
given in Table 13.3 to be R w /R = (i r)/J(i i). So, with large
families, withinfamily selection will be superior to individual selec
Ition if (1  r) exceeds J(i  1). The graphs in Fig. 13.5 show that the
correlation, t> in fullsib families would have to exceed about 075 to
085, according to the family size. Correlations as high as this cannot
arise without a large amount of variation due to common environ
ment. Correlations high enough to make withinfamily selection
superior to individual selection are, however, not commonly found,
and the advantage of withinfamily selection therefore comes chiefly
244
SELECTION: III
[Chap. 13
from the reduced rate of inbreeding which was mentioned earlier.
Fig. 13.5 shows how much will be sacrificed in the rate of response if
withinfamily selection is applied. Most characters have fullsib
correlations below about 05, and withinfamily selection is then only
about half as effective as individual selection.
14
12
10
•8
•6
•4
I
« = 30
n = 10
II
n=4
)
'/ \
ft J
n=2
^
V
•2
•8
•3 4 5 6 7
PHENOTYPIC CORRELATION, /
Fig. 13.5. Response expected under withinfamily selection rela
tive to that of individual selection, plotted against the phenotypic
correlation, t. The separate curves refer to different family sizes,
as indicated.
Weights to be attached to families of different size. Through
out this chapter we have assumed that all families whose mean values
are to be used in selection have equal numbers of individuals in them;
i.e. n is the same for all families. This is a reasonable enough assump
tion to make when we are considering the expected response from the
point of view of the planner who has to decide on the method of
selection to be applied. But, in practice, families are very seldom of
equal size and if we are to apply any method of selection based on
family means we are immediately faced with the problem of how to
make allownace for different numbers in the families. Obviously the
mean of a large family is more reliable than that of a small one, and
should be given more weight when the selection is being made. The
solution of the problem comes from a consideration of the heritability
Chap. 13]
RELATIVE MERITS OF THE METHODS
245
as the regression of breeding value on phenotypic value. The best
estimate of the breeding value of a family is obtained by multiplying
the family mean (measured as a deviation from the population mean)
by the heritability of family means. The appropriate weighting factor
for family means is therefore the heritability of family means, cal
culated separately for each family according to its size. Quantities
that are constant for all families may be omitted without altering the
relative weights. Thus, in the application of family selection, each
family mean, calculated as a deviation from the population mean,
should be weighted by [i +(n i)r]/[i +(n i)t], and in sib selection by
p*/[i + (n  i)t]. The heritability of withinfamily deviations does not
contain the term w, and is therefore unaffected by family size. Thus no
weighting is required in the application of withinfamily selection. The
weighting factor to be used in combined selection has already been
given in equation 13.6.
We conclude this chapter with an example from a laboratory
experiment which compared the responses actually obtained under
different methods of selection.
Example 13.1. In an experiment with Drosophila melanogaster selec
tion for abdominal bristlenumber was made by three methods (Clayton,
Morris, and Robertson, 1957). The responses to individual selection at
different intensities were quoted in Example 11.2. Sib selection was also
applied in both fullsib and halfsib families and the responses compared
with expectation. Here we shall compare the responses under sib selection
with the response under individual selection, according to the formula in
Table 13.3. The same proportion of the population was selected in each
case, namely 20 per cent, but the intensities of selection under sib selection
Relative response, RJR
Full sibs Half sibs
Exp. 0832 0614
Obs. up 0618 0527
Obs. down 0919 0635
were lower than under individual selection because there was a smaller
total number of families than of individuals — 10 halfsib families, 20
fullsib families, and 100 individuals. The intensity of selection under
individual selection was 140. Those under sib selection are given in the
table, together with the other data needed for calculating the expected
responses under sib selection relative to that under individual selection.
Data
Full sibs
Half sibs
i
i*33
127
n
12
20
r
050
0275
t
0265
0I2I
246
SELECTION: III
[Chap. 13
In applying the formula from Table 13.3 we have to take account of the
intensity of selection, multiplying by the ratio of the intensity under sib
selection to the intensity under individual selection. It will be seen that
the correlation of breeding values, r, between half sibs is a little greater
than J. This is because the females mated to a male were not entirely
unrelated to each other. The ratios of the responses expected and observed
are given in the righthand half of the table. The expectation is that in
dividual selection should be the best method, and so it proved to be.
There is, however, some discrepancy between the upward and downward
responses, of which the reason is not known.
CHAPTER 14
INBREEDING AND CROSSBREEDING:
I. Changes of Mean Value
We turn our attention now to inbreeding, the second of the two ways
open to the breeder for changing the genetic constitution of a popula
tion. The harmful effects of inbreeding on reproductive rate and
general vigour are well known to breeders and biologists, and were
mentioned in Chapter 6 as one of the two basic genetic phenomena
displayed by metric characters. The opposite, or complementary,
phenomenon of hybrid vigour resulting from crosses between inbred
lines or between different races or varieties is equally well known, and
forms an important means of animal and plant improvement. The
production of lines for subsequent crossing in the utilisation of
hybrid vigour is one of two main purposes for which inbreeding may
be carried out. The other is the production of genetically uniform
strains, particularly of laboratory animals, for use in bioassay and in
research in a variety of fields. Inbreeding in itself, however, is almost
universally harmful and the breeder or experimenter normally seeks
to avoid it as far as possible, unless for some specific purpose. Men
tion should be made here of naturally selffertilising plants, to which
much of the discussion in this chapter is inapplicable. Since inbreed
ing is their normal mating system they cannot be further inbred: they
can, however, be crossed, but they do not regularly show hybrid
vigour.
In the treatment of inbreeding given in Chapter 3 the conse
quences were described in terms of the expected changes of gene
frequencies and of genotype frequencies. Here we have to show how
the changes of gene and genotype frequencies are expected to affect
metric characters. And at the same time we have to consider the
observed consequences of inbreeding and crossing, and see what
light they throw on the properties of the genes concerned with
metric characters. We shall first consider the changes of mean value
and then, in the next chapter, the changes of variance resulting from
inbreeding and crossbreeding. Finally, in Chapter 16, we shall con
R F.Q.G.
248 INBREEDING AND CROSSBREEDING: I [Chap. 14
sider the combination of selection with inbreeding and crossbreeding
by means of which hybrid vigour may be utilised in animal and plant
improvement.
Inbreeding Depression
The most striking observed consequence of inbreeding is the
reduction of the mean phenotypic value shown by characters con
nected with reproductive capacity or physiological efficiency, the
phenomenon known as inbreeding depression. Some examples of in
breeding depression are given in Table 14. i, from which one can see
what sort of characters are subject to inbreeding depression, and —
very roughly — the magnitude of the effect. From the results of these
and many other studies we can make the generalisation that inbreed
ing tends to reduce fitness. Thus, characters that form an important
component of fitness, such as litter size or lactation in mammals,
show a reduction on inbreeding; whereas characters that contribute
little to fitness, such as bristle number in Drosophila, show little or no
change.
In saying that a certain character shows inbreeding depression,
we refer to the average change of mean value in a number of lines.
The separate lines are commonly found to differ to a greater or lesser
extent in the change they show, as, indeed, we should expect in
consequence of random drift of gene frequencies. This matter of dif
ferentiation of lines will be discussed later when we deal with changes
of variance. It is mentioned here only to emphasise the fact that the
changes of mean value now to be discussed refer to changes of the
mean value of a number of lines derived from one base population.
As in our earlier account of inbreeding we have to picture the "whole
population" consisting of many lines. The population mean then
refers to the whole population and inbreeding depression refers to a
reduction of this population mean. Let us now consider the theoreti
cal basis of the change of population mean on inbreeding.
First, we may recall and extend some of the conclusions from
Chapter 3, supposing at first that selection does not in any way inter
fere with the dispersion of gene frequencies. Since the gene fre
quencies in the population as a whole do not change on inbreeding,
any change of the population mean must be atrributed to the changes
of genotype frequencies. Inbreeding causes an increase in the frequen
cies of homozygous genotypes and a decrease of heterozygous genotypes.
Chap. 14]
INBREEDING DEPRESSION
249
Table 14. i
Some Examples of Inbreeding Depression
The figures given show approximately the decrease of mean
phenotypic value per 10 per cent increase of the coefficient
of inbreeding: column (1) in absolute units; column (2) as
percentage of noninbred mean; column (3) in terms of the
original phenotypic standard deviation (data not available
for all characters).
Character Inbreeding depression per
10% increase ofF
to (2)
units %
Cattle (A. Robertson, 1954)
Milkyield 296 gal. 32
Pigs (Dickerson et al. 1954)
(3)
/ap
017
Litter size at birth
Weight at 154 days
0*38 young
364 lb.
4.6
27
015
0*12
Sheep (Morley, 1954)
Fleece weight
Length of wool
Body weight at 1 year
064 lb.
oi2 cm.
291 lb.
5'5
i3
37
051
CI4
C36
Poultry (Shoffner, 1948)
Eggproduction
Hatchability
Body weight
926 eggs
436%
004 lb.
62
6.4
o8
Mice (Original data)
Litter size at birth
o*6o young
8o
028
Weight at 6 weeks ($?)
058 gm.
26
026
Drosophila melanogaster
(Tantawy and Reeve, 1956)
Fertility (per pair per day)
Viability (egg to adult)
Wing length
22 offspring
26 %
2'8 (too) mm.
67
37
i4
o8o
Drosophila subobscura
(Hollingsworth and Smith, 1955)
Fertility (per pair per day)
Egg hatchability
6o offspring
83 %
I2'5
8'3
—
250 INBREEDING AND CROSSBREEDING: I [Chap. 14
Therefore a change of population mean on inbreeding must be con
nected with a difference of genotypic value between homozygotes and
heterozygotes. Let us now see more precisely how the population
mean depends on the degree of inbreeding, which we may con
veniently express as the inbreeding coefficient, F. \
Consider a population, subdivided into a number of lines, with a
coefficient of inbreeding, F. The expression for the population mean
is derived by putting together the reasoning set out in Tables 3.1 and
7.1, in the following way. Table 14.2 shows the three genotypes of a
twoallele locus with their genotypic frequencies in the whole popula
tion. These frequencies come from Table 3.1, p and q being the
gene frequencies in the whole population. Then the third column
gives the genotypic values assigned as in Fig. 7.1. The value and
Table 14.2
Genotype Frequency Value Frequency x Value
A^ p+pqF +a p 2 a+pqaF
A^ 2pq2pqF d 2pqd2pqdF
A 2 A 2 q 2 +pqF a q 2 apqaF
I
Sum = a(p q) + 2dpq  2dpqF
= a(p q) + 2dpq(i F)
frequency of each genotype are multiplied together in the righthand
column, the summation of which gives the contribution of this locus a
to the population mean. Thus, referring still to the effects of a single
locus, we find that a population with inbreeding coefficient F has a
mean genotypic value:
M F = a{pq) + 2dpq(iF) (14.1)
= M zdpqF (14.2)
where M is the population mean before inbreeding, from equation
7.2. The change of mean resulting from inbreeding is therefore
— 2dpqF. This shows that a locus will contribute to a change of mean
value on inbreeding only if d is not zero; in other words if the value
of the heterozygote differs from the average value of the homozygotes. ^
This conclusion, though demonstrated in detail only for two alleles ^
at a locus, is equally valid for loci with more than two alleles. The
following general conclusions can therefore be drawn: that a change
of mean value on inbreeding is a consequence of dominance at the
loci concerned with the character, and that the direction of the change
..
Chap. 14]
INBREEDING DEPRESSION
251
is toward the value of the more recessive alleles. The dominance may
be partial or complete, or it may be overdominance; all that is neces
sary for a locus to contribute to a change of mean is that the heterozy
gote should not be exactly intermediate between the two homozygotes.
Equation 14.2 shows also that the magnitude of the change of mean
depends on the gene frequencies. It is greatest when pq is maximal:
that is, when j>=<7 = . Genes at intermediate frequencies therefore
contribute more to a change of mean than genes at high or low fre
quencies, other things being equal.
Now let us consider the combined effect of all the loci that affect
the character. In so far as the genotypic values of the loci combine
additively, the population mean is given by summation of the contri
butions of the separate loci, thus:
M F =Za{p  q) + 2{Zdpq)(i F)
= M 2FZdpq
(143)
and the change of mean on inbreeding is  zFZdpq.
These expressions show what are the circumstances under which
a metric character will show a change of mean value on inbreeding.
The chief one is if the dominance of the genes concerned is pre
ponderantly in one direction; i.e. if there is directional dominance.
If the genes that increase the value of the character are dominant
over their alleles that reduce the value, then inbreeding will result in
a reduction of the population mean, i.e. a change in the direction of
the more recessive alleles. The contribution of each locus, however,
depends also on its gene frequencies, those with intermediate fre
quencies having the greatest effect on the change of mean value.
We have now reached two conclusions about the effects of in
breeding, one from observation — that inbreeding reduces fitness; the
other from theory — that the change is in the direction of the more
recessive alleles. Putting these two conclusions together leads to the
generalisation, already familiar from Mendelian genetics, that dele
terious alleles tend to be recessive.
Another conclusion that can be drawn from equation 14.4 is that
when loci combine additively the change of mean on inbreeding
should be directly proportional to the coefficient of inbreeding. In
other words the change of mean should be a straight line when
plotted against F. Two examples of experimentally observed inbreed
ing depression are illustrated in Fig. 14.1.
On the whole the observed inbreeding depression does tend to be
252
INBREEDING AND CROSSBREEDING: I
[Chap. 14
linear with respect to F, and this might be taken as evidence that
epistatic interaction between loci is not of great importance. There
are, however, several practical difficulties that stand in the way of
drawing firm conclusions from observations of the rate of inbreeding
depression. One is that as inbreeding proceeds and reproductive
capacity deteriorates, it soon becomes impossible to avoid the loss of
Fig. 14. i. Examples of inbreeding depression affecting fertility.
(a) Littersize in mice (original data). Mean number born alive in
1 st litters, plotted against the coefficient of inbreeding of the litters.
The first generation was by doublefirstcousin mating; thereafter
by fullsib mating. No selection was practised, (b) Fertility in
Drosophila subobscura. Mean number of adult progeny per pair per
day, plotted against the inbreeding coefficient of the parents.
Consecutive fullsib matings. (Redrawn from Hollingsworth &
Smith, I955)
some lines. The survivors are then a selected group to which the
theoretical expectations no longer apply. Thus precise measurement
of the rate of inbreeding depression can generally be made only over
the early stages, before the inbreeding coefficient reaches high levels.
Another difficulty, met with particularly in the study of mammals,
arises from maternal effects. Maternal qualities are among the most
sensitive characters to inbreeding depression. The effect of inbreed
ing on another character that is influenced by maternal effects is
therefore twofold: part being attributable to the inbreeding of the
individuals measured and part to the inbreeding in the mothers. So
the relationship between the character measured and the coefficient
of inbreeding cannot be depicted in any simple manner. In conse
Chap. 14]
INBREEDING DEPRESSION
253
quence of these difficulties reliable conclusions cannot easily be
drawn from the exact form of the inbreeding depression observed in
experiments.
Example 14. i. The complications arising from maternal effects may
be illustrated by litter size in pigs and mice. Litter size is a composite
character, which is partly an attribute of the mother and partly an attribute
of the young in the litter. It is therefore influenced both by the inbreeding
of the mother and by the inbreeding of the young, and these two influences
are difficult to disentangle in practice. Studies on pigs (Dickerson et al.,
1954) have shown that the reduction of litter size due to inbreeding in the
mother alone is about 020 young per 10 per cent of inbreeding; and the
reduction due to inbreeding in the young alone is about 017 young per 10
per cent of inbreeding. Thus the effects of inbreeding in the mother and in
the young are about equally important. A small experiment with mice
(original data) gave much the same picture. A rough separation of the
effects of inbreeding in the mother and in the young was made by means of
crosses between lines after 2 or 3 generations of sib mating. (The justifi
cation for regarding this as a measure of the inbreeding depression will be
explained in the next section.) The mean litter sizes, arranged according
to the coefficient of inbreeding of the mothers and of the young, are given
in the table.
Inbreeding coefficient of mothers
0% 37'5% 50%
0%
50%
59%
82
7'5
63
7'3
58
The three comparisons in the first row show the effect of inbreeding in the
mothers, and give values of 019, 018 and 016 for the reduction of litter
size per 10 per cent of inbreeding. The comparisons in the second and
third column show the effect of inbreeding in the young, and give values
of 024 and 025 for the reduction per 10 per cent of inbreeding. Thus
inbreeding in the young had rather more effect than inbreeding in the
mother. These results, however, should not be taken as being character
istic of mice in general.
The effect of selection. The neglect of selection during in
breeding is an unrealistic omission because natural selection cannot
be wholly avoided even in laboratory experiments. Since inbreeding
tends to reduce fitness, natural selection is likely to oppose the in
breeding process by favouring the least homozygous individuals.
254 INBREEDING AND CROSSBREEDING: I [Chap. 14
The balance between selection and the dispersion of gene frequencies
was discussed in Chapter 4, and the only further point that need be
added here is that the operation of natural selection makes the in
breeding depression dependent on the rate of inbreeding. One must
distinguish between the state of dispersion of gene frequencies and
the coefficient of inbreeding as computed from the population size or
the pedigree relationships. The state of dispersion is what determines
the amount of inbreeding depression; the coefficient of inbreeding is a
measure of the state of dispersion only in the absence of selection.
When selection operates, the state of dispersion will be less than that
indicated by the coefficient of inbreeding, and the discrepancy be
tween the two will be greater when the rate of inbreeding is slower,
because the selection will then be relatively more potent. Therefore
one must expect the inbreeding depression caused by a given increase
of the computed coefficient of inbreeding to be less when inbreeding
is slow than when it is rapid.
Heterosis
Complementary to the phenomenon of inbreeding depression is
its opposite, "hybrid vigour" or heterosis. When inbred lines are
crossed, the progeny show an increase of those characters that previ
ously suffered a reduction from inbreeding. Or, in general terms, the
fitness lost on inbreeding tends to be restored on crossing. That the
phenomenon of heterosis is simply inbreeding depression in reverse
can be seen by consideration of how the population mean depends on
the coefficient of inbreeding, as shown in equation 14.4. Consider, as
before, a population subdivided into a number of lines. If the lines
are crossed at random, the average coefficient of inbreeding in the
crossbred progeny reverts to that of the base population. Thus, if a
number of crosses are made at random between the lines, the mean
value of any character in the crossbred progeny is expected to be the
same as the population mean of the base population. In other words,
the heterosis on crossing is expected to be equal to the depression on
inbreeding. Furthermore, if the population is continued after the
crossing by random mating among the crossbred and subsequent
generations, the coefficient of inbreeding will remain unchanged, and
the population mean is consequently expected to remain at the level
of the base population. We may, thus, make the following generalisa
.
Chap. 14]
HETEROSIS
255
tion on theoretical grounds: that, in the absence of selection, in
breeding followed by crossing of the lines in a large population is not
expected to make any permanent change in the population mean.
Example 14.2. An experiment with mice (R. C. Roberts, unpublished)
was designed to test the theoretical expectation that in the absence of
selection the heterosis on crossing should be equal to the depression on
inbreeding. The character studied was litter size. Thirty lines taken from
a randombred population were inbred by 3 consecutive generations of
fullsib mating, bringing the coefficient of inbreeding up to 50 per cent in
the litters and 375 per cent in the mothers. No selection was practised
during the inbreeding, and only 2 of the 30 lines were lost as a conse
quence of their inbreeding depression.
Litter size
Before inbreeding 8i
Inbred (litters: F = 50%) 57
Crossbred 85
After the third generation of inbreeding, crosses were made at random
between the lines, and in the next generation crosses between the F/s were
made so as to give crossbred mothers with noninbred young. The mean
litter sizes observed at the different stages are given in the table. The
inbreeding depression was 24 and the heterosis 28; the two are equal
within the limits of experimental error.
Single crosses. The foregoing theoretical conclusions refer to
the average of a large number of crosses between lines derived from a
single base population. In practice, however, one is often interested
in a somewhat different problem, namely the heterosis shown by a
particular cross between two lines, or between two populations which
may have no known common origin. To refer the changes of mean
value to changes of inbreeding coefficient would be inappropriate
under these circumstances, and the theoretical basis of the heterosis is
better expressed in terms of the gene frequencies in the two lines.
We may recall from Chapter 3 that inbreeding leads to a dispersion of
gene frequencies among the lines, the lines becoming differentiated
in gene frequency as inbreeding proceeds; and the coefficient of
inbreeding is a means of expressing the degree of differentiation
(equation 3.14). In turning from the inbreeding coefficient to the
gene frequencies as a basis for discussion we are therefore turning
from the general, or average, consequence of crossing, to the particu
lar circumstances in two lines.
256 INBREEDING AND CROSSBREEDING: I [Chap. 14
Let us, then, consider two populations, referred to as the ' 'parent
populations," both randombred though not necessarily large. The
parent populations are crossed to produce an F x or "first crossbred
generation," and the F x individuals are mated together at random to
produce an F 2 or "second crossbred generation." The amount of
heterosis shown by the F x or the F 2 will be measured as the deviation
from the midparent value, i.e. as the difference from the mean of the
two parent populations. First consider the effects of a single locus
with two alleles whose frequencies are p and q in one population, and
p' and q' in the other. Let the difference of gene frequency between
the two populations be y, so that y=pp' =q' q. The algebra is
then simplified by writing the gene frequencies^/ and q' in the second
population as (p y) and (q +y). Let the genotypic values be a, d,  a,
as before. They are assumed to be the same in the two popula
tions, epistatic interaction being disregarded. We have to find the
mean of each parent population and the midparent value; then the
mean of the F x and the mean of the F 2 . The parental means, M Vl and »
Mp 2 , are found from equation 7.2. They are
M 1 > 1 =a(pq) + 2dpq
Mj> 2 = a{pyqy) + zd(p y)(q +y)
= a(pq 2y) + zd[pq +y(p q) y 2 ]
The midparent value is
Mp = «M Pi +Mp 2 )
= a(pqy) + d[2pq+y{pq)y*\ (14.5)
When the two populations are crossed to produce the F lf indi
viduals taken at random from one population are mated to indivi
Table 14.3
Frequencies of Zygotes in the F 1
Gametes from P 1
Aj A 2
P Q
Gametes \ A 1 py
from¥ 2 J A 2 q+y
p(py) q(py)
p(q+y) q{<i+y)
duals taken at random from the other population. This is equivalent
to taking genes at random from the two populations, as shown in
Table 14.3. The F x is therefore constituted as follows: I ence
Chap. 14]
HETEROSIS
257
Genotypes
Frequencies
Genotypic values
p(py)
a
AiA 2
2pq+y(pq)
d
A 2 A 2
q(q+y)
a
The mean genotypic value of the F x is therefore:
M ¥i = a(p 2 pyq 2 qy) + d[2pq+y(pq)]
= a{pqy) + d[zpq +y(p  q)]
The amount of heterosis, expressed as the difference between the F 1
and the midparent values, is obtained by subtracting equation 14.5
from equation 14.6:
■(14.6)
H Fl =M Fl Mp
= dy*
(147)
Thus heterosis, just like inbreeding depression, depends for its occur
rence on dominance. Loci without dominance (i.e. loci for which
d=6) cause neither inbreeding depression nor heterosis. The amount
of heterosis following a cross between two particular lines or popula
tions depends on the square of the difference of gene frequency (y)
between the populations. If the populations crossed do not differ in
gene frequency there will be no heterosis, and the heterosis will be
greatest when one allele is fixed in one population and the other allele
in the other population.
Now consider the joint effects of all loci at which the two parent
populations differ. In so far as the genotypic values attributable to
the separate loci combine additively, we may represent the heterosis
produced by the joint effects of all the loci as the sum of their separate
contributions. Thus the heterosis in the F 1 is
H Vl =Zdy*
(14.8)
If some loci are dominant in one direction and some in the other their
effects will tend to cancel out, and no heterosis may be observed, in
spite of the dominance at the individual loci. The occurrence of
heterosis on crossing is therefore, like inbreeding depression, de
pendent on directional dominance, and the absence of heterosis is not
sufficient ground for concluding that the individual loci show no
dominance.
Before we go on to consider the F 2 it is perhaps worth noting that
the formulation of the heterosis in terms of the square of the differ
ence of gene frequency, in equations J4.7 and 14.8, is quite in line
258 INBREEDING AND CROSSBREEDING: I [Chap. 14
with the previous formulation of the inbreeding depression in terms
of the coefficient of inbreeding. If we envisage once more the whole
population subdivided into lines, and we suppose pairs of lines to be
taken at random, then the mean squared difference of gene frequency
between the pairs of lines will be equal to twice the variance of gene
frequency among the lines. That is: (j 2 ) = 2o^. And, by equation
3.14, 2o\ = 2pqF. Therefore the mean amount of heterosis shown by
crosses between random pairs of lines is equal to the inbreeding
depression as given in equation 14.2, though of opposite sign.
Now let us consider the F 2 of a particular cross of two parent
populations, the F 2 being made by random mating among the indi
viduals of the Fj. In consequence of the random mating, the geno
type frequencies in the F 2 will be the Hardy Weinberg frequencies
corresponding to the gene frequency in the F v The mean genotypic
value of the F 2 is then easily derived by application of equation 7.2.
The gene frequency in the F 1} being the mean of the gene frequencies
in the two parent populations, is (p  \y) for one allele, and (q + \y)
for the other. Putting these gene frequencies in place of p and q
respectively in equation 7.2 gives the mean genotypic value of the
2 as:
M Vi = a(piyqly) + 2d(piy)(q + iy)
= a(pqy) + d[zpq+y(pq)iy 2 ]
The amount of heterosis shown by the F 2 is the difference between
the F 2 and midparent values. So, from equations 14.5 and X4.9,
= \dy*
=i#F x {141°)
We find therefore that the heterosis shown by the F 2 is only half as
great as that shown by the F x . In other words, the F 2 is expected to
drop back halfway from the F x value toward the midparent value.
At first sight this conclusion may seem to contradict the one arrived
at earlier, when we were considering crosses between many lines, the
F]_ and F 2 means then being equal. The difference between the two
situations is that an F 2 made by random mating among a large number
of different crosses has the same inbreeding coefficient as the F 2 .
But an F 2 made from an F x derived from a single cross has inevitably
an increased inbreeding coefficient. If the inbreeding coefficient is
..
Chap. 14]
HETEROSIS
259
worked out in the manner described in Example 5.2, it will be found
to be half the inbreeding coefficient of the parent lines. The change
of mean from F x to F 2 may therefore be regarded as inbreeding de
pression. It cannot be overcome by having a large number of parents
of the F 2 because the restriction of population size that causes the
inbreeding has already been made in the single cross of only two lines,
or parent populations. There need, however, be no further rise of the
inbreeding coefficient in the F 3 and subsequent generations. Pro
vided, therefore, that there is no other reason for the gene frequency
to change, the population mean will be the same in the generations
following as in the F 2 .
That the heterosis expected in the F 2 is half that found in the F ±
is equally true when the joint effects of all loci are considered, pro
vided that epistatic interaction is absent. The conclusion for a single
locus was based on the principle that Hardy Weinberg equilibrium
is attained by a single generation of random mating. It will be
remembered from Chapter 1 (p. 19), however, that this is not true
with respect to genotypes at more than one locus considered jointly.
Therefore if there is epistatic interaction, the population mean will
not reach its equilibrium value in the F 2 , but will approach it more or
less rapidly according to the number of interacting loci and the
closeness of the linkage between them. The existence of epistatic
interaction is intimately connected with the scale of measurement,
but this matter will not be discussed until Chapter 17. Here we need
only note that for reasons connected with the scale of measurement
the halving of the heterosis in the F 2 expected on theoretical grounds
is not often found at all exactly in practice, though the F 2 usually falls
somewhere between the F x and midparent values. Some examples
from plants of the heterosis observed in the F 1 and F 2 generations are
illustrated in Fig. 14.2. It will be noticed that with some of the
characters shown, the F x and F 2 are lower in value than the mid
parent, and the heterosis is consequently negative in sign. This is in
no way inconsistent with our definition of heterosis as the difference
between the F x or F 2 and the midparent value. The sign of the
difference depends simply on the nature of the measurement. For
example, the character "days to first fruit," represented in the lower
graphs, shows heterosis of negative sign: but if the character were
called "speed of development" and expressed as a reciprocal of time
the heterosis would be positive in sign.
The relative amount of heterosis observed in the F x and F 2
260
INBREEDING AND CROSSBREEDING: I
[Chap. 14
generations is complicated also by the existence of maternal effects,
particularly in mammals. A character subject to a maternal effect,
200
180
160
140
120
•be
F,
(e)
F 2
F,
(g)
F,
F,
(h)
F,
Fig. 14.2. Some illustrations of heterosis observed in crosses
between pairs of highly inbred strains of plants. The points show
the mean values of the two parent strains, the F x and the F 2
generations. The midparent values are shown by horizontal lines.
Graph (a) refers to tobacco, Nicotiana rustica (data from Smith,
1952). All the other graphs refer to tomatoes, Ly coper sicon (Data
from Powers, 1952). The characters represented are:
(a) Height of plant (in.)
(b) Mean weight of one fruit (gm.)
(c) Number of locules per fruit
{d) Mean weight per locule (gm.)
(e)(h) Mean time in days between the planting of the seed and
the ripening of the first fruit, in 4 different crosses.
such as litter size, is divided between two generations. The maternally
determined component of the character may be expected to follow the
Chap. 14]
HETEROSIS
261
same general pattern of heterosis in the F ± and F 2 as we have just
discussed, but it will be one generation out of phase with the non
maternal part of the character. Thus the heterosis observed in the F 1
is attributable to the nonmaternal part, the maternal effect being still
at the inbred level. In the F 2 , however, the nonmaternal part will
lose half the heterosis as explained above, but the maternal effect will
now show the full effect of its heterosis since the mothers are now in
the Fj stage. This rather complicated situation may perhaps be more
LU
<
>
U
__,#.
/
CL
>
H
o
z
UJ
Q.
z
<
UJ
z
r V \
1
 1 1 1
F,
F 2
GENERATION
F 3
CHARACTER AS MEASURED
NON MATERNAL COMPONENT
MATERNAL COMPONENT
Fig. 14.3. Diagram of the heterosis expected in a character sub
ject to a maternal effect, when two lines are crossed and the F 2 is
made by random mating among the F x . The maternal and non
maternal components of the character separately are here supposed
to show equal amounts of heterosis, and to combine by simple
addition to give the character as it is measured.
readily grasped from the diagrammatic representation in Fig. 14.3.
As a result of maternal effects, therefore, the loss of heterosis in the
F 2 and subsequent generations is usually less noticeable with animals
than with plants, and experiments of great precision would be re
quired to detect any regular pattern.
Wide crosses. We have seen that the amount of heterosis shown
by a particular cross depends, among other things, on the differences
of gene frequency between the two populations crossed. This would
seem to indicate that the amount of heterosis would increase with the
262 INBREEDING AND CROSSBREEDING: I [Chap. 14
degree of genetic differentiation between the two populations and
would be limited only by the barrier of interspecific sterility. This,
however, is not true. Crosses between subspecies, or between local
races, taken from the wild often fail to show heterosis, particularly
in characters closely related to fitness which show heterosis in crosses
between less differentiated laboratory populations. Indeed the F^s
of wide crosses are often less fit than the parent populations. Much
of the evidence about such crosses comes from studies of wild
populations of Drosophila pseudoobscura and other species, (see
Dobzhansky, 1950; Wallace and Vetukhiv, 1955). Though wide
crosses may not show heterosis in fitness, they do often show hetero
sis in certain characters, particularly growth rate in plants. Dob
zhansky (1950, 1952), who drew attention to this, refers to heterosis
in fitness as "euheterosis" and to heterosis in a character that does not
confer greater fitness as "luxuriance."
The error in extending our earlier conclusion to wide crosses
arises from the fact that we have assumed epistatic interaction be
tween loci to be negligible, an assumption that is probably justified
for crosses between breeds of domestic animals or between laboratory
populations, but is obviously not justified in the case of crosses be
tween differentiated wild populations. The existing genetic differen
tiation between wild populations has, for the most part, arisen by
evolutionary adaptation to the local conditions. Adaptation to local
conditions or to a particular way of life involves many different
characters, both structural and functional, because the fitness of the
organism depends on the harmonious interrelations of all its parts.
If two populations adapted to different ways of life are crossed, the
crossbred individuals will be adapted to neither, and will conse
quently be less fit than either of the parent populations. The effect
of this evolutionary adaptation on the genetic structure of the popu
lations is as follows. The genes A x and B 1} say, are selected in one
population because together they increase fitness, though either one
separately may not; while, in another population living under differ
ent conditions, the genes A 2 and B 2 are selected for similar reasons.
In respect of fitness, therefore, there is epistatic interaction between
these two loci. But if these pairs of genes become fixed throughout the
two populations, A ± and B ± in one and A 2 and B 2 in the other, and so
become part of their constant genetic structure, the variation arising
from this interaction will disappear. Within any one population,
therefore, we may find very little epistatic variation, and the interac
..
Chap. 14]
HETEROSIS
263
tion will become apparent as a cause of variation between individuals
only in a crossbred population in which there is segregation at both
interacting loci.
The idea that the genetic structure of a natural population evolves
as a whole, so that the selection pressure on any one locus is depend
ent on the alleles present at many of the other loci, is expressed in the
terms "coadaptation" and "integration," used to describe the genetic
structure of natural populations. (For general discussions of these
concepts, see Dobzhansky, 195 ib; Lerner, 1954, 1958; Wright,
1956.) The important point for us to note is this. The property of
coadaptation, or integration, assumes primary importance only when
different populations are to be compared and when the results of
crossing adaptively differentiated populations are to be studied; it is
of less importance in the genetic study of a single population. In
this book we are chiefly concerned with the genetic variation within a
population: that is, the variation arising from the segregation of genes
in the population. Some of this variation arises from epistatic iner
action between the genes segregating at different loci, which is the
raw material, as it were, from which coadaptation could evolve if the
population were to become subdivided. But the amount of this epi
static variation within a population is probably seldom very large,
and moreover it is seldom necessary to distinguish it from other
sources of nonadditive genetic variance.
F.Q.G.
CHAPTER 15
INBREEDING AND CROSSBREEDING:
II. Changes of Variance
The effect of inbreeding on the genetic variance of a metric character
is apparent, in its general nature, from the description of the changes
of gene frequency given in Chapter 3. Again, we have to imagine the
whole population, consisting of many lines. Under the dispersive
effect of inbreeding, or random drift, the gene frequencies in the
separate lines tend toward the extreme values of o or 1, and the lines
become differentiated in gene frequency. Since the mean genotypic
value of a metric character depends on the gene frequencies at the
loci affecting it, the lines become differentiated, or drift apart, in
mean genotypic value. And, since the genetic components of vari
ance diminish as the gene frequencies tend toward extreme values
(see Fig. 8.1), the genetic variance within the lines decreases. The
general consequence of inbreeding, therefore, is a redistribution of the
genetic variance; the component appearing between the means of
lines increases, while the component appearing within the lines
decreases. In other words, inbreeding leads to genetic differentiation
between lines and genetic uniformity within lines. The differentia
tion is illustrated from experimental data in Fig. 15.1.
The subdivision of an inbred population into lines introduces an
additional observational component of variance, the betweenline
component, and it is not surprising that this adds a considerable
complication to the theoretical description of the components of
genetic variance. Indeed, a full theoretical treatment of the redistri
bution of variance has not yet been achieved. Here we shall attempt
no more than a brief description of the main outlines, and for this we
shall have to make some simplifications. In particular we shall
entirely neglect the interaction component of genetic variance arising
from epistasis. For detailed treatment of various aspects of the
problem, and for references, see Kempthorne (1957, Ch. 17). After
this description of the redistribution of genetic variance we shall
consider changes of environmental variance. The greater sensitivity
INBREEDING AND CROSSBREEDING: II.
265
Chap. 15]
of inbred individuals to environmental sources of variation was
mentioned earlier, in Chapter 8. This phenomenon interferes with
the experimental study of the changes of variance, and until it is
better understood we cannot put much reliance on the theoretical
11 12 13 14 15 16 17
60 61 62 63 64 65 66 67 68 69
generations of inbreeding
Fig. 15. i. Differentiation between lines by random drift, shown
by abdominal bristle number in Drosophila melanogaster. The
graphs show the mean bristle number in each of 10 lines during
fullsib inbreeding without artificial selection. (From Rasmuson,
1952; reproduced by courtesy of the author and the editor of Acta
Zoological)
expectations concerning variance being manifest in the observable
phenotypic variance. Finally, in this chapter, we shall discuss the use
of inbred animals for experimental purposes.
Redistribution of Genetic Variance
The redistribution of variance arising from additive genes (i.e.
genes with no dominance) is easily deduced. This is because with
additive genes the proportions in which the original variance is dis
I tributed within and between lines does not depend on the original
I gene frequencies. When there is dominance, however, we cannot
j deduce the changes of variance without a knowledge of the initial
I gene frequencies. This not only adds considerably to the mathematical
complexity, but it renders a general solution impossible. We shall
266 INBREEDING AND CROSSBREEDING: II [Chap. 15
first consider the case of additive genes, and then very briefly indicate
the conclusions arrived at for dominant genes. The effect of selection
will not be specifically discussed. We need only note that natural
selection will tend to render the actual state of dispersion of gene
frequencies less than that indicated by the inbreeding coefficient
computed from the population size or pedigree relationships. There
fore we must expect the redistribution of genetic variance to proceed
at a slower rate than the theoretical expectation, and we must expect
the discrepancy to be greater when inbreeding is slow than when it is
rapid.
No dominance. What follows refers to the variance arising from
additive genes: it does not apply to the additive variance arising from
genes with dominance. The conclusions therefore apply, strictly
speaking, only to characters which show no nonadditive variance.
They serve, however, to indicate the general effect of inbreeding on
variance, and may be taken as a fair approximation to what is expected
of characters such as bristle number in Drosophila, that show little
nonadditive genetic variance. The description to be given refers to
slow inbreeding, and is not strictly true of rapid inbreeding by sib
mating or selffertilisation. The redistribution of the variance under
rapid inbreeding is, however, not very different except in the first few
generations.
Consider first a single locus. When there is no dominance the
genotypic variance in the base population, given in equation 8.y i be
comes
V G = 2p Q q Q a 2
The variance within any one line is
V G = 2pqa 2
where p and q are the gene frequencies in that line. The mean vari
ance within lines is
V Gw = 2(pq)a 2
where (pq) is the mean value of pq over all lines. Now, z(pq) is the
overall frequency of heterozygotes in the whole population, which, by
Table 3.1, is equal to 2p q (i F), where F is the coefficient of in
breeding. Therefore
V Gw = 2p Q q a 2 (iF)
= V (iF)
r Chap. 15]
REDISTRIBUTION OF GENETIC VARIANCE
267
and this remains true when summation of the variances is made over
all loci. Thus the withinline variance is (i F) times the original
variance, and as F approaches unity the withinline variance approaches
zero.
Now let us consider the betweenline variance. This is the vari
ance of the true means of lines, and would be estimated from an
analysis of variance as the betweenline component. For a single
locus, still with no dominance, the mean genotypic value of a line
with gene frequency^) and q is obtained from equation 7.2 as
M=a{pq)
= a{izq)
Thus we want to find the variance of (a  zaq). Now, in general,
w\xY) ~ G x + °r> ^ X and Y are uncorrelated. Since in this case a is
constant from line to line (epistasis being assumed absent) it has no
variance, and so
Again, in general, o^x
K*<j'x when K is a constant. So
°M
= ^a 2 p q F (from 3.14)
=zFV G
and this also remains true when summation is made over all loci.
Thus the betweenline genetic variance is zF times the genetic vari
ance in the base population.
The partitioning of the genetic variance into components as
explained above is summarised in Table 15.1. The total genetic
Table 15. i
Partitioning of the variance due to additive genes in a
population with inbreeding coefficient F, when the variance
due to additive genes in the base population is V G .
Between lines zFV G
Within lines (iF)V G
Total (1 +F)V G
variance in the whole population is the sum of the withinline and
betweenline components, and is equal to (1 +F) times the original
genetic variance. (This is true also of close inbreeding.) Thus when
inbreeding is complete the genetic variance in the population as a
268 INBREEDING AND CROSSBREEDING: II [Chap. 15
whole is doubled, and all of it appears as the betweenline component.
The genetic variance within lines, before inbreeding is complete,
is partitioned within and between the families of which the lines are
composed. Under slow inbreeding with random mating within the
lines, it is partitioned equally within and between fullsib families.
The covariance of relatives within the lines is just as described in
Chapter 9, each line being a separate randombreeding population
with a total genetic variance of (1 F)V G , on the average. From this
we can deduce what the heritability is expected to be within any one
line. It will be (1 F)V G j[(i F)V Q + V E \ and this reduces to
* xWt (J5  J)
where h 2 t and F t are the heritability within lines and the inbreeding
coefficient at time t, and h% is the original heritability in the base
population. This shows how the heritability is expected to decline
with the inbreeding in a small population. The formula, however, is
applicable only to characters with no nonadditive variance, and in
the absence of selection. The operation of natural selection renders
the reduction of the heritability less than expected, especially under
slow inbreeding. This point has been demonstrated experimentally
with Drosophila (Tantawy and Reeve, 1956).
Dominance. The components of variance arising from additive
genes will have been seen to be independent of the gene frequencies
in the base population. When we consider genes with any degree of
dominance, however, we find that the changes of variance on in
breeding depend on the initial gene frequencies, and this makes it
impossible to give a general solution in terms of the genetic variance
present in the base population. We shall therefore do no more than
give the conclusions arrived at by A. Robertson (1952) for the case of
fully dominant genes, when the recessive allele is at low frequency.
This is the situation most likely to apply to variation in fitness arising
from deleterious recessive genes, though the effects of selection are
here disregarded. Fig. 15.2 shows the redistribution of variance
arising from recessive genes at a frequency of q — oi in the base
population. Fig. 15.2(a) refers to fullsib mating with only one
family in each line, and Fig. 15.2(6) refers to slow inbreeding. A
surprising feature of the conclusions is that the withinline variance
at first increases, reaching a maximum when the coefficient of in
breeding is a little under 05, and it remains at a fairly high level until
Chap. 15]
REDISTRIBUTION OF GENETIC VARIANCE
269
the coefficient of inbreeding approaches I. The reason, in general
terms, for the apparent anomaly that the variation within lines in
creases during the first stages of inbreeding, can be seen from a con
sideration of the relationship between the gene frequency and the
variance arising from a dominant gene shown in Fig. 8.1(b). The
gene frequency is taken to start at a value of oi, and on inbreeding it
5
GENERATIONS OF INBREEDING
INBREEDING COEFFICIENT
Fig. 15.2. Redistribution of variance arising from a single fully
recessive gene with initial frequency q =o*i. (a) with fullsib
mating, (b) with slow inbreeding. (From A. Robertson, 1952;
reproduced by courtesy of the author and the editor of Genetics.)
V t —total genetic variance.
V b = betweenline component.
V w =withinline component.
V a = additive genetic variance within lines.
will increase in some lines and decrease in others, the increase being
on the average equal in amount to the decrease. But examination of
the graph shows that an increase of gene frequency by a certain
amount will increase the variance more than a decrease of the same
amount will reduce it. Therefore, on the average, the variance within
the lines will increase in the early stages of inbreeding. This increase
of variance would be detectable in practice only if a substantial part
of the genetic variance were due to recessive genes at low frequencies.
Practical considerations. The extent to which the theoretical
changes of variance described in this chapter can be observed in
practice depends on how much environmental variance is present.
The precise estimation of variance requires a large number of obser
vations and the estimates obtained in practice are usually subject to
270 INBREEDING AND CROSSBREEDING: II [Chap. 15
rather large deviations due to the chances of sampling. Consequently
the changes of variance must usually be quite substantial before they
are likely to be readily detected. The genotypic variance, moreover,
seldom constitutes the major part of the phenotypic variance.
Therefore, in relation to the original phenotypic variance, the expected
changes due to inbreeding are usually rather small, and this renders
their detection all the more difficult. Furthermore, the detection of
the expected changes of phenotypic variance is entirely dependent on
the constancy of the environmental variance, and this cannot be
assumed without evidence, as we shall show in the next section. For
these reasons, and also because of the simplifications we have had to
make, we must bear in mind the uncertainties in the connexion
between what is expected and what may be observed in the pheno
typic variance.
Changes of Environmental Variance
Several times in previous chapters we have referred to the fact that
the environmental component of variance may differ according to
the genotype; in particular that inbred individuals often show more
environmental variation than noninbred individuals. This fact has
been revealed by many experiments in which the variances of inbreds
and of hybrids have been compared. Any difference of phenotypic
variance between highly inbred lines and the F 2 between them (i.e.
the "hybrid") must be attributed to a difference of the environmental
component, because the genetic variance is negligible in amount in
the hybrids as well as in the inbred lines. The greater susceptibility
of inbreds than of hybrids to environmental sources of variation has
been observed in a wide variety of characters and organisms. Some
examples are cited in Table 15.2; others will be found in the review
by Lerner(i954).
The cause of the greater environmental variance of inbreds is not
yet fully understood. It has been suggested that the possession of
different alleles at specific loci endows the hybrids with greater
' 'biochemical versatility" (Robertson and Reeve, 19526), which
enables them to adjust their development and physiological mech
anisms to the circumstances of the environment: in other words that
developmental and physiological homeostasis is improved by allelic
diversity. On the other hand, it has been suggested (Mather, 19530)
Chap. 15]
CHANGES OF ENVIRONMENTAL VARIANCE
271
2'35
00665
124
00165
that the reduced homeostatic power of inbreds is to be regarded as a
manifestation of inbreeding depression: homeostatic power is likely
to be an important aspect of fitness, and would therefore be expected,
like other aspects of fitness, to decline on inbreeding. The under
lying mechanism, we may presume, would be directional dominance,
genes that increase homeostatic power tending on the average to be
Table 15.2
Comparisons of Phenotypic Variance in Inbreds and
Hybrids
The figures are the averages of the inbred lines, and of the
Fj's where more than one cross was made. (C.V.) 2 = Squared
coefficient of variation.
Inbreds Hybrids
Drosophila melanogaster — wing length
(Robertson and Reeve, 19526) (C.V.) 2 .
6 inbreds and 6 F/s
Mice — duration of "Nembutal" anaesthesia
(McLaren and Michie, 19566). Log minutes.
2 inbreds and 1 F 1
Mice — age at opening of vagina
(Yoon, 1955). Days.
3 inbreds and 2 F/s
Mice — weight at ages given
(Chai, 1957) (C.V.) 2 .
2 inbreds and 1 F x
Rats — weight at 90 days
(Livesay, 1930.) (C.V.) 2 .
3 inbreds and 2 F/s
dominant over their alleles that decrease it.
causal connexion between variability and fitness. He believes greater
stability to be a general property of heterozygotes and regards it as
the cause of their greater fitness. Though the increase of environ
mental variance on inbreeding is a phenomenon of great theoretical
interest and some practical importance, too little is known about it to
justify a more detailed discussion of its causes here. Comprehensive
discussions will be found in Lerner (1954) and Waddington (1957).
There are, however, two further points in connexion with the
phenomenon that should be mentioned. The first is a technical
matter. If the mean value of the character differs between inbreds
fBirth
< 3 weeks
L60 days
517
174
J 9
59
98
47
24
!9
522
170
Lerner (1954) sees a
272 INBREEDING AND CROSSBREEDING: II [Chap. 15
and hybrids, as it frequently does, then it may be difficult to decide
on a proper basis for the comparison of the variances. It is necessary
to find a measure of the variance that does not merely reflect the
difference of mean value, and for this purpose the coefficient of
variation is often an appropriate measure. The problem is basically a
matter of the choice of scale, and will be discussed again in Chapter
The second point concerns the nature of the environmental
variation that is being measured. There is a distinction to be made
between the "developmental" variation arising from "accidents of
development" on the one hand, and adaptive reponses to changed
conditions on the other. The developmental variation is a mani
festation of incomplete buffering, or canalisation, of development and
is generally regarded as being harmful. Inbreds, in so far as they
show a greater amount of developmental variation, are therefore less
fit than hybrids; they are less well able to adjust their development to
different conditions of the environment so as to achieve the optimal
phenotype. An adaptive response, in contrast, is a modification of the
phenotypic value that is beneficial to the individual, such as for
example the thickening of the coat of mammals in response to low
temperature. If the greater fitness of hybrids over inbreds extends to
adaptive responses we should therefore expect hybrids to show more
variation of this sort than inbreds. Thus the nature of the environ
mental variation has an important bearing on the interpretation of a
difference of variability between inbreds and hybrids.
Uniformity of Experimental Animals
Inbred strains of laboratory animals, particularly of mice, are
widely used as experimental material in pharmacological, physio
logical, and nutritional laboratories, when uniformity of biological
material is desired. In some kinds of work, work for example which
demands the absence of immunological reactions,, it is genetic uni
formity that is required, and abundant experience has shown that the
inbred strains of mice fully satisfy this requirement. In spite of
doubts about how effective natural selection for heterozygotes may be
in delaying the progress towards homozygosity, these strains have
been proved in practice to be genetically uniform. In the course of
their maintenance, however, strains inevitably become split up into
Chap. 15]
UNIFORMITY OF EXPERIMENTAL ANIMALS
273
sublines, and it is only within a subline that their genetic uniformity
can be relied on. Recent work, described in the two following
1920,
1930
1950
o
©o ©
WHITE = 5 VERTEBRAE (+ ASYMMETRICAL)
BLACK= 6 VERTEBRAE
Fig. 15.3. Differentiation between sublines of the C3H inbred
strain of mice, in the number of lumbar vertebrae. Each circle
represents a sample of individuals classified for the number of
lumbar vertebrae. The proportions of black and white in the
circles show the proportions of individuals with 6 and with 5
lumbar vertebrae respectively. (Small proportions of asymmetrical
individuals are included with the 5 vertebra classes.) The circles
are positioned according to the date of clasification, and arranged
according to their pedigree relationships. (Data from McLaren
and Michie, 1954.)
examples, has revealed genetic differentiation within two widely used
strains of mice, and has shown that differences can sometimes be
detected between sublines separated by only a few generations.
274 INBREEDING AND CROSSBREEDING: II [Chap. 15
Example 15.1. The inbred strain of mice known as C3H exhibits
variability in the number of lumbar vertebrae, and the sublines differ
markedly in this character. Some sublines consist entirely of mice with
5 vertebrae, others entirely of mice with 6, and others with different pro
portions. The strain originated in 1920 and was split into three main
groups of sublines in about 1930, each group being later subdivided
further. The number of lumbar vertebrae has been studied in 16 sublines
maintained in America and Britain (McLaren and Michie, 1954). The
pedigree relationships between these sublines, and the proportions of the
two vertebral types in them, are shown in Fig. 15.3. One of the three main
groups of sublines has predominantly 6 lumbar vertebrae, and the other
two groups predominantly 5. This differentiation between the main
groups may have been due to residual segregation in the strain at the time
when the main groups became separated. The strain had, however, been
fullsib mated for 10 years — probably between 20 and 30 generations —
before the separation of the groups, and residual segregation therefore
seems unlikely. The sublines within the main groups are differentiated in
a manner that points to mutation rather than residual segregation as the
cause. The mutational origin of differentiation is more clearly proved in
the study described in the next example.
Example 15.2. Another inbred strain of mice, known as C57BL, has
been the subject of a thorough study by Griineberg and coworkers (Deol,
Griineberg, Searle, and Truslove, 1957; Carpenter, Griineberg, and Rus
sell, 1957). Twentyseven skeletal characters were examined in four main
groups of sublines, three maintained in America and one in Britain, the
British group being studied in greater detail. The nature and extent of the
differentiation found cannot be easily summarised, and therefore we shall
only state the conclusions reached about the cause of the differentiation.
Each of the four main groups differed from the others in between 7 and 17
out of the 27 characters. The following conclusions were drawn: (1) The
differentiation could not reasonably be attributed to residual segregation
before the separation of the sublines; and segregation following an acci
dental outcross was conclusively disproved. (2) Sublines that had been
separated for a longer time tended to differ by a greater number of charac
ters than sublines more recently separated. But the magnitude of the
difference in any one character was no greater between longseparated
sublines than between sublines only recently separated. From this it was
concluded that the differences in each character were caused by mutations
at single loci. The average difference caused by one mutational step
amounted to about o6 standard deviation of the character affected.
The study cited in the above example shows that the differences
between sublines, though they may be readily detectable, are prob
Chap. 15]
UNIFORMITY OF EXPERIMENTAL ANIMALS
275
ably caused by rather few loci. The differentiation is quite small in
comparison with the differences between strains or between indi
viduals in a noninbred population.
In much of the work for which inbred strains are used it is not
the genetic uniformity alone that matters, but the phenotypic uni
formity. The more variable the animals the larger the number that
must be used to attain a given degree of precision in measuring their
mean response to a treatment. The value of uniformity is therefore
in reducing the number of animals that must be used in an experi
ment or a test. Inbred animals, however, are costly to produce
because of their poor breeding qualities, and the advantage gained
from genetic uniformity has to be weighed against the extra cost of
the material. If the character to be measured is one of which the
phenotypic variance is chiefly environmental in origin, then the
absence of genetic variation in an inbred strain will reduce the pheno
typic variance by only a small amount. The extra cost of the inbred
animals may then outweigh the advantage of their being slightly
more uniform than noninbred animals. The phenotypic uniformity
of inbred animals, however, has been taken on trust from the genetical
theory of inbreeding, and it seems now that this trust has, to some
extent at least, been misplaced. In some characters inbred animals
are more phenotypically variable than noninbred (see Table 15.4)
on account of their greatly increased environmental variation. It
seems now that for some, perhaps for many, characters the greatest
phenotypic uniformity is found in hybrids (i.e. F^s) produced by
crossing two inbred strains. The value of hybrids for work requiring
phenotypic uniformity has been discussed by Griineberg (1954); and
by Biggers and Claringbold (1954).
One final point about the use of inbred and hybrid animals may
be noted. An inbred strain or the F x of two inbred strains has a
unique genotype; and that of an inbred, moreover, is one that cannot
occur in a natural population. Testing the response to any treatment
on one inbred strain or one hybrid is therefore testing it on one geno
type. If there are appreciable differences of response between
different genotypes, the experimenter is then not justified in describ
ing his results as referring, for example, to "the mouse."
CHAPTER 16
INBREEDING AND CROSSBREEDING:
III. The Utilisation of Heterosis
The crossing of inbred lines plays a major role in the present methods
of plant improvement, though in animal improvement it plays a much
less important part. In this chapter the genetic principles underlying
the use of inbreeding and crossing will be explained, and the various
methods described in outline. Technical details, however, will not be
given: for these the reader should consult a textbook of plant breeding
(e.g. Hayes, Immer, and Smith, 1955). We shall be concerned with
outbreeding plants and with animals. But since at first sight the
methods applicable to naturally selffertilising plants are super
ficially rather like those applicable to outbreeding plants and animals,
it will be advisable first to consider very briefly the improvement of
selffertilising plants.
Selffertilising plants. Each variety of a naturally selffertilising
plant is a highly inbred line, and the only genetic variation within it is
that arising from mutation. Genetic improvement can therefore be
made only by choosing the best of the existing varieties or by crossing
different varieties. The purpose of the crossing is to produce genetic
variation on which selection can operate. After a cross has been
made, the F x and subsequent generations are allowed to self fertilise
naturally. A new population, subdivided into lines, is thus made, and
the lines become differentiated as the inbreeding proceeds. Selection
is applied by choosing the best lines, which become new and im
proved varieties. The essential point to note is that what is sought is
an improved inbred line, and not a superior crossbred generation: the
purpose of the crossing is to provide genetic variation and not to
produce heterosis. The process of crossing and selection among the
subsequent lines may be repeated cyclically. If two good lines are
selected out of the first cross, these may be crossed and a second cycle
of selection applied to the derived lines. The genetic properties of a
population derived from a cross of two highly inbred lines, such as
two varieties of a selffertilising plant, are peculiar in that all segre
Chap. 16]
INBREEDING AND CROSSBREEDING: III.
277
gating genes have a frequency of 05 in the population as a whole.
This greatly simplifies the theoretical description of the variances and
covariances. Special methods of analysis applicable to such popula
tions have been developed which lead to a separation of the additive,
dominance, and epistatic effects, and so provide a guide to the possi
bilities of improvement in the population of lines derived from a
particular cross. For a description of these methods, see Mather
(1949), Hayman (1958), and Kempthorne (1957, Ch. 21) where other
references are given.
Outbreeding plants, and animals. Applied to naturally out
breeding plants and to animals, the purpose of crossing inbred lines is
to produce superior crossbred, or F 1} individuals. The utilisation of
heterosis in this way depends on selection as well as on the inbreeding
and crossing. The selection is applied, in principle, to the crosses,
with the aim of finding pairs of lines that cross well, so that the lines
may be perpetuated and provide crossbred individuals for com
mercial use. In practice, however, the performance of the lines
themselves has to be taken into account, because the lines must be
reasonably productive if they are to be maintained and used for
crossing. This method has been very successful with plants, and has
led to an improvement of 50 per cent in the yield of maize grown
commercially in the United States, since hybrid seed started to be
used in the early 1930's (Mangelsdorf, 195 1). Its success with
animals, however, has been much less notable. The reasons probably
lie chiefly in the greater amount of space and labour required by
animals and in their lower reproductive rate, both of which add
greatly to the difficulty of producing and testing the inbred lines.
During the inbreeding a large proportion of the lines die out from
inbreeding depression before a reasonably high degree of inbreeding
has been attained. Consequently the inbreeding programme must
start with a very large number of lines if enough are to be left after the
wastage to give some scope for the selection of good crosses. Another
point is that with plants that can be selffertilised, such as maize, the
inbreeding proceeds much faster than with animals. To attain an
inbreeding coefficient of, say, 90 per cent would require only 4
years for maize, but 1 1 years for pigs or chickens, and about 50 years
for cattle with a 4 or 5 year generation interval.
Let us now consider the genetic principles on which the utilisa
tion of heterosis depends. It was shown in Chapter 14 that crosses
made at random between lines inbred without selection are expected
278 INBREEDING AND CROSSBREEDING: III [Chap. 16
to have a mean value equal to that of the base population. This is
the reason why inbreeding and crossing alone cannot be expected to
lead to an improvement, but must be supplemented by selection. In
practice some improvement can be expected from the effects of
natural selection. It eliminates lethal and severely deleterious genes
during the inbreeding, and in so far as these genes affect the desired
character an improvement of the crossbred mean over that of the
base population is to be expected. But this improvement will not be
very great, because the deleterious genes eliminated will have been at
low frequencies in the base population — and the more harmful, the
lower the frequency — so that their effect on the population mean will
be small. It has been calculated, on the basis of assumptions about
the number of loci concerned and their mutation rates, that an im
provement of 5 per cent in fitness is the most that could be expected
from the elimination of deleterious recessive genes (Crow, 1948, 1952).
The bulk of the improvement, therefore, must come from artificial
selection applied to the economically desirable characters.
The crossing of inbred lines produces no genotypes that could not
occur in the base population. But whereas the best genotypes occur
only in certain individuals in the base population, they are replicated
in every individual of certain crosses. It is in this replication of a
desirable genotype that the chief merit of the method lies. Let us, for
simplicity, consider crosses between fully inbred lines. The gametes
produced by a highly inbred line are all identical, except for mutation.
And the gene content of the gametes of any one line could in principle
be found in a gamete from the base population. Therefore the geno
type of the F x of two lines could in principle be found in an individual
of the base population. Thus, provided there has been no selection
during the inbreeding, a set of crosses made at random is genetically
equivalent to a set of individuals taken at random from the base popu
lation; and the individuals of one cross are replicates of one individual
in the base population. This replication of a genotype in the indi
viduals of a cross allows the genotypic value to be measured with little
error; whereas the genotypic value of an individual in the base popu
lation is only crudely measured by its phenotypic value. Further, it is
the genotypic value that is measured in the cross and can be repro
duced indefinitely, as long as the inbred lines are maintained; whereas
only the breeding value can be reproduced by selection of individuals
in a noninbred population. Therefore the condition under which
inbreeding and crossing are likely to be a better means of improvement
Chap. 16]
INBREEDING AND CROSSBREEDING: III
279
than selection without inbreeding is when much of the genetic
variance of the character is nonadditive.
The amount of improvement that can be made by selection among
a number of crosses depends on the amount of variation between the
crosses. The same relationship holds between the intensity of selec
tion, the standard deviation, and the selection differential as was
described in Chapter n and illustrated in Fig. 11.3. In the following
section the variance between crosses made at random between pairs
of lines inbred without selection will be examined.
Variance between Crosses
The variance between crosses to be considered is the variance of
the true means of the crosses, or the betweencross component as
estimated from an analysis of variance. The variance of the observed
means will contain a fraction of the withincross component for the
reasons explained in connexion with family selection in Chapter 13.
We shall assume that the experimental design has eliminated all
nongenetic sources of variation from the betweencross component.
If the lines crossed are fully inbred there will be no genetic vari
ance within the crosses, and the variance between crosses will be
equal to the genotypic variance in the base population, since each
cross is equivalent to an individual of the base population. When the
lines are only partially inbred, however, some genetic variance will
appear within the crosses, and the betweencross variance will be less
than with fully inbred lines. It is therefore important to know in
what manner the betweencross variance increases as inbreeding
proceeds, since this will tell us how much is to be gained by proceed
ing to high levels of inbreeding.
We noted that crosses between fully inbred lines are genetically
equivalent to single individuals of the base population. Crosses
between partially inbred lines are analogous, not to individuals, but
to families, with degrees of relationship dependent on the inbreeding
coefficient of the lines. The variance between families can be formu
lated in terms of the degree of relationship in the families (Kemp
thorne, 1954), and this formulation may be extended to crosses by
regarding the crosses as families with a relationship depending on the
inbreeding coefficient of the lines. The following expression is then
obtained for the component of variance between crosses:
T F.Q.G.
280 INBREEDING AND CROSSBREEDING: III [Chap. 16
Betweencross variance
= F V *+F*V D +F*V AA +F*V AD +F*V DD + (16.1)
In this expression V A and V D are the additive and dominance vari
ances in the base population; V AA , V AD and V DD are the interaction
components as explained in Chapter 8; and F is the inbreeding
coefficient of the lines as specified below. The interaction components
are included because epistasis may have important effects. Only
twofactor interactions, however, are shown: the higher interactions
have coefficients in correspondingly higher powers of F. (For every
A in the subscript there is a factor F, and for every D a factor F 2 .)
The formulation in equation 16. 1 is conditional on the following
specifications about how the crosses are made. 1 . All lines have the
same coefficient of inbreeding. 2. All lines have independent ancestry
back to the base population; i.e. there is no relationship between the
lines. 3. Each cross is made from many individuals of the parent
lines; and these individuals are not related to each other within their
lines. This means that the genetic variance within the lines is fully
represented within the crosses. 4. The coefficient of inbreeding, F,
refers not to the individuals used as parents of the crosses, but to their
progeny if they were mated within their own lines; in other words, F
is the inbreeding coefficient of the next generation of the lines.
Let us now examine the expression 16.1 and consider what it tells
us about the variance between crosses. When the inbreeding coeffi
cient is unity the betweencross variance is, as we have already stated,
simply the sum of all the components of genetic variance in the base
population. During the progress of the inbreeding the contribution
of the additive variance increases linearly with F; those of the domin
ance variance and of Ax A interactions increases with the square of
F; and the other interaction components with the third or fourth
power of F. This means that the dominance and interaction com
ponents contribute proportionately more at higher levels of inbreed
ing than at lower levels. If the character is one with predominantly
nonadditive variance, the crosses will differ little in merit during the
early stages but will differentiate rapidly in the final stages. Since this
is the sort of character for which inbreeding and crossing is likely to
be the most effective means of improvement, it is clear that inbreed
ing must be taken to a fairly high level if anything approaching its full
benefit is to be realised. Some idea of the level of inbreeding required
can be obtained by noting that with F = 05 the betweencross vari
Chap. 16]
VARIANCE BETWEEN CROSSES
281
ance is equal to the variance between fullsib families in the base
population. At this level of inbreeding, therefore, the best cross would
do no more than replicate the best fullsib family in a noninbred
population.
Combining ability. The components of genetic variance making
up the betweencross variance that we have been discussing are
causal components, in the sense explained in Chapter 9. The vari
ance between crosses, however, can also be analysed into observa
tional components in the following way. Suppose a set of lines are
crossed at random, each line being simultaneously crossed with a
number of others. We can then calculate for each line its mean per
formance, i.e. the mean value of the Fj/s in crosses with other lines.
This is known as the general combining ability of the line. The
performance of a particular cross may deviate from the average
general combining ability of the two lines, and this deviation is
known as the special (or specific) combining ability of the cross. Or, if
we measure the mean values as deviations from the general mean of
all crosses, we can express the value of a certain cross as the sum of
the general combining abilities of the two lines and the special
combining ability of the pair of lines. Thus the mean value of the
cross of line X with line Y is
M XY = G.C. X + G.C. Y + S.C. XY
(16.2)
where G.C. and S.C. stand for the general and special combining
abilities. The variance between crosses can therefore be analysed
into two components: variance of general combining abilities and
variance of special combining abilities; the latter being, in statistical
terms, the interaction component.
The observational components of variance attributable to general
and special combining ability are made up of the causal components
in the following way.
(16.3)
Variance of crosses attributable to:
General combining ability =FV A +F 2 V AA + . . . \
Special combining ability =FW D +FW AD +FW DD + . . . J
So differences of general combining ability are due to the additive
genetic variance in the base population, and to Ax A interactions;
and differences of special combining ability are attributable to the
nonadditive genetic variance. Consequently the variance of general
282 INBREEDING AND CROSSBREEDING: III [Chap. 16
combining ability increases linearly with F (apart from the interaction
component), while the variance of special combining ability increases
with higher powers of F. It is therefore the special, and not the
general, combining ability that is expected to increase more rapidly
as the inbreeding reaches high levels.
Example 16.1. An analysis of egglaying in crosses between highly
inbred lines of Drosophila melanogaster is reported by Gowen (1952).
Five lines were crossed in all ways, including reciprocals, and the numbers
of eggs laid by females in the fifth to ninth days of adult life were recorded.
The analysis of the crosses yielded the following percentage composition
of the variance of egg number:
Variance component % of total
General combining ability 113
Special combining ability 97
Differences between reciprocals 23
Within crosses 766
Thus about half the variance between crosses was due to general, and half
to special, combining ability.
Some of the methods of improvement by crossing aim at utilising
only the variance of general combining ability, and then the measure
ment of the general combining ability of the lines becomes an im
portant procedure. In addition to the making of specific crosses
between the lines, there are two other methods of measuring general
combining ability. A method convenient for use with plants is known
as the polycross method. A number of plants from all the lines to be
tested are grown together and allowed to pollinate naturally, self
pollination being prevented by the natural mechanism for cross
pollination, or by the arrangement of the plants in the plot. The seed
from the plants of one line are therefore a mixture of random crosses
with other lines, and their performance when grown tests the general
combining ability of that line. Another method, applicable also to
animals, is known as topcrossing. Individuals from the line to be
tested are crossed with individuals from the base population. The
mean value of the progeny then measures the general combining
ability of the line, because the gametes of individuals from the base
population are genetically equivalent to the gametes of a random set
of inbred lines derived without selection from the base population.
Chap. 16]
VARIANCE BETWEEN CROSSES
283
These methods are essentially methods for comparing the general
combining abilities of different lines, and so leading to the choice of
the lines most likely to yield the best cross, among all the crosses that
might be made between the available lines. But if much of the varia
tion between crosses is due to special combining ability, then the
general combining ability of two lines will not provide a reliable
guide to the performance of their cross.
Methods of Selection for Combining Ability
The methods of improvement by inbreeding and crossing fall into
two groups, according to whether they are designed to utilise only
the variation in general combining ability or to utilise also the varia
tion in special combining ability.
Selection for general combining ability. When the improve
ment of general combining ability only is sought the procedure of
selection is much simplified. The general combining abilities of all
available lines can be measured, as already explained, without the
necessity of making and testing all the possible crosses between them.
Some selection can usefully be applied to the lines before they are
tested in crosses. There is some degree of correlation between a line's
performance as an inbred and its general combining ability, so a
proportion of lines can be discarded on the basis of their own per
formance before the crosses are made. And, finally, there is less
to be lost by making the crosses at a relatively low coefficient of in
breeding. Selection for general combining ability may be repeated
in cycles, a procedure known in plant breeding as recurrent selection.
(In animal breeding this term has come to have a different meaning,
as will be explained below.) Lines are inbred by selffertilisation
for one or two generations and their general combining abilities
tested. The lines with the best general combining abilities are then
crossed and a second cycle of inbreeding and selection carried out.
A review of the progress made by this method is given by Sprague
( I 952).
The seed for commercial use is usually not made by a single cross
of two lines, but by a 3way or 4way cross. The object of this is to
overcome the generally low production of an inbred used as seed
parent. In a 3way cross the F x of two lines is used as seed parent and
crossed with a third inbred line. In a 4way cross two F^s of differ
284 INBREEDING AND CROSSBREEDING: III [Chap. 16
ent pairs of lines are crossed. The performance of 3 way and 4way
crosses can be reliably predicted from the performance of the con
stituent single crosses.
Even though selection for general combining ability is widely
used in plant breeding and has abundantly proved its success, it is
not, perhaps, altogether clear why it is preferred to selection without
inbreeding, made either by individual selection or by family selection.
Since the variation in general combining ability is attributable to
additive variance in the population from which the lines were derived,
selection should be effective without inbreeding. Comparisons of the
two methods by experiment have not been made on a scale sufficient
to prove convincingly the superiority of selection with inbreeding
(see Robinson and Comstock, 1955).
Selection for general and specific combining ability. The
specific combining ability of a cross cannot be measured without
making and testing that particular cross. Therefore to achieve a
reasonably high intensity of selection for specific combining ability a
large number of crosses must be made and tested. Is no shortcut
possible? Could the superior combining ability not be, as it were,
built into the lines by selection? From the causes of heterosis ex
plained in Chapter 14 it is clear that what is wanted is two lines that
differ widely in the gene frequencies at all loci that affect the character
and that show dominance. It should therefore be possible to build
up these differences of gene frequency in two lines by selection.
Instead of the differences of gene frequency being produced by the
random process of inbreeding, they would be produced by the directed
process of selection, which would be both more effective and more
economical. Two methods based on this idea have been devised.
These methods, though originating from plant breeding, provide —
in theory at least — the most hopeful means of utilising heterosis in
animals. We shall first describe the method known as reciprocal
recurrent selection, or simply as reciprocal selection. In outline, the
procedure is as follows.
The start is made from two lines, say A and B. (We shall call
them "lines" even though they will not be deliberately inbred.)
Crosses are made reciprocally, a number of A 33 being mated to
B ??, and a number of B 33 to A $?. The crossbred progeny are then
measured for the character to be improved and the parents are judged
from the performance of their progeny. The best parents are selected
and the rest discarded, together with all the crossbred progeny, which
Chap. 16] METHODS OF SELECTION FOR COMBINING ABILITY
285
are used only to test the combining ability of the parents. The selected
individuals must then be remated, to members of their own line, to pro
duce the next generation of parents to be tested. These are crossed
again as before and the cycle repeated. It is seldom practicable to select
among the female parents, and the selection is chiefly applied to the
males. Each male is mated to several females of the other line so that
the judgment of his combining ability may be based on a reasonably
large number of progeny. Most of these females are needed to mate to
the selected males of their own line for the continuation of the line.
Deliberate inbreeding is avoided as far as possible, for the reason to be
explained below. The use of all the females as parents in their own lines
helps to reduce the rate of inbreeding and allows relatively few males to
be used, which intensifies the selection.
An essential prerequisite is that there should be some difference of
gene frequency between the two lines at the beginning, or else selec
tion for combining ability will be unable to produce a differentiation
of the lines. Any locus at which the gene frequencies are the same in
the two lines will be in equilibrium, though an unstable equilibrium.
Any shift in one direction or the other will give the selection something
to act on and the difference will be increased. The initial difference
between the lines may be obtained by starting from two different
breeds or varieties, choosing two that already cross well; or by de
liberate inbreeding, up to perhaps 25 per cent, and relying on random
differentiation of gene frequencies.
Though the performance of the cross is expected to increase
under this method of selection, the performance of the lines them
selves in respect of the character selected is expected to decrease, for
this reason. Characters to which selection would be applied in this
way are those subject to inbreeding depression and heterosis; that is
to say, those in which dominance is directional. The changes of gene
frequency brought about by the selection are toward the extremes,
and consequently the mean values of the lines will decline for the
reasons explained in connexion with inbreeding in Chapter 14. This
decline in the performance of the lines, however, should not be quite
as deleterious as the effects of deliberate inbreeding. Inbreeding, as a
random process, affects all loci, and the mean values of all characters
showing directional dominance decline. But under reciprocal selec
tion it is only the selected character that should decline, except in so
far as linked loci are carried along. Nevertheless, reproductive fitness
is nearly always a component of economic value, and it is doubtful
286 INBREEDING AND CROSSBREEDING: III [Chap. 16
how far the distinction will hold. This, however, is the reason why
deliberate inbreeding of the lines is to be avoided.
The second method is simpler in procedure than reciprocal
selection described above. It was devised as a modification of recur
rent selection, intended to utilise special as well as general combining
ability (Hull, 1945), and as yet it has no distinctive name. It is known
variously as "Hull's modification of recurrent selection," ' 'recurrent
selection to inbred tester," "recurrent selection for special combining
ability," and in animal breeding simply as "recurrent selection." It
differs from reciprocal selection in the following way. Instead of
starting with two lines and selecting both for combining ability with
the other, one starts with only one line and selects it for combining
ability with a "tester" line which has previously been inbred. This
reduces the amount of effort spent on the testing, and is expected to
yield more rapid progress at the beginning because the initial differ
ences of gene frequency between the line and the tester are likely to
be more marked. But the ultimate gain is expected to be less than
under reciprocal selection, because the general combining ability of
the tester line is predetermined, and only the general combining
ability of the selected line and the special combining ability of the
cross can be improved.
The two methods of selection for special combining ability de
scribed in this section are comparatively new methods of improvement
and very little practical experience of them has yet been gained. The
account of them given here is consequently based almost entirely on
theory. Theoretical assessments of their merits in relation to other
methods have been made by Comstock, Robinson, and Harvey
(1949) and by Dickerson (1952). Though on theoretical grounds
they seem promising, the results of the only experiments so far pub
lished (Bell, Moore, and Warren, 1955; Rasmuson, 1956) are not
encouraging.
Before we leave the subject of inbreeding we must give some
further consideration to the particular genetic property that makes
selection with inbreeding and crossing preferable to selection without
inbreeding. From the theoretical point of view, and leaving all prac
tical considerations aside, the crucial genetic property is over
dominance of the genes concerned. The following section is devoted
to a consideration of overdominance and its significance.
Chap. 16]
OVERDOMINANCE
287
OVERDOMINANCE
Overdominance is the property shown by two alleles when the
heterozygote lies outside the range of the two homozygotes in
genotypic value with respect to the character under discussion. Its
meaning was illustrated in Fig. 2.3 with respect to fitness as the
character, and it has been mentioned from time to time in other
chapters. We saw in Chapter 2 how selection favouring hetero
zygotes leads to a stable gene frequency at an intermediate value, and
how this overdominance with respect to fitness probably accounts for
much of the stable polymorphism found in natural populations.
And in Chapter 12 we saw how overdominance may be a source of
nonadditive genetic variance in populations that have reached their
limit under artificial selection. It is, however, in connexion with the
utilisation of heterosis by inbreeding and crossing, or by reciprocal
selection, that overdominance has its most important practical conse
quences. In earlier chapters two basic methods of improvement
were distinguished, one being selection without inbreeding, and the
other inbreeding followed by crossing. In this chapter we have seen
that selection is an integral part of the second method also. The
essential distinction therefore lies in the crossing, rather than in the
selection. Now, crossing two lines in which different alleles are fixed
gives an F 1 in which all individuals are heterozygotes; and this is the
only way of producing a group of individuals that are all heterozy
gotes. In a noninbred population no more than 50 per cent of the
individuals can be heterozygotes for a particular pair of alleles.
Consequently, if heterozygotes of a particular pair of alleles are
superior in merit to homozygotes, inbreeding and crossing will be a
better means of improvement than selection without inbreeding.
Furthermore, it is only when there is overdominance with respect to
the desired character, or combination of characters, that inbreeding
and crossing can achieve what selection without inbreeding cannot.
Under any other conditions of dominance the best genotype is one of
the homozygotes, and all individuals can be made homozygous by
selection, without the disadvantages attendant on inbreeding and
much more simply than by methods dependent on crossing. It was
stated earlier in this chapter that the potentialities of inbreeding and
crossing are greatest when there is much nonadditive genetic vari
ance and little additive. Now we see that this is only part of the truth:
288 INBREEDING AND CROSSBREEDING: III [Chap. 16
in principle inbreeding and crossing can surpass selection without in
breeding only when a substantial part of the nonadditive variance is
due to over dominance. It is therefore of great practical importance
to know whether overdominance with respect to economically
desirable characters is a major source of variation. It is also of great
theoretical interest to know whether overdominance with respect to
natural fitness is a common phenomenon affecting many loci, because
natural selection favouring heterozygotes would be a potent factor
tending to maintain genetic variation in populations. This point will
be discussed further in Chapter 20.
The contribution of overdominance to the variance, and the pro
portion of loci that show overdominance, are really two different
questions. Genes that are overdominant with respect to fitness will be
at intermediate frequencies and will therefore contribute much
more variation than genes at low frequencies. So overdominance may
be a major source of variation and yet be a property of only a few
loci.
The evidence concerning overdominance has been compre
hensively reviewed by Lerner (1954), who reaches the conclusion that
overdominance with respect to fitness and characters closely con
nected with it is widespread and very important. A contrary view is
expressed by Mather (19556) on the grounds that much of what
appears to be overdominance with respect to certain characters in
plants can be attributed to epistatic interaction. These two conflicting
opinions will be enough to show that the problem of overdominance
remains still an open question. The aim here is not to discuss the
opinions, but to indicate briefly the nature of the evidence.
The evidence concerning overdominance is broadly speaking of
two sorts, direct and indirect. The direct evidence comes from the
comparison of heterozygotes and homozygotes in identifiable geno
types. The indirect evidence comes from the study of the expected
consequences of overdominance as they affect the genetic properties of
a population, or the outcome of certain breeding methods. Both sorts
of evidence are complicated by linkage. We have to distinguish
between overdominance as a property of a single locus, and over
dominance as a property of a segment of chromosome, which we shall
refer to as apparent overdominance. Unequivocal evidence of over
dominance arising from a single locus is scarce because it can only be
obtained from a locus that has mutated in a highly inbred line, or
from a population in which coupling and repulsion linkages are in
Chap. 16]
OVERDOMINANCE
289
equilibrium. The segregation that can be observed in practice, and
that gives rise to the genetic variation in a population, is usually not a
segregation of single loci but of segments of chromosome, longer or
shorter according to the amount of crossingover. These segments
of chromosome, or units of segregation, can show overdominance
even though the separate loci do not. All that is needed to produce
some degree of apparent overdominance is two genes, linked in
repulsion, and both partially recessive. Its most extreme form is pro
duced by two lethal genes linked in repulsion — a "balanced lethal"
system — when the heterozygote of the segment spanned by the two
loci is the only viable genotype.
In considering the direct evidence it is necessary to recognise that
overdominance may be manifested at different "levels" according to
the complexity of the character under discussion. A pair of alleles
with pleiotropic effects may be found not to exhibit overdominance
when any of the characters they affect is examined separately; yet if
natural fitness or economic merit is founded on a combination of
these characters, the alleles may show overdominance with respect to
fitness or merit. Thus there may be no overdominance at the lower
level of the simpler characters, but overdominance at the higher level
of the more complex character.
Example 16.2. An example of overdominance due to pleiotropy is
provided by the pygmy gene in mice, already referred to in several ex
amples in earlier chapters. The gene reduces body size and in the homo
zygote it causes sterility (King, 1955). In respect of body size it is nearly,
but not quite, recessive. In respect of sterility it is probably also nearly
recessive, though this was not proved. In neither body size nor sterility
separately is there overdominance. But if small size were desirable (as it
was in the experiment in which the gene was discovered), then under these
conditions the genotype with the highest merit is the heterozygote, since
the sterile homozygotes cannot reproduce. With respect to merit, or fitness
under these conditions, the gene therefore shows overdominance. The
lethal gene in the line of Drosophila selected for high bristle number,
mentioned in Chapter 12, is another case of the same sort of overdomin
ance; and so also is the sicklecell anaemia described in Example 2.4.
The observations that provide direct evidence concerning over
dominance may be briefly summarised as follows. The experience
of Mendelian genetics shows that mutant genes are not commonly
overdominant with respect to their main effects. Nor is overdomin
ance with respect to natural fitness at all obvious. Indeed, if there
290 INBREEDING AND CROSSBREEDING: III [Chap. 16
were more than a mild degree of overdominance with respect to
fitness a gene would not be rare enough to be classed as a "mutant."
Though the evidence of Mendelian genetics suggests that overdomin
ance is not a very common property of genes, many cases are never
theless known. Overdominance due to pleiotropy, such as the cases
mentioned in the above example, are not infrequent. And, over
dominance with respect to certain components of natural fitness has
been proved for some of the blood group genes in poultry (see Briles,
Allen, andMillen, 1957; Gilmour, 1958).
The nature of the indirect evidence concerning overdominance
is, in brief summary, as follows.
1 . Experiments on the rate of loss of genetic variance during in
breeding point to the operation of natural selection in favour of
heterozygotes (Tantawy and Reeve, 1956; Briles, Allen, and Millen,
1957; Gilmour, 1958). This indicates apparent overdominance, but
it does not prove overdominance at the individual loci.
2. Crow (1948, 1952) has given reasons for thinking that the
yield of grain obtained from the best crosses between inbred lines of
maize is too high to be accounted for without overdominance at some
loci. The reasoning depends on assumptions about the number of
loci affecting yield and the mutation rates, and the conclusion is
therefore tentative. Robinson et at. (1956) point out that the reason
ing cannot justifiably be applied to maize crosses because the lines
crossed generally come from different varieties and not from the
same base population as required by Crow's hypothesis.
3. Comstock and Robinson (1952) have devised methods for
measuring the average degree of dominance from measurements
made on noninbred populations. Preliminary results from maize
(Robinson and Comstock, 1955) suggest that there cannot be over
dominance (as distinct from apparent overdominance) at more than a
small proportion of the loci that influence the yield of grain.
4. The existence of polymorphism in natural populations, asj
described in Chapter 2, cannot readily be explained except by sup
posing that the genes concerned are overdominant with respect to
fitness.
From the foregoing outline of the evidence it is clear that the
problem of how important overdominance is remains unsolved.
Some of the differences of opinion about it may arise from different
views of what phenomena are to be included under the term —
whether apparent overdominance due to linkage, or overdominance
Chap. 16]
OVERDOMINANCE
291
jldue to pleiotropy, are to be regarded as overdominance or not.
I Moreover, the question of how important overdominance is means
 different things according to whether we are concerned with its
I frequency as a property of genes, or with the amount of variation it
I causes.
CHAPTER 17
SCALE
The choice of a suitable scale for the measurement of a metric charac
ter has been mentioned several times in the foregoing chapters. The
explanation of what is involved in the choice of a scale and a discussion
of the criteria of suitability have, however, been deferred till this
point because these are matters that cannot be properly appreciated
until the nature of the deductions to be made from the data are
understood. In other words the choice of a scale has to be made in
relation to the object for which the data are to be used. The data
from any experimental or practical study are obtained in the form
most convenient for the measurement of the character. That is to
say the phenotypic values are recorded in grams, pounds, centimetres,
days, numbers, or whatever unit of measurement is most convenient.
The point at issue is whether these raw data should be transformed to
another scale before they are subjected to analysis or interpretation.
A transformation of scale means the conversion of the original units
to logarithms, reciprocals, or some other function, according to what
is most appropriate for the purpose for which the data are to be used.
It is tempting to suppose that each character has its "natural"
scale, the scale on which the biological process expressed in the
character works. Thus, growth is a geometrical rather than an arith
metical process, and a geometric scale would appear to be the most
' 'natural." For example, an increase of 1 gm. in a mouse weighing
20 gm. has not the same biological significance as an increase of 1 gm.
in a mouse weighing 2 gm.: but an increase of 10 per cent has ap
proximately the same significance in both. For this reason a trans
formation to logarithms would seem appropriate for measurements of
weight. This, however, is largely a subjective judgment, and some
objective criterion for the choice of a scale is needed. There are
several recognised criteria (see Wright, 1952&); but, as Wright points
out, the different criteria are often inconsistent in the scale they indi
cate. And, moreover, the same criterion applied to the same character
may indicate different scales in different populations. Therefore the
Chap. 17]
SCALE
293
idea that every character must have its "natural" and correct scale is
largely illusory.
In the first chapter on metric characters, Chapter 6, it was stated
that we should assume throughout that any metric character under
discussion would be measured on an "appropriate" scale, the
criterion being that the distribution of phenotypic values should
approximate to a normal curve. This is, in principle, the chief
criterion, and a markedly asymmetrical, or skewed, distribution is a
certain indication that the data may have to be transformed if they are
to be used in certain ways. But a transformation may still be required
even if the distribution is not markedly asymmetrical: we shall see
below that the most important criterion then is that the variance
should be independent of the mean. We shall treat the choice of
scale in this chapter by showing what will arise if the transformation
required is not made. We shall find that certain phenomena arise,
called scale effects, which disappear when the appropriate transforma
tion is made. For the sake of clarity we shall discuss in particular the
logarithmic transformation which converts an arithmetic to a geo
metric scale. This is probably the commonest and most useful
transformation. The general principles, outlined by reference to the
log transformation, will, however, apply equally to other transforma
tions. Let us first consider the distribution of phenotypic values.
Fig. 17. i shows three distributions plotted as if from the original
data on an arithmetic scale. They would all three be symmetrical
and normal if the data were first transformed to logarithms, or plotted
on logarithmic paper. There are two points of importance to notice.
First, the degree of departure from normality depends on the amount
of variation in relation to the mean. This may be seen from a com
parison of the two upper graphs, (a) and (b), which are not very
noticeably asymmetrical, with the lower graph, (c), which is. The
relationship between the amount of variation and the mean, which
determines the degree of departure from normality, is best expressed
as the coefficient of variation; i.e. the ratio of standard deviation to
mean, often multiplied by 100 to bring it to a percentage. The
coefficient of variation of the two upper graphs is 20 per cent, while
that of the lower graph is 50 per cent. Thus, a transformation to
logarithms does not make an appreciable difference to the shape of the
distribution unless the coefficient of variation is fairly high — that is,
above about 20 per cent or so. Consequently, statistical procedures
which do not rely on a strictly normal distribution, such as the ana
294
SCALE
[Chap. 17
lysis of variance, can be carried out on the untransformed data when
the coefficient of variation is not above about 20 per cent. Trans
formations to other scales are also less necessary when the coefficient
of variation is low than when it is high.
The second point to notice in Fig. 17. 1 is that the variance, when
computed in arithmetic units, increases when the mean increases.
This may be seen in the two upper graphs, (a) and (b). These have
Fig. 17. i. Distributions that are symmetrical and normal on a
logarithmic scale shown plotted on an arithmetic scale. Explana
tion in text.
both the same variance in logarithmic units, but different means.
The mean — or strictly speaking the mode — of (b) is double that of (a)
and the standard deviation in arithmetic units is correspondingly
doubled. Though the distributions are not very noticeably skewed
and a transformation does not seem to be very strongly indicated, yet
in consequence of the difference of mean the variances differ very
greatly. Here, then, is one of the commonest scale effects, namely a
change of variance following a change of the population mean. The
two graphs (a) and (b) in Fig. 17.1 might well represent two popula
Chap. 17]
SCALE
295
tions which have diverged by some generations of twoway selection,
if the character were something like body weight measured in grams
or pounds. Such characters are commonly found to increase in
variance when the mean increases and to decrease in variance when
the mean decreases. Fig. 17.2 shows an example from an experiment
with mice (MacArthur, 1949), the character being weight at 60 days.
30

25

LU
^20
f
Z
LU
LU
OL
10
• /
5
/.
1 y^TV/^
3 IC
15
20
25 30
35
40
45
50
GRAMS
Fig. 17.2. Distributions of body weight of male mice at 60 days.
Centre: base population before selection. Left and right: small
and large strains after 21 generations of twoway selection. (Re
drawn from MacArthur, 1949.)
Small Unselected Large
Standard deviation 171 2*56 510
Coeff. of variation, % 143 iii 128
Phenomena such as the change of variance discussed above are
called scale effects if they disappear when the measurements are
appropriately transformed: in other words, if their cause can be
attributed to the scale of measurement. But they are none the less
real, though labelled as a scale effect or removed by transformation.
The large mice, for example, are really more variable than the small
when their weights are measured in grams. What is gained by recog
nising this as a scale effect is that there is no need to look deeper into
the genetic properties of the character for an explanation.
A convenient test for the appropriateness of a logarithmic trans
formation is provided by the proportionality of standard deviation
U F.Q.G.
296 SCALE [Chap. 17
and mean, which we noted in connexion with graphs (a) and (b) in
Fig. 17. i. If two distributions have the same variance on a logarith
mic scale then the coefficients of variation in arithmetic units will be
the same. Thus, constancy of the coefficient of variation indicates
constancy of variance on a logarithmic scale. And, if variances are to
be compared, we may simply compare the coefficients of variation
instead of expressing the variances in logarithmic units. The stand
ard deviations and coefficients of variation of the distributions shown
in Fig. 17.2 are given in the legend to the figure. The coefficients of
variation, though not identical, are much more alike than the stand
ard deviations, and this shows that the changes of variance that have
resulted from the selection can be attributed, in large part at least, to
the scale of measurement.
The effect of scale on the connexion between variance and mean
complicates the comparison of the variances of two populations that
differ also in mean, as for example the comparison of the variances of
inbreds and hybrids discussed in Chapter 15. If a difference of
variance is to be unambiguously attributed to a difference of homeo
static power, for example, there must be independent grounds for
believing that a similar difference would not be expected as a scale
effect connected with the difference of mean.
Let us return to the consequences of selection and pursue them a
little further. If the variance changes with the change of mean as a
result of selection, so also will the selection differential and the
response. The response per generation of a character such as we have
been considering would therefore be expected to increase with the
progress of selection in the upward direction, and to decrease corre
spondingly in the downward direction. The response to twoway
selection would then be asymmetrical. An example of an asymmetri
cal response which can most probably be attributed to a scale effect
in this way is shown in Fig. 17.3. Plotted in arithmetic units, as in
(a), the response is much greater in the upward than in the downward
direction. A transformation to logarithms, shown in (b), renders the
response much more nearly symmetrical. This does not do away
with the fact that the character as measured increased much more than
it decreased under selection. But it accounts for the asymmetry
without the need for more elaborate hypotheses. A convenient way of
eliminating scale effects from the graphical presentation of a response
to selection is to plot the response in the form of the realised herit
ability, as explained in Chapter 11 and illustrated in Fig. 11.5. The
Chap. 17]
SCALE
297
realised heritability, which is the ratio of response to selection differ
ential, is very little influenced by scale effects (Falconer, 1954a:).
When means or variances are to be compared, for example in a
comparison of two populations or in following the changes resulting
from selection, and a transformation to logarithms is indicated, it is
not necessary to convert each individual measurement. On the other
260
220
140
100
60
20
r
(a)
/
■
/
r
■
: /
:/
■
.'\.
\.
**"■"• — .^"*\.
.
, ;^,— t
£24
<
Q
122
o
"20
 z
z
cC . 4
2 4 6 8 10
GENERATIONS
/ *
/
1
(b)
\
; \
~;v :
2 4 6 8 10
GENERATIONS
Fig. 17.3. Response to twoway selection for resistance to dental
caries in rats. Resistance is measured in days and plotted on an arith
metic scale in (a), and on a logarithmic scale in (b). The arithmetic
means were converted to logarithmic means by formula 17. 1. The
coefficient of variation was high — about 50 % — and was approxi
mately constant. The reason why the upward selection has not
covered so many generations as the downward is simply that the
increased resistance lengthened the generation interval. (Data
from Hunt, Hoppert, and Erwin, 1944.)
hand it is not sufficient to convert the arithmetic mean or variance to
logarithms, unless the coefficient of variation is very low. The con
versions may be conveniently made by the two following formulae,
given by Wright (19526). The first converts the mean of arithmetic
values to the mean of logarithmic values, and the second converts the
variance as computed from the arithmetic values to the variance as it
(log x) = log x  I log ( 1 + C 2 )
o'dogo;) =04343 log (i+C 2 )
.(I 7 .I)
,(17.2)
would be computed from logarithmic values. In these formulae C is
the coefficient of variation in the form ujx computed from arithmetic
values, and the logarithms are to the base 10.
298 SCALE [Chap. 17
We turn now to what is perhaps a more fundamental effect of a
scale transformation — its effect on the apparent nature of the genetic
variance. To understand this we must go back to a single locus and
consider the effect, or mode of action, of the genes. Let us imagine a
locus with two alleles whose mode of action is geometric, the geno
typic value of A 2 A 2 being 50 per cent greater than A X A 2 and that of
A X A 2 being also 50 per cent greater than A^. Thus on the logarith
mic scale there is no dominance, the heterozygote being exactly mid
way between the two homozygotes. Now suppose the genotypic
values are measured in arithmetic units, such as grams, and that A X A X
has a value of 10 units. Then A X A 2 will be 15 units and A 2 A 2 225
units. On the arithmetic scale, therefore, A x is partially dominant to
A 2 , the heterozygote no longer falling midway between the homo
zygotes. Thus the degree of dominance is influenced by the scale of
measurement, and so also is the proportionate amount of dominance
variance. This effect of a scale transformation, however, is normally
rather small. A gene that causes a 50 per cent difference between the
genotypic values, such as we have considered, would be a major gene,
easily recognisable individually. But even so the degree of dominance
on the arithmetic scale is not very great. Minor genes with effects of
perhaps 1 per cent or 10 per cent would be scarcely influenced in their
dominance.
In the same way that the dominance is affected by the scale, so
also is the epistatic interaction between different loci. Loci with
geometric effects would combine without interaction if the genotypic
values were measured in logarithmic units. But when measured in
arithmetic units there would be interaction deviations due to epis
tasis. Thus the amount of interaction variance is also influenced by
the scale of measurement. The following example illustrates the
dependence of interaction on scale.
Example 17.1. The pygmy gene in mice is a major gene affecting body
size, homozygotes being much reduced in size. The effect of this gene
was studied in different genetic backgrounds (King, 1955). The gene was
transferred from the strain selected for small size where it arose, to a strain
selected for large size, by repeated backcrosses. The mean difference be
tween pygmy homozygotes and normals (i.e. heterozygotes and normal
homozygotes together) was measured in the two strains and during the
transference, the comparisons being made between pygmies and normals
in the same litters. The results are shown in Fig. 17.4. The difference
between pygmies and normals increases with the weight of the normals.
Chap. 17]
SCALE
299
In the background of the small strain the pygmies were about 7 gm. smaller
than normals, but in the background of the large strain they were about
12 gm. smaller. Thus the pygmy gene shows epistatic interaction with the
other genes that affect body size. But if the effect of the gene is expressed
as a proportion, it is constant and independent of the other genes present.
8 
7
"10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Weight of normal mice (g.)
Fig. 17.4. Intralitter comparisons of the 6week weights of pyg
mies and normals. Mean of pygmies plotted against mean of nor
mals in the same litter. (From King, 1955; reproduced by courtesy
of the author and the editor of the Journal of Genetics.)
Pygmies are about half the weight of their normal littermates, no matter
what the actual weights are. Thus if the comparisons are made in logar
ithmic units there is no epistatic interaction.
In general, therefore, a scale transformation may remove or
reduce the variance attributable to epistatic interaction, and this
variance might then be labelled as a scale effect. A transformation
which removes or reduces interaction variance may be useful if con
clusions are to be drawn from an analysis that depends for its validity
on the absence of interaction. A detailed treatment of the relation
ship between scale and epistatic interaction is given by Horner,
Comstock, and Robinson (1955).
In this chapter we have outlined some of the scale effects most
commonly met with, and have indicated the circumstances under
which a transformation of scale may be helpful to the interpretation
of results and the drawing of conclusions. Transformations of scale,
however, should not be made without good reason. The first pur
pose of experimental observations is the description of the genetic
300 SCALE [Chap. 17
properties of the population, and a scale transformation obscures
rather than illuminates the description. If epistasis, for example, is
found, this is an essential part of the description, and it is better
labelled as epistasis than as a scale effect. The transformation of scale
is essentially a statistical device to be employed for the purpose of
simplifying the analysis of the data, or to make possible the drawing
of valid conclusions from the analysis. It is sometimes helpful also in
the interpretation of results. If epistasis, for example, were found to
disappear on transformation to a logarithmic scale we could conclude
that the effects of different loci combined by multiplication rather
than by addition. Or, if there were good reasons for attributing a
difference of variance to a scale effect we should not need to invoke
more complicated genetic explanations. The choice of scale, how
ever, raises troublesome problems in connexion with the interpreta
tion of results. Logical justification of a scale transformation can
only come from some criterion other than the property about which
the conclusions are to be drawn. If there is no independent criterion
the argument becomes circular, and the distinction between a scale
effect and some other interpretation becomes meaningless. There is
also a more fundamental difficulty: the scale appropriate for one
population may not be appropriate for another, and the scale appro
priate to the genetic and environmental components of the variation
may be different. This difficulty is strikingly illustrated by an analysis
of the character " weight per locule" in a number of crosses between
varieties of tomato (Powers, 1950). By the same criterion — normality
of the distribution — this character was found to require an arithmetic
scale in some crosses and a geometric scale in others; and, moreover,
in the F 2 generations of some crosses the genetic variation required one
scale while the environmental variation required another.
CHAPTER 18
THRESHOLD CHARACTERS
There are many characters of biological interest or economic im
portance whose inheritance is multifactorial but whose distribution is
discontinuous. For example: resistance to disease, a character ex
pressed either in survival or in death with no intermediate; "litter"
size in the larger mammals that bear usually one young at a time but
sometimes two or three; or the presence or absence of any organ or
structure. Characters of this sort appear at first sight to be outside the
realm of quantitative genetics because they do not exhibit continuous
variation; yet when subjected to genetic analysis they are found to be
under the influence of many genes just as any metric character. For
this reason they have been called "quasicontinuous variations"
(Griineberg, 1952): the phenotypic values are discontinuous but the
mode of inheritance is like that of a continuously varying character.
The clue to the understanding of the inheritance of such characters
lies in the idea that the character has an underlying continuity with a
"threshold" which imposes a discontinuity on the visible expression
of the character, as depicted in Fig. 18.1. The underlying continuous
variation is both genetic and environmental in origin, and may be
thought of as the concentration of some substance or the speed of some
developmental process — of something, that is to say, that could in
principle be measured and studied as a metric character in the
ordinary way. The hypothetical measurement of this variation is
supposed to be made on a scale that renders its distribution normal,
and the unit of measurement is the standard deviation of the dis
tribution. This provides what may be called the underlying scale. We
now have two scales for the description of the phenotypic values: the
underlying scale which is continuous, and the visible scale which is
discontinuous. The two are connected by the threshold, or point of
discontinuity. This is a point on the continuous scale which corre
sponds with the discontinuity in the visible scale. The idea will be
clearer from an inspection of Fig. 18.1, which depicts a character
whose visible expression can take only two forms, such as alive versus
302
THRESHOLD CHARACTERS
[Chap. 18
dead, or present versus absent. Individuals whose phenotypic values
on the underlying scale exceed the threshold will appear in one visible
class, while individuals below the threshold will appear in the other.
2
I + 2 +3 3 2 I
STANDARD DEVIATIONS
+ 3
Fig. i 8. i. Illustrations of a threshold character with two visible
classes. The vertical line marks the theshold between the two
phenotypic classes, one of which is crosshatched. The population
depicted on the left has an incidence of io%; that on the right, an
incidence of 90 %.
On the visible scale individuals can have only two values, o or 1.
Groups of individuals, however, such as families or the population as
a whole can have any value, in the form of the proportion or percent
age of individuals in one or other class. This may be referred to as
the incidence of the character. Susceptibility to disease, for example,
can be expressed as the percentage mortality in the population or in
a family. The incidence is quite adequate as a description of the
population or group, but the percentage scale in which the incidence
is expressed is inappropriate for some purposes because on a per
centage scale variances differ according to the mean. The interpre
tation of genetic analyses of threshold characters is therefore facili
tated by the transformation of incidences to values on the underlying
scale. The transformation is easily made by reference to a table of
probabilities of the normal curve. The threshold is a point of trun
cation whose deviation from the population mean can be found from
the proportion of the population falling beyond it. A table of ''pro
bits" (Fisher and Yates, 1943, Table ix) is convenient to use because
it refers to a single tail of the distribution and obviates confusion
over the sign of the deviation. The transformation from the visible
to the underlying scale enables us to state the mean phenotypic value
of a population or family in terms of its standard deviation, and to
Chap. 18]
THRESHOLD CHARACTERS
303
compare the means of different populations or families provided they
have the same standard deviation. It is convenient to take the posi
tion of the threshold as the origin, or zeropoint, on the underlying
scale and to express the mean as a deviation from the threshold.
Thus if the incidence of the character is, for example, 10 per cent, a
table of the normal curve shows that the threshold exceeds the mean
by 128 standard deviations. The population mean, referred to the
threshold as origin, is therefore  1280. Or, if the incidence were
90 per cent then the population mean would be + i28cj, as shown in
Fig. 1 8. 1. For any comparison of means, however, it is necessary to
assume that the populations compared have the same variance on the
underlying scale. If reasons are known for the variances not being
equal — in comparisons, for example, between inbreds, F x 's and F 2 's —
then the means cannot be expressed on a common scale that allows a
valid comparison to be made.
This is as far as we can go with a character that is visibly expressed
in only two classes. The mean of a population or group can be stated,
but not the variance, because the mean has to be stated in terms of the
standard deviation. We can, however, subject the observed means of
families to analysis and compute the heritability of the character.
The heritability of threshold characters is treated by A. Robertson
and Lerner (1949) and by Dempster and Lerner (1950), and will not
be further discussed here.
If a character has three classes in its visible scale then comparisons
can be made between the variances of populations as well as between
the means. The number of lumbar vertebrae in mice is a character
of this sort that has been extensively studied (Green, 1951; McLaren
and Michie, 1955). The number is usually either 5 or 6, but some
individuals have 5 on one side and 6 on the other. This comes about
through the last vertebra being sacralised on one side and not on the
other. The asymmetrical mice have 5! lumbar vertebrae and are
regarded as being intermediate between the 5 class and the 6class.
When the visible scale has three classes there are two thresholds,
as shown in Fig. 18.2. If the assumption is made that the difference
between the two thresholds represents a constant difference on the
underlying scale, then we have not only a fixed origin of the scale but
also a fixed unit, and this provides a basis for the comparison of
variances as well as of means. The underlying scale then has one of
the thresholds as origin and the threshold difference as the unit of
measurement. The idea is most easily explained by a numerical
304
THRESHOLD CHARACTERS
[Chap. 18
example. Consider the two populations illustrated in Fig. 18.2. Let
their standard deviations on a common underlying scale be g 1 and o 2
respectively, and let them have the following incidences in the three
visible classes, X, I, and Z, of which I is the intermediate class:
* +5
POPULATION (2)
THRESHOLD UNITS
Fig. 18.2. Illustrations of a threshold character with three visible
classes, in two populations with incidences as shown. The axes are
marked in threshold units, and the population means are indicated
by arrows. Further explanation in text.
Class
X
I
Z
60
I 5
25
20
10
70
X/I
I/Z
Population (1)
+ 0250!
+ 06701
Population (2)
— 08403
05202
Incidence, %. Population (1)
Population (2)
The deviations of the thresholds from the population means, found
from a table of the normal curve, are as follows:
Threshold interval
04201
03203
The intervals between the two thresholds, given above on the right,
are found by subtraction of the deviations of the two thresholds in
each population. These threshold intervals are supposed by hypo
thesis to be equal on the common underlying scale. By assigning the
threshold interval the value of one ' 'threshold unit" we can therefore
express the standard deviations of the two populations on a common
basis in terms of threshold units. The standard deviations then
become
o 1 = 2 , 38 threshold units
ct 2 = 3'I2 threshold units.
Chap. 18]
THRESHOLD CHARACTERS
305
The means of the populations can also be expressed in threshold
units. Reckoned from the X/I threshold as origin they are
M 1 =  025 01 =  o6o threshold units
M 2 = + 084 o 2 = + 262 threshold units.
The standard deviation and population mean of a character with
three visible classes may be put in general form in the following way.
Let X be the incidence in one visible class, and Y the incidence in
this class together with the intermediate class. Let the threshold
between these two classes be the origin of the underlying scale. Let
x and y be the deviations of the two thresholds corresponding to the
incidences X and Y respectively. Then the standard deviation is
and the mean is
x y
M= xg
x
threshold units
.(18.1)
threshold units
x
(18.2)
The comparison of variances in this way depends entirely, as we
have pointed out, on the assumption that the interval between the
two thresholds is constant from one population to another. If we
think again of the hypothetical substance or process whose concentra
tion or rate determines the value on the underlying scale, the assump
tion is that the intermediate class spans the same difference of con
centration or of rate in the two populations compared. Whether this
assumption is a reasonable one or not is hard to judge. It may,
nevertheless, lead to reasonable results, as the following example
shows.
Example 18.1. The number of lumbar vertebrae was studied in two
inbred lines of mice and their cross (Green and Russell, 195 1). The inbred
lines were a branch of the C3H strain with predominantly 5 lumbar
vertebrae, and the C57BL strain with predominantly 6 lumbar vertebrae.
Crosses were made reciprocally, and F 2 generations were made from each
F P The incidences of the 5vertebra class and of the intermediate class of
asymmetrical mice with 5^ are given in the table. The reciprocal F/s
were found to differ and are listed separately. The F 2 's did not differ and
their results are pooled. The table gives also the positions of the two thresh
olds in standard deviations; and the mean and standard deviation com
306
THRESHOLD CHARACTERS
[Chap. 18
puted in threshold units, the mean being reckoned from the threshold
between the 5class and the asymmetrical class as origin. The distribu
Population
Incidence, %
5 5i
Deviation of
thresholds from
mean, in a
5/5* Si/6
Mean and stand
ard deviation in
threshold units
M a
Inbreds C3H
C57
C3H? x C 5 7<?
C 5 7?xC 3 H<?
969
i3
23
20
+ 187
223
+ 241
184
3 '44
+ 574
184
258
57'4
290
I5'5
250
+ OI9
o55
+ o6i
+ oio
044
+ 085
236
i53
F 2 (pooled)
467
12*2
008
+ 023
+ 027
325
tions of the populations, based on the computed means and standard
deviations, are shown graphically in Fig. 18.3. It should be noted that the
means and standard deviations of the inbreds are not very precisely esti
mated because the incidences are low. The computed properties of the
populations follow the expected pattern. The F x generation is intermediate
in mean between the two parental populations, though there is a maternal
effect causing a difference between the reciprocal F/s. This maternal
effect has been further studied and confirmed by McLaren and Michie
(1956a). The variance of the F 1 is somewhat lower than that of the
parental inbreds, as might be expected from a reduction of environmental
variance in the hybrids. This was further studied and confirmed by
McLaren and Michie (1955). The F 2 is equal in mean to the F l9 but shows
an increased variance as would be expected from the segregation of genes.
If we take 200 as the mean standard deviation of the F lf representing
purely environmental variation, then the environmental variance is 400,
and the total phenotypic variance given by the F 2 is 1056; therefore the
genotypic variance works out at 656, or 62 per cent of the total. Thus the
analysis of the threshold character studied in this cross leads to very
reasonable results, and the assumptions on which it rests do not seem to be
very seriously wrong.
The meaning of the threshold unit in which values on the under
lying scale are expressed may conveniently be discussed by reference
to the number of lumbar vertebrae in mice, described in the above
example. From the graduation of the scale at the foot of Fig. 18.3
it appears that the threshold interval corresponds to one vertebra. It
is therefore tempting to regard the scale as indicating ' 'potential'
vertebrae, ranging from 5 at the origin to 15 at the upper extreme
Chap. 18]
THRESHOLD CHARACTERS
307
ie
5 THRESHOLD UNITS +5
5 ►M 6 ►
VERTEBRAE
Fig. 18.3. Distributions of number of lumbar vertebrae in mice
transformed to the underlying scale of threshold units. The upper
distributions are two inbred lines, the two middle ones are the two
reciprocal F/s, and the lower distribution is the F 2 . (Data from
Green & Russell, 1951) See example 18.1 for further explanation.
308 THRESHOLD CHARACTERS [Chap. 18
and to  5 at the lower extreme. We should then regard the develop
ing vertebral column as being protected by canalisation against this
wide range of potential variation, so that the vertebrae actually
formed are restricted to the narrow range between 5 and 6. This
interpretation, however, assumes that individuals with a potential
number anywhere between 5 and 6 will be asymmetrical with 5!
vertebrae; and for this there is no justification. The asymmetrical
individuals may equally well, or more probably, be those with almost
exactly 5 \ potential vertebrae. Suppose, for example, that the range
of potential vertebrae that gave rise to an asymmetrical individual
were between 54 and 56. Then 1 threshold unit would correspond
to o2 potential vertebrae; the origin of the underlying scale would
be at 54 and the variation would range from 74 potential vertebrae
at one extreme to 34 at the other. Or, if the asymmetrical individuals
covered a range of only oi potential vertebrae, the whole distribu
tion would lie within the potential numbers of 5 and 6, just as the
actual range does. Thus the threshold unit is purely arbitrary in
nature; though useful for the comparison of populations, it cannot be
given any concrete interpretation.
From what has been said so far in this chapter it will be clear that
threshold characters do not provide ideal material for the study of
quantitative genetics, because the genetic analyses to which they can
be subjected are limited in scope and subject to assumptions that one
would be unwilling to make except under the force of necessity. We
turn now to a consideration of some aspects of selection for threshold
characters, which has more practical importance than the genetic
analyses that we have been considering, and does not involve the same
theoretical difficulties.
Selection for Threshold Characters
Selection for threshold characters has some practical importance
in connexion with the improvement of viability and with changing
the response of experimental animals to treatments, such as, for
example, increasing or decreasing drug resistance. We shall consider
only characters with two visible classes; and we shall assume that
there is no means of measuring some aspect of the character that
varies continuously, such as measuring the time of survival instead of
classifying simply dead versus alive.
wii
Chap. 18]
SELECTION FOR THRESHOLD CHARACTERS
309
The response to selection depends in the usual way on the selec
tion differential. But the selection differential does not depend prim
I arily on the proportion selected, as with a continuously varying
character, but on the incidence, for the following reason. We may
I breed exclusively from those individuals in the desired phenotypic
i class, but we cannot discriminate between those with high and those
with low values on the underlying scale. The selected individuals are
therefore a random sample from the desired class, and the mean of
the selected individuals is the mean of the desired class, irrespective
of whether we select all of the desired class or only a portion of it.
The point will be made clearer by reference to Fig. 18.1, letting the
crosshatching represent the desired class. Let us suppose that the
replacement rate allows us to select 10 per cent of the population. If
we select out of the population on the right, with an incidence of 90
I per cent, the mean of the selected individuals will be the same as if
we had selected 90 per cent. But if we select out of the population
on the left, with an incidence of 10 per cent, we shall use all of the
individuals in the desired class and none of the others. The selection
differential will then be the same as if we had selected on the basis of
a continuously varying character. Thus the selection differential is
greatest when the incidence is exactly equal to the proportion selected.
If it is less we shall be forced to use some individuals of the un
desired class; and if it is greater we shall do no better than we should
by selecting the whole of the desired class.
With some characters, however, the incidence can be altered and
this provides a means of improving the response to selection. If the
character is, for example, a reaction to some treatment, the treatment
can be increased or reduced in intensity, so that the incidence is
altered. This is an alteration of the mean level of the environment,
and its effect is in principle to shift the distribution of phenotypic
values with respect to the fixed threshold. But it is more con
venient to regard it as changing the nature of the character and shift
ing the threshold with respect to a fixed mean phenotypic level.
When the level of the threshold can be controlled in this way, the
maximum speed of progress under selection will be attained by ad
justing the threshold so that the incidence is kept as nearly as possible
equal to the minimum proportion that must be selected for breeding.
The progress made can be assessed by subjecting the population, or
part of it, to the original treatment under which the threshold is at its
original level.
310
THRESHOLD CHARACTERS
[Chap. 18
Genetic assimilation. A very interesting result of the applica
tion of this principle of changing the threshold by environmental
means is the phenomenon known as "genetic assimilation" (Wad
dington, 1953). If a threshold character appears as a result of an
environmental stimulus, and selection is applied for this character, it
may eventually be made to appear spontaneously, without the neces
sity of the environmental stimulus. In this way what was originally
an "acquired character" becomes by perfectly orthodox principles of
selection an "inherited character" (Waddington, 1942). In such a
situation there are two thresholds, one spontaneous and the other
4 t 6
SPONTANEOUS
Fig. 1 8.4. Diagram illustrating genetic assimilation of a threshold
character. Distributions on the underlying scale, which is marked
in standard deviations. The vertical lines show the positions of the
induced and spontaneous thresholds, and the arrows mark the
population means at three stages of selection.
(a) before selection: incidence — induced = 30 %, spontaneous = o %
(b) after some selection: incidence — induced = 80 %, spontaneous = 2 % ■■
(c) after further selection: incidence — induced = 100 %, spontaneous =95 %
induced, as shown in Fig. 18.4. The spontaneous threshold is at first
outside the range of variation of the population, so that there is no
variation of phenotype and no selection can be applied, (Fig. 18.4, a).
The induced threshold, however, is within the range of the under
lying scale covered by the population, and it allows individuals toward
one end of the distribution to be picked out by selection. In this way
the mean genotypic value of the population is changed. If this change
goes far enough some individuals will eventually cross the spon
taneous threshold and appear as spontaneous variants, (Fig. 18.4, b).
When the spontaneous incidence becomes high enough selection may
Chap. 18]
SELECTION FOR THRESHOLD CHARACTERS
311
be continued without the aid of the environmental stimulus, and the
spontaneous incidence may be further increased, (Fig. 18.4, c).
Example 18.2. An experimental demonstration of genetic assimilation
in Drosophila melanogaster is described by Waddington (1953). The charac
ter was the absence of the posterior crossvein of the wing. In the base
population no flies with this abnormality were present, but treatment of
the puparium by heat shock caused about 30 per cent of crossveinless
individuals to appear. Selection in both directions was applied to the
treated flies, and after 14 generations the incidence of the induced character
had risen to 80 per cent and fallen to 8 per cent. At this time crossveinless
flies began to appear in small numbers among untreated flies of the upward
selected line, and by generation 16 the spontaneous incidence was between
1 and 2 per cent. Selection was then continued without treatment, the
population being subdivided into a number of lines. The best four of the
lines, selected without further treatment, reached spontaneous incidences
ranging from 67 per cent to 95 per cent. The distributions in Fig. 18.4
illustrate the progress of the upward selection. Graph (b) shows a spon
taneous incidence of 2 per cent and an induced incidence of 80 per cent
and thus corresponds approximately with generation 16. On the assump
tion of constant variance, the change of mean at this stage amounted to
136 standard deviations. Graph (c) shows a spontaneous incidence of
95 per cent and represents the line that finally showed the greatest pro
gress. Its mean on the underlying scale is 515 standard deviations above
that of the initial population.
The idea of genetic assimilation is not confined to threshold
characters; but for its wider significance the reader must be referred
to Waddington (1957).
F.Q.G.
CHAPTER 19
CORRELATED CHARACTERS
This chapter deals with the relationships between two metric charac
ters, in particular with characters whose values are correlated —
either positively or negatively — in the individuals of a population.
Correlated characters are of interest for three chief reasons. Firstly
in connexion with the genetic causes of correlation through the
pleiotropic action of genes: pleiotropy is a common property of major
genes, but we have as yet had little occasion to consider its effects in
quantitative genetics. Secondly in connexion with the changes
brought about by selection: it is important to know how the im
provement of one character will cause simultaneous changes in other
characters. And thirdly in connexion with natural selection: the
relationship between a metric character and fitness is the primary
agent that determines the genetic properties of that character in a
natural population. This last point, however, will be discussed in
the next chapter.
Genetic and Environmental Correlations
In genetic studies it is necessary to distinguish two causes of cor
relation between characters, genetic and environmental. The genetic
cause of correlation is chiefly pleiotropy, though linkage is a cause of
transient correlation particularly in populations derived from crosses
between divergent strains. Pleiotropy is simply the property of a
gene whereby it affects two or more characters, so that if the gene is
segregating it causes simultaneous variation in the characters it
affects. For example, genes that increase growth rate increase both
stature and weight, so that they tend to cause correlation between
these two characters. Genes that increase fatness, however, influence
weight without affecting stature, and are therefore not a cause of
correlation. The degree of correlation arising from pleiotropy ex
presses the extent to which two characters are influenced by the same
Chap. 19] GENETIC AND ENVIRONMENTAL CORRELATIONS
313
genes. But the correlation resulting from pleiotropy is the overall, or
net, effect of all the segregating genes that affect both characters.
Some genes may increase both characters, while others increase one
and reduce the other; the former tend to cause a positive correlation,
the latter a negative one. So pleiotropy does not necessarily cause a
detectable correlation. The environment is a cause of correlation in
so far as two characters are influenced by the same differences of
environmental conditions. Again, the correlation resulting from en
vironmental causes is the overall effect of all the environmental
factors that vary; some may tend to cause a positive correlation, others
a negative one.
The association between two characters that can be directly
observed is the correlation of phenotypic values, or the phenotypic
correlation. This is determined from measurements of the two
characters in a number of individuals of the population. Suppose,
however, that we knew not only the phenotypic values of the indi
viduals measured, but also their genotypic values and their environ
mental deviations for both characters. We could then compute the
correlation between the genotypic values of the two characters and
the correlation between the environmental deviations, and so assess
independently the genetic and environmental causes of correlation.
And if, in addition, we knew the breeding values of the individuals, we
could determine also the correlation of breeding values. In principle
there are also correlations between dominance deviations, and be
tween the various interaction deviations. To deal with all these cor
relations, even in theory, would be unmanageably complex, and
fortunately is not necessary, since the practical problems can be quite
adequately dealt with in terms of two correlations. These are the
genetic correlation, which is the correlation of breeding values, and
the environmental correlation, which is not strictly speaking the cor
relation of environmental deviations, but the correlation of environ
mental deviations together with nonadditive genetic deviations. In
other words, just as the partitioning of the variance of one charac
ter into the two components, additive genetic versus all the rest,
was adequate for many purposes, so now the covariance of two
characters need only be partitioned into these same two compon
ents. The ' 'genetic" and " environmental" correlations thus corres
pond to the partitioning of the covariance into the additive genetic
component versus all the rest. The methods of estimating these
two correlations will be explained later. Let us consider first how
314 CORRELATED CHARACTERS [Chap. 19
they combine together to give the directly observable phenotypic
correlation.
The following symbols will be used throughout this chapter:
X and Y: the two characters under consideration.
r P the phenotypic correlation between the two characters,
XandY.
r A the genetic correlation between X and Y (i.e. the
correlation of breeding values).
r E the environmental correlation between X and Y
(including nonadditive genetic effects).
cov the covariance of the two characters X and Y, with
subscripts P, A, or E, having the same meaning as for
the correlations.
cr 2 and g variance and standard deviation, with subscripts
P, A, or E, as above, and X or Y according to the
character referred to. E.g. g px = phenotypic variance
of character X.
h 2 the heritability, with subscript X or Y, according to
the character.
e 2 = i  h 2 .
(The customary symbol for the genetic correlation is r G , but since the
genetic correlation is almost always the correlation of breeding values
we shall use the symbol r A for the sake of consistency with previous
chapters.)
A correlation, whatever its nature, is the ratio of the appropriate
covariance to the product of the two standard deviations. For
example, the phenotypic correlation is
COVp
r P
G PX G P y
The phenotypic covariance is the sum of the genetic and environ
mental covariances, so we can write the phenotypic correlation as
_cov A +cov E
r p —
vpxVpy
The denominator can be differently expressed by the following
device: g\ — h 2 G P , and g% — ^g p . So G P —G A jh=G E je. The phenotypic
correlation then becomes
k.
Chap. 19] GENETIC AND ENVIRONMENTAL CORRELATIONS
315
7 j cov A COV E
r P = h x h Y  — —+e x e Y
a AX?AY
^ex^ey
Therefore
r P =h x h Y r A +e x e Y r E
,{l 9 .l)
This shows how the genetic and environmental causes of correlation
combine together to give the phenotypic correlation. If both
characters have low heritabilities then the phenotypic correlation is
determined chiefly by the environmental correlation: if they have
high heritabilities then the genetic correlation is the more important.
The genetic and environmental correlations are often very differ
ent in magnitude and sometimes different even in sign, as may be
seen from the examples given in Table 19.1. A difference in sign
between the two correlations shows that genetic and environmental
sources of variation affect the characters through different physio
logical mechanisms. The correlations between bodyweight and egg
laying characters in poultry provide striking examples. Pullets that
are larger at 18 weeks from genetic causes reach sexual maturity later
and lay fewer eggs, but the eggs are larger. Pullets that are larger
from environmental causes reach sexual maturity earlier and lay
more eggs, which however are very little different in size.
The dual nature of the phenotypic correlation makes it clear that
the magnitude and even the sign of the genetic correlation cannot be
determined from the phenotypic correlation alone. Let us therefore
consider the methods by which the genetic correlation can be
estimated.
Estimation of the genetic correlation. The estimation of
genetic correlations rests on the resemblance between relatives in a
manner analogous to the estimation of heritabilities described in
Chapter 10. Therefore only the principle and not the details of the
procedure need be described here. Instead of computing the com
ponents of variance of one character from an analysis of variance, we
compute the components of covariance of the two characters from an
analysis of covariance which takes exactly the same form as the ana
lysis of variance. Instead of starting from the squares of the individual
values and partitioning the sums of squares according to the source
of variation, we start from the product of the values of the two
characters in each individual and partition the sums of products
according to the source of variation. This leads to estimates of the
observational components of covariance, whose interpretation in
316 CORRELATED CHARACTERS [Chap. 19
Table 19. i
Some Examples of Phenotypic, Genetic, and
Environmental Correlations
The environmental correlations (except those marked*) were
calculated for this table from the genetic correlations and
heritabilities given in the papers cited, by equation ig.i.
They are not purely environmental in causation but include
correlation due to nonadditive genetic causes, as explained
in the text. Those marked* are true environmental correla
tions, estimated directly from the phenotypic correlation in
inbred lines and crosses.
r P r A r E
Cattle (Johansson, 1950)
Milkyield : butterfatyield.
Milkyield : butterfat %.
Butterfatyield : butterfat %.
Pigs (Fredeen and Jonsson, 1957)
Body length : backfat thickness.
Growth rate : feed efficiency.
Backfat thickness : feed efficiency.
Sheep (Morley, 1955)
Fleece weight : length of wool.
Fleece weight : crimps per inch.
Fleece weight : body weight.
Poultry (Dickerson, 1957)
Body weight : eggproduction.
(at 18 weeks) (to 72 weeks of age)
Body weight : egg weight.
(at 18 weeks)
Body weight : age at first egg.  30 29  50
(at 18 weeks)
Mice (Falconer, 1954&)
Body weight : tail length (within litters). 44 59 34
Drosophila melanogaster
Bristle number, abdominal : sternopleural. 06 08 04
(Clayton, Knight, Morris, and Robert
son, 1957)
Number of bristles on different abdominal
segments. (Reeve and Robertson, 1954) — '96 05
Thorax length : wing length.
(Reeve and Robertson, 1953) — 75 '5°
•93
•14
•23
•85
•20
•26
.96
•10
•22
•24
.84
•31
•47
•96
•28
•01
•50
.32
.30
•21
.36
•02
 II
117
•10
105
•09
•l6
•18
•l6
•50
•05
Chap. 19] GENETIC AND ENVIRONMENTAL CORRELATIONS 317
terms of causal components of covariance is exactly the same as that
of the components of variance given in Table 10.4. Thus, in an
analysis of halfsib families the component of covariance between
sires estimates \cov Ay i.e. one quarter of the covariance of breeding
values of the two characters. For the estimation of the correlation
the components of variance of each character are also needed. Thus
the betweensire components of variance estimate la AX and \v\ Y 
Therefore the genetic correlation is obtained as
m
s/var x var Y
where var and cov refer to the components of variance and covariance.
The offspringparent relationship can also be used for estimating
the genetic correlation. To estimate the heritability of one character
from the resemblance between offspring and parents we compute
the covariance of offspring and parent for the one character by
taking the product of the parent or midparent value and the mean
value of the offspring. To estimate the genetic correlation between
two characters we compute what might be called the "cross
covariance," obtained from the product of the value of X in parents
and the value of Y in offspring. This "crossco variance" is half the
genetic covariance of the two characters, i.e. \cov A . The covariances
of offspring and parents for each of the characters separately are also
needed, and then the genetic correlation is given by
Cm ^=^ (193)
vCOVxx cov Y y
where cov^y is the "crosscovariance," and cov X x an d cov Y y are tne
offspringparent covariances of each character separately.
The genetic correlation can also be estimated from responses to
selection in a manner analogous to the estimation of realised herit
ability. This will be explained in the next section.
Data that provide estimates of genetic correlations provide also
estimates of the heritabilities of the correlated characters, and of the
phenotypic correlations. The environmental correlation can then be
found from equation ig.i. If highly inbred lines are available the
environmental correlations can be estimated directly from the
phenotypic correlation within the lines, or preferably within the F/s
of crosses between the lines.
Estimates of genetic correlations are usually subject to rather
318 CORRELATED CHARACTERS [Chap. /!
large sampling errors and are therefore seldom very precise. The
sampling variance of genetic correlations is treated by Reeve (1955^)
and by A. Robertson (19596). The standard error of an estimate
is given approximately by the following formula :
U (r A ) ~
V <*(ft) <*0
where <r denotes standard error. Since the standard errors of the two
heritabilities appear in the numerator, an experiment designed to
minimise the sampling variance of an estimate of heritability, in the
manner described in Chapter 10, will also have the optimal design for
the estimation of a genetic correlation.
Correlated Response to Selection
The next problem for consideration concerns the response to
selection: if we select for character X, what will be the change of the
correlated character Y? The expected response of a character, Y,
when selection is applied to another character, X, may be deduced in
the following way. The response of character X — i.e. the character
directly selected — is equivalent to the mean breeding value of the
selected individuals. This was explained in Chapter 11. The conse
quent change of character Y is therefore given by the regression of the
breeding value of Y on the breeding value of X. This regression is
_cov A _ G AY
°(A)YX——^ — 'A
<*AX &AX
The response of character X, directly selected, by equation 11. 4, is
Rx = ihx°Ax
Therefore the correlated response of character Y is
CR Y =b U)YX R x
■j (J AY
=in x a Ax r A
°AX
=ihxr A °AY ( J 9'
Or, by putting g ay — h Y cr PYi the correlated response becomes
CR Y = ih x h Y r A a PY ( I 95)
.
Chap. 19]
CORRELATED RESPONSE TO SELECTION
319
Thus the response of a correlated character can be predicted if
the genetic correlation and the heritabilities of the two characters are
known. And, conversely, if the correlated response is measured by
experiment, and the two heritabilities are known, the genetic corre
lation can be estimated. If the heritability of character Y is to be
estimated as the realised heritability from the response to selection,
then it is necessary to do a double selection experiment. Character X
is selected in one line and character Y in another. Then both the
direct and the correlated responses of each character can be measured.
This type of experiment provides two estimates of the genetic corre
lation (by equation 19.5), one from the correlated response of each
character; and the two estimates should agree if the theory of corre
lated responses expressed in equation J9.5 adequately describes the
observed responses (Falconer, 1954&). A joint estimate of the genetic
correlation can be obtained from such double selection experiments,
without the need for estimates of the heritabilities, from the following
formula which may be easily derived from equations 11. 4 and 19.4:
r A =
Ry Rxr
.(i 9 .6)
Example 19. i. In a study of wing length and thorax length in Droso
phila melanogaster, Reeve and Robertson (1953) estimated the genetic
correlation between these two measures of body size from the responses to
selection. There were two pairs of selection lines; one pair was selected for
increased and for decreased thorax length, and the other pair for increased
and for decreased wing length. In each line the correlated response of the
character not directly selected was measured, as well as the response of the
character directly selected. Two estimates of the genetic correlation were
obtained by equation J9.6, one from the responses to upward selection and
the other from the responses to downward selection. In addition, estimates
of the genetic correlation in the unselected population were obtained from
the offspringparent covariance and also from the fullsib co variance. The
four estimates were as follows:
Method
Genetic correlation
Offspringparent
074
Full sib
075
Selection, upward
071
Selection, downward
o73
The agreement between the estimates from selection and the estimates
from the unselected population shows that the correlated responses were
320 CORRELATED CHARACTERS [Chap. 19
very close to what would have been predicted from the genetic analysis of
the unselected population.
Close agreement between observed and predicted correlated
responses, such as was shown in the above example, cannot always be
expected, particularly if the genetic correlation is low. With a low
genetic correlation the expected response is small and is liable to be
obscured by random drift (see Clayton, Knight, Morris and Robert
son, 1957). Also, if the genetic correlation is to any great extent
caused by linkage, it is likely to diminish in magnitude through
recombination, with a consequent diminution of the correlated
response. There has not yet been enough experimental study of
correlated responses to allow us to draw any conclusions about the
number of generations over which they continue, nor about the total
response when the limit is reached.
Indirect selection. Consideration of correlated responses sug
gests that it might sometimes be possible to achieve more rapid pro
gress under selection for a correlated response than from selection for
the desired character itself. In other words, if we want to improve
character X, we might select for another character, Y, and achieve
progress through the correlated response of character X. We shall
refer to this as "indirect" selection; that is to say, selection applied to
some character other than the one it is desired to improve. And we
shall refer to the character to which selection is applied as the
"secondary" character. The conditions under which indirect selec
tion would be advantageous are readily deduced. Let R x be the
direct response of the desired character, if selection were applied
directly to it. And let CR X be the correlated response of character X
resulting from selection applied to the secondary character, Y. The
merit of indirect selection relative to that of direct selection may then
be expressed as the ratio of the expected responses, CR X /R X . Taking
the expected correlated response from equation 19.4 and the expected
direct response from equation 11. 4, we find
CR X = t Y ^Y r A^AX
Rx ixhx°AX
l x n x
If the same intensity of selection can be achieved when selecting for
Chap. 19]
CORRELATED RESPONSE TO SELECTION
321
character Y as when selecting for character X, then the correlated
response will be greater than the direct response if r A h Y is greater
than h x . Therefore indirect selection cannot be expected to be
superior to direct selection unless the secondary character has a
substantially higher heritability than the desired character, and the
genetic correlation between the two is high; or, unless a substantially
higher intensity of selection can be applied to the secondary than to
the desired character. The circumstances most likely to render
indirect selection superior to direct selection are chiefly concerned
with technical difficulties in applying selection directly to the desired
character. Two such technical difficulties may be mentioned briefly.
i . If the desired character is difficult to measure with precision,
the errors of measurement may so reduce the heritability that indirect
selection becomes advantageous. Threshold characters in general are
likely for this reason to repay a search for a suitable correlated charac
ter, unless the position of the threshold can be adjusted in the manner
described in the last chapter. An interesting experimental result which
may well prove to be an example of indirect selection being superior
to direct selection concerns sex ratio in mice. The sex ratio among
the progeny may be regarded as a metric character of the parents.
Selection applied directly to sex ratio was ineffective in changing it
(Falconer, 1954c), but selection for bloodpH produced a correlated
change of sex ratio (Weir and Clark, 1955; Weir, 1955). The reason
for the ineffectiveness of direct selection is probably that the true sex
ratio of a family is subject to a large error of estimation resulting
from the sampling variation, and the heritability is consequently very
low.
2. If the desired character is measurable in one sex only, but the
secondary character is measurable in both, then a higher intensity of
selection will be possible by indirect selection. Other things being
equal, the intensity of selection would be twice as great by indirect
as by direct selection; but a better plan would be to select one sex
directly for the desired character and the other indirectly for the
secondary character.
Though indirect selection has been presented above as an alterna
tive to direct selection, the most effective method in theory is neither
one nor the other but a combination of the two. The most effective use
that can be made of a correlated character is in combination with the
desired character, as an additional source of information about the
breeding values of individuals. This, however, is a special case of a
322 CORRELATED CHARACTERS [Chap. 19
more general problem which will be dealt with in the final section of
this chapter. First we shall show how the idea of indirect selection
can be extended to cover selection in different environments.
GenotypeEnvironment Interaction
The concept of genetic correlation can be applied to the solution
of some problems connected with the interaction of genotype with
environment. The meaning of interaction between genotype and
environment was explained in Chapter 8, where it was discussed as a
source of variation of phenotypic values, which in most analyses is
inseparable from the environmental variance. The chief problem
which it raises and which we are now in a position to discuss concerns
adaptation to local conditons. The existence of genotypeenviron
ment interaction may mean that the best genotype in one environ
ment is not the best in another environment. It is obvious, for
example, that the breed of cattle with the highest milkyield in
temperate climates is unlikely also to have the highest yield in tropical
climates. But it is not so obvious whether smaller differences of en
vironmental conditions also require locally adapted breeds; nor is it
intuitively obvious how much of the improvement made in one
environment will be carried over if the breed is then transferred to
another environment. These matters have an important bearing on
breeding policy. If selection is made under good conditions of feeding
and management on the best farms and experimental stations, will
the improvement achieved be carried over when the later generations
are transferred to poorer conditions? Or would the selection be
better done in the poorer conditions under which the majority of
animals are required to live ? The idea of genetic correlation provides
the basis for a solution of these problems in the following way.
A character measured in two different environments is to be
regarded not as one character but as two. The physiological mechan
isms are to some extent different, and consequently the genes re
quired for high performance are to some extent also different. For
example, growth rate on a low plane of nutrition may be principally
a matter of efficiency of foodutilisation, whereas on a high plane of
nutrition it may be principally a matter of appetite. By regarding
performance in different environments as different characters with
genetic correlation between them we can in principle solve the prob
Chap. 19]
GENOTYPEENVIRONMENT INTERACTION
323
lems outlined above from a knowledge of the heritabilities of the
different characters and the genetic correlations between them
(Falconer, 1952). If the genetic correlation is high, then performance
in two different environments represents very nearly the same
character, determined by very nearly the same set of genes. If it is
low, then the characters are to a great extent different, and high
performance requires a different set of genes. Here we shall con
sider only two environments, but the idea can be extended to an
indefinite number of different environments (A. Robertson,
Let us consider the problem of the ' 'carryover" of the improve
ment from one environment to another. Let us suppose that we
select for character X — say growth rate on a high plane of nutrition —
and we look for improvement in character Y — say growth rate on a
low plane of nutrition. The improvement of character Y is simply a
correlated response and the expected rate of improvement was given
in equation J9.5 as
CR Y — thjJtyTAVp?
The improvement of performance in an environment different from
the one in which selection was carried out can therefore be predicted
from a knowledge of the heritability of performance in each environ
ment and the genetic correlation between the two performances. We
can also compare the improvement expected by this means with that
expected if we had selected directly for character Y, i.e. for perfor
mance in the environment for which improvement is wanted. This
is simply a comparison of indirect with direct selection, which was
explained in the previous section. The comparison is made from the
ratio of the two expected responses given in equation ig.7, i.e.
Rv
r A ~T~
i Y h Y
This shows how much we may expect to gain or lose by carrying out
the selection in some environment other than the one in which the
improved population is required to live. If we assume that the in
tensity of selection is not affected by the environment in which the
selection is carried out, then the indirect method will be better if
r A^x is greater than h Y , where h x is the square root of the heritability
in the environment in which selection is made, and h Y is the square
root of the heritability in the environment in which the population is
324 CORRELATED CHARACTERS [Chap. 19
required subsequently to live. If the genetic correlation is high, then
the two characters can be regarded as being substantially the same;
and if there are no special circumstances affecting the heritability or
the intensity of selection it will make little difference in which en
vironment the selection is carried out. But if the genetic correlation
is low, then it will be advantageous to carry out the selection in the
environment in which the population is destined to live, unless the
heritability or the intensity of selection in the other environment is
very considerably higher.
This is the theoretical basis for dealing with selection in different
environments. So far, however, there has been little experimental
work to substantiate the theory. The results of the experiments that
have been carried out do not appear to be fully in agreement with
theoretical expectations, and this suggests that other factors not yet
understood are probably operating. (See Falconer and Latyszewski,
1952; Falconer, 1952.)
Simultaneous Selection for more than one Character
When selection is applied to the improvement of the economic
value of animals or plants it is generally applied to several characters
simultaneously and not just to one, because economic value depends
on more than one character. For example, the profit made from a
herd of pigs depends on their fertility, mothering ability, growth
rate, efficiency of foodutilisation, and carcass qualities. How, then,
should selection be applied to the component characters in order to
achieve the maximum improvement of economic value? There are
several possible procedures. One might select in turn for each
character singly ("tandem" selection); or one might select for all the
characters at the same time but independently, rejecting all individuals
that fail to come up to a certain standard for each character regardless
of their values for any other of the characters ("independent culling
levels"). It has been shown, however, that the most rapid improve
ment of economic value is expected from selection applied simul
taneously to all the component characters together, appropriate weight
being given to each character according to its relative economic
importance, its heritability and the genetic and phenotypic correla
tions between the different characters, (Hazel and Lush, 1942;
Hazel, 1943). The practice of selection for economic value is thus a
ofef
Chap. 19]
SIMULTANEOUS SELECTION
325
matter of some complexity. The component characters have to be
combined together into a score, or index, in such a way that selection
applied to the index, as if the index were a single character, will yield
the most rapid possible improvement of economic value. If the
characters are uncorrelated there is no great problem: each character
is weighted by the product of its relative economic value and its
heritability. This is the best that can be done in the absence of
information about the genetic correlations, but if the genetic correla
tions are known the efficiency of the index can be improved. The
following account gives an outline of the principles on which the
construction of a selection index is based. For a fuller account the
reader should consult Lerner (1950) and the original papers of
Fairfield Smith (1936) and Hazel (1943).
For the sake of simplicity we shall consider only two component
characters of economic value, but the conclusions can readily be ex
tended to any number of characters. Let the economic value be
determined by two characters X and Y, and let w be the additional
profit expected from one unit increase of Y relative to that from one
unit increase of X. The aim of selection therefore is to pick out
individuals with the highest values of (A x + wA Y ), where A x and A Y
are the breeding values of the two characters X and Y. Let us call
this compound breeding value "merit," with the symbol H, so that
H = A x + wA
(i 9 .8)
The problem is to find out how the phenotypic values, P x and P Y , of
the two component characters are to be combined into an index that
gives the best estimate of an individual's merit, H. In Chapter 10 we
saw how the best estimate of the breeding value of an individual for
one character is the regression equation A =b AP P, where b AP is the
regression of breeding value on phenotypic value, and is equal to the
heritability (see p. 166). The present problem is essentially the same,
only now we have to use partial regression coefficients. The multiple
regression equation giving the best estimate of merit is
H=b H ^ Y P x +b HY , x P Y (J9.9)
where P x and P Y are phenotypic values measured as deviations from
the population mean. (In this formula, and in those that follow, the
symbol X has the same meaning as P x , i.e. the phenotypic value of
character X; and similarly Y and P Y both mean the phenotypic value
of character Y. Thus, b HX , Y is the regression of merit on the pheno
326
CORRELATED CHARACTERS
[Chap. 19
typic value of X when the phenotypic value of Y is held constant, and
^hy.x nas a similar meaning with X and Y interchanged.) In practice
it is convenient to have the index in a form that requires the manipu
lation of only one of the phenotypic values, i.e. in the form
I=P X + WP Y (ig.io)
where / is the index by means of which individuals are to be chosen,
and W is a factor by which the phenotypic value of character Y is
to be multiplied. Since the absolute magnitude of the index is of no
importance, but only its relative magnitude in different individuals,
we can work with the phenotypic values as they stand instead of with
deviations from the population mean. And we can put equation
ig.g into the form of equation ig.io simply by dividing through by
t>HX.Y Then W in equation ig.io is the ratio of the two partial
regression coefficients,
W=
'HY.X
} HX.Y
and our task now is to find a way of expressing W in terms of the
genetic properties of the two characters.
First let us put the partial regression coefficients in terms of the
total regression coefficients. For example,
bnvbnYb
'HY
HX U XY
Therefore
W
'HY.X —
'HY.X
'HY
r XY
UHxbxY
'HX.Y
'HX
^HY^YX
Now let us express these total regressions in terms of covariances and
variances. For example, b HY = cov HY /a Y . After some simplification
the expression reduces to
W=
g x cov hy  cov EX cov XY
O Y C0V HX  COV HY COV XY
.{19.11)
The variances o x and <j y here, and in what follows, are the pheno
typic variances of characters X and Y. The covariances in the above
expression can be expressed in terms of the'phenotypic variance anc
the heritability of each character and of the phenotypic and genetic
correlations between the two characters, all of which quantities car
Chap. 19]
SIMULTANEOUS SELECTION
327
be estimated. Take, for example, the covariance of H with X. This
 may be written as follows:
cov HX = covariance of (A x + wA Y ) with P x
=cov Ux .p x) +cov iwAYw p x)
= h\ox + wr A h x o x h Y o Y
J In this way the covariances in equation 19.11 can be expressed as
 follows, a andcr 2 being phenotypic standard deviations and variances
i ! throughout:
cov HX = h x cr x + wr A h x h Y a x <j Y ^
cov HY = wh\a\ + r A h x h Y u x a Y > ( 19.12)
cov XY = r P <j x v Y J
The procedure for selection is thus to compute the covariances
I given in 19.12, substitute them in 19.11 and use the value of W so
obtained to compute the index of selection / given in 19.10. The
value of the index for each individual then forms the basis of selection.
The index as formulated above is applicable only to individual
selection. If family selection is applied then the heritabilities and
correlations that go into the index must be those appropriate to the
family means. Family selection, however, is not greatly improved by
the use of an index, because the family heritabilities of the component
characters are generally fairly high and the mean economic value of a
family in terms of phenotypic values is not very different from its
merit in terms of breeding values. Therefore family selection for
economic value can be applied with little loss of efficiency if the
phenotypic values are weighted only by w, the relative economic im
portance of each component character.
The complexity of selection by means of an index need hardly be
emphasised, especially when the index is extended to cover many
component characters. Even with two characters, estimates of no
fewer than seven quantities are required for the construction of the
index. Since some of these, particularly the genetic correlation,
cannot usually be estimated with any great precision, the index
cannot be regarded as much more than a rough guide to procedure.
But since selection has to be applied to economic value by some
means, it seems better to use a selection index, however imprecise,
i than to base selection on a purely arbitrary combination of com
ponent characters.
Use of a secondary character by means of an index. The
Y F.Q.G.
328 CORRELATED CHARACTERS [Chap. 19
selection index described above can readily be adapted to meet the
case where improvement of only one character is sought, the other
character being used merely as an aid to more efficient selection.
The use of a secondary character in this way was mentioned earlier,
in connexion with indirect selection. Let X be the character it is
desired to improve, and Y the secondary character. Then the
relative "economic" value of character Y is zero, and we can substitute
w = o in the formulae of ig.12. Substitution of the covariances in
equation ig.n then yields a formula which on simplification reduces
to
w ^Ar A h Y r P h)
o Y (h x r A h Y ) K y J)
The selection index of equation ig.io is then used with this value of
W. The value of W in the index may be negative. This will arise if
the phenotypic correlation between the two characters is chiefly
environmental in origin. The secondary character then acts as an
indicator of the environmental deviation rather than of the breeding
value of the desired character (see Rendel, 1954; and Osborne,
Genetic correlation and the selection limit. There is one
important consequence of simultaneous selection for several charac
ters to be discussed before we leave the subject. Just as the herit
abilities are expected to change after selection has been applied for
some time, so also are the genetic correlations. If selection has been
applied to two characters simultaneously the genetic correlation
between them is expected eventually to become negative, for the
following reason. Those pleiotropic genes that affect both characters
in the desired direction will be strongly acted on by selection and
brought rapidly toward fixation. They will then contribute little to
the variances or to the covariance of the two characters. The pleio
tropic genes that affect one character favourably and the other ad
versely will, however, be much less strongly influenced by selection
and will remain for longer at intermediate frequencies. Most of the
remaining covariance of the two characters will therefore be due to
these genes, and the resulting genetic correlation will be negative.
The consequence of a negative genetic correlation, whether produced
by selection in this way or present from the beginning, is that the two
characters may each show a heritability that is far from zero, and yet
when selection is applied to them simultaneously neither responds.
Ian
value
Chap. 19]
SIMULTANEOUS SELECTION
329
We have already discussed, in Chapter 12, what is essentially the
same situation resulting from the combined effects of artificial and
natural selection: a selection limit is reached even though the charac
ter to which artificial selection is applied still shows a substantial
amount of additive genetic variance.
Example 19.2. A practical example from a commercial flock of poultry
is described by Dickerson (1955). Selection for economic value had been
applied for many years, but recent progress in the component characters
was much less than was to be expected from their heritabilities, which were
found to be moderately high. Estimations of the genetic correlations
between the component characters showed that many of these were nega
tive. To take just one example, the relationships between eggproduction
and egg weight were as follows:
X
Production
Y
Weight
hi
h\
032 059
004 039
+ 025
In spite of the high heritabilities neither character had shown any improve
ment over the last 1015 years. The high negative genetic correlation
would account for this failure to respond, if selection was applied to both
characters simultaneously. It is interesting to note that environmental
variation, unlike genetic variation, affects both characters in the same way
and leads to a positive environmental correlation. The phenotypic cor
relation, which is almost zero, gives no clue to the genetic relationship
between the two characters, and the failure to respond to selection could
mot have been predicted from it alone.
A population which has been subjected over a long period to
selection for economic value throws light on the genetic properties to
be expected in natural populations subject to natural selection for
fitness. Fitness is a compound character with many components —
far more than would appear in the most elaborate assessment of
economic value — and so we should expect negative genetic correla
tions between its major components, a conclusion to be developed
further in the next chapter. It is interesting to note, however, that
natural selection takes no account of heritabilities or genetic correla
tions, and is therefore, in theory, less efficient in improving fitness
than artificial selection by means of an index is in improving economic
value.
CHAPTER 20
METRIC CHARACTERS UNDER
NATURAL SELECTION
Throughout the discussion of the genetic properties of metric
characters, which has occupied the major part of the book, very little
attention has been given to the effects of natural selection, and some
thing must now be done to remedy this omission. The absence of
differential viability and fertility was specified as a condition in the
theoretical development of the subject: that is to say, natural selection
was assumed to be absent. Though for many purposes this assump
tion may lead to no serious error, a complete understanding of metric
characters will not be reached until the effects of natural selection
can be brought into the picture. The operation of natural selection
on metric characters has, however, a much wider interest than just
as a complication that may disturb the simple theoretical picture and
the predictions based on it. It is to natural selection that we must look
for an explanation of the genetic properties of metric characters which
hitherto we have accepted with little comment. The genetic pro
perties of a population are the product of natural selection in the past,
together with mutation and random drift. It is by these processes
that we must account for the existence of genetic variability; and it is
chiefly by natural selection that we must account for the fact that
characters differ in their genetic properties, some having propor
tionately more additive variance than others, some showing in
breeding depression while others do not. These, however, are very
wide problems which are still far from solution, and in this con
cluding chapter we can do little more than indicate their nature. Any
discussion of them, moreover, cannot but be controversial; the reader
should therefore understand that the contents of this chapter are to a
large extent matters of personal opinion, and that any conclusions to
which the discussion may lead are open to dispute.
We shall refer throughout to a population that is in genetic equil
ibrium. Being in genetic equilibrium means that the gene frequencies
are not changing, and therefore that the mean values of all metric
tha
Chap. 20] METRIC CHARACTERS UNDER NATURAL SELECTION
331
characters are constant. (Changes of environmental conditions are
assumed to be so slow as to be negligible.) The population is con
stantly subject to natural selection tending to increase fitness, but
despite the selection the gene frequencies do not change and fitness
does not improve. There can therefore be no additive genetic vari
ance of fitness: in other words, if we could measure fitness itself as a
character we should find that its genetic variance was entirely non
additive. For the purposes of discussion we may regard any natural
population as being in genetic equilibrium, at least approximately,
and also any population that has been subject to artificial selection
consistently over a long period of time, provided that fitness is
defined in terms of both the artificial and the natural selection.
Fitness, crudely defined, is the "character" selected for, whether by
natural selection alone or by artificial and natural selection com
bined.
If a population is in genetic equilibrium it follows that a reduction
of fitness must in principle result from any change in the array of gene
frequencies, apart from any genes that may have no effect on fitness.
Natural selection must therefore be expected to resist any tendency
to change of the gene frequencies, such as must result from artificial
selection applied to any metric character other than fitness itself.
This principle has been called "genetic homeostasis," and its conse
quences have been discussed, by Lerner (1954). Thus if we change
any metric character by artificial selection we must expect a reduction
of fitness as a correlated response. And if we then suspend the arti
ficial selection before any of the variation has been lost by fixation,
we must expect the population mean to revert to its original value.
On the whole, the experience of artificial selection is in general
agreement with this expectation, though under laboratory conditions
the reduction of fitness may not be apparent in the early stages, and
some characters appear to revert very slowly, if at all, toward the
original value. Our domesticated animals and plants are perhaps the
best demonstration of the effects of the principle. The improvements
that have been made by selection in these have clearly been accom
panied by a reduction of fitness for life under natural conditions, and
only the fact that domestic animals and plants do not have to live
under natural conditions has allowed these improvements to be
made. The problems for discussion in this chapter must be seen
against the background of this principle: that the existing array of
gene frequencies, and consequently the existing genetic properties of
332 METRIC CHARACTERS UNDER NATURAL SELECTION [Chap. 20
the population, represent the best total adjustment to existing con
ditions that is possible with the available genetic variation.
The problem of how natural selection operates on metric charac
ters has two aspects: the relation between any particular metric
character and fitness, and the way in which natural selection operates
on the individual loci concerned with a metric character. This latter
aspect is part of a wider problem which concerns the reasons for the
existence of genetic variation. We shall discuss these two aspects
separately, because any conclusions that may be drawn about the
second will depend on what can be discovered about the first.
Relation of Metric Characters to Fitness
The fitness of an individual is the final outcome of all its develop
mental and physiological processes. The differences between indi
viduals in these processes are seen in variation of the measurable
attributes which can be studied as metric characters. Thus the
variation of each metric character reflects to a greater or lesser degree
the variation of fitness; and the variation of fitness can theoretically
be broken down into variation of metric characters. Let us consider
for example a mammal such as the mouse, because this matter is more
easily discussed in concrete terms. Fitness itself might be broken
down into two or three major components, which could be measured
and studied as metric characters. These might be the total number
of young reared, and some measure of the quality of the young, such
as their weaning weight. The variation of the major components
would account for all the variation of fitness. Each of the major com
ponents might be broken down into other metric characters which
would account for all their variation. Thus the total number of
young weaned depends on the viability of the parent up to breeding
age, its mating ability, average litter size, frequency of litters, and
longevity. These characters in turn might be further broken down.
For example, litter size depends on the number of eggs shed and the
proportion that are brought to term. The number of eggs shed
depends, again, on body size and endocrine activity, among other
things. Thus each metric character has its place in one of a series of
chains of causation converging toward fitness. And these chains of
causation interconnect one with another: body size, for example,
influences not only litter size, but also lactation, longevity, and prob
Chap. 20] RELATION OF METRIC CHARACTERS TO FITNESS
333
ably many other characters. The relationship between any particular
metric character and fitness is thus a very complicated matter. The
following discussion of the problem is based largely on the ideas put
forward by A. Robertson (19556).
The way in which natural selection operates on a character
depends on the part played by the character in the causation of differ
ences of fitness: that is to say, on the manner and degree by which
differences of value of the metric character cause differences of fitness.
This we shall refer to as the "functional relationship" between the
character and fitness. The functional relationship expresses the
mode of operation of natural selection on the metric character; but it
is not necessarily also the relationship that would be revealed if we
could measure the fitness of individuals and compare their fitness with
their values for the metric character. This point, however, will be
more easily explained by an example to be given in a moment.
Different characters must be expected to have different functional
relationships with fitness, according to the nature of the character.
In explanation of the kinds of relationship that may be envisaged let
us take some examples of different sorts of character at different
positions in the chain of causation.
1. Neutral characters. There may be some characters that have
no functional relationship at all with fitness. This does not mean
that, like vestigial organs, they have no function or use. It means that
the variation in the character is not a cause of variation of fitness.
Abdominal bristle number in Drosophila may be taken as an example
of a character which is probably not far from this state, and two
reasons can be given for regarding it thus. First, it is difficult to
conceive of any biological reason why it should be important to have
18 bristles, or thereabouts, on each segment rather than more or
fewer. And second, if we change the bristle number by artificial
selection and then suspend the selection, the mean bristle number
does not return to its original value — or returns only very slowly —
under the influence of natural selection, even though it could be
brought back rapidly by artificial selection (Clayton, Morris, and
Robertson, 1957). In other words, genetic homeostasis in respect of
bristle number is weak or nonexistent. Such a metric character may
be termed "neutral" with respect to fitness. The mean value of a
neutral character in the population has little or nothing to do with the
character itself, but is the outcome of the pleiotropic effects of the
genes whose frequencies are controlled by their effects on other
334 METRIC CHARACTERS UNDER NATURAL SELECTION [Chap. 20
characters. Though a neutral character has no functional relationship
with fitness, we may nevertheless find that individuals with different
values do in fact differ in fitness in a regular way. If the genetic
variance of the character is predominantly additive then individuals
with intermediate values will tend on the whole to be heterozygous at
more loci than individuals with extreme values. Then if hetero
zygotes were superior in fitness for some other reason, unconnected
with the character in question, this would result in intermediates
being superior in fitness. At the level of observation there would be
a relationship between values of the character and fitness, but this
would not be a functional relationship because the values of the
character are not the cause of the differences of fitness. The differ
ences of fitness are the result of the functional relationships of other
characters affected by the pleiotropic action of the genes.
2. Characters with intermediate optima. There are some
characters for which an intermediate value is optimal for functional
reasons. One might distinguish three sorts of intermediate opti
mum according to the reasons for intermediates being superior in
fitness.
(i) Optima determined by the character itself. As an example we
might take any character that measures the thermal insulation of a
mammalian coat. Too dense a coat would be disadvantageous and so
would too sparse a coat. An intermediate density would confer the
highest fitness as a consequence of the function of the coat in thermo
regulation. For such a character the mean value in the population is
the optimal value, provided there are no complications of the sort to
be considered later. Though irrefutable biological reasons might be
given for supposing that a character such as the density of fur has an
intermediate optimal value, we might nevertheless find that over the
range of variation covered by the population there was very little
variation in fitness. In practice therefore one could not expect always
to draw a clear line between this sort of character and a neutral
character such as we have taken bristle number to be.
(ii) Optima imposed by the environment. As an example we may
take the clutch size of birds. It has been shown, particularly for the
European robin and swift, that a larger number of young are reared
from nests containing the average number of eggs than from nests
with larger or smaller clutches (Lack, 1954). Thus individuals with
intermediate values appear to be the fittest. If a character such as
this has an optimal value that is intermediate there must obviously
\Chap. 20] RELATION OF METRIC CHARACTERS TO FITNESS
335
be some other factor interacting with it to determine fitness; for,
otherwise, the individuals that lay more eggs must inevitably be the
fitter. The other factor in this case is the supply of insects for feeding
the young and the length of daylight available for their capture.
With characters of this sort natural selection tends to eliminate indi
viduals with extreme values and favours individuals with intermediate
values. The mean value in the population is the optimal value under
the environmental conditions to which the population is subjected.
If the environment were to change, the population mean would
change too in adjustment to the new optimum. In the case of clutch
size it is noteworthy that the mean value varies with the latitude,
being larger in the north than in the south.
(iii) Optima imposed by a correlated character. Body size in mice
may be taken as an example. Larger mice have larger litters and,
under laboratory conditions, they rear more young. Therefore if
there were no other factor involved, larger mice would be fitter. Since
body size can, as we have seen, be readily increased by artificial
selection, there must be some other factor that prevents its being
increased by natural selection in the wild. The other factor in this case
is probably not environmental, but another character negatively
correlated with size, namely wildness. A change of body size under
artificial selection is always accompanied by a correlated change of
wildness. Large mice are phlegmatic and unreactive to disturbance,
whereas small mice are alert and react energetically to disturbance
(MacArthur, 1949; Falconer, 1953). Therefore under natural con
ditions larger mice would more readily fall prey to cats and owls than
small mice, and the advantage of greater fertility would be offset by
the disadvantage of being less well fitted to escape predators. The
body size of wild mice, it may be suggested, represents the best
compromise between these two correlated characters. If we could
measure the relationship between size and fitness in wild mice we
should find that those of intermediate size were fittest. With charac
ters of this sort also, the population mean represents the optimal
value. But this value is optimal not because of this character itself but
because of its genetic correlations with other characters. Large mice
are selected against not because they are large but because, being
large, they are inevitably also less wild. This example brings us to the
point mentioned at the end of the last chapter: that we must expect to
find negative genetic correlations between characters under simul
taneous selection. In this case we find a negative genetic correlation
336 METRIC CHARACTERS UNDER NATURAL SELECTION [Chap. 20
between large size and wildness, both of which may reasonably be
supposed to be favoured by natural selection. These two characters
are * 'components' ' of fitness in the same way that characters of econ
omic importance are components of total economic merit. What
natural selection "aims at" is to increase both characters indefinitely,
but the physiological connexions between them, which we see as a
negative genetic correlation, limit the increase that is possible with
the existing genetic variability.
3. Major components of fitness. If we could measure fitness
itself — which is technically very difficult — we should obviously find
no "optimal" value; the individuals most favoured by natural
selection would not be those nearest to the population mean, but the
most extreme. In spite of the selection toward higher values the
mean fails to change under natural selection because there is no
additive variance of fitness. If we measure as a metric character
something that is a major component of fitness, in the sense that it
accounts for a large part of the variation of fitness, we should probably
find the same sort of relationship. Fitness would increase as the
value of the character increased. At the very highest values, however,
fitness would probably decline again slightly. Egglaying in Droso
phila might well be such a character, even if measured only over a
few days, since the daily egg production is highly correlated with the
total production (Gowen, 1952). We should almost certainly find
that the fittest individuals were not those that laid an intermediate
number of eggs, but those that laid almost the most. The most ex
treme individuals would probably be slightly less fit because of some
environmental limitation or some correlated character, perhaps
longevity. There must be many characters whose relationships with
fitness fall between this and the previous type, characters with an
optimal value above the population mean but yet below that of the
most extreme individuals.
The foregoing discussion will be enough to explain the nature of
the problem of the relationship between a metric character and fitness
and to indicate the sort of solution that may be sought. Let us turn
now to the connexion between the relationship with fitness and the
nature of the genetic variation of a metric character. When we first
discussed the heritability as a property of a character in Chapter 10,
we noted a tendency toward lower heritabilities among characters
more closely connected with fitness. But the precise meaning of a
"close connexion" with fitness was not explained. It may now be
Chap. 20] RELATION OF METRIC CHARACTERS TO FITNESS
337
suggested that the meaning of a close connexion with fitness may
perhaps be seen in the functional relationships discussed above.
Characters with the closest connexion are of the third type where the
population mean is not at an optimal value; characters with a less
close connexion are nearer to the second type; while characters with
the least connexion are the neutral or nearly neutral characters. On
the whole it does seem that characters with high heritabilities are to
be found among the first type and characters with low heritabilities
among the third. Differences of heritability are, however, not really
relevant here. It is the genetic variance with which we are concerned;
and the differences in the proportion of the genotypic variance that
is additive, that we want to account for. But so little is known about
how the genotypic variance is partitioned into additive and non
additive components that we can scarcely begin to tackle the prob
lem. Four characters of Drosophila, however, seem to fit the picture
fairly well, (see Table 8.2). For bristle number, which we have taken
as a neutral character of the first type, 85 per cent of the genotypic
variance is additive. Thorax length, which might perhaps be of the
second type, has about the same proportion. For ovary size, however,
only 43 per cent of the genotypic variance is additive, and this
character might well be between the second and third types. For
egg laying, which we have taken to be of the third type, the propor
tion is 29 per cent. These comparisons, of course, cannot be given
much weight because in fact we know almost nothing of the func
tional relations of the characters with fitness. But they do suggest
that the solution of the problem of why characters differ in their
genetic properties may lie along these lines. The reaction of a charac
ter to inbreeding seems also to be connected with the proportion of
nonadditive genetic variance, those with most nonadditive variance
being those that suffer the greatest inbreeding depression. Some,
perhaps most, of the nonadditive variance must be attributed to
dominance. Reasons for expecting the effects of genes on characters
closely connected with fitness to show dominance, while the effects on
characters not closely connected with fitness do not, have been put
forward by A. Robertson (19556); but it would take too much space
here to summarise the argument. There we must leave the problem
of the nature of the genetic variance and pass on to the second aspect
of the operation of natural selection on metric characters.
338 METRIC CHARACTERS UNDER NATURAL SELECTION [Chap. 20
Maintenance of Genetic Variation
The second aspect of the operation of natural selection on metric
characters — its effects on the individual loci — is part of a wider
problem, which concerns the mechanisms by which genetic variation
is maintained. Almost every metric character, of the many that have
been studied both in natural populations and in domesticated animals
and plants, exhibits genetic variation. What are the reasons for the
existence of this genetic variation? The coexistence in a population
of different alleles at a locus is governed by the three processes of
mutation, random drift, and selection. Allelic differences originate
by mutation and are extinguished by random drift, since no natural
population is infinite in size. Natural selection may tend to eliminate
the differences by favouring one allele over all others at a locus; or it
may tend to perpetuate the differences by favouring heterozygotes.
Let us discuss the role of natural selection first and the roles of muta
tion and random drift later.
Effects of selection on individual loci. The way in which
selection operates on any locus depends on the effects that the differ
ent alleles have on fitness itself, and not simply on their effects on one
particular metric character. Therefore the functional relations be
tween characters and fitness, which were discussed above, can indij w
cate the action of selection only on those loci which affect fitness ■%
through the character in question and not through any pleiotropic
effects on other characters. Let us consider the three types of
character in turn.
i. Neutral characters. If there are genes whose only effects are
on a neutral character, then selection plays no part in the existence of
allelic differences at these loci. The gene frequencies at these loci
must be controlled solely by mutation and random drift.
2. Characters with intermediate optima. The consequences of
selection favouring individuals of intermediate value have been
examined from different aspects by Wright (19350, ^)> Haldane
(19540), and by A. Robertson (1956) who reaches the following
conclusions. If the intermediate optimum is the result of the func
tional relations of the character to fitness, and the optimum is deter
mined by the character itself or by the environment, then selection
will tend toward fixation at all the loci whose only influence on fitness
is through the character in question. This would apply to characters
Chap. 20]
MAINTENANCE OF GENETIC VARIATION
339
of type 2 (i) and (ii) described above and exemplified by the density
of mammalian fur and by clutch size in birds. Selection will thus
tend to eliminate rather than to conserve variability arising from loci
which affect fitness only through such characters. The rate at which
the gene frequencies are expected to change toward fixation is very
slow, and so the rate at which variation would be eliminated is also
very slow; but on an evolutionary timescale it would not be negligible.
Characters of type 2 (iii), where an intermediate optimum is deter
mined by a correlated character, have not yet been investigated in
this connexion, and the mode of operation of selection on loci that
affect them is not known.
3. Major components of fitness. The essential feature of a major
component of fitness is that the population mean is not at the opti
mum. But we cannot deduce, from this fact alone, how selection
operates on the individual loci. If the genes that affect these charac
ters are at intermediate frequencies, it seems most probable that they
are held there by selection favouring heterozygotes, because it seems
hardly possible that the coefficients of selection are small enough to
allow mutation alone to maintain intermediate frequencies. We do not
know, however, whether these genes are at intermediate frequencies.
It seems quite possible that a considerable portion of the genetic
variation of these characters is due to genes at very low frequencies,
where they are maintained by the balance between mutation and
selection against the recessive homozygotes. Much evidence, how
ever, has been presented by Lerner (1954) in support of the view that
heterozygotes in general are superior in fitness; and Haldane (19546)
has pointed out that a general superiority of heterozygotes is a very
reasonable expectation from biochemical considerations of gene
action. Though the matter is not yet settled, the weight of evidence
at present seems to point to superior fitness of heterozygotes, and
consequently to natural selection favouring heterozygotes at most of
the loci that affect fitness through its major components.
There are three other ways in which selection may influence
genetic variability, to be discussed before we leave the subject. They
are all subsidiary to the main effects on gene frequencies which we
have been discussing; they may modify these main effects, but they
do not in themselves provide a sufficient description of the operation
of natural selection.
Variable selection. If characters have optimal values these
optima are likely to vary from time to time and from season to season
340 METRIC CHARACTERS UNDER NATURAL SELECTION [Chap. 20
according to the environmental conditions. The selection pressures
on the individual genes are therefore likely to change from generation
to generation. The consequence of variable selection coefficients has
been shown (Kimura, 1954) to be a tendency toward fixation — or
more strictly, nearfixation — the favoured allele being the one that
gives the highest average fitness. In this aspect selection would
therefore tend to eliminate variability. The optimal values are
likely to vary also from place to place within each generation, especi
ally if different genotypes choose different environments in which to
live, as Waddington (1957) suggests. This form of variable selection
has been shown to be capable under certain conditions of maintaining
stable polymorphism, as was mentioned in Chapter 2. Its effect on
the variation of metric characters, however, has not been examined.
It does not seem likely to be very great.
Balanced linkage. Mather's theory of "polygenic balance" is
based on the idea of selection favouring intermediate values of metric
characters and the effect this is likely to have on linkage (see for
example, Mather, 1949, 1953^). In considering linkage between the
loci affecting a metric character we have to take account of the linkage
phase. We may say that two genes on the same chromosome are in
coupling if they affect the character in the same direction, and in
repulsion if they affect it in opposite directions. The two phases will
be represented in equal frequencies in a randombreeding population
subject to no selection, as was shown in Chapter 1. Now, chromo
somes carrying genes in coupling will contribute more to the variation
than chromosomes carrying genes in repulsion. And individuals with
intermediate values will tend on the whole to carry repulsion chromo
somes rather than coupling chromosomes. Therefore, if intermediates
are favoured for functional reasons, selection will favour repulsion
chromosomes and thus tend to build up * 'balanced" combinations of
genes: that is, combinations in predominantly repulsion linkage,
which contribute the minimal amount of variance. In this way,
according to Mather, "potential" genetic variability is stored in latent
form, and a compromise is reached between the conflicting needs of
uniformity in adaptation to present circumstances and flexibility in
adaptation to changing circumstances.
If, however, this supposed tendency of selection to build up
balanced combinations is to have any significant effect on genetic
variability it is necessary that the selection should be strong enough
to maintain the balanced combinations in the face of recombination
i
Chap. 20]
MAINTENANCE OF GENETIC VARIATION
341
which must tend continuously to reduce them to a random arrange
ment. The selective forces required have been examined by Wright
(1952&). It is clear, without going into the details, that coefficients of
selection of the same order of magnitude as the recombination fre
quencies would be required. The balancing of linkage by natural
selection therefore seems from Wright's reasoning to be relevant only
to very short segments of chromosome. Loci with more than about 1
per cent recombination between them would not be expected to
depart significantly from a random arrangement, unless they carried
major genes with large effects on the character. Furthermore, if we
consider a number of loci on the same chromosome, it is not clear
how much difference of variance would be expected between fully
balanced and fully random arrangements; it might well be very little.
Experimental evidence on the matter is scanty. In two experiments,
one with mice and the other with Drosophila, where artificial selection
was applied for and against intermediates, no changes of variance
were detected (Falconer and Robertson, 1956; Falconer, 19576).
Intensification of the selection against extremes therefore does not
seem to have any effect on the variance within the timespan of a
laboratory experiment.
Canalisation. Waddington's theory of "canalisation" is con
cerned with the developmental pathways through which the pheno
typic values come to their expression (see Waddington, 1957). If
intermediates are favoured because of their values of the metric
character in question, then deviation from the optimal value is dis
advantageous. Selection will therefore operate against the causes of
deviation, and will tend to produce a greater stability so that develop
ment is canalised along the path that leads to the optimal phenotypic
expression. The role ascribed to selection is its discrimination against
alleles that increase variability. These may be at loci that affect the
character in question or at other loci. Variation both of environ
mental and of genetic origin may be reduced in this way. The
genetic variation is reduced not by eliminating the segregation, but by
rendering the organism less sensitive to the effects of the segregation.
A change in the proportion of genetic to environmental variation is
therefore not necessarily to be expected. As a consequence of canalisa
tion we should expect to find some characters less variable than
others, the less variable being those for which deviation from the
optimum has the more serious effect on fitness. This expected con
sequence of canalisation, however, cannot easily be tested experi
342 METRIC CHARACTERS UNDER NATURAL SELECTION [Chap. 20
mentally, because, as Waddington (1957) points out, it is difficult to
find a logical basis for comparing the variability of different characters.
Origin of variation by mutation. Before the reasons for the
existence of genetic variability can be fully understood it will be
necessary to know what part mutation plays in restoring what is lost
by random drift or by selection. If there were no selection of any
kind then the amount of genetic variation would come to equilibrium
when its rate of origin was equal to its rate of extinction by random
drift. The rate of extinction presents no very serious problem be
cause we need know the population size only approximately. If,
therefore, we knew the rate of origin by mutation we could decide
whether a significant amount of the existing variation can be ascribed
to mutation. Very little, however, is known about the rate of origin
by mutation. The only evidence comes from two studies oiDrosophila
by Clayton and Robertson (1955) and Paxman (1957), which yielded
very similar results. The following discussion is based on the experi
ment of Clayton and Robertson. Selection for abdominal bristle
number was applied to an inbred line derived from the same base
population on which the other studies of this character were made.
From the rate of response to selection it was concluded that the aver
age amount of variation arising by spontaneous mutation in one
generation amounted to one thousandth part of the genetic variation
present in the base population. In other words it would take about
1000 generations for mutation to restore the genetic variation to its
original level. (We may note in passing that this proves mutation to
have a negligible influence on the response of noninbred populations
to artificial selection, apart from the rare occurrence of mutants with
major effects.) Now consider the loss of variance due to random drift
in a population of effective size N e , subject to no selection. If all the
genetic variance is additive, as it very nearly is in the case of bristle
number, then the rate of loss per generation is equal to the rate of 1 f
inbreeding, which is ijzN e . (This follows from the reasoning given
in Chapter 15, where the variance within a line was shown to be
(1 F) times the original variance.) Therefore the new variation
arising by mutation at the rate found in this experiment would be lost
at the same rate, if the rate of inbreeding were 1/1,000: that is, in a
population of effective size 500. The base population was roughly
ten times this size and therefore the expected rate of extinction by
random drift is less than the observed rate of origin by mutation. In
other words, mutation alone seems to be capable of accounting for
not
mt
som,
.,
Chap. 20]
MAINTENANCE OF GENETIC VARIATION
343
more variation of bristle number than was actually present in the
base population. Therefore selection favouring heterozygotes does
not seem to have been an important cause of the genetic variability of
bristle number. This suggests that little of the variation of bristle
number is due to the pleiotropic effects of genes that affect the major
components of fitness. It suggests, in other words, that much of the
variation of bristle number is due to genes that are not far from being
neutral with respect to fitness. This conclusion, though only tenta
tive, is in line with the fact, mentioned earlier, that bristle number
I shows little tendency to revert to the original mean value when
I artificial selection is relaxed. The conclusions to which the results of
I this experiment point cannot yet be extended to other characters.
I Characters more closely connected with fitness, when they have been
i studied from this point of view, may present a very different picture.
Evolutionary significance of variability. There can be little
doubt that the existence of genetic variation is advantageous to the
evolutionary survival of a species, the advantage it confers being the
ability to evolve rapidly and so to meet the needs of a changing
environment, both through the course of time and in the colonisation
of new localities. Sexual reproduction and outbreeding are necessary
conditions for the continued existence of genetic variation and it is
noteworthy that the naturally inbreeding species among the higher
plants are of comparatively recent origin. This suggests that the
possession of genetic variability is necessary for the continued exist
ence of a species over a long period of time; or in other words, that the
prevalence of genetic variability among existing species is because
those without it have not survived. The inbreeding plants, however,
as we see them at present, compete successfully with the outbreeding
species, and this proves that the possession of genetic variability does
not confer much immediate advantage. The evolutionary significance
of genetic variability, however, throws no light on the mechanisms
that maintain it. It is these mechanisms, which have been discussed
in this chapter, that are the concern of quantitative genetics.
The Genes concerned with Quantitative Variation
The genetic variation of metric characters appears from the re
sults of experimental selection to be the product of segregation at
some hundreds of loci, or more probably some thousands if the
z F.Q.G.
344 METRIC CHARACTERS UNDER NATURAL SELECTION [Chap. 20
variation of all characters is included. So natural populations prob
ably carry a variety of alleles at a considerable proportion of loci, even
perhaps at virtually every locus. It seems unreasonable, therefore, to
think of genes having the control of a metric character as their
specific function: we cannot reasonably suppose that there are genes
whose only functions are the adjustment of, say, body size to an
optimal value. How, then, are we to think of the genes with which we
are concerned in quantitative genetics? Our knowledge of these
genes may be briefly summarised as follows.
The distinction between ' 'major" and "minor" genes marks the
difference between those which we can study individually, and whose
properties are therefore fairly easily discovered, and those which we
cannot study individually and whose properties can only be deduced
by indirect means. Both are concerned with quantitative variation.
Among the major genes two sorts may be distinguished. There are
genes with more or less severely deleterious effects on fitness, and
these include nearly all the "mutants" of Mendelian genetics, as well
as lethals. Each may have pleiotropic effects on a variety of metric
characters. They are recessive, or nearly so, in their effects on fitness,
but not necessarily also in their effects on metric characters. They are
kept in equilibrium at low frequencies by natural selection balanced
against mutation. Being at low frequencies they contribute, individu
ally, little to the genetic variance of any character; their total contri
bution, however, is unknown. They are probably an important cause
of inbreeding depression. Major genes of the second sort are those
responsible for the antigenic differences. The alleles at these loci are
at intermediate frequencies where they are probably maintained by
selection favouring heterozygotes. Their effects on fitness, however,
are probably fairly small — certainly small enough for all to be
regarded as "wildtype" alleles. Their effects on metric characters
are almost unexplored, and their importance as sources of variation is
consequently unknown. They presumably contribute to inbreeding
depression if heterozygotes are superior in fitness, but again their
relative importance in this respect is not known with certainty.
About the minor genes little is known. They do not necessarily
occupy loci different from those occupied by major genes. It seems
more likely, on the contrary, that they are isoalleles, capable of
mutating to major deleterious genes. They are performing their
primary functions perfectly adequately and may differ only in the rate
at which their primary product is synthesised. The variation of
Chap. 20]
THE GENES CONCERNED
345
metric characters which they produce may be quite incidental to their
main biochemical functions. There is no reason at present to think
that these minor genes differ in any essential way from the genes that
determine antigenic differences. The fact that their effects are not
individually recognisable, whereas the antigenic differences are, may
be due only to the inadequacy of the techniques available for detect
ing biochemical differences among essentially normal individuals.
The problems that have been raised but left unanswered in this
chapter will be sufficient indication of the directions which the future
development of quantitative genetics may take. It does not seem to
the present writer that much progress toward their solution is likely
to be made by deductive reasoning, because most of the outstanding
problems are not essentially theoretical in nature: the theoretical
structure of the subject is now fairly clear, at least in its main out
lines. Some of the outstanding problems are beyond the reach of the
experimental techniques now at our command. New techniques, both
more penetrating and more discriminating, will therefore be needed.
Other problems arise from the paucity of experimental data and the
consequent difficulty of deciding what phenomena are general and
what are due to special circumstances. These problems will be solved
not so much by deliberately designed experiments, but rather from
the accumulated experience of experiments extended to a wider
variety of characters and of organisms.
I
GLOSSARY OF SYMBOLS
This list gives the meanings of most of the symbols used in the book.
Many of the symbols listed are used also with other meanings in certain
places, but these meanings, as well as the symbols not listed, do not appear
more than a page or two removed from their definition. The more im
portant differences from current usage are indicated where the equivalent
symbols used by Lerner (1950) — denoted by (L) — and by Mather (1949)
— denoted by (M) — are given.
A x , A 2 Allelomorphic genes.
A Breeding value. = G (L).
a Genotypic value of the homozygote A^, as deviation from the
midhomozygote value. = d (M).
a Average effect of a genesubstitution.
a x , a 2 Average effects of the alleles A x and A 2 respectively.
b Regression coefficient; e.g. &op = regression of offspring on parent.
CR Correlated response to selection.
D Dominance deviation.
d Genotypic value of the heterozygote A X A 2 , as deviation from the
midhomozygote value. = /z(M).
A Change of , as Aq = change of gene frequency, Zli r = rate of in
breeding.
E Environmental deviation.
Ec Common environment; i.e. environmental deviation of family mean
from population mean. = C (L).
Ew Withinfamily environment; i.e. environmental deviation of indi
vidual from family mean. = E' (L).
F Coefficient of inbreeding.
F 1 First generation of cross between lines or populations.
F 2 Second generation of cross, by random mating among F x .
FS Full sibs.
/ Coancestry; i.e. inbreeding coefficient of the progeny of the indi
viduals concerned.
/ (Chap. 13): Subscript referring to selection between families.
G Genotypic value. = Ge (L).
GLOSSARY OF SYMBOLS
347
H
H
HS
h*
I
I
M
m
N
N
N e
n
O
P
P
P
P
P
P
Q
q
R
Frequency of heterozygous genotype (A X A 2 ).
Amount of heterosis; i.e. deviation of cross mean from midparent
value.
Half sibs.
Heritability.
Interaction deviation, due to epistasis.
(Chap. 13 & 19): Index for selection.
Intensity of selection; i.e. selection differential in units of the
phenotypic standard deviation. = 1 (L).
Population mean.
Immigration rate.
Population size; i.e. number of breeding individuals in a population
or line.
(Chap. 10 & 13): Number of families.
Effective population size.
Number in various contexts. In Chapters 10 and 13, specifically
number of offspring per family.
Offspring
Parent. P = Midparent.
Frequency of homozygous genotype (A^).
Panmictic index, ( = 1  F).
Phenotypic value.
Gene frequency (of A x ). = u (M).
(Chap. 11, part): proportion selected as parents from a normally
distributed population. = v (L).
Frequency of homozygous genotype (A 2 A 2 ).
Gene frequency (of A 2 ). = v (M).
Response to selection — specifically to individual selection. = AG
(L).
(Chap. 8): Repeatability; i.e. correlation between repeated measure
ments of the same individual.
(Chap. 13): Coefficient of relationship; i.e. correlation of breeding
values between related individuals. = r G (L).
(Chap. 19): Correlations between two characters:
r A additive genetic correlation. = r G (L).
r E environmental correlation.
r P phenotypic correlation. = r (L).
Selection differential in actual units of measurement. = i (L).
Coefficient of selection against a particular genotype.
(Chap. 13): subscript referring to sibselection.
348 GLOSSARY OF SYMBOLS
E Summation of the quantity following the sign.
a Standard deviation (a 2 = variance) of the quantity indicated by
subscript. Components of variance, from an analysis of vari
ance are indicated by subscripts as follows:
a% between groups, or families.
o% between dams, within sires,
of between sires.
a\ total; i.e. the sum of all components.
o\ within groups, or families.
t Time in number of generations. As a subscript it means "at
generation t".
t Phenotypic correlation between members of families.
u Mutation rate (from A x to A 2 ).
V Variance (causal component) of the value or deviation indicated by
subscript. The most important are:
V P Phenotypic variance. = o% (L), = V (M).
Vq Genotypic variance. = o% e (L).
Vj Additive genetic variance. = al (L), = \T> (M).
Vj) Dominance variance.) 2 . (=^H(M).
Vi Interaction variance. J G \ = / (M).
V E Environmental variance. =ct^(L), =E(M).
v Mutation rate (from A 2 to A x ).
w (Chap. 13): subscript referring to selection within families.
X (Chap. 19): One of two correlated characters.
Y (Chap. 19): The other of two correlated characters.
y (Chap. 14): Difference of gene frequency between two lines.
z (Chap. 11): Height of the ordinate of a normal distribution, in
units of the standard deviation.
INDEXED LIST OF REFERENCES
The numbers in square brackets refer to pages in
the text where the work is mentioned
Allison, A. C. 1954. Notes on sicklecell polymorphism. Ann. hum. Genet.
[Lond.], 19: 3957 [45]
1955. Aspects of polymorphism in man. Cold Spr. Harb. Symp. quant.
Biol, 20: 239252. [44]
Bartlett, M. S., and Haldane, J. B. S. 1935. The theory of inbreeding
with forced heterozygosis. J. Genet., 31: 327340. [97]
Bell, A. E., Moore, C. H., and Warren, D. C. 1955. The evaluation of
new methods for the improvement of quantitative characteristics.
Cold Spr. Harb. Symp. quant. Biol., 20: 19721 1. [286]
Biggers, J. D., and Claringbold, P. J. 1954. Why use inbred lines?
Nature [Lond.], 174: 596. [275]
Briles, W. E., Allen, C. P., and Millen, T. W. 1957. The B blood group
system of chickens. I. Heterozygosity in closed populations.
Genetics, 42: 631648. [290]
Briquet, R., and Lush, J. L. 1947. Heritability of amount of spotting in
HolsteinFriesian cattle. J. Hered., 38: 99105. [167]
Brumby, P. J. 1958. Monozygotic twins and dairy cattle improvement.
Anim. Breed. Abstr., 26: 112. [ J 83]
Brumby, P. J., and Hancock, J. 1956. A preliminary report of growth and
milk production in identical and fraternaltwin dairy cattle. N.Z.
J. Sci. Tech., Agric, 38: 184193. [185]
Buri, P. 1956. Gene frequency in small populations of mutant Drosophila.
Evolution, 10: 367402. [52, 53, 56, 59, 74]
Butler, L. 1952. A study of size inheritance in the house mouse. II. Analysis
of five preliminary crosses. Canad. J. Zool., 30: 154171. [216]
Cain, A. J., and Sheppard, P. M. 1954a. Natural selection in Cepaea.
Genetics, 39: 89116. [43, 83]
19546. The theory of adaptive polymorphism. Amer. Nat., 88: 321
326. [44]
Carpenter, J. R., Gruneberg, H., and Russell, E. S. 1957. Genetical
differentation involving morphological characters in an inbred
strain of mice. II. The American branches of the C57BL and
C57BR strains. J. Morph., 100: 377388. [274]
Castle, W. E., and Wright, S. 1916. Studies of inheritance in guineapigs
and rats. Publ. Carneg. Instn. Wash., No. 241: iv + 192 pp. [168]
Ceppellini, R., Siniscalco, M., and Smith, C. A. B. 1955. The estimation
of gene frequencies in a randommating population. Ann. hum.
Genet. [Lond.], 20: 97115. [16]
350 INDEXED LIST OF REFERENCES
Chai, C. K. 1957. Developmental homeostasis of body growth in mice.
Amer. Nat., 91: 4955. [271]
Chapman, A. B. 1946. Genetic and nongenetic sources of variation in the
weight response of the immature rat ovary to a gonadotropic
hormone. Genetics, 31: 494507. [168]
Clayton, G. A., Knight, G. R., Morris, J. A., and Robertson, A. 1957.
An experimental check on quantitative genetical theory. III.
Correlated responses. J. Genet., 55: 171180. [316, 320]
Clayton, G. A., Morris, J. A., and Robertson, A. 1957. An experimental
check on quantitative genetical theory. I. Shortterm responses to
selection. J. Genet., 55: 131151.
[140, 168, 169, 177, 190, 195, 209, 210, 221, 245, 333]
Clayton, G. A., and Robertson, A. 1955. Mutation and quantitative
variation. Amer. Nat., 89: 1 51158. [342]
1957. An experimental check on quantitative genetical theory. II.
The longterm effects of selection. J. Genet., 55: 152170.
[216, 223]
Cockerham, C. C. 1 954. An extension of the concept of partitioning
hereditary variance for analysis of covariances among relatives
when epistasis is present. Genetics, 39: 859882. [138]
1956*2. Effects of linkage on the covariances between relatives. Genetics,
41: 138141. [159]
19566. Analysis of quantitative gene action. Genetics in Plant Breeding.
Brookhaven Symp. Biol., No. 9: 5368. [140]
Comstock, R. E., and Robinson, H. F. 1952. Estimation of average domi
nance of genes. Heterosis, ed. J. W. Gowen. Ames: Iowa State
College Press. Pp. 494516. [290]
Comstock, R. E., Robinson, H. F., and Harvey, P. H. 1949. A breeding
procedure designed to make maximum use of both general and
specific combining ability. J. Amer. Soc. Agron., 41: 360367.
[286]
Crow, J. F. 1948. Alternative hypotheses of hybrid vigor. Genetics, 33:
477487. [278, 290]
1952. Dominance and overdominance. Heterosis, ed. J. W. Gowen.
Ames: Iowa State College Press. Pp. 282297. [278, 290]
1954. Breeding structure of populations. II. Effective population
number. Statistics and Mathematics in Biology, ed. O. Kempthorne,
T. A. Bancroft, J. W. Gowen, and J. L. Lush. Ames: Iowa State
College Press. Pp. 543"55o [53, 60, 61, 64, 71]
1956. The estimation of spontaneous and radiationinduced mutation
rates in man. Eugen. Quart., 3: 201208. [38]
1957. Possible consequences of an increased mutation rate. Eugen.
Quart., 4: 6780. [39]
Crow, J. F., and Morton, N. E. 1955. Measurement of gene frequency
drift in small populations. Evolution, 9: 202214. [73, 74]
Cruden, D. 1949. The computation of inbreeding coefficients in closed
populations. J. Hered., 40: 248251. [88, 89]
4
INDEXED LIST OF REFERENCES
351
Dempster, E. R., and Lerner, I. M. 1950. Heritability of threshold charac
ters. Genetics, 35: 212236. [303]
Deol, M. S., Gruneberg, H., Searle, A. G., and Truslove, G. M. 1957.
Genetical differentiation involving morphological characters in an
inbred strain of mice. I. A British branch of the C57BL strain.
J. Morph., 100: 345376. [274]
Dickerson, G. E. 1952. Inbred lines for heterosis tests? Heterosis, ed.
J. W. Gowen. Ames: Iowa State College Press. Pp. 330351.
[286]
1955. Genetic slippage in response to selection for multiple objectives.
Cold Spy. Harb. Symp. quant. Biol., 20: 213224. [329]
I 957 (Two abstracts.) Poult. Sci., 36: 11 121 113. [316]
Dickerson, G. E., et al. 1954. Evaluation of selection in developing inbred lines
of swine. Res. Bull. Mo. agric. Exp. Sta., No. 55 1 : 60 pp. [249, 253]
Dobzhansky, Th. 1950. Genetics of natural populations. XIX. Origin of
heterosis through natural selection in populations of Drosophila
pseudoobscura. Genetics, 35: 288302. [262]
195 \a. Genetics and the Origin of Species. New York: Columbia Uni
versity Press. 3rd edn. xi+364pp. [44]
195 ib. Mendelian populations and their evolution. Genetics in the 2.0th
Century, ed. L. C. Dunn. New York: Macmillan Co. Pp. 573589.
[44> 263]
1952. Nature and origin of heterosis. Heterosis, ed. J. W. Gowen.
Ames: Iowa State College Press. Pp. 218223. [262]
Dobzhansky, Th., and Pavlovsky, O. 1955. An extreme case of heterosis
in a Central American population of Drosophila tropicalis. Proc.
nat. Acad. Sci. U.S.A., 41: 289295. [39]
Donald, H. P., Deas, D. W., and Wilson, A. L. 1952. Genetical analysis
of the incidence of dropsical calves in herds of Ayrshire cattle.
Brit. vet. J., 108: 227245. [13]
Emik, L. O., and Terrill, C. E. 1949. Systematic procedures for calculating
inbreeding coefficients, jf. Hered., 40: 5155. [88, 89]
Falconer, D. S. 1952. The problem of environment and selection. Amer.
Nat., 86: 293298. [323, 324]
1953. Selection for large and small size in mice. J. Genet., 51: 470501.
[96, 168, 199, 335]
1954a. Asymmetrical responses in selection experiments. Symposium
on Genetics of Population Structure, Istituto di Genetica, Universita
di Pavia, Italy, August 2023, 1953. Un. int. Sci. biol., No. 15:
1641. [31, 33, 203, 213, 297]
19546. Validity of the theory of genetic correlation. An experimental
test with mice. J. Hered., 45: 4244. [168, 316, 319]
1954c. Selection for sex ratio in mice and Drosophila. Amer. Nat., 88:
385397. [321]
1 9SS Patterns of response in selection experiments with mice. Cold
Spr. Harb. Symp. quant. Biol., 20: 178196.
[168, 201, 214, 216, 220]
352 INDEXED LIST OF REFERENCES
1957a. Breeding methods — I. Genetic considerations. The UFAW
Handbook on the Care and Management of Laboratory Animals, 2nd
edn., edd. A. N. Worden and W. LanePetter. London: Univer
sities Federation for Animal Welfare. Pp. 85107. [228]
19576. Selection for phenotypic intermediates in Drosophila. J. Genet.,
55: 55i56i. [341]
Falconer, D. S., and Latyszewski, M. 1952. The environment in relation
to selection for size in mice. J. Genet., 51: 6780. [324]
Falconer, D. S., and Robertson, A. 1956. Selection for environmental
variability of body size in mice. Z. indukt. Abstamm.u. Vererblehre,
87: 38539I [34i]
Fisher, R. A. 19 18. The correlation between relatives on the supposition
of Mendelian inheritance. Trans, roy. Soc. Edinb., 52: 399433.
[2, 124]
1930. The Genetical Theory of Natural Selection. Oxford University
Press. xiv+272pp. [4]
1 94 1. Average excess and average effect of a gene substitution. Ann.
Eugen. [Lond.], 11: 5363 . [124]
1949. The Theory of Inbreeding. Edinburgh: Oliver & Boyd, viii + 120
pp. [90, 97, 99, 100]
Fisher, R. A., and Yates, F. 1943. Statistical Tables. Edinburgh: Oliver &
Boyd. 2nd edn. viii +98 pp. [i94 5 302]
Ford, E. B. 1953. The genetics of polymorphism in the Lepidoptera.
Advanc. Genet., 5: 4387. [44]
Fredeen, H. T., and Jonsson, P. 1957. Genie variance and covariance in
Danish Landrace swine as evaluated under a system of individual
feeding of progeny test groups. Z. Tierz. Zuchtbiol., 70: 348363.
[167, 174, 175, 316]
Gilmour, D. G. 1958. Maintenance of segregation of blood group genes
during inbreeding in chickens. (Abstr.) Heredity, 12: 141142.
[290]
Gowe, R. S., Robertson, A., and Latter, B. D. H. 1959. Environment and
poultry breeding problems. 5. The design of poultry control
strains. Poult. Sci., 38: 462471. [72, 73]
Gowen, J. W. 1952. Hybrid vigor in Drosophila. Heterosis, ed. J. W.
Go wen. Ames: Iowa State College Press. Pp. 474493.
[282, 33«]
Green, E. L. 195 1. The genetics of a difference in skeletal type between
two inbred strains of mice (BalbC and C57blk). Genetics, 36: 391
409. [303]
Green, E. L., and Russell, W. L. 1951. A difference in skeletal type be
tween reciprocal hybrids of two inbred strains of mice (C57BLK
and C3H). Genetics, 36: 641651. [305, 307]
Gruneberg, H. 1952. Genetical studies on the skeleton of the mouse. IV.
Quasicontinuous variations. J. Genet., 51: 95114. [3 01 ]
1954. Variation within inbred strains of mice. Nature [Lond.], 173: 674.
[275]
INDEXED LIST OF REFERENCES
353
Haldane, J. B. S. 192432. A mathematical theory of natural and artificial
selection. Proc. Camb. phil. Soc, 23: 1941; 158163; 363372;
607615; 838844; 26: 220230; 27: 131142; 28: 244248. [2]
1932. The Causes of Evolution. London: Longmans, Green & Co., Ltd.
vii + 235pp. [2,4]
Haldane, J. B. S. 1936. The amount of heterozygosis to be expected in an
approximately pure line. J. Genet., 32: 375391. [100]
1937. Some theoretical results of continued brothersister mating. J.
Genet., 34: 265274. [90, 97]
1939. The spread of harmful autosomal recessive genes in human
populations. Ann. Eugen. [Lond.], 9: 232237. [41]
1946. The interaction of nature and nurture. Ann. Eugen. [Lo?id.], 13:
197205. [133]
1949. The rate of mutation of human genes. Proc. 8th int. Congr. Genet.
1948 [Stockh.]. Lund: Issued as a supplementary volume of
Hereditas, 1949. Pp. 267273. [38]
1954a. The measurement of natural selection. Proc. gth int. Congr.
Genet. [Bellagio (Como)], 1953, Pt. I (Suppl. to Caryologia, 6): 480
487. [338]
19546. The Biochemistry of Genetics. London: George Allen & Unwin
Ltd. 144 pp. [339]
J 955 The complete matrices for brothersister and alternate parent
offspring mating involving one locus. J. Genet., 53: 315324.
[90, 97]
Hancock, J. 1954. Monozygotic twins in cattle. Advanc. Genet., 6: 141
181. [183]
Hardy, G. H. 1908. Mendelian proportions in a mixed population. Science,
28: 4950 [9]
Hayes, H. K., Immer, F. R., and Smith, D. C. 1955. Methods of Plant
Breeding. New York: McGrawHill Book Co., Inc. 2nd edn.
xi+551 pp. [276]
Hayman, B. I. 1955. The description and analysis of gene action and inter
action. Cold Spr. Harb. Symp. quant. Biol., 20: 7984. [i4°]
1958. The theory and analysis of diallel crosses. II. Genetics, 43: 6385.
[140, 277]
Hayman, B. I., and Mather, K. 1953. The progress of inbreeding when
homozygotes are at a disadvantage. Heredity, 7: 165183. [102]
Hazel, L. N. 1943. The genetic basis for constructing selection indexes.
Genetics, 28: 476490. [324* 325]
Hazel, L. N., and Lush, J. L. 1942. The efficiency of three methods of
selection. J. Hered., 33: 393~399 [324]
Hollingsworth, M. J., and Smith, J. M. 1955. The effects of inbreeding
on rate of development and on fertility in Drosophila subobscura.
J. Genet., 53: 295314. f 2 49, 252]
Horner, T. W., Comstock, R. E., and Robinson, H. F. 1955. Nonallelic
gene interactions and the interpretation of quantitative genetic data.
Tech. Bull. N.C. agric. Exp. Sta., No. 1 18: v + 1 17 pp. [299]
354 INDEXED LIST OF REFERENCES
Hull, F. H. 1945. Recurrent selection for specific combining ability in
corn. J. Amer. Soc. Agron., 37: 134145. [286]
Hunt, H. R., Hoppert, C. A., and Erwin, W. G. 1944. Inheritance of
susceptibility to caries in albino rats {Mus norvegicus). y. dent. Res.,
23: 385401 [297]
Johansson, I. 1950. The heritability of milk and butterfat yield. Anim.
Breed. Abstr., 18: 112. [i44> 167, 316]
Kempthorne, O. 1954. The correlation between relatives in a random
mating population. Proc. roy. Soc, B, 143: 103113. [138, 279]
I 955 a  The theoretical values of correlations between relatives in ran
dom mating populations. Genetics, 40: 153167.
[138, 152, 158, 174]
19556. The correlations between relatives in random mating popula
tions. Cold Spr. Harb. Symp. quant. Biol., 20: 6075. [ x 38, 158]
1957. An Introduction to Genetic Statistics. New York: John Wiley & Sons,
Inc.; London: Chapman & Hall, Ltd. xvii + 545 pp. [4, 264, 277]
Kempthorne, O., and Tandon, O. B. 1953. The estimation of heritability
by regression of offspring on parent. Biometrics, 9: 90100. [171]
Kerr, W. E., and Wright, S. 1954a. Experimental studies of the distribu
tion of gene frequencies in very small populations of Drosophila
melanogaster: I. Forked. Evolution, 8: 172177. [74]
19546. Experimental studies of the distribution of gene frequencies in
very small populations of Drosophila melanogaster. III. Aristapedia
and spineless. Evolution, 8: 293302. [74]
Kimura, M. 1954. Process leading to quasifixation of genes in natural
populations due to random fluctuation of selection intensities.
Genetics, 39: 280295. [57, 34°]
J 955 Solution of a process of random genetic drift with a continuous
model. Proc. nat. Acad. Sci. U.S.A., 41: 144150. [54, 55, 57]
1956. Rules for testing stability of a selective polymorphism. Proc. nat.
Acad. Sci. U.S.A., 42: 336340. [42]
King, J. W. B. 1950. Pygmy, a dwarfing gene in the house mouse. J.Hered.,
41: 249252. [113]
x 955 Observations on the mutant "pygmy" in the house mouse. J&
Genet., 53: 487"497 [113, 289, 298, 299]
King, S. C, and Henderson, C. R. 1954a. Variance components analysis
in heritability studies. Poult. Sci., 33: 147154. [173]
19546. Heritability studies of egg production in the domestic fowl.
Poult. Sci., 33: 155169. [168]
Kyle, W. H., and Chapman, A. B. 1953. Experimental check of the effec
tiveness of selection for a quantitative character. Genetics, 38: 421
443. [225]
Lack, D. 1954. The Natural Regulation of Animal Numbers. Oxford:
Clarendon Press. viii+343pp. [334]
Lamotte, M. 195 1. Recherches sur la structure genetique des populations
naturelles de Cepaea nemoralis (L.). Bull, biol., Suppl. 35: 238 pp.
[78, 83, 84]
INDEXED LIST OF REFERENCES
355
Lerner, I. M. 1950. Population Genetics and Animal Improvement. Cam
bridge University Press, xviii + 342 pp. [4, 236, 325, 346]
1954. Genetic Homeostasis. Edinburgh: Oliver & Boyd, vii + 134 pp.
[44, 202, 213, 263, 270, 271, 288, 331, 339]
1958. The Genetic Basis of Selection. New York: John Wiley & Sons,
Inc. xvi+298 pp. [202, 263]
Lerner, I. M., and Cruden, D. 195 i. The heritability of egg weight: the
advantages of mass selection and of early measurements. Poult. Sci.,
30: 3441 [168]
Levene, H. 1953. Genetic equilibrium when more than one ecological niche
is available. Amer. Nat., 87: 331333. [43]
Li, C. C. 1955a. Population Genetics. Chicago: University of Chicago Press;
London: Cambridge University Press, xi +366 pp.
[4, 15, 20, 22, 24]
19556. The stability of an equilibrium and the average fitness of a
population. Amer. Nat., 89: 281296. [43]
Livesay, E. A. 1930. An experimental study of hybrid vigor or heterosis in
rats. Genetics, 15: 1754. [271]
Lush, J. L. 1945. Animal Breeding Plans . Ames: Iowa State College Press.
3rd edn. viii+443pp. [4]
1947. Family merit and individual merit as bases for selection. Pt. I,
Pt. II. Amer. Nat., 81: 241261; 362379. [236, 237]
1950. Genetics and animal breeding. Genetics in the Twentieth Century,
ed. L. C. Dunn. New York: Macmillan Co. Pp. 493525. [200]
Lush, J. L., and Molln, A. E. 1942. Litter size and weight as permanent
characteristics of sows. Tech. Bull. U.S. Dep. Agric, No. 836: 40 pp.
[167]
Mac Arthur, J. W. 1949. Selection for small and large body size in the
house mouse. Genetics, 34: 194209. [216, 295, 335]
McLaren, A., and Michie, D. 1954. Factors affecting vertebral variation
in mice. 1. Variation within an inbred strain. J. Embryol. exp.
Morph., 2: 149160. [273, 274]
1955. Factors affecting vertebral variation in mice. 2. Further evidence
on intrastrain variation. J. Embryol. exp. Morph., 3: 366375.
[303, 306]
1956a. Factors affecting vertebral variation in mice. 3. Maternal
effects in reciprocal crosses. J. Embryol. exp. Morph., 4: 1 61166.
1 [306]
19566. Variability of response in experimental animals. J. Genet., 54:
440455. [271]
' Malecot, G. 1948. Les Mathematiques deVHeredite. Paris: Masson et Cie.
vi+63 pp. [4, 61, 69, 75, 88]
' : Mangelsdorf, P. C. 1 95 1. Hybrid corn: its genetic basis and its significance
1 ! in human affairs. Genetics in the Twentieth Century, ed. L. C. Dunn.
New York: Macmillan Co. Pp. 555571. [no, 277]
' j Mather, K. 1949. Biometrical Genetics. London: Methuen & Co., Ltd.
ix + 162 pp. [4, 106, 277, 340, 346]
356 INDEXED LIST OF REFERENCES
1953a. Genetical control of stability in development. Heredity, 7: 297
336. [270]
19536. The genetical structure of populations. Symp. Soc. exp. Biol. 7:
6695. [34o]
1955a. Polymorphism as an outcome of disruptive selection. Evolution,
9: 5261. [43]
19556. The genetical basis of heterosis. Proc. roy. Soc, B., 144: 143
150. [288]
Merrell, D. J. 1953. Selective mating as a cause of gene frequency changes
in laboratory populations of Drosophila melanogaster. Evolution, 7:
287296. [34]
Morley, F. H. W. 195 1. Selection for economic characters in Australian
Merino sheep. (1) Estimates of phenotypic and genetic parameters.
Set. Bull. Dep. Agric. N.S.W., No. 73: 45 pp. [144]
1954. Selection for economic characters in Australian Merino sheep.
IV. The effect of inbreeding. Aust. J. agric. Res., 5: 305316.
[249]
1955. Selection for economic characters in Australian Merino sheep.
V. Further estimates of phenotypic and genetic parameters. Aust.
J. agric. Res., 6: 7790. [168, 316]
Mourant, A. E. 1954. The Distribution of the Human Blood Groups. Oxford:
Blackwell. xxi+428pp. [5]
Muller, H. J., and Oster, I. I. 1957. Principles of back mutation as ob
served in Drosophila and other organisms. Advances in Radio
biology. Proc. 5th int. Conf.Radiobiol. [Stockh.], 1956. Edinburgh:
Oliver & Boyd. Pp. 407415. [26]
Newman, H. H., Freeman, F. N., and Holzinger, K. J. 1937. Twins: a
Study of Heredity and Environment. Chicago: University of Chicago
Press. xvi+369pp. [185]
Osborne, R. 1957a. The use of sire and dam family averages in increasing
the efficiency of selective breeding under a hierarchical mating
system. Heredity, 11: 93116. [241]
19576. Correction for regression on a secondary trait as a method of
increasing the efficiency of selective breeding. Aust. J. biol. Sci.,
10: 365366. [328]
Osborne, R., and Paterson, W. S. B. 1952. On the sampling variance of
heritability estimates derived from variance analyses. Proc. roy..
Soc. Edinb., B., 64: 456461. [183]
Paxman, G. J. 1957. A study of spontaneous mutation in Drosophila
melanogaster. Genetica, 29: 3957. [342]
Pearson, K., and Lee, A. 1903. On the laws of inheritance in man. I. In
heritance of physical characters. Biometrika, 2: 357462. [163]
Penrose, L. S. 1949. The Biology of Mental Defect. London: Sidgwick &
Jackson. xiv+285pp. [163,164]
1954. Some recent trends in human genetics. Proc. gth int. Congr.
Genet. [Bellagio (Como)], 1953, Pt. I (Suppl. to Caryologia, 6): 521
530. [Hi]
INDEXED LIST OF REFERENCES
357
Plum, M. 1954. Computation of inbreeding and relationship coefficients.
J. Hered., 45: 9294 [89]
Powers, L. 1950. Determining scales and the use of transformations in
studies on weight per locule of tomato fruit. Biometrics, 6: 145163.
[300]
1952. Gene recombination and heterosis. Heterosis, ed. J. W. Gowen.
Ames: Iowa State College Press. Pp. 298319. [260]
Race, R. R., and Sanger, R. 1954. Blood Groups in Man. Oxford: Black
well. 2nd edn. xvi+40opp. [12, 16]
Rasmuson, M. 1952. Variation in bristle number of Drosophila melanogaster .
Acta zool. [Stockh.], 33: 277307. [265]
1956. Recurrent reciprocal selection. Results of three model experi
ments on Drosophila for improvement of quantitative characters.
Hereditas [Lund], 42: 397414. [286]
Reeve, E. C. R. 1955a. Inbreeding with homozygotes at a disadvantage.
Ann. hum. Genet. [Lond.], 19: 332346. [101]
^S^ (Contribution to discussion.) Cold Spr. Harb. Symp. quant.
Biol., 20: 7678. [170]
1955c. The variance of the genetic correlation coefficient. Biometrics,
11: 357374 [171, 318]
Reeve, E. C. R., and Robertson, F. W. 1953. Studies in quantitative in
heritance. II. Analysis of a strain of Drosophila melanogaster
selected for long wings. J. Genet., 51: 276316. [171, 316, 319]
1954. Studies in quantitative inheritance. VI. Sternite chaeta number
in Drosophila: a metameric quantitative character. Z. indukt. Ab
stamm.u. Vererblehre, 86: 269288. [140, 145, 316]
Rendel, J. M. 1954. The use of regressions to increase heritability. Aust.
J. biol. Sci., 7: 368378. [328]
Rendel, J. M., Robertson, A., Asker, A. A., Khishin, S. S., and Ragab,
M. T. 1957. The inheritance of milk production characteristics.
J. agric. Sci., 48: 426432. [149]
Roberts, J. A. Fraser. 1957. Blood groups and susceptibility to disease: a
review. Brit. J. prev. Soc. Med., 11: 107125. [44]
Robertson, A. 1952. The effect of inbreeding on the variation due to re
cessive genes. Genetics, 37: 189207. [268, 269]
1954. Inbreeding and performance in British Friesian cattle. Proc.
Brit. Soc. Anim. Prod., 1954: 8792. [249]
1955a. Prediction equations in quantitative genetics. Biometrics, 11:
9598. [236]
19556. Selection in animals: synthesis. Cold Spr. Harb. Symp. quant.
Biol, 20: 225229. [333, 337]
1956. The effect of selection against extreme deviants based on devia
tion or on homozygosis. J. Genet., 54: 236248. [338]
1957a. Genetics and the improvement of dairy cattle. Agric. Rev.
[Lond.], 2 (8): 1021. [167]
19576. Optimum group size in progeny testing and family selection.
Biometrics, 13: 442450. [243]
358 INDEXED LIST OF REFERENCES
ig$ga. Experimental design in the evaluation of genetic parameters.
Biometrics, 15: 219226. [178, 182, 183]
19596. The sampling variance of the genetic correlation coefficient.
Biometrics, 15: 469485. [318, 323]
Robertson, A., and Lerner, I. M. 1949. The heritability of allornone
traits: viability of poultry. Genetics, 34: 395411. [168, 303]
Robertson, F. W. 1955. Selection response and the properties of genetic
variation. Cold Spr. Harb. Symp. quant. Biol., 20: 166177.
[211, 212, 216]
1957a. Studies in quantitative inheritance. X. Genetic variation of ovary
size in Drosophila. J. Genet., 55: 410427. [!4 > j 44> x 45> io 8]
19576. Studies in quantitative inheritance. XI. Genetic and environ
mental correlation between body size and egg production in
Drosophila melanogaster . J. Genet., 55: 428443. [131, 140, 168]
Robertson, F. W., and Reeve, E. C. R. 1952a. Studies in quantitative in
heritance. I. The effects of selection of wing and thorax length in
Drosophila melanogaster. J. Genet., 50: 414448. [223]
19526. Heterozygosity, environmental variation and heterosis. Nature
[Lond.], 170: 296. [270, 271]
Robinson, H. F., and Comstock, R. E. 1955. Analysis of genetic variability
in corn with reference to probable effects of selection. Cold Spr.
Harb. Symp. quant. Biol., 20: 127135. [140, 284, 290]
Robinson, H. F., Comstock, R. E., Khalil, A., and Harvey, P. H. 1956.
Dominance versus overdominance in heterosis: evidence from
crosses between openpollinated varieties of maize. Amer. Nat.,
90: 12713 1. [290]
Robson, E. B. 1955. Birth weight in cousins. Ann. hum. Genet. [Lond.], 19:
262268. [141, 163, 185]
Russell, E. S. 1949. A quantitative histological study of the pigment found
in the coatcolor mutants of the house mouse. IV. The nature of
the effects of genie substitution in five major allelic series. Genetics,
34: 146166. [116, 126]
Schafer, W. 1937. Uber die Zunahme der Isozygotie (Gleicherbarkeit)
bei fortgesetzter BruderSchwesterInzucht. Z. indukt. Abstamm.
u. Vererblehre, 72: 5078. [91, 97]
Searle, A. G. 1949. Gene frequencies in London's cats. J. Genet., 49:
214220. [18]
Sheppard, P. M. 1958. Natural Selection and Heredity. London: Hutchin
son & Co. (Publishers) Ltd. 212 pp. [44]
Shoffner, R. N. 1948. The reaction of the fowl to inbreeding. Poult. Sci.,
27: 448452. [249]
Siertsroth, U. 1953. Geburts und Aufzuchtgewichte von Rassehunden.
Z. Hundeforsch., 20: 1 122. [216]
Slizynski, B. M. 1955. Chiasmata in the male mouse. J. Genet., 53: 597
605. [99]
Smith, H. Fairfield. 1936. A discriminant function for plant selection.
Ann. Eugen. [Lond.], 7: 240250. [325]
•
INDEXED LIST OF REFERENCES
359
Smith, H. H. 1952. Fixing transgressive vigor in Nicotiana rustica.
Heterosis, ed. J. W. Gowen. Ames: Iowa State College Press. Pp.
161174. [260]
Snedecor, G. W. 1956. Statistical Methods. Ames: Iowa State College
Press. 5th edn. xiii + 534 pp. [144,173]
Sprague, G. F. 1952. Early testing and recurrent selection. Heterosis, ed.
J. W. Gowen. Ames:' Iowa State College Press. Pp. 400417.
[283]
Stern, C. 1943. The HardyWeinberg law. Science, 97: 137138. [9]
1949. Principles of Human Genetics. San Francisco: W. H. Freeman &
Co. xi+6i7pp. [13, 183]
Tantawy, A. O., and Reeve, E. C. R. 1956. Studies in quantitative inheri
tance. IX. The effects of inbreeding at different rates in Drosophila
melanogaster . Z. indukt. Abstamm.u. Vererblehre, 87: 648667.
[249, 268, 290]
Waddington, C. H. 1942. Canalisation of development and the inheritance
of acquired characters. Nature [Lond.], 150: 563. [310]
IQ 53 Genetic assimilation of an acquired character. Evolution, 7: 118
126. [310,311]
1957. The Strategy of the Genes. London: George Allen & Unwin, Ltd.
ix + 262 pp. [43, 146, 271, 311, 340, 341, 342]
Wallace, B. 1958. The comparison of observed and calculated zygotic
distributions. Evolution, 12: 113115. [12]
Wallace, B., and Vetukhiv, M. 1955. Adaptive organization of the gene
pools of Drosophila populations. Cold Spr. Harb. Symp. quant.
Biol, 20: 303309. [262]
Warren, E. P., and Bogart, R. 1952. Effect of selection for age at time of
puberty on reproductive performance in the rat. Sta. Tech. Bull. Ore.
agric. Exp. Sta., No. 25: 27 pp. [168]
Warwick, B. L. 1932. Probability tables for Mendelian ratios with small
numbers. Bull. Tex. agric. Exp. Sta., No. 463: 28 pp. [105]
Warwick, E. J., and Lewis, W. L. 1954. Increase in frequency of a de
leterious recessive gene in mice. J. Hered., 45: 143145. [113]
Weinberg, W. 1908. Uber den Nachweis der Vererbung beim Menschen.
Jh. Ver. vaterl. Naturk. Wiirttemb., 64: 369382. [9]
Weir, J. A. 1955. Male influence on sex ratio of offspring in high and low
blood/>H lines of mice. J. Hered., 46: 277283. [321]
Weir, J. A., and Clark, R. D. 1955. Production of high and low blood^H
lines of mice by selection with inbreeding. J. Hered., 46: 125132.
[321]
Whatley, J. A. 1942. Influence of heredity and other factors on 180
day weight in Poland China swine. J. agric. Res., 65: 249264.
[167]
Williams, E. J. 1954. The estimation of components of variability. Tech.
Pap. Div. math. Statist. C.S.I.R.O. Aust., No. 1: 22 pp. [173]
Wright, S. 1921. Systems of mating. Genetics, 6: 11 1178.
[2, 4, 22, 90, 165]
2A f.q.g.
360 INDEXED LIST OF REFERENCES
1922. Coefficients of inbreeding and relationship. Amer. Nat., 56: 330
338. [22,87,88]
1 93 1. Evolution in Mendelian populations. Genetics, 16: 97159.
[4, 53, 54, 69, 75, 92]
1933. Inbreeding and homozygosis. Proc. nat. Acad. Sci. [Wash.], 19:
411420. [90,92]
1934. The method of path coefficients. Ann. math. Statist., 5: 161215.
[90]
1935a. The analysis of variance and the correlations between relatives
with respect to deviations from an optimum. J. Genet., 30: 243256.
[338]
19356. Evolution in populations in approximate equilibrium. J. Genet.,
30: 257266. [338]
1939. Statistical genetics in relation to evolution. Actualites scientifiques
et industr idles, 802. Paris: Hermann et Cie. 63 pp. [70]
1940. Breeding structure of populations in relation to speciation.
Amer. Nat., 74: 232248. [71, 77]
1942. Statistical genetics and evolution. Bull. Amer. math. Soc, 48:
223246. [75, 79]
1943. Isolation by distance. Genetics, 28: 1 14138. [77]
1946. Isolation by distance under different systems of mating. Genetics,
3i: 3959 [77]
1948. On the roles of directed and random changes in gene frequency
in the genetics of populations. Evolution, 2: 279294. [75]
195 1. The genetical structure of populations. Ann. Eugen. [Lond.], 15:
323354. [75, 76, 77]
1952a. The theoretical variance within and among subdivisions of a
population that is in a steady state. Genetics, 37: 312321. [57]
19526. The genetics of quantitative variability. Quantitative Inheritance,
edd. E. C. R. Reeve and C. H. Waddington. London: H.M.S.O.
Pp. 541. [219, 292, 297, 34i]
1954. The interpretation of multivariate systems. Statistics and Mathe
matics in Biology, edd. O. Kempthorne, T. A. Bancroft, J. W.
Gowen and J. L. Lush. Ames: Iowa State College Press. Pp. 1 133.
[90]
1956. Modes of selection. Amer. Nat., 90: 524. [263]
Wright, S., and Kerr, W. E. 1954. Experimental studies of the distribu
tion of gene frequencies in very small populations of Drosophila
melanogaster. II. Bar. Evolution, 8: 225240. [74, 80, 81]
Wright, S., and McPhee, H. C. 1925. An approximate method of cal
culating coefficients of inbreeding and relationship from livestock
pedigrees. J. agric. Res., 31: 377383 • [87]
Yoon, C. H. 1955. Homeostasis associated with heterozygosity in the
genetics of time of vaginal opening in the house mouse. Genetics,
40: 297309. [271]
Zeleny, C. 1922. The effect of selection for eye facet number in the white
bareye race of Drosophila melanogaster. Genetics, 7: 1115. [107]
SUBJECT INDEX
Adaptive value, 26.
additive:
action of genes, 126, 138;
combination of loci, 1167;
effects, 122;
genes, 124, 138;
variance, 1358.
albinism in man, 13, 36.
assimilation, genetic, 3101.
assortative mating, 22, 164, 1701.
asymmetry in selection response,
2125;
as scale effect, 2967.
average effect, 11720.
Base population, 49, 61, 956.
blood groups:
in man, 5, 7, 12, 16, 44;
in poultry, 290;
selective advantage, 44.
Breeding value, 1205;
difference between definitions,
158.
Canalisation, 272, 308, 341.
cats, 1819.
cattle, dropsy in, 13.
causal components of variance, 150.
Cepaea nemoralis, 43, 789, 834.
coadaptation, 263.
coancestry, 8890, 233.
coefficient:
of inbreeding, see under Inbreed
ing;
of relationship, 233;
of selection, 28.
combining ability, 2816.
continuous variation, 1041 1 .
correlated characters, 31229, 3356.
correlation (between characters),
31229;
genetic, 3138. /
correlation (between relatives):
of breeding valWs, 233;
phenotypic, 151X1623.
covariance, 15 12;
environmental, 15961;
genetic, 1529,
offspringparent, 1526,
sibs, 154, 1567; \
phenotypic, 16 14.
crossbreeding:
heterosis, 25563;
in plant and animal improvement,
27686;
variance between crosses, 27983.
Developmental variation, 141, 143.
deviation:
dominance, 1225;
environmental, 112;
interaction (epistatic), 1258.
discontinuous variation, 104, 108,
301.
dispersive process, 23, 478 (see
also Inbreeding),
dominance, 27, 113;
deviation, 1225;
directional, 213;
effect on variance, 137;
and fitness, 337;
and heterosis, 257;
and inbreeding depression, 251;
and scale, 298.
drift, random, 507;
in natural populations, 814.
Drosophila melanogaster :
Bar, 801, 107;
bristle number:
components of variance, 140,
148,
fitness relationship, 333, 343,
362
SUBJECT INDEX
frequency distribution, 107,
heritability, 16970, 177,
mutation, 342,
number of "loci", 219,
random drift, 265,
repeatability, 145,
response to selection, 190, 195,
209, 210, 216, 221, 223, 245
246;
brown, 52, 53, 56, 59;
effective population size, 734;
egg number, 140, 282, 336;
ovary size, 140, 145;
raspberry, 34;
thorax length:
components of variance, 130,
140,
response to selection, 209, 211
212, 216, 219, 221, 319;
wing length, 17 12, 319.
Drosophila pseudoobscura, 262.
Drosophila subobscura, 252.
Drosophila tropicalis, 39.
dwarfism (chondrodystrophy) in
man, 38.
Effective factors, number of, 217.
effective population size (number),
< 6874;
ratio of, to actual number, 734.
environment, 112 {see also under
Variance);
common, 15961.
epistasis, 126 {see also under Inter
action),
equilibrium:
Hardy Weinberg, 912;
under inbreeding, 7481,
and selection for heterozygotes,
1003;
with linked loci, 201;
with more than one locus, 1920;
under mutation, 256,
with selection, 3641;
under natural selection, 3312;
under selection for heterozygotes,
416.
eugenics, 36, 40.
euheterosis, 262.
Factors, effective number of, 217.
family size:
and heritability estimates, 17783;
and inbreeding, 703;
and selection, 23346.
fitness, natural, 26, 167, 329, 33043.
fixation, 547, 667, 97;
of deleterious genes, 801.
Gene frequency, 622;
change of, 2336,
by selection for metric character,
2037;
directional, 213;
distributions of:
with inbreeding, 52, 55, 84,
with inbreeding and mutation,
76,
with inbreeding and selection,
79, 81;
effect on variance, 137;
sampling variance of, 504, 64.
generation interval, 1968.
genetic death, 3940.
genotype, 112;
frequencies, 57,
with inbreeding, 579, 656;
with random mating, 922.
genotypeenvironment correlation,
1323
genotypeenvironment interaction,
1334, 1489, 3224.
genotypic value, 11 24, 1235, x 32.
Hardy Weinberg law, 915.
heritability, 135, 163, 1657;
estimation, 16885;
examples, 1678;
of family means, 2327;
after inbreeding, 268;
precision of estimates, 17783;
realised, 2023, 2967;
of threshold characters, 303;
of withinfamily deviations, 2327.
SUBJECT INDEX
363
heterosis, 254;
examples, 260;
in single crosses, 25561;
in wide crosses, 2613;
utilisation of, 27686.
heterozygOtes:
frequency of,
with inbreeding, 656,
with random mating, 913, 38;
selection for, see under selection,
homeostasis:
developmental, 2702;
genetic, 331.
hybrid vigour, see heterosis,
hybrids, uniformity of, 2702, 275,
296.
Idealised population, 4850.
inbreds:
experimental use, 272, 275;
subline differentiation, 2734;
variability, 2701, 296.
inbreeding, 607;
coefficient, 61;
computation,
from pedigrees, 868,
from population size, 614,
for regular systems, 905;
depression, 24754,
examples, 249;
rate of, 63, 6970, 92, 96, 1012;
regular systems, 905;
and variance, 26572.
index for selection, 3258.
incidence of threshold character, 302.
I intangible variation, 141.
J integration, 263.
intensity of selection, 192.
interaction:
between loci (epistatic), 1258,
263,
and heterosis, 259, 262,
and inbreeding, 252,
and scale, 2989;
deviation, 1258;
between genotype and environ
ment, 1334, 1489, 3224.
island model, 77.
isoalleles, 344.
isolation by distance,
779
Line (subdivision of a population),
49
linkage:
and correlation, 312, 320;
and HarqLyWeinberg equilibrium,
2021;
and inbreeding, 97100;
and polygenic balance, 3401;
and resemblance between rela
tives, 1589.
logarithmic transformation, 297.
luxuriance, 262.
Ly coper sicon, 260, 300.
Maize, 277, 290.
Man:
albinism, 13, 36;
birth weight, 1412;
blood groups, 5, 7, 12, 16, 44;
dwarfism (chondrodystrophy), 38;
sicklecell anaemia, 446.
maternal effects, 1402, 160, 214,
2523, 2601.
mating, types of, 15.
metric character, 10411.
migration, 23;
in small populations, 759.
mouse:
bloodpH, 321;
body weight:
fitness relationship, 3356,
number of "loci", 219,
realised heritability, 203,
response to selection, 199, 214,
216, 220,
selection differentials, 201,
sibanalysis, 1756,
variance and scale, 295;
growth rate, 107, 133;
litter size:
frequency distribution, 107,
heterosis, 255,
inbreeding depression, 2523,
364
SUBJECT INDEX
repeatability, 144;
nonagouti, 51;
pigment granules, 1167, 1268;
pygmy, 1135, 1203, 136, 2223,
289, 299;
sex ratio, 321;
skeletal variants, 274;
vertebrae, number of, 2734, 3°5 —
308.
multiple alleles, 1517, 42, 138.
multiple measurements, 1429.
mutation, 23;
balanced against selection, 3641;
change of gene frequency by, 246;
and inbreeding, 759, 100, 2745;
and origin of variation, 3423;
rate:
estimation of, 38,
increase of, 26, 39.
Neighbourhood model, 779.
Nicotiana, 260.
nonadditive:
combination of genes, 1258;
variance, 13940, 280, 287, 337.
Observational components of vari
ance, 150.
overdominance, 27, 28791;
effect on variance, 137;
equilibrium gene frequency, 416;
and fitness, 339, 344;
in selection experiments, 213, 222
223.
Panmictic index, 64, 66.
panmixia, 8.
pedigrees and inbreeding, 8590.
pigs:
bodylength, 1745;
litter size, 253.
pleiotropy, 28991, 3123, 3289,
333
polycross, 282.
polygenes, 106.
polygenic balance, 3401.
polygenic variation, 106.
population:
base, 49, 61, 956;
effective size (number), 6874,
ratio of, to actual size, 734;
mean, 1137;
size, 50.
premisses, 2, 3.
probit transformation, 302.
progeny testing, 22930.
proportionate effect, 207, 219.
Quantitative character, 104.
Quasicontinuous variation, 301.
Radiation, 26, 39.
random drift, 517;
in natural populations, 814.
random mating, 821.
range, total, 115, 116, 2159.
regression, offspring on parents,
151, 1623.
relatives, resemblance between,
15064.
repeatability, 1439.
Scale, 1089, 292300;
effects, 293;
underlying, 301.
segregation index, 217.
selection, 23, 26, 1867;
balanced by mutation, 3641;
change of gene frequency, 2836;
coefficient of, 28,
related to intensity of, 2037;
combined, 227, 2367, 23940;
for combining ability, 2836;
correlated response to, 31824;
in different environments, 3224;
differential, 187, 1918,
weighting of, 2002;
for economic value, 3249;
eugenic effects, 36, 401;
family, 2278 (see also Selection,
methods),
and family size, 2435;
for heterozygotes, 416, 213, 222
223, 339,
SUBJECT INDEX
365
affecting inbreeding, 1003,
2534;
index, 3258;
indirect, 3204;
individual, 227 (see also Selection,
methods);
intensity of, 192,
related to coefficient of, 203
71
for intermediates, 33842;
limit, 215, 21924, 3289;
longterm results, 2154;
mass, 227;
methods (use of relatives), 22531,
heritabilities, 2326,
relative merits, 23744,
responses expected, 2317;
natural, 187, 2002, 212, 253, 266,
32943;
reciprocal, 284;
recurrent, 283, 286;
response, 18791,
asymmetry, 2125, 2967,
duration, 2157,
measurement, 198203,
number of "loci", 2179,
prediction, 18991, 2145,
repeatability, 20812,
total, 2157;
sib, 229 (see also Selection,
methods);
in small populations, 7981;
for threshold characters, 30811;
variable, 33940;
withinfamily, 227 (see also Selec
tion, methods),
selective value, 26.
selffertilising plants, 247, 2767.
sexlinked genes, 1719, 34.
sicklecell anaemia, 446.
sibanalysis, 1726.
snails, 43, 789, 834.
systematic processes, 23,
in small populations, 7481.
Threshold characters, 30111, 321.
transformation of scale, 1089, 292
300;
logarithmic, 297;
probit, 302.
tobacco, 260.
tomato, 260, 300.
topcross, 282.
twins, 131, 1835.
Uniformity of inbred lines, 54, 667,
97, 1003.
Value:
genotypic, 11 24, 1235;
phenotypic, 112.
variance:
additive, 1358;
between crosses, 27983;
components, 12930,
causal, 150,
genetic, 1344°,
observational, 150;
dominance, 1358, 163,
environmental, 1304, 1409,
common, 15961,
general, 1439,
inbreeding effects, 2702,
special, 1439;
genotypic, 1304;
inbreeding effects, 26572;
interaction (epistatic), 13840,
and resemblance between rela
tives, 1579;
nonadditive, 13940, 280, 287,
337
variation:
continuous, 1041 1 ;
discontinuous, 104, 108, 301;
quasicontinuous, 301.
MARSTON SCIENCE LIBRARY
Date Due
Due
Returned
Due
Returned
NOV 2 * 199
SEP 3 199 i
UNIVERSITY OF FLORIDA
3 1262 05478 2528
SCIENCE
I LIBRi
mtftjioH
SC\EHC£
UBRftK*
> Z N
HECKMAN l±l
BINDERY INC. e
AUG 95
—**' m&*