Full text of "Introduction to quantitative genetics"

See other formats

INTRODUCTION TO
QUANTITATIVE GENETICS

INTRODUCTION TO

QUANTITATIVE GENETICS

D. S. FALCONER

Agricultural Research Councils Unit of Animal Genetics
University of Edinburgh

THE RONALD PRESS COMPANY • NEW YORK

No part of this book may be reproduced

in any form without permission in writing

from the Publisher

FIRST PUBLISHED IN GREAT BRITAIN i960

6%\

i SCIEHCt V 1 c ~* •

Printed in Great Britain by
Robert MacLehose and Company Limited, Glasgow

PREFACE

My aim in writing this book has been to provide an introductory text-
book of quantitative genetics, with the emphasis on general principles
rather than on practical application, and one moreover that can be
understood by biologists of no more than ordinary mathematical
ability. In pursuit of this latter aim I have set out the mathematics in
the form that I, being little of a mathematician, find most compre-
hensible, hoping that the consequent lack of rigour and elegance will
be compensated for by a wider accessibility. The reader is not, how-
ever, asked to accept conclusions without proof. Though only the
simplest algebra is used, all the mathematical deductions essential
to the exposition of the subject are demonstrated in full. Some
knowledge of statistics, however, is assumed, particularly of the ana-
lysis of variance and of correlation and regression. Elementary
knowledge of Mendelian genetics is also assumed.

I have had no particular class of reader exclusively in mind, but
have tried to make the book useful to as wide a range of readers as
possible. In consequence some will find less detail than they require
and others more. Those who intend to become specialists in this
branch of genetics or in its application to animal or plant breeding
will find all they require of the general principles, but will find little
guidance in the techniques of experimentation or of breeding
practice. Those for whom the subject forms part of a course of
general genetics will find a good deal more detail than they require.
The section headings, however, should facilitate the selection of what
is relevant, and any of the following chapters could be omitted without
serious loss of continuity : Chapters 4, 5, 10 (after p. 168), 12, 13,
and 15-20.

The choice of symbols presented some difficulties because there
are several different systems in current use, and it proved impossible
to build up a self-consistent system entirely from these. I have
accordingly adopted what seemed to me the most appropriate of the

vi PREFACE

symbols in current use, but have not hesitated to introduce new
symbols where consistency or clarity seemed to require them. I
hope that my system will not be found unduly confusing to those
accustomed to a different one. There is a list of symbols at the end,
where some of the equivalents in other systems are given.

Acknowledgements

Many people have helped me in various ways, to all of whom I
should like to express my thanks. I am greatly indebted to Professor
C. H. Waddington for his encouragement and for the facilities that I
have enjoyed in his laboratory. It is no exaggeration to say that with-
out Dr Alan Robertson's help this book could not have been written.
Not only has his reading of the manuscript led to the elimination of
many errors, but I have been greatly assisted in my understanding
of the subject, particularly its more mathematical aspects, by frequent
discussions with him. Dr R. C. Roberts read the whole manuscript
with great care and his valuable suggestions led to many improve-
ments being made. Parts of the manuscript were read also by Dr
N. Bateman, Dr J. C. Bowman, Dr D. G. Gilmour, Dr J. H. Sang,
and my wife, to all of whom I am grateful for advice. I owe much
also to the Honours and Diploma students of Animal Genetics in
Edinburgh between 195 1 and 1957, whose questions led to improve-
ments of presentation at many points. Despite all the help I have
received, many imperfections remain and there can hardly fail to be
some errors that have escaped detection : the responsibility for all
of these is entirely mine. To Mr E. D. Roberts I am indebted for
drawing all the graphs and diagrams, and I greatly appreciate the
care and skill with which he has drawn them. I am indebted also to
the Director and Staff of the Commonwealth Bureau of Animal
Breeding for assistance with the preparation of the list of references.

D. S. FALCONER

Institute of Animal Genetics, Edinburgh
December, 1958

CONTENTS

PREFACE v

INTRODUCTION ..' i

1 GENETIC CONSTITUTION OF A POPULATION ... 5

Frequencies of genes and genotypes ... ... ... 5

Hardy-Weinberg equilibrium ... ... ... ... 9

2 CHANGES OF GENE FREQUENCY 23

Migration ... ... ... ... ... ... 23

Mutation ... ... ... ... ... ... ... 24

Selection ... ... ... ... ... ... ... 26

3 SMALL POPULATIONS: I. Changes of gene frequency under

simplified conditions ... ... ... ... ... ... 47

The idealised population ... ... ... ... 48

Sampling ... ... ... ... ... ... ... 50

Inbreeding ... ... ... ... ... ... 60

4 SMALL POPULATIONS: II. Less simplified conditions ... 68

Effective population size ... ... ... ... 68

Migration, Mutation, and Selection ... ... ... 74

Random drift in natural populations ... ... ... 81

5 SMALL POPULATIONS: III. Pedigreed populations and

close inbreeding ... ... ... ... ... ... 85

Pedigreed populations ... ... ... ... ... 86

Regular systems of inbreeding ... ... ... ... 90

6 CONTINUOUS VARIATION 104

Metric characters ... ... ... ... ••• 106

General survey of subject-matter ... ... ... 109

via

CONTENTS

7 VALUES AND MEANS

Population mean . . .
Average effect
Breeding value
Dominance deviation
Interaction deviation

8 VARIANCE

Genotypic and environmental variance
Genetic components of variance
Environmental variance ...

9 RESEMBLANCE BETWEEN RELATIVES

Genetic covariance
Environmental covariance
Phenotypic resemblance . .

10 HERITABILITY

Estimation of heritability

The precision of estimates of heritability

Identical twins

11 SELECTION: I. The response and its prediction

Response to selection

Measurement of response

Change of gene frequency under artificial selection

12 SELECTION: II. The results of experiments

Repeatability of response

Asymmetry of response

Long-term results of selection ... ... ....

13 SELECTION: III. Information from relatives

Methods of selection

Expected response

Relative merits of the methods ...

14 INBREEDING AND CROSSBREEDING:

mean value

Inbreeding depression ...
Heterosis ...

I. Changes

CONTENTS

15 INBREEDING AND CROSSBREEDING: II. Changes of

variance

Redistribution of genetic variance
Changes of environmental variance
Uniformity of experimental animals

16 INBREEDING AND CROSSBREEDING: III. The utilisa-

tion of heterosis

Variance between crosses

Methods of selection for combining ability

Overdominance ...

17 SCALE

18 THRESHOLD CHARACTERS

Selection for threshold characters

19 CORRELATED CHARACTERS

Genetic and environmental correlations

Correlated response to selection

Genotype-environment interaction

Simultaneous selection for more than one character . . .

20 METRIC CHARACTERS UNDER NATURAL
SELECTION

Relation of metric characters to fitness

Maintenance of genetic variation

The genes concerned with quantitative variation

GLOSSARY OF SYMBOLS

INDEXED LIST OF REFERENCES

SUBJECT INDEX

264
265

270
272

276
279

283
287

292

301
308

312
312
318
322
324

33o
332
338
343

346

349

361

INTRODUCTION

Quantitative genetics is concerned with the inheritance of those differ-
ences between individuals that are of degree rather than of kind,
quantitative rather than qualitative. These are the individual differ-
ences which, as Darwin wrote, "afford materials for natural selection
to act on and accumulate, in the same manner as man accumulates in
any given direction individual differences in his domestic produc-
tions." An understanding of the inheritance of these differences is thus
of fundamental significance in the study of evolution and in the appli-
cation of genetics to animal and plant breeding; and it is from these
two fields of enquiry that the subject has received the chief impetus to
its growth.

Virtually every organ and function of any species shows individual
differences of this nature, the differences of size among ourselves or
our domestic animals being an example familiar to all. Individuals
form a continuously graded series from one extreme to the other and
do not fall naturally into sharply demarcated types. Qualitative
differences, in contrast, divide individuals into distinct types with
little or no connexion by intermediates. Examples are the differ-
ences between blue-eyed and brown-eyed individuals, between the
blood groups, or between normally coloured and albino individuals.
The distinction between quantitative and qualitative differences
marks, in respect of the phenomena studied, the distinction between
quantitative genetics and the parent stem of "Mendelian" genetics.
In respect of the mechanism of inheritance the distinction is between
differences caused by many or by few genes. The familiar Mendelian
ratios, which display the fundamental mechanism of inheritance, can
be seen only when a gene difference at a single locus gives rise to a
readily detectable difference in some property of the organism.
Quantitative differences, in so far as they are inherited, depend on gene
differences at many loci, the effects of which are not individually dis-
tinguishable. Consequently the Mendelian ratios are not exhibited
by quantitative differences, and the methods of Mendelian analysis
are inappropriate.

INTRODUCTION

It is, nevertheless, a basic premiss of quantitative genetics that the
inheritance of quantitative differences depends on genes subject to
the same laws of transmission and having the same general properties
as the genes whose transmission and properties are displayed by-
qualitative differences. Quantitative genetics is therefore an extension
of Mendelian genetics, resting squarely on Mendelian principles as its
foundation.

The methods of study in quantitative genetics differ from those
employed in Mendelian genetics in two respects. In the first place,
since ratios cannot be observed, single progenies are uninformative,
and the unit of study must be extended to "populations," that is
larger groups of individuals comprising many progenies. And, in the
second place, the nature of the quantitative differences to be studied
requires the measurement, and not just the classification, of the indi-
viduals. The extension of Mendelian genetics into quantitative gene-
tics may thus be made in two stages, the first introducing new con-
cepts connected with the genetic properties of "populations" and the
second introducing concepts connected with the inheritance of
measurements. This is how the subject is presented in this book. In
the first part, which occupies Chapters i to 5, the genetic properties of
populations are described by reference to genes causing easily identi-
fiable, and therefore qualitative, differences. Quantitative differences
are not discussed until the second part, which starts in Chapter 6.
These two parts of the subject are often distinguished by different
names, the first being referred to as "Population Genetics" and the
second as "Biometrical Genetics" or "Quantitative Genetics."
Some writers, however, use "Population Genetics" to refer to the
whole. The terminology of this distinction is therefore ambiguous.
The use of "Quantitative Genetics" to refer to the whole subject may
be justified on the grounds that the genetics of populations is not just
a preliminary to the genetics of quantitative differences, but an in-
tegral part of it.

The theoretical basis of quantitative genetics was established
round about 1920 by the work of Fisher (19 18), Haldane (1924-32,
summarised 1932) and Wright (1921). The development of the
subject over the succeeding years, by these and many other gene-
ticists and statisticians, has been mainly by elaboration, clarifica-
tion, and the filling in of details, so that today we have a substantial
body of theory accepted by the majority as valid. As in any healthily
growing science, there are differences of opinion, but these are chiefly

INTRODUCTION

matters of emphasis, about the relative importance of this or that
aspect.

The theory consists of the deduction of the consequences of
Mendelian inheritance when extended to the properties of popula-
tions and to the simultaneous segregation of genes at many loci. The
premiss from which the deductions are made is that the inheritance of
quantitative differences is by means of genes, and that these genes
are subject to the Mendelian laws of transmission and may have any
of the properties known from Mendelian genetics. The property of
"variable expression" assumes great importance and might be raised
to the status of another premiss: that the expression of the genotype
in the phenotype is modifiable by non-genetic causes. Other pro-
perties whose consequences are to be taken into account include
dominance, epistasis, pleiotropy, linkage, and mutation.

These theoretical deductions enable us to state what will be the
genetic properties of a population if the genes have the properties
postulated, and to predict what will be the consequences of applying
any specified plan of breeding. In principle we should then be able to
make observations of the genetic properties of natural or experi-
mental populations, and of the outcome of special breeding methods,
and deduce from these observations what are the properties of the
genes concerned. The experimental side of quantitative genetics,
however, has lagged behind the theoretical in its development, and it
is still some way from fulfilling this complementary function. The
reason for this is the difficulty of devising diagnostic experiments
which will unambiguously discriminate between the many possible
situations envisaged by the theory. Consequently the experimental
side has developed in a somewhat empirical manner, building general
conclusions out of the experience of many particular cases. Never-
theless there is now a sufficient body of experimental data to substan-
tiate the theory in its main outlines; to allow a number of generalisa-
tions to be made about the inheritance of quantitative differences;
and to enable us to predict with some confidence the outcome of
certain breeding methods. Discussion of all the difficulties would be
inappropriate in an introductory treatment. The aim here is to
describe all that is reasonably firmly established and, for the sake of
clarity, to simplify as far as is possible without being misleading.
Consequently the emphasis is on the theoretical side. Though con-
clusions will often be drawn directly from experimental data, the
experimental side of the subject is presented chiefly in the form of

4 INTRODUCTION

examples, chosen with the purpose of illustrating the theoretical
conclusions. These examples, however, cannot always be taken as
substantiating the postulates that underlie the conclusions they
illustrate. Too often the results of experiments are open to more than
one interpretation.

No attempt has been made to give exhaustive references to pub-
lished work in any part of the subject; or to indicate the origins, or
trace the history, of the ideas. To have done this would have required
a much longer book, and a considerable sacrifice of clarity. The chief
sources, from which most of the material of the book is derived, are
listed below. These sources are not regularly cited in the text.
References are given in the text when any conclusion is stated without
full explanation of its derivation. These references are not always to
the original papers, but rather to the more recent papers where the
reader will find a convenient point of entry to the topic under dis-
cussion. References are also given to the sources of experimental
data, but these, for reasons already explained, cover only a small part
of the experimental side of the subject. In particular, a great deal
more work has been done on plants and on farm animals than would
appear from its representation among the experimental work cited.

Chief Sources

(For details see List of References)

Fisher, R. A. (1930), The Genetical Theory of Natural Selection.

Haldane, J. B. S. (1932), The Causes of Evolution.

Kempthorne, O. (1957), An Introduction to Genetic Statistics.

Lerner, I. M. (1950), Population Genetics and Animal Improvement.

Li, C. C. (1955), Population Genetics.

Lush , J . L . ( 1 945 ), A nimal Breeding Plans.

Malecot, G. (1948), Les Mathematiques de VHeredite.

Mather, K. (1949), Biometrical Genetics.

Wright, S. (1921), Systems of Mating. Genetics 6: 111-178.

(1931), Evolution in Mendelian Populations. Genetics 16:

97-J59-

■i

CHAPTER i

GENETIC CONSTITUTION OF A
POPULATION

Frequencies of Genes and Genotypes

To describe the genetic constitution of a group of individuals we
should have to specify their genotypes and say how many of each geno-
type there were. This would be a complete description, provided the
nature of the phenotypic differences between the genotypes did not
concern us. Suppose for simplicity that we were concerned with a
certain autosomal locus, A, and that two different alleles at this locus,
Ax and A2, were present among the individuals. Then there would be
three possible genotypes, AjA^ AjAa, and A2A2. (We are concerned
here, as throughout the book, exclusively with diploid organisms.)
The genetic constitution of the group would be fully described by
the proportion, or percentage, of individuals that belonged to each
genotype, or in other words by the frequencies of the three genotypes
among the individuals. These proportions or frequencies are called
genotype frequencies, the frequency of a particular genotype being its
proportion or percentage among the individuals. If, for example, we
found one quarter of the individuals in the group to be AjA^ the
frequency of this genotype would be 0-25, or 25 per cent. Naturally
the frequencies of all the genotypes together must add up to unity, or
1 00 per cent. " " "" "

Example i.i. The M-N blood groups in man are determined by two
alleles at a locus, and the three genotypes correspond with the three blood
groups, M, MN, and N. The following figures, taken from the tabulation
of Mourant (1954), show the blood group frequencies among Eskimoes
of East Greenland and among Icelanders as follows:

Frequency,

Blood group

Number of
individuals

M MN N

Greenland

83-5 15-6 0-9

569

Iceland

31-2 51-5 17-3

747

6 GENETIC CONSTITUTION OF A POPULATION [Chap. I

Clearly the two populations differ in these genotype frequencies, the N
blood group being rare in Greenland and relatively common in Iceland.
Not only is this locus a source of variation within each of the two popula-
tions, but it is also a source of genetic difference between the populations.

A population, in the genetic sense, is not just a group of individuals,
but a breeding group; and the genetics of a population is concerned
not only with the genetic constitution of the individuals but also with
the transmission of the genes from one generation to the next. In the
transmission the genotypes of the parents are broken down and a new
set of genotypes is constituted in the progeny, from the genes trans-
mitted in the gametes. The genes carried by the population thus have
continuity from generation to generation, but the genotypes in which
they appear do not. The genetic constitution of a population, refer-
ring to the genes it carries, is described by the array of gene frequencies;
that is by specification of the alleles present at every locus and the
numbers or proportions of the different alleles at each locus. If,
for example, Ax is an allele at the A locus, then the frequency of Ax
genes, or the gene frequency of Alt is the proportion or percent-
age of all genes at this locus that are the Ax allele. The frequencies
of all the alleles at any one locus must add up to unity, or ioo per
cent.

The gene frequencies at a particular locus among a group of
individuals can be determined from a knowledge of the genotype
frequencies. To take a hypothetical example, suppose there are two
alleles, A± and A2, and we classify ioo individuals and count the
numbers in each genotype as follows:

AjAi AjA2 A2A2 Total
Number of individuals 30 60 10 100

Number of genes < . , _ V200

& \A2 o 60 20 80J

Each individual contains two genes, so we have counted 200 repre-
sentatives of the genes at this locus. Each^p^jj^diyjdual_contains
two At genes and each AXA2 contains one Ax gene. So there are 120 Ax
genes intne sample, and 80 A2 genes. The frequency of A± is there-
fore 60 per cent or o-6, and the frequency of A2 is 40 per cent or 0-4.
To express the relationship in a more general form, let the frequencies
of genes and of genotypes be as follows:

Chap. I]

FREQUENCIES OF GENES AND GENOTYPES

Genes

Genotypes

A1 A2

AjAj AjA-2 x\2^2

Frequencies

P 9

P H Q

so that p+q= i, and P + H+ Q = i. Since each individual contains
two genes, the frequency of Ax genes is J(2P + H)} and the relation-
ship between gene frequency and genotype frequency among the
individuals counted is as follows:

p=p-

q=Q

Xi.x)

Example 1.2. To illustrate the calculation of gene frequencies from
genotype frequencies we may take the M-N blood group frequencies given
in Example 1 . 1 . The M and N blood groups represent the two homozygous
genotypes and the MN group the heterozygote. The frequency of the M
gene in Greenland is, from equation 1.1, 0-835 +2(0*156) = 0-913, and the
frequency of the N gene is 0-009 +i(o- 156) = 0-087, tne sum °f tne
frequencies being i-ooo as it should be. Doing the same for the Iceland
sample we find the following gene frequencies in the two populations, ex-
pressed now as percentages:

Gene

Greenland

9!'3

Iceland

57-0

43 -°

Thus the two populations differ in gene frequency as well as in genotype
frequencies.

The genetic properties of a population are influenced in the pro-
cess of transmission of genes from one. generation to the next by a
number of agencies. These form the chief subject-matter of the next
four chapters, but we may briefly review them here in order to have
some idea of what factors are being left out of consideration in this
chapter. The agencies through which the genetic properties of a
population may be changed are these:

Population size. The genes passed from one generation to the
next are a sample of the genes in the parent generation. Therefore
the gene frequencies are subject to sampling variation between suc-
cessive generations, and the smaller the number of parents the greater
is the sampling variation. The effects of sampling variation will be
considered in Chapters 3-5, and meantime we shall exclude it from

B F.Q.G.

8 GENETIC CONSTITUTION OF A POPULATION [Chap. I

the discussion by supposing always that we are dealing with a ' 'large
population," which means simply one in which sampling variation is
so small as to be negligible. For practical purposes a "large popula-
tion" is one in which the number of adult individuals is in the hundreds
rather than in the tens.

Differences of fertility and viability. Though we are not at
present concerned with the phenotypic effects of the genes under dis-
cussion, we cannot ignore their effects on fertility and viability, be-
cause these influence the genetic constitution of the succeeding
generation. The different genotypes among the parents may have
different fertilities, and if they do they will contribute unequally to
the gametes out of which the next generation is formed. In this way
the gene frequency may be changed in the transmission. Further,
the genotypes among the newly formed zygotes may have different
survival rates, and so the gene frequencies in the new generation may
be changed by the time the individuals are adult and themselves
become parents. These processes are called selection, and will be
described in Chapter 2. Meanwhile we shall suppose they are not
operating. It is difficult to find examples of genes not subject to
selection. For the purpose of illustration, however, we may take the
human blood-group genes since the selective forces acting on these
are probably not very strong. Genes that produce a mutant pheno-
type which is abnormal in comparison with the wild-type are, in
contrast, usually subject to much more severe selection.

Migration and mutation. The gene frequencies in the popula-
tion may also be changed by immigration of individuals from another
population, and by gene mutation. These processes will be described
in Chapter 2, and at this stage will also be supposed not to operate.

Mating system. The genotypes in the progeny are determined
by the union of the gametes in pairs to form zygotes, and the union of
gametes is influenced by the mating of the parents. So the genotype
frequencies in the offspring generation are influenced by the geno-
types of the pairs that mate in the parent generation. We shall at
first suppose that mating is at random with respect to the genotypes
under discussion. Random mating, or panmixia, means that any
individual has an equal chance of mating with any other individual in
the population. The important points are that there should be no
special tendency for mated individuals to be alike in genotype, or to
be related to each other by ancestry. If a population covers a large
geographic area individuals inhabiting the same locality are more

Chap. I]

FREQUENCIES OF GENES AND GENOTYPES

likely to mate than individuals inhabiting different localities, and so
the mated pairs tend to be related by ancestry. A widely spread
population is therefore likely to be subdivided into local groups and
mating is random only within the groups. The properties of sub-
divided populations depend on the size of the local groups, and will
be described under the effects of population size in Chapters 3-5.

Hardy-Weinberg Equilibrium

In a lar^e rajiploiiiamating^ population both _gene_frequencies and
per^ot^pe frequencies are constant from generation to gene^^p-n. in
th^^tfifince of migration, mutation and selection; and the genotype
frequencies are determined by the gene frequencies. These properties
of a population were first demonstrated fry Harfly ancLhy, Weinberg
independently in iqo8, and are generally known as the Hardy-
Weinberg Law. (See Stern, 1943, where a translation of the relevant
part of Weinberg's paper will be found.) Such a population is said
to be in Hardy-Weinberg equilibrium. Deduction of the Hardy-
Weinberg Law involves three steps: (1) from the parents to the
gametes they produce; (2) from the union of the gametes to the geno-
types in the zygotes produced; and (3) from the genotypes of the
zygotes to the gene frequency in the progeny generation. These steps,
in detail, are as follows:

1 . Let the parent generation have gene and genotype frequencies
as follows:

P 9.

AXA2
H

A2A2

Two sorts of gametes are produced, those bearing Ax and those bear-
ing A2. The frequencies of these gametic types are the same as the
gene frequencies, p and q, in the generation producing them, for this
reason: AXAX individuals produce only A± gametes, and AXA2 indi-
viduals produce equal numbers of A± and A2 gametes (provided, of
course, there is no anomaly of segregation). So the frequency of A±
gametes produced by the whole population is P + \H, which by
equation j.j is the gene frequency of A±.

2. Random mating between individuals is equivalent to random
union among their gametes. We can think of a pool of gametes to
which all the individuals contribute equally; zygotes are formed by

GENETIC CONSTITUTION OF A POPULATION

[Chap. I

random union between pairs of gametes from the pool. The genotype
frequencies among the zygotes are then the products of the frequencies
of the gametic types that unite to produce them. The genotype
frequencies among the progeny produced by random mating can
therefore be determined simply by multiplying the frequencies of the
gametic types as shown in the following table:

s 8-

Female gc

imetes and

their frequencies

AiA2

A1A2

A2A2

We need not distinguish the union of Ax eggs with A2 sperms from
that of A2 eggs with A1 sperms; so the genotype frequencies of the
zygotes are

AiAj_ A]A2

A2A2

zpq

.(1.2)

Note that these genotype frequencies depend only on the gene fre-
quency in the parents, and not on the parental genotype frequencies,
provided the parents mate at random.

3. Finally we use these genotype frequencies to determine the
gene frequency in the offspring generation. Applying equation 1.1
we find the gene frequency of Ax is j>2 + \ {zpq) =p(p + q) =p, which is
the same as in the parent generation. ' —

The properties ot appellation with respect to a single locus, ex-
pressed in the Hardy- Weinberg law and demonstrated above, are
these:

^ (1) A large random-mating population, in the absence of migra-
tion, mutation, and selection, is stable with respect to both gene and
genotype frequencies: there is no inherent tendency for its genetic
properties to change from generation to generation.

(2) The genotype frequencies in the progeny produced by random
mating among the parents are determined solely by the gene fre-
quencies among the parents. Consequently:

Chap. I]

HARDY-WEINBERG EQUILIBRIUM

(a) a population in Hardy- Weinberg equilibrium has the rela-
tionship expressed in equation 1.2 between the gene and
genotype frequencies in any one generation. And,

(b) these Hardy- Weinberg genotype frequencies are established
by one generation of random mating, irrespective of the
genotype frequencies among the parents.

>-
u
z

D 6

Z A

\a,a,

A2A2/

^a7a2

0 I -2 -3 -4 5 -6 7 8 9 I

GENE FREQUENCY of A2

Fig. i.i. Relationship between genotype frequencies and gene
frequency for two alleles in a population in Hardy- Weinberg
equilibrium.

We shall later give another proof of the Hardy- Weinberg law by
a different method. Let us now first illustrate the properties of a
population in Hardy- Weinberg equilibrium, and then show to what
uses these properties can be put. The relationship between gene
frequency and genotype frequencies expressed in equation 1.2 is

,000 I Moa%$X?--

12 GENETIC CONSTITUTION OF A POPULATION [Chap. I

illustrated graphically in Fig. i.i, which shows how the frequencies
of the three genotypes for a locus with two alleles depend on the gene
frequency. As an example of the Hardy- Weinberg genotype fre-
quencies we shall take again the M-N blood groups in man.

Example 1.3. Race and Sanger (1954) quote the following frequencies
(%) of the M-N blood groups in a sample of 1,279 English people. From
the observed genotype (i.e. blood group) frequencies we can calculate the
gene frequencies by equation 1.1. These gene frequencies are shown on
the right.

Blood group Gene

M MN N M N

Observed 28-38 49-57 22-05 53-165 46-835

Expected 28-265 49-800 21-935

Now from the gene frequencies we can calculate the expected Hardy-
Weinberg genotype frequencies by equation 1.2, and we find that the
observed frequencies agree very closely with those expected for a popula- 1
tion in Hardy- Weinberg equilibrium.

Comparison of observed with expected genotype frequencies may
be regarded as a test of the fulfilment of the conditions on which the
Hardy- Weinberg equilibrium depends. ^Xhese conditions are:
random mating among the parents of the individuals observed, equal
fertility of the different genotypes among; the parents, and equal
viability of the different genotypes amnn^ the nffoprjng from f^rtilisa-
tion up to the time of observation. In addition, the classification of
individuals as to genotype must have been correctly made. The
blood group frequencies in Example 1.3 give no cause to doubt the
fulfilment of these conditions. It-should be noted, however, that a
difference of fertility or of viability between the genotypes, though it
can be detected, cannot be measured from a comparison of observed
v^ith^expected frequencies (Wallace, 1958). The. expected frequencies
arej)ased on the observed gene frequencies after the differences of fer-
ity or viability have had their effect. In order to measure these effects
wejshould have to know the original gene or genotype frequencies.

At the beginning of the chapter we saw, in equation J. J, how the
gene frequencies among a group of individuals can be determined
from their genotype frequencies; but for this it was necessary to know
the frequencies of all three genotypes. Consequently the relationship
in equation 1.1 cannot be applied to the case of a recessive allele,

)X>f>

Chap. I]

HARDY-WEINBERG EQUILIBRIUM

when the heterozygote is indistinguishable from the dominant homo-
zygote. Consideration of the population as a breeding unit, however,
shows that when the conditions for Hardy- Weinberg equilibrium
hold, only the frequency of one of the homozygous genotypes is
needed to determine the gene frequency, and the difficulty of recessive
genes is thus overcome. Let A2, for example, be a recessive gene
with frequency q; then the frequency of A2A2 homozygotes is q2. In
other words the gene frequency is the square root of the homozygote
frequency. Thus we can determine the gene frequency of recessive
abnormalities, provided that selective mortality of the homozygote
can be discounted or allowed for. But we can go further, and this is
often the more important point: we can also determine the frequency
of heterozygotes, or "carriers," of recessive abnormalities, which is f
2q(i -q). It comes as a surprise to most people to discover how com-C- J^
mem heterozygotes of a rare recessive abnormality are.

Example 1.4. Albinism in man is probably determined by a single
recessive autosomal gene, and the frequency of albinos is about 1/20,000
in human populations (see Stern, 1949). If q is the frequency of the albino
gene, then q2 = 1/20,000, and q = 1/141, if selective mortality is disregarded.
The frequency of heterozygotes is then 2^(1 -q)y which works out to about
1/70. So about one person in seventy is a heterozygote for albinism,
though only one in twenty thousand is a homozygote.

Example 1.5. There is a recessive autosomal gene in the Ayrshire
breed of cattle in Britain which causes dropsy in the new-born calf. The
frequency of this abnormality is about 1 in 300 births (Donald, Deas, and
Wilson, 1952). A means of reducing the frequency of the defect would
obviously be the avoidance of the use of bulls known or thought to be
heterozygous. We might first want to know what proportion of bulls
would be expected to be heterozygotes. In this case the conditions for
Hardy-Weinberg equilibrium are certainly not all fulfilled: the breed is not
a single random-breeding population, and the abnormal homozygotes are
not fully viable up to the time of birth. So we can only get a rough idea of
the frequency of heterozygotes by assuming the observations to refer to a
population in Hardy-Weinberg equilibrium. On this assumption,
q2 = 0-0033, so tf = 0'°57; me frequency of heterozygotes is zq{i -q) = o-n.
So we should expect, very approximately, one bull in ten to be a hetero-
zygote.

Mating frequencies and another proof of the Hardy-
Weinberg law. Let us now look more closely into the breeding

/ / 1

14 GENETIC CONSTITUTION OF A POPULATION [Chap. I

structure of a random-mating population, distinguishing the types of
mating according to the genotypes of the pairs, and seeing what are
the genotype frequencies among the progenies of the different types
of mating. This provides a general method for relating genotype
frequencies in successive generations, which we shall use in a later
chapter. It also provides another proof of the Hardy- Weinberg law;
a proof more cumbersome than that already given but showing more
clearly how the Hardy- Weinberg frequencies arise from the Men-
delian laws of segregation. The procedure is to obtain first the
frequencies of all possible mating types according to the frequencies
of the genotypes among the parents, and then to obtain the fre-
quencies of genotypes among the progeny of each type of mating
according to the Mendelian ratios.

Consider a locus with two alleles, and let the frequencies of genes
and genotypes in the parents be, as before,

Genes Genotypes

A1 A2 -A-i-A-i A1A2 A2A2

Frequencies p q P H Q

There are altogether nine types of mating, and their frequencies
when mating is random are found thus:

Q ^ s

S ^a

Since the sex of the parent is irrelevant in this context, some of the
types of mating are equivalent, and the number of different types
reduces to six. By summation of the frequencies of equivalent types,
we obtain the frequencies of mating types in the first two columns of
Table i . i . Now we have to consider the genotypes of offspring pro-
duced by each type of mating, and find the/frequency of each geno-
type in the total progeny, assuming, of course, that all types of mating
are equally fertile and all genotypes equally viable. This is done in
the right hand side of Table i . i . Thus, for example, matings of the
type AXAX x A^ produce only AXAX offspring. So, of all the A^

Genofy

ipe and ft

equency o

A,A,

A]A2

A2A2

A,A,

P2.

AXA2

A2A2

Chap. I]

HARDY-WEINBERG EQUILIBRIUM

genotypes in the total progeny, a proportion P2 come from this type
of mating. Similarly a quarter of the offspring of AXA2 x AXA2
matings are A^. So this type of mating, which has a frequency of
H2y contributes a proportion \H2 of the total A^ progeny. To find
the frequency of each genotype in the total progeny we add the

Mating

Table i.i

Genotype and frequency of progeny

Type

Frequency

AiA2

A„An

■/x-^-fij X A-^/ij

■ —

Xil/il X XTL-f/lo

zPH

—

A-jAj x A2A2

2PQ

—

2PQ

—

AjA2 x AjA2

\H2

iff2

AXA2 x A2A2

zHQ

— '

A2A2 X r\2r\.2

Sums

—

{P+Wf

2{P +

WW + W)

(Q + Wf

zpq

frequencies contributed by each type of mating. The sums, after
simplification, are given at the foot of the table, and from the identity
given in equation J.J they are seen to be equal to p2, 2pq, and q2.
These are the Hardy-Weinberg equilibrium frequencies, and we
have shown that they are attained by one generation of random mating,
irrespective of the genotype frequencies among the parents.

Multiple alleles. Restriction of the treatment to two alleles at a
locus suffices for many purposes. If we are interested in one
particular allele, as often happens, then all the other alleles at the
locus can be treated as one. Formulation of the situation in terms of
two alleles is therefore often possible even if there are in fact more
than two. If we are interested in more than one allele we can still, if
we like, treat the situation as a two-allele system by considering each
allele in turn and lumping the others together. But the treatment can
be easily extended to cover more than two alleles, and no new prin-
ciple is introduced. In general, if qx and q2 are the frequencies of any
two alleles, Ax and A2, of a multiple series, then the genotype fre-
quencies under Hardy-Weinberg equilibrium are as follows (Li,

Genotype: A^ AXA2 A2A2
Frequency: q2 2q±q2 q2

16 GENETIC CONSTITUTION OF A POPULATION [Chap. I

These frequencies are also attained by one generation of random
mating. This can readily be seen by reducing the situation to a two-
allele system, and considering each allele in turn. Or it can be
proved, though somewhat more laboriously, by the method explained
above for the two-allele system.

Example i.6. The ABO blood groups in man are determined by a
series of allelic genes. For the purpose of illustration we shall recognise
three alleles, A, B, and O, and show how the gene frequencies can be
estimated from the blood group frequencies. Let the frequencies of the
A, B, and O genes be p, q, and r respectively, so that p+q + r=i. The
following table shows (i) the genotypes, (2) the blood groups (i.e. pheno-
types) corresponding to the different genotypes, (3) the expected frequen-
cies of the blood groups in terms of p, q, and r, on the assumption of
Hardy- Weinberg equilibrium, (4) observed frequencies of blood groups in
a sample of 190,177 United Kingdom airmen, quoted by Race and Sanger
(1954)-

Genotype AA AO BB BO 00 AB

Blood group A B O AB

Frequency (%)

expected p2 + 2pr q2 + zqr r2 zpq

observed 41 716 8-560 46-684 3*040

Calculation of the gene frequencies is rather more complicated than with
two alleles. The following is the simplest method: a more refined method
is described by Ceppellini et al. (1955). FirsL the frequency of the O gene
is simply the square roqf of the frequency of t)ie._Q group. Next it will be
seen that the sum of the frequencies of the B and O groups is q2 + zqr + r2 =
(q + r)2 = (i -p)2. So p = 1 - J(B + O), where B and O are the frequencies
of the blood groups B and O. In the same way q=i -^/(A + 0), and we
have seen that r = JO. This method gives the following gene frequencies
in the sample:

A gene: ^ = 0-2567
B gene: # = 0-0598
Ogene: r = 0-6833

Total 0-9998

As a result of sampling errors these frequencies do not add up exactly to
unity, but we shall not trouble to make an adjustment for so small a dis-
crepancy. We may now calculate the expected frequency of the AB blood

Chap. I]

HARDY-WEINBERG EQUILIBRIUM

group, which has not been used in arriving at these gene frequencies, and
see whether the observed frequency agrees satisfactorily. The expected
frequency of AB from estimates of p and q is 3-070 per cent, which is in
good agreement with the observed frequency of 3-040 percent. (x2=z°'7>
with 1 d.f., calculated by the method given by Race and Sanger.)

Sex-linked genes. With sex-linked genes the situation is rather
more complex than with autosomal genes. The relationship between
gene frequency and genotype frequency in the homogametic sex is
the same as with an autosomal gene, but the heterogametic sex has
only two genotypes and each individual carries only one gene instead
of two. For this reason two-thirds of the sex-linked genes in the
population are carried by tKeTibmogametic sex and one-third by the
heterogametic. For the sake of brevity we shall now refer to the
heterogametic sex as male. Consider two alleles, Ax and A2, with
frequencies^) and q, and let the genotypic frequencies be as follows:

Females

AjAj AjA2
P H

A2A2

Males
Ax A2
R S

The frequency of A1 among the females is then pf =P + \Hy and the

frequency among the males is pr
whole population is

R. The frequency of A± in the

= i(2pf+Pm)

= ±(2P + H + R)

(1.3)

-(14)

Now, if the gene frequencies among males and among females are
different, the population is not in equilibrium. The gene frequency
in the population as a whole does not change, but its distribution
between the two sexes oscillates as the population approaches equili-
brium. The reason for this can be seen from the following con-
siderations. Males get their sex-linked genes only from their
mothers; therefore pm is equal to pf in the previous generation.
Females get their sex-linked genes equally from both parents; there-

fore pf is equal to the mean of pm and pf in the previous generation,
Using primes to indicate the previous generation, we have

Pm=p'f

Pf^Wm+P'f)

18 GENETIC CONSTITUTION OF A POPULATION [Chap. I

The difference between the frequencies in the two sexes is

Pf-pm = i(Pm +Pf)-Pf

= -i(Pf-p'm)

i.e. half the difference in the previous generation, but in the other
direction. Therefore the distribution of the genes between the two
sexes oscillates, but the difference is halved in successive generations
and the population rapidly approaches an equilibrium in which the

>-
u
z

FEMALES

MALES

FEMALES & MALES COMBINED

GENERATIONS

Fig. 1.2. Approach to equilibrium under random mating for a
sex-linked gene, showing the gene frequency among females,
among males, and in the two sexes combined. The population
starts with females all of one sort (qf — i), and males all of the
other sort (qm= o).

frequencies in the two sexes are equal. The situation is illustrated
in Fig. 1.2, which shows the consequences of mixing females of one
sort (all AjAi) with males of another sort (all A2) and letting them
breed at random.

Example 1.7. Searle (1949) gives the frequencies of a number of
genes in a sample of cats in London. The animals examined were sent to

Chap. I]

HARDY-WEINBERG EQUILIBRIUM

clinics for destruction; they were therefore not necessarily a random
sample. Among the genes studied was ''yellow" (y) which is sex-linked
and for which all three genotypes in females are recognisable, the hetero-
zygote being tortoise-shell. The data were used to test for agreement with
Hardy- Weinberg equilibrium. The numbers observed in each phenotypic
class are shown in table (i). We may first see whether the gene frequency

(i)

Females

+ + +y yy

Numbers observed 277 54 7
Numbers expected 269-6 64-5 3-9

Males

311
3I5-2

y
42

37-8

(")

in females

608

o-ioi

in males

311

0-119

total

919

0-107

is equal in the two sexes. The numbers of genes counted, and the
frequency (q) of the gene y, in each sex are as given in table (ii). The

X2 testing difference in q between the sexes is 0-4 which is quite in-
significant. There is therefore no reason to think the population is not
in equilibrium, and we may take the estimate of gene frequency from both
sexes combined: it is # = 0-107. From this estimate of q the expected
numbers in the different phenotypic classes are calculated; they are shown
in table (i). Only the females are relevant to the test of random mating.
The x2 testing agreement between observed and expected numbers in
females is 4-4, with 2 degrees of freedom. This has a probability of o-i and
cannot be judged significant. The data are therefore compatible with the
Hardy- Weinberg equilibrium, in spite of the deficiency of tortoise-shell
females. If the deficiency of heterozygous females were real we might
attribute it to the method of sampling and infer that the tortoise-shells
were sent for destruction less often than the other colours, on account of
human preference.

More than one locus. The attainment of the equilibrium in
genotype frequencies after one generation of random mating is true
of all autosomal loci considered separately. But it is not true of the
genotypes with respect to two or more loci considered jointly. To
illustrate the point, consider a population made up of equal numbers

20 GENETIC CONSTITUTION OF A POPULATION [Chap. I

of A^B^ and A2A2B2B2 individuals, of both sexes. The gene
frequency at both loci is then J, and if the individuals mated at ran-
dom only three out of the nine genotypes would appear in the pro-
geny; the genotype A1A1B2B2, for example, would be absent though
its frequency in an equilibrium population would be yg-. The missing
genotypes appear in subsequent generations, but not immediately
at their equilibrium frequencies. The approach to equilibrium is
described by Li (19550) an<^ nere we snan onry outline the con-
clusions.

Consider two loci each with two alleles, and let the frequencies of
the four types of gamete formed by the initial population be as fol-
lows:

type of gamete A1B1 AXB2 A2BX A2B2
frequency r s t u

Then if the population is in equilibrium, ru=st, as may be seen by
writing the gametic frequencies in terms of the gene frequencies.
The difference, ru - st, gives a measure of the extent of the departure
from equilibrium. This difference is halved in each successive genera-
tion of random mating, and the approach to equilibrium is thus fairly
rapid (see Fig. 1.3). If, however, more than two loci are to be con-
sidered jointly the approach to equilibrium becomes progressively
slower as the number of loci increases.

Linked loci. If two loci are linked the approach to equilibrium
under random mating is slower in proportion to the closeness of the
linkage. When equilibrium is reached the coupling and repulsion
phases are equally frequent; the frequencies of the gametic types then
depend only on the gene frequencies and not at all on the linkage. It
is easy to suppose that association between two characters, as for
example between hair colour and eye colour, is evidence of linkage
between the genes concerned. Association between characters,
however, is more often evidence of pleiotropy than of linkage. Link-
age can give rise to association only after a mixture of populations,
the length of time that the association persists depending on the
closeness of the linkage.

The approach to equilibrium after the mixture of populations
differing in respect of the genes at two linked loci can be described in
the manner of the preceding section. The departure from equili-
brium, d, is expressed as d — ru-st, where ru is the frequency of
coupling heterozygotes and st that of repulsion heterozygotes. If c

Chap. I]

HARDY-WEINBERG EQUILIBRIUM

is the frequency of recombination between the two loci then the
difference, d, at generation t is

dt = (i-c)dt_1

Thus if, for example, there is 25 per cent recombination the difference
is reduced by one quarter in each generation; or if there is 10 per cent
recombination the difference is reduced by 10 per cent in each

4 5 6 7
GENERATIONS

I I

Fig. 1.3. Approach to equilibrium under random mating of two
loci, considered jointly. The graphs show the difference of fre-
quency (d) between coupling and repulsion heterozygotes in suc-
cessive generations, starting with all individuals repulsion hetero-
zygotes. The five graphs refer to different degrees of linkage
between the two loci, as indicated by the recombination frequency
shown alongside each graph. The graph marked .5 refers to un-
linked loci.

generation. Closely linked loci will therefore continue for a consider-
able time to show the effects of a past mixture of populations. The
approach to equality of coupling and repulsion phases with different
degrees of linkage is illustrated in Fig. 1.3.

22 GENETIC CONSTITUTION OF A POPULATION [Chap. I

Assortative mating. Assortative mating is a form of non-random
mating, but this is the most convenient place to mention it. If the
mated pairs tend to be of the same genotype more often than would
occur by chance this is called positive assortative mating, and if less
often it is called negative assortative (or sometimes disassortative)
mating. The consequences are described by Wright (1921) and sum-
marised by Li (1955^) and will be only briefly outlined here. Posi-
tive assortative mating is of some importance in human populations,
where it occurs with respect to intelligence and other mental charac-
ters. These however are not single gene differences such as can be
discussed in the present context. The consequences of assortative
mating with a single locus can be deduced from Table 1 . 1 by appro-
priate modification of the frequencies of the types of mating to allow
for the increased frequency of matings between like genotypes. The
effect on the genotype frequencies among the progeny is to increase
the frequencies of homozygotes and reduce that of heterozygotes.
In effect the population becomes partially subdivided into two
groups, mating taking place more frequently within than between
the groups.

CHAPTER 2

CHANGES OF GENE FREQUENCY

We have seen that a large random-mating population is stable with
respect to gene frequencies and genotype frequencies, in the absence
of agencies tending to change its genetic properties. We can now
proceed to a study of the agencies through which changes of gene
frequency, and consequently of genotype frequencies, are brought
about. There are two sorts of process: systematic processes, which
tend to change the gene frequency in a manner predictable both in
amount and in direction; and the dispersive process, which arises in
small populations from the effects of sampling, and is predictable in
amount but not in direction. In this chapter we are concerned only
with the systematic processes, and we shall consider only large random-
mating populations in order to exclude the dispersive process from
the picture. There are three systematic processes: migration, mutation,
and selection. We shall study these separately at first, assuming that
only one process is operating at a time, and then we shall see how the
different processes interact.

Migration

The effect of migration is very simply dealt with and need not con-
cern us much here, though we shall have more to say about it later,
in connexion with small populations. Let us suppose that a large
population consists of a proportion, m, of new immigrants in each
generation, the remainder, i - m, being natives. Let the frequency
of a certain gene be qm among the immigrants and q0 among the
natives. Then the frequency of the gene in the mixed population, qlf
will be

mqm + (i-m)q0

.(2.1)

The change of gene frequency, Aq, brought about by one generation

F.Q.G.

24 CHANGES OF GENE FREQUENCY [Chap. 2

of immigration is the difference between the frequency before
immigration and the frequency after immigration. Therefore

= m(qm-q0) (2.2)

Thus the rate of change of gene frequency in a population subject to
immigration depends, as must be obvious, on the immigration rate
and on the difference of gene frequency between immigrants and
natives.

Mutation

The effect of mutation on the genetic properties of the population
differs according to whether we are concerned with a mutational
event so rare as to be virtually unique, or with a mutational step that
recurs repeatedly. The first produces no permanent change, whereas
the second does.

3fe» Non-recurrent mutation. Consider first a mutational event
*mat gives rise to just one representative of the mutated gene or
chromosome in the whole population. This sort of mutation is of
little importance as a cause of change of gene frequency, because the
product of a unique mutation has an infinitely small chance of sur-
viving in a large population, unless it has a selective advantage. This
can be seen from the following consideration. As a result of the single
mutation there will be one AXA2 individual in a population all the
rest of which is AjA^ The frequency of the mutated gene, A2, is
therefore extremely low. Now according to the Hardy- Weinberg
equilibrium the gene frequency should not change in subsequent
generations. But with this situation we can no longer ignore the
variation of gene frequency due to sampling. With a gene at very low
frequency the sampling variation, even though very small, may take
the frequency to zero, and the gene will then be lost from the popu-
lation. Though at each generation a single gene has an equal chance
of surviving or being lost, the loss is permanent and the probability
of the gene still being present decreases with the passage of genera-
tions (see Li, 1955a). The conclusion, therefore, is that a unique
mutation without selective advantage cannot produce a permanent
change in the population.

Recurrent mutation. It is with the second type of mutation —

Chap. 2]

MUTATION

recurrent mutation — that we are concerned as an agent for causing
change of gene frequency. Each mutational event recurs regularly
with characteristic frequency, and in a large population the frequency
of a mutant gene is never so low that complete loss can occur from
sampling. We have, then, to find out what is the effect of this "pres-
sure" of mutation on the gene frequency in the population.

Suppose gene A^mutates to A^ with^aJrequencv u per generation.
(u is the proportion of all Ax genes that mutate to A2 between one
generation and the next.) If the frequency of Ax in one generation is
p0 the frequency of newly mutated A2 genes in the next generation is
upQ. So the new gene frequency of Ax is p0 - up0, and the change of
gene frequency is - up0. Now consider what happens when the genes
mutate in both directions. Suppose for simplicity that there are only
two alleles, Ax and A2, with initial frequencies p0 and q0. Ax mutates
to A2 at a rate u per generation, and A2 mutates to Ax at a rate v.
Then after one generation there is a gain of A2 genes equal to up0 due
to mutation in one direction, and a loss equal to vq0 due to mutation
in the other direction. Stated in symbols, we have the situation:

Mutation rate Ax ^ A2

Initial gene frequencies p0 q0

Then the change of gene frequency in one generation is

Aq=up0-vq0

It is easy to see that this situation leads to an equilibrium in gene
frequency at which no further change takes place, because if the
frequency of one allele increases fewer of the other are left to mutate
in that direction and more are available to mutate in the other direc-
tion. The point of equilibrium can be found by equating the change
of frequency, Aq, to zero. Thus at equilibrium

pu
P.

(*-3)

-qv

and

u + v

(2.4)

Three conclusions can be drawn from the effect of mutation on
gene frequency. Measurements of mutation rates indicate values
ranging between about io~4 and io-8 per generation (one in ten

lb CHANGES OF GENE FREQUENCY [Chap. 2

thousand and one in a hundred million gametes). With normal
mutation rates, therefore, mutation alone can produce only very slow
changes of gene frequency; on an evolutionary time-scale they might
be important, but they could scarcely be detected by experiment
unless with micro-organisms. The second conclusion concerns the
equilibrium between mutation in the two directions. Studies of
reverse mutation (from mutant to wild type) indicate that it is usually
less frequent than forward mutation (from wild type to mutant), on
the whole about one tenth as frequent (Muller and Oster, 1957).
The equilibrium gene frequencies for such loci, resulting from
mutation alone, would therefore be about o-i of the wild-type allele
and 0-9 of the mutant; in other words the "mutant" would be the
common form and the "wild type" the rare form. Since this is not
the situation we find in natural populations it is clear that the fre-
quencies of such genes are not the product of mutation alone. We
shall see in the next section that the rarity of mutant alleles is attribu-
table to selection. The third conclusion concerns the effects of an
increase of mutation rates such as might be caused by an increase of
the level of ionising radiation to which the population is subjected.
Any loci at which the gene frequencies are in equilibrium from the
effects of mutation alone will not be affected by a change of mutation
rate, provided the change affects forward and reverse mutation pro-
portionately. This can be seen from consideration of the equilibrium
gene frequencies given in equation 2.4.

Selection

Hitherto we have supposed that all individuals in the population
contribute equally to the next generation. Now we must take account
of the fact that individuals differ in viability and fertility, and that
they therefore contribute different numbers of offspring to the next
generation. The proportionate contribution of offspring to the next
generation is called the fitness of the individual, or sometimes the
adaptive value, or selective value. If the differences of fitness are in
any way associated with the presence or absence of a particular gene
in the individual's genotype, then selection operates on that gene.
When a gene is subject to selection its frequency in the offspring is
not the same as in the parents, since parents of different genotypes
pass on their genes unequally to the next generation. In this way

Chap. 2]

SELECTION

selection causes a change of gene frequency, and consequently also of
genotype frequency. The change of gene frequency resulting from
selection is more complicated to describe than that resulting from
mutation, because the differences of fitness that give rise to the
selection are an aspect of the phenotype. We therefore have to take
account of the degree of dominance shown by the genes in question.
Dominance, in this connexion, means dominance with respect to
fitness, and this is not necessarily the same as the dominance with
respect to the main visible effects of the gene. Most mutant genes, for
example, are completely recessive to the wild type in their visible

A2A2

I —

\-s

DOMINANCE

A,A2
+

A,A,
— I

1-is

A2A2
i —
1 -s

COMPLETE

DOMINANCE

A,A,

A,A,
— I

OVERDOMINANCE

A2A2

1-*,

A,A,

1-*,

FITNESS

A,A,

Fig. 2.i. Degrees of dominance with respect to fitness.

effects, but this does not necessarily mean that the heterozygote has a
fitness equal to that of the wild-type homozygote. The meaning of
the different degrees of dominance with which we shall deal is
illustrated in Fig. 2.1.

It is most convenient to think of selection acting against the gene
in question, in the form of selective elimination of one or other of the
genotypes that carry it. This may operate either through reduced
viability or through reduced fertility in its widest sense, including
mating ability. In either case the outcome is the same: the genotype
selected against makes a smaller contribution of gametes to form
zygotes in the next generation. We may therefore treat the change of
gene frequency as taking place between the counting of genotypes
among the zygotes of the parent generation and the formation of

28 CHANGES OF GENE FREQUENCY [Chap. 2

zygotes in the offspring generation. The intensity of the selection is
expressed as the coefficient of selection, s, which is the proportionate
reduction in the gametic contribution of a particular genotype com-
pared with a standard genotype, usually the most favoured. The
contribution of the favoured genotype is taken to be i, and the
contribution of the genotype selected against is then i - s. This
expresses the fitness of one genotype compared with the other. Sup-
pose, for example, that the coefficient of selection is s = o-i; this
means that for every ioo zygotes produced by the favoured genotype,
only 90 are produced by the genotype selected against.

The fitness of a genotype with respect to any particular locus is
not necessarily the same in all individuals. It depends on the en-
vironmental circumstances in which the individual lives, and also on
the genotype with respect to genes at other loci. When we assign a
certain fitness to a genotype, this refers to the average fitness in the
whole population. Though differences of fitness between individuals
result in selection being applied to many, perhaps to all, loci simul-
taneously, we shall limit our attention here to the effects of selection
on the genes at a single locus, supposing that the average fitness of the
different genotypes remains constant despite the changes resulting
from selection applied simultaneously to other loci. The conclusions
we shall reach apply equally to natural selection occurring under
natural conditions without the intervention of man, and to artificial
selection imposed by the breeder or experimenter through his choice
of individuals as parents and through the number of offspring he
chooses to rear from each parent.

Change of gene frequency under selection. We have first to
derive the basic formulae for the change of gene frequency brought
about by one generation of selection. Then we can consider what they
tell us about the effectiveness of selection. The different conditions
of dominance have to be taken account of, but the method is the same
for all, and we shall illustrate it by reference to the case of complete
dominance with selection acting against the recessive homozygote.
Let the genes Ax and A2 have initial frequencies p and q, Ax being
completely dominant to A2, and let the coefficient of selection against
A2A2 individuals be s. Multiplying the initial frequency by the fitness
of each genotype we obtain the proportionate contribution of each
genotype to the gametes that will form the next generation, thus:

Chap. 2]

SELECTION

Genotypes

Initial frequencies

Fitness

Gametic contribution

AXA2
zpq

i
2pq

r\_2rA.o

f-~

I -S

q*(i-s)

Total

i -sq2

Note that the total gametic contribution is no longer unity, because
there has been a proportionate loss of sq2 due to the selection. To
find the frequency of A2 gametes produced — and so the frequency of
A2 genes in the progeny — we take the gametic contribution of A2A2
individuals plus half that of AXA2 individuals and divide by the new
total, i.e. we apply equation J.J. Thus the new gene frequency is

■(2-5)

_q\i-s)+pq
qi~ l-sq*

The change of gene frequency, Aq, resulting from one generation of
selection is

_g% -%&pq n

which on simplification reduces to

Aq =

^2(l~g)
i -sq2

(2.6)

From this we see that the effect of selection on gene frequency de-
pends not only on the intensity of selection, s, but also on the initial
gene frequency. But both relationships are somewhat complex, and
the examination of their significance will be postponed till after the
other situations have been dealt with.

Selection may act against the dominant phenotype and favour the
recessive: we then put i - s for the fitness of A^ and of AXA2 geno-
types. The expression for Aq is given in Table 2.1. The difference
may best be appreciated by considering the effects of total elimination
(s = i). The expression for selection against the dominant allele then
reduces to Aq = 1 - q, which expresses the fact that if only the reces-
sive genotype survives to breed the frequency of the recessive allele
will become 1 after a single generation of selection. But, on the other
hand, if there is complete elimination of the recessive genotype the
frequency of the dominant allele does not reach 1 after a single
generation. The difference between the effects of selection in oppo-
site directions becomes less marked as the value of s decreases.

30 CHANGES OF GENE FREQUENCY [Chap. 2

If there is incomplete dominance the expression for Aq is again
different. The case of exact intermediate dominance is given in
Table 2.1. Here we put 1 - %s for the fitness of AxK2y and 1 -s for
the fitness of A2A2 genotype. For selection in the opposite direction
in this case we need only interchange the initial frequencies of the
two alleles, writings in the place of q.

Table 2.1

Change of gene frequency, Aq, after one generation of selection
under different conditions of dominance specified in Fig. 2.1.
Conditions of domin- Initial frequencies and Change of frequency,

ance and selection fitness of the genotypes Aq, of gene A2

A]AX AjA2 A2A2

p2 2pq q2

No dominance , had -a)

, . . A 1 i-is i-s — =-^ ^ (1)

selection against A2 1 - sq '

Complete dominance ^(i -q)

selection against A2A2 1 -sq2

Complete dominance sq2(i -q)

selection against Ax - i-s(i-q2)

Overdominance

selection against 1 - s1 1 1 - ^2 + p^ lP — — (4)

AjAi and A2A2

i-hp'-w

When s is small the denominators differ little from 1, and the numerators
alone can be taken to represent Aq sufficiently accurately for most purposes.

Finally, selection may favour the heterozygote, a condition known
as overdominance. In this case we put 1 - s± and 1 -s2 for the fitness
of the two homozygotes. The expression for Aq is given in Table 2. 1 .
This special case will be given more detailed attention later. The
different conditions of dominance to which the expressions in
Table 2.1 refer are illustrated diagrammatically in Fig. 2.1. Let us
now see what these equations tell us about the effectiveness of selec-
tion.

Effectiveness of selection. We see from the formulae that the
effectiveness of selection, i.e. the magnitude of Aq, depends on the
initial gene frequency, q. The nature of this relationship is best
appreciated from graphs showing Aq at different values of q. Fig. 2.2

028
024
020
016
012
008
004
000
032
028
024
020

^ 016

■

■ //

012
008
004
000

■

A A

Fig. 2.2. Change of gene frequency, Aq, under selection of intensity s =o-2, at
different values of initial gene frequency, q. Upper figure: a gene with no domi-
nance. Lower figure: a gene with complete dominance. The graphs marked
( -) refer to selection against the gene whose frequency is q, so that Aq is nega-
tive. The graphs marked ( +) refer to selection in favour of the gene, so that
Aq is positive. (From Falconer, 1954a; reproduced by courtesy of the editor of
the International Union of Biological Sciences.)

32 CHANGES OF GENE FREQUENCY [Chap. 2

shows these graphs for the cases of no dominance and complete
dominance. They also distinguish between selection in the two
directions. A value of s = o-2 was chosen for the coefficient of selec-
tion because, for reasons given in Chapter 12, this seems to be the
right order of magnitude for the coefficient of selection operating on
genes concerned with metric characters in laboratory selection experi-
ments. First we may note that with this value of s there is never a
great difference in Aq according to the direction of selection. The
two important points about the effectiveness of selection that these
graphs demonstrate are: (i) Selection is most effective at intermediate
gene frequencies and becomes least effective when q is either large or
small, (ii) Selection for or against a recessive gene is extremely
ineffective when the recessive allele is rare. This is the consequence
of the fact, noted earlier, that when a gene is rare it is represented
almost entirely in heterozygotes.

Another way of looking at the effect of the initial gene frequency on
the effectiveness of selection is to plot a graph showing the course of
selection over a number of generations, starting from one or other
extreme. Such graphs are shown in Fig. 2.3. They were constructed
directly from those of Fig. 2.2, and refer again to a coefficient of
selection, s = o-z. They show that the change due to selection is at
first very slow, whether one starts from a high or a low initial gene
frequency; it becomes more rapid at intermediate frequencies and
falls off again at the end. In the case of a fully dominant gene one is
chiefly interested in the frequency of the homozygous recessive
genotype, i.e. q2. For this reason the graph shows the effect of selec-
tion on q2 instead of on q.

It is often useful to express the change of gene frequency, Aq,
under selection in a simplified form, which is a sufficiently good
approximation for many purposes. If either the coefficient of selec-
tion, sy or the gene frequency, q, is small, then the denominators of
the equations in Table 2.1 become very nearly unity, and we can use
the numerators alone as expressions for Aq. Then for selection in
either direction we have, with no dominance:

Aq=±isq(i-q) (approx.) (2.7)

and with complete dominance:

Aq= ±sq2(i-q) (approx.) (2.8)

SELECTION

•

■

^t+)

,,,.,,,,,

, r-r-i ! i i i i ■

■

. . . , 1 . . , .

to/

.........

GENERATIONS
Fig. 2.3. Change of gene frequency during the course of selection from one
extreme to the other. Intensity of selection, s —0-2. Upper figure: a gene with
no dominance. Lower figure: a gene with complete dominance, q being the
frequency of the recessive allele and q2 that of the recessive homozygote. The
graphs marked ( - ) refer to selection against the gene whose frequency is q, so
that q or q2 decreases. The graphs marked ( + ) refer to selection in favour of the
gene, so that q or q2 increases. (From Falconer, 1954a; reproduced by courtesy
of the editor of the International Union of Biological Sciences.)

CHANGES OF GENE FREQUENCY

[Chap. 2

Example 2.1. As an example of the change of gene frequency under
selection we shall take the case of a sex-linked gene, in spite of the added
complication, because there is no well documented case of an autosomal
gene. Fig. 2.4 shows the change of the frequency of the recessive sex-
linked gene "raspberry" in Drosophila melanogaster over a period of about
eighteen generations, described by Merrell (1953). The population was
started with a gene frequency of 0-5 in both sexes, and was therefore in

Generations
Days

Fig. 2.4. Change of gene frequency under natural selection in
the laboratory, as described in Example 2.1. (Data from Merrell,
1953.)

equilibrium at the beginning (see p. 17). Counts were made at about
monthly intervals, and the gene frequency in both sexes combined (by
equation 1.3) is shown against the scale of days in the figure. Measure-
ments of fitness were made by comparison of the relative viability of
mutant and wild-type phenotypes, and of their relative success in mating.
No differences of viability were detected, nor of the success of females in

Chap. 2]

SELECTION

mating. But mutant males were only 50 per cent as successful as wild-
type males in mating. The changes of gene frequency expected on the
basis of this difference of fitness were then calculated generation by
generation, and these calculated values are shown in the figure by the
smooth curve, plotted against the scale of generations. From a similar
experiment with a different mutant it was found that the calculated and
observed curves coincided if a period of 24 days was taken as the interval
between generations. For this reason 24 days to a generation was taken as
the basis for superimposing the curves shown here. Since the calculated
curve was to this extent made to fit the observed, the good agreement
between the two cannot be taken as proof that selection operated only
through the males' success in mating. But the similarity in their shapes
illustrates well how the change of gene frequency is rapid at first, tails off
as the gene frequency becomes lower, and becomes very slow when it
approaches zero.

Number of generations required. How many generations of
selection would be needed to effect a specified change of gene fre-
quency? An answer to this question is sometimes required in con-
nexion with breeding programmes or proposed eugenic measures.
We shall here consider only the case of selection against a recessive
when elimination of the unwanted homozygote is complete, i.e. s=i.
This would apply to natural selection against a recessive lethal, and
artificial selection against an unwanted recessive in a breeding pro-
gramme. We shall also, for the moment, suppose that there is no
mutation. We had in equation 2.5 an expression for the new gene
frequency after one generation of selection against a recessive.
Substituting s = 1 in this equation and writing q0, qly q2, ... ,qt for the
gene frequency after o, 1, 2, . . . , t generations of selection we have

and

i+go

!+gl

by substituting for q1 and simplifying. So in general

tq0

(2.9)

36 CHANGES OF GENE FREQUENCY [Chap. 2

and the number of generations, t, required to change the gene
frequency from q0 to qt is

tJhzli

11 / X

= (2.10)

Qt q0

We may use this formula to illustrate the point already made, that
when the frequency of a recessive gene is low selection is very slow
to change it.

Example 2.2. It is sometimes suggested, as a eugenic measure, that
those suffering from serious inherited defects should be prevented from
reproducing, since in this way the frequency of such defects would be
reduced in future generations. Before deciding whether the proposal is a
good one we ought to know what it would be expected to achieve. We
cannot properly discuss this problem without taking mutation into ac-
count, as we shall do later; the answer we get ignoring mutation, as we do
now, shows what is the best that could be hoped for. Let us take albinism
as an example, though it cannot be regarded as a very serious defect, and
ask the question: how long would it take to reduce its frequency to half the
present value? The present frequency is about 1/20,000, and this makes
q0 = 1/141, as we saw in Example 1.4. The objective is q2 = 1/40,000, which
makes qt = 1/200. So, from equation 2. io, t = zoo - 141 =59 generations.
With 25 years to a generation it would take nearly 1500 years to achieve
this modest objective. More serious recessive defects are generally even
less common than albinism and with them elimination would be still
slower.

Balance between mutation and selection. Having described
the effects of mutation and selection separately we must now compare
them and consider them jointly. Which is the more effective process
in causing change of gene frequency? Is it reasonable to attribute the
low frequency of deleterious genes that we find in natural popula-
tions to the balance between mutation tending to increase the fre-
quency and selection tending to decrease it? The expressions already
obtained for the change of gene frequency under mutation or selec-
tion alone show that both depend on the initial gene frequency, but in
different ways. Mutation to a particular gene is most effective in
increasing its frequency when the mutant gene is rare (because there

Chap. 2]

SELECTION

are more of the unmutated genes to mutate); but selection is least
effective when the gene is rare. The relative effectiveness of the two
processes depends therefore on the gene frequency, and if both pro-
cesses operate for long enough a state of equilibrium will eventually
be reached. So we must find what the gene frequency will be when
equilibrium is reached. This is done by equating the two expressions
for the change of gene frequency, because at equilibrium the change
due to mutation will be equal and opposite to the change due to
selection.

Let us consider first a fully ] recessive gene with frequency q>
mutation rate to it «, and from it v\ and selection coefficient against it

s. Then from equations {2.3) and! (2.6) we have at equilibrium

u{i-q)i-

lsf{i-q)

sq¥

.(2.11)

This equation is too complicated to give a clear answer to our ques-
tion. But we can make two simplifications with only a trivial sacrifice
of accuracy. We are specifically interested in genes at low equilibrium
frequencies. If q is small the term vq representing back mutation is
relatively unimportant and can be neglected; and we can use the
approximate expression (equation 2.8) for the selection effect.
Making these simplifications we have the equilibrium condition for
selection against a recessive gene

u(i ~q)=sq2(i -q) (approx.)

u = sqd

(approx.) (2.12)

(approx.) (2.13)

For a gene with no dominance similar reasoning from equation (1)
in Table 2.1 gives the equilibrium condition

q=- (approx.)

(2.14)

Finally, consider selection against a completely dominant gene, the
frequency of the dominant gene being 1 - q, and the mutation rate
to it being v. In this case 1 -q is very small and the term w(i -q) in
ere equation 2. 11 is negligible. We have therefore at equilibrium

38 CHANGES OF GENE FREQUENCY [Chap. 2

vq = sq2(i -q) (approx.)

q(i-q)=j (approx.)

or H=— (approx.) (2-J5)

where H is the frequency of heterozygotes. If the mutant gene is
rare H is very nearly the frequency of the mutant phenotype in the
population.

Example 2.3. If the equilibrium state is accepted as applicable, we
can use it to get an estimate of the mutation rate of dominant abnormalities
for which the coefficient of selection is known. Among some human
examples described by Haldane (1949) is the case of dominant dwarfism
(chondrodystrophy) studied in Denmark. The frequency of dwarfs was
estimated at 10-7 x io-5, and their fitness (1 -s) at 0-196. The estimate of
fitness was made from the number of children produced by dwarfs com-
pared with their normal sibs. The mutation rate, by equation (2. 75),
comes out at 4-3 x io-5. Though there is a possibility of serious error in
the estimate of frequency owing to prenatal mortality of dwarfs, the
mutation rate is almost certainly estimated within the right order of magni-
tude. For a discussion of the estimation of mutation rates in man see
Crow (1956).

These expressions for the equilibrium gene frequency under the
joint action of mutation and selection show that the gene frequency
can have any value at equilibrium, depending on the relative magni-
tude of the mutation rate and the coefficient of selection. But if
mutation rates are of the order of magnitude commonly accepted,
i.e. io-5, or thereabouts, then only a mild selection against the mutant
gene will be needed to hold it at a very low equilibrium frequency.
For example, the following are the equilibrium frequencies of a
recessive gene and of the recessive homozygote under various intensi-
ties of selection if the mutation rate is io-5:

s =

•001

•01

•1

•5

9 =

•1

•03

•01

•0045

q2 =

•01

•001

•0001

2x10

Thus, if a gene mutates at the rate of io-5, a selective disadvantage of
10 per cent is enough to hold the frequency of the recessive homo-
zygote at one in ten thousand; and a 50 per cent disadvantage will

oth

iino

Chap. 2] SELECTION 39

hold it at one in fifty thousand. It is quite clear therefore that the
low frequency of deleterious mutants in natural populations is in
accord with what would be expected from the joint action of mutation
and selection. A further conclusion is that mutation alone is most
unlikely to be a cause of evolutionary change. It is not mutation, but
selection, that chiefly determines whether a gene spreads through the
population or remains a rare abnormality, unless the mutation rate
is very much higher than seems to be the rule.

Let us now briefly consider two questions of social importance
concerning the balance between selection and mutation: the effect of
an increase of mutation rate, and the effect of a change in the intensity
of selection against deleterious mutants. These questions are more
fully discussed by Crow (1957).

Increase of mutation rate. Since the products of mutation are
predominantly deleterious, the process of mutation has a harmful
effect on a proportion of the individuals in a population. When an
individual dies or fails to reproduce in consequence of the reduced
fitness of its genotype, we may refer to this as a ''genetic death." An
increase in the frequency of genetic deaths would reduce the poten-
tial reproductive rate and might thus reduce the speed with which a
species could multiply in an unoccupied territory. But when the
numbers of adults are held constant by density-dependent factors,
even quite a high frequency of genetic deaths will not affect the
ability of the population to perpetuate itself, especially if the repro-
ductive rate is high, because the death of some individuals leaves room
for others that would otherwise have died from lack of food or some
ut if other cause. There is a species of Drosophila, for example (D.
tropicalis, from Central America), in which 50 per cent of individuals
in a certain locality suffer genetic death, and yet the population
flourishes (Dobzhansky and Pavlovsky, 1955). In species with low
reproductive rates the frequency of genetic deaths is of greater conse-
quence, particularly in ourselves, where the death of every individual
is a matter of concern. Let us therefore consider what effect is to be

xpected from an increase of mutation rate such as might be caused
by an increase in the amount of ionising radiation to which human
populations are exposed.

Let us take the case of a recessive gene with a mutation rate (to it)
Df u, the gene being in equilibrium at a frequency of q. Then, if the

oefficient of selection against the homozygote is s, the frequency of
genetic deaths is sq2. This is the proportionate loss due to selection,

F.Q.G.

40 CHANGES OF GENE FREQUENCY [Chap. 2

as shown on p. 29, and it is equal to u, by equation 2.12. Thus the
frequency of genetic deaths, when equilibrium has been attained,
depends on the mutation rate alone, and is not influenced by the
degree of harmfulness of the gene. The reason for this apparent para-
dox is that the more harmful genes come to equilibrium at lower
frequencies.

Now, if the mutation rate is increased, and maintained at the new
level, the gene will begin to increase toward a new point of equili^
brium at which sq2 will be equal to the new mutation rate. Thus if
the mutation rate were doubled the frequency of genetic deaths would
also be doubled, when the new equilibrium had been reached. But
the approach to the new equilibrium would be very slow. The change
of gene frequency in the first generation is approximately

Aq = u(i-q)-sq2(i-q)

u being the new mutation rate (from equations 2.3 and 2.8, but
ctingback negle mutation). To see what this means let us take a
mutation rate of io-5 as being probably representative of many loci,
and let us suppose that this was doubled. We may with sufficient
accuracy take 1 - q as unity. Then

Aq = 2 x io~5 - io-5
= io-5

The immediate effect of the increase of mutation rate would there-
fore be very small indeed.

Change of selection intensity. Intensification of selection is
sometimes advocated as a eugenic measure in human populations,
on the grounds that if sufferers from genetic defects were prevented
from breeding the frequency of the defects would be reduced. We
saw from Example 2.2. that the effect of selection against a recessive
defect is very slow indeed, even when mutation is ignored. The true
situation is even worse. We cannot reduce the frequency of an
abnormality, whether dominant or recessive, below the new equili-
brium frequency. The serious defects have already a fairly strong
natural selection working on them, and the addition of artificial
selection can do no more than make the coefficient of selection, s,
equal to 1. This would probably seldom do more than double the
present coefficient of selection, and the incidence of defects would be
reduced to not less than half their present values (equations 2.13,
2.14, 2.15). With a dominant gene the effect would be immediate,

Chap. 2]

SELECTION

but with a recessive the approach to the new equilibrium would be
extremely slow.

The situation with respect to recessives is complicated by the
fact that deleterious recessives are certainly not at their equilibrium
frequencies in present-day human populations (Haldane, 1939).
The reason is that modern civilisation has reduced the degree of
subdivision (i.e. inbreeding) and so reduced the frequency of homo-
zygotes, as will be explained in the next chapter. In consequence
both the gene frequencies and the homozygote frequencies are below
their equilibrium values, and must be presumed to be at present
increasing slowly toward new equilibria at higher values.

Perhaps the converse of the question posed above is one that
should give us more concern, namely the consequences of the reduced
intensity of natural selection under modern conditions. Minor
genetic defects, such as colour-blindness, must presumably have had
some selective disadvantage in the past but now have very little, if
any, effect on fitness. Moreover, the development and extension of
medical treatment prolongs the lives of many people with diseases
that have at least some degree of genetic causation through genes that
increase susceptibility. This relaxation of the selection operating on
minor genetic defects and against genes concerned in the causation of
disease suggests that the frequencies of these genes will increase
toward new equilibria at higher values. If this is true we must expect
the incidence of minor genetic defects to increase in the future, and
also the proportion of people who need medical treatment for a
variety of diseases. By applying humanitarian principles for our own
good now we are perhaps laying up a store of inconvenience for our
descendants in the distant future.

Selection favouring heterozygotes. We have considered the
effects of selection operating on genes that are partially or fully
dominant with respect to fitness; but, though the appropriate for-
mula was given in Table 2.1, we have not yet discussed the conse-
quences of overdominance with respect to fitness; that is, when the
heterozygote has a higher fitness than either homozygote. At first
sight it may seem rather improbable that selection should favour the
heterozygote of two alleles rather than one or other of the homo-
zygotes, but there are reasons for thinking that this in fact is not at all
an uncommon situation. Let us first examine the consequences of
this form of selection, and then consider the evidence of its occur-
rence in nature.

42 CHANGES OF GENE FREQUENCY [Chap. 2

Selection operating on a gene with partial or complete dominance
tends toward the total elimination of one or other allele, the final gene
frequency, in the absence of mutation, being o or i . When selection
favours the heterozygote, however, the gene frequency tends toward
an equilibrium at an intermediate value, both alleles remaining in the
population, even without mutation. The reason is as follows. The
change of gene frequency after one generation was given in Table 2.1
as being

pq(s1p-s2q)

hp2 - s2q

The condition for equilibrium is that Aq = o, and this is fulfilled when
s1p=s2q. The gene frequencies at this point of equilibrium are
therefore «X»

-=- Zl

q Sl

?=*7T72 ^

Now, if q is greater than its equilibrium value (but not 1), and p
therefore less, sxp will be less than s2q, and Aq will be negative; that is
to say q will decrease. Similarly if q is less than its equilibrium value
(but not o) it will increase. Therefore when the gene frequency has
any value, except o or 1 , selection changes it toward the intermediate
point of equilibrium given in equation 2.16, and both alleles remain
permanently in the population. Three or more alleles at a locus are
maintained in the same way, provided the heterozygote of any pair
is superior in fitness to both homozygotes of that pair (Kimura,
1956). A feature of the equilibrium worthy of note is that the gene
frequency depends not on the degree of superiority of the hetero-
zygote but on the relative disadvantage of one homozygote compared
with that of the other. Therefore there is a point of equilibrium at
some more or less intermediate gene frequency whenever a hetero-
zygote is superior to both the homozygotes, no matter by how little.

Our previous consideration of genes with complete dominance
showed that the balance between selection and mutation satisfactorily
accounts for the presence of deleterious genes at low frequencies,
causing the appearance of rare abnormal, or mutant, individuals.
Genes at intermediate frequencies, however, are common in very
many species, and the presence of these cannot satisfactorily be

Chap. 2]

SELECTION

lcies,

ven"

accounted for in this way. But the intermediate frequencies are just
what would be expected if selection favoured the heterozygotes.
The existence in a population of individuals with readily discernible
differences caused by genes at intermediate frequencies is referred to
as polymorphism. The blood group differences of man are perhaps
the best known examples, but antigenic differences are found also in
many other species and are probably universal in animals. More
striking forms of polymorphism are the colour varieties found in
many species, particularly among insects, snails, and fishes. The
genes causing polymorphism have usually no obvious advantage of
one allele over another, all the genotypes being essentially normal, or
"wild-type," individuals. In these circumstances, as we noted above,
only a very slight superiority of the heterozygote would be sufficient
to establish an equilibrium at an intermediate gene frequency. The
properties of the genes concerned with polymorphism seem, there-
fore, to accord well with the hypothesis that selection is operating on
them in favour of the heterozygotes, and this is generally conceded to
be the most probable reason for their intermediate frequencies. As a
general cause of polymorphism, however, it cannot be taken as fully
proved, because the superior fitness of heterozygotes has been
demonstrated in relatively few cases, and there are other possible
reasons for the existence of polymorphism. For example, the genes
might be in a transitional stage of a change from one extreme to the
other as a result of slow environmental change; or the intermediate
frequencies might be the point of equilibrium between mutation in
opposite directions, with virtually no selective advantage of one allele
over the other. But these explanations seem improbable, particularly
as some polymorphisms are known to be of very long standing. The
polymorphism of shell colours in the land snail Cepaea nemoralis, for
example, goes back to Neolithic times (Cain and Sheppard, 1954a).
Another possible cause of polymorphism lies in the heterogeneity of
the environment in which a population lives. If the differences of
environment influence the selection coefficients in such a way that
one allele is favoured in some conditions and another allele in other
conditions, then polymorphism may result provided that mating is
not entirely at random over the range of environments. (See Levene,
1953; Li, 19556; Mather, 1955a; Waddington, 1957.)

If heterozygotes are indeed superior in fitness, one naturally
wants to enquire into the nature of their superiority. Unfortunately,
however, very little is known about this, though evidence is accumu-

44 CHANGES OF GENE FREQUENCY [Chap. 2

lating, in the case of the human blood groups, that certain blood groups
are associated with an increased susceptibility to certain diseases
(Roberts, 1957); group O, for example, with duodenal ulcer and group
A with pernicious anaemia. If one states this the other way round
and says that the other alleles confer increased resistance to these
diseases, then it is not unreasonable to suppose that each allele
increases resistance to different diseases, and that the presence of two
alleles increases the resistance to two different diseases, thereby
giving a selective advantage to the heterozygote.

Another question of interest concerns the evolutionary signifi-
cance of polymorphism. Is it an "adaptive" feature of a species?
Does it, in other words, confer some advantage over a population
without it? Some think that it does. (See, particularly, Dobzhansky,
195 ib). Others, however, point out that the average fitness of a
population with polymorphism resulting from superior fitness of
heterozygotes is less than that of a population in which a single
allele performs the same function as the two different alleles in the
heterozygote (Cain and Sheppard, 19546). On this view, polymor-
phism is a situation that, once established, is perpetuated by selection
between individuals within the population, but is a disadvantage to
the population as a whole in competition with another population
lacking the polymorphism.

The foregoing account of polymorphism leaves many problems
unsolved, and does little more than sketch the outlines of a most
interesting aspect of the genetics of populations. In particular, we
have not mentioned the extensive and detailed investigations of poly-
morphism in respect of inverted segments of chromosomes found in
species of Drosophila and, to a lesser extent, in some other animals
and plants. For a description of these studies, and also for a fuller
general account of polymorphism, the reader must be referred to
Dobzhansky (1951a). We conclude by giving one example of poly-
morphism where the nature of the superiority of heterozygotes is
clear. Other cases are described by Dobzhansky (1951a), Ford
(1953), Lerner (1954), and Sheppard (1958).

Example 2.4. Sickle-cell anaemia (Allison, 1955). There is a gene,
found in American negroes and in the indigenous East Africans, which
causes the formation of an abnormal type of haemoglobin. Homozygotes
suffer from an anaemia, characterised by the "sickle" shape of the erythro-
cytes; it is a severe disease from which many die. All the haemoglobin of
homozygotes is of the abnormal type, though there is a variable admixture

Chap. 2]

SELECTION

of foetal haemoglobin. Heterozygotes do not suffer from anaemia, but
they can be recognised by the presence of sickle cells if the haemoglobin
is deoxygenated. About 35 per cent of their haemoglobin is of the ab-
normal type. With respect to haemoglobin synthesis, therefore, the sickle-
cell gene is partially dominant, though with respect to the anaemia it is
recessive, and with respect to fitness it has been proved to be over-
dominant. In routine surveys the few surviving homozygotes are not
readily distinguished from heterozygotes; we shall refer to the combined
heterozygotes and surviving homozygotes as "abnormals." The frequency
of abnormals varies very much with the locality: in American negroes it
is about 9 per cent, and in different parts of Africa it varies from zero up
to a maximum of about 40 per cent. In view of the severe disability of the
homozygotes it is impossible to account for these high frequencies unless
the heterozygotes have a quite substantial selective advantage over the
normal homozygotes. The nature of this selective advantage has been
shown to be connected with resistance to malaria. Heterozygotes are less
susceptible to malaria than normal homozygotes, and the frequency of
abnormals in different areas is correlated with the prevalence of malaria.
Let us work out the gene frequency corresponding with the maximum
frequency of 40 per cent abnormals, and then find the magnitude of the
selective advantage of heterozygotes necessary to maintain this gene
frequency in equilibrium.

If the gene frequency is in equilibrium it will be the same after selec-
tion has taken place as it was before. Therefore, if we assume that all the
selection takes place before adulthood — an assumption that is not very far
from the truth — we can estimate the gene frequency from the genotype
frequencies in the adult population. But it is first necessary to know what
proportion of abnormals are homozygotes. This has been estimated as
being approximately 2-9 per cent (Allison, 1954). Thus, when the fre-
quency of abnormals is 0-4, the frequency of homozygotes is 0-012, and
that of heterozygotes is 0-388. The gene frequency, then, by equation 1.1,
is the frequency of homozygotes plus half the frequency of heterozygotes,
which comes to q = 0-206. If this gene frequency is the equilibrium value
maintained by natural selection favouring the heterozygotes, and if we
assume mating to be random, then the gene frequency is related to the
selection coefficients by equation 2.16. The fitness of sickle-cell homo-
zygotes, relative to that of heterozygotes, has been estimated from a
comparison of viability and fertility as being approximately 0-25. There-
fore the coefficient of selection against homozygotes is sa = 0'75' Substi-
tuting this value of s2, and the value of q found above, in equation 2.16
gives ^ = 0-197. This is the coefficient of selection against normal homo-
zygotes, relative to heterozygotes. If we want to express the selective
advantage of heterozygotes as the superiority of heterozygotes, relative to

46 CHANGES OF GENE FREQUENCY [Chap. 2

normal homozygotes, we may do so, since the fitness of heterozygotes

relative to normal homozygotes is

. This is 1-24. Thus the selective

advantage to be attributed to the resistance of heterozygotes to malaria,
if these are the forces holding the gene in equilibrium, is 24 per cent.

The presence of the sickle-cell gene in American negroes can be
attributed to their African origin. The gene's present frequency of 0-046,
deduced in the manner described above, can be accounted for partly by
racial mixture and partly by the change of habitat which, removing the
advantage of heterozygotes, has exposed the gene to the full power of the
selection against homozygotes.

As an example of polymorphism the sickle-cell gene is not altogether
typical, because the differences of fitness are rather large and one of the
genotypes is clearly abnormal. But it illustrates in an exaggerated form
the nature of the selective forces that are presumed to underlie the more
usual forms of polymorphism.

CHAPTER 3

SMALL POPULATIONS:

I. Changes of Gene Frequency under
Simplified Conditions

We have now to consider the last of the agencies through which gene
frequencies can be changed. This is the dispersive process, which
differs from the systematic processes in being random in direction,
and predictable only in amount. In order to exclude this process
from the previous discussions we have postulated always a "large"
population, and we have seen that in a large population the gene
frequencies are inherently stable. That is to say, in the absence of
migration, mutation, or selection, the gene and genotype frequencies
remain unaltered from generation to generation. This property of
stability does not hold in a small population, and the gene frequencies
are subject to random fluctuations arising from the sampling of
gametes. The gametes that transmit genes to the next generation
carry a sample of the genes in the parent generation, and if the sample
is not large the gene frequencies are liable to change between one
generation and the next. This random change of gene frequency
is the dispersive process.

The dispersive process has, broadly speaking, three important
consequences. The first is differentiation between sub-populations.
The inhabitants of a large area seldom in nature constitute a single
large population, because mating takes place more often between
inhabitants of the same region. Natural populations are therefore
more or less subdivided into local groups or sub-populations, and the
sampling process tends to cause genetic differences between these, if
the number of individuals in the groups is small. Domesticated or
laboratory populations, in the same way, are often subdivided — for
example, into herds or strains — and in them the subdivision and its
resultant differentiation are often more marked. The second con-
sequence is a reduction of genetic variation within a small population.
The individuals of the population become more and more alike in
genotype, and this genetic uniformity is the reason for the widespread

48 SMALL POPULATIONS: I [Chap. 3

use of inbred strains of laboratory animals in physiological and allied
fields of research. (An inbred strain, it may be noted, is a small
population.) The third consequence of the dispersive process is an
increase in the frequency of homozygotes at the expense of hetero-
zygotes. This, coupled with the general tendency for deleterious
alleles to be recessive, is the genetic basis of the loss of fertility and
viability that almost always results from inbreeding. To explain
these three consequences of the dispersive process is the chief purpose
of this chapter.

There are two different ways of looking at the dispersive process
and of deducing its consequences. One is to regard it as a sampling
process and to describe it in terms of sampling variance. The other
is to regard it as an inbreeding process and describe it in terms of the
genotypic changes resulting from matings between related indi-
viduals. Of these, the first is probably the simpler for a description
of how the process works, but the second provides a more convenient
means of stating the consequences. The plan to be followed here is
first to describe the general nature of the dispersive process from the
point of view of sampling. This will show how the three chief con-
sequences come about. Then we shall approach the process afresh
from the point of view of inbreeding, and show how the two view-
points connect with each other. In all this we shall confine our
attention to the simplest possible situation, excluding migration,
mutation, and selection. Thus we shall see what happens in small
populations in the absence of other factors influencing gene frequency.
In the next chapter we shall extend the conclusions to more realistic
situations, by removing the restrictive simplifications, and we shall in
particular consider the joint effects of the dispersive process and the
systematic processes. Finally, in Chapter 5, we shall consider the
special cases of pedigreed populations, and very small populations
maintained by regular systems of close inbreeding.

The Idealised Population

In order to reduce the dispersive process to its simplest form we
imagine an idealised population as follows. We suppose there to be
initially one large population in which mating is random, and this
population becomes subdivided into a large number of sub-popula-
tions. The subdivision might arise from geographical or ecological
causes under natural conditions, or from controlled breeding in

Chap. 3]

THE IDEALISED POPULATION

domesticated or laboratory populations. The initial random-mating
population will be referred to as the base population, and the sub-
populations will be referred to as lines. All the lines together consti-
tute the whole population, and each line is a "small population" in
which gene frequencies are subject to the dispersive process. When a
single locus is under discussion we cannot properly understand what
goes on in one line except by considering it as one of a large number
of lines. But what happens to the genes at one locus in a number of
lines happens equally to those at a number of loci in one line, pro-
vided they all start at the same gene frequency. So the consequences
of the process apply equally to a single line provided we consider
many loci in it.

The simplifying conditions specified for the idealised population
are the following:

i. Mating is restricted to members of the same line. The lines
are thus isolated in the sense that no genes can pass from one line to
another. In other words migration is excluded.

2. The generations are distinct and do not overlap.

3. The number of breeding individuals in each line is the same for
all lines and in all generations. Breeding indviduals are those that
transmit genes to the next generation.

4. Within each line mating is random, including self-fertilisation
in random amount.

5. There is no selection at any stage.

6. Mutation is disregarded.
The situation implied by these conditions is represented dia-

grammatically in Fig. 3.1, and may be described thus: All breeding

Generation
0

BASE POPULATION (N=co)

Gametes 2A7

I Breeding
individuals

CD ul\ CD CD CD

2N
2 Breeding i — L— i i — * — i r— * — i

LZj LZj

2;V

Gametes 2N

individuals! ,1

I \ I

Fig. 3.1. Diagrammatic representation of the subdivision of a
single large population — the base population — into a number of
sub-populations, or lines.

\
CD

50 SMALL POPULATIONS: I [Chap. 3

individuals contribute equally to a pool of gametes from which zygotes
will be formed. Union of gametes is strictly random. Out of a
potentially large number of zygotes only a limited number survive to
become breeding individuals in the next generation, and this is the
stage at which the sampling of the genes transmitted by the gametes
takes place. Survival of zygotes is random, and consequently the
contribution of the parents to the next generation is not uniform, but
varies according to the chances of survival of their progeny. Since
the population size is constant from generation to generation, the
average number of progeny that reach breeding age is one per
individual parent or two per mated pair of parents. For any particular
zygote the chance of survival is small, and therefore the number of
progeny contributed by individual parents, or by pairs of parents, has
a Poisson distribution.

The following symbols will be used in connexion with the
idealised population.

N=the number of breeding individuals in each line and genera-
tion. This is the population size.
/ = time, in generations, starting from the base population at t0.
q = frequency of a particular allele at a locus.
p = i - q = frequency of all other alleles at that locus, q and p refer
to the frequencies in any one line; q and p refer to the fre-
quencies in the whole population and are the means of q and^>;
q0 andpQ are the frequencies in the base population.

Sampling

Variance of gene frequency. The change of gene frequency
resulting from sampling is random in the sense that its direction is
unpredictable. But its magnitude can be predicted in terms of the
variance of the change. Consider the formation of the lines from
the base population. Each line is formed from a sample of N in-
dividuals drawn from the base population. Since each individual
carries two genes at a locus, the sub-division of the population
represents a series of samples each of 2N genes, drawn at random
from the base population. The gene frequencies in these samples
will have an average value equal to that in the base population, i.e.
q0, and will be distributed about this mean with a variance p0q0/2N,
which is simply the variance of a ratio, the sample size being in this

Chap. 3]

SAMPLING

case 2N. Thus the change of gene frequency, Aqf resulting from
sampling in one generation, can be stated in terms of its variance as

.(**)

2 _Mo
°A« 2N

This variance of Aq expresses the magnitude of the change of gene
frequency resulting from the dispersive process. It expresses the
expected change in any one line, or the variance of gene frequencies
that would be found among many lines after one generation. Its
effect is a dispersion of gene frequencies among the lines; in other
words the lines come to differ in gene frequency, though the mean
in the population as a whole remains unchanged.

In the next generation the sampling process is repeated, but each
line now starts from a different gene frequency and so the second
sampling leads to a further dispersion. The variance of the change
now differs among the lines, since it depends on the gene frequency,
qlt in the first generation of each line separately. The effect of con-
tinued sampling through successive generations is that each line
fluctuates irregularly in gene frequency, and the lines spread apart pro-
gressively, thus becoming differentiated. The erratic changes of gene
frequency shown by the individual lines are exemplified in Fig. 3.2;

6 8 10 12 14

GENERATIONS
Fig. 3.2. Random drift of the colour gene "non-agouti" in three
lines of mice, each maintained by 6 pairs of parents per generation.
(Original data.)

SMALL POPULATIONS: I

[Chap. 3

■ i i i ■' i

10 12

16 18 20 22 24 26 28 30 32

7 £

number of bw 5 genes

Fig. 3.3. Distributions of gene frequencies in 19 consecutive
generations among 105 lines of Drosophila melanogaster , each of 16
individuals. The gene frequencies refer to two alleles at the
"brown" locus (bw™ and bw), with initial frequencies of 0-5. The
height of each black column shows the number of lines having the
gene frequency shown on the scale below. (From Buri, 1956;
reproduced by courtesy of the author and the editor of Evolution.)

Chap. 3]

SAMPLING

and the consequent differentiation, or spreading apart, of the lines
in Fig. 3.3. These changes of gene frequency resulting from samp-
ling in small populations are known as random drift (Wright, 193 1).

O -06

8 10

GENERATIONS

Fig. 3.4. Variance of gene frequencies among lines in the ex-
periment illustrated in Fig. 3.3. The circles are the observed values,
and the smooth curve shows the expected variance as given by
equation 3.2. The value taken for N is 1 1 -5, which is the "effective
number," Ne, as explained in the next chapter. (Data from Buri,
1956.)

As the dispersive process proceeds, the variance of gene frequency
among the lines increases, as shown in Fig. 3.4. At any generation, t,
the variance of gene frequencies, o-J, among the lines is as follows

(see Crow, 1954):

-p^[I-[I-^)i]

(3.2)

Since the mean gene frequency among all the lines remains unchanged,
q=q0. We may note a fact that will be needed later, and is obvious
from equation 3.2, namely that g^ — u1. The dispersion of the gene
frequencies, which we have described by reference to one locus in
many lines, could equally well be described by reference to the

54 SMALL POPULATIONS: I [Chap. 3

frequencies at a number of different loci in one line, provided they all
started from the same initial frequency, and were unlinked.

Fixation. There are limits to the spreading apart of the lines that
can be brought about by the dispersive process. The gene frequency
cannot change beyond the limits of o or i, and sooner or later each
line must reach one or other of these limits. Moreover, the limits are
"traps" or points of no return, because once the gene frequency has
reached o or i it cannot change any more in that line. When a
particular allele has reached a frequency of i it is said to be fixed in
that line, and when it reaches a frequency of o it is lost. When an
allele reaches fixation no other allele can be present in that line, and
the line may then be said to be fixed. When a line is fixed all indi-
viduals in it are of identical genotype with respect to that locus.
Eventually all lines, and all loci in a lino, become fixed. The indi-
viduals of a line are then genetically identical, and this is the basis of
the genetic uniformity of highly inbred strains.

The proportion of the lines in which different alleles at a locus are
fixed is equal to the initial frequencies of the alleles. If the base
population contains two alleles Ax and A2 at frequencies p0 and q0
respectively, then Ax will be fixed in the proportion p0 of the lines,
and A2 in the remaining proportion, q0. The variance of the gene
frequency among the lines is then p0q0, as may be seen from equation
3.2 by putting t equal to infinity. (In Fig. 3.3 the lines in which
fixation or loss has just occurred are shown, but not those in which it
occurred earlier.)

When concerned with the attainment of genetic uniformity one
wants to know how soon fixation takes place; what is the probability
of a particular locus being fixed, or what proportion of all loci in a
line will be fixed, after a certain number of generations. Considera-
tion of the progressive nature of the dispersion, as illustrated in Fig.
3.3, will show that fixation does not start immediately; the dispersion of
gene frequencies must proceed some way before any line is likely to
reach fixation. To deduce the probability of fixation is mathemati-
cally complicated (see particularly Wright, 193 1; Kimura, 1955), and
only an outline of the conclusions can be given here. There are two
phases in the dispersive process: during the initial phase the gene
frequencies are spreading out from the initial value; this leads to a
steady phase, when the gene frequencies are evenly spread out over
the range between the two limits, and all gene frequencies except the
two limits are equally probable. The duration of the initial phase

Chap. 3]

SAMPLING

in generations is a small multiple of the population size, depending
on the initial gene frequency. With q0 = O'S it lasts about zN genera-
tions, and with ^0 = o- 1 it lasts about 4^ generations (Kimura, 1955).
(In the experiment illustrated in Fig. 3.3 it lasted till about the
seventeenth generation.) The theoretical distributions of gene
frequency during the initial phase, with original frequencies of 0-5
and o-i, are shown in Fig. 3.5.

T*N/I0

3.0

/t-N/5\

2 0

T=N/2

1 n

T=N

^nT

T=2N

tK-

/ / /

T=3N

\ \ \

■ yy .

■ I •

. w-

Q.O 0.5 1.0

Fig. 3.5. Theoretical distributions of gene frequency among lines.
The initial and mean gene frequency is 0-5 in the left hand figure,
and o*i in the right hand figure. Previously fixed lines are excluded.
N= population size; T=time in generations. Note the general
agreement of the left hand figure with the observed distributions
shown in Fig. 3.3. (From Kimura, 1955; reproduced by courtesy
of the author and the editor of the Proc. Nat. Acad. Set. Wash.)

To visualise the process one might think of a pile of dry sand in a
narrow trough open at the two ends. Agitation of the trough will
cause the pile to spread out along the trough, till eventually it is
evenly spread along its length. Toward the end of the spreading out
some of the sand will have fallen off the ends of the trough, and this
represents fixation and loss. Continued agitation after the sand is

E F.Q.G.

Small populations-.

[Chap. 3

evenly spread will cause it to fall off the ends at a steady rate, and the
depth of sand left in the trough will be continually reduced at a
steady rate until in the end none is left. The initial gene frequency is
represented by the position of the initial pile of sand. If it is near one
end of the trough, much of the sand will have fallen off that end be-

10 12

GENERATIONS

Fig. 3.6. Fixation and loss occurring among 107 lines of Droso-
phila melanogaster, during 19 generations. This is not the same
experiment as that illustrated in Figs. 3.3 and 3.4, but was similar
in nature. There were 16 parents per generation in each line, and
the effective number (see chapter 4) was 9. The closed circles
show the percentage of lines in which the bw75 allele has become
fixed; the open circles show the percentage in which it has been
lost and the bw allele fixed. The smooth curve is the expected
amount of fixation of one or other allele, computed from the effec-
tive number by equation 3.3. (Data from Buri, 1956.)

fore any reaches the other end, and the total amount falling off each
end will be in proportion to the relative distance of the initial pile
from the two ends. Relating this model to the diagram of the process
in Fig. 3.5, the position along the trough represents the horizontal
axis, or gene frequency, and the depth of the sand represents the
vertical axis, or the probability of a line having a particular gene

Chap. 3]

SAMPLING

frequency. The graphs are thus analogous to longitudinal sections
through the trough and its sand.

The probability of fixation at any time during the initial phase is
too complicated for explanation here, and the reader is referred to
the papers of Kimura (1954, 1955). After the steady phase has been
reached fixation proceeds at a constant rate: a proportion ijzN of the
lines previously unfixed become fixed in each generation. The
proportion of lines in which a gene with initial frequency q0 is
expected to be fixed, lost, or to be still segregating is as follows
(Wright, 1952a):

fixed: q0-3PoqoP

lost: po~3PoqoP

neither: bp^qJP

where P

4-^y

Fig. 3.6 shows the progress of fixation and loss in an experiment
with Drosophila.

Genotype frequencies. Change of gene frequency leads to
change of genotype frequencies; so the genotype frequencies in small
populations follow the changes of gene frequency resulting from the
dispersive process. In the idealised population, which we are still
considering, mating is random within each of the lines. Consequently
the genotype frequencies in any one line are the Hardy- Weinberg
frequencies appropriate to the gene frequency in the previous genera-
tion of that line. As the lines drift apart in gene frequency they
become differentiated also in genotype frequencies. But differentia-
tion is not the only aspect of the change: the general direction of the
change is toward an increase of homozygous, and a decrease of
heterozygous, genotypes. The reason for this is the dispersion of gene
frequencies from intermediate values toward the extremes. Hetero-
zygotes are most frequent at intermediate gene frequencies (see
Fig. 1.1), so the drift of gene frequencies toward the extremes leads,
on the average, to a decline in the frequency of heterozygotes.

The genotype frequencies in the population as a whole can be
deduced from a knowledge of the variance of gene frequencies in the
following way. If an allele has a frequency q in one particular line,
homozygotes of that allele will have a frequency of q2 in that line.
The frequency of these homozygotes in the population as a whole will
therefore be the mean value of q2 over all lines. We shall write this

58 SMALL POPULATIONS: I [Chap. 3

mean frequency of homozygotes as (q2). The value of (q2) can be
found from a knowledge of the variance of gene frequencies among
the lines, by noting that the variance of a set of observations is found
by deducting the square of the mean from the mean of the squared
observations. Thus

and («2)=<P + °S2 (3-4)

where o\ is the variance of gene frequencies among the lines, as given
in equation 3.2, and q2 is the square of the mean gene frequency.
Since the mean gene frequency, q, is equal to the original, q0> it
follows that q2 or q% is the original frequency of homozygotes in the
base population. Thus in the population as a whole the frequency of
homozygotes of a particular allele increases, and is always in excess
of the original frequency by an amount equal to the variance of the
gene frequency among the lines. In a two-allele system the same
applies to the other allele, and the frequency of heterozygotes is
reduced correspondingly. Noting from equation 3.2 that o\ — a\ we
therefore find the genotypic frequencies for a locus with two alleles
as follows:

(3.5)

Frequency in

Genotype

whole population

Po + rf

AiA2

2p0q0-2or*

A2A2

ql + a2

These genotype frequencies are no longer the Hardy- Weinberg
frequencies appropriate to the original or mean gene frequency. The
Hardy- Weinberg relationships between gene frequency and genotype
frequencies, though they hold good within each line separately, do
not hold if the lines are taken together and regarded as a single
population. This fact causes some difficulty in relating gene and
genotype frequencies in natural populations, because they are often
more or less subdivided and the degree of subdivision is seldom
known. An example of the decrease of heterozygotes resulting from
the dispersion of gene frequencies is shown in Fig. 3.7.

The foregoing account of genotype frequencies describes the
situation in terms of one locus in many lines. It can be regarded
equally as referring to many loci in one line; then the change in any
one line or small population is an increase in the number of loci at

Chap. 3]

SAMPLING

which individuals are homozygous and a corresponding decrease in
the number at which they are heterozygous — in short an increase of
homozygotes at the expense of heterozygotes. This change of geno-
type frequencies resulting from the dispersive process is the genetic
basis of the phenomenon of inbreeding depression, of which a full
explanation will be found in Chapter 14.

GENERATIONS

Fig. 3.7. Change of frequency of heterozygotes among 105 lines
of Drosophila melanogaster, each with 16 parents. The same ex-
periment as is illustrated in Figs. 3.3. and 3.4. The frequency of
heterozygotes refers to the population as a whole, all lines taken
together. The smooth curve is the expected frequency of hetero-
zygotes. (Data from Buri, 1956.)

We have now surveyed the general nature of the dispersive process
and its three major consequences — differentiation of sub-populations,
genetic uniformity within sub-populations, and overall increase in
the frequency of homozygous genotypes. Let us now look at the
process from another viewpoint, as an inbreeding process. Instead
of regarding the increase of homozygotes as a consequence of the
dispersion of gene frequencies, we shall now look directly at the
manner in which the additional homozygotes arise.

60 SMALL POPULATIONS: I [Chap. 3

Inbreeding

Inbreeding means the mating together of individuals that are
related to each other by ancestry. That the degree of relationship
between the individuals in a population depends on the size of the
population will be clear by consideration of the numbers of possible
ancestors. In a population of bisexual organisms every individual
has two parents, four grand-parents, eight great-grandparents, etc.,
and t generations back it has 2* ancestors. Not very many generations
back the number of individuals required to provide separate ancestors
for all the present individuals becomes larger than any real popula-
tion could contain. Any pair of individuals must therefore be related
to each other through one or more common ancestors in the more or
less remote past; and the smaller the size of the population in previous
generations the less remote are the common ancestors, or the greater
their number. Thus pairs mating at random are more closely related
to each other in a small population than in a large one. This is why
the properties of small populations can be treated as the consequences
of inbreeding.

The essential consequence of two individuals having a common
ancestor is that they may both carry replicates of one of the genes
present in the ancestor; and if they mate they may pass on these
replicates to their offspring. Thus inbred individuals — that is to
say, offspring produced by inbreeding — may carry two genes at a
locus that are replicates of one and the same gene in a previous
generation. Consideration of this consequence of inbreeding shows
that there are two sorts of identity among allelic genes, and two sorts
of homozygote. The sort of identity we have hitherto considered is a
functional identity. Two genes are regarded as being identical if they
are not recognisably different in their phenotypic effects, or by any
other functional criterion; in other words, if they have the same
allelemorphic state. Following the terminology of Crow (1954) they
may be called alike in state. An individual carrying a pair of such genes
is a homozygote in the ordinary sense. The new sort of identity is
one of replication. If two genes originated from the replication of one
gene in a previous generation, they may be said to be identical by
descent, or simply identical. An individual possessing two identical
genes at a locus may be called an identical homozygote. Genes that
are not identical by descent may be called independent, whether they

Chap, 3]

INBREEDING

are alike in state or different alleles; and homozygotes of independent
genes may be called independent homozygotes.

Identity by descent provides the basis for a measure of the dis-
persive process, through the degree of relationship between the
mating pairs. The measure is the coefficient of inbreeding, which is the
probability that the two genes at any locus in an individual are identi-
cal by descent. It refers to an individual and expresses the degree of
relationship between the individual's parents. If the parents mated
at random then the coefficient of inbreeding of the progeny is the
probability that two gametes taken at random from the parent
generation carry identical genes at a locus. The coefficient of in-
breeding, generally symbolised by F, was first defined by Wright
(1922) as the correlation between uniting gametes; the definition
given here, which follows that of Malecot (1948) and Crow (1954), is
equivalent.

The degree of relationship expressed in the inbreeding coefficient
is essentially a comparison between the population in question and
some specified or implied base population. Without this point of
reference it is meaningless, as the following consideration will show.
On account of the limitation in the number of independent ancestors
in any population not infinitely large, all genes now present at a locus
in the population would be found to be identical by descent if traced
far enough back into the remote past. Therefore the inbreeding
coefficient only becomes meaningful if we specify some time in the
past beyond which ancestries will not be pursued, and at which all
genes present in the population are to be regarded as independent —
that is, not identical by descent. This point is the base population
and by its definition it has an inbreeding coefficient of zero. The
inbreeding coefficient of a subsequent generation expresses the
amount of the dispersive process that has taken place since the base
population, and compares the degree of relationship between the
individuals now, with that between individuals in the base population.
Reference to the base population is not always explicitly stated, but is
always implied. For example, we can speak of the inbreeding coeffi-
cient of a population subdivided into lines. The comparison of
relationship is between the individuals of a line and individuals
taken at random from the whole population. The base population
implied is a hypothetical population from which all the lines were
derived.

Inbreeding in the idealised population. Let us now return to

62 SMALL POPULATIONS: I [Chap. 3

the idealised population and deduce the coefficient of inbreeding in
successive generations, starting with the base population and its
progeny constituting generation i. The situation may be visualised
by thinking of a hermaphrodite marine organism, capable of self-
fertilisation, shedding eggs and sperm into the sea. There are N
individuals each shedding equal numbers of gametes which unite at
random. All the genes at a locus in the base population have to be
regarded as being non-identical; so, considering only one locus,
among the gametes shed by the base population there are zN different
sorts, in equal numbers, bearing the genes Al5 A2, A3, etc. at the A
locus. The gametes of any one sort carry identical genes; those of
different sort carry genes of independent origin. What is the pro-
bability that a pair of gametes taken at random carry identical genes?
This is the inbreeding coefficient of generation i . Any gamete has a
i/aiVth chance of uniting with another of the same sort, so i/zNis the
probability that uniting gametes carry identical genes, and is thus the
coefficient of inbreeding of the progeny. Now consider the second
generation. There are now two ways in which identical homo-
zygotes can arise, one from the new replication of genes and the other
from the previous replication. The probability of newly replicated
genes coming together in a zygote is again i/2N. The remaining
proportion, i - i/zN, of zygotes carry genes that are independent in
their origin from generation i, but may have been identical in their
origin from generation o. The probability of their identical origin in
generation o is what we have already deduced as the inbreeding
coefficient of generation i. Thus the total probability of identical
homozygotes in generation 2 is

F>=m+ [*-£?)*>

where Fx and F2 stand for the inbreeding coefficients of generations
1 and 2 respectively. The same argument applies to subsequent
generations, so that in general the inbreeding coefficient of individuals
in generation t is

Thus the inbreeding coefficient is made up of two parts: an "incre-
ment," i/zN, attributable to the new inbreeding, and a "remainder,"
attributable to the previous inbreeding and having the inbreeding

Chap. 3]

INBREEDING

coefficient of the previous generation. In the idealised population the
"new inbreeding" arises from self-fertilisation, which brings together
genes replicated in the immediately preceding generation. Exclusion
of self-fertilisation simply shifts the replication one generation
further back, so that the "new inbreeding" brings together genes
replicated in the grand-parental generation; the coefficient of in-
breeding is affected, but not very much, as we shall see later. The
distinction between "new" and "old" inbreeding brings clearly to
light a point which we note here in passing because it will be needed
later and is often important in practice: if there is no "new inbreed-
ing," as would happen if the population size were suddenly increased,
the previous inbreeding is not undone, but remains where it was
before the increase of population size.

Let us call the "increment" or "new inbreeding" AF, so that

i
zN

(5.7)

Equation 3.6 may then be rewritten in the form

Ft=AF+(i-^F)Ft-i (3-8)

Further rearrangement makes clearer the precise meaning of the
"increment," AF.

AF:

F±-FU

' i ~F,-i

(3-9)

From the equation written thus we see that the "increment," AF,
measures the rate of inbreeding in the form of a proportionate increase.
It is the increase of the inbreeding coefficient in one generation, rela-
tive to the distance that was still to go to reach complete inbreeding.
This measure of the rate of inbreeding provides a convenient way of
going beyond the restrictive simplifications of the idealised popula-
tion, and it thus provides a means of comparing the inbreeding effects
of different breeding systems. When the inbreeding coefficient is
expressed in terms of AF, equation 3.8 is valid for any breeding system
and is not restricted to the idealised population, though only in the
idealised population is AF equal to 1/2N.

So far we have done no more than relate the inbreeding coefficient
in one generation to that of the previous generation. It remains to
extend equation 3.8 back to the base population and so express the
inbreeding coefficient in terms of the number of generations. This is

64 SMALL POPULATIONS: I [Chap. 3

made easier by the use of a symbol, P, for the complement of the
inbreeding coefficient, i -P, which is known as the panmictic index.
Substitution of P= i -F in equation 3.8 gives

p-=i-AF {3.10)

Thus the panmictic index is reduced by a constant proportion in
each generation. Extension back to generation t - 2 gives

and extension back to the base population gives

pt={i-AFypB (3.11)

where P0 is the panmictic index of the base population. The base
population is defined as having an inbreeding coefficient of o, and
therefore a panmictic index of 1. The inbreeding coefficient in any
generation, t, referred to the base population, is therefore

Ft = i-(i-AFY (3.12)

The consequences of the dispersive process were described earlier
from the viewpoint of sampling variance. Let us now look again at
them, applying the rate of inbreeding and the inbreeding coefficient
as measures of the process. Strictly speaking we should refer still
to the idealised population, but the equating of the two viewpoints
can be regarded as generally valid except in some very special and
unlikely circumstances (see Crow, 1954).

Variance of gene frequency. First, the variance of the change
of gene frequency in one generation, taken from equation 3.1 and
expressed in terms of the rate of inbreeding, becomes

<=P-§=M,AF {3-I3)

Similarly, the variance of gene frequencies among the lines at
generation t, taken from equation 3.2 and expressed in terms of the
inbreeding coefficient from 3. 12, becomes

of=M.[l-(l-^)']
=P&loF (3-14)

Chap. 3] INBREEDING 65

Thus AF expresses the rate of dispersion and F the cumulated effect
of random drift.

Genotype frequencies. Leaving fixation aside for the moment,
let us consider next the genotype frequencies in the population as a
whole. The genotype frequencies expressed in terms of the variance
of gene frequency in equations 3.5 can be rewritten in terms of the
coefficient of inbreeding from equation 3.J4. The frequency of A2A2,
for example, is

(?)=q%+°%=q2o+P<tioF

The genotype frequencies expressed in this way are entered in the
left-hand side of Table 3.1. As was explained before, this way of
writing the genotype frequencies shows how the homozygotes in-

Table 3.1

Genotype frequencies for a locus with two alleles, expressed
in terms of the inbreeding coefficient, F.

Original Change

fre- due to

quencies inbreeding

Origin:
Independent Identical

Pi + M/

Pl(i -F) + PaF

AXA2

A2A2

2M0 - 2/XtfoF
Qo + P0Q0F

or
or

2/>o?o(i -F)
sKi -F) + 1oF

crease at the expense of the heterozygotes. Recognition of identity
by descent to which the inbreeding viewpoint led us means that we
can now distinguish the two sorts of homozygote, identical and
independent, among both the A1A1 or A2A2 genotypes. The fre-
quency of identical homozygotes among both genotypes together is
by definition the inbreeding coefficient, F; and it is clear that the
division between the two genotypes is in proportion to the initial
gene frequencies. So p$F is the frequency of AXAX identical homo-
zygotes, and q0F that of A2A2 identical homozygotes. The remaining
genotypes, both homozygotes and heterozygotes, carry genes that are
independent in origin and are therefore the equivalent of pairs of
gametes taken at random from the population as a whole. Their
frequencies are therefore the Hardy- Weinberg frequencies. Thus,
from the inbreeding viewpoint, we arrive at the genotype frequencies
shown in the right-hand columns of Table 3.1. This way of writing
the genotype frequencies shows how homozygotes are divided be-

66 SMALL POPULATIONS: I [Chap. 3

tween those of independent and those of identical origin. The
equivalence of the two ways of expressing the genotype frequencies
can be verified from their algebraic identity. Both ways show equally
clearly how the heterozygotes are reduced in frequency in proportion
to i -F. The term "heterozygosity" is often used to express the
frequency of heterozygotes at any time, relative to their frequency in
the base population. The heterozygosity is the same as the panmic-
tic index, P. Thus if Ht and H0 are the frequencies of heterozygotes
for a pair of alleles at generation t and in the base population res-
pectively, then the heterozygosity at generation t is

§=P* (3-15)

Fixation. There is little to add, from the inbreeding viewpoint, to
the description of fixation given earlier. The rate of fixation — that
is the proportion of unfixed loci that become fixed in any generation —
is equal to AF, after the steady phase has been reached and the dis-
tribution of gene frequencies has become flat. The quantity P in
equations 3.3 which give the probability of a gene having become
fixed or lost, is equal to 1 -F. We may note, however, that the
probability of fixation is not very different from the inbreeding
coefficient itself. The explanation comes more readily by considering
the probability that a locus remains unfixed. This probability was
given in equation 3.3 for a locus with two alleles after enough genera-
tions have passed to take the population into the steady phase.
Expressed in terms of the inbreeding coefficient, from equation 3. 12,
it is 6p0q0(i -F). Now, the value of p0q0 does not change very much
over quite a wide range of gene frequencies, and so the probability
that a locus is still unfixed is not very sensitive to the initial gene
frequency. The value of 6p0q0 lies between i-o and 1-5 over a range
of gene frequency from 0-2 to o-8, a range that is likely to cover many
situations. Consequently the probability that a line still segregates,
or the proportion of loci expected to remain unfixed, is likely to lie
between (1 -F) and 1-5(1 -F). Thus the inbreeding coefficient gives
a good idea of the approximate probability of fixation, even in the
absence of a knowledge of the initial gene frequencies. That the
approximation may be quite close enough for practical purposes may
be seen by taking a specific example. In work involving immuno-
logical reactions it may be necessary to produce a strain in which all
loci that determine the reactions have been fixed. One therefore

Chap. 3]

INBREEDING

wants to know the inbreeding coefficient necessary to raise the
probability of fixation, or the proportion of loci expected to be fixed,
to a certain level — say 90 per cent. The inbreeding coefficient needed
to do this would, on the above considerations, lie between 0-90 and
0-93, and this would answer the question with quite enough accuracy
for most purposes.

CHAPTER 4

SMALL POPULATIONS:

II. Less Simplified Conditions

In order to simplify the description of the dispersive process we
confined our attention in the last chapter to an idealised population,
and to do this we had to specify a number of restrictive conditions,
which could seldom be fulfilled in real populations. The purpose of
this chapter is to adapt the conclusions of the last chapter to situations
in which the conditions imposed do not hold; in other words to
remove the more serious restrictions and bring the conclusions closer
to reality. The restrictive conditions were of two sorts, one sort
being concerned with the breeding structure of the population and
the other excluding mutation, migration, and selection from con-
sideration. We shall first describe the effects of deviations from the
idealised breeding structure, and then consider the outcome of the
dispersive process when mutation, migration, or selection are oper-
ating at the same time.

Effective Population Size

If the breeding structure does not conform to that specified for
the idealised population, it is still possible to evaluate the dispersive
process in terms of either the variance of gene frequencies or the rate
of inbreeding. This can be done by the same general methods and
no new principles are involved. We shall therefore give the con-
clusions briefly and without detailed explanation. The most con-
venient way of dealing with any particular deviation from the
idealised breeding structure is to express the situation in terms of
the effective number of breeding individuals, or the effective population
size. This is the number of individuals that would give rise to the
sampling variance or the rate of inbreeding appropriate to the con-
ditions under consideration, if they bred in the manner of the
idealised population. Thus, by converting the actual number, N, to

Chap. 4]

EFFECTIVE POPULATION SIZE

the effective number, Ne, we can apply the formulae deduced in the
last chapter. The rate of inbreeding, for example, is

AF=

zN.

(4.1)

just as for the idealised population AF= ijzN (equation 3.7).

The relationships between actual and effective numbers in the
situations most commonly met with are given below. The exact
expressions are often complicated, but in most circumstances an
approximation can be used with sufficient accuracy. We should first
note that the actual number, Nt refers to breeding individuals — the
breeding individuals of one generation — and it therefore cannot be
obtained directly from a census, unless the different age-groups are
distinguished.

Bisexual organisms: self-fertilisation excluded. The ex-
clusion of self-fertilisation makes very little difference to the rate of
inbreeding, unless N is very small, as with close inbreeding. The
relationship of effective to actual numbers (Wright, 1 931) is

Ne=N+i

and the rate of inbreeding is

AF=

2N+1

(approx.)

.(4.2)

(4.3)

The exact expression for the inbreeding coefficient in a bisexual
population, and its derivation, are given by Malecot (1948).

Different numbers of males and females. In domestic and
laboratory animals the sexes are often unequally represented among
the breeding individuals, since it is more economical, when possible,
to use fewer males than females. The two sexes, however, whatever
their relative numbers, contribute equally to the genes in the next
generation. Therefore the sampling variance attributable to the two
sexes must be reckoned separately. Since the sampling variance is
proportional to the reciprocal of the number, the effective number is
twice the harmonic mean of the numbers of the two sexes (Wright,
1 931), so that

1 1

Ne iNm'4Nf

(44)

70 SMALL POPULATIONS: II [Chap.4

where Nm and Nf are the actual numbers of males and females
respectively. The rate of inbreeding is then

AF=sk+m <approx-> (*5)

This gives a close enough approximation unless both Nm and Nf are
very small, as with close inbreeding. It should be noted that the rate
of inbreeding depends chiefly on the numbers of the less numerous
sex. For example, if a population were maintained with an in-
definitely large number of females but only one male in each genera-
tion, the effective number would be only about 4.

Unequal numbers in successive generations. The rate of
inbreeding in any one generation is given, as before, by i/zN. If the
numbers are not constant from generation to generation, then the
mean rate of inbreeding is the mean value of i/zN'm successive genera-
tions. The effective number is the harmonic mean of the numbers in
each generation (Wright, 1939). Over a period of / generations,
therefore,

we=1tlk+k+k+ - +w] (approx-} {4-6)

Thus the generations with the smallest numbers have the most effect.
The reason for this can be seen by consideration of the "new" and
"old" inbreeding referred to in connexion with equation 3.6. An
expansion in numbers does not affect the previous inbreeding; it
merely reduces the amount of new inbreeding. So, in a population
with fluctuating numbers the inbreeding proceeds by steps of varying
amount, and the present size of the population indicates only the
present rate of inbreeding.

Non-random distribution of family size. This is probably the
commonest and most important deviation from the breeding system
of the idealised population. Its consequence is usually to render the
effective number less than the actual, but in special circumstances it
makes it greater. Family size means here the number of progeny of
an individual parent or of a pair of parents, that survive to become
breeding individuals. It will be remembered that each breeding
individual in the idealised population contributes equally to the pool
of gametes, and therefore equally also to the potential zygotes in the
next generation. Survival of zygotes is random. The mean number
of progeny surviving to breeding age is 1 for individual parents and 2

Chap. 4]

EFFECTIVE POPULATION SIZE

for pairs of parents. Since the chance of survival for any particular
zygote is small, the variation of family size follows a Poisson distribu-
tion. The variance of family size is therefore equal to the mean family
size, equality of mean and variance being a property of the Poisson
distribution. Thus in a population of bisexual organisms, in which
all other conditions of the idealised population are satisfied, family
size will have a mean and a variance of 2. In natural populations the
mean is not likely to differ much from 2, but the variance must be
expected to be usually greater, for reasons of differing fertility be-
tween the parent individuals and differing viability between the
families. If the variance of family size is increased, a greater propor-
tion of the following generation will be the progeny of a smaller
number of parents, and the effective number of parents will be less
than the actual number. Conversely, if the variance of family size is
reduced below that of the idealised population, the effective number
will be greater than the actual number. It can be shown that, when
the mean family size is 2, the effective number is as follows (Wright,
1940; Crow, 1954):

Ne =

4iV
2 + 0I

(4-7)

where erf is the variance of family size. (Strictly speaking this is the
effective number as it affects variance of gene frequency and fixation:

for its effects on the inbreeding coefficient, Ne=- % . The differ-
ence is small and we shall ignore it.) Thus, when there is equal
fertility of the parents and random survival of the progeny of — 2, and
Ne=N. When differences of fertility and viability make of greater
than 2, as in most actual populations, then Ne is less than N. The
effective number under consideration here refers to a population with
equal numbers of males and females, and with monogamous mating.
If males are not restricted to a single mate, then the families of males
are likely to be more variable in size than those of females. In these
circumstances the relationship of effective to actual numbers will
differ for male and female parents.

It is possible by controlled breeding to make the variance of
family size, of, less than 2, and therefore to make the effective
number greater than the actual. If two members of each family are
deliberately chosen to be parents of the next generation, then the
variance of family size is zero. Under these special circumstances,

F F.Q.G.

72 SMALL POPULATIONS: II [Chap. 4

and if the sexes are equal in numbers, the effective number is twice
the actual:

N. = zN (4.8)

The rate of inbreeding is consequently half what it would be in an
idealised population of equal size, and is usually less than half the
rate of inbreeding under normal circumstances and random mating.
Under this controlled breeding system the rate of inbreeding is the
lowest possible with a given number of breeding individuals. The
reduced variance of family size is the path through which the ' 'de-
liberate avoidance of inbreeding" works. The problem often arises
of keeping a stock with minimum inbreeding, but with a limitation of
the actual population size imposed by the space or facilities available.
A common practice under these circumstances is the deliberate
avoidance of sib-matings and perhaps also of cousin-matings. One
may go further and by the use of pedigrees (in the manner described
in the next chapter) choose pairs for mating that have the least
possible relationship with each other. Deliberate avoidance of in-
breeding in this way has the effect of distributing the individuals
chosen to be parents evenly over the available families, and thus
reduces the variance of family size and the rate of inbreeding. The
same result, however, can be achieved with less labour simply by
ensuring that the available families are as far as possible equally
represented among the individuals chosen to be the parents of the
next generation. If, in addition, matings between close relatives are
avoided, the inbreeding coefficient in any generation is slightly lower
and is more uniform between the individuals in the generation than if
matings between close relatives are allowed; but the rate of inbreeding
is the same.

If the sexes are unequal in numbers, but the individuals chosen as
parents are equally distributed, in numbers and sexes, between the
families, so that the variance of family size is still zero, then the rate
of inbreeding is given by the following formula (Gowe, Robertson,
and Latter, 1959):

where Nm and Nf are the actual numbers of male and female parents
respectively, and females are more numerous than males.

Chap. 4]

EFFECTIVE POPULATION SIZE

Example 4.1. Several flocks of poultry in the United States and in
Canada, which are used as controls for breeding experiments, are main-
tained by the following breeding system (Gowe, Robertson, and Latter,

1959)-

There are 50 breeding males and 250 breeding females in each genera-
tion. Every male is the son of a different father, and every female the
daughter of a different mother, so that the variance of family size is zero.
One of the objectives of this breeding system is to minimise the rate of
inbreeding. Let us therefore find what the rate of inbreeding is, and then
see how much is achieved in this respect by the deliberate equalisation of
family size. By equation 4.9 the rate of inbreeding in these flocks is
AF = 0-002. If there were no deliberate choice of breeding individuals, and
family size conformed to a Poisson distribution, the rate of inbreeding by
equation 4.5 would be AF = 0-003. Thus, without the deliberate equalisa-
tion of family size the rate of inbreeding would be 50 per cent greater. If
a low rate of inbreeding were the only objective, the number of females
could be substantially reduced without much effect. For example, if
there were no more females than males, with 50 of each sex (N= 100) and
with equalisation of family size, the rate of inbreeding from equation 4.8
would be AF= 0-0025, which is not very much greater than with five times
as many females. This illustrates the point, mentioned earlier, that most
of the inbreeding comes from the less numerous sex.

Ratio of effective to actual number. When matings are con-
trolled and pedigree records kept, the rate of inbreeding can readily
be computed, as will be explained in the next chapter. But pedigree
records are not available for natural populations, nor for laboratory
populations kept by mass culture, as for example Drosophila popul-
tions. How are we to estimate the rate of inbreeding in such popula-
tions? We know the effective number is likely to be less than the
actual, but how much less ? To estimate the effective number requires
a special experiment, and only the actual number is likely to be known.
Determinations of the ratio of effective to actual numbers, Ne/N,
from data on man, Drosophila, and the snail Lymnaea, led to values
ranging from 70 per cent to 95 per cent (Crow and Morton, 1955).
In the absence of specific knowledge, therefore, it would seem
reasonable to take the effective number as being, very roughly, about
three-quarters of the actual number. There are two methods by
which the ratio NJN may be determined: (1) by the estimation of the
variance of family size, which yields Ne by equation 4.7 (though
adjustment has to be made if the mean family size at the time of
measurement is not 2); and (2) by the estimation of the variance of the

74 SMALL POPULATIONS: II [Chap. 4

changes of gene frequency during inbreeding, which yields Ne by
equation 3.1. Both methods have been applied to Drosophila
melanogaster in laboratory cultures. The ratio Ne/N for female
parents was 71 per cent by the first method and 76 per cent by the
second; and for male parents, 48 per cent and 35 per cent (Crow and
Morton, 1955). The ratio NJN for the sexes jointly, determined by
the second method, ranged from 56 per cent to 83 per cent, with a
mean of 70 per cent, in five experiments with equal actual numbers
of males and females (Kerr and Wright, 19540, b; Wright and Kerr,
1954; Buri, 1956). The low value of 56 per cent was found in rather
poor culture conditions of crowding, where there was more compe-
tition (Buri, 1956).

Example 4.2. As an illustration of the use of the ratio NJN let us find
the expected rate of inbreeding in a population of Drosophila maintained
by 20 pairs of parents in each generation. The actual number is TV = 40.
If the effective number were equal to the actual, the rate of inbreeding, by
equation 4. J, would be AF= 1/80 = 1 -25 per cent. If we take Ne = o-yN, from
the experimental results cited above, then iVe = 28, and the rate of in-
breeding is AF= 1/56 = 1 786 per cent. The coefficient of inbreeding after
10, 50, and 100 generations would then be (by equation 3.12) 17 per cent,
59 per cent, and 84 per cent.

Migration, Mutation, and Selection

The description of the dispersive process given so far in this
chapter and the previous one is conditional on the systematic pro-
cess of mutation, migration, and selection being absent, and its rele-
vance to real populations is therefore limited. So let us now consider
the effects of the dispersive and systematic processes when acting
jointly. The systematic processes, as we have seen in Chapter 2,
tend to bring the gene frequencies to stable equilibria at particular
values which would be the same for all populations under the same
conditions. The dispersive process, in contrast, tends to scatter the
gene frequencies away from these equilibrium values, and if not held
in check by the systematic processes it would in the end lead to all
genes being either fixed or lost in all populations not infinite in size.
The tendency of the systematic processes to change the gene fre-
quency toward its equilibrium value becomes stronger as the fre-
quency deviates further from this value. For this reason the opposing

Chap. 4]

MIGRATION, MUTATION, AND SELECTION

tendencies of the dispersive and systematic processes reach a point of
balance: a point at which the dispersion of the gene frequencies is
held in check by the systematic processes. When this point of balance
is reached there will be a certain degree of differentiation between
sub-populations, but it will neither increase nor decrease so long as
the conditions remain unchanged. The problem is therefore to find
the distribution of gene frequencies among the lines of a subdivided
population when this steady state has been reached. The solution is
complicated mathematically, and we shall give only the main con-
clusions, explaining their meaning but not their derivation. For
details of the joint action of the dispersive and systematic processes,
see Wright (193 1, 1942, 1948, 195 1).

Mutation and migration. Mutation and migration can be
dealt with together because they change the gene frequency in the
same manner. Consider again a population subdivided into many
lines, all with an effective size Ne\ and let a proportion, m, of the
breeding individuals of every generation in each line be immigrants
coming at random from all other lines. Consider two alleles at a
locus, with mean frequencies p and q in the population as a whole, and
with mutation rates u and v in the two directions. Then, when the
balance between dispersion on the one hand and mutation and
migration on the other is reached, the variance of the gene frequency
among the lines is given by the following expression (Wright, 1931;
Malecot, 1948):

1 + \Ne(u + v + m)

(approx.)

,{4.10)

The degree of dispersion represented here by the variance of the gene
frequency can also be expressed as a coefficient of inbreeding, by
putting o\ =Fpq, from equation 3. 14. Then

v +

(approx.)

.(4.11)

i+4Ne(u

The theoretical distributions of the gene frequency appropriate to
four different values of F, when the mean gene frequency is 0-5, are
shown in Fig. 4. 1 . These distributions show how high F must be for
there to be a substantial amount of fixation or of differentiation be-
tween sub-populations. What the distributions depict can be stated
in three ways: (a) If we had a large number of sub-populations and we
determined the frequency of a particular gene in all of them, the dis-

SMALL POPULATIONS: II

[Chap. 4

tribution curve is what we should obtain by plotting the percentage
of sub-populations showing each gene frequency. Or, in other words,
the height of the curve at a particular gene frequency shows the
probability of finding that gene frequency in any one sub-population.
(b) If we had one sub-population and measured the gene frequencies
at a large number of loci, all of which started with the same initial
frequency, the curve is the distribution of frequencies that we should

find, (c) If we had one sub-
population and measured the
frequency of one particular gene
repeatedly over a long period of
time, the curve is the distribu-
tion of frequencies that we
should find. The distributions
describe the state of affairs when
equilibrium between the sys-
tematic and dispersive pro-
cesses has been reached, and
the population as a whole is in
a steady state. From the dis-
tributions shown in Fig. 4.1 it
will be seen that when F is 0-005
there is very little differentia-
tion, and when F is 0-048 there
is a fair amount of differentia-
tion but still no fixation. When
F is 0-333 tne distribution is flat,
which means that all gene fre-
quencies are equally probable
(including o and 1); thus there
is much differentiation, and in
addition a substantial amount
of fixation and loss occurs.
When F exceeds this critical value intermediate gene frequencies
become rarer, and a greater proportion of sub-populations have the
gene either fixed or lost. When mutation or migration occurs, fixation
or loss is not a permanent state in any one sub-population; the amount
of fixation or loss is what would be found at any one time.

Let us return now to the expression, 4. jt, relating the coefficient
of inbreeding to the rates of mutation and migration when the

Fig. 4.1. Theoretical distributions
of gene frequency among sub-
populations, when dispersion is
balanced by mutation or migration.
The states of dispersion to which
the curves refer are indicated by the
values of F in the figure. (Redrawn
from Wright, 195 1.)

Chap. 4]

MIGRATION, MUTATION, AND SELECTION

population has reached the steady state; and let us consider the rates
of mutation or migration, in relation to the effective population size,
that would just allow the dispersive process to go to the critical point
corresponding to the value of ^=0-333. Putting this value of F in
equation 4.11 yields

(4.12)

u + v + m=-^-r (approx.)

First let us consider mutation alone. If the sum of the mutation
rates in the two directions (u + v) were io-5, which is a realistic value
to take according to what is known of mutation rates, then the critical
state of dispersion will be reached in sub-populations of effective size
Ne = 50,000. In other words, mutation rates of this order of magni-
tude will arrest the dispersive process before the critical state only in
populations with effective numbers greater than 50,000. Populations
smaller than this will show a substantial amount of fixation of genes
having this mutation rate. In practice, therefore, mutation may be
discounted as a force opposing dispersion in populations that would
commonly be regarded as "small"; populations, that is, with effective
numbers of the order of 100, or even 1,000.

With migration the picture is different, because what would be
considered a high rate of mutation would be judged a low rate of
migration. The critical value of F= 0-333 w^ occur when m = ijzNe.
With this rate of migration there would be only one immigrant
individual in every second generation, irrespective of the population
size. Thus we see that only a small amount of interchange between
sub-populations will suffice to prevent them from differentiating
appreciably in gene frequency.

The situation to which this consideration of migration refers is
known as the * 'island model." It pictures a discontinuous population
such as might be found inhabiting widely separated islands, inter-
change taking place by occasional migrants from one sub-population
to another. But differentiation of sub-populations by random drift
can take place also in a continuous population if the motility of the
organism is small in relation to the population density. This is known
as "isolation by distance" or the "neighbourhood model" (Wright,
1940; 1943; 1946; 1 951). Clearly, if there is little dispersal over the
territory between one generation and the next the choice of mates is
restricted and mating cannot be at random. The population is then
subdivided into "neighbourhoods" (Wright, 1946) within which

78 SMALL POPULATIONS: II [Chap. 4

individuals find mates. A neighbourhood is an area within which
mating is effectively random. The size of a neighbourhood depends
on the distance covered by dispersal between one generation and the
next. If the distances between localities inhabited by offspring and
parents at corresponding stages of the life cycle are distributed with a
variance oj, then the area of a neighbourhood is the area enclosed by a
circle of radius 2ody which is 7r(2crd)2. The effective population size
of a neighbourhood is the number of breeding individuals in the
area of a neighbourhood. The subdivision of a population into
neighbourhoods leads to random drift, but the amount of local
differentiation depends on the size of the whole population as well
as on the effective number in the neighbourhood. If the whole
population is not very much larger than the neighbourhood then the
whole population will drift, and there will be little local differentiation
within it. The conclusion to which the neighbourhood model leads
is that a great amount of local differentiation will take place if the
effective number in a neighbourhood is of the order of 20, and a
moderate amount if it is of the order of 200; but with larger neigh-
bourhoods it will be negligible. There will be much more local
differentiation in a population inhabiting a linear territory, such as a
river or shore line, because a neighbourhood is then open to immi-
gration only from two directions instead of from all round. The extent
of a neighbourhood in a population distributed in one dimension is
the square root of the area of a neighbourhood in a population dis-
tributed in two dimensions. The effective population size is there-
fore the number of breeding individuals in a distance 2adJir of terri-
tory.

Example 4.3. As an illustration of the computation of the effective
population size of a neighbourhood we may take some observations from
the detailed studies by Lamotte (195 1) of the snail Cepaea nemoralis in
France. Marked individuals were released in spring and the distance
travelled from the point of release by those recaptured in the autumn was
noted. Since the snails are inactive in winter this represented the dis-
placement occurring in one year. The mean displacement was 8-i metres,
and its standard deviation 9-4 m. The standard deviation of the displace-
ment between birth and mating, which usually takes place in the second
year of life, was estimated as 0-^ = 15 m. The area occupied by a neigh-
bourhood is therefore 7r(2ud)2 = 12-50-1 = 2,813 sq. m. The density of in-
dividuals in two large colonies was found to be 2 per sq. m., and in another
3 per sq. m. The effective population size of the neighbourhoods in these
colonies was therefore about 5,600 and 8,400. These figures are a good

Chap. 4]

MIGRATION, MUTATION, AND SELECTION

deal larger than the size of neighbourhoods from which we would expect
differentiation within the colonies. Five colonies inhabiting linear terri-
tories had densities ranging from 4-5 to 20 individuals per metre. The
effective population size of the neighbourhoods in these colonies ranged
from 236 to 1,050. These are approaching the size from which differentia-
tion within a colony would be expected.

Selection. Selection operating on a locus in a large population
brings the gene frequency to an equilibrium; when selection against
a recessive or semidominant gene is balanced by mutation the
equilibrium is at a low gene frequency, and when selection favours
the heterozygote the equilibrium
is more likely to be at an inter-
mediate frequency. The question
we have now to consider is: How
much can the dispersive process
disturb these equilibria and cause
small populations to deviate from
the point of equilibrium? The
importance of this question lies
in the fact that an increase of the
frequency of a deleterious gene
will reduce the fitness — that is,
will increase the frequency of
"genetic deaths" — and the dis-
persive process may therefore
lead to non-adaptive changes in
small populations. We shall not
attempt to cover the joint effects
of selection and dispersion in
detail, but shall merely illustrate
their general nature by reference
to a particular case of selection

against a recessive gene balanced by mutation. The effects of selec-
tion in favour of heterozygotes will be discussed in the next chapter,
because they have more importance in connexion with close in-
breeding.

Fig. 4.2 shows the state of dispersion of a gene among sub-
populations of three sizes under the following conditions. Mutation
is supposed to be the same in both directions, and the coefficient of

Fig. 4.2. Theoretical distributions
of gene frequency among sub-
populations when the dispersion is
balanced by mutation and selection.
The graphs refer to a recessive gene
with u=v =-£qS, in populations of
size: (a) Ne= 50 Is, (b) Ne=$ls, and
(c) Ne =0-5/5. (Redrawn from Wright,
1942.)

80 SMALL POPULATIONS: II [Chap. 4

selection against the homozygote is supposed to be twenty times the
mutation rate. In a large population the balance between the
mutation and the selection would bring the gene frequency to equili-
brium at about 0-2. The population sizes to which the graphs refer
are (a) Ne = 50/5, (b) Ne = 5/s, and (c) Ne = o-$/s. If we assumed a muta-
tion rate of io~5 in both directions then the intensity of selection would
be s = zo x io~5, and the effective population sizes to which the graphs
refer would be (a) 250,000 (b) 25,000 and (c) 2,500. These graphs
show that with the largest value of Ne there is little differentiation
between sub-populations; with the intermediate value of Ne random
drift is strong enough to cause a good deal of differentiation; with the
smallest value of Ne the effects of random drift predominate over those
of mutation and selection, intermediate gene frequencies are almost
absent, and in the majority of sub-populations the allele is either fixed
or lost. In this case, moreover, a fair proportion of the sub-populations
have the deleterious allele fixed in them. This illustrates how random
drift can overcome relatively weak selection and lead to fixation of a
deleterious gene.

This particular case illustrates in principle what will happen when
the processes of random drift, selection, and mutation are all operating.
But we need to have some idea of how intense the selection must be
before it overcomes the effects of random drift. If we are content not
to be very precise we can say that selection begins to be more im-
portant than random drift when the coefficient of selection, s, is of the
order of magnitude of 1 j^Ne. For example, in a population of effective
size 100, the critical value of s would be about 0-0025. This is a very
low intensity of selection, quite beyond the reach of experimental
detection. The conclusion to be drawn, therefore, is that in all but
very small populations, even a very slight selective advantage of one
allele over another will suffice to check the dispersive process before
it causes an appreciable amount of fixation or of differentiation be-
tween sub-populations.

Example 4.4. The opposing forces of dispersion and selection are
illustrated in Fig. 4.3, from an experiment with Drosophila melanogaster
(Wright and Kerr, 1954). The frequency of the sex-linked gene "Bar"
was followed for 10 generations in 108 lines each maintained by 4 pairs of
parents. (On account of the complication of sex-linkage, which increases
the rate of dispersion, the theoretical effective number was 6765: the
effective number as judged from the actual rate of dispersion was Ne = 4*87.)
The initial gene frequency was 0-5. The circles in the figure show the

Chap. 4]

MIGRATION, MUTATION, AND SELECTION

distribution of the gene frequency among the lines in the fourth to tenth
generations, when the distribution had reached its steady form. The
smooth curve shows the theoretical distribution based on Ne = $ and a
coefficient of selection against Bar of 5 = 0-17. Previously fixed lines are
not included in the distributions. Altogether, at the tenth generation, 95

0 2 4 6 8

NUMBER OF BAR GENES

Fig. 4.3. Distribution of gene frequencies under inbreeding and
selection, as explained in Example 4.4. (Data from Wright and
Kerr, 1954.)

of the 108 lines had become fixed for the wild-type allele and 3 for Bar
while 10 remained unfixed. Thus, despite a 17 per cent selective dis-
advantage, the deleterious allele was fixed in about 3 per cent of the lines.

Random Drift in Natural Populations

Having described the dispersive process and its theoretical conse-
quences, we may now turn to the more practical question of how far
these consequences are actually seen in natural populations. The
answering of this question is beset with difficulties, and the following
comments are intended more to indicate the nature of these diffi-
culties than to answer the question.

82 SMALL POPULATIONS: II [Chap. 4

The theory of small populations, outlined in this and the pre-
ceding chapter, is essentially mathematical in nature and is un-
questionably valid: given only the Mendelian mechanism of inheri-
tance, the conclusions arrived at are a necessary consequence under
the conditions specified. The question at issue, then, is whether the
conditions in natural populations are often such as would allow the
dispersion of gene frequencies to become detectable. The pheno-
mena which would be expected to result from the dispersive process,
if the conditions were appropriate, are differentiation between the
inhabitants of different localities, and differences between successive
generations. Both these phenomena are well known in subdivided or
small isolated populations, and it is tempting to conclude that because
they are the expected consequences of random drift, random drift
must be their cause. But there are other possible causes: the en-
vironmental conditions probably differ from one locality to another
and from one season to another; so the intensity, or even the direction
of selection may well vary from place to place and from year to year,
and the differences observed could equally well be attributed to
variation of the selection pressure. Before we can justifiably attribute
these phenomena to random drift, therefore, we have to know (a)
that the effective population size is small enough, (b) that the sub-
populations are well enough isolated (or the size of the ' 'neighbour-
hoods" sufficiently small), and (c) that the genes concerned are subject
to very little selection.

The estimation of the present size of a population, though not tech-
nically easy, presents no difficulties of principle. But the present
state of differentiation depends on the population size in the past,
and this can generally only be guessed at. It is difficult to know how
often the population may have been drastically reduced in size in
unfavourable seasons, and the dispersion taking place in these
generations of lowest numbers is permanent and cumulative. There
is less difficulty in deciding whether the sub-populations are suffi-
ciently well isolated. With a discontinuous population inhabiting
widely separated islands, it is often possible to be reasonably sure
that there is not too much immigration; and with a continuous
population the size of the "neighbourhoods" is, at least in principle,
measurable. The greatest difficulty lies in estimating the intensity of
natural selection acting on the genes concerned. Selection of an
intensity far lower than could be detected experimentally is sufficient
to check dispersion in all but the smallest populations. It seems

Chap. 4]

RANDOM DRIFT IN NATURAL POPULATIONS

rather unlikely — though this is no more than an opinion — that any
gene that modifies the phenotype enough to be recognised would
have so little effect on fitness. The genes concerned with quantitative
differences, which are not individually recognisable, may however be
nearly enough neutral for random drift to take place. There is no
doubt at all that genes of this sort do show random drift, at least in
laboratory populations, as will be shown in Chapter 15. Of the
individually recognisable genes, those concerned with polymorphism
seem the most likely to show the effects of random drift. At inter-
mediate frequencies a small displacement from the equilibrium would
be detectable, and therefore a relatively small amount of dispersion
of the gene frequency might well lead to recognisable differentiation.
The following example will serve to illustrate the observed differen-
tiation of a natural population, as well as the difficulties of its inter-
pretation.

Example 4.5. The polymorphism in respect of the banding of the
shell in the snail Cepaea nemoralis has been extensively studied by Lamotte
(1951) in France. The population is subdivided into colonies with a high
degree of isolation between them. The absence of dark-coloured bands
on the shell is caused by a single recessive gene. The mean frequency of
bandless snails is 29 per cent, but individual colonies range between the
two extremes, some being entirely bandless and a few entirely banded.
The colonies vary in the number of individuals that they contain, and 291
colonies were divided into three groups according to their population
size. The variation in the frequency of bandless snails was then compared
in the three groups, as shown in Fig. 4.4. The variation between the
colonies, which measures the degree of differentiation, was found to be
greater among the small colonies than among the large. The variance of
the frequency of bandless between colonies was 0-067 among colonies of
500-1,000 individuals, 0-048 among colonies of 1,000-3,000, and 0-037
among colonies of 3,000-10,000 individuals. This dependence of the
degree of differentiation on the population size is interpreted by Lamotte
as evidence that the differentiation is caused by random drift.

Cain and Sheppard (1954a), on the other hand, offer a different
interpretation, sustained by an equally thorough study of colonies in
England. They show that predation by birds — chiefly thrushes — exerts a
strong selection in favour of shell colours matching the background of the
habitat. Though the polymorphism is maintained by selection, of an
unknown nature, in favour of heterozygotes, the frequency of the different
types in any colony is determined by selection in relation to the nature of
the habitat. In the areas occupied by small colonies, they argue, there is
less variation of habitat than in the areas occupied by large colonies. There-

SMALL POPULATIONS: II

[Chap. 4

fore the variation of habitat between small colonies is greater than between
large. This they regard as the cause of the greater differentiation among
small colonies than among large, selection bringing the frequency of band-
less forms to a value appropriate to the mean habitat of the colony. It is
not for us here to attempt an assessment of these two conflicting interpre-
tations.

FREQUENCY OF BANDLESS

Fig. 4.4. Distribution of the frequency of bandless snails among
colonies of three sizes. (Data from Lamotte, 195 1.)

(a) (b) (c)

Population size 500-1,000 1,000-3,000 3,000-10,000

Mean frequency of bandless 0-292 0-256 0-211

Variance between colonies 0-067 0-048 0-037

CHAPTER 5

SMALL POPULATIONS:
III. Pedigreed Populations and Close Inbreeding

In the two preceding chapters the genetic properties of small popu-
lations were described by reference to the effective number of breeding
individuals; and expressions were derived, in terms of the effective
number, by means of which the state of dispersion of the gene
frequencies could be expressed as the coefficient of inbreeding. The
coefficient of inbreeding, which is the probability of any individual
being an identical homozygote, was deduced from the population size
and the specified breeding structure. It expressed, therefore, the
average inbreeding coefficient of all individuals of a generation.
When pedigrees of the individuals are known, however, the coeffi-
cient of inbreeding can be more conveniently deduced directly from
the pedigrees, instead of indirectly from the population size. This
method has several advantages in practice. Knowledge is often re-
quired of the inbreeding coefficient of individuals, rather than of the
generation as a whole, and this is what the calculation from pedigrees
yields. In domestic animals some individuals often appear as parents
in two or more generations, and this overlapping of generations causes
no trouble when the pedigrees are known. (Non-overlapping of
generations was one of the conditions of the idealised population
which we have not yet removed.) The first topic for consideration in
this chapter is therefore the computation of inbreeding coefficients
from pedigrees. The second topic concerns regular systems of close
inbreeding. When self-fertilisation is excluded the rate of inbreeding
expressed in terms of the population size is only an approximation,
and the approximation is not close enough if the population size is
very small. Under systems of close inbreeding, therefore, the rate of
inbreeding must be deduced differently, and this is best done also
by consideration of the pedigrees.

When the coefficient of inbreeding is deduced from the pedigrees
of real populations it does not necessarily describe the state of dis-
persion of the gene frequencies. It is essentially a statement about

86 SMALL POPULATIONS: III [Chap. 5

the pedigree relationships, and its correspondence with the state of
dispersion is dependent on the absence of the processes that counteract
dispersion, in particular on selection being negligible. We were able
to use the coefficient of inbreeding as a measure of dispersion in the
preceding chapters because the necessary conditions for its relation-
ship with the variance of gene frequencies were specified.

Pedigreed Populations

The inbreeding coefficient of an individual is the probability
that the pair of alleles carried by the gametes that produced it were iden-
tical by descent. Computation of the inbreeding coefficient therefore
requires no more than the tracing of the pedigree back to common
ancestors of the parents and computing the probabilities at each
segregation. Consider the pedigree in Fig. 5.1. X is the individual
we are interested in, whose parents are P and Q. We
A want to know what is the probability that X receives

J identical alleles transmitted through P and Q from A.

Consider first B and C. The probability that they
B C receive replicates of the same gene from A is J, and the

probability that they receive different genes is J. But
if they receive different genes from A, then the prob-
ability of these being identical as a result of previous
Y inbreeding is the inbreeding coefficient of A. There-

I fore the total probability of B and C receiving identical

I genes from A is J(i +FA). Put in other words, this is

the probability that two gametes taken at random

Fig. 5.1 from A will contain identical alleles. Now consider
the rest of the path through B. The probability that B
passes the gene it got from A on to D is ^; from D to P is J, and from
P to X is \ . Similarly for the other side of the ancestry through C and
Q. Putting all this together we find the probability that X receives
identical alleles descended from A is |(i +FA)(^y+2, or
\(i + FA)(^)ni+n2, where n1 is the number of generations from one
parent back to the common ancestor and n2 from the other parent.
If the two parents have more than one ancestor in common the separ-
ate probabilities for each of the common ancestors have to be summed
to give the inbreeding coefficient of the progeny of these parents.
Thus the general expression for the inbreeding coefficient of an

Chap. 5]

PEDIGREED POPULATIONS

individual is

Fx = m)n^+1(i+FA)] (5-1)

(Wright, 1922). When inbreeding coefficients are computed in this
way it is necessary, of course, to define the base population to which
the present inbreeding is referred. The base population might be the
individuals from which an experiment was started or a herd founded;
or it might be those born before a certain date. The designation of an
individual as belonging to the base population means that it will be
assumed to have zero inbreeding coefficient. When pedigrees are
long and complicated there may be very many common ancestors, but
it is not necessary to trace back all lines of descent. A sufficiently
accurate estimate can be got by sampling a limited number of lines of
descent (Wright and McPhee, 1925).

Example 5.1. As an illustration of the use of formula 5.J let us consider
the hypothetical pedigree in Fig.
5.2. The relevant individuals in
the pedigree are indicated by
letters. Individual Z is the one
whose inbreeding coefficient is to
be computed. Its parents are X
and Y, so we have to trace the
paths of common ancestry con-
necting X with Y. There are
four common ancestors, A, B, C,
and H, and five paths connecting
X with Y through them. We as-
sume A, B, and C to have zero in-
breeding coefficients, since the
pedigree tells us nothing about
their ancestry. Individual H,
however, has parents that are
half sibs, and the inbreeding
coefficient of H is therefore 5'2

Common Path from
ancestor X to Y

Generations to
common ancestor:
from X from Y

Inbreeding

coeff. of

common ancestor

Contribution

to inbreeding

ofZ

A KGCADHL
B KHDBEJM
B KHDBEL
C KGCHL
H KHL

4
4
4
3

4
4
3
3
2

0
0
0
0
\

(i)9 = -00195

ay = -00195
ay = -00391
ay = -00781
(*)5.1 = -03516

Total by summation

0-05078

F.Q.G.

88 SMALL POPULATIONS: III [Chap. 5

(i)(i+i+i) = i# ^he computation of the separate paths may now be made as
shown in the table. By addition of the contributions from the five paths
we get the inbreeding coefficient of Z as Fz =0-05078, or 5-1 per cent.

"Coancestry." There is another method of computing inbreed-
ing coefficients (Cruden, 1949; Emik and Terrill, 1949) which is more
convenient for many purposes, and is also more readily adapted to
a variety of problems. We shall use it later to work out the inbreeding
coefficients under regular systems of close inbreeding. The method
does not differ in principle from the formula 5.J given above, but
instead of working from the present back to the common ancestors
we work forward, keeping a running tally generation by generation,
and compute the inbreeding that will result from the matings now
being made. The inbreeding coefficient of an individual depends on
the amount of common ancestry in its two parents. Therefore,
instead of thinking about the inbreeding of the progeny, we can think
of the degree or relationship by descent between the two parents.
This we shall call the coancestry of the two parents, and symbolise it
by /. It is identical with the inbreeding coefficient of the progeny,
and is the probability that two gametes taken one from one parent
and one from the other will contain alleles that are identical by
descent. (Malecot, 1948, calls this the "coefficient de parente," but
the translation "coefficient of relationship" cannot be used because
Wright (1922) has used this term with a different meaning.)

Consider the generalised pedigree in Fig. 5.3.
X is an individual with parents P and Q and grand- A x B C x D
parents A, B, C, and D. Now, the coancestry of P j J

with Q is fully determined by the coancestries relating P x Q
A and B with C and D, and if these are known we |

need go no further back in the pedigree. It can be X

shown that the coancestry of P with Q is simply the Fig. 5.3
mean of the four coancestries AC, AD, BC, and BD.
This will be clearer if stated in the form of probabilities, though the
explanation is cumbersome when put into words. Take one gamete
at random from P and one from Q, and repeat this many times. In
half the cases P's gamete will carry a gene from A and in half from B:
similarly for Q's gamete. So the two gametes, one from P and one
from Q, will carry genes from A and C in a quarter of the cases, from
A and D in a quarter, from B and C in a quarter, and from B and D
in a quarter of the cases. Now the probability that two gametes

taken at random, one from A and the other from C, are identical by
descent is the coancestry of A with C, i.e. fAC etc. So, reverting now
to symbols,

/pq — J/ac + J/ad + J/bc + J/bd

This gives the basic rule relating coancestries in one generation with
those in the next:

Chap. 5]

PEDIGREED POPULATIONS

Fx =/pq — | (Ac +/ad +/bc +/bd)

■(5-2)

With this rule the experimenter can tabulate the coancestries genera-
tion by generation, and this gives a basis for planning matings and
computing inbreeding coefficients. More detailed accounts of the
operation are given by Cruden (1949), Emik and Terrill (1949), and
Plum (1954).

If there is overlapping of generations it may happen that we must
find the coancestry between individuals belonging to different
generations. This situation is covered by the following supplementary
rules, which can readily be deduced by a consideration of probabilities
in the manner explained above. Referring to the same pedigree

(Fig- 5-3)>

and

/pc:

/pD
/PQ

J(/ac+/bo]

K/ad+/bd)

K/i

+/pd)

(5.3)

which by substitution reduces to the basic rule.

Before we can apply this method to systems of close inbreeding
we have to see how the basic rule is to be applied when there are
fewer than four grandparents. As an example
we shall consider the coancestry between a
pair of full sibs. The pedigree can be written
as in Fig. 5.4: A and B are parents of both Px
and P2, which are full sibs and have an off-
spring X. Applying the basic rule (equation
5.2), and noting that /BA =/AB, we have

Fig. 5.4

^x =/Pip2 = K/aa +/bb + 2/Ab)

(54)

The meaning of /AA, the coancestry of an individual with itself, is the
probability that two gametes taken at random from A will contain

90 SMALL POPULATIONS: III [Chap. 5

identical alleles, and we have already seen that this probability is
equal to |(i +FA). The value of FA will be known from the coancestry
of A's parents. The coancestry between offspring and parent can be
found in a similar way, by application of the supplementary rules in
5.5. Substituting the individuals in Fig. 5.4 for those in Fig. 5.3 and
applying the first two equations of 5. 3 gives

/pa = 2 (/aa +/ab) 1 / -x

/pB — K/BB +/ab) J

where P is equivalent to either Px or P2; and applying the third
equation of 5.5 gives the coancestry between full sibs

/piPa = J(/pa +/pb)

= K/aa+/bb + 2/Ab)

as above. We now have all the rules needed for computing the in-
breeding coefficients in successive generations under regular systems
of inbreeding.

Regular Systems of Inbreeding

The consequences of regular systems of inbreeding have been
the subject of much study. They were first described in detail by
Wright ( 1 921) in a series of papers which form the foundation of the
whole theory of small populations. Wright's studies were based on
the method of path coefficients (Wright, 1934, 1954). Haldane
(1937, 1955) and Fisher (1949) derived the consequences by the
method of matrix algebra. The inbreeding coefficients in successive
generations can, however, be more simply derived by application of
the rules of coancestry explained in the previous section, and this is
the method we shall follow here. We shall illustrate the application
of the method for consecutive full-sib mating, which is one of the most
commonly used systems, and give the results for some other systems.
The inbreeding coefficients refer to autosomal genes; the results for
sex-linked genes are described by Wright (1933) in a paper which
also contains a useful summary of the results for autosomal genes in
a great variety of mating systems.

Full-sib mating. The equation 5.4 given above for the co-
ancestry between full sibs can be applied to successive generations to

Chap. 5]

REGULAR SYSTEMS OF INBREEDING

FABLE

5-1

Inbreeding

coefficients

under

various

systems of

nbreeding

Generation

(

(2)

0 0

•500

250

•125

•250

750

375

•219

*375

•875

500

•063

•305

•438

•938

594

•172

.381

•469

•969

672

•293

•449

•484

•984

734

•409

•509

•492

.992

785

•512

•563

•496

•996

826

•601

•611

•498

.998

859

•675

•654

•499

•999

886

736

•691

908

785

•725

926

•826

755

940

•859

•782

95i

•886

•806

961

•908

•827

968

•925

•846

974

•940

•863

979

•951

•878

983

•960

•891

986

•968

•903

Column
A

(1)
(2)

System of mating Recurrence equation

Self-fertilisation, or re-
peated backcrosses to highly |(i +Ft_1)
inbred line.

Full brother x sister, or off-
spring x younger parent:

Inbreeding coefficient. J(i + 2Ft_± +Ft_2)

Probability of fixation
(from Schafer, 1937).

Half sib (females half J(i +6Ft-1+Ft_2)
sisters).

Repeated backcrosses to J(i +zFt_1)
random-bred individual.

92 SMALL POPULATIONS: III [Chap. 5

give the inbreeding coefficients under continued full-sib mating.
But it is more convenient to rearrange the equation so that the in-
breeding coefficient is given in terms of the inbreeding coefficients
of the previous generations. Note first that, because the mating sys-
tem is regular, contemporaneous individuals have the same inbreeding
coefficients and coancestries: so, referring again to the pedigree in
Fig. 5.4>/aa=/bb> and Fa = Fb. Now, if we let t be the generation to
which individual X belongs, then/AB =Ft_lf and/AA =/BB =i(i +-^-2)-
The coancestry equation can therefore be rewritten to give the
inbreeding coefficient in any generation, ty in terms of the inbreeding
coefficients of the previous two generations, thus:

Ft=l(i+zFt_1+Ft_2) (5.6)

This recurrence equation enables us to write down the inbreeding
coefficients in successive generations. In the first generation Ft_±
and Ft_2 are both zero and so F(t=1) =0-25. The inbreeding coeffi-
cients in the first four generations are 0-25, 0-375, 0-50, and 0-59.
The rate of inbreeding is not constant in the first few generations, as
may be seen by computing AF from equation 3.9. For the first four
generations AF is 0-25, 0-17, 0-20, and 0-19. It later settles down
to a constant value of 0-191 (Wright, 193 1). The inbreeding co-
efficients over the first 20 generations of full-sib mating are given in
Table 5.1.

Some other systems of mating may now be mentioned briefly.
Self-fertilisation gives the most rapid inbreeding. If X is the off-
spring of P, we have from the coancestry identities

Fx=f?r = l(i+Fj,)

and the recurrence equation is therefore

Ft=i(i +Ft_1) {5.7)

The inbreeding coefficients over the first ten generations of self-
fertilisation are given in Table 5.1. The rate of inbreeding is con-
stant from the beginning; AF=o-$ exactly.

Parent-offspring mating, in which offspring are mated to the
younger parent, gives the same series of inbreeding coefficients as
full-sib mating for autosomal genes, but for sex-linked genes it gives
a slightly higher rate of inbreeding. For sex-linked genes AF is 0-293
after the first few generations (Wright, 1933).

Chap. 5]

REGULAR SYSTEMS OF INBREEDING

Half-sib mating is usually between paternal half sibs, one male being
mated to two or more of his half sisters. If these females are half
sisters of each other the recurrence equation is

Ft=i(x+6Ft_1+Ft

■(5-8)

The first 20 generations are given in Table 5.1. There are, however,
practical difficulties in the way of maintaining this system regularly,
and sometimes females that are full sisters of each other have to be
used. The inbreeding will then go a little faster. If full-sister females
are always used the recurrence equation is

Ft=M3 + 8^_1+4F<_2+^_8)

Repeated backcrosses to an individual or to a highly
inbred line are often made, for a variety of purposes.
The resulting inbreeding is as follows. The pedigree
(Fig. 5.5) shows an individual, A, which will probably
be a male, mated to his daughter, C, his granddaughter,
D, etc. From the supplementary rule (5.5)

Fx =/ad = J(/aa +/ac)
The recurrence equation is therefore

^ = i(l+^A + 2^-l)

(5-9)

I
D

Fig. 5.5

(5-io)

I where FA is the inbreeding coefficient of the individual to which the
j repeated backcrosses are made. If A is an individual from the base
J population andFA = o, the equation becomes

F( = 1(1+2.^)

(5-ii)

The inbreeding coefficients over the first 9 generations are given in
Table 5.1. If A is an individual from a highly inbred line and FA = 1 ,
the equation becomes

Ft=i(i+Ft_0

■(5-i2)

which is identical with self-fertilisation. In this case A need not be
the same individual in successive generations: it can be any member
of the inbred line.

94 SMALL POPULATIONS: III [Chap. 5

Example 5.2. As an example of the use of coancestry for computing
inbreeding coefficients let us consider populations derived from "2-way"
and from "4-way" crosses between highly inbred lines. In a 2-way cross
two inbred lines are crossed and the population is maintained by random
mating among the cross-bred individuals and subsequently among their
progeny. In a 4-way cross four inbred lines are crossed in two pairs, and
the two cross-bred groups are again crossed, subsequent generations being
maintained by random mating. If the base population is taken to be a real,
or hypothetical, random-bred population from which the inbred lines were
derived, we may compute the inbreeding coefficients of the population
derived from the cross, referring it to this base. The crosses and sub-
sequent generations are shown schematically in the diagram below.

Generation 2-way cross 4-way cross

1 AxB AxB CxD

Xx x X2 Xjl X2 Y2 Y2

Xj x Y1 X2 x Y2

1 ' 1

O Zj x Za

I
4 O

The inbred lines are represented by A, B, C, and D. If they are fully
inbred, as we shall take them to be, the coefficient of inbreeding of the
individuals from the lines is 1, and the coancestry of an individual with
another of the same line is also 1. Therefore only one individual of each
line need be represented in the scheme, even though any number may
actually be used. The progeny of the crosses between the inbred lines are
represented by X and Y, the suffices 1 and 2 indicating different individuals.
In the 2-way cross the progeny of these cross-bred individuals are the
foundation generation whose inbreeding coefficient we are to compute.
They are represented by O. In the 4-way cross the two sorts of cross-bred
individuals, X and Y, are crossed, one sort with the other. Two such
matings are represented in the scheme. They produce the "double-cross"
individuals, Z, whose progeny constitute the foundation generation repre-
sented by O, whose inbreeding coefficient we are to compute.

In the computation of the coancestries we shall omit the symbol /,
writing for example AB for /AB, the coancestry of individual A with in-
dividual B. The coancestries of the parents in generation 1 are

AA=BB=CC=DD=i

Chap. 5]
and

REGULAR SYSTEMS OF INBREEDING
AB=AC=AD=BC=BD=CD=o

The coancestries in the second generation of the 2-way cross are

X±X2 = J(AA + BB + AB + BA) (by equation 5.2)
= |( 1 +1+0+0)

Therefore F0 = 0-5, which is the required inbreeding coefficient of the
foundation generation of the population derived from the 2-way cross.
The subsequent matings between the O individuals need produce no
further inbreeding provided enough 2nd generation matings are made.
The coancestries in the second generation of the 4-way cross are

and

XXX2 = Y^ = J (as shown for the 2-way cross)
XXY2 = X2YX = i(AC + AD + BC + BD) = o

The coancestries in the third generation are

ZXZ2 = i(XxX2 + YXY2 + XXY2 + X2YX)

_ 1

~~4

Therefore the inbreeding coefficient of the foundation generation is
^0 = 0-25. Again, the inbreeding need not increase further, provided
enough third generation matings are made.

The meaning of these coefficients of inbreeding, with the base popula-
tion as stated, may be clarified thus. If we made a large number of 2-way,
or of 4-way, crosses each with a different set of inbred lines, the populations
derived from the crosses would constitute a set of lines or sub-populations.
The inbreeding coefficients would then indicate the expected amount of
dispersion of gene frequencies among these lines. Populations derived
from 2-way crosses are equivalent to progenies of one generation of self-
fertilisation. The gene frequencies can therefore have only three values,
o, J, and 1. Populations derived from 4-way crosses are equivalent to
progenies of one generation of full-sib mating, and the gene frequencies
can have only five values, o, J, J, f , and 1 .

Reference to a different base population. Having computed
a coefficient of inbreeding with reference to a certain group of indi-
viduals as the base population, one may then want to change the base
and refer the inbreeding coefficient to another group of individuals.
One might, for example, compute the inbreeding coefficient of a herd

96 SMALL POPULATIONS: III [Chap. 5

of cattle referred to the foundation animals of the herd as the base,
and then want to recompute the inbreeding coefficient

so as to refer to the breed as a whole with a base popula- A

tion in the more remote past. Let X represent the group J,

of individuals whose inbreeding coefficient is required, B

and let A and B represent ancestral groups, A being more !

remote than B, as shown in Fig. 5.6. Then it follows from X

equation 3. II that Fig. 5.6

Px.a = Px.bPb.a (5.13)

where Px A = i -Fx.a1 Fx.a being the inbreeding coefficient of X
referred to A as base, and similarly for the other subscripts.

Example 5.3. A selection experiment with mice was started from a
foundation population made by a 4-way cross of highly inbred lines
(Falconer, 1953). According to the computation given above in Example
5.2, the inbreeding coefficient of this foundation population was reckoned
to be 25 per cent. On this basis the inbreeding coefficients of subsequent
generations were computed from the pedigrees by the coancestry method. The
inbreeding coefficient at generation 24, computed thus, was 58-8 per cent.
What would the inbreeding coefficient be if referred to the foundation
population as base, instead of to the more remote hypothetical population
from which the inbred lines were derived? The figures to be substituted in

equation 3-13 are Px.a = °*4i2 and Pb.a = 0'75- Therefore Px.b = —

= 0-549. The inbreeding coefficient at generation 24, referred to the
foundation population as base, is therefore 45-1 per cent.

We may use this population of mice also to compare the rate of in-
breeding when computed by the two methods, from the pedigrees and from
the effective population size. Computed from the pedigrees, the average
rate of inbreeding over the 24 generations is found from equation 3.12
thus: 0-451 = 1 -(1 -zlF)24, whence AF = 2-4.7 Per cent- The population
was maintained by six pairs of parents in each generation. Matings were
made between individuals with the lowest coancestries and this has the
effect of equalising family size, as explained in the previous chapter.
Therefore, by equation 4.8, the effective number was twice the actual, i.e.

Ne = 24. The rate of inbreeding, by equation 4.1, is therefore AF = —= = 2-08

per cent. The slightly higher rate of inbreeding as computed directly

from the pedigrees can be attributed to some irregularities in the mating

system, resulting from the sterility of some parents and the death of some

whole litters. The random drift of a colour gene in this line, and two others

maintained in the same manner, was shown in Fig. 3.2.

Chap. 5]

REGULAR SYSTEMS OF INBREEDING

Fixation. One is often more interested in the probability of
fixation as a consequence of inbreeding than in the inbreeding coeffi-
cient. The inbreeding coefficient gives the probability of an indi-
vidual being a homozygote, which is i - 2p0q0(i -F) from Table 3.1.
But one wants to know also how soon all individuals in a line can be
expected to be homozygous for the same allele. This is the "purity"
implied by the term "pure line" which is often used to mean highly
inbred line. The degree of "purity" is the probability of fixation.
The probability of fixation has been worked out by Haldane (1937,
1955), Schafer (1937) and Fisher (1949). It depends on the number
of alleles and their arrangement in the initial mating of the line. The
probabilities of fixation over the first 20 generations of full-sib mating
are given in Table 5.1, when 4 alleles were present in the initial
mating. There cannot, of course, be more than 4 alleles in a sib-
mated line, and when there are fewer the probability of fixation is
greater (see Haldane, 1955).

Linkage. Linkage introduces a problem in connexion with the
consequences of inbreeding of which a solution is sometimes needed.
Individuals heterozygous at a particular locus will also be hetero-
zygous for a segment of chromosome in which the locus lies, and it
may be of interest to know the average length of heterozygous
segments. The form in which this problem most commonly arises is
connected with the transference of a marker gene to an inbred line by
repeated backcrosses, when one wants to know how much of the
foreign chromosome is transferred along with the marker. This
problem has been worked out by Bartlett and Haldane (1935). A
dominant gene can be transferred by successive crosses of the hetero-
zygote to the strain into which it is to be introduced. In this case the
mean length of chromosome introduced with the gene after t crosses
is ijt cross-over units on each side of the gene. A recessive gene is
commonly transferred by alternating backcrosses and intercrosses
from which the homozygote is extracted. The mean length of foreign
chromosome in this case is z\t cross-over units on each side, after t
cycles. Other cases are described in the paper cited. From this and a
knowledge of the total map length of the organism we can arrive at
the expected proportion of the total chromatin that is still hetero-
geneous.

Example 5.4. What percentage of the total chromatin is expected to
be still heterogeneous after a dominant gene has been transferred to an
inbred strain of mice by five, and by ten successive backcrosses? The

SMALL POPULATIONS: III

[Chap. 5

A
i
•

1
I

1 ' ^ r* — , ™ — ■ - ,

i i

in t "

■

TV (

■

XVI J

at 1 ..

(a)

XV C

viii c

ix c

xi C

xii C

XIII c

xrvc

xvi c

XIX (_

xxC

(b)

Chap. 5]

REGULAR SYSTEMS OF INBREEDING

XV£

viii n

XIII c

xvi c
xvii n

xviii c

XIX C

xx C

(c)

Fig. 5.7. Theoretical models illustrating the distribution of
heterozygous segments of chromosome (shown black) after (a) 5
generations, (b) 10 generations, and (c) 20 generations of full-sib
mating, in an organism with twenty chromosomes, such as the
mouse. The total map-length is taken to be 2500 centimorgans,
and the chromosomes are assumed to be of equal genetic length.
The points marked A, B, C, D, in chromosomes I to IV are loci
held heterozygous by forced segregation, and the associated hetero-
zygous segments are cross-hatched. (From Fisher, The Theory of
Inbreeding, Oliver and Boyd, 1949; reproduced by courtesy of the
author and publishers.)

expected length of heterogeneous chromosome associated with the gene
is 0*2. centimorgans after five crosses, and o-i cM after ten. The average
map length of the 20 chromosomes in male mice is 977 cM (Slizynski,
1955). Therefore 0-2 per cent of the chromosome will be heterogeneous
after five crosses, and o-i per cent after ten, assuming that the gene is
transferred through males, and taking the average as being the length of
the chromosome carrying the gene. The percentage of chromatin not
associated with the gene that is expected still to be heterogeneous can be
taken as approximately i-Ft from column A of Table 5.1: that is, 3-1 per
cent after five crosses and o-i per cent after ten. The total percentage of
heterogeneous chromatin is therefore 3-4 per cent after five crosses, and
0-2 per cent after ten.

100 SMALL POPULATIONS: III [Chap. 5

The more general problem of the mean length of heterozygous
segments during inbreeding has been treated by Haldane (1936) and
by Fisher (1949). It need not be discussed in detail here. The con-
clusions are well illustrated in Fig. 5.7, which is Fisher's diagrammatic
representation of the situation in an organism with 20 chromosomes,
such as the mouse, after five, ten, and twenty generations of full-sib
mating. The diagrams show the expected number and lengths of
unfixed segments. The first four chromosomes are supposed to carry
loci at which segregation is maintained by mating always hetero-
zygotes with homozygotes. The slower reduction of the lengths of
these unfixed segments can be seen.

Mutation. After a long period of inbreeding mutation may be-
come an important factor in determining the frequency of hetero-
zygotes. If u is the mutation rate of a gene that has reached near-
fixation in the line, then the frequency of heterozygotes at this locus
due to mutation is \u under self-fertilisation, and \zu under full-sib
mating, for autosomal loci (Haldane, 1936). These are very small
frequencies if we are concerned with only one locus, but if the effects
of all loci are taken together mutation is not entirely negligible as a
source of heterozygosis in long inbred strains such as the widely used
strains of mice. The practical consequences of the origin of hetero-
geneity by mutation are that the characteristics of a line will slowly
change through the fixation of mutant alleles, and that sub-lines will
become differentiated. Examples are given in Chapter 15.

Selection favouring heterozygotes. When close inbreeding is
practised the object is generally to produce fixation, or homozygosis
within the lines, and the experimenter is not usually interested in the
differentiation between lines. It is therefore a matter of little concern
which allele is fixed, so long as fixation occurs. Selection against a
deleterious recessive may prevent the deleterious allele becoming
fixed, but it will not prevent or delay the fixation of the more favour-
able allele. Therefore the conclusions about selection reached in the
previous chapter are of little relevance to close inbreeding. Selection
that favours heterozygotes, however, is another matter. A conse-
quence of inbreeding almost universally observed is a reduction of
fitness, the reasons for which will be given in Chapter 14. Thus
selection resists the inbreeding, since the more homozygous indi-
viduals are the less fit, and this can only mean that selection favours
heterozygotes — not necessarily heterozygotes of the loci taken singly,
but heterozygotes of segments of chromosome. It is only necessary

Chap. 5]

REGULAR SYSTEMS OF INBREEDING

101

to have two deleterious genes, recessive or partially recessive, linked
in repulsion, to confer a selective advantage on the heterozygote of
the segment of chromosome within which the genes are located. It is
therefore important to find out how the opposing tendencies of
inbreeding and selection in favour of heterozygotes balance each
other, in order to assess the reliability of the computed inbreeding
coefficient as a measure of the probability of fixation.

The outcome of the joint action of inbreeding and selection in
favour of heterozygotes depends on whether there is replacement of
the less fit lines by the more fit; in other words, on whether selection
operates between lines or only within lines. Within any one line,
selection against homozygotes only delays the progress toward
fixation and cannot arrest it, the delay being roughly in proportion to
the intensity of the selection (Reeve, 19550). Table 5.2 shows the

Table 5.2

Rate of inbreeding, AF, with selection favouring the
heterozygote. (Except with self-fertilisation, the rates are
only approximate over the first few generations of in-
breeding.)

Coefficient of
selection against

"* \/o)

Self-

the homozygotes

fertilisation

Full sib

Half sib

50-00

19-10

13-01

0-2

44'44

14-88

9-32

0-4

37-5o

10-32

5-6?

o-6

28-57

57i

2-48

075

20-00

2-62

0-82

o-8

16-67

1-76

0-46

* Females full sisters to each other.

rates of inbreeding with various intensities of selection, when there
are two alleles and selection acts equally against both homozygotes.
(The rate of inbreeding, AF, is used here to mean the rate of dispersion
of gene frequencies and, after the first few generations when the
distribution of gene frequencies has become flat, it measures the rate
of fixation — i.e. the proportion of unfixed loci that become fixed in
each generation — as explained in Chapter 3.) The delay of fixation
caused by selection is least under the closest systems of inbreeding.

102 SMALL POPULATIONS: III [Chap. 5

Thus the rate is halved under self-fertilisation when the coefficient
of selection is 0-67; under full-sib mating when it is 0-44; and under
half-sib mating when it is 0-35. It will be seen from the table that the
rate of inbreeding, though much reduced by intense selection, does
not become zero until the coefficient of selection rises to 1 . If there is
only one line, therefore, fixation eventually goes to completion, unless
both homozygotes are entirely inviable or sterile.

If there are many lines, however, selection may arrest the progress
of fixation and lead to a state of equilibrium, for the following reason.
The amount by which the inbreeding has changed the frequency of a
particular gene from its original value differs at any one time from
line to line. In other words, the state of dispersion of the locus has
gone further in some lines than in others. Now, if those lines in
which the dispersion has gone furthest, and which are consequently
most reduced in fitness, die out or are discarded, and if they are
replaced by sub -lines taken from the lines in which it has gone least
far, then the progress of the dispersive process will have been set
back. When there is replacement of lines in this way, and the selec-
tion is sufficiently intense, a state of balance between the opposing
tendencies of inbreeding and selection is reached. The intensity of
selection needed to arrest the dispersive process has been worked out
for regular systems of close inbreeding (Hayman and Mather, 1953).
Some of the conclusions, for the case of two alleles with equal selec-
tion against the two homozygotes, are given in Table 5.3, which
shows the intensity of selection against the homozygotes which will
(a) just allow fixation to go eventually to completion, and (b) arrest

Table 5.3

Balance between inbreeding and selection in favour of
heterozygotes, when selection operates between lines. The
figures are the selective disadvantages of homozygotes, sf
expressed as percentages. Column (a) shows the highest
value of ^ compatible with complete fixation. Column (b)
shows the value of s that leads to a steady state at
P=i-F = o-5.

(a)

(b)

Mating system

(P = o)

(iVo-5)

Self-fertilisation

50-0

667

Full-sib

237

44-6

Half-sib

18-8

47-2

(females half sisters)

Chap. 5]

REGULAR SYSTEMS OF INBREEDING

103

the dispersive process at a point of balance where the frequency of
heterozygotes is half its original value, i.e. where P= i-F=o-$.
These figures show that only a moderate advantage of heterozygotes
will suffice to prevent complete fixation. Under full-sib mating, for
example, loci, or segments of chromosomes that do not recombine,
with a 25 per cent disadvantage in homozygotes will not all go to
fixation. And, of those with a 50 per cent disadvantage, only about
half will become fixed, no matter for how long the inbreeding is
continued.

It must be stressed, however, that prevention of fixation in this
way can only take place when there is replacement of lines and sub-
lines. The following breeding methods, for example, would allow
replacement of lines: if seed, set by self-fertilisation, were collected
in bulk and a random sample taken for planting, and this were re-
peated in successive generations; or, if sib pairs of mice were taken at
random from all the surviving progeny, so that the same amount of
breeding space was occupied in successive generations.

The conclusions outlined above refer to a single locus. If there
were more than a few loci on different chromosomes all subject to
selection against homozygotes of an intensity sufficient to arrest or
seriously delay the progress of inbreeding, the total loss of fitness
from all the loci would be very severe. Inbred lines of organisms
with a high reproductive rate, such as plants and Drosophila, might
well stand up to a total loss of fitness sufficient to keep several loci
or segments of chromosome permanently unfixed. But the loss of
fitness involved in preventing the fixation of more than two or three
loci in an organism such as the mouse would be crippling. Under
laboratory conditions the highly inbred strains of mice, after 100 or
more generations of sib-mating, have a fitness not much less than half
that of non-inbred strains. It is conceivable that they might have one
locus permanently unfixed, but it is difficult to believe that they can
have more. Complete lethality or sterility of both homozygotes at
one locus means a 50 per cent loss of progeny; at two unlinked loci, a
75 per cent loss. A mouse strain with a mortality or sterility of 50
per cent can be kept going, but hardly one with 75 per cent.

F.Q.G.

CHAPTER 6

CONTINUOUS VARIATION

It will be obvious, to biologist and layman alike, that the sort of
variation discussed in the foregoing chapters embraces but a small
part of the naturally occurring variation. One has only to consider
one's fellow men and women to realise that they all differ in countless
ways, but that these differences are nearly all matters of degree and
seldom present clear-cut distinctions attributable to the segregation
of single genes. If, for example, we were to classify individuals ac-
cording to their height, we could not put them into groups labelled
"tall" and "short," because there are all degrees of height, and a
division into classes would be purely arbitrary. Variation of this sort,
without natural discontinuities, is called continuous variation, and
characters that exhibit it are called quantitative characters or metric
characters, because their study depends on measurement instead of
on counting. The genetic principles underlying the inheritance of
metric characters are basically those outlined in the previous chapters,
but since the segregation of the genes concerned cannot be followed
individually, new methods of study have had to be developed and
new concepts introduced. A branch of genetics has consequently
grown up, concerned with metric characters, which is called variously
population genetics, biometrical genetics or quantitative genetics. The
importance of this branch of genetics need hardly be stressed; most
of the characters of economic value to plant and animal breeders are
metric characters, and most of the changes concerned in micro-
evolution are changes of metric characters. It is therefore in this
branch that genetics has its most important application to practical
problems and also its most direct bearing on evolutionary theory.

How does it come about that the intrinsically discontinuous varia-
tion caused by genetic segregation is translated into the continuous
variation of metric characters? There are two reasons: one is the
simultaneous segregation of many genes affecting the character, and
the other is the superimposition of truly continuous variation arising
from non-genetic causes. Consider, for example, a simplified situa-

Chap. 6]

CONTINUOUS VARIATION

105

tion. Suppose there is segregation at six unlinked loci, each with two
alleles at frequencies of 0-5. Suppose that there is complete domin-
ance of one allele at each locus and that the dominant alleles each add
one unit to the measurement of a certain character. Then if the
segregation of these genes were the only cause of variation there would

Fig. 6.1. Distributions expected from the simultaneous segrega-
tion of two alleles at each of several or many loci: (a) 6 loci, (b) 24
loci. There is complete dominance of one allele over the other at
each locus, and the gene frequencies are all 0-5. Each locus, when
homozygous for the recessive allele, is supposed to reduce the
measurement by 1 unit in (a), and by \ unit in (b). The horizontal
scale, representing the measurement, shows the number of loci
homozygous for the recessive allele, and the vertical axis shows the
probability, or the percentage of individuals expected in each class.
The probabilities are derived from the binomial expansion of
(i + !)w> where n is the number of loci, and they are taken from the
tables of Warwick (1932).

be 7 discrete classes in the measurements of the character, according
to whether the individual had the dominant allele present at o, 1, 2, . . .
or 6 of the loci. The frequencies of the classes would be according to
the binomial expansion of (i + |)6, as shown in Fig. 6.1 (a). If our
measurements were sufficiently accurate we should recognise these
classes as being distinct and we should be able to place any individual

106 CONTINUOUS VARIATION [Chap. 6

unambiguously in its class. If there were more genes segregating but
each had a smaller effect, there would be more classes with smaller
differences between them, as in Fig. 6.1 (b). It would then be more
difficult to distinguish the classes, and if the difference between the
classes became about as small as the error of measurement we should
no longer be able to recognise the discontinuities. In addition, metric
characters are subject to variation from non-genetic causes, and this
variation is truly continuous. Its effect is, as it were, to blur the edges
of the genetic discontinuity so that the variation as we see it becomes
continuous, no matter how accurate our measurements may be.

Thus the distinction between genes concerned with Mendelian
characters and those concerned with metric characters lies in the
magnitude of their effects relative to other sources of variation. A
gene with an effect large enough to cause a recognisable discontinuity
even in the presence of segregation at other loci and of non-genetic
variation can be studied by Mendelian methods, whereas a gene whose
effect is not large enough to cause a discontinuity cannot be studied
individually. This distinction is reflected in the terms major gene and
minor gene. There are, however, all intermediate grades, genes that
cannot properly be classed as major or as minor, such as the "bad
genes" of Mendelian genetics. And, furthermore, as a result of
pleiotropy the same gene may be classed as major with respect to one
character and minor with respect to another character. The distinc-
tion, though convenient, is therefore not a fundamental one, and there
is no good evidence that there are two sorts of genes with different
properties. Variation caused by the simultaneous segregation of
many genes may be called polygenic variation, and the minor genes
concerned are sometimes referred to as polygenes (see Mather, 1949).

Metric Characters

The metric characters that might be studied in any higher organ-
ism are almost infinitely numerous. Any attribute that varies con-
tinuously and can be measured might in principle be studied as a
metric character — anatomical dimensions and proportions, physio-
logical functions of all sorts, and mental or psychological qualities.
The essential condition is that they should be measureable. The
technique of measurement, however, sets a practical limitation on
what can be studied. Usually rather large numbers of individuals

Chap. 6]

METRIC CHARACTERS

107

30 35 t 40

NO. OF BRISTLES

40 T60 80 100 120

NO. OF FACETS

Fig. 6.2. Frequency distributions of four metric characters, with
normal curves superimposed. The means are indicated by arrows.
The characters are as follows, the number of observations on which
each histogram is based being given in brackets:

(a) Mouse (<?<?): growth from 3 to 6 weeks of age. (380)

(b) Mouse: litter size (number of live young in 1st litters).
(689)

(c) Drosophila melanogaster ($$): number of bristles on ventral
surface of 4th and 5th abdominal segments, together. (900)

(d) Drosophila melanogaster ($?): number of facets in the eye of
the mutant "Bar". (488)

(a), (b), and (c) are from original data: (d) is from data of Zeleny
(1922).

108 CONTINUOUS VARIATION [Chap. 6

have to be measured and the study of any character whose measure-
ment requires an elaborate technique therefore becomes impracti-
cable. Consequently the characters that have been used in studies of
quantitative genetics are predominantly anatomical dimensions, or
physiological functions measured in terms of an end-product, such as
lactation, fertility, or growth rate.

Some examples of metric characters are illustrated in Fig. 6.2.
The variation is represented graphically by the frequency distribu-
tion of measurements. The measurements are grouped into equally
spaced classes and the proportion of individuals falling in each class
is plotted on the vertical scale. The resulting histogram is discontinu-
ous only for the sake of convenience in plotting. If the class ranges
were made smaller and the number of individuals measured were in-
creased indefinitely the histogram would become a smooth curve.
The variation of some metric characters, such as bristle number or
litter size, is not strictly speaking continuous because, being measured
by counting, their values can only be whole numbers. Nevertheless,
one can regard the measurements in such cases as referring to an
underlying character whose variation is truly continuous though
expressible only in whole numbers, in a manner analogous to the
grouping of measurements into classes. For example, litter size may
be regarded as a measure of the underlying, continuously varying
character, fertility. For practical purposes such characters can be
treated as continuously varying, provided the number of classes is
not too small. When there are too few classes, as for example when
susceptibility to disease is expressed as death or survival, different
methods have to be employed, as will be explained in Chapter 18.

The frequency distributions of most metric characters approxi-
mate more or less closely to normal curves. This can be seen in
Fig. 6.2, where the smooth curves drawn through the histograms are
normal curves having means and variances calculated from the data.
In the study of metric characters it is therefore possible to make use
of the properties of the normal distribution and to apply the appro-
priate statistical techniques. Sometimes, however, the scale of
measurement must be modified if a distribution approximating to the
normal is to be obtained. The distribution in Fig. 6.2 {d\ for example,
would be skewed if measured and plotted simply as the number of
facets. But it becomes symmetrical, and approximates to a normal
distribution, if measured and plotted in logarithmic units. The
criteria on which the choice of a scale of measurement rests cannot be

Chap. 6]

METRIC CHARACTERS

109

fully appreciated at this stage, and will be explained in Chapter 17.
Meantime it will be assumed that any metric character under dis-
cussion is measured on an appropriate scale and has a distribution
that is approximately normal.

General Survey of the Subject-matter

There are tw.o^basic genetic phenomena concerned with metric
characters, botmnore or'lessjjaqm^arto aUJiLoiopsts. and each forms
the basis of a breeding method. The first is the resemblance between
relatives. Everyone is familiar with the fact that relatives tend to
resemble each other, and the closer the relationship, in general the
closer the resemblance. Though it is only in our own species that
resemblances are readily discernible without measurement, the
phenomenon is equally present in other species. The degree of
resemblance varies with the character, some showing more, some less.
The resemblance between offspring and parents provides the basis
for selective breeding. Use of the more desirable individuals as
parents brings about an improvement of the mean level of the next
generation, and just as some characters show more resemblance than
others, so some are more responsive to selection than others. The
degree of resemblance between relatives is one of the properties of a
population that can be readily observed, and it is one of the aims of
quantitative genetics to show how the degree of resemblance between
different sorts of relatives can be used to predict the outcome of
selective breeding and to point to the best method of carrying out the
selection. This problem will form the central theme of the next
seven chapters, the resemblance between relatives being dealt with in
Chapters 9 and 10, and the effects of selection in Chapters 1 1-13.

"Jjy^ej^BfUja^j^^gej^
with its converse hybrid vigour, or heterosis. This phenomenon is less
familiar to the layman than the first, since the laws against incest pre-
vent its more obvious manifestations in our own species; but it is well
known to animal and plant breeders. Inbreeding tends to reduce the
mean level of all characters closely connected with fitness in animals
and in naturally outbreeding plants, and to lead in consequence to loss
of general vigour and fertility. Since most characters of economic
value in domestic animals and plants are aspects of vigour or fertility,
inbreeding is generally deleterious. The reduced vigour and fertility

110 CONTINUOUS VARIATION [Chap. 6

of inbred lines is restored on crossing, and in certain circumstances
this hybrid vigour can be made use of as a means of improvement.
The enormous improvement of the yield of commercially grown
maize has been achieved by this means and represents probably the
greatest practical achievement of genetics (see Mangelsdorf, 195 1).
The effects of inbreeding and crossing will be described in Chapters
14-16.

The properties of a population that we can observe in connexion
with a metric character are means, variances, and covariances. The
natural subdivision of the population into families allows us to analyse
the variance into components which form the basis for the measure-
ment of the degree of resemblance between relatives. We can in
addition observe the consequences of experimentally applied breed-
ing methods, such as selection, inbreeding or cross-breeding. The
practical objective of quantitative genetics is to find out how we can
use the observations made on the population as it stands to predict
the outcome of any particular breeding method. The more general
aim is to find out how the observable properties of the population are
influenced by the properties of the genes concerned and by the various
non-genetic circumstances that may influence a metric character. The
chief properties of genes that have to be taken account of are the
degree of dominance, the manner in which genes at different loci
combine their effects, pleiotropy, linkage, and fitness under natural
selection. To take account of all these properties simultaneously, in
addition to a variety of non-genetic circumstances, would make the
problems unmanageably complex. We therefore have to simplify
matters by dealing with one thing at a time, starting with the simpler
situations.

The plan to be followed in the succeeding chapters is this: we
shall first show what determines the population mean, and then
introduce two new concepts — average effect and breeding value —
which are necessary to an understanding of the variance. Then we
shall discuss the variance, its analysis into components, and the co-
variance of relatives, which will lead us to the degree of resemblance
between relatives. In all this we shall take full account of dominance
from the beginning: the other complicating factors will be more
briefly discussed when they become relevant. The most important
simplification that we shall make concerns the effect of genes on
fitness: we shall assume that Mendelian segregation is undisturbed
by differential fitness of the genotypes. The description of means.

Chap. 6]

GENERAL SURVEY OF SUBJECT-MATTER

III

variances, and covariances will refer to a random breeding popula-
tion, with Hardy- Weinberg equilibrium genotype frequencies, with
no selection and no inbreeding. That is to say, we shall describe the
population before any special breeding method is applied to it. Then
in Chapters n-13 we shall describe the effects of selection, and in
Chapters 14-16 the effects of inbreeding. This will cover the funda-
mentals of quantitative genetics, and in the final chapters we shall
discuss some special topics.

CHAPTER 7

VALUES AND MEANS

We have seen in the early chapters that the genetic properties of a
population are expressible in terms of the gene frequencies and geno-
type frequencies. In order to deduce the connexion between these
on the one hand and the quantitative differences exhibited in a metric
character on the other, we must introduce a new concept, the concept
of value, expressible in the metric units by wnichtne character is
mea^gmjed. The value observed when the character is measured on an
individual is the phenotypic value of that individual. All observations,
whether of means, variances, or covariances, must clearly be based on
measurements of phenotypic values. In order to analyse the genetic
properties of the population we have to divide the phenotypic value
into component parts attributable to different causes. Explanation of
the meanings of these components is our chief concern in this chapter,
though we shall also be able to find out how the population mean is
influenced by the array of gene frequencies.

The first division of phenotypic value is into components attribut-
able to the influence of genotype and environment. The genotype is
the particular assemblage of genes possessed by the individual, and
the environment is all the non-genetic circumstances that influence the
phenotypic value. Inclusion of all non-genetic circumstances under
the term environment means that the genotype and the environment
are by definition the only determinants of phenotypic, value. The two
components of value associated with genotype and environment are
the genotypic value and the environmental deviation. We may think
of the genotype conferring a certain value on the individual and the
environment causing a deviation from this, in one direction or the
other. Or, symbolically,

P=G+E (7.1)

where P is the phenotypic value, G is the genotypic value, and J? is the
environmental deviation. The mean environmental deviation in the
population as a whole is taken to be zero, so that the mean phenotypic

Chap. 7]

VALUES AND MEANS

113

value is equal to the mean genotypic value. T^heterm population
mean then refers equally to phenotypic or to genotypic values. When
dealing with successive generations we shall assume for simplicity that
the environment remains constant from generation to generation, so
that the population mean is constant in the absence of genetic change.
If we could replicate a particular genotype in a number of individuals
and measure them under environmental conditions normal for the
population, their mean environmental deviations would be zero, and
their mean phenotypic value would consequently be equal to the
genotypic value of that particular genotype. This is the meaning of
the genotypic value of an individual. In principle it is measurable,
but in practice it is not, except when we are concerned with a single
locus where the genotypes are phenotypically distinguishable, or with
the genotypes represented in highly inbred lines.

For the purposes of deduction we must assign arbitrary values to
the genotypes under discussion. This is done in the following way.
Considering a single locus with two alleles, Ax and A2, we call the
genotypic value of one homozygote + a, that of the other homozygote
- a, and that of the heterozygote d. (We shall adopt the convention
that Ax is the allele that increases the value.) We thus have a scale of
genotypic values as in Fig. 7.1. The origin, or point of zero value, on
this scale is mid- way between the values of the two homozygotes.

Genotype

A2A2
l__

AjA2 AjAj

Genotypic -a o d +a

value

Fig. 7.1. Arbitrarily assigned genotypic values.

The value, d. of the heterozygote depends on the degree of dominance.
If there is no dominance, d = o; if Ax is dominant over A2, dis positive,
and if A2 is dominant over A1? d is negative. If dominance is com-
plete, d is equal to +a or -a, and if there is overdominance_ d is
greater than + a or less than - a. The degree of dominance mav be

Example 7.1. For the purposes of illustration in this chapter, and also
later on, we shall refer to a dwarfing gene in the mouse, known as "pygmy' '
(symbol pg), described by King (1950, 1955), and by Warwick and Lewis
(1954). This gene reduces body-size and is nearly, but not quite, recessive
in its effect on size. It was present in a strain of small mice (Mac Arthur's)
at the time the studies cited above were made. The weights of mice of the

114 VALUES AND MEANS [Chap. 7

three genotypes at 6 weeks of age were approximately as follows (sexes
averaged):

+ + +Pg PgPg
Weight in grams 14 12 6

(The weight of heterozygotes given here is to some extent conjectural, but
it is unlikely to be more than 1 gm. in error.) These are average weights
obtained under normal environmental conditions, and they are therefore
the genotypic values. The mid-point in genotypic value between the two
homozygotes is 10 gm., and this is the origin, or zero-point, on the scale
of values assigned as in Fig. 7. 1 . The value of a on this scale is therefore
4 gm., and that of d is 2 gm.

Population Mean

We can now see how the gene frequencies influence the mean of
the character in the population as a whole. Let the gene frequencies
of A± and A2 be p and q respectively. Then the first two columns of
Table 7.1 show the three genotypes and their frequencies in a random
breeding population, from formula 1.2. The third column shows the
genotypic values as specified above. The mean value in the whole

Table 7.1

freq. x vol.
p2a
2pqd
-q2a

Genotype

Frequency

Value

+ a

AXA2

zpq

A2A2

-a

Sum = a(p -q) + 2dpq

population is obtained by multiplying the value of each genotype by
its frequency and summing over the three genotypes. The reason why
this yields the mean value may be understood by converting fre-
quencies to numbers of individuals. Multiplying the value by the
number of individuals in each genotype and summing over genotypes
gives the sum of values of all individuals. The mean value would then
be this sum of values divided by the total number of individuals. The
procedure in working with frequencies is the same, but since the sum
of the frequencies is 1, the sum of values x frequencies is the mean
value. In other words, the division by the total number has already
been made in obtaining the frequencies. Multiplication of values by
frequencies to obtain the mean value is a procedure that will be often

Chap.7] POPULATION MEAN 115

used in this chapter and subsequent ones. Returning to the popula-
tion mean, multiplication of the value by the frequency of each
genotype is shown in the last column of Table 7.1. Summation of
this column is simplified by noting that p2 - q2 = (p+q)(p -q)=p- q-
The population mean, which is the sum of this column, is thus

flif— i(ft Ul I zdpq

■{7-2)

This is both the mean genotypic value and the mean phenotypic
value of the population with respect to the character.

The contribution of any locus to the population mean thus has
two terms: a(p - q) attributable to homozygotes, and zdpq attributable
to heterozygotes. If there is no dominance (d=o) the second term is
zero, and the mean is proportional to the gene frequency: M= a(i - 2q).
If there is complete dominance (d=a) the mean is proportional to the
square of the gene frequency: M=a(i -2q2). The total range of
values attributable to the locus is 2a, in the absence of overdominance.
That is to say, if Ax were fixed in the population (p = i) the popula-
tion mean would be a, and if A2 were fixed (q=i) it would be - a.
If the locus shows overdominance, however, the mean of an unfixed
population is outside this range.

Example 7.2. Let us take again the pygmy gene in mice, as described
in Example 7.1, and see what effect this gene would have on the population
mean when present at two particular frequencies. First, the total range is
from 6 gm. to 14 gm.: a population consisting entirely of pygmy homo-
zygotes would have a mean of 6 gm., and one from which the gene was
entirely absent would have a mean of 14 gm. (These values refer speci-
fically to MacArthur's Small Strain at the time the observations were
made.) Now suppose the gene were present at a frequency of o-i, so that
under random mating homozygotes would appear with a frequency of 1
per cent. The values to be substituted in equation 7.2 are p = o-g, q = o-i}
and a = 4 gm., d = 2 gm., as shown in Example 7.1. The population mean,
by equation 7.2, is therefore: M = \ x o-8 + 2 x o- 18 = 3-56. This value of
the mean, however, is measured from the mid-homozygote point, which is
10 gm., as origin. Therefore the actual value of the population mean is
13-56 gm. Next suppose the gene were present at a frequency of 0-4.
Substituting in the same way, we find M — 176, to which must be added
10 gm. for the origin, giving a value of 11-76 gm. Rough corroboration of
these figures is given by the records of the strain carrying the gene.
When the gene was present at a frequency of about 0-4 the mean weight
was about 12 gm. Two generations later, when the pygmy gene had been
deliberately eliminated, the mean weight rose to about 14 gm.

116

VALUES AND MEANS

[Chap. 7

Now we have to put together the contributions of genes at several
loci and find their joint effect on the mean. This introduces, the
qiiegfliflQ nf ^nw 8m£^d4jlifegn^Qc^oinbirietg produce a joint
efl^cj^nthgjjjjgra^ter. For the moment we shall suppose that com-
bination isJiv addition, which means that the value of a genotype
with respect to several loci is the sum of the values attributable to the
separate loci. For example, if the genotypic value of A^ is aA and
that of B1B1 is aBy then the genotypic value of AjA^B]. is aA +aB.
The consequences of non-additive combination will be explained at
the end of this chapter. With additive combination, then, the popu-
lation mean resulting from the joint effects of several loci is the sum of
the contributions of each of the separate loci, thus:

M=Za(p-q) + 2Zdpq

.(7-5)

This is again both the genotypic and the phenotypic mean value. The
total range in the absence of overdominance is now 2Ua. If all alleles
that increase the value were fixed the mean would be + £a, and if all
alleles that decrease the value were fixed it would be - Ea. These are
the theoretical limits to the range of potential variation in the popula-
tion. The origin from which the mean value in equation 7.5 is
measured is the mid-point of the total range. This is equivalent to
the average mid-homozygote point of all the loci separately.

Example 7.3. As an example of two loci that combine additively, and
also of their joint effects on the population mean, we shall refer to two
colour genes in mice, whose effects on the number of pigment granules
have been described by Russell (1949). This is a metric character which
reflects the intensity of pigmentation in the coat. The two genes are
"brown" (b) and "extreme dilution" (ce), an allele of the albino series.
Measurements were made of the number of melanin granules per unit
volume of hair, in wild-type homozygotes, in the two single mutant homo-
zygotes, and in the double mutant homozygote. We shall assume both
wild-type alleles to be completely dominant, so that only these four geno-
types need be considered. The mean numbers of granules in the four
genotypes were as follows:

B-

2aB

cece

95
38

90
34

5
4

2flc

Chap. 7]

POPULATION MEAN

117

The difference between the two figures in each row and in each column
measures the homozygote difference, or 2a on the scale of values assigned
as in Fig. 7.1. Apart from the trivial discrepancy of 1 unit, these differences
are independent of the genotype at the other locus. In other words, the
difference of value between B - and bb is the same among C - genotypes
as it is among cece genotypes; and similarly the difference between C - and
cece is the same in B - as it is in bb. Thus the two loci combine addi-
tively, and the value of a composite genotype can be rightly predicted
from knowledge of the values of the single genotypes. For example: the
bb genotype is 5 units less than the wild-type, and the cece is 57 units less;
therefore bb cece should be 62 units less, namely 33, which is almost iden-
tical with the observed value of 34.

We may use this example further to illustrate the effect of the two
loci jointly on the population mean. Let us work out, from the effects of
the loci taken separately, what would be the mean granule number in a
population in which the frequency of bb was ql = 0-4, and that of cece
was ql = o-2. For the effects of the loci separately we shall take aB = 2 and
flc = 28. The population mean, considering one locus, is M = a(i -2q2),
when there is complete dominance. For the B locus this is MB = 2 x 0-2
= 0-4; and for the C locus Mc = 28 xo-6 = i6-8. The mean, considering both
loci together, is MB + Mc = 17-2 (by equation 7.3). The point of origin from
which this is measured is the mid-point between the two double homo-
zygotes, which is ^(95 + 34) = 64-5. Thus the mean granule number in this
population would be 64-5 + 17-2 = 81-7. We may check this from the ob-
servations of the values of the joint genotypes. The four genotypes would
have the following frequencies and observed values:

Genotype

B- C-

B - cece

bbC-

bb cece

Frequency

0-48

0-12

0-32

0-08

Observed value

The mean value is obtained by multiplying the values by the frequencies
and summing over the four genotypes. This yields a mean granule number
of8i-68.

Average Effect

In order to deduce the properties of a population connected with
its family structure we have to deal with the transmission of value
from parent to offspring, and this cannot be done by means of geno-
typic values alone, because parents pass on their genes and not their
genotypes to the next generation, genotypes being created afresh in

118 VALUES AND MEANS [Chap. 7

each generation. A new measure of value is therefore needed which
will refer to genes and not to genotypes. This will enable us to assign
a ''breeding value" to individuals, a value associated with the genes
carried by the individual and transmitted to its offspring. The new
measure is the "average effect." We can assign an average effect to a
gene in the population, or to the difference between one gene and
another of an allelic pair. The average effect of a gene is the mean
deviation from the population mean of individuals which received
that gene from one parent, the gene received from the other parent
having come at random from the population. This may be stated in
another way. Let a number of gametes all carrying A± unite at ran-
dom with gametes from the population; then the mean deviation from
the population mean of the genotypes so produced is equal to the
average effect of the gene Ax. The concept of average effect is perhaps
easier to grasp in the form of the average effect of a gene-substitution,
which can more conveniently be used when only two alleles at a locus
are under consideration. If we could change, say, A2 genes into Ax at
random in the population, and could then note the resulting change of
value, this would be the average effect of the gene-substitution. It is
equal to the difference between the average effects of the two genes
involved in the substitution. A graphical representation of the average
effect of a gene-substitution is given later in Fig. 7.2.

It is important to realise that the average effect of a gene or a gene-
substitution depends on the gene frequency, and that the average
effect is therefore a property of the population as well as of the gene.
The reason for this can be seen in the words "taken at random" in the
definitions, because the content of the random sample depends on the
gene frequency in the population. The point may perhaps be more
easily understood from a specific example. Consider the substitution
of a recessive gene, a, for its dominant allele, A. The substitution will
change the value only when the individual already carries one reces-
sive allele, in other words in heterozygotes. Changing AA into Aa
will not affect the value, but changing Aa into aa will. Now, when the
frequency of the recessive allele, a, is low there will be many AA
individuals, which the substitution will not affect; but when the
recessive is at high frequency there will be very few AA individuals,
and most of the individuals in which a substitution can be made will
be affected by it. Therefore the average effect of the substitution will
be small when the frequency of the recessive allele is low and large
when it is high.

Chap. 7]

AVERAGE EFFECT

119

Let us see how the average effect is related to the genotypic
values, a and d, in terms of which the population mean was expressed.
This will help to make the concept clearer. The reasoning is set out
in Table 7.2. Consider a locus with two alleles, A± and A2, at fre-
quencies p and q respectively, and take first the average effect of the

Table 7.2

Type of
gamete

Values and

frequencies of

genotypes produced

A^Aj A1A2 A^A^
a d -a

Mean value

of genotypes

produced

Population mean
to be deducted

Average effect
of gene

A,
A2

P q

pa +qd
-qa +pd

-[a(p-q)+2dpq]

q[a+d(q-p)]
-p[a+d(q-p)]

gene Aly for which we shall use the symbol ax. If gametes carrying At
unite at random with gametes from the population, the frequencies
of the genotypes produced will be p of A^! and q of AXA2. The
genotypic value of AjAj is + a and that of AXA2 is d, and the mean of
these, taking account of the proportions in which they occur, is
pa+qd. The difference between this mean value and the population
mean is the average effect of the gene A±. Taking the value of the
population mean from equation 7.2 we get

ai =Pa +qd- [a(p -q) + zdpq]

=q[a + d(q-p)] (7.4a)

Similarly the average effect of the gene A2 is

cc2=-p[a + d(q-p)] (7.4b)

Now consider the average effect of the gene-substitution, letting Ax
be substituted for A2. Of the A2 genes taken at random from the
population for substitution, a proportion p will be found in AXA2
genotypes and a proportion q in A2A2 genotypes. In the former the
substitution will change the value from d to +a, and in the latter
from -a to d. The average change is therefore p(a-d)+q(dJrd),
which on rearrangement becomes a + d(q-p). Thus the average
effect of the gene-substitution (written as a, without subscript) is

<x = a + d(q-p)

(7-5)

The relation of a to ax and <%2 can be seen by comparing equations
7.5 and 7.4, whence

I F.Q.G.

120 VALUES AND MEANS [Chap. 7

oc = oc1-a2 (7.6)

} (7-7)

and

oc1=q<x
a2= -poc

Example 7.4. Consider again the pygmy gene and its effect on body
weight, for which a = 4 gm. and d = 2 gm. If the frequency of the pg gene
were # = o-i, the average effect of substituting + for pg would be, by
equation 7.5, <x = 4 + 2x -0-8 = 2-4 §m- And if the frequency were
q = 0-4, the average effect of the gene-substitution would be: a = 4 + 2 x - o>2
= 3-6 gm. Thus, the average effect is greater when the gene frequency is
greater. The average effects of the genes separately are, by equation 7.7:

q = o-i <7 = o-4

Average effect of + : oc1= +0-24 +!*44
Average effect of pg : a2 = -2-16 -2- 16

(The identity of the average effects of pg at the two gene frequencies is
only a coincidence.)

Breeding Value

The usefulness of the concept of average effect arises from the fact,
already noted, that parents pass on their genes and not their genotypes
to their progeny. It is therefore the average effects of the parent's
genes that determine the mean genotypic value of its progeny. The
value of an individual, judged by the mean value of its progeny, is
called the breeding value of the individual. Breeding value, unlike
average effect, can therefore be measured. If an individual is mated
to a number of individuals taken at random from the population then
its breeding value is twice the mean deviation of the progeny from
the population mean. The deviation has to be doubled because the
parent in question provides only half the genes in the progeny, the
other half coming at random from the population. Breeding values
can be expressed in absolute units, but are usually more conveniently
expressed in the form of deviations from the population mean, as
defined above. Just as the average effect is a property of the gene
and the population so is the breeding value a property of the individual
and the population from which its mates are drawn. One cannot
speak of an individual's breeding value without specifying the popu-
lation in which it is to be mated.

Chap. 7]

BREEDING VALUE

121

Defined in terms of average effects, the breeding value of an
individual is equal to the sum of the average effects of the genes it
carries, the summation being made over the pair of alleles at each
locus and over all loci. Thus, for a single locus with two alleles the
breeding values of the genotypes are as follows:

Genotype Breeding value
A^ 2ax = 2qoc

AXA2 a1 + ot2 = (q-p)ac
A2A2 2a2 = — 2/)a

Example 7.5. Let us illustrate breeding values by reference to the
pygmy gene in mice. The average effects of the + and pg genes were
given in the last example. From these we may find the breeding values of
the three genotypes as explained above. These breeding values, which are
given below, are deviations from the population mean. The population
means with gene frequencies of o-i and 0-4 were found in Example 7-2 and
are shown again below in the column headed M.

q = o-i
2 = o-4

i3'56
1 1 76

Breeding values
+ + +Pg PgPg

+ 0-48
+ 2-88

-1-92

-072

-4-32
-4*32

(The breeding values of pygmy homozygotes are only hypothetical
because in fact pygmy homozygotes are nearly all sterile: but this compli-
cation may be overlooked in the present context.)

Extension to a locus with more than two alleles is straightforward,
the breeding value of any genotype being the sum of the average
effects of the two alleles present. If all loci are to be taken into
account, the breeding value of a particular genotype is the sum of the
breeding values attributable to each of the separate loci. If there is
non-additive combination of genotypic values a slight complication
arises. We have given two definitions of breeding value, a practical
one in terms of the measured value of the progeny and a theoretical
one in terms of average effects. Non-additive combination renders
these two definitions not quite equivalent. This point will be more
fully explained in Chapter 9.

Consideration of the definition of breeding value will show that
in a population in Hardy- Weinberg equilibrium the mean breeding
value must be zero; or if breeding values are expressed in absolute

122 VALUES AND MEANS [Chap. 7

units the mean breeding value must be equal to the mean genotypic
value and to the mean phenotypic value. This can be verified from
the breeding values listed above. Multiplying the breeding value by
the frequency of each genotype and summing gives the mean breeding
value (expressed as a deviation from the population mean) as

2p2q<x + 2pq(q -p)<x. - 2q2poc = 2pqoc(p + q-p-q) = o

The breeding value is sometimes referred to as the "additive
genotype," and variation in breeding value ascribed to the "additive
effects" of genes. Though we shall not use these terms we shall
follow custom in using the term "additive" in connexion with the
variation of breeding values to be discussed in the next chapter, and
we shall use the symbol A to designate the breeding value of an
individual.

Dominance Deviation

We have separated off the breeding value as a component part of
the genotypic value of an individual. Let us consider now what
makes up the remainder. When a single locus only is under con-
sideration, the difference between the genotypic value, G, and the
breeding value, A, of a particular genotype is known as the dominance
deviation D, so that

G=A+D {7.8)

The dominance deviation arises from the property of dominance
among the alleles at a locus, since in the absence of dominance breed-
ing values and genotypic values coincide. From the statistical point
of view the dominance deviations are interactions between alleles, or
within-locus interactions. They represent the effect of putting genes
together in pairs to make genotypes; the effect not accounted for by
the effects of the two genes taken singly. Since the average effects of
genes and the breeding values of genotypes depend on the gene
frequency in the population, the dominance deviations are also
dependent on gene frequency. They are therefore partly properties
of the population and are not simply measures of the degree of
dominance.

Example 7.6. Continuing with the example of the pygmy gene, we
may now list the genotypic values and the breeding values, and so obtain
the dominance deviations of the three genotypes, by equation J.8. These

DOMINANCE DEVIATION

123

Chap. 7]

values, all now expressed as deviations from the population mean, M, are
as follows:

? = o-i:M=i3-56

+ + +Pg PgPg

Frequency
Genotypic value, G
Breeding value, A
Dominance dev., D

o-8i o-i8 o-oi

+ o«44 -1-56 -7-56

+ 0-48 -1-92 -4*32

-0-04 +0-36 -3-24

q = 0\

j.: M=ii"j6

+ +

+ Pg

PgPg

0-36

0-48

0-16

+ 2-24

+ 0-24

-576

+ 2-88

-072

-4'32

-0-64

+ 0-96

-1-44

The relations between genotypic values, breeding values and
dominance deviations can be illustrated graphically, as in Fig. 7.2,

+ a

s •

•

/*S

> «

ry— - — —i -

■ 1

1 — '

-2q,

(q-p) *

2pm

A2A2

A,A2

FREQUENCY

2pq

A,A,

Fig. 7.2. Graphical representation of genotypic values (closed
circles), and breeding values (open circles), of the genotypes for a
locus with two alleles, Ax and A2, at frequencies p and q, as ex-
plained in the text. Horizontal scale: number of Ax genes in the
genotype. Vertical scales of value: on left— arbitrary values as-
signed as in Fig. 7.1; on right — deviations from the population
mean. The figure is drawn to scale for the values: d — la, and q=\.

124

VALUES AND MEANS

[Chap. 7

and the meaning of the dominance deviation is perhaps more easily
understood in this way. In the figure the genotypic value (black dots)
is plotted against the number of Ax genes in the genotype. A straight
regression line is fitted by least squares to these points, each point
being weighted by the frequency of the genotype it represents. The
position of this line gives the breeding values of each genotype, as
shown by the open circles. The differences between the breeding
values and the genotypic values are the dominance deviations, indi-
cated by vertical dotted lines. The cross marks the population mean.
The average effect, a, of the gene-substitution is given by the differ-
ence in breeding value between A2A2 and A^, or between AXA2 and
AjAi, as indicated. The original definition of the average effect of a
gene-substitution was given by Fisher (191 8, 1941) in terms of this
linear regression of genotypic value on number of genes.

The dominance deviation can be expressed in terms of the arbi-
trarily assigned genotypic values a and d, by subtraction of the breed-
ing value from the genotypic value, as shown in Table 7.3. The

Table 7.3

Values of genotypes in a two-allele system, measured as
deviations from the population mean.
Population mean: M=a(p -q) + 2dpq
Average effect of gene-substitution: a = a + d(q -p)

Genotypes

AiA2

A2A2

Frequencies

zpq

Assigned values

-a

Deviations from

population-mean:

Genotypic value

{

2q{a -pd)
2q(a. - qd)

a(q-p) + d(i-2pq)
(q-p)a + 2pqd

-2p(a-qd)
- 2p(a +pd)

Breeding value

2q<x

(q-p)<x

-2poc

Dominance deviation

-2<fd

2pqd

-2p2d

genotypic values must first be converted to deviations from the
population mean, because the breeding values have been expressed
in this way. The genotypic values, so converted, are given in two
forms: in terms of a and in terms of a. Let us take the genotype A^Aj.
to show how these are obtained and how the dominance deviation is
obtained by subtraction of the breeding value. The arbitrarily as-
signed genotypic value of h1A1 is + a, and the population mean is

Chap. 7]

DOMINANCE DEVIATION

125

a(p —q) + zdpq. Expressed as a deviation from the population mean,
the genotypic value is therefore

a - [a(p -q) + zdpq] =a(i-p+q)- zdpq — zqa - zdpq = zq(a - dp).

This may be expressed in terms of the average effect, a, by substitut-
ing a = a- d(q -p) (from equation 7.5), and the genotypic value then
becomes zq(oc - qd). Subtraction of the breeding value, zq<x, gives the
dominance deviation as - zq2d. By similar reasoning the dominance
deviation of AXA2 is zpqd, and that of A2A2 is - zp2d. Thus all the
dominance deviations are functions of d. If there is no dominance d
is zero and the dominance deviations are also all zero. Therefore in
the absence of dominance, breeding values and genotypic values are
the same. Genes that show no dominance (d=o) are sometimes called
"additive genes," or are said to "act additively."

Since the mean breeding value and the mean genotypic value are
equal, it follows that the mean dominance deviation is zero. This can
be verified by multiplying the dominance deviation by the frequency
of each genotype and summing. The mean dominance deviation is
thus

- zp2q2d + 4p2q2d - zp2q2d — o

Another fact, which will be needed later when we deal with
variances, may be noted here: there is no correlation between the
dominance deviation and the breeding value of the different genotypes.
This can be shown by multiplying together the dominance deviation,
the breeding value and the frequency of each genotype. Summation
gives the sum of cross-products, and it works out to be zero, thus:

- 4p2q3ocd + 4p2q2(q -p)ocd + 4p3q2ad=4p2q2ad(-q+q -p +p) = o

Since the sum of cross-products is zero, breeding values and domin-
ance deviations are uncorrelated.

Interaction Deviation

When only a single locus is under consideration the genotypic
value is made up of the breeding value and the dominance deviation
only. But when the genotype refers to more than one locus the geno-
typic value may contain an additional deviation due to non-additive
combination. Let GA be the genotypic value of an individual attri-
butable to one locus, GB that attributable to a second locus, and G the

126

VALUES AND MEANS

[Chap. 7

aggregate genotypic value attributable to both loci together. Then

G = GX+G^ + I

■(7-9)

where 7AB is the deviation from additive combination of these geno-
typic values. In dealing with the population mean, earlier in this
chapter, we assumed that I was zero for all combinations of geno-
types. If / is not zero for any combination of genes at different loci,
those genes are said to "interact" or to exhibit "epistasis," the term
epistasis being given a wider meaning in quantitative genetics than
in Mendelian genetics. The deviation / is called the interaction
deviation or epistatic deviation. Loci may interact in pairs or in threes
or higher numbers, and the interactions may be of many different
sorts, as the behaviour of major genes shows. The complex nature of
the interactions, however, need not concern us, because in the aggre-
gate genotypic value interactions of all sorts are treated together as a
single interaction deviation. So for all loci together we can write

G=A + D + I (7.10)

where A is the sum of the breeding values attributable to the separate
loci, and D is the sum of the dominance deviations.

If the interaction deviation is zero the genes concerned are said to
"act additively" between loci. Thus "additive action" may mean two
different things. Referred to genes at one locus it means the absence
of dominance, and referred to genes at different loci it means the
absence of epistasis.

Example 7.7. As an example of non-additive combination of two loci
we shall take the same two colour genes in mice that were used in Example
7.3 to illustrate additive combination; but this time we refer to their effects
on the size of the pigment granules, instead of their number (Russell,
1949). The mean size (diameter in fj,) of the granules in the four geno-
types was as follows:

B-

Diff.

C-

cece

1-44

0-94

077
077

0-67
0-17

Diff.

0-50

o-oo

This time the differences are not independent of the other genotype: the
ce gene for example has quite a large effect on the B - genotype, but none
at all on the bb genotype. Thus the two loci show epistatic interaction and

Chap. 7]

INTERACTION DEVIATION

127

do not combine additively. Let us therefore work out the interaction
deviations. This is not altogether a straightforward matter because the
deviations depend on the gene frequencies in the population under dis-
cussion; it does, however, help to clarify the meaning of the interaction
deviations.

If we were to measure the homozygote differences of these two loci
with the object of estimating the value of a for each, the results would
depend on the gene frequency at the other locus. For example, the differ-
ence between B - and bb would be 0-67 if measured in C - genotypes, but
0-17 if measured in cece genotypes. The value of a therefore depends on
the population in which it is measured. Let us take, for the sake of illus-
tration, a population in which the frequency of bb genotypes is ^ = 0-4
and the frequency of cece genotypes is q% = 0-2. Then the mean homo-
zygote difference for the B locus will be 2«B = (0-67 x o-8) + (0-17 x 0-2) =
0-57. Similarly, for the C locus, 2flc=o*30. The object now is to find for
each genotype the aggregate genotypic value, G, for the two loci combined
(i.e. the observed values given above); then the genotypic values, GB and
GCi derived from consideration of the two loci separately; and, finally, the
interaction deviation, IBC, according to equation y.g. The procedure is
simplified if all these values are expressed as deviations from the popula-
tion mean. The table gives, in line (1), the four genotypes (assuming again
complete dominance at both loci); in line (2), the frequency of each geno-
type in the population; and in line (3), the observed value of granule size
in each genotype. The population mean is found by multiplying the value
by the frequency of each genotype and summing over the four genotypes.
This yields M= 1-112. Subtracting the population mean from the ob-
served value gives the aggregate genotypic value, G, as a deviation from
the population mean, shown in line (4). Now consider each locus separ-

(1) Genotypes

B- C-

B- cece

bbC-

bb cece

Mean

(2) Frequencies

0-48

0*12

0-32

0-08

(3) Observed values

1-44

0-94

0-77

I-II2

(4) G

+ 0-328

-0-172

-0-342

(5) GB + GC

+ 0-288

-0-0I2

-0-282

-0-582

(6) /bo

+ 0-040

-0-160

- 0-060

+ 0-240

ately, paying no regard to the other locus. The genotypic values for a
single locus, expressed as deviations from the population mean, were given
in Table 7.3. With complete dominance these reduce to zaq2 for the two
dominant genotypes combined, and -20(1 -q2) for the recessive homo-
zygote. Take the B - genotype for example: the value of 2«B m tne popu-
lation under consideration was shown above to be 0-57, and the value of
q2 assumed is 0-4; therefore the genotypic value is 0-57x0-4= +0-228-

128

VALUES AND MEANS

[Chap. 7

This is the average value of the B - genotype irrespective of the other locus.
The other single-locus values, found in a similar way, are as follows:

B-

C-

cece

- 0-228

-0-342

1 Gc:

+ o-o6o

- 0-240

The values given in line (5) of the table as GB + Gc are found by summa-
tion of the two appropriate single-locus values. For example, the B - C -
genotype is +0-228 + 0-060= +0-288. These are the genotypic values
expected if there were additive combination. It may be noted that their
mean, obtained by summation of (value x frequency) is zero, as is the mean
of the aggregate genotypic values in line (4). Finally, the interaction devi-
ations, 7BC, given in line (6) are obtained by subtracting the "expected"
values in line (5) from the "actual" values in line (4). The mean interaction
deviation is also zero.

CHAPTER 8

VARIANCE

The genetics of a metric character centres round the study of its
variation, for it is in terms of variation that the primary genetic
questions are formulated. The basic idea in the study of variation is
its partitioning into components attributable to different causes. The
relative magnitude of these components determines the genetic
properties of the population, in particular the degree of resemblance
between relatives. In this chapter we shall consider the nature of
these components and how the genetic components depend on the
gene frequency. Then, in the next chapter, we shall show how the
degree of resemblance between relatives is determined by the magni-
tudes of the components.

The amount of variation is measured and expressed as the vari-
ance: when values are expressed as deviations from the population
mean the variance is simply the mean of the squared values. The
components into which the variance is partitioned are the same as the
components of value described in the last chapter; so that, for
example, the genotypic variance is the variance of genotypic values,
and the environmental variance is the variance of environmental
deviations. The total variance is the phenotypic variance, or the
variance of phenotypic values, and is the sum of the separate com-
ponents. The components of variance and the values whose variance
they measure are listed in Table 8.1.

Table 8

Components of

Variance

Value whose variance

Variance component

Symbol

is measured

Phenotypic

Phenotypic value

Genotypic

Genotypic value

Additive

Breeding value

Dominance

v»

Dominance deviation

Interaction

Interaction deviation

Environmental

Environmental deviation

130 VARIANCE [Chap. 8

The total variance is then, with certain qualifications to be men-
tioned below, the sum of the components, thus:

VP = VG+VE

= VA + VD + VI+VE (8.1)

Let us now consider these components of variance in detail.

Genotypic and Environmental Variance

The first division of phenotypic value that we made in the last
chapter was into genotypic value and environmental deviation,
P=G +E. The corresponding partition of the variance into genotypic
and environmental components formulates the problem of "heredity
versus environment' ' or "nature and nurture"; or, to put the question
more precisely, the relative importance of genotype and- environment
in determining the phenotypic value. The "relative importance" of a
cause of variation means the amount of variation that it gives rise to,
as a proportion of the total. So the relative importance of genotype
as a determinant of phenotypic value is given by the ratio of geno-
typic to phenotypic variance, VG/VP. The genotypic and environ-
mental components cannot be estimated directly from observations
on the population, but in certain circumstances they can be estimated
in experimental populations. If one or other component could be
completely eliminated, the remaining phenotypic variance would
provide an estimate of the remaining component. Environmental
variance cannot be removed because it includes by definition all
non-genetic variance, and much of this is beyond experimental
control. Elimination of genotypic variance can, however, be achieved
experimentally. Highly inbred lines, or the Fx of a cross between two
such lines, provide individuals all of identical genotype and therefore
with no genotypic variance. If a group of such individuals is raised
under the normal range of environmental circumstances, their pheno-
typic variance provides an estimate of the environmental variance
V . Subtraction of this from the phenotypic variance of a genetically
mixed population then gives an estimate of the genotypic variance
of this population.

Example 8.i. Partitioning of the phenotypic variance into its geno-
typic and environmental components has been done for several characters

Chap. 8]

GENOTYPIC AND ENVIRONMENTAL VARIANCE

131

in Drosophila melanogaster. The results are given later, in Table 8.2, but
here we may describe the results for one character in more detail in order
to show how the partitioning is made. The character is the length of the
thorax (in units of i/ioo mm.), which may be regarded as a measure of body-
size. The phenotypic variance was measured first in a genetically mixed —
i.e. a random-bred — population, and then in a genetically uniform popu-
lation, consisting of the F± generation of three crosses between highly
inbred lines. The first estimates the genotypic and environmental variance
together, and the second estimates the environmental variance alone. So,
by subtraction, an estimate of the genotypic variance is obtained. The
results, obtained by F. W. Robertson ( 19576), were as follows:

Population

Components

Observed variance

Mixed

vG + vE

0-366

Uniform

0-186

Difference

0-180

Va/Vp =49%

Thus 49 per cent of the variation of thorax length in this genetically mixed
population is attributable to genetic differences between individuals, and
5 1 per cent to non-genetic differences.

Individuals of identical genotype are also provided by identical
twins in man and cattle, but their use in partitioning the variance is
very limited: they will be discussed in a later chapter when the
problems that they raise will be better understood. Apart from the
severely limited use of identical twins, the partitioning of the vari-
ance into genotypic and environmental components depends on the
availability of highly inbred lines, and is therefore restricted to experi-
mental populations of plants or small animals.

Three complications arise in connexion with the partitioning of
the variance into genotypic and environmental components. They
are all things that can usually be neglected or circumvented with little
risk of error, but in some circumstances they may be important. The
following account of them might well be omitted at a first reading,
unless the reader is worried by the logical fallacies introduced by
neglecting them.

Dependence of environmental variance on genotype. Ex-
periments of the type illustrated in Example 8.1 rest on the assump-
tion that the environmental variance is the same in all genotypes, and
this is certainly not always true. The environmental variance mea-
sured in one inbred line or cross is that shown by this one particular

132 VARIANCE [Chap. 8

genotype, and other genotypes may be more or less sensitive to
environmental influences and may therefore show more or less
environmental variance. The environmental variance of the mixed
population may therefore not be the same as that measured in the
genotypically uniform group. Not very much is yet known about this
complication except that many characters show more environmental
variance among inbred than among outbred individuals, inbreds being
more sensitive or less well "buffered." The reality of the complica-
tion is therefore not in doubt. Further discussion of the phenomenon
will be found under the effects of inbreeding, in Chapter 15, where it
more properly belongs. The existence of this complication means
that when dealing with genotypically mixed populations we have to
define the environmental component of variance as the mean en-
vironmental variance of the genotypes in the population, and we have
to recognise the possibility that if the frequencies of the genotypes
are changed, as by selection, the environmental variance may also be
changed in consequence.

Genotype-environment correlation. Hitherto we have tacitly
assumed that environmental deviations and genotypic values are
independent of each other; in other words that there is no correlation
between genotypic value and environmental deviation, such as would
arise if the better genotypes were given better environments. Corre-
lation between genotype and environment is seldom an important
complication, and can usually be neglected in experimental popula-
tions, where randomisation of environment is one of the chief objects
of experimental design. There are some situations, however, in
which the correlation exists. Milk-yield in dairy cattle provides an
example. The normal practice of dairy husbandry is to feed cows
according to their yield, the better phenotypes being given more
food. This introduces a correlation between phenotypic value and
environmental deviation; and, since genotypic and phenotypic values
are correlated, there is also a correlation between genotypic value and
environmental deviation. The complication of genotype-environment
correlation is very simply overcome by regarding the special environ-
ment— i.e. the feeding level in the case of cows — as part of the geno-
type. This situation is covered by the definition of genotypic value,
provided genotypic values are taken to refer to genotypes as they
occur under the normal conditions of association with specific
environments. If genotypic values were not so defined we could not
treat the phenotypic variance as simply the sum of the genotypic and

Chap. 8]

GENOTYPIC AND ENVIRONMENTAL VARIANCE

133

environmental variances, but we should have to include a covariance
term, thus:

VP = Vq + VE + 2C0V0E

.(8.2)

where covGE is the covariance of genotypic values and environmental
deviations. If the genotypic variance is estimated, as in Example 8.i,
by the comparison of genetically identical with genetically mixed
groups, then the covariance would be eliminated with the genotypic
variance from the genetically identical group, and the estimate ob-
tained will be of genotypic variance together with twice the co-
variance. Thus, while on theoretical grounds it is convenient, on
practical grounds it is unavoidable, to regard any covariance that may
arise from genotype-environment correlation as being part of the
genotypic variance.

Genotype-environment interaction. Another assumption that
we have made, which is not always justifiable, is that a specific differ-
ence of environment has the same effect on different genotypes; or, in
other words, that we can associate a certain environmental deviation
with a specific difference of environment, irrespective of the genotype
on which it acts. When this is not so there is an interaction, in the
statistical sense, between genotypes and environments. There are
several forms which this interaction may take (Haldane, 1946). For
example, a specific difference of environment may have a greater
effect on some genotypes than on others; or there may be a change in
the order of merit of a series of genotypes when measured under
different environments. That is to say, genotype A may be superior
to genotype B in environment X, but inferior in environment Y, as in
the following example.

Example 8.2. The following figures show the growth, between 3 and
6 weeks of age, of two strains of mice reared on two levels of nutrition
(original data):

Good

Bad

nutrition

Strain A

17-2 gm.

12-6 gm.

Strain B

16-6 gm.

13*3 gm-

Strain A grows better than strain B under good conditions, but worse
under bad conditions.

134 VARIANCE [Chap. 8

An interaction between genotype and environment, whatever its
nature, gives rise to an additional component of variance. This
interaction variance can be isolated and measured only under rather
artificial circumstances. We may replicate genotypes by the use of
inbred lines or Fx's, and replicate specific environments by the con-
trol of such factors as nutrition or temperature. Then an analysis of
variance in a two-way classification of genotypes x environments will
yield estimates of the genotypic variance (between genotypes), the
environmental variance (between environments) and the variance
attributable to interaction of genotypes with environments. The
specific environments in such an experiment are, however, more in
the nature of "treatments" because a population under genetical
study would not normally encounter so wide a range of environments
as that provided by the different treatments. It is therefore the
genotype-environment interaction occurring within one such treat-
ment that is relevant to the genetical study of a population, and this
cannot be measured because the separate elements of the environ-
ment cannot be isolated and controlled. In an experiment such as
that of Example 8.1, which removes the genotypic variance by the
use of inbred lines or F^s, the interaction variance remains with the
environmental in the phenotypic variance measured in the genetically
uniform individuals. In normal circumstances, therefore, the vari-
ance due to genotype-environment interaction, since it cannot be
separately measured, is best regarded as part of the environmental
variance. When large differences of environment, such as between
different habitats, are under consideration, the presence of genotype-
environment interaction becomes important in connexion with the
specialisation of breeds or varieties to local conditions. This matter
will be taken up again later, in Chapter 19, because it can be more
profitably discussed from a different viewpoint.

Genetic Components of Variance

The partition into genotypic and environmental variance does not
take us far toward an understanding of the genetic properties of a
population, and in particular it does not reveal the cause of resem-
blance between relatives. The genotypic variance must be further
divided according to the division of genotypic value into breeding
value, dominance deviation, and interaction deviation. Thus we have:

Chap. 8]

GENETIC COMPONENTS OF VARIANCE

135

Values

Variance components

G = A + D + I

VG = VA + VD + Vt (8.4)

(genotypic) (additive) (dominance) (interaction)

The additive variance, which is the variance of breeding values, is the
important component since it is the chief cause of resemblance be-
tween relatives and therefore the chief determinant of the observable
genetic properties of the population and of the response of the popu-
lation to selection. Moreover, it is the only component that can be
readily estimated from observations made on the population. In prac-
tice, therefore, the important partition is into additive genetic variance
versus all the rest, the rest being non-additive genetic and environ-
mental variance. This partitioning is most conveniently expressed
as the ratio of additive genetic to total phenotypic variance, VA/VP> a
ratio called the heritability.

Estimation of the additive variance rests on observation of the
degree of resemblance between relatives and will be described later
when we have discussed the causes of resemblance between relatives.
Our immediate concern here is to show how the genetic components
of variance are influenced by the gene frequency. To do this we have
to express the variance in terms of the gene frequency and the as-
signed genotypic values a and d. We shall consider first a single locus
with two alleles, thus excluding interaction variance for the moment.

Additive and dominance variance. The information needed to
obtain expressions for the variance of breeding values and the variance
of dominance deviations was given in the last chapter in Table 7.3
(p. 125). This table gives the breeding values and dominance devia-
tions of the three genotypes, expressed as deviations from the popu-
lation mean. It will be remembered that the means of both breeding
values and dominance deviations are zero. Therefore no correction
for an assumed mean is needed, and the variance is simply the mean
of the squared values. The variances are thus obtained by squaring
the values in the table, multiplying by the frequency of the genotype
concerned, and summing over the three genotypes. (The procedure
of multiplying values by frequencies to obtain the mean was explained
on p. 114.) The additive variance, which is the variance of breeding
values, is obtained as follows:

VA = oPfoqy + (q -pf . zpq + tf y]
= 2pqoc2(2pq +p2 +q2- zpq + zpq)
= zpqa?
= zpq[a + d(q-p)Y

.(8.5. a)
\8.5.b)

F.Q.G.

136 VARIANCE [Chap. 8

Similarly the variance of dominance deviations is
VD = d%iq2p2 + Sp V + ^p V)

= (^)2 (5.6)

It was noted in the last chapter that breeding values and dominance
deviations are uncorrelated. From this it follows that the genotypic
variance is simply the sum of the additive and dominance variances.
Thus

vQ=vA + vD

= zpq[a + d(q -p)f + [zpqdf (8.7)

Example 8.3. To illustrate the genetic components of variance arising
from a single locus let us return to the pygmy gene in mice, used for
several examples in the last chapter. From the values tabulated in Ex-
ample 7.6 (p. 123) we may compute the components of variance directly.
Since the values are expressed as deviations from the population mean, the
variance is obtained by multiplying the frequency of each genotype by
the square of its value, and summing over the three genotypes. For ex-
ample, the genotypic variance when q = o-i is o-8i(o-44)2 + o-i8( - i*56)2 +
o-oi(-7'56)2 = 1-1664. The additive variance is obtained in the same way
from the variance of breeding values, and the dominance variance from
the variance of dominance deviations. The variances obtained are as
follows:

q = o-i <Z = o*4

Genotypic, Vq 1-1664 7,:i:424

Additive, VA 1-0368 6-2208

Dominance, Vj) 0-1296 0-9216

The variances may be obtained also, and with less trouble, by use of the
formulae given above in equations #.5, 8.6 and 8.J. The values to be sub-
stituted were given in Example 7.1; namely, a = 4 and d = z. Notice that
the dominance variance is quite small in comparison with the additive.

The ways in which the gene frequency and the degree of domin-
ance influence the magnitude of the genetic components of
variance can best be appreciated from graphical representations of the
relationships derived above, in equations £.5, 8.6, and 8.J. The
graphs in Fig. 8.1 show the amounts of genotypic, additive, and
dominance variance arising from a single locus with two alleles,
plotted against the gene frequency. Three cases are shown to illus-

Chap. 8]

GENETIC COMPONENTS OF VARIANCE

137

trate the effect of different degrees of dominance: in graph (a) there
is no dominance (d=o); in graph (b) there is complete dominance
(d=a); and in graph (c) there is "pure" over- dominance (a = o).
In the first case the genotypic variance is all additive, and it is
greatest whenp=q=o-$. In the second case the dominance variance
is maximal when p=q = 0-5, and the additive is maximal when the
frequency of the recessive allele is q = o-j$. In the third case the
dominance variance is the same as in the second and is maximal

0-2

(a)

(b)

(c)

rr—

-~,

■/

^*— j

-r-

<Nv

>.\

0-2

0-2 0-4 0-6 0-8 0-2 04 06 08 I

GENE FREQUENCY, q

Fig. 8.1. Magnitude of the genetic components of variance
arising from a single locus with two alleles, in relation to the gene
frequency. Genotypic variance — thick lines; additive variance —
thin lines; dominance variance — broken lines. The gene fre-
quency, q, is that of the recessive allele. The degrees of dominance
are: in (a) no dominance (d=o); in (b) complete dominance (d = a);
and in (c) "pure" overdominance (a =o). The figures on the vertical
scale, showing the amount of variance, are to be multiplied by a2
in graphs (a) and (b), and by d2 in graph (c).

when p=q=o-$. The additive variance, however, is zero when
p=q = o-$y and has two maxima, one at ^ = 0-15 and the other at
^ = 0-85. The genotypic variance, in this case, remains practically
constant over a wide range of gene frequency, though its composition
changes profoundly. The general conclusion to be drawn from these
graphs is that genes contribute much more variance when at inter-
mediate frequencies than when at high or low frequencies: recessives
at low frequency, in particular, contribute very little variance.

A possible misunderstanding about the concept of additive gene-
tic variance, to which the terminology may give rise, should be

138 VARIANCE [Chap. 8

mentioned here. The concept of additive variance does not carry
with it the assumption of additive gene action; and the existence of
additive variance is not an indication that any of the genes act addi-
tively (i.e. show neither dominance nor epistasis). No assumption is
made about the mode of action of the genes concerned. Additive
variance can arise from genes with any degree of dominance or epis-
tasis, and only if we find that all the genotypic variance is additive can
we conclude that the genes show neither dominance nor epistasis.

The existence of more than two alleles at a locus introduces no
new principle, though it complicates the theoretical description of the
effect of the locus. Expressions for the additive and dominance
variances are given by Kempthorne (1955a). The locus contributes
additive variance arising from the average effects of its several alleles,
and dominance variance arising from the several dominance devia-
tions.

To arrive at the variance components expressed in the population
the separate effects of all loci that contribute variance have to be
combined. The additive variance arising from all loci together is the
sum of the additive variances attributable to each locus separately;
and the dominance variance is similarly the sum of the separate contri-
butions. But when more than one locus is under consideration then
the interaction deviations, if present, give rise to another component
of variance, the interaction variance, which is the variance of the
interaction deviations.

Interaction variance. We shall treat the interaction variance as
a complication, like genotype-environment correlation or inter-
action, to be circumvented: that is to say, we shall not discuss its
properties in detail, but we shall show what happens to it if it is
ignored. It is only comparatively recently that the properties of the
interaction variance have been worked out (see Cockerham, 1954;
Kempthorne, 1954, 1955a, 6) and little is yet known about its import-
ance in relation to the other components. It seems probable, how-
ever, that the amount of variance contributed by it is usually rather
small, and that neglect of it is therefore not likely to lead to serious
error. Description of the properties of interaction variance rests on
its further subdivision into components. It is first subdivided ac-
cording to the number of loci involved: two-factor interaction arises
from the interaction of two loci, three-factor from three loci, etc.
Interactions involving larger numbers of loci contribute so little
variance that they can be ignored, and we shall confine our attention

Chap. 8]

GENETIC COMPONENTS OF VARIANCE

139

to two-factor interactions since these suffice to illustrate the principles
involved. The next subdivision of the interaction variance is accord-
ing to whether the interaction involves breeding values or dominance
deviations. There are thus three sorts of two-factor interactions.
Interaction between the two breeding values gives rise to additive x
additive variance, V AA\ interaction between the breeding value of one
locus and the dominance deviation of the other gives rise to additive x
dominance variance, VAD; and interaction between the two domin-
ance deviations gives rise to dominance x dominance variance, VDD.
So the interaction variance is broken down into components thus:

Vi = VAA + VAD + VDD + etc.

,(8.8)

the terms designated "etc." being similar components arising from
interactions between more than two loci. At the moment we cannot
go further than this in the description of the interaction variance, but
we shall show later how it affects the resemblance between relatives
and what happens to it when components of variance are estimated
from observations on the population.

That completes the description of the nature of the genetic com-
ponents of variance. The practical value of the partitioning of the
variance will not yet be fully apparent because it arises from the
causes of resemblance between relatives, which is the subject of the
next chapter. The partitioning we have made is essentially a theo-
retical one, and before we pass on we should consider how much of it
can actually be made in practice. When observations of resemblance
between relatives are available we can estimate the additive variance
and so make the partition V A : (VD+ Vr+ VE). And if inbred lines
are available we can estimate the environmental variance and so make
the partition VG : VE. If both these partitions are made we can
separate the additive genetic from the rest of the genetic variance, and
so make the three-fold partition into additive genetic, non-additive
genetic, and environmental variance, VA : (VD + Vj) : V^, the domin-
ance and interaction components being lumped together as non-
additive genetic variance. Examples of this partitioning are given in
Table 8.2, although at this stage the method by which the additive
component is estimated will not be understood. This partitioning is
as far as we can go by means of relatively simple experiments. By
more elaborate techniques, requiring large numbers of observations,
it may be possible to go some way toward separating the dominance
from the interaction components, or at least to get an idea of their

140 VARIANCE [Chap. 8

relative importance. (See, in particular, Robinson and Comstock,
1955; Hayman, 1955, 1958; Cockerham, 19566.)

Table 8.2

Partitioning of the variance of four characters in Drosophila

melanogaster. Components as percentages of the total,

phenotypic, variance.

Character

(I)

(2)

(3)

(4)

Bristles

Thorax

Ovary

Egg*

Phenotypic

100

Additive genetic

Non-additive genetic

Vd + Vj

Environmental

Characters:

(1) Number of bristles on 4th + 5th abdominal segments (Clayton, Morris,
and Robertson, 1957; Reeve and Robertson, 1954).

(2) Length of thorax (F. W. Robertson, 1 9576).

(3) Size of ovaries, i.e. number of ovarioles in both ovaries. (F. W.
Robertson, 19570).

(4) Number of eggs laid in 4 days (4th to 8th after emergence) (F. W.
Robertson, 19576).

Environmental Variance

Environmental variance, which by definition embraces all varia-
tion of non-genetic origin, can have a great variety of causes and its
nature depends very much on the character and the organism studied.
Generally speaking, environmental variance is a source of error that
reduces precision in genetical studies and the aim of the experimenter
or breeder is therefore to reduce it as much as possible by careful
management or proper design of experiments. {Nutritional and
climatic factors are the commonest external causes of environmental
variation, and they are at least partly under experimental control.
Maternal effects form another source of environmental variation that
is sometimes important, particularly in mammals, but is less sus-
ceptible to control. Maternal effects are prenatal and postnatal
influences, mainly nutritional, of the mother on her young: we shall
have more to say about them in the next chapter in connexion with

Chap. 8]

ENVIRONMENTAL VARIANCE

141

resemblance between relatives. Error of measurement is another
source of variation, though it is usually quite trivial. When a charac-
ter can be measured in units of length or weight it is usually measured
so accurately that the variance attributable to measurement is neg-
ligible in comparison with the rest of the variance. Some characters,
however, cannot strictly speaking be measured, but have to be graded
by judgement into classes. Carcass qualities of livestock are an ex-
ample. With such characters the variance due to measurement may
be considerable.

In addition to the variation arising from recognisable causes, such
as those mentioned, there is usually also a substantial amount of
non-genetic variation whose cause is unknown, and which therefore
cannot be eliminated by experimental design. This is generally
referred to as "intangible" variation. Some of the intangible varia-
tion may be caused by "environmental" circumstances, in the common
meaning of the word — that is, by circumstances external to the
individual — even though their nature is not known. Some, however,
may arise from "developmental" variation: variation, that is, which
cannot be attributed to external circumstances, but is attributed, in
ignorance of its exact nature, to "accidents" or "errors" of develop-
ment as a general cause. Characters whose intangible variation is
predominantly developmental are those connected with anatomical
structure, which do not change after development is complete, such
as skeletal form, pigmentation, or bristle number in Drosophila.
Characters more susceptible to the influences of the external environ-
ment, in contrast, are those connected with metabolic processes, such
as growth, fertility, and lactation.

Example 8.4. Human birth weight provides an example of a character
subject to much environmental variation whose nature has been analysed
in detail (Penrose, 1954; Robson, 1955). The partitioning of the pheno-
typic variance given in the table shows the relative importance of all the
identified sources of variation, birth weight being regarded as a character
of the child. All the environmental variation is "maternal" in the sense
that it is connected with the prenatal environment, but several distinct
components of the maternal environment are distinguished. "Maternal
genotype," which accounts for 20 per cent of the total phenotypic variance,
reflects genetic variation (chiefly additive) between mothers in the birth
weight of their children; i.e. birth weight regarded as a character of the
mother. "Maternal environment, general," which accounts for another
18 per cent, reflects non-genetic variation between mothers in the same way.

142 VARIANCE [Chap. 8

These two components, totalling 38 per cent, are maternal causes of varia-
tion in birth weight that affect all children of the same mother alike.
"Maternal environment, immediate" means causes attributable to the
mother but differing in successive pregnancies. Two causes of the same
nature — "age of mother" and "parity" (i.e. whether the child is the first,

Partitioning of variance of human birth-weight. Com-
ponents as percentages of the total, phenotypic, variance.

Cause of variation

%oft

Genetic

Additive

Non-additive (approx)

Sex

Total genotypic

Environmental

Maternal genotype

Maternal environment,

general

Maternal environment,

immediate

Age of mother

Parity

Intangible

3°

Total environmental

second, etc.) — are separately identifiable. Finally, the "intangible"
variation is all the remainder, of which the cause cannot be identified. To
explain how these various components were estimated would take too
much space, and could not properly be done until the end of Chapter 10.
It must suffice to say that the estimates all come from comparisons of the
degree of resemblance between identical twins, fraternal twins, full sibs,
children of sisters, and other sorts of cousins.

Multiple measurements. When more than one measurement
of the character can be made on each individual, the phenotypic
variance can be partitioned into variance within individuals and
variance between individuals. This subdivision serves to show how
much is to be gained by the repetition of measurements, and it may
also throw light on the nature of the environmental variation. There
are two ways by which the repetition of a character may provide
multiple measurements: by temporal repetition and by spatial repe-
tition. Milk-yield and litter size are examples of characters repeated
in time. Milk-yield can be measured in successive lactations, and

Chap. 8]

ENVIRONMENTAL VARIANCE

143

litter size in successive pregnancies. Several measurements of each in-
dividual can thus be obtained. The variance of yield per lactation, or
of the number of young per litter, can then be analysed into a com-
ponent within individuals, measuring the differences between the
performances of the same individual, and a component between in-
dividuals, measuring the permanent differences between individuals.
The within-individual component is entirely environmental in
origin, caused by temporary differences of environment between suc-
cessive performances. The between-individual component is partly
environmental and partly genetic, the environmental part being
caused by circumstances that affect the individuals permanently. By
this analysis, therefore, the variance due to temporary environmental
circumstances is separated from the rest, and can be measured.

Characters repeated in space are chiefly structural or anatomical,
and are found more often in plants than in animals. For example,
plants that bear more than one fruit yield more than one measure-
ment of any character of the fruit, such as its shape or seed content.
Spatial repetition in animals is chiefly found in characters that can be
measured on the two sides of the body or on serially repeated parts,
such as the number of bristles on the abdominal segments of Droso-
phila. With spatially repeated characters the within-individual
variance is again entirely environmental in origin but, unlike that of
temporally repeated characters, it represents the "developmental"
variation arising from localised circumstances operating during
development.

In order that we may discuss both temporal and spatial repetition
together we shall use the term special environmental variance, VEs, to
refer to the within-individual variance arising from temporary or
localised circumstances; and the term general environmental variance,
VEg, to refer to the environmental variance contributing to the
between-individual component and arising from permanent or non-
localised circumstances. The ratio of the between-individual com-
ponent to the total phenotypic variance measures the correlation (r)
between repeated measurements of the same individual, and is
known as the repeatability of the character. The partitioning of the
phenotypic variance expressed by the repeatability is thus into two
components, VEs versus (VG + VEg), so that the repeatability is

r =

Vq+Ve,
Vp

.(8.9)

144 - VARIANCE [Chap. 8

The repeatability therefore expresses the proportion of the variance
of single measurements that is due to permanent, or non-localised,
differences between individuals, both genetic and environmental.
The repeatability differs very much according to the nature of the
character, and also, of course, according to the genetic properties of

Table 8.3
Some Examples of Repeatability

Organism and character

Repeatability

Drosophila melanogaster :

Abdominal bristle number (see Example 8.6).
Ovary size (F. W. Robertson, 19570).

•42

Mouse:

Weight at 6 weeks (repeated on 4 consecutive days.

Original data).
Litter size (see Example 8.5).

•95

•45

Sheep:

Weight of fleece, measured in different years (Morley,
I95i)-

Cattle:

Milk-yield (Johansson, 1950). -40

the population and the environmental conditions under which the
individuals are kept. The estimates in Table 8.3 give some idea of
the sort of values that may be found with various characters, and two
cases are described in more detail in the following examples.

Example 8.5. Litter size in mice will serve as an example of a character
repeated in time. The number of live young born in first and in second
litters was recorded in 296 mice of a genetically heterogeneous stock, and
yielded the following components of variance (original data):

Between mice 3-58
Within mice 4-44

(The procedure for estimating the components of variance from an
analysis of variance is described by Snedecor (1956, Section 10.12) and is
outlined below, in Chapter 10, p. 173.) The repeatability of litter size is
given by the ratio of the between-mice component to the sum of the be-
tween-mice and the within-mice components: i.e.

Chap. 8]

ENVIRONMENTAL VARIANCE

145

3-58
3-58+4-44

0*45

Example 8.6. The number of bristles on the ventral surfaces of the
abdominal segments is a character that has been much studied in Droso-
phila melanogaster, because it is technically convenient and its genetic
properties are relatively simple. We have already mentioned it several
times but have not yet used it as an example. There are about 20 bristles
on each of 3 segments in males and each of 4 segments in females. The
number of bristles per segment can therefore be treated as a spatially
repeated character. The sources of variation in this character have been
studied in detail by Reeve and Robertson (1954), and the following com-
ponents of variance were found:

Total phenotypic

4-24

5'44

Between flies

vG+vEg

1-82

2-19

Within flies

VEs

2-42

3-25

Repeatability

0-429

0-403

Estimation of the repeatability of a character separates off the
component of variance due to special environment, VEsi but it leaves
the other component of environmental variance — that due to general
environment, VEg — confounded with the genotypic variance, as
shown in the above example. The component due to general en-
vironment can be separately estimated only if the genotypic variance
(i.e. including the non-additive components) has been estimated, in
the manner explained in Example 8.1. This has been done with two
characters in Drosophila, and the results are given in Table 8.4. The

Table 8.4

Partitioning of the environmental variance of two charac-
ters in Drosophila melanogaster into components due to
general, VEg, and special, VEsy environment. The charac-
ters are: abdominal bristle-number (Reeve and Robertson,
1954) as explained in Example 8.6, and ovary size (F. W.
Robertson, 1957a), measured in the two ovaries by the
number of ovarioles, or "egg strings."

Total environmental, VE
General environmental, VEg
Special environmental, VEs

Bristle

Ovar

number

size

100

146 VARIANCE [Chap. 8

nature of the environmental variation revealed by these results is
remarkable. With both characters less than 10 per cent of the
environmental variance is general — that is, due to causes influencing
the individual as a whole. These characters are therefore very little
influenced by the conditions of the external environment: or, perhaps
it would be more accurate to say that the experimental technique of
rearing the flies has been very successful in eliminating unwanted
sources of environmental variation. Yet, fully half the phenotypic
variation of one measurement (one segment or ovary) is non-genetic,
or environmental in the wide sense, as shown in Table 8.2; and,
moreover, is due to strictly localised causes that influence the seg-
ments or ovaries independently. Whether this developmental
variation represents a real indeterminacy of development, or has
material causes still undetected but in principle controllable, is quite
unknown. Nor is it known whether the situation revealed in these
two characters is at all general. We cannot here pursue further the
biological nature of the non-genetic variation: a general discussion of
these problems will be found in Waddington (1957).

We must return to the repeatability and consider its uses.
Knowledge of the repeatability of a character is useful in two ways.
First, it sets upper limits to the values of the two ratios, VAjVP and
V0/VP. The first (additive genetic to total phenotypic variance), is the
heritability, which as we shall see in later chapters is of great practical
importance. The second (genotypic to phenotypic variance) measures
the degree of genetic determination of the character. The repeatability
is usually much easier to determine than either of these two ratios,
and it may often be known when they are not.

The second way in which knowledge of the repeatability is useful
is that it indicates the gain in accuracy to be expected from multiple
measurements. Suppose that each individual is measured n times,
and that the mean of these n measurements is taken to be the pheno-
typic value of the individual, say P(n). Then the phenotypic variance
is made up of the genotypic variance, the general environmental
variance, and one nth of the special environmental variance:

VPin) = Va+VEg + ^VEs (8 jo)

Thus, increasing the number of measurements reduces the amount
of variance due to special environment that appears in the pheno-
typic variance, and this reduction of the phenotypic variance repre-

Chap. 8]

ENVIRONMENTAL VARIANCE

147

sents the gain in accuracy. The variance of the mean of n measure-
ments as a proportion of the variance of one measurement can be
expressed in terms of the repeatability, as follows:

Pin)

i + r(n - i )

(8.U)

where r is the repeatability, or the correlation between the measure-
ments of the same individual. Fig. 8.2 shows how the phenotypic
variance is reduced by multiple measurements, with characters of

100 i

^—.

— -—. —

r=075

- —

r=0'5

J -0-25

r=0l

123456789 l(

NUMBER OF MEASUREMENTS

Fig. 8.2. Gain in accuracy from multiple measurements of each
individual. The vertical scale gives the variance of the mean of n
measurements as a percentage of the variance of one measurement.
The horizontal scale gives the number of measurements, up to io.
The four graphs refer to characters of different repeatability as
indicated.

148 VARIANCE [Chap. 8

different repeatabilities. When the repeatability is high, and there is
therefore little special environmental variance, multiple measure-
ments give little gain in accuracy. When the repeatability is low,
multiple measurements may lead to a worth-while gain in accuracy.
The gain in accuracy, however, falls off rapidly as the number of
measurements increases, and it is seldom worth while to make more
than two measurements.

Example 8.7. Studies of abdominal bristle number in Drosophila are
generally based on two measurements, i.e. of the fourth and fifth seg-
ments, and the phenotypic values are expressed as the sum of the two
counts. As an illustration of the nature of the advantage gained by the
double measurement we may compare the percentage composition of the
phenotypic variance when phenotypic values are based on counts of one
or of two segments:

One

segment

Two

segments

Phenotypic

100

Additive genetic

Non-additive genetic

Vb + Vj

Environmental, general

Vm.

Environmental, special

Vm.

By reducing the amount of environmental variance, the making of two
measurements increases the proportionate amount of genetic variance: in
practice it is the increase of the proportion of additive variance — in this
case from 34 per cent to 52 per cent — that is the important consideration.

There is an important assumption implicit in the idea of repeata-
bility, which we have not yet mentioned. It is the assumption that
the multiple measurements are indeed measurements of what is
genetically the same character. Consider for example milk-yield in
successive lactations. If the assumption were valid it would mean that
the genes that influence yield in first lactations are entirely the same
as those that influence yield in second or later lactations; or, to put the
matter in another way, that yield in all lactations is dependent on
identical developmental and physiological processes. If this assump-
tion is not valid, as it certainly is not for milk-yield in cattle, then the
variation within individuals is not purely environmental, and equation
8.11 is erroneous. The variance between the means of individuals
will be augmented by additional variance arising from what may
formally be regarded as interaction between genotype and "environ-

Chap. 8]

ENVIRONMENTAL VARIANCE

149

ment," that is between genotype and the time or location of the
measurement. And this additional variance may be enough to
counteract the reduction of environmental variance which we have
described as the chief advantage to be gained from multiple measure-
ments. Consequently an increase in the proportion of additive genetic
variance from multiple measurements cannot be relied on until the
genetical identity of the character measured has been established.
The number of bristles on the abdominal segments of Drosophila has
been proved to be genetically the same character, as will be explained
in Chapter 19, and the conclusions reached in Example 8.7 are valid.
Milk-yield in cattle, in contrast, is not the same character in suc-
cessive lactations, and the proportion of additive variance is actually
less for the mean of several lactations than for first lactations only.
(See Rendel, et al.y 1957.)

CHAPTER 9

RESEMBLANCE BETWEEN RELATIVES

The resemblance between relatives is one of the basic genetic pheno-
mena displayed by metric characters, and the degree of resemblance
is a property of the character that can be determined by relatively
simple measurements made on the population without special experi-
mental techniques. The degree of resemblance provides the means
of estimating the amount of additive variance, and it is the propor-
tionate amount of additive variance (i.e. the heritability) that chiefly
determines the best breeding method to be used for improvement.
An understanding of the causes of resemblance between relatives is
therefore fundamental to the practical study of metric characters and
to its application in animal and plant improvement. In this chapter,
therefore, we shall examine the causes of resemblance between rela-
tives, and show in principle how the amount of additive variance can
be estimated from the observed degree of resemblance, leaving the
more practical aspects of the estimation of the heritability for con-
sideration in the next chapter.

In the last chapter we saw how the phenotypic variance can be
partitioned into components attributable to different causes. These
components we shall call causal components of variance, and denote
them as before by the symbol V. The measurement of the degree of
resemblance between relatives rests on the partitioning of the pheno-
typic variance in a different way, into components corresponding to
the grouping of the individuals into families. These components can
be estimated directly from the phenotypic values and for this reason
we shall call them observational components of phenotypic variance,
and denote them by the symbol ct2 in order to keep the distinction
clear. Consider, for example, the grouping of individuals into
families of full sibs. By the analysis of variance we can partition the
total observed variance into two components, within groups and
between groups. The within-group component is the variance of
individuals about their group means, and the between-group com-
ponent is the variance of the "true" means of the groups about the

Chap. 9]

RESEMBLANCE BETWEEN RELATIVES

151

population mean. The true mean of a group is the mean estimated
without error from a very large number of individuals. An explana-
tion of the estimation of these two components will be given, with
examples, in the next chapter. Now, the resemblance between related
individuals, i.e. between full sibs in the case under discussion, can be
looked at either as similarity of individuals in the same group, or as
difference between individuals in different groups. The greater the
similarity within the groups, the greater in proportion will be the
difference between the groups. The degree of resemblance can
therefore be expressed as the between-group component as a pro-
portion of the total variance. This is the intra-class correlation coeffi-
cient and is given by

oB-tow

where <j% is the between-group component and o> the within-group
component. (It is customary to use the symbol t for the intra-class
correlation of phenotypic values in order to avoid confusion with
other types of correlation for which the symbol r is used.) The
between-group component expresses the amount of variation that is
common to members of the same group, and it can equally well be
referred to as the covariance of members of the groups. In the case of
the resemblance between offspring and parents the grouping of the
observations is into pairs rather than groups; one parent, or the mean
of two parents, paired with one offspring or the mean of several
offspring. It is then more convenient to compute the covariance of
offspring with parents from the sum of cross-products, rather than
from the between-pair component of variance. With offspring-
parent relationships, also, it is usually more convenient to express the
degree of resemblance as the regression coefficient of offspring on
parent, instead of the correlation between them, the regression being
given by

'OP

covOF

where covOY is the covariance of offspring and parents, and o-J is the
variance of parents.

Thus, the covariance of related individuals is the new property
of the population that we have to deduce in seeking the cause of
resemblance between relatives, whether sibs or offspring and parents.

L F.Q.G.

152 RESEMBLANCE BETWEEN RELATIVES [Chap. 9

The covariance, being simply a portion of the total phenotypic
variance, is composed of the causal components described in the last
chapter, but in amounts and proportions differing according to the
sort of relationship. By finding out how the causal components con-
tribute to the covariance we shall see how an observed covariance can
be used to estimate the causal components of which it is composed.

Both genetic and environmental sources of variance contribute to
the covariance of relatives. We shall consider the genetic causes of
resemblance first, then the environmental causes, and finally, by
putting the two causes together, arrive at the phenotypic covariance
and the degree of resemblance that can be observed from measure-
ments of phenotypic values. A general description of the covariance,
applicable to any sort of relationship, is given by Kempthorne
(1955a). Here we shall consider only four sorts of relationship: (1)
between offspring and one parent, (2) between half sibs, (3) between
offspring and the mean of the two parents, and (4) between full sibs.
These are the most important relationships in practice. Identical
twins will be considered in the next chapter, because the problems
they raise will be better understood then.

Genetic Covariance

Our object now is to deduce from theoretical considerations the
covariance of relatives arising from genetic causes, neglecting for the
time being any non-genetic causes of resemblance that there may be.
This means that we have to deduce the covariance of the genotypic
values of the related individuals. This will be done by reference to
two alleles at a locus, but the conclusions are equally valid for loci
with any number of alleles. We shall at first omit interaction deviations
and the interaction component of variance from consideration, but
we shall describe its effects briefly later.

Offspring and one parent. The covariance to be deduced is
that of the genotypic values of individuals with the mean genotypic
values of their offspring produced by mating at random in the popu-
lation. If values are expressed as deviations from the population
mean, then the mean value of the offspring is by definition half the
breeding value of the parent, as explained in Chapter 7. Therefore
the covariance to be computed is that of an individual's genotypic
value with half its breeding value, i.e. the covariance of G with \A.

GENETIC COVARIANCE

153

Chap. 9]

Since G=A+D (D being the dominance deviation) the covariance
is that of (A+D) with \A. Taking the sum of cross-products, we
have

sum of cross-products =Z\A(A +D)
= ±ZA2 + \ZAD

Since A and D are uncorrelated (see p. 125), the term \ZAT> is
zero. Then if we divide both sides by the number of paired observa-
tions we have

cov01> = iVA

(9-0

since ZA2 is the sum of squares of breeding values. The genetic
covariance of offspring and one parent is therefore half the additive
variance.

The covariance may be derived by another method, which though
less concise is perhaps more explicit. Table 9.1 gives the genotypes
of the parents, their frequencies in the population, and their geno-
typic values expressed as deviations from the population mean (from
Table 7.3). The right-hand column gives the mean genotypic values

Parents

Table 9.1

Offspring

Genotype

Frequency

Genotypic value

Mean genotypic value

2q(oc-qd)

qoc

AXA2

2pq

(q -p)oc + zpqd

Vs-p)*

A2A2

- 2p{tx +pd)

—pa

of the offspring, which are half the breeding values of the parents as
given in Table 7.3. The covariance of offspring and parent is then the
mean cross-product, and is obtained by multiplying together the
three columns — frequency x genotypic value of parent x genotypic
value of offspring — and summing over the three genotypes of the
parents. After collecting together the terms in a2 and the terms in ocd
we obtain

covOY =pq<x2(p2 + Zpq + q2) + 2p2q2ad( -q + q-p +p)
=pq<x2
= Wa

since from equation £.5, VA = zpqa.2. Summing over all loci we again
reach the conclusion that the covariance of offspring and one parent
is equal to half the additive variance.

154 RESEMBLANCE BETWEEN RELATIVES [Chap.9

Half sibs. Half sibs are individuals that have one parent in com-
mon and the other parent different. A group of half sibs is therefore
the progeny of one individual mated at random and having one
offspring by each mate. Thus the mean genotypic value of the group
of half sibs is by definition half the breeding value of the common
parent. The covariance is the variance of the means of the half-sib
groups, and is therefore the variance of half the breeding values of the
parents; this is a quarter of the additive variance:

CW(BB) = V*A=hVA (9-2)

This covariance also can be demonstrated by the longer method,
from the values already given in Table 9.1. The covariance is the
variance of the means of the groups of offspring listed in the right-
hand column. Squaring the offspring values and multiplying by their
frequencies we get

Variance of means of half-sib families

=p2q2*2 + Zpq. l(q -p)2oc2 + q2p2<x2
=pqoc2[pq + i(q-p)2+pq]

=pq«2ii(P+q)2]

= ipq*2
Therefore, since zpqoc2 = VA (from equation 8.5),

coV(m)=lVA

summation being made over all loci.

Offspring and mid-parent. The covariance of the mean of the
offspring and the mean of both parents (commonly called the * 'mid-
parent") may be deduced in the following way. Let O be the mean of
the offspring, and P and P' be the values of the two parents. Then
we want to find cov0t>\ that is, the covariance of O with |(P + P').
This is equal to \{cov0^ + covov>). If P and P' have the same variance,
then covov = covov> and cov0? = covOY. Thus, provided the two sexes
have equal variances, the covariance of offspring and mid-parent is
the same as that of offspring with one parent, which we have seen is
equal to half the additive variance. This conclusion may be extended
to other sorts of relative: the covariance of any individual with the
mean value of a number of relatives of the same sort is equal to its
covariance with one of those relatives.

The longer method of demonstrating the covariance of offspring
with mid-parent is rather laborious, but it must be given since it will

Chap. 9]

GENETIC COVARIANCE

155

«
<

<-£>§
s

Qj *•

H|tH

Co S*

J >.

HH<

H^I

^_^

eny met

parent

**3

<3
+

rH|<N

<3
M

1? rg

£ S

<3
Ho*

•3 ^

Q <^>

53 ^0

Hid

S ^

h|«

h|n

H<*

H|«

i— (

5S w

h|«

H|<N

H|C*

H|N

HH<

^ s

y s

s J

*\^3

+
<3
1

<3
1

h£

8 5?

^"1

«*

<3^

<N
<

ir> <fc

1 fc

§ s,

156 RESEMBLANCE BETWEEN RELATIVES [Chap. 9

be needed for arriving at the covariance of full sibs. We shall, how-
ever, omit some of the steps of algebraic reduction. A table (Table
9.2) is made in the same manner as for offspring and one parent, but
now we have to tabulate types of mating and their frequencies, in-
stead of single parents. This was done in Chapter 1 (Table 1.1).
Against each type of mating we put the mean genotypic value of the
two parents, i.e. the mid-parent value; then the genotypes of the pro-
geny and the mean genotypic value of the progeny. The working is
made easier by writing the genotypic values in terms of a and d
instead of as deviations from the population mean. In the last two
columns of the table we put the product of progeny-mean x mid-
parent, and the square of the progeny for later use. Now, to get the
covariance of progeny-mean and mid-parent value, we take the pro-
duct of progeny-mean x mid-parent and multiply it by the frequency
of the mating type, and then sum over mating types. This gives the
mean product (M.P.) from which we have to deduct a correction for
the population mean, since values are not here expressed as deviations
from the mean. The correction is simply the square of the population
mean (M2) since the means of parents and of progeny are equal.
Both the M.P. and M2 contain terms in a2, in ad, and in d2. By col-
lecting together these terms and simplifying a little we obtain

M.P. = a2[p3(p +q)+ q\p + q)] + 2adpq(p2 - q2) + d2pq(p2 + 2pq + q2)
M2 = a\p2 - 2pq + q2) + \adpq(p - q) ■ + \d2p 2q2

Then, cov0^ = M.P.-M2

= a2pq - 2adpq(p -q) + d2pq(p - q)2

=pq[a + d(q-p)]2

=pqoc2

=Wa (9-3)

when summed over all loci.

So the genetic covariance of offspring with the mean of their parents
is equal to half the additive genetic variance. That this covariance
comes out the same as that of offspring and one parent need cause no
surprise when we note that the variance of mid-parent values is half
the variance of individual values (see below, p. 162).

Full sibs. The covariance of full sibs is the variance of the means
of full-sib families, and is got with little additional work from Table
9.2. The last column shows the squares of progeny means and it will
be seen that these squares are all exactly the same as the products of

Chap. 9]

GENETIC COVARIANCE

157

progeny-mean x mid-parent, except for the two entries in the middle
involving terms in d2. The mean square (M.S.) can therefore be got
from the mean product (M.P.) already calculated, thus

M.S.=M.?.+d2.2p2q2-id\4p2q2
= M.¥.+dyq2

The correction for the mean is the same as before, so we have

covim

- coV(y§ + d2p2q2
-pqo.2 + d2p2q2

Since 2pqcx.2 = VA (from equation 8.5) and ^d2p2q2
8.6) the covariance of full sibs is

VD (from equation

(94)

covm)=iVA + lVD

summing over all loci.

So the genetic covariance of full sibs is equal to half the additive
genetic variance plus a quarter of the dominance variance. This is the
only one of the relationships that we have considered where we find
the dominance variance contributing to the resemblance. The reason
is that full sibs have both parents in common, and a pair of full sibs
have a quarter chance of having the same genotype for any locus.

Covariance due to epistatic interaction. Before we turn to the
environmental causes of resemblance between relatives let us briefly
examine the role of interaction variance arising from epistasis. In
Chapter 8 we noted that the interaction variance, VIy is subdivided
into components according to the number of loci interacting, and
according to whether the interaction is between breeding values or
dominance deviations. The covariances of relatives, with the contri-
butions of the two-factor interactions included, are shown in Table 9.3

Table 9.3
Covariances of relatives including the contributions of

two-factor interactions.

Relatives

Variance components and the coeffi-
cients of their contributions

Offspring-parent: cov0? ■■
Half sibs: covms)

Full sibs: covcfs)

General: cov ■■

4
_1_
16

1,2

158 RESEMBLANCE BETWEEN RELATIVES [Chap. 9

(for details see Kempthorne, 19550, b). The offspring-parent co-
variance refers equally to one parent and to mid-parent values.
For the sake of clarity the components of variance are shown at the
heads of the columns and their coefficients in the covariances are
listed below. For example, the offspring-parent covariance is
i^A+l Vaa> The contributions of interaction to the covariances are
expressible in a simple general form, shown in the bottom line of the
table. If the covariance contains xVA then it contains also xWAA\ and
if it contains yVD it contains also xyVAD and y2VDD. Interactions in-
volving more than two loci contribute progressively smaller propor-
tions as the number of loci increases. The effect of the interaction
variance on the resemblance between relatives is, in principle, that
the offspring-parent covariance is not twice the half-sib covariance,
but a little more than twice; and that the excess of the full-sib co-
variance over the half-sib represents not only dominance variance but
also some of the interaction variance.

When the interaction variance was first discussed in Chapter 8
we said we would regard it as a complication to be circumvented,
noting only the consequences of neglecting it. These consequences
are now apparent. First, only small fractions of it contribute to the
covariances and therefore its effect on the resemblance between rela-
tives is small unless the amount of interaction variance is large in
comparison with the other components. And second, it appears that
there is little we can do in practice except ignore it, because, apart
from the special experimental methods mentioned on p. 139, there is
no practicable means of separating the interaction from the other
components. The consequences of ignoring the interaction variance
are thus that any estimate of VA made from offspring-parent regres-
sions will contain also \VAA + \VAAA +etc; any estimate of VA from
half-sib correlations will contain also iVAA+T6VAAA-\-etc; and any
estimate of VD obtained from a full-sib correlation will contain also
portions of the interaction components. We noted in Chapter 7 that
the two definitions of breeding value given there are not equivalent
if there is interaction between loci. We can now see how this comes
about. Defined in terms of the measured values of progeny — the
practical definition — breeding value includes additive x additive
interaction deviations in addition to the average effects of the genes
carried by the parents; whereas, defined in terms of the average
effects of genes — the theoretical definition — it does not.

Effect of linkage. Throughout the discussion of the covariances

Chap. 9]

GENETIC COVARIANCE

159

of relatives we have ignored the effects of linkage, assuming always
that the loci concerned segregate independently. The effects of
linkage in a random-mating population, where the coupling and
repulsion phases are in equilibrium, are as follows (Cockerham,
1956a). The covariances of offspring and parents are not affected,
but the covariances of half and full sibs are increased; the closer the
linkage the greater the increase. The additional covariance due to
linkage appears with the interaction component. Therefore what is
formally attributed to epistatic interaction may be in part due to
linkage.

Environmental Covariance

Genetic causes are not the only reasons for resemblance between
relatives; there are also environmental circumstances that tend to
make relatives resemble each other, some sorts of relatives more than
others. If members of a family are reared together, as with human
families or litters of pigs or mice, they share a common environment.
This means that some environmental circumstances that cause
differences between unrelated individuals are not a cause of difference
between members of the same family. In other words there is a com-
ponent of environmental variance that contributes to the variance
between means of families but not to the variance within the families,
and it therefore contributes to the covariance of the related individuals.
This between-group environmental component, for which we shall
use the symbol VEcy is usually called the common environment, a term
that seems more appropriate when we think of the component as a
cause of similarity between members of a group than when we think
of it as a cause of difference between members of different groups.
The remainder of the environmental variance, which we shall denote
by VEw, arises from causes of difference that are unconnected with
whether the individuals are related or not. It therefore appears in
the within-group component of variance, but does not contribute to
the between-group component, which is the variance of the true
means of the groups. In considerations of the resemblance between
relatives, therefore, the environmental variance must be divided into
two components:

Vn=VMB+V

(9-5)

160 RESEMBLANCE BETWEEN RELATIVES [Chap.9

one of the components, VEcy contributing to the covariance of the
related individuals.

The sources of common environmental variance are many and
varied, and only a few examples can be mentioned. Soil conditions
may differentiate families of plants when the members of a family are
grown together on the same plot: similarly the conditions of the cul-
ture medium may differentiate families of Drosophila or other small
animals. With farm animals, related individuals are likely to have
been reared on the same farm, and differences of climate or of manage-
ment contribute to the resemblance between the relatives. "Maternal
effects" are a frequent source of environmental difference between
families, especially with mammals. The young are subject to a
maternal environment during the first stages of their life, and this
influences the phenotypic values of many metric characters even
when measured on the adult, causing offspring of the same mother to
resemble each other. Finally, members of the same family tend to be
contemporaneous, and changes of climatic or nutritional conditions
tend to differentiate members of different families. This source of
common environmental variation is especially important in animals
that produce their young in broods or litters.

These various sources of common environmental variation con-
tribute chiefly to the resemblance between sibs, though some may
also cause resemblance between parent and offspring. Maternal
effects, in particular, often cause a resemblance between mother and
offspring as well as among the offspring themselves. Body size in mice
and other mammals provides an example. Large mothers tend to
provide better nutrition for their young, both before and after birth,
than small mothers. Therefore the young of large mothers tend to
grow faster, and the effect of the rapid early growth may persist, so
that when adult their body size is larger. Thus mothers and offspring
tend to resemble each other in body size.

It will be seen from the examples given that the nature of the
component of variance due to common environment differs according
to the circumstances. What we designate as the VEc component
depends on the way in which individuals are grouped when we esti-
mate the observational components of phenotypic variance. What-
ever the form of the analysis, the part of the variance between the
means of groups that can be ascribed to environmental causes is
called the VEc component. The nature of this component thus
depends on the form of the analysis applied. If the groups in the

Chap. 9]

ENVIRONMENTAL COVARIANCE

161

analysis are full-sib families then the VEc component represents
environmental causes of similarity between full sibs; if the groups are
half sibs it represents causes of similarity between half sibs. And in
parent-offspring relationships a comparable covariance term repre-
sents environmental causes of resemblance between offspring and
parent. Thus, whenever we measure a phenotypic covariance with
the object of using it to estimate a causal component of variance we
have to decide whether it includes an appreciable component due to
common environment, and this is often a matter of judgment based
on a biological understanding of the organism and the character. In
experiments, much of the VEc component can often be eliminated by
suitable design. For example, members of the same family need not
always be reared in the same vial, cage, or plot; they can be random-
ised over the rearing environments. Or, by replication, the VEc
component can be measured and suitable allowance made for it in the
resemblance between the relatives.

Thus relatives of all sorts may in principle be subject to an en-
vironmental source of resemblance. In what follows, however, we
shall make the simplification of disregarding the VEc component for
all relatives except full sibs, though from time to time we shall put in
a reminder of its possible presence. Full sibs are subject to a com-
mon maternal environment and this is often the most troublesome
source of environmental resemblance to overcome by experimental
design. Consequently a VEc component contributes more often and
in greater amount to the covariance of full sibs than to that of any
other sort of relative. The simplification of disregarding all other
sources of common environmental variance is therefore not entirely
unrealistic.

Phenotypic Resemblance

The covariance of phenotypic values is the sum of the covariances
arising from genetic and from environmental causes. Thus by
putting together the conclusions of the two preceding sections we
arrive at the phenotypic covariances given in Table 9.4. (It will be
remembered that some possible sources of environmental covariance
are being neglected, particularly in offspring-parent relationships
involving the mother.) In all these relationships except that of full
sibs the covariance is either a half or a quarter of the additive genetic
variance. By observing the phenotypic covariance of relatives we can

162 RESEMBLANCE BETWEEN RELATIVES [Chap. 9

thus estimate the amount of additive variance in the population and
make the partition of the variance into additive versus the rest.

To arrive at the degree of resemblance expressed as a regression or
correlation coefficient we have to divide the covariance by the appro-
priate variance. The resemblance between sibs is expressed as a
correlation and the covariance is divided by the total phenotypic
variance. The correlation between half sibs, for example, is therefore
\VAjVP. The resemblance between offspring and parent is expressed

Table 9.4
Phenotypic Resemblance between Relatives
Relatives

Offspring and one parent
Offspring and mid-parent
Half sibs
Full sibs

as the regression of offspring on parent, and the covariance is there-
fore divided by the variance of parents. In the case of single parents
this is again the phenotypic variance, and the regression of offspring
on one parent is thus \VA\VP. In a random-breeding population the
phenotypic variance of parents and offspring is the same, and then the
correlation between offspring and one parent is the same as the re-
gression. The case of mid-parent values, however, is a little different.
The covariance has to be divided by the variance of mid-parent values,
and this is half the phenotypic variance, for the following reason. Let
X and Y stand for the phentoypic values of male and female parents
respectively. Then Gx = oy=VP. The mid-parent value is \X+\Y.
and the variance of mid-parent values, assuming X and Y to be
uncorrelated, is therefore u£x + °Vf = ia\x = 2 • \^x — 2 Vp- Thus
the regression of offspring on mid-parent is \V A\\V P = VAjVP. The
correlation between offspring and mid-parent values, however, is
2 ^/crP<70) where op and cr0 are the square roots of the phenotypic vari-
ances of mid-parents and offspring respectively, and this is not the
same as the regression of offspring on mid-parent.

Covariance

Regression (b)

or correlation (t)

b-*vP

1 WP

Wa+Wd + VBc

, Wa+Wd + Vec

*~ T/

Chap. 9]

PHENOTYPIC RESEMBLANCE

163

The regressions of offspring on parents and the correlations of
sibs are shown in Table 9.4. All except the full-sib correlation are
simple fractions of the ratio VAjVP. Thus the different degrees of
resemblance between different sorts of relatives become apparent.
For example, the regression of offspring on one parent is twice the
correlation between half sibs, and the correlation between full sibs is
twice the correlation between half sibs if there is no dominance and
no common environment.

The difference between the full-sib covariance and twice the
half-sib covariance can, in principle, be used to estimate the domin-
ance variance, VDi provided there is no variance due to common
environment, though some of the variance due to epistatic interaction
would be included, as may be seen from Table 9.3. In practice,
however, it is usually very difficult to be certain that there is no
variance due to common environment, and estimates of the domin-
ance variance obtained in this way are generally to be regarded as
upper limits rather than as precise estimates.

Table 9.5
The Resemblance between Relatives for some Characters in Man

Correlation

coefficient

Parent-

Character

Reference

offspring

Full sib

Stature

(1)

•51

•53

Span

(1)

•45

•54

Length of forearm

(1)

•42

•48

Intelligence

(2)

•49

Birth weight

(3)

—

•50

(1) Pearson and Lee (1903).

(2) Unweighted averages of several estimates, cited by
Penrose (1949).

(3) Quoted from Robson (1955).

The chief use of measurements of the degree of resemblance
between relatives is to estimate the proportionate amount of additive
genetic variance, V A\ VP, which is the heritability . The meaning of the
heritability and the methods of estimating it will be considered more
fully in the next chapter. To conclude this chapter we give in Table
9.5 some examples of correlations between relatives in man. These

164 RESEMBLANCE BETWEEN RELATIVES [Chap. 9

are undoubtedly complicated by covariance due to common en-
vironment, and also by assortative mating. The correlation between
husband and wife for intelligence, for example, is as high as 0*58
(see Penrose, 1949). For these reasons human correlations cannot
easily be used to partition the variation into its components.

CHAPTER 10

HERITABILITY

The heritability of a metric character is one of its most important
properties. It expresses, as we have seen, the proportion of the total
variance that is attributable to the average effects of genes, and this is
what determines the degree of resemblance between relatives. But
the most important function of the heritability in the genetic study
of metric characters has not yet been mentioned, namely its predictive
role, expressing the reliability of the phenotypic value as a guide to
the breeding value. Only the phenotypic values of individuals can
be directly measured, but it is the breeding value that determines their
influence on the next generation. Therefore if the breeder or experi-
menter chooses individuals to be parents according to their pheno-
typic values, his success in changing the characteristics of the popu-
lation can be predicted only from a knowledge of the degree of corre-
spondence between phenotypic values and breeding values. This
degree of correspondence is measured by the heritability, as the fol-
lowing considerations will show.

The heritability is defined as the ratio of additive genetic variance
to phenotypic variance:

h2 =

.(lO.l)

(The customary symbol h2 stands for the heritability itself and not for
its square. The symbol derives from Wright's (1921) terminology,
where h stands for the corresponding ratio of standard deviations.)
An equivalent meaning of the heritability is the regression of breeding
value on phenotypic value:

h2=bAP i10-2)

The equivalence of these meanings can be seen from reasoning similar
to that by which we derived the genetic covariance of offspring and
one parent on p. 153. If we split the phenotypic value into breeding
value and a remainder (R) consisting of the environmental, domin-

166 HERITABILITY [Chap. 10

ance, and interaction deviations, thenP=A+R. Since A and R are
uncorrelated, covAP = VA and so bAP = VAjVP.

We may note also that the correlation between breeding values
and phenotypic values, rAP, is equal to the square root of the heri-
tability. This follows from the general relationship between corre-
lation and regression coefficients, which gives

» op
rAP—°Ap—

<*A

=h (10.3)

By regarding the heritability as the regression of breeding value
on phenotypic value we see that the best estimate of an individual's
breeding value is the product of its phenotypic value and the heri-
tability:

^(expected) = h*P (IO.4)

breeding values and phenotypic values both being reckoned as
deviations from the population mean. In other words the heritability.
expresses the reliability of the phenotypic value as a guide to the
breeding value, or the degree of correspondence between phenotypic
value and breeding value. For this reason the heritability enters into
almost every formula connected with breeding methods, and many
practical decisions about procedure depend on its magnitude. These
matters, however, will be considered in the next chapters; here we
are concerned only to point out that the determination of the heri-
tability is one of the first objectives in the genetic study of a metric
character.

It is important to realise that the heritability is a property not
only of a character but also of the population and of the environ-
mental circumstance to which the individuals are subjected. Since
the value of the heritability depends on the magnitude of all the com-
ponents of variance, a change in any one of these will affect it. All
the genetic components are influenced by gene frequencies and may
therefore differ from one population to another, according to the past
history of the population. In particular, small populations maintained
long enough for an appreciable amount of fixation to have taken place
are expected to show lower heritabilities than large populations.
The environmental variance is dependent on the conditions of culture

Chap. 10]

HERITABILITY

167

or management: more variable conditions reduce the heritability,
more uniform conditions increase it. So, whenever a value is stated
for the heritability of a given character it must be understood to refer
to a particular population under particular conditions. Values found
in other populations under other circumstances will be more or less
the same according to whether the structure of the population and the
environmental conditions are more or less alike.

Very many determinations of heritabilities have been made for a
variety of characters, chiefly in farm animals. Some representative
examples are given in Table io.i. Different determinations of the
heritability of the same character show a considerable range of varia-
tion. This is partly due to statistical sampling, but some of the
variation reflects real differences between the populations or the
conditions under which they are studied. For these reasons, and be-
cause estimations of heritabilities can seldom be very precise, the
figures quoted in the table are rounded to the nearest 5 per cent.
From Table 10. 1 it can be seen that the magnitude of the heritability
shows some connexion with the nature of the character. On the
whole, the characters with the lowest heritabilities are those most
closely connected with reproductive fitness, while the characters
with the highest heritabilities are those that might be judged on bio-
logical grounds to be the least important as determinants of natural
fitness. This is well seen in the gradation of the four characters of
Drosophila.

Table io.i

Approximate values of the heritability of various characters
in domestic and laboratory animals.

Cattle

Amount of white spotting in Friesians (Briquet and Lush, 1947) -95

Butterfat % (Johansson, 1950) -6

Milk-yield (Johansson, 1950) -3

Conception rate (in 1st service) (A. Robertson, 1957a) -oi

'igs

Thickness of back fat (Fredeen and Jonsson, 1957) *55

Body length (Fredeen and Jonsson, 1957) -5

Weight at 180 days (Whatley, 1942) «3

Litter size (Lush and Molln, 1 942) • 1 5

{Continued overleaf)
M F.Q.G.

168 HERITABILITY [Chap. 10

Sheep (Australian Merino)

Length of wool (Morley, 1955) *55

Weight of fleece (Morley, 1955) *4

Body weight (Morley, 1955) *35

Poultry (White Leghorn)

Egg weight (Lerner and Cruden, 195 1) *6

Age at laying of first egg (King and Henderson, 19546) *5
Egg-production (annual, of surviving birds) (King and Henderson,

I954&) *3

Egg-production (annual, of all birds) (King and Henderson,

19546) -2

Body weight (Lerner and Cruden, 1951) *2

Viability (Robertson and Lerner, 1949) *i

Rats

Expression of hooded gene (amount of white) (from data of Castle

and Wright, 1 9 1 6) -4

Ovary response to gonadotrophic hormone (Chapman, 1946) -35

Age at puberty in females (Warren and Bogart, 1952) -15

Mice

Tail length at 6 weeks (Falconer, 19546) -6

Body weight at 6 weeks (Falconer, 1953) -35

Litter size (1st litters) (Falconer, 1955) '15

Drosophila melanogaster

Abdominal bristle number (Clayton, Morris, and Robertson,

1957) '5

Body size (thorax length) (F. W. Robertson, 19576) -4

Ovary size (F. W. Robertson, 1957a:) -3

Egg production (F.W. Robertson, 19576) -2

Estimation of Heritability

Let us first compare the merits of the different sorts of relatives
for estimating either the additive genetic variance from the covariance,
or the heritability from the regression or correlation coefficient.
Table 10.2 shows again the composition of the phenotypic covariances,

Chap. 10]

ESTIMATION OF HERITABILITY

169

and shows also the regression or correlation expressed in terms of the
heritability. The choice depends on the circumstances. In addition

Table 10.2

Relatives

Offspring and one parent
Offspring and mid-parent
Half sibs
Full sibs

Co-variance

tv a

Wa
Wa+Wd+VEo

Regression (b) or
correlation (t)

b = \W
b=h2

t = \h*
t>ih*

to the practical matter of which sorts of relatives are in fact obtain-
able, there are two points to consider — sampling error and environ-
mental sources of covariance. The statistical precision of the estimate
depends on the experimental design and also on the magnitude of the
heritability being estimated, and so no hard and fast rule can be
made. The matter of statistical precision will be further considered
in a later section of this chapter. The question of environmental
sources of covariance is generally more important than the statistical
precision of the estimate, because it may introduce a bias which
cannot be overcome by statistical procedure. From considerations of
the biology of the character and the experimental design we have to
decide which covariance is least likely to be augmented by an en-
vironmental component, a matter already discussed in the last
chapter. Generally speaking the half-sib correlation and the regres-
sion of offspring on father are the most reliable from this point of
view. The regression of offspring on mother is sometimes liable to
give too high an estimate on account of maternal effects, as it would,
for example, with body size in most mammals. The full-sib corre-
lation, which is the only relationship for which an environmental
component of covariance is shown in the table, is the least reliable of
all. The component due to common environment is often present in
large amount and is difficult to overcome by experimental design;
and the full-sib covariance is further augmented by the dominance
variance. The full-sib correlation can therefore seldom do more than
set an upper limit to the heritability.

Example io.i. The heritability of abdominal bristle number in
Drosophila melanogaster has been determined by three different methods,
applied to the same population (Clayton, Morris, and Robertson, 1957),
with the following results:

170 HERITABILITY [Chap. 10

Method of estimation Heritability

Offspring-parent regression *5 1 ± '07

Half-sib correlation -48 ± • 1 1

Full-sib correlation -53 ± '07

Combined estimate ^52

The estimates obtained by the three methods are in very satisfactory
agreement. In this case, the character — bristle number — is free of com-
plications arising from maternal effects and common environment.

Let us now consider briefly some technical matters concerning the
translation of observational data into estimates of heritability. We
shall deal first with the estimation of the heritability; and we shall
later discuss the standard error of the estimate, and the design that
gives an experiment its greatest precision.

Selection of parents and assortative mating. In the treatment
of resemblance between relatives we have supposed the parents to be
a random sample of their generation and to be mated at random. Quite
often, however, one or other of these conditions does not hold, and
the choice of which sort of relative to use in the estimation of herita-
bility is then somewhat restricted. In experimental and domesticated
populations the parents are often a selected group and consequently
the phenotypic variance among the parents is less than that of the
population as a whole and less than that of the offspring. The regres-
sion of offspring on parents, however, is not affected by the selection
of parents because the covariance is reduced to the same extent as the
the variance of the parents, so that the slope of the regression line is
unaltered. Thus the regression of offspring on one parent is a valid
measure of J/?2, and that of offspring on mid-parent is a valid measure
of h2. But the covariance is not a valid measure of VAy nor the vari-
ance of parents of VP\ moreover, the correlation and regression coeffi-
cients are not equal.

Sometimes the mating of parents is not made at random but
according to their phenotypic resemblance, a system known as
assortative mating. There is then a correlation between the pheno-
typic values of the mated pairs. The consequences of assortative
mating are described by Reeve (19556) but they are too complicated
to explain in detail here. They can be deduced by modification of
Table 9.2, the frequencies of the different types of mating being
altered according to the correlation between the mated pairs. The
variance of mid-parent values is increased and consequently also the

Chap. 10]

ESTIMATION OF HERITABILITY

171

covariance of full sibs. The regression of offspring on mid-parent,
however, is very little affected and it can be taken as a valid measure
of h2. The increased variance of mid-parent values under assortative
mating has the practical advantage of reducing the sampling error of
the regression coefficient and thus of the estimate of heritability.

Offspring-parent relationship. The estimation of heritability
from the regression of offspring on parent is comparatively straight-
forward and needs little comment apart from the points mentioned in
the preceding paragraphs. The data are obtained in the form of
measurements of parents and the mean values of their offspring. The
covariance is then computed in the usual way from the cross-products
of the paired values. The mean values of offspring may be weighted
according to the number of offspring in each family, if the numbers
differ. The appropriate weighting is discussed by Kempthorne and
Tandon (1953) and by Reeve (1955c).

Fig. 10. i. Regression of offspring on mid-parent for wing-length
in Drosophila, as explained in Example 10.2. Mid-parent values are
shown along the horizontal axis, and mean value of offspring along
the vertical axis. (Drawn from data kindly supplied by Dr E. C.
R. Reeve.)

Example 10.2. Fig. 10. 1 illustrates the regression of offspring on
mid-parent values for wing length in Drosophila melanogaster (Reeve and
Robertson, 1953). There are 37 pairs of parents and a mean of 273
offspring were measured from each pair of parents. The parents were
mated assortatively, with the result that the variance of mid-parent values

172 HERITABILITY [Chap. 10

is greater than it would be if mating had been at random. Each point on
the graph represents the mean value of one pair of parents (measured along
the horizontal axis), and the mean value of their offspring (measured along
the vertical axis). The axes are marked at intervals of i/ioo mm., and they
intersect at the mean value of all parents and all offspring. The sloping
line is the linear regression of offspring on mid-parent. The slope of this
line estimates the heritability, and has the value ( ± standard error):

h2=b0? = 0-577 ±0-07

A complication in the use of the regression of offspring on mid-
parent arises if the variance is not equal in the two sexes. We noted
in the previous chapter that the genetic covariance of offspring and
mid-parent is equal to half the additive variance on condition that the
sexes are equal in variance. If this is not so, the regression on mid-
parent cannot, strictly speaking, be used, and the heritability must
be estimated separately for each sex from the regression of daughters
on mothers and of sons on fathers. If the heritabilities are found to
be equal in the two sexes, then a joint estimate can be made from the
regression on mid-parent, by taking the mean value of the offspring
as the unweighted mean of males and females.

Sib analysis. The estimation of heritability from half sibs is
more complicated than appears at first sight and needs more detailed
comment. A common form in which data are obtained with animals
is the following. A number of males (sires) are each mated to several
females (dams), and a number of offspring from each female are
measured to provide the data. The individuals measured thus form a
population of half-sib and full-sib families. An analysis of variance
is then made by which the phenotypic variance is divided into ob-
servational components attributable to differences between the pro-
geny of different males (the between-sire component, u2s); to differ-
ences between the progeny of females mated to the same male
(between-dam, within-sires, component, v%)\ and to differences
between individual offspring of the same female (within-progenies
component, o-j^). The form of the analysis is shown in Table 10.3.
There are supposed to be s sires, each mated to d dams, which
produce k offspring each. The values of the mean squares are de-
noted by MSS, MSDi and MSW. The mean square within progenies
is itself the estimate of the within-progeny variance component,
vw\ but tne other mean squares are not the variance components.
The compositions of the mean squares in terms of the observational

Chap. 10]

ESTIMATION OF HERITABILITY

173

components of variance are shown in the right-hand column of the
table, consideration of which will show how the variance components
are to be estimated. The between-dam mean square, for example, is
made up of the within-progeny component together with k times the
between-dam component; so the between-dam component is esti-
mated as vi ~{ijk){MSD - MSW), i.e. we deduct the mean square for
progenies from the mean square for dams and divide by the number
of offspring per dam. Similarly the between-sire component is
estimated as os = {ijdk)(MSs - MSD), where dk is the number of off-

Table 10.3
Form of Analysis of Half-Sib and Full-Sib Families

Composition oj

Source

d.f.

Mean Square

Between sires

S-I

MSS

= c?w + ko% + dkal

Between dams

s(d-i)

MSn

= a^ + kal

(within sires)

Within progenies

sd(k-i)

MSW

= aw

s = number of sires

d = number of dams per sire

k = number of offspring per dam

spring per sire. If there are unequal numbers of offspring from the
dams, or of dams in the sire groups, the exact solution, which is
described by King and Henderson (1954a), Williams (1954), and
Snedecor (1956, section 10.17) becomes too complicated for descrip-
tion here. We can, however, use the mean values of d and k with
little error, provided the inequality of numbers is not very great.

The next step is to deduce the connexions between the observa-
tional components that have been estimated from the data and the
causal components, in particular the additive genetic variance, the
estimation of which is the main purpose of the analysis. Though all
the information needed has already been given, the interpretation of
the observational components, which is given in Table 10.4, is not
immediately apparent without explanation. The first point to note
is that the estimate of the phenotypic variance is given by the sum
(o-y) of the three observational components: VP = 0% = 0% + 0% + crj^.
This is not necessarily equal to the observed variance as estimated
from the total sum of squares, though the two seldom differ by much.
Now consider the interpretation of the between-sire component,

174

HERITABIUTY

[Chap. 10

g%. This is the variance between the means of half-sib families and
it therefore estimates the phenotypic covariance of half sibs, cov(mi)y
which is \VA. Thus o\ = \VA. Next consider the within-progeny
component, o-^. Since any between-group variance component is
equal to the covariance of the members of the groups, it follows that a
within-group component is equal to the total variance minus the
covariance of members of the groups. The progenies of the dams are

Table 10.4

Interpretation of the observational components of variance
in a sib analysis

Observational component

Covariance and causal components
estimated

Sires:

°l =

Dams:

ol =

Progenies:

„1
crw =

Total: 4 =

ffs + ^-f

aw =

Sires + Dams:

^ + o-J =

=Wa+W» + vEc
=Wa+Wi>+vEV}
= vA + vJ>+vEe+vEw

= WA+iVn+VEe

cov(aB)
Vp-cov^)

cov{m)

full-sib families and so the within-progeny variance estimates
VP - coV(FS). This leads to the interpretation o> =\VA +%VD + VEw.
Finally, there remains the between-dam component, and what it
estimates can be found by subtraction as follows:

^D = ^T-^s-^w=cov{m -cov(K$)=IVa + IVd+ VEc

Consideration of the between-sire and between-dam components will
show that their sum gives an estimate of the full-sib covariance,
co<v(fs)> Dut this provides no new information for estimating the causal
components. These conclusions about the connexion between ob-
servational and causal components of variance are summarised in
Table 10.4. The contributions of the interaction variance to the
observational components is given by Kempthorne (1955(2), and
can be deduced from the contributions to the covariances given in
Table 9.3.

Example 10.3. As an illustration of the estimation of heritability from
a sib analysis we refer to the study of Danish Landrace pigs based on the
records of the Danish Pig Progeny Testing Stations (Fredeen and Jonsson,
1957). The data came from 468 sires each mated to 2 dams, the analysis
being made on the records of 2 male and 2 female offspring from each
dam. Only one such analysis is given here: that of body length in the male
offspring. The analysis, shown in the table, was made within stations and

Chap. 10]

ESTIMATION OF HERITABIUTY

175

within years, and this accounts for the degrees of freedom being fewer than
would appear appropriate from the numbers stated above. The interpre-
tation of the analysis, shown at the foot of the table, has been slightly

Sib analysis of body length in Danish Landrace pigs; data
for male offspring only (from Fredeen and Jonsson, 1957).

Source

d.f.

Mean Square

Component of variance

Between sires

432

6-03

^=1(6*03 -3-8i) = o-555

Between dams,

within sires

468

3.81

^ = i(3*81- 2-87) = 0-47

Within progenies

936

2-87

a2w= 2-87

4= 3-895

Interpretation of analysis
Sib correlations Estimates of heritability

Half sibs: t^

}) = — -2 =0*142 Sire-component: h2 = :~

(J rp O ' rp

Dam-component: h2

4°"j

Grp

= 0-57
= 0-48

Full sibs: t(FS)

2 , 2
crf

= 0-263 Sire + Dam:

h* = *M+J® =o-S3

simplified by the omission of some minor adjustments not relevant for us
at this stage. The between-dam component is not greater than the between-
sire component, so there cannot be much non-additive genetic variance or
variance due to common environment. The two estimates of the heri-
tability, from the sire and dam components respectively, can therefore be
regarded as equally reliable, and their combination based on the resem-
blance between full sibs may be taken as the best estimate.

Example 10.4. We have not yet had an example to illustrate the effect
of common environment in augmenting the full-sib correlation. This is
provided by body size in mice. The analysis given in table (i) refers to the

Table (i)

Source

d.f.

Mean Square Composition of M. S.

Components

Sires

17-10 ct£ + k'a% + dk'ol

0-1 = 0-48

Dams

118

10-79 <Tw + karl +

4 = 2-47

Progenies

527

2-19 al

0^ = 2-19

6 = 3-48; k' =4-16; ^ = 2-33

4=5#I4

176 HERITABILITY [Chap. 10

weight of female mice at 6 weeks of age (J. C. Bowman, unpublished).
There were 719 offspring from 74 sires and 192 dams, each with one
litter. These were spread over 4 generations and the analysis was made
within generations. The analysis is complicated by the inequality of the
number of offspring per dam and of dams per sire. We shall not attempt
to explain the adjustments made for these inequalities, but simply give
the compositions of the mean squares from which the components are
estimated. The dam component is much greater than the sire component,
indicating a substantial amount of variance due to common environment.
Therefore only the sire component can be used to estimate the heritability.
The estimate obtained is A2 = 4 x 0-48/5-14 = 0-37. Let us now use the analysis
to estimate the causal components according to the interpretation given
in Table 10.4, but with the assumption that non-additive genetic variance
is negligible in amount. Table (ii) gives the estimates and shows how they

Tab

le (ii)

vF-

= <JT

= 5*14 =

100%

vA-

=4"!

= 1-92 =

37%

vSc

-«4-

°l

= 1-99 =

39%

* Ew~-

„2

-<JW-

2o\

= 1-23 =

24%

are derived. The percentage contribution of each component to the total
variance is given in the right-hand column. It will be seen that the vari-
ance due to common environment (Vec) amounts to 39 per cent of the
total, and is greater than the environmental variance within full-sib
families (Vew) which amounts to only 24 per cent of the total.

Intra-sire regression of offspring on dam. The heritability
can be estimated from the offspring-parent relationship in a popula-
tion with the structure described in the foregoing section, but a slight
modification is necessary. Since each male is mated to several females,
the regression of offspring on mid-parent is inappropriate; and, since
there are usually rather few male parents, the simple regressions on
one or other parent are both unsuitable. The heritability can, how-
ever, be satisfactorily estimated from the average regression of off-
spring on dams, calculated within sire groups. That is to say, the
regression of offspring on dam is calculated separately for each set of
dams mated to one sire, and the regressions from each set pooled in a
weighted average. This method is commonly used for the estimation
of heritabilities in farm animals. The intra-sire regression of off-
spring on dam estimates half the heritability, as the following con-
sideration will show. The progeny of one sire has a mean deviation

Chap. 10]

ESTIMATION OF HERITABILITY

177

from the population mean equal to half the breeding value of the sire,
provided the females he is mated to are a random sample from the
population. The progeny of one dam deviates from the mean of the
sire group by half the breeding value of the dam. Therefore the
within-sire covariance of offspring and dam is equal to half the
additive variance of the population as a whole; and the within-sire
regression of offspring on dam is equal to half the heritability, just
like the simple regression of offspring on one parent. The validity
of the estimate is, of course, dependent on the absence of maternal
effects contributing to the resemblance between daughters and dams.
Inequality of the variance of males and females calls for an adjustment
if the heritability is to be estimated from the intra-sire regression of
male offspring on dams. The regression coefficient should then be
multiplied by the ratio of the phenotypic standard deviation of females
to that of males.

Example 10.5. The heritability of abdominal bristle-number in
Drosophila melanogaster, estimated from the offspring-parent regression,
was cited in Example 10.1. This was in fact a joint estimate based on
intra-sire regressions of daughters on dams and of sons on dams, the latter
being corrected for inequality of variance in the two sexes (Clayton, Morris,
and Robertson, 1957). The separate regression coefficients, with the cor-
rection for inequality of variances, and the estimates of the heritability
are given in the table.

Estimate of
heritability

Standard deviation: females
Standard deviation: males
Standard deviation: female/male
Regression coefficient: daughter-dam
Regression coefficient: son- dam
Regression coefficient: son-dam corrected

0-206 x 1-17 =
Joint estimate, as given in Example 10.1,

3*54

3'03
1-17
0-269
0*206

0-241

°*54

0-48
0-51

The Precision of Estimates of Heritability

It is of the greatest importance to know the precision of any esti-
mate of heritability. When an estimate has been obtained one wants
to be able to indicate its precision by the standard error. And when

178 HERITABILITY [Chap. 10

an experiment aimed at estimating a heritability is being planned one
wants to choose the method and design the experiment so that the
estimate will have the greatest possible precision within the limita-
tions imposed by the scale of the experiment. The precision of an
estimate depends on its sampling variance, the lower the sampling
variance the greater the precision; and the standard error is the square
root of the sampling variance. Estimates of heritability are derived
from estimates of either a regression coefficient or an intra-class cor-
relation coefficient, and the sampling variances of these are given in
textbooks of statistics. We shall therefore present the necessary
formulae without explanation of their derivation. The information on
the design of experiments given here is derived from the paper by A.
Robertson (19590) on this subject.

The problems of experimental design are, first, the choice of
method and, second, the decision of how many individuals in each
family are to be measured. Since the total number of individuals
measured cannot be increased indefinitely, an increase of the number
of individuals per family necessarily entails a reduction of the number
of families. The problem is therefore to find the best compromise
between large families and many families. In assessing the relative
efficiencies of different methods and designs we have to compare
experiments made on the same scale; that is to say, with the same
total expenditure in labour or cost. We must therefore decide first
what are the circumstances that limit the scale of the experiment. If
the labour of measurement is the limiting factor, as for example in
experiments with Drosophila, then the limitation is in the total
number of individuals measured, including the parents if they are
measured. If, on the other hand, breeding and rearing space is the
limiting factor, as it generally is with larger animals, the limitation
may be either in the number of families or in the total number of
offspring that can be produced for measurement, and measurements
of the parents may be included without additional cost. We cannot
here take account of all the possible ways in which the scale of the
experiment may be limited. Therefore for the sake of illustration we
shall consider only a limitation of the total number of individuals
measured. That is to say, we shall assume the total number of in-
dividuals measured to be the same for all methods and all experi-
mental designs. What we have to do, then, is to consider each method
on this basis and see what design and which method will give an
estimate of the heritability with the lowest sampling variance.

Chap. 10] THE PRECISION OF ESTIMATES OF HERITABILITY

179

Offspring-parent regression. Consider first estimates based on
the regression of offspring on parents. LetX be the independent
variate, which may be either the value of a single parent or the mid-
parent value. Let Y be the dependent variate, which may be either a
single offspring of each parent or the mean of n offspring. Let crx
and oy De the variances of X and Y respectively; let b be the regres-
sion of FonZ, and N the number of paired observations of X and Y,
which is equivalent to the number of families in the experiment.
Let T be the total number of individuals measured, which is fixed by
the scale of the experiment. The number of offspring measured is
nN, and the number of parents N or zN according to whether the
regression is on one parent or on the mid-parent value. So, with one
parent measured, T=N(n + i)> and with both parents measured
T=N(n + 2). With these symbols, the variance of the estimate of the
regression coefficient is

^AfhtiH

(10.5)

For use as a guide to design this formula is more convenient
if put in a simplified and approximate form. The regression coeffi-
cient is usually small enough that b2 can be ignored; and we may sup-
pose that N is fairly large, so that the variance of the estimate may be
put, approximately, as

2_ 1 <4

(approx.) (10.6)

When only one parent is measured the variance of parental values is
equal to the phenotypic variance, i.e. ux = VP. When both parents
are measured (provided they were not mated assortatively) the vari-
ance of mid-parent values is half the phenotypic variance, i.e.
crx — iVp- The variance of the offspring values, cry, is the variance of
the means of families of n individuals. This depends on the pheno-
typic correlation, t, between members of families, in a manner that
will be explained in Chapter 13, (see Table 13.2), where it will be
shown that

i+(n-i)t

Gy= Vp

Therefore by substitution for crx and gy in equation 10.6 the sampling
variance of the regression on one parent becomes

180 HERITABILITY [Chap. 10

°b = — k/V (aPProxO (10.7)

and that of the regression on mid-parent is twice as great. Since the
phenotypic correlation, t, depends on the heritability it will not
generally be known at the time an experiment is being planned.
Therefore the best design cannot be exactly determined in advance.
We can, however, get an approximate idea of how many offspring of
each parent should be measured. On the assumption already stated,
that the total number of individuals measured including the parents
is fixed, it can be shown that the sampling variance given in equation
10.7 is minimal when n = J(i - t)jt if one parent is measured and when
n = \iz(i - t)jt if both parents are measured. Consider, for example, a
character with a heritability of 20 per cent and no variance due to
common environment, so that the phenotypic correlation in full-sib
families is t = o-i. Then the optimal family size works out to be
n = 3 when only one parent is measured and n=\ when both parents
are measured. If we had taken a higher heritability the optimal family
size would have been lower. Large families are advantageous only
for the estimation of very low heritabilities. For example, full-sib
families of about 10 or 14 would be optimal for estimating a herit-
ability of 2 per cent.

So far we have considered only the sampling variance of the
regression coefficient, and how this can be reduced by the design of
the experiment. Now let us consider the sampling variance of the
estimate of heritability, so that we can compare methods, i.e. the use
of one parent or of mid-parent values. A just comparison can only
be made on the assumption of the optimal design for each method,
and therefore we can only illustrate the comparison by reference to a
particular case. We shall consider the particular case mentioned
above where the phenotypic correlation is £ = o-i, which would be
found in full-sib families when the heritability is 20 per cent. The
optimal family sizes are 3 or 4 as stated above. For the purpose of
comparison we have to express the sampling variance of the regression
coefficient given in equation 10.7 in terms of the total number of
individuals measured, T, since this is assumed to be the same for all
methods. We therefore substitute in equation 10.7 as follows. When
one parent is measured N= T\{n +1), and n = 3. When both parents are
measured N — Tj(n + 2), and n = 4. Substitution in equation 10.7 then
yields 0-6=4* 8/3 T when one parent is measured, and of = 3 • 9/T when both

Chap. 10] THE PRECISION OF ESTIMATES OF HERITABILITY 181

are measured. The regression on one parent must be doubled to give
the estimate of heritability, but the regression on mid-parent is itself
the estimate. So the sampling variances of the estimates of herit-
ability, in the special case under consideration, are:

By regression on one parent: o$ = \o\ = 6-^/T (approx.)
By regression on mid-parent: 0$ = ol — y^jT (approx.)

Thus the estimate based on mid-parent values has considerably less
sampling variance. A regression on mid-parent values, in general,
yields a more precise estimate of heritability for a given total number
of individuals measured.

Sib analyses. Now let us consider estimates obtained from the
intra-class correlation of full-sib or half-sib families. We shall at
first suppose for simplicity that half-sib families are not subdivided
into full-sib families; i.e. that only one offspring from each dam is
measured in paternal half-sib families. In the case of full-sib families
we shall assume that there is no variance due to common environ-
ment so that the estimate of heritability is a valid one. Let N be the
number of families, and n the number of individuals per family, so
that the total number of individuals measured is T=nN. Let the
intra-class correlation be t. The sampling variance of the intra-class
correlation is then

„ 2[l+(?Z
0?= L

i)t]%i-ty

.(10.8)

n(n-i)(N~i)

When the value of T=nN is limited by the size of the experiment it
can be shown that the sampling variance of the intra-class correlation
is minimal when n = i/t, approximately. Therefore the optimal family
size depends on the heritability. In the case of full-sib families
h2 = 2t, and in the case of half-sib families, h2=\t. So the most
efficient design has the following family sizes:

2
With full-sib families: n—-^

With half-sib families: n — -^

Since prior knowledge of the heritability will be at the best only
approximate, the optimal family size cannot be exactly determined
before-hand. The loss of efficiency, however, is much greater if the

182 HERITABILITY [Chap. 10

family size is below the optimum than if it is above. It is therefore
better to err on the side of having too large families. A. Robertson
(1959a) shows that, in the absence of prior knowledge of the herita-
bility, half-sib analyses should generally be designed with families of
between 20 and 30.

If the experiment has the most efficient design, with n = ijt, then
the sampling variance of the intra-class correlation is approximately

°t=f {10.9)

Therefore under optimal design the sampling variances of the esti-
mates of heritability are as follows:

16A2
From full-sib families: 0$ = 40? = —=- (approx.)

From half-sib families: 0$ = 1 6^ = ^-=- (approx.)

Thus, other things being equal, an estimate from full-sib families is
twice as precise as one from half-sib families.

At this point let us compare the precision of estimates from sib
analyses with those from offspring-parent regressions, assuming
optimal design in each case. Again we have to choose a specific case
for illustration of the comparison. Let us for simplicity suppose as we
did before that the heritability to be estimated is 20 per cent. And,
though perhaps not very representative of situations likely to arise in
practice, let us compare an estimate obtained from a half-sib analysis
with one obtained from the regression of offspring on one parent
when the offspring consist of full-sib families. The variance of the
estimate of heritability from the half-sib analysis would then be 6-/\./T
by substitution in the formula given above, and from the regression of
offspring on one parent it would also be 6'4/Tas we found previously.
In this case, therefore, these two methods would give equally precise
estimates for a given total number of individuals measured. If we had
considered a higher heritability, then the regression method would
have had the lower sampling variance. The comparison we have made,
though referring to a particular case, illustrates the general conclusion,
which is that the regression method is preferable for estimating
moderately high heritabilities and the sib correlation method is
preferable for low heritabilities, the critical heritability being, very

Chap. 10] THE PRECISION OF ESTIMATES OF HERITABILITY

183

roughly, about 20 per cent when the comparison is made on the basis
of an equal total number of individuals measured.

Finally let us consider briefly a sib analysis where the half-sib
families are subdivided into full-sib families. The situation is then
more complicated, and for details the reader should consult the papers
of Osborne and Paterson (1952) and A. Robertson (1959 a). The
conclusions are as follows. In many cases the estimation of heri-
ability will be based only on the between-sire component, i.e. the
half-sib correlation. This will arise when common environment
renders the full-sib correlation unsuitable. The most efficient design
then has only one offspring per dam, and is exactly the same as the
half-sib analysis discussed above. If there is no common environ-
ment and it is desired to estimate the correlations from sire and from
dam components with equal precision, then the optimal design has
3 or 4 dams per sire with the number of offspring per dam equal to
z/h2. In the absence of prior knowledge of the heritability the analysis
should be planned with 3 or 4 dams per sire, and 10 offspring per
dam.

Identical Twins

Identical twins seem at first sight to provide, for man and cattle, a
means of estimating the genotypic variance. They provide individuals
of identical genotype, just as inbred lines, or crosses between lines, do
for laboratory animals or for plants. The phenotypic variance within
pairs of identical twins should, therefore, estimate the environmental
variance and so allow the partition of the phenotypic variance into
genotypic and environmental components to be made. (This would
not estimate the heritability, but the use of identical twins seems
nevertheless most appropriately discussed at this point.) Many
studies of human twins have been made, and have shown the mem-
bers of the pairs to be extremely alike in most characters, even when
reared apart from childhood (see Stern, 1949, Ch. 23, for review and
references). Studies of cattle twins, though on a much smaller scale,
show the same thing (see Hancock, 1954; Brumby, 1958). Taken at
their face value these studies seem to indicate a very high degree of
genetic determination — up to 90 per cent or even more — for many
characters. The use of identical twins in this way is, however, vitiated
by the additional similarity due to common environment. Twins
share a common environment from conception to birth, and over the

N F.Q.G.

184 HERITABILITY [Chap. 10

period during which they are reared together, so that the within-pair
variance contains only a part, and perhaps only a small part, of the
total environmental variance. This difficulty may be partially over-
come by the comparison of identical with fraternal twins. Fraternal
twins are full sibs which share a common environment to approxi-
mately the same extent as identical twins. Let us therefore consider
how the causal components of variance contribute to the observa-
tional components between pairs and within pairs for the two sorts of
twins. The composition of the observational components are given
in Table 10.5, the between-pair component being the phenotypic
covariance. The environmental components are shown as being the
same for fraternal as for identical twins. This is not necessarily true,
but one can proceed only on the assumption that it is.

Table 10.5

Composition of the components of variance between and
within pairs of twins.

Between pairs Within pairs

Identicals VA+ VD + VEc VEw

Fraternals Wa+Wd + VEc Wa+Wd + VEw

Difference Wa+Wd Wa+Wd

The contributions of the interaction variance, which for simplicity
are omitted, can be added from Table 9.3 (p. 1 57). If the environmental
components are the same for the two sorts of twins, then the differ-
ence between identicals and fraternals in either of the two components
estimates half the additive variance together with three-quarters of
the dominance variance (and more than three-quarters of the inter-
action variance). To take the partitioning further it is necessary to
have an estimate of the additive variance, reliably free from admixture
with variance due to common environment. By subtraction of half
the additive variance we may then obtain an estimate of three-quarters
of the dominance variance together with more than three-quarters of
the interaction variance. This would give at least an approximate idea
of the amount of non-additive genetic variance. There is, however, a
difficulty with cattle in comparisons between identical and fraternal
twins, connected again with the environmental components of
variance. Vascular anastomoses frequently occur in the placentae of
both sorts of twins, so that the blood of the two twins is mixed. This
will not make identicals any more alike, but it may make fraternals
more alike than they would otherwise be.

\Chap. 10]

IDENTICAL TWINS

185

Some results of twin-studies are quoted in Table 10.6, in order to
illustrate the degree of resemblance between identical and between
(fraternal twins in both man and cattle. The difference between the
I correlation coefficients of identicals and fraternals, given in the right-
hand column, could be taken as an estimate of half the heritability if
there were no non-additive genetic variance and if there were no
complications arising from a common circulation. But since non-
additive variance cannot reasonably be assumed to be absent, the
difference can only be regarded as setting an upper limit to half the
heritability. The vascular anastomoses in cattle twins may, however,
render the estimates of the heritability, or of its upper limit, too low.

Table 10.6

Resemblance between Twins

Correlation coefficients
Character Reference Identicals Fraternals Difference

Man

Height
Weight
Intelligence
Birth weight

Cattle

Milk-yield, 1st lactation
Butterfat-yield, 1st lactation
Fat % in milk, 1st lactation
Weight at 96 weeks
Body length at 96 weeks

(1)
(1)
(2)

(3)

•93
•92

•88
•67

•91

•90
•95

•83
75

•64
•63
•63

•58

•65
■51
•86
78
•62

•29
•29

•25
•09

•26

•39
•09
•05
*J3

(1) Newman, Freeman, and Holzinger (1937). Based on 50 pairs of
identicals and 50 pairs of fraternals, corrected for age differences.

2) Quoted from Robson (1955).

(3) Brumby and Hancock (1956). Based on 10 pairs of identicals and 11
pairs of fraternals.

CHAPTER ii

SELECTION:
I. The Response and its Prediction

Up to this point in our treatment of metric characters we have been
concerned with the description of the genetic properties of a popula-
tion as it exists under random mating, with no influences tending to
change its properties; now we have to consider the changes brought
about by the action of breeder or experimenter. There are two ways,
as we noted in Chapter 6, in which the action of the breeder can change
the genetic properties of the population; the first by the choice of
individuals to be used as parents, which constitutes selection, and the
second by control of the way in which the parents are mated, which
embraces inbreeding and cross breeding. We shall consider selection
first, and in doing so we shall ignore the effects of inbreeding, even
though we cannot realistically suppose that we are always dealing
with a population large enough for its effects to be negligible.

The basic effect of selection is to change the array of gene fre-
quencies in the manner described in Chapter 2. The changes of gene
frequency themselves, however, are now almost completely hidden
from us because we cannot deal with the individual loci concerned
with a metric character. We therefore have to describe the effects of
selection in a different manner, in terms of the observable properties
— means, variances and covariances — though without losing sight of
the fact that the underlying cause of the changes we describe is the
change of gene frequencies. Before we come to details let us consider
the change of gene frequencies a little further in general terms.

To describe the change of the genetic properties from one genera-
tion to the next we have to compare successive generations at the same
point in the life cycle of the individuals, and this point is fixed by the
age at which the character under study is measured. Most often the
character is measured at about the age of sexual maturity or on the
young adult individuals. The selection of parents is made after the
measurements, and the gene frequencies among these selected in-
dividuals are different from what they were in the whole population

Chap. II] SELECTION: I. THE RESPONSE AND ITS PREDICTION

187

before selection. If there are no differences of fertility among the
selected individuals or of viability among their progeny, then the gene
frequencies are the same in the offspring generation as in the selected
parents. Thus artificial selection — that is, selection resulting from
the action of the breeder in the choice of parents — produces its change
of gene frequency by separating the adult individuals of the parent
generation into two groups, the selected and the discarded, that differ
in gene frequencies. Natural selection, operating through differences
of fertility among the parent individuals or of viability among their
progeny, may cause further changes of gene frequency between the
parent individuals and the individuals on which measurements are
made in the offspring generation. Thus there are three stages at
which a change of gene frequency may result from selection: the first
through artificial selection among the adults of the parent generation;
the second through natural differences of fertility, also among the
adults of the parent generation; and the third through natural differ-
ences of viability among the individuals of the offspring generation.
Though natural differences of fertility and viability are always present
they are not necessarily always relevant, because they are not neces-
sarily connected with the genes concerned with the metric character.

Response to Selection

The change produced by selection that chiefly interests us is the
change of the population mean. This is the response to selection,
which we shall symbolise by R; it is the difference of mean phenotypic
value between the offspring of the selected parents and the whole of
the parental generation before selection. The measure of the selec-
tion applied is the average superiority of the selected parents, which
is called the selection differential, and will be symbolised by S. It is
the mean phenotypic value of the individuals selected as parents
expressed as a deviation from the population mean, that is from the
mean phenotypic value of all the individuals in the parental genera-
tion before selection was made. To deduce the connexion between
response and selection differential let us imagine two successive
generations of a population mating at random, as represented dia-
grammatically in Fig. 1 1 . i . Each point represents a pair of parents
and their progeny, and is positioned according to the mid-parent
value measured along the horizontal axis and the mean value of the

188

SELECTION: I

[Chap. II

progeny measured along the vertical axis. The origin represents the
population mean, which is assumed to be the same in both generations.
The sloping line is the regression line of offspring on mid-parent.
(A diagram of this sort, plotted from actual data was given in Fig.
10. i.) Now let us regard a group of individuals in the parental
generation as having been selected — say those with the highest
values. These pairs of parents and their offspring are indicated by
solid dots in the figure. The parents have been selected on the basis

Fig. i i.i. Diagrammatic representation of the mean values of
progeny plotted against the mid-parent values, to illustrate the
response to selection, as explained in the text.

of their own phenotypic values, without regard to the values of their
progeny or of any other relatives. (This chapter deals exclusively
with selection made in this way: other methods will be described in
Chapter 13.) Let S be the mean phenotypic value of these selected
parents, expressed as a deviation from the population mean. And
similarly let R be the mean deviation of their offspring from the
population mean. Then S is the selection differential and R is the
response. The point marked by the cross represents the mean value
of the selected parents and of their progeny, and it lies on the regres-
sion line. The regression coefficient of offspring on parents is thus
equal to R/S. Therefore the connexion between response and selection
differential is

R=bovS

OP*

.(11.1)

Chap. II]

RESPONSE TO SELECTION

189

We saw in the last chapter that the regression of offspring on mid-
parent is equal to the heritability, provided there is no non-genetic
cause of resemblance between offspring and parents. To this we must
add the further condition that there should be no natural selection:
that is to say, that fertility and viability are not correlated with the
phenotypic value of the character under study. Provided these
conditions hold, therefore, the ratio of response to selection differ-
ential is equal to the heritability, and the response is given by

R=h*S

(II.2)

The connexion between the response and the selection differen-
tail, expressed in equation JJ.2, follows directly from the meaning of
the heritability. We noted in the last chapter (equation 10.2) that the
heritability is equivalent to the regression of an individual's breeding
value on its phenotypic value. The deviation of the progeny from
the population mean is, by definition, the breeding value of the
parents, and so the response is equivalent to the breeding value of the
parents. Thus it follows that the expected value of the progeny is
given by R=h2S.

There is one point at which the situation envisaged in deducing
the equations of response does not coincide with what is actually
done in selection. We supposed the individuals of the parent genera-
tion to have mated at random and the selection to have been applied
subsequently. In practice, however, the selection is usually made
before mating, on the basis of the individuals' values and not the
mid-parent values. The effect of this is that the individuals, when
regarded as part of the whole parental population, have been mated
assortatively. Assortative mating, however, has very little effect on
the offspring-parent regression, as we noted in the last chapter, and
this feature of selection procedure can therefore be disregarded.

Prediction of response. The chief use of these equations of
response is for predicting the response to selection. Let us consider a
little further the nature of the prediction that can be made. First, it
is clear that equation 11.1 is not a prediction but simply a description,
because the regression of offspring on parent cannot be measured
until the offspring generation has been reared. We could, however,
measure the regression, &0p, in a previous generation, and then use
the equation R=b0^S to predict the response to selection. There is
no genetics involved in this; it is simply an extrapolation of direct
observation, and the only conditions on which it depends are the

190 SELECTION: I [Chap. 1 1

absence of environmental change and the absence of genetic change
between the generations from which the regression was estimated and
the generation to which selection is applied. The equation R=h2S,
however, provides a means of prediction based on observations made
only on the individuals of the parent generation before selection. Its
validity rests on obtaining a reliable estimate of h2 from the resem-
blance between relatives, such as half sibs; and on the truth of the
identity Z>0p = A2.

Example i i . i . The selection for abdominal bristle number in Droso-
phila melanogaster, by Clayton,'Morris, and Robertson (1957), will provide
an illustration of the prediction of the response, and will serve also to
indicate the extent of the agreement between observation and prediction.
(The data for this example were kindly supplied by Dr G. A. Clayton.)
The heritability of bristle number was first estimated from the base
population before selection, and the value found was 0-52, as stated in
Example 10.1. Five samples of 100 males and 100 females were taken from
the base population, and selection for high and for low bristle number was
made in each of the five samples, the 20 most extreme individuals of each
sex being selected as parents. The mean deviations of these selected indi-
viduals from the mean of the sample out of which they were selected are
given in the table in the columns headed S, the negative signs under down-
ward selection being omitted. These are the selection differentials. The
expected responses are obtained by multiplying the selection differentials
by the heritability, according to equation 11. 2. The observed responses

Upward selection

Downward selection

Resp

onse

Response

Line

Exp.

Obs.

Exp. Obs.

5'29

275

2-60

4'63

2-41 2-44

5-12

2-66

2-23

4-58

2-38 2-29

4'44

2-31

2'43

4-36

2-27 0-67

4-32

2-25

3-12

5-60

2-91 1-13

4-88

2'54

2-68

4-12

2-14 2-68

Mean

4-8 1

2-50

2'6l

4-66

2-42 1-84

are the differences between the progeny means and the sample means out
of which the parents were selected. The expected and observed responses
are also given in the table, negative signs being again omitted. Comparison
of the observed with the expected responses shows that on the whole there
is fairly good agreement, though in some lines — particularly lines 3 and 4
selected downward — there are quite serious discrepancies. These dis-
crepancies, which are typical of selection experiments, illustrate the fact that

Chap. II]

RESPONSE TO SELECTION

191

a single generation of selection in only one line cannot be relied on to
follow the prediction at all closely.

The prediction of response is valid, in principle, for only one
generation of selection. The response depends on the heritability of
the character in the generation from which the parents are selected.
The basic effect of the selection is to change the gene frequencies, so
the genetic properties of the offspring generation, in particular the
heritability, are not the same as in the parent generation. Since the
changes of gene frequency are unknown we cannot strictly speaking
predict the response to a second generation of selection without re-
determining the heritability. Experiments have shown, however,
that the response is usually maintained with little change over several
generations — up to five, ten, or even more. This will be seen in the
graphs of responses to selection given later in this chapter and in the
next. In practice, therefore, the prediction may be expected to hold
good over several generations. The effects of selection over longer
periods, and also its effects on properties other than the mean, will be
discussed in a later section.

The selection differential. We have seen that the change of the
population mean brought about by selection — i.e. the response —
depends on the heritability of the character and on the amount of
selection applied as measured by the selection differential. The
selection differential will not be known, however, until the selection
among the parental generation has actually been made. So the equa-
tions of response in the form given above are only of limited useful-
ness for predicting the response. To be able to predict further ahead
we need to know what determines the magnitude of the selection
differential. Consideration of the factors that influence the selection
differential will also enable us to see more clearly the means by which
the breeder may improve the response to selection.

The magnitude of the selection differential depends on two fac-
tors: the proportion of the population included among the selected
group, and the phenotypic standard deviation of the character. The
dependence of the selection differential on these two factors is illus-
trated diagrammatically in Fig. 11.2. The graphs show the distribu-
tion of phenotypic values, which is assumed to be normal. The
individuals with the highest values are supposed to be selected, so
that the distribution is sharply divided at a point of truncation, all
individuals above this value being selected and all below rejected.

192

SELECTION: I

[Chap. II

The arrow in each figure marks the mean value of the selected group,
and S is the selection differential. In graph (a) half the population is
selected, and the selection differential is rather small: in graph (b)
only 20 per cent of the population is selected, and the selection differ-
ential is much larger. In graph (c) 20 per cent is again selected, but

Fig. i 1.2. Diagrams to show how the selection differential, S,
depends on the proportion of the population selected, and on the
variability of the character. All the individuals in the stippled
areas, beyond the points of truncation, are selected. The axes are
marked in hypothetical units of measurement.

(a) 5°% selected; standard deviation 2 units: S = i-6 units

(b) 20% selected; standard deviation 2 units: S = 2-8 units

the character represented is less variable and the selection differential
is consequently smaller. The standard deviation in (c) is half as great
as in (b) and the selection differential is also half as great.

The standard deviation, which measures the variability, is a
property of the character and the population, and it sets the units in
which the response is expressed — i.e. so many pounds, millimetres,
bristles, etc. The response to selection may be generalised if both
response and selection differential are expressed in terms of the
phenotypic standard deviation, o>. Then Rjop is a generalised mea-
sure of the response, by means of which we can compare different
characters and different populations; and*S/aP is a generalised measure
of the selection differential, by means of which we can compare
different methods or procedures for carrying out the selection. The
' 'standardised" selection differential, SjoP, will be called the intensity
of selection, symbolised by i. The equation of response {n. 2) then
becomes

Op Up

; Chap. II]

RESPONSE TO SELECTION
R = i(jph2

193

By noting that h = (ta/gp, where vA is the standard deviation of breed-
ing values (square root of the additive genetic variance), we may write
this equation in the form

R=ihcrA (JI>4)

which is sometimes used in comparisons of different methods of
selection.

The intensity of selection, % depends only on the proportion of
the population included in the selected group, and, provided the

20
1-8
1-6

•« 1-4

<•>

u_ 10

0-4
0-2

,o\\

Sb.

30 40 50 60 70

PROPORTION SELECTED, %

100

Fig. i i .3 . Intensity of selection in relation to proportion selected.
The intensity of selection is the mean deviation of the selected
individuals, in units of phenotypic standard deviations. The upper
graph refers to selection out of a large total number of individuals
measured: the lower two graphs refer to selection out of totals of 20
and 10 individuals respectively.

194 SELECTION: I [Chap. II

distribution of phenotypic values is normal, it can be determined
from tables of the properties of the normal distribution. If p is the
proportion selected — i.e. the proportion of the population falling
beyond the point of truncation — and z is the height of the ordinate at
the point of truncation, then it follows from the mathematical
properties of the normal distribution that

S . z , x

Thus, given only the proportion selected, p, we can find out by how
many standard deviations the mean of the selected individuals will
exceed the mean of the population before selection: that is to say, the
intensity of selection, i. The graphs in Fig. 11.3 show the relation-
ship between i and p\ the value of i for any given value of p can be
read from the graphs with sufficient accuracy for most purposes. The
relationship between i and p given in equation 11. 5 applies, strictly
speaking, only to a large sample: that is to say, when a large number of
individuals have been measured, among which the selection is to be
made. When selection is made out of a small number of measured
individuals, the mean deviation of the selected group is a little less.
The intensity of selection can be found from tables of deviations of
ranked data (Table XX of Fisher and Yates, 1943). The two lower

Table ii.i

Intensities of selection when selection is made out of a small
number of individuals measured. The figures in the table
are values of i =Sjop = mean deviation in standard measure.

Number

Size

ofsampl

selected

1-49

1-42

i-35

1-27

1-16

1-03

0-85

0-56

I-2I

1-14

1-06

0-96

0-83

0-67

0-42

—

I'OO

0-91

0-82

070

o-55

o-34

—

0-82

072

0-62

0-48

0-29

—

o-66

o-55

0-42

0-25

—

0-50

0-38

0-23

—

o-35

0-20

0-19

curves in Fig

. 11.3

show the intensity

of selection for samples of 10

and 20. Selection

intensities for

samples smaller than 10

are given

in Table 11.1.

Chap. II]

RESPONSE TO SELECTION

195

Example 11.2. A comparison of the expected and observed responses
under different intensities of selection was made by Clayton, Morris, and
Robertson (1957), studying abdominal bristle number in Drosophila. The
heritability was first determined by three methods which yielded a com-
bined estimate of 0-52 (see Example 10.1). The standard deviation of
bristle number (average of the two sexes) was 3-35. Selection at four
different intensities was carried on for five generations, both upward and
downward (i.e. both for increased and for decreased bristle number). In
each case 20 males and 20 females were selected as parents, the intensity
being varied by the number out of which these were selected, as shown in
the first column of the table. The intensities of selection corresponding to
these proportions selected may be read off the graphs in Fig. 11.3. They
are given in the second column of the table. The expected responses are

Mean response per generation

Proportion

Intensity of

Exp-

Observed

selected, p

selection, i

ected

Down

20/100 = 0-20

1-40

2-44

2*02

1-48

20/75 = 0*267

1-23

2*14

2*20

1-26

20/50 = 0-40

0-97

1-65

1-46

0-79

20/25=0-80

0'34

0-59

0-28

-0-08

then found from equation 11.3. Under the most intense selection, for
example, it is ^ = 1-4x3-35 xo*52 = 2-44. There were five replicate lines
in both directions under the most intense selection, and three replicates
under the other intensities. The observed responses are quoted in the last
two columns of the table. Although they do not agree very precisely
with expectation, they show how the change made by selection falls off as
the intensity of selection is reduced, and the data serve to illustrate the
computation of the expected response.

It will now be clear that there are two methods open to the breeder
for improving the rate of response to selection: one by increasing the
heritability and the other by reducing the proportion selected and so
increasing the intensity of selection. The heritability can be increased
only by reducing the environmental variation through attention to the
technique of rearing and management. Reducing the proportion
selected seems at first sight to be a straightforward means of improv-
ing the response, but there are several factors to be considered which
set a limit to what the breeder can do in this way. First is the matter
of population size and inbreeding. This sets a lower limit to the
number of individuals to be used as parents. In experimental work,
for example, one might decide to use not less than 10 or even 20 pairs

196 SELECTION: I [Chap. II

of parents; and in livestock improvement, particularly if artificial
insemination came into general use as a means of intense selection on
males, care would have to be taken not to restrict the number of
males too much. For this reason the intensity of selection can be
increased above a certain point only by increasing the total number of
individuals measured, out of which the selection is made. With
organisms that have a high reproductive rate, such as Drosophila and
plants, very large numbers can, in principle, be measured; but in
practice a limit is set to the intensity of selection by the time and
labour required for the measurement. With organisms that have a
low reproductive rate the limit to the intensity of selection is set by
the reproductive rate, since the proportion saved can never be less
than the proportion needed for replacement; that is to say, two
individuals are needed on the average to replace each pair of parents.
Usually fewer males are needed than females, because each male can
mate with several females, and so the males leave more offspring than
the females. A higher intensity of selection can then be made on
males than on females. Suppose, for example, that females leave on
the average 5 offspring, and each male mates with 10 females, so that
males leave on the average 50 offspring. Then the proportion of
females selected cannot be less than 1/5, but only 1/50 of the males
need be selected. The upper limits of the intensity of selection in this
case would be 1-40 for females, and 2-64 for males.

The number of offspring produced by a pair of parents depends
not only on their reproductive rate but also on how long the breeder
is willing to wait before he makes the selection. This introduces a
new factor — the interval of time between generations — which we
have not yet taken into account in the treatment of the response to
selection, and which we must now consider.

Generation interval. The progress per unit of time is usually
more important in practice than the progress per generation, so the
interval between generations is an important factor in reckoning the
response to selection. The generation interval is the interval of time
between corresponding stages of the life cycle in successive genera-
tions, and it is most conveniently reckoned as the average age of the
parents when the offspring are born that are destined to become
parents in the next generation. By waiting until more offspring have
been reared before he makes the selection the breeder can increase the
intensity of selection and the response per generation; but in doing so
he inevitably increases the generation interval and may thereby

Chap. II]

RESPONSE TO SELECTION

197

reduce the response per unit of time. There is thus a conflict of
interest between intensity of selection and generation interval, and
the best compromise must be found between the two. Increasing the
number of offspring will pay up to a certain point, and beyond this
point it will not. The optimal number of offspring cannot be stated
in general terms, and each case must be worked out according to its
special circumstances. The procedure is explained in the following
example, referring to mice.

Example 11.3. Let us suppose that selection is to be applied to some
character in mice, and that speed of progress per unit of time is the aim.
The question is: how many litters should be raised? To find the number of
litters that will give the maximum speed of progress we have to find the
intensity of selection and the generation interval. The ratio of the two will
then give the relative speed. The actual speed could be obtained by multi-
plying by the heritability and the standard deviation, but these factors will
be assumed to be independent of the number of litters raised. A comparison
of the expected rates of progress per week is made in the table. The com-
parison is made for three different average sizes of litter, meaning the
number of young reared per litter. It is assumed that the character to be
selected can be measured before sexual maturity, and that first litters are
born when the parents are 9 weeks old, subsequent litters following at
intervals of 4 weeks. It is assumed also that the population is large enough
to be treated as a large sample in reckoning the intensity of selection; and
that equal numbers of males and females are selected. The optimal
number of litters differs according to the number reared per litter. If 6

N = 6

N=4

N = :

i\t

p i ijt

i\t

•333

I-IO

•122

•50 o-8o -089 1

•0

o-oo

•000

•167

1-50

■115

•25 1-27 -098

•50

o-8o

•062

•in

171

•101

•167 1-50 -088

•333

i-io

•065

•083

1-85

•088

■125 1-65 -079

•25

1-27

•060

Column headings

: L-

= number of litters raised.

t--

= generation interval in weeks

P~-

= proportion selected.

= intensity of selection.

i\t--

= relative speed of progress.

N--

= number of young reared per

litter

young are reared the maximum speed is attained by rearing only one
litter. If 4 young are reared it is worth while to wait for second litters
before making the selection, but not for third litters. If only 2 young are
reared per litter, raising three litters gives the maximum speed of progress.

198 SELECTION: I [Chap. II

Most mouse stocks are able to rear 6 young per litter, so under most cir-
cumstances it is best to make the selection from the first litters, and not to
wait for second litters. This conclusion could hardly have been guessed at
without the computations shown in the table.

Measurement of Response

When one or more generations of selection have been made the
measurement of the response actually obtained introduces several
problems. These are matters of procedure rather than of principle
and will be only briefly discussed.

Variability of generation means. The first problem to be
solved arises from the variability of generation means. Inspection of
any of the graphs of selection given in the examples shows that the
generation means do not progress in a simple regular fashion, but
fluctuate erratically and more or less violently. There are two main
causes of this variation between the generation means: sampling
variation, depending on the number of individuals measured; and
environmental change, which is usually the more important of the
two. The consequence of this variation between generation means is
that the response can seldom be measured with any pretence of
accuracy until several generations of selection have been made. The
best measure of the average response per generation is then obtained
from the slope of a regression line fitted to the generation means, the
assumption being made that the true response is constant over the
period. The variation between generation means appears as error
variation about the regression line, and the standard error of the
estimate of response is based on it. Variation due to changes of
environment can, of course, be overcome, or at least reduced, by the
use of a control population. The measurement of the response can,
however, be improved in accuracy if the "control" is not an un-
selected population but is selected in the opposite direction. This is
known as a "two-way" selection experiment. The response measured
from the divergence of the two lines is then about twice as great as
that of the lines separately, and the variation between generations is
reduced to the extent that the environmental changes affect both lines
alike. An unselected control is, however, preferable if for practical
reasons one is interested only in the change in one direction, because
the response is not always equal in the two directions. This point will
be discussed in the next chapter.

Chap. II]

MEASUREMENT OF RESPONSE

199

Example i i .4. Fig. 1 1 .4 shows the results of 1 1 generations of two-way
selection for body weight in mice (Falconer, 1953). On the left the "up"
and "down" lines are shown separately, and on the right the divergence be-
tween the two is shown. Linear regression lines are fitted to the observed

2468 10 2468 10

GENERATIONS

Fig. i 1.4. Two-way selection for 6-week weight in mice. Ex-
planation in Example 11.4. (Redrawn from Falconer, 1953.)

generation means. (The first generation of selection is disregarded be-
cause the method of selection was different.) The estimates of the average
response per generation, with their standard errors, are as follows:

Response ± standard error
in grams per generation.

Up 0-27 ± 0-050

Down 0-62 ± 0-046

Divergence o-88 ± 0-036

The difference between the upward and downward responses will be dis-
cussed in the next chapter.

The foregoing example shows how the variation of the generation
means can be reduced when the response is measured from the differ-
ence betwreen two lines, each acting in the manner of a control for the
other. Controls, however, are not always available, and then a more
serious difficulty may arise from progressive changes of environment.
This makes it difficult to assess the effectiveness of selection in the
improvement of domesticated animals, and to a lesser extent of plants,
because in the absence of a control there is no sure way of deciding

O F.Q.G.

200 SELECTION: I [Chap. It

how much of the improvement is due to selection and how much to a
progressive change in the conditions of management.

Example 11.5. Lush (1950) has assembled a number of graphs show-
ing the improvement of farm animals that has taken place during the
present century. Instead of reproducing any of these graphs we give in
the table an indication of the increase of yield per individual over a period
of years, as a percentage of the initial yield. It is difficult to avoid the con-
clusion that much of the improvement of these characters is the result of
selection, but in the absence of any standard of comparison it is very
difficult to decide how much is due to selection and how much to improved
methods of feeding and management.

Character

Country

Period

Improvement, %

Cows:

Milk-yield

Sweden

1920- 1944

Butterfat-yield

New Zealand

1910-1940

Fat % in milk

Netherlands

1906-1945

Pigs:

Efficiency of growth

Denmark

1922-1949

Body length

Denmark

1926-1949

Sheep:

Fleece weight

Australia

1881-1945

Hens:

Egg production

U.S.A.

1909-1950

Weighting the selection differential. In experimental selection
the selection differential as well as the response has to be measured
because it is the relationship between the two, and not the response
alone, that is of interest from the genetic point of view. We have to
distinguish between the expected and the effective selection differ-
ential, because in practice the individual parents do not contribute
equally to the offspring generation. Differences of fertility are always
present so that some parents contribute more offspring than others.
To obtain a measure of the selection differential that is relevant to
the response observed in the mean of the offspring generation we
therefore have to weight the deviations of the parents according to
the number of their offspring that are measured. The expected
selection differential is the simple mean phenotypic deviation of the
parents as defined at the beginning of this chapter; the effective
selection differential is the weighted mean deviation of the parents,
the weight given to each parent, or pair of parents, being their pro-
portionate contribution to the individuals that are measured in the
next generation.

The weighting of the selection differential takes account of a good
part of the effects of natural selection. If the differences of fertility

Chap. II]

MEASUREMENT OF RESPONSE

201

are related to the parents' phenotypic values for the character being
selected, then this natural selection will either help or hinder the
artificial selection. If, for example, the more extreme phenotypes are
less fertile or more frequently sterile, then natural selection is working
against artificial selection. By weighting the selection differential we
measure the joint effects of natural and artificial selection together.
A comparison of the effective (i.e. weighted) with the expected selec-
tion differential may thus be used to discover whether natural selec-
tion is operative.

Example ii.6. In an experiment with mice, selection for body size
(weight at 6 weeks) was carried through 30 generations in the upward
direction and 24 generations in the downward direction (see Falconer,
1955). Comparisons are made in the table between the effective (weighted)
and the expected (unweighted) selection differentials in the two lines. The
period of selection is divided into two parts and the comparisons are made
separately in each. Throughout the whole of the upward selection there
was virtually no difference between the effective and expected selection
differential, and we can conclude that natural selection was unimportant
as a factor influencing the response. The situation in the downward
selected line, however, is different, the effective selection differential being
less than the expected, especially in the second part. From this we can
conclude that natural selection was operating in favour of large size, thus
hindering the artificial selection and reducing the response obtained,
particularly in the latter part of the experiment. The cause of the natural
selection and the reason why it operated only in the downward selected
line were as follows. Large mice produce larger litters than small mice; but
for the purpose of standardisation, litters were artificially reduced to 8
young at birth. At the beginning, and throughout the whole period in the
upward selected line, there were few litters with less than 8 young, and so

Direction of
selection

Upwards

Downwards

Generation
numbers

1-22

23-3°

1-18

19-24

Selection differential per
generation (gms.)

Effective

Expected Effective

1*39
1-08
1-03
0-82

1-36
1-09
0-96

070

Expected
0-98
1 -oi
0-93

o-86

the differential fertility had no consequence in the upward selected line.
In the downward selected line, however, there was soon no standardisation
because there were few litters with as many as 8 young. Thus the smaller

202 SELECTION: I [Chap. II

mice produced fewer young and this reduced the effective selection differ-
ential. In the second part of the experiment the smallest mice did not
breed at all and this reduced the effective selection differential still further.

The weighting of the selection differential does not take account
of the whole effect of natural selection. We noted at the beginning of
the chapter that natural selection may operate at two stages, through
differences of fertility among the parents and through differences of
viability among the offspring. The effect of differences of viability
among the offspring are not accounted for in the effective selection
differential. For further examples and a fuller account of the inter-
action of natural and artificial selection see Lerner (1954, 1958).

Realised heritability. The equation of response, R=h2S {11.2),
which we discussed earlier from the point of view of predicting the
response, can be looked at the other way round, as a means of esti-
mating the heritability from the result of selection already carried
out, the heritability being estimated as the ratio of response to selec-
tion differential:

*=§ (n-5)

The same conditions are necessary for the valid use of the equation
for estimating heritability as for predicting response, except that now
by weighting the selection differential a good part of the effects of
natural selection can be taken account of. There is also the condition
that the observed response should not be confounded with systematic
changes of generation mean due to the environment or the effects of
inbreeding. This, and the absence of maternal effects, are the im-
portant conditions for the valid estimation of heritability from the
response to selection.

The ratio of response to selection differential, however, has an
intrinsic interest of its own, quite apart from whether it provides a
valid estimate of the heritability. It provides the most useful empiri-
cal description of the effectiveness of selection, which allows com-
parison of different experiments to be made even when the intensity
of selection is not the same. The term realised heritability will be used
to denote the ratio R/S, irrespective of its validity as a measure of the
true heritability. The realised heritability is estimated as follows.
The generation means are plotted against the cumulated selection
differential. That is to say, the selection differentials, appropriately

Chap. II]

MEASUREMENT OF RESPONSE

203

weighted, are summed over successive generations so as to give the
total selection applied up to the generation in question. A regression
line is then fitted to the points and the slope of this line measures the
average value of R/S, the realised heritability.

Example 11.7. Fig. 11.5 shows the results of 21 and 18 generations
of two-way selection for 6-week weight in mice (Falconer, 1954 a). The

SELECTION

Fig. 1 1.5. Two-way selection for 6-week weight in mice. Res-
ponse plotted against cumulated selection differential, as explained
in Example 11.7. (From Falconer, 19540; reproduced by courtesy
of the editor of the International Union of Biological Sciences.)

generation means are plotted against the cumulated selection differential
and linear regression lines are fitted to the points. The realised herit-
abilities, estimated from the slopes of these lines, are:

Upward selection: 0-175 ± 0-0161
Downward selection: 0-5 1 8 ± 0-023 l

The difference between the upward and downward selection is referred to
in the next chapter.

Change of Gene Frequency under Artificial Selection

It was pointed out at the beginning of this chapter that the change
of the population mean resulting from selection is brought about
through changes of the gene frequencies at the loci which influence
the character selected. But since the effects of the loci cannot be

204

SELECTION: I

[Chap. II

individually identified, the changes of gene frequency cannot in
practice be followed. Consequently the process of selection for a
metric character had to be described in terms of the selection differ-
ential, or the intensity of selection, and of the change of the popula-
tion mean, representing the combined effects of all the loci. This
leaves unanswered the fundamental question: How great are the
changes of gene frequency underlying the response of a metric
character to selection? To answer this question, and so to bridge the
gap between the treatment of selection given in this chapter and that
given earlier in Chapter 2, we have to find the connexion between the
intensity of selection (i) and the coefficient of selection (s) operating
on a particular locus.

The effect of selection for a metric character on one of the loci
concerned may best be pictured in the manner illustrated in Fig.
1 1.6. This refers to a locus with two alleles of which one (AT) is com-

Fig. 1 1.6. Selection for a metric character operating on one of
the loci concerned. The frequency of A2A2 as depicted is q2 = I.

pletely dominant. With respect to this locus, therefore, the popula-
tion is divided into two portions which differ in their mean pheno-
typic values by an amount 2<z, this being the difference between the
two homozygotes in the notation of earlier chapters (see Fig. 7.1,
p. 1 1 3). It is assumed that the residual variance within each portion is
the same, this residual variance arising from all the other loci as well

Chap. II]

CHANGE OF GENE FREQUENCY

205

as from environmental causes. The proportion of individuals in the
two portions depends on the gene frequency at the locus, q2 being in
the portion consisting of A2A2 genotypes, and i -q2 in the portion
containing AXAX and AXA2 genotypes. When artificial selection is
applied, a proportion of the whole population lying beyond the point
of truncation is cut off, and the proportion of A2A2 genotypes is lower
among this selected group than in the population as a whole, selec-
tion acting in the case illustrated against the A2 allele. Now, the new
gene frequency, ql9 is the frequency of A2 genes among the selected
group of individuals. This may be found by deducing the regression
of gene frequency on phenotypic value, bqP. The selected group
deviates in mean phenotypic value from the population mean by an
amount £, which is the selection differential. The gene frequency
among the selected group will then be given by the regression equa-
tion

qi=q+bqPS (11.6)

The regression of gene frequency on phenotypic value is found as
follows. The three genotypes are listed in Table 11.2 with their

Table 11.2

q G

AiA2

A2A2

p2
zpq

frequencies in the whole population. The third column of the table
gives the frequency of the A2 allele among each of the three geno-
types, which is simply o, J, and 1 . The last column gives the geno-
typic values. Provided there is no correlation between genotype and
environment, these are also the mean phenotypic values of each
genotype. There is now no assumption of complete dominance.
The covariance of gene frequency with phenotypic value is obtained
from the sum of the products of q and P, each multiplied by the
frequency of the genotype. From this sum of products must be
deducted the product of the means of the gene frequency and the
phenotypic value. Thus the covariance is covqP=pqd-q2a-qM,
where M is the population mean. Substituting the value of M from
equation 7.2, the covariance reduces to - pq[a + d(q - p)] — - pqa,
where a is the average effect of the gene substitution (see equation J.5).
The regression of gene frequency on phenotypic value is therefore

206 SELECTION: I [Chap. II

where oP is the phenotypic variance.

Next, we substitute this regression coefficient in equation u.6,
putting also S = ioP from equation 11.5. This gives the gene frequency
among the selected parents as

and the change of gene frequency resulting from the selection
reduces to

Aq= -ipq— (11.8)

The change is negative because selection is acting against the allele
A2 whose frequency is q. This formula enables us to translate the
intensity of selection, i, into the coefficient of selection, s, against A2,
because equations for the change of gene frequency in terms of s were
given in Chapter 2. We shall take the approximate equations given
in 2.7 and 2.8. If dominance is complete, d=a and a = 2qa. Then
equating 1 1.8 with 2.8 gives

ipq^-=sq\i-q).

If there is no dominance d=o and <x = a. Then equating 11. 8 with
2.7 gives

ipq^- = isq(i-q)

Both these equations, on simplification, reduce to

.2a
s~i— (JJ-9)

Thus we find that the two ways of expressing the "force" of selection
— by the intensity and the coefficient of selection — are very simply
related to each other. The coefficient of selection operating on any
locus is directly proportional to the intensity of selection and to the
quantity zajop. This quantity is the difference of value between the two
homozygotes expressed in terms of the phenotypic standard deviation.

Chap. II]

CHANGE OF GENE FREQUENCY

207

For want of a more suitable term we shall refer to this, rather loosely,
as the "proportionate effect" of the locus. There is nothing more
that we can do with the relationship expressed in equation n.g at
the moment, but we shall use it in the next chapter to draw some
tentative conclusions about the "proportionate effects" of loci con-
cerned with metric characters.

CHAPTER 12

SELECTION:

II. The Results of Experiments

In the last chapter we saw that the theoretical deductions about the
effects of artificial selection are limited to the change of the popula-
tion mean, and strictly speaking over only one generation. By chang-
ing the gene frequencies selection changes the genetic properties of
the population upon which the effects of further selection depend.
And, because the effects of the individual loci are unknown, the
changes of gene frequency cannot be predicted, and so the response
to selection can be predicted only for as long as the genetic properties
remain substantially unchanged. Thus there are many consequences
of selection that can be discovered only by experiment. The object of
this chapter is to describe briefly what seem to be the most general
conclusions about these consequences that have emerged from
experimental studies of selection. It should be noted, however, that
the drawing of conclusions from the results of experiments in the
field of quantitative genetics is to some extent a matter of personal
judgement. Many of the conclusions put forward in this chapter
therefore represent a personal viewpoint, and are not necessarily
accepted generally. The most important questions to be answered
by experiment concern the long-term effects of selection. For how
long does the response continue? By how much can the population
mean ultimately be changed ? What is the genetic nature of the limit
to further progress? These questions will be dealt with in the latter
part of the chapter. First we shall consider two questions raised by
the examples in the last chapter.

Repeatability of Response

In Example ii.i we saw that the response in one generation of
selection was very variable when the selection was replicated in a
number of lines. Though the average response agreed fairly well

Chap. 12]

REPEATABILITY OF RESPONSE

209

with the prediction, the responses of the individual lines did not.
This raises the question: How consistent, or repeatable, are the
results of selection ? If selection is applied to different samples drawn
from the same population, how closely will the results agree? Part
of the problem here concerns sampling variation — the extent to
which the samples differ in gene frequencies, both initially and
during the course of the continued selection. This depends, of course,
on the size of the populations, or lines, during the course of the
selection; but it depends also on the initial gene frequencies in the
base population from which the samples were drawn. If most of the
loci concerned with the character have genes at more or less inter-
mediate frequencies then the response to selection is not likely to be
much influenced by sampling variation. On the other hand, if there
are loci with genes at low frequency then these will be included in
some samples drawn from the initial population but will be absent
from others. Then, if any of these low-frequency genes have a fairly
large effect on the character their presence or absence may appreciably
influence the outcome of selection. The experiment on abdominal
bristle-number in Drosophila whose first generation was quoted in
Example ii.i, provides the only evidence on this point (Clayton,
Morris, and Robertson, 1957). Fig. 12. 1 shows the responses in the
five up and the five down lines over 20 generations. The responses
are reasonably consistent over the first 5 generations in the up lines
and over about 10 generations in the down lines. Thereafter the
lines begin to differentiate, and by the twentieth generation there are
substantial differences between them. The conclusion suggested by
the early similarity and the later divergence between the replicate
lines is that the early response is governed chiefly by genes at more or
less intermediate frequencies, but in the later stages genes at initially
low frequencies begin to come into play, the initial sampling having
caused differences between the lines in respect of these genes.

The question of repeatability of the response to selection may be
extended to differences between populations. This is not a matter of
sampling variation but of the differences in the genetic properties of
populations. We noted in Chapter 10 that heritabilities frequently
differ between populations, and consequently we should not expect
the responses to selection to be the same. It is of interest nevertheless
to compare the results of selection applied to different populations
and to see how they do actually differ. Fig. 12.2 shows the results of
selection for thorax length in Drosophila melanogaster applied to three

210

SELECTION: II

[Chap. 12

MEANS

Fig. 12.1. Selection for abdominal bristle number in Drosophila
melanogaster, replicated in 5 lines in each direction. The broken
lines refer to suspended selection and the thin continuous lines to
inbreeding without selection. (From Clayton, Morris, and Robert-
son, 1957; reproduced by courtesy of the authors and the editor of
the Journal of Genetics.)

Chap. 12]

REPEATABILITY OF RESPONSE
MASS SELECTION- THORAX

211

LARG£_ .

ISCHIA

^~s \..-%

^ -

«••

-s-

Nv ,— ' CONTROL LEVEL

-10-

^~"-~^~~\|MALL

-IS-
20-

— i 1 I ■ — i 1 — I 1 — i — i 1 1 i—i — i— i — r

/
\.^....yt

.■*S-

LARGE,

\ RENFREW
\ A

v \

^s. SMALL

CONTROL LEVEL

BACK SELECTION

EL SO SELEC IO

^~^\^^y^-

CRIANLARICH

O 5 IO IS 20

GENERATIONS

Fig. 12.2. Selection for thorax length in Drosophila melanogaster
from three different base populations. The broken lines refer to
reversed selection and the dotted lines to suspended selection.
(From F. W. Robertson, 1955; reproduced by courtesy of the
author and the editor of the Cold Spring Harbor Symposia on
Quantitative Biology.)

212 SELECTION: II [Chap. 12

different wild populations, (F. W. Robertson, 1955). The responses
of the three populations, both upward and downward, are fairly alike.
It is not possible to discuss further the degree of repeatability
between the responses found in these two experiments, because there
is no objective criterion for deciding how closely the responses ought
to agree. One can therefore only regard them as empirical evidence
of what in practice does occur.

Asymmetry of Response

A surprising feature of the experimental results illustrated in the
last chapter is the inequality of the responses to selection in opposite
directions, seen particularly well in Fig. 11.5. This asymmetry of
response has been found in many two-way selection experiments,
but its cause is not yet known. For this reason we shall not discuss
the phenomenon in detail, but shall merely note the possible causes,
of which there are several. These possible causes are, briefly, as
follows.

1. Selection differential. The selection differential may differ
between the upward and downward selected lines, for several reasons,
(i) Natural selection may aid artificial selection in one direction or
hinder it in the other, (ii) The fertility may change so that a higher
intensity of selection is achieved in one direction than in the other,
(iii) The variance may change as a result of the change of mean: the
selection differential will increase as the variance increases and de-
crease as it decreases. This is a "scale- effect," to be discussed more
fully in Chapter 17. These three causes operating through the selec-
tion differential were all found in the experiment with mice cited in
the last chapter, but they operated in the direction opposite to that of
the asymmetry found. The selection differential was greater in the
upward selection but the response was greater in the downward
selection. Differences of the selection differential influence the
response per generation, but they affect the realised heritability only a
little. Therefore if the response is plotted against the cumulated
selection differential and there is still much asymmetry, as in Fig.
1 1.5, it cannot be attributed to any cause operating through the
selection differential.

2. "Genetic asymmetry." There are two sorts of asymmetry
in the genetic properties of the initial population that could give rise

Chap. 12]

ASYMMETRY OF RESPONSE

213

to asymmetry of the responses to selection (Falconer, 1954a). These
concern the dominance and the gene frequencies of the loci concerned
with the character. The dominant alleles at each locus may be mostly
those that affect the character in one direction, instead of being more
or less equally distributed between those that increase and those that
decrease it. We shall refer to this situation as directional dominance.
If the initial gene frequencies were about 0-5, the response would be
expected to be greater in the direction in which the alleles tend to be
recessive. It will be shown in Chapter 14 that this is also the direction
in which the mean is expected to change on inbreeding. Therefore
we should, in general, expect characters that show inbreeding
depression to respond more rapidly to downward selection than to
upward selection. There may also be asymmetry in the distribution of
gene frequencies. The more frequent alleles at each locus may be
mostly those that affect the character in one direction — a situation
that we shall refer to as directional gene frequencies. In the absence of
directional dominance this would be expected to cause a more rapid
response to selection in the direction of the less frequent alleles.
Under natural selection the less favourable alleles, in respect to fit-
ness, will have been brought to lower frequencies. Therefore if
selection in one direction reduces fitness more than selection in the
other, we should expect a more rapid response in the direction of the
greater loss of fitness. The asymmetry of the response to selection
theoretically expected from these two causes may be seen by con-
sideration of Fig. 2.3, which shows the expected response arising
from one locus. Neither of these two causes — directional dominance
and directional gene frequencies — would, however, be expected to
give rise to immediate asymmetry; that is, in the first few generations
of selection. The asymmetry would appear only as the gene fre-
quencies in the upward and downward selected lines become differ-
entiated. The asymmetry found in some experiments undoubtedly
appears sooner than would be expected from these causes.

3. Selection for heterozygotes. If selection in one direction
favours heterozygotes at many loci, or at a few loci with important
effects, the response would become slow as the gene frequencies ap-
proach their equilibrium values. But the response in the other direc-
tion would be rapid until the favoured alleles approach fixation. This
situation, which is a form of directional dominance, would also be
expected to give rise to an asymmetrical response (Lerner, 1954); but,
again, not immediately.

214 SELECTION: II [Chap. 12

4. Inbreeding depression. Most experiments on selection are
made with populations not very large in size, and there is usually
therefore an appreciable amount of inbreeding during the progress
the selection. If the character selected is one subject to inbreeding
depression, there will be a tendency for the mean to decline through
inbreeding. This will reduce the rate of response in the upward
direction and increase it in the downward direction, thus giving rise
to asymmetry. An unselected control population will reveal how
much asymmetry can be attributed to this cause. Inbreeding
depression has been shown to be an insufficient cause of the asymmetry
in the experiments cited in the last chapter.

5. Maternal effects. Characters complicated by a maternal effect
may show an asymmetry of response associated with the maternal
component of the character. The situation envisaged may best
be explained by reference to the selection for body weight in
mice (Falconer, 1955), which showed the strong asymmetry illus-
trated in Fig. 1 1.5. The character selected — 6- week weight — may be
divided into two components, weaning weight and post-weaning
growth, the former being maternally determined. It was found that
all the asymmetry resided in the weaning weight and none in the
post-weaning growth. The weaning weight increased hardly at all
in the large line but decreased very much in the small line. Thus it
was the mothering ability that changed asymmetrically under selec-
tion and not the growth of the young themselves. To attribute an
asymmetrical response to maternal effects does not, however, solve
the problem, because the asymmetry has merely been shifted from
the character selected to another, and is still just as much in need of
an explanation.

These, then, are the possible causes of asymmetry that may be
suggested. There are probably others. Until the causes of asym-
metry are better understood it is clear that predictions of the rate of
response to selection must be made with caution. Where there is
asymmetry of response the mean of the realised heritabilities in the
two directions will presumably correspond with the heritability
estimated from the resemblance between relatives. Therefore the
response predicted will presumably be about the mean of the two-
way responses actually obtained. If the asymmetry found in the
mouse experiment should prove to be characteristic of selection for
economically desirable characters in mammals, it means that we must
expect actual progress to fall short of the predicted progress. In this

Chap. 12]

ASYMMETRY OF RESPONSE

215

experiment the mean realised heritability was 35 per cent, but the
upward progress was only at the rate of 18 per cent. In other words
the progress made was only about half as rapid as would, presumably,
have been predicted.

Long-term Results of Selection

The response to selection cannot be expected to continue in-
definitely. Sooner or later it is to be expected that all the favourable
alleles originally segregating will be brought to fixation. As they
approach fixation the genetic variance should decline and the rate of
response diminish, till, when fixation is complete, the response should
cease. The population should then fail also to respond to selection in
the opposite direction, and further response to selection in either
direction will depend on the origin of new genetic variation by
mutation. But how many generations must elapse before the response
ceases, and how great will be the total response are questions that can
be answered only by experiment. Let us first see what evidence is
available on these points, and then see how far the long-term effects
of selection conform to the simple theoretical picture outlined above.

Total response and duration of response. When the response
to selection has ceased, the population is said to be at the selection
limit. It is usually impossible to decide exactly at what point the
limit is reached, because the limit is approached gradually, the res-
ponse becoming progressively slower. The total response, and par-
ticularly the duration of the response, can therefore be estimated only
approximately. Bearing this in mind, we may examine the results of
four two-way selection experiments, two with Drosophila and two
with mice, given in Table 12.1. The asymmetry of the responses is
disregarded, and the total response is taken as the sum of the total
responses in the two directions. This is the difference between the
upper and lower selection limits, and may be called the total range. In
the table the total range is expressed in three ways: as a percentage of
the initial population mean, M0] in terms of the phenotypic standard
deviation, aP, in the initial population; and in terms of the standard
deviation of breeding values, crAi (i.e. the square root of the additive
variance) in the initial population. To draw general conclusions from
these four experiments would be rash, because the experiments
differed in several ways — in the intensity of selection, the population

P F.Q.G.

216 SELECTION: II [Chap. 12

size, and the nature of the initial population — all of which would be
expected to affect the duration of response and the total range.
Despite these differences, however, the picture they give is fairly-
consistent. The response continues for about 20 to 30 generations;

Table 12.1
Total Responses in four Selection Experiments

Experiment Duration Total range

(generations) /M0 (%) jaP jaA
Drosophila:

(1) abdominal bristles 30 189 20 28

(2) thorax length 20 24 12 22

Mice:

(3) 6-week weight

(4) 60-day weight

122

References:

(1) Clayton and Robertson (1957).

(2) F. W. Robertson (1955).

(3) Falconer (1955).

(4) MacArthur (1949); Butler (1952).

and the total range is between 15 and 30 times the square root of the
additive variance, or about 10 to 20 times the phenotypic standard
deviation in the initial population. The relationship between the
total range and the original population mean, however, is quite
irregular.

The total response produced by selection in these experiments,
though it may be impressive when reckoned in terms of the variation
present in the original population, is not at all spectacular when com-
pared with the achievements of the breeders of domestic animals.
For example, the upper limits of body weight of the mice in the
experiments quoted are 2 to 3 times the lower limits; but the weights
of the largest breeds of dog are about 75 times greater than those of
the smallest (Sierts-Roth, 1953). The reason for the disappointing
results of experimental selection when viewed against the differences
between the breeds of domestic animals is that experiments are
carried out with closed populations of not very large size. The limits
are set by the gene content of the foundation individuals, since no
genes are brought in after selection has been started. The breeder of

Chap. 12]

LONG-TERM RESULTS OF SELECTION

217

domestic animals, in contrast, by intermittent crossing casts his net
far wider in the search for genes favourable to his purposes.

The effects of inbreeding during the selection have been ignored
in this account of selection limits. It is clear on theoretical grounds
that inbreeding will tend to cause fixation of unfavourable alleles at
some loci. Both the total response and the duration of the response
must therefore be expected to be reduced if the selection is carried
out in a small population with a fairly high rate of inbreeding. There
is, however, little experimental evidence on the magnitude of this
effect of inbreeding. The four experiments discussed above were all
carried out on fairly large populations, so that the rate of inbreeding
was fairly low.

Number of "loci." When the total range has been determined
by experiment it is possible, in principle, to deduce the number of
loci that gave rise to the response, and the magnitude of their effects.
The estimates that can be made in practice, however, are only rough
ones, because the properties of the individual loci are unknown and
have to be guessed at. But even though we can do no more than
establish the order of magnitude of the number and effects of the loci,
this is better than no estimate at all; so let us see how these estimates
may be obtained. The limitations will become apparent as we pro-
ceed.

The estimates come from a comparison of the total range with
the amount of additive genetic variance in the original population. In
principle it is clear that with a given amount of initial variation a
small number of genes will produce less total response than a larger
number; and that if a given amount of variation is produced by few
genes the magnitude of their effects must be greater than if it is pro-
duced by many. It is clear, also, that linkage is an important factor
in the relationship between variance and total response. Some seg-
ments of chromosome that segregate as units in the initial popula-
tion will recombine during the selection and appear as many genes
contributing to the total response. Other segments may fail to re-
combine and will be counted as single genes. In order to emphasise
this limitation, the estimate of the number of loci may be referred to
as the number of "effective factors" or as the "segregation index."
There are, however, other uncertainties, and we shall simply refer to
it as the number of "loci," letting the inverted commas serve to
remind us of the unavoidable limitations and qualifications.

We must first suppose that there has been no inbreeding and when

i R2 — N
n = o-^ • (J2-5)

This equation gives the basis for estimating the number of "loci."
Their effects may then be estimated from equation 12.4. The most
meaningful measure of the "effect" of a locus, however, is what we

218 SELECTION: II [Chap. 12

the selection limits have been reached all loci are fixed for the favour-
able allele. The total range is then zUa, where za is the difference of
genotypic value between the two most extreme homozygotes at a
particular locus, and is the precise meaning of what we have loosely
called the "effect" of the locus. If R is the total range and n is the
number of loci that have contributed to the response, then

R = 2tia i12-1)

where a is the mean value of a. Next we must suppose that each locus
has only two alleles. The additive variance arising from one locus is
then o-jj =2pq[a + d(q-p)Y, from equation £.5. (We shall use a2 here
to denote variance instead of V, because it simplifies the formulation
when standard deviations are involved.) The gene frequencies at the
individual loci thus enter the picture. Unless the initial population
was made from crosses between inbred lines, the gene frequencies
are not known and we shall therefore have to insert hypothetical
values. We shall suppose that all segregating genes are at frequencies
of 0-5, as they would be if the initial population were made from a
cross between two inbred lines. The additive variance contributed by
one locus then becomes a\ = |<22, and the degree of dominance be-
comes irrelevant. Next we have to suppose there is no linkage be-
tween the loci, so that the additive variance due to all n loci together is

fliHW?) .V.. ..(12.2)

where (a2) is the mean of the squares of a for each locus. Finally we
shall suppose that all loci have equal effects, so that equations 12.1
and 12.2 become

R = 2tia (I2>3)

and

<y\ = lna2 (12.4)

Squaring equation 12.3 and substituting a2 = (2/n)(j^ from equation
12.4 gives R2 = 8/zo-l, whence

Chap. 12]

LONG-TERM RESULTS OF SELECTION

219

have earlier called the * 'proportionate effect," 2<z/crP, which is the
difference between the homozygotes expressed in terms of the pheno-
typic standard deviation. By rearrangement of equation 12.4 this
becomes

gp \] \n)

(12.6)

where h is the square root of the heritability.

Let us see what results these theoretical deductions yield when
applied to the experiments quoted in Table 12.1. The estimates of
the number of "loci" and of the proportionate effects of the genes are

Table

12.2

Experiment

Number
"loci"

Proportionate
effect (za/ap)

Drosophila :

(1) abdominal bristles

0-21

(2) thorax length

59 ,

0-20

Mice:

(3) 6-week weight

0-23

(4) 60-day weight

0-19

(For references to experiments see Table 12. 1)

given in Table 12.2. Since the estimation of the number of "loci" is
necessarily so imprecise it does not seem worth while to discuss in
detail its limitations or the errors that may have been introduced by
the assumptions that were made. These matters are discussed by
Wright ( 1 9526). The results given in Table 12.2, then, suggest that
the responses to selection in these experiments have resulted from
about 100 loci (i.e. more nearly 100 than 10 or 1,000); and that on the
average the difference in value between homozygotes at one locus
amounts to about one-fifth of the phenotypic standard deviation.

Nature of the selection limit. The deductions made in the last
section from the observed total response were based on the assump-
tion that the selection limit represents fixation of all favourable
alleles. The simple theoretical expectation is that selection should
lead to fixation with the consequent loss of genetic variance. Let us
now consider the evidence from experiments about the nature of the
selection limit and see how far it conforms to this simple theoretical
picture. If the genetic variance declines as the limit is approached

220 SELECTION: II [Chap. 12

this ought to be apparent in a decline of phenotypic variance. In
many experiments, however, the phenotypic variance has been found
not to decline, even when the selection limit has been reached, and
when due allowance for "scale effects" has been made as will be
explained in Chapter 17. A fairly typical example is provided by the
experiment with mice which was described in the last chapter (Fig.
1 1.5). The phenotypic variance is shown in Fig. 12.3, expressed in
the form of the coefficient of variation in order to eliminate scale

GENERATIONS

Fig. 12.3. Coefficient of variation of 6 -week weight in mice. The
thin continuous line starting at generation 23 refers to the un-
selected control. The broken lines refer to reversed selection and
the dotted lines to suspended selection. (From Falconer, 1955;
reproduced by courtesy of the editor of the Cold Spring Harbor
Symposia on Quantitative Biology.)

effects. The variance in the large line remains at the same level
throughout the experiment, and after the limit has been reached at
about the twenty-fifth generation a comparison with the unselected
control shows the variance not to have declined at all. The variance
in the small line shows a sudden and large increase, but we shall
return to this point later. An example from Drosophila is provided
by the experiment on abdominal bristle-number illustrated in Fig.
12. 1 . The phenotypic variance in the base population and in the most
extreme of the high and of the low lines after 35 and 34 generations
respectively is illustrated by frequency distributions in Fig. 12.4. In
this case the variance not only failed to decline but increased very
much during the selection in both directions. Before we consider the

LONG-TERM RESULTS OF SELECTION

221

Chap. 12]

reasons for this behaviour of the variance we shall mention another
fact often found in selection experiments. It is that when the response
to continued selection has ceased the population will often respond
to selection in the reverse direction and will often respond rapidly.
This is well illustrated in Fig. 12.2, where the three lines selected for

Fig. 12.4. Frequency distributions of abdominal bristle number
in Drosophila melanogaster (females), in the base population and in
the most extreme high and low lines after 35 and 34 generations
of selection. (From Clayton, Morris, and Robertson, 1957; re-
produced by courtesy of the authors and the editor of the Journal
of Genetics.)

increased thorax length returned rapidly to the unselected level
when the direction of selection was reversed after the upward res-
ponses had ceased. The lines selected for reduced thorax length,
however, did not respond to reversed selection. From this brief
outline of the evidence it is clear that the simple theoretical picture of
the selection limit is not substantiated by experiment. Instead, we
find — not always but often — no loss of phenotypic variance and the
ability to respond rapidly to reversed selection. Let us now consider
what may be the possible reasons for these facts, and what conclusions
about the genetic nature of the selection limit can be drawn from
them.

1 . The failure of the phenotypic variance to decline may be due
to an increase of non-genetic variance compensating for the expected
reduction of genetic variance. With the approach to fixation of the

222 SELECTION: II [Chap. 12

loci concerned, and of others linked to them, the frequency of homo-
zygotes will increase. There is evidence, mentioned in Chapter 8
and to be discussed more fully in Chapter 15, that homozygotes are
sometimes more variable from environmental causes than hetero-
zygotes. This could cause an increase of environmental variance which
might counterbalance a reduction of genetic variance; but there is
little experimental evidence concerning the matter.

2. If the population, after the selection limit has been reached,
responds to reversed selection we can only conclude that genetic
variance of some sort remains. The continued presence of genetic
variance could result from the following causes:

(i) We saw in Example 11.6 how natural selection opposed the
artificial selection for small size in mice, partly because small mice
are less fertile than large ones and partly because the smallest mice
were sterile. Natural selection acting in this sort of way may increase
as the population mean changes further from the original level, until
it becomes strong enough to counteract completely the artificial
selection. The response would then cease, but reversed selection
would be aided by natural selection and the population would res-
pond.

(ii) Selection may favour heterozygotes at some loci. At the
selection limit the genes would be in equilibrium at more or less
intermediate frequencies, and they would give rise to genetic vari-
ance. But the variance would be non-additive, and there would be
no immediate response to reversed selection. If reversed selection
were continued a response would slowly develop and become more
rapid as the gene frequencies changed away from the equilibrium
values. The behaviour of populations at the selection limit, however,
does not seem commonly to be of this sort.

(iii) If there is superiority of heterozygotes arising from the com-
bined action of artificial and natural selection then the situation is
quite different. Consider a locus at which the heterozygote AjA2 is
superior in the character selected to the homozygote AjA^, and the
homozygote A2A2 is inviable or sterile. Artificial selection will choose
A]A2, or perhaps A2A2 if it is viable, but natural selection will reject
A2A2, so that under the combined effect of artificial and natural
selection the heterozygote is superior. The pygmy gene in mice
which was used for several examples in Chapter 7 provides just such a
case, when artificial selection is in the direction of small size. Hetero-
zygotes are favoured because they are smaller than normal homozy-

Chap. 12]

LONG-TERM RESULTS OF SELECTION

223

gotes; homozygous pygmies are smaller still but are sterile. When the
selection limit is reached under this situation there will be genetic
variance due to the gene, but no further response. When selection is
reversed, however, it is only the artificial selection that is reversed in
direction, and one homozygote will be favoured. The population will
therefore respond immediately. This may be regarded as an extreme
form of asymmetrical response to selection. It leads to the anomaly of
a high heritability — about 50 per cent — estimated from the offspring-
parent regression, but a realised heritability of zero in one direction
and up to 100 per cent in the other direction. The anomaly, however,
is only apparent because the estimation of heritability and the pre-
diction of the response to selection are valid only if natural selection
does not interfere with the appearance of the genotypes in their proper
Mendelian ratios.

The situation described above was proved to exist in one of the
lines of Drosophila selected for high bristle number in the experiment
illustrated in Fig. 12.1. There was a gene present which was lethal
in the homozygote and which in the heterozygote increased bristle
number by 22, which is 5-8 times the original phenotypic standard
deviation (Clayton and Robertson, 1957). The line carrying this
gene was the one whose distribution is shown in Fig. 12.4, and the
bimodality of the distribution can be seen. It seems probable that
in cases like this the gene does not have so large an effect in the original
population, but that the effect of the heterozygote is enhanced during
the selection, either by "modifying" genes or by a cross-over which
separates a linked gene whose effect is in the opposite direction. A
mechanism of this sort seems to be required to account for the very
great increase of variance often found in selected lines (F. W. Robert-
son and Reeve, 19520; Clayton and Robertson, 1957).

The selection of heterozygotes at one or a few loci with major
effects through the combined action of artificial and natural selection
in the manner explained above seems to be a common situation in
Drosophila populations at the selection limit. Whether it occurs as
frequently in other organisms is not known because the genetic
analyses required to detect it are more difficult to make. The increase
of variance in the mice selected for small size shown in Fig. 12.3 may
well have been due to this cause.

The deleterious effect on fitness is an essential part of the situa-
tion, so genes of this sort will always be at low frequencies in the
initial population. The appearance of any particular gene in a selected

224 SELECTION: II [Chap. 12

line will therefore depend very much on the chances of sampling, or
on its occurring later by mutation. Consequently such genes will be a
cause of differences between replicated lines, such as we noted at the
beginning of this chapter in the experiment on Drosophila bristle
number, and they will render the selection limit to a large extent
unpredictable in its level and its precise genetic nature.

Relevance of selection limits to animal and plant improve-
ment. It may be thought that experimental studies of long continued
selection are of little relevance to the practice of selection in animal
and plant improvement, because the breeder is concerned only with
the first five or ten generations. This, however, is not necessarily so.
The breeds of animals and varieties of plants which he seeks to im-
prove have already been under selection for more or less the same
characters over a long time. They may therefore by now be approach-
ing, if they are not already at, the selection limits. An understanding
of the nature of the selection limit and of the behaviour of populations
at the selection limit may therefore be very relevant in the field of
practice.

CHAPTER 13

SELECTION:

III. Information from Relatives

In our consideration of selection we have up to now supposed that
individuals are measured for the character to be selected and that the
best are chosen to be parents in accordance with the individual pheno-
typic values. An individual's own phenotypic value, however, is not
the only source of information about its breeding value; additional
information is provided by the phenotypic values of relatives, particu-
larly by those of full or half sibs. With some characters, indeed, the
values of relatives provide the only available information. Milk-
yield, to take an obvious example, cannot be measured in males, so
the breeding value of a male can only be judged from the phenotypic
values of its female relatives. Ovarian response to gonadotropic
hormone, a character for which selection has been applied in rats
(Kyle and Chapman, 1953), cannot be measured on the living animal,
so selection can only be based on the phenotypic values of female
relatives. The use of information from relatives is of great importance
in the application of selection to animal breeding, for two reasons.
First, the characters to be selected are often ones of low heritability,
and with these the mean value of a number of relatives often provides
a more reliable guide to breeding value than the individual's own
phenotypic value. And, second, when the outcome of selection is a
matter of economic gain even quite a small improvement of the
response will repay the extra effort of applying the best technique.
In this chapter we shall outline the principles underlying the use of
information from relatives and the choice of the best method of
selection, but we shall not discuss the technical details of procedure
in the application of selection to animal breeding.

Methods of Selection

If the family structure of the population is taken into account we
can compute the mean phenotypic value of each family; this is known

226 SELECTION: /// [Chap. 13

as the family mean. Suppose, then, that we have a population in which
the individuals are grouped in families, which may be full or half sibs,
and we have measurements of each individual and of the means of
every family. A choice of procedure for applying selection to this
population is then open, according to the use we make of the family
means. Let us first look at the problem from the point of view of the
additional information provided by the values of relatives. Suppose,
for example, that we have an individual whose own value puts it on
the border-line between selection and rejection, and it has a number
of sibs with high values, so that the family to which it belongs has a
high mean. We may interpret the situation in one of two ways.
Either we may say that the individual's own rather poor value has
been due to poor environmental circumstances, and that the high
family mean suggests that its breeding value is likely to be a good deal
better than its phenotypic value. Or we may say that the high family
mean has been due to a favourable common environment, provided
perhaps by a good mother, from which the individual in question
must also have benefited; on this interpretation, therefore, the in-
dividual's breeding value is likely to be less good than its phenotypic
value. In the first case we should regard the information from the
relatives as favourable and we should select the individual in question,
while in the second case we should regard it as unfavourable and
should reject the individual. Here then is the problem: how do we
decide which is the correct interpretation ? It turns out that only three
things need be known: the kind of family (whether full or half sibs),
the number of individuals in the families (i.e. the family size), and the
phenotypic correlation between members of the families with respect
to the character. The choice of method is thus a relatively simple
matter in practice. But the explanation of the principles underlying
the choice is more complicated. Before embarking on this explana-
tion we shall therefore give a brief general account of the different
methods of selection according to the use made of the information
from relatives, indicating the circumstances to which each method
is specially suited. Then we shall explain how the response expected
under each method is deduced; and finally we shall compare the
relative merits of the methods under different circumstances.

The phenotypic value of an individual, P, measured as a deviation
from the population mean, is the sum of two parts: the deviation of
its family mean from the population mean, Pfy and the deviation of the
individual from the family mean, Pw (the within-family deviation);

Chap. 13]
so that

METHODS OF SELECTION

227

P=Pf+Pl

.(*?•*)

the

The procedure of selection, then, varies according to the attention
paid, or the weight given, to these two parts. If we select on the basis
of individual values only, as assumed in the last two chapters, we give
equal weight to the two components Pf and Pw of the individual's
value P. This is known as individual selection. We may, alternatively,
select on the basis of the family mean Pf alone, disregarding the
within-family deviation Pw entirely. This is known as family selection
and it corresponds to the procedure adopted in the first case discussed
above. Again, we may select on the basis of the within-family devia-
tion Pw alone, disregarding the family mean Pf entirely. This is
known as within-family selection and it corresponds to the second case
discussed above. Finally, we may take account of both components
Pf and Pw but give them different weights chosen so as to make the
best use of the two sources of information. This is known as selection
by optimum combination, or combined selection. It represents the
general solution for obtaining the maximum rate of response, and the
other three simpler methods are special cases in which the weights
given to the two sources of information are either i or o. It is there-
fore in principle always the best method. But its advantage over one
or other of the simpler methods is never very great, and it is a refine-
ment that is not often worth while in practice. Beyond showing why
this is so, we shall therefore not give very much attention to combined
selection.

The salient features of the three simpler methods are as follows,
the differences of procedure between them being illustrated diagram-
matically in Fig. 13. 1.

Individual selection. Individuals are selected solely in accord-
ance with their own phenotypic values. This method is usually the
simplest to operate and in many circumstances it yields the most rapid
response. It should therefore be used unless there are good reasons
for preferring another method. Mass selection is a term often used for
individual selection, especially when the selected individuals are put
together en masse for mating, as for example Drosophila in a bottle.
The term individual selection is used more specifically when the
matings are controlled or recorded, as with mice or larger animals.

Family selection. Whole families are selected or rejected as
units according to the mean phenotypic value of the family. In-

228

SELECTION: III

[Chap. 13

dividual values are thus not acted on except in so far as they determine
the family mean. In other words the within-family deviations are
given zero weight. The families may be of full sibs or half sibs, families
of more remote relationship being of little practical significance. The
use of full-sib families is dependent on a high reproductive rate and
with slow-breeding organisms half sibs must generally be used.

i • i i
i'ii
i'ii

! • i

° ' i

1 7 i |

O • | '

A ' ' ?

° • i

1 • i °

o ! ' ] o

i o 7

i ; o ,

0 ' I o

1 I I I

' I O I

i i T i

I I I I

I O I

i I o

' I I

I o '

I 1 1

(a)
INDIVIDUAL SELECTION

(b)

FAMILY SELECTION

(c)

(d)

WITHIN-FAMILY SELECTION

Fig. i 3. i. Diagram to illustrate the different methods of selec-
tion. The dots and circles represent individuals plotted on a
vertical scale of merit, those with the best measurements being at
the top. The individuals to be selected are those shown as dots.
There are 5 families each with 5 individuals; {a), {b)y and (c) show
identical arrangements of the same 25 individuals. The families
are separated laterally, with the individuals of each family placed
one above the other. The mean of each family is shown by a cross-
bar. The situation in which within-family selection is most useful
is shown in (d), where the variation between families is very great
in comparison with the variation within families. (Redrawn from
Falconer, 1957a.)

The chief circumstance under which family selection is to be pre-
ferred is when the character selected has a low heritability. The
efficacy of family selection rests on the fact that the environmental
deviations of the individuals tend to cancel each other out in the mean

Chap. 13]

METHODS OF SELECTION

229

value of the family. So the phenotypic mean of the family comes close
to being a measure of its genotypic mean, and the advantage gained is
greater when environmental deviations constitute a large part of the
phenotypic variance, or in other words when the heritability is low.
On the other hand, environmental variation common to members of a
family impairs the efficacy of family selection. If this component is
large, as illustrated in Fig. 13. i (d)y it will tend to swamp the genetic
differences between families and family selection will be corre-
spondingly ineffective. Another important factor in the efficacy of
family selection is the number of individuals in the families, or the
family size. The larger the family the closer is the correspondence
between mean phenotypic value and mean genotypic value. So the
conditions that favour family selection are low heritability, little
variation due to common environment, and large families.

There are practical difficulties in the application of family selec-
tion, particularly in laboratory populations. They arise from the
conflict between the intensity of selection and the avoidance of in-
breeding. It is generally desirable to keep the rate of inbreeding as
low as possible. If the minimum number of parents is fixed by con-
siderations of inbreeding — say at ten pairs — then under family
selection ten families must be selected, since each family represents
only one pair of parents in the previous generation. And, if a reason-
ably high intensity of selection is to be achieved, the number of
families bred and measured must be perhaps twice to four times this
number. Family selection is thus costly of space, and if breeding space
is limited the intensity of selection that can be achieved under family
selection may be quite small. The two following methods are variants
of family selection.

Sib selection. Some characters, we have already noted, cannot
be measured on the individuals that are to be used as parents, and
selection can only be based on the values of relatives. This amounts
to family selection but with the difference that now the selected indi-
viduals have not contributed to the estimate of their family mean.
The difference affects the way in which the response is influenced by
family size. Where the distinction is of consequence we shall use the
term sib selection when the selected individuals are not measured and
family selection when they are measured and included in the family
mean. When families are very large the two methods are equivalent,
and the term family selection is then to be understood to cover both.

Progeny testing is a method of selection widely applied in ani-

230 SELECTION: III [Chap. 13

mal breeding. We shall not discuss it in detail, except in so far as it
can be treated as a form of family selection. The criterion of selection,
as the name implies, is the mean value of an individual's progeny.
At first sight this might seem to be the ideal method of selection and
the easiest to evaluate because, as we saw in Chapter 7, the mean
value of an individual's offspring comes as near as we can get to a
direct measure of its breeding value, and is in fact the operational
definition of breeding value. In practice, however, it suffers from the
serious drawback of a much lengthened generation interval, because
the selection of the parents cannot be carried out until the offspring
have been measured. The evaluation of selection by progeny testing
is apt to be rather confusing because of the inevitable overlapping of
generations, and because of a possible ambiguity about which genera-
tion is being selected, the parents or the progeny. The progeny,
whose mean is used to judge the parents, are ready to be used as
parents just when the parents have been tested and await selection.
Thus both the selected parents and their progeny are used con-
currently as parents. The difficulty of interpretation may be partially
overcome by regarding progeny testing as a modified form of family
selection. The progenies are families, usually of half sibs, and selec-
tion is made between them on the basis of the family means in the
manner described above. The only difference is that the selected
families are increased in size by allowing their parents to go on breed-
ing. The additional, younger, members of the families do not con-
tribute to the estimates of the family means and are therefore selected
by sib selection. Increasing the size of the selected families by un-
measured individuals does not improve the accuracy of the selection,
but it reduces the replacement rate and so increases the intensity of
selection that can be achieved. This is the principal advantage of
progeny testing, but it can only be realised in operations on a large
scale, when the danger of inbreeding is not introduced by limitation
of space.

Within-family selection. The criterion of selection is the
deviation of each individual from the mean value of the family to
which it belongs, those that exceed their family mean by the greatest
amount being regarded as the most desirable. This is the reverse of
family selection, the family means being given zero weight. The chief
condition under which this method has an advantage over the others
is a large component of environmental variance common to members
of a family. Fig 13. 1 (d) shows how within-family selection would be

Chap. 13]

METHODS OF SELECTION

231

applied in this situation. Pre-weaning growth of pigs or mice might
be cited as examples of such a character. A large part of the variation
of individuals' weaning weights is attributable to the mother and is
therefore common to members of a family. Selection within families
would eliminate this large non-genetic component from the variation
operated on by selection. An important practical advantage of selec-
tion within families, especially in laboratory experiments, is that it
economises breeding space, for the same reason that family selection
is costly of space. If single-pair matings are to be made, then two
members of every family must be selected in order to replace the
parents. This means that every family contributes equally to the
parents of the next generation, a system that we saw in Chapter 4
renders the effective population size twice the actual. Thus when
selection within families is practised, the breeding space required to
keep the rate of inbreeding below a certain value is only half as great
as would be required under individual selection.

Expected Response

To evaluate the relative merits of the different methods of selec-
tion we have to deduce the response expected from each. There is
nothing to be added here about individual selection to what was said
in Chapter 11. The expected response was given in equation 11. 3 as
R=icrph?, where i is the intensity of selection (i.e. the selection
differential in standard deviations), gp is the standard deviation, and
W the heritability, of the phenotypic values of individuals. The
response expected under family selection or within-family selection
is arrived at in an analogous manner. Under family selection, the
criterion of selection is the mean phenotypic value of the members of
a family, so the expected response to family selection is

Rf=icrfh2f

to-2)

where i is the intensity of selection, of is the observed standard
deviation of family means, and hj is the heritability of family means.
In the same way the expected reponse to within-family selection is

Rw=icrl

{13.3)

where ow is the standard deviation, and h\ the heritability of within-
family deviations.

F.Q.G.

232 SELECTION: III [Chap. 13

The concept of heritability applied to family means or to within-
family deviations introduces no new principle. It is simply the pro-
portion of the phentoypic variance of these quantities that is made
up of additive genetic variance. These heritabilities can be expressed
in terms of the heritability of individual values (which we shall con-
tinue to refer to simply as the heritability, with symbol A2), the pheno-
typic correlation between members of families, and the number of
individuals in the families, all of which can be estimated by observa-
tion. To arrive at the appropriate expressions we have to consider
again how the observational components of variance are made up of
the causal components, as explained in Chapters 9 and 10 (see in
particular Tables 9.4 and 10.4). First let us simplify matters by
supposing that all families contain a large number of individuals, so
that the means of all families are estimated without error. Consider
first the phenotypic variance. The intra-class correlation, t, between
members of families is the between-group component divided by the
total variance: t — G%ju^. Therefore the between-group component
can be expressed as G% — tG%, and the within-group component as
<7jp = (i -£)crf.. This expresses the partitioning of the phenotypic
variance into its observational components. The total variance,
written here as oy, is the phenotypic variance which we shall write
as VP in the context of causal components. Now, the partitioning of
the additive variance between and within families can be expressed
in the same way, in terms of the correlation of breeding values, for
which we shall use the symbol r. (The meaning of this correlation
will be explained in a moment.) Thus the additive variance between
families is rVA and the additive variance within families is (1 -r)VA.
The dual partitioning is summarised in Table 13.1.

Table 13. i

Partitioning of the variance between and within families of
large size.

Observational component Additive variance Phenotypic variance

Between families, 0% rVA tVP

Within families, al (i-r)VA (i-t)VP

This partitioning of both the additive and the phenotypic variance
leads at once to the heritabilities of family means and of within-
family deviations, since these heritabilities are simply the ratios of
the additive variance to the phenotypic variance. Thus, when the

Chap. 13]

EXPECTED RESPONSE

233

families are large, the heritability of family means is rVAjtVPi or (r/t)h2,
since VAjVP is the heritability of individual values, h2.

The correlation of breeding values between members of families
is a measure of the degree of relationship, usually called the "coeffi-
cient of relationship." The correlation between the breeding values
of relatives in a random-mating population is twice their coancestry

r = 2f

(13.4)

that is to say, twice the inbreeding coefficient of their progeny if the
relatives were mated together. Its values in full-sib and half-sib
families can be seen from Table 9.4; for full sibs it is \ and for half
sibs it is J. In order to be able to discuss full-sib and half-sib families
at the same time in what follows, we shall retain the symbol r in the
formulae instead of inserting the appropriate values of \ or \.

The foregoing account of the heritabilities of family means and
within-family deviations was simplified by the supposition of large
families. This simplification is not justified in practice and we must
now remove it by considering families of finite size. We shall, how-
ever, suppose that all families are of equal size. The number of
individuals in a family — called the family size — has to be taken into
consideration for the following reason. If selection is based on the
family mean, or on the deviations from the family mean, then it is the
observed mean that we are concerned with and not the true mean. In
other words we are not concerned with the observational components
of variance which we have hitherto discussed, but with the variance of
the observed means and of the observed within-family deviations.
The observed means of groups are subject to sampling variance which
comes from the within-group variance. If there are n individuals in a
group then the sampling variance of the group-mean is (i/n) o>, where
&w is the component of variance within the group. Thus the variance of
observed group-means is augmented by (i/w) afVy and the variance of

Table 13.2
Composition of observed variances with families of size n.

Observed variance

of family means

of within-family
deviations

Observational
components

ctJ + - °w
n

°W

Causal components
Additive Phenotypic

i+(n-i)r i+(n-i)t.

(n-i)(i-r) v (n-i){i-t)

234

SELECTION: III

[Chap. 13

observed deviations within groups is correspondingly diminished by
the same amount. The observed variances, with family size w, are
therefore made up of the observational components as shown in
Table 13.2. The causal components entering into the observed
variances can now be found by translating the observational com-
ponents into causal components from Table 13. 1. They are shown in
the two right-hand columns of Table 13.2.

To find the heritabilities of family means and of within-family
deviations we have only to divide the additive component by the
phenotypic component of the observed variances. Thus the herit-
ability of family means is

I+(„_IK2

3 i+(n-i)t
and the heritability of within-family deviations is

At this point sib selection has to be distinguished from family
selection. The foregoing account referred to family selection where
the individuals to be selected were themselves measured and contributed
to the observed family mean. Sib selection differs in that the individuals
selected are not measured. This does not affect the phenotypic com-
ponent, because this is simply the observed variance of what is
measured. But it does affect the additive component, because the
mean breeding value with which we are concerned is not that of the
individuals whose phenotypic values have been measured, but of
others that have not been measured. Therefore the appropriate
variance of mean breeding values is simply the between-family com-
ponent of additive variance, rVA, irrespective of the number of other
individuals that have been measured. The heritability of family
means appropriate to sib selection is therefore

hl =

i+(n- i)t

The heritabilities of the different methods of selection, whose deriva-
tions have now been explained, are listed in Table 13.3.

To deduce the expected response is now a simple matter. Let us

Chap. 13] EXPECTED RESPONSE 235

take family selection for illustration. The expected response was
given in equation 13.2 as

Rf = i(jfh}

where crf is the standard deviation of observed family means. This
expression, however, is not much use as it stands, because it does not
readily allow a comparison to be made with the other methods. It
will be most convenient to cast it into a form that facilitates compari-
son with individual selection. This can be done by substituting the

Table 13.3

Heritability and expected response under different methods
of selection.

Method of
selection

Individual
Family h} = h2

Sib

Within-
family

Combined

hl=h

Heritability
h2
1 +(n-i)r
i+(n-i)t

' i+(n-i)t

R = icrPh2
Rt = iaPh2 .

Rs = iaPh2 .

Expected response

i+(n- i)r

sln{i+(n-i)t)

hi = hK

(i-r)

(i-O

Jn{i+(n-i)t}

VL (i-O i+(»-i)u

(i-t)

i = intensity of selection (selection differential in standard measure):
assumed to be equal for all methods, but not necessarily so.
oP = standard deviation in phenotypic values of individuals.
h2 = heritability of individual values.
r: with full-sib families, r = \
with half-sib families, r = J
t = correlation of phenotypic values of members of the families.
n = number of individuals in the families.

expression for the heritability of family means, h}, given above, and
by putting the standard deviation of observed family means, oy, in
terms of the standard deviation of individual phenotypic values,
°p( = JVp) from the right-hand column of Table 13.2. The expected
response then becomes

236 SELECTION: III [Chap. 13

Rf = i h Hn-*)t i+(n-i)r

x V w i+(rc-i)*

which reduces to

j^v^r '+(»-* ;'i

' L>/[»{i+(»-i)*}]J

The term i<jPh2 is equivalent to the expected response under indi-
vidual selection, so the expression within the square brackets is the
factor that compares family selection with individual selection. The
expression looks very complicated but it contains only three simple
quantities: n, which is the family size; r, which is \ for full-sib and
\ for half-sib families; and t> which is the phenotypic intra-class
correlation.

The expected responses under the different methods of selection
are listed in Table 13.3, all expressed in this manner which allows
the comparisons to be made with individual selection. The relative
merits of the different methods will be discussed in the next section:
first we must deal with combined selection.

Combined selection. We shall deal very briefly with combined
selection, referring the reader to Lush (1947), Lerner (1950) and
A. Robertson (1955a) for details. First we have to find what are the
appropriate weighting factors to be used in its application. We saw
before that the phenotypic value of an individual is made up of two
parts, the family mean and the within-family deviation, P=Pf+Pw,
and that each part gives some information about the individual's
breeding value. In Chapter 10 we saw that the heritability is equi-
valent to the regression of breeding value on phenotypic value
(equation J0.2), so that the best estimate of an individual's breeding
value to be derived from its phenotypic value is h2P. This idea can
be applied separately to the two parts of the phenotypic value, since
these are uncorrelated and supply independent information about the
breeding value. Therefore, taking both parts of the phenotypic value
into account, the best estimate of an individual's breeding value is
given by the multiple regression equation

expected breeding value = hjPf + h%Pw

(Pf being measured as a deviation from the population mean, and Pw
as a deviation from the family mean). The weighting factors that
make the most efficient use of the two sources of information are
therefore the two heritabilities, appropriate to family means and to

Chap. 13]

EXPECTED RESPONSE

237

within-family deviations respectively. The criterion of selection
under combined selection is thus an index, /, in the form

I=h}Pf + h^Pw

<*3-5)

If the values of the heritabilities are inserted from Table 13.3 it will
be seen that the term h2 is common to both weighting factors, and
this term may therefore be omitted without affecting the relative
weighting. We then have an index for the computation of which only
n, r, and t need be known. In practice it is more convenient to work
with the individual values in place of the within-family deviations,
and to assign them a weight of 1 . The family mean is thus used in the
manner of a correction, supplementing the information provided by
the individual itself. Rearrangement of the appropriate weighting
factor for the family mean leads to an index made up as follows (Lush,

1947):

/=p+r~. ,* /]p, (jj.6)

\_i-r i+(n-i)tj T v w» /

where P is the individual value and Pf the family mean, in which the
individual itself is included.

This solution of the problem of how we can best make use of the
information provided by relatives is now cast in precisely the form
in which the problem was introduced at the beginning of this
chapter. The expression in the square brackets in equation JJ.6,
which contains nothing but easily measurable quantities, shows how
we can best use the family mean to supplement the individual values
in making the selection.

The expected response to combined selection, cast in a form
suitable for comparison with individual selection, is given at the foot
of Table 13.3. For its derivation see Lush (1947).

Relative Merits of the Methods

The formulae for the expected responses that we have derived
enable us to compare one method of selection with another and dis-
cover what are the conditions that determine the choice of the best
method. Before making detailed comparisons let us note the reason
for individual selection being usually better than either family selec-
tion or within-family selection. The reason is that the standard

238 SELECTION: III [Chap. 13

deviations of family means and of within-family deviations are both
bound to be less than the standard deviation of individual values;
and the standard deviation of the criterion of selection is one of the
factors governing the response. If we compare, for example, family
selection with individual selection by writing the expected responses
in the form

R = icjph2 (for individual selection)
and Rf=i(7fh} (for family selection)

then it is clear that family selection cannot be better than individual
selection unless the heritability of family means, h}i is greater than
the heritability of individual values, W, by an amount great enough
to counterbalance the lower standard deviation of family means.
And the same applies to within-family selection.

A general picture of the circumstances that make one method
better than another can best be obtained from graphical representa-
tions of the relative responses: that is, the response expected from
one method expressed as a proportion of the response expected from
another, the expected responses being taken from Table 13.3. In
making these comparisons we shall assume that the intensity of
selection is the same for all methods. Though not necessarily true,
this simplification is unavoidable because no generalisation can be
made about the proportions selected under the different methods.
We shall make the comparisons separately for full-sib families (r = J)
and for half-sib families (r = J). Then the relative responses depend
only on two factors, the family size, n> and the intra-class correlation
of phenotypic values, t. If there is no variance due to common en-
vironment contributing to the variance of family means, then the
correlation in full-sib families is equal to half the heritability, and that
in half-sib families to one quarter of the heritability. This lets us see
in a general way how the heritability of the character influences the
relative response. It is, however, the correlation and not the herit-
ability that is the determining factor, so only the correlation need be
known when a choice of method is to be made.

Fig. 13.2 gives a general picture of all the methods, showing how
their relative merits depend on the phenotypic correlation. The
graphs refer only to full-sib families and only to the two extremes of
family size: infinitely large families in (a) and families of 2 in (b).
The comparisons are made here with combined selection since this is
necessarily the method that gives the greatest response. The graphs

Chap. 13]

RELATIVE MERITS OF THE METHODS

239

therefore show the ratio of the response for each method to that for
combined selection: e.g. for family selection, the ratio RfjRc. The
general picture indicated by the graphs is as follows. The relative
merit of individual selection is greatest when the correlation is 0-5
and falls off as the correlation drops below or rises above this value.
The relative merit of family selection is greatest when the correlation
is low, and that of within-family selection when the correlation is

> 5
<

.-"'w -

- /

\ ■

•/..

\ \

1 •'"

1/ i

•

M-

~~>^rtC

•=A •

\ \

\v^-'

\ \

- .1 -..-

0 2 4 6

10 0

(a) n = 00

4 6

(b) n = l

Fig. 13.2. Relative merits of the different methods of selection,
with full-sib families. Responses relative to that for combined
selection plotted against the phenotypic intra-class correlation, t.
/= individual selection; F = family selection; W= within-family
selection.

high. Now, a low correlation between sibs can only result from a
character of low heritability, and with very little variance due to
common environment. These therefore are the circumstances that
favour family selection. A high correlation can only result from a
large amount of variance due to common environment. Even if the
heritability were 100 per cent the correlation between full sibs could
not exceed 0*5 without augmentation by common environment. A
large amount of variation due to common environment is therefore
the circumstance that favours within-family selection. We shall
examine the three simpler methods in more detail in a moment.
First let us look at what may be gained from combined selection.
Though combined selection is always as good as or superior to any
other method, its superiority is never very great. With large families
its superiority is greatest when the correlation is close to 0-25 or 075,
but even then its superiority is not much more than 10 per cent.

240 SELECTION: III [Chap. 13

With families of 2 its superiority reaches 20 per cent when the cor-
relation is 0-875. Thus the range of circumstances under which
combined selection is more than a few per cent better than one or
other of the simpler methods is very narrow. In general, therefore,
there is little to be gained from the extra trouble of applying combined
selection, and we shall not give it any further consideration.

Let us now examine the simpler methods in more detail. The
most useful comparison to make now is with individual selection.
The expected responses will therefore be expressed as a proportion
of the response to individual selection. We shall examine each method
in turn, commenting on the special questions that arise in connexion
with each.

Family selection. Fig. 13.3 shows the relative response RfjR
plotted against the family size, n, for full-sib families in (a) and
for half-sib families in (b). These graphs therefore show primarily the
effect of family size on the relative merit of family selection, but the
magnitude of the correlation, t, is taken into account by separate
curves for different correlations. Only the circumstances when family
selection is superior to individual selection are shown on the graphs.
The chief points made clear by the graphs are these, (i) As we saw
from Fig. 13.2, there is a critical value of the correlation, above which
family selection cannot be superior to individual selection. From the
expected responses in Table 13.3 it is easy to show that when the
families are large the relative response expected is Rf/R = r/Jt. So,
with large families, family selection becomes superior to individual
selection when r exceeds Jt. The critical value of the correlation, t,
depends a little on the family size and differs between full-sib and
half-sib families. Family selection with full sibs is very little better
than individual selection unless the correlation is below 0-2; and with
half sibs unless it is below 0-05 . (ii) The effect of family size is greatest
when the correlation is low. Therefore there is little to be gained
from very large families unless the correlation is well below the critical
value. There is, however, another consideration in connexion with
the family size which will be explained later, (iii) Finally, there is the
question whether full sibs or half sibs are to be preferred for family
selection. This depends so much on the special circumstances that
general conclusions cannot be drawn. From the graphs it would
appear that full sibs must always be better than half sibs. But the
full-sib correlation is more likely to be increased by common en-
vironment, and full-sib families are likely to be a good deal smaller

Chap. 13]

RELATIVE MERITS OF THE METHODS

241

than half-sib families. Both these factors work in favour of half-sib
families. It has been shown that in selection for egg-production in
poultry the factor of family size makes half-sib families superior to
full sibs (Osborne, 19570).

0^ 1-4

t^-

-1?SS

t=2 „

^=•20

t=i

h1 = -40

1-4

1-2

20 30 40

FAMILY SIZE , 71

(b)

,=-Q25_

i=05

/!2 = -20

Fig. 13.3. Responses expected under family selection relative to
that for individual selection, plotted against family size. The
separate curves refer to different values of the phenotypic cor-
relation, t, as indicated. The corresponding values of the heri-
tability, h2, in the absence of variation due to common environment,
are also given, (a) full-sib families; (b) half-sib families.

Sib selection. The use of this method is usually dictated by
necessity rather than by choice, and comparisons with other methods
are of less interest. The chief practical question that arises concerns

242

SELECTION: III

[Chap. 13

the family size: how many sibs should be measured? Or, how far is it
worth while increasing family size ? The effect of family size on the
response to sib selection is shown in Fig. 13.4. The graphs show the
response with family size nf as a percentage of the response with
infinitely large families, which would be the maximum possible

100

9 70

1 60
X

O 50
ui

2 40

^T^sT

~-~^Z~

-*&&

<ToT

^^023

20 30
FAMILY SIZE

Fig. 13.4. Effect of family size on the response to sib selection,
with either full- or half-sib families. The expected response is
shown as a percentage of the response with infinitely large families.
The separate curves refer to different values of the phenotypic
correlation, t, as indicated.

response. The graphs are valid for both full and half sibs. Again the
effect of increasing family size is greatest when the correlation is low.
But with sib selection as with family selection there is another con-
sideration to be taken into account in connexion with the family size,
which will now be explained.

Chap. 13] RELATIVE MERITS OF THE METHODS 243

Optimal family size. Though the graphs suggest that the larger
the family size the greater will be the response, under both family
selection and sib selection, this is not so in practice because the in-
tensity of selection is involved as a factor in the following way. In
practice there is always a limitation on the amount of breeding space
or facilities for measurement. The total available space can be filled
with a large number of small families, or with a small number of large
families. Considerations of inbreeding set a lower limit to the number
of families that will be selected, so the larger the number of families
measured the greater will be the intensity of selection. Therefore
there is a conflict of advantage between the size of the families and
the intensity of selection: large families lead to a lower intensity of
selection. When the intensity of selection is taken into consideration
it turns out that there is an optimal family size which gives the
greatest expected response. The optimal family size with half-sib
families can be found approximately from the following simple
formula (A. Robertson, 19576):

VA (J*7)

7Z = 0-56

where n is the otpimal family size, T is the total number of individuals
that can be accommodated and measured, N is the number of families
to be selected, and h2 is the heritability of the character.

Within-family selection. Fig. 13.5 shows the relative response,
Rw/R, for within-family selection applied to full-sib families. Half-
sib families need not be considered since the method is unlikely to be
applied to them. The graphs show primarily the effect of the pheno-
typic correlation, t> on the response. Four graphs are given repre-
senting family sizes between 2 and 30, and it can be seen that the
family size does not have a great effect. The relative response when
the families are very large can be shown from the expected responses
given in Table 13.3 to be Rw/R = (i -r)/J(i -i). So, with large
families, within-family selection will be superior to individual selec-

Ition if (1 - r) exceeds J(i - 1). The graphs in Fig. 13.5 show that the
correlation, t> in full-sib families would have to exceed about 075 to
0-85, according to the family size. Correlations as high as this cannot
arise without a large amount of variation due to common environ-
ment. Correlations high enough to make within-family selection
superior to individual selection are, however, not commonly found,
and the advantage of within-family selection therefore comes chiefly

244

SELECTION: III

[Chap. 13

from the reduced rate of inbreeding which was mentioned earlier.
Fig. 13.5 shows how much will be sacrificed in the rate of response if
within-family selection is applied. Most characters have full-sib
correlations below about 0-5, and within-family selection is then only
about half as effective as individual selection.

•8

•6

•4

« = 30
n = 10

n=4

)

'/ \

ft J

n=2

•2

•8

•3 -4 5 6 7

PHENOTYPIC CORRELATION, /

Fig. 13.5. Response expected under within-family selection rela-
tive to that of individual selection, plotted against the phenotypic
correlation, t. The separate curves refer to different family sizes,
as indicated.

Weights to be attached to families of different size. Through
out this chapter we have assumed that all families whose mean values
are to be used in selection have equal numbers of individuals in them;
i.e. n is the same for all families. This is a reasonable enough assump-
tion to make when we are considering the expected response from the
point of view of the planner who has to decide on the method of
selection to be applied. But, in practice, families are very seldom of
equal size and if we are to apply any method of selection based on
family means we are immediately faced with the problem of how to
make allownace for different numbers in the families. Obviously the
mean of a large family is more reliable than that of a small one, and
should be given more weight when the selection is being made. The
solution of the problem comes from a consideration of the heritability

Chap. 13]

RELATIVE MERITS OF THE METHODS

245

as the regression of breeding value on phenotypic value. The best
estimate of the breeding value of a family is obtained by multiplying
the family mean (measured as a deviation from the population mean)
by the heritability of family means. The appropriate weighting factor
for family means is therefore the heritability of family means, cal-
culated separately for each family according to its size. Quantities
that are constant for all families may be omitted without altering the
relative weights. Thus, in the application of family selection, each
family mean, calculated as a deviation from the population mean,
should be weighted by [i +(n- i)r]/[i +(n- i)t], and in sib selection by
p*/[i + (n - i)t]. The heritability of within-family deviations does not
contain the term w, and is therefore unaffected by family size. Thus no
weighting is required in the application of within-family selection. The
weighting factor to be used in combined selection has already been
given in equation 13.6.

We conclude this chapter with an example from a laboratory
experiment which compared the responses actually obtained under
different methods of selection.

Example 13.1. In an experiment with Drosophila melanogaster selec-
tion for abdominal bristle-number was made by three methods (Clayton,
Morris, and Robertson, 1957). The responses to individual selection at
different intensities were quoted in Example 11.2. Sib selection was also
applied in both full-sib and half-sib families and the responses compared
with expectation. Here we shall compare the responses under sib selection
with the response under individual selection, according to the formula in
Table 13.3. The same proportion of the population was selected in each
case, namely 20 per cent, but the intensities of selection under sib selection

Relative response, RJR

Full sibs Half sibs
Exp. 0-832 0-614

Obs. up 0-618 0-527

Obs. down 0-919 0-635

were lower than under individual selection because there was a smaller
total number of families than of individuals — 10 half-sib families, 20
full-sib families, and 100 individuals. The intensity of selection under
individual selection was 1-40. Those under sib selection are given in the
table, together with the other data needed for calculating the expected
responses under sib selection relative to that under individual selection.

Data

Full sibs

Half sibs

i*33

1-27

0-50

0-275

0-265

0-I2I

246

SELECTION: III

[Chap. 13

In applying the formula from Table 13.3 we have to take account of the
intensity of selection, multiplying by the ratio of the intensity under sib
selection to the intensity under individual selection. It will be seen that
the correlation of breeding values, r, between half sibs is a little greater
than J. This is because the females mated to a male were not entirely
unrelated to each other. The ratios of the responses expected and observed
are given in the right-hand half of the table. The expectation is that in-
dividual selection should be the best method, and so it proved to be.
There is, however, some discrepancy between the upward and downward
responses, of which the reason is not known.

CHAPTER 14

INBREEDING AND CROSSBREEDING:

I. Changes of Mean Value

We turn our attention now to inbreeding, the second of the two ways
open to the breeder for changing the genetic constitution of a popula-
tion. The harmful effects of inbreeding on reproductive rate and
general vigour are well known to breeders and biologists, and were
mentioned in Chapter 6 as one of the two basic genetic phenomena
displayed by metric characters. The opposite, or complementary,
phenomenon of hybrid vigour resulting from crosses between inbred
lines or between different races or varieties is equally well known, and
forms an important means of animal and plant improvement. The
production of lines for subsequent crossing in the utilisation of
hybrid vigour is one of two main purposes for which inbreeding may
be carried out. The other is the production of genetically uniform
strains, particularly of laboratory animals, for use in bioassay and in
research in a variety of fields. Inbreeding in itself, however, is almost
universally harmful and the breeder or experimenter normally seeks
to avoid it as far as possible, unless for some specific purpose. Men-
tion should be made here of naturally self-fertilising plants, to which
much of the discussion in this chapter is inapplicable. Since inbreed-
ing is their normal mating system they cannot be further inbred: they
can, however, be crossed, but they do not regularly show hybrid
vigour.

In the treatment of inbreeding given in Chapter 3 the conse-
quences were described in terms of the expected changes of gene
frequencies and of genotype frequencies. Here we have to show how
the changes of gene and genotype frequencies are expected to affect
metric characters. And at the same time we have to consider the
observed consequences of inbreeding and crossing, and see what
light they throw on the properties of the genes concerned with
metric characters. We shall first consider the changes of mean value
and then, in the next chapter, the changes of variance resulting from
inbreeding and crossbreeding. Finally, in Chapter 16, we shall con-

R F.Q.G.

248 INBREEDING AND CROSSBREEDING: I [Chap. 14

sider the combination of selection with inbreeding and crossbreeding
by means of which hybrid vigour may be utilised in animal and plant
improvement.

Inbreeding Depression

The most striking observed consequence of inbreeding is the
reduction of the mean phenotypic value shown by characters con-
nected with reproductive capacity or physiological efficiency, the
phenomenon known as inbreeding depression. Some examples of in-
breeding depression are given in Table 14. i, from which one can see
what sort of characters are subject to inbreeding depression, and —
very roughly — the magnitude of the effect. From the results of these
and many other studies we can make the generalisation that inbreed-
ing tends to reduce fitness. Thus, characters that form an important
component of fitness, such as litter size or lactation in mammals,
show a reduction on inbreeding; whereas characters that contribute
little to fitness, such as bristle number in Drosophila, show little or no
change.

In saying that a certain character shows inbreeding depression,
we refer to the average change of mean value in a number of lines.
The separate lines are commonly found to differ to a greater or lesser
extent in the change they show, as, indeed, we should expect in
consequence of random drift of gene frequencies. This matter of dif-
ferentiation of lines will be discussed later when we deal with changes
of variance. It is mentioned here only to emphasise the fact that the
changes of mean value now to be discussed refer to changes of the
mean value of a number of lines derived from one base population.
As in our earlier account of inbreeding we have to picture the "whole
population" consisting of many lines. The population mean then
refers to the whole population and inbreeding depression refers to a
reduction of this population mean. Let us now consider the theoreti-
cal basis of the change of population mean on inbreeding.

First, we may recall and extend some of the conclusions from
Chapter 3, supposing at first that selection does not in any way inter-
fere with the dispersion of gene frequencies. Since the gene fre-
quencies in the population as a whole do not change on inbreeding,
any change of the population mean must be atrributed to the changes
of genotype frequencies. Inbreeding causes an increase in the frequen-
cies of homozygous genotypes and a decrease of heterozygous genotypes.

Chap. 14]

INBREEDING DEPRESSION

249

Table 14. i

Some Examples of Inbreeding Depression

The figures given show approximately the decrease of mean
phenotypic value per 10 per cent increase of the coefficient
of inbreeding: column (1) in absolute units; column (2) as
percentage of non-inbred mean; column (3) in terms of the
original phenotypic standard deviation (data not available
for all characters).

Character Inbreeding depression per

10% increase ofF

to (2)

units %

Cattle (A. Robertson, 1954)

Milk-yield 29-6 gal. 3-2

Pigs (Dickerson et al. 1954)

(3)
/ap

0-17

Litter size at birth
Weight at 154 days

0*38 young
3-64 lb.

4.6

0-15

0*12

Sheep (Morley, 1954)

Fleece weight
Length of wool
Body weight at 1 year

0-64 lb.
o-i2 cm.
2-91 lb.

5'5
i-3

0-51
CI4
C36

Poultry (Shoffner, 1948)

Egg-production
Hatchability
Body weight

9-26 eggs

4-36%
0-04 lb.

6-2

6.4
o-8

Mice (Original data)

Litter size at birth

o*6o young

8-o

0-28

Weight at 6 weeks ($?)

0-58 gm.

2-6

0-26

Drosophila melanogaster

(Tantawy and Reeve, 1956)
Fertility (per pair per day)
Viability (egg to adult)
Wing length

2-2 offspring

2-6 %

2'8 (too) mm.

37
i-4

o-8o

Drosophila subobscura

(Hollingsworth and Smith, 1955)
Fertility (per pair per day)
Egg hatchability

6-o offspring
8-3 %

I2'5

8'3

—

250 INBREEDING AND CROSSBREEDING: I [Chap. 14

Therefore a change of population mean on inbreeding must be con-
nected with a difference of genotypic value between homozygotes and
heterozygotes. Let us now see more precisely how the population
mean depends on the degree of inbreeding, which we may con-
veniently express as the inbreeding coefficient, F. \

Consider a population, subdivided into a number of lines, with a
coefficient of inbreeding, F. The expression for the population mean
is derived by putting together the reasoning set out in Tables 3.1 and
7.1, in the following way. Table 14.2 shows the three genotypes of a
two-allele locus with their genotypic frequencies in the whole popula-
tion. These frequencies come from Table 3.1, p and q being the
gene frequencies in the whole population. Then the third column
gives the genotypic values assigned as in Fig. 7.1. The value and

Table 14.2

Genotype Frequency Value Frequency x Value

A^ p+pqF +a p2a+pqaF

A^ 2pq-2pqF d 2pqd-2pqdF

A2A2 q2+pqF -a -q2a-pqaF

Sum = a(p -q) + 2dpq - 2dpqF
= a(p -q) + 2dpq(i -F)

frequency of each genotype are multiplied together in the right-hand
column, the summation of which gives the contribution of this locus a
to the population mean. Thus, referring still to the effects of a single
locus, we find that a population with inbreeding coefficient F has a
mean genotypic value:

MF = a{p-q) + 2dpq(i-F) (14.1)

= M0-zdpqF (14.2)

where M0 is the population mean before inbreeding, from equation
7.2. The change of mean resulting from inbreeding is therefore
— 2dpqF. This shows that a locus will contribute to a change of mean
value on inbreeding only if d is not zero; in other words if the value
of the heterozygote differs from the average value of the homozygotes. ^
This conclusion, though demonstrated in detail only for two alleles ^
at a locus, is equally valid for loci with more than two alleles. The
following general conclusions can therefore be drawn: that a change
of mean value on inbreeding is a consequence of dominance at the
loci concerned with the character, and that the direction of the change

Chap. 14]

INBREEDING DEPRESSION

251

is toward the value of the more recessive alleles. The dominance may
be partial or complete, or it may be overdominance; all that is neces-
sary for a locus to contribute to a change of mean is that the heterozy-
gote should not be exactly intermediate between the two homozygotes.
Equation 14.2 shows also that the magnitude of the change of mean
depends on the gene frequencies. It is greatest when pq is maximal:
that is, when j>=<7 = |. Genes at intermediate frequencies therefore
contribute more to a change of mean than genes at high or low fre-
quencies, other things being equal.

Now let us consider the combined effect of all the loci that affect
the character. In so far as the genotypic values of the loci combine
additively, the population mean is given by summation of the contri-
butions of the separate loci, thus:

MF=Za{p - q) + 2{Zdpq)(i -F)
= M0-2FZdpq

(14-3)

and the change of mean on inbreeding is - zFZdpq.

These expressions show what are the circumstances under which
a metric character will show a change of mean value on inbreeding.
The chief one is if the dominance of the genes concerned is pre-
ponderantly in one direction; i.e. if there is directional dominance.
If the genes that increase the value of the character are dominant
over their alleles that reduce the value, then inbreeding will result in
a reduction of the population mean, i.e. a change in the direction of
the more recessive alleles. The contribution of each locus, however,
depends also on its gene frequencies, those with intermediate fre-
quencies having the greatest effect on the change of mean value.

We have now reached two conclusions about the effects of in-
breeding, one from observation — that inbreeding reduces fitness; the
other from theory — that the change is in the direction of the more
recessive alleles. Putting these two conclusions together leads to the
generalisation, already familiar from Mendelian genetics, that dele-
terious alleles tend to be recessive.

Another conclusion that can be drawn from equation 14.4 is that
when loci combine additively the change of mean on inbreeding
should be directly proportional to the coefficient of inbreeding. In
other words the change of mean should be a straight line when
plotted against F. Two examples of experimentally observed inbreed-
ing depression are illustrated in Fig. 14.1.

On the whole the observed inbreeding depression does tend to be

252

INBREEDING AND CROSSBREEDING: I

[Chap. 14

linear with respect to F, and this might be taken as evidence that
epistatic interaction between loci is not of great importance. There
are, however, several practical difficulties that stand in the way of
drawing firm conclusions from observations of the rate of inbreeding
depression. One is that as inbreeding proceeds and reproductive
capacity deteriorates, it soon becomes impossible to avoid the loss of

Fig. 14. i. Examples of inbreeding depression affecting fertility.
(a) Litter-size in mice (original data). Mean number born alive in
1 st litters, plotted against the coefficient of inbreeding of the litters.
The first generation was by double-first-cousin mating; thereafter
by full-sib mating. No selection was practised, (b) Fertility in
Drosophila subobscura. Mean number of adult progeny per pair per
day, plotted against the inbreeding coefficient of the parents.
Consecutive full-sib matings. (Redrawn from Hollingsworth &
Smith, I955-)

some lines. The survivors are then a selected group to which the
theoretical expectations no longer apply. Thus precise measurement
of the rate of inbreeding depression can generally be made only over
the early stages, before the inbreeding coefficient reaches high levels.
Another difficulty, met with particularly in the study of mammals,
arises from maternal effects. Maternal qualities are among the most
sensitive characters to inbreeding depression. The effect of inbreed-
ing on another character that is influenced by maternal effects is
therefore two-fold: part being attributable to the inbreeding of the
individuals measured and part to the inbreeding in the mothers. So
the relationship between the character measured and the coefficient
of inbreeding cannot be depicted in any simple manner. In conse-

Chap. 14]

INBREEDING DEPRESSION

253

quence of these difficulties reliable conclusions cannot easily be
drawn from the exact form of the inbreeding depression observed in
experiments.

Example 14. i. The complications arising from maternal effects may
be illustrated by litter size in pigs and mice. Litter size is a composite
character, which is partly an attribute of the mother and partly an attribute
of the young in the litter. It is therefore influenced both by the inbreeding
of the mother and by the inbreeding of the young, and these two influences
are difficult to disentangle in practice. Studies on pigs (Dickerson et al.,
1954) have shown that the reduction of litter size due to inbreeding in the
mother alone is about 0-20 young per 10 per cent of inbreeding; and the
reduction due to inbreeding in the young alone is about 0-17 young per 10
per cent of inbreeding. Thus the effects of inbreeding in the mother and in
the young are about equally important. A small experiment with mice
(original data) gave much the same picture. A rough separation of the
effects of inbreeding in the mother and in the young was made by means of
crosses between lines after 2 or 3 generations of sib mating. (The justifi-
cation for regarding this as a measure of the inbreeding depression will be
explained in the next section.) The mean litter sizes, arranged according
to the coefficient of inbreeding of the mothers and of the young, are given
in the table.

Inbreeding coefficient of mothers

0% 37'5% 50%

0%
50%
59%

8-2

7'5
6-3

7'3
5-8

The three comparisons in the first row show the effect of inbreeding in the
mothers, and give values of 0-19, 0-18 and 0-16 for the reduction of litter
size per 10 per cent of inbreeding. The comparisons in the second and
third column show the effect of inbreeding in the young, and give values
of 0-24 and 0-25 for the reduction per 10 per cent of inbreeding. Thus
inbreeding in the young had rather more effect than inbreeding in the
mother. These results, however, should not be taken as being character-
istic of mice in general.

The effect of selection. The neglect of selection during in-
breeding is an unrealistic omission because natural selection cannot
be wholly avoided even in laboratory experiments. Since inbreeding
tends to reduce fitness, natural selection is likely to oppose the in-
breeding process by favouring the least homozygous individuals.

254 INBREEDING AND CROSSBREEDING: I [Chap. 14

The balance between selection and the dispersion of gene frequencies
was discussed in Chapter 4, and the only further point that need be
added here is that the operation of natural selection makes the in-
breeding depression dependent on the rate of inbreeding. One must
distinguish between the state of dispersion of gene frequencies and
the coefficient of inbreeding as computed from the population size or
the pedigree relationships. The state of dispersion is what determines
the amount of inbreeding depression; the coefficient of inbreeding is a
measure of the state of dispersion only in the absence of selection.
When selection operates, the state of dispersion will be less than that
indicated by the coefficient of inbreeding, and the discrepancy be-
tween the two will be greater when the rate of inbreeding is slower,
because the selection will then be relatively more potent. Therefore
one must expect the inbreeding depression caused by a given increase
of the computed coefficient of inbreeding to be less when inbreeding
is slow than when it is rapid.

Heterosis

Complementary to the phenomenon of inbreeding depression is
its opposite, "hybrid vigour" or heterosis. When inbred lines are
crossed, the progeny show an increase of those characters that previ-
ously suffered a reduction from inbreeding. Or, in general terms, the
fitness lost on inbreeding tends to be restored on crossing. That the
phenomenon of heterosis is simply inbreeding depression in reverse
can be seen by consideration of how the population mean depends on
the coefficient of inbreeding, as shown in equation 14.4. Consider, as
before, a population subdivided into a number of lines. If the lines
are crossed at random, the average coefficient of inbreeding in the
cross-bred progeny reverts to that of the base population. Thus, if a
number of crosses are made at random between the lines, the mean
value of any character in the cross-bred progeny is expected to be the
same as the population mean of the base population. In other words,
the heterosis on crossing is expected to be equal to the depression on
inbreeding. Furthermore, if the population is continued after the
crossing by random mating among the cross-bred and subsequent
generations, the coefficient of inbreeding will remain unchanged, and
the population mean is consequently expected to remain at the level
of the base population. We may, thus, make the following generalisa-

Chap. 14]

HETEROSIS

255

tion on theoretical grounds: that, in the absence of selection, in-
breeding followed by crossing of the lines in a large population is not
expected to make any permanent change in the population mean.

Example 14.2. An experiment with mice (R. C. Roberts, unpublished)
was designed to test the theoretical expectation that in the absence of
selection the heterosis on crossing should be equal to the depression on
inbreeding. The character studied was litter size. Thirty lines taken from
a random-bred population were inbred by 3 consecutive generations of
full-sib mating, bringing the coefficient of inbreeding up to 50 per cent in
the litters and 37-5 per cent in the mothers. No selection was practised
during the inbreeding, and only 2 of the 30 lines were lost as a conse-
quence of their inbreeding depression.

Litter size
Before inbreeding 8-i

Inbred (litters: F = 50%) 57

Cross-bred 8-5

After the third generation of inbreeding, crosses were made at random
between the lines, and in the next generation crosses between the F/s were
made so as to give cross-bred mothers with non-inbred young. The mean
litter sizes observed at the different stages are given in the table. The
inbreeding depression was 2-4 and the heterosis 2-8; the two are equal
within the limits of experimental error.

Single crosses. The foregoing theoretical conclusions refer to
the average of a large number of crosses between lines derived from a
single base population. In practice, however, one is often interested
in a somewhat different problem, namely the heterosis shown by a
particular cross between two lines, or between two populations which
may have no known common origin. To refer the changes of mean
value to changes of inbreeding coefficient would be inappropriate
under these circumstances, and the theoretical basis of the heterosis is
better expressed in terms of the gene frequencies in the two lines.
We may recall from Chapter 3 that inbreeding leads to a dispersion of
gene frequencies among the lines, the lines becoming differentiated
in gene frequency as inbreeding proceeds; and the coefficient of
inbreeding is a means of expressing the degree of differentiation
(equation 3.14). In turning from the inbreeding coefficient to the
gene frequencies as a basis for discussion we are therefore turning
from the general, or average, consequence of crossing, to the particu-
lar circumstances in two lines.

256 INBREEDING AND CROSSBREEDING: I [Chap. 14

Let us, then, consider two populations, referred to as the ' 'parent
populations," both random-bred though not necessarily large. The
parent populations are crossed to produce an Fx or "first cross-bred
generation," and the Fx individuals are mated together at random to
produce an F2 or "second cross-bred generation." The amount of
heterosis shown by the Fx or the F2 will be measured as the deviation
from the mid-parent value, i.e. as the difference from the mean of the
two parent populations. First consider the effects of a single locus
with two alleles whose frequencies are p and q in one population, and
p' and q' in the other. Let the difference of gene frequency between
the two populations be y, so that y=p-p' =q' -q. The algebra is
then simplified by writing the gene frequencies^/ and q' in the second
population as (p -y) and (q +y). Let the genotypic values be a, d, - a,
as before. They are assumed to be the same in the two popula-
tions, epistatic interaction being disregarded. We have to find the
mean of each parent population and the mid-parent value; then the
mean of the Fx and the mean of the F2. The parental means, MVl and »
Mp2, are found from equation 7.2. They are

M1>1=a(p-q) + 2dpq

Mj>2 = a{p-y-q-y) + zd(p -y)(q +y)
= a(p-q- 2y) + zd[pq +y(p -q)- y2]

The mid-parent value is

Mp = «MPi+Mp2)

= a(p-q-y) + d[2pq+y{p-q)-y*\ (14.5)

When the two populations are crossed to produce the Flf indi-
viduals taken at random from one population are mated to indivi-

Table 14.3
Frequencies of Zygotes in the F1

Gametes from P1
Aj A2

P Q

Gametes \ A1 p-y
from¥2 J A2 q+y

p(p-y) q(p-y)

p(q+y) q{<i+y)

duals taken at random from the other population. This is equivalent
to taking genes at random from the two populations, as shown in
Table 14.3. The Fx is therefore constituted as follows: I ence

Chap. 14]

HETEROSIS

257

Genotypes
Frequencies
Genotypic values

p(p-y)
a

AiA2

2pq+y(p-q)
d

A2A2

q(q+y)

-a

The mean genotypic value of the Fx is therefore:

M¥i = a(p2 -py-q2-qy) + d[2pq+y(p-q)]
= a{p-q-y) + d[zpq +y(p - q)]

The amount of heterosis, expressed as the difference between the F1
and the mid-parent values, is obtained by subtracting equation 14.5
from equation 14.6:

■(14.6)

HFl=MFl-Mp
= dy*

(14-7)

Thus heterosis, just like inbreeding depression, depends for its occur-
rence on dominance. Loci without dominance (i.e. loci for which
d=6) cause neither inbreeding depression nor heterosis. The amount
of heterosis following a cross between two particular lines or popula-
tions depends on the square of the difference of gene frequency (y)
between the populations. If the populations crossed do not differ in
gene frequency there will be no heterosis, and the heterosis will be
greatest when one allele is fixed in one population and the other allele
in the other population.

Now consider the joint effects of all loci at which the two parent
populations differ. In so far as the genotypic values attributable to
the separate loci combine additively, we may represent the heterosis
produced by the joint effects of all the loci as the sum of their separate
contributions. Thus the heterosis in the F1 is

HVl=Zdy*

(14.8)

If some loci are dominant in one direction and some in the other their
effects will tend to cancel out, and no heterosis may be observed, in
spite of the dominance at the individual loci. The occurrence of
heterosis on crossing is therefore, like inbreeding depression, de-
pendent on directional dominance, and the absence of heterosis is not
sufficient ground for concluding that the individual loci show no
dominance.

Before we go on to consider the F2 it is perhaps worth noting that
the formulation of the heterosis in terms of the square of the differ-
ence of gene frequency, in equations J4.7 and 14.8, is quite in line

258 INBREEDING AND CROSSBREEDING: I [Chap. 14

with the previous formulation of the inbreeding depression in terms
of the coefficient of inbreeding. If we envisage once more the whole
population subdivided into lines, and we suppose pairs of lines to be
taken at random, then the mean squared difference of gene frequency
between the pairs of lines will be equal to twice the variance of gene
frequency among the lines. That is: (j2) = 2o^. And, by equation
3.14, 2o\ = 2pqF. Therefore the mean amount of heterosis shown by
crosses between random pairs of lines is equal to the inbreeding
depression as given in equation 14.2, though of opposite sign.

Now let us consider the F2 of a particular cross of two parent
populations, the F2 being made by random mating among the indi-
viduals of the Fj. In consequence of the random mating, the geno-
type frequencies in the F2 will be the Hardy- Weinberg frequencies
corresponding to the gene frequency in the Fv The mean genotypic
value of the F2 is then easily derived by application of equation 7.2.
The gene frequency in the F1} being the mean of the gene frequencies
in the two parent populations, is (p - \y) for one allele, and (q + \y)
for the other. Putting these gene frequencies in place of p and q
respectively in equation 7.2 gives the mean genotypic value of the

2 as:

MVi = a(p-iy-q-ly) + 2d(p-iy)(q + iy)

= a(p-q-y) + d[zpq+y(p-q)-iy2]

The amount of heterosis shown by the F2 is the difference between
the F2 and mid-parent values. So, from equations 14.5 and X4.9,

= \dy*

=i#Fx {14-1°)

We find therefore that the heterosis shown by the F2 is only half as
great as that shown by the Fx. In other words, the F2 is expected to
drop back half-way from the Fx value toward the mid-parent value.
At first sight this conclusion may seem to contradict the one arrived
at earlier, when we were considering crosses between many lines, the
F]_ and F2 means then being equal. The difference between the two
situations is that an F2 made by random mating among a large number
of different crosses has the same inbreeding coefficient as the F2.
But an F2 made from an Fx derived from a single cross has inevitably
an increased inbreeding coefficient. If the inbreeding coefficient is

Chap. 14]

HETEROSIS

259

worked out in the manner described in Example 5.2, it will be found
to be half the inbreeding coefficient of the parent lines. The change
of mean from Fx to F2 may therefore be regarded as inbreeding de-
pression. It cannot be overcome by having a large number of parents
of the F2 because the restriction of population size that causes the
inbreeding has already been made in the single cross of only two lines,
or parent populations. There need, however, be no further rise of the
inbreeding coefficient in the F3 and subsequent generations. Pro-
vided, therefore, that there is no other reason for the gene frequency
to change, the population mean will be the same in the generations
following as in the F2.

That the heterosis expected in the F2 is half that found in the F±
is equally true when the joint effects of all loci are considered, pro-
vided that epistatic interaction is absent. The conclusion for a single
locus was based on the principle that Hardy- Weinberg equilibrium
is attained by a single generation of random mating. It will be
remembered from Chapter 1 (p. 19), however, that this is not true
with respect to genotypes at more than one locus considered jointly.
Therefore if there is epistatic interaction, the population mean will
not reach its equilibrium value in the F2, but will approach it more or
less rapidly according to the number of interacting loci and the
closeness of the linkage between them. The existence of epistatic
interaction is intimately connected with the scale of measurement,
but this matter will not be discussed until Chapter 17. Here we need
only note that for reasons connected with the scale of measurement
the halving of the heterosis in the F2 expected on theoretical grounds
is not often found at all exactly in practice, though the F2 usually falls
somewhere between the Fx and mid-parent values. Some examples
from plants of the heterosis observed in the F1 and F2 generations are
illustrated in Fig. 14.2. It will be noticed that with some of the
characters shown, the Fx and F2 are lower in value than the mid-
parent, and the heterosis is consequently negative in sign. This is in
no way inconsistent with our definition of heterosis as the difference
between the Fx or F2 and the mid-parent value. The sign of the
difference depends simply on the nature of the measurement. For
example, the character "days to first fruit," represented in the lower
graphs, shows heterosis of negative sign: but if the character were
called "speed of development" and expressed as a reciprocal of time
the heterosis would be positive in sign.

The relative amount of heterosis observed in the Fx and F2

260

INBREEDING AND CROSSBREEDING: I

[Chap. 14

generations is complicated also by the existence of maternal effects,
particularly in mammals. A character subject to a maternal effect,

200

180

160

140

120

•be

F,
(e)

(g)

F,
(h)

Fig. 14.2. Some illustrations of heterosis observed in crosses
between pairs of highly inbred strains of plants. The points show
the mean values of the two parent strains, the Fx and the F2
generations. The mid-parent values are shown by horizontal lines.
Graph (a) refers to tobacco, Nicotiana rustica (data from Smith,
1952). All the other graphs refer to tomatoes, Ly coper sicon (Data
from Powers, 1952). The characters represented are:

(a) Height of plant (in.)

(b) Mean weight of one fruit (gm.)

(e)-(h) Mean time in days between the planting of the seed and
the ripening of the first fruit, in 4 different crosses.

such as litter size, is divided between two generations. The maternally
determined component of the character may be expected to follow the

Chap. 14]

HETEROSIS

261

same general pattern of heterosis in the F± and F2 as we have just
discussed, but it will be one generation out of phase with the non-
maternal part of the character. Thus the heterosis observed in the F1
is attributable to the non-maternal part, the maternal effect being still
at the inbred level. In the F2, however, the non-maternal part will
lose half the heterosis as explained above, but the maternal effect will
now show the full effect of its heterosis since the mothers are now in
the Fj stage. This rather complicated situation may perhaps be more

__,#.

>-
H

UJ
Q.

z
<

r V \

| 1 1 1

GENERATION

CHARACTER AS MEASURED
NON- MATERNAL COMPONENT
MATERNAL COMPONENT

Fig. 14.3. Diagram of the heterosis expected in a character sub-
ject to a maternal effect, when two lines are crossed and the F2 is
made by random mating among the Fx. The maternal and non-
maternal components of the character separately are here supposed
to show equal amounts of heterosis, and to combine by simple
addition to give the character as it is measured.

readily grasped from the diagrammatic representation in Fig. 14.3.
As a result of maternal effects, therefore, the loss of heterosis in the
F2 and subsequent generations is usually less noticeable with animals
than with plants, and experiments of great precision would be re-
quired to detect any regular pattern.

Wide crosses. We have seen that the amount of heterosis shown
by a particular cross depends, among other things, on the differences
of gene frequency between the two populations crossed. This would
seem to indicate that the amount of heterosis would increase with the

262 INBREEDING AND CROSSBREEDING: I [Chap. 14

degree of genetic differentiation between the two populations and
would be limited only by the barrier of interspecific sterility. This,
however, is not true. Crosses between subspecies, or between local
races, taken from the wild often fail to show heterosis, particularly
in characters closely related to fitness which show heterosis in crosses
between less differentiated laboratory populations. Indeed the F^s
of wide crosses are often less fit than the parent populations. Much
of the evidence about such crosses comes from studies of wild
populations of Drosophila pseudoobscura and other species, (see
Dobzhansky, 1950; Wallace and Vetukhiv, 1955). Though wide
crosses may not show heterosis in fitness, they do often show hetero-
sis in certain characters, particularly growth rate in plants. Dob-
zhansky (1950, 1952), who drew attention to this, refers to heterosis
in fitness as "euheterosis" and to heterosis in a character that does not
confer greater fitness as "luxuriance."

The error in extending our earlier conclusion to wide crosses
arises from the fact that we have assumed epistatic interaction be-
tween loci to be negligible, an assumption that is probably justified
for crosses between breeds of domestic animals or between laboratory
populations, but is obviously not justified in the case of crosses be-
tween differentiated wild populations. The existing genetic differen-
tiation between wild populations has, for the most part, arisen by
evolutionary adaptation to the local conditions. Adaptation to local
conditions or to a particular way of life involves many different
characters, both structural and functional, because the fitness of the
organism depends on the harmonious interrelations of all its parts.
If two populations adapted to different ways of life are crossed, the
cross-bred individuals will be adapted to neither, and will conse-
quently be less fit than either of the parent populations. The effect
of this evolutionary adaptation on the genetic structure of the popu-
lations is as follows. The genes Ax and B1} say, are selected in one
population because together they increase fitness, though either one
separately may not; while, in another population living under differ-
ent conditions, the genes A2 and B2 are selected for similar reasons.
In respect of fitness, therefore, there is epistatic interaction between
these two loci. But if these pairs of genes become fixed throughout the
two populations, A± and B± in one and A2 and B2 in the other, and so
become part of their constant genetic structure, the variation arising
from this interaction will disappear. Within any one population,
therefore, we may find very little epistatic variation, and the interac-

Chap. 14]

HETEROSIS

263

tion will become apparent as a cause of variation between individuals
only in a cross-bred population in which there is segregation at both
interacting loci.

The idea that the genetic structure of a natural population evolves
as a whole, so that the selection pressure on any one locus is depend-
ent on the alleles present at many of the other loci, is expressed in the
terms "coadaptation" and "integration," used to describe the genetic
structure of natural populations. (For general discussions of these
concepts, see Dobzhansky, 195 ib; Lerner, 1954, 1958; Wright,
1956.) The important point for us to note is this. The property of
coadaptation, or integration, assumes primary importance only when
different populations are to be compared and when the results of
crossing adaptively differentiated populations are to be studied; it is
of less importance in the genetic study of a single population. In
this book we are chiefly concerned with the genetic variation within a
population: that is, the variation arising from the segregation of genes
in the population. Some of this variation arises from epistatic iner-
action between the genes segregating at different loci, which is the
raw material, as it were, from which coadaptation could evolve if the
population were to become subdivided. But the amount of this epi-
static variation within a population is probably seldom very large,
and moreover it is seldom necessary to distinguish it from other
sources of non-additive genetic variance.

F.Q.G.

CHAPTER 15

INBREEDING AND CROSSBREEDING:

II. Changes of Variance

The effect of inbreeding on the genetic variance of a metric character
is apparent, in its general nature, from the description of the changes
of gene frequency given in Chapter 3. Again, we have to imagine the
whole population, consisting of many lines. Under the dispersive
effect of inbreeding, or random drift, the gene frequencies in the
separate lines tend toward the extreme values of o or 1, and the lines
become differentiated in gene frequency. Since the mean genotypic
value of a metric character depends on the gene frequencies at the
loci affecting it, the lines become differentiated, or drift apart, in
mean genotypic value. And, since the genetic components of vari-
ance diminish as the gene frequencies tend toward extreme values
(see Fig. 8.1), the genetic variance within the lines decreases. The
general consequence of inbreeding, therefore, is a redistribution of the
genetic variance; the component appearing between the means of
lines increases, while the component appearing within the lines
decreases. In other words, inbreeding leads to genetic differentiation
between lines and genetic uniformity within lines. The differentia-
tion is illustrated from experimental data in Fig. 15.1.

The subdivision of an inbred population into lines introduces an
additional observational component of variance, the between-line
component, and it is not surprising that this adds a considerable
complication to the theoretical description of the components of
genetic variance. Indeed, a full theoretical treatment of the redistri-
bution of variance has not yet been achieved. Here we shall attempt
no more than a brief description of the main outlines, and for this we
shall have to make some simplifications. In particular we shall
entirely neglect the interaction component of genetic variance arising
from epistasis. For detailed treatment of various aspects of the
problem, and for references, see Kempthorne (1957, Ch. 17). After
this description of the redistribution of genetic variance we shall
consider changes of environmental variance. The greater sensitivity

INBREEDING AND CROSSBREEDING: II.

265

Chap. 15]

of inbred individuals to environmental sources of variation was
mentioned earlier, in Chapter 8. This phenomenon interferes with
the experimental study of the changes of variance, and until it is
better understood we cannot put much reliance on the theoretical

11 12 13 14 15 16 17

60 61 62 63 64 65 66 67 68 69
generations of inbreeding

Fig. 15. i. Differentiation between lines by random drift, shown
by abdominal bristle number in Drosophila melanogaster. The
graphs show the mean bristle number in each of 10 lines during
full-sib inbreeding without artificial selection. (From Rasmuson,
1952; reproduced by courtesy of the author and the editor of Acta
Zoological)

expectations concerning variance being manifest in the observable
phenotypic variance. Finally, in this chapter, we shall discuss the use
of inbred animals for experimental purposes.

Redistribution of Genetic Variance

The redistribution of variance arising from additive genes (i.e.

genes with no dominance) is easily deduced. This is because with

additive genes the proportions in which the original variance is dis-
I tributed within and between lines does not depend on the original
I gene frequencies. When there is dominance, however, we cannot
j deduce the changes of variance without a knowledge of the initial
I gene frequencies. This not only adds considerably to the mathematical

complexity, but it renders a general solution impossible. We shall

266 INBREEDING AND CROSSBREEDING: II [Chap. 15

first consider the case of additive genes, and then very briefly indicate
the conclusions arrived at for dominant genes. The effect of selection
will not be specifically discussed. We need only note that natural
selection will tend to render the actual state of dispersion of gene
frequencies less than that indicated by the inbreeding coefficient
computed from the population size or pedigree relationships. There-
fore we must expect the redistribution of genetic variance to proceed
at a slower rate than the theoretical expectation, and we must expect
the discrepancy to be greater when inbreeding is slow than when it is
rapid.

No dominance. What follows refers to the variance arising from
additive genes: it does not apply to the additive variance arising from
genes with dominance. The conclusions therefore apply, strictly
speaking, only to characters which show no non-additive variance.
They serve, however, to indicate the general effect of inbreeding on
variance, and may be taken as a fair approximation to what is expected
of characters such as bristle number in Drosophila, that show little
non-additive genetic variance. The description to be given refers to
slow inbreeding, and is not strictly true of rapid inbreeding by sib-
mating or self-fertilisation. The redistribution of the variance under
rapid inbreeding is, however, not very different except in the first few
generations.

Consider first a single locus. When there is no dominance the
genotypic variance in the base population, given in equation 8.yi be-
comes

VG = 2pQqQa2

The variance within any one line is

VG = 2pqa2

where p and q are the gene frequencies in that line. The mean vari-
ance within lines is

VGw = 2(pq)a2

where (pq) is the mean value of pq over all lines. Now, z(pq) is the
overall frequency of heterozygotes in the whole population, which, by
Table 3.1, is equal to 2p0q0(i -F), where F is the coefficient of in-
breeding. Therefore

VGw = 2pQq0a2(i-F)
= V0(i-F)

r Chap. 15]

REDISTRIBUTION OF GENETIC VARIANCE

267

and this remains true when summation of the variances is made over
all loci. Thus the within-line variance is (i -F) times the original
variance, and as F approaches unity the within-line variance approaches
zero.

Now let us consider the between-line variance. This is the vari-
ance of the true means of lines, and would be estimated from an
analysis of variance as the between-line component. For a single
locus, still with no dominance, the mean genotypic value of a line
with gene frequency^) and q is obtained from equation 7.2 as

M=a{p-q)
= a{i-zq)

Thus we want to find the variance of (a - zaq). Now, in general,
w\x-Y) ~Gx + °r> ^ X and Y are uncorrelated. Since in this case a is
constant from line to line (epistasis being assumed absent) it has no
variance, and so

Again, in general, o^x

K*<j'x when K is a constant. So

°M

= ^a2p0q0F (from 3.14)
=zFVG

and this also remains true when summation is made over all loci.
Thus the between-line genetic variance is zF times the genetic vari-
ance in the base population.

The partitioning of the genetic variance into components as
explained above is summarised in Table 15.1. The total genetic

Table 15. i

Partitioning of the variance due to additive genes in a

population with inbreeding coefficient F, when the variance

due to additive genes in the base population is VG.

Between lines zFVG

Within lines (i-F)VG

Total (1 +F)VG

variance in the whole population is the sum of the within-line and
between-line components, and is equal to (1 +F) times the original
genetic variance. (This is true also of close inbreeding.) Thus when
inbreeding is complete the genetic variance in the population as a

268 INBREEDING AND CROSSBREEDING: II [Chap. 15

whole is doubled, and all of it appears as the between-line component.
The genetic variance within lines, before inbreeding is complete,
is partitioned within and between the families of which the lines are
composed. Under slow inbreeding with random mating within the
lines, it is partitioned equally within and between full-sib families.
The covariance of relatives within the lines is just as described in
Chapter 9, each line being a separate random-breeding population
with a total genetic variance of (1 -F)VG, on the average. From this
we can deduce what the heritability is expected to be within any one
line. It will be (1 -F)VGj[(i -F)VQ + VE\ and this reduces to

*- x-Wt (J5-J)

where h2t and Ft are the heritability within lines and the inbreeding
coefficient at time t, and h% is the original heritability in the base
population. This shows how the heritability is expected to decline
with the inbreeding in a small population. The formula, however, is
applicable only to characters with no non-additive variance, and in
the absence of selection. The operation of natural selection renders
the reduction of the heritability less than expected, especially under
slow inbreeding. This point has been demonstrated experimentally
with Drosophila (Tantawy and Reeve, 1956).

Dominance. The components of variance arising from additive
genes will have been seen to be independent of the gene frequencies
in the base population. When we consider genes with any degree of
dominance, however, we find that the changes of variance on in-
breeding depend on the initial gene frequencies, and this makes it
impossible to give a general solution in terms of the genetic variance
present in the base population. We shall therefore do no more than
give the conclusions arrived at by A. Robertson (1952) for the case of
fully dominant genes, when the recessive allele is at low frequency.
This is the situation most likely to apply to variation in fitness arising
from deleterious recessive genes, though the effects of selection are
here disregarded. Fig. 15.2 shows the redistribution of variance
arising from recessive genes at a frequency of q — o-i in the base
population. Fig. 15.2(a) refers to full-sib mating with only one
family in each line, and Fig. 15.2(6) refers to slow inbreeding. A
surprising feature of the conclusions is that the within-line variance
at first increases, reaching a maximum when the coefficient of in-
breeding is a little under 0-5, and it remains at a fairly high level until

Chap. 15]

REDISTRIBUTION OF GENETIC VARIANCE

269

the coefficient of inbreeding approaches I. The reason, in general
terms, for the apparent anomaly that the variation within lines in-
creases during the first stages of inbreeding, can be seen from a con-
sideration of the relationship between the gene frequency and the
variance arising from a dominant gene shown in Fig. 8.1(b). The
gene frequency is taken to start at a value of o-i, and on inbreeding it

0 5

GENERATIONS OF INBREEDING

INBREEDING COEFFICIENT

Fig. 15.2. Redistribution of variance arising from a single fully
recessive gene with initial frequency q0 =o*i. (a) with full-sib
mating, (b) with slow inbreeding. (From A. Robertson, 1952;
reproduced by courtesy of the author and the editor of Genetics.)

Vt —total genetic variance.

Vb = between-line component.

Vw =within-line component.

Va = additive genetic variance within lines.

will increase in some lines and decrease in others, the increase being
on the average equal in amount to the decrease. But examination of
the graph shows that an increase of gene frequency by a certain
amount will increase the variance more than a decrease of the same
amount will reduce it. Therefore, on the average, the variance within
the lines will increase in the early stages of inbreeding. This increase
of variance would be detectable in practice only if a substantial part
of the genetic variance were due to recessive genes at low frequencies.
Practical considerations. The extent to which the theoretical
changes of variance described in this chapter can be observed in
practice depends on how much environmental variance is present.
The precise estimation of variance requires a large number of obser-
vations and the estimates obtained in practice are usually subject to

270 INBREEDING AND CROSSBREEDING: II [Chap. 15

rather large deviations due to the chances of sampling. Consequently
the changes of variance must usually be quite substantial before they
are likely to be readily detected. The genotypic variance, moreover,
seldom constitutes the major part of the phenotypic variance.
Therefore, in relation to the original phenotypic variance, the expected
changes due to inbreeding are usually rather small, and this renders
their detection all the more difficult. Furthermore, the detection of
the expected changes of phenotypic variance is entirely dependent on
the constancy of the environmental variance, and this cannot be
assumed without evidence, as we shall show in the next section. For
these reasons, and also because of the simplifications we have had to
make, we must bear in mind the uncertainties in the connexion
between what is expected and what may be observed in the pheno-
typic variance.

Changes of Environmental Variance

Several times in previous chapters we have referred to the fact that
the environmental component of variance may differ according to
the genotype; in particular that inbred individuals often show more
environmental variation than non-inbred individuals. This fact has
been revealed by many experiments in which the variances of inbreds
and of hybrids have been compared. Any difference of phenotypic
variance between highly inbred lines and the F2 between them (i.e.
the "hybrid") must be attributed to a difference of the environmental
component, because the genetic variance is negligible in amount in
the hybrids as well as in the inbred lines. The greater susceptibility
of inbreds than of hybrids to environmental sources of variation has
been observed in a wide variety of characters and organisms. Some
examples are cited in Table 15.2; others will be found in the review
by Lerner(i954).

The cause of the greater environmental variance of inbreds is not
yet fully understood. It has been suggested that the possession of
different alleles at specific loci endows the hybrids with greater
' 'biochemical versatility" (Robertson and Reeve, 19526), which
enables them to adjust their development and physiological mech-
anisms to the circumstances of the environment: in other words that
developmental and physiological homeostasis is improved by allelic
diversity. On the other hand, it has been suggested (Mather, 19530)

Chap. 15]

CHANGES OF ENVIRONMENTAL VARIANCE

271

2'35

0-0665

1-24

0-0165

that the reduced homeostatic power of inbreds is to be regarded as a
manifestation of inbreeding depression: homeostatic power is likely
to be an important aspect of fitness, and would therefore be expected,
like other aspects of fitness, to decline on inbreeding. The under-
lying mechanism, we may presume, would be directional dominance,
genes that increase homeostatic power tending on the average to be

Table 15.2

Comparisons of Phenotypic Variance in Inbreds and
Hybrids

The figures are the averages of the inbred lines, and of the

Fj's where more than one cross was made. (C.V.)2 = Squared

coefficient of variation.

Inbreds Hybrids

Drosophila melanogaster — wing length
(Robertson and Reeve, 19526) (C.V.)2.
6 inbreds and 6 F/s

Mice — duration of "Nembutal" anaesthesia
(McLaren and Michie, 19566). Log minutes.

2 inbreds and 1 F1

Mice — age at opening of vagina
(Yoon, 1955). Days.

3 inbreds and 2 F/s

Mice — weight at ages given
(Chai, 1957) (C.V.)2.

2 inbreds and 1 Fx

Rats — weight at 90 days
(Livesay, 1930.) (C.V.)2.

3 inbreds and 2 F/s

dominant over their alleles that decrease it.
causal connexion between variability and fitness. He believes greater
stability to be a general property of heterozygotes and regards it as
the cause of their greater fitness. Though the increase of environ-
mental variance on inbreeding is a phenomenon of great theoretical
interest and some practical importance, too little is known about it to
justify a more detailed discussion of its causes here. Comprehensive
discussions will be found in Lerner (1954) and Waddington (1957).

There are, however, two further points in connexion with the
phenomenon that should be mentioned. The first is a technical
matter. If the mean value of the character differs between inbreds

fBirth
< 3 weeks
L60 days

517

17-4

522

170

Lerner (1954) sees a

272 INBREEDING AND CROSSBREEDING: II [Chap. 15

and hybrids, as it frequently does, then it may be difficult to decide
on a proper basis for the comparison of the variances. It is necessary
to find a measure of the variance that does not merely reflect the
difference of mean value, and for this purpose the coefficient of
variation is often an appropriate measure. The problem is basically a
matter of the choice of scale, and will be discussed again in Chapter

The second point concerns the nature of the environmental
variation that is being measured. There is a distinction to be made
between the "developmental" variation arising from "accidents of
development" on the one hand, and adaptive reponses to changed
conditions on the other. The developmental variation is a mani-
festation of incomplete buffering, or canalisation, of development and
is generally regarded as being harmful. Inbreds, in so far as they
show a greater amount of developmental variation, are therefore less
fit than hybrids; they are less well able to adjust their development to
different conditions of the environment so as to achieve the optimal
phenotype. An adaptive response, in contrast, is a modification of the
phenotypic value that is beneficial to the individual, such as for
example the thickening of the coat of mammals in response to low
temperature. If the greater fitness of hybrids over inbreds extends to
adaptive responses we should therefore expect hybrids to show more
variation of this sort than inbreds. Thus the nature of the environ-
mental variation has an important bearing on the interpretation of a
difference of variability between inbreds and hybrids.

Uniformity of Experimental Animals

Inbred strains of laboratory animals, particularly of mice, are
widely used as experimental material in pharmacological, physio-
logical, and nutritional laboratories, when uniformity of biological
material is desired. In some kinds of work, work for example which
demands the absence of immunological reactions,, it is genetic uni-
formity that is required, and abundant experience has shown that the
inbred strains of mice fully satisfy this requirement. In spite of
doubts about how effective natural selection for heterozygotes may be
in delaying the progress towards homozygosity, these strains have
been proved in practice to be genetically uniform. In the course of
their maintenance, however, strains inevitably become split up into

Chap. 15]

UNIFORMITY OF EXPERIMENTAL ANIMALS

273

sublines, and it is only within a subline that their genetic uniformity
can be relied on. Recent work, described in the two following

1920,

1930

1950

©o ©

WHITE = 5 VERTEBRAE (+ ASYMMETRICAL)
BLACK= 6 VERTEBRAE

Fig. 15.3. Differentiation between sublines of the C3H inbred
strain of mice, in the number of lumbar vertebrae. Each circle
represents a sample of individuals classified for the number of
lumbar vertebrae. The proportions of black and white in the
circles show the proportions of individuals with 6 and with 5
lumbar vertebrae respectively. (Small proportions of asymmetrical
individuals are included with the 5 -vertebra classes.) The circles
are positioned according to the date of clasification, and arranged
according to their pedigree relationships. (Data from McLaren
and Michie, 1954.)

examples, has revealed genetic differentiation within two widely used
strains of mice, and has shown that differences can sometimes be
detected between sublines separated by only a few generations.

274 INBREEDING AND CROSSBREEDING: II [Chap. 15

Example 15.1. The inbred strain of mice known as C3H exhibits
variability in the number of lumbar vertebrae, and the sublines differ
markedly in this character. Some sublines consist entirely of mice with
5 vertebrae, others entirely of mice with 6, and others with different pro-
portions. The strain originated in 1920 and was split into three main
groups of sublines in about 1930, each group being later subdivided
further. The number of lumbar vertebrae has been studied in 16 sublines
maintained in America and Britain (McLaren and Michie, 1954). The
pedigree relationships between these sublines, and the proportions of the
two vertebral types in them, are shown in Fig. 15.3. One of the three main
groups of sublines has predominantly 6 lumbar vertebrae, and the other
two groups predominantly 5. This differentiation between the main
groups may have been due to residual segregation in the strain at the time
when the main groups became separated. The strain had, however, been
full-sib mated for 10 years — probably between 20 and 30 generations —
before the separation of the groups, and residual segregation therefore
seems unlikely. The sublines within the main groups are differentiated in
a manner that points to mutation rather than residual segregation as the
cause. The mutational origin of differentiation is more clearly proved in
the study described in the next example.

Example 15.2. Another inbred strain of mice, known as C57BL, has
been the subject of a thorough study by Griineberg and co-workers (Deol,
Griineberg, Searle, and Truslove, 1957; Carpenter, Griineberg, and Rus-
sell, 1957). Twenty-seven skeletal characters were examined in four main
groups of sublines, three maintained in America and one in Britain, the
British group being studied in greater detail. The nature and extent of the
differentiation found cannot be easily summarised, and therefore we shall
only state the conclusions reached about the cause of the differentiation.
Each of the four main groups differed from the others in between 7 and 17
out of the 27 characters. The following conclusions were drawn: (1) The
differentiation could not reasonably be attributed to residual segregation
before the separation of the sublines; and segregation following an acci-
dental outcross was conclusively disproved. (2) Sublines that had been
separated for a longer time tended to differ by a greater number of charac-
ters than sublines more recently separated. But the magnitude of the
difference in any one character was no greater between long-separated
sublines than between sublines only recently separated. From this it was
concluded that the differences in each character were caused by mutations
at single loci. The average difference caused by one mutational step
amounted to about o-6 standard deviation of the character affected.

The study cited in the above example shows that the differences
between sublines, though they may be readily detectable, are prob-

Chap. 15]

UNIFORMITY OF EXPERIMENTAL ANIMALS

275

ably caused by rather few loci. The differentiation is quite small in
comparison with the differences between strains or between indi-
viduals in a non-inbred population.

In much of the work for which inbred strains are used it is not
the genetic uniformity alone that matters, but the phenotypic uni-
formity. The more variable the animals the larger the number that
must be used to attain a given degree of precision in measuring their
mean response to a treatment. The value of uniformity is therefore
in reducing the number of animals that must be used in an experi-
ment or a test. Inbred animals, however, are costly to produce
because of their poor breeding qualities, and the advantage gained
from genetic uniformity has to be weighed against the extra cost of
the material. If the character to be measured is one of which the
phenotypic variance is chiefly environmental in origin, then the
absence of genetic variation in an inbred strain will reduce the pheno-
typic variance by only a small amount. The extra cost of the inbred
animals may then outweigh the advantage of their being slightly
more uniform than non-inbred animals. The phenotypic uniformity
of inbred animals, however, has been taken on trust from the genetical
theory of inbreeding, and it seems now that this trust has, to some
extent at least, been misplaced. In some characters inbred animals
are more phenotypically variable than non-inbred (see Table 15.4)
on account of their greatly increased environmental variation. It
seems now that for some, perhaps for many, characters the greatest
phenotypic uniformity is found in hybrids (i.e. F^s) produced by
crossing two inbred strains. The value of hybrids for work requiring
phenotypic uniformity has been discussed by Griineberg (1954); and
by Biggers and Claringbold (1954).

One final point about the use of inbred and hybrid animals may
be noted. An inbred strain or the Fx of two inbred strains has a
unique genotype; and that of an inbred, moreover, is one that cannot
occur in a natural population. Testing the response to any treatment
on one inbred strain or one hybrid is therefore testing it on one geno-
type. If there are appreciable differences of response between
different genotypes, the experimenter is then not justified in describ-
ing his results as referring, for example, to "the mouse."

CHAPTER 16

INBREEDING AND CROSSBREEDING:

III. The Utilisation of Heterosis

The crossing of inbred lines plays a major role in the present methods
of plant improvement, though in animal improvement it plays a much
less important part. In this chapter the genetic principles underlying
the use of inbreeding and crossing will be explained, and the various
methods described in outline. Technical details, however, will not be
given: for these the reader should consult a textbook of plant breeding
(e.g. Hayes, Immer, and Smith, 1955). We shall be concerned with
outbreeding plants and with animals. But since at first sight the
methods applicable to naturally self-fertilising plants are super-
ficially rather like those applicable to outbreeding plants and animals,
it will be advisable first to consider very briefly the improvement of
self-fertilising plants.

Self-fertilising plants. Each variety of a naturally self-fertilising
plant is a highly inbred line, and the only genetic variation within it is
that arising from mutation. Genetic improvement can therefore be
made only by choosing the best of the existing varieties or by crossing
different varieties. The purpose of the crossing is to produce genetic
variation on which selection can operate. After a cross has been
made, the Fx and subsequent generations are allowed to self -fertilise
naturally. A new population, subdivided into lines, is thus made, and
the lines become differentiated as the inbreeding proceeds. Selection
is applied by choosing the best lines, which become new and im-
proved varieties. The essential point to note is that what is sought is
an improved inbred line, and not a superior crossbred generation: the
purpose of the crossing is to provide genetic variation and not to
produce heterosis. The process of crossing and selection among the
subsequent lines may be repeated cyclically. If two good lines are
selected out of the first cross, these may be crossed and a second cycle
of selection applied to the derived lines. The genetic properties of a
population derived from a cross of two highly inbred lines, such as
two varieties of a self-fertilising plant, are peculiar in that all segre-

Chap. 16]

INBREEDING AND CROSSBREEDING: III.

277

gating genes have a frequency of 0-5 in the population as a whole.
This greatly simplifies the theoretical description of the variances and
covariances. Special methods of analysis applicable to such popula-
tions have been developed which lead to a separation of the additive,
dominance, and epistatic effects, and so provide a guide to the possi-
bilities of improvement in the population of lines derived from a
particular cross. For a description of these methods, see Mather
(1949), Hayman (1958), and Kempthorne (1957, Ch. 21) where other
references are given.

Outbreeding plants, and animals. Applied to naturally out-
breeding plants and to animals, the purpose of crossing inbred lines is
to produce superior cross-bred, or F1} individuals. The utilisation of
heterosis in this way depends on selection as well as on the inbreeding
and crossing. The selection is applied, in principle, to the crosses,
with the aim of finding pairs of lines that cross well, so that the lines
may be perpetuated and provide cross-bred individuals for com-
mercial use. In practice, however, the performance of the lines
themselves has to be taken into account, because the lines must be
reasonably productive if they are to be maintained and used for
crossing. This method has been very successful with plants, and has
led to an improvement of 50 per cent in the yield of maize grown
commercially in the United States, since hybrid seed started to be
used in the early 1930's (Mangelsdorf, 195 1). Its success with
animals, however, has been much less notable. The reasons probably
lie chiefly in the greater amount of space and labour required by
animals and in their lower reproductive rate, both of which add
greatly to the difficulty of producing and testing the inbred lines.
During the inbreeding a large proportion of the lines die out from
inbreeding depression before a reasonably high degree of inbreeding
has been attained. Consequently the inbreeding programme must
start with a very large number of lines if enough are to be left after the
wastage to give some scope for the selection of good crosses. Another
point is that with plants that can be self-fertilised, such as maize, the
inbreeding proceeds much faster than with animals. To attain an
inbreeding coefficient of, say, 90 per cent would require only 4
years for maize, but 1 1 years for pigs or chickens, and about 50 years
for cattle with a 4- or 5 -year generation interval.

Let us now consider the genetic principles on which the utilisa-
tion of heterosis depends. It was shown in Chapter 14 that crosses
made at random between lines inbred without selection are expected

278 INBREEDING AND CROSSBREEDING: III [Chap. 16

to have a mean value equal to that of the base population. This is
the reason why inbreeding and crossing alone cannot be expected to
lead to an improvement, but must be supplemented by selection. In
practice some improvement can be expected from the effects of
natural selection. It eliminates lethal and severely deleterious genes
during the inbreeding, and in so far as these genes affect the desired
character an improvement of the cross-bred mean over that of the
base population is to be expected. But this improvement will not be
very great, because the deleterious genes eliminated will have been at
low frequencies in the base population — and the more harmful, the
lower the frequency — so that their effect on the population mean will
be small. It has been calculated, on the basis of assumptions about
the number of loci concerned and their mutation rates, that an im-
provement of 5 per cent in fitness is the most that could be expected
from the elimination of deleterious recessive genes (Crow, 1948, 1952).
The bulk of the improvement, therefore, must come from artificial
selection applied to the economically desirable characters.

The crossing of inbred lines produces no genotypes that could not
occur in the base population. But whereas the best genotypes occur
only in certain individuals in the base population, they are replicated
in every individual of certain crosses. It is in this replication of a
desirable genotype that the chief merit of the method lies. Let us, for
simplicity, consider crosses between fully inbred lines. The gametes
produced by a highly inbred line are all identical, except for mutation.
And the gene content of the gametes of any one line could in principle
be found in a gamete from the base population. Therefore the geno-
type of the Fx of two lines could in principle be found in an individual
of the base population. Thus, provided there has been no selection
during the inbreeding, a set of crosses made at random is genetically
equivalent to a set of individuals taken at random from the base popu-
lation; and the individuals of one cross are replicates of one individual
in the base population. This replication of a genotype in the indi-
viduals of a cross allows the genotypic value to be measured with little
error; whereas the genotypic value of an individual in the base popu-
lation is only crudely measured by its phenotypic value. Further, it is
the genotypic value that is measured in the cross and can be repro-
duced indefinitely, as long as the inbred lines are maintained; whereas
only the breeding value can be reproduced by selection of individuals
in a non-inbred population. Therefore the condition under which
inbreeding and crossing are likely to be a better means of improvement

Chap. 16]

INBREEDING AND CROSSBREEDING: III

279

than selection without inbreeding is when much of the genetic
variance of the character is non-additive.

The amount of improvement that can be made by selection among
a number of crosses depends on the amount of variation between the
crosses. The same relationship holds between the intensity of selec-
tion, the standard deviation, and the selection differential as was
described in Chapter n and illustrated in Fig. 11.3. In the following
section the variance between crosses made at random between pairs
of lines inbred without selection will be examined.

Variance between Crosses

The variance between crosses to be considered is the variance of
the true means of the crosses, or the between-cross component as
estimated from an analysis of variance. The variance of the observed
means will contain a fraction of the within-cross component for the
reasons explained in connexion with family selection in Chapter 13.
We shall assume that the experimental design has eliminated all
non-genetic sources of variation from the between-cross component.

If the lines crossed are fully inbred there will be no genetic vari-
ance within the crosses, and the variance between crosses will be
equal to the genotypic variance in the base population, since each
cross is equivalent to an individual of the base population. When the
lines are only partially inbred, however, some genetic variance will
appear within the crosses, and the between-cross variance will be less
than with fully inbred lines. It is therefore important to know in
what manner the between-cross variance increases as inbreeding
proceeds, since this will tell us how much is to be gained by proceed-
ing to high levels of inbreeding.

We noted that crosses between fully inbred lines are genetically
equivalent to single individuals of the base population. Crosses
between partially inbred lines are analogous, not to individuals, but
to families, with degrees of relationship dependent on the inbreeding
coefficient of the lines. The variance between families can be formu-
lated in terms of the degree of relationship in the families (Kemp-
thorne, 1954), and this formulation may be extended to crosses by
regarding the crosses as families with a relationship depending on the
inbreeding coefficient of the lines. The following expression is then
obtained for the component of variance between crosses:

T F.Q.G.

280 INBREEDING AND CROSSBREEDING: III [Chap. 16

Between-cross variance

=FV*+F*VD+F*VAA+F*VAD+F*VDD+ (16.1)

In this expression VA and VD are the additive and dominance vari-
ances in the base population; VAA, VAD and VDD are the interaction
components as explained in Chapter 8; and F is the inbreeding
coefficient of the lines as specified below. The interaction components
are included because epistasis may have important effects. Only
two-factor interactions, however, are shown: the higher interactions
have coefficients in correspondingly higher powers of F. (For every
A in the subscript there is a factor F, and for every D a factor F2.)
The formulation in equation 16. 1 is conditional on the following
specifications about how the crosses are made. 1 . All lines have the
same coefficient of inbreeding. 2. All lines have independent ancestry
back to the base population; i.e. there is no relationship between the
lines. 3. Each cross is made from many individuals of the parent
lines; and these individuals are not related to each other within their
lines. This means that the genetic variance within the lines is fully
represented within the crosses. 4. The coefficient of inbreeding, F,
refers not to the individuals used as parents of the crosses, but to their
progeny if they were mated within their own lines; in other words, F
is the inbreeding coefficient of the next generation of the lines.

Let us now examine the expression 16.1 and consider what it tells
us about the variance between crosses. When the inbreeding coeffi-
cient is unity the between-cross variance is, as we have already stated,
simply the sum of all the components of genetic variance in the base
population. During the progress of the inbreeding the contribution
of the additive variance increases linearly with F; those of the domin-
ance variance and of Ax A interactions increases with the square of
F; and the other interaction components with the third or fourth
power of F. This means that the dominance and interaction com-
ponents contribute proportionately more at higher levels of inbreed-
ing than at lower levels. If the character is one with predominantly
non-additive variance, the crosses will differ little in merit during the
early stages but will differentiate rapidly in the final stages. Since this
is the sort of character for which inbreeding and crossing is likely to
be the most effective means of improvement, it is clear that inbreed-
ing must be taken to a fairly high level if anything approaching its full
benefit is to be realised. Some idea of the level of inbreeding required
can be obtained by noting that with F = 0-5 the between-cross vari-

Chap. 16]

VARIANCE BETWEEN CROSSES

281

ance is equal to the variance between full-sib families in the base
population. At this level of inbreeding, therefore, the best cross would
do no more than replicate the best full-sib family in a non-inbred
population.

Combining ability. The components of genetic variance making
up the between-cross variance that we have been discussing are
causal components, in the sense explained in Chapter 9. The vari-
ance between crosses, however, can also be analysed into observa-
tional components in the following way. Suppose a set of lines are
crossed at random, each line being simultaneously crossed with a
number of others. We can then calculate for each line its mean per-
formance, i.e. the mean value of the Fj/s in crosses with other lines.
This is known as the general combining ability of the line. The
performance of a particular cross may deviate from the average
general combining ability of the two lines, and this deviation is
known as the special (or specific) combining ability of the cross. Or, if
we measure the mean values as deviations from the general mean of
all crosses, we can express the value of a certain cross as the sum of
the general combining abilities of the two lines and the special
combining ability of the pair of lines. Thus the mean value of the
cross of line X with line Y is

MXY = G.C.X + G.C.Y + S.C.XY

(16.2)

where G.C. and S.C. stand for the general and special combining
abilities. The variance between crosses can therefore be analysed
into two components: variance of general combining abilities and
variance of special combining abilities; the latter being, in statistical
terms, the interaction component.

The observational components of variance attributable to general
and special combining ability are made up of the causal components
in the following way.

(16.3)

Variance of crosses attributable to:

General combining ability =FVA +F2 V AA + . . . \

Special combining ability =FWD +FWAD +FWDD + . . . J

So differences of general combining ability are due to the additive
genetic variance in the base population, and to Ax A interactions;
and differences of special combining ability are attributable to the
non-additive genetic variance. Consequently the variance of general

282 INBREEDING AND CROSSBREEDING: III [Chap. 16

combining ability increases linearly with F (apart from the interaction
component), while the variance of special combining ability increases
with higher powers of F. It is therefore the special, and not the
general, combining ability that is expected to increase more rapidly
as the inbreeding reaches high levels.

Example 16.1. An analysis of egg-laying in crosses between highly
inbred lines of Drosophila melanogaster is reported by Gowen (1952).
Five lines were crossed in all ways, including reciprocals, and the numbers
of eggs laid by females in the fifth to ninth days of adult life were recorded.
The analysis of the crosses yielded the following percentage composition
of the variance of egg number:

Variance component % of total
General combining ability 11-3

Special combining ability 9-7

Differences between reciprocals 2-3

Within crosses 76-6

Thus about half the variance between crosses was due to general, and half
to special, combining ability.

Some of the methods of improvement by crossing aim at utilising
only the variance of general combining ability, and then the measure-
ment of the general combining ability of the lines becomes an im-
portant procedure. In addition to the making of specific crosses
between the lines, there are two other methods of measuring general
combining ability. A method convenient for use with plants is known
as the polycross method. A number of plants from all the lines to be
tested are grown together and allowed to pollinate naturally, self-
pollination being prevented by the natural mechanism for cross-
pollination, or by the arrangement of the plants in the plot. The seed
from the plants of one line are therefore a mixture of random crosses
with other lines, and their performance when grown tests the general
combining ability of that line. Another method, applicable also to
animals, is known as top-crossing. Individuals from the line to be
tested are crossed with individuals from the base population. The
mean value of the progeny then measures the general combining
ability of the line, because the gametes of individuals from the base
population are genetically equivalent to the gametes of a random set
of inbred lines derived without selection from the base population.

Chap. 16]

VARIANCE BETWEEN CROSSES

283

These methods are essentially methods for comparing the general
combining abilities of different lines, and so leading to the choice of
the lines most likely to yield the best cross, among all the crosses that
might be made between the available lines. But if much of the varia-
tion between crosses is due to special combining ability, then the
general combining ability of two lines will not provide a reliable
guide to the performance of their cross.

Methods of Selection for Combining Ability

The methods of improvement by inbreeding and crossing fall into
two groups, according to whether they are designed to utilise only
the variation in general combining ability or to utilise also the varia-
tion in special combining ability.

Selection for general combining ability. When the improve-
ment of general combining ability only is sought the procedure of
selection is much simplified. The general combining abilities of all
available lines can be measured, as already explained, without the
necessity of making and testing all the possible crosses between them.
Some selection can usefully be applied to the lines before they are
tested in crosses. There is some degree of correlation between a line's
performance as an inbred and its general combining ability, so a
proportion of lines can be discarded on the basis of their own per-
formance before the crosses are made. And, finally, there is less
to be lost by making the crosses at a relatively low coefficient of in-
breeding. Selection for general combining ability may be repeated
in cycles, a procedure known in plant breeding as recurrent selection.
(In animal breeding this term has come to have a different meaning,
as will be explained below.) Lines are inbred by self-fertilisation
for one or two generations and their general combining abilities
tested. The lines with the best general combining abilities are then
crossed and a second cycle of inbreeding and selection carried out.
A review of the progress made by this method is given by Sprague

(I952).

The seed for commercial use is usually not made by a single cross
of two lines, but by a 3-way or 4-way cross. The object of this is to
overcome the generally low production of an inbred used as seed
parent. In a 3-way cross the Fx of two lines is used as seed parent and
crossed with a third inbred line. In a 4-way cross two F^s of differ-

284 INBREEDING AND CROSSBREEDING: III [Chap. 16

ent pairs of lines are crossed. The performance of 3 -way and 4-way
crosses can be reliably predicted from the performance of the con-
stituent single crosses.

Even though selection for general combining ability is widely
used in plant breeding and has abundantly proved its success, it is
not, perhaps, altogether clear why it is preferred to selection without
inbreeding, made either by individual selection or by family selection.
Since the variation in general combining ability is attributable to
additive variance in the population from which the lines were derived,
selection should be effective without inbreeding. Comparisons of the
two methods by experiment have not been made on a scale sufficient
to prove convincingly the superiority of selection with inbreeding
(see Robinson and Comstock, 1955).

Selection for general and specific combining ability. The
specific combining ability of a cross cannot be measured without
making and testing that particular cross. Therefore to achieve a
reasonably high intensity of selection for specific combining ability a
large number of crosses must be made and tested. Is no short-cut
possible? Could the superior combining ability not be, as it were,
built into the lines by selection? From the causes of heterosis ex-
plained in Chapter 14 it is clear that what is wanted is two lines that
differ widely in the gene frequencies at all loci that affect the character
and that show dominance. It should therefore be possible to build
up these differences of gene frequency in two lines by selection.
Instead of the differences of gene frequency being produced by the
random process of inbreeding, they would be produced by the directed
process of selection, which would be both more effective and more
economical. Two methods based on this idea have been devised.
These methods, though originating from plant breeding, provide —
in theory at least — the most hopeful means of utilising heterosis in
animals. We shall first describe the method known as reciprocal
recurrent selection, or simply as reciprocal selection. In outline, the
procedure is as follows.

The start is made from two lines, say A and B. (We shall call
them "lines" even though they will not be deliberately inbred.)
Crosses are made reciprocally, a number of A 33 being mated to
B ??, and a number of B 33 to A $?. The cross-bred progeny are then
measured for the character to be improved and the parents are judged
from the performance of their progeny. The best parents are selected
and the rest discarded, together with all the cross-bred progeny, which

Chap. 16] METHODS OF SELECTION FOR COMBINING ABILITY

285

are used only to test the combining ability of the parents. The selected
individuals must then be remated, to members of their own line, to pro-
duce the next generation of parents to be tested. These are crossed
again as before and the cycle repeated. It is seldom practicable to select
among the female parents, and the selection is chiefly applied to the
males. Each male is mated to several females of the other line so that
the judgment of his combining ability may be based on a reasonably
large number of progeny. Most of these females are needed to mate to
the selected males of their own line for the continuation of the line.
Deliberate inbreeding is avoided as far as possible, for the reason to be
explained below. The use of all the females as parents in their own lines
helps to reduce the rate of inbreeding and allows relatively few males to
be used, which intensifies the selection.

An essential prerequisite is that there should be some difference of
gene frequency between the two lines at the beginning, or else selec-
tion for combining ability will be unable to produce a differentiation
of the lines. Any locus at which the gene frequencies are the same in
the two lines will be in equilibrium, though an unstable equilibrium.
Any shift in one direction or the other will give the selection something
to act on and the difference will be increased. The initial difference
between the lines may be obtained by starting from two different
breeds or varieties, choosing two that already cross well; or by de-
liberate inbreeding, up to perhaps 25 per cent, and relying on random
differentiation of gene frequencies.

Though the performance of the cross is expected to increase
under this method of selection, the performance of the lines them-
selves in respect of the character selected is expected to decrease, for
this reason. Characters to which selection would be applied in this
way are those subject to inbreeding depression and heterosis; that is
to say, those in which dominance is directional. The changes of gene
frequency brought about by the selection are toward the extremes,
and consequently the mean values of the lines will decline for the
reasons explained in connexion with inbreeding in Chapter 14. This
decline in the performance of the lines, however, should not be quite
as deleterious as the effects of deliberate inbreeding. Inbreeding, as a
random process, affects all loci, and the mean values of all characters
showing directional dominance decline. But under reciprocal selec-
tion it is only the selected character that should decline, except in so
far as linked loci are carried along. Nevertheless, reproductive fitness
is nearly always a component of economic value, and it is doubtful

286 INBREEDING AND CROSSBREEDING: III [Chap. 16

how far the distinction will hold. This, however, is the reason why
deliberate inbreeding of the lines is to be avoided.

The second method is simpler in procedure than reciprocal
selection described above. It was devised as a modification of recur-
rent selection, intended to utilise special as well as general combining
ability (Hull, 1945), and as yet it has no distinctive name. It is known
variously as "Hull's modification of recurrent selection," ' 'recurrent
selection to inbred tester," "recurrent selection for special combining
ability," and in animal breeding simply as "recurrent selection." It
differs from reciprocal selection in the following way. Instead of
starting with two lines and selecting both for combining ability with
the other, one starts with only one line and selects it for combining
ability with a "tester" line which has previously been inbred. This
reduces the amount of effort spent on the testing, and is expected to
yield more rapid progress at the beginning because the initial differ-
ences of gene frequency between the line and the tester are likely to
be more marked. But the ultimate gain is expected to be less than
under reciprocal selection, because the general combining ability of
the tester line is predetermined, and only the general combining
ability of the selected line and the special combining ability of the
cross can be improved.

The two methods of selection for special combining ability de-
scribed in this section are comparatively new methods of improvement
and very little practical experience of them has yet been gained. The
account of them given here is consequently based almost entirely on
theory. Theoretical assessments of their merits in relation to other
methods have been made by Comstock, Robinson, and Harvey
(1949) and by Dickerson (1952). Though on theoretical grounds
they seem promising, the results of the only experiments so far pub-
lished (Bell, Moore, and Warren, 1955; Rasmuson, 1956) are not
encouraging.

Before we leave the subject of inbreeding we must give some
further consideration to the particular genetic property that makes
selection with inbreeding and crossing preferable to selection without
inbreeding. From the theoretical point of view, and leaving all prac-
tical considerations aside, the crucial genetic property is over-
dominance of the genes concerned. The following section is devoted
to a consideration of overdominance and its significance.

Chap. 16]

OVERDOMINANCE

287

OVERDOMINANCE

Overdominance is the property shown by two alleles when the
heterozygote lies outside the range of the two homozygotes in
genotypic value with respect to the character under discussion. Its
meaning was illustrated in Fig. 2.3 with respect to fitness as the
character, and it has been mentioned from time to time in other
chapters. We saw in Chapter 2 how selection favouring hetero-
zygotes leads to a stable gene frequency at an intermediate value, and
how this overdominance with respect to fitness probably accounts for
much of the stable polymorphism found in natural populations.
And in Chapter 12 we saw how overdominance may be a source of
non-additive genetic variance in populations that have reached their
limit under artificial selection. It is, however, in connexion with the
utilisation of heterosis by inbreeding and crossing, or by reciprocal
selection, that overdominance has its most important practical conse-
quences. In earlier chapters two basic methods of improvement
were distinguished, one being selection without inbreeding, and the
other inbreeding followed by crossing. In this chapter we have seen
that selection is an integral part of the second method also. The
essential distinction therefore lies in the crossing, rather than in the
selection. Now, crossing two lines in which different alleles are fixed
gives an F1 in which all individuals are heterozygotes; and this is the
only way of producing a group of individuals that are all heterozy-
gotes. In a non-inbred population no more than 50 per cent of the
individuals can be heterozygotes for a particular pair of alleles.
Consequently, if heterozygotes of a particular pair of alleles are
superior in merit to homozygotes, inbreeding and crossing will be a
better means of improvement than selection without inbreeding.
Furthermore, it is only when there is overdominance with respect to
the desired character, or combination of characters, that inbreeding
and crossing can achieve what selection without inbreeding cannot.
Under any other conditions of dominance the best genotype is one of
the homozygotes, and all individuals can be made homozygous by
selection, without the disadvantages attendant on inbreeding and
much more simply than by methods dependent on crossing. It was
stated earlier in this chapter that the potentialities of inbreeding and
crossing are greatest when there is much non-additive genetic vari-
ance and little additive. Now we see that this is only part of the truth:

288 INBREEDING AND CROSSBREEDING: III [Chap. 16

in principle inbreeding and crossing can surpass selection without in-
breeding only when a substantial part of the non-additive variance is
due to over dominance. It is therefore of great practical importance
to know whether overdominance with respect to economically
desirable characters is a major source of variation. It is also of great
theoretical interest to know whether overdominance with respect to
natural fitness is a common phenomenon affecting many loci, because
natural selection favouring heterozygotes would be a potent factor
tending to maintain genetic variation in populations. This point will
be discussed further in Chapter 20.

The contribution of overdominance to the variance, and the pro-
portion of loci that show overdominance, are really two different
questions. Genes that are overdominant with respect to fitness will be
at intermediate frequencies and will therefore contribute much
more variation than genes at low frequencies. So overdominance may
be a major source of variation and yet be a property of only a few
loci.

The evidence concerning overdominance has been compre-
hensively reviewed by Lerner (1954), who reaches the conclusion that
overdominance with respect to fitness and characters closely con-
nected with it is widespread and very important. A contrary view is
expressed by Mather (19556) on the grounds that much of what
appears to be overdominance with respect to certain characters in
plants can be attributed to epistatic interaction. These two conflicting
opinions will be enough to show that the problem of overdominance
remains still an open question. The aim here is not to discuss the
opinions, but to indicate briefly the nature of the evidence.

The evidence concerning overdominance is broadly speaking of
two sorts, direct and indirect. The direct evidence comes from the
comparison of heterozygotes and homozygotes in identifiable geno-
types. The indirect evidence comes from the study of the expected
consequences of overdominance as they affect the genetic properties of
a population, or the outcome of certain breeding methods. Both sorts
of evidence are complicated by linkage. We have to distinguish
between overdominance as a property of a single locus, and over-
dominance as a property of a segment of chromosome, which we shall
refer to as apparent overdominance. Unequivocal evidence of over-
dominance arising from a single locus is scarce because it can only be
obtained from a locus that has mutated in a highly inbred line, or
from a population in which coupling and repulsion linkages are in

Chap. 16]

OVERDOMINANCE

289

equilibrium. The segregation that can be observed in practice, and
that gives rise to the genetic variation in a population, is usually not a
segregation of single loci but of segments of chromosome, longer or
shorter according to the amount of crossing-over. These segments
of chromosome, or units of segregation, can show overdominance
even though the separate loci do not. All that is needed to produce
some degree of apparent overdominance is two genes, linked in
repulsion, and both partially recessive. Its most extreme form is pro-
duced by two lethal genes linked in repulsion — a "balanced lethal"
system — when the heterozygote of the segment spanned by the two
loci is the only viable genotype.

In considering the direct evidence it is necessary to recognise that
overdominance may be manifested at different "levels" according to
the complexity of the character under discussion. A pair of alleles
with pleiotropic effects may be found not to exhibit overdominance
when any of the characters they affect is examined separately; yet if
natural fitness or economic merit is founded on a combination of
these characters, the alleles may show overdominance with respect to
fitness or merit. Thus there may be no overdominance at the lower
level of the simpler characters, but overdominance at the higher level
of the more complex character.

Example 16.2. An example of overdominance due to pleiotropy is
provided by the pygmy gene in mice, already referred to in several ex-
amples in earlier chapters. The gene reduces body size and in the homo-
zygote it causes sterility (King, 1955). In respect of body size it is nearly,
but not quite, recessive. In respect of sterility it is probably also nearly
recessive, though this was not proved. In neither body size nor sterility
separately is there overdominance. But if small size were desirable (as it
was in the experiment in which the gene was discovered), then under these
conditions the genotype with the highest merit is the heterozygote, since
the sterile homozygotes cannot reproduce. With respect to merit, or fitness
under these conditions, the gene therefore shows overdominance. The
lethal gene in the line of Drosophila selected for high bristle number,
mentioned in Chapter 12, is another case of the same sort of overdomin-
ance; and so also is the sickle-cell anaemia described in Example 2.4.

The observations that provide direct evidence concerning over-
dominance may be briefly summarised as follows. The experience
of Mendelian genetics shows that mutant genes are not commonly
overdominant with respect to their main effects. Nor is overdomin-
ance with respect to natural fitness at all obvious. Indeed, if there

290 INBREEDING AND CROSSBREEDING: III [Chap. 16

were more than a mild degree of overdominance with respect to
fitness a gene would not be rare enough to be classed as a "mutant."
Though the evidence of Mendelian genetics suggests that overdomin-
ance is not a very common property of genes, many cases are never-
theless known. Overdominance due to pleiotropy, such as the cases
mentioned in the above example, are not infrequent. And, over-
dominance with respect to certain components of natural fitness has
been proved for some of the blood group genes in poultry (see Briles,
Allen, andMillen, 1957; Gilmour, 1958).

The nature of the indirect evidence concerning overdominance
is, in brief summary, as follows.

1 . Experiments on the rate of loss of genetic variance during in-
breeding point to the operation of natural selection in favour of
heterozygotes (Tantawy and Reeve, 1956; Briles, Allen, and Millen,
1957; Gilmour, 1958). This indicates apparent overdominance, but
it does not prove overdominance at the individual loci.

2. Crow (1948, 1952) has given reasons for thinking that the
yield of grain obtained from the best crosses between inbred lines of
maize is too high to be accounted for without overdominance at some
loci. The reasoning depends on assumptions about the number of
loci affecting yield and the mutation rates, and the conclusion is
therefore tentative. Robinson et at. (1956) point out that the reason-
ing cannot justifiably be applied to maize crosses because the lines
crossed generally come from different varieties and not from the
same base population as required by Crow's hypothesis.

3. Comstock and Robinson (1952) have devised methods for
measuring the average degree of dominance from measurements
made on non-inbred populations. Preliminary results from maize
(Robinson and Comstock, 1955) suggest that there cannot be over-
dominance (as distinct from apparent overdominance) at more than a
small proportion of the loci that influence the yield of grain.

4. The existence of polymorphism in natural populations, asj
described in Chapter 2, cannot readily be explained except by sup-
posing that the genes concerned are overdominant with respect to
fitness.

From the foregoing outline of the evidence it is clear that the
problem of how important overdominance is remains unsolved.
Some of the differences of opinion about it may arise from different
views of what phenomena are to be included under the term —
whether apparent overdominance due to linkage, or overdominance

Chap. 16]

OVERDOMINANCE

291

jldue to pleiotropy, are to be regarded as overdominance or not.
I Moreover, the question of how important overdominance is means
|| different things according to whether we are concerned with its
I frequency as a property of genes, or with the amount of variation it
I causes.

CHAPTER 17

SCALE

The choice of a suitable scale for the measurement of a metric charac-
ter has been mentioned several times in the foregoing chapters. The
explanation of what is involved in the choice of a scale and a discussion
of the criteria of suitability have, however, been deferred till this
point because these are matters that cannot be properly appreciated
until the nature of the deductions to be made from the data are
understood. In other words the choice of a scale has to be made in
relation to the object for which the data are to be used. The data
from any experimental or practical study are obtained in the form
most convenient for the measurement of the character. That is to
say the phenotypic values are recorded in grams, pounds, centimetres,
days, numbers, or whatever unit of measurement is most convenient.
The point at issue is whether these raw data should be transformed to
another scale before they are subjected to analysis or interpretation.
A transformation of scale means the conversion of the original units
to logarithms, reciprocals, or some other function, according to what
is most appropriate for the purpose for which the data are to be used.

It is tempting to suppose that each character has its "natural"
scale, the scale on which the biological process expressed in the
character works. Thus, growth is a geometrical rather than an arith-
metical process, and a geometric scale would appear to be the most
' 'natural." For example, an increase of 1 gm. in a mouse weighing
20 gm. has not the same biological significance as an increase of 1 gm.
in a mouse weighing 2 gm.: but an increase of 10 per cent has ap-
proximately the same significance in both. For this reason a trans-
formation to logarithms would seem appropriate for measurements of
weight. This, however, is largely a subjective judgment, and some
objective criterion for the choice of a scale is needed. There are
several recognised criteria (see Wright, 1952&); but, as Wright points
out, the different criteria are often inconsistent in the scale they indi-
cate. And, moreover, the same criterion applied to the same character
may indicate different scales in different populations. Therefore the

Chap. 17]

SCALE

293

idea that every character must have its "natural" and correct scale is
largely illusory.

In the first chapter on metric characters, Chapter 6, it was stated
that we should assume throughout that any metric character under
discussion would be measured on an "appropriate" scale, the
criterion being that the distribution of phenotypic values should
approximate to a normal curve. This is, in principle, the chief
criterion, and a markedly asymmetrical, or skewed, distribution is a
certain indication that the data may have to be transformed if they are
to be used in certain ways. But a transformation may still be required
even if the distribution is not markedly asymmetrical: we shall see
below that the most important criterion then is that the variance
should be independent of the mean. We shall treat the choice of
scale in this chapter by showing what will arise if the transformation
required is not made. We shall find that certain phenomena arise,
called scale effects, which disappear when the appropriate transforma-
tion is made. For the sake of clarity we shall discuss in particular the
logarithmic transformation which converts an arithmetic to a geo-
metric scale. This is probably the commonest and most useful
transformation. The general principles, outlined by reference to the
log transformation, will, however, apply equally to other transforma-
tions. Let us first consider the distribution of phenotypic values.

Fig. 17. i shows three distributions plotted as if from the original
data on an arithmetic scale. They would all three be symmetrical
and normal if the data were first transformed to logarithms, or plotted
on logarithmic paper. There are two points of importance to notice.
First, the degree of departure from normality depends on the amount
of variation in relation to the mean. This may be seen from a com-
parison of the two upper graphs, (a) and (b), which are not very
noticeably asymmetrical, with the lower graph, (c), which is. The
relationship between the amount of variation and the mean, which
determines the degree of departure from normality, is best expressed
as the coefficient of variation; i.e. the ratio of standard deviation to
mean, often multiplied by 100 to bring it to a percentage. The
coefficient of variation of the two upper graphs is 20 per cent, while
that of the lower graph is 50 per cent. Thus, a transformation to
logarithms does not make an appreciable difference to the shape of the
distribution unless the coefficient of variation is fairly high — that is,
above about 20 per cent or so. Consequently, statistical procedures
which do not rely on a strictly normal distribution, such as the ana-

294

SCALE

[Chap. 17

lysis of variance, can be carried out on the untransformed data when
the coefficient of variation is not above about 20 per cent. Trans-
formations to other scales are also less necessary when the coefficient
of variation is low than when it is high.

The second point to notice in Fig. 17. 1 is that the variance, when
computed in arithmetic units, increases when the mean increases.
This may be seen in the two upper graphs, (a) and (b). These have

Fig. 17. i. Distributions that are symmetrical and normal on a
logarithmic scale shown plotted on an arithmetic scale. Explana-
tion in text.

both the same variance in logarithmic units, but different means.
The mean — or strictly speaking the mode — of (b) is double that of (a)
and the standard deviation in arithmetic units is correspondingly
doubled. Though the distributions are not very noticeably skewed
and a transformation does not seem to be very strongly indicated, yet
in consequence of the difference of mean the variances differ very
greatly. Here, then, is one of the commonest scale effects, namely a
change of variance following a change of the population mean. The
two graphs (a) and (b) in Fig. 17.1 might well represent two popula-

Chap. 17]

SCALE

295

tions which have diverged by some generations of two-way selection,
if the character were something like body weight measured in grams
or pounds. Such characters are commonly found to increase in
variance when the mean increases and to decrease in variance when
the mean decreases. Fig. 17.2 shows an example from an experiment
with mice (MacArthur, 1949), the character being weight at 60 days.

^20

f-
Z

LU
OL

• /

1 y^TV/^

3 IC

25 30

GRAMS

Fig. 17.2. Distributions of body weight of male mice at 60 days.
Centre: base population before selection. Left and right: small
and large strains after 21 generations of two-way selection. (Re-
drawn from MacArthur, 1949.)

Small Unselected Large
Standard deviation 171 2*56 5-10

Coeff. of variation, % 14-3 ii-i 12-8

Phenomena such as the change of variance discussed above are
called scale effects if they disappear when the measurements are
appropriately transformed: in other words, if their cause can be
attributed to the scale of measurement. But they are none the less
real, though labelled as a scale effect or removed by transformation.
The large mice, for example, are really more variable than the small
when their weights are measured in grams. What is gained by recog-
nising this as a scale effect is that there is no need to look deeper into
the genetic properties of the character for an explanation.

A convenient test for the appropriateness of a logarithmic trans-
formation is provided by the proportionality of standard deviation

U F.Q.G.

296 SCALE [Chap. 17

and mean, which we noted in connexion with graphs (a) and (b) in
Fig. 17. i. If two distributions have the same variance on a logarith-
mic scale then the coefficients of variation in arithmetic units will be
the same. Thus, constancy of the coefficient of variation indicates
constancy of variance on a logarithmic scale. And, if variances are to
be compared, we may simply compare the coefficients of variation
instead of expressing the variances in logarithmic units. The stand-
ard deviations and coefficients of variation of the distributions shown
in Fig. 17.2 are given in the legend to the figure. The coefficients of
variation, though not identical, are much more alike than the stand-
ard deviations, and this shows that the changes of variance that have
resulted from the selection can be attributed, in large part at least, to
the scale of measurement.

The effect of scale on the connexion between variance and mean
complicates the comparison of the variances of two populations that
differ also in mean, as for example the comparison of the variances of
inbreds and hybrids discussed in Chapter 15. If a difference of
variance is to be unambiguously attributed to a difference of homeo-
static power, for example, there must be independent grounds for
believing that a similar difference would not be expected as a scale
effect connected with the difference of mean.

Let us return to the consequences of selection and pursue them a
little further. If the variance changes with the change of mean as a
result of selection, so also will the selection differential and the
response. The response per generation of a character such as we have
been considering would therefore be expected to increase with the
progress of selection in the upward direction, and to decrease corre-
spondingly in the downward direction. The response to two-way
selection would then be asymmetrical. An example of an asymmetri-
cal response which can most probably be attributed to a scale effect
in this way is shown in Fig. 17.3. Plotted in arithmetic units, as in
(a), the response is much greater in the upward than in the downward
direction. A transformation to logarithms, shown in (b), renders the
response much more nearly symmetrical. This does not do away
with the fact that the character as measured increased much more than
it decreased under selection. But it accounts for the asymmetry
without the need for more elaborate hypotheses. A convenient way of
eliminating scale effects from the graphical presentation of a response
to selection is to plot the response in the form of the realised herit-
ability, as explained in Chapter 11 and illustrated in Fig. 11.5. The

Chap. 17]

SCALE

297

realised heritability, which is the ratio of response to selection differ-
ential, is very little influenced by scale effects (Falconer, 1954a:).

When means or variances are to be compared, for example in a
comparison of two populations or in following the changes resulting
from selection, and a transformation to logarithms is indicated, it is
not necessary to convert each individual measurement. On the other

260

220

140

100

(a)

■

: /

■

.'\.

**"■"-• — .^"*\.

, ;^,— t

£2-4

Q
122

o
"20

- z

cC |.4

2 4 6 8 10

GENERATIONS

/ *

(b)

\
; \

~;v:

0 2 4 6 8 10

GENERATIONS

Fig. 17.3. Response to two-way selection for resistance to dental
caries in rats. Resistance is measured in days and plotted on an arith-
metic scale in (a), and on a logarithmic scale in (b). The arithmetic
means were converted to logarithmic means by formula 17. 1. The
coefficient of variation was high — about 50 % — and was approxi-
mately constant. The reason why the upward selection has not
covered so many generations as the downward is simply that the
increased resistance lengthened the generation interval. (Data
from Hunt, Hoppert, and Erwin, 1944.)

hand it is not sufficient to convert the arithmetic mean or variance to
logarithms, unless the coefficient of variation is very low. The con-
versions may be conveniently made by the two following formulae,
given by Wright (19526). The first converts the mean of arithmetic
values to the mean of logarithmic values, and the second converts the
variance as computed from the arithmetic values to the variance as it

(log x) = log x - I log ( 1 + C2)

o'dogo;) =0-4343 log (i+C2)

.(I7.I)
,(17.2)

would be computed from logarithmic values. In these formulae C is
the coefficient of variation in the form ujx computed from arithmetic
values, and the logarithms are to the base 10.

298 SCALE [Chap. 17

We turn now to what is perhaps a more fundamental effect of a
scale transformation — its effect on the apparent nature of the genetic
variance. To understand this we must go back to a single locus and
consider the effect, or mode of action, of the genes. Let us imagine a
locus with two alleles whose mode of action is geometric, the geno-
typic value of A2A2 being 50 per cent greater than AXA2 and that of
AXA2 being also 50 per cent greater than A-^. Thus on the logarith-
mic scale there is no dominance, the heterozygote being exactly mid-
way between the two homozygotes. Now suppose the genotypic
values are measured in arithmetic units, such as grams, and that AXAX
has a value of 10 units. Then AXA2 will be 15 units and A2A2 22-5
units. On the arithmetic scale, therefore, Ax is partially dominant to
A2, the heterozygote no longer falling mid-way between the homo-
zygotes. Thus the degree of dominance is influenced by the scale of
measurement, and so also is the proportionate amount of dominance
variance. This effect of a scale transformation, however, is normally
rather small. A gene that causes a 50 per cent difference between the
genotypic values, such as we have considered, would be a major gene,
easily recognisable individually. But even so the degree of dominance
on the arithmetic scale is not very great. Minor genes with effects of
perhaps 1 per cent or 10 per cent would be scarcely influenced in their
dominance.

In the same way that the dominance is affected by the scale, so
also is the epistatic interaction between different loci. Loci with
geometric effects would combine without interaction if the genotypic
values were measured in logarithmic units. But when measured in
arithmetic units there would be interaction deviations due to epis-
tasis. Thus the amount of interaction variance is also influenced by
the scale of measurement. The following example illustrates the
dependence of interaction on scale.

Example 17.1. The pygmy gene in mice is a major gene affecting body
size, homozygotes being much reduced in size. The effect of this gene
was studied in different genetic backgrounds (King, 1955). The gene was
transferred from the strain selected for small size where it arose, to a strain
selected for large size, by repeated backcrosses. The mean difference be-
tween pygmy homozygotes and normals (i.e. heterozygotes and normal
homozygotes together) was measured in the two strains and during the
transference, the comparisons being made between pygmies and normals
in the same litters. The results are shown in Fig. 17.4. The difference
between pygmies and normals increases with the weight of the normals.

Chap. 17]

SCALE

299

In the background of the small strain the pygmies were about 7 gm. smaller
than normals, but in the background of the large strain they were about
12 gm. smaller. Thus the pygmy gene shows epistatic interaction with the
other genes that affect body size. But if the effect of the gene is expressed
as a proportion, it is constant and independent of the other genes present.

8 -

"10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Weight of normal mice (g.)

Fig. 17.4. Intra-litter comparisons of the 6-week weights of pyg-
mies and normals. Mean of pygmies plotted against mean of nor-
mals in the same litter. (From King, 1955; reproduced by courtesy
of the author and the editor of the Journal of Genetics.)

Pygmies are about half the weight of their normal litter-mates, no matter
what the actual weights are. Thus if the comparisons are made in logar-
ithmic units there is no epistatic interaction.

In general, therefore, a scale transformation may remove or
reduce the variance attributable to epistatic interaction, and this
variance might then be labelled as a scale effect. A transformation
which removes or reduces interaction variance may be useful if con-
clusions are to be drawn from an analysis that depends for its validity
on the absence of interaction. A detailed treatment of the relation-
ship between scale and epistatic interaction is given by Horner,
Comstock, and Robinson (1955).

In this chapter we have outlined some of the scale effects most
commonly met with, and have indicated the circumstances under
which a transformation of scale may be helpful to the interpretation
of results and the drawing of conclusions. Transformations of scale,
however, should not be made without good reason. The first pur-
pose of experimental observations is the description of the genetic

300 SCALE [Chap. 17

properties of the population, and a scale transformation obscures
rather than illuminates the description. If epistasis, for example, is
found, this is an essential part of the description, and it is better
labelled as epistasis than as a scale effect. The transformation of scale
is essentially a statistical device to be employed for the purpose of
simplifying the analysis of the data, or to make possible the drawing
of valid conclusions from the analysis. It is sometimes helpful also in
the interpretation of results. If epistasis, for example, were found to
disappear on transformation to a logarithmic scale we could conclude
that the effects of different loci combined by multiplication rather
than by addition. Or, if there were good reasons for attributing a
difference of variance to a scale effect we should not need to invoke
more complicated genetic explanations. The choice of scale, how-
ever, raises troublesome problems in connexion with the interpreta-
tion of results. Logical justification of a scale transformation can
only come from some criterion other than the property about which
the conclusions are to be drawn. If there is no independent criterion
the argument becomes circular, and the distinction between a scale
effect and some other interpretation becomes meaningless. There is
also a more fundamental difficulty: the scale appropriate for one
population may not be appropriate for another, and the scale appro-
priate to the genetic and environmental components of the variation
may be different. This difficulty is strikingly illustrated by an analysis
of the character " weight per locule" in a number of crosses between
varieties of tomato (Powers, 1950). By the same criterion — normality
of the distribution — this character was found to require an arithmetic
scale in some crosses and a geometric scale in others; and, moreover,
in the F2 generations of some crosses the genetic variation required one
scale while the environmental variation required another.

CHAPTER 18

THRESHOLD CHARACTERS

There are many characters of biological interest or economic im-
portance whose inheritance is multifactorial but whose distribution is
discontinuous. For example: resistance to disease, a character ex-
pressed either in survival or in death with no intermediate; "litter"
size in the larger mammals that bear usually one young at a time but
sometimes two or three; or the presence or absence of any organ or
structure. Characters of this sort appear at first sight to be outside the
realm of quantitative genetics because they do not exhibit continuous
variation; yet when subjected to genetic analysis they are found to be
under the influence of many genes just as any metric character. For
this reason they have been called "quasi-continuous variations"
(Griineberg, 1952): the phenotypic values are discontinuous but the
mode of inheritance is like that of a continuously varying character.

The clue to the understanding of the inheritance of such characters
lies in the idea that the character has an underlying continuity with a
"threshold" which imposes a discontinuity on the visible expression
of the character, as depicted in Fig. 18.1. The underlying continuous
variation is both genetic and environmental in origin, and may be
thought of as the concentration of some substance or the speed of some
developmental process — of something, that is to say, that could in
principle be measured and studied as a metric character in the
ordinary way. The hypothetical measurement of this variation is
supposed to be made on a scale that renders its distribution normal,
and the unit of measurement is the standard deviation of the dis-
tribution. This provides what may be called the underlying scale. We
now have two scales for the description of the phenotypic values: the
underlying scale which is continuous, and the visible scale which is
discontinuous. The two are connected by the threshold, or point of
discontinuity. This is a point on the continuous scale which corre-
sponds with the discontinuity in the visible scale. The idea will be
clearer from an inspection of Fig. 18.1, which depicts a character
whose visible expression can take only two forms, such as alive versus

302

THRESHOLD CHARACTERS

[Chap. 18

dead, or present versus absent. Individuals whose phenotypic values
on the underlying scale exceed the threshold will appear in one visible
class, while individuals below the threshold will appear in the other.

-2

I + 2 +3 -3 -2 -I
STANDARD DEVIATIONS

+ 3

Fig. i 8. i. Illustrations of a threshold character with two visible
classes. The vertical line marks the theshold between the two
phenotypic classes, one of which is cross-hatched. The population
depicted on the left has an incidence of io%; that on the right, an
incidence of 90 %.

On the visible scale individuals can have only two values, o or 1.
Groups of individuals, however, such as families or the population as
a whole can have any value, in the form of the proportion or percent-
age of individuals in one or other class. This may be referred to as
the incidence of the character. Susceptibility to disease, for example,
can be expressed as the percentage mortality in the population or in
a family. The incidence is quite adequate as a description of the
population or group, but the percentage scale in which the incidence
is expressed is inappropriate for some purposes because on a per-
centage scale variances differ according to the mean. The interpre-
tation of genetic analyses of threshold characters is therefore facili-
tated by the transformation of incidences to values on the underlying
scale. The transformation is easily made by reference to a table of
probabilities of the normal curve. The threshold is a point of trun-
cation whose deviation from the population mean can be found from
the proportion of the population falling beyond it. A table of ''pro-
bits" (Fisher and Yates, 1943, Table ix) is convenient to use because
it refers to a single tail of the distribution and obviates confusion
over the sign of the deviation. The transformation from the visible
to the underlying scale enables us to state the mean phenotypic value
of a population or family in terms of its standard deviation, and to

Chap. 18]

THRESHOLD CHARACTERS

303

compare the means of different populations or families provided they
have the same standard deviation. It is convenient to take the posi-
tion of the threshold as the origin, or zero-point, on the underlying
scale and to express the mean as a deviation from the threshold.
Thus if the incidence of the character is, for example, 10 per cent, a
table of the normal curve shows that the threshold exceeds the mean
by 1-28 standard deviations. The population mean, referred to the
threshold as origin, is therefore - 1-280-. Or, if the incidence were
90 per cent then the population mean would be + i-28cj, as shown in
Fig. 1 8. 1. For any comparison of means, however, it is necessary to
assume that the populations compared have the same variance on the
underlying scale. If reasons are known for the variances not being
equal — in comparisons, for example, between inbreds, Fx's and F2's —
then the means cannot be expressed on a common scale that allows a
valid comparison to be made.

This is as far as we can go with a character that is visibly expressed
in only two classes. The mean of a population or group can be stated,
but not the variance, because the mean has to be stated in terms of the
standard deviation. We can, however, subject the observed means of
families to analysis and compute the heritability of the character.
The heritability of threshold characters is treated by A. Robertson
and Lerner (1949) and by Dempster and Lerner (1950), and will not
be further discussed here.

If a character has three classes in its visible scale then comparisons
can be made between the variances of populations as well as between
the means. The number of lumbar vertebrae in mice is a character
of this sort that has been extensively studied (Green, 1951; McLaren
and Michie, 1955). The number is usually either 5 or 6, but some
individuals have 5 on one side and 6 on the other. This comes about
through the last vertebra being sacralised on one side and not on the
other. The asymmetrical mice have 5! lumbar vertebrae and are
regarded as being intermediate between the 5 -class and the 6-class.

When the visible scale has three classes there are two thresholds,
as shown in Fig. 18.2. If the assumption is made that the difference
between the two thresholds represents a constant difference on the
underlying scale, then we have not only a fixed origin of the scale but
also a fixed unit, and this provides a basis for the comparison of
variances as well as of means. The underlying scale then has one of
the thresholds as origin and the threshold difference as the unit of
measurement. The idea is most easily explained by a numerical

304

THRESHOLD CHARACTERS

[Chap. 18

example. Consider the two populations illustrated in Fig. 18.2. Let
their standard deviations on a common underlying scale be g1 and o2
respectively, and let them have the following incidences in the three
visible classes, X, I, and Z, of which I is the intermediate class:

0 * +5

POPULATION (2)
THRESHOLD UNITS

Fig. 18.2. Illustrations of a threshold character with three visible
classes, in two populations with incidences as shown. The axes are
marked in threshold units, and the population means are indicated
by arrows. Further explanation in text.

Class

X/I

I/Z

Population (1)

+ 0-250-!

+ 0-6701

Population (2)

— 0-8403

-0-520-2

Incidence, %. Population (1)
Population (2)

The deviations of the thresholds from the population means, found
from a table of the normal curve, are as follows:

Threshold interval

0-4201
0-3203

The intervals between the two thresholds, given above on the right,
are found by subtraction of the deviations of the two thresholds in
each population. These threshold intervals are supposed by hypo-
thesis to be equal on the common underlying scale. By assigning the
threshold interval the value of one ' 'threshold unit" we can therefore
express the standard deviations of the two populations on a common
basis in terms of threshold units. The standard deviations then
become

o1 = 2,38 threshold units
ct2 = 3'I2 threshold units.

Chap. 18]

THRESHOLD CHARACTERS

305

The means of the populations can also be expressed in threshold
units. Reckoned from the X/I threshold as origin they are

M1 = - 0-25 01 = - o-6o threshold units
M2 = + 0-84 o-2 = + 2-62 threshold units.

The standard deviation and population mean of a character with
three visible classes may be put in general form in the following way.
Let X be the incidence in one visible class, and Y the incidence in
this class together with the intermediate class. Let the threshold
between these two classes be the origin of the underlying scale. Let
x and y be the deviations of the two thresholds corresponding to the
incidences X and Y respectively. Then the standard deviation is

and the mean is

x -y

M= -xg

-x

threshold units

.(18.1)

threshold units

(18.2)

The comparison of variances in this way depends entirely, as we
have pointed out, on the assumption that the interval between the
two thresholds is constant from one population to another. If we
think again of the hypothetical substance or process whose concentra-
tion or rate determines the value on the underlying scale, the assump-
tion is that the intermediate class spans the same difference of con-
centration or of rate in the two populations compared. Whether this
assumption is a reasonable one or not is hard to judge. It may,
nevertheless, lead to reasonable results, as the following example
shows.

Example 18.1. The number of lumbar vertebrae was studied in two
inbred lines of mice and their cross (Green and Russell, 195 1). The inbred
lines were a branch of the C3H strain with predominantly 5 lumbar
vertebrae, and the C57BL strain with predominantly 6 lumbar vertebrae.
Crosses were made reciprocally, and F2 generations were made from each
FP The incidences of the 5-vertebra class and of the intermediate class of
asymmetrical mice with 5^ are given in the table. The reciprocal F/s
were found to differ and are listed separately. The F2's did not differ and
their results are pooled. The table gives also the positions of the two thresh-
olds in standard deviations; and the mean and standard deviation com-

306

THRESHOLD CHARACTERS

[Chap. 18

puted in threshold units, the mean being reckoned from the threshold
between the 5-class and the asymmetrical class as origin. The distribu-

Population

Incidence, %
5 5i

Deviation of

thresholds from

mean, in a

5/5* Si/6

Mean and stand-
ard deviation in
threshold units
M a

Inbreds C3H

C57

C3H? x C57<?
C57?xC3H<?

96-9
i-3

2-3

2-0

+ 1-87

-2-23

+ 2-41
-1-84

-3 '44

+ 574

1-84
2-58

57'4
29-0

I5'5
25-0

+ OI9

-o-55

+ o-6i
+ o-io

-0-44

+ 0-85

2-36
i-53

F2 (pooled)

46-7

12*2

-0-08

+ 0-23

+ 0-27

3-25

tions of the populations, based on the computed means and standard
deviations, are shown graphically in Fig. 18.3. It should be noted that the
means and standard deviations of the inbreds are not very precisely esti-
mated because the incidences are low. The computed properties of the
populations follow the expected pattern. The Fx generation is intermediate
in mean between the two parental populations, though there is a maternal
effect causing a difference between the reciprocal F/s. This maternal
effect has been further studied and confirmed by McLaren and Michie
(1956a). The variance of the F1 is somewhat lower than that of the
parental inbreds, as might be expected from a reduction of environmental
variance in the hybrids. This was further studied and confirmed by
McLaren and Michie (1955). The F2 is equal in mean to the Fl9 but shows
an increased variance as would be expected from the segregation of genes.
If we take 2-00 as the mean standard deviation of the Flf representing
purely environmental variation, then the environmental variance is 4-00,
and the total phenotypic variance given by the F2 is 10-56; therefore the
genotypic variance works out at 6-56, or 62 per cent of the total. Thus the
analysis of the threshold character studied in this cross leads to very
reasonable results, and the assumptions on which it rests do not seem to be
very seriously wrong.

The meaning of the threshold unit in which values on the under-
lying scale are expressed may conveniently be discussed by reference
to the number of lumbar vertebrae in mice, described in the above
example. From the graduation of the scale at the foot of Fig. 18.3
it appears that the threshold interval corresponds to one vertebra. It
is therefore tempting to regard the scale as indicating ' 'potential'
vertebrae, ranging from 5 at the origin to 15 at the upper extreme

Chap. 18]

THRESHOLD CHARACTERS

307

-5 THRESHOLD UNITS +5

5 ►M- 6 ►

VERTEBRAE

Fig. 18.3. Distributions of number of lumbar vertebrae in mice
transformed to the underlying scale of threshold units. The upper
distributions are two inbred lines, the two middle ones are the two
reciprocal F/s, and the lower distribution is the F2. (Data from
Green & Russell, 1951-) See example 18.1 for further explanation.

308 THRESHOLD CHARACTERS [Chap. 18

and to - 5 at the lower extreme. We should then regard the develop-
ing vertebral column as being protected by canalisation against this
wide range of potential variation, so that the vertebrae actually
formed are restricted to the narrow range between 5 and 6. This
interpretation, however, assumes that individuals with a potential
number anywhere between 5 and 6 will be asymmetrical with 5!
vertebrae; and for this there is no justification. The asymmetrical
individuals may equally well, or more probably, be those with almost
exactly 5 \ potential vertebrae. Suppose, for example, that the range
of potential vertebrae that gave rise to an asymmetrical individual
were between 5-4 and 5-6. Then 1 threshold unit would correspond
to o-2 potential vertebrae; the origin of the underlying scale would
be at 5-4 and the variation would range from 7-4 potential vertebrae
at one extreme to 3-4 at the other. Or, if the asymmetrical individuals
covered a range of only o-i potential vertebrae, the whole distribu-
tion would lie within the potential numbers of 5 and 6, just as the
actual range does. Thus the threshold unit is purely arbitrary in
nature; though useful for the comparison of populations, it cannot be
given any concrete interpretation.

From what has been said so far in this chapter it will be clear that
threshold characters do not provide ideal material for the study of
quantitative genetics, because the genetic analyses to which they can
be subjected are limited in scope and subject to assumptions that one
would be unwilling to make except under the force of necessity. We
turn now to a consideration of some aspects of selection for threshold
characters, which has more practical importance than the genetic
analyses that we have been considering, and does not involve the same
theoretical difficulties.

Selection for Threshold Characters

Selection for threshold characters has some practical importance
in connexion with the improvement of viability and with changing
the response of experimental animals to treatments, such as, for
example, increasing or decreasing drug resistance. We shall consider
only characters with two visible classes; and we shall assume that
there is no means of measuring some aspect of the character that
varies continuously, such as measuring the time of survival instead of
classifying simply dead versus alive.

wii

Chap. 18]

SELECTION FOR THRESHOLD CHARACTERS

309

The response to selection depends in the usual way on the selec-
tion differential. But the selection differential does not depend prim-

I arily on the proportion selected, as with a continuously varying
character, but on the incidence, for the following reason. We may

I breed exclusively from those individuals in the desired phenotypic

i class, but we cannot discriminate between those with high and those
with low values on the underlying scale. The selected individuals are
therefore a random sample from the desired class, and the mean of
the selected individuals is the mean of the desired class, irrespective
of whether we select all of the desired class or only a portion of it.
The point will be made clearer by reference to Fig. 18.1, letting the
cross-hatching represent the desired class. Let us suppose that the
replacement rate allows us to select 10 per cent of the population. If
we select out of the population on the right, with an incidence of 90

I per cent, the mean of the selected individuals will be the same as if
we had selected 90 per cent. But if we select out of the population
on the left, with an incidence of 10 per cent, we shall use all of the
individuals in the desired class and none of the others. The selection
differential will then be the same as if we had selected on the basis of
a continuously varying character. Thus the selection differential is
greatest when the incidence is exactly equal to the proportion selected.
If it is less we shall be forced to use some individuals of the un-
desired class; and if it is greater we shall do no better than we should
by selecting the whole of the desired class.

With some characters, however, the incidence can be altered and
this provides a means of improving the response to selection. If the
character is, for example, a reaction to some treatment, the treatment
can be increased or reduced in intensity, so that the incidence is
altered. This is an alteration of the mean level of the environment,
and its effect is in principle to shift the distribution of phenotypic
values with respect to the fixed threshold. But it is more con-
venient to regard it as changing the nature of the character and shift-
ing the threshold with respect to a fixed mean phenotypic level.
When the level of the threshold can be controlled in this way, the
maximum speed of progress under selection will be attained by ad-
justing the threshold so that the incidence is kept as nearly as possible
equal to the minimum proportion that must be selected for breeding.
The progress made can be assessed by subjecting the population, or
part of it, to the original treatment under which the threshold is at its
original level.

310

THRESHOLD CHARACTERS

[Chap. 18

Genetic assimilation. A very interesting result of the applica-
tion of this principle of changing the threshold by environmental
means is the phenomenon known as "genetic assimilation" (Wad-
dington, 1953). If a threshold character appears as a result of an
environmental stimulus, and selection is applied for this character, it
may eventually be made to appear spontaneously, without the neces-
sity of the environmental stimulus. In this way what was originally
an "acquired character" becomes by perfectly orthodox principles of
selection an "inherited character" (Waddington, 1942). In such a
situation there are two thresholds, one spontaneous and the other

4 t 6

SPONTANEOUS

Fig. 1 8.4. Diagram illustrating genetic assimilation of a threshold
character. Distributions on the underlying scale, which is marked
in standard deviations. The vertical lines show the positions of the
induced and spontaneous thresholds, and the arrows mark the
population means at three stages of selection.

(a) before selection: incidence — induced = 30 %, spontaneous = o %

(b) after some selection: incidence — induced = 80 %, spontaneous = 2 % ■■

induced, as shown in Fig. 18.4. The spontaneous threshold is at first
outside the range of variation of the population, so that there is no
variation of phenotype and no selection can be applied, (Fig. 18.4, a).
The induced threshold, however, is within the range of the under-
lying scale covered by the population, and it allows individuals toward
one end of the distribution to be picked out by selection. In this way
the mean genotypic value of the population is changed. If this change
goes far enough some individuals will eventually cross the spon-
taneous threshold and appear as spontaneous variants, (Fig. 18.4, b).
When the spontaneous incidence becomes high enough selection may

Chap. 18]

SELECTION FOR THRESHOLD CHARACTERS

311

be continued without the aid of the environmental stimulus, and the
spontaneous incidence may be further increased, (Fig. 18.4, c).

Example 18.2. An experimental demonstration of genetic assimilation
in Drosophila melanogaster is described by Waddington (1953). The charac-
ter was the absence of the posterior cross-vein of the wing. In the base
population no flies with this abnormality were present, but treatment of
the puparium by heat shock caused about 30 per cent of cross-veinless
individuals to appear. Selection in both directions was applied to the
treated flies, and after 14 generations the incidence of the induced character
had risen to 80 per cent and fallen to 8 per cent. At this time cross-veinless
flies began to appear in small numbers among untreated flies of the upward-
selected line, and by generation 16 the spontaneous incidence was between
1 and 2 per cent. Selection was then continued without treatment, the
population being subdivided into a number of lines. The best four of the
lines, selected without further treatment, reached spontaneous incidences
ranging from 67 per cent to 95 per cent. The distributions in Fig. 18.4
illustrate the progress of the upward selection. Graph (b) shows a spon-
taneous incidence of 2 per cent and an induced incidence of 80 per cent
and thus corresponds approximately with generation 16. On the assump-
tion of constant variance, the change of mean at this stage amounted to
1-36 standard deviations. Graph (c) shows a spontaneous incidence of
95 per cent and represents the line that finally showed the greatest pro-
gress. Its mean on the underlying scale is 5-15 standard deviations above
that of the initial population.

The idea of genetic assimilation is not confined to threshold
characters; but for its wider significance the reader must be referred
to Waddington (1957).

F.Q.G.

CHAPTER 19

CORRELATED CHARACTERS

This chapter deals with the relationships between two metric charac-
ters, in particular with characters whose values are correlated —
either positively or negatively — in the individuals of a population.
Correlated characters are of interest for three chief reasons. Firstly
in connexion with the genetic causes of correlation through the
pleiotropic action of genes: pleiotropy is a common property of major
genes, but we have as yet had little occasion to consider its effects in
quantitative genetics. Secondly in connexion with the changes
brought about by selection: it is important to know how the im-
provement of one character will cause simultaneous changes in other
characters. And thirdly in connexion with natural selection: the
relationship between a metric character and fitness is the primary
agent that determines the genetic properties of that character in a
natural population. This last point, however, will be discussed in
the next chapter.

Genetic and Environmental Correlations

In genetic studies it is necessary to distinguish two causes of cor-
relation between characters, genetic and environmental. The genetic
cause of correlation is chiefly pleiotropy, though linkage is a cause of
transient correlation particularly in populations derived from crosses
between divergent strains. Pleiotropy is simply the property of a
gene whereby it affects two or more characters, so that if the gene is
segregating it causes simultaneous variation in the characters it
affects. For example, genes that increase growth rate increase both
stature and weight, so that they tend to cause correlation between
these two characters. Genes that increase fatness, however, influence
weight without affecting stature, and are therefore not a cause of
correlation. The degree of correlation arising from pleiotropy ex-
presses the extent to which two characters are influenced by the same

Chap. 19] GENETIC AND ENVIRONMENTAL CORRELATIONS

313

genes. But the correlation resulting from pleiotropy is the overall, or
net, effect of all the segregating genes that affect both characters.
Some genes may increase both characters, while others increase one
and reduce the other; the former tend to cause a positive correlation,
the latter a negative one. So pleiotropy does not necessarily cause a
detectable correlation. The environment is a cause of correlation in
so far as two characters are influenced by the same differences of
environmental conditions. Again, the correlation resulting from en-
vironmental causes is the overall effect of all the environmental
factors that vary; some may tend to cause a positive correlation, others
a negative one.

The association between two characters that can be directly
observed is the correlation of phenotypic values, or the phenotypic
correlation. This is determined from measurements of the two
characters in a number of individuals of the population. Suppose,
however, that we knew not only the phenotypic values of the indi-
viduals measured, but also their genotypic values and their environ-
mental deviations for both characters. We could then compute the
correlation between the genotypic values of the two characters and
the correlation between the environmental deviations, and so assess
independently the genetic and environmental causes of correlation.
And if, in addition, we knew the breeding values of the individuals, we
could determine also the correlation of breeding values. In principle
there are also correlations between dominance deviations, and be-
tween the various interaction deviations. To deal with all these cor-
relations, even in theory, would be unmanageably complex, and
fortunately is not necessary, since the practical problems can be quite
adequately dealt with in terms of two correlations. These are the
genetic correlation, which is the correlation of breeding values, and
the environmental correlation, which is not strictly speaking the cor-
relation of environmental deviations, but the correlation of environ-
mental deviations together with non-additive genetic deviations. In
other words, just as the partitioning of the variance of one charac-
ter into the two components, additive genetic versus all the rest,
was adequate for many purposes, so now the covariance of two
characters need only be partitioned into these same two compon-
ents. The ' 'genetic" and " environmental" correlations thus corres-
pond to the partitioning of the covariance into the additive genetic
component versus all the rest. The methods of estimating these
two correlations will be explained later. Let us consider first how

314 CORRELATED CHARACTERS [Chap. 19

they combine together to give the directly observable phenotypic
correlation.

The following symbols will be used throughout this chapter:

X and Y: the two characters under consideration.

rP the phenotypic correlation between the two characters,

XandY.
rA the genetic correlation between X and Y (i.e. the

correlation of breeding values).
rE the environmental correlation between X and Y

(including non-additive genetic effects).
cov the covariance of the two characters X and Y, with

subscripts P, A, or E, having the same meaning as for

the correlations.
cr2 and g variance and standard deviation, with subscripts

P, A, or E, as above, and X or Y according to the

character referred to. E.g. gpx = phenotypic variance

of character X.
h2 the heritability, with subscript X or Y, according to

the character.
e2 = i - h2.

(The customary symbol for the genetic correlation is rG, but since the
genetic correlation is almost always the correlation of breeding values
we shall use the symbol rA for the sake of consistency with previous
chapters.)

A correlation, whatever its nature, is the ratio of the appropriate
covariance to the product of the two standard deviations. For
example, the phenotypic correlation is

COVp

GPXGPy

The phenotypic covariance is the sum of the genetic and environ-
mental covariances, so we can write the phenotypic correlation as

_covA+covE

rp —

vpxVpy

The denominator can be differently expressed by the following
device: g\ — h2GP, and g% — ^gp. So GP—GAjh=GEje. The phenotypic
correlation then becomes

Chap. 19] GENETIC AND ENVIRONMENTAL CORRELATIONS

315

7 j covA COVE

rP = hxhY- — —+exeY

aAX?AY

^ex^ey

Therefore

rP=hxhYrA+exeYrE

,{l9.l)

This shows how the genetic and environmental causes of correlation
combine together to give the phenotypic correlation. If both
characters have low heritabilities then the phenotypic correlation is
determined chiefly by the environmental correlation: if they have
high heritabilities then the genetic correlation is the more important.

The genetic and environmental correlations are often very differ-
ent in magnitude and sometimes different even in sign, as may be
seen from the examples given in Table 19.1. A difference in sign
between the two correlations shows that genetic and environmental
sources of variation affect the characters through different physio-
logical mechanisms. The correlations between body-weight and egg-
laying characters in poultry provide striking examples. Pullets that
are larger at 18 weeks from genetic causes reach sexual maturity later
and lay fewer eggs, but the eggs are larger. Pullets that are larger
from environmental causes reach sexual maturity earlier and lay
more eggs, which however are very little different in size.

The dual nature of the phenotypic correlation makes it clear that
the magnitude and even the sign of the genetic correlation cannot be
determined from the phenotypic correlation alone. Let us therefore
consider the methods by which the genetic correlation can be
estimated.

Estimation of the genetic correlation. The estimation of
genetic correlations rests on the resemblance between relatives in a
manner analogous to the estimation of heritabilities described in
Chapter 10. Therefore only the principle and not the details of the
procedure need be described here. Instead of computing the com-
ponents of variance of one character from an analysis of variance, we
compute the components of covariance of the two characters from an
analysis of covariance which takes exactly the same form as the ana-
lysis of variance. Instead of starting from the squares of the individual
values and partitioning the sums of squares according to the source
of variation, we start from the product of the values of the two
characters in each individual and partition the sums of products
according to the source of variation. This leads to estimates of the
observational components of covariance, whose interpretation in

316 CORRELATED CHARACTERS [Chap. 19

Table 19. i

Some Examples of Phenotypic, Genetic, and
Environmental Correlations

The environmental correlations (except those marked*) were
calculated for this table from the genetic correlations and
heritabilities given in the papers cited, by equation ig.i.
They are not purely environmental in causation but include
correlation due to non-additive genetic causes, as explained
in the text. Those marked* are true environmental correla-
tions, estimated directly from the phenotypic correlation in
inbred lines and crosses.

rP rA rE

Cattle (Johansson, 1950)
Milk-yield : butterfat-yield.
Milk-yield : butterfat %.
Butterfat-yield : butterfat %.

Pigs (Fredeen and Jonsson, 1957)
Body length : backfat thickness.
Growth rate : feed efficiency.
Backfat thickness : feed efficiency.

Sheep (Morley, 1955)

Fleece weight : length of wool.
Fleece weight : crimps per inch.
Fleece weight : body weight.

Poultry (Dickerson, 1957)

Body weight : egg-production.

(at 18 weeks) (to 72 weeks of age)

Body weight : egg weight.

(at 18 weeks)

Body weight : age at first egg. - -30 -29 - -50

(at 18 weeks)

Mice (Falconer, 1954&)

Body weight : tail length (within litters). -44 -59 -34

Drosophila melanogaster

Bristle number, abdominal : sternopleural. -06 -08 -04

(Clayton, Knight, Morris, and Robert-
son, 1957)

Number of bristles on different abdominal

segments. (Reeve and Robertson, 1954) — '96 -05

Thorax length : wing length.

(Reeve and Robertson, 1953) — 75 '5°

•93
-•14

•23

•85

-•20

•26

.96

-•10

•22

-•24

-.84

•31

-•47
-•96

•28

-•01

-•50
.32

.30

-•21
.36

-•02
- -II

1-17

•10

1-05

•09

-•l6

•18

•l6

•50

-•05

Chap. 19] GENETIC AND ENVIRONMENTAL CORRELATIONS 317

terms of causal components of covariance is exactly the same as that
of the components of variance given in Table 10.4. Thus, in an
analysis of half-sib families the component of covariance between
sires estimates \covAy i.e. one quarter of the covariance of breeding
values of the two characters. For the estimation of the correlation
the components of variance of each character are also needed. Thus
the between-sire components of variance estimate laAX and \v\Y-
Therefore the genetic correlation is obtained as

s/varx varY

where var and cov refer to the components of variance and covariance.
The offspring-parent relationship can also be used for estimating
the genetic correlation. To estimate the heritability of one character
from the resemblance between offspring and parents we compute
the covariance of offspring and parent for the one character by
taking the product of the parent or mid-parent value and the mean
value of the offspring. To estimate the genetic correlation between
two characters we compute what might be called the "cross-
covariance," obtained from the product of the value of X in parents
and the value of Y in offspring. This "cross-co variance" is half the
genetic covariance of the two characters, i.e. \covA. The covariances
of offspring and parents for each of the characters separately are also
needed, and then the genetic correlation is given by

Cm^=^ (19-3)

vCOVxx covYy

where cov^y is the "cross-covariance," and covXx and covYy are tne
offspring-parent covariances of each character separately.

The genetic correlation can also be estimated from responses to
selection in a manner analogous to the estimation of realised herit-
ability. This will be explained in the next section.

Data that provide estimates of genetic correlations provide also
estimates of the heritabilities of the correlated characters, and of the
phenotypic correlations. The environmental correlation can then be
found from equation ig.i. If highly inbred lines are available the
environmental correlations can be estimated directly from the
phenotypic correlation within the lines, or preferably within the F/s
of crosses between the lines.

Estimates of genetic correlations are usually subject to rather

318 CORRELATED CHARACTERS [Chap. /!

large sampling errors and are therefore seldom very precise. The
sampling variance of genetic correlations is treated by Reeve (1955^)
and by A. Robertson (19596). The standard error of an estimate
is given approximately by the following formula :

U(rA) ~

V<*(ft|) <*0

where <r denotes standard error. Since the standard errors of the two
heritabilities appear in the numerator, an experiment designed to
minimise the sampling variance of an estimate of heritability, in the
manner described in Chapter 10, will also have the optimal design for
the estimation of a genetic correlation.

Correlated Response to Selection

The next problem for consideration concerns the response to
selection: if we select for character X, what will be the change of the
correlated character Y? The expected response of a character, Y,
when selection is applied to another character, X, may be deduced in
the following way. The response of character X — i.e. the character
directly selected — is equivalent to the mean breeding value of the
selected individuals. This was explained in Chapter 11. The conse-
quent change of character Y is therefore given by the regression of the
breeding value of Y on the breeding value of X. This regression is

_covA_ GAY

°(A)YX——^ — 'A

<*AX &AX

The response of character X, directly selected, by equation 11. 4, is

Rx = ihx°Ax
Therefore the correlated response of character Y is
CRY=bU)YXRx

■j (JAY

=inxaAxrA

°AX

=ihxrA°AY (J9-'

Or, by putting gay — hYcrPYi the correlated response becomes

CRY = ihxhYrAaPY (I9-5)

Chap. 19]

CORRELATED RESPONSE TO SELECTION

319

Thus the response of a correlated character can be predicted if
the genetic correlation and the heritabilities of the two characters are
known. And, conversely, if the correlated response is measured by
experiment, and the two heritabilities are known, the genetic corre-
lation can be estimated. If the heritability of character Y is to be
estimated as the realised heritability from the response to selection,
then it is necessary to do a double selection experiment. Character X
is selected in one line and character Y in another. Then both the
direct and the correlated responses of each character can be measured.
This type of experiment provides two estimates of the genetic corre-
lation (by equation 19.5), one from the correlated response of each
character; and the two estimates should agree if the theory of corre-
lated responses expressed in equation J9.5 adequately describes the
observed responses (Falconer, 1954&). A joint estimate of the genetic
correlation can be obtained from such double selection experiments,
without the need for estimates of the heritabilities, from the following
formula which may be easily derived from equations 11. 4 and 19.4:

rA =

Ry- Rxr

.(i9.6)

Example 19. i. In a study of wing length and thorax length in Droso-
phila melanogaster, Reeve and Robertson (1953) estimated the genetic
correlation between these two measures of body size from the responses to
selection. There were two pairs of selection lines; one pair was selected for
increased and for decreased thorax length, and the other pair for increased
and for decreased wing length. In each line the correlated response of the
character not directly selected was measured, as well as the response of the
character directly selected. Two estimates of the genetic correlation were
obtained by equation J9.6, one from the responses to upward selection and
the other from the responses to downward selection. In addition, estimates
of the genetic correlation in the unselected population were obtained from
the offspring-parent covariance and also from the full-sib co variance. The
four estimates were as follows:

Method

Genetic correlation

Offspring-parent

074

Full sib

075

Selection, upward

071

Selection, downward

o73

The agreement between the estimates from selection and the estimates
from the unselected population shows that the correlated responses were

320 CORRELATED CHARACTERS [Chap. 19

very close to what would have been predicted from the genetic analysis of
the unselected population.

Close agreement between observed and predicted correlated
responses, such as was shown in the above example, cannot always be
expected, particularly if the genetic correlation is low. With a low
genetic correlation the expected response is small and is liable to be
obscured by random drift (see Clayton, Knight, Morris and Robert-
son, 1957). Also, if the genetic correlation is to any great extent
caused by linkage, it is likely to diminish in magnitude through
recombination, with a consequent diminution of the correlated
response. There has not yet been enough experimental study of
correlated responses to allow us to draw any conclusions about the
number of generations over which they continue, nor about the total
response when the limit is reached.

Indirect selection. Consideration of correlated responses sug-
gests that it might sometimes be possible to achieve more rapid pro-
gress under selection for a correlated response than from selection for
the desired character itself. In other words, if we want to improve
character X, we might select for another character, Y, and achieve
progress through the correlated response of character X. We shall
refer to this as "indirect" selection; that is to say, selection applied to
some character other than the one it is desired to improve. And we
shall refer to the character to which selection is applied as the
"secondary" character. The conditions under which indirect selec-
tion would be advantageous are readily deduced. Let Rx be the
direct response of the desired character, if selection were applied
directly to it. And let CRX be the correlated response of character X
resulting from selection applied to the secondary character, Y. The
merit of indirect selection relative to that of direct selection may then
be expressed as the ratio of the expected responses, CRX/RX. Taking
the expected correlated response from equation 19.4 and the expected
direct response from equation 11. 4, we find

CRX =tY^YrA^AX

Rx ixhx°AX

lx nx
If the same intensity of selection can be achieved when selecting for

Chap. 19]

CORRELATED RESPONSE TO SELECTION

321

character Y as when selecting for character X, then the correlated
response will be greater than the direct response if rAhY is greater
than hx. Therefore indirect selection cannot be expected to be
superior to direct selection unless the secondary character has a
substantially higher heritability than the desired character, and the
genetic correlation between the two is high; or, unless a substantially
higher intensity of selection can be applied to the secondary than to
the desired character. The circumstances most likely to render
indirect selection superior to direct selection are chiefly concerned
with technical difficulties in applying selection directly to the desired
character. Two such technical difficulties may be mentioned briefly.

i . If the desired character is difficult to measure with precision,
the errors of measurement may so reduce the heritability that indirect
selection becomes advantageous. Threshold characters in general are
likely for this reason to repay a search for a suitable correlated charac-
ter, unless the position of the threshold can be adjusted in the manner
described in the last chapter. An interesting experimental result which
may well prove to be an example of indirect selection being superior
to direct selection concerns sex ratio in mice. The sex ratio among
the progeny may be regarded as a metric character of the parents.
Selection applied directly to sex ratio was ineffective in changing it
(Falconer, 1954c), but selection for blood-pH produced a correlated
change of sex ratio (Weir and Clark, 1955; Weir, 1955). The reason
for the ineffectiveness of direct selection is probably that the true sex
ratio of a family is subject to a large error of estimation resulting
from the sampling variation, and the heritability is consequently very
low.

2. If the desired character is measurable in one sex only, but the
secondary character is measurable in both, then a higher intensity of
selection will be possible by indirect selection. Other things being
equal, the intensity of selection would be twice as great by indirect
as by direct selection; but a better plan would be to select one sex
directly for the desired character and the other indirectly for the
secondary character.

Though indirect selection has been presented above as an alterna-
tive to direct selection, the most effective method in theory is neither
one nor the other but a combination of the two. The most effective use
that can be made of a correlated character is in combination with the
desired character, as an additional source of information about the
breeding values of individuals. This, however, is a special case of a

322 CORRELATED CHARACTERS [Chap. 19

more general problem which will be dealt with in the final section of
this chapter. First we shall show how the idea of indirect selection
can be extended to cover selection in different environments.

Genotype-Environment Interaction

The concept of genetic correlation can be applied to the solution
of some problems connected with the interaction of genotype with
environment. The meaning of interaction between genotype and
environment was explained in Chapter 8, where it was discussed as a
source of variation of phenotypic values, which in most analyses is
inseparable from the environmental variance. The chief problem
which it raises and which we are now in a position to discuss concerns
adaptation to local conditons. The existence of genotype-environ-
ment interaction may mean that the best genotype in one environ-
ment is not the best in another environment. It is obvious, for
example, that the breed of cattle with the highest milk-yield in
temperate climates is unlikely also to have the highest yield in tropical
climates. But it is not so obvious whether smaller differences of en-
vironmental conditions also require locally adapted breeds; nor is it
intuitively obvious how much of the improvement made in one
environment will be carried over if the breed is then transferred to
another environment. These matters have an important bearing on
breeding policy. If selection is made under good conditions of feeding
and management on the best farms and experimental stations, will
the improvement achieved be carried over when the later generations
are transferred to poorer conditions? Or would the selection be
better done in the poorer conditions under which the majority of
animals are required to live ? The idea of genetic correlation provides
the basis for a solution of these problems in the following way.

A character measured in two different environments is to be
regarded not as one character but as two. The physiological mechan-
isms are to some extent different, and consequently the genes re-
quired for high performance are to some extent also different. For
example, growth rate on a low plane of nutrition may be principally
a matter of efficiency of food-utilisation, whereas on a high plane of
nutrition it may be principally a matter of appetite. By regarding
performance in different environments as different characters with
genetic correlation between them we can in principle solve the prob-

Chap. 19]

GENOTYPE-ENVIRONMENT INTERACTION

323

lems outlined above from a knowledge of the heritabilities of the
different characters and the genetic correlations between them
(Falconer, 1952). If the genetic correlation is high, then performance
in two different environments represents very nearly the same
character, determined by very nearly the same set of genes. If it is
low, then the characters are to a great extent different, and high
performance requires a different set of genes. Here we shall con-
sider only two environments, but the idea can be extended to an
indefinite number of different environments (A. Robertson,

Let us consider the problem of the ' 'carry-over" of the improve-
ment from one environment to another. Let us suppose that we
select for character X — say growth rate on a high plane of nutrition —
and we look for improvement in character Y — say growth rate on a
low plane of nutrition. The improvement of character Y is simply a
correlated response and the expected rate of improvement was given
in equation J9.5 as

CRY — thjJtyTAVp?

The improvement of performance in an environment different from
the one in which selection was carried out can therefore be predicted
from a knowledge of the heritability of performance in each environ-
ment and the genetic correlation between the two performances. We
can also compare the improvement expected by this means with that
expected if we had selected directly for character Y, i.e. for perfor-
mance in the environment for which improvement is wanted. This
is simply a comparison of indirect with direct selection, which was
explained in the previous section. The comparison is made from the
ratio of the two expected responses given in equation ig.7, i.e.

rA ~T~

iYhY

This shows how much we may expect to gain or lose by carrying out
the selection in some environment other than the one in which the
improved population is required to live. If we assume that the in-
tensity of selection is not affected by the environment in which the
selection is carried out, then the indirect method will be better if
rA^x is greater than hY, where hx is the square root of the heritability
in the environment in which selection is made, and hY is the square
root of the heritability in the environment in which the population is

324 CORRELATED CHARACTERS [Chap. 19

required subsequently to live. If the genetic correlation is high, then
the two characters can be regarded as being substantially the same;
and if there are no special circumstances affecting the heritability or
the intensity of selection it will make little difference in which en-
vironment the selection is carried out. But if the genetic correlation
is low, then it will be advantageous to carry out the selection in the
environment in which the population is destined to live, unless the
heritability or the intensity of selection in the other environment is
very considerably higher.

This is the theoretical basis for dealing with selection in different
environments. So far, however, there has been little experimental
work to substantiate the theory. The results of the experiments that
have been carried out do not appear to be fully in agreement with
theoretical expectations, and this suggests that other factors not yet
understood are probably operating. (See Falconer and Latyszewski,
1952; Falconer, 1952.)

Simultaneous Selection for more than one Character

When selection is applied to the improvement of the economic
value of animals or plants it is generally applied to several characters
simultaneously and not just to one, because economic value depends
on more than one character. For example, the profit made from a
herd of pigs depends on their fertility, mothering ability, growth
rate, efficiency of food-utilisation, and carcass qualities. How, then,
should selection be applied to the component characters in order to
achieve the maximum improvement of economic value? There are
several possible procedures. One might select in turn for each
character singly ("tandem" selection); or one might select for all the
characters at the same time but independently, rejecting all individuals
that fail to come up to a certain standard for each character regardless
of their values for any other of the characters ("independent culling
levels"). It has been shown, however, that the most rapid improve
ment of economic value is expected from selection applied simul
taneously to all the component characters together, appropriate weight
being given to each character according to its relative economic
importance, its heritability and the genetic and phenotypic correla-
tions between the different characters, (Hazel and Lush, 1942;
Hazel, 1943). The practice of selection for economic value is thus a

ofef

Chap. 19]

SIMULTANEOUS SELECTION

325

matter of some complexity. The component characters have to be
combined together into a score, or index, in such a way that selection
applied to the index, as if the index were a single character, will yield
the most rapid possible improvement of economic value. If the
characters are uncorrelated there is no great problem: each character
is weighted by the product of its relative economic value and its
heritability. This is the best that can be done in the absence of
information about the genetic correlations, but if the genetic correla-
tions are known the efficiency of the index can be improved. The
following account gives an outline of the principles on which the
construction of a selection index is based. For a fuller account the
reader should consult Lerner (1950) and the original papers of
Fairfield Smith (1936) and Hazel (1943).

For the sake of simplicity we shall consider only two component
characters of economic value, but the conclusions can readily be ex-
tended to any number of characters. Let the economic value be
determined by two characters X and Y, and let w be the additional
profit expected from one unit increase of Y relative to that from one
unit increase of X. The aim of selection therefore is to pick out
individuals with the highest values of (Ax + wAY), where Ax and A Y
are the breeding values of the two characters X and Y. Let us call
this compound breeding value "merit," with the symbol H, so that

H = Ax + wA

(i9.8)

The problem is to find out how the phenotypic values, Px and PY, of
the two component characters are to be combined into an index that
gives the best estimate of an individual's merit, H. In Chapter 10 we
saw how the best estimate of the breeding value of an individual for
one character is the regression equation A =bAPP, where bAP is the
regression of breeding value on phenotypic value, and is equal to the
heritability (see p. 166). The present problem is essentially the same,
only now we have to use partial regression coefficients. The multiple
regression equation giving the best estimate of merit is

H=bH^YPx+bHY,xPY (J9.9)

where Px and PY are phenotypic values measured as deviations from
the population mean. (In this formula, and in those that follow, the
symbol X has the same meaning as Px, i.e. the phenotypic value of
character X; and similarly Y and PY both mean the phenotypic value
of character Y. Thus, bHX, Y is the regression of merit on the pheno-

326

CORRELATED CHARACTERS

[Chap. 19

typic value of X when the phenotypic value of Y is held constant, and
^hy.x nas a similar meaning with X and Y interchanged.) In practice
it is convenient to have the index in a form that requires the manipu-
lation of only one of the phenotypic values, i.e. in the form

I=PX + WPY (ig.io)

where / is the index by means of which individuals are to be chosen,
and W is a factor by which the phenotypic value of character Y is
to be multiplied. Since the absolute magnitude of the index is of no
importance, but only its relative magnitude in different individuals,
we can work with the phenotypic values as they stand instead of with
deviations from the population mean. And we can put equation
ig.g into the form of equation ig.io simply by dividing through by
t>HX.Y- Then W in equation ig.io is the ratio of the two partial
regression coefficients,

'HY.X
}HX.Y

and our task now is to find a way of expressing W in terms of the
genetic properties of the two characters.

First let us put the partial regression coefficients in terms of the
total regression coefficients. For example,

bnv-bnYb

'HY

HXUXY

Therefore

W-

'HY.X —

'HY.X

'HY

rXY

UHxbxY

'HX.Y

'HX

^HY^YX

Now let us express these total regressions in terms of covariances and
variances. For example, bHY = covHY/aY. After some simplification
the expression reduces to

gxcovhy - covEXcovXY

OYC0VHX - COVHYCOVXY

.{19.11)

The variances ox and <jy here, and in what follows, are the pheno-
typic variances of characters X and Y. The covariances in the above
expression can be expressed in terms of the'phenotypic variance anc
the heritability of each character and of the phenotypic and genetic
correlations between the two characters, all of which quantities car

Chap. 19]

SIMULTANEOUS SELECTION

327

be estimated. Take, for example, the covariance of H with X. This
| may be written as follows:

covHX = covariance of (Ax + wAY) with Px
=covUx.px)+coviwAYwpx)
= h\ox + wrAhxoxhYoY

J In this way the covariances in equation 19.11 can be expressed as
| follows, a andcr2 being phenotypic standard deviations and variances
i ! throughout:

covHX = hxcrx + wrAhxhYax<jY ^

covHY = wh\a\ + rAhxhYuxaY > ( 19.12)

covXY = rP<jxvY J

The procedure for selection is thus to compute the covariances

I given in 19.12, substitute them in 19.11 and use the value of W so

obtained to compute the index of selection / given in 19.10. The

value of the index for each individual then forms the basis of selection.

The index as formulated above is applicable only to individual
selection. If family selection is applied then the heritabilities and
correlations that go into the index must be those appropriate to the
family means. Family selection, however, is not greatly improved by
the use of an index, because the family heritabilities of the component
characters are generally fairly high and the mean economic value of a
family in terms of phenotypic values is not very different from its
merit in terms of breeding values. Therefore family selection for
economic value can be applied with little loss of efficiency if the
phenotypic values are weighted only by w, the relative economic im-
portance of each component character.

The complexity of selection by means of an index need hardly be
emphasised, especially when the index is extended to cover many
component characters. Even with two characters, estimates of no
fewer than seven quantities are required for the construction of the
index. Since some of these, particularly the genetic correlation,
cannot usually be estimated with any great precision, the index
cannot be regarded as much more than a rough guide to procedure.
But since selection has to be applied to economic value by some
means, it seems better to use a selection index, however imprecise,
i than to base selection on a purely arbitrary combination of com-
ponent characters.

Use of a secondary character by means of an index. The

Y F.Q.G.

328 CORRELATED CHARACTERS [Chap. 19

selection index described above can readily be adapted to meet the
case where improvement of only one character is sought, the other
character being used merely as an aid to more efficient selection.
The use of a secondary character in this way was mentioned earlier,
in connexion with indirect selection. Let X be the character it is
desired to improve, and Y the secondary character. Then the
relative "economic" value of character Y is zero, and we can substitute
w = o in the formulae of ig.12. Substitution of the covariances in
equation ig.n then yields a formula which on simplification reduces
to

w^ArAhY-rPh)

oY(hx-rAhY) K y J)

The selection index of equation ig.io is then used with this value of
W. The value of W in the index may be negative. This will arise if
the phenotypic correlation between the two characters is chiefly
environmental in origin. The secondary character then acts as an
indicator of the environmental deviation rather than of the breeding
value of the desired character (see Rendel, 1954; and Osborne,

Genetic correlation and the selection limit. There is one
important consequence of simultaneous selection for several charac-
ters to be discussed before we leave the subject. Just as the herit-
abilities are expected to change after selection has been applied for
some time, so also are the genetic correlations. If selection has been
applied to two characters simultaneously the genetic correlation
between them is expected eventually to become negative, for the
following reason. Those pleiotropic genes that affect both characters
in the desired direction will be strongly acted on by selection and
brought rapidly toward fixation. They will then contribute little to
the variances or to the covariance of the two characters. The pleio-
tropic genes that affect one character favourably and the other ad-
versely will, however, be much less strongly influenced by selection
and will remain for longer at intermediate frequencies. Most of the
remaining covariance of the two characters will therefore be due to
these genes, and the resulting genetic correlation will be negative.
The consequence of a negative genetic correlation, whether produced
by selection in this way or present from the beginning, is that the two
characters may each show a heritability that is far from zero, and yet
when selection is applied to them simultaneously neither responds.

Ian
value

Chap. 19]

SIMULTANEOUS SELECTION

329

We have already discussed, in Chapter 12, what is essentially the
same situation resulting from the combined effects of artificial and
natural selection: a selection limit is reached even though the charac-
ter to which artificial selection is applied still shows a substantial
amount of additive genetic variance.

Example 19.2. A practical example from a commercial flock of poultry
is described by Dickerson (1955). Selection for economic value had been
applied for many years, but recent progress in the component characters
was much less than was to be expected from their heritabilities, which were
found to be moderately high. Estimations of the genetic correlations
between the component characters showed that many of these were nega-
tive. To take just one example, the relationships between egg-production
and egg weight were as follows:

Production

Weight

0-32 0-59

0-04 -0-39

+ 0-25

In spite of the high heritabilities neither character had shown any improve-
ment over the last 10-15 years. The high negative genetic correlation
would account for this failure to respond, if selection was applied to both
characters simultaneously. It is interesting to note that environmental
variation, unlike genetic variation, affects both characters in the same way
and leads to a positive environmental correlation. The phenotypic cor-
relation, which is almost zero, gives no clue to the genetic relationship
between the two characters, and the failure to respond to selection could
mot have been predicted from it alone.

A population which has been subjected over a long period to
selection for economic value throws light on the genetic properties to
be expected in natural populations subject to natural selection for
fitness. Fitness is a compound character with many components —
far more than would appear in the most elaborate assessment of
economic value — and so we should expect negative genetic correla-
tions between its major components, a conclusion to be developed
further in the next chapter. It is interesting to note, however, that
natural selection takes no account of heritabilities or genetic correla-
tions, and is therefore, in theory, less efficient in improving fitness
than artificial selection by means of an index is in improving economic
value.

CHAPTER 20

METRIC CHARACTERS UNDER
NATURAL SELECTION

Throughout the discussion of the genetic properties of metric
characters, which has occupied the major part of the book, very little
attention has been given to the effects of natural selection, and some-
thing must now be done to remedy this omission. The absence of
differential viability and fertility was specified as a condition in the
theoretical development of the subject: that is to say, natural selection
was assumed to be absent. Though for many purposes this assump-
tion may lead to no serious error, a complete understanding of metric
characters will not be reached until the effects of natural selection
can be brought into the picture. The operation of natural selection
on metric characters has, however, a much wider interest than just
as a complication that may disturb the simple theoretical picture and
the predictions based on it. It is to natural selection that we must look
for an explanation of the genetic properties of metric characters which
hitherto we have accepted with little comment. The genetic pro-
perties of a population are the product of natural selection in the past,
together with mutation and random drift. It is by these processes
that we must account for the existence of genetic variability; and it is
chiefly by natural selection that we must account for the fact that
characters differ in their genetic properties, some having propor-
tionately more additive variance than others, some showing in-
breeding depression while others do not. These, however, are very
wide problems which are still far from solution, and in this con-
cluding chapter we can do little more than indicate their nature. Any
discussion of them, moreover, cannot but be controversial; the reader
should therefore understand that the contents of this chapter are to a
large extent matters of personal opinion, and that any conclusions to
which the discussion may lead are open to dispute.

We shall refer throughout to a population that is in genetic equil-
ibrium. Being in genetic equilibrium means that the gene frequencies
are not changing, and therefore that the mean values of all metric

tha

Chap. 20] METRIC CHARACTERS UNDER NATURAL SELECTION

331

characters are constant. (Changes of environmental conditions are
assumed to be so slow as to be negligible.) The population is con-
stantly subject to natural selection tending to increase fitness, but
despite the selection the gene frequencies do not change and fitness
does not improve. There can therefore be no additive genetic vari-
ance of fitness: in other words, if we could measure fitness itself as a
character we should find that its genetic variance was entirely non-
additive. For the purposes of discussion we may regard any natural
population as being in genetic equilibrium, at least approximately,
and also any population that has been subject to artificial selection
consistently over a long period of time, provided that fitness is
defined in terms of both the artificial and the natural selection.
Fitness, crudely defined, is the "character" selected for, whether by
natural selection alone or by artificial and natural selection com-
bined.

If a population is in genetic equilibrium it follows that a reduction
of fitness must in principle result from any change in the array of gene
frequencies, apart from any genes that may have no effect on fitness.
Natural selection must therefore be expected to resist any tendency
to change of the gene frequencies, such as must result from artificial
selection applied to any metric character other than fitness itself.
This principle has been called "genetic homeostasis," and its conse-
quences have been discussed, by Lerner (1954). Thus if we change
any metric character by artificial selection we must expect a reduction
of fitness as a correlated response. And if we then suspend the arti-
ficial selection before any of the variation has been lost by fixation,
we must expect the population mean to revert to its original value.
On the whole, the experience of artificial selection is in general
agreement with this expectation, though under laboratory conditions
the reduction of fitness may not be apparent in the early stages, and
some characters appear to revert very slowly, if at all, toward the
original value. Our domesticated animals and plants are perhaps the
best demonstration of the effects of the principle. The improvements
that have been made by selection in these have clearly been accom-
panied by a reduction of fitness for life under natural conditions, and
only the fact that domestic animals and plants do not have to live
under natural conditions has allowed these improvements to be
made. The problems for discussion in this chapter must be seen
against the background of this principle: that the existing array of
gene frequencies, and consequently the existing genetic properties of

332 METRIC CHARACTERS UNDER NATURAL SELECTION [Chap. 20

the population, represent the best total adjustment to existing con-
ditions that is possible with the available genetic variation.

The problem of how natural selection operates on metric charac-
ters has two aspects: the relation between any particular metric
character and fitness, and the way in which natural selection operates
on the individual loci concerned with a metric character. This latter
aspect is part of a wider problem which concerns the reasons for the
existence of genetic variation. We shall discuss these two aspects
separately, because any conclusions that may be drawn about the
second will depend on what can be discovered about the first.

Relation of Metric Characters to Fitness

The fitness of an individual is the final outcome of all its develop-
mental and physiological processes. The differences between indi-
viduals in these processes are seen in variation of the measurable
attributes which can be studied as metric characters. Thus the
variation of each metric character reflects to a greater or lesser degree
the variation of fitness; and the variation of fitness can theoretically
be broken down into variation of metric characters. Let us consider
for example a mammal such as the mouse, because this matter is more
easily discussed in concrete terms. Fitness itself might be broken
down into two or three major components, which could be measured
and studied as metric characters. These might be the total number
of young reared, and some measure of the quality of the young, such
as their weaning weight. The variation of the major components
would account for all the variation of fitness. Each of the major com-
ponents might be broken down into other metric characters which
would account for all their variation. Thus the total number of
young weaned depends on the viability of the parent up to breeding
age, its mating ability, average litter size, frequency of litters, and
longevity. These characters in turn might be further broken down.
For example, litter size depends on the number of eggs shed and the
proportion that are brought to term. The number of eggs shed
depends, again, on body size and endocrine activity, among other
things. Thus each metric character has its place in one of a series of
chains of causation converging toward fitness. And these chains of
causation interconnect one with another: body size, for example,
influences not only litter size, but also lactation, longevity, and prob-

Chap. 20] RELATION OF METRIC CHARACTERS TO FITNESS

333

ably many other characters. The relationship between any particular
metric character and fitness is thus a very complicated matter. The
following discussion of the problem is based largely on the ideas put
forward by A. Robertson (19556).

The way in which natural selection operates on a character
depends on the part played by the character in the causation of differ-
ences of fitness: that is to say, on the manner and degree by which
differences of value of the metric character cause differences of fitness.
This we shall refer to as the "functional relationship" between the
character and fitness. The functional relationship expresses the
mode of operation of natural selection on the metric character; but it
is not necessarily also the relationship that would be revealed if we
could measure the fitness of individuals and compare their fitness with
their values for the metric character. This point, however, will be
more easily explained by an example to be given in a moment.
Different characters must be expected to have different functional
relationships with fitness, according to the nature of the character.
In explanation of the kinds of relationship that may be envisaged let
us take some examples of different sorts of character at different
positions in the chain of causation.

1. Neutral characters. There may be some characters that have
no functional relationship at all with fitness. This does not mean
that, like vestigial organs, they have no function or use. It means that
the variation in the character is not a cause of variation of fitness.
Abdominal bristle number in Drosophila may be taken as an example
of a character which is probably not far from this state, and two
reasons can be given for regarding it thus. First, it is difficult to
conceive of any biological reason why it should be important to have
18 bristles, or thereabouts, on each segment rather than more or
fewer. And second, if we change the bristle number by artificial
selection and then suspend the selection, the mean bristle number
does not return to its original value — or returns only very slowly —
under the influence of natural selection, even though it could be
brought back rapidly by artificial selection (Clayton, Morris, and
Robertson, 1957). In other words, genetic homeostasis in respect of
bristle number is weak or non-existent. Such a metric character may
be termed "neutral" with respect to fitness. The mean value of a
neutral character in the population has little or nothing to do with the
character itself, but is the outcome of the pleiotropic effects of the
genes whose frequencies are controlled by their effects on other

334 METRIC CHARACTERS UNDER NATURAL SELECTION [Chap. 20

characters. Though a neutral character has no functional relationship
with fitness, we may nevertheless find that individuals with different
values do in fact differ in fitness in a regular way. If the genetic
variance of the character is predominantly additive then individuals
with intermediate values will tend on the whole to be heterozygous at
more loci than individuals with extreme values. Then if hetero-
zygotes were superior in fitness for some other reason, unconnected
with the character in question, this would result in intermediates
being superior in fitness. At the level of observation there would be
a relationship between values of the character and fitness, but this
would not be a functional relationship because the values of the
character are not the cause of the differences of fitness. The differ-
ences of fitness are the result of the functional relationships of other
characters affected by the pleiotropic action of the genes.

2. Characters with intermediate optima. There are some
characters for which an intermediate value is optimal for functional
reasons. One might distinguish three sorts of intermediate opti-
mum according to the reasons for intermediates being superior in
fitness.

(i) Optima determined by the character itself. As an example we
might take any character that measures the thermal insulation of a
mammalian coat. Too dense a coat would be disadvantageous and so
would too sparse a coat. An intermediate density would confer the
highest fitness as a consequence of the function of the coat in thermo-
regulation. For such a character the mean value in the population is
the optimal value, provided there are no complications of the sort to
be considered later. Though irrefutable biological reasons might be
given for supposing that a character such as the density of fur has an
intermediate optimal value, we might nevertheless find that over the
range of variation covered by the population there was very little
variation in fitness. In practice therefore one could not expect always
to draw a clear line between this sort of character and a neutral
character such as we have taken bristle number to be.

(ii) Optima imposed by the environment. As an example we may
take the clutch size of birds. It has been shown, particularly for the
European robin and swift, that a larger number of young are reared
from nests containing the average number of eggs than from nests
with larger or smaller clutches (Lack, 1954). Thus individuals with
intermediate values appear to be the fittest. If a character such as
this has an optimal value that is intermediate there must obviously

\Chap. 20] RELATION OF METRIC CHARACTERS TO FITNESS

335

be some other factor interacting with it to determine fitness; for,
otherwise, the individuals that lay more eggs must inevitably be the
fitter. The other factor in this case is the supply of insects for feeding
the young and the length of daylight available for their capture.
With characters of this sort natural selection tends to eliminate indi-
viduals with extreme values and favours individuals with intermediate
values. The mean value in the population is the optimal value under
the environmental conditions to which the population is subjected.
If the environment were to change, the population mean would
change too in adjustment to the new optimum. In the case of clutch
size it is noteworthy that the mean value varies with the latitude,
being larger in the north than in the south.

(iii) Optima imposed by a correlated character. Body size in mice
may be taken as an example. Larger mice have larger litters and,
under laboratory conditions, they rear more young. Therefore if
there were no other factor involved, larger mice would be fitter. Since
body size can, as we have seen, be readily increased by artificial
selection, there must be some other factor that prevents its being
increased by natural selection in the wild. The other factor in this case
is probably not environmental, but another character negatively
correlated with size, namely wildness. A change of body size under
artificial selection is always accompanied by a correlated change of
wildness. Large mice are phlegmatic and unreactive to disturbance,
whereas small mice are alert and react energetically to disturbance
(MacArthur, 1949; Falconer, 1953). Therefore under natural con-
ditions larger mice would more readily fall prey to cats and owls than
small mice, and the advantage of greater fertility would be offset by
the disadvantage of being less well fitted to escape predators. The
body size of wild mice, it may be suggested, represents the best
compromise between these two correlated characters. If we could
measure the relationship between size and fitness in wild mice we
should find that those of intermediate size were fittest. With charac-
ters of this sort also, the population mean represents the optimal
value. But this value is optimal not because of this character itself but
because of its genetic correlations with other characters. Large mice
are selected against not because they are large but because, being
large, they are inevitably also less wild. This example brings us to the
point mentioned at the end of the last chapter: that we must expect to
find negative genetic correlations between characters under simul-
taneous selection. In this case we find a negative genetic correlation

336 METRIC CHARACTERS UNDER NATURAL SELECTION [Chap. 20

between large size and wildness, both of which may reasonably be
supposed to be favoured by natural selection. These two characters
are * 'components' ' of fitness in the same way that characters of econ-
omic importance are components of total economic merit. What
natural selection "aims at" is to increase both characters indefinitely,
but the physiological connexions between them, which we see as a
negative genetic correlation, limit the increase that is possible with
the existing genetic variability.

3. Major components of fitness. If we could measure fitness
itself — which is technically very difficult — we should obviously find
no "optimal" value; the individuals most favoured by natural
selection would not be those nearest to the population mean, but the
most extreme. In spite of the selection toward higher values the
mean fails to change under natural selection because there is no
additive variance of fitness. If we measure as a metric character
something that is a major component of fitness, in the sense that it
accounts for a large part of the variation of fitness, we should probably
find the same sort of relationship. Fitness would increase as the
value of the character increased. At the very highest values, however,
fitness would probably decline again slightly. Egg-laying in Droso-
phila might well be such a character, even if measured only over a
few days, since the daily egg production is highly correlated with the
total production (Gowen, 1952). We should almost certainly find
that the fittest individuals were not those that laid an intermediate
number of eggs, but those that laid almost the most. The most ex-
treme individuals would probably be slightly less fit because of some
environmental limitation or some correlated character, perhaps
longevity. There must be many characters whose relationships with
fitness fall between this and the previous type, characters with an
optimal value above the population mean but yet below that of the
most extreme individuals.

The foregoing discussion will be enough to explain the nature of
the problem of the relationship between a metric character and fitness
and to indicate the sort of solution that may be sought. Let us turn
now to the connexion between the relationship with fitness and the
nature of the genetic variation of a metric character. When we first
discussed the heritability as a property of a character in Chapter 10,
we noted a tendency toward lower heritabilities among characters
more closely connected with fitness. But the precise meaning of a
"close connexion" with fitness was not explained. It may now be

Chap. 20] RELATION OF METRIC CHARACTERS TO FITNESS

337

suggested that the meaning of a close connexion with fitness may
perhaps be seen in the functional relationships discussed above.
Characters with the closest connexion are of the third type where the
population mean is not at an optimal value; characters with a less
close connexion are nearer to the second type; while characters with
the least connexion are the neutral or nearly neutral characters. On
the whole it does seem that characters with high heritabilities are to
be found among the first type and characters with low heritabilities
among the third. Differences of heritability are, however, not really
relevant here. It is the genetic variance with which we are concerned;
and the differences in the proportion of the genotypic variance that
is additive, that we want to account for. But so little is known about
how the genotypic variance is partitioned into additive and non-
additive components that we can scarcely begin to tackle the prob-
lem. Four characters of Drosophila, however, seem to fit the picture
fairly well, (see Table 8.2). For bristle number, which we have taken
as a neutral character of the first type, 85 per cent of the genotypic
variance is additive. Thorax length, which might perhaps be of the
second type, has about the same proportion. For ovary size, however,
only 43 per cent of the genotypic variance is additive, and this
character might well be between the second and third types. For
egg laying, which we have taken to be of the third type, the propor-
tion is 29 per cent. These comparisons, of course, cannot be given
much weight because in fact we know almost nothing of the func-
tional relations of the characters with fitness. But they do suggest
that the solution of the problem of why characters differ in their
genetic properties may lie along these lines. The reaction of a charac-
ter to inbreeding seems also to be connected with the proportion of
non-additive genetic variance, those with most non-additive variance
being those that suffer the greatest inbreeding depression. Some,
perhaps most, of the non-additive variance must be attributed to
dominance. Reasons for expecting the effects of genes on characters
closely connected with fitness to show dominance, while the effects on
characters not closely connected with fitness do not, have been put
forward by A. Robertson (19556); but it would take too much space
here to summarise the argument. There we must leave the problem
of the nature of the genetic variance and pass on to the second aspect
of the operation of natural selection on metric characters.

338 METRIC CHARACTERS UNDER NATURAL SELECTION [Chap. 20

Maintenance of Genetic Variation

The second aspect of the operation of natural selection on metric
characters — its effects on the individual loci — is part of a wider
problem, which concerns the mechanisms by which genetic variation
is maintained. Almost every metric character, of the many that have
been studied both in natural populations and in domesticated animals
and plants, exhibits genetic variation. What are the reasons for the
existence of this genetic variation? The coexistence in a population
of different alleles at a locus is governed by the three processes of
mutation, random drift, and selection. Allelic differences originate
by mutation and are extinguished by random drift, since no natural
population is infinite in size. Natural selection may tend to eliminate
the differences by favouring one allele over all others at a locus; or it
may tend to perpetuate the differences by favouring heterozygotes.
Let us discuss the role of natural selection first and the roles of muta-
tion and random drift later.

Effects of selection on individual loci. The way in which
selection operates on any locus depends on the effects that the differ-
ent alleles have on fitness itself, and not simply on their effects on one
particular metric character. Therefore the functional relations be
tween characters and fitness, which were discussed above, can indi-j w
cate the action of selection only on those loci which affect fitness ■%
through the character in question and not through any pleiotropic
effects on other characters. Let us consider the three types of
character in turn.

i. Neutral characters. If there are genes whose only effects are
on a neutral character, then selection plays no part in the existence of
allelic differences at these loci. The gene frequencies at these loci
must be controlled solely by mutation and random drift.

2. Characters with intermediate optima. The consequences of
selection favouring individuals of intermediate value have been
examined from different aspects by Wright (19350, ^)> Haldane
(19540), and by A. Robertson (1956) who reaches the following
conclusions. If the intermediate optimum is the result of the func-
tional relations of the character to fitness, and the optimum is deter
mined by the character itself or by the environment, then selection
will tend toward fixation at all the loci whose only influence on fitness
is through the character in question. This would apply to characters

Chap. 20]

MAINTENANCE OF GENETIC VARIATION

339

of type 2 (i) and (ii) described above and exemplified by the density
of mammalian fur and by clutch size in birds. Selection will thus
tend to eliminate rather than to conserve variability arising from loci
which affect fitness only through such characters. The rate at which
the gene frequencies are expected to change toward fixation is very
slow, and so the rate at which variation would be eliminated is also
very slow; but on an evolutionary time-scale it would not be negligible.
Characters of type 2 (iii), where an intermediate optimum is deter-
mined by a correlated character, have not yet been investigated in
this connexion, and the mode of operation of selection on loci that
affect them is not known.

3. Major components of fitness. The essential feature of a major
component of fitness is that the population mean is not at the opti-
mum. But we cannot deduce, from this fact alone, how selection
operates on the individual loci. If the genes that affect these charac-
ters are at intermediate frequencies, it seems most probable that they
are held there by selection favouring heterozygotes, because it seems
hardly possible that the coefficients of selection are small enough to
allow mutation alone to maintain intermediate frequencies. We do not
know, however, whether these genes are at intermediate frequencies.
It seems quite possible that a considerable portion of the genetic
variation of these characters is due to genes at very low frequencies,
where they are maintained by the balance between mutation and
selection against the recessive homozygotes. Much evidence, how-
ever, has been presented by Lerner (1954) in support of the view that
heterozygotes in general are superior in fitness; and Haldane (19546)
has pointed out that a general superiority of heterozygotes is a very
reasonable expectation from biochemical considerations of gene
action. Though the matter is not yet settled, the weight of evidence
at present seems to point to superior fitness of heterozygotes, and
consequently to natural selection favouring heterozygotes at most of
the loci that affect fitness through its major components.

There are three other ways in which selection may influence
genetic variability, to be discussed before we leave the subject. They
are all subsidiary to the main effects on gene frequencies which we
have been discussing; they may modify these main effects, but they
do not in themselves provide a sufficient description of the operation
of natural selection.

Variable selection. If characters have optimal values these
optima are likely to vary from time to time and from season to season

340 METRIC CHARACTERS UNDER NATURAL SELECTION [Chap. 20

according to the environmental conditions. The selection pressures
on the individual genes are therefore likely to change from generation
to generation. The consequence of variable selection coefficients has
been shown (Kimura, 1954) to be a tendency toward fixation — or
more strictly, near-fixation — the favoured allele being the one that
gives the highest average fitness. In this aspect selection would
therefore tend to eliminate variability. The optimal values are
likely to vary also from place to place within each generation, especi-
ally if different genotypes choose different environments in which to
live, as Waddington (1957) suggests. This form of variable selection
has been shown to be capable under certain conditions of maintaining
stable polymorphism, as was mentioned in Chapter 2. Its effect on
the variation of metric characters, however, has not been examined.
It does not seem likely to be very great.

Balanced linkage. Mather's theory of "polygenic balance" is
based on the idea of selection favouring intermediate values of metric
characters and the effect this is likely to have on linkage (see for
example, Mather, 1949, 1953^). In considering linkage between the
loci affecting a metric character we have to take account of the linkage
phase. We may say that two genes on the same chromosome are in
coupling if they affect the character in the same direction, and in
repulsion if they affect it in opposite directions. The two phases will
be represented in equal frequencies in a random-breeding population
subject to no selection, as was shown in Chapter 1. Now, chromo-
somes carrying genes in coupling will contribute more to the variation
than chromosomes carrying genes in repulsion. And individuals with
intermediate values will tend on the whole to carry repulsion chromo-
somes rather than coupling chromosomes. Therefore, if intermediates
are favoured for functional reasons, selection will favour repulsion
chromosomes and thus tend to build up * 'balanced" combinations of
genes: that is, combinations in predominantly repulsion linkage,
which contribute the minimal amount of variance. In this way,
according to Mather, "potential" genetic variability is stored in latent
form, and a compromise is reached between the conflicting needs of
uniformity in adaptation to present circumstances and flexibility in
adaptation to changing circumstances.

If, however, this supposed tendency of selection to build up
balanced combinations is to have any significant effect on genetic
variability it is necessary that the selection should be strong enough
to maintain the balanced combinations in the face of recombination

Chap. 20]

MAINTENANCE OF GENETIC VARIATION

341

which must tend continuously to reduce them to a random arrange-
ment. The selective forces required have been examined by Wright
(1952&). It is clear, without going into the details, that coefficients of
selection of the same order of magnitude as the recombination fre-
quencies would be required. The balancing of linkage by natural
selection therefore seems from Wright's reasoning to be relevant only
to very short segments of chromosome. Loci with more than about 1
per cent recombination between them would not be expected to
depart significantly from a random arrangement, unless they carried
major genes with large effects on the character. Furthermore, if we
consider a number of loci on the same chromosome, it is not clear
how much difference of variance would be expected between fully
balanced and fully random arrangements; it might well be very little.
Experimental evidence on the matter is scanty. In two experiments,
one with mice and the other with Drosophila, where artificial selection
was applied for and against intermediates, no changes of variance
were detected (Falconer and Robertson, 1956; Falconer, 19576).
Intensification of the selection against extremes therefore does not
seem to have any effect on the variance within the time-span of a
laboratory experiment.

Canalisation. Waddington's theory of "canalisation" is con-
cerned with the developmental pathways through which the pheno-
typic values come to their expression (see Waddington, 1957). If
intermediates are favoured because of their values of the metric
character in question, then deviation from the optimal value is dis-
advantageous. Selection will therefore operate against the causes of
deviation, and will tend to produce a greater stability so that develop-
ment is canalised along the path that leads to the optimal phenotypic
expression. The role ascribed to selection is its discrimination against
alleles that increase variability. These may be at loci that affect the
character in question or at other loci. Variation both of environ-
mental and of genetic origin may be reduced in this way. The
genetic variation is reduced not by eliminating the segregation, but by
rendering the organism less sensitive to the effects of the segregation.
A change in the proportion of genetic to environmental variation is
therefore not necessarily to be expected. As a consequence of canalisa-
tion we should expect to find some characters less variable than
others, the less variable being those for which deviation from the
optimum has the more serious effect on fitness. This expected con-
sequence of canalisation, however, cannot easily be tested experi-

342 METRIC CHARACTERS UNDER NATURAL SELECTION [Chap. 20

mentally, because, as Waddington (1957) points out, it is difficult to
find a logical basis for comparing the variability of different characters.
Origin of variation by mutation. Before the reasons for the
existence of genetic variability can be fully understood it will be
necessary to know what part mutation plays in restoring what is lost
by random drift or by selection. If there were no selection of any
kind then the amount of genetic variation would come to equilibrium
when its rate of origin was equal to its rate of extinction by random
drift. The rate of extinction presents no very serious problem be-
cause we need know the population size only approximately. If,
therefore, we knew the rate of origin by mutation we could decide
whether a significant amount of the existing variation can be ascribed
to mutation. Very little, however, is known about the rate of origin
by mutation. The only evidence comes from two studies oiDrosophila
by Clayton and Robertson (1955) and Paxman (1957), which yielded
very similar results. The following discussion is based on the experi-
ment of Clayton and Robertson. Selection for abdominal bristle
number was applied to an inbred line derived from the same base
population on which the other studies of this character were made.
From the rate of response to selection it was concluded that the aver-
age amount of variation arising by spontaneous mutation in one
generation amounted to one thousandth part of the genetic variation
present in the base population. In other words it would take about
1000 generations for mutation to restore the genetic variation to its
original level. (We may note in passing that this proves mutation to
have a negligible influence on the response of non-inbred populations
to artificial selection, apart from the rare occurrence of mutants with
major effects.) Now consider the loss of variance due to random drift
in a population of effective size Ne, subject to no selection. If all the
genetic variance is additive, as it very nearly is in the case of bristle
number, then the rate of loss per generation is equal to the rate of 1 0f
inbreeding, which is ijzNe. (This follows from the reasoning given
in Chapter 15, where the variance within a line was shown to be
(1 -F) times the original variance.) Therefore the new variation
arising by mutation at the rate found in this experiment would be lost
at the same rate, if the rate of inbreeding were 1/1,000: that is, in a
population of effective size 500. The base population was roughly
ten times this size and therefore the expected rate of extinction by
random drift is less than the observed rate of origin by mutation. In
other words, mutation alone seems to be capable of accounting for

not

som,

Chap. 20]

MAINTENANCE OF GENETIC VARIATION

343

more variation of bristle number than was actually present in the
base population. Therefore selection favouring heterozygotes does
not seem to have been an important cause of the genetic variability of
bristle number. This suggests that little of the variation of bristle
number is due to the pleiotropic effects of genes that affect the major
components of fitness. It suggests, in other words, that much of the
variation of bristle number is due to genes that are not far from being
neutral with respect to fitness. This conclusion, though only tenta-
tive, is in line with the fact, mentioned earlier, that bristle number
I shows little tendency to revert to the original mean value when
I artificial selection is relaxed. The conclusions to which the results of
I this experiment point cannot yet be extended to other characters.
I Characters more closely connected with fitness, when they have been
i| studied from this point of view, may present a very different picture.
Evolutionary significance of variability. There can be little
doubt that the existence of genetic variation is advantageous to the
evolutionary survival of a species, the advantage it confers being the
ability to evolve rapidly and so to meet the needs of a changing
environment, both through the course of time and in the colonisation
of new localities. Sexual reproduction and outbreeding are necessary
conditions for the continued existence of genetic variation and it is
noteworthy that the naturally inbreeding species among the higher
plants are of comparatively recent origin. This suggests that the
possession of genetic variability is necessary for the continued exist-
ence of a species over a long period of time; or in other words, that the
prevalence of genetic variability among existing species is because
those without it have not survived. The inbreeding plants, however,
as we see them at present, compete successfully with the outbreeding
species, and this proves that the possession of genetic variability does
not confer much immediate advantage. The evolutionary significance
of genetic variability, however, throws no light on the mechanisms
that maintain it. It is these mechanisms, which have been discussed
in this chapter, that are the concern of quantitative genetics.

The Genes concerned with Quantitative Variation

The genetic variation of metric characters appears from the re-
sults of experimental selection to be the product of segregation at
some hundreds of loci, or more probably some thousands if the

z F.Q.G.

344 METRIC CHARACTERS UNDER NATURAL SELECTION [Chap. 20

variation of all characters is included. So natural populations prob-
ably carry a variety of alleles at a considerable proportion of loci, even
perhaps at virtually every locus. It seems unreasonable, therefore, to
think of genes having the control of a metric character as their
specific function: we cannot reasonably suppose that there are genes
whose only functions are the adjustment of, say, body size to an
optimal value. How, then, are we to think of the genes with which we
are concerned in quantitative genetics? Our knowledge of these
genes may be briefly summarised as follows.

The distinction between ' 'major" and "minor" genes marks the
difference between those which we can study individually, and whose
properties are therefore fairly easily discovered, and those which we
cannot study individually and whose properties can only be deduced
by indirect means. Both are concerned with quantitative variation.
Among the major genes two sorts may be distinguished. There are
genes with more or less severely deleterious effects on fitness, and
these include nearly all the "mutants" of Mendelian genetics, as well
as lethals. Each may have pleiotropic effects on a variety of metric
characters. They are recessive, or nearly so, in their effects on fitness,
but not necessarily also in their effects on metric characters. They are
kept in equilibrium at low frequencies by natural selection balanced
against mutation. Being at low frequencies they contribute, individu-
ally, little to the genetic variance of any character; their total contri-
bution, however, is unknown. They are probably an important cause
of inbreeding depression. Major genes of the second sort are those
responsible for the antigenic differences. The alleles at these loci are
at intermediate frequencies where they are probably maintained by
selection favouring heterozygotes. Their effects on fitness, however,
are probably fairly small — certainly small enough for all to be
regarded as "wild-type" alleles. Their effects on metric characters
are almost unexplored, and their importance as sources of variation is
consequently unknown. They presumably contribute to inbreeding
depression if heterozygotes are superior in fitness, but again their
relative importance in this respect is not known with certainty.
About the minor genes little is known. They do not necessarily
occupy loci different from those occupied by major genes. It seems
more likely, on the contrary, that they are isoalleles, capable of
mutating to major deleterious genes. They are performing their
primary functions perfectly adequately and may differ only in the rate
at which their primary product is synthesised. The variation of

Chap. 20]

THE GENES CONCERNED

345

metric characters which they produce may be quite incidental to their
main biochemical functions. There is no reason at present to think
that these minor genes differ in any essential way from the genes that
determine antigenic differences. The fact that their effects are not
individually recognisable, whereas the antigenic differences are, may
be due only to the inadequacy of the techniques available for detect-
ing biochemical differences among essentially normal individuals.

The problems that have been raised but left unanswered in this
chapter will be sufficient indication of the directions which the future
development of quantitative genetics may take. It does not seem to
the present writer that much progress toward their solution is likely
to be made by deductive reasoning, because most of the outstanding
problems are not essentially theoretical in nature: the theoretical
structure of the subject is now fairly clear, at least in its main out-
lines. Some of the outstanding problems are beyond the reach of the
experimental techniques now at our command. New techniques, both
more penetrating and more discriminating, will therefore be needed.
Other problems arise from the paucity of experimental data and the
consequent difficulty of deciding what phenomena are general and
what are due to special circumstances. These problems will be solved
not so much by deliberately designed experiments, but rather from
the accumulated experience of experiments extended to a wider
variety of characters and of organisms.

GLOSSARY OF SYMBOLS

This list gives the meanings of most of the symbols used in the book.
Many of the symbols listed are used also with other meanings in certain
places, but these meanings, as well as the symbols not listed, do not appear
more than a page or two removed from their definition. The more im-
portant differences from current usage are indicated where the equivalent
symbols used by Lerner (1950) — denoted by (L) — and by Mather (1949)
— denoted by (M) — are given.

Ax, A2 Allelomorphic genes.

A Breeding value. = G (L).

a Genotypic value of the homozygote A^, as deviation from the

mid-homozygote value. = d (M).

a Average effect of a gene-substitution.

ax, a2 Average effects of the alleles Ax and A2 respectively.

b Regression coefficient; e.g. &op = regression of offspring on parent.

CR Correlated response to selection.

D Dominance deviation.

d Genotypic value of the heterozygote AXA2, as deviation from the

mid-homozygote value. = /z|(M).

A Change of -, as Aq = change of gene frequency, Zlir = rate of in-

breeding.

E Environmental deviation.

Ec Common environment; i.e. environmental deviation of family mean
from population mean. = C (L).

Ew Within-family environment; i.e. environmental deviation of indi-
vidual from family mean. = E' (L).

F Coefficient of inbreeding.

F1 First generation of cross between lines or populations.

F2 Second generation of cross, by random mating among Fx.

FS Full sibs.

/ Coancestry; i.e. inbreeding coefficient of the progeny of the indi

viduals concerned.

/ (Chap. 13): Subscript referring to selection between families.

G Genotypic value. = Ge (L).

GLOSSARY OF SYMBOLS

347

H
H

M
m

N
Ne
n

O
P
P
P
P
P
P

Frequency of heterozygous genotype (AXA2).

Amount of heterosis; i.e. deviation of cross mean from mid-parent

value.
Half sibs.
Heritability.

Interaction deviation, due to epistasis.
(Chap. 13 & 19): Index for selection.
Intensity of selection; i.e. selection differential in units of the

phenotypic standard deviation. = 1 (L).
Population mean.
Immigration rate.
Population size; i.e. number of breeding individuals in a population

or line.
(Chap. 10 & 13): Number of families.
Effective population size.
Number in various contexts. In Chapters 10 and 13, specifically

number of offspring per family.
Offspring

Parent. P = Mid-parent.
Frequency of homozygous genotype (A^).
Panmictic index, ( = 1 - F).
Phenotypic value.
Gene frequency (of Ax). = u (M).
(Chap. 11, part): proportion selected as parents from a normally

distributed population. = v (L).
Frequency of homozygous genotype (A2A2).
Gene frequency (of A2). = v (M).
Response to selection — specifically to individual selection. = AG

(L).
(Chap. 8): Repeatability; i.e. correlation between repeated measure-
ments of the same individual.
(Chap. 13): Coefficient of relationship; i.e. correlation of breeding

values between related individuals. = rG (L).
(Chap. 19): Correlations between two characters:
rA additive genetic correlation. = rG (L).
rE environmental correlation.
rP phenotypic correlation. = r (L).

Selection differential in actual units of measurement. = i (L).
Coefficient of selection against a particular genotype.
(Chap. 13): subscript referring to sib-selection.

348 GLOSSARY OF SYMBOLS

E Summation of the quantity following the sign.
a Standard deviation (a2 = variance) of the quantity indicated by

subscript. Components of variance, from an analysis of vari-
ance are indicated by subscripts as follows:
a% between groups, or families.
o% between dams, within sires,
of between sires.

a\ total; i.e. the sum of all components.
o\ within groups, or families.
t Time in number of generations. As a subscript it means "at

generation t".
t Phenotypic correlation between members of families.

u Mutation rate (from Ax to A2).

V Variance (causal component) of the value or deviation indicated by

subscript. The most important are:

VP Phenotypic variance. = o% (L), = V (M).

Vq Genotypic variance. = o%e (L).

Vj Additive genetic variance. = al (L), = \T> (M).

Vj) Dominance variance.) 2 . (=^H(M).

Vi Interaction variance. J G \ = / (M).

VE Environmental variance. =ct^(L), =E(M).
v Mutation rate (from A2 to Ax).

w (Chap. 13): subscript referring to selection within families.

X (Chap. 19): One of two correlated characters.

Y (Chap. 19): The other of two correlated characters.

y (Chap. 14): Difference of gene frequency between two lines.

z (Chap. 11): Height of the ordinate of a normal distribution, in

units of the standard deviation.

INDEXED LIST OF REFERENCES

The numbers in square brackets refer to pages in
the text where the work is mentioned

Allison, A. C. 1954. Notes on sickle-cell polymorphism. Ann. hum. Genet.

[Lond.], 19: 39-57- [45]

1955. Aspects of polymorphism in man. Cold Spr. Harb. Symp. quant.

Biol, 20: 239-252. [44]

Bartlett, M. S., and Haldane, J. B. S. 1935. The theory of inbreeding

with forced heterozygosis. J. Genet., 31: 327-340. [97]

Bell, A. E., Moore, C. H., and Warren, D. C. 1955. The evaluation of

new methods for the improvement of quantitative characteristics.

Cold Spr. Harb. Symp. quant. Biol., 20: 197-21 1. [286]

Biggers, J. D., and Claringbold, P. J. 1954. Why use inbred lines?

Nature [Lond.], 174: 596. [275]

Briles, W. E., Allen, C. P., and Millen, T. W. 1957. The B blood group

system of chickens. I. Heterozygosity in closed populations.

Genetics, 42: 631-648. [290]

Briquet, R., and Lush, J. L. 1947. Heritability of amount of spotting in

Holstein-Friesian cattle. J. Hered., 38: 99-105. [167]

Brumby, P. J. 1958. Monozygotic twins and dairy cattle improvement.

Anim. Breed. Abstr., 26: 1-12. [J83]

Brumby, P. J., and Hancock, J. 1956. A preliminary report of growth and

milk production in identical- and fraternal-twin dairy cattle. N.Z.

J. Sci. Tech., Agric, 38: 184-193. [185]

Buri, P. 1956. Gene frequency in small populations of mutant Drosophila.

Evolution, 10: 367-402. [52, 53, 56, 59, 74]

Butler, L. 1952. A study of size inheritance in the house mouse. II. Analysis

of five preliminary crosses. Canad. J. Zool., 30: 154-171. [216]
Cain, A. J., and Sheppard, P. M. 1954a. Natural selection in Cepaea.

Genetics, 39: 89-116. [43, 83]

19546. The theory of adaptive polymorphism. Amer. Nat., 88: 321-

326. [44]

Carpenter, J. R., Gruneberg, H., and Russell, E. S. 1957. Genetical

differentation involving morphological characters in an inbred

strain of mice. II. The American branches of the C57BL and

C57BR strains. J. Morph., 100: 377-388. [274]

Castle, W. E., and Wright, S. 1916. Studies of inheritance in guinea-pigs

and rats. Publ. Carneg. Instn. Wash., No. 241: iv + 192 pp. [168]
Ceppellini, R., Siniscalco, M., and Smith, C. A. B. 1955. The estimation

of gene frequencies in a random-mating population. Ann. hum.

Genet. [Lond.], 20: 97-115. [16]

350 INDEXED LIST OF REFERENCES

Chai, C. K. 1957. Developmental homeostasis of body growth in mice.

Amer. Nat., 91: 49-55. [271]

Chapman, A. B. 1946. Genetic and nongenetic sources of variation in the

weight response of the immature rat ovary to a gonadotropic

hormone. Genetics, 31: 494-507. [168]

Clayton, G. A., Knight, G. R., Morris, J. A., and Robertson, A. 1957.

An experimental check on quantitative genetical theory. III.

Correlated responses. J. Genet., 55: 171-180. [316, 320]

Clayton, G. A., Morris, J. A., and Robertson, A. 1957. An experimental

check on quantitative genetical theory. I. Short-term responses to

selection. J. Genet., 55: 131-151.

[140, 168, 169, 177, 190, 195, 209, 210, 221, 245, 333]

Clayton, G. A., and Robertson, A. 1955. Mutation and quantitative

variation. Amer. Nat., 89: 1 51-158. [342]

1957. An experimental check on quantitative genetical theory. II.

The long-term effects of selection. J. Genet., 55: 152-170.

[216, 223]
Cockerham, C. C. 1 954. An extension of the concept of partitioning
hereditary variance for analysis of covariances among relatives
when epistasis is present. Genetics, 39: 859-882. [138]

1956*2. Effects of linkage on the covariances between relatives. Genetics,
41: 138-141. [159]

19566. Analysis of quantitative gene action. Genetics in Plant Breeding.
Brookhaven Symp. Biol., No. 9: 53-68. [140]

Comstock, R. E., and Robinson, H. F. 1952. Estimation of average domi-
nance of genes. Heterosis, ed. J. W. Gowen. Ames: Iowa State
College Press. Pp. 494-516. [290]

Comstock, R. E., Robinson, H. F., and Harvey, P. H. 1949. A breeding
procedure designed to make maximum use of both general and
specific combining ability. J. Amer. Soc. Agron., 41: 360-367.

[286]

Crow, J. F. 1948. Alternative hypotheses of hybrid vigor. Genetics, 33:

477-487. [278, 290]

1952. Dominance and overdominance. Heterosis, ed. J. W. Gowen.

Ames: Iowa State College Press. Pp. 282-297. [278, 290]

1954. Breeding structure of populations. II. Effective population

number. Statistics and Mathematics in Biology, ed. O. Kempthorne,

T. A. Bancroft, J. W. Gowen, and J. L. Lush. Ames: Iowa State

College Press. Pp. 543"55o- [53, 60, 61, 64, 71]

1956. The estimation of spontaneous and radiation-induced mutation
rates in man. Eugen. Quart., 3: 201-208. [38]

1957. Possible consequences of an increased mutation rate. Eugen.
Quart., 4: 67-80. [39]

Crow, J. F., and Morton, N. E. 1955. Measurement of gene frequency
drift in small populations. Evolution, 9: 202-214. [73, 74]

Cruden, D. 1949. The computation of inbreeding coefficients in closed
populations. J. Hered., 40: 248-251. [88, 89]

INDEXED LIST OF REFERENCES

351

Dempster, E. R., and Lerner, I. M. 1950. Heritability of threshold charac-
ters. Genetics, 35: 212-236. [303]
Deol, M. S., Gruneberg, H., Searle, A. G., and Truslove, G. M. 1957.
Genetical differentiation involving morphological characters in an
inbred strain of mice. I. A British branch of the C57BL strain.
J. Morph., 100: 345-376. [274]
Dickerson, G. E. 1952. Inbred lines for heterosis tests? Heterosis, ed.
J. W. Gowen. Ames: Iowa State College Press. Pp. 330-351.

[286]
1955. Genetic slippage in response to selection for multiple objectives.
Cold Spy. Harb. Symp. quant. Biol., 20: 213-224. [329]

I957- (Two abstracts.) Poult. Sci., 36: 11 12-1 113. [316]

Dickerson, G. E., et al. 1954. Evaluation of selection in developing inbred lines
of swine. Res. Bull. Mo. agric. Exp. Sta., No. 55 1 : 60 pp. [249, 253]
Dobzhansky, Th. 1950. Genetics of natural populations. XIX. Origin of
heterosis through natural selection in populations of Drosophila
pseudoobscura. Genetics, 35: 288-302. [262]

195 \a. Genetics and the Origin of Species. New York: Columbia Uni-
versity Press. 3rd edn. xi+364pp. [44]
195 ib. Mendelian populations and their evolution. Genetics in the 2.0th
Century, ed. L. C. Dunn. New York: Macmillan Co. Pp. 573-589.

[44> 263]

1952. Nature and origin of heterosis. Heterosis, ed. J. W. Gowen.
Ames: Iowa State College Press. Pp. 218-223. [262]

Dobzhansky, Th., and Pavlovsky, O. 1955. An extreme case of heterosis

in a Central American population of Drosophila tropicalis. Proc.

nat. Acad. Sci. U.S.A., 41: 289-295. [39]

Donald, H. P., Deas, D. W., and Wilson, A. L. 1952. Genetical analysis

of the incidence of dropsical calves in herds of Ayrshire cattle.

Brit. vet. J., 108: 227-245. [13]

Emik, L. O., and Terrill, C. E. 1949. Systematic procedures for calculating

inbreeding coefficients, jf. Hered., 40: 51-55. [88, 89]

Falconer, D. S. 1952. The problem of environment and selection. Amer.

Nat., 86: 293-298. [323, 324]

1953. Selection for large and small size in mice. J. Genet., 51: 470-501.

[96, 168, 199, 335]

1954a. Asymmetrical responses in selection experiments. Symposium
on Genetics of Population Structure, Istituto di Genetica, Universita
di Pavia, Italy, August 20-23, 1953. Un. int. Sci. biol., No. 15:
16-41. [31, 33, 203, 213, 297]

19546. Validity of the theory of genetic correlation. An experimental
test with mice. J. Hered., 45: 42-44. [168, 316, 319]

1954c. Selection for sex ratio in mice and Drosophila. Amer. Nat., 88:
385-397. [321]

19SS- Patterns of response in selection experiments with mice. Cold
Spr. Harb. Symp. quant. Biol., 20: 178-196.

[168, 201, 214, 216, 220]

352 INDEXED LIST OF REFERENCES

1957a. Breeding methods — I. Genetic considerations. The UFAW
Handbook on the Care and Management of Laboratory Animals, 2nd
edn., edd. A. N. Worden and W. Lane-Petter. London: Univer-
sities Federation for Animal Welfare. Pp. 85-107. [228]
19576. Selection for phenotypic intermediates in Drosophila. J. Genet.,
55: 55i-56i. [341]

Falconer, D. S., and Latyszewski, M. 1952. The environment in relation
to selection for size in mice. J. Genet., 51: 67-80. [324]

Falconer, D. S., and Robertson, A. 1956. Selection for environmental
variability of body size in mice. Z. indukt. Abstamm.-u. Vererblehre,

87: 385-39I- [34i]

Fisher, R. A. 19 18. The correlation between relatives on the supposition
of Mendelian inheritance. Trans, roy. Soc. Edinb., 52: 399-433.

[2, 124]

1930. The Genetical Theory of Natural Selection. Oxford University

Press. xiv+272pp. [4]

1 94 1. Average excess and average effect of a gene substitution. Ann.

Eugen. [Lond.], 11: 53-63 . [124]

1949. The Theory of Inbreeding. Edinburgh: Oliver & Boyd, viii + 120

pp. [90, 97, 99, 100]

Fisher, R. A., and Yates, F. 1943. Statistical Tables. Edinburgh: Oliver &

Boyd. 2nd edn. viii +98 pp. [i945 302]

Ford, E. B. 1953. The genetics of polymorphism in the Lepidoptera.

Advanc. Genet., 5: 43-87. [44]

Fredeen, H. T., and Jonsson, P. 1957. Genie variance and covariance in

Danish Landrace swine as evaluated under a system of individual

feeding of progeny test groups. Z. Tierz. Zuchtbiol., 70: 348-363.

[167, 174, 175, 316]
Gilmour, D. G. 1958. Maintenance of segregation of blood group genes
during inbreeding in chickens. (Abstr.) Heredity, 12: 141-142.

[290]

Gowe, R. S., Robertson, A., and Latter, B. D. H. 1959. Environment and

poultry breeding problems. 5. The design of poultry control

strains. Poult. Sci., 38: 462-471. [72, 73]

Gowen, J. W. 1952. Hybrid vigor in Drosophila. Heterosis, ed. J. W.

Go wen. Ames: Iowa State College Press. Pp. 474-493.

[282, 33«]

Green, E. L. 195 1. The genetics of a difference in skeletal type between

two inbred strains of mice (BalbC and C57blk). Genetics, 36: 391-

409. [303]

Green, E. L., and Russell, W. L. 1951. A difference in skeletal type be-
tween reciprocal hybrids of two inbred strains of mice (C57BLK
and C3H). Genetics, 36: 641-651. [305, 307]

Gruneberg, H. 1952. Genetical studies on the skeleton of the mouse. IV.
Quasi-continuous variations. J. Genet., 51: 95-114. [301]

1954. Variation within inbred strains of mice. Nature [Lond.], 173: 674.

[275]

INDEXED LIST OF REFERENCES

353

Haldane, J. B. S. 1924-32. A mathematical theory of natural and artificial
selection. Proc. Camb. phil. Soc, 23: 19-41; 158-163; 363-372;
607-615; 838-844; 26: 220-230; 27: 131-142; 28: 244-248. [2]

1932. The Causes of Evolution. London: Longmans, Green & Co., Ltd.

vii + 235pp. [2,4]

Haldane, J. B. S. 1936. The amount of heterozygosis to be expected in an

approximately pure line. J. Genet., 32: 375-391. [100]

1937. Some theoretical results of continued brother-sister mating. J.
Genet., 34: 265-274. [90, 97]

1939. The spread of harmful autosomal recessive genes in human
populations. Ann. Eugen. [Lond.], 9: 232-237. [41]

1946. The interaction of nature and nurture. Ann. Eugen. [Lo?id.], 13:
197-205. [133]

1949. The rate of mutation of human genes. Proc. 8th int. Congr. Genet.
1948 [Stockh.]. Lund: Issued as a supplementary volume of
Hereditas, 1949. Pp. 267-273. [38]

1954a. The measurement of natural selection. Proc. gth int. Congr.
Genet. [Bellagio (Como)], 1953, Pt. I (Suppl. to Caryologia, 6): 480-
487. [338]

19546. The Biochemistry of Genetics. London: George Allen & Unwin
Ltd. 144 pp. [339]

J955- The complete matrices for brother-sister and alternate parent-
offspring mating involving one locus. J. Genet., 53: 315-324.

[90, 97]
Hancock, J. 1954. Monozygotic twins in cattle. Advanc. Genet., 6: 141-

181. [183]

Hardy, G. H. 1908. Mendelian proportions in a mixed population. Science,

28: 49-50- [9]

Hayes, H. K., Immer, F. R., and Smith, D. C. 1955. Methods of Plant

Breeding. New York: McGraw-Hill Book Co., Inc. 2nd edn.

xi+551 pp. [276]

Hayman, B. I. 1955. The description and analysis of gene action and inter-
action. Cold Spr. Harb. Symp. quant. Biol., 20: 79-84. [i4°]
1958. The theory and analysis of diallel crosses. II. Genetics, 43: 63-85.

[140, 277]
Hayman, B. I., and Mather, K. 1953. The progress of inbreeding when

homozygotes are at a disadvantage. Heredity, 7: 165-183. [102]
Hazel, L. N. 1943. The genetic basis for constructing selection indexes.

Genetics, 28: 476-490. [324* 325]

Hazel, L. N., and Lush, J. L. 1942. The efficiency of three methods of

selection. J. Hered., 33: 393~399- [324]

Hollingsworth, M. J., and Smith, J. M. 1955. The effects of inbreeding

on rate of development and on fertility in Drosophila subobscura.

J. Genet., 53: 295-314. f249, 252]

Horner, T. W., Comstock, R. E., and Robinson, H. F. 1955. Non-allelic

gene interactions and the interpretation of quantitative genetic data.

Tech. Bull. N.C. agric. Exp. Sta., No. 1 18: v + 1 17 pp. [299]

354 INDEXED LIST OF REFERENCES

Hull, F. H. 1945. Recurrent selection for specific combining ability in
corn. J. Amer. Soc. Agron., 37: 134-145. [286]

Hunt, H. R., Hoppert, C. A., and Erwin, W. G. 1944. Inheritance of
susceptibility to caries in albino rats {Mus norvegicus). y. dent. Res.,

23: 385-401- [297]

Johansson, I. 1950. The heritability of milk and butterfat yield. Anim.
Breed. Abstr., 18: 1-12. [i44> 167, 316]

Kempthorne, O. 1954. The correlation between relatives in a random
mating population. Proc. roy. Soc, B, 143: 103-113. [138, 279]
I955a- The theoretical values of correlations between relatives in ran-
dom mating populations. Genetics, 40: 153-167.

[138, 152, 158, 174]
19556. The correlations between relatives in random mating popula-
tions. Cold Spr. Harb. Symp. quant. Biol., 20: 60-75. [x38, 158]
1957. An Introduction to Genetic Statistics. New York: John Wiley & Sons,
Inc.; London: Chapman & Hall, Ltd. xvii + 545 pp. [4, 264, 277]
Kempthorne, O., and Tandon, O. B. 1953. The estimation of heritability
by regression of offspring on parent. Biometrics, 9: 90-100. [171]
Kerr, W. E., and Wright, S. 1954a. Experimental studies of the distribu-
tion of gene frequencies in very small populations of Drosophila
melanogaster: I. Forked. Evolution, 8: 172-177. [74]

19546. Experimental studies of the distribution of gene frequencies in
very small populations of Drosophila melanogaster. III. Aristapedia
and spineless. Evolution, 8: 293-302. [74]

Kimura, M. 1954. Process leading to quasi-fixation of genes in natural
populations due to random fluctuation of selection intensities.
Genetics, 39: 280-295. [57, 34°]

J955- Solution of a process of random genetic drift with a continuous
model. Proc. nat. Acad. Sci. U.S.A., 41: 144-150. [54, 55, 57]
1956. Rules for testing stability of a selective polymorphism. Proc. nat.
Acad. Sci. U.S.A., 42: 336-340. [42]

King, J. W. B. 1950. Pygmy, a dwarfing gene in the house mouse. J.Hered.,
41: 249-252. [113]

x955- Observations on the mutant "pygmy" in the house mouse. J&
Genet., 53: 487"497- [113, 289, 298, 299]

King, S. C, and Henderson, C. R. 1954a. Variance components analysis
in heritability studies. Poult. Sci., 33: 147-154. [173]

19546. Heritability studies of egg production in the domestic fowl.
Poult. Sci., 33: 155-169. [168]

Kyle, W. H., and Chapman, A. B. 1953. Experimental check of the effec-
tiveness of selection for a quantitative character. Genetics, 38: 421-
443. [225]

Lack, D. 1954. The Natural Regulation of Animal Numbers. Oxford:
Clarendon Press. viii+343pp. [334]

Lamotte, M. 195 1. Recherches sur la structure genetique des populations
naturelles de Cepaea nemoralis (L.). Bull, biol., Suppl. 35: 238 pp.

[78, 83, 84]

INDEXED LIST OF REFERENCES

355

Lerner, I. M. 1950. Population Genetics and Animal Improvement. Cam-
bridge University Press, xviii + 342 pp. [4, 236, 325, 346]

1954. Genetic Homeostasis. Edinburgh: Oliver & Boyd, vii + 134 pp.

[44, 202, 213, 263, 270, 271, 288, 331, 339]

1958. The Genetic Basis of Selection. New York: John Wiley & Sons,

Inc. xvi+298 pp. [202, 263]

Lerner, I. M., and Cruden, D. 195 i. The heritability of egg weight: the

advantages of mass selection and of early measurements. Poult. Sci.,

30: 34-41- [168]

Levene, H. 1953. Genetic equilibrium when more than one ecological niche

is available. Amer. Nat., 87: 331-333. [43]

Li, C. C. 1955a. Population Genetics. Chicago: University of Chicago Press;

London: Cambridge University Press, xi +366 pp.

[4, 15, 20, 22, 24]

19556. The stability of an equilibrium and the average fitness of a

population. Amer. Nat., 89: 281-296. [43]

Livesay, E. A. 1930. An experimental study of hybrid vigor or heterosis in

rats. Genetics, 15: 17-54. [271]

Lush, J. L. 1945. Animal Breeding Plans . Ames: Iowa State College Press.

3rd edn. viii+443pp. [4]

1947. Family merit and individual merit as bases for selection. Pt. I,

Pt. II. Amer. Nat., 81: 241-261; 362-379. [236, 237]

1950. Genetics and animal breeding. Genetics in the Twentieth Century,

ed. L. C. Dunn. New York: Macmillan Co. Pp. 493-525. [200]

Lush, J. L., and Molln, A. E. 1942. Litter size and weight as permanent

characteristics of sows. Tech. Bull. U.S. Dep. Agric, No. 836: 40 pp.

[167]
Mac Arthur, J. W. 1949. Selection for small and large body size in the

house mouse. Genetics, 34: 194-209. [216, 295, 335]

McLaren, A., and Michie, D. 1954. Factors affecting vertebral variation

in mice. 1. Variation within an inbred strain. J. Embryol. exp.

Morph., 2: 149-160. [273, 274]

1955. Factors affecting vertebral variation in mice. 2. Further evidence
on intra-strain variation. J. Embryol. exp. Morph., 3: 366-375.

[303, 306]

1956a. Factors affecting vertebral variation in mice. 3. Maternal

effects in reciprocal crosses. J. Embryol. exp. Morph., 4: 1 61-166.

1 [306]

19566. Variability of response in experimental animals. J. Genet., 54:

440-455. [271]

' Malecot, G. 1948. Les Mathematiques deVHeredite. Paris: Masson et Cie.

vi+63 pp. [4, 61, 69, 75, 88]

': Mangelsdorf, P. C. 1 95 1. Hybrid corn: its genetic basis and its significance

1 ! in human affairs. Genetics in the Twentieth Century, ed. L. C. Dunn.

New York: Macmillan Co. Pp. 555-571. [no, 277]

' j Mather, K. 1949. Biometrical Genetics. London: Methuen & Co., Ltd.

ix + 162 pp. [4, 106, 277, 340, 346]

356 INDEXED LIST OF REFERENCES

1953a. Genetical control of stability in development. Heredity, 7: 297-

336. [270]

19536. The genetical structure of populations. Symp. Soc. exp. Biol. 7:

66-95. [34o]

1955a. Polymorphism as an outcome of disruptive selection. Evolution,

9: 52-61. [43]

19556. The genetical basis of heterosis. Proc. roy. Soc, B., 144: 143-

150. [288]

Merrell, D. J. 1953. Selective mating as a cause of gene frequency changes

in laboratory populations of Drosophila melanogaster. Evolution, 7:

287-296. [34]

Morley, F. H. W. 195 1. Selection for economic characters in Australian

Merino sheep. (1) Estimates of phenotypic and genetic parameters.

Set. Bull. Dep. Agric. N.S.W., No. 73: 45 pp. [144]

1954. Selection for economic characters in Australian Merino sheep.

IV. The effect of inbreeding. Aust. J. agric. Res., 5: 305-316.

[249]

1955. Selection for economic characters in Australian Merino sheep.

V. Further estimates of phenotypic and genetic parameters. Aust.
J. agric. Res., 6: 77-90. [168, 316]

Mourant, A. E. 1954. The Distribution of the Human Blood Groups. Oxford:
Blackwell. xxi+428pp. [5]

Muller, H. J., and Oster, I. I. 1957. Principles of back mutation as ob-
served in Drosophila and other organisms. Advances in Radio-
biology. Proc. 5th int. Conf.Radiobiol. [Stockh.], 1956. Edinburgh:
Oliver & Boyd. Pp. 407-415. [26]

Newman, H. H., Freeman, F. N., and Holzinger, K. J. 1937. Twins: a
Study of Heredity and Environment. Chicago: University of Chicago
Press. xvi+369pp. [185]

Osborne, R. 1957a. The use of sire and dam family averages in increasing
the efficiency of selective breeding under a hierarchical mating
system. Heredity, 11: 93-116. [241]

19576. Correction for regression on a secondary trait as a method of
increasing the efficiency of selective breeding. Aust. J. biol. Sci.,
10: 365-366. [328]

Osborne, R., and Paterson, W. S. B. 1952. On the sampling variance of
heritability estimates derived from variance analyses. Proc. roy..
Soc. Edinb., B., 64: 456-461. [183]

Paxman, G. J. 1957. A study of spontaneous mutation in Drosophila
melanogaster. Genetica, 29: 39-57. [342]

Pearson, K., and Lee, A. 1903. On the laws of inheritance in man. I. In-
heritance of physical characters. Biometrika, 2: 357-462. [163]

Penrose, L. S. 1949. The Biology of Mental Defect. London: Sidgwick &

Jackson. xiv+285pp. [163,164]

1954. Some recent trends in human genetics. Proc. gth int. Congr.

Genet. [Bellagio (Como)], 1953, Pt. I (Suppl. to Caryologia, 6): 521-

530. [Hi]

INDEXED LIST OF REFERENCES

357

Plum, M. 1954. Computation of inbreeding and relationship coefficients.

J. Hered., 45: 92-94- [89]

Powers, L. 1950. Determining scales and the use of transformations in

studies on weight per locule of tomato fruit. Biometrics, 6: 145-163.

[300]
1952. Gene recombination and heterosis. Heterosis, ed. J. W. Gowen.
Ames: Iowa State College Press. Pp. 298-319. [260]

Race, R. R., and Sanger, R. 1954. Blood Groups in Man. Oxford: Black-
well. 2nd edn. xvi+40opp. [12, 16]
Rasmuson, M. 1952. Variation in bristle number of Drosophila melanogaster .
Acta zool. [Stockh.], 33: 277-307. [265]
1956. Recurrent reciprocal selection. Results of three model experi-
ments on Drosophila for improvement of quantitative characters.
Hereditas [Lund], 42: 397-414. [286]
Reeve, E. C. R. 1955a. Inbreeding with homozygotes at a disadvantage.
Ann. hum. Genet. [Lond.], 19: 332-346. [101]
^S^- (Contribution to discussion.) Cold Spr. Harb. Symp. quant.
Biol., 20: 76-78. [170]
1955c. The variance of the genetic correlation coefficient. Biometrics,
11: 357-374- [171, 318]
Reeve, E. C. R., and Robertson, F. W. 1953. Studies in quantitative in-
heritance. II. Analysis of a strain of Drosophila melanogaster
selected for long wings. J. Genet., 51: 276-316. [171, 316, 319]
1954. Studies in quantitative inheritance. VI. Sternite chaeta number
in Drosophila: a metameric quantitative character. Z. indukt. Ab-
stamm.-u. Vererblehre, 86: 269-288. [140, 145, 316]
Rendel, J. M. 1954. The use of regressions to increase heritability. Aust.
J. biol. Sci., 7: 368-378. [328]
Rendel, J. M., Robertson, A., Asker, A. A., Khishin, S. S., and Ragab,
M. T. 1957. The inheritance of milk production characteristics.
J. agric. Sci., 48: 426-432. [149]
Roberts, J. A. Fraser. 1957. Blood groups and susceptibility to disease: a
review. Brit. J. prev. Soc. Med., 11: 107-125. [44]
Robertson, A. 1952. The effect of inbreeding on the variation due to re-
cessive genes. Genetics, 37: 189-207. [268, 269]
1954. Inbreeding and performance in British Friesian cattle. Proc.
Brit. Soc. Anim. Prod., 1954: 87-92. [249]
1955a. Prediction equations in quantitative genetics. Biometrics, 11:
95-98. [236]
19556. Selection in animals: synthesis. Cold Spr. Harb. Symp. quant.
Biol, 20: 225-229. [333, 337]
1956. The effect of selection against extreme deviants based on devia-
tion or on homozygosis. J. Genet., 54: 236-248. [338]
1957a. Genetics and the improvement of dairy cattle. Agric. Rev.
[Lond.], 2 (8): 10-21. [167]
19576. Optimum group size in progeny testing and family selection.
Biometrics, 13: 442-450. [243]

358 INDEXED LIST OF REFERENCES

ig$ga. Experimental design in the evaluation of genetic parameters.

Biometrics, 15: 219-226. [178, 182, 183]

19596. The sampling variance of the genetic correlation coefficient.

Biometrics, 15: 469-485. [318, 323]

Robertson, A., and Lerner, I. M. 1949. The heritability of all-or-none
traits: viability of poultry. Genetics, 34: 395-411. [168, 303]

Robertson, F. W. 1955. Selection response and the properties of genetic
variation. Cold Spr. Harb. Symp. quant. Biol., 20: 166-177.

[211, 212, 216]
1957a. Studies in quantitative inheritance. X. Genetic variation of ovary
size in Drosophila. J. Genet., 55: 410-427. [!40> j44> x45> io8]

19576. Studies in quantitative inheritance. XI. Genetic and environ-
mental correlation between body size and egg production in
Drosophila melanogaster . J. Genet., 55: 428-443. [131, 140, 168]

Robertson, F. W., and Reeve, E. C. R. 1952a. Studies in quantitative in-
heritance. I. The effects of selection of wing and thorax length in
Drosophila melanogaster. J. Genet., 50: 414-448. [223]

19526. Heterozygosity, environmental variation and heterosis. Nature
[Lond.], 170: 296. [270, 271]

Robinson, H. F., and Comstock, R. E. 1955. Analysis of genetic variability
in corn with reference to probable effects of selection. Cold Spr.
Harb. Symp. quant. Biol., 20: 127-135. [140, 284, 290]

Robinson, H. F., Comstock, R. E., Khalil, A., and Harvey, P. H. 1956.
Dominance versus over-dominance in heterosis: evidence from
crosses between open-pollinated varieties of maize. Amer. Nat.,
90: 127-13 1. [290]

Robson, E. B. 1955. Birth weight in cousins. Ann. hum. Genet. [Lond.], 19:
262-268. [141, 163, 185]

Russell, E. S. 1949. A quantitative histological study of the pigment found
in the coat-color mutants of the house mouse. IV. The nature of
the effects of genie substitution in five major allelic series. Genetics,
34: 146-166. [116, 126]

Schafer, W. 1937. Uber die Zunahme der Isozygotie (Gleicherbarkeit)
bei fortgesetzter Bruder-Schwester-Inzucht. Z. indukt. Abstamm.-
u. Vererblehre, 72: 50-78. [91, 97]

Searle, A. G. 1949. Gene frequencies in London's cats. J. Genet., 49:
214-220. [18]

Sheppard, P. M. 1958. Natural Selection and Heredity. London: Hutchin-
son & Co. (Publishers) Ltd. 212 pp. [44]

Shoffner, R. N. 1948. The reaction of the fowl to inbreeding. Poult. Sci.,
27: 448-452. [249]

Sierts-roth, U. 1953. Geburts- und Aufzuchtgewichte von Rassehunden.
Z. Hundeforsch., 20: 1 -122. [216]

Slizynski, B. M. 1955. Chiasmata in the male mouse. J. Genet., 53: 597-
605. [99]

Smith, H. Fairfield. 1936. A discriminant function for plant selection.
Ann. Eugen. [Lond.], 7: 240-250. [325]

•

INDEXED LIST OF REFERENCES

359

Smith, H. H. 1952. Fixing transgressive vigor in Nicotiana rustica.
Heterosis, ed. J. W. Gowen. Ames: Iowa State College Press. Pp.
161-174. [260]

Snedecor, G. W. 1956. Statistical Methods. Ames: Iowa State College
Press. 5th edn. xiii + 534 pp. [144,173]

Sprague, G. F. 1952. Early testing and recurrent selection. Heterosis, ed.
J. W. Gowen. Ames:' Iowa State College Press. Pp. 400-417.

[283]
Stern, C. 1943. The Hardy-Weinberg law. Science, 97: 137-138. [9]

1949. Principles of Human Genetics. San Francisco: W. H. Freeman &
Co. xi+6i7pp. [13, 183]

Tantawy, A. O., and Reeve, E. C. R. 1956. Studies in quantitative inheri-
tance. IX. The effects of inbreeding at different rates in Drosophila
melanogaster . Z. indukt. Abstamm.-u. Vererblehre, 87: 648-667.

[249, 268, 290]
Waddington, C. H. 1942. Canalisation of development and the inheritance
of acquired characters. Nature [Lond.], 150: 563. [310]

IQ53- Genetic assimilation of an acquired character. Evolution, 7: 118-
126. [310,311]

1957. The Strategy of the Genes. London: George Allen & Unwin, Ltd.
ix + 262 pp. [43, 146, 271, 311, 340, 341, 342]

Wallace, B. 1958. The comparison of observed and calculated zygotic
distributions. Evolution, 12: 113-115. [12]

Wallace, B., and Vetukhiv, M. 1955. Adaptive organization of the gene
pools of Drosophila populations. Cold Spr. Harb. Symp. quant.
Biol, 20: 303-309. [262]

Warren, E. P., and Bogart, R. 1952. Effect of selection for age at time of
puberty on reproductive performance in the rat. Sta. Tech. Bull. Ore.
agric. Exp. Sta., No. 25: 27 pp. [168]

Warwick, B. L. 1932. Probability tables for Mendelian ratios with small
numbers. Bull. Tex. agric. Exp. Sta., No. 463: 28 pp. [105]

Warwick, E. J., and Lewis, W. L. 1954. Increase in frequency of a de-
leterious recessive gene in mice. J. Hered., 45: 143-145. [113]
Weinberg, W. 1908. Uber den Nachweis der Vererbung beim Menschen.
Jh. Ver. vaterl. Naturk. Wiirttemb., 64: 369-382. [9]
Weir, J. A. 1955. Male influence on sex ratio of offspring in high and low
blood-/>H lines of mice. J. Hered., 46: 277-283. [321]
Weir, J. A., and Clark, R. D. 1955. Production of high and low blood-^H
lines of mice by selection with inbreeding. J. Hered., 46: 125-132.

[321]
Whatley, J. A. 1942. Influence of heredity and other factors on 180-
day weight in Poland China swine. J. agric. Res., 65: 249-264.

[167]
Williams, E. J. 1954. The estimation of components of variability. Tech.
Pap. Div. math. Statist. C.S.I.R.O. Aust., No. 1: 22 pp. [173]

Wright, S. 1921. Systems of mating. Genetics, 6: 11 1-178.

[2, 4, 22, 90, 165]
2A f.q.g.

360 INDEXED LIST OF REFERENCES

1922. Coefficients of inbreeding and relationship. Amer. Nat., 56: 330-
338. [22,87,88]

1 93 1. Evolution in Mendelian populations. Genetics, 16: 97-159.

[4, 53, 54, 69, 75, 92]

1933. Inbreeding and homozygosis. Proc. nat. Acad. Sci. [Wash.], 19:
411-420. [90,92]

1934. The method of path coefficients. Ann. math. Statist., 5: 161-215.

[90]

1935a. The analysis of variance and the correlations between relatives

with respect to deviations from an optimum. J. Genet., 30: 243-256.

[338]
19356. Evolution in populations in approximate equilibrium. J. Genet.,

30: 257-266. [338]

1939. Statistical genetics in relation to evolution. Actualites scientifiques
et industr idles, 802. Paris: Hermann et Cie. 63 pp. [70]

1940. Breeding structure of populations in relation to speciation.
Amer. Nat., 74: 232-248. [71, 77]

1942. Statistical genetics and evolution. Bull. Amer. math. Soc, 48:
223-246. [75, 79]

1943. Isolation by distance. Genetics, 28: 1 14-138. [77]
1946. Isolation by distance under different systems of mating. Genetics,

3i: 39-59- [77]

1948. On the roles of directed and random changes in gene frequency

in the genetics of populations. Evolution, 2: 279-294. [75]

195 1. The genetical structure of populations. Ann. Eugen. [Lond.], 15:

323-354. [75, 76, 77]

1952a. The theoretical variance within and among subdivisions of a
population that is in a steady state. Genetics, 37: 312-321. [57]
19526. The genetics of quantitative variability. Quantitative Inheritance,
edd. E. C. R. Reeve and C. H. Waddington. London: H.M.S.O.
Pp. 5-41. [219, 292, 297, 34i]

1954. The interpretation of multivariate systems. Statistics and Mathe-
matics in Biology, edd. O. Kempthorne, T. A. Bancroft, J. W.
Gowen and J. L. Lush. Ames: Iowa State College Press. Pp. 1 1-33.

[90]
1956. Modes of selection. Amer. Nat., 90: 5-24. [263]

Wright, S., and Kerr, W. E. 1954. Experimental studies of the distribu-
tion of gene frequencies in very small populations of Drosophila
melanogaster. II. Bar. Evolution, 8: 225-240. [74, 80, 81]

Wright, S., and McPhee, H. C. 1925. An approximate method of cal-
culating coefficients of inbreeding and relationship from livestock
pedigrees. J. agric. Res., 31: 377-383 • [87]

Yoon, C. H. 1955. Homeostasis associated with heterozygosity in the
genetics of time of vaginal opening in the house mouse. Genetics,
40: 297-309. [271]

Zeleny, C. 1922. The effect of selection for eye facet number in the white
bar-eye race of Drosophila melanogaster. Genetics, 7: 1-115. [107]

SUBJECT INDEX

Adaptive value, 26.
additive:

action of genes, 126, 138;

combination of loci, 116-7;

effects, 122;

genes, 124, 138;

variance, 135-8.
albinism in man, 13, 36.
assimilation, genetic, 310-1.
assortative mating, 22, 164, 170-1.
asymmetry in selection response,
212-5;

as scale effect, 296-7.
average effect, 117-20.

Base population, 49, 61, 95-6.
blood groups:

in man, 5, 7, 12, 16, 44;

in poultry, 290;

selective advantage, 44.
Breeding value, 120-5;

difference between definitions,
158.

Canalisation, 272, 308, 341.

cats, 18-19.

cattle, dropsy in, 13.

causal components of variance, 150.

Cepaea nemoralis, 43, 78-9, 83-4.

coadaptation, 263.

coancestry, 88-90, 233.

coefficient:

of inbreeding, see under Inbreed-
ing;

of relationship, 233;

of selection, 28.
combining ability, 281-6.
continuous variation, 104-1 1 .
correlated characters, 312-29, 335-6.
correlation (between characters),
312-29;

genetic, 313-8. /
correlation (between relatives):

of breeding valWs, 233;

phenotypic, 151X162-3.
covariance, 15 1-2;

environmental, 159-61;

genetic, 152-9,

offspring-parent, 152-6,
sibs, 154, 156-7; \

phenotypic, 16 1-4.
crossbreeding:

heterosis, 255-63;

in plant and animal improvement,
276-86;

variance between crosses, 279-83.

Developmental variation, 141, 143.
deviation:

dominance, 122-5;

environmental, 112;

interaction (epistatic), 125-8.
discontinuous variation, 104, 108,

301.
dispersive process, 23, 47-8 (see

also Inbreeding),
dominance, 27, 113;

deviation, 122-5;

directional, 213;

effect on variance, 137;

and fitness, 337;

and heterosis, 257;

and inbreeding depression, 251;

and scale, 298.
drift, random, 50-7;

in natural populations, 81-4.
Drosophila melanogaster :

Bar, 80-1, 107;

bristle number:

components of variance, 140,

148,
fitness relationship, 333, 343,

362

SUBJECT INDEX

frequency distribution, 107,
heritability, 169-70, 177,
mutation, 342,
number of "loci", 219,
random drift, 265,
repeatability, 145,
response to selection, 190, 195,
209, 210, 216, 221, 223, 245-
246;
brown, 52, 53, 56, 59;
effective population size, 73-4;
egg number, 140, 282, 336;
ovary size, 140, 145;
raspberry, 34;
thorax length:

components of variance, 130,

140,
response to selection, 209, 211-
212, 216, 219, 221, 319;
wing length, 17 1-2, 319.
Drosophila pseudoobscura, 262.
Drosophila subobscura, 252.
Drosophila tropicalis, 39.
dwarfism (chondrodystrophy) in
man, 38.

Effective factors, number of, 217.
effective population size (number),

< 68-74;
ratio of, to actual number, 73-4.
environment, 112 {see also under
Variance);
common, 159-61.
epistasis, 126 {see also under Inter-
action),
equilibrium:

Hardy- Weinberg, 9-12;
under inbreeding, 74-81,

and selection for heterozygotes,
100-3;
with linked loci, 20-1;
with more than one locus, 19-20;
under mutation, 25-6,

with selection, 36-41;
under natural selection, 331-2;
under selection for heterozygotes,
41-6.

eugenics, 36, 40.
euheterosis, 262.

Factors, effective number of, 217.
family size:

and heritability estimates, 177-83;

and inbreeding, 70-3;

and selection, 233-46.
fitness, natural, 26, 167, 329, 330-43.
fixation, 54-7, 66-7, 97;

of deleterious genes, 80-1.

Gene frequency, 6-22;
change of, 23-36,

by selection for metric character,
203-7;
directional, 213;
distributions of:

with inbreeding, 52, 55, 84,
with inbreeding and mutation,

76,
with inbreeding and selection,
79, 81;
effect on variance, 137;
sampling variance of, 50-4, 64.
generation interval, 196-8.
genetic death, 39-40.
genotype, 112;
frequencies, 5-7,

with inbreeding, 57-9, 65-6;
with random mating, 9-22.
genotype-environment correlation,

132-3-
genotype-environment interaction,

133-4, 148-9, 322-4.
genotypic value, 11 2-4, 123-5, x32.

Hardy- Weinberg law, 9-15.
heritability, 135, 163, 165-7;

estimation, 168-85;

examples, 167-8;

of family means, 232-7;

after inbreeding, 268;

precision of estimates, 177-83;

realised, 202-3, 296-7;

of threshold characters, 303;

of within-family deviations, 232-7.

SUBJECT INDEX

363

heterosis, 254;

examples, 260;

in single crosses, 255-61;

in wide crosses, 261-3;

utilisation of, 276-86.
heterozygOtes:

frequency of,

with inbreeding, 65-6,

with random mating, 9-13, 38;

selection for, see under selection,
homeostasis:

developmental, 270-2;

genetic, 331.
hybrid vigour, see heterosis,
hybrids, uniformity of, 270-2, 275,
296.

Idealised population, 48-50.
inbreds:

experimental use, 272, 275;
sub-line differentiation, 273-4;
variability, 270-1, 296.
inbreeding, 60-7;
coefficient, 61;
computation,

from pedigrees, 86-8,
from population size, 61-4,
for regular systems, 90-5;
depression, 247-54,

examples, 249;
rate of, 63, 69-70, 92, 96, 101-2;
regular systems, 90-5;
and variance, 265-72.
index for selection, 325-8.
incidence of threshold character, 302.
I intangible variation, 141.
J integration, 263.
intensity of selection, 192.
interaction:

between loci (epistatic), 125-8,
263,
and heterosis, 259, 262,
and inbreeding, 252,
and scale, 298-9;
deviation, 125-8;

between genotype and environ-
ment, 133-4, 148-9, 322-4.

island model, 77.
isoalleles, 344.
isolation by distance,

77-9-

Line (subdivision of a population),

49-
linkage:

and correlation, 312, 320;

and HarqLy-Weinberg equilibrium,

20-21;
and inbreeding, 97-100;
and polygenic balance, 340-1;
and resemblance between rela-
tives, 158-9.
logarithmic transformation, 297.
luxuriance, 262.
Ly coper sicon, 260, 300.

Maize, 277, 290.
Man:

albinism, 13, 36;
birth weight, 141-2;
blood groups, 5, 7, 12, 16, 44;
dwarfism (chondrodystrophy), 38;
sickle-cell anaemia, 44-6.
maternal effects, 140-2, 160, 214,

252-3, 260-1.
mating, types of, 15.
metric character, 104-11.
migration, 23;

in small populations, 75-9.
mouse:

blood-pH, 321;
body weight:

fitness relationship, 335-6,
number of "loci", 219,
realised heritability, 203,
response to selection, 199, 214,

216, 220,
selection differentials, 201,
sib-analysis, 175-6,
variance and scale, 295;
growth rate, 107, 133;
litter size:

frequency distribution, 107,

heterosis, 255,

inbreeding depression, 252-3,

364

SUBJECT INDEX

repeatability, 144;
non-agouti, 51;

pigment granules, 116-7, 126-8;
pygmy, 113-5, 120-3, 136, 222-3,

289, 299;
sex ratio, 321;
skeletal variants, 274;
vertebrae, number of, 273-4, 3°5—
308.
multiple alleles, 15-17, 42, 138.
multiple measurements, 142-9.
mutation, 23;

balanced against selection, 36-41;
change of gene frequency by, 24-6;
and inbreeding, 75-9, 100, 274-5;
and origin of variation, 342-3;
rate:

estimation of, 38,
increase of, 26, 39.

Neighbourhood model, 77-9.
Nicotiana, 260.
non-additive:

combination of genes, 125-8;

variance, 139-40, 280, 287, 337.

Observational components of vari-
ance, 150.
overdominance, 27, 287-91;
effect on variance, 137;
equilibrium gene frequency, 41-6;
and fitness, 339, 344;
in selection experiments, 213, 222-
223.

Panmictic index, 64, 66.

panmixia, 8.

pedigrees and inbreeding, 85-90.

pigs:

body-length, 174-5;

litter size, 253.
pleiotropy, 289-91, 312-3, 328-9,

333-
polycross, 282.
polygenes, 106.
polygenic balance, 340-1.
polygenic variation, 106.

population:

base-, 49, 61, 95-6;

effective size (number), 68-74,
ratio of, to actual size, 73-4;

-mean, 113-7;

size, 50.
premisses, 2, 3.
probit transformation, 302.
progeny testing, 229-30.
proportionate effect, 207, 219.

Quantitative character, 104.
Quasi-continuous variation, 301.

Radiation, 26, 39.
random drift, 51-7;

in natural populations, 81-4.
random mating, 8-21.
range, total, 115, 116, 215-9.
regression, offspring on parents,

151, 162-3.
relatives, resemblance between,

150-64.
repeatability, 143-9.

Scale, 108-9, 292-300;
-effects, 293;
underlying, 301.
segregation index, 217.
selection, 23, 26, 186-7;

balanced by mutation, 36-41;
change of gene frequency, 28-36;
coefficient of, 28,

related to intensity of, 203-7;
combined, 227, 236-7, 239-40;
for combining ability, 283-6;
correlated response to, 318-24;
in different environments, 322-4;
-differential, 187, 191-8,

weighting of, 200-2;
for economic value, 324-9;
eugenic effects, 36, 40-1;
family, 227-8 (see also Selection,
methods),
and family size, 243-5;
for heterozygotes, 41-6, 213, 222-
223, 339,

SUBJECT INDEX

365

affecting inbreeding, 100-3,

253-4;
-index, 325-8;
indirect, 320-4;
individual, 227 (see also Selection,

methods);
intensity of, 192,

related to coefficient of, 203-

for intermediates, 338-42;

-limit, 215, 219-24, 328-9;

long-term results, 215-4;

mass, 227;

methods (use of relatives), 225-31,
heritabilities, 232-6,
relative merits, 237-44,
responses expected, 231-7;

natural, 187, 200-2, 212, 253, 266,

329-43;
reciprocal, 284;
recurrent, 283, 286;
response, 187-91,

asymmetry, 212-5, 296-7,
duration, 215-7,
measurement, 198-203,
number of "loci", 217-9,
prediction, 189-91, 214-5,
repeatability, 208-12,
total, 215-7;
sib, 229 (see also Selection,

methods);
in small populations, 79-81;
for threshold characters, 308-11;
variable, 339-40;

within-family, 227 (see also Selec-
tion, methods),
selective value, 26.
self-fertilising plants, 247, 276-7.
sex-linked genes, 17-19, 34.
sickle-cell anaemia, 44-6.
sib-analysis, 172-6.
snails, 43, 78-9, 83-4.

systematic processes, 23,

in small populations, 74-81.

Threshold characters, 301-11, 321.
transformation of scale, 108-9, 292-
300;

logarithmic, 297;

probit, 302.
tobacco, 260.
tomato, 260, 300.
top-cross, 282.
twins, 131, 183-5.

Uniformity of inbred lines, 54, 66-7,
97, 100-3.

Value:

genotypic, 11 2-4, 123-5;
phenotypic, 112.
variance:

additive, 135-8;
between crosses, 279-83;
components, 129-30,

causal, 150,

genetic, 134-4°,

observational, 150;
dominance, 135-8, 163,
environmental, 130-4, 140-9,

common, 159-61,

general, 143-9,

inbreeding effects, 270-2,

special, 143-9;
genotypic, 130-4;
inbreeding effects, 265-72;
interaction (epistatic), 138-40,

and resemblance between rela-
tives, 157-9;
non-additive, 139-40, 280, 287,

337-
variation:

continuous, 104-1 1 ;
discontinuous, 104, 108, 301;
quasi-continuous, 301.

MARSTON SCIENCE LIBRARY

Date Due

Due

Returned

Due

Returned

NOV 2 * 199

SEP 3 0 199 i

UNIVERSITY OF FLORIDA

3 1262 05478 2528

SCIENCE
I LIBRi

mtftjioH

SC\EHC£

UBRftK*

> Z N

HECKMAN l±l
BINDERY INC. |e|

AUG 95

—**' m&*

Internet Archive Audio

Featured

Top

Images

Featured

Top

Software

Featured

Top

Books

Featured

Top

Video

Featured

Top

Mobile Apps

Browser Extensions

Archive-It Subscription

Save Page Now

Full text of "Introduction to quantitative genetics"

See other formats