\l^
NAVAL RESEARCH
L
2 0 J30
o
O
m
— o
mr
SEPTEMBER 1980
VOL. 27, NO. 3
OFFICE OF NAVAL RESEARCH
NAVSO P-1278
NAVAL RESEARCH LOGISTICS QUARTERLY
EDITORIAL BOARD
Marvin Denicoff, Office of Naval Research, Chairman Ex Officio Members
Murray A. Geisler, Logistics Management Institute
W. H. Marlow, The George Washington University
Thomas C. Varley, Office of Naval Research
Program Director
Seymour M. Selig, Office of Naval Research
Managing Editor
MANAGING EDITOR
Seymour M. Selig
Office of Naval Research
Arlington, Virginia 22217
ASSOCIATE EDITORS
Frank M. Bass, Purdue University
Jack Borsting, Naval Postgraduate School
Leon Cooper, Southern Methodist University
Eric Denardo, Yale University
Marco Fiorello, Logistics Management Institute
Saul I. Gass, University of Maryland
Neal D. Glassman, Office of Naval Research
Paul Gray, Southern Methodist University
Carl M. Harris, Center for Management and
Policy Research
Arnoldo Hax, Massachusetts Institute of Technology
Alan J. Hoffman, IBM Corporation
Uday S. Karmarkar, University of Chicago
Paul R. Kleindorfer, University of Pennsylvania
Darwin Klingman, University of Texas, Austin
Kenneth O. Kortanek, Carnegie-Mellon University
Charles Kriebel, Carnegie-Mellon University
Jack Laderman, Bronx, New York
Gerald J. Lieberman, Stanford University
Clifford Marshall, Polytechnic Institute of New York
John A. Muckstadt, Cornell University
William P. Pierskalla, University of Pennsylvania
Thomas L. Saaty, University of Pittsburgh
Henry Solomon, The George Washington University
Wlodzimierz Szwarc, University of Wisconsin, Milwaule
James G. Taylor, Naval Postgraduate School
Harvey M. Wagner, The University of North Carolina
John W. Wingate, Naval Surface Weapons Center, Whit 0a<
Shelemyahu Zacks, Virginia Polytechnic Institute and
State University
The Naval Research Logistics Quarterly is devoted to the dissemination of scientific information in logistic am
will publish research and expository papers, including those in certain areas of mathematics, statistics, and econciics
relevant to the over-all effort to improve the efficiency and effectiveness of logistics operations.
Information for Contributors is indicated on inside back cover.
The Naval Research Logistics Quarterly is published by the Office of Naval Research in the months of March, jnt
September, and December and can be purchased from the Superintendent of Documents, U.S. Government Pr:ting
Office, Washington, D.C. 20402. Subscription Price: $1 1.15 a year in the U.S. and Canada, $13.95 elsewhere. C<t of
individual issues may be obtained from the Superintendent of Documents.
The views and opinions expressed in this Journal are those of the authors and not necessarily those of the (fi«
of Naval Research.
Issuance of this periodical approved in accordance with Department of the Navy Publications and Printing Regula >ns,
P-35 (Revised 1-74).
ON THE RELIABILITY, AVAILABILITY AND BAYES CONFIDENCE
INTERVALS FOR MULTICOMPONENT SYSTEMS
William E. Thompson
Columbia Research Corporation
Arlington, Virginia
Robert D. Haynes
ARINC Research Corporation
Annapolis, Maryland
ABSTRACT
The problem of computing reliability and availability and their associated
confidence limits for multi-component systems has appeared often in the litera-
ture. This problem arises where some or all of the component reliabilities and
availabilities are statistical estimates (random variables) from test and other
data. The problem of computing confidence limits has generally been con-
sidered difficult and treated only on a case-by-case basis. This paper deals with
Bayes confidence limits on reliability and availability for a more general class of
systems than previously considered including, as special cases, series-parallel
and standby systems applications. The posterior distributions obtained are ex-
act in theory and their numerical evaluation is limited only by computing
resources, data representation and round-off in calculations. This paper collects
and generalizes previous results of the authors and others.
The methods presented in this paper apply both to reliability and availability
analysis. The conceptual development requires only that system reliability or
availability be probabilities defined in terms acceptable for a particular applica-
tion. The emphasis is on Bayes Analysis and the determination of the posterior
distribution functions. Having these, the calculation of point estimates and
confidence limits is routine.
This paper includes several examples of estimating system reliability and
confidence limits based on observed component test data. Also included is an
example of the numerical procedure for computing Bayes confidence limits for
the reliability of a system consisting of A' failure independent components con-
nected in series. Both an exact and a new approximate numerical procedure for
computing point and interval estimates of reliability are presented. A compari-
son is made of the results obtained from the two procedures. It is shown that
the approximation is entirely sufficient for most reliability engineering analysis.
INTRODUCTION
The problem of computing reliability, availability, and confidence limits for multicom-
ponent systems where some or all of the component reliabilities and availabilities are statistical
estimates from test and other data has appeared often in the literature. The problem of com-
puting these confidence limits has generally been considered difficult and treated only on a case
by case basis. The present paper deals with Bayes confidence limits on reliability and steady
state availability for a general class of fixed mission time, two-state systems including, as special
cases, series-parallel, stand-by and others that appear in the applications. Further, a fixed mis-
sion length is assumed. It is also assumed that neither reliability growth nor deterioration occur
during the life of the system and the system becomes as good as new after each repair. Finally,
we assume that no environmental changes, which could affect reliability occur. The posterior
345
346 WE. THOMPSON AND R.D. HAYNES
distributions obtained are exact in theory and their numerical evaluation is limited only by com-
puting resources, data representation and round-off in calculation. The present paper collects
and generalizes previous results of the authors and others.
The methods obtained in the following apply both to reliability and steady state availability
analysis and to avoid repeated reference to "reliability or availability", the discussion references
only reliability with the understanding that the terms system reliability R and component relia-
bility r, can be replaced by system availability A and component availability a,. The conceptual
development requires only that R and A be probabilities defined in terms acceptable for a par-
ticular application. The emphasis is on the determination of the posterior distribution func-
tions. Having these, the calculation of point estimates and confidence intervals is routine.
BAYES CONFIDENCE INTERVALS
In the Bayes inference model, the unknown probability, R, 0 < R ^ 1, is considered a
random variable whose posterior density is the result of combining prior information with test
data to obtain a probability density function f(R) for R. If the posterior density of R is seen to
be spread out, then relatively more uncertainty in the value of R obtains than when the poste-
rior density is concentrated closely about some particular value. The posterior density function
provides the most complete form of information about R, but sometimes summary information
is desired. A point estimate is one such form of summary information and this can be selected
in various ways and is analogous to the familiar statistical problem of characterizing an entire
population by some parameter value. Examples are mean, mode, median, etc. A point esti-
mate has the disadvantage of ignoring the information concerning the uncertainty in the un-
known reliability. Confidence intervals derived from /(/?) provide such additional information.
The true but unknown (and unknowable except with infinite data) reliability R0 is some specific
value of the random variable /?, 0 < R < 1. Conceptually, R0 can be considered a random
sample from 0 < /? < 1 made when the system was built. We can never know that R0 is, but
rR
f(R) gives a measure of the likelihood that RQ= R for each 0 < fl < 1. If F(R) = J
f{R)dR denotes the distribution function of R then
Prob {/?, ^ R0 ^ R2) = F(R2) - F(RX)
and [R], R2] is an interval estimate of R of confidence c = F(R2) - F(R{). The interpreta-
tion is simply that, based on the prior and current data the probability is c that the unknown
system reliability lies between R\ and R2. The interval [R\, R2] has been called [25] a Bayes c
level confidence interval. For R2= 1, R\ is called the lower c level confidence limit. For
R{ = 0, R2 is called the upper c level confidence limit. Given f(R) and F(R), Bayes
confidence limits for any c can be obtained by graphical or numerical methods and the pro-
cedure is generally not difficult. Numerical examples and discussion of numerical methods are
given in [25,27,8,26,28,29].
DEFINITION OF STRUCTURE FUNCTION
To establish the relationship between the reliabilities of the components of a system and
the reliability of the entire system, the way in which performance and failure of the com-
ponents affects performance and failure of the system must be specified. For this purpose, as
in [5,10,15], the state of any component is coded 1 when it performs and 0 when it fails. The
state of all A; components of the system can then be coded by a vector of N coordinates
X = \X\, X2, . . . , Xfri)
RELIABILITY FOR MULTICOMPONENT SYSTEMS 347
where x, = 0 means the i-th component fails and x, = 1 means that it does not fail. All possi-
ble states of the system are represented by the 2A/ different values this vector can assume.
Where an explicit mission time dependence is required, a random process y(t) =
[y\U), • • • , ^a/(^)} can be defined as in [15] so that to each component trajectory a measure x,
is assigned. Then, for example: x, = 1 if y,(t) is a failure-free process over some interval
O^T/i.^ t ^ 7/2, and xi = 0 if at least one failure occurs.
Some of the 2N states cause the system to fail and the others cause the system to perform.
The response of the system as a whole is written as a function </>(x) of x such that 0(x) = 0
when the system is failed in state x, and </>(x) = 1 when the system performs in state x. This
function <f>(x) is known in the literature [5,10,23] and has been called a structure function of
order N.
The structure function can be written in a systematic way for any series parallel system.
When the system is not too large the structure function can also be written by observation for
many more general systems. The structure function can always be written for a system of N
components by enumeration of its 2N states. For large systems this is at best very tedious, but
generally short cuts can be found which simplify the process. The structure function is con-
venient for conceptual development of the theory and provides a very general notation which is
why it is used here. What is required in the application of the present results is the formula for
system reliability in terms of component reliabilities as is done in [25,27, and 8]. The structure
function provides this formula in a general form but other methods are available. Some of
these methods are identified and referenced in [17] along with a new and useful algorithm
based on graph theory.
DEFINITION OF RELIABILITY FUNCTION
Assume that the components of the system are failure independent so that the elements
of the state vector x = (x1( ... , xN) are independent random variables with probability distri-
butions
Pr[x, = 1) = r,
Pr{x, = 0) = 1 - r,
where r, is the reliability of the i-th component.
The structure function 4>(x) is also a random variable with
Pr{(f>(x)= 1} = R
Pr{cf>(x) = 0} = 1- R
where R is the reliability of the system. R is the expected value of 4>(x) so that
(1) R = £{<Mx)} = £ <Mx)//' (1 - r0 1_Xl ... r» *" (1 - rN) l~x"
where the summation is over all 2A' states of the system.
In a particular application given the structure function and the values of all component
reliabilities, the system reliability, R, can be computed explicitly using (1). References [5],
[10], and [23] provide further discussion with examples of </> and R.
348 WE THOMPSON AND R.D. HAYNES
RELIABILITY ESTIMATION FROM TEST DATA
In many applications the system structure is known but some or all of the component reli-
abilities are unknown and must be estimated from tests and other data. As a result, statements
concerning these component and system reliabilities are subject to the uncertainties of statistical
estimation. A method of treating this uncertainty is provided by a Bayes analysis which consid-
ers the unknown component reliabilities as random variables and leads to Bayes confidence
intervals for both component and system reliabilities. The following is an extension and gen-
eralization of previous analysis of this kind [25,27,8,7,14,23,29].
BAYES MODEL
Assume a system of A7 failure independent components has a known structure function
(f>(x) and reliability function R (r) r — (r\, ...,%) of the form (1). Suppose that among the
N separate components of the system some are known to have identical reliabilities say ij, and
k, for example, then since r, = rj = rk, the symbols /) and rk can be replaced by r, everywhere
in (1). Finally, in this way there remain only N' < N different r's, one of each reliability
value. In addition, suppose that among the N' different component reliabilities N'—n are
known constants and thus there remain n different types of components with unknown reliabili-
ties. By a simple change in notation these n different, unknown reliabilities are denoted by
P = (P\,P2> ■■■ > Pn)-
By multiplying out factors (1 — p) and collecting terms, the system reliability (1) can then
be written in the equivalent form
(2) R(p) = T,a0lPx0]l ■••/>,/'"
j
where the constants ay, are integer for / ^ 0.
Using a Bayes inference model, the unknown p, are considered independent random vari-
ables with known posterior density functions,
' fM, 0< p{ < 1, /- 1, ..., n.
The system reliability, R(p) is then also a random variable, defined by (1) with unknown
distribution function H(R).
In applications, what is required is the calculation of H(R) given the f,(pi)\
/ = 1, 2, . . . , n. Having obtained //(/?), point estimates and confidence intervals on R can be
obtained directly. This result is also required for risk, cost and other analyses based on the
Bayes model. The method for an explicit numerical evaluation is presented in the following
section.
EVALUATION OF THE POSTERIOR DISTRIBUTION
The proposed method of evaluating the posterior distribution function H(R) is based on
an expansion of H(R) in Chebyshev polynomials of the second kind [1,16]. The main advan-
tages of this method lie in the rapid convergence properties of the Chebyshev expansion and
the convenient numerical computation for its evaluation. Although a description of the pro-
cedure has been presented in [8] and [7], for the sake of completeness, we shall outline the
main steps below.
RELIABILITY FOR MULTICOMPONENT SYSTEMS 349
Expansion by Chebyshev Polynomials
Let H{R) denote the posterior distribution,
H(R) = f h(R)dR, 0 ^ R ^ 1,
" o
where h(R) is the posterior density of the reliability of the overall system. By definition,
H(R) satisfies the boundary conditions:
(3) //(0) = 0; //(l) = 1
Let us introduce a new function Q (R ) defined by
(4) Q(R) = H(R)- R
the Q (R ) satisfies the boundary conditions
(5) 0(0) = 0(1) = 0
and can be expanded in a Fourier sine series of the following form:
(6) Q(R)= — sin9[b0+bl ;
77 ■" sin6»
sin (* + 1)0 j
k sin 9 "
where the angular variable 9 is related to R by the relation
(7) R = cos2 -|.
The coefficients /^ of the expansion (6) can be determined by:
(8) bk = JQ' [//(/?) -*] #JE (/?)£//?
where U* (R) = — is the shifted Chebyshev polynomial of the second kind [1,16]
sin#
which can be computed by the recursion relations:
(9) Ut+i (R) = (4R - 2) U*k (R) - U*k-i (R)
with
U$(R)=l U\ (R ) = - 2 + 4R
U$(R) = 3- 16/? + 16/?2
If we express U*k (R ) explicitly as a /cth order polynomial
(10) U*k (/?) = £ ClkRk
(=0
J RiH{R)dR- j R'+ldR
then Equation (8) becomes
, R'H(R)dR- f
'o -Jo
It can be shown, integrating by parts, that
(12) M\H{R)\ = -^t- {1 - Mi+1[h(R)]},
(ii) h = Hcik
f-0
350 W.E. THOMPSON AND R.D. HAYNES
Thus, Equation (11) becomes
(13) bk = £ Q
*=0
l-Mi+l[h(R)]
i + 1 i+2
Note that the Chebyshev coefficients Cik can be computed independently of the moments.
They may be stored in the form of a triangular matrix if sufficient storage space is available. A
simple algorithm for recursively calculating the coefficients is Ci>k+] =4C,_liA -2CU -Cik_h
Computations and Results
To complete the analysis it remains to compute the moments of h (R ) given the density
functions /■(/?,) and then use (13) to compute the br.
From (2) Rk(p) k = 1, 2, ... can be written as a finite sum
(14) **(/>) = ! Wi°"A ■■■PRa"lk-
j
where the aljk are independent of the pt and also integers for / ^ 0. Using this result and the
fact that the expected value of a sum is the sum of the expected values and the expected value
of a product of independent random variables is the product of the expected values, it follows
that
(15) Mk{h) = ^aojkMa]ik...Maiuh
j
where Ma , denotes the aijk 'th moment of p, .
Ilk <f
Having determined the coefficients bk we can write down the final expression for H(R)
from Equations (4) and (6) as follows:
(16) //(/?) = R + —y/R(l-R) {60+ b\U*(R) +... + bk Vt(R) + •••)•
77
This result is exact in the sense that the error can be made arbitrarily small by taking a :
sufficient number of terms. References [8] and [7] give a discussion of numerical considera-
tions and examples. Generally, (16) has been found very convenient for numerical calculation
using an electronic digital computer.
MODELS FOR APPLICATION
To evaluate H(R) the posterior distribution /•(/?,•) for each different component reliability
Pi is required. The derivation of these require application of Bayes inference procedures on a
case by case basis. The theory can be found in [20,4,19,2,3,24,6,18] and some specific applica-
tions in [25,27,8,7,14,1,16,12,23]. A tabulation for some familiar models of mathematical reli- |
ability theory is presented in the following.
Component With Constant Failure Rate
A single component has an unknown constant failure rate X and fixed mission time /. j
Component reliability p = exp(-\/) is regarded as a random variable. The natural conjugate
prior density function is
P(p)=Cpb'>lnO/pY°
RELIABILITY FOR MULTICOMPONENT SYSTEMS 351
with parameters b0 and r0 . When test data consists of T operating hours after r failures,
t= ti + t2+ ... + tr+ (m - r)tr.
Here tr is the time of the r-th failure among m initially on test. Failures are not replaced and
the test is terminated at the r-th failure. The resulting posterior density function of p is
f{p\aMa0,b0) = {b + X}T Pb(\n\lp)°,
Y{a + \)
where a = r + r0 and b = t/t + b0. The k-ih moment of f(p) is
Mk{f) = (b + l)a+1 (k + b + I)-""1.
The above results are from Reference [25].
Component Having Fixed Probability of Success
A single component has an unknown fixed probability of success, p. In testing, there
were observed m successes in n trials. For the natural conjugate Beta prior density function
with parameters m0 and n0 the posterior density function of p is
where
a = m + m0, b = n + n0 -a and B(a + 1, b + 1) = f pa{\ - p)b dp.
J o
The Ar-th moment of f{p\a,b),
k = 0,1, 2, ... is:
(b - a + \)\ (a + k)\ Y{b-a + 2) T(a + k + 1)
Mk{f)
kJ' a\ (b-a + kV. T(a + 1) Y{b-a + k + \)'
This result is from [26].
Steady State Availability of Component With Repair
A two state component has exponential distributions of life and of repair times. The
duration of intervals of operation and repair define two different statistically independent
sequences of identically distributed, mutually independent random variables. Both the mean-up
time, 1/X, and mean repair time l//x are unknown parameters estimated from test and prior
data.
The long term availability of the component is a function of the random variables p, and X
i.e.:
a = fi/ik + /x).
Assuming gamma priors for X and /x with snapshot, life and repair time data, the posterior den-
sity of availability a is the Euler density function:
f( I ^ (1-5)" aw-Hl- aY~l
0< a ^ !;/• > 0; w > 0, |8| < 1.
352 WE THOMPSON AND R.D. HAYNES
The parameters r, w and 8 are determined by test data and prior information as defined in
[25].
The moments of f(a) are given in [25] in terms of Gauss' hypergeometric function 2^1
(w + r, w + k; w + r 4- k\h). (Note the typographical error in [25] where k in 2^1 is
replaced by r.)
A special case of this availability model treating only "snapshot" data is given in [28],
Snapshot data defined in [25,28] records only the state of the system (up or down) at random
instants of time.
RULES OF COMBINATION FOR SOME BASIC SYSTEM ELEMENTS
Components are often combined to form system elements which are special in some
sense. For example, the same multicomponent element may appear several times as a unit in
the same system. In this case, it may be convenient to treat the element as a single system
component. Some simple multicomponent system elements are presented in the following:
N Identical Components in Series
The reliability, p, of N identical components in series is p = P\.
Component reliability p\ is a random variable in the Bayes representation with known pos-
terior density, f\(p\). The moments MkX{f\\\ k = 0,1, ... ; of f\(p\) are then also known.
The moments Mk{f) of the posterior density f(p) of p are related to moments of the f, by
Mk[f} = MNkA{fl};k = 0,1,2, ....
Using this result one can write the moments of the posterior density of series combinations of
any of the special components treated in the previous section.
N Identical Redundant Components
When only one is required to operate in order that the system operates, then the reliabil-
ity, p, of N identical failure independent redundant components is p = 1 - (1 - P\)N where P\
is the Bayes representation of the component reliability p\. It is shown in [8] that the moments
Mk\f) of the posterior density f(p) of p are related to the moments MkX{j\\ of the posterior
density J\{p\) of P] by the relation
Mk{f} = t ("l)j
7=0
MjilA
By alternately applying this result and the previous one for components in series, the
moments of the posterior density of any series parallel system of components can be obtained.
A "2 out of 3" Element
An element consisting of three identical failure independent components, which operates
if any two or more of the components operate, is sometimes called a "2 out of 3 voter," [21].
The structure function of this element is
4>(xhx2,xi) = 1 if X] 4- x2 + x3 > 2
= 0 if xi + x2 + x3 < 2
RELIABILITY FOR MULTICOMPONENT SYSTEMS
353
and the reliability p is
p = 3p? -2p\
where P\ is the component reliability. If the posterior density f\{p\) of p\ has moments
Mk\{f\\ then the moments Mk[f) of the posterior density f{p) of p are:
Mk{f) = lky£
7=0
M,+7k.\{f\
This result follows using the fact that for p = px, Mp{f(p)\ = MNkA{fx), when applied term by
term to the expansion of
(3p,2 -2pl)k.
Reference [21] gives the reliability function of the N- tuple Modular Redundant design
consisting of N replicated units feeding a (/? 4- l)-out-of-A/ voter. This case can also be treated
by the present methods.
Exactly L Out of N Element
An element consisting of N identical failure independent components which operates only
when exactly L out of N components operate is a rather unusual system. If L + 1 out of N
operate the system fails. Such a system is not a coherent structure in the sense of [5]. The
reliability p of this element is given by
M
L
p[(\-Pi)
N-L
The moments of the posterior density f(p) of the Bayes representation p in terms of the
moments MkX{fx] of the posterior density f\(p\) of the component reliability px can be shown
to be
Mk[f\ =
(N-L)k
I (-1V
7=0
(N-L)k
J
M
j+kL. 1
l/ll
This example serves to illustrate that the proposed evaluation is not restricted to coherent sys-
tems.
DEVELOPMENT OF AN APPROXIMATE PRIOR FOR TESTING AT SYSTEM LEVEL
Section 9.4.4 of NAVORD OD 44622, Reference [22], presents a procedure for develop-
ing the posterior beta distribution of system reliability for system level TECHEVAL/OPEVAL
testing. Reference [9] presents further discussion with an example. The observed system level
data is binomial i.e., r failures in n trials. The system level, natural conjugate prior is the beta
density. An exact prior for the system level tests is the posterior density function based on all
prior component tests and component priors and can be computed by the methods above. The
procedure recommended in OD44622 is to approximate the exact system prior with a beta den-
sity having the same first and second moments.
Equation (15) above provides a tractable tool for computing the required first and second
moments for extending the method to arbitrary system structures.
Let A/1 and M2 denote the .first and second moments computed as shown in this report
for the posterior density f(R) of system reliability, R, based on prior component data. The
354 WE. THOMPSON AND R.D. HAYNES
f(R) is considered the exact prior for determinations of a new posterior density based on bino-
mial system level data. What are required for the approximation are the parameters n' and r'
of the beta prior
t d \ i >\ R" r (\ — R)r
g(/?l";r)=B(y-r'+l, r'+l)
with the same first and second moments as f(R). Having computed Mx and M2 the answer is
direct using formulas on page 9.23 of NAVORD OD 44622 i.e.,
n'= [M,(l- MX)/(M2- M,2)]-l
/-'= (1- Mx)n'.
The gamma prior is treated in a similar way in the same reference.
The beta approximation can also be used directly as an approximation to the exact poste-
rior density function for complex systems based on component test data. The approximation
has been very good when compared with the exact result in examples treated by the authors.
The calculation is tractable for hand computation since only the first and second moments of
the exact posterior density function are required.
Numerical Example
Consider a system consisting of five components, A\ (/ = 1, . . . , 5) connected in series.
Components Ax, A2, A^, and A4 have unknown fixed probabilities of success, p,\ and in testing,
there were observed m, successes in n, trials. The fifth component, A5, has an unknown con-
stant failure rate A and has mission time /. In testing, component A5 failed r times in T operat-
ing hours. The following test data were observed:
« , = 20, w,= 18; /?2=30, w2= 25; //3= 20, m3= 20; /?4= 20, w4=19;r=38, /=6, r= 3
The resulting posterior density functions are:
/,(/?,) = 3990 R\% (1- Rx)2
f2(R2) = 4417686 R225 (1 - R2)5
f3(R}) = 21 Ri°
MRa) = 420 /?419 (1 - R4)
1
f5(R5) - 482.00823 fl519/3
ln«5
We know [25,26] that the Mellin integral transform of the posterior density function,
h (R ) for the system is the product of the Mellin integral transforms of the density functions of
the components. At this point we can determine h(R) exactly by means of the inverse Mellin
integral transform or we can approximate h (R ) with a Beta density function having the same
first and second moments as li(R).
The Mellin integral transforms of the density function for the components of the system
are:
mwMkl 21! TCS+ 18) iL#r#-/i»Mci 21! r(5 + 19)
RELIABILITY FOR MULTICOMPONENT SYSTEMS
355
M\f(R >|ci- 31! T(5 + 25)
M[f2(R2)\S]- 25,r(s + 31)
%4\f (p Mci 21! rCS + 20)
M[f5(R5)\S] =
(22/3)4
(s + i9/3r
The Mellin Integral transform of h(R) is M[h(R)\S] = fj A/[/) (/?,-) I S].
;=1
From [26] we know that the Mellin inversion integral yields directly
h(R) = -^-- f^'°° R-sM[h(R)\S]dS
27TI ve-i00
where the path of integration is any line parallel to the imaginary axis and lying to the right of
the real part of c, If b is greater than 1, the real part of c is greater than p, and p is any
number, then, [26]
1 Cc+
Itt'i Jc-i
C + ioo
R
-5
09 (s + p)1
dS =
Rp
In
b-\
To find h(R) we simply write M[h(R)\S] as the sum of its partial fractions [13] and
integrate each term using the above equation. Thus the exact posterior density function, h(R),
for system reliability is
h(R) = + 1094388844.948 i?18 + 30505643166.29 R]9.
- 12601708553.76 R19
19915799047.82 fl20
4
- 31650550963.66 R20
In
- 5114357474.61 R
20
4
2
+ 235122603.404 R25 - 354959810.01 R26
+ 249501799.456 R21 - 98389473.63 R2%
+ 21240815.37 R29 - 1974044.939 Ri0
- 22937.221 Rl9/3 + 78073.717 Rl9/3
2
In
42683.275 Rx9/i \\n-^
R
- 95839.296 Rl9/3
The exact distribution function, H(R), is found by integrating the density function.
'4
To obtain the approximate solutions for the system reliability density and distribution
functions, we recall that the first and second moments of h(R) are given by M[h(R)\2] and
M[h(R)\3] respectively. The beta density function, which is used to approximate h(R), is
h(R) =
Ra(\ - R)b
B(a + 1, b + 1)
where h{R) denotes the approximate system density function, (3 (a + I, b + 1) is the com-
plete beta function, and a and b are the parameters of the beta function. The first moment of
HR) is
356
WE. THOMPSON AND R.D HAYNES
a + 1
a + b + 2
and the second moment is
(a + 1) (a + 2)
(a + b + 2) (a + b + 3) '
We require that the first and second moments of h (R ) and h (R ) be equal. Thus we have
M[h(R)\2] =
M[h(R)\3] =
a + b + 2
(a + 1) (a + 2)
(a + b + 2) (a + b + 3)
Solving simultaneously for a and b yields the parameters for the beta density function. Thus
we have a = 6.43596 and b = 11.92734. Therefore we can now write h(R), the approximate
density function for system reliability:
^6.43596 n _ ^)11.92734
h(R) =
B (7.43596, 12.92734)
To determine the approximate distribution functions, H(R), for system reliability we simply
integrate h (R )
Table 1 provides the comparison between the results obtained by the exact solution and
the approximate solution.
TABLE 1 — Numerical Results Obtained from Exact
and Approximate Solutions
R
Density Function
Distribution Function
Exact
Approximate
Exact
Approximate
h(R)
h(R')
H(R)
H{R)
.0
.0
.0
.0
.0
.10
.079
.057
.001
.001
.20
1.213
1.208
.052
.048
.30
3.243
3.339
.278
.281
.40
3.429
3.382
.635
.641
.50
1.667
1.616
.896
.895
.60
.343
.365
.986
.984
.70
.020
.032
.999
.999
.80
.003
.001
1.000
1.000
.90
0.000
0.000
1.000
1.000
1.00
0.000
0.000
1.000
1.000
REFERENCES
[1] Abramowitz, M. and I. A. Stegun (Editors), "Handbook of Mathematical Functions,"
National Bureau of Standards, Applied Mathematics Series, 55, 782 (1964).
[2] Aitchison, J., "Two Papers on the Comparison of Bayesian and Frequentist Approaches to
Statistical Problems of Prediction," Journal of the Royal Society, Series B., 26, 161-175
(1964).
RELIABILITY FOR MULTICOMPONENT SYSTEMS 357
Bartholomew, D.J., "A Comparison of Some Bayesian and Frequentist Inferences," Biome-
trika 52, 1 and 2, 19-35 (1965).
Birnbaum, Z.W., "On the Probabilistic Theory of Complex Structures," Proceeding of the
Fourth Berkeley Symposium, 7, 49-55, University of California Press (1961).
Birnbaum, Z.W., J.D. Esary and S.C. Saunders, "Multi-Component Systems and Structures
and Their Reliability," Technometrics, 3, 55-77 (1961).
Brender, D.M., "Reliability Testing in a Bayesian Context," IEEE International Convention
Record, Part 9, 125-136 (1966).
Chang, E.Y. and W.E. Thompson, "Bayes Analysis of Reliability of Complex Systems,"
Operations Research, 24, 156-168 (1976).
Chang, E.Y. and W.E. Thompson, "Bayes Confidence Limits for Reliability of Redundant
Systems," Technometrics, 77(1975).
Cole, Peter Z.V., "A Bayesian Reliability Assessment of Complex Systems for Binomial
Sampling," IEEE Transactions on Reliability, R-24, 114-117 (1975).
Esary, J.D. and F. Proschan, "Coherent Structures of Non-Identical Components," Tech-
nometrics, 5, 191-209 (1963).
Esary, J.D. and F. Proschan, "The Reliability of Coherent Systems," Redundancy Tech-
niques for Computing Systems, Edited by R.H. Wilcox and W.C. Mann, SPARTAN
BOOKS, 47-61 (1962).
Fox, B.L., "A Bayesian Approach to Reliability Assessment," NASA Memorandum
RM/5084 - (1966).
Gardner, M.F. and J.L. Barnes, Transients In Linear Systems, John Wiley and Sons, New
York, 152-163 (1942).
Gaver, D.P. and M. Mazumdar, "Statistical Estimation in a Problem of System Reliability,"
Naval Research Logistics Quarterly, 14, 473-488 (1967).
Gnedenko, B.V., Yu.K. Belyayev and A.D. Solovyev, Mathematical Methods of Reliability
Theory, Academic Press, New York, 76-77 (1969).
Lanczos, C, Applied Analysis, Prentice Hall, Inc. Chapter IV and VII.
Lin, P.M., B.J. Leon and T.C. Huang, "A New Algorithm for Symbolic System Reliability
Analysis," IEEE Transactions on Reliability, R-25 (1976).
Lindley, D.V., "The Robustness of Internal Estimates," Bulletin of International Statistical
Institute, 38, 209-220 (1961).
Lindley, D.V., "The Use of Prior Probability Distributions in Statistical Inference and
Decisions," Proceeding of the Fourth Berkeley Symposium, 7, 453-468, University of
California Press (1961).
Maritz, J.S., "Empirical Bayes Methods," Methuen's Monographs on Applied Probability
and Statistics, Methuen and Co. Ltd., London (1970).
Matther, F.P. and Paulo T. deSousa, "Reliability Models of A^-tape Modular Redundancy
Systems," IEEE Transactions on Reliability, R-24, 108 (1975).
NAVORD OD 44622, Reliability Guide Series, 4. The Superintendent of Documents,
U.S. Government Printing Office, Washington, D.C.
Parker, J.B., "Bayesian Prior Distributions for Multi-Component Systems," Naval Research
Logistics Quarterly, 19 (1972).
Savage, L.J., "The Foundations of Statistics Reconsidered," Proceeding of the Fourth
Berkeley Symposium, 7, 575-585, University of California Press (1961).
Springer, M.D. and W.E. Thompson, "Bayesian Confidence Limits for the Reliability of
Cascade Exponential Subsystems," IEEE Transactions on Reliability, R-16, 86-89
(1967).
Springer, M.D. and W.E. Thompson, "Bayesian Confidence Limits for the Product of A'
Binomial Parameters," Biometrika 53, 3 and 4, 611 (1966).
Thompson, W.E. and P. A. Palicio, "Bayesian Confidence Limits for the Availability of Sys-
tems," IEEE Transactions on Reliability, R-24, 118-120 (1975).
358 W.E THOMPSON AND R.D. HAYNES
[28] Thompson, W.E. and M.D. Springer, "A Bayes Analysis of Availability for a System Con-
sisting of Several Independent Subsystems," IEEE Transactions on Reliability, R-21,
212-214 (1972).
[29] Wolf, J.E., "Bayesian Reliability Assessment From Test Data," Proceedings 1976 Annual
Reliability and Maintainability Symposium, Las Vegas, Nevada, 20-22, 411-419 (1976).
OPTIMAL REPLACEMENT OF PARTS HAVING OBSERVABLE
CORRELATED STAGES OF DETERIORATION*
L. Shaw
Polytechnic Institute of New York
Brooklyn, New York
C-L. Hsu
Minneapolis Honeywell
Minneapolis, Minnesota
S. G. Tyan
M/A COM Laboratories
Germantown, Maryland
ABSTRACT
A single component system is assumed to progress through a finite number
of increasingly bad levels of deterioration. The system with level i (0 < / < n)
starts in state 0 when new, and is definitely replaced upon reaching the worth-
less state n. It is assumed that the transition times are directly monitored and
the admissible class of strategies allows substitution of a new component only
at such transition times. The durations in various deterioration levels are
dependent random variables with exponential marginal distributions and a par-
ticularly convenient joint distribution. Strategies are chosen to maximize the
average rewards per unit time. For some reward functions (with the reward
rate depending on the state and the duration in this state) the knowledge of
previous state duration provides useful information about the rate of deteriora-
tion.
Many authors have studied optimal replacement rules for parts characterized by Marko-
vian deterioration, for example Kao [6] and Luss [9] and the many references found in those
papers. Kao minimized the expected average cost per unit time for semi-Markovian deteriorat-
ing system, and considered various combinations of state and age-dependent replacement rules.
Luss examined inspection and repair models, where he assumed that the operating costs
occurring during the system's life increase with the increasing deterioration. The holding times
in the various states were independently, identically, and exponentially distributed. The policies
examined include the scheduling of the next inspections (when an inspection reveals that the
state of the system is better than certain critical state k) and preventive repairs (when an
inspection reveals the state of the system being worse than or equal to k). The convenience of
a Poisson-type structure for the number of events-per-unit-time made it relatively easy to allow
general freedom in the selection of observation times.
The work studied here is based on a modification of the model used by Luss. Our model
for deterioration is more general, but the admissible strategies used here are more restricted.
Here we allow the exponentially distributed durations to have different mean values, and to be
positively correlated.
'This work was partially supported by Grant No. N00014-75-C-0858 from the Office of Naval Research.
359
360
L. SHAW, C-L. HSU AND S.G. TYAN
The introduction here of correlation between interval durations permits the modeling of a
rate of deterioration which can be estimated from a particular realization of the past durations.
However, the lack of a Poisson-type of structure for the events-per-unit-time makes it much
more difficult here to allow general freedom in the selection of observation times. At present,
only the simple case of direct and instantaneous observation of deterioration jumps has been
considered.
This model would be appropriate, for example, in a subsystem which functions, but with
reduced efficiency, when some redundant components have failed; and for which failure of one
component might indicate environmental stresses which increase the probability of failure for
other components. In addition, deterioration in correlated stages might be used as a simple
approximation for a continuously varying degradation which does not exhibit discrete stages.
Figure 1 shows a typical time history of deterioration and replacement. The duration in
state (/'— 1), prior to reaching state (/'), is r,_x. The intervals d, in Figure 1 represent the time
required to replace a component when it has entered state /'. The sequence {/•,■} will be Markov,
characterized by a multi-variate exponential distribution. Reward functions will be related to
the deterioration state and the time spent in each state. The decision rule specifies whether or
not to replace when entering each state /, on the basis of the history of r,-_i, /■/_2, .... The
Markov property simplifies the decision rule to be a collection of C, sets such that we replace
on entering state /' if and only if r-_] € (£,.
State
5i
4
3
2
1
'o
'0
Time
Figure 1. History of deterioration and replacement (n = 5).
(1)
(2)
The objective is to maximize the average reward per unit time:
1
L = lim -= (Total reward in (0, T)]
T~oo T
£ [Reward per renewal]
E [Duration between renewals]
(See Ross [11] page 160 for equivalence of (1) and (2).) The mean reward per renewal is
defined here as:
(3) % =
N- 1 r r
I Jo c>{t)dt-
/=0
OPTIMAL REPLACEMENT OF PARTS
361
in which:
A' = state at which replacement occurs (possibly random).
pN = replacement cost if replaced on entering state N (possibly random).
Cj(t) = reward rate when in state /'.
Figure 2 shows several reward rate time functions c(t) which have been considered.
When one of these c(t) functions is specified for a given problem, the c,{t) in (3) are assigned
values j8,c(/) with:
(4)
i30>i3l >£l> ■■■ >Pn-l >Pn = 0,
to assure greater reward rates in less deteriorated states. State n corresponds to a completely
failed or worthless component.
c(t)
(a) constant
(b) linear
(c) constant-after set up
Figure 2. Reward rate time functions.
The mean duration in (2) is defined as:
(5) 5) = E
A'-l
I r, + dN
i=0
to include a possibly random time dN for carrying out a replacement at state N.
While the ultimate objective is to choose 6, to maximize the L defined in (1), it is well
known that a related problem of maximizing:
(6) £„(«) = » -a %
is simpler [1]. Indeed, the £, which maximize L will be identical to those which maximize
£<)(«) for the a * such that:
(7)
£<? (« *) = 0, where £0° (a) A max £0(a).
362 L. SHAW, C-L. HSU AND S.G. TYAN
Section 1 considers a case in which it is found that deterioration rate information is not
useful (e.g., the optimal policy is independent of the amount of correlation between successive
state durations).
Sections 2 and 3 consider other penalty cost structures, e.g., assuming that more
deteriorated parts are rustier, hotter, or more brittle, and therefore more costly to replace. In
such cases the optimal policies do make use of estimates of the deterioration rates as well as of
observations of the deterioration level.
The Appendix describes useful properties of the multivariate exponential {/■,} sequence
which is used to model the correlated residence times in a sequence of deterioration states.
These durations have marginal distributions which are exponential with mean values tj,, and
i *— ■ i
correlations prr =p ".
1. CONSTANT REWARD RATE-STATE INDEPENDENT
REPLACEMENT PENALTIES
The constant reward rate case with c,(t) = (i, and with state-independent replacement
penalties (p, = p, d, = d) is particularly simple to analyze. We will see that as long as
£[//|//_i, r,_2, ••• 1 ^ 0 for all /', even if the r, are not exponentially distributed, the optimal
rule will be to replace the deteriorating part upon entering some critical state A-*, independent of
the observed durations r,.
Based on the problem statement, the optimal decision on entering state j must maximize
the mean future reward until the next renewal, £,(«), for a suitable a. Here:
(8) £,(«) = £
Immediately after a renewal, when j = 0, the expectations defining £0(a) are unconditional
The optimal decisions for each state will be found in terms of a, and then the proper a* (for
producing decisions which maximize L) is the one for which the maximum:
(9) max£()(a*) = £o0(«*) = 0.
Optimization by dynamic programming begins by considering the decisions at the last
step, i.e., on entering state (n - 1). There are two choices, to replace (R) or not to replace
(R), with corresponding values:
(10) £„_,(«;*) p-ad,
and:
£„_,(<*;#) = E[ (3 „_,/■„_, I r„_2] - aE[rn_x\rn_2]-p-ad
= E[{pn_x-a)rn_{\rn_2]-p-ad.
Clearly, the best decision is not to replace, if and only if, the difference
(12) An_,(a;rn_2) A £„_,(«;£)-£„_,(«;/?)
= {(Sn_x-<x)E[rn_x\rn_2\ > 0.
is non-negative. The sign of (12) will be the sign of (j8„_,-a), due to the non-negativity of
all interval durations. Thus the best decision depends on a and the reward parameter /8„_i, but
not on the previously observed duration. Two cases will be considered separately.
A'-l
Z /v,lo-i
— aE
N-]
i=j
. i=j
OPTIMAL REPLACEMENT OF PARTS 363
If /3„_! ^ a then the best decision at state (n — 1) is not to replace. We will now explain
why, under this condition, it is best not to replace at any state less than n. Consider the situa-
tion on entering (n — 2). We have already shown that it is better not to replace on entering
(n — 1). Thus the choice will be based on a A„_2 °f me form:
(13) A„_2(a;/-fl_3) = £[(/3fl_2-a)/-fl_2 + (0„_i -a)r„_, k_3).
Here we have:
(14) (j3n-2-«) > (/8„-i-a) > 0,
by assumption, and:
(15) £[/■„_, I rB_3] > 0 and £[r„_2| r„_3] > 0,
because all a, > 0 with probability one. Thus A„_2(a;r„_3) > 0 for all a-„_3 > 0, and it is also
better not to replace here. This argument can be repeated for states (n — 3),
(n -4) 1,0.
The other case to consider is yS,,^ < a, which requires replacement on entering state
(n — 1), if the system ever reaches that state. When we consider the decision on entering
(n - 2). the A„_2 is:
(16) A„_2(a;/'„_3) = £[(/3„_2-a)/-„_2|/-„_3],
which has the sign of (/3„_2-a). If (/3„_2-a) < 0, then replacement is optimal on entering
(n — 2) and in — 3) is considered next. This iteration may eventually reach a state (k — 1)
where (ftk-\ — a) > 0 and it is better not to replace. Arguments similar to those for the
j8n_! — a > 0 case show that nonreplacement is the optimal decision at all states preceding the
one which first arises as a nonreplacement state in this backward iteration.
In summary, in the constant reward rate-constant replacement penalty case S-0(a) is max-
imized by a decision rule which says replace on entering some state k ^ n which depends on
the reward parameters {{3,} and the a:
(17) k = min{/:(a - £,) > 0}.
Finally, we must choose a * so that £q (<* *) = 0, where:
(18) Z§(a) = -p-ad + kfi (p, -a)E[r,].
Figure 3 shows a typical plot of £<? (a) as a continuous, piecewise linear curve whose zero
crossing (£°(a*) = 0) defines a * and the optimal replacement state k* for maximizing L.
EXAMPLE. Figure 3 shows that the optimal average reward per unit time is L = 2—
when k* = 3, where /30 = 5, jSj = 4, /32 = 3, /33 = 2, /34 - 1, /85 = 0, p = 5, d = 1, t;,- - 2
(/' = 0,1,2,3,4) and n = 5. From Equation (18), the optimal k is a function of a, which
remains constant when a varies over each interval j6,+ i < a < /3,-, as shown in the figure.
2. INCREASING REPLACEMENT PENALTIES-CONSTANT REWARD RATE
Here we generalize the model of the previous section by allowing the replacement cost p,
and replacement duration d, to be functions of the replacement state (/), and to be random.
These parameters are assumed to have mean values E[pj] and E[dj] which are convex nonde-
creasing sequences in /, corresponding to the increased difficulty in replacing more deteriorated
parts which may be, e.g., rustier, hotter or more brittle. We also assume that the mean dura-
tions are ordered: tj0 ^ rj| ^ . . . > 17,,- 1, corresponding to faster transitions of more
deteriorated parts.
364
L. SHAW, C-L. HSU AND S.G. TYAN
J
ii°oU*, a)
30-
20-
k*(a)>
i
-5
-4
10-
\
-3
-2
o-
1
1 1
i :
1 >v 1
1
1
a
"* - Ltmx
^\
K
10-
20-
Figure 3. Optimal reward search: constant reward rate case.
OPTIMAL REPLACEMENT OF PARTS
365
The foregoing assumptions, together with properties of the assumed multivariate
exponential density for stage-durations (see Appendix), lead to an optimal decision policy with
a nice structure. That optimal policy prescribes replacement when entering state j, if and only
if /)_, < /}*_i, where the decision thresholds are ordered: 0 < /"o/^o ^ r°\/v\ ^ • • • ^ C-\l
f)n-\
The optimal decision on entering state j must maximize the mean future reward until the
next renewal, i. e., £,(«). For a suitable a, we have:
(19)
£>) = £ £ /3,
A'-l
i=J
aE
A'-l
E[pN + adN].
For notational simplicity we define e, = E[p, + ad,] and note that e, is also convex and nonde-
creasing since we are only interested in a > 0. The optimal decisions for each state will be
found in terms of a, and then the proper a * (for producing decisions which maximize L) is the
one for which the maximum £ vanishes:
(20)
£,?(«*) =- eN(a*) + E
A'-l
;'=0
n
-a*
A'-l
I
j=0
n
= 0.
Optimization by dynamic programming begins by considering the decision at the last step.
Since state n represents a failed component, we definitely replace the component when it enters
state n. Next, we consider the decision to be made on entering state n — 1. There are two
choices: to replace (R) or not to replace (/?), with corresponding values
(21) £„_,(«;/?) = -£-„_,,
(22) £„_,(«;£) = £[/3„_i''/»-i-"',«-ik-2] - en
for £„_|(a). Clearly, the best decision is not to replace, if and only if,
A„_,(/-„_2) A £„_,(«;£) - £„_,(«,/?)
is non-negative, i.e.,
(23) K-i(r„-2) = (/3„-i-a) E[rn.x\rn.2\ + (en_x - e„) > 0.
Referring to (A-6), A„_i(r„_2) is a linear function of r„_2, with
A„_,(0) = (jS^-ah^a ~ p) + (eB_j - O-
Figure 4 shows the possible shapes for this function. There can be no downward zero-crossing
at an rn_2 > 0.
Thus, depending on the numerical values of the parameters, there are three possible kinds
of optimal decision rules when entering state (n — 1):
(i) replace for no rfl_2 if A„_i ^ 0 for all /-„_2 ^ 0
(ii) replace for any rn_2 if A„_] < 0 for all r„_2 ^ 0
(iii) replace if and only if r,*_2 > r„_2 > 0, where A„_, (r,*_2 ) = 0.
In other words,
(24) £,_,(«) = {r„_2: rn_2 < C2],
where r„"_2 could be zero (case i) or infinite (case ii).
366
L. SHAW, C-L. HSU AND S.G. TYAN
A„_,(r„_2)
• P„-\ > a
r„-2
Figure 4. Possible shapes for &„_](rn_2).
Next we consider the optimal decision when entering state (n — 2), and assuming that the
optimal decision will be made at the subsequent stage. We consider cases of (/3„_| < a) and
(/3„_i ^ a) separately.
(a) (/3„_i < a) implies replacement on entering (n — 1), so
A„_2(r„_3) = (/3„_2-a) £[r„_2|r„_3] + (e„_2 - e„_i),
resulting in the same three possibilities listed above for state (n — 1).
(b) for (£„_! > a):
(25) A„_2(r„_3) = e„_2 + (j3„_2-a)£[rn_2|r„_3]
+ J . [(p„-i-a) E[rn_x\rn_2] - e„] /(r„_2|r„_3)rfr„_2
C*n-l
+ Jo (-e„-i) /(r„_2|rw_3)^„_2
Equation (25) can be simplified, with the aid of the notation (x)+ = maxOc, 0), to the form
(26) A„_2(r„_3) = (e„_2 - en_x) + (j8„_2-a) E[r„-2\r„-3]
+ £[(A„_1(rn_2)) + |r„_3].
Useful comparisons can be formed if normalized variables are introduced, namely
s, = rj-oc hM-\) = M'V-i)lr,._,U,/_,s/_,
We now prove
(a) 8„_2(s„_3) ^ 8„_,(s„_3)
(b) 8„_2(s„_3) is convex with at most one upward zero crossing at an 5 > 0.
There is no harm in writing 8„_,(s„_3) or 8„_,(s+) instead of 8„_,(s„_2) for purposes of
comparing functions.
OPTIMAL REPLACEMENT OF PARTS 367
To prove (a), consider
(27) 8„_2(s) - 8„_iCs) = [(e„_2 - e„-i) - (e„-i - <?„)] + £[(8„_1(s+))+|5]
+ [(/3w_2-a!)i7„_2- (^w_i-a)Tj„_i] £"[s+|s].
where 5+ represents the normalized duration following 5.
The terms on the right side of (27) are nonnegative due to the convexity of the eh i )+
^ 0, (A-6), and the assumed orderings of the /3, and tj,.
This completes the proof that (a) is true. It follows immediately that if (i) (preceding Eq.
(24)) applies for state in — 1), then it is also optimal not to replace in state in — 2) or any ear-
lier state. (Recall /3„_i < /3„_2 < . . . , and we are now considering a < (3„-\).
To prove (b), which is only of interest when an r*_2 > 0 exists, we refer to the theorem
in the appendix. The test difference 8„_2(s) can be written as
(28) 8„_2(s) = E[en_2 - en_x + (fi „_2- a)r) „-2 s+ + i8n^is+))+\s]
in which the integrand has the properties required by his) in the theorem. To see this, we
note that r*_2 > 0 implies that (8„_,(0))+ = 0, so the integrand is nonpositive at 5+ = 0.
Thus, 8„_2(s) has the shape stated in (b), implying that
(29) e„_3={ra_3:r„_3< r,;_3}
where r„*_3 may be zero, infinity, or the nonnegative value defined by 8 „_2(r„*_3 / '-q „_3) = 0.
The foregoing arguments can be repeated for /•„_4, rn_5 ... r0 to prove that the optimal
replacement policy has the form:
Replace on entering state /', if and only if, r, ^ r* where
0 < rj/rjo < /-'/tj, < ... < d/i7„_, = oo.
When repeating the proof for earlier stages, the ( )+ term in (27) and (28) is modified to the
form, e.g., [(8„_2(s+))+ - (8 „_t (5+))+] . This term is generally nonnegative, due to (a) at the
preceding iteration (next time step); and it is zero for s+ = 0 when proving (b), since then r„*_3
> 0. Thus the basic theorem is still applicable.
3. Computational Procedure
The preceding section derived the structure of the optimal decision rule for the case
where replacement is more difficult and more expensive when the part is more deteriorated.
The corresponding optimal decision thresholds can be formed as follows:
(a) choose an initial a.
(b) Find the r*ia) (;= n—l, n — 2, ...0) recursively, via numerical integration of
expressions like (26) (where r*_3 (a) is defined by the condition A„_2(v_3) = 0).
(c) Compute
n [ifi0-a) r0+ (A,(/-0))+] fir0)dr0.
o
(d) If |£o(a)l < e^ f°r sufficiently small e, say Lmax = a* = a: otherwise repeat the
computational cycle starting with a new a.
368
L. SHAW, C-L HSU AND S.G TYAN
The following properties of £o (a) can be used to generate an a -sequence which con-
verges to a *.
1. £q (**) 's monotone decreasing, since £0(a) nas this property for a fixed policy (see
Eq. (19)); and if £o ("2) ^ £0 (ai) f°r a2 > fli, then the policy used to achieve £q ^2) could
be used to achieve an £0(«|) > £<? («i) — a contradiction.
2. When p = 0, all r* are zero or infinite: replacement always occurs on arrival at a criti-
cal state /*. Use of that policy will achieve the same average reward for durations having any
value of p. Thus, a useful bound on a*(p) is a*(0) ^ a*(p); 0 < p < 1*.
3. When p = 1, future r, are completely predictable ( Var (/-,| r,_j) = 0 in (A-7)), so
a*(l) ^ a*(p). In this case there is essentially a single random variable r0, and the r," can be
calculated without the need for numerical integration of Bessel functions.
4. NUMERICAL EXAMPLE
Table I lists parameter values for a replacement problem which fall under the assumptions
of Section 2.
TABLE 1 — Numerical Example Parameters
CASE I (p = 0)
i
0
1
2
3
4
5
(Q,
5
4
3
2
1
17/
1
0.9
0.8
0.7
0.6
E[Pl\
2
2.2
2.4
2.6
2.8
E[d,]
1
1.1
1.2
1.3
1.4
Since future durations are independent of past ones, the optimal policy replaces when a
critical state i* is reached. The general optimal reward expression
<**(p) =
N-\
Z/V, - Pn
0
N-\
0
becomes, in this case
a*(0) = max
fcjB, 7,, - E[Pi]
= max ^ (J)
Direct evaluation shows
1.5 2.13 2.205 2.085 1.89
withy'*= 3 anda*(0) = 2.205.
OPTIMAL REPLACEMENT OF PARTS
369
CASE 2 (p = 1)
Since r, = r0 rj/rjo in this case, the optimal rule specifies a replacement state j(r0) as a
function for r0.
For any such policy
£0(a,j(r0)) = ErQ
T70 o
This expectation will be maximized if j(r0) maximizes the bracketed term for each r0. Making
the necessary comparisons for a sequence of a -values leads to the policy
j* =\, if ro < 0.2698
= 2, if 0.2698 ^ r0< 0.7083
= 3, if 0.7083 ^ r0
for which |£0I < 0.003 and a *(1) = 2.25.
< 2.25. A pilot calculation along the lines indicated in the
CASE 3
1
We know that 2.205 < a*
1
2
< 2.25.
previous section shows that rj
1
2
= o, >■;
1
2
. 9
(<**-
2)
— = oo for j ^ 2, and
r\ =
1(3 -a*)'
where a * is chosen to make the following £0(a) vanish.
oo .oo
£0(a) = 6.4 - 3a + f J
'-f
+ j (3-a)r,
2rn +
0.45
0.45
The known bounds on the optimal reward a
bounded, too: 0.290 < n
70(2.981 yf7^\)drxdrQ
imply that the optimal threshold r* is
< 0.375.
Similar study of other values of the correlation parameter p lead to the optimal policy pat-
tern described in Table II. One might say that as p increases, the past observations are more
informative, the optimal policy makes finer distinctions, and the optimal reward increases.
5. CONCLUSIONS
A multivariate exponential distribution has been used to describe successive stages of
deterioration. Optimal replacement strategies have been found for the class of decision rules
which can continuously observe the deterioration state, and which may make replacements only
370
L. SHAW, C-L. HSU AND S.G. TYAN
TABLE 2 - Optimal Policy Structure
Correlation Parameter p
0)
c
E
o
o
o.
a!
0
1/4
1/2
3/4
1
1
r0 < r0*(3/4)
r0 < r'Q (1)
2
r, < rrfl/2)
r, < rf(3/4)
'o < — r\ (!)
3
always
always
r, ^ r' (1/2)
/•, ^ r," (3/4)
r0 > — r, (1)
at the times of state transitions. Similar results have been found for the other reward rates
shown in Figure 2 (linear; and constant after an initial set-up interval for readjustment to the i
new state) [5].
The optimal replacement policy derived in Section 2 makes use of observations which
allow estimation of the current rate of deterioration for the correlated stages of deterioration.
The numerical example demonstrates how the optimal policy and reward are related to the
amount of correlation between the durations in successive deterioration states. For the model
used here, the optimal policy for p =0 will achieve the same reward (less than optimal) for j
any p. Depending on the application, the suboptimal approach may be satisfactory. The addi- j
tional reward achievable by the actual optimal policy is bounded by the easily computed optimal ,
reward for p = 1. However, it is possible that the small percentage improvement achievable
for the p = 1/2 case in the example could represent a significant gain in a particular applica-
tion.
The ordering of state dependent rewards, mean durations, etc. assumed here are physi-
cally reasonable, and lead to nice ordering of the decision regions. However, other
/?,■, 7],, ph d, orderings might be more appropriate in other situations. The model introduced
here for dependent stage durations could be used in those cases, together with dynamic pro-
gramming optimization, although the solutions may not have comparably neat structures.
We anticipate that the optimization approach and policy structure described here will also
be applicable to replacement problems having similar deterioration models. One easy extension
would be to change the correlation structure in (A-3) from p''-/l to something else, e.g.,
p \i-j\ + p |/-/l_ other changes could permit the r, to have non-exponential distributions, as
long as similar total-positivity properties exist to permit analogous simplifications in the
dynamic programming arguments.
Some of these other r, distributions are being studied now in the hope of finding similar
models which exhibit large percentage differences between the optimal rewards as PVi+1 changes
from zero to one. (Other choices of the numerical values in Table I have not revealed any
such cases for the current model).
One reasonable generalization would allow transitions from state /to any state j > /'. This,
would not change the form of the solution in the case of constant replacement penalties. How-
OPTIMAL REPLACEMENT OF PARTS 371
ever, the possibility of these additional transitions does ruin the structure when replacement
penalties increase with the deterioration state. (The 8„_2(s) > 8 „_)($) argument is no longer
valid.)
REFERENCES
[1] Barlow, R.E. and F. Proschan, "Mathematical Theory of Reliability," John Wiley and Sons
(1965).
[2] Barlow, R.E. and F. Proschan, "Statistical Theory of Reliability and Life Testing," Holt,
Rinehart, and Winston (1975).
[3] Griffith, R.C., "Infinitely Divisible Multivariate Gamma Distributions," Sankhya, Series
A, 52, 393-404 (1970).
[4] Gumbel, E.J., "Bivariate Exponential Distributions," Journal of the American Statistical
Association, 55, 678-707 (1960).
[5] Hsu, C-L., L. Shaw and S.G. Tyan, "Reliability Applications of Multivariate Exponential
Distributions," Technical Report, Poly-EE-77-036, Polytechnic Institute of New York
(1977).
[6] Kao, E.P., "Optimal Replacement Rules when Changes of States are Semi-Markovian,"
Operations Research, 21, 1231-1249 (1973).
[7] Karlin, S., "Total Positivity," Stanford University Press (1968).
[8] Kibble, W.F., "A Two-Variate Gamma Type Distribution," Sankhya, 5, 137-150 (1941).
[9] Luss, H., "Maintenance Policies when Deterioration Can Be Observed by Inspections,"
Operations Research, 24, 359-366 (1976).
[10] Marshall, A.W. and J. Olkin, "A Multivariate Exponential Distribution," Journal of the
American Statistical Association, 22, 30-44 (1967).
[11] Ross, S., "Applied Probability Models with Optimization Applications," Holden-Day
(1970).
APPENDIX
Dependence Relationships Among Multivariate
Exponential Variables
Many multivariate distributions have been described and applied to reliability problems
[4,8,10]. In each case the marginal univariate distributions are of the negative exponential
form. Properties of the distribution used here are most easily derived by exploiting its relation-
ship to multivariate normal distributions [3.5].
The multivariate exponential variables rx, r2 . . . , rn can be viewed as sums of squares:
(A-l) r, = w,2 + z,2,
where w and z are independent, zero mean, identically distributed normal vectors, each with
covariance matrix T. It follows that the r, have exponential marginal distributions with
(A-2) E[r,] = 2r„
We specialize to the case where the underlying normal sequences {w,} and {z,} are Markovian
(A-3) yij = Jy77JjPU-Jl
and find that {r,} is also Markov with the joint density
372
(A-4)
L. SHAW, C-L. HSU AND S.G. TYAN
/(ro,rlfr2, ••• . V-i) -
n-i l-i
/•-0 J
n-2
;=0
exp
1-
1
V rfiVi+
nn+\
\-p
I± + rJzl + y2iili±Pl
170 TJn-l
l»i
;» > 2,
Equation (A-4) uses the modified Bessel function 70( ) and the notations E[r,] = tj, and
Pr,r = P- (When « = 2, the summation in exp ( ) vanishes.)
(A-5)
The conditional density is easily shown to satisfy the Markov property and [5]
1
/(r,|/7_,) = [tj,(/ - p)] 'exp
(1 -p)
2 I p nn- 1
, 1 ~P V V,Vi-\
with
(A-6)
(A-7)
E[n\rHl] = 17, + (/-,_! - T7,_!)p Vi/Vi-\-
Varlr/lr,.!] = t,,2[(1 - p)2 + 2p(l - p),-,,,/-^,].
These conditional moments shows, e.g., that the conditional mean of r, exceeds its mean in
proportion to the amount by which r,_] exceeds its mean, and that conditional mean is a
linearly increasing function of rH\.
The dynamic programming arguments used here required calculations of conditional
expectations based on (A-5). As is often the case [2], the total positivity properties of
/0/|#/_i) are very useful for determining structural properties of the optimal policy.
It is straightforward to show that both fir,, rHi) and /(r/|r,-_i) are totally positive of all
orders (TP^), [5,7]. This means, for /(/•,, rHi), that the following determinants are nonnega-
tive for any Wand any ai < a2 ... < aN; ft{ < /32 . . . < /3#.
/(a,,/8,) /(a, -fl2) ... f(ah/3N)
> 0.
/(atf.Pi) A<*N,PN)
THEOREM: if h(y) is continuous and convex, and satisfies the bounds
(i) h (0) < 0
(ii) \h(y)\ < a + b v2m; a > 0, b > 0, y > 0, m = positive integer. g(x) =
I ^(y) /(yU) dy, and f(y\x) is jHPoc, then g(x) is continuous, convex, bounded in the sense
\g(x)\ < a' + b' x2m; a' > 0, b' > 0, x > 0;
and belongs to one of the three following categories:
OPTIMAL REPLACEMENT OF PARTS 373
(I) g(x) > 0 for all x^ 0,
(II) g(x) < 0 for all x ^ 0 except with a possible zero at x = 0,
(III) there exists a unique x*, 0 < x* < °o, such that g(x) > 0 for all x > x* ; and
g(x) < 0 for x < x* except for a possible zero at x = 0.
This theorem is used to define optimal decision regions according to the sign of a function like
g(x), with x* corresponding to a decision threshold.
I
STATISTICAL ANALYSIS OF A CONVENTIONAL FUZE TIMER
Edgar A. Cohen, Jr.
Naval Surface Weapons Center
White Oak
Silver Spring, Maryland
ABSTRACT
In this paper, a statistical analytic model for evaluation of the performance
of a standard electric bomb fuze timer is presented. The model is based on
what is called a selective design assembly, where one item, namely, a resistor,
is used to time the circuit. In such an assembly, the remaining components are
chosen a priori from predetermined distributions. Based on the analysis, a gen-
eral numerical integration scheme is utilized for assessing performance of the
timer. The results of a computer simulation are also given. In the last section
of the paper, a theory for evaluation of the yield of two or more timers
designed to operate in sequence is derived. To appraise such a scheme, a nu-
merical quadrature routine is developed.
1. INTRODUCTION AND PHYSICAL DESCRIPTION
In this paper, we shall be concerned with the statistical analysis of the bomb fuze timer
shown in Figure 1. As is common in practice, a standard, or precision, resistor is used to time
the circuit after the rest of the components have been assembled in a random fashion. Then,
to meet certain timing requirements to be discussed later, a resistor is selected and introduced
into the circuit. A number of tests must afterwards be performed in sequence to check the per-
formance of the product under differing environmental conditions. Such environmental
influences are, for example, temperature effects, effect of packaging, resistor incrementation (to
be discussed), and effect of vibration and moisture uptake. In addition, one might have several
timers which operate sequentially, all fed from the same energy storage capacitor CI of Figure
1. This paper is devoted to an analysis of such a timer in what is called the ambient tempera-
ture range, whose limits are 70°F and 80°F, respectively. We will also indicate the procedure
for treating analytically the assessment of performance of combinations of several timers. The
author has been involved in a Monte Carlo study for the Navy of such timers. Previous work
has involved reliability studies of an entire fuze assembly using these timers [2].
2. RESISTOR SELECTION PROCESS
The timer indicated in Figure 1 works once the potential difference across the two capaci-
tors C2 and C3 is sufficient to fire the cold cathode diode tube VT. Capacitors CI and C3 ini-
tially have the same potential across them. As time progresses, CI discharges through resistor
RES into C2, while C3 serves as a reference capacitor. Thus, the voltage across C2 builds up
375
376
E A. COHEN, JR
TIMER
OUTPUT
Figure 1. Fuze timer configuration
until the potential across tubes C2 and C3 is adequate to fire tube VT. The relationship
between firing time and the values of the circuit components can be derived from a simple first
order differential equation and is given by
(2.1)
where
RC,C
t =
Q + c2
In
VC>
VC,- (VT- V) (C,+ C2)
Cx = capacitance of capacitor CI
C2 = capacitance of capacitor C2
V — supply voltage (potential across C3 and potential initially across CI)
VT = firing voltage of cold cathode diode tube VT
R = resistance of resistor RES.
To illustrate the pertinent features of the process, write (2.1), for brevity, in the form
(2.2) t = RF(C,, C2, V, Vj).
Note that (2.2) is linear and homogeneous in R, so that R can be used as a scaling parameter.
This is precisely how it is used when the timer is first assembled.
In practice, the resistors are supplied in large numbers by the manufacturer, after which
they are tested and sorted by the user into a large number of bins. The resistors in each bin
have resistances, at a standard temperature, which fall into a certain interval. These intervals
are arranged to have the same "percent width", to-be described in more detail below. The timer
is to be designed to fire at a nominal time tN. Since capacitors CI and C2 are chosen at ran-
dom from a lot, their capacitances C\ and C2 may be treated as random variables. Likewise,
tube firing voltage VT may also be considered as a random variable. In general, we shall also
consider the supply voltage Fto be random.
ANALYSIS OF FUZE TIMER 377
Let us agree to denote by R0 that value of R obtained from relation (2.1) when t = tN
and Cb C2, V, and VT are given their expected values at some standard temperature, e.g.,
75°F. For convenience, R0 may be used as a reference resistance, and the bin to which refer-
ence resistor RES0, of resistance R0, belongs could be called the reference resistor bin. The
interval corresponding to this bin is to contain all resistances which fall between R0 (1 — e) and
R0(\ + e), where e is a preassigned small positive number. Our second bin will contain all
resistors whose resistances fall between 7?0(1 + e) and R0(\ + e)2/(l — e), and the third bin
those resistors whose resistances lie between R0(\ — e)2/(l -I- e) and R0(\ — e). In general,
our intervals are to be so constructed that the ratio of right endpoint to left endpoint is always
(1 + «)/(l — e), which, to first order accuracy, is just 1 + 2e. Alternatively, one may divide
the difference of the two endpoints by its midpoint to obtain precisely 2e. We shall, therefore,
say that each such interval has "percent width" 2e. In setting up the interval division scheme, a
percent increment e, is chosen a priori, and then e = e,/100. This e, is typically of the order of
1/2 to 1%. Figure 2 is a diagram of this scheme.
-o-
M-e)2 Ro(l-€) R0(J-€) R0 R0(l+€) R0(l+€) R0(l+€)
2
l + € l+€ 1-6 J-6
Figure 2. Resistance interval setup
Once again referring to our circuit configuration, where C\, C2, K and VT are random
variables, let us define
(2.3) f0- R0F(CU C2, V, VT).
Then, to achieve the nominal time tN, we define our nominal resistance to be
(2.4) RN - R0tN/t0.
Note that, since tQ is a random variable (being a function of the random variables Cj, C2, V,
and VT), RN is also a random variable. A technician may use relation (2.4) to determine RN.
Then he picks a resistor RESp at random from the bin to which resistor RES/y belongs and
integrates such resistor, of resistance Rp, into the circuit. This process is called, in fuze tech-
nology parlance, "resistor incrementation." Note that Rp is a random variable which is statisti-
cally dependent on RN inasmuch as Rp and RN must lie in the same interval. However, once
attention is restricted to a given interval of the scheme, it is clear that the value of RN in no
way influences the value of Rp, since one is free to select any resistor in the bin to which the
nominal resistor belongs. We shall reemphasize this fact in Section 3. For simplicity we index
the intervals by /', letting their left and right endpoints be r, and ri+\, respectively. To achieve
compatibility, the bins should initially be formed and kept at some standard temperature, and
the timer should be assembled at that same temperature. In practice, this will, in all likelihood,
not be the case, but one may compensate for this defect by studying the sensitivity of the timer
to changes in bin interval width. For example, if by doubling the interval width, the overall
change in performance is insignificant, it may be safely assumed that such a discrepancy was
unimportant (provided the distributions due to ambient temperature variations are of small
variance).
3. PROBABILITY INTERVALS AT THE STANDARD TEMPERATURE
The problem of determining the probability of operation of the timer within two given
times, say tx and t2, when there is no effect other than resistor selection is not difficult. (We
also ignore, in this section, the effect of tube firing voltage variation from one firing to the
378 E. A COHEN, JR.
next. This phenomenon will be discussed in some detail in Section 4.) The reason is that the
time is linear and homogeneous in resistance R. In fact, the bins have been designed to take
advantage of this feature, and we shall show that the probability interval is independent of the
bin in which resistor RESyv falls.
First of all, let r^'in and ^ax De the minimum and maximum times, respectively, obtain-
able when the nominal resistance RN and the picked resistance Rp come from a given bin /.
Also, let F^'/n and F^lx be the smallest and largest values of F, respectively, given only tN and
knowing that RN comes from that bin. It follows that
(3.1) fm'in — r, Fm'in = rjtN/rl+\
and
(3-2) /'max = r,+ i ^m'ax = ri+\ WO'
Therefore, given that RN and Rp lie in interval /,
(3.3) r,tN/r,+] ^ t ^ r/+1 tN/rr
Since rJrM = (1 - e)/(l + e),
(3.4) (1 - «)/(l + e) ^ tit* < (1 + e)/(l - e),
independent of bin interval. In other words, (3.4) is true with probability 1.
Generally, suppose that one is interested in the probability that firing time falls between
two prescribed limits about the nominal time. Consider once more a given bin /. Let us denote
by R^ and Rp(l) random variables derived from RN and Rp respectively under the condition
that RN and, therefore, Rp must lie in interval /. From our discussion in section 2, it is clear
that these new random variables must be independent. Let f] and t2 be the lower and upper
limits, respectively, on firing time. For any given value of the random variable Rh\ one can I
determine limits on the random variable Rpn so that the firing time lies between t\ and t-i. j
Since, by definition, tN = Rfj']F, it follows that Rp cannot be less than
(3.5) r,/F= tiR^/tn.
Similarly, Rp(,) cannot exceed
(3.6) t1R(Nl)/tN.
One must, of course, realize that (3.5) may be smaller than r, and (3.6) larger than ri+\ for
values of /?# ' close to r, and ri+\, respectively.
If we let g(R/v) be the density function of the random variable RN defined by (2.3) and
(2.4), whose range is a function of the domain of Cj, C2, V, and VT, then the induced random
variable /?#' has conditional density
(3.7) g('W>) = g(RN)/P(r, < RN ^ rl + ]) = g(RN)/ f'M g(RN)dRN.
The range of R^n is restricted to the interval [r„ r,+1]. Using the mean value theorem of
integral calculus, (3.7) becomes
(3.8) gU)(R^) = g(RN)/git)(ri+] - r,), r, < f < r,
If r,+1 — r, is sufficiently small, one sees that
(3.9) gU)(R^n) = l/(r,+1-r,).
/+!•
ANALYSIS OF FUZE TIMER
379
Similarly, let f{l)(RpU)) be the density function for picked resistance Rp{'\ whose range is like-
wise restricted to [r,, ri+l]. Then, with the knowledge that R^l] and Rp(l) are independent ran-
dom variables, and, letting Pj(t\ < t < t2) be the probability that firing time falls between /]
and t2 (given that RN and Rp come from interval /),
(3.10)
■',«#'/'
(ri < r < h) = f +' f 2 " '* g<»Ullp)fi»Ull»)dRydR!P
Jr, JtlRfr)/tN
We take the liberty of defining fU)(Rp(n) = 0 in (3.10) whenever /?„(/) tf [r„ r/+1]. This is done
purely for the sake of convenience of notation even though the range of RpU) is [r,, ri+x\.
The probability that the time falls between tx and t2 is expressed by
oo
(3.11) Pit, < t < t2) = X ^(r, < r < t2),
where /?,• is the probability of choosing bin /'.
As we have previously indicated, if ri+l — r, is sufficiently small, we can assume, for all
practical purposes, that R^ is a uniformly distributed random variable. The picked resistance
Rp{l) should also be a uniformly distributed random variable if all resistors in bin / are equally
likely to appear. In other words, let us assume that
(3.12)
.(/>i
«■( i i
(,),
g{,,W>) = fn(Rpw) = l/(r/+1-r,).
Suppose then that one asks for the probability that tx = tN{\ — 8) < t < tN{\ + 8) = t2 for a
given small 8. We proceed to derive closed form expressions for this probability. Three cases
naturally arise, the first of which is shown in Figure 3 below. For brevity, we shall drop the
superscript / in this figure and the two following figures. In this diagram, the interior of the
quadrilateral formed by the lines RN = r,, RN = ri+\, Rp = t\RN/tN, and Rp = t2RN/tN is the
region of integration. Note that, in the two hatched regions, f{l){Rp']) = 0, since then either
Rp < r, or Rn > r,
equivalent to
p < r, or Rp > r/+1. After a small computation, one sees that the inequality r^0) < ffi is
(3.13)
0 < 8 < €.
We also note that, using (3.12), (3.10) represents the normalized area of the interior of the
hexagon shown in Figure 3, bounded by the lines RN = rh RN
Rp = t2RNltN, Rp = rh and Rp = rl+]. Therefore,
Rp — t\^-NltN->
(3.14)
P,{tN{\ -8) < t ^ tN(\ +8))
1
+
C'i+1 " r,)2
r,/(l-5) J(
,/-,/(l-5)
r- J r
(\+h)R^
dR(,) dR^
(\+6)R^')
dRU) dRil]
(l-S)/?^'
+ f"+]n « f',+1 M«(,)«#J
Jr,+ I/(l+8) J(\-8)Rk<) N
8
86 2
(1 + e)2(2 + 8) (l-e)2(2-8)
1+8 1-8
It follows that P, is independent of /. From (3.11),
(3.15) PUi < t < t2) = P,{t\ < t < h).
0 < 8 < e.
380
E.A. COHEN, JR.
R^=t2RN/t
Rp Rp=ri+'~^\
ou
R^N
Rp2,=t.RN/tN
%=
ri r(0)r(l]
N N
r.
i+l
Figure 3. Picked resistance versus nominal resistance (Region 1)
The second case occurs when r, ^ r,y ^ rN ^ ri+\- This situation is indicated in Fig-
ure 4. One can also show that r^0) = r, + 1 when 8 = 2e/(l + e) and that r^l) = r, when
8 = 2e/(l — e). Therefore, the situation illustrated in Figure 4 occurs when
e < 8 ^ 2e/(l + e). A third case will occur when 2e/(l + e) < 8 < 2e/(l — e), as illustrated
in Figure 5, where the dotted region is now a pentagon. For 8 ^ 2e/(l — e), the dotted region
becomes the interior of a rectangle completely enclosed in the sector, so that the probability
becomes unity. In the third case, one sees that r, < r^n ^ r,+1 ^ r^0). When one integrates
over the interior of the quadrilateral outlined in Figure 4, one again obtains the closed form
given in (3.14). Therefore, (3.14) is valid whenever 0^8^ 2e/(l + e). The case illustrated
in Figure 5 is different. When we integrate over the interior of the pentagon, which is that por-
tion of the region of integration for which the integrand of (3.10) is nonzero, we find that
(3.16)
Pi(tN(\ - 8) < t < tN(\ + 8)) =
4e2 + 4e(l +e)8 - (1 - e)282
8e2(l +8)
2e
1 +€
< 8 <
2e
One easily shows that (3.16) becomes unity when 8 = 2e/(l — e) is substituted.
4. PROBABILITY INTERVALS AT AMBIENT TEMPERATURE BEFORE POTTING
The analysis of the timer when temperature and cold cathode diode firing voltage varia-1
tions are considered is different from that of the previous section, since all components except
for the resistor enter the time nonlinearly. It would then be necessary, at least in principle, to
take into consideration the probabilities p, of picking the bins as well as the probabilities for1
picked resistance once a bin has been selected. However, if the variations due to these effects
are relatively small, one should again see probabilities essentially independent of the bin
selected. Furthermore, in a situation like this wherein certain distributions are quite tight, i.e.,
are of small variance, some simplifying assumptions can be made. We shall get to these
presently. Again, as before, we assume that the bin intervals are so small that we may reason-
ably suppose that (3.12) is true. Note also that (2.3) and (2.4) express RN in terms of tN, C\,
C2, K, and VT. Assume now that C,, C2, V, and VT are independent, normally distributed ran-;
dom variables. Suppose, as is common in practice when coefficients of variation are
ANALYSIS OF FUZE TIMER
R(3, = R,
381
R = r. . .
p
•fHvu
Figure 4. Picked resistance versus
nominal resistance (Region 2)
Rl'^t2RN/t,
rL3, = rn
Ri2)=',fVtN
>-R,
Figure 5 Picked resistance versus
nominal resistance (Region 3)
small [4, pp. 246-251], that RN is linearized about the expected values of capacitances C\ and
C2, supply voltage V, and tube breakdown voltage VT. Now a linear function of independent,
normally distributed random variables is again a normally distributed random variable, and,
from (2.3) and (2.4), it follows [4, pg. 118] that
-l
(4.1)
and
(4.2)
Z7/-D \ ~ Ov(Q,£ + C2,£-)
E{RN) =
C,FC
l,£^2,£
in
VfC
£M,£
VEC\,E- (Vt.e- VE)(ChE + C2iE)
var (RN) =
dV
\ 2
var C] +
var V +
ac2
dVr
var C2
var VT.
382
E.A. COHEN, JR.
Here the subscript E indicates evaluation at expected values and var represents the variance
operator. Now, clearly,
dRN
(4.3)
dC,
*n dF . , .
= -^aq', = 1'2'
with similar expressions for dRN/d V and dRN/d VT. The relevant partial derivatives of F are
given by
(4.4)
dF Cl
1 VT-V
X
ec, cx + c2
C, + C2 ? Y
BF C,
[ C, C2(Kr-
- n
dC2 c, + g
c, + c2 'r ' r
dF C,C2
dVT Y
dF C,C2VT
dV VY '
and
* = /«
KG
KG- (Kr- K)(G + C2)
y= vcx-{vT- v){cx + c2).
Now /?, represents the probability of choosing bin /', and that is precisely the probability that the
random variable RN belongs to bin /. Furthermore, because we are now assuming that RN is a
linear function of the independent, normal random variables G, G, K anc* Kr, RN is likewise
normal. Therefore, letting £ = E(RN) and cr2 = \ar(RN), one has
2
1
(4.5)
where v.
Pi"
1
r
2
e
r-i
dr =
X.
e 2 d\,
(Ty/ln Jr< V27T
(rj — £)/o- and v2 = (r,+1 — £)/o\ so that p, may be readily calculated from tables.
Supposing that the picked resistor and the other components are subject to a temperature
change from the standard temperature, we must compute the effect of such a change, together
with the resistor incrementation effect of Section 3, in order to obtain the probability of satisfy-
ing the specification. It will be assumed in our analysis that the ambient temperature is a uni-
formly distributed random variable whose range is given by Tx ^ T ^ T2. If
P(t\ ^ t ^ t2\T) is the probability of meeting the time limits for a given temperature T, then,
clearly,
(4.6) P(t^t*:t2) = fT2PU]^t^t2\T)p(T)dT= \ fT2 Pit, < t ^ t2\T)dT.
Let us give an example of the computation of the nominal resistor distribution. Suppos-
ing in (4.1) that tN = 2.6 seconds, ChE = .44 /uf, C2E = .15fif, VE = 177v., and VTE = 235v.,
one finds that E{RN) = 40.16 megohms. Also, one finds from (4.3) and similar expressions,
upon inserting expected values, that dR^/dC\ = 8.65, dRN/dC2 = — 293.12, dR^/d V=— 1.26,
and dRNldVT= — 0.95. Let us assume the following standard deviations: <t(C\) = 0.0073,
o-(G) = 0.0025, a(V) = 0.17, and a(Vf) = 4.17, where Vf is used to denote the expected
breakdown voltage of a diode chosen from a lot. The expected values of the breakdown vol-
tages of all the tubes are themselves assumed to follow a normal distribution with expected
ANALYSIS OF FUZE TIMER 383
value 235v. and with the above o\ In addition, each tube has a firing voltage which varies
about its expected value. This new random variable, with expected value 0, we denote by A VE,
and it is assumed that A VE is also normally distributed. The random variable VT, which
represents the firing voltage of a tube selected from a lot, is actually formed as a sum
VT = Vj + A VE, where we shall suppose that A VE is independent of Vf. Also, tests per-
formed by fuze specialists indicate that the random variables A VE have the same distribution
from one tube to the next. Assuming that <t(AVe) = 0.24, it follows that a-(VT) = 4.17.
Then, from (4.2), var RN = 16.3235, or a-(RN) = 4.04. Therefore, the coefficient of variation
is 0.10, which is reasonably small.
We now develop a general method for evaluating the performance of the timer which is
based on a linear theory. Hopefully, this theory will yield at least conservative estimates. Our
formula is a generalization of that given in paragraph 3. First of all, from (2.3) and (2.4), it
follows that
(4.7) RN= tN/F(Ch C2, V, VP),
where V±l) = Vf+kV^. Therefore, solving (4.7) for Vf, where F(CU C2, V, Vj-X)) is given
through (2.1) and (2.2), one finds that
(4.8) Vf = VCJ{CX + C2) - (AKi]) - V) - VCX ^'"^^/(C, + C2).
Here Ceff = \/C\ + 1/C2 is the effective series capacitance of C\ and C2, and A V^x) denotes
that variation in tube firing voltage from its expected value which is associated with determina-
tion of the nominal resistance RN. For brevity, we let g(C\, C2, V, RN, A F^") represent the
right hand side of (4.8). There is, however, a second variation, which we shall denote by
A VE2), that occurs once a resistor has been selected from a bin and the timer actually operated.
These two variations must be taken into account carefully when assessing timer performance.
One may now make a 1-1 transformation from the space of (C,, C2, V, AK^n, AK^2), Vf) to
that of (Cb C2, V, A Kin, A V^2\ RN) through the map
(4.9) C, = C,, C2 = C2, V = V, A V(EX) = A V(EX\ A Vp] = A V£\
Vf= g(Clt C2, V, RN, AKi1}),
whose Jacobian is dVE/BRN. It follows [3, pp. 56-62] that the density function for the state
(C, C2, V, AKin, AV&2\ RN) is
(4.10) f(Cu C2, V, AKiu, A^i2), RN) = Px{Cx)p2{C2)pi{V)pMy(EX))
■ ps^V^p.igiC,, C2, V, RN, AV^}
■ \dVE/dRN\,
where /?,(C,) (/' = 1,2) are the densities for C,, p3 is the density for V, p4 the density for
^VE\ p5 the density for &VE2\ and p6 the density for Vf. These random variables are all
assumed to be independent. In addition, AK^n and A V^2) are identically distributed. Next
account must be taken of the fact that, because of a change in temperature, the capacitances C,
will change in value. In fact, we assume that C,(T), where T denotes temperature, is of the
form
(4.11) C,(D = C,(l + Ki(T - TE)/\00),
where Kj represents a random percent change per degree from the expected temperature TE.
Thus Cj(T) is a product convolution [3, pp. 56-62] of C, and the second factor, which we
denote by ACP,(T) (representing a percentage change in C, due to a temperature change from
expected value TE to T). We then form the joint density h(Cu C2, AC/VD, ACP2(D, K,
384 E.A. COHEN. JR.
A vp\ A Vp\ /?yy) from /and the densities for these percent changes. Afterwards, h is multi-
plied by p(Rp(T)), the convolution density of picked resistance at temperature, where
(4.12) Rp(T) = Rp(\ + C(T - TE)/\00)
and C is a random percent change per degree. Finally, if we are interested in the conditional
density for any given bin i, we must divide by ph the probability of choice of bin /. It is clear
that, in order for the time output of the timer to fall between two chosen values t\ and t2,
Rp(T) must lie between
tx/F{Cx{T), C2(T), V, V±2))
and
t2/F(C{(T), C2(T), V, Vj2)),
where V^ - Kf + A vp> with Kf given by (4.8). Also, from (4.11),
(4.13) C,(T) = CACPi(T).
Now let XT= (C,, C2, &CP;(T), ACP2(T), V, kV^\ AVp]). There follows the general
multiple integration formula, which expresses the probability P, that the time falls between ^
and t2 for bin i and conditioned on temperature T:
r t I F
(4.14) p,P,U, < t < t2\T) = X '+1 L Bv , X /V p(Pp(T))h(XT,RN)dRp(T)dXTdR„,
,"ri ** X-ftR '(-c»,oo) •"|/r
where R1{—°ot oo) represents the seven-fold Cartesian product of the real line. Finally,
(4.15) PO, ^ r < r2) = * £ A f ^ P,(/, < t ^ t2\ T)dT,
1 2 ~ l 1 -oo I
given that the temperature distribution is uniform. This integration procedure could be accom-
plished on a digital computer through use of numerical Gaussian quadrature and Gauss-
Hermite quadrature [5, pp. 130-132]. However, instead of using this general nonlinear
approach, we find it convenient, in the present context, to linearize the products given by
(4.11) and (4.12) and to make use of a linearized version of RN given by
(4.16) RN=RN(ClE, C2lE. VE, V$) + AX{CX- ChE) + A2(C2- C2>£)
+ A3(V- VE) + A4(VT- Vj$),
where, of course,
oR/v oR/\i o/?/v oRn
A i = , A-, = , A i = , .. , Aa =
6C, ' l bC2 ' J bV ' 4 bVT
are evaluated at the expected values for the components and VJ-X)E represents the expected value
of random variable VJ-X). (4.11) now becomes
(4.17) C,m = ChE{K, - KiE){T- r£)/100 + C,(l + KlE(T- TE)/\00).
where KiE represents the expected value of Kh and (4.12) becomes
(4.18) Rp(T) = [1 + CE(T- TE)/\00]Rp + RC(C - CE)(T - TE)/100,
where Rc is the center of the bin considered. Note that the effect of (4.17) and (4.18) is to
replace product convolutions by convolutions of sums of random variables when it comes to
computing densities. Also, supposing that t\ = tN{\ — 8) and t2 = tN{\ + 8), the limits on the
innermost integral of (4.14) become tx/F = (1 - 8)//v//7and t2/F= (1 +8)tN/F, respectively, i
ANALYSIS OF FUZE TIMER 385
The functional form tN/F is to be replaced by the linearized version (4.16) with C\(T), C2(T),
and Vt2) substituted for Cb C2, and VT, respectively. We have, therefore, after a small com-
putation,
(4.19) txlF= (l-8)[RN + A^CX{T) + A2bC2(T) + ^4(A V^2) - A V^)\
and, likewise,
(4.20) t2/F = (1 +B)[RN + AlbCl{T) + A2AC2(T) + ^4(A^2) - A^")],
where AC,(D = C,(T) — C,. When C\, C2, V, and Fr are independent, normally distributed
random variables, the analysis is a bit simpler, since it is easily seen that, in this case, the pair
(RN, A K^n) is bivariate normal [3, pg. 162]. In addition, one notes that (4.19) and (4.20) do
not depend on C\, C2, and V, in the linear analysis. In Section 6, we present a numerical
example following this procedure. It may be noted, by analogy with the development in para-
graph 3, that the condition txlF < R(T) ^ t-J F is equivalent to requiring that R(T) lie
between two hyperplanes in the six-dimensional (RN, ACj, AC2, A V^\ A VP\ R(T)) space.
5. PROBABILITY INTERVALS AT AMBIENT TEMPERATURE AFTER POTTING
When the timer is actually packaged, or potted, this procedure will produce statistical
changes in the component values. These changes are known in the trade as potting shifts.
Such shifts can be taken into account by convolutions of the densities previously determined
with those densities evolving from the operation of potting. This has an effect on such items as
the picked resistor, the capacitors, and the voltage regulator. Generally, potting shifts are
represented as percentage changes from previous values, and, therefore, strictly speaking, we
have another product convolution to consider. For example, we represent the value of resis-
tance due to temperature and potting by
(5.1) Rpol(T) = RP(T)(\ +CHG,/100),
where the subscript pot denotes potting and CHGi represents a random per cent change from
the value of picked resistance at temperature. If we linearize Rpot(T) about nominal values, we
find that
(5.2) RPoST) = (1 + CHG LE/ 100) R(T) + RE(T)(CHGX - CHG, f)/100,
where Re(T) is the expected value of picked resistance at temperature for the given bin and
CHG] £ is the expected value of CHGJ. From (4.18), this is given, to a first approximation, by
(5.3) RE(T) = [1 + CN{T - TN)l\00]Rc,
where, as before, Rc is the center of the bin interval. As for the capacitances, we assume a
form
(5.4) C,.p0l(r) -C,(D(1 + CFKVlOO),
so that we would linearize C, pot(D about nominal values in a manner analogous to that for
Rpol(T). Lastly, the voltage regulator value after potting is representable by
(5.5) Kpol = V + CHG3.
Hence, we need only go back through our analysis with Rp(T) replaced by flpot(r), C,(D
replaced by C,pot(D, and V replaced by Kpot. It is assumed that VT, the cold cathode diode
tube firing voltage, is unaffected by potting. One more integration, corresponding to CHG3, is
introduced in order to take account of the change in regulator voltage due to potting.
386 E.A. COHEN, JR.
6. NUMERICAL RESULTS
Using a CDC 6600 computer, we were able to develop a computer code which can be
used to predict efficiently the performance of the timer under the linearity assumptions outlined
in the two previous paragraphs. The integration scheme developed will, in this paragraph, be
discussed in some detail. A listing of the computer code used can be provided on request.
First of all, in (4.18), we assume that Rp has a uniform distribution across the bin which
is being considered and that C is normally distributed. Let us suppose, as an example, that
CE = -0.0235, TE = 75°F, and a-(C) = 0.0078. Then, of course, from (4.18),
(6.1) RP(T) = [1 - 0.0235(7* - 15)/\00)RP + RC(C + 0.0235)(T - 75)/100.
Therefore, Rp(T) is a sum of two independent random variables, one of which is uniform and
the other of which is normal and of mean 0. It follows that
(6.2) p(RAT)) =
2tt (0.000078) | T - 15\Rc(ri+l - r,)[\ - 0.000235(7- 75)]
2
X
( 1-0.000235(7-- 75) )r+) ~j
l-0.000235(r-75))r,
Rp(T)-u
R JO. 000078) (r-75)
du.
Letting
v= (u - /?„ (D)//?r (0.000078) | T - 75|,
(6.2) is converted into
(63> P(R'm>= V5F(„», - „)[■ -0.000235(7-75)] S^' ''^ "-■
where
(6.4) v, = [(1 - 0.000235(7- 75))r, - Rp (T)]/Rc (0.000078)1 T - 75|
and
(6.5) v2= [(1 - 0.000235(7- 75))r,+1 - Rp (T)]/Rc (0.000078)1 T - 75 1 .
Several cases now arise according to the value of RP(T) and according to whether or not
T ^ 75°F. We first consider the case when T ^ 75°. Let us develop an inequality which
allows us to assert that Vj < — 3. In fact, suppose that
(6.6) RP(T) - t; > kx r, (0.00023 5)(T - 75),
where kx is to be so determined that v, < — 3 is valid. Upon substituting (6.6) into (6.4), one
has
(6.7) v, ^ - {kx + l)r,(0.000235)//?f(0.000078).
Remembering that rjRc = 1 - e, we find that fcn = e/(l - e) will yield the requisite inequality.
Next let us obtain an inequality which will permit us to say that v2 ^ 3. Suppose that
(6.8) rf+I - RP(T) > fc2r,+1(0.000235)(r-75).
Then, from (6.5), we have
(6.9) v2> Wc2- l)(l + e).
The right side of (6.9) equals 3 when
£2= (2 + €)/(! +e).
ANALYSIS OF FUZE TIMER 387
Thus, if, for T > 75,
(6.10) r,-(e) = r,(l + -r1— (0.000235) (T - 75)) ^ R(T)
1 — €
< r,+1(l - 771 (0.000235)(T - 75)) = r/+1(e),
it follows from (6.3) that
(6U) P(R"(T)) = (r,+1-r,.)[l- 0.000235(7-75)]-
Now suppose that r < 75. Letting Rp(T) - r, > k3 r, (0.000235) (T - 75), it follows that
(6.12) v, < 3(/c3+ 1)(1 -c).
The right side equals - 3 when /c3 = - (2 - e)/(l - e). Again, assuming that rj+x —
RP(T) > k4 rl+x (0.000235)(r - 75), we have
(6.13) v2>-3(l+e)(Jfc4-l),
which equals 3 when k4 = e/(l + e). Therefore, when T < IS and
(6.14) r/(e) = r,(l - \:zJ- (0.000235)(r - 75)) < R(T)
1 — e
< rl+1(l - — !— (0.000235) (T - 75)) = r/+1(e),
1 + €
(6.11) is again satisfied. Next let us go back to the case when T ^ 75, but let us now require
that v2 < — 3. We find that such is true when
(6.15) RAT) > r,+1 - — f— rM (0.000235) (T - 75).
1 + €
Since v2 ^ — 3 also implies that vt < — 3, we can assume that p(R(T)) = 0 in this case.
Likewise, one finds that V) ^ 3 whenever
(6.16) RAT) ^ r, - |— ^ r, (0.000235) (r - 75),
1 — €
so that, in this range, p(Rp(T)) = 0, also. When T < 75, v2 < - 3 whenever
(6.17) RAT) > rM- —^ r/+1 (0.000235) (T - 75),
1 + €
and V! ^ 3 when
(6.18) /?„(D ^ r, + — ^— r. (0.000235) (T - 75).
1 — e
Again it follows that p(Rp(T)) = 0. Now there remain certain intervals in which p(Rp(T))
cannot be treated as constant for a given temperature. For example, it is found that, when
T ^ 75 and
(6.19) r,+1(e) = r/+1 (1 - 4^ (-000235)(r - 75))
1 +e
< Rp(T) ^ r,+1(l - — — (.000235) (T - 75)) = s,+1(e),
1 + e
- 3 < v2 < 3 while v, ^ - 3. Also, in the interval
388
(6.20)
E.A COHEN, JR.
S,-(e) = r,(l -
< r,{\ +
T — L (.000235)(r- 75)) < RAT)
1 — 6
(.000235) (T- 75)) = r,(e).
1 - 6
3 < v, < 3 while v2 > 3. When T < 75, p{RAT)) cannot be treated as constant whenever
(6.21) s/(e) - r,(l +
1 -€
(.000235)(T- 75)) < ^(D < r/(e)
or
(6.22) r/+1(e) < RAT) ^ ri+l (1 -
2 + 6
1 +6
(.000235) (T- 75)) = s/+i(e).
The intervals so developed, in which the behavior of p(Rp(T)) is examined, are very
important in the numerical study conducted on the CDC 6600. We now set up the precise pro-
cedure used in the computer study, first of all, referring to (4.19) and (4.20), we find it a little
more natural to integrate with respect to AC,(D or AC2(D first instead of~Rp(T). We see
then that our region of integration is fully specified by
(6.23) fi(bC2.Rp(T), R^.AV^.AV^) ^ AC, ^ /2(AC2> Rp(T), RN, A V^\ A V^)
— oo < AC2 < +°°
-oo < Rp(T) < +<*>
— oo < A VP} < oo
-oo < A VP] < oo
r, ^ RN ^ r/+]
Tx < T < T2,
where, for A , > 0,
1
(6.24)
/i =
b¥L-A4C2-RN-A4(AVF-*vP)
^l -A2AC2-RN- A4(* VP - A Vp)
and the inclusion of negative values for Rp(T) is merely a mathematical artifice. The density
function for this process then has the following form:
(6.25)
MAC,, AC2, RN, RP(T), A V^\ A Vp})
= pMC,)p2(AC2)pARp(T))p4(Rn, AVE\x))ps(AV(E2))lpi{T2- Tx).
The densities /?,, p2, and p5 are all normal densities. The mass function p3 was ascertained in
(6.3). p4 is a bivariate normal density, and p, is the probability of being in bin /'. It is easy to
determine the correlation coefficient p for p4. Multiplying A V^l) by R^, as given by (4.16), we
have
(6.26) E(RNA VP}) - A4 £(A F^")2
= ^4o-2(A^1)),
and, since the expected value of A V^ is zero, cov(RN, AV^U) = £(/?/vAK^1)). It follows,
using (6.26), that
(6.27) p = ^4o-(A^1))/o-(/?^).
ANALYSIS OF FUZE TIMER
389
The factors in (6.25), other than p3, are given by
pMCx) =
P2(AC2) =
1
(27r),/2o-(AC1)
1
(2tt)1/2o-(AC2)
exp
exp
AC, - £(AC.)
cr(AC,)
AC2- £(AC2)
(AC2)
P4(RN,±vn
Lih =
27ro-(i?^)o-(A^1))Vl -p2
exp
-1
2(1 -p2)
^?v — E(RN)
-2p
R,
(t(Rn)
E(RN)
+
[AKi»
- £(A^iu)
cr
(AKi»)
0"(/?yv)
AFi"
£(A^1))
o-(AKin)
/>5(A^2)) =
1
(27r)1/2o-(AKi2))
exp
f A V^
-£(A^2))
a
(A Vp])
where p is given by (6.27) and p4 is the well-known joint normal density for two variates [7,
pp. 111-114].
Now let K,E = .04 for / = 1,2 in (4.17) and trlK,) = .013, / = 1,2. Recall from our dis-
cussion in paragraph IV that C\ E = .44p,/, C2£=.15p,/, E{RN) = 40.16 megohms,
,4, = 8.65, A2 = - 293.12, ^4=-0.95, and a(RN) = 4.04. In addition, suppose that
£(A^n) = £(AF^2)) = 0ando-(AKin) = cr(AKi2)) = 0.2357. Then it is seen that
£(AC,) = 0.000176(7-75), o-(AC,) = 0.00005874| T- 75|,
£(AC2) = 0.00006(7-75), ando-(AC2) = 0.00002034|r-75|.
Next we make several changes of variable. Let
(6.28) u = (AC, - £(AC!))/V2(7(AC,)
w = (AC2 - £(AC2))/V2o-(AC2)
z = v/V2
w, = (/?„ - E(RN))/y/2(l - P2)<t(Rn)
wx= AKin/V2(l -p2)o-(AKin)
w2= A^2)/V2o-(AKi2)).
Then (6.25) becomes
(6.29)
*9-
vr
7r3(r,+1 - /-,)[! + Cyv(r- 75)/100]
-! ;.:
2 e--' dz
- ( « j | — 2p u | h> | + >v | )
e ■ e ' ' ' ' /Pi(T2 - T,).
Now one finds, by completing the square, that
(6.30)
-<U|2-2pW| W,+ >V|2) -(Hi|-pu,)2-(l-p2)u ,2
Next we let w3 = W[ - pw, and u2 = Vl - p2 w,. Our integrand becomes
390
(6.31)
h =
E A COHEN. JR.
1
7rHrl+l - r,)[l + CN(T - 75)/100]
2 2 2
e 3 • e 7 ■ e 1lp\T2- T}).
dz
For brevity, set Y = (w, h>2, w3), and let /?3(-oo,oo) denote the usual three-dimensional
Euclidean space. Also, put u2i = (r, - E{RN))/Jl v{RN) and «2./+i - C^+i - E(RN))/y/2
a(RN). Then our integration scheme becomes
(6.32)
where
P,(^(l -8) < f < f„(l +8)) = Ax + A2,
J75 /•"2/+1 T /'r/(e) fF7(Y.uvR)
r J J , ^K^2c/r j . . f sl h{u,Y,R,u2)dudR
,/■',(«) j.FAY.u^.R
F2(Y,urR)
xL,(«) /.FJY.UtR)
+ J.M L,v ,» h(u,Y,R,u2)dudR + I , ,. I ,v _, h{u,Y,R,u2)dudR
Jr^t) J F^kY.u^.Ri Jrl+iU) J F\(Y.u2.R) l
and /42 is obtained by using 75 and T2 for limits on the T integration in place of Tx and 75,
respectively, with primed quantities replaced by unprimed quantities. In addition, we have set
(6.34) F1(r,«2,JR)=F1(w,w2,w2(w3)/?)=[/1(AC2//?(AK^),A^1),/?^)-£(AC1)]/V2o-(AC1)
F2(Y,u2,R)=F2(w,u2,w2,wi,R)^[fMC2,R,AVP),AV^),RN)-E^C[)]/^2(r(^C0.
Now /, and f2 were defined in (6.24), and, from the changes of variable given by (6.28), we
have
(6.35) AC2= £(AC2) + V2wo-(AC2)
AFi2) = V2o-(AKi2,)w2
A^n = >/2(l -p2)o-(A^n)(»V3 + pw2/Vl -p2)
/?;v = £(/?/v) + J2<r(RN)u2.
Our computer code is just the implementation of a nesting procedure, making use of Gaussian
and Hermite-Gaussian quadrature routines, together with routines to evaluate the error integral
[5, pp. 130-132], [6, pp. 319-330], [8], [1, pg. 924]. It turned out to be convenient and numer-
ically accurate and timewise efficient to employ three Gauss points per integration step.
The effect of cold cathode diode firing voltage variations in this problem is more
significant than that of ambient temperature departures from nominal. In our case study, for
example, when e = .01 and 8 = .02, P, was essentially 91%. With 8 = .03, this figure was
increased to almost 100%. Results for six bins with e = .01 and 8 = .03 are given in Table 1.
TABLE 1 — Performance of Fuze Timer
for Representative Bins
Pi
/■/
r,+\
Re
.994848
37.4435
38.2000
37.8218
.995103
38.2000
38.9717
38.5858
.995221
38.9717
39.7590
39.3653
.995414
39.7590
40.5622
40.1606
.995452
40.5622
41.3817
40.9719
.995490
41.3817
42.2176
41.7997
ANALYSIS OF FUZE TIMER 391
It is seen that the probability is essentially the same independent of the bin. Running time for
this problem was approximately four seconds per bin. Indeed one would reason, as in para-
graph 3, that, at least approximately, each bin should yield the same probability for firing time,
given a 8 — e combination. This should occur if the nonlinearities are not too severe and the
distributions due to change in temperature and cold cathode diode firing voltage variations are
fairly compact. This would then mean that we need only examine one bin to determine the
performance of the timer, and our integration procedure could then represent a substantial time
saving over a Monte Carlo simulation.
Going back to (6.32), we can also give an error bound for the part neglected in the com-
putation of Ph Let us illustrate in one case what is happening. For instance, we have neglected
(6.36) L J J , J , \F(Y -, h{u,Y,R,u2)dudYdRdu2dT
Clearly, (6.36) is bounded above by
(6.37) f 2 f"2',+1 f A f°° , h{R,Z,u2)dRdZdu2dT,
Jis Jti2j ~'ze/?4(-°°,oo) *,s,+i(e)
where Z = (u, Y). Noting that h{u,w,R,W3,u2,w2) = g{u,w,WT,,u2,w2)p(R) and that
frf'f * g(Z,u2)dZdu2dT= 1,
We need only study the behavior of the integration with respect to R. Going back to (6.37),
when si+\ < R < °°, we know that V) ^ v2 < — 3. Therefore, it is easy to show that
(6.38) p(RAT)) < — L e~vl/2/Rc<r(C)\T- 751/100.
Vz7r
It follows {2, pg. 149] that
, ,p(R(T))dR(T) ^ -7== I e~x/2 dx = .00135.
s, + l(e) V27T J3
A similar result is obtained when R is restricted to the interval (— °°, s,(e)) and T > 75° or
when R lies in either (syV^e),00) or (-°o, 5/(e)) and T < 75°. The result is finally that the
portion neglected is bounded above by .0027, so that we are at most off in the third decimal
place.
7. THE CASE OF TWO OR MORE TIMERS
An interesting case study arises when there are two or more timers which are statistically
dependent. This occurs, for example, when, after the first timer is operated, a switch closes
and a second timer is started, the second one being fed by the same capacitor which fed the
first timer. Let us suppose, for instance, that capacitor CI in Figure 1 feeds the second fuze
timer indicated in Figure 6.
At the end of operation of the first timer, switch S in Figure 6 is thrown into the position indi-
cated, thus allowing CI to begin charging up C4. C5 serves as the reference capacitor. The
second timer is also governed by a simple first order differential equation, and one can show
that the time is given by
(7.D ,,^£i£ien
Letting 4n be the nominal time for the second timer, we find the nominal resistance for this
timer to be
C\V- C2(VT- V)
cxv-
- C2(VT- V)- {VTA- K)(C, + C4)
392
E.A. COHEN, JR.
TIMER
OUTPUT
Figure 6. Second fuze timer configuraiion
(7.2)
D (1) _
(C, + c4)^"
C\C*
In
VC] - (V}"~ V)C
vcx - (V±l)- V)C2
ivff
j - v ; ^ 2 ~ \rt,\ ~ V)(C\ + C4)
where K/n = Kf+AKJn and V$\ == Kf, +A^',). Then, substituting (4.8) into (7.2), we
derive the functional relationship
(7.3) R$] - /?;" (C,, C2l C4, V, RN,bV£", V}}{).
Solving (7.3) for Vfti, we have
(7.4) VfA = h(CuC2,C4, V, RN. R&1), LVp\ LVgl).
To determine the joint density for the process, we must, by analogy with the method in para-
graph 4, introduce a pair of diode firing variations A V^2) and A Vj£\. We then consider the fol-
lowing transformation of variables:
(7.5)
yf.\ ■
= h{C\, C2, C4
c,=
c,
c2 =
c2
c4 =
c4
V =
V
RN =
RN
AVP
A^»
*Vfi
A Vft
AKi2
_
afj»
a vm
_
Ayfl.
K /?„, CA^'.A^1,')
(1)
(1)
To compute the density, we employ (4.10) and the Jacobian of the transformation (7.5) to
obtain
ANALYSIS OF FUZE TIMER
393
(7.6) d5(Clt C2, C4, V, RN, R$\ LVP, AKft AKJ2), AK$)
= /(C„ C2, K.A^.AKjP,/^) rf,(C4) -dMV{ExD -dMV^)
d,iVfA)
bVfA
9/?;n
Also, if both /?,v" and RN are linearized about nominal values of capacitance, tube firing vol-
tages, and regulator voltage, then the map
(7.7) R^ = I,(C,, C2, C4, Vf^VP, V, VluLVg\)
RN= I2(C„ C2, V, Vl AKin)
AK<V=A^'»
A^" = AK^1'
shows that (/?^l>, ^/v, A V^\, A K^n) is a quadrivariate normal random vector [2, pg. 162]. The
reason is that all random variables on the right side of (7.7) are independent and normally dis-
tributed. At the nominal temperature, the density function is therefore generally representable
by
(7.8) d6(CuC2,C4, V,RN,R^,AV^,AV^\, Rp, R™ , AV£2\ A^j)
= </5(C„ C2, C4, V, RN, RJP.LVP.LVSlLVP.LVgi)
■p(Rp)-p«HR}l)),
where, for example,
p{Rp)= l/(r/+1- r,)
and
//» (/?;»)- 1/0$ -i>(l))
if picked resistance is equally likely across the bins. (7.8), also, obviously indicates that picked
resistances are statistically independent of the other component values. It will be possible to
reduce (7.8) to the simpler form
(7.9) d6(RN, R$\ AKi», A K^, Rp, R«\ A V?\ A Vg\)
= p{RN,Rkl\bVP,b,Vgl) p(Rp) VW*) ■ piAV^) -p(AV$)
when (7.7) is valid, p(RN, R^\ A KJn, A v£\) being the density for the quadrivariate normal
distribution [9, pg. 88]. From (7.7) the elements of the covariance matrix [9, pg. 88] can be
easily obtained.
Next account must be taken of changes in component values due to temperature changes
from the nominal value. We use the same ideas presented in paragraph 4, together with the
same notation. The density becomes
I
(7.10) d{Cx, C2, C4, V, RN, Rtf\ A VP, A V${, Rp(T), Rpil)(T),
ACP,(n, ACP2(T), ACP4(T))
= d5(Cu C2, C4,V, RN, R#\ AV?\ AKif7,AKi2>, AK^)
AK«)
.LVff.
■ p{ACPx{T)) ■ p(ACP2(T)) ■ d(ACP4(T)) ■ p(Rp(T)) ■ p{l)(RpiU(T)),
394 FA COHEN. JR.
where p{Rp(T)) and p{l)(Rp{l)(T)) are again convolution densities. We must now determine
limits of integration. One requires that the first timer fire in time t, where tx = tN{\ - 8) ^
/ ^ tN(\ + 8) = t2 and that the second timer fire in time tu\ where r/1' = ^"(1 - 8(1))
^ r(,) < 41}(1 + 8(l)) = '2<n- Therefore, we have
(7-11) r,/F< RP(T) ^ t-jF
,m/Fu) ^ Ru)(T) < ,(i)/F(i»
— °o < C/ < «\ / = 1,2,4
-oo < ACP,(T) < oo, /= i; 2, 4
- oo < A VP] < oo
- «3 < A Vg\ < oo
- oo < A VP] < oo
- oo < A K^ < oo
''/ ^ FN < rl+]
,.(1) <- p (1) ^ r(\)
- oo < J/ < oo,
where F = FiC^T), C2(D, K ^2>), F(1) = F(,l(C,(r), C2(D, C4(D, K, F/2\ K/2/) and
yp) = |/f + a ^2), J//2/ = VfA + A J^2/, as before. Kf is given by (4.8) and V$A by (7.5).
Also, (4.13) holds for / = 1,2, and 4.
An integration scheme patterned after (4.14) can then be recorded with pu = prob (RN €
bin /and R^]) 6 bin y) in place of /?,-. (4.15) would then be replaced by a double sum:
(7.12) />(/, ^ t < r2> r,(,) ^ f(1) < /2(n)
1 oo oo _ 7"
r,- r,
Also, in the case where (7.7) is valid, C{ C2, C4, and V are eliminated and ACP,(T) is to be
replaced by AC,(D. In addition, /(I)/F(1) and f/F become linear forms in AC^D, AC2(D,
AC4(r), /?/v, /^n, A^'\ A^2), Al^.y, and A^2|. In that case a sixteen-fold integral is
reduced to a twelve-fold integral.
ACKNOWLEDGMENTS
The author would like to thank personally Messrs. Larry Burkhardt and John Abell, of the
fuze group at this laboratory, for their generous support of this work and their helpful sugges-
tions. Also, he would like to thank Mr. William McDonald and Mr. Ted Orlow, also of this
laboratory, for their helpful comments and guiding ideas.
REFERENCES
[1] Abramowitz, Milton and Irene A. Stegun (Editors), Handbook of Mathematical Functions,
National Bureau of Standards Applied Mathematics Series, No. 55, June (1964).
[2] Cohen, Edgar A., Jr. and Ronald Goldstein, "A Component Reliability Model for Bomb
Fuze MK 344 Mod 1 and MK 376 Mod 0", NSWC/WOL/TR 75-123 (1975).
ANALYSIS OF FUZE TIMER 395
[3] Fisz, Marek, Probability Theory and Mathematical Statistics, John Wiley & Sons, Inc., New
York (1963).
[4] Hald, A., Statistical Theory with Engineering Applications, John Wiley & Sons, Inc., New
York, Chapman & Hall, Limited, London (1952).
[5] Hamming, R.W., Numerical Methods for Scientists and Engineers, McGraw-Hill, New York
(1962).
[6] Hildebrand, F.B., Introduction to Numerical Analysis, McGraw-Hill, New York (1956).
[7] Hogg, Robert V. and Allen T. Craig, Introduction to Mathematical Statistics, 3rd Edition,
MacMillan, London (1970).
[8] IBM 7094/7094 Operating System Version 13, IBJOB Processor, Appendix H: FORTRAN
IV Mathematics Subroutines, International Business Machines Corporation, New York
(1965).
[9] Parzen, Emanuel, Stochastic Processes, Holden-Day, San Francisco, London; Amsterdam
(1964).
THE ASYMPTOTIC SUFFICIENCY OF SPARSE
ORDER STATISTICS IN TESTS OF FIT
WITH NUISANCE PARAMETERS*
Lionel Weiss
Cornell University
Ithaca, New York
ABSTRACT
In an earlier paper, il was shown that for the problem of testing that a sam-
ple comes from a completely specified distribution, a relatively small number of
order statistics is asymptotically sufficient, and for all asymptotic probability cal-
culations the joint distribution of these order statistics can be assumed to be
normal In the present paper, these results are extended to certain cases where
the problem is to test the hypothesis that a sample comes from a distribution
which is a member of a specified parametric family of distributions, with the
parameters unspecified.
1. INTRODUCTION
For each n, the random variables X\{n), ... , Xn{n) are independent, identically distri-
buted, with unknown common probability density function and cumulative distribution function
f„(x), Fn(x) respectively. An m-parameter family of distributions, with pdf fQ(x\9u ... , 9 m)
and cdf F0(x;9{, .... 9m), is specified, and the problem is to test the hypothesis that f„(x) =
/q{x\9\, ..., 9 m) for all x, for some unspecified values of 9X, ...,9m.
In [5] the simpler problem of testing the hypothesis that f„{x) = fo(x), where fo(x) is
completely specified, was discussed. In this simpler case, the familiar probability integral
transformation can be used to reduce the problem to that of testing whether a sample comes
from a uniform distribution over (0,1). This type of reduction is not always available when the
hypothetical density is not completely specified. (See [1] for some cases where the reduction is
available.)
will
Since we will be interested in large sample theory, to keep the alternatives challenging we
assume that f„(x) = f0(x;9], ...,9%) (1 + r„(x)) for some unknown values
. . , 9®„ and some unknown function rn(x ) satisfying the conditions sup |/"„(x)| < n~* and
< n e for all n and for j = 1,2,3,4, where e is a fixed value in the open interval
d1
r„(x)
Su jj
X
dx>
\
1 1
3'
2
'Research supported by NSF Grant No. MCS76-06340.
397
398
L. WEISS
case. That is, f0(x;9h92) = — g
x - 0,
The case where m = 2, and 0,, 02 are location and scale parameters respectively is rela-
tively simple to analyze, and occurs often in practice, so until Section 5 we will discuss only this
with 9 2 > 0, and the pdf g(x) is completely
< Ai < °° for j =
specified. G(x) denotes g(t)dt. We assume that sup
J -co /
1,2,3,4, and that sup g{x) < A2 < °°.
d' t \
^8M
tions:
(1.1)
For each n, we choose positive quantities /?„, q„, and L„ satisfying the following condi-
p„ < q„ < 1 - «"e.
(1.2; npn, nq„, Ln, and K„ = are all integers.
(1.3)
(1.4)
, 3
lim — - — = 1 for some fixed 8 in the open interval
lim p„ = 0, lim qn = 1, lim npn = °o.
°H
(1.5)
(1.6)
(1.7)
bn = inf
g(x):G~'
Pn
l + n
<x < G'
Qn
1
— ri
-e
> n y for a fixed
positive y with — - e + 28 + 5y < 0.
lim
°o, lim
//-«» np„ ' n-oo A7 (1 — q„)
g(G-HP„))
g(x)
g(G-Hq„))
g(x)
> A3 > 0 for all x < G~l(p„), and
> A4 > 0 for all x > G~x{qn).
y,(n) < Y2{n) < < Y„(n) denote the ordered values of *,(«), • ..,X„(n). For
typographical simplicity, we denote Y,(n) by 7,. For j — 1, ... , Kn, let ^(z?) denote —
(Y*Pn+jL„ + YnPn+<J-l)L), and let D,(n) denote ( J^+yz.,, - YnPn+(i_x)L). For y =
1, ... , Kn — \, let W"(lj>), ... , W'(L„—l,j,n) denote the values of the L„ - 1 variables
among {X\(n), ... , Xn(n)} which fall in the open interval Yj(n)
D,{n) '
^, ?,<■> *
, written in random order: that is, the same order in which the corresponding elements
W'iiJ.n) - Yj(n)
of \X\(n), .... Xn(n)} are written. Define W(i,j,n) as „ , „ for i
Dt(n)
SPARSE ORDER STATISTICS 399
1, .... L„ - 1 and j : = 1, .... Kn, so - — < W(i,j,n) < — . Let Wij.n) denote the (L„ -
l)-dimensional vector { W{\,j,n), . . . , W{Ln — \J,n)} for j — 1, . .., K„. Let
Wi\,0,n), ... , Winp„ — 1,0, n) denote the values of the npn — 1 variables among
\X\(n), ... , X„(n)) which fall in the open interval (— °°, Ynp ) written in random order. Let
W(0,n) denote the vector [W(l,0,n) W(npn-l,0?n)}. Let W(\,Kn + \,n), . . . ,
Win — nq„,K„ + \,n) denote the values of the n — nq„ variables among \X\in) X„in)}
which fall in the open interval (Ynq ,°°), written in random order. Let W(K„ + \,n) denote
the vector { W(\,K„ + \, «),... , Win - nqn,K„ + l,n)}. Let Tin) denote the (K„ + l)-
dimensional vector { Ynp +JL ; j = 0, 1, . . . , Kn). Note that if we are given the Kn + 3 vectors
defined, we can compute the n order statistics Yx, ■ ■ ■ , Y„, so that any test procedure based on
the order statistics can also be based on the K„ + 3 vectors.
Let hn(j(n)) denote the joint pdf for the elements of the vector Tin), and let
h?n(w(i,n) \t(n)) denote the joint conditional pdf for the elements of the vector W(i,n), given
that Tin) =jin). Then the joint pdf for all n elements of all the vectors is hnijin))
hi'n(w(i>n) Li(«)), which we denote by hn{1).
Next we construct two different "artificial" joint pdfs for the n elements of the vectors.
In the first artificial joint pdf, the marginal pdf for Tin) and the conditional pdfs for
Wi0,n) and WiK„ + \,n) are the same as above. The pdfs for the elements of the other vec-
tors are constructed as follows.
Let a tin) denote G~
np„ +
1
7~T
, and y ,in) denote ^— — — — t x^ , for j = 1, .
g' ia /in))
In g2iaiin))
Kn. Let UiiJ) (/=1, .... L„—\;j=\, ... , K„) be IID random variables, independent of
Tin), W_iQ,n), W iK„ + \,n), and each with a uniform distribution over (0,1). Then the dis-
tribution of Wii,j,n) is to be the distribution of — — + (1 +y/(n)) UiiJ) - y iin) U2ii,j), for
/ = 1, .... Ln — 1 and /' = 1 Kn. Denote the resulting joint pdf for all n elements by
h(7)
"n
In the second artificial joint distribution, the marginal pdf for Tin) and the conditional
pdfs for Wi\,n), ... , WiKn,n) given Tin) are the same as in /?„(1). Given Tin), the np„ -
1 elements of Wi0,n) are distributed as IID random variables, each with pdf
giix-9?)/9$)/6$G (O%,n-0,o)/02°) for x < YnPn, zero if x> YnPn. Given T in), the n- nqn
elements of WiKn + l,n) are distributed as IID random variables, each with pdf
(U/02°) g iix - 0,o)/02°)/(l - G HYnqn - 9?)/9$)) for x > YnQn, zero if x < K„v Denote the
resulting joint pdf for all n elements by /?„<3).
If Sn is any measurable region in ^-dimensional space, let P {l)iSn) denote the probability
assigned to S„ by the pdf /z„(,). The next two sections are devoted to proving the following:
THEOREM 1: lim sup \P.i2)iS„) - P.U) iS„)\ = 0.
THEOREM 2: lim sup \P. („(5„) - PhW(S„)\ = 0.
400
L. WEISS
2. PROOF OF THEOREM 1
Let h„ denote the joint pdf which differs from h„ ' only in that y;(n) is replaced by
(2)
y;(n), denned as
Ln f'n(<Xj(n))
, where atj(n) = F'n
= f-1
npn +
J~l
It was shown in
[8] that lim sup \P (4>(S„) — P (\)(Sn)\ = 0, and thus Theorem 1 will be proved if we can show
n— *oo Sn "n "n
that lim sup \P (2) (S„) — P w(Sn)\ = 0. By the reasoning used in [8], this last equality will
n— *°° S n n
n
be demonstrated if we can show that
hnl2)(T(n), W(0,n) W(K„ + \,n))
log
h„w(T(n), W{Q,n) W(K„ + l,n))
= Rn,
say, converges stochastically to zero as n increases, when the joint pdf is actually h}2\ From .
the definitions above, and the formula in [8], for all sufficiently large n we can write Rn as
(2.1)
7 I I
Z 7-1 '=1
log[l+y}(n)-4yj(n)W(U,n)]
- \og[\+y j(n)-4yj(n)W(iJ,n)]
where W(i,j,n) have the same distribution as - — + (\+y l(n))U(i,j) - y j(n)U2(ij). We
show that the expression (2.1) converges stochastically to zero as n increases by means of three
lemmas. (The order symbol 0( ) used below has the usual interpretation.)
LEMMA 2.1: max |y,(«)|=0U 3+*+2>
PROOF: Directly from the assumptions and the definition of y ,(n).
LEMMA 2.2: sup | F~] (t) - {0i° + 02oCrl(/)}| - 0(/Te+?).
n = = ^ n
1
PROOF: Since /„(x) = ~ g
x-0f
e°2
x-0?
9°2
x-0
0
0?
(l + r„(x)), with \r„(x)\ < n €, we have F„{x) =
— — Cx 1
+ Rn(x), where R„(x) = J -y g
y2
t-v\
0 1
rn(t)dt, and thus \R„(x)\ < n~eG
. Then we can write F„ (x) = G
x-0
(! + /?„ (x)), where \R„(x)\ < n'1 for all
x. Fix any value / in the closed interval [p„,q„]. Writing Fn(x)= t = G
we have x = F„ Ht) and G
(2.2) G~l
We can write G~x
(^1(r)-01°)
x-0
0 \
0,°
(! + /?„ (x)),
l + R„(F-lU))
, so
1 + n
t
<'■-'<'>-"' <G
1 + A?"
0,°
= (T1*/) -
r/7"
1 + «"
1
■
t
J
1 - n~€
1
i
'(G~Ht*))
where r* is in the open
interval
i
1 + n
J
, and thus
SPARSE ORDER STATISTICS
1
401
t
- G~Kt)
t
g(G-Hf))
< ny , by assumption (1.5). Then sup
n „■£'&„
1 + n
that sup
p,,^1^,, [ 1 — «
using the inequalities (2.2).
= 0(/7 €+y). By a completely analogous argument, it can be shown
G~Ht)
= 0(/? e+y). Then the lemma follows immediately,
LEMMA 2.3: y An) - y,(«) + 8. ■ (/?), where max \b An)\ = 0U 3
i</<^„
+8-e+3y
PROOF: By lemma 2.2, we can write y ,(«) as
L^ f;,(9? + 0$al(n)+8l(n))
In ft(0? + e°a,(n)+8l(n)) '
where max |8,-(/i)| = 0(n~€+y), f'„(x) = —z g
\<j<Kn 6"
x - e
x - 9
o
99
r'„ (x) + (1 + /-„(*))
(020)2
, so we can write f'n{9\ + #2° <*/(«) + 8,(«)) as fn #'(<*,(«)) + 8 *(«), where
(02°)'
1
max \8'(n) | = 0(n~e+y). We can also write /„(6>,° + 6>2V(«) + §,•(«)) as — - g(a ,■(/»)) +
il/'S*,, #2
8, («), and thus fj(9? + 02°a/ («) + 8y(«» as — - j-y g2(ai(n)) + 8*(n), where max
\9 2 ) 1 </ ^ ATn
|8.-(/i)| = 0(n"e+y) and max |8*(n)| = 0(n~'+y). Thus we can write y An) as -^
i</<^n In
{((l/(02O)2)^(«/(«))+8*(«))/((l/(e2o)2)g2(a/(«))+8*(«))}, and the proof of the lemma fol-
lows directly from assumptions (1.3) and (1.5).
Now we complete the proof of Theorem 1 by applying the expansion log (1 + x) = x —
X2 X3 X4
~- + — for |x| < 1, where |a>| < 1, to each of the logarithms in the expres-
2 3 4(l+a>x)4
sion (2.1). This enables us to write the expression (2.1) as the sum of a finite number of
expressions, each of which can easily be shown to converge stochastically to zero as n increases,
using the lemmas. For example, two of these expressions are:
K„ L.-l
(2.3)
1
Z Z (y2(«)-y2(«))> and
/=! /=!
(2.4)
2Z Z (yj(n)-yj(n))W(i,j,n).
The expression (2.3) is the sum of K„(Ln-\) terms, where Kn{Ln-\) < n. A typical term
can be written as {y f{n)-y j{n)) (y j(n) +y7(«)), which by Lemmas 2.1 and 2.3 is
-y-€ + 28+5y ±-e + 2s + 5y
0(« ). So the whole expression (2.3) is 0(n
) and converges to zero as n
K„ Z...-1
increases, by assumption (1.5). The expected value of the expression (2.4) is 2 £ £
j= i /= l
1 ] *n l„-\
y z ( « ) , and the variance of the expression (2.4) is 4 £ £
f- i /= i
iy jin) - y j{n))
402
L. WEISS
(y ,(n) - y ,{n)):
1 yf(n)
12 180
This mean and variance can both be seen to converge to zero
as n increases by the same reasoning as in the analysis of the expression (2.3), and thus the
expression (2.4) converges stochastically to zero as n increases. The other expressions in the
sum comprising the expression (2.1) can be handled similarly, completing the proof of
Theorem 1.
3. PROOF OF THEOREM 2
In Section 2 we showed that we can write F„(x) = G
02°
(\ + R„(x)) where
|/?„(x)| < n e for all x. We now develop an analogous expression for 1 — F„(x). 1 — F„{x)
- £" f"{,)dt =
1-G
02°
9?
dt
x-9
< n
+
\-G
r '*«
X
-e?\
^2° |
*2°
dt, and since
we can write 1 — /\,(x) =
(1 + S„(x)) where |S„(x)| < n f for all x.
Theorem 2 will be proved if we can show that
log
/?„u,(7>), W(0,n),
W{K„ + \,n))
hp](T(n), W(0,n),
W(K„ + \,n))
= r:„
say, converges stochastically to zero as n increases, when the joint pdf is actually /?„(1). Assum-
ing /?„(1) is the joint pdf, the conditional (given Tin)) distribution of R' is the same as the dis-
np,r 1 _
tribution of &,(!) + Q„ (2), where Q„(\) = £ log(l + r„{Vi)) - (npn - Dlog (1 +
/-l
" — "in
R„(Y„P)), and Qn{2) = £ log(l + r„(Z,)) - (n - ngn)log(l + Sn(Yn(l)), and
"p„
I) Z\>
'n-nqn
are mutually independent, each V, with pdf
< Ynp , zero for v > Ynp , each Z, with pdf
fn(z)
Fn(Ynp)
for v
"p,i
F ) for 2 > YnQir zero for z < J^
n^'nq„'
LEMMA 3.1: Qn (1) converges stochastically to zero as n increases.
"/>„-i
PROOF: Define (?„(!) as 52 r»< K/) ~ (wa-1)/?„(K,V(). By assumption 1.6
;-l
|(?„(1) - Q„(l) | converges stochastically to zero as n increases. Thus the lemma will be provec
if we show that Qn{\) converges stochastically to zero as n increases.
I
At)
't-9^
E{rn{V}\T{n)) =■
0?
(\ + r„(t))dt
YnPn ~ 9 1
99
(l + Rn(Ynp))
SPARSE ORDER STATISTICS
403
"/>„
9?
°!
r. M-2e,
Rn(Ynp) + a
Y"P„ ~ 9 1
e2°
^-'
0,°
(i + /?„(yB„))
where |uj < 1. From this, it follows that |£{r„(F,)| r(w)} - K„ ( 5^) I = O^-2*). This
implies that E[Qn(\)\T(n)} converges to zero as n increases, and also that Variance
{rn(Vi)\T(n)} = 0pin~2e) which in turn implies that Variance [Q„ (1)| Tin)) converges sto-
chastically to zero as n increases. These facts clearly imply that Q„i\) converges stochastically
to zero as n increases.
LEMMA 3.2: Q„ (2) converges stochastically to zero as n increases.
_ "~"q» _
PROOF: Define £>„(2) as £ rniZ,) ~ (n ~ nQn^n (Ynq ). Just as in Lemma 3.1, all we
have _to do is to prove that Q„(2) converges stochastically to zero as n increases.
£{r„(Z,)|7>)} =
f°° 1 '-0i°
\-G
^«<?„ — ^ i
[\ + S„{Y„q)]
S,AY„a)
\-G
02°
+ (ii„n~
\-G
Y"q~e\
0?
\-G
Ynq., ~ ^ 1
9?
[l + S„{YnQ))
where |a»„| < 1. From this, it follows that \E{rn{Z)\T(n)} - S„(Ynq)\ = Qp(n~2€). The rest of
the proof is similar to the proof of Lemma 3.1.
Lemmas 3.1 and 3.2 imply that R* converges stochastically to zero as // increases, and this
proves Theorem 2.
4. CONSEQUENCES OF THE THEOREMS
Theorem 1 implies that a statistician who knows only the vectors Tin), \V(0,n),
W.{Kn + \,n) is asymptotically as well off as a statistician who knows all the vectors Tin),
WiO.n), Wi\,n), .... W(Kn + \,n). This is so because given Tin), using a table of random
numbers it is possible to generate additional random variables so the joint distribution of the
additional random variables and the elements of Tin), WiQ.n), W(K„ + \,n) is the joint dis-
tribution given by h}2). But Theorem 1 states that all probabilities computed using /?„(2) are
asymptotically the same as probabilities computed under the actual pdf /7„(1).
404
L. WEISS
Theorem 2 implies that asymptotically the order statistics {K1(
nq„+\>
Y,,} contain no information about r„(x). This is so because under //„(3) the conditional distribu-
tion (given T{n)) of these order statistics does not involve r„(x).
Taken together, the two theorems imply that a knowledge of T(n) is asymptotically as
good as a knowledge of the whole sample, for the purpose of testing whether rn(x) = 0. This
assumes that we have to deal only with the challenging alternatives described in Section 1, but
less challenging alternatives do not pose any problem asymptotically.
5. EXTENSION TO OTHER CASES
The results above were for the case where the unknown parameters are location and scale
parameters. In other cases, it may not be possible to choose p„ and qn that will guarantee that
assumptions (1.5) and (1.6) hold for all 9X, ... , 0,„, if we want lim p„ = 0 and lim q„ = 1.
/;— >oo 11—°°
But if we fix p and q with 0 < p < q~< 1, an analogue of Theorem 1 can often be proved with
p„ replaced by /?, q„ replaced by q, and a j(n), y ,-(«)
defined as F n '
np +
J-
. , 9,
L^ fo(ai(n)'Jl 9 J
In fi(a,(n)-9u ... , 9 J
respectively,
where 9X, ... , 9m are estimates of 9®, ... , 0° based on { Ynp, Y„p+L , ... , Ynq). Then, if we
are willing to ignore departures from the hypothesis in the tails of the distribution, we can stil
use only the order statistics { Y„p, Ynp+L) Y„q).
6. APPLICATIONS
For the case where m = 2 and 0b 02 are location and scale parameters respectively, vari
ous tests based on T(n) have been investigated in [2] and [6]. In particular, [2] contains vari
ous analogues of the familiar Wilk-Shapiro test, first proposed in [3]. The tests in [2] and [6
were based on T(n) because it made the analysis easier. The present paper gives a theoretics
justification for basing tests on these sparse order statistics alone.
For the location and scale parameter case, we can construct other tests, as follows. Fc
j= 0,1, .... K„, let V,(n) denote yfn f„
np„+jL„
let Z.(n) denote Vn — „ g
' 02°
nPn+JLn
Y p -
"P„+ <Ln "
Y p~ l
1 np +jL rn
npn+jL„
np„ + jLn
an
It was shown in [4] that for all asymptotic probability calculations, we can assume that the joii
distribution of { ^o^)- • • • > ^k (")) 's given by the normal pdf
c„exp
SPARSE ORDER STATISTICS
405
n(Ln- 1)
2L2
L„\q
L„vK
+
^T^tr + ,?(v'-v'-i):
1 15
Under the additional condition that — - e + 2y < 0, it can be shown that for all
asymptotic probability calculations we can assume that the joint distribution of
[Z0(n), . . . , ZK («)} is given by the normal pdf just described. Then, if we define Pi as
Qn
1 +
V np,,
Qo-
1
Vl - Qn
, p2 as
V np„
Pi, and the observable random variables Q0,
V^yj^\^/—(g(G-lip„)) Ynp+pxg{G-\qn))Ynq
L,2 I V np„
/n(L,-i:
«G-
J np„+jL„
Yn„+lL+P2g(G^(q„)Yl
'"'„
np„+ij- \)Ln
npn + (/-!)/.„
for 7 = 1, . . . , A',,, a straightforward computation shows that for all asymptotic probability cal-
culations we can assume that Q0, Qx, ... , QK are independent, each with a normal distribu-
tion with standard deviation 82, and with
/n(L,-\) 1 /7~
V L„2 I V npn
V
/»„(0)+P ,/»„(/:„)
£10/} =
n L" l [h„(j) - hn{j-\)+p2h„{K„)\, for j= 1, .... *„,
where /?„(/) = £
L oo
0/C-l
= 0;° + 02UG
:-1
np„+jL„
np» + jL„
np„+jL„
np„+jLn
. If the hypothesis is true, F„
, and in this case we can write E[Qt} = A„(j)9\ + Bn(j)92, where
An(J), Bn(J) are known, for j = 0, . . . , A",,. So we have reduced our hypothesis testing prob-
lem to the following: we observe random variables Q0, Q\, ... , QK which are independent
and normal, each with the same standard deviation 02°, which is unknown. The problem is to
test the hypothesis that £{(?,■} = An(J)0? + B„(J)9^, for some unknown 0,°, where A„(J) and
Bn(j) are known values, for j = 0,1, ... , K„, against alternatives that E{Qf) = An(j)9\ +
Bn(j)92 + A „(/"), where A„(/) is unknown.
The formulation of the problem just described makes it easy to construct various tests.
For example, suppose for convenience that Kn + 1 is a multiple of 4. Then it is possible to
find — (AT„ + 1) sets of nonrandom quantities
' = 0,
K„-3
X„(4/), X„(4/ + l), X„(4/ + 2), X„(4/ + 3);
such that the* — (K,, + \) quantities £>„(/) = X„(4/)Q, +X„(4/ + !)(?,+,+
L.
WEISS
K„-
3
1
—
0,
1
4
406
\„(4i + 2)Qi+2 + \n(4i + 3)Ql+i /' = 0, 1, ... , — 'Lj— can be assumed to be independent
normal random variables, each with unknown standard deviation 0°, and with E[Q„(i)} = £
— _ /=0
X„(4/4\/)A„(4/ +j) = A„(/), say, where A„(/) is unknown. Then the hypothesis to be tested
is that A„(/) = 0 for all /. But if we examine the development above, we see that (A„(/)} is
4/
not completely arbitrary. Instead, A„(/) = q„
, where q„ (v) is a continuous function
Kn-l
of v for 0 < v < 1. If we have some particular alternative qn (v) against which to test the
hypothesis, a likelihood ratio test can be constructed. If we want to test against a very wide
class of alternatives, we could apply one of various nonparametric tests. For example, we could
base a test on the total number of runs of positive and negative elements in the sequence
{(?„(/)}. If the hypothesis is true_, there should be a relatively large number of runs, but if the
hypothesis is false, neighboring Qn(iYs would tend to have the same sign, decreasing the total
number of runs. Other tests for an analogous problem are developed in [7].
j -£
In the case where g(x) — /— e 2 , all the conditions imposed above hold if we take p„
, _ x 1 1 Ai A2 A2
= 1 - q„ - 0(« "), e = - - A,, 8 = — - — - A2, y = — - A3, p = — - A3 - A4,
where Ab A2, A3, A4 are very small positive values chosen so that e > 0, 8 > 0, y > 0, and p
> 2A,.
REFERENCES
[1] Hensler, G.L., K.G. Mehrotra and J.E. Michalek, "A Goodness of Fit Test for Multivariate
Normality," Communications in Statistics, A6, 33-41 (1977).
[2] Jakobovits, R.H., "Goodness of Fit Tests for Composite Hypotheses Based on an Increasing
Number of Order Statistics," Ph.D. Thesis, Cornell University (1977).
[3] Shapiro, S.S. and M.B. Wilk, "An Analysis of Variance Test for Normality (Complete Sam-
ples)," Biometrika, 52, 591-611 (1965).
[4] Weiss, L., "Statistical Procedures Based on a Gradually Increasing Number of Order Statis-
tics," Communications in Statistics, 2, 95-114 (1973).
[5] Weiss, L. "The Asymptotic Sufficiency of a Relatively Small Number of Order Statistics in
Tests of Fit," Annals of Statistics, 2, 795-802 (1974).
[6] Weiss, L., "Testing Fit with Nuisance Location and Scale Parameters," Naval Research
Logistics Quarterly, 22, 55-63 (1975).
[7] Weiss, L., "Asymptotic Properties of Bayes Tests of Nonparametric Hypotheses," Statistical
Decision Theory and Related Topics, //Academic Press, 439-450 (1977).
[8] Weiss, L., "The Asymptotic Distribution of Order Statistics," Naval Research Logistics
Quarterly, 26, 437-445 (1979).
ON A CLASS OF NASH-SOLVABLE BIMATRIX GAMES
AND SOME RELATED NASH SUBSETS
Karen Isaacson and C. B. Millham
Washington State University
Pullman, Washington
ABSTRACT
This work is concerned with a particular class of bimatrix games, the set of
equilibrium points of which games possess many of the properties of solutions
to zero-sum games, including susceptibility to solution by linear programming.
Results in a more general setting are also included. Some of the results are be-
lieved to constitute interesting potential additions to elementary courses in
game theory.
1. INTRODUCTION
A bimatrix game is defined by an ordered pair <A,B> of m x n matrices over an ordered
field F, together with the Cartesian product X x Y of all m-dimensional probability vectors
x € X and all n-dimensional probability vectors y 6 Y. If player 1 chooses a strategy (probabil-
ity vector) x and player 2 chooses a strategy v, the payoffs to the two players, respectively, are
xAy and xBy, where x and y are interpreted appropriately as row or column vectors. A pair
<x*,y*> in A' x Y is an equilibrium point of the game <A,B> if x*Ay* ^ xAy* and x*By* ^
x*By, for all probability vectors x and y.
A Nash-solvable bimatrix game is one in which, if <x*,y*> and <x',y'> are both equili-
brium points, then so are <x*,y'> and <x',y*>. It is well known that 0-sum bimatrix games
(fly + bjj = 0, all i,j) are Nash-solvable, and that this property extends to constant-sum games
(fly -I- b,j = k, all ij, for some k € F). It is also well known that in the constant-sum case all
equilibrium points are equivalent in that they provide the same payoffs to both players. This
work generalizes, slightly, that contained in such sources as Luce and Raiffa (9) and Burger
(2), and represents a very small step toward the solution of the open problem of characterizing
Nash-solvable games. In the following, Ar will be the rth row of A and A.} the y'th column of
A, and similarly for B. The inner product of 2 vectors u, v in E" will be denoted by (w,v). The
ordered pair is <a,v>.
2. ROW-CONSTANT-SUM BIMATRIX GAMES
DEFINITION 1: An m x n bimatrix game <A,B> is row-constant-sum if, for each
I / = 1, ... m, there is a k, € Fsuch that au + by = k,, j = 1, . . . n.
THEOREM 1: Let <x*,y*> and <x',y'> be two equilibrium points for a row-constant-
sum game <A,B>. Then <x*,^*> and <x',y'> are interchangeable, and they are equivalent
m m
for PI (player 1). They are equivalent for P2 (player 2) if and only if £ x*k, = £ x,'/c,.
;=1 (=1
407
408 K. ISAACSON AND C.B. MILLHAM
PROOF: It is well known and easily proved that <x*,y*> is an equilibrium point for
<A,B> if and only if x* > 0 implies that (A/., y*) = max(Ak , y*) and y* > 0 implies that
k
(x*,B.i) = max(x*,B.k), for all i, j. Accordingly, let /3 * = x*By*. Then v* > 0 implies
(x*,fl.y) =/3* - £xffc, - (x*, /*.,) > £ xft - (x*„4.r) for all r, or (x*. >*.,) ^ (x*. ^.7), and
x* > 0 implies (y4,.,v*) = a* = max(/^., v*). If <x',v'> is any equilibrium point, then we
k
have that xMv* ^ x '4y* (because x* is in equilibrium with y*) ^ x'Ay' (because y' is in
equilibrium with x ' and by the above argument) ^ x*Ay' (because x'is in equilibrium with y')
^ x*Ay* (because y* is in equilibrium with x* and by the above argument). Thus <x*,y*>
and <x',y'> are interchangeable for PI, and equivalent for PI. To show they are interchange-
able for P2, note that xBy' = ]T x^kj — x'Ay' = £ x,'/c, — x'Ay*, or x'By' = x'By*. One can simi-
larly show that x*By* = x*By\ completing this part of the proof.
Suppose now that £ x,'/c, = £ x*k,. Since x*Ay* = x'Ay*, we have that £ x//c, —
x'Ay* = £ x* kj - x*/4v*, or x'By* = x* Z?v*, and equivalence follows.
i
On the other hand, suppose x*Ay* = x*Ay' = x'Ay* = x'Ay', x*By* = x*By' = x'By* =
x'By'. Then £ xfk, — x*Ay* = £ x,'/c, — x'/4y*. Since x'/ly* = x*Ay*, it follows that £ x*/c, =
/ i i
£ X/'A:/, and the proof is complete.
i
It is well known that, if A (=—B) is the payoff matrix for a zero-sum game, optimal stra-
tegies <x*,y*> for the game satisfy the so-called "saddle-point" property: x*Ay > x*Ay* ^
xAy* for all probability vectors x and y, and that, conversely, if <x*,y*> is a saddle-point of
the function xAy, then <x*,y*> is a solution to the game A.
THEOREM 2: <x*,y*> is an equilibrium point of the row-constant-sum game <A,B>
if, and only if, <x*,y*> is a saddle-point of the function <$>(x,y) = xAy.
PROOF: \[<x*,y*> is an equilibrium point of <A,B>, then x*Ay* ^ xAy* for all
x 6 J, from which half of one implication follows. Now, let K be the m x n matrix
K
*l
*,.
.A:,
*2
*2.
•*2
*„
*„.
./c„
of row constants £,, au + by = /c,.
Since x*By* > x*By for all y e K, we have x*(A^ - /D.y* ^ x*(K — A)y, from which
x*Ay ^ x*^v* [since x*Aj>* = x*Ky = £ Jc^fe/l . This completes one implication. Suppose now
that <x*,y*> is a saddle-point of <I>. From x*Ay ^ x*/4v* it follows that v*= 0 if' (x*,A.j) >
m
a = m\n(x*,A.k), from which, if y* > 0, £ x*A:, - (x*,A.,) ^ £ x*A:, - (x* ,A.k) for all k, or
* ,= i
(x*,^.,-) >(x*,B.k) for all fe Finally, it follows from x*Ay* ^ xAy* for all x that x*= 0 if
(/4,.,v*) < ma\(Ak.,y*), and the proof is complete.
k
The implication is that any solution of A as a 0-sum game is also an equilibrium point of
the row-constant-sum bimatrix game <A,B>, and conversely. Thus, a solution of A found by
NASH-SOLVABLE BIMATRIX GAMES 409
linear programming will provide an equilibrium point <x*,y*> for <A,B> and the payoff a
for PI. The payoff fi for P2 must be calculated via x*By*, or via £ x*kt — a.
i
3. A SOMEWHAT MORE GENERAL SETTING
We now consider the m x n matrix A, we let B be m x n (not necessarily in row-
constant-sum with A) and we henceforth let X x Y be the set of solutions to A as a 0-sum
game. The following theorem then follows.
THEOREM 3: Let <x*,y*> € X x Y. In order for <x*,y*> to be an equilibrium point
of <A,B> regarded as a bimatrix game, it is necessary and sufficient that x*By* ^ x*By for all
probability vectors y, or, for x*(— B)y > x*(—B)y*. It is clearly sufficient for <x*,y*> to also
be a solution to (-/?), regarded as a 0-sum game.
The proof is omitted, as it follows immediately from the definition of equilibrium point.
The following comment is made, however: if <A,B> is row-constant-sum, a point <x*,y*>
that solves A as a 0-sum game and is an equilibrium point of <A,B> , will not necessarily solve
(-B) as a 0-sum game, because the condition x*(— B)y* ^ x(—B)y* holds if and only if
m in
x*Ay* — £ xfkj ^ xAy* — £ Xjkj, or x* Ay* — xAy*^ ]£ kj(x* — x,). Thus, the condition
/= 1 i i- 1
that <x*,j>*> also solve (—5) as a 0-sum game is extremely strong. This illustrates a major
difference between the constant-sum case (in which the above condition will hold if <x*,y*>
solves A as a 0-sum game) and the row-constant-sum case. It is also logical to ask if there are
conditions on A and B which would cause an equilibrium point of <A,B> to also solve A and
-B as separate 0-sum games. The conditions are inescapable: yf > 0 must imply
(x*,A.f) = min (x*,A.k) and x* > 0 must imply {Br,y*) = min (Bk.,y*). Since, for example,
k k
to be an equilibrium point of <A,B> it is necessary that y* > 0 imply
(x*,B.f) = max (x*,B.k), any game satisfying these conditions must be heavily restricted.
k
Finally, it is noted that if there are common saddle-points of A and (—5), which are therefore
equilibrium points of the bimatrix game <A,B>, each of these saddle-points will necessarily
provide the same payoffs a, (3 to the respective players (note the contrast of the row-constant-
sum case with the constant-sum case).
DEFINITION 2: A Nash Subset for a game <A,B> is a set S = [<x,y>} of equili-
brium points for <A,B> such that, if <x,y> and <x',y'> are in S, so are <x,y'> and
<x',y>. See (6) and (13) for related material.
THEOREM 4: Let A and B be m x // matrices over the ordered field F, and let X x Kbe
the set of all solutions to A regarded as a 0-sum game. In order for X x Y to constitute a Nash
subset of equilibrium points for <A,B>, regarded as a bimatrix game, it is necessary and
sufficient that K(X) = [k\ {x,*A.k) = min (x*,A.,), all x* € X)c K' (X) = [k\ (x*.B.k) =
max (x*.B.i), all x* <E X ).
PROOF: Write K = K(X), K' = K'(X), and let KCK'. Then because any <x*,y*> in
X x Y solves A as a 0-sum game, x*Ay ^ x*Ay* ^ x4y* for all <x*,y*> in J x Y and all
probability vectors x, y. Also, .y*= 0 if {x*,A..) > min (x*,A.k), or if y # K C A". Hence
A:
y*= 0 if (x*,B.i) < max (x*,5.,) for all >>*€ X, any x* € A\ and <x*,.y*> is an equilibrium
i
point for </L5>, for all <x*,y*> € J x Y. Suppose there exists k' € K - K', so that for
410
K. ISAACSON AND C.B MILLHAM
some x* 6 X, (x*,B.k) < max (x*,5.,) but (x*,A.k) = m'm(x*,A.,). Since it is known that
there exists v' 6 Y (see (1), page 52) such that y£- > 0, it follows that y' cannot be in equili-
brium with x* for <A,B> regarded as a bimatrix game, a contradiction. This completes the
proof.
COROLLARY 1: Let X* x Y* be any subset of X x K, the set of all solutions to A
regarded as a 0-sum game. In order for X* x Y* to be a set of interchangeable equilibrium
points (a Nash subset) for <A,B> regarded as a bimatrix game, it is sufficient that K(X*) =
[k\ (x*,A.k) = min (x*,A.,) for all x* e X*} = K'(X*) = {k\(x*,B.k) = max (**,£.,) for all x*
€ X*}.
COROLLARY 2: Let X' C A, and let A" (A") be denned as above, and let
Y'=[y € r|v, > 0 implies 7 6 A'(A')). Then A" x r is a Nash subset for <A,B>.
Finally, we consider the construction of all matrices B such that A x Y, the set of solu-
tions to A as a 0-sum game, will also be a set of equilibrium points for <A,B> regarded as a
bimatrix game.
THEOREM 5: Let A be an m x n matrix over f, with Ax K its solutions as a 0-sum
game. Then a matrix B can be constructed such that Ax Y is a Nash subset for <A,B>
regarded as a bimatrix game. The equilibrium points <x,y> in A x Y may or may not be
equivalent for P2, depending on construction. Further, all matrices B such that Ax Y is a
Nash subset for <A,B> are constructed as described.
PROOF: Let x , x, . . . xk be the extreme points of A, and assume that x\ . . . xr, r ^ k,
xl
are a maximal linearly-independent subset of x ,
xk. Let x ~
be regarded as the
matrix of a linear transformation from Em to E\ taken with respect to a basis of unit vectors,
and let c\c2, ... cm~r be a basis for the nullspace of x- Let 01. /32, ■■■ftr be scalars. Let ,
v\ y2, ... v' be the extreme points of the set Y, and let KJ = {/| yj > 0}. Let KY = U A,,/.
Let D = UKx'.rf) =/3/(l ^ 7 < r}, and let d\ ... <T_r+1 be m - r + 1 (if some B, * 0)
linearly-independent solutions to the system of r equations in m variables. For j € Kv, let
m — r+\ m — r m— r + 1
B.j = £ a,,^' + £ \/7c' where £ a „= 1 for at least, £ a/7 = a for some a ^ Oj, all
r
y. Then, if x € A, there are scalars y,, / = 1, . . . r, such that x= £ y,x', and for
y € Ky,(x, B.,) =
= Iy/j8f Of «- D-
/• r m — r + \ m — r
r=l [/-i />=i i=l '=i
After all £.,, y€ATK, have been constructed, for j $ KY, let B., be such that
(x'.B.j) < (x',B.h), h € ATK, for all extreme points x', i = 1, ...A:. Then, for all y* € K, x*
€ A
with x'
Erfx'
, x*By* = ^y^3, ^ x*£v for all probability vectors v. Hence.
/=i
Ax Y is a set of interchangeable equilibrium points for <A,B> that would, for example, be
equivalent if /3, = |3, for all /, /
Finally, suppose there is a matrix 5 such that Ax Y is a Nash subset for <A,B> but
which does not have the above construction. Then there is a column B.r j € KY, such that j
NASH-SOLVABLE BIMATRIX GAMES 411
m—r+\ m—t •m—r+\ m—r
either B.t ^ £ «,-,</' + £ \y/c' for any coefficients a „, or #, = ]£ «,,</' + £ \/Vc' but
i=i 1-1 f=i /=i
ffi-c+i
£ a„ = a, ^ aA = 1, k€KY, k^j. In the first instance we note (x',B./) = 8h / = 1, .../•,
and we contradict the assumption that d] , ... a'"'"' + 1 are a maximal linearly-independent set of
m-r+\
solutions to ix'.d) = Bjt j = 1, ... r. In the second instance, if ]£ a/7 = a, ^ 1, let x =
]£ y,x'. Then (x.B.,) = a7£ y, B, ^ (x,5.A)=^ -y,/3, for other k € /Tr, so that any
/= i ' /= i i= l
equilibrium strategy y will either exclude j, or include y and exclude any k such that ak = 1.
Either contradicts the definition of Ky.
Note that the matrix A is used only to define X x K Given the set of Jf x y, it follows
that both A and 5 could be constructed as described, assuming the appropriate dimensionality
conditions.
4. CONCLUSIONS
It is hoped that this slight extension of previously published material regarding Nash-
solvable bimatrix games will lend itself to inclusion in future texts in game theory and opera-
tions research covering 2-person, 0-sum finite games (matrix games). Clearly, nearly any state-
ment that can be made about solutions of matrix games can also be made about the somewhat
more interesting row-constant-sum bimatrix case, and the usual methods for finding such solu-
tions carry over with the minor modifications indicated. The reader is also referred to the
excellent text by Vorobjev (21), and his discussion on "almost antagonistic" bimatrix games
(pp. 103-115) for related interesting material.
BIBLIOGRAPHY
[1] Bohnenblust, H.F., S. Karlin, and L.S. Shapley, "Solutions of Discrete, Two-Person
Games," Contributions to the Theory of Games, Annals of Mathematics, Studies 24,
Princeton University Press (1950).
[2] Burger, E. Theory of Games. Prentice-Hall, Englewood Cliffs, New Jersey (1963).
[3] Gale, D. and S. Sherman, "Solutions of Finite Two-Person Games," Contributions to the
Theory of Games, Annals of Mathematics, Studies 24, Princeton University Press
(1950).
[4] Heuer, G.A., "On Completely Mixed Strategies in Bimatrix Games," The Journal of the
London Mathematical Society, 2, 17-20 (1975).
[5] Heuer, G.A., "Uniqueness of Equilibrium Points in Bimatrix Games," International Jour-
nal of Game Theory, 8, 13-25 (1979).
[6] Heuer, G.A., and C.B. Millham, " On Nash Subsets and Mobility Chains in Bimatrix
Games," Naval Research Logistics Quarterly 23. 311-319 (1976).
[7] Kuhn, H.W., "An Algorithm for Equilibrium Points in Bimatrix Games," Proceedings of
the National Academy of Sciences, 47, 1656-1662 (1961).
[8] Lemke, C.E. and J.T. Howson, Jr., "Equilibrium Points of Bimatrix Games," Journal of the
Society for Industrial and Applied Mathematics, 12, 413-423 (1964).
[9] Luce, R.D., and H. Raiffa, Games and Decisions, John Wiley and Sons, New York (1957).
, [10] Mangasarian, O.L., "Equilibrium Points of Bimatrix Games," Journal of the Society for
Industrial and Applied Mathematics, 12, 778-780 (1964).
[11] Millham, C.B., "On the Structure of Equilibrium Points in Bimatrix Games," SIAM Review
10, 447-449 (1968).
412 K. ISAACSON AND C.B. MILLHAM
[12] Millham, C.B., "Constructing Bimatrix Games with Special Properties," Naval Research
Logistics Quarterly 19, 709-714 (1972).
[13] Millham, C.B., "On Nash Subsets of Bimatrix Games," Naval Research Logistics Quarterly
21, 307-317 (1974).
[14] Mills, H., "Equilibrium Points in Finite Games," Journal of the Society for Industrial and
Applied Mathematics, 8, 397-402 (1960).
[15] Nash, J.F. Jr., "Two-Person Cooperative Games," Econometrica, 21, 128-140 (1953).
[16] Owen, G., "Optimal Threat Strategies in Bimatrix Games," International Journal of Game
Theory, 1, 3-9 (1971).
[17] Pugh, G.E. and J. P. Mayberry, "Theory of Measure of Effectiveness for General-Purpose
Military Forces, Part I: A Zero-Sum Payoff Appropriate for Evaluating Combat Stra-
tegies," Operations Research 21, 867-885 (1973).
[18] Raghavan, T.E.S., "Completely Mixed Strategies in Bimatrix Games," The Journal of the
London Mathematical Society, 2, 709-712 (1970).
[19] von Neumann, J. and O. Morganstern, Theory of Games and Economic Behavior, Princeton
University Press, Princeton, New Jersey (1953), 3rd Ed.
[20] Vorobjev, N.N., "Equilibrium Points in Bimatrix Games," Theoriya Veroyatnostej i ee
Primeneniya 3, 318-331 (1958).
[21] Vorobjev, N.N., Game Theory Springer-Verlag, New York, Heidelberg, Berlin (1977).
OPTIMALITY CONDITIONS FOR CONVEX
SEMI-INFINITE PROGRAMMING PROBLEMS*
A. Ben-Tal
Department of Computer Science
Technion — Israel Institute of Technology
Haifa, Israel
L. Kerzner
National Defence
Ottawa, Canada
S. Zlobec
Department of Mathematics
McGili University
Montreal, Quebec, Canada
ABSTRACT
This paper gives characterizations of optimal solutions for convex semi-
infinite programming problems. These characterizations are free of a constraint
qualification assumption. Thus ihey overcome the deficiencies of the semi-
infinite versions of the Fritz John and the Kuhn-Tucker theories, which give
only necessary or sufficient conditions for optimality, but not both
1. INTRODUCTION
A mathematical programming problem with infinitely many constraints is termed a "semi-
infinite programming problem." Such problems occur in many situations including production
scheduling [10], air pollution problems [6], [7], approximation theory [5], statistics and proba-
bility [9]. For a rather extensive bibliography on semi-infinite programming the reader is
referred to [8].
The purpose of this paper is to give necessary and sufficient conditions of optimality for
convex semi-infinite programming problems. It is well known that the semi-infinite versions of
both the Fritz John and the Kuhn-Tucker theories fail to characterize optimality (even in the
linear case) unless a certain hypothesis, known as a "constraint qualification," is imposed on the
problem, e.g. [4], [12]. This paper gives a characterization of optimality without assuming a
constraint qualification.
*This research was partially supported by Project No. NR047-021, ONR Contract N00014-75-C0569 with the Center lor
Cybernetic Studies, The University of Texas and by the National Research Council of Canada.
413
414 A. BEN-TAL, L. KERZNER. AND S. ZLOBEC
Characterization theorems without a constraint qualification for ordinary (i.e. with a finite
number of constraints) mathematical programming problems have been obtained in [1]. It
should be noted that the analysis of the semi-infinite case is significantly different; the special
feature being here the topological properties of all constraint functions including the particular
role played by the nonbinding constraints.
The optimality conditions are given in Section 2 for differentiate convex semi-infinite
programming programs, whose constraint functions have the "uniform mean value property."
This class of programs is quite large and it includes programs with arbitrary convex objective
functions and linear or strictly convex constraints. For a particular class of such programs,
namely the programs with "uniformly decreasing" constraint functions, the optimality conditions
can be strengthened, as shown in Section 4. A comparison with the semi-infinite analogs of the
Fritz John and Kuhn-Tucker theories is presented in Section 5. An application to the problem
of best linear Chebyshev approximation with constraints is demonstrated in Section 6. A linear
semi-infinite problem taken from [4], for which the Kuhn-Tucker theory fails, is solved in this
section using our results.
2. OPTIMALITY CONDITIONS FOR PROGRAMS HAVING UNIFORM
MEAN VALUE PROPERTY
Consider the convex semi-infinite programming problem
(P)
Min f°(x)
s.t.
fk(x,t) < 0 for all f <E Tk, k € P A (1, ... , p}
x € R"
where
/" is convex and differentiable,
fk(x,t) is convex and differentiable in x for every / € Tk and continuous in / for every x,
Tk is a compact subset of R1 (/ ^ 1).
The feasible set of problem (P) is
F = (x € R":fk(x,t) ^ 0 for all t € Tk, k € P).
Note that F\s a convex set being the intersection of convex sets.
For x* € F,
T*k A [t € Tk: fk{x*,t) = 0},
P* A {k € P. T*k * 0).
A vector d € R" is called a feasible direction at x* if x* + d € F. For a given function fk(-,t),
k € {0} U P and for a fixed t 6 Tk, we define
Dk(x*,t) A [d e R". 3 a > 0 ^ fix* + ad,t) = fk(x*,t) for all 0 ^ a ^ a).
OPTIMALITY FOR SEMI-INFINITE PROGRAMMING 415
This set is called the cone of directions of constancy in [1], where it has been shown that, for a
differentiable convex function fk(-,t), it is a convex cone contained in the subspace
[d: d'Vfkix*,t) = 0}.
Furthermore, if fki-,t) is an analytic convex function, then Dk(x*,t) is a subspace (not depend-
ing on x*), see [1, Example 4]. In the sequel the derivative of / with respect to x, i.e.
V xfix,t), is denoted by Vfix.t).
Optimality conditions will be given for problem (P) if the constraint functions have the
"uniform mean value property" which is defined as follows.
DEFINITION 1: Let T be a compact set in R'. A function f\R" x T — R has the uni-
form mean value property at x € R" if, for every nonzero d € R" and every a > 0, there
exists a = a id, a), 0 < a ^ a such that
(MV) fix + adj) ~ f{x,t) > d' V/(x + ad,t) for every t <E T.
a
If f(-,t) is a linear function in x for every t € T, i.e. if /is of the form
f(x,t) = git) + Y,x,g,(t),
or if /(-,/) is a differentiable strictly convex function in x for every t € T, i.e. if
f(kx + (1 - \)y,r) < \/(x,/) + (1 - \)f(y,t) for every / € 7
where y € /?" is arbitrary, y ^ x, 0 < X < 1, and if fix, ■) is continuous in t for every x, then
/has the uniform mean value property. For a linear function / one finds d'Vfix + ad.t) =
n
£ djgtit) and (MV) is obviously satisfied. The mean value property for strictly convex func-
tions follows immediately from e.g. [14, Corollary 25.5.1 and Theorem 25.7].
EXAMPLE!: Function
fix,t) = t2[ix - t)2 - t2] for every K T = [0, 1]
is neither linear nor strictly convex in x € R for every t € T. However /' has the uniform
mean value property. Function
/2(x,,x2,f) =
x,2 4- tx2ix2 - t) if x2 < — t
X? + '_ (x2 - t+ 1) (x2- 1) if X2 ^ y /
for every / € T = [0, 1] does not have the uniform mean value property at the origin. Note
that f2 is convex and differentiable in x € R2 for every t € T and continuous in f G T for
every x. This function has provided counterexamples to some of our early conjectures.
Optimality conditions will now be given for problem iP).
THEOREM 1: Let x* be a feasible solution of problem iP) where fk, k € P* have the
uniform mean value property. Then x* is an optimal solution of iP) if, and only if, for every
a * > 0 the system
416 A BEN-TAL. L KERZNER, AND S. ZLOBEC
(A) d'Vfix*) < 0 ,
(B) d'Vfix* + a *d,t) < 0 for all t € T*k,
(c) d'vfix*+a*d,t) _ _L for all f € r VT>
/*UV) a* x
is inconsistent.
PROOF: We will show that x* is nonoptimal if, and only if, there exists a * > 0 such
that the system (A), (B), (C) is consistent. A feasible x* is nonoptimal if, and only if, there
exist a > 0 and d £ R'\ d ^ 0, such that
(1) f°(x* + ad) < fix*)
(2) fk(x* + ad.t) ^ 0 for every t 6 Tk,
k € P.
By the convexity of f and the gradient inequality, the existence of a > 0 satisfying (1) is
equivalent to
d'Vfix*) < 0.
By the continuity of fi-.t), k € P, the constraints with k € P\P* can be omitted from discus-
sion. We consider (2), for some given k € P*, and discuss separately the two cases: t € T*
and / € Tk\T*. Thus (2) can be written
(2-a) fix* + ad.t) ^ 0 for every t <E FJ
(2-b) fk{x* + ad.t) < 0 for every t € Tk\T*k.
Consider first (2-a) for some fixed k € P*. By the convexity and uniform mean value property
of A
(3) f(x* + ad.t) > fix*,t) + ad'Vfix* + akd,t) for all 1 € T*k
and for some
0 < ak < a.
Since t € T* and a > 0, (2-a) implies
(4) d'Vfix* + akd,t) < 0.
Denote
(5) a = min [ak\.
k£P*
Clearly, a always exists (since Pis finite) and it is positive. By the convexity of fi-.t), (5)
and (4),
(6) d'Vfix* + ad.t) < d'Vfix* '+ akd,t) ^ 0.
On the other hand, the existence of a* > 0 such that, for some / € T* and all k € P*,
d'Vfix* +a*d,t) < 0
implies (2-a) with 0 < a ^ a*.
OPTIMALITY FOR SEMI-INFINITE PROGRAMMING
417
It is left to show that the existence of a > 0, such that (2-b) holds, is equivalent to the
existence of a > 0, such that (C) holds. Suppose that (2-b) holds for some a > 0. Then, by
the convexity and uniform mean value property, for k € P*,
0 > fix* + ad,t) > fk{x*,t) + ad'Vfix* + akd,t) for all / € Tk\T*k
and for some
(7)
Hence,
(8)
0 < ak . < a.
d'Vfix* + akd,t) 1
— > — — , since t € Tk\T*
a
f{x*,t)
> - — , by (7).
a*
Denote
(9)
a = min [ak\ > 0.
kiP'
Using the monotonicity of the gradient of the convex function fk(-,t), one obtains here
(10)
d'Vfk{x* + ocd,t) . d'Vfk(x* + akd,t) _
rki * \ ^ rk< * x for every 0 < a < afc.
fk(x*,t) fh{x*,t)
This gives
— ^ , by (10) and (8)
f(x*,t)
>
a
, by (9)
which is (C) with a* = a.
Suppose now that (C) is true for some a* > 0. Using again the monotonicity of the gra-
dient of the convex function fk(-,t), and the fact that fk(x*,t) < 0 for t 6 Tk\T*k, one easily
obtains
(11)
But
fk(x*,t) + a*d'Vfk(x* + ad.t) < 0, for every 0 < a < a *.
fk(x* +a*d,t) = fk(x*,t) +a*d'Vfk(x* + akd,t),
for some particular 0 < ak < a*, ak = ak(t)
by the mean value theorem
< 0, by (11)
which is (2-b) with a = a*.
Summarizing the above results one derives the following conclusion: If x* is not optimal
then there exists a* = min{a,a) > 0 such that the system (A), (B) and (C) is consistent. If
there exists a* > 0 such that the system (A), (B) and (C) is consistent, then there exist
a0 > 0 and a > 0 such that
418
A. BEN-TAL, L KERZNER, AND S. ZLOBEC
f°(x* + a0d) < f°{x*)
fix* + ad.t) ^ 0 for every t € T*k
fix* + ad.t) ^ 0 for every t 6 Tk\T*k
k € P*.
If one denotes
a = min{a0,a} > 0
then, again by the convexity of fi',t), k € {0} U P, (12) can be written
fix* + ad) < fix*)
fix* + ad,t) < 0 for every t € Tk,
k e P*
implying that x* is not optimal.
REMARK 1: Since Vfix,-) is continuous for every x in some neighbourhood of x*
(this follows from e.g. [14, Theorem 25.7]), condition (C) in Theorem 1 needs checking only
at the points t € Tk which are in
Nk A U Nit*),
— i*i.Tl
where Nit*) is a fixed open neighbourhood of t*. For the points / in Tk\Nk one can always find
a * which satisfies (C). This follows from the fact that for every a,
(13)
d'Vfjx* + ad,t)
^ -M
fix*,t)
for some positive constant M, by the compactness of Tk\Nk. Choose M in (13) large enough,
so that
(14)
a* A — - ^ a.
= M
Now,
*fw + «**/) > d'v/y + a-d,^ and
fix*,t) fix*,t)
>
, by (13) and (14).
EXAMPLE 2: The purpose of this example is to show that Theorem 1 fails if the con-
straint functions do not have the uniform mean value property. Consider
Min — x2
subject to
fixux2,t) < 0 for all t € [0,1]
OPTIMALITY FOR SEMI-INFINITE PROGRAMMING
419
where
f(xux2,t) =
x2 + tx2(x2 - t)
,3
.2
t
x{ + (x2- t + \){x2- 1) if x2 > -t.
if x2 < -t
I
2
Function / satisfies the assumptions of problem (P) but it does not enjoy the uniform mean
value property. The feasible set is
F =
0 € x7 ^ 1
and the optimal solution is x = (0, 1)'. However, for every a * > 0, the system (A), (B) and
(C) is inconsistent at x* = 0, a nonoptimal point. Since T* = [0,1], condition (C) is here
redundant, while (A) and (B) become, respectively,
-d2 < 0
2a V,2 + t(2a*d2- t)d2 < 0if2aV2<7
,3
loi*d{ +
(2- t)1
(2a *d2 - t) d2 ^ 0 if 2a *d2 > t.
The above system cannot be consistent for some a* > 0, because, if it were, the last inequality
would be absurd for small t € [0,1].
When the constraint functions (but not necessarily the objective function) are linear, i.e.
when (P ) is of the form
(L)
Min f"(x)
s.t.
Soit) + £ x,g?{t) < 0, for all t € Tk, k 6 P
then Theorem 1 can be considerably simplified.
COROLLARY 1: Let x* be a feasible solution of problem (L). Then x* is optimal if, and
only if, the system
(A)
(B2)
(C,)
d'Vf°(x*) < 0
£ d,g?(t) < 0, for all / € T*k
i=\
Z d,g*U)
i=\
gait) + Z xfefu)
> -1, for all t € FA?*.
k e p*
is inconsistent.
420
A. BEN-TAL, L. KERZNER, AND S. ZLOBEC
PROOF: Recall that linear functions have the uniform mean value property. If fk(-,t) is
linear, then for every t € Tk
Dk(x,t) - [d € R": d'Vfk(x,t) = 0}.
Thus (B) reduces to (B2). The left hand side of (C) reduces to the left hand side of (Ci),
which does not depend on a *. Moreover, a * on the right hand side of (C) can be taken
a* = 1, because whenever d satisfies (A) and (B2), so does d = — r d.
a
□
In many practical situations the sets Tk, k € P are compact intervals and the sets T*,
k € P* are finite. (This is always the case when fix*, •) are analytic functions not identically
zero.) For such cases condition (B) can be replaced by a finite number of linear inequalities.
COROLLARY 2: Let x* be a feasible solution of problem (/>), where fk, k € P* have
the uniform mean value property. Suppose that all the sets T*, k € P* are finite. Then a
feasible solution x* of problem (P) is optimal if, and only if, for every a * > 0 and for every
subset £1A of T* the system
d'Vf°(x*) < 0
• d'Vfk{x*,t) < 0, t € nk
d € Dk{x*,t), t € T*k\VLk
(A)
(B3)
(C)
is inconsistent.
d'Vfk(x* + a*d,() > _ _1_
fk(x*,t) a'
for all t € TJSTl ,
k € P*
An important special case of Corollary 2 is when the sets Tk themselves are finite. Then
problem (P) can be reduced to a mathematical program of the form
(MP)
Min f°(x)
s.t.
fix) < 0, k 6 P.
This is obtained by setting Tk = {kltk2, ■
^carcrrj and identifying {/*(*,*,): k; € Tk,
k = 1,2 p]
with
fk(x): k € PA {1,2 JT card Tk)
fk(x*) = 0}. Also {Dk(x*,k,): k, € Tk, k = 1,2,
k=\
Here
{A: € P:
p) is denoted by {DA.(x*): k € P}.
The major difference between the semi-infinite problem (P) and the mathematical prob-
lem (MP) is that for the latter the condition (C) is redundant; Theorem 1 then reduces to the
following result obtained in [1, Theorem 1].
0PT1MALITY FOR SEMI-INFINITE PROGRAMMING 421
COROLLARY 3: Consider problem (MP), where {fk: k 6 {0} U P) are differentiable
convex functions: R" —> R. A feasible solution x* of (MP) is optimal if, and only if, for every
subset Q of P* the system
d'Vf°(x*) < 0
d'Vfk(x*) < 0, k € n
d € Dk(x*), k € P*\Q,
is inconsistent.
PROOF: Here condition (C) becomes
d'Vfk(x* + a*d) > _ J_ .
/*(x*) "* a * '
for some a * > 0. Since here the set P\P* is finite, and hence compact, the redundancy of
condition (C) is shown as in Remark 1.
□
The following result gives a characterization of a unique optimal solution of problem (P).
THEOREM 2: Let x* be a feasible solution of problem (P), where fk, k € P* have the
uniform mean value property. Then x* is a unique optimal solution of problem (P) if, and
only if, for every a* > 0 there is no d satisfying conditions (5), (C) and
(A,) d'Vf°(x*) < 0 or d € D0(x*).
PROOF: Suppose that the system (A,), (B), (C) is inconsistent. Then so is the system
(A), (B), (C). Hence, by Theorem 1, x* \s an optimal solution. Suppose that_x* is not a
unique optimal solution. Then there exist a > 0 and d ^ 0 such that x = x* + ad is feasible,
which implies that d satisfies (B), (C) and f°(x*) = f°(x* + ad) . Since the set of all optimal
solutions of a convex program is convex, the latter implies fix*) = f"(x* + ad) for all
0 ^ a < a, i.e., d € D0(x*). Thus (/satisfies (A,), (B) and (C), which is impossible. There-
fore x* is the unique optimum. The necessity follows by a similar argument.
□
3. OPTIMALITY CONDITIONS FOR STRICTLY CONVEX FUNCTIONS
IN THEIR ACTUAL VARIABLES
This section can be skipped without hindering the study of Section 4.
In order to state our next result, which is a characterization of optimality for a subclass of
convex functions, i.e. strictly convex functions in their "actual variables", we adopt some
notions from [1].
For every k € P and t 6 Tk, denote by [£](/) (read "block /c"), the following index sub-
set of P: j € [k](t) if, and only if, yk: R — R, defined by
yk(-) A (xi, ... , Xj-h ; XJ+l, ... , x„)
is not a constant function for some fixed x,, . . . , x,_,, xl+u . . . ,x„. Thus, for a given r € Tk,
\k](t) is the set of indices of those variables on which fk(-,t) actually depends. These "actual
422
A. BEN-TAL, L. KERZNER, AND S. ZLOBEC
variables" determine the vector x(*](,,, obtained from x = (x,, ... , x„)' by deleting the vari-
ables [xf. j % [£](/)}, without changing the order of the remaining ones. Similarly, we denote
by f[k]u): /?«*[*) -» r the restriction of/* to Rc^kl
DEFINITION 2: A function f*\ R" x Tk -* R is strictly convex in its actual variables if
for every / € Tk its restriction f[k](,)(-,t) is strictly convex.
The above concept will be illustrated by an example.
EXAMPLE 3: Consider
f{x,t) = x,2 + txl t € T= [0,1].
Note that function /'(•,/) is not strictly convex for every / € T. Here
{1} if / = 0
{1,2} if/ € (0,1],
(*,) if / = 0
if/ € (0,1]
[11(f) =
*[i](i>
and
/
HO)
f x,2 if / = 0
x,2 + /x22 ifr € (0,1],
clearly a strictly convex function in its actual variables for every / € T. Hence, /' is a strictly
convex function in its actual variables.
COROLLARY 4: Let x* be a feasible solution of problem (P), where /*(•,/), A: € P*are
strictly convex in their actual variables and have the uniform mean value property. Then x* is
an optimal solution of (P) if, and only if, for every a * > 0 and every subset ft* C T* the
system
(A)
(B, ft)
(C)
(D,ft)
is inconsistent.
d'\7f°{x*) < 0
d'Vfix* + a*d,t) < 0 for all t € Tf$lk
d'Vfk(x* + a*d,t) . 1 . .. , T,T%
rkt * ^ > * for a11 * € r^r*
fk(x*,t) a*
d[k](D = 0 for all / € ft k,
k £ P*
PROOF: We know, by Theorem 1, that-x* is nonoptimal if, and only if, there exists
a* > 0 such that the system (A), (B), (C) is consistent. In order to prove Corollary 4, it is
enough to show that (B) is consistent if, and only if, for some subsets ft* C 71*, k € P*, the
system (B, ft), (D,ft) is consistent. Suppose that (B) holds. For every k € P* define
ft* A [t € T*k: d'S7fk{x* + ad,t) = 0 for all 0 < a < a*}.
Hence, by the mean value theorem, for every /6ft*
fix* + ad.t) = fk(x*,t) for all 0 < a < a *.
OPTIMALITY FOR SEMI-INFINITE PROGRAMMING 423
Since fki-,t) is strictly convex in its actual variables, this is equivalent to
d{k)(t) = 0 for all t € Q.k.
If t € T*k\Clk, then obviously rf'V fkix* + ad.t) < 0 for some 0 < a < a *, by (B). Thus
(B, ft), (D.ft) holds for ft^ = ft^. (Note that some or all ft^'s may be empty.) The reverse
statement follows from the observation that d[k\(,) = 0 implies d'Vfkix* + a *d,t) = 0.
D
If a function fki-,t) is strictly convex (in all variables X\, ... , x„) for every t € Tk,
k € />*, then Dk(x*,t) = {0}. This implies that the system (A), (B, ft), (C), (D, ft) is incon-
sistent for every nonempty Clk, k € P*. Thus condition (D,ft) is redundant. In fact, condi-
tion (C) is also redundant, which follows by the following lemma.
LEMMA 1: Let fix.t) be convex and differentiate in x € R" for every t in a compact
set T C R1 and continuous in t for every x If for some d € R",
(15) d'Vf(x*,t) < 0 for all t € T* = {t: f{x*,t) = 0},
then there exists a > 0 such that
• (16) fix* + ad,t) < 0 for all t 6 T\T*.
PROOF: It is enough to show that the hypothesis (15) and the negation of the conclusion
! (16), which is
"For every a > 0 there is t - ti.a) € T\r*such that fix* + ad.tia)) > 0,"
are not simultaneously satisfied. If this were true one would have the following situation:
For every a„ of the sequence a„ = 2~" there is a t„ = tn(an) € T\r*such that
(17) f(x* + and,t„(a„)) > 0, n = 0, \, 2, ...
Since Tis compact, {tn} has an accumulation point t € T, i.e. there is a convergent subsequence
' {/„} with /as its limit point. We discuss separately two possibilities and arrive at contradictions
in each case.
CASE I: t € T*. Since f(x*j) = 0 and d'Vf(x*,t) < 0, by (15), there exists a > 0
such that
! (18) fix* +ad,h < 0.
For all large values of index /, a„ < a and
(19) fix*,tn) < 0,
since t„, € T\T*. This implies
(20) fix* + ad,t„) > 0.
(If (20) were not true, one would have, for some particular «,,
(21) fix* +ad,t„) < 0.
Nowa,, < a, (19), (21) and the convexity of /imply
fix* +a„d,t„) ^ 0
424 A. BEN-TAL, L. KERZNER, AND S. ZLOBEC
which contradicts (17).) But (18) and (20) contradict the continuity of fix* + ad,-)-
CASE II: tJE T\T*. Since fix*,t) < 0, there exists a > 0 such that (18) holds, by the
continuity of fi-,t). The rest of the proof is the same as in Case I.
□
r
A characterization of optimality for strictly convex constraints follows.
COROLLARY 5: Let x* be a feasible solution of problem (P), where fi-,t) are strictly
convex for every t € Tk, k € P*. Then x* is an optimal solution of (P) if, and only if, for
every a * > 0 the system
(A) d'Vf°(x*) < 0
(B,) d'Vfix*,t) < 0 for all t € T%
k € P*
is inconsistent.
PROOF: First we recall that f, k € P*, under the assumption of the corollary, have the
uniform mean value property. If x* is not optimal, then the system (A), (Bj), (C) is con-
sistent, by Corollary 4. This implies that the less restrictive system (A), (B^ is consistent.
Suppose that the system (A), (B]) is consistent. Then for every k € P* there is ak > 0 such
that
fix* + akd,t) ^ 0 for all t € Tk\T*k
by Lemma 1. Let
a* A min{a^: k € />*).
By the convexity of /*, it follows that
fix* + a*d,t) < 0 for all t € Tk\T*k and k 6 P*.
This is equivalent to (C) of Theorem 1 (see (2-b)). Therefore the system (A), (B]), (C) is
consistent. This implies that the system (A), (B), (C) is consistent. (The reader may verify
this statement by the technique used in the proof of Lemma 2.) Hence x*is optimal, by Corol-
lary 4.
D
REMARK 2: Differentiate strictly convex (in all variables!) functions f do have the
uniform mean value property. However, this is not necessarily true in the case of convex func-
tions with strictly convex restrictions. In particular, function
fixux2,t) =
x,2 + tx2ix2 - t) lf *2 < J*
S . ,
(2- t)2
x2 + -7rJ-—(x2- t + l)(x2- 1) if x2 ^ -t
is differentiate and has strictly convex restrictions for every t € [0,1]. Note that
{1} iff = 0
{1,2} if/ € (0,1].
[Jfc](r) =
OPTIMALITY FOR SEMI-INFINITE PROGRAMMING 425
But function /does not have the uniform mean value property. One can show, however, that a
differentiable function which is strictly convex in its actual variables and such that [/c](/) is
constant over all compact set T, does have the mean value property.
4. PROGRAMS WITH UNIFORMLY DECREASING CONSTRAINTS
The applicability of Theorem 1 is, in general, obscured by the appearance of parameter a *
in conditions (B) and (C). The purpose of this section is to point out some of the topological
difficulties which arise in the removing of a * from condition (B). A class of convex functions
for which the optimality conditions can be stated without reference to a * in condition (B) will
be called the uniformly decreasing functions.
In what follows we assume that f:R" x T — • R is convex and differentiable in x € R" for
every / of a compact set Tin Rm. Further, Vf(x*,t) denotes Vfx(x*,t).
DEFINITION 3: Let /: R" x T — R and x* € R" be such that T* * 0. Then for a
given d € R", d ^ 0, the function /is uniformly decreasing at x* in the direction d, if (i) the
set
S(x*,d) A(K T*: d'Vf(x*,t) < 0}
is compact and if (ii) there exists a > 0 such that fix* + ad,t) = 0 for all t G T* for which
d € D(x*,t).
It is not easy to recognize whether a general convex function /is uniformly decreasing.
EXAMPLE 4: Consider the following functions from R x R into R:
f(x,t) = t2[(x - t)2 - t2], t 6 T (used in Example 1)
f{xj) = x2 - tx, t € T
f(x,t) = -tx, t € T.
These functions are all convex, f2 is actually strictly convex and /3 linear in x for every / € T.
:If T = [0, 1], then neither function is uniformly decreasing at x* = 0 in the direction d = 1.
However, if T = [1,2] then all three functions are uniformly decreasing at x* = 0 in the same
direction d = 1.
As suggested by the above example, a convex function /is uniformly decreasing at x* in
the direction d ^ 0, whenever Vf(x*, •) is continuous and the set
E(x*,d) A {/ € T*: d'Vf(x*,t) = 0}
is empty. Its complement
S(x*,d) = T*\E{x*,d) - T*
s then compact. In particular, all analytic functions not identically zero are uniformly decreas-
ng. However, a characterization of optimality for problem (P) with such constraint functions
s already given by Corollary 4.
An important uniformity property of convex functions with compact S(x*,d) follows:
LEMMA 2: Let f(x,t) be convex and differentiable in x, for every / in a compact set
r C R"\ and continuous in t, for every x € R". Suppose further that for some x* and d ^ 0
n R", the set S(x*,d) is nonempty and compact. Then there exists a > 0 such that
426
A. BEN-TAL, L. KERZNER, AND S. ZLOBEC
(22) fix* +ad,t) < 0, 0 < a < a
for all t € S(x*,d).
PROOF: Suppose that such a > 0 does not exist. Then there exists a sequence
[tj) C S(x*,d) and a sequence {a,}, a, = a jit,) > 0 such that
fix* + aid,t,) - 0,
fix* + ad.t,) < 0, 0 < a < a,
and
(23) fix* +ad,tj) > 0, a > a,
with inf (a,} = 0. Since S(x*,d) is compact, (/,) contains a convergent subsequence {/,.}. Let
t 6 S(x*,d) be the limit point of [tj}. Now
d'Vfix*,t) < 0
implies that there exists a > 0 such that
fix* + ad.t) < 0, 0 < a ^ a.
In particular,
(24) fix* + ad,h < 0.
For any e > 0 there exists j0 = jflie) such that
(25) \tj — t\ < € and a, < a for all j > j0.
Now (23) and (25) imply
(26) fix* + ad.t,) > 0 for all j > j0.
But the inequalities (24) and (26) contradict the continuity of fix* + ad,-).
D
EXAMPLES: Consider again
fixj) = x2- tx, rG F= [1,2].
This function is uniformly decreasing at x* = 0 in the direction d — \. The inequality (22)
holds for every 0 < a < 1, in particular a = — . If the above interval T is replaced by
T — [0,1], then f7 is not uniformly decreasing at x* = 0 with rf = 1. An a > 0 satisfying (22)
here does not exist.
A characterization of optimality for programs (F), with constraint functions which have
the uniform mean value property and are uniformly decreasing, follows.
THEOREM 3: Let x* be a feasible solution of problem (/>), where fk, k € P* have the
uniform mean value property. Suppose also that /*, k € P* are uniformly decreasing at x* in
every feasible direction d. Then x* is an optimal solution of iP) if, and only if, for every
a * > 0 the system
(A) d'Vf°ix*) < 0,
d'Vfkix*,t) < 0 or d € Dkix*,t)
for all t € T% '
(B4)
OPTIMALITY FOR SEMI-INFINITE PROGRAMMING
427
(C)
is inconsistent.
d'VfHx* +a*d,t) > _ _1_
fk(x*,t) a*
for all t € Tk\T*k,
k e p*
PROOF: Parts (A) and (C) are proved as in the case of Theorem 1. It is left to show
that the existence of a > 0 satisfying (2-a) is equivalent to the consistency of (B4). It is clear
that (2-a) implies (B4). In order to show that (B4) implies (2-a) we use the assumption that the
functions {fk(x,t): k € P*} are uniformly decreasing at x* in the direction d. When (B4) holds,
then for every k € P* there exist ak > 0 and ak > 0 such that
(27)
by Lemma 2, and
(28)
fk(x* + ad.t) < 0, 0 < a < ak
for all t € Sk A [t € T*k: d'Vfk(x*,t) < 0),
fk(x* +ad,t) = 0, 0 < a < ag
for all t € T*k\Sk,
since d ^ 0. The latter follows by part (ii) of Definition 2 and the convexity of fk. Let
(29)
a A min(aA,aA0} > 0.
A€/>*
Clearly (27) and (28) can be written as the single statement (2-a) with a chosen as in (29).
□
The following example shows that the assumption that [fk{x,t): k € P*) be uniformly
decreasing at x* cannot be omitted in Theorem 3.
EXAMPLE 6: Consider the program
Min f°(x) = -x
s.t.
f(x,t) ^ 0, for all t € T = [0,1]
where
f(x,t) =
t{x - t)2 if x > t
0 if x ^ t.
The feasible set consists of the single point x* = 0, which is therefore optimal. One can verify,
after some manipulation, that the constraint function /has the uniform mean value property at
x*. (For every a > 0 there exists 0 < a ^ — a such that (MV) holds.) However, /is not
uniformly decreasing at x*. In order to demonstrate that Theorem 3 here fails, first we note
that T* = T = [0, 1], so the condition (C) is redundant. Since d = 1 is in the cone of direc-
tions of constancy D(x*,t) for every t € [0, 1], we conclude that the system (A), (B4) is here
consistent, contrary to the statement of the theorem. Therefore the assumption that the con-
straint functions be uniformly decreasing. cannot be omitted in Theorem 3.
428 A BEN-TAL, L. KERZNER, AND S. ZLOBEC
5. THE FRITZ JOHN AND KUHN-TUCKER THEORIES FOR
SEMI-INFINITE PROGRAMMING
In contrast to the characterizations of optimality stated in the preceding sections we will
now recall the Fritz John and Kuhn-Tucker theories for semi-infinite programming. In the
sequel we use the following concept from the duality theory of semi-infinite programming, e.g.
[3].
DEFINITION 4: Let / be an arbitrary index set, [p1: i € /} a collection of vectors in Rm
and [c,\ / € /} a collection of scalars. The linear inequality system
u'p1 ^ c„ for all / € /
is canonically closed if the set of coefficients {((/>')', c,): / € /} is compact in R'"+] and there
exists a point u° such that
(u")'p' < c„ for all i € /.
We will say that problem (P) is canonically closed at x*if the system
(B5) d'Vfk(x*,t) < 0 for all t € T%, k € P*
is canonically closed.
REMARK 3: All constraint functions of problem (P) can have the uniform mean value
property, or they can be uniformly decreasing, without problem (P) being canonically closed.
Lemma 3 below is a specialized version of Theorem 3 from [3], adjusted to our need. It
is related to the following pair of the semi-infinite linear programs:
(I) (ID
Inf u'p" Sup Y, CA/
s.t. s.t.
u'p'> c„ all / € / 2>'X,= P°
u € Rm k € S, X ^ 0,
where S is the vector space of all vectors (X,: /' € /) with only finitely many nonzero entries.
Denote by Vt and Vn the optimal values of (I) and (II), respectively.
LEMMA 3: Assume that the linear inequality system of problem (I) is canonically closed.
If the feasible set of problem (I) is nonempty and Vx is finite, then problem (II) is consistent
and Vu = Vx. Moreover, Vu is a maximum.
The concept of a canonically closed system' is used in the proof of the dual statement of
the following theorem.
THEOREM 4: ("The Fritz John Necessity Theorem") Let x* be an optimal solution of
problem (P). Then the system
(A) d'Vfix*) < 0
OPTIMALITY FOR SEMI-INFINITE PROGRAMMING
429
(B,)
d'Vf(x*,t) < 0 for all t € T*k,
k € P"
is inconsistent or, dually, the system
(FJ)
is consistent.
k°Vf°(x*) + Y £ ^l^7fk(x*,t) = 0
k£P' f€T'k
\°, [\tk: t € 71*, k 6 />*} nonnegative scalars,
not all zero and of which only finitely many are positive
PROOF: If x* is optimal, then the inconsistency of the system (A), (B,) is well-known,
e.g. [4, Lemma 1]. In order to prove the dual statement, we note that the inconsistency of
(A), (B,) is equivalent to fx * = 0 being the optimal value of the semi-infinite linear program
(I)
Min ix
s.t.
d'Vf°(x*) + fi > 0
d'Vfk(x*,t) + ai ^ 0, all t € T%, k <E P*
d
The dual of (/) is
(ft)
6 R" + [.
Max 0
s.t.
\°Vf°(x*) + £ £ ^l^fk(x*,t) = 0
fee/" /e rfc
\k > 0, only finitely many are positive.
The feasible set of problem (I) is clearly nonempty and canonically closed (d = 0, n = 1 satisfy
the constraints of (I) with strict inequalities). Lemma 3 is now readily applicable to the pair
(I), (II), which proves (FJ).
□
The dual statement in Theorem 4 is the Fritz John optimality condition for semi-infinite pro-
gramming. For an equivalent formulation the reader is referred to Gehner's paper [4].
Under various "constraint qualifications" such as Slater's condition:
3x € R" 3 fk{x,t) < 0 for all t € Tk, k € P
or the "Constraint Qualification II" of Gehner [4], one can set \0 = 1 in Theorem 4. In fact,
the same is possible if problem (P) is canonically closed at x*, i.e. if there exists dsuch that
430
A. BEN-TAL, L. KERZNER, AND S. ZLOBEC
(30)
d'Vfk(x*,t) < 0 for all t € T*k, k € P*.
This is easily seen by multiplying the equation in (FJ) by d satisfying (30). Note that the
canonical closedness assumption is a semi-infinite version of the Arrow-Hurwicz-Uzawa con-
straint qualification, e.g. [12]. The latter constraint qualification is implied by Slater's condition.
The Fritz John condition (FJ) with X0 = 1 is a semi-infinite version of the Kuhn-Tucker
condition, e.g. [12]. While the Fritz John condition is necessary but not sufficient, the Kuhn-
Tucker condition is sufficient but not necessary for optimality. If a constraint qualification is
assumed, then the Kuhn-Tucker condition is both necessary and sufficient for optimality for
problem (P). If a constraint qualification is not satisfied then the Fritz John condition fails to
establish the optimality and the Kuhn-Tucker condition fails to establish the nonoptimality of a
feasible point x*. In contrast, our results are applicable. This will be demonstrated by two
examples. (See also an example, taken from approximation theory, in Section 6.)
EXAMPLE 7: Consider the semi-infinite convex problem
Min f° = xx — x2
s.t.
/' = x,2 + tx2 - t2 ^ 0 for all t € Tx - [0, 1]
f - -x, - tx2 - t < 0 for all / € T2 = [0, 1].
The feasible set is
' 0
F =
A' 2
:-l < x, < 0
and the optimal solution is x* = (0,0)'. For this point
77= H= (0), P* = {1,2}.
The system (B5) is
0^0
-dx ^ 0,
obviously not canonically closed. The Kuhn-Tucker condition is
1
f 1
0
-1
0
-1
+ \,
0
+ x2
0
—
0
A-l > 0, X2 > 0
which clearly fails.
One can easily verify that the constraint functions /' and f2 have the uniform mean value
property. Also, these functions are uniformly decreasing at x* = 0 in every direction d ^ 0.
(The sets T{ and rj are singletons!) Thus Theorem 3 is applicable. Conditions (A), (B4) and
(C) are here
(A)
dx - d2 < 0
(B4)
0 < 0 or rf, = 0, d2 € R
-dx < 0
OPTIMALITY FOR SEMI-INFINITE PROGRAMMING 431
(C)
a *d\ + td2 1 ,
~ — - ^ - -±r for all t € (0, 1]
— tl OL*
—d\ - td2 1 „
! > - -1- for all / € (0, 1].
— / a*
This reduce
s to
dx = 0, d2 > 0
(31)
— > - -~ for all t 6 (0, 1]
-* > ~ £•
Since d2 > 0, the inequality (31) cannot hold for any a* > 0. Hence, by Theorem 3,
x* = (0,0)' is optimal. The optimality of a feasible point is thus established here using
Theorem 3 and not by the Kuhn-Tucker condition which here fails.
Consider now the point x* = (0,-1)'. Here
T*= {o}, T\= [0,1], P* = {1,2}.
It is easy to verify that the Fritz John condition is satisfied in spite of the fact that x* is not
optimal. Conditions (A), (B) and (C) are here
(A) dx - d2 < 0
0^0
(B)
-dx- td2 < 0 for all t € [0,1]
. . a *d\ + td2 1 ,
(C) ■ , > ~ -^ for all / € (0, 1].
—t—r a*
For a * = 1, these conditions are satisfied by c/[ = 0, J2 = 1- Hence, by Theorem 1, the point
x* = (0, 1)' is not optimal. Both the Fritz John and the Kuhn-Tucker theories fail to character-
ize optimality in this example because a constraint qualification (or a regularization condition,
e.g. [1]) is not here satisfied.
Although the Fritz John and Kuhn-Tucker theories fail to characterize optimality, they
can be used to formulate, respectively, either the necessary or the sufficient conditions of
optimality.
In the remainder of the section we will show that the ordinary Kuhn-Tucker condition
(i.e. the (FJ) condition with X0 = 1) can be weakened by assuming an asymptotic form. For a
related discussion in Banach spaces the reader is referred to [16].
THEOREM 5: ("The Kuhn-Tucker Sufficiency Theorem") Let x* be a feasible solution of
problem (P). Then x* is optimal if the system
(A) d'Vf°(x*) < 0
(B5) d'\7fk(x*,t) < 0 for all t € T*k,
k 6 P*
is inconsistent or, dually, if the system
432
A. BEN-TAL, L. KERZNER, AND S. ZLOBEC
(K-T)
V/°(x*) + £ £ X*V/*(xV) = 0
kZP" t£T*k
{X^: t € T*, k €P*} nonnegative scalars
of which only finitely many are positive
is consistent.
PROOF: If the system (A), (B5) is inconsistent, so is (A), (B). (Recall that
Dk{x*,t) C [d\ d'Vfk(x*,t) = 0}.) Hence, in particular, the system (A), (B), (C) is incon-
sistent. Following the proof of Theorem 1, one concludes that x* is optimal. The inconsistency
of (A), (B5) is equivalent to the consistency of (K-T), by e.g. [11, Corollary 5].
□
REMARK 4: The "asymptotic" form of the Kuhn-Tucker conditions (K—T) gives a
weaker sufficient condition for optimality than the familiar (i.e., without the closure) condition
(K-T)
v/°(x*) + £ £ x*v/*(xv) = o
k€P* t€T%
{X*: t € T%, k € P*} nonnegative scalars
of which only finitely many are positive.
In some situations the primal Kuhn-Tucker conditions (A), (B5) may be easier to apply
than (K-T). This will be illustrated on the following problem taken from [8, Example 2.4].
EXAMPLES: Consider
Min f° = 4x, + j (x4 + x6)
s.t.
/' = -x, - t\X2 - t2x3 - t}xA - txt2xs - t22xt + 3 - (t\ - t2)2(t\ + t2)2 ^ 0
t]
for all / € T, =
1 2
: -1 < t, ^ 1, /= 1,2
We will show, using the Kuhn-Tucker theory, that x* = (3,0,0,0,0,0)' is an optimal solution.
The optimality of x* has been established in [8] by a different approach.
First note that here
t
Tt =
' n r.
: t\ - t2 = 0 or tx + t2 = 0
The system (A), (B5) becomes
(A) 4dx +y d4
(B5) - dx- txd2 - t2d3 - t\di - txt2ds - t2d6 < 0
+| d6 < 0
for all t € T\.
0PTIMAL1TY FOR SEMI-INFINITE PROGRAMMING
433
Substitute in (B5) the following five points of T*\
1
0
1
1
-1
-1
0
'
1
'
-1
'
1'
-1
This gives
-</, < 0
— d\ — d2 — dj — d4 — d$ — d(, ^ 0
-dx - d2 + d}- d4+ d5- d6^ 0
-dx + d2 ~ d3 - d4 + d5 - d6 < 0
-d\ + d2 + d3 - d4 - d5 - d6 < 0.
Multiply the first inequality by ten thirds and each of the remaining four inequalities by one
sixth then add all five inequalities. We get
-Ad,
-7*4
-jd6<0
which contradicts (A). Thus the system (A), (B5) is inconsistent and x*= (3,0,0,0,0,0)' is
optimal, by Theorem 5.
Theorems 1 and 3 suggest that the presently used constraint qualifications for semi-
infinite programming problems are too restrictive because they do not employ the topological
properties of problem (P), such as the uniform mean value property or the uniformly decreas-
ing constraints.
6. AN APPLICATION TO CHEBYSHEV APPROXIMATION
It is well-known that there is a close connection between convex programming and
approximation theory, e.g. [5], [13]. In fact, many approximation problems can be formulated
as convex semi-infinite programming problems in which case the results of this paper are
readily applicable. In particular, the problem of linear Chebyshev approximation subject to side
constraints
(MM)
Min
max \f(t) ~ £ XjgjU)\
r£T
/=!
S.t.
lit) < £ x^U) < uit) for all t <E T
i-i
is equivalent to the linear semi-infinite programming problem
(L)
Min xn+\
s.t.
ft
-x„+] < £ x,g,(t) - fit) ^ Xn+l
i=\
for all / e T.
434
A. BEN-TAL, L. KERZNER. AND S ZLOBEC
Corollary 3 of this paper can be applied to (L) and it gives a characterization of the best approx-
imation for the problem (MM). Uniqueness of the best approximation can be checked using
Theorem 2. Rather than going into details we will illustrate this application by an example.
EXAMPLE 9: The approximation problem stated in this example is taken from [4], see
also [15]. It shows that there exist situations when the Kuhn-Tucker theory for semi-infinite
programming fails to establish optimum even in the case of linear constraints. However, the
optimality is established using the results of this paper.
The linear Chebyshev approximation problem is
Min
max 1 1
r€[0,l]
XX~ X2t
S.t.
< 0
< 0
for all 1 € [0,1].
-t < x, + x2t < t2, for all / € [0,1].
An equivalent linear semi-infinite programming problem is
Min f° = x3
s.t.
f = ,4 - x, - X2t - x3 < 0
f = -tA + x, + x2t - x3 ^ 0
f3 = -t2 + x{+x2t
f = -t - x, - x2t
Isx*= (0,0. D' optimal?
Here Tf= {1}, T*2= 0, T*3 = (0), T*4= {0} and P* = {1,3,4}. The system (A), (B5) is
(A) d3 < 0
-dx - d2- d3 ^ 0
d] < 0
-d\ < 0
(B5)
and it is clearly consistent (set e.g. d\ = 0, d2 = 1, d3 = -1). Therefore, Theorem 5 cannot be
applied. (Since the system (K—T) is inconsistent, x* = (0,0,1)' is not a "Kuhn-Tucker
point".) But the system
(A)
(B2)
(C,)
~d\
d3<0
d2 ~ d3 < o
< 0
^ 0
dot- d
t4- 1
d{ + d2t
dx + d2t
I > -1, for all t € [0,1)
> -1, for all / € (0,1]
> -1, for all t 6 (0,1]
OPTIMALITY FOR SEMI-INFINITE PROGRAMMING 435
is inconsistent. (First, d\ = 0, by the last two inequalities in (B2). Now (A) and (B2) imply
d2 > 0. This contradicts d2 ^ 0 obtained from the second inequality in (Ci).) Therefore
x* = (0,0, 1)' is optimal, by Corollary 1.
ACKNOWLEDGMENT
The authors are indebted to Professor G. Schmidt for providing some of the constraint
functions used in Examples 1, 3 and 4, Mr. H. Wolkowicz for providing a counter-example to
one of their earlier conjectures and the referee for his recommendations about organization of
the paper and providing a correct version of Lemma 3.
REFERENCES
[1] Ben-Tal. A., A. Ben-Israel and S. Zlobec, "Characterization of Optimality in Convex Pro-
gramming without a Constraint Qualification," Journal of Optimization Theory and
Applications 20, 417-437 (1976).
[2] Charnes, A., W.W. Cooper and K.O. Kortanek, "Duality, Haar Programs and Finite
Sequence Spaces," Proceedings of the National Academy of Science, 48, 783-786
(1962).
[3] Charnes, A., W.W. Cooper and K.O. Kortanek, "On the Theory of Semi-Infinite Program-
ming and a Generalization of the Kuhn-Tucker Saddle Point Theorem for Arbitrary
Convex Functions," Naval Research Logistics Quarterly, 16, 41-51 (1969).
[4] Gehner, K.R., "Necessary and Sufficient Conditions for the Fritz John Problem with
Linear Equality Constraints," SIAM Journal on Control, 12, 140-149 (1974).
[5] Gehner, K.R., "Characterization Theorems for Constrained Approximation Problems via
Optimization Theory," Journal of Approximation Theory, 14, 51-76 (1975).
[6] Gorr, W. and K.O. Kortanek, "Numerical Aspects of Pollution Abatement Problems: Con-
strained Generalized Moment Techniques," Carnegie-Mellon University, School of
Urban and Public Affairs, Institute of Physical Planning Research Report No. 12
(1970).
[7] Gustafson, S.A. and K.O. Kortanek, "Analytical Properties of Some Multiple-Source Urban
Diffusion Models," Environment and Planning, 4, 31-41 (1972).
[8] Gustafson, S.A. and K.O. Kortanek, "Numerical Treatment of a Class of Semi-Infinite Pro-
gramming Problems," Naval Research Logistics Quarterly, 20, 477-504 (1973).
[9] Gustafson, S.A. and J. Martna, "Numerical Treatment of Size Frequency Distributions
with Computer Machine," Geologiska Foreningens Forhandlingar, 84, 372-389 (1962).
[10] Kantorovich, L.V. and G.Sh. Rubinshtein, "Concerning a Functional Space and Some
Extremum Problems," Dokl. Akad. Nauk. SSSR 115, 1058-1061 (1957).
[11] Lehmann, R. and W. Oettli, "The Theorem of the Alternative, the Key-Theorem and the
Vector-Maximum Problem," Mathematical Programming, 8, 332-344 (1975).
[12] Mangasarian, O, Nonlinear Programming, McGraw Hill, New York (1969).
[13] Rabinowitz, P., "Mathematical Programming and Approximation," Approximation Theory,
A. Talbot (editor), Academic Press (1970).
[14] Rockafellar, R.T., Convex Analysis, Princeton University Press, Princeton, N.J. (1970).
[15] Taylor, G.D., "On Approximation by Polynomials Having Restricted Ranges," Journal on
Numerical Analysis, 5, 258-268 (1968).
[16] Zlobec, S., "Extensions of Asymptotic Kuhn-Tucker Conditions in Mathematical Program-
ming," SIAM Journal on Applied Mathematics, 21, 448-460 (1971).
SOLVING INCREMENTAL QUANTITY DISCOUNTED
TRANSPORTATION PROBLEMS BY VERTEX RANKING
Patrick G. McKeown
University of Georgia
Athens, Georgia
ABSTRACT
Logistics managers often encounter incremental quantity discounts when
choosing the best transportation mode to use. This could occur when there is a
choice of road, rail, or water modes to move freight from a set of supply points
to various destinations. The selection of mode depends upon the amount to be
moved and the costs, both continuous and fixed, associated with each mode.
This can be modeled as a transportation problem with a piecewise-linear objec-
tive function. In this paper, we present a vertex ranking algorithm to solve the
incremental quantity discounted transportation problem. Computational results
for various test problems are presented and discussed.
1. INTRODUCTION
Whenever a logistics manager is making a decision about the movement of freight, he is
often faced with choosing from among different modes of transportation. Movement of freight
by air or motor express may involve no fixed costs to the transporter, but will usually involve
relatively higher variable costs than either rail or water. However, both rail and water can
involve the investment of large sums for rail sidings or docking facilities. The problem of
selecting freight modes can be modeled as a transportation problem with a piecewise-linear
objective function. This problem has been termed the incremental quantity discounted tran-
sportation problem, since it is assumed that the variable costs decrease as the amount shipped
increases. This comes about due to the lower variable costs for rail or water modes relative to
air or road freight costs. The presence of fixed costs for the use of rail or water determines the
range of shipment levels over which each cost will be applicable. Figure 1 shows this type of
objective function.
In this paper we will present a vertex ranking algorithm to solve this type problem along
with the computational results from various sizes and types of problems. Background material
is discussed in Section 2, while the details of the algorithm are given in Section 3. An example
is worked out in Section 4 while Section 5 gives computational results.
2. BACKGROUND MATERIAL
The incremental quantity discounted transportation problem is a member of a general
class of math programming problems, i.e., the piecewise-linear programming problem. Vogt
and Even [15] considered the case of the piecewise-linear transportation problem derived from
U.S. freight rates. This problem is neither convex nor concave, and has sections of the objec-
tive function which are flat or "free." Figure 2 shows this case. Vogt and Evans used separable
437
438
PG. MCKEOWN
Total
Cost
Motor
Freight
_► Rail ►
Water
Flow (xu)
Quantity Shipped
Figure 1
J
i
"Free" Sections
Total
Cost
te-
Quantity Shipped
Figure 2
- Flow (xy)
nonconvex programming to reach an approximately optimal solution to this problem. Balachan-
dran and Perry [1] consider another version of this problem which they termed the all unit
quantity discount transportation problem. The main difference between this and the previous
case is the lack of the flat section of the objective function. The latter case is typical of some
foreign freight rates, and is shown in Figure 3 below.
Problems similar to this one have been mentioned in the plant location literature, e.g.,
Townsend [14], and Efroymson and Ray [5]. In these cases, it is suggested that the problem be
solved by considering multiple plants, one for each range of demand.
Balachandran and Perry presented a branch and bound algorithm for the all unit quantity
discount problem, which they show, will also work for the incremental quantity discount prob-
lem as well as fixed charge transportation problems. However, no computational results are
SOLVING TRANSPORTATION PROBLEMS BY VERTEX RANKING
439
Total
Cost
Flow (x„)
Quantity Shipped
Figure 3
given to demonstrate the efficiency of this algorithm. Here, we will consider a vertex ranking
algorithm for only the incremental quantity discount problem for two reasons. First, fixed
charge transportation problems have been handled in several other places in a manner that has
been shown to be superior to vertex ranking [2,8]. Secondly, the incremental quantity discount
transportation problem has a concave objective function; but, neither the problem considered
by Vogt and Evans, or the all unit quantity discount transportation problem, have nonconcave
objective functions. This is crucial to the use of vertex ranking since this procedure will only
consider vertices of the constraint set, and the optimal solution to problems with nonconcave
objective functions need not occur at a vertex.
The incremental quantity discount problem may be formulated as follows (following the
model proposed by Balachandran and Perry [1]):
(1)
(2)
(3)
(4)
Min Z = ££C* xu + ££ /j} yjj
i i i j
subject to £ Xjj = a, for /€/
£ Xjj = bj for j €7
C
C2
ku ^ XU < *•(/
C^ = <
o
ifx/r1 <
< \[j < °°,
440 P.G. MCKEOWN
(5) yu =
(6) 4 =
1 if A J"1 < x(J < \jj
0 otherwise
v-l
- C^A/p1 for k = 2,3 r
'</" './
and
(7) /j = 0, Xy ^ 0 for all /€ 1 and 7 €7,
where
7 = {1, . . . , «} = set of sinks,
/ = { 1 , .... m } = set of sources,
/? = {1, . . . , r} = set of cost intervals.
As may be easily seen, this is a generalization of the fixed charge transportation problem,
(see [1]), with a fixed charge, /$, and a continuous cost, Cjj, for each range of shipment
between source / and destination j. Since the situation which we are attempting to model, i.e.,
the choice of shipment mode, does involve various levels of fixed charge, (1) - (7) is the
proper formulation for this problem. It should be noted that we are implicitly assuming that
Cjj > C#+1 for all/ €7, 7 €7.
This is necessary for the concavity of the objective function. However, we would expect that
lower continuous costs would occur for higher shipment levels.
Balachandran and Perry [1] suggested that (1) - (7) may be solved by a branch and bound
algorithm. Their procedure is similar to that used to solve travelling salesmen problems by
driving out subtours [13]. They solve the transportation problem with all costs set to their
lowest value, i.e., Cjj. If any routes have flow below \[j~l, branching is done on one of these
variables. Two branches are used. Our branch forces the flow over the arc above the lower
limit for the cost level Cfj, i.e., Xy ^ Af1. In the other branch, the infeasible cost, Q, is
replaced by the feasible cost, Cjj. This continues until a solution is found where the arc flows
match the costs used. This is the optimal solution. However, the effectiveness of the pro-
cedure is unknown since the authors did not provide any computational results.
It would also appear that the work of Kennington [8] on the fixed charge transportation
problem could possibly be modified to solve this problem by having multiple arcs between each
set of nodes. Each arc would be bounded by \,y_1andX,y with multiple continuous costs and
fixed costs. However, this would lead to effectively larger problems, e.g., a problem with 60
arcs and five breakpoints would have 300 variables in the new problem.
3. SOLUTION PROCEDURE
Using the formulation of the incremental quantity discount transportation problem given
in (1) - (7), along with the assumption of decreasing costs, we have a problem with linear con-
straints and concave objective function. It is well known [7] that an optimal solution for prob-
lems of this type will occur at a vertex of the constraint set. Examples of other problems that
share this condition are the fixed charge problem, the quadratic transportation problem, and the
quadratic assignment problem. Murty [12] was the first to suggest a vertex ranking scheme for
a problem of this category. He showed that the fixed charge problem could be solved by rank-
ing the vertices of the constraint according to the objective value up to some upper bound. At
that point, the optimal solution would be found at one or more of the ranked vertices.
SOLVING TRANSPORTATION PROBLEMS BY VERTEX RANKING 441
We may formulate any problem with concave objective function and linear constraint as
below:
(8) Mm fix)
(9) s.t. x£S
(10) where S = {x\Ax = b,x ^ 0}.
Since no "direct" optimization techniques exist for the case where fix) is nonlinear, we
shall look at a procedure for searching the vertices of 5. To do this, we will use a linear
underapproximation of fix), say Lix), such that Lix) < fix), x€S. In this case, to show
that x*is an optimal solution to (8) - (10), we need only rank the vertices of S until a vertex of
x° is found such that Lix0) ^fix*). At this point, all vertices that could possibly be optimal
have been ranked. This is proved by Cabot and Francis [3].
In order to rank the extreme points of S, we need to use a result also first proved by
Murty as Theorem 1 below:
THEOREM 1: \{ Ex, E2, . . . , EK are the first K vertices of a linear underapproximation
problem which are ranked in nondecreasing order according to their objective value, then ver-
tex EK+i must be adjacent to one of Ex, E2, ■ ■ ■ , EK.
Simply put, this says that vertex 2 will be adjacent to the optimal solution to the linear
underapproximation and vertex 3 will be adjacent to vertex 1, or vertex 2, and so on. This,
then, gives us a procedure for ranking the vertices if all adjacent vertices can be found. It is
this "if that can cause problems. These problems arise due to the possibility of degeneracy in
5. If S is degenerate, then there may exist multiple bases for the same vertex. This implies
that all such bases must be available before one can be sure that all adjacent vertices have been
found. Finding all such bases for finding and "scanning" all adjacent vertices can be quite
cumbersome. However, a recent application of Chernikova's work [4,9] has been shown to be
a way around the problem of degeneracy.
Vertex ranking has been used by McKeown [10] to solve fixed charge problems and
Fluharty [6] to quadratic assignment problems. Cabot and Francis [3] also proposed the use of
vertex ranking to solve a certain class of nonconvex quadratic programming problems, e.g.,
quadratic transportation problems. For a survey of vertex ranking procedures, see [11].
In our problem, we need to determine the linear underapproximation to the first objective
function, (1). We may do this by first noting that
(11) Hy = min {a,, bj)
is an upper bound on xir We may then note that if Fixif) = Cjj x„ + fjjyjj then
F(u„) ~ FiO)
k-
uu
Ck- u ■■ + fk
(12)
uu
for A,*-1 ^ Ujj < kjj, is a linear underapproximation to F(xy)
We may now form a problem to rank vertices, i.e.,
rk u + fk
(13) Min Z = X E
u 2* 2* v V
subject to (2) - (7)
for\£-' < UiJ < kkj.
442
I'(i MCKIOWN
Using (13) and (2) - (7), we may rank vertices as discussed earlier until some vertex x°
is found such that Lix°) = ]T£ l^Xy > fix*) where x* is a candidate for optimality. We
i J
may start with x* equal to the optimal solution to (13) and (2) - (7), and then update it as new,
possibly better solutions to (1) - (7), are found by the ranking procedure. When all vertices x
are found such that L(x) < fix*), the solution procedure terminates with the present candi-
date being optimal.
EXAMPLE : As an example of our procedure, we will solve an incremental quantity
discount version of the example problem presented by Balachandran and Perry [1]. Table 1
below gives the supplies, demands, and costs, for each range of shipment. Table 2 gives the
optimal solution to the linear underapproximation problem. The values of lu are given in the
upper right-hand corner of each cell with shipment being circled in the basic cells.
TABLE 1
^•s. Destina-
^v. tion
Source ^\
1
2
3
4
Warehouse
Capacity
1
3[2(Kx,,<«>]
4[KKxM<20]
5[(Kx,i<10]
6[10<jc12<»]
7[5<x12< 10]
8[0<x12<5]
3[27<x13<°°]
4[15<xI3<27]
5[5<x13<15]
One price bracket 4
80
2
One price
bracket 6
5[65<Ar22<°°]
6[20^x22<65]
8[0<jr22<20]
8[10<x23^oo]
9l5<jfj3<10]
10[0<jr23<5]
One price bracket 15
90
3
l[27<jr31<°o]
2[20<.v3l<27]
3[0^a-3,<201
3[60<x32<°o]
4[30^x33<60]
5[0<x32<30]
10[20^x33<°°]
ll[10^.v33<20]
12[0<jf33< 10]
5[30<x34<°°]
6[20<x34<30]
7l0<jr34<20]
55
Market
Demand S
70
60
35
60
TABLE 2
^\Destina-
^\tion
Source ^^
1
2
3
4
Warehouse
Capacity
1
3.43 1
6.25 1
4.20|
4.00 1
80
©
©
2
6.00 1
6.67|
8.43 1
15.001
90
©
©
©
3
1.85 1
4.55 1
10.861
5.91 1
55
©
Market
Demands
70
60
35
60
SOLVING TRANSPORTATION PROBLEMS BY VERTEX RANKING 443
As an example of the calculation of the /,7 values, we will look at /M. First, it is necessary
to calculate f}\ and f\\ using (6). We will do f\\.
f\\ = Cu (\]\ — X\\) — C| i -X.|i
= (5)(10) - 4(10) = 10.
Similarly, /,3, = 30. i/„ = min {80, 70} = 70. Then, /„ = (3) (70^ + 30 = 3.43.
Now, if we solve this continuous transportation problem, we get a value of Z = 1042.20
with the circled cells being basic. If we compute the feasible value for this solution, Z = 1067.
Call this solution X\
Now, since this solution is nondegenerate, we may use simplex pivoting to look at each
nonbasic cell. The values of these adjacent vertices are given below:
Vertex Z-Value
(1,1)
1067.10
(1,2)
1118.40
(2,4)
1143.75
(3,2)
1154.40
(3,3)
1141.05
(3,4)
1069.80
Since the Z-value for each vertex is greater than the present value of Z, we do not need to rank
any other vertices, and Z= 1067.0 is the optimal solution value.
4. COMPUTATIONAL RESULTS
To test the vertex ranking procedure discussed here, randomly-generated problems were
run. These problems were generated by first generating supplies and demands uniformly
between upper and lower bounds, U and L. These supplies and demands were generated so
that they were all multiples of 5. This was done to insure the presence of degeneracy in some
,of the problems. All problems were set up to have discount ranges at 20, 50, 300, 1000, and
2000. By proper selection of L and U, various numbers of ranges could be tested.
The costs for each arc were generated by randomly generating mileages between each set
of nodes, and then, inputting discount cost-per-mile values for each range of flow, e.g., 10, 9,
8, etc. The final discount costs were found by multiplying the mileage between each arc times
the various costs. In this way, various supply-demand discount ranges and cost configurations
could be tested. These problems were generated and solved using a computer code in FOR-
TRAN run on the CYBER 70/74 using the FTN Compiler with OPT = 1.
(The problem characteristics and test results are given in Table 1. The second column
gives the solution time in seconds, while the third column shows the number of vertices of the
linear underapproximation other than the optimal solution that were ranked to solve each prob-
lem. The fourth column gives the size of the problem (m x n); the fifth column gives the
number of cost ranges that the arc flows would cover; the sixth column gives the cost per mile
for each range of flow, pfi; the seventh column gives the lower and upper ranges used to gen-
erate the supplies and demands; and finally, the last column gives the ranges used to generate
nileages. The Cjj values were determined by Cjj = pjj (mileage). As can be seen, the algo-
rithm successfully solved all problems tested. The most difficult problems were those with
three ranges and supplies/demands between 5 and 100. Problems 6 and 13 are identical, except
that 6 is over only 3 ranges, while 13 is over 5; but, problem 13 is solved in much less time. In
;fact, the linear underapproximating transportation problem was found to be optimal and no
pther extreme points were even ranked. This was also the case in problems 7, 9, 10, 11, and
444
P.G. MCKLOWN
12, even though the number of variables increased markedly. It is also interesting to note the
effect of costs in problems 5, 6, and 7. These are essentially the same problem, but with the
present decrease in cost for increasing flow being less in each case. The results are as expected
since in problem 7 the linear underapproximation will be closer to the actual objective function
than in problems 4 and 5.
TABLE 3 — Computational Results
Problem
Number
Vertices
Ranked
Solution
Time
mxn
Number of
Ranges
p6
U.L
Mileage
Range
1
13
2.604
6x8
3
10,9,8
1,50
100,200
2
39
10.289
8x8
3
10,9,8
1,50
100,200
3
39
34.103
9x9
3
10,9,8
1,50
100,200
4
257
42.938
6x8
3
10,9,8
1,100
100,200
5
247
39.799
6x8
3
20,18,17
1,100
100,200
6
84
13.353
6x 8
3
20,19,18
1,100
100,200
7
0
.121
4x6
5
20,19,18,17,16
400,500
50,100
8
18
.888
4x8
5
20,19,18,17,16
400,500
50,100
9
0
.196
6x8
5
20,19,18,17,16
400,500
50,100
10
0
.393
8 x 8
5
20,19,18,17,16
400,500
50,100
11
0
.518
9x9
5
20,19,18,17,16
400,500
50,100
12
0
.130
4x6
5
10,9,8,7,6
400,500
50,100
13
0
.213
6x8
5
20,19,18,17,16
400,500
100,200
It would appear from these results that vertex ranking does hold promise as a solution
procedure for incremental cost discount transportation problems. Neither size of problem nor
degeneracy appears to have any effect on solution time but cost patterns and number of cost
ranges do seem to have a marked effect.
Extensions of this work could be used to solve other concave linear programming prob-
lems. Walker [16] discusses the fact that these can be considered as generalizations of fixed
charge problems. The main difference would be that the first linear portion would have a posi-
tive fixed charge rather than zero, as in the problem discussed here. However, this would not
change the approach to the solution used here.
REFERENCES
[1] Balachandran, V. and A. Perry, "Transportation Type Problems with Quantity Discounts,"
Naval Research Logistics Quarterly, 23, 195-209 (1976).
[2] Barr, R.L., "The Fixed Charge Transportation Problem," presented at the Joint National
Meeting of ORSA/TIMS in Puerto Rico (Nov. 1974).
[3] Cabot, A.V. and R.L. Francis, "Solving Certain Nonconvex Quadratic Minimization Prob-
lems by Ranking the Extreme Points," Operations Research 18, 82-86 (1970).
[4] Chernikiva, N.V., "Algorithm for Finding a General Formula for the Non-negative Solu-
tions of a System of Linear Inequalities," U.S.S.R. Computational Mathematics and
Mathematical Physics.
,[5] Efroymson, M.A. and T.L. Ray, "A Branch-Bound Algorithm for Plant Location," Opera-
tions Research, 14, 361-368 (1966).
[6] Fluharty, R., "Solving Quadratic Assignment Problems by Ranking the Assignments,"
unpublished Master's Thesis, Ohio State University (1970).
[7] Hirsch, W.M. and A.J. Hoffman, "Extreme Varieties, Concave Functions, and The Fixed
Charge Problem," Communications on Pure and Applied Mathematics, 14, 355-370
(1961).
SOLVING TRANSPORTATION PROBLEMS BY VERTEX RANKING 445
[8] Kennington, J.L., "The Fixed Charge Transportation Problem: A Computational Study
with a Branch and Bound Code," AIIE Transactions, 8 (1976).
[9] McKeown, P.G. and D.S. Rubin, "Adjacent Vertices on Transportation Polytopes," Naval
Research Logistics Quarterly, 22, 365-374 (1975).
[10] McKeown, P.G., "A Vertex Ranking Procedure for Solving the Linear Fixed Charge Prob-
lem," Operations Research 23, 1183-1191 (1975).
[11] McKeown, P.G., "Extreme Point Ranking Algorithms: A Computational Survey,"
Proceedings of Bicentennial Conference on Mathematical Programming (1976).
[12] Murty, K., "Solving the Fixed Charge Problem by Ranking the Extreme Points," Opera-
tions Research, 16, 268-279 (1968).
[13] Shapiro, D., "Algorithms for the Solution of the Optimal Cost Travelling Salesmen Prob-
lem," Sc.D. Thesis, Washington University, St. Louis (1966).
[14] Townsend, W., "A Production Stocking Problem Analogous to Plant Location," Opera-
tions Research Quarterly, 26, 389-396 (1975).
[15] Vogt, L. and J. Evan, "Piecewise Linear Programming Solutions of Transportation Costs
as Obtained from Rate Traffic," AIIE Transactions, 4 (1972).
Walker, Warren E., "A Heuristic Adjacent Extreme Point Algorithm for the Fixed Charge
Problem," Management Science, 22, 587-596 (1976).
AUXILIARY PROCEDURES FOR SOLVING LONG
TRANSPORTATION PROBLEMS
J. Intrator and M. Berrebi
Bar-llan University
Ramat-Gan, Israel
ABSTRACT
An efficient auxiliary algorithm for solving transportation problems, based
on a necessary but not sufficient condition for optimum, is presented.
In this paper a necessary (but not sufficient) condition for a given feasible solution to a
transportation problem to be optimal is established, and a special algorithm for finding solutions
which satisfy this condition is adapted as an auxiliary procedure for the MODI method.
Experimental results presented show that finding an initial solution which satisfies this
necessary condition for problems with m < < n eliminates 70%-90% of the MODI iterations.
(See Table 1)
TABLE 1 — Matrix of Principal Results
>v n
m >^
20
30
40
50
100
200
300
4
0.65
0.69
0.72
0.74
0.88
0.91
0.93
5
0.61
0.67
0.69
0.71
0.84
0.87
0.90
6
0.59
0.65
0.66
0.68
0.80
0.82
0.85
8
0.61
0.62
0.64
0.66
0.76
0.80
0.82
10
0.57
0.65
0.66
0.69
0.73
0.77
0.80
20
0.25
0.27
0.31
0.36
0.45
0.50
0.52
Fraction of Modi iteration eliminated by using the method presented
in this paper.
The case when our algorithm is used during the solution process (especially for m — n) is
presently being examined. Our auxiliary procedure requires relatively little computational effort
in finding the appropriate candidate for the basis, eliminating entirely the need to calculate the
dual variables. It works with positive variables associated with one pair of rows at a time using
only the prices of these rows.
447
448 J. INTRATOR AND M. BERREB1
Once a loop for any given pair of rows is determined it may be used to insert numerous
non-basic cells in these two rows to the basis. The result is a considerable time reduction in
determining loops.
The storage and time requirements for the special lists needed in our auxiliary algorithm
are fully discussed in [1]. A rigorous proof presented in [1] shows that updating these lists
requires no more than 0(m log a?) computer operations per MODI iteration.
A Linear Programming Transportation Problem is characterized by a cost matrix C and
two positive requirement vectors a and b such' that ^ o, = ^ br The problem is to minimize
SZ cuxu subject to
i j
Z xu = bi j- 1,2 .... n
i
(A) £ Xjj : = a, i = 1,2 ... m
j
xu > 0 for all (ij).
A proper perturbation of our problem ensures that:
(1) each feasible basic solution of (A) contains exactly m + n — 1 positive variables xlh
(2) corresponding to each nonbasic cell (/,./) (xn = 0) there exists a unique loop of different
cells, say L(i,j) - (/,./',) (i2J\) diJi) (/3J2) ••••
(B) UrJr-l) UrJ) Oj)
which contains at most two cells in each row and column, where the cell (/',./) is the unique
nonbasic cell,
(3) there are no loops which contain basic cells only.
Notation: For fixed l,k (1 ^ / ^ k ^ m) we denote
V,= [j\x,j > 0) 1 <7< //
v*- V\ n yk = {j\x,j > 0, xkl > 0}.
With no loss of generality it is assumed that for each /, (1 < / < m) there exists at least one
destination (column) 76 V, such that (IJ) is the unique basic cell of column j. Otherwise, an
artificial destination, say J with xtj ~ bj — e where e is an infinitely small positive number will
be introduced.
It is easy to see that the feasible solution of the augmented problem of dimension
mxin + 1) satisfies 1), 2), 3) mentioned above.
DEFINITION 1: A destination with a unique basic cell will be called a fundamental desti-
nation.
The unique nonbasic cell (ij) of L(ij) will be considered for convenience to be the last
cell of L(ij). For each loop L(iJ), say loop (B), we introduce the notation:
<o cLUJ] = c,7| - c,Vl + ci2h - clil: + ...+ clrJr - Q.
SOLVING LONG TRANSPORTATION PROBLEMS 449
It is well to know that
Q(/j) = ui + vj ~ 0/ where w, and v7- are the dual prices.
DEFINITION 2: A loop with Q > 0 is called an improving loop.
DEFINITION 3: Let l,k be a fixed pair of numbers so that 1 ^ / ^ k ^ m, we define
Alk= [J\ x„ > 0, x^ = 0}- V,- Vlk
Dik(j)= C,j- CkJ j = 1,2, .... n.
THEOREM 1: The number of elements in Vlk is at most 1.
PROOF: Suppose that JUJ2£ Vlk (1 < /, ^ J2 < «) then the loop (/,/,) (k,J\) (k,J2)
(l,J2) is of only basic cells contradicting (3) above.
Let J\ be a fundamental destination of A/k, the purpose of Theorem 2 and Theorem 3 is
to show that all the simplex loop L(i,J) and the numbers Q,(,/) / = l,k; J€A/k U Akl are deter-
mined after the simplex loop L{k,J) is found.
THEOREM 2: CL{kj$ ~ D,kU2) = CUU]) - /),*(/,) for all J2eAlk
J i being the above fundamental destination of Alk.
PROOF:
CASE (a) Vlk ^0. Denote by J the unique member of Vlk (Th.l). We have j ^ J\\
j * J2 U\ zx\AJ2(iVk) and
L{k,J]) = (k,j) U,j) (/,•/,) UUi)
L{k,J2) = (kj) UJ) (l,J2) ik,J2)
: e.g.
Q(A.y2) ~~ Dlk{J2) = CL{kJ[) - Dlk{Jx).
CASE (b) Vlk = 0. Let L{k,Jx) be the loop (B). Note that ir = / (since column 7, con-
tains a basic cell in row /exclusively) and r > 2. Otherwise, i\ = i2= /and L{k,J\) = (kj)
(IJ) U,J\) (k,J\) means that J€ Vlk contradicting the fact that V,k = <f>.
I Consider the loop:
L = (kJO (/2J,) (i2J2) U3J2) (U-i) UJi) (U2), ('1 = k)
' obtained from L{k,J\) by substituting J2 for J\. Let us show that either L{k,J2) = L or
L(k,J2) can be obtained from L by deleting two identical cells.
At first, observe that all rows and columns of L (except perhaps J2) contain exactly two
different cells of L. The column J2 has not appeared previously (unless J2 = jr_\) because it
1 equals one of the previous members js, 1 ^ s ^ r — 2, then the loop (/s+|,75)
('s+iJs+i) • • • Ur,J2) will be a loop of basic cells only which contradicts (3).
Thus, only two possibilities exist:
(1) J2 9^ jr_} and L(k,J2) = L or
450 J. INTRATOR AND M. BERREBI
(2) J2 — jr-i and L(k,J2) is obtained from L by deleting the two identical cells (Ljr-\) and
(U2).
Since this deleting does not effect the value of CUkJ >, we have for both possibilities
Q(A-.y:> — DlkU2) = CL{kJ{) — D,kU\).
THEOREM 3: Let J\ and J2 be the destinations defined in Theorem 2 and J^Ak. We
shall prove that
Q(/.y,) = - iCL{kJl) ~ DlkU2)} + Dkl U3).
PROOF: Let L ik,J\) be the loop (B) with ir = / because J is fundamental.
Consider the loop L defined by
L = UrJr-0 (/,-„ Jr-l) ••• U2,j{) (/,J,) (*,/3) (W-
CASE (a) K/A ^ 0. Same proof as in Theorem 2.
CASE (b) ^ = 0. By the same argument as in Theorem 2 we can show that there are
only two possibilities.
1) y3 ^ /| which implies that L(/,y3) = L.
2) y3 = 7|. In this case (r > 2) and L(/,73) can be obtained from L by deleting the two identi-
cal cells (i\J\) and (k,J}).
In the two cases we have
Q(/.y,) = - ICitup-DftWl + AfC/s)
and by Theorem 2 we have
Q</.y,> = - IQ(W,) " DlkU2)] + Dw(73).
THEOREM 4: If DjkU2) > DlkU3) then either L(k,J2) or L(k,J}) is an improving sim-
plex loop.
PROOF: Since £/A(y3) = - Dkl(J3), it follows from Theorem 3 that
Cnu3i + Q.(fc/2) = DlkU2) - DlkU3) > 0
(Dlk(J2) > D/A (73)) and either Cu/y , or Q(A.y j is a positive number, e.g., either L(/,y3) or
L(k,J2) is an improving simplex loop (Definition 2).
COROLLARY: At optimum we have DlkU2) < DlkU3).
DEFINITION 4: Define J/k by
D,kUlk) = max Z),*0')-
7"€ K
SOLVING LONG TRANSPORTATION PROBLEMS 451
REMARK 1: We shall suppose that Dlk{j\) = D,k{j2) if and only if y, = j2. Otherwise,
a cost perturbed problem with C,j = Cy + em,+J can be considered and
m/+./'] mk+j\
AXC/.) " D,l(j2) = C, - Ck. + €m,+/l - e^+" - C, +
+ Ckj — e 2 + € 2 which for sufficiently small e > 0 is equal 0 only for y, = j2.
>2
THEOREM 5: If at the oplimality Vlk = $ then DtkUik) < D,kUki) U,k ^ JM), else
DlkU,k) = DlkUkl) Ulk = Jkl).
PROOF: If Vlk = 0 then D/k(J/k) ^ D/k(Jki) otherwise, (by cost-perturbation) Jlk = Jkl
and ^ ^ </>.
The first part of Theorem 5 follows now immediately from the corollary of Theorem 4.
If Vlk ^ 0 and j is the unique element of Vlk then by the definition of Jtk and from j€ V{ we
have A* 0') < DlkU,k).
Let us show that j = 7/A. Suppose that j ^ 7//v then we have D/k(J) < Dik{Jlk), (Definition 4)
and the simplex loop
L(kJik) = (kj) (l,j) (lJik) ik,Jlk) will be an improving simplex loop since
CL{Ulk) = A/0) + DlkUlk) = DlkUik) - Dlk(j) > 0
contradicting the fact that we have optimality.
Thus, j = J,k. By the same argument we have j = Jkl.
A simple algorithm consists of
1) Computing the differences D,k (J/k),
2) Comparing D!k(Jlk) to Dlk(Jk/).
If D,kUlk) > DlkUki) (or if Jlk ^ Jk/ for non-empty Vlk) we improve our solution, using
all the nonbasic cells (l,J) or (k,J) where 7€ (V,UVk) such that DlkU,k) < DlkU) < DlkUkl)
by searching only the first loop involving the rows /and k.
The other loops will be obtained by changing the last two cells keeping the 2k— 2 first cells
, in the same order or in the opposite order (Theorem 2 and Theorem 3).
REMARK: In order to assure that the first loop will not be a shortened loop, this loop
will be obtained by using a fundamental artificial destination /with only one basic cell in the k
row with xkJ = e.
The proposed technique was applied to each pair of rows (l,k) until Dik(Jlk) ^ D/A(yA/) for
all 1 ^ / ^ k ^ m. At that point the MODI method was implemented. Performing a MODI
• iteration frequently caused D,kUlk) > D/k(Jkl) for some 1 < / ^ k < m which would enable
further utilization of the proposed technique. However, for the purpose of the present experi-
ment the proposed technique was not reactivated after the initial processing. (See Table 1)
The storage and time requirements of the lists Jlk when updated at each MODI iteration
are fully discussed in [1].
452 J. INTRATOR AND M. BERREBI
One possible way to update this list may be described as follows: For each / the destina-
tions of ji V, are ordered in m — 1 sequences Plk (1 < k ^ / < m) of increasing Dlk(j).
Thus Pik = [jx\ j2 . . . ; fa) (N, — the number of elements in V,) D,k(jO < D,k{j2) ...
< A*0/v,) (equality excluded because of the supposed cost-perturbation). These Ptk sequences
are organized in heaps. Adding or deleting an item from a heap requires 0(log N,) < 0(log n)
computer operations. Since at each simplex iteration only one basic cell, say (o-,t), becomes
nonbasic and one nonbasic cell, say (s,t), becomes basic, we have to update 2(m — 1) heaps
(Pap and Psr for all p ^ o\ r ?± s), which amounts to 0(m log n) computer operations per
simplex iteration, (heaps, see [2]).
REFERENCES
[1] Brandt, A. and J. Intrator, "Fast Algorithms for Long Transportation Problems," Computers
and Operations Research 5, 263-271 (1978).
[2] Knuth, D.E., The Art of Computer Programming, 3, Sorting and Searching, Addison Wesley
(1973).
ON THE GENERATION OF DEEP DISJUNCTIVE CUTTING PLANES*
Hanif D. Sherali and C. M. Shetty
School of Industrial & Systems Engineering
Georgia Institute of Technology
Atlanta, Georgia
ABSTRACT
In ihis paper we address ihe quesiion of deriving deep cuts for nonconvex
disjunctive programs. These problems include logical constraints which restrict
the variables to at least one of a finite number of constraint sets. Based on the
works of Balas, Glover, and Jeroslow, we examine the set of valid inequalities
or cuts which one may derive in this context, and defining reasonable criteria
to measure depth of a cut we demonstrate how one may obtain the "deepest"
cut. The analysis covers the case where each constraint set in the logical state-
ment has only one constraint and is also extended for the case where each of
these constraint sets may have more than one constraint.
1. INTRODUCTION
A Disjunctive Program is an optimization problem where the constraints represent logical
conditions. In this study we are concerned with such conditions expressed as linear constraints.
Several well-known problems can be posed as disjunctive programs, including the zero-one
integer programs. The logical conditions may include conjunctive statements, disjunctive state-
ments, negation and implication as discussed in detail by Balas [1,2]. However, an implication
can be restated as a disjunction, and conjunctions and negations lead to a polyhedral constraint
set. Thus, this study deals with the harder problem involving disjunctive restrictions which are
essentially nonconvex problems.
It is interesting to note that disjunctive programming provides a powerful unifying theory
, for cutting plane methodologies. The approach taken by Balas [2] and Jeroslow [14] is to
characterize all valid cutting planes for disjunctive programs. As such, it naturally leads to a
statement which subsumes prior efforts at presenting an unified theory using convex sets, polar
sets and level sets of gauge functions [1,2,5,6,8,13,14]. On the other hand, the approach taken
1 by Glover [10] is to characterize all valid cutting planes through relaxations of the original dis-
junctive program. Constraints are added sequentially, and when all the constraints are con-
sidered Glover's, result is equivalent to that of Balas and Jeroslow. Glover's approach is a con-
structive procedure for generating valid cuts, and may prove useful algorithmically.
The principal thrust of the methodologies of disjunctive programming is the generation of
cutting planes based on the linear logical disjunctive conditions in order to solve the
corresponding nonconvex problem. Such methods have been discussed severally by Balas
[1,2,3], Glover [8], Glover, Klingman and Stutz [11], Jeroslow [14] and briefly by Owen [17].
But the most fundamental and important result of disjunctive programming has been stated by
*This paper is based upon work supported by the National Science Foundation under Grant No. ENG-77-23683.
453
454 H D SHERALI AND CM SHETTY
Balas [1,2] and Jeroslow [14], and in a different context by Glover [10]. It unifies and sub-
sumes several earlier statements made by other authors and is restated below. This result not
only provides a basis for unifying cutting plane theory, but also provides a different perspective
for examining this theory. In order to state this result, we will need to use the following nota-
tion and terminology.
Consider the linear inequality systems Sh, /?€// given by
(1.1) Sh = {x: Ahx > bh, x >0}, /?€//
where H is an appropriate index set. We may state a disjunction in terms of the sets Sh, h€H
as a condition which asserts that a feasible point must satisfy at least one of the constraint Sh,
/?€//. Notationally, we imply by such a disjunction, the restriction x€ U Sh. Based on this
h€H
disjunction, an inequality tt'x > tt0 will be considered a valid inequality or a valid disjunctive cut
if it is satisfied for each x€ U S,,. (The superscript t will throughout be taken to denote the
transpose operation). Finally, for a set of vectors {v/;: /?€//}, where v1' = (vj\ ... , v^O for
each //€//, we will denote by sup (v;0, the pointwise supremum v = (vi, .... v„) of the vec-
tors v\ /?€//, such that v, = sup (v/) for j = 1, . . . , n.
Before proceeding, we note that a condition which asserts that a feasible point must satisfy
at least p of some q sets, p < q, may be easily transformed into the above disjunctive statement
)y letting each Sh denote the conjunction of the q original sets taken p at a time. Thus, H =
' (J)
in this case. Now consider that following result.
THEOREM 1: (Basic Disjunctive Cut Principle) - Balas [1,2], Glover [10], Jeroslow
[14]
Suppose that we are given the linear inequality systems Sh, h €// of Equation (1.1), where
\H\ may or may not be finite. Further, suppose that a feasible point must satisfy at least one
of these systems. Then, for any choice of nonnegative vectors k'\ /?€//, the inequality
(1.2) (sup (\h)'Ah\ x ^ inf UW
[heH J /;€//
is a valid disjunctive cut. Furthermore, if every system Sh, /;€//is consistent, and if \H\ <
n
°o, then for any valid inequality £ ttjXj ^ 7r0, there exist nonnegative vectors k'\ /?€//such
7=1
that ttq < inf (kh)'bh and for 7= 1, ... , n, the j th component of sup (k'')'Ah does not
exceed ttj.
The forward part of the above theorem was originally proved by Balas [2] and the con-
verse part by Jeroslow [14]. This theorem has also been independently proved by Glover [10]
in a somewhat different setting. The theorem merely states that given a disjunction x€ U Sh,
one may generate a valid cut (1.2) by specifying any nonnegative values for the vectors kh,
h£H. The versatility of the latter choice is apparent from the converse which asserts that so
long as we can identify and delete any inconsistent systems, Sh, /?€//, then given any valid cut
tt'x > 7r0, we may generate a cut of the type (1.2) by suitably selecting values for the parame-
ters k'\ /;€// such that for any x belonging to the nonnegative orthant of R", if (1.2) holds
then we must have tt'x ^ n0. In other words, we can make a cut of the type (1.2) uniformly
dominate any given valid inequality or cut. Thus, any valid inequality is either a special case of
GENERATION OF DEEP DISJUNCTIVE CUTTING PLANES 455
(1.2) or may be strictly dominated by a cut of type (1.2). In this connection, we draw the
reader's attention to the work of Balas [1] in which several convexity/intersection cuts dis-
cussed in the literature are recovered from the fundamental disjunctive cut. Note that since the
inequality (1.2) defines a closed convex set, then for it to be valid, it must necessarily contain
the polyhedral set
(1.3) S = convex hull of U Sh.
Hence, one may deduce that a desirable deep cut would be a facet of S, or at least would sup-
port it. Indeed, Balas [3] has shown how one may generate with some difficulty cuts which
contain as a subset, the facets of S when \H\ < °°. Our approach to developing deep disjunc-
tive cuts will bear directly on Theorem 1. Specifically, we will be indicating how one may
specify values for parameters \h to provide supports of S, and will discuss some specific criteria
for choosing among supports. We will be devoting our attention to the following two disjunc-
tions titled DC1 and DC2. We remark that most disjunctive statements can be cast in the for-
mat of DC2. Disjunction DC1 is a special case of disjunction DC2, and is discussed first
because it facilitates our presentation.
DC1:
Suppose that each systems Sh is comprised of a single linear inequality, that is, let
(1.4) Sh =
*:£ a\jXj > b\, x > 0
7=1
for /?€//= {1, .... h)
where we assume that h = \H\ < °o and that each inequality in Sh, h£H\s stated with the ori-
gin as the current point at which the disjunctive cut is being generated. Then, the disjunctive
statement DC1 is that at least one of the sets Sh, h€H must be satisfied. Since the current
point (origin) does not satisfy this disjunction, we must have b'{ > 0 for each h € H. Further,
we will assume, without loss of generality, that for each /?€//, a\j > 0 for some
j € {1, . . . , n) or else, S,, is inconsistent and we may disregard it.
DC2:
Suppose each system S,, is comprised of a set of linear inequalities, that is, let
(1.5) Sh =
x : £ a1,] Xj ^ b-1 for each / 6 Qh , x ^ 0
7=1
for /?€//= {1,
where Qh, h€H are appropriate constraint index sets. Again, we assume that h = \H\ < oo
and that the representation in (1.5) is with respect to the current point as the origin. Then, the
disjunctive statement DC2 is that at least one of the sets Sh, h€H must be satisfied. Although
it is not necessary here for bj' > 0 for all i 6 Qh one may still state a valid disjunction by delet-
ing all constraints with bj' ^ 0, i£Qh from each set Sh, /?€//. Clearly a valid cut for the
relaxed constraint set is valid for the original constraint set. We will thus obtain a cut which
possibly is not as strong as may be derived from the original constraints. To aid in our develop-
ment, we will therefore assume henceforth that b,1' > 0, /€£)/,, h€H.
Before proceeding with our analysis, let us briefly comment on the need for deep cuts.
Although intuitively desirable, it is not always necessary to seek a deepest cut. For example, if
one is using cutting planes to implicitly search a feasible region of discrete points, then all cuts
which delete the same subset of this discrete region may be equally attractive irrespective of
their depth relative to the convex bull of this discrete region. Such a situation arises, for exam-
ple, in the work of Majthay and Whinston [16]. On the other hand, if one is confronted with
456 H.D. SHERALI AND CM. SHETTY
the problem of iteratively exhausting a feasible region which is not finite, as in [20] for exam-
ple, then indeed deep cuts are meaningful and desirable.
2. DEFINING SUITABLE CRITERIA FOR EVALUATING THE DEPTH OF A CUT
In this section, we will lay the foundation for the concepts we propose to use in deriving
deep cuts. Specifically, we will explore the following two criteria for deriving a deep cut:
(i) Maximize the euclidean distance between the origin and the nonnegative region
feasible to the cutting plane
(ii) Maximize the rectilinear distance between the origin and the nonnegative region
feasible to the cutting plane.
Let us briefly discuss the choice of these criteria. Referring to Figure 1(a) and (b), one
may observe that simply attempting to maximize the euclidean distance from the origin to the
cut can favor weaker over strictly stronger cuts. However, since one is only interested in the
subset of the nonnegative orthant feasible to the cuts, the choice of criterion (i) above avoids
such anamolies. Of course, as Figure Kb) indicates, it is possible for this criterion to be unable
to recognize dominance, and treat two cuts as alternative optimal cuts even through one cut
dominates the other.
Let us now proceed to characterize the euclidean distance from the origin to the nonnega-
tive region feasible to a cut
(2.1) ]£ ZjXj ^ z0, where z0 > 0, z, > 0 for some j€{l, ... ,
7=1
The required distance is clearly given by
n
(2.2) 6e = minimum {||x||: ]£ ZjXj ^ z0, x ^ 0}.
7=1
Consider the following result.
LEMMA 1: Let 0e be defined by Equations (2.1) and (2.2). Then
(2.3) ee = ^n-
Ibll
where,
(2.4) y - (ylt . . . , y„), vy = maximum {0, z,}, j = \, ... , n.
zo
PROOF: Note that the solution x* =
— . Moreover, ior any x ieasioie to u.4J, we nave, z0 ^ ^ ZjXj ^
7=1
, y is feasible to the problem in (2.2) with
LMI2 '
r—rr. Moreover, for any x feasible to (2.2), we have, z0 < £ ZjXj < £ yjXj ^
Lyll J=\ ' 7=i
1 1 v 1 1 | |x 1 1 , or that, | |x 1 1 ^ -n — rr- This completes the proof.
lb 1 1
Now, let us consider the second criterion. The motivation for this criterion is similar to
that for the first criterion and moreover, as we shall see below, the use of this criterion has
GENERATION OF DEEP DISJUNCTIVE CUTTING PLANES
457
intuitive appeal. First of all, given a cut (2.1), let us characterize the rectilinear distance from
the origin to the nonnegative region feasible to this cut. This distance is given by
(2.5)
9r = minimum {|x|: £ ZjXj ^ z0, x > 0}, when |x| £ xr
Consider the following result.
/-i
Criterion values
Figure 1. Recognition of dominance
Criterion value
for either cut
LEMMA 2: Let 9r be defined by Equations (2.1) and (2.5). Then,
(2.6) 9r = — where zm = maximum z,.
zm J=x "
z0
PROOF: Note that the solution x* = (0, ... , — , ...0), with the m th component
being non-zero, is feasible to the problem in (2.5) with |x*| = — . Moreover, for any x feasi-
ble to (2.5), we have,
This completes the proof.
Zn n Z n
Note from Equation (2.6) that the objective of maximizing Gr is equivalent to finding a
cut which maximizes the smallest positive intercept made on any axis. Hence, the intuitive
appeal of this criterion.
3. DERIVING DEEP CUTS FOR DC1
It is very encouraging to note that for the disjunction DC1 we are able to derive a cut
which not only simultaneously satisfies both the criterion of Section 2, but which is also a facet
! of the set S of Equation (1.3). This is a powerful statement since all valid inequalities are given
through (1.2) and none of these can strictly dominate a facet of S.
458 H.D. SHERALI AND CM. SHETTY
We will find it more convenient to state our results if we normalize the linear inequalities
(1.4) by dividing through by their respective, positive, right-hand-sides. Hence, let us assume
without loss of generality that
(3.1)
x:£ a'ijXj > 1, x > 0
7-1
for /?€//= {1, .... /?}.
Then the application of Theorem 1 to the disjunction DC1 yields valid cuts of the form:
n
(3.2) £
7-1
max kl'a'ii
hZH J
Xi ^ min {\/'}
J hZH .
where A [', h € H are nonnegative scalars. Again, there is no loss of generality in assuming that
(3.3) £ A/'= 1, A," ^ 0, /;€// = {1, .... h)
since we will not allow all A/', /?6// to be zero. This is equivalent to normalizing (3.2) by
dividing through by ]T A/'.
Theorem 2 below derives two cuts of the type (3.2), both of which simultaneously
achieve the two criteria of the foregoing section. However, the second cut uniformly dominates
the first cut. In fact, no cut can strictly dominate the second cut since it is shown to be a facet
of 5 defined by (1.3).
THEOREM 2: Consider the disjunctive statement DC1 where Sh is defined by (3.1) and is
assumed to be consistent for each /;€//. Then the following results hold:
(a) Both the criteria of Section 2 are satisfied by letting A/' = A ('* where
(3.4) A/'*= \/h for /?€//
in inequality (3.2) to obtain the cut
n
(3.5) y, a\jXj ^ 1, where a'Xj = max a1],, forj = 1, ... , n.
(b) Further, defining
(3.6) y('= minimum [a'j/a'ij} > 0, h£H
j:a'{j>0
and letting A/' = A/'", where
(3.7) A,/'"=y17£ yf for/?e/Y
p<LH
in inequality (3.2), we obtain a cut of the form
(3.8) T a[* x, ^ 1, where a** '= max a\, y/'fory =1, ... n
/-i /,e//
which again satisfies both the criteria of Section 2.
(c) The cut (3.8) uniformly dominates the cut (3.5); in fact,
= a \j if a \j > 0
(3.9) a,*,* ' ^ • r • ^ n , j = 1. .. , n.
" (< a,, if ai, < 0
GENERATION OF DEEP DISJUNCTIVE CUTTING PLANES 459
(d) The cut (3.8) is a facet of the set S of Equation (1.3).
PROOF:
(a) Clearly, \/'= \/h, h€H leads to the cut (3.5) from (3.2). Now consider the
euclidean distance criterion of maximizing 9e(or &l) of Equation (2.3). For cut (3.5), the
value of 9] is given by
(3.10) (O2 = l/Z (y/)2 > 0 where yj = maxfO.af,}, J = 1, .... n.
Now, for any choice X /', /?€//,
(3.11)
02 =
min(X/0
'T,y?-bpyi:yj2. say.
where ^ = max{0,max \('a'{j}. If Xf = 0, then 9e = 0 and noting (3.10), such a choice of
h€H
parameters X/', h€H is suboptimal. Hence, Xf > 0, whence (3.11) becomes 6>e2 = l/]£
But since (X/'/Xf) > 1 for each /?€//, we get
-V/
K\ )
yJ\P = max
1
A,*
0, max
^ max
/;€//
hi
0, maxaj',
/;€//
= •>>/•
Thus 0,? < (0/)2 so that the first criterion is satisfied.
Now consider the maximization of 9r of Equation (2.5), or equivalently Equation (2.6).
For the choice (3.4), the value of 9r is given by
1
(3.12)
> 0.
max a ly
j
Now, for any choice x/', h€H, from Equations (2.6), (3.2) we get
9r = [minX/i /[max max X/'oi',-] = Xf /max max X/'ai',-, say.
l/ie// J/ I 7 /!€// J) '/ y /i 6// J
As before, Xf = 0 implies a value of 9r inferior to 9*. Thus, assume Xf > 0. Then, 9r =
Oiy. But (x/'/Xf) > 1 for each h£H and in evaluating 0,., we are interested
1/ max max
j hZH
X/<
Af
only in those y€{l, ...,«} for which a('y > 0 for some //€//. Thus, 9r < 1/max max aj
0,, so that the second criterion is also satisfied. This proves part (a).
j hZH
(b) and (c). First of all, let us consider the values taken by y/1, /?€//. Note from the
assumption of consistency that yj\ h€H are well defined. From (3.5), (3.6), we must have
y\ ^ 1 for each /?€//. Moreover, if we define from (3.5)
(3.13) H*= {/?€//: a'{k= a\k > 0 for some A; 6 {1, .... n}}
then clearly H*^ {<£} and for h£H*, Equation (3.6) implies y," < 1. Thus,
460
H.D. SHERALI AND CM. SHETTY
(3.14)
Hence,
(3.15)
yf
= 1 for h€H*
> 1 for h $H*.
min y i = 1
hZH
or that, using (3.7) in (3.2) yields a cut of the type (3.8), where,
(3.16)
a i , ■ = max a , . • y , , /' = 1 , . . . , n.
Now, let us establish relationship (3.9). Note from (3.5) that if a\, ^ 0, then a\-, < 0
for each /?€//and hence, using (3.14), (3.16), we get that (3.9) holds. Next, consider a*, > 0
for some y € { 1 , ..., n). From (3.13), (3.14), (3.16), we get
(3.17) a[\*= max{maxai\, max af,y/'}
h€H h£H*
afj > 0
where we have not considered /?#//* with a^ ^ 0 since a** > 0. But for h$H*mlh a\, > 0,
we get from (3.5), (3.6)
(3.18)
„h ,, h _ h
min
k:a'{k>Q
w*xa[k
< at
a\k
max a i ,
«fc
= maxoi,.
Using (3.18) in (3.17) yields a** = a'n which establishes (3.9).
Finally, we show that (3.8) satisfies both the criteria of Section 2. This part follows
immediately from (3.9) by noting that the cut (3.5) yields 9e = 9* of (3.10) and 0, = 9* of
(3.12). This completes the proofs of parts (b) and (c).
(d) Note that since (3.8) is valid, any x€S satisfies (3.8). Hence, in order to show that
(3.8) defines a facet of S, it is sufficient to identify n affinely independent points of 5 which
satisfy (3.8) as an equality, since clearly, dim S = n. Define
(3.19) 7, = L/€{1 n\: a" > 0} and let J2 - {1, .... n) - /,.
Consider any p€J\, and let
(3.20)
ep= (0 — 0), p£J,
\p
have the non-zero term in the p'h position. Now, since p£J\, (3.9) yields
a\n = a\D = max a\
a\'p, say,
Hence, ep€Sh and so, ep€S and moreover, ep satisfies (3.8) as an equality. Thus, ep, p€J\
qualify as \J\\ of the n affinely independent points we are seeking.
Now consider a q€J7. Let us show that there exists an Sh satisfying
h h .. ,
y," axi = aXp for some p€J\
GENERATION OF DEEP DISJUNCTIVE CUTTING PLANES 461
and
(3.21) y!qafi"q = a".
From Equation (3.16), we get ai* = max a\a yP= a\qayx\ say. Then for this //„€//, Equation
h<zH
(3.6) yields y,'" = minimum {<**,■/«,'«} = a'p/a^p, say. Or, using (3.9), y,'" a\"p = a'p - a" >
0. Thus (3.21) holds. For convenience, let us rewrite the set Sh below as
(3.22) fy = U: af'x, + a?<x9 + £ afjx, > 1, x > 0}.
Now, consider the direction
(0, .... -Jr-. -4r 0) lf«,7< 0
(0, . . . , 0, . . . , A , . . . , 0) if a '* = 0
(3.23) da =
where A > 0. Let us show that dq is a direction for Sh . Clearly, if a\q = 0, then from (3.21)
a{q = 0 and thus (3.22) establishes (3.23). Further, if a]q < 0 then one may easily verify
from (3.21), (3.22), (3.23) that
ep = (0 y\"la"p , ... , 0) 6 Sh and ep + Sty,''"^] € Sh for each 8^0
where ep has the non-zero term at position p. Thus, dq is a direction for Sh . It can be easily
shown that this implies dq is a direction for S. Since ep = (0, . . . , — —, . . . , 0) of Equation
. (3.20) belongs to S, then so does (ep + dq). But (ep + dq) clearly satisfies (3.8) as an equality.
Hence, we have identified n points of S, which satisfy the cut (3.8) as an equality, of the type
■
e„= (0, ... , -^7 , ... , 0) for/^7,
<*\P
eq = dq + ep for some p€J\, for each (?€72
where dq is given by (3.23). Since these n points are clearly affinely independent, this com-
pletes the proof.
It is interesting to note that the cut (3.5) has been derived by Balas [2] and by Glover [9,
Theorem 1]. Further, the cut (3.8) is precisely the strengthened negative edge extension cut of
Glover [9, Theorem 2]. The effect of replacing A./'* defined in (3.4) by A./'" defined in (3.7) is
equivalent to the translation of certain hyperplanes in Glover's theorem. We have hence
shown through Theorem 2 how the latter cut may be derived in the context of disjunctive pro-
gramming, and be shown to be a facet of the convex hull of feasible points. Further, both
(3.5) and (3.8) have been shown to be alternative optima to the two criteria of Section 2.
In generalizing this to disjunction DC2, we find that such an ideal situation no longer
exists. Nevertheless, we are able to obtain some useful results. But before proceeding to DC2,
let us illustrate the above concepts through an example.
462 H.D. SHERALI AND CM. SHETTY
EXAMPLE: Let H = {1,2}, n = 3 and let DC1 be formulated through the sets
5, = {x: x, + 2x2 - 4x3 ^ 1, x ^ 0},S2 = {x: -y- + -y- - 2x3 > 1, x ^ 0}.
The cut (3.5), i.e., Eaf/X,- $5 1, is *i + 2x2 - 2x3 > 1. From (3.6),
y{ = min
1 1
1 ' 2
= 1 and y( = min
2 _
1 2
1/2' 1/3
2.
Thus, through (3.7), or more directly, from (3.16), the cut (3.8), i.e., I a'Jxj ^ 1 is
X] + 2x2 — 4x3 ^ 1. This cut strictly dominates the cut (3.5) in this example, though both
have the same values 1/V5 and 1/2 respectively for 9e and 9r of Equations (2.2) and (2.5).
4. DERIVING DEEP CUTS FOR DC2
To begin with, let us make the following interesting observation. Suppose that for con-
venience, we assume without loss of generality as before, that b^ = 1, i€Qh, //€//in Equation
(1.4). Thus, for each /?€//, we have the constraint set
(4.1)
x: 5>/;x; > 1, i£Qh, x > 0
M
Now for each /?€//, let us multiply the constraints of 5/, by corresponding scalars 8/' ^ 0, i€Qh
and add them up to obtain the surrogate constraint
(4.2)
I
7=1
Xj > £ 8,*, /?€#.
Further, assuming that not all 8/' are zero for /€ (?,,, (4.2) may be re-written as
(4.3)
I
7=1
I
'6 0*
8/'
1
a''
x, > 1, /?€//.
Finally, denoting 8/' /£ 8 * by X,/? for i€Qh, h € //, we may write (4.3) as
(4.4)
where,
(4.5)
z
/-I
I *,*flj
'€<?*
Xf ^ 1 for each /? € //
X X,*- 1 for each //€//, X," ^ 0 for /€&, h€H.
Observe that by surrogating the constraints of (4.1) using parameters X/', /€Q/7, /? € // satisfying
(4.5), we have essentially represented DC2 as DC1 through (4.4). In other words, since x£Sh
implies x satisfies (4.4) for each /?€//, then given X,/(, /€£>,,, h£H, DC2 implies that at least
one of (4.4) must be satisfied. Now, whereas Theorem 1 would directly employ (4.2) to derive
a cut, since we have normalized (4.2) to obtain (4.4), we know from the previous section that
the optimal strategy is to derive a cut (3.8) using inequalities (4.4).
Now let us consider in turn the two criteria of Section 2.
GENERATION OF DEEP DISJUNCTIVE CUTTING PLANES
463
4.1. Euclidean Distance-Based Criterion
Consider any selection of values for the parameters \/\ /€£?/,, h €// satisfying (4.5) and
let the corresponding disjunction DC1 derived from DC2 be that at least one of (4.4) must
hold. Then, Theorem 2 tells us through Equations (3.5), (3.10) that the euclidean distance cri-
terion value for the resulting cut (3.8) is
(4.6)
where,
(4.7)
and
(4.8)
9Ak)= 1
yj = max{0, z,}, j = 1,
max
,j=l
Thus, the criterion of Section 2 seeks to
(4.9) maximize {9e(\):X= (A/0 satisfies (4.5)}
or equivalently, to
(4.10) minimize {£v,2: (4.5), (4.7), (4.8) are satisfied}.
7=1
It may be easily verified that the problem of (4.10) may be written as
(4.11)
(4.12)
(4.13)
(4.14)
PD,:
minimize £ yj
y=i
subject to }) ^ £ A /' a'j for each h£H for each j = 1 ,
£ A/' = 1 for each //€//
a/' > 0 i£Qhl heH
Note that we have deleted the constraints vy ^ 0, j = 1, ... , « since for any feasible A/',
/€()/,, /7 6//, there exists a dominant solution with nonnegative yj = j = 1, ... , n. This relax-
ation is simply a matter of convenience in our solution strategy.
Before proposing a solution procedure for Problem PD2, let us make some pertinent
remarks. Note that Problem PD2 has the purpose of generating parameters \/', /€(?/,, /?€//
which are to be used to obtain the surrogate constraints (4.4). Thereafter, the cut that we
derive for the disjunction DC2 is the cut (3.8) obtained from the statement that at least one of
(4.4) must hold. Hence, Problem PD2 attempts to find values for A/', i€Qh, /?€//, such that
this resulting cut achieves the euclidean distance criterion.
Problem PD2 is a convex quadratic program for which the Kuhn-Tucker conditions are
both necessary and sufficient. Several efficient simplex-based quadratic programming pro-
cedures are available to solve such a problem. However, these procedures require explicit han-
dling of the potentially large number of constraints in Problem PD2. On the other hand, the
464
H D SHERAL1 AND CM. SHETTY
subgradient optimization procedure discussed below takes full advantage of the problem struc-
ture. We are first able to write out an almost complete solution to the Kuhn-Tucker system.
We will refer to this as a partial solution. In case we are unable to either actually construct a
complete solution or to assert that a feasible completion exists, then through the construction
procedure itself, we have a subgradient direction available. Moreover, this latter direction is
very likely to be a direction of ascent. We therefore propose to move in the negative of this
direction and if necessary, project back onto the feasible region. These iterative steps are now
repeated at this new point.
4.1.1 Kuhn- Tucker Systems for PD2 and Its Implications
Letting uf, /?€//, j = 1, ... , n denote the lagrangian multipliers for constraints (4.12),
th, h€H those for constraints (4.13), and wj', /€(),,, h€H those for constraints (4.14), we may
write the Kuhn-Tucker optimality conditions as
(4.15)
(4.16)
(4.17)
(4.18)
(4.19)
(4.20)
I uf-ty 7-1 n
n
2^ uj'alj + th - w/' = 0 for each i€Qh, and for each h€H
7=1
V ■
0 for each j = 1 , . . . , n and each h € H
\/'w/'= Ofor /€£>,,, /?€/¥
wj1 ^ 0 /€(?,,, h€H
uf> Oy- 1, ... , n, h£H.
Finally, Equations (4.12), (4.13), (4.14) must also hold. We will now consider the implications
of the above conditions. This will enable us to construct at least a partial solution to these con-
ditions, given particular values of \/', i€Qh, /?€//. First of all, note that Equations (4.7),
(4.10) and (4.20) imply that
(4.21)
(4.22)
yj ^ 0 for each j = 1 , ...
>j = maxjo, £ kfajj, /?€//
for j = I, ... , n.
Now, having determined values for yjr j = 1, . . . , n, let us define the sets
{0} ifjy-0
(4.23)
Hj-
for j = 1, . . . , n.
[h€H:yj= £ *iai > 0}
Now, consider the determination of w/', /?€//, j =' 1, . . . , /j. Clearly, Equations (4.15), (4.17)
and (4.20) along with the definition (4.23) imply that for each j — I, ■ ■ ■ , n
(4.24) uf = 0 for h € H/Hj and that £ «/ - 2yr uf > 0 for each h € fl).
/)€//.
Thus, for any y'€{l, ... , «}, if //, is either empty or a singleton, the corresponding values for
uf, h € H are uniquely determined. Hence, we have a choice in selecting values for uf, h 6 Hj
GENERATION OF DEEP DISJUNCTIVE CUTTING PLANES
465
only when \Hj\ ^ 2 for any y€{l, ... , n). Next, multiplying (4.16) by \/' and using (4.18),
we obtain
(4.25)
I
+ % £ X/'= 0 for each /?€//.
'"6 0/,
w/' = £ i// [flj - j;y] for each /€£>,„ h e //.
Using Equations (4.13), (4.17), this gives us
(4.26) 'a - - £ "yV, for each /?€//.
Finally, Equations (4.16), (4.26) yield
(4.27)
Notice that once the variables uj', h€H, j = 1, . . . , « are fixed to satisfy (4.24), all the vari-
ables are uniquely determined. We now show that if the variables w/', / €(),,, h € H so deter-
mined are nonnegative, we then have a Kuhn-Tucker solution. Since the objective function of
PD2 is convex and the constraints are linear, this solution is also optimal.
LEMMA 2: Let a primal feasible set of a/', /€()/,, h€H be given. Determine values for
all variables v7, uj\ th, w'' using Equations (4.22) through (4.27), selecting an arbitrary solution
in the case described in Equation (4.24) if \Hj\ > 2. If w/' > 0, KQh, /;€//, then A/', /€£>,,,
h £H solves Problem PD2.
PROOF: By construction Equations (4.12), through (4.17), and (4.20) clearly hold.
Thus, noting that in our problem the Kuhn-Tucker conditions are sufficient for optimality, all
we need to show is that if w = (h>/0 ^ 0 then (4.18) holds. But from (4.17) and (4.27) for
any /; € H, we have,
''€0/, /eo„
£ «/ [^ -
-^
= 1
"7
I a/'4-
-#
,/=!
7=1
'€G/,
= 0
for each /?€//. Thus, A/' ^ 0, w/' ^ 0 /€£>,,, /?€// imply that (4.18) holds and the proof is
complete.
The reader may note that in Section 4.1.4 we will propose another stronger sufficient con-
dition for a set of variables A./', /€£?/,, /?€//to be optimal. The development of this condition
is based on a subgradient optimization procedure discussed below.
4.1.2 Subgradient Optimization Scheme for Problem PD
For the purpose of this development, let us use (4.22) to rewrite Problem PD2 as follows.
First of all define
(4.28) A ={\ = (a/0: constraints (4.13) and (4.14) are satisfied )
and let /: A — R be defined by
(4.29)
/0O = £
7=1
maximum
0, £ a," 4, /?€//
'6G/,
466
H.D. SHERALI AND CM SHETTY
Then, Problem PD2 may be written as
minimize {/(X): A. € A).
Note that for each j = 1, . . ., n, gj(X) — max {0, Z Xfoy, h€H) is convex and nonnegative.
Thus, [g,(X)]2 is convex and so/(X) = Z [gj(X)]2 is also convex.
y=i
The main thrust of the proposed algorithm is as follows. Having a solution X at any stage,
we will attempt to construct a solution to the Kuhn-Tucker system using Equations (4.15) :
through (4.20). If we obtain nonnegative values iv* for the corresponding variables w'\ /'€Q/M
h€H, then by Lemma 2 above, we terminate. Later in Section 4.1.7, we will also use another
sufficient condition to check for termination. If we obtain no indication of optimality, we con- j
tinue. Theorem 3 below established that in any case, the vector w = w constitutes a subgra- i
dient of /(•) at the current point X. Following Poljak [18,19], we hence take a suitable step in
the negative subgradient direction and project back onto the feasible region A of Equation
(4.28). This completes one iteration. Before presenting Theorem 3, consider the following
definition.
DEFINITION 1: Let/: A — R be a convex function and let X € AC Rm. Then f € Rm '
is a subgradient of /(•) at X if
fix) ^ f(X) + £' (X - X) for each X € A.
THEOREM 3: Let X be a given point in A defined by (4.28) and let w be obtained from
Equations (4.22) through (4.27), with an arbitrary selection of a solution to (4.24).
_
Then, w is a subgradient of /(•) at X, where /:A — - R is defined in Equation (4.29).
PROOF. Let y and y be obtained through Equation (4.22) from X € A and X € A respec-
tively. Hence,
/<X)-£ y2mdf(X) = £y2.
Thus, from Definition 1, we need to show that
(4.30) £ £ wj1 (X/' - X?) < £ y} - £ y2.
hZH KQh 7=1 7=1
Noting from Equations (4.17), (4.27) that Z Z w/'x/'= 0, we have,
hZHitQ,,
X £ ^<x*-x,*>- £ £ »*a*- £ £ Z-yVW-Jvi
hdH <£Qh hZH i$Qh h€H i€Qh j=\
Z I */
/! € // /= 1
Z x,X
'■6 0*
z z
/) 6 W /= I
ufy, Z X/
Using (4.13) and (4.15), this yields
t>>
z *M
-2
- 2 z yj
/=i
GENERATION OF DEEP DISJUNCTIVE CUTTING PLANES
467
Combining this with (4.30), we need to show that
(4.31)
Z Z «/'
ye//y-i
Z Vol}
< Z yf + Z #
7-1 7=1
But Equations (4.15), (4.20), (4.22) imply that
IH
Z *M
hQm
/i 6 //v=l 7=1
UN < ItvlP+IUIP
heHj-l
so that Equation (4.31) holds. This completes the proof.
Although, given X € A, any solution to Equations (4.22) through (4.27) will yield a
subgradient of /(•) at the current point X, we would like to generate, without expending much
effort, a subgradient which is hopefully a direction of ascent. Hence, this would accelerate the
cut generation process. Later in Section 4.1.6 we describe one such scheme to determine a
suitable subgradient direction. For the present moment, let us assume that we have generated
a subgradient w and have taken a suitable step size 9 in the direction — w as prescribed by the
subgradient optimization scheme of Held, Wolfe, and Crowder [12]. Let
(4.32)
X = X - 9 w
be the new point thus obtained. To complete the iteration, we must now project X into A, that
is, we must determine a new X according to
(4.33)
Xnew = PX(X) = minimum [| |x — X||: X € A}.
The method of accomplishing this efficiently is presented in the next subsection.
4.1.3 Projection Scheme
For convenience, let us define the following linear manifold
(4.34)
Mh = ,
, /?€//
and let Mh be the intersection of Mh with the nonnegative orthant, that is,
(4.35) Mh - {x/\ /€&: Z */*- 1. */' > °. '€&}.
Note from Equation (4.28) that
(4.36) A = Mx x ... x M\H\.
Now, given X, we want to project it onto A, that is, determine Xnew from Equation (4.33).
Towards this end, for any vector a = (a,, /€/), where /is a suitable index set for the |/| com-
ponents of a, let P(a,I) denote the following problem:
(4.37)
PiaJY.
minimize
|Z (X,-a,)2:Z^= 1. h > 0- '€■
Then to determine Xnew, we need to find the solutions (X/,'e(t,),, /€0/, as projections onto Mh of
,= h
= h
X = (X,, i€Qh) through each of the \H\ separable Problems P(X , Qh). Thus, henceforth in
this section, we will consider only one such /?€//. Theorem 4 below is the basis of a finitely
convergent iterative scheme to solve Problem P(X , Qh).
468 H.D. SHERALI AND CM. SHETTY
THEOREM 4: Consider the solution of Problem P(J3k, Ik), where 0* = (/3f, /€/*), with
|/J ^ 1. Define
(4.38) Pk
1-1/3/
\h
and let
(4.39) pk = pk+{pk)lk
where 4 denotes a vector of \lk\ elements, each equal to unity. Further, define
(4.40) 4+1= {/€/*: j§f > 0).
Finally, let/3^1 defined below be a subvector of/?*,
(4.41) 0*+1= (j3k+], ieik+l)
where, pk+x = JS,^, r€4+1. Now suppose that^+1 solves P(j8fc+1, /*+,).
(a) If/3* ^ 0, then/3* solves P(J3k, Ik).
(b) Ifp _> 0, then /3 solves />(/3A:, 4), where /3 has components given by
J3k+\ if /€4+1 for each /€/*.
0 otherwise
(4.42) /3, =
PROOF: For the sake of convenience, let RP(a,I) denote the problem obtained by
relaxing the nonnegativity restrictions in P(a,I). That is, let
RP(a,I): minimize
J £ (X, - a,)2: £ X, - 1
First of all, note from Equations (4.38), (4.39) that /3* solves RP(J3k, lk) since fik is the projec-
tion of (lk onto the linear manifold
(4.43)
\= (\„ /€/*): £ \,= l|
which is the feasible region of RP(J3k, Ik). Thus, /3* ^ 0 implies that /3* also solves P(j3*, 4)-
This proves part (a).
Next, suppose that /3A ' > 0. Observe that /3 is feasible to P(fik, Ik) since from (4.42), we
get /3 ^ 0 and £ /3, = £ /3*+1 = 1 as /3*+1 solves P(fik+l, 4+1).
Now, consider any A. = (X„ /€/*) feasible to P(/3k, Ik). Then, by the Pythagorem
Theorem, since ftk is the projection of pk onto (4.43), we get
||\-/8*||2 = ||\-J8*||2 + ||J3*-/3*||2.
Hence, the optimal solution to P(fik, Ik) is also_optimal to P(fik, Ik). Now, suppose that we
can show that the optimal solution to Problem P(JBk, Ik) must satisfy
(4.44) X, = 0 for <?4+1.
Then, noting (4.41), (4.42), and using the hypothesis that J3k+l solves P(fik+], Ik+\), we will
have established part (b). Hence, let us prove that (4.44) must hold. Towards this end, con-
sider the following Kuhn-Tucker equations for Problem P(fik, Ik) with / and wn i€lk as the
appropriate lagrangian multipliers:
GENERATION OF DEEP DISJUNCTIVE CUTTING PLANES 469
(4.46) £ X, = 1, a, ^ 0 for each i£lk
(4.47) (X, - /§/*) + t - w, = 0 and w, ^ 0 for each /€/*
(4.48) X , w, = 0 for each i € 7A .
Now, since £ J8* = 1, we get from (4.45), (4.46) that
/e/t
^ 0.
But from (4.46), (4.47), and (4.48) we get for each /€/*,
0= w,X, = X,(X, + t-pf)
which implies that for each /'€/*, we must have,
either X, = 0, whence from (4.46), w, ■ = t — /3* must be nonnegative
or X, = J3* - /, whence from (4.46), w, = 0.
In either case above, noting (4.45), if /3* < 0, that is, if i&Ik+\, we must have X; = 0. This
completes the proof.
Using Theorem 4, one may easily validate the following procedure for finding k„ew of
Equation (4.33), given Xh. This procedure has to be repeated separately for each /?€//.
Initialization
Set k = 0, /3° = \\ I0 = Qh. Go to Step 1.
Step 1
Given pk, Ik, determine pk and ~jik from (4.38), (4.39). If /3* ^ 0, then terminate with
KJIew having components given by
pfi£iak
0 otherwise.
(\ " ) =
v/x new ' i
Otherwise, proceed to Step 2.
Step 2
Define Ik+\, (3k+l as in Equations (4.40), (4.41), increment k by one and return to Step 1.
Note that this procedure is finitely convergent as it results in a strictly decreasing, finite
sequence \lk\ satisfying \lk\ ^ 1 for each k, since £ j3k = 1 for each k.
EXAMPLE: Suppose we want to project X = (-2,3,1,2) on to A c R4. Then the above
procedure yields the following results.
Initialization
k = 0, 0° = (-2,3,1,2), 70= {1,2,3,4}.
470
H.D. SHERALI AND CM. SHETTY
Step 1
P0--3/4, j§° =
n 9 ] 5
4 ' 4' 4' 4
Step 2
ft: — 1. /x — {2,3,4}, /31
111
4' 4' 4
Step 1
1 _2 1
3 ' 3' 3
Step 2
/c = 2, /2= {2,4}, |82 =
1 1
3' 3
Step 1
1 o2
P2--J. P - (1.0) ^ 0
Thus, X* - (0,1,0,0).
4. /. 4 >4 Second Sufficient Condition for Termination
_ As indicated earlier in Section 4.1.2, we will now derive a second sufficient condition on w
for A. to solve PD2. For this purpose, consider the following lemma:
LEMMA 3: Let \ € A be given and suppose we obtain w using Equations (4.22) through
(4.27). Let w solve the problem.
PRh: minimize
\ £ (w,*- w/')2: £ w/'= 0, vv/' < 0 for l€JA for each htH
where,
(4.49) 7„ = {/€£?„: A/' = 0}, /;€//.
Then, if w = 0, A. solves Problem PD2.
PROOF. Since w = 0 solves /V?/,, /?€//, we have for each /?€//,
(4.50) £ (w/')2< £ (w/'~ w/')2
for all w/', /€Q/, satisfying £ w/*= 0, w/' < 0 for /'€/,,. Given any X € A and given any
/jl > 0 define,
(4.51) w/'= (\/'-X/')/At, /€(?;„ /?£//.
GENERATION OF DEEP DISJUNCTIVE CUTTING PLANES 471
Then, £ w/' = 0 for each h£H and since X," = 0 for /€/,,, /?€//, we get vv/' < 0 for i£Jh,
/;€//. Thus, for any A. € A, by substituting (4.51) into (4.50), we have,
(4.52) n2 Z (wfi2< £ (X/*-X,* + Atw/)2 for each /?€//.
But Equation (4.52) implies that for each /?€//, A.'' = X7' solves the problem
i
minimize
£ [\/'- (a/'-mw/')]2: £ X/' = 1, a," > 0 ,€£
for each /; € //.
In other words, the projection /^(X — Vv/a) of (X — vv/a) onto A is equal to X for any ix = 0.
In view of Poljak's result [18,19], since vv is a subgradient of /(■) at X, then X solves PD2.
This completes the proof.
Note that Lemma 3 above states that if the "closest" feasible direction — w to — vv is a zero
vector, then X solves PD2. Based on this result, we derive through Lemma 4 below a second
sufficient condition for X to solve PD2.
LEMMA 4: Suppose vv = 0 solves Problems PR,,, /;€// as in Lemma 3. Then for each
h € H, we must have
(4.53) (a) Wj1 = t,,, a constant, for each i$Jh
(b) wj' < th for each i£Jh
where J,, is given by Equation (4.49).
PROOF: Let us write the Kuhn-Tucker conditions for Problem PR,,, for any /?€//. We
obtain
(vv/'- iv/9 + fA- Ofor /<T/A
(w/' - vv/') + r/; - w," = 0 for iej,,
U-' ^ 0, /' € //, , «/' Wj1 =0 /€//,, ?,, unrestricted
£ w/' = 0, w/' ^ Ofor i£Jh.
'€0/,
If w = 0 solves P^/? , /?€//, then since PR,, has a convex objective function and linear con-
straints, then there must exist a solution to
vv/' = t,, for each / GJ,,
and
«/'= {th - vv/') ^ 0 for each i£J,,.
This completes the proof.
Thus Equation (4.53) gives us another sufficient condition for X to solve PD2. We illus-
trate the use of this condition through an example in Section 4.1.7.
4.1.5 Schema of an Algorithm to Solve Problem PD2
The procedure is depicted schematically below. In block 1, an arbitrary or preferably, a
good heuristic solution X € A is sought. For example, one may use X/' = \/\Q/, I for each
/€(),,, for h€H. For blocks 4 and 6, we recommend the procedural steps proposed by Held,
Wolfe and Crowder [12] for the subgradient optimization scheme.
472
H.D. SHERALI AND CM. SHETTY
For j = 1 n
determine yn
Uj, h € H. using
Equations (4.22),
(4.24). Hence,
determine w from
Equation (4.27)
3
Yes
Is w > 0 or
does w satisfy
Equation (4.53)'
No
Terminate with K
as an optimal solution
to PDi
Select 0
and let
\ = X - flw
5
Replace_
X by PU)
of Equations
(4.33)
Is a suitable
subgradient
optimization
termination
criterion
satisfied?
Yes
Terminate with A.
as an estimate of an
optimal solution to
PD,
No
4.1.6 Derivation of a Good Subgradient Direction
In our discussion in Section 4.1.1, we saw that given a X € A of Equation (4.28), we were
able to uniquely determine yh j = 1, ... , n through Equation (4.22). Thereafter, once we
fixed values 5* for «/', j — I, ... , n, h€H satisfying Equation (4.24), we were able to uniquely
determine values for the other variables in the Kuhn-Tucker System using Equations (4.26),
(4.27). Moreover, the only choice in determining uj1, j = I, . . . , «, /?€// arose in case \Hj\ ^
2 for some j € (l, . . . , n) in Equation (4.25). We also established that no matter what feasible
values we selected for uf, j€ {l,
n), h€H, the corresponding vector w obtained was a
subgradient direction. In order to select the best such subgradient direction, we are interested
in finding a vector w which has _the smallest euclidean norm among all possible vectors
corresponding to the given solution \ € A. However, this problem is not easy to solve. More-
over, since this step will merely be a subroutine at each iteration of the proposed scheme to
solve PD2, we will present a heuristic approach to this problem.
Towards this end, let us define for convenience, mutually exclusive but not uniquely
determined sets A//,,/?€//as follows:
(4.54) Nh C {y'€{l, ...,«}: heHj of Equation (4.23)}
(4.55) N, n Nj = {(f)} for any /, j€H and |J N„ - L/€{1, ... , n}:yj > 0}.
In other words, we take each j€[\, ... , n) which has 3>7 > 0, and assign it to some h£Hj,
that is, assign it to a set Nh, where h£Hj. Having done this, we let
(4.56)
"/
2yjifjeNh
0 otherwise for each ; e { 1, .
n), /?€//.
Note that Equation (4.56) yields values uf for «/, y€{l, . .. , «}, h£H which are feasible to
(4.24). Hence, having defined sets N,,, /;€//as in Equations (4.54), (4.55), we determine uf,
7"€{1, ... , «}, h£H through (4.56) and hence w through (4.27).
Thus, the proposed heuristic scheme commences with a vector w obtained through an
arbitrary selection of sets Nh, h 6 H satisfying Equations (4.54), (4.55). Thereafter, we attempt
to improve (decrease) the value of w'w in the following manner. We consider in turn each
y € { 1 , . . . , n) which satisfies \Hj\ ^ 2 and move it from its current set Nh , say, to another set
GENERATION OF DEEP DISJUNCTIVE CUTTING PLANES
473
Nh with h€Hj, h^ hj, if this results in a decrease w'w. If no such single movements result in
a decrease in w'w, we terminate with the incumbent solution w as the sought subgradient direc-
tion. This procedure is illustrated in the example given below.
4.1.7 Illustrative Example
The intention of this subsection is to illustrate the scheme of the foregoing section for
determining a good subgradient direction as well as the termination criterion of Section 4.1.4.
Thus, let // = {1,2}, n = 3, \QX\ = \Q2\ = 3 and consider the constraint sets
S,=
x: 2x\ — 3x2 + x3 ^ 1
— Xj + 2x2 + 3*3 ^ 1
3xj — x2 — x3 ^ 1
X\, x2, x3 > 0
and S2 =
x: 3xi — x2 — x3 ^ 1
2x] + x2 — 2x3 ^ 1
—X] + 3x2 + 3x3 ^ 1
x]t x2, x3 ^ 0
Further, suppose we are currently located at a point A with
\} = 0, \2' = 5/12, Xl = 7/12; \2 = 7/12, A22 - 0, \32 = 5/12.
Then the associated surrogate constraints are
(4.57)
4 1 2
— X| + — x2 + —x} ^ 1 for h = 1
4 2 2
—x, + — x2 + — x3 ^ 1 for h = 2.
Using Equations (4.22), (4.25), we find
y\= \ with//! = {1,2}, y2= j with//2= {2} and ^3 = j with //3 = {1,2}.
Note that the possible combinations of N] and N2 are as follows:
(i) #!= {1}, W2 = {2,3},
(ii) AT!- {0}, ^2= {1,2,3},
(Hi) #i= {1,3}, N2= {2}, and
(iv) ^ = {3}, /V2={1,2}.
A total enumeration of the values of u obtained for these sets through (4.56) and the
corresponding values for w are shown below.
NY N2
u},j€[l n)
wf, ieQ,„ heH
w'w
«/
u\
ul
u\
u\
u]
w/
w2
w3'
w2
w22
w32
{1} {2,3}
{0} {1,2,3}
{1,3} {2}
{3} {1,2}
8/3
0
8/3
0
0
0
0
0
0
0
4/3
4/3
0
8/3
0
8/3
4/3
4/3
4/3
4/3
4/3
4/3
0
0
16/9
0
20/9
-4/9
-56/9
0
-28/9
28/9
40/9
0
20/9
-20/9
-40/9
0
-20/9
20/9
-28/9
-4/3
4/9
20/9
56/9
0
28/9
-28/9
129.78
1.78
34.37
34.37
Thus, according to the proposed scheme, if we commence with Nx = {l}, N2 = {2,3}, then
picking j = 1 which has \Hj\ = 2, we can move 7=1 into N2 since 2€H\. This leads to an
improvement. As one can see from above, no further improvement is possible. In fact, the
474 H.D. SHERALI AND CM SHETTY
best solution shown above is accessible by the proposed scheme by all except the third case
which is a "local optimal".
We now illustrate the sufficient termination condition of Section 4.1.4. The vector w
A-l . _ h=\ h=l
obtained above is (0,0,0|0, -4/3, 0). Further the vector \ is ( 0, 5/12, 7/12|7/12, 0, 5/12).
Thus, even though w >_ 0, we see that the conditions (4.53) of Lemma 6 are satisfied for each
h€H = {1,2} and thus the given \ solves PD2.
The disjunctive cut (3.8) derived with this optimal solution A is obtained through (4.57)
as
(4.58) yx, + y;c2 + jx3 ^ 1.
It is interesting to compare this cut with that obtained through the parameter values \/' =
l/\Qi,\ for each i€Qh as recommended by Balas [1,2]. This latter cut is
(4.59) jxi+jc2 + x3 > 1-
Observe that (4.58) uniformly dominates (4.59).
4.2 Maximizing the Rectilinear Distance Between the Origin and the Disjunctive Cut
In this section, we will briefly consider the case where one desires to use rectilinear
instead of euclidean distances. Extending the developments of Sections 2, 3 and 4.1, one may
easily see that the relevant problem is
minimize {maximum v,: constraints (4.12), (4.13), (4.14) are satisfied}.
yell n) '
The reason why we consider this formulation is its intuitive appeal. To see this, note that the
above problem is separable in /; € H and may be rewritten as
PD|: minimize
£*: {h > £ Xi'ai) for each j - 1, ...,«,£ X,*- 1, X," > 0
'"€<?/, ''€0/,
for/€0„f Zh> 0
for each /; € H.
Thus, for each /?€//, PD] seeks X/', i€Qh such that the largest of the surrogate constraint
coefficients is minimized. Once such surrogate constraints are obtained, the disjunctive cut
(3.8) is derived using the principles of Section 3.
As far as the solution of Problem PD] is concerned, we merely remark that one may
either solve it as a linear program or rewrite it as the minimization of a piecewise linear convex
function subject to linear constraints and use a subgradient optimization technique.
BIBLIOGRAPHY
[1] Balas, E., "Intersection Cuts from Disjunctive Constraints," Management Science Research
Report, No. 330, Carnegie-Mellon University (1974).
[2] Balas, E., "Disjunctive Programming: Cutting Planes from Logical Conditions in Nonlinear
Programming," O.L. Mangasarian, R.R. Meyer, and S.M. Robinson, editors, Academic
Press, New York (1975).
GENERATION OF DEEP DISJUNCTIVE CUTTING PLANES 475
[3] Balas, E., "Disjunctive Programming: Facets of the Convex Hull of Feasible Points,"
Management Science Research Report, No. 348, Carnegie-Mellon University, (1974).
[4] Bazaraa, M.S. and CM. Shetty, "Nonlinear Programming: Theory and Algorithms," John
Wiley and Sons, New York (1979).
[5] Burdet, C, "Elements of a Theory in Non-Convex Programming," Naval Research Logis-
tics Quarterly, 24, 47-66 (1977).
[6] Burdet, C. "Convex and Polaroid Extensions," Naval Research Logistics Quarterly, 24, 67-
82 (1977).
[7] Dem'janov, V.F., "Seeking a Minimax on a Bounded Set," Soviet Mathematics Doklady,
11, 517-521 (1970) (English Translation).
[8] Glover, F., "Convexity Cuts for Multiple Choice Problems," Discrete Mathematics, 6,
221-234 (1973).
[9] Glover, F., "Polyhedral Convexity Cuts and Negative Edge Extensions," Zeitschrift fur
Operations Research, 18, 181-186 (1974).
[10] Glover, F., "Polyhedral Annexation in Mixed Integer and Combinatorial Programming,"
Mathematical Programming, 8, 161-188 (1975). See also MSRS Report 73-9, Univer-
sity of Colorado (1973).
[11] Glover, F., D. Klingman and J. Stutz, "The Disjunctive Facet Problem: Formulation and
Solutions Techniques," Management Science Research Report, No. 72-10, University of
Colorado (1972).
[12] Held, M., P. Wolfe and H.D. Crowder, "Validation of Subgradient Optimization,"
Mathematical Programming, 6, 62-88 (1974).
[13] Jeroslow, R.G., "The Principles of Cutting Plane Theory: Part I," (with an addendum),
Graduate School of Industrial Administration, Carnegie-Mellon University (1974).
[14] Jeroslow, R.G., "Cutting Plane Theory: Disjunctive Methods," Annals of Discrete
Mathematics, 1, 293-330 (1977).
[15] Karlin, S., "Mathematical Methods and Theory in Games, Programming and Economics,"
1, Addison-Wesley Publishing Company, Reading, Mass. (1959).
[16] Majthay, A. and A. Whinston, "Quasi-Concave Minimization Subject to Linear Con-
straints," Discrete Mathematics, 9, 35-59 (1974).
[17] Owen, G., "Cutting Planes for Programs with Disjunctive Constraints," Optimization
Theory and Its Applications, //, 49-55 (1973).
[18] Poljak, B.T., "A General Method of Solving Extremum Problems," Soviet Mathematics
Doklady, 8, 593-597 (1967). (English Translation).
[19] Poljak, B.T., "Minimization of Unsmooth Functionals," USSR Computational Mathematics
and Mathematical Physics, 9, 14-29 (1969). (English Translation).
[20] Vaish, H. and C. M. Shetty, "A Cutting Plane Algorithm for the Bilinear Programming
Problem," Naval Research Logistics Quarterly, 24, 83-94 (1975).
THE ROLE OF INTERNAL STORAGE CAPACITY
IN FIXED CYCLE PRODUCTION SYSTEMS
B. Lev*
Temple University
Philadelphia, Pennsylvania
D. I. Toof
Ernst & Ernst
Washington, D.C.
ABSTRACT
The reliability of a serial production line is optimized with respect to the lo-
cation of a single buffer. The problem was earlier defined and solved by Soy-
ster and Toof for the special case of an even number of machines all having
equal probability of failure. In this paper we generalize the results for any
number of machines and remove the restriction of identical machine reliabili-
ties. In addition, an analysis of multibuffer systems is presented with a closed
form solution for the reliability when both the number of buffers and their
capacity is limited. For the general multibuffer system we present an approach
for determining system reliability.
1. INTRODUCTION
Several types of production line models appear in the literature. Each one is a realization
of a different real life situation. A summary of the various types and the differences in the
mechanism of product flow among them appears in Buzacott [5], Koenigsberg [9], Toof [14] or
Buxey et al [1]. Recently Soyster and Toof [13] defined a serial production line, which is the
model analyzed in this paper.
The mechanism of product flow in a serial production line is described via Figure 1. An
unlimited source of raw material exists before machine 1. If machine 1 is capable of working
(i.e., not failed), an operator takes a unit of raw material and processes it on machine 1, after
which he moves to machine 2 and processes it on machine 2, if machine 2 is capable of work-
ing. He proceeds analagously until machine N where a finish product is completed. Let T, be
N
the process time on machine /. Then the cycle time of the system T = £ T,. Let q, be the
/=i
probability that at any cycle T machine / is capable of working and p,- = 1 — q, the probability of
failing. The serial production line with no buffer must stop working if any of the individual
machines on the line fails. The placement of a single buffer of capacity M after machine /
alleviates this situation. If any of the first i machines fail and the buffer is not empty, machines
*This study was done when the author was at the Department of Energy, Washington, D.C. under the provisions of the
Intergovernmental Personnel Act.
477
478
B LEV AND D.I. TOOF
M,
M2
M,
Ml+X
Mi+2
MN
Product Flow
Figure 1. Serial production line with N machines and a single buffer
i + l, / + 2, . . . , N can still function. Conversely, if any of the machines / + 1, .... TV fail
and the buffer is not full, the first /machines may still work and produce a semifinished good to
be stored in the buffer. One obviously would like to identify the optimal placement of this
buffer. Soyster and Toof [13] proved that if there are an even number of machines, all identi-
cally reliable (q, = q V /) then the optimal placement of the buffer is exactly in the middle of
the line. In section 2 we generalize these results for any number of machines not necessarily
identically reliable. Specifically, we prove that the optimal placement of a single buffer is at a
place which minimizes the absolute value of the difference between the reliability of the two
parts of the line separated by the buffer.
The optimal location /* is determined from (1)
(1)
n«,- n q,
/=! /-/*+!
Min
/=i
N
n *
A more difficult question is the optimal locations of several buffers. In section 3 we analyze a
special case of a two buffer system, each buffer having a capacity of one unit. In section 4 we
present an approach that can be used for any number of buffers with any capacity. The
approach we suggest is efficient as long as the number of buffers and their capacity remains
relatively small.
2. OPTIMAL LOCATION OF A SINGLE BUFFER
Let a single buffer with capacity M be placed after machine /'. Let a, = \\qj,
N
13, = [ qn p, ■ = (a, - a,/3,)/(/3, - a,/3,), and let X„ be the number of units in the buffer at
the beginning of cycle n. Soyster and Toof [13] have shown that X„ defines a finite Markov
Chain, presented its transition matrix and found that the reliability R (/) of the line is given by
(2) and (3):
(2)
(3)
R(i) = j3,a, +/3, (1 -a,)
Pi ~ Pi
M+\
1
R(i) =/3,a, +/3, (1 -a,)
~ Pi
M
M+\
M + 1
if «i *Pi
ifa, = /3,.
One has to maximize /?(;') with respect to /, that is, to identify the optimal location of the
/ N t N
buffer within the line. Since a,/3, = ] J qt [ q) = ]7<7/ is a constant and does not affect the
/-i ;-/+i /=i
location of the buffer, one can simply ignore this term from (2) and (3) in the optimization
phase. Thus, we want to find /'*that maximizes R(i) or:
(4)
R (/*) = Max R (/) = Max
0/(1 -«/)
0/(1 -«,)
Pi ~ P,
M+\
1-P.
M
M + 1
M+\
if a, *Pi
it a, = pr
INTERNAL STORAGE CAPACITY IN FIXED CYCLE PRODUCTION SYSTEMS
479
The approach we take to solve (4) for /'* is to show that R(i) is strictly increasing with a, for
a, < /3, and strictly decreasing with a, for a, > /3,; that a, = /3, occurs when R (/') reaches its
maximum value; and that R (/) is symmetric about the point i* where a,. = /3,..
Let
(5)
R(i) = 08; -a,.j8,.)
p-r+i _ i
when a, = /3,, p, = 1 and (5) becomes (6)
M
(6)
/?(/) = (/3, -a,/3,)
M+ 1
«/ * 0,
0,-iS,.
Note in (6) as M becomes large the total reliability of the line, which is equal to a,/3, + /?(/),
approaches (3r That is, the two segments of the line become independent of each other.
In this section the general strategy is to show that if a, > /3, or a, < (3, then the reliabil-
ity of (5) is smaller than the reliability of (6). Hence, we treat a, as a continuous variable and
show that the derivative of (5) with respect to a, is positive for a, < /S, and negative for
a, > /3,-.
The derivative of R (/') with respect to a, is:
dR (/)
da.
M+\
-otfii
(p,M+1-p,) +
{MPr+-(M + \)Pr + 1)
(p;
M+\
\y
i +
pfi,
LEMMA 1: The additional reliability function R(i), is strictly increasing with respect to
a, over the range
1/2
'(//? (/) '
da.
,1
0,
n?
That is, if 0 < a, < /3M then
, and strictly decreasing with respect to a, over the range
dR (/)
da.
> 0. Conversely, if /3, < a, ^ 1, then
< 0. The proof can be found in [14]. (The first range is closed from the left and open
from the right; the second range is open from the left and closed from the right).
THEOREM 1 : The optimal placement, /*, of a single buffer of integer capacity M in an N
machine line is where a *= /3 *
PROOF: The proof of this theorem is essentially complete. We must only show that (5)
is continuous at the point where a*= /3* By definition the additional reliability attributable to
the introduction of the buffer when a f= /3 * is:
M
(fi*-a*p*)
M + 1
As a, — ► /3, pi — - 1 so that in (5) the limit of the steady state probability as a, — - /3, is of the
indeterminate form 0/0. However, an application of L'Hospital's rule shows that:
lim
Prx-Pl
M + \
M
a, -0,. p'"^'-! M + 1
and thus the continuity is proven.
480 B. LEV AND D.I. TOOF
Theorem 1 defines an optimal though not necessarily feasible solution to the problem of
buffer placement. The condition a, = /3, may be impossible to satisfy. In the remainder of this
section we examine the symmetry of the reliability function defined by Equation (5), develop a
simple criterion that provides the best feasible solution and, lastly, we examine the special case
of identical machine reliability, i.e., q, = q V /.
LEMMA 2: Given Kx and K2 continuous variables such that a^ -/3K = /3K - aK .
Thenp*, ■ pKl = 1.
N
PROOF: Recall that afi, = ]^q,= Q a constant for all /'. Thus the condition
aK ~ &k = Pk-,~ aK may De rewritten aK *— = — * aK . This implies that:
1122 ' <*KX aK2 2
Q{aKx + aK)
"A", + a«, = or that aK aK = Q.
i 2 aK^K2 ' 2
Similarly, one obtains the result that fiK (3K = Q. We want to show that pK ■ pK =1. Sub-
stituting for pK and pK in the definition of p yields:
(aK]- Q)iaKi- Q)
P*lPKl ' ((}Ki-Q) (fiK2-Q)-
We then must show that:
(aK] - Q) (aKi - Q) - Q8*, - Q) (fiKi - Q)
or that:
«*,«*, - Q(aKl + <*k2) = /Sat, Pk2 ~ 003*, + Pi)-
The condition aK{— Pk^ Pk2~ <*k2 infers both that aK] + aKi = (3K] + jiKi and that
aKxaK2 = Pk{Pk2 = Q, and thus the proof is complete.
This leads directly to the following theorem:
THEOREM 2: For a continuous argument (/'), R (/') is symmetric about the point i*
where a,. = /3,..
The proof is in [14].
The placement of the buffer has been treated as a continuous variable. While this has led
to satisfying mathematical results, in reality one must develop an optimizing criterion which is
physically feasible. Unfortunately, the condition a,. = /3,. does not satisfy the feasibility
requirements. Rarely will i* be integer and what, for example, is the physical interpretation of
i* = 7.63. To this end, it will be shown in this section that the steady state reliability of the
line is maximized by placing the buffer after machine /* (/* integer) where /* satisfies the fol-
lowing condition:
\a,. - j8,-.| = min \a, - /3,|.
1 < ; < N
!• N
Note that if an integer i* exists such that a,. = ] J q, = q, = /3,., it would satisfy the
/=1 /=r+i
above criterion and be consistent with Theorem 1.
INTERNAL STORAGE CAPACITY IN FIXED CYCLE PRODUCTION SYSTEMS 481
To this end observe that \a, • — /3,| is a convex function of a, that obtains its minimum
point at a, • = j3, ■ = -J a fit = yfQ. Thus, for
a, < aj < y/Q, \ctj - /3,\ < \a, - j3,-|, and for JQ < a, < a,, \a ,- - (3 ,\ < \a, - j8,|.
THEOREM 3 (Fundamental): The optimal integer placement of a single buffer of capa-
city Min an N machine line is where \a, — /3,-| is minimized.
PROOF: From Theorem 1 we know that by treating / as a continuous variable the optimal
placement /'* satisfies a,* = j3,». If /'* is integer the theorem is evident. Assume that i* is not
integer. Examine the points [/*] and [/* -t- 1]. From lemma 1 and the convexity of \a, — /3,-|
we know that R([i*]) > R(K{) where a[r]>aK] and R ([/* + 1]) > R (K2) where
|[;*+n > aK ■ Thus, the only two candidate placements are [/'*] and [/* + 1].
If \a[j*] — fi[j*]\ = |a [,••+!] — /8r/.+ 1]| then the theorem holds and either placement is
optimal. Therefore, assume that ]«[,♦] — /3[,-.] I < |a(/*+i] — /8[/»+i]|. We want to show that
R([i*]) > R ([/*+l]). Assume the contrary, i.e., that R ([/* + 1]) > R ([/*]). From
Theorem 2 we know that there exists a point K* such that R{K*) = R([i* + 1]) and that
Wk*~ Pk*\= \au*+i]- Pn*+i]\- This implies that R (K*) >R ([/*]). We know that
\aK*~ $k*\ > lot [/*] — /S[/*]l and since both aK* and «[,»] must be greater than v 'a Wi tnis
implies that aK* > a [,-*]. By Theorem 2 this would infer that /?([/*]) > R{K*) which is a
contradiction. Similar results may be obtained by assuming that Icq,-*] — /3[/*]l > la[,*+i]
- P[r+ ill-
Theorem 3 details a simple, yet elegant criterion for the optimal placement of a single
buffer regardless of capacity so as to maximize the reliability of the system.
A Special Case: q, = q V /.
J
Consider the case where q, = V i. In this case:
a, = q'
13, = qN-'.
It follows from Theorems 1 and 3 that if A^ is even, the optimal placement would be where
a, = f3, which in this case is where q' = qN~' which is satisfied at / = A/2. This is consistent
with the results developed by Soyster and Toof [13].
Assume that N is odd. Then N is of the form 2K + 1 where K is integer and by
Theorem 3 the optimal placement is either after machine K or machine K + 1 since:
l«* -18*|- \qK- q2K+'-K\= \qK - qK^\
l«*+i - 0*+ll = l^ + ' - <72*+1-*-' \=\qK~ gK+l\.
We have just completed the proof for the optimal location of a single buffer on an N machine
serial line. The optimal location is for any N (even or odd) and for any q, (both when machine
reliability are identical or not identical for all machines). In the next section we generalize the
model to include more than one buffer.
3. TWO BUFFERS OF CAPACITY ONE UNIT
Consider a simpler case of the general model where N = 3k and q, = q for all /. The
placement of two buffers separates the line into three segments. Since N = 3k, one may arbi-
trarily place the first buffer immediately after machine k and the second immediately after
482
B. LEV AND D.l. TOOF
machine 2k. The placement of these two buffers has just defined the three stages of the sys-
tem. Each stage may be comprised of more than one machine; for a line of N = 3k, each stage
is comprised of k machines. The reliability of each stage is Q{ = Q2 = Q3 = qf = Q and
P = 1 - Q.
The two buffer system operates analogously to the one buffer system described in section
2. If all machines are up, then a unit of raw material is processed by stages one, two and three
and a finished good is produced. If, for example, stage three is down, stages one and two are up
and buffer two is not full, then both stages one and two operate and a semicompleted good
would be stored in buffer two. If buffer two had been full and buffer one had not, then
machine two would not operate; it would be blocked by the second buffer which is full. In this
case only machine one would operate and a semiprocessed good would be stored in buffer one.
Define an ordered pair (X, Y) where X represents the quantity of semifinished goods in
buffer one at the start of cycle /, and Y the quantity in buffer two at the start of cycle /. If we
assume that the maximum capacity of both buffers one and two is one, then the pair (X, Y)
may take on the following four values: (0,0), (1,0), (0,1), and (1,1). The one cycle transition
probability from state (X, Y) = (0,0) to all states is:
• Both are empty at the start of cycle t + 1 if either all stages are up, or if stage one is
down. Thus: P[(Xl+], K,+1) = (0,0) | {X,, Y,) = (0,0)] = Q3 + P.
• If stage one is up during cycle t but stage two is down, then a unit of raw material is
processed on stage one and the semicompleted good stored in buffer one. Thus:
P[(X,+], K,+1) = (1,0) I (X„ Y,) = (0,0)] = QP
• If both stages one and two are up but stage three is down, then a unit of raw material is
processed on both stages one and two and the semicompleted good stored in buffer two.
Thus: P[(X,+], Yl+]) = (0, 1) I (X,, Yt) = (0,0)] = Q2P
• Lastly, note that it is impossible for (Xl+\, Yl+X) to equal (1,1) given that
{X,, Y,) = (0, 0), as at most, one unit may be added to storage during any cycle. Thus:
P[(Xl+], Yl+l) = (1, 1) I (X„ Yt) = (0,0)] = 0.
One may compute the transition probabilities for all of the four possible states in an analogous
manner. The complete transition matrix is presented in Figure 2.
\ State
\t+l
State in t \^
(0,0)
(1,0)
(0,1)
(1,1)
(0,0)
Q3+P
QP
Q2P
0
(1,0)
Q2P
Q3+P
QP2
Q2P
(0,1)
QP
Q2P
Q3+P2
QP
(1,1)
0
QP
Q2P
Q3+P
Figure 2. Transition matrix — two buffer system
Let 77), 7T2, 7r3, 7r4 be the steady state probabilities of buffer states (0,0), (1,0), (0,1) and
(1,1) respectively. The system is in state (0,0) with probability 77 1, then a good is produced if
and only if all three stages are up. This event has a probability of Q3tt\. Similarly, with proba-
bility 7r2 the system is in state (1,0), then only stages two and three must be up for a finished
good to be produced. This event has probability Q2tt2- Lastly in both state (0,1) and
INTERNAL STORAGE CAPACITY IN FIXED CYCLE PRODUCTION SYSTEMS
483
(1,1), buffer two is not empty and thus the only condition for a successful cycle is that stage
three must be up. These events have probability £)t73 and QnA, respectively. The steady state
reliability, R, of the two buffer system where the capacity of both buffer one and buffer two is
one unit is equal to:
= <23ir, + <22tt2+ Qrr3 + Qtt ,.
(7)
R
Thus, upon determining the steady state probabilities, 77,, tt2, 77 3 and 774, one has an
exact formulation of the reliability of the three stage, two buffer system, where each buffer has
a capacity of one unit.
From the transition matrix presented as Figure 2 and basic finite Markov Chain theory,
one can calculate ttu tt2, 773, and 77 4 in the following manner.
First, we know that in the steady state ttB = it where B is the one step transition matrix
of the system (Figure 2) and
77 = (77 j, 77 jt W3, 77 4) .
This identity yields a system of four simultaneous equations of the form
(8) ir(B-I) = 0
where B is the form:
B =
However, (B-I) has no inverse as the rows are linearly dependent. The classical method of
solution to this problem is to drop one of the identity equations of 77 and substitute the fact
that the sum of the steady state probabilities must equal one. That is, 77, + tt2 + tt3 + tt4 = 1.
Making this substitution for column 3 of B-I yields the following system of simultaneous equa-
tions: ttA = (0,0,1,0), where:
1 0
Q3 + P
QP
Q2P
0
Q2P
Q3 + P
QP2
Q2P
QP
Q2P
Q3 + P2
QP
0
QP
Q2P
Q3 + P
Q3 + P -
1 QP
Q2P
Q3 + P -
- 1
QP
Q2P
0
QP
1 Q2P
Thus, 77 = (0,0, 1,0) A which reduces to 7r = A3
inverse matrix A~x. The solution to the last system of four equations and four variables is:
QP
Q3 + P - 1
where A^ is the third column of the
(9)
tt, = (Q2+ Q + \)/(4Q2 + 3Q + 5)
tt2= (Q2+ Q + 2)/(402 + 3Q + 5)
773= (Q2+ \)/(4Q2 + 3Q + 5)
it4= (Q2+ Q + 1)/(4Q2 + 3(3 + 5)
We are now able to directly compute the steady state reliability of a two buffer series sys-
tem where each stage has identical reliability, Q, distributed Bernoulli and each buffer a capacity
: of one unint. We have just proved Theorem 4 which results from (7) and (9).
484
B. LEV AND D.I. TOOF
THEOREM 4: For the series production system described above the steady state reliabil-
ity of the system /?, is equal to:
0s + 2Q4 + 4Q3 + 3Q2 + 1Q
4(?2 + 30 + 5
R =
4. EXTENSION OF THE GENERAL MUTLIBUFFER CASE
The previous sections have laid the groundwork for our analysis of a general multistage,
multibuffer system such as the one depicted in Figure 3. For ease of analysis let us assume
that the reliability of each stage has the Bernoulli distribution with parameter Q and further that
m
buffer /' has capacity Mr For a general N stage system with m buffers, there are ] [ (M, + 1)
possible buffer states; i.e., each buffer may take on M, + 1 values and there are m such buffers.
For example, if M, = 4 for all /, and m = 5 there would be 3,125 possible buffer states ranging
in value from (0,0,0,0,0) to (4,4,4,4,4). The question arises as to the viability of this form of
analysis for systems with large buffer capacity (M,), multiple buffers (m) or a combination of
the two. Clearly, the transition matrix for a large system would be relatively sparse (i.e., many
zero entries). For example, in a four stage (three buffer) system, where each buffer has a capa-
city of three units, there would be 43 = (3 + l)3 or 64 possible transition states. For the start-
ing state (1,1,1) there are 13 possible transitions (i.e., nonzero transition probabilities). The
feasible transitions from the state (1,1,1) are:
(0,1,1), (0,1,2), (0,2,1), (1,0,1), (1,0,2), (1,1,0), (1,1,1),
(1,1,2), (1,2,0), (1,2,1), (2,0,1), (2,1,0), and (2,1,1).
Raw
Material
M,
B
Mi
B
M
N-\
BN~
jV— 1 j
Mk
Finished
Goods
Product Flow
Figure 3. General muliisiage, mullibufTer system
While it is obvious that the method of analysis employed to this point is feasible, that is,
(1) definition of a one step transition matrix; (2) development of a reliability equation as a
function of stage reliability and the steady state transition probabilities, and (3) solving a system
of linear equations for the steady state transition probabilities; its application is, for the most
part, not practical.
Let us present the transition matrices for two or three buffer systems with capacity one or
two. For the system of two buffers of capacity two the transition matrix is given in Figure 4
and the steady state probabilities for various values of Q are given in Figure 5 where the relia-
bility R is:
R = (237r, + Qtt2 + £>7r3 + Q2ir4 + Qtts + £>7r6 + Q1-n1 + Q-n% + Qtt9.
Figure 5 was calculated by a small computer program. For various values of Q, we solved for
the unique rr-, and calculated /?, which appears in Figure 5. For the system of four stages, and
three buffers with capacity one, the transition matrix is given in Figure 6.
Again, using a small computer program we solved for n , and calculated R. The steady
state probabilities and the system reliability R is given in Figure 7 where
R
= n4
QV, + QlT2 + 037T3 + Q7T4 + Q37T5 + QlT^ + Q27T 7 + QtT g.
INTERNAL STORAGE CAPACITY IN FIXED CYCLE PRODUCTION SYSTEMS
485
Transition Matrix
\t+l
t ^\
(0,0)
(0,1)
(0,2)
(1,0)
(1,1)
(1,2)
(2,0)
(2,1)
(2,2)
(0,0)
Q3+P
Q2P
0
QP
0
0
0
0
0
(0,1)
QP
Q3+P2
Q2P
Q2P
QP2
0
0
0
0
(0,2)
0
QP
Q3+P2
0
Q2P
QP
0
0
0
(1,0)
Q2P
QP2
0
Q3+P2
Q2P
0
QP
0
0
(1,1)
0
Q2P
QP2
QP2
Q3+P3
Q2P
Q2P
QP2
0
(1,2)
0
0
Q2P
0
QP2
Q3+P2
0
Q2P
QP
(2,0)
0
0
0
Q2P
QP2
0
Q3+P
Q2P
0
(2,1)
0
0
0
0
Q2P
QP2
QP
Q3+P2
Q2P
(2,2)
0
0
0
0
0
Q2P
0
QP
Q3+P
Figure 4. Two buffers of maximum capacity two
Buffer Capacity Equals 2
Q = .9
(3=7
Q = .3
Q= A
7T,
.108
.105
.097
.092
1T?
.092
.091
.090
.089
T3
.060
.059
.060
.064
7T4
.126
.124
.118
.114
1TS
.106
.109
.121
.129
"6
.092
.091
.090
.089
TT1
.183
.191
.209
.218
T«
.126
.124
.118
.114
77q
.108
.105
.097
.092
R
.855
.596
.205
.061
Figure 5. Exact solutions to three stage
two buffer system
Maximum Buffer Capacity Equals One
\t+l
t ^\
(0,0,0)
(0,0,1)
(0,1,0)
(0,1,1)
(1,0,0)
(1,0,1)
(1,1,0)
(1,1,1)
(0,0,0)
Q4+P
Q3P
Q2P
0
QP
0
0
0
(0,0,1)
QP
Q4+P2
Q3P
Q2P
Q2P
QP2
0
0
(0,1,0)
Q2P
QP2
Q4+P2
Q3P
Q3P
Q2P2
QP
0
(0,1,1)
0
Q2P
QP2
Q4+P2
0
Q3P
Q2P
QP
(1,0,0)
Q3P
Q2P2
QP2
0
Q4+P
Q3P
Q2P
0
(1,0,1)
0
Q3P
Q2P2
QP2
QP
Q4+P2
Q3P
Q2P
(1,1,0)
0
0
Q3P
Q2P2
Q2P
QP2
Q4+P
Q3P
(1,1,1)
0
0
0
Q3P
0
Q2P
QP
Q4+P
Figure 6. Four stage, three buffer transition matrix
486
B LEV AND D.I. TOOF
Buffer Capacity of One Unit
7T4
7T7
7T8
(3 = .9
Q = .1
Q = .3
(3=1
.109
.100
.091
.080
.079
.075
.071
.073
.098
.098
.118
.138
.079
.080
.071
.021
.157
.156
.221
.270
.137
.147
.118
.193
.206
.213.
.221
.187
.136
.131
.091
.038
.819
.533
.143
.036
Figure 7. Reliability of a four stage.
three butTer svsiem
The approach we present here can be summarized as follows; for a given configuration of
a serial production line with multiple buffers and no restriction on their capacity, one can write
the one step transition probability matrix and solve for its steady state probabilities which yields
the reliability of the line. The method is efficient for a small number of buffers and small capa-
cities. In general, the number of state variables and the number of linear equations are
m
~[ (M, + 1) for m buffers with capacity Mr
[1
[2
[3
[4
[5
[6
[7
[8
[9
10
11
:i2
BIBLIOGRAPHY
Buxey, G.M., N.D. Slack and R. Wild, "Production Flow Line Systems Design — A
Review," AIIE Transactions (1973).
Buxey, G.M. and D. Sadjadi, "Simulation Studies of Conveyor Paced Assembly Lines with
Buffer Capacity," The International Journal of Production Research, 14 (1976).
Buzacott, J. A., "Automatic Transfer Lines with Buffer Stocks," International Journal of
Production Research, 5 (1967).
Buzacott, J. A., "The Role of Inventory Banks in Flow-Line Production Systems," The
International Journal of Production Research, 9 (1971).
Buzacott, J. A., "Models of Automatic Transfer Lines with Inventory Banks, A Review and
Comparison," AIIE Transactions, 10 (1978).
Gershwin, S.B., "The Efficiency of Transfer Lines Consisting of Three Unreliable
Machines and Finite Interstage Buffers," Presented at ORSA/TIMS Los Angeles Meet-
ing (1978).
Hatcher, J.M., "The Effect of Internal Storage on the Production Rate of a Series of Stages
Having Exponential Service Times," AIIE Transactions, 2, 150-156 (1969).
Ignall, E. and A. Silver, "The Output of a Two-Stage System with Unreliable Machines and
Limited Storage," AIIE Transactions, 9 (1977).
Koenigsberg, E., "Production Lines and Internal Storage — A Review," Management Sci-
ence, 5(1959).
Okamura, K. and H. Yamashina, "Analysis of the Effect of Buffer Storage Capacity in
Transfer Line Systems," AIIE Transactions, 9 (1977).
Rao, N.P., "Two Stage Production System with Intermediate Storage," AIIE Transactions,
7(1975).
Sheskin, T.J., "Allocation of Interstage Storage Along an Automatic Production Line,"
AIIE Transactions, 8 (1976).
INTERNAL STORAGE CAPACITY IN FIXED CYCLE PRODUCTION SYSTEMS 487
[13] Soyster, A.L. and D.I. Toof, "Some Comparative and Design Aspects of Fixed Cycle Pro-
duction Systems," Naval Research Logistics Quarterly, 23, 437-454 (1976).
[14] Toof, D.I., "Output Maximization of a Series Assembly Facility Through the Optimal
Placement of Buffer Capacity," Unpublished Ph.D. dissertation, Temple University
(1978).
SCHEDULING COUPLED TASKS
Roy D. Shapiro
Harvard University
Graduate School of Business Administration
Cambridge, Massachusetts
ABSTRACT
Consider a set of task pairs coupled in time: a first (initial) and second
(completion) tasks of known durations with a specified time between them. If
the operator or machine performing these tasks is able to process only one at a
time, scheduling is necessary to insure that no overlap occurs. This problem
has a particular application to production scheduling, transportation, and radar
operations (send-receive pulses are ideal examples of time-linked tasks requir-
ing scheduling). This article discusses several candidate techniques for schedule
determination, and these are evaluated in a specific radar scheduling applica-
tion.
This article considers the problem of scheduling task pairs, i.e., tasks which consist of two
coupled tasks, an initial task and a completion, separated by a known, fixed time interval. If
the operator or machine performing these tasks is only able to process one at a time, scheduling
is necessary to insure that a completion task of one pair does not arrive for processing while
one part of another task is being processed.
Consider, for example, a radar tracking aircraft approaching a large airport [1]. In order
to track adequately, it is necessary to transmit pulses and receive the reflection once every
specified update period. The radar cannot transmit a pulse at the same time that a reflected
pulse is arriving nor can two reflected pulses overlap. A possible strategy is to transmit to one
tracked object and wait for that pulse to return before another pulse is transmitted as shown in
Figure 1(a), but unless the number of objects being tracked is small, this may not allow all
objects to be tracked in each update period. A more efficient strategy is some form of inter-
laced scheduling like that shown in Figure Kb). Observe that the time between each pair of
transmit and receive pulses is the same in Figure Kb) as in Figure 1(a), yet the total transmis-
sion time is far less in Kb).
b).
Figure 1. Sample 4-pair schedules
489
490
R D SHAPIRO
1. NOTATION, CLASSIFICATION, AND COMPLEXITY
Our object is to generate a schedule for a given set of task pairs which allows that set to
be completed in the least possible time with no overlap between tasks (Figure 2). Formally, let
/, = the time of initiation of the rth task pair;
Sj = the duration of the initial task of the rth pair, / = 1, 2, . . . , N;
T, = the duration of the completion task of the rth pair, / = 1, 2, . . . , N;
dj = the "inter-task" duration, i.e., the time between the initiation of
the initial task of the rth pair and the initiation of that pair's completion.
rl
— .s,—
a, —
-* —
T,——
t, + s,
t, + d, t, + d, + T,
Figure 2. The rth task pair
The time between the initiation of the first task pair and the completion of the final pair we
refer to as the frame time (or makespan, cf. [3,4] denoted z). For convenience, we will set the
initiation time of the first pair to 0.
The scheduling problem may be stated as
find /, ^ 0, / = 1 /V to minimize
z = max, (/, + d, + T,)
subject to the constraint that no member of the set of intervals
{(/,, t, + sx u, + d,, t, + d, + r,)) / = i n
overlap with any other member.
To put this problem into context with much of the recent literature classifying scheduling
problems with regard to their computational complexity, we observe that the problem as stated
is equivalent to a job shop problem where A' jobs are to be scheduled on two machines with the
following characteristics*:
1. Each job requires three operations: the first (of duration S,) to be processed on
Machine 1; the second (of duration d, — S,) on Machine 2; the third (of duration T)
again on Machine 1.
2. Machine 1 may only process one operation at a time; Machine 2, however, has
infinite processing capacity.
"Under the classification scheme of Rinooy Kan [9], this problem is V 1 2 1 (7, no wan, V/2 iwn-bon |Cmax. See also [8].
SCHEDULING COUPLED TASKS 491
3. No waiting between operations is permitted. That is, once a job is begun, it must
proceed from Machine 1 to Machine 2 and back again to Machine 1 with no delay.
The problem can then be shown to be NP-complete by Theorem 5.7, pg. 93 in [9] or by a
reduction from KNAPSACK in [6]. NP-complete problems form an equivalence class of com-
binatorial problems for which no nonenumerative algorithms are known. If an "efficient" algo-
rithm were constructed which could solve any problem in this class, any other would also be
solvable in polynomial time (cf. [2,4,6,7]). Members of this class include the chromatic
number problem, the knapsack problem, and the traveling salesman problem.
The fact that a polynomial-bounded algorithm is not likely to exist motivates the construc-
tion of several polynomial-bounded algorithms which are presented and evaluated in Sections 2
and 3. An integer programming formulation leads to a straightforward branch and bound pro-
cedure which makes use of the problem's special structure. (See [11].) In view of the fact that
this optimal procedure is likely to be tractable only for very small problems, and not even then
for radar-like applications requiring real time solution, we proceed directly to consideration of
three suboptimal algorithms.
2. SUB-OPTIMAL ALGORITHMS
This section considers scheduling procedures which can be shown to be polynomially
bounded: Sequencing, Nesting and Fitting. After some discussion of their characteristics, they
will be evaluated on realistic examples in Section 3.
Sequencing
An ordered set of p task pairs are said to be sequenced when the completion tasks arrive
for processing in the same order as the initial tasks were scheduled, p pairs can be sequenced
whenever
(1) </, > £ S, and
1=1
(2) d, > d,-x+ T^-Si-x, i = 2, 3, ... , p.
If, as is the case for many applications, S, = T, for each task pair, (2) becomes simply
(3) d, > rf,_lf
and implementation of this procedure becomes quite easy.
We may think of this procedure as "jamming" initial tasks together until they run into the
completion task corresponding to the first initial task. The completion tasks are guaranteed not
to overlap since each succeeding d, is at least as large as the one before. Also, since this is a
"single-pass" procedure (cf. [3]), computation time is linear in A/.*
In any sequenced p-set, dead time can occur in two ways, as is shown in Figure 3. It
occurs between the last initial task and the first completion, and it occurs between successive
completions. The former can be written as
±s,
i=\
'Actually, computation time is 0(A log AO since the d, have to be ordered.
492
R.D SHAPIRO
S, S2 S3 V4 S< S6 S1 7",
r2 r3 r4
r5 t-6
7"7
and the latter as
Figure 3. Sequencing
Hence,
Z 4+i - 4 = 4» - ^i-
+ dp - d]
zseq - I CS, + r,) + rf,-ls,
- 1 7) + d„.
Hence, if N task pairs are sequenced in P p-sets, the Ath set having pk pairs, /c = 1, 2,
the total frame time may be represented as
zseq — 2L
, P,
Pk
yv p
= 1 7)-+I^-
;=1 /<=1
As an example, consider the following 7 task pairs with common durations for initial and
completion tasks, ordered by increasing d,.
i - 1: Si - F, - 2, d, - 9
/ = 2: S2= F2 = 1, rf2= 13
/ = 3: S3= F3 = 2, rf3= 15
/ = 4: S4 = F4 = 3, d4 = 15
/ = 5: S5= T5= 2, d5= 19
/ = 6: S6 = F6 = 4, </6 = 24
/ = 7: S7 = T7= 3, rf7 = 25.
Figure 4(a) shows their sequenced schedule.
For comparison, Figure 4(b) shows the optimal schedule for this set of task pairs as gen-
erated by the branching algorithm alluded to above. At the other extreme, if these pairs were
scheduled by waiting until each pair was completely processed before initiating the next, the
frame time would be 138.
£ id, + T,) = 138.
1=1
Nesting
An ordered set of p task pairs are said to be nested whenever the completion tasks arrive
for processing in the reverse of the order in which the initial tasks were scheduled, p pairs may
be nested if
SCHEDULING COUPLED TASKS 493
Sequencing:
v, v2.s, s4 r, t7 7-3 r4 s\ sb s^
n
?"„
7",
: = 58
III 11 1 1 1 1 11
1
II 1
0 2 3 5 8 "J 11 15 16 18 20 23 25 29 32
Oplimal
s2 s7 sh s4 s, r2 ss s3 r, r4 r7 r6 r, r5
42 44
49
>i 54 58
: - 37
M II 1 1 1 II 1 1 1 II 1 1 III
01 4 8 1113 14 16 18 202223 26 29 33 35 37
Nesting:
>7 ss SjS: /, r3 r5 r7 .•>„ s4
'4
n
h S2
n=
"Mill 1 1 1 1 1 1 1 II
II
= 70
11 3 5 7 9 16 18 20 22 24 25 28 32 35 47 50 52 56 57 69 70
Figure 4. Sequencing and nesting
(4) d, > di+l+ Ti+l + S„ /=1, ... , p-\. .
Applying this procedure to the 7-pair example discussed above gives the schedule shown
in Figure 4(c) with z = 70.
Fitting
This procedure, unlike the two discussed above, allows the user to specify a priority order-
ing, and corresponds intuitively to the simple process which one might use when scheduling
task pairs by hand. After setting the desired order and scheduling the first task pair at time 0,
each successive pair is scheduled at the earliest possible time not involving any overlap with
pairs already scheduled.
Let us consider this procedure for the above example, taking an arbitrary ordering:
2,6,7,4,3,1,5. As shown in Figure 5(a), the task pair is scheduled at time 0, and pairs 6 and 7
can successively be scheduled with no overlap. If we, however, try to schedule pair 4 at the
first available time, its completion would overlap with pair 6's completion (Figure 5(b)), so this
is not possible. The first available time for scheduling task pair 4 without overlap is time 18
(Figure 5 (c)). Pair 3, however, having task duration only 2, can be scheduled at time 8 (Fig-
ure 5(d)). Observe now that pair 1 can be scheduled nowhere in the existing schedule without
overlap, so it must be "tacked" onto the end, at time 36 (Figure 5(e)). Pair 5 is scheduled at
time 21, completing the schedule with z = 47 (Figure 5(f)).
3. TASK PAIR SIMULATION AND NUMERICAL RESULTS
In keeping with the radar application mentioned above, a simulation has been developed
to generate aircraft configurations suitable to radar operation. For each object, range, cross-
section, and velocity can be used to determine the necessary length of transmit and receive
pulses (of the order of 10-100 /usees.) as well as the inter-pulse distance (of the order of 300-
1300 /usees.). Thus, a list of task pairs can be generated for evaluation of the procedures out-
lined in the previous section. As an example, such a list is given in Table I for /V = 20.
For values of N shown in Table II, the simulation generated 50 such task pair lists, and
the average frame time and computation time were computed. Figure 6 presents this data
graphically. Note that, as one would expect, frame time is linear in N. This is not surprising
since in the best conceivable situation, that of no idle time between subtasks,
494
R.D. SHAPIRO
S, Sb S7
(a)
(b)
(c)
(d)
(e)
(f)
H
rh t-,
01 5 8 13 14
s2 s6 v7 .v4 r2
JL
25 29 30 33
t, Tb r7
01 5 8 1113
Si v6 s7
r2
n
Si
23 25 26 29 30 33
Th r7 r,
Dl 5 8 13 14 18 21 2< 2') 30 }} 36
n
1)1 5 8 10 13 14 18 21 23 25 29 30 33 36
.v. s6 s7 s, r2 s4 r} rh r7 7"4 s,
n
0
45 47
5 8 10 13 U 18 2! 23 25 2930 33 36 38
n
01 5 8 10 13 14 18 21 23 25 2930 33 36 38 40 42 45 47
Figure 5. Fitting
TABLE I - Sample Task Pair List (N - 20)
47
/ = :
I
~T
~~ 3
4i 5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
S, = T, (/usee)
70
70
80
50 1 70
80
70
70
70
40
50
70
60
60
60
70
50
40
50
40
</, (/usee)
1334
L. .
1258
1254
I159JI107
1022
954
884
791
750
709
674
631
623
621
555
513
498
465
387
TABLE II (a) - Average Simulated
Frame Times
N
20
SEQUENCE
4.4
(.414)
NEST
7.3
(1.387)
FIT
4.0
(.381)
50
8.6
(.559)
15.0
(1.830)
7.6
(.452)
100
15.5
(.759)
. 27.2
(2.747)
13.8
(.679)
200
29.2
(1.197)
52.2
(4.683)
27.4
(.990)
500
70.8
(2.188)
119.2
(8.391)
66.1
(1.590)
Frame times in msec.
Quantities in parentheses are standard deviations
SCHEDULING COUPLED TASKS
495
TABLE II (b) — Average Computation
Times (msec)
N
20
SEQUENCE
1.9
NEST
4.0
FIT
67.2
50
4.4
16.3
440.1
100
8.6
51.9
1742
200
16.9
195.3
7091
500
42.0
1064
44160
Z(MS)n
120--
90 —
60
30-
Z = Mean Frame Time
Nest
Z~ .232 N
— i 1 1 1 1 m-
100 200 300 400 500 H
Figure 6. Comparative frame times
z = £ (Si + T,) ~ kxN
and in the worst situation, that in which no task is performed until the previous task has been
completed,
* - Z d, + T, - k2N.
496 R.D SHAPIRO
An assumption made in the treatment of this example is that the radar operator knows the
values of Sf, T, and d, precisely. If there is any uncertainty, signals can overlap. A straightfor-
ward way of avoiding this problem in a real situation where uncertainty would obviously be
present would be to "open a window" around the pulse. That is, if the object is such that
transmit and receive pulses are estimated to be of 60 ^sec. duration, an interval longer than 60
Aisec. can be allotted to these pulses to accommodate (1) the possibility that a pulse length
longer than 60 Aisec. might be necessary or, more important, (2) the possibility that the receive
signal might arrive sooner or later than expected. This procedure offers no conceptual difficulty
since the window around the pulses may be made large enough to guarantee that the probability
of overlap is as small as required. In order to retain frame times small enough to allow updat-
ing every, say, 200 milliseconds, we must limit the size of the window somewhat. This does
not seem to be a severe restriction, however. For example, since frame time is linear in XT',,
opening a window around each pulse of twice that pulse's estimated duration would cause the
frame time to be no more than doubled. The frame times of sequenced pulses in Table 11(a)
indicate that even for large N, this is no problem.
A second possibly problematic characteristic of the example is that it is static, i.e., no
explicit consideration is given to new "jobs" added to the system during the scheduling process.
In job shop scheduling, this may present no problem if jobs are released to the shop at
predetermined times. In radar tracking, however, one cannot hold enemy missiles, and the
scheduler must be dynamic. This can be accomplished; the new targets may be inserted into
the queue of jobs to be processed, or, since this is likely to be time-consuming when jobs are
ordered (as in Sequencing and Nesting), all current jobs can be processed, followed by the
newly-arrived entries. This procedure will be especially efficient for sequencing since the rf;'s
are proportional to the distance between radar and target, and new targets will tend to appear at
approximately the same range.
The necessity to allow for search and discrimination as well as the tracking activity and
real-time schedule determination within a 200 milli-second period makes sequencing the only
viable alternative. Even when real-time processing is not required, one wonders whether the
slight improvement in frame time allowed by fitting warrants the extra computational burden.
A caveat is in order here: these results are somewhat application-dependent. It is quite
possible that other applications which produce task pairs with different structures will lead to
different conclusions.
CONCLUDING REMARKS
In the above discussion it has been assumed that the operator or machine can process
only one task segment at a time. This is appropriate for the application being considered, but
one might easily imagine instances in which there is some nonunit capacity constraint on the
operator. For example, if trucks are being loaded and unloaded at some central depot, labor or
space restrictions might limit the number of trucks being simultaneously processed.
Fortunately, the suboptimal procedures described above may be extended without any
problem.* Figure 7 shows how the example given in Section 2 may be sequenced if the operator
is limited to two tasks at a time. Note that due to the ordering of the inter-task durations,
sequencing guaranteed that since no more than two initial tasks can overlap, no more than two
final tasks will overlap.
The optimal enumerative procedure described in [11] is also easily extended.
SCHEDULING COUPLED TASKS
497
U
U-l
~1
t_
i) I i 1 5 8 9 II 1.1 14 Ih 17 18 20 22 24 24 30 .VI
/, =0 i<= .1
/, = 0 /„ = 5
*-; "-s
'4 - 2
Figure 7. Sequencing with operator capacity = 2
Another extension is to consider tasks which consist of more than two coupled segments.
The notation changes slightly: the rth task pair becomes a task set, the initial task of duration
5, followed by nt subtasks; the /th subtask is of duration Ty and the time at which it is initiated
is dy after the initiation of the initial task (Figure 8).
k-l
U
U.A
■*Vn
Figure 8. Multiple coupled subtasks
Fitting, as proposed above, works well in this case, but sequencing and nesting are waste-
ful since they treat the subtasks as one long task of duration dm + Tin — dn.
BIBLIOGRAPHY
[1] Air Traffic Control Advisory Committee, Report of the Department of Transportation, 1
(1969).
[2] Coffman, E.G., Jr. (editor), Computer and Jobshop Scheduling Theory, John Wiley, New
York, N.Y. (1976).
[3] Conway, R.W., W.L. Maxwell and L.W. Miller, Theory of Scheduling, Addison-Wesley,
Reading, Mass. (1967).
[4] Garey, M.R., D.S. Johnson and R. Sethi, "The Complexity of Flowshop and Jobshop
Scheduling," Mathematics of Operations Research, 1, 117-129 (1976).
[5] Heffes, J. and S. Horing, "Optimal Allocation of Tracking Pulses for an Array Radar,"
IEEE Transactions on Automatic Control, 15, 81-87 (1970).
[6] Karp, R.M., "Reducibility among Combinatorial Problems," R.E. Miller and J.W.
Thatcher, (editors), Complexity of Computer Computations, Plenum Press, New York,
N.Y., 85-104 (1972).
[7] Karp, R.M., "On the Computational Complexity of Combinatorial Problems," Networks, 5,
45-68 (1975).
498
R.D SHAPIRO
[8]
[9]
[10]
[11]
Reddi, S.S. and C.V. Ramamoorthy, "On the Flowshop Sequencing Problem with No Wait
in Process," Operational Research Quarterly, 23, 323-331 (1972).
Rinooy Kan, A.H.G., Machine Scheduling Problems, Martinus Nijhoff, The Hague (1976).
Schweppe, F.C. and D.L. Gray, "Radar Signal Design Subject to Simultaneous Peak and
Average Power Constraints," IEEE Transactions on Information Theory, 12, 13-26
(1966).
Shapiro, R.D., "Scheduling Coupled Tasks," Harvard Business School, Working Paper,
HBS 76-10.
SEQUENCING INDEPENDENT JOBS
WITH A SINGLE RESOURCE
Kenneth R. Baker
Dartmouth College
Hanover, New Hampshire
Henry L. W. Nuttle
North Carolina State University
Raleigh, North Carolina
ABSTRACT
This paper examines problems of sequencing n jobs for processing by a sin-
gle resource to minimize a function of job completion times, when the availa-
bility of the resource varies over time. A number of well-known results for
single-machine problems which can be applied with little or no modification to
the corresponding variable-resource problems are given. However, it is shown
that the problem of minimizing the weighted sum of completion times provides
an exception.
1. INTRODUCTION
We consider the problem of sequencing a set N = {1, 2, . . . , n) of jobs to be processed
using a single homogeneous resource, where the availability of the resource varies over time.
If t represents time (measured from some origin / = 0) then we denote by r{t) the resource
available at time t and by R (?),
R(t) = f' r(u)du
the cumulative availability as of time /, i.e., the area under the curve r(u) over the interval
[0,t]. See Figure 1.
Let pj, j = I, ... , n, denote the resource requirement of job / Once Pj units of
resource have been applied to job j, the job is considered complete. We denote the completion
time of job j by C,. In all problems treated the objective is is to minimize G, a function of the
completion times of the jobs, where G is assumed to be a regular measure (see [1], Chapter 2).
This model is a generalization of the single-machine sequencing model. The generaliza-
tion to a resource capacity that varies over time allows for situations in which machine availabil-
ity is interrupted for scheduled maintenance or temporarily reduced to conserve energy. It also
allows for a situation in which processing requirements are stated in terms of man-hours and
labor availability varies over time.
499
500
K.R BAKER AND H.L.W. NUTTLE
r(() n
3--
2-
H 1-
-l 1 1 1 h
C,
RU)k
SEQUENCING INDEPENDENT JOBS 501
In the single-machine case the resource profile r(t) is constant (typically r{t) = 1), and
the cumulative profile R{t) is a straight line with slope r(t). Time is measured in some basic
unit such as hours; and completion times, ready times, due dates and tardiness are expressed in
the same units. Resource requirements (processing times) are simply requirements for inter-
vals on the time-axis.
In the variable-resource problem, the exact correspondence between the requirement for a
unit of resource and the requirement for a unit interval on the time axis is lost. This lack of
correspondence arises from the fact that there may be a number of units of resource available
during a particular unit of time and a different number during the next. In the single-machine
problem if a job j is sequenced to follow jobs in B (where B is any subset of N ) then job j will
be complete at time Cn
C, = p(B) +Pj,
where p(B) = ]Tp,, and p, denotes the processing time for job i. In the variable-resource
problem it is appealing to analogously specify the completion time of job j by Cr
(1) C,= t(p(B) +/>,.)
where p, is the resource requirement of job / and t{Q) is the (smallest) point on the time axis
corresponding to R(t) = Q. See Figure 1. In effect, jobs are sequenced on the resource axis,
while their completion times are measured on the time axis. For the single-machine problem
; the completion point of job j is the same on both axes, but such is not the case for the
! variable-resource problem.
Notice that this specification implicity assumes that the resource available at any point in
time is devoted entirely to the processing of a single job. Thus, for example, if ten men were
available in a particular hour, all ten would be assigned to work simultaneously on the same
job. Also, if the available resource represents several machines, then this formulation permits
each job to be processed simultaneously on more than one machine. Equivalently, this means
that jobs must be divisible into portions that can be allocated equally to the number of
machines available. Such a formulation will be called a continuous-time model.
In order to allow for a wider range of applicability, we can re-formulate the model in
discrete time as follows.
(a) Unit intervals on the time axis (of Figure 1) are called periods, and job comple-
tion times are measured in periods.
(b) In a given period the resource availability is an integer number of units.
(c) Each job requires an integer number of resource-periods.
(d) Processing work is divisible only to the level of one resource-unit for one period.
Under this formulation, for example, the time unit might be days, the resource availability
might be crew size, and the processing requirement might be man-days. Property (d) then res-
tricts the refinement of a schedule to the assignment of each crew member's task on a day-by-
clay basis. Furthermore, a task requiring two man-days could be accomplished either by one
crew member working two days or by two members working one day each.
In the discrete-time context, we may regard sequencing as ordering jobs on the resource
scale in Figure 1, but taking the completion time of job j to be [Cf] vs [C,-], the smallest
502
K.R. BAKER AND H.L.W. NUTTLE
integer greater than or equal to Cj, where Cj itself is given by (1). In other words, we obtain a
sequence using the continuous-time framework, which assumes arbitrarily divisible jobs, but we
round up the resulting completion times when they are noninteger. Under this interpretation
of the model, due-dates are specific days and a job is "on time" as long as it is completed on or
before the specific day. Clearly, in the discrete time model several jobs can have the same
completion time.
To verify that a job sequence can be interpreted consistently with requirement (d), note
that the cumulative resource requirement and the cumulative resource availability by the end of
any period are both integers. It follows that the workload implied by the continuous-time solu-
tion can be shifted to meet the integer restrictions of the discrete-time model since the resource
availability in any period can be treated as a set of unit-resource availabilities. Then any frac-
tion of a day's work in the original solution can be rescheduled as a day's work for the same
proportion of the total resource units available. This rescheduling will consume an integer
number of resource-periods for each job.
As an example, consider the three-job problem shown below.
J
1
Pj
r(t)= 1 0 < t < 4
r{t) = 4 4 < t < 7
In Figure 2 we represent the sequence 1-2-3 assuming infinite divisibility. In Figure 3 we show
how the work is rescheduled to meet the integrality requirement of the discrete-time model.
As Figures 2 and 3 indicate, the discrete-time conditions can be incorporated by a minor adjust-
ment of continuous-time job assignments that essentially involves replacing vertical portions of
the schedule chart with horizontal portions whenever the available resource capacity is split
among two or more jobs within a period.
r(/),i
Figure 2.
Our purpose in this paper is to note that certain well-known results for the single-machine
model carry over with little or no modification to the variable-resource model. In fact, we
found only one exception. (See Section 3.)
A variable resource problem has also been examined by Gelders and Kleindorfer [6,7] in
the context of coordinating aggregate and detailed scheduling decisions. In their model the
variation in resource availability results from the explicit decision to schedule overtime. This
SEQUENCING INDEPENDENT JOBS
503
decision leads to a cumulative resource availability function consisting of segments with identi-
cal positive slope (corresponding to capacity available) separated by horizontal segments
(corresponding to unused overtime.) Their objective is to determine when and how much
overtime should be scheduled, and to determine the associated job sequence, so as to minimize
the sum of overtime, tardiness and flow-time costs. They also note that for a given overtime
schedule, shortest-first sequencing minimizes mean job completion time while nondecreasing
processing time-to-weight ratio sequencing may not minimize mean weighted job completion
time. These two results are encompassed in our general treatment of the variable-resource
model in Sections 2 and 3.
U)ji
Figure 3.
2. RESULTS THAT GENERALIZE TO VARIABLE RESOURCES
The following is a set of sequencing results for the variable-resource model that are ident-
ical to or slight modifications of their single-machine counterparts. It is not difficult to establish
that the results we give are valid for both the continuous-time and discrete-time models. How-
ever, proofs are omitted, since they are typically direct extensions of the original arguments in
the single-machine case.
The results involve sequences of jobs, or at least partial sequences. We reiterate that
these sequences can be viewed as applying to the resource axis in Figure 1 but can be converted
to completion time schedules in either the continuous-time or discrete-time case by means of
the appropriate transformation. We use Cj to denote the completion time of job j and r(p(B))
to denote the makespan for the jobs in /?, recognizing that in the discrete-time case these quan-
tities must be interpreted in the appropriate way.
Minimizing the Maximum Cost
One of the few efficient algorithms for a broad class of sequencing criteria is Lawler's pro-
cedure [9] for minimizing the maximum cost in the sequence. Formally, the criterion is to
minimize
G = max {gj(Cj)}
where £/(C,-) is the cost incurred by job j when it completes at C, and where gj(t) is nonde-
creasing in /. The solution procedure works by constructing a sequence from the back of the
schedule and the procedure is easily adapted to the variable-resource model, as shown below.
504 K.R. BAKER AND H.L.W. NUTTLE
1. Initially let A = <f>. (A denotes the set of jobs at the end of the schedule and
A' = N — A denotes its complement.)
2. Find M = t(p(A')). (Mis the makespan for unscheduled jobs.)
3. Identify job k satisfying gk{M) = min {g,(M)). (Considering only the
unscheduled jobs, job k is the one that achieves the minimum cost when
scheduled last.)
4. Schedule job k last among the jobs in A'. Then add job k to A and return to Step
2 until A = N.
A noteworthy special case is the criterion of maximum tardiness. The procedure
sequences jobs in nondecreasing order of due-dates in this case. Thus, as in the single machine
problem, earliest due-date (EDD) scheduling will minimize the maximum tardiness. It will
also find a schedule in which all jobs complete on time, if such a schedule exists.
Minimizing the Sum of Tardiness Penalties
Many problems of considerable interest for the single machine model may be regarded as
special cases of the problem of minimizing total tardiness penalty,
where T, = max(C, — d,\ 0) and wl > 0.
Several dominance properties, in the spirit of Emmons [3], can be shown to hold for the
variable resource problem. These in turn imply similar dominance properties for the various
special cases and, in some instances, optimizing (ranking) procedures. Let:
J — a set of jobs
J' = the complement of J
A, = the set of jobs known to follow job ;', by virtue of precedence conditions.
B, = the set of jobs known to precede job /', by virtue of precedence conditions.
Cj = the time required to process the jobs in set /, defined by R (C,) — £/?,
B' = B,U { j) = the set containing job j and all jobs known to precede job / by virtue of
the precedence conditions.
A* = A- — {j} = the set containing the complement of A,, but excluding job /
THEOREM 1: If wk < w. and dk ^ max(rf,, C..) then j precedes k in an optimal
sequence.
THEOREM 2: If dk ^ C . then j precedes k in an optimal sequence.
Ai
THEOREM 3: If p, < pk, w, > wk and d, < max (dk, Cfl.), then j precedes k in an
optimal sequence.
COROLLARY (Theorem 3): If Wj > wk, p, < pk and d, < dk, then j precedes k in an
optimal sequence. The corollary immediately yields an optimal ranking procedure for problems
derived by making constant any two of the three parameters. For example, when G = ^ T,
with w, = w and d, = d, an optimal sequence is determined by ordering the jobs by processing
!
SEQUENCING INDEPENDENT JOBS 505
requirement, smallest first (p\ ^ p2 ... < p„). When d = 0 we have T, = Cn i.e., the mean
flowtime problem, for which this sequence is called shortest processing time (SPT).
The problem of minimizing the total tardiness penalty when pt = p is also not difficult to
solve. Constant resource requirements imply a fixed sequence of completion times under any
sequence. In particular the first job completes at tip), the second job at t(2p), etc.; and an
optimal schedule may be found by assigning jobs to positions, as in Lawler [10]:
xu = 1 if job /appears in sequence position j
= 0 otherwise
Cy = the penalty for job /'when it appears in sequence position j, i.e. max {0, t{jp) — d,}.
The problem is to minimize £ ]L cuxU
i l
Subject to
Tfii = 1
i
An assignment algorithm can produce the optimal solution.
The most general version of the single-machine problem, with unequal due-dates, pro-
cessing times, and weights is binary NP-complete. The computational complexity of the cases
in which w, = w or d, = d > 0 is an open question. However, pseudo-polynomial algorithms
have been developed by Lawler [11] and Lawler and Moore [12]. The algorithms which have
demonstrated the most effective computational power for the problems are those found in [14].
These and other enumerative algorithms can be modified in a straightforward manner to accom-
modate the variable resource problem.
Minimizing The Weighted Number of Tardy Jobs
In this case we are interested in whether a job is tardy rather than the the length of time
by which it is tardy. Let 8(7}) = 1 indicate that job j is tardy and 8(7}) = 0 indicate that it is
completed on time. If each job has its own penalty for being tardy, i.e.,
G= ^w/iiTj),
then the single-machine problem is binary NP-complete, although it can be solved by a
pseudo-polynomial dynamic programming algorithm due to Lawler and Moore [12]. The algo-
rithm can easily be adapted to the variable-resource problem with no impact on computational
efficiency.
By restricting the data we obtain special cases that are solvable by ranking algorithms, just
as in the single-machine case:
THEOREM 4: When d, = d for all jobs, if the processing times and weights are agreeable
(Pi ^ pj whenever wt > w,) then an optimal sequence is obtained by scheduling the jobs in
order of processing requirement, shortest first (in order of weight, largest first).
COROLLARY (Theorem 4): When d, = dand pt = p, an optimal sequence is obtained by
' scheduling jobs in order of weight, largest first.
506
K.R. BAKER AND H.L.W. NUTTLE
When Wj = w for all jobs a sequence that minimizes the number of tardy jobs i.e.,
G = £8(7^), can be determined by generalizing an efficient algorithm due to Moore [13].
Since maximum tardiness is minimized by sequencing the jobs in EDD order, it follows that if
sequence S yields minimum G, then so will sequence S', in which the on-time jobs in S are
scheduled in EDD order followed by all the tardy jobs in S. Letting S„ represent the largest
possible set of on-time jobs (so that G = n — \S„\ is the minimum number of tardy jobs) S„
can be determined as follows:
Order and index the jobs in N such that dx < d2 ^ • • -^ d„ (where ties are bro-
ken arbitrarily). Set SQ = 0 and k = 1.
If k = n + 1 stop. S„ is an optimal set.
If t
Z p, + Pk
^ dk set Sk = Sk_\ U {k}, otherwise let pr = max
[Pj\j\ Sk.x U{k\] and set Sk = ^_, U[k] - {r}.
4. Set k = k + 1 and return to step 2.
Constrained (Secondary Criterion) Problems
Several authors have addressed the problem of sequencing n jobs on one machine so as to
optimize one criterion while restricting the set of sequences so that all or some jobs also satisfy
another. We include four such problems here. In particular,
(a) Minimize total (mean) flow time given that a subset E of the jobs are to be on time
(Burns and Noble [2] and Emmons [4], i.e.,
min G = £C,
s.t. C, < d„ i € E
(b) Minimize maximum tardiness given that a subset E of the jobs are to be on time
(Burns and Noble [2]), i.e.,
min G = max T,
s.t. C, < rf„ / 6 E
(c) Minimize mean flow time over all sequences which yield minimum maximum cost
(Emmons [5] and Heck and Roberts [8]), i.e.,
min G = £C,
s.t. g,(C,) < Gm, i e N
where g,(C) is a non-decreasing function of Cand G„, = min {max g,(C,)}
(d) Minimize the number of tardy jobs given that a subset E of the jobs is to be on time
(Sidney [15]), i.e.,
min G = J^Sd1,)
subject to C, < dj i € E.
SEQUENCING INDEPENDENT JOBS 507
In all cases the algorithms originally developed for the single-machine problem can easily
be adapted to the variable-resource problem.
The first three problems can be solved by a one pass algorithm which sequences jobs one
, at a time from last to first. Suppose that jobs have been assigned to positions k + 1 through n.
Let Nk be the set of jobs as yet unsequenced and Lk be the subset of Nk that can be assigned
position k without violating the constraint. A job from Lk, say job j, is then chosen according
to a certain rule and sequenced in position k. Then Nk_l = Nk — {j}, Lk-\ is generated, and a
job is sequenced in position k— 1, etc.
Letting Ek = Nk Q £and p(Nk) = £ pk, then for problems (a) and (b)
Lk = [Nk - Ek) U {j\j € Ek- dj > t(p(Nk)))
' while for problem (c)
Lk= U\j € Nk-gl(t(p(Nk))) ^ Gj
The rule for choosing the job for position k in problems (a) and (c) is choose j such that
Pj = max p,
while for problem (b), j is chosen such that
dj = max dj.
i€Lk
Problem (d) may be solved by modifying the due-dates to reflect the fact that if
d, ^ dk, k € £, and job / is to be on time in a feasible sequence then i must be completed by
! t(R(dk) — pk). Then Moore's algorithm can be applied, with an adjustment to assure that jobs
in £will be on time. This is essentially the procedure developed by Sidney [15].
Nonsimultaneous Arrivals
In the preceding sections all jobs are assumed to be available for sequencing at time zero.
We now consider problems in which job j is not available for processing until the beginning of
period rr where rt ^ 1. If, in this situation, it is possible to interrupt the processing of a job
and resume it later without loss of progress toward completion of the job, we say that the sys-
tem operates in a "prempt-resume" mode.
For single-machine problems with criteria maximum tardiness (G = max Tj) or total
(mean) completion time (G = £ C,) when prempt-resume applies; the static optimizing rules
EDD and SPT can be generalized in a straightforward manner to produce optimal sequences
when all jobs are not simultaneously available ([1] p. 82). The same generalizations apply when
resource availability varies with time, using the following procedure:
1. At time zero if one or more jobs are available assign the resource to process the
available job with the smallest (most urgent) priority. Otherwise leave the
resource idle until the first job is available.
2. At each job arrival, compare the priority of the newly available job j with the
priority of the job currently being processed. If the priority of job j is less, allow
job j to preempt the job being processed; otherwise add job j to the list of avail-
able jobs.
508
KR. BAKER AND H.L.W NUTTLE
3. At each job completion, examine the set of available jobs and assign the resource
to process the one with the smallest priority.
In order to minimize maximum tardiness, the priority of a job is taken to be its due-date, and
to minimize mean flowtime the priority is its remaining resource requirement.
3. MINIMIZING THE SUM OF WEIGHTED COMPLETION TIMES
One case for which the single-machine result does not generalize in a straightforward
manner to the corresponding variable-resource problem is the case sequencing to minimize the
sum of weighted completion times, where
G = £wyCy
when all jobs are available at time zero.
Sequencing jobs in nondecreasing ordef of the ratio p,/Wj, which will always minimize G
in the single-machine problem, need not yield an optimal sequence when the resource availabil-
ity varies with time. The following simple example demonstrates this fact.
EXAMPLE
j
1
2
3
Pi
7
3
6
Wj
5
2
4
Pj/Wj
1.4
1.5
1.5
rU) =1 0 ^ t < 4
,-(,) = 4 4^/^7 M =7
Sequencing the jobs in nondecreasing order of Pjlwj yields the order 1-2-3, for which the com-
pletion times are 4.75, 5.5 and 7. Therefore, G = 62.75. For the sequence 2-1-3 the comple-
tion times are 3, 5.5 and 7, with G = 61.5. (Under the discrete time framework G = 65 for
1-2-3 but G = 64 for 2-1-3.)
While the differences in G-values may seem almost insignificant it is possible to construct
an example in which sequencing by increasing ratios pjw, will yield an arbitrarily bad solution.
Consider the data for a two-job problem in which
1 2
Pi
10'"
5 x 102"'
Wj
1
10"'
Pj/Wj
10'"
5 x 10'"
r(t) = 5 x 102'", 0 ^ t < 1
, 1 < / < 1 + 10'".
< Pi/w2) and S\ the sequence 2-1, for large m
rU) = 1
Letting S represent the sequence 1-2 (p\/w
have
we
G w-J ~ 1 (If)"')
G(S') " 2U '"
SEQUENCING INDEPENDENT JOBS 509
For the special case in which the processing times and weights are agreeable (p, ^ p,
whenever w, ^ w,) sequencing by nondecreasing ratios of Pj/wj does produce an optimal solu-
tion (see Theorem 4). Otherwise the two examples given in this section reinforce the notion
that the single-machine result cannot be extended to even the simplest versions of the
variable-resource model. In one example the resource profile r(t) is nondecreasing, while in
the other example r(t) is nonincreasing. In both cases there is only one change in r(t). These
situations would appear to be among the least drastic ways of relaxing the constant resource
assumption; but, as we have demonstrated, the ratio rule still fails. At this point, we can con-
clude only that the minimization of Ew^C, involves more than a simple extension of the
single-machine result. Obviously, any optimal ordering rule (if one exists) would have to
involve information about the resource profile as well as information about processing require-
ments and weights. We conjecture that this problem is NP-complete.
4. COMMENTS
Although it is not possible to extend all single-machine results directly to the variable-
resource case, a few observations can be made. A look at Figure 1 indicates that the graph of
R (t) transforms processing times (on the horizontal axis) into resource consumptions (on the
vertical axis), and vice-versa. This transformation is at least order-preserving. In particular,
the makespan for a set A of jobs is at least as large as the makespan for set B when the jobs in
A have a total processing requirement that equals or exceeds the requirement of the jobs in B.
This property is fundamental to the proof of many single-machine results as they carry over to
variable-resource models. Moreover, the results for problems in which pj = p do not rely on
the precise nature of the transformation, but depend only on the fact that all solutions share a
common nondecreasing sequence of completion times.
In the single-machine case, R(t) is linear, implying that the mapping of resource con-
sumptions into processing times is proportionality-preserving as well as order-preserving. That
is, ratios of intervals on the resource axis convert to identical ratios on the time axis. This pro-
perty is not maintained in the variable-resource model, because the transformation distorts pro-
portionality. In particular, we have in the single-machine problem that pj pj < wi/w/ implies
AC//AC,- < Wj/wj, where AC, and AC,- denote the magnitude changes in the completion times
of adjacent jobs /' and j which are interchanged in sequence. This implication does not hold in
the variable-resource problem, so the pairwise interchange argument may fail.
These observations lead to the conclusion that single-machine results involving minimum
weighted sum of completion times cannot be directly extended. An open question is therefore
how to exploit the structure of this problem in the variable-resource case in order to find
optimal solutions.
REFERENCES
[1] Baker, K.R., Introduction to Sequencing and Scheduling, Wiley (1974).
[2] Burns, R.N. and K.J. Noble, "Single Machine Sequencing with a Subset of Jobs Completed
on Time," Working Paper, University of Waterloo, Canada (1975).
[3] Emmons, H., "One Machine Sequencing to Minimize Certain Functions of Job Tardiness,"
Operations Research, 77, 701-715 (1969).
[4] Emmons, H., "One Machine Sequencing to Minimize Mean Flow Time with Minimum
Number Tardy," Naval Research Logistics Quarterly, 22, 585-592 (1975).
[5] Emmons, H., "A Note on a Scheduling Program with Dual Criteria," Naval Research
Logistics Quarterly, 22, 615-616 (1975).
510 K.R. BAKER AND H.L.W. NUTTLE
[6] Gelders, L. and P. Kleindorfer, "Coordinating Aggregate and Detailed Scheduling in the
One Machine Job Shop: Part I," Operations Research, 22, 46-60 (1974).
[7] Gelders, L. and P. Kleindorfer, "Coordinating Aggregate and Detailed Scheduling in the
One-Machine Job Shop: Part II," Operations Research, 23, 312-324 (1975).
[8] Heck, H. and S. Roberts, "A Note on the Extension of a Result on Scheduling with a
Secondary Criteria," Naval Research Logistics Quarterly, 19, 403-405 (1972).
[9] Lawler, E.L., "Optimal Sequencing of a Single Machine Subject to Precedence Constraints,"
Management Science, 19, 544-546 (1973).
[10] Lawler, E.L., "On Scheduling Problems with Deferral Costs," Management Science, 11,
280-288 (1964).
[11] Lawler, E.L., "A Pseudopolynomial Algorithm for Sequencing Jobs to Minimize Total Tar-
diness," Annals of Discrete Mathematics, /, 331-342 (1977).
[12] Lawler, E.L. and J.M. Moore, "A Functional Equation and Its Application to Resource
Allocation and Sequencing Problems," Management Science, 16, 77-84 (1969).
[13] Moore, J.M. "An n Job, One Machine Sequencing Algorithm for Minimizing the Number
of Late Jobs," Management Science, 15, 102-109 (1968).
[14] Schrage, L.E. and K.R. Baker, "Dynamic Programming Solution of Sequencing Problems
with Precedence Constraints," Operations Research, 26, 444-449 (1978).
[15] Sidney, J.B., "An Extension of Moore's Due Date Algorithm," Symposium on the Theory
of Scheduling and Its Application, (S.E. Elmaghraby, editor) Lecture Notes on
Economics and Mathematical Systems 86, Springer-Verlag, Berlin, 393-398 (1973).
EVALUATION OF FORCE STRUCTURES UNDER UNCERTAINTY
Charles R. Johnson
Department of Economics and Institute for
Physical Science and Technology
University of Maryland
College Park, Maryland
Edward P. Loane
EPL Analysis
Olney, Maryland
ABSTRACT
A model, for assessing the effectiveness of alternative force structures in an
uncertain future conflict, is presented and exemplified. The methodology is ap-
propriate to forces (e.g., the attack submarine force) where alternative unit
types may be employed, albeit at differing effectiveness, in the same set of mis-
sions. Procurement trade-offs, and in particular the desirability of special pur-
pose units in place of some (presumably more expensive) general purpose
units, can be addressed by this model. Example calculations indicate an in-
crease in the effectiveness of a force composed of general purpose units, rela-
tive to various mixed forces, with increase in the uncertainty regarding future
conflicts.
INTRODUCTION
In planning the procurement of major weapons systems (submarines, aircraft, ships, etc.),
an argument, based upon relative cost-effectiveness in certain uses, may be made for the
development and purchase of some items which are less versatile and effective than the "best"
available components of an overall force. Assuming all relative costs and effectivenesses
known, such an argument is sound at least to the extent that the uses necessitated by a poten-
tial conflict are anticipated. However, under uncertainty about the nature of potential conflicts,
a question, in general more subtle, is raised regarding the optimal composition of forces. In
this case, a model is developed here to analyze the utility of "mixed" force structures, and
examples are given to support the intuitive notion that the less specific are the presumptions
about needs in a future conflict, the more valuable are the most versatile forces.
Our focus here is upon presenting a model able to capture the value, under uncertainty,
of versatile forces and not upon the equally important problem of determination of cost and
effectiveness parameters. The latter, as well as the mixture versus force level interaction, are
touched upon tangentially in an example. The parameter estimation problem, in general,
requires both large scale theoretical and empirical effort and has been addressed, in the subma-
rine case, in Reference [1].
511
512 CR. JOHNSON AND E.P. LOANE
By general purpose forces we shall mean the most versatile, advanced or effective com-
ponents which technology would currently allow in building a military force structure. Special
purpose forces, on the other hand, might be competitive in effectiveness with general purpose,
but only in some of the uses (which we shall call missions) which possible conflicts might
require. Naturally, we presume that the general purpose are more expensive than the special
purpose forces per item, and further that the special purpose forces are cost effective, in some
missions. It is assumed also that all costs are accounted for, e.g., development, production,
maintenance, operation, repair and logistical mobility, etc.
Examples of general versus special purpose forces include the following. In the case of
submarine forces, the general purpose would be the newest fully equipped nuclear submarine
while a special purpose alternative would be the conventional diesel submarine found in many
European navies. The former is presumed at least as effective in all missions (much more so in
some) while the latter is much less expensive and nearly as effective in some missions requiring
only low mobility. In the case of aircraft, a long-range fighter-bomber might be considered gen-
eral purpose and a plane designed primarily for ground attack would be special purpose.
The force planner must procure some mixture of forces, constrained, presumably, by a
fixed budget. In general there may be several force types, ranging from the very general to the
very special purpose, and we may think of the force structure as being a vector of inventories
of each type purchased. We think of a conflict as simply a collection of mission opportunities,
and the planner's problem is then to procure that force structure which permits the most
effective deployment for a conflict. For a specified conflict, this poses a deterministic optimiza-
tion problem which, if the conflict includes enough important mission opportunities in which
the special purpose forces are cost effective, will surely suggest a mixed force structure includ-
ing at least some special purpose units.
However, procurement of weapons systems must generally be decided upon long in
advance of potential conflicts. For a variety of additional reasons, there will likely be consider-
able uncertainty as to the precise nature of an actual conflict. We consider this uncertainty to
be characterized by a (known) distribution of potential conflicts, i.e., a distribution of mission
opportunities. We note that there are other ways in which uncertainty might be treated. For
example, if one's own force structure is known, a hostile adversary might be expected, to the
extent that circumstances allow, to bias a conflict in a direction which would render one's own
force least effective. This suggests a game theoretic approach. Although it is not pursued
further here and although its information requirements might be great, this would naturally fit
into the model context we outline below. It seems likely that such a treatment would value the
versatility of general purpose forces more so than the one we pursue. Another alternative
would be to treat the effectiveness of each unit type as unknown and characterize it by a proba-
bility distribution.
The planner's problem which we address is then to choose that affordable mixture of
forces which, assuming optimal deployment in any conflict, yields the largest expected
effectiveness in the uncertain conflict. It should be noted that, as stated, there is an implicit
assumption that the planner is willing to take the risk that the solution mixture will produce
unusually low effectiveness in some conflicts. (This is in contrast with the game theoretic
approach mentioned above.) However, to the extent that the planner is risk-adverse rather
than risk-neutral, other criteria may be substituted for "expected effectiveness" without concep-
tual difficulty and probably without operational difficulty in the development below. It should
also be mentioned that a measure of the value of the versatility of general purpose forces under
uncertainty lies in comparing the solution mixture of the above problem to the optimal mixture
when the expected conflict is assumed known (i.e., the case of certainty). In general the
"expected effectiveness" solution will differ from the "expected conflict" solution.
EVALUATING FORCE STRUCTURES 513
MODEL DESCRIPTION
We imagine n force types 7), j — 1, ..., n and m different mission categories Uh
/ = 1, . . . , aw, in which a component of the force might be engaged. Each 7} is more or less
effective in a given U, which, to the extent that total effectiveness is linear in the deployment
of force types to mission categories, suggests the definition of an m-by-n unit effectiveness matrix
E-(eu),
in which etj indicates the effectiveness of a unit of 7} employed in U, for a unit of conflict
(presuming opportunities available). We denote by a \-by-n vector s, a particular force composi-
tion in which Sj is the number of 7} available. At the time of a conflict, s is fixed and, there-
fore, provides a constraint on the total effectiveness attainable. A particular conflict is charac-
terized by the total opportunity for effectiveness which may be obtained from each mission category.
These bounds are summarized in an m-by-1 vector b in which 6, is the maximum opportunity
in Uh This bound is expressed in effectiveness units rather than force units because the
"opportunities" are opportunities to damage the opponent and the force types vary in their abil-
ity to do so in a given mission.
The m-by-n matrix A = (fly) summarizes the allocation (or deployment) of 7} to Uh i.e.,
Ojj is the amount of force type 7} allocated to mission category U, during a conflict. The atJ are
necessarily nonnegative but we do not assume them integral because of the possibility of
switching units among missions.
The problem of waging a given conflict is then to deploy the given force so as to maxim-
ize total effectiveness within the constraint of the opportunities the conflict presents. In general
(no linearity assumption), total effectiveness is some function
e = e {A )
of the allocation, and, furthermore,
e(A) = e{(A) + ... + em{A),
where e,{A) is the effectiveness A yields through the /th mission category. This means that
waging the known conflict b amounts to the optimization problem:
maximize e {A )
m
subject to £ a^ ^ Sj, j = 1, . . . , n
i=\
e,(A) < b,, i = 1, ... , m
a, > 0.
In case total effectiveness is linear in A, we have the linear programming problem:
m n
maximize ]£ £ a^e-^
<=1 7=1
m
subject to £ ay < Sj, j — I, ... , n
, m
£ flyfy < K
/ = 1,
7=1
a0 > 0.
-
514
C.R. JOHNSON AND E.P LOANE
In either case we denote the maximum achieved by Mis,b). Then, equicost force compositions
s may be compared, for a given conflict, by comparing the M{s,b). A good general reference
for relevant concepts in the linear case is Reference [2].
Uncertainty as to the nature of the conflict is characterized by a probability distribution for
b. For a given s, there is an M(s,b) for each possible value of b. These may then be averaged
according to the distribution of b to obtain the expected value:
Mis) = Eb{M{s,b)).
Comparisons among force compositions may- then be made by comparing the Mis), and the
planner's problem is to
maximize Mis)
subject to his budget constraint governing the possible forces 5 which may be purchased. In
general,
max EbiMis,b)) j* max Mis,Ebib)),
s s
and in the case that effectiveness is linear in A,
max EbiMis,b)) < max iMis,Ebib)).
s s
Thus, the maximum expected effectiveness problem has a different solution from the problem
of maximum effectiveness is an expected conflict, so that uncertainty makes a difference in
planning. We present examples which illustrate this, and in which the latter favors special pur-
pose forces while the former favors general purpose, presumably because of their greater ability
to defend against variation (uncertainty). The suggestion is that the more uncertainty there is,
the greater the value of general purpose forces.
EXAMPLES
We conclude by giving two examples. The first is primarily to illustrate the evaluation
model and some of the remarks made. The second includes a more thorough examination of
the model and its assumptions in a detailed example intended to be suggestive of a realistic case
which motivated this study.
EXAMPLE 1: Here we imagine three force types. Type T{ is the general purpose, and T2
and T3 are different special purpose forces. There are also three mission categories. Type T2 is
cost effective relative to T\ in mission U\, while r3 is cost effective relative to T\ in t/3. Total
effectiveness is assumed linear in allocations and the unit effectiveness matrix is
E =
1
.7
.1
1
.1
.1
. 1
.1
.7
We consider seven equicost force compositions
s1 = (9, 0, 0)
s2 = (6, 3, 3)
s3 = (6, 6, 0)
s4 = (6, 0, 6)
55 = (5> 4> 4)
f6 =
,7 =
(5, 8, 0)
(5, 0, 8).
EVALUATING FORCE STRUCTURES
515
Thus, the two special purpose forces cost half as much as the general purpose over the range of
procurement considered. (Actually, the outcome will not differ qualitatively if more alterna-
tives based upon the 2-for-l trade-off are considered.)
There are six possible conflicts
,1 _
0
6
6
12
0
6
b7 =
6
b3 =
0
b4 =
0
/>5 =
12
6
0
6
0
0
and b6 =
0
0
12
with the first three presumed to have probability 2/9 each and the last three probability 1/9
each. Thus, the expected conflict is
b =
Straightforward calculations then yield
Misx) = 9
Mis1) = Mis3) = Mis4) = 8.6, and
Mis5) = Mis6) = Mis1) = 8.47
so that
max Mis') = 9
KK7
is achieved at s\ the all general purpose force. On the other hand,
Mis\ b) = 9
while
Mis2, b) = 10.2, Mis3, b) = Mis\ b) = 10.1,
Mis5, b) = 10.6, and Mis6, b) = Mis1, b) = 9.3.
Thus, a mixed force s5 is optimal for the expected conflict. The conclusion, in this case, is that
general purpose forces are overall more cost efficient under uncertainty. It should be noted that
in calculating the Mis'), each other force had higher effectiveness than s1 for some conflicts
(but not overall) and all were better than s1 in the expected conflict. Thus, it is only the value
of versatility under uncertainty which makes s1 preferred.
EXAMPLE 2: This example is taken from the problem of submarine procurement and
again illustrates the effect of uncertainty on the attractiveness of special purpose forces.
For simplicity, we consider only two types of forces, general purpose and special purpose
units. In this setting, the distinction between new procurement general purpose or special pur-
pose forces might well be that between nuclear or diesel-electric propulsion. Equipment and
weapons could be identical, but the lower underwater mobility inherent in diesel-electric pro-
pulsion would limit effective employment of such forces to particular ASW missions. In the
actual planning process, the existing force structure must also be considered since in a future
conflict, presently existing units might be restricted to low vulnerability missions (presumably
being less capable than new procurement general purpose units) and thus constitute additional
categories of special purpose forces.
The present example considers four missions and measures unit effectiveness in each mis-
sion by a kill rate defined by:
516 C.R JOHNSON AND E.P LOANE
Kills (of enemy submarines) per unit
time by one on-station U.S. submarine
of type Tj engaged in mission Ut
,J [Number of surviving enemy submarines]
The above quantity is well defined for important submarine missions, being independent of
enemy force size and the number of U.S. submarines committed to U, over a substantial range
of values. For instance, considering a fixed barrier mission, the rate of enemy transits through
the barrier and thus the rate of opportunities for kill would be proportional to the number of
surviving units. Also, U.S. submarine probabilities of detection and kill given an opportunity
(here target passage through the barrier area assigned to the submarine) are, at least initially,
inversely proportional to the width of the barrier area assigned. In this circumstance, etj is well
defined. Of course, nonlinear effects are present and become significant as the number of U.S.
units is increased. One could argue that, as returns diminish, no additional submarines should
be assigned to the fixed barrier; this then determines the mission opportunities, bj. With units
of differing capabilities, bj is properly stated in terms of effectiveness obtained, not in some
fixed maximum number of units employed, since the onset of diminishing returns would occur
at different force levels for different unit effectivenesses. Finally, variations in bj (for the fixed
barrier mission) might arise from uncertainties in enemy basing, at-sea replenishment of sub-
marines, desirable barrier locations being untenable due to enemy ASW, etc.
Similar arguments apply for the direct support mission (submarines employed in the
defense of surface formations) and similar conclusions are obtained in the area search mission.
It should be noted that kill rates add, and that the summation
m n
i=\ 7=1
being an overall rate at which enemy submarines are being killed, is a sensible measure of
effectiveness for the entire U.S. submarine force. It is even plausible that the differing subma-
rine types would be assigned to missions so as to (approximately) maximize this sum. Finally,
to the extent that variations in bj reflect week-to-week changes within a single conflict (i.e., one
week large numbers of forces are required for direct support, the next week these same units
are used in a barrier) rather than uncertainty as to some long-term mix of missions that will be
required in an unspecified conflict, then the expected value
Eb(M(s,b))
can be interpreted as a time-average of force kill rate and again this is a preeminently sensible
measure.
It is the authors' belief that the use of kill rates as measures of unit effectiveness and the
linear formulation of force effectiveness, while necessarily involving some approximation, does
capture the important aspects of evaluating alternative submarine force structures. Of course,
in realistic applications, the evaluation of effectiveness for alternative forces is a substantial
effort. Reference [1] documents a major study effort which arrives at such estimates, although
not expressed as kill rates. Evaluation of force effectiveness is not addressed here. Quantita-
tive inputs to this second example, shown in the following tabulation, are completely hypotheti-
cal; and, while of reasonable relative magnitudes, are chosen to illustrate the theses of this
paper.
EVALUATING FORCE STRUCTURES
517
TABLE 1.
Unit Effectiveness, e,j
(Kill rates)
General
Purpose
Submarines
Special
Purpose
Submarines
Expected Total
Opportunity for
Effectiveness
EtUh)
Mission 1
1.0
.95
16
Mission 2
1.50
.50
16
Mission 3
.75
.375
12
Mission 4
.40
.20
Unlimited
TABLE 2.
Alternative Force Compositions, (s],s2)
(Numbers of Units on-station)
General
Purpose
Submarines
Special
Purpose
Submarines
35
0
25
10
20
17
15
24
; Unit effectivenesses and force compositions are stated in terms of on-station submarines; actual
numbers of operational units would be higher than, and not necessarily in proportion to, the
; numbers shown. The alternative forces shown might well be equal cost options if there were
, some fixed cost associated with deploying any special purpose submarines. The fourth mission
: is not limited in the number of forces which can be employed or the total effectiveness which
can be obtained. This might be thought of as undirected open-ocean search, which could
always be undertaken by any submarine not otherwise assigned.
The distributions of b, reflecting uncertainty, are represented by lists of 60 sample
vectors— each considered equally likely. The lists are not repeated here. Sample vectors were
generated by Monte-Carlo methods, assuming each b, is an independent truncated* Gaussian
random variable with the above stated mean and relative standard deviations of 35% and 60% in
the two cases considered. Effectiveness, for the alternative force compositions is shown in
Table 3, following.
The maximal effectiveness for each level of uncertainty is enclosed in dashes. Not
, surprisingly, the example values show a change in preference, from a mixed force to an all gen-
eral purpose force, as variability in mission opportunities increases. What is surprising is that
the changes, and differences are so small overall. This can be explained qualitatively, and is a
reflection of a real concern in procurement decisions.
'Both high and low values were discarded so as to preserve the mean value and assure that b, ^ 0.
518
C.R. JOHNSON \NI) E.P. LOANK
TABLE 3.
Force Compositions, 5
Force Effectiveness, ,
Mis)
(s\, s2)
(35, 0)
(25, 10)
(20, 17)
(15,24)
No Uncertainty
(mean value
b used)
38.2
37.9-
Relative
Standard
Deviation of
each b, = 35%
37.4
36.9
Relative
Standard
Deviation of
each b-, = 60%
36.5
35.8
39.1
37.6
36.2
37.9
37.1
36.0
In the present example, the attractiveness of special purpose units rests on the availability
of opportunities in mission 1; i.e., if b\ ^ 11.4 then forces including some special purpose
units are preferred to an all general purpose force. But mission 1 is a substantial (36%) of the
projected employment of submarines; if this were taken away, then the force is over-built and
any alternative composition is able to exploit the remaining attractive opportunities. That is, if
b\ — 0 then all force compositions entertained give about the same effectiveness; and as noted
above, if b\ ^ 11.4, compositions involving special purpose units are preferred. In this cir-
cumstance, i.e., with the numeric inputs to this example calculation, one cannot expect to see
dramatic changes in preferences among force compositions, with explicit consideration of
uncertainty.
As a final point, we note the suboptimality of separating questions of force composition
from questions of force levels. Although this raises an issue worthy of further study, we only
mention the issue here by extending the previous example. Using exactly the same unit
effectiveness and mission opportunity values stated previously, but considering alternative force
compositions which involve an additional 5 general purpose submarines, one obtains the follow-
ing results:
TABLE 4.
Force Compositions, s
Force Effectiveness,
Mis)
iS\, s2)
No Uncertainty
(mean value
b used)
Relative
Standard
Deviation of
Relative
Standard
Deviation of
each b; = 35%
each bi = 60%
(40, 0)
42.0
40.7
39.6
(30, 10)
(25, 17)
(20, 24)
41.6
40.3
39.0
39.6
42.8
41.0
41.7
40.7
39.7
In this case, the uncertainty considered does not lead to a preference for an all general purpose
force, although again the effects are very small. The tendency here is intuitively satisfying, i.e.,
special purpose units become more attractive as overall force levels are increased, relative to a
fixed job to be done. Notice also that increased uncertainty decreases the incremental
effectiveness of the additional five general purpose units, in every case.
EVALUATING FORCE STRUCTURES 519
REFERENCES
[1] Chief of Naval Operations, Future Submarine Employment Study (U), (29 December
1972)-SECRET.
[2] Dantzig, G.B., Linear Programming and Extensions, Princeton University Press (1963).
A NOTE
ON THE OPTIMAL REPLACEMENT
TIME OF DAMAGED DEVICES
Dror Zuckerman
The Hebrew University of Jerusalem
Israel
ABSTRACT
Abdel Hameed and Shimi [1] in a recent paper considered a shock model
with additive damage. This note generalizes the work of Abdel Hameed and
Shimi by showing that the w-priori restriction to replacement at a shock time
made in [1] is unnecessary.
1. INTRODUCTION
A recent paper by Abdel Hameed and Shimi [1] was concerned with determining the
optimal replacement time for a breakdown model under the following assumptions: A device is
subject to a sequence of shocks occurring randomly according to a Poisson process with parame-
ter A. Each shock causes a random amount of damage and these damages accumulate addi-
! tively. The successive shock magnitudes Yx, Y2, ■ ■ ■ , are positive, independent, identically dis-
tributed random variables having a known distribution function F(-). A breakdown can occur
only at the occurrence of a shock. Let 8 denote the failure time of the device. For / < 8 let
X(t) be the accumulated damage over the time duration [0,/]. The device fails when the accu-
mulated damage X{t) first exceeds Z. That is,
(1) 8 = inf{/ ^ 0; X(t) ^ Z\,
where Z is a random variable, independent of the accumulated damage process X, having a
known distribution function G() called the killing distribution. More explicitly, if X(t) = x
and a shock of magnitude y occurs, at time r, then the device fails with probability
n, G(x+y) - G(x)
, l-G(x) ■
Upon failure the device is immediately replaced by a new identical one with a cost of c. When
the device is replaced before failure, a smaller replacement cost is incurred. That cost depends
on the accumulated damage at the time of replacement and is denoted by c(x). That is to say
c(x) is the cost of replacement before failure when the accumulated damage equals x. It is
assumed that c(0) = 0 and c(x) is bounded above by c. Thus there is an incentive to attempt
to replace the device before failure. The condition c(0) =0 has to be interpreted as a policy of
no replacement if there is no damage.
In their paper Abdel Hameed and Shimi [1] derived an optimal replacement policy that
minimizes the expected cost per unit time under the restriction that the device can be replaced
only at shock point of time.
521
522
D ZUCKERMAN
In the present article we consider a similar breakdown model without the above restriction
made in [1]. We allow a controller to institute a replacement at any stopping time before failure
time. He must replace upon device failure. Throughout, we restrict attention to replacement
policies for which cost of replacement is solely a function of the accumulated damage. In some
shock models, replacement at a scheduled time offers potential benefits relative to replacement
at a random time. However, the problem of scheduled replacement in failure models with addi-
tive damage is an open problem and it is beyond the scope of the present study.
Let Fbe the replacement time. At time 7" the device is replaced by a new one having sta-
tistical properties identical with the original, and the replacement cycles are repeated
indefinitely. The collection of all permissible replacement policies described above will be
denoted by M. Our objective is to prove that an optimal policy replaces the system at shock
point of time. Thus the restriction about the class of permissible replacement policies made in
[1] can be omitted.
The following will be standard notation used throughout the paper: E[Y;A], where Yis a
random variable and A is an event, refers to the expectation E [IA Y] = E[Y\IA — \]P(A),
where lA is the set characteristic function of A.
2. THE OPTIMAL POLICY
By applying a standard renewal argument, the long run average cost per unit time when a
replacement policy Tis employed can be expressed as follows
E[c(X{T)); T < 8] + £[c; T = 8]
E[T)
(3)
«/>r =
Let i// * = inf <// r.
Clearly
.//* ^
E[c(X(T))\ T < 8] + E[c; T = 8]
£171
for every T € A/, and the optimal replacement policy that minimizes t// T over the set M is the
one that maximizes
(4)
0T = i}j*E[T] + E[c - c(X(T))- T < 8].
By applying Dynkin's formula (see Theorem 5.1 and its Corollary in Dynkin [2]) equation (4)
reduces to
(5)
where
(6) J(x) = i)j*-kc
0T= e\Jq J(X(s))ds
+ c,
J n
G(x+y)
G(x)
dF{y)
+ \
(x) - f c(x+y) — =—
J G(:
G(x+y)
x)
dF{y)
The proof of the above result follows a procedure similar to that used by the author in (Section
2 of [3]), and therefore is omitted.
In what follows we shall denote by S the state space of the stochastic process
[X(t); t < 8).
NOTE ON REPLACEMENT TIME
523
Let
(7)
and
(8)
S, = {x € S; J(x) > 0},
S2 = U € S; 7U) < 0}.
Let t\, t2, t^, . . . , be the shock points of time and define
W=[t,; i >l).
Let L be the subclass of replacement policies in which a decision can be taken only over the set
W.
We proceed with the following result:
THEOREM 1: For every replacement policy Tx £ L, there exists a replacement policy T2
6 L such that 9 T > 9 T .
'2 ' 1
PROOF: Let Tx be a replacement policy such that Tx $ L.
Let T(S2) be the hitting time of the set S2. That is
(9) T(S2) = inf{/ > 0; X{t) € S2).
(It is understood that when the set in braces is empty, then T(S2) = °°.)
Let
(10) T= inftf > Tf, t € W}
and define
(11) T2 = min{f, 7-(S2)}.
Clearly T2 € L. Next we show that 0 r > 0 r .
Using (5) we obtain
9T2-9T]= ElfJ2 J(X(s))ds
\s:
E\ J(X(s))ds
£"[// J(X(s))ds; T2= f]
(12)
fT(lS)J(X(s))ds\ t > T(S2)
Note that
I. {T2= f) implies that \T(S2) > t) and therefore E \fT J(X(s))ds; T2 = f 1 ^ 0
524
D. ZUCKERMAN
II.
J (X(s)) for T(S2) ^ s < T, is non-positive on the set {f > T(S2)}. Therefore
J ' J(X(s))ds\ t > T(s2)
^ 0.
Therefore, (using (12)) we obtain
9
r2-eT] > o
as desired.
Recalling that an optimal replacement policy T* is the one that maximizes 9 T and using
Theorem 1 it follows that T* € L. Hence, the optimal policy derived by [1] is the optimal one
among all possible replacement policies for which cost of replacement is solely a function of the
accumulated damage.
Finally it should be pointed out that if the benefits of scheduled replacement were con-
sidered, the conclusion reached, that an optimal policy replaces the device at a shock point of
time, would no longer generally hold.
REFERENCES
[1] Abdel Hameed, M. and I.N. Shimi, "Optimal Replacement of Damaged Devices," Journal of
Applied Probability 15, 153-161, (1978).
[2] Dynkin, E.G., Markov Processes I, Academic Press, New York, (1965).
[3] Zuckerman, D. "Replacement Models under Additive Damage," Naval Research Logistics
Quarterly, 24, 549-558, (1977).
■
A NOTE ON THE SENSITIVITY OF NAVY FIRST
TERM REENLISTMENT TO BONUSES, UNEMPLOYMENT
AND RELATIVE WAGES*
Les Cohen
Government Services Division
Kenneth Leventhal & Company
Washington, D.C.
Diane Erickson Reedy
Mathtech, Inc.
Rosslyn, Virginia
ABSTRACT
Multiple regression analysis of first term reenlistment rates over the period
1968-1977 confirms previous findings that reenlistment is highly sensitive to
unemployment at the time of reenlistment and shortly after enlistment, almost
four years earlier. Bonuses, particularly lump sum bonuses, were also shown to
be a significant determinant of reenlistment.
This note reports the results of cross-sectional multiple regression analysis of first term
Navy reenlistment. Equations which were estimated represent the completion of research con-
ducted by Cohen and Reedy [1] which analyzed the sensitivity of first term reenlistment to
fluctuations in economic conditions at the time of reenlistment and about the time of enlist-
ment, considering the effect of the latter on reenlistment behavior four years later. The princi-
pal finding of that study was that unemployment rates, both at the time of reenlistment and
about the time of enlistment four years earlier, were powerful predictors of reenlistment rates.
By comparison, measures of private sector versus military wages entered in the same equations
were generally found to be insignificant or, at best, relatively unimportant. That study did not,
however, take into account the influence of reenlistment bonuses which this follow-up note
addresses.
This note describes the results of regression equations, replicating those which were the
basis of the original Cohen-Reedy paper, which include reenlistment bonus variables to con-
sider their influence upon Navy reenlistment over the ten year period, 1968-1977.
Reenlistment rates were compiled from Navy Military Personnel Statistics ("The Green
Book"), quarterly by rating, separately for E-4's and E-5's. To help minimize spurious fluctua-
tions in the data, reenlistment rates were calculated only for those quarters which had an aver-
age of at least 10 eligibles per month. In addition, due to definitional and mensurational incon-
sistencies, ratings which include nuclear power and diver NEC's were eliminated and other
*This research was supported by the Office of the Chief of Naval Operations, Systems Analysis Division, under a con-
tract with Information Spectrum, Inc., Arlington, Virginia.
525
526 L COHEN AND D. REEDY
ratings which include 6 year obligors (6YO's) were analyzed separately. The resultant data base
consisted of 3110 observations for 4YO ratings, and 787 observations for 6YO ratings. Each
observation referred to a specific quarter, rating and pay grade, either E-4 or E-5.
Four multiple linear regression equations were estimated: one for 4YO ratings (including
E-4's and E-5's); one for 6YO ratings (including E-4's and E-5's); one for 4YO E-4's; and one
for 4YO E-5's. No attempt was made to estimate separate equations for each major occupa-
tional category as was done in the previous study. Given observed variations in earlier equa-
tions, collective treatment of ratings has probably resulted in depressed R2 statistics.
The dependent variable, RATE3, is the percentage deviation of the current quarter reen-
listment rate from the mean reenlistment rate for that rating and pay rate over the 10 years
under study, 1968-1977.
RATF3 = (Quartei"ly Reenlistment Rate - Mean (10 Year) Reenlistment Rate)
Mean (10 Year) Reenlistment Rate
This specification of the dependent variable was adopted to contend with wide variations in the
level of reenlistment rates from rating to rating. RATE3 describes relative changes in reenlist-
ment rates.
Independent variables included in the equations are listed and defined in Table 1.
TABLE 1 — Independent Variables
AUR current national unemployment rate
ARAUR average rate of change in unemployment (AUR) over the
past 6 quarters preceeding the reenlistment decision
AUR13 unemployment (AUR) 13 quarters prior to the reenlistment
decision (NOTE: Virtually uncorrelated with AUR.)
RW the ratio of military basic pay to private sector earnings
AWARD bonus award multiple
LS dummy variable indicating lump sum payment of bonuses
(LS = 1 for 1968 - 1974; LS = 0 for 1975 - 1977)
ELIG number of individuals eligible for reenlistment
PAYRATE dummy variable indicating rate
(PAYRATE = 1 for E - 5's; PAYRATE = 0 for E - 4's)
DRAFT number of persons drafted (all services) 18 quarters prior
to reenlistment decision
WAR dummy variable for Viet Nam War
(WAR = 1 for 1968 - 1972; WAR = 0 for 1973 - 1977)
QTR3 third quarter seasonal dummy
(QTR3 = 1 for 3rd calendar quarter only)
TIME time variable (TIME = Year - 67)
NOTE ON FIRST TERM REENLISTMENT
527
In the context of cross-sectional analysis, estimated coefficients do not pertain to the
impact of a given variable over time for a specific rating, but represent the typical impact of that
variable over the entire 10 years across all ratings which were included in the study.
Results of the estimation procedures are summarized in Table 2.
TABLE 2 — Reenlistment Equations: Coefficients, (t-statistics), and Means
EQUATION
4Y0
6YO
4Y0/E-4
4YO/E-5
INDEPENDENT
Coef.
Coef.
Coef.
Coef.
VARIABLE
(i)
Mean
(i)
Mean
(/)
Mean
(/)
Mean
AUR
14.52
(4.96)
.06
18.46
(5.07)
.06
15.77
(3.56)
.06
12.49
(3.41)
.06
ARAUR
-.84
(1.61)
.02
-1.56
(2.34)
.02
-.70
( .88)
.02
-1.38
(2.12)
.02
AUR13
29.38
(12.44)
.04
25.13
(8.00)
.05
24.21
(6.75)
.04
36.87
(12.50)
.04
RW
-.63
(1.12)
.77
.48
( .64)
.77
.94
(1.03)
.72
-.68
(1.03)
.82
AWARD
.03
(4.06)
1.15
.0004
( .05)
2.31
.03
(3.47)
1.15
.02
(2.43)
1.15
LS
.45
(6.32)
.76
.57
(5.81)
.67
.31
(2.90)
.76
.50
(5.62)
.76
ELIG
- 08E-2
(8.74)
103.76
- 03E-2
(3.83)
165.82
- .07E-2
(5.69)
123.71
- .001
(6.92)
83.74
PAYRATE
.03
( .45)
.50
- .05
( .66)
.50
DRAFT
- .04E-4
(6.69)
5.18E+4
- .04E-4
(4.82)
4.77E+4
- .04E-4
(5.03)
5.18E+4
- .04E-4
(5.51)
5.17E + 4
WAR
.26
(4.10)
.57
.36
(4.46)
.50
.21
(2.12)
.57
.37
(4.61)
.57
QTR3
- .09
(3.60)
.26
-.08
(2.46)
.26
- .08
(2.17)
.26
-.07
(2.32)
.26
TIME
.08
(4.49)
5.05
.08
(3.86)
5.63
.08
(2.94)
5.05
.06
(2.85)
5.06
CONSTANT
-2.26
(5.27)
- 3.03
(5.85)
- 3.10
(4.47)
- 2.39
(4.23)
R2
.34
.48
.40
.35
OBSERVATIONS
3110
787
1556
1554
The three unemployment variables, AUR, ARAUR and AUR13, were specified precisely
as in the earlier Cohen-Reedy study. Consistent with those results, the significance of the
unemployment rate variables and the magnitude of their apparent effect upon reenlistment are
striking. Taken literally, coefficients in the 4YO equation, for example, show a one point
increase in AUR 13 ( + .01) indicating a 29 point ( + .29) increase in RATE3. While it is real-
ized that these coefficients may overstate the real influence of unemployment, their equations,
like those which they are replicating, do indicate that reenlistment decisions may in fact be sen-
sitive to perceived costs of employment search and to the security of private sector employ-
ment.
The first compensation variable, RW, representing the ratio of military to private sector
wages, was calculated separately for E-4's and E-5's using basic pay for E-5's and E-6's respec-
tively as proxies for next-term earnings. RW was not a significant variable in any of the four
equations.
528
L COHEN AND D REEDY
The other two compensation variables, AWARD and LS, relate to bonuses. AWARD is
the multiple for a particular rating in a given quarter, ranging from 0 to 6. This multiple is the
factor which the Navy applies against an individual's monthly pay to compute the dollar amount
of his bonus payment. AWARD was significant in all three 4YO equations. LS is a dummy
variable which assumes a value of 1 through calendar 1974 during the period when lump sum
awards were paid to approximately 50% of those individuals who reenlisted. Beginning January
1, 1975, a new policy was initiated which reduced the percentage of lump sum bonus payments
to approximately 10% of those reenlisting. The coefficient of LS indicates that when bonuses
were paid in lump sums, the percentage difference between actual reenlistment rates and mean
(10 year) reenlistment rates was higher by .45 than when bonuses were paid in installments.
The variable ELIG was included in the equations simply to capture the observed relation-
ship between low numbers of eligibles and high reenlistment rates.
PAYRATE is a dummy variable which distinguishes between E-4's and E-5's (PAYRATE
= 1). TIME was included to capture the influence of factors which have changed steadily over
time such as the quality of life improvements effected by the Navy over the past several years.
These equations support the authors' earlier findings, notably that unemployment rates at
the time of the reenlistment decision and shortly after enlistment are important determinants of
reenlistment rates. Relative wages continue to appear unimportant. It appears, however, that
reenlistment bonuses have had a significant positive effect on reenlistment, particularly when
those bonuses have been awarded in lump sum payments.
Although by no means conclusive, the equations summarized in Table 2 suggest the fol-
lowing management initiatives:
— Experimentation is warranted in the use of lump sum bonuses to mitigate the effects
of low unemployment rates on reenlistment.
— Opportunities to reenlist might be timed to coincide with low points (periods of high
unemployment) in the business cycle.
— AUR13 and predicted AUR should be used to augment current information used for
projecting reenlistment rates.
— Based on the continued performance of the AUR13 variable, serious consideration
must be given to implementing new programs designed to effect enlistee career decision
making very early during the first term of service.
REFERENCES
[1] Cohen, L. and D. Reedy, "The Sensitivity of Navy First Term Reenlistment Rates to
Changes in Unemployment and Relative Wages," Naval Research Logistics Quarterly, 26,
695-709 (1979).
U.S. GOVERNMENT PRINTING OFFICE: 1980-311-493/3
INFORMATION FOR CONTRIBUTORS
The NAVAL RESEARCH LOGISTICS QUARTERLY is devoted to the dissemination of
scientific information in logistics and will publish research and expository papers, including those
in certain areas of mathematics, statistics, and economics, relevant to the over-all effort to improve
the efficiency and effectiveness of logistics operations.
Manuscripts and other items for publication should be sent to The Managing Editor, NAVAL
RESEARCH LOGISTICS QUARTERLY, Office of Naval Research, Arlington, Va. 22217.
Each manuscript which is considered to be suitable material tor the QUARTERLY is sent to one
or more referees.
Manuscripts submitted for publication should be typewritten, double-spaced, and the author
should retain a copy. Refereeing may be expedited if an extra copy of the manuscript is submitted
with the original.
A short abstract (not over 400 words) should accompany each manuscript. This will appear
at the head of the published paper in the QUARTERLY.
There is no authorization for compensation to authors for papers which have been accepted
for publication. Authors will receive 250 reprints of their published papers.
Readers are invited to submit to the Managing Editor items of general interest in the field
of logistics, for possible publication in the NEWS AND MEMORANDA or NOTES sections
of the QUARTERLY
NAVAL RESEARCH
LOGISTICS
QUARTERLY
SEPTEMBER
VOL. 27, N(
NAVSO P-127
CONTENTS
ARTICLES
On the Reliability, Availability and Bayes Confidence
Intervals for Multicomponent Systems
Optimal Replacement of Parts Having Observable
Correlated Stages of Deterioration
Statistical Analysis of a Conventional Fuze Timer
The Asymptotic Sufficiency of Sparse Order Statistics
in Tests of Fit with Nuisance Parameters
On a Class of Nash-Solvable Bimatrix Games
and Some Related Nash Subsets
Optimality Conditions for Convex Semi-Infinite
Programming Problems
W. E. THOMPSON
R. D. HAYNES
L. SHAW
S. G. TYAN
C-L. HSU
E. A. COHEN, JR.
L. WEISS
K. ISAACSON
C. B. MILLHAM
A. BEN-TAL
L. KERZNER
S. ZLOBEC
Solving Incremental Quantity Discounted Transportation P. G. MCKEOWN
Problems by Vertex Ranking
Auxiliary Procedures for Solving Long
Transportation Problems
On the Generation of Deep Disjunctive Cutting Planes
J. INTRATOR
M. BERREBI
The Role of Internal Storage Capacity in Fixed
Cycle Production Systems
Scheduling Coupled Tasks
Sequencing Independent Jobs With a
Single Resource
Evaluation of Force Structures Under Uncertainty
A Note on the Optimal Replacement Time
of Damaged Devices
A Note on the Sensitivity of Navy First Term
Reenlistment to Bonuses, Unemployment
and Relative Wages
H. D. SHERALI
C. M. SHETTY
B. LEV
D. I. TOOF
R. D. SHAPIRO
K. R. BAKER
H. L. W. NUTTLE
C. R. JOHNSON
E. P. LOANE
D. ZUCKERMAN
L. COHEN
D. E. REEDY
OFFICE OF NAVAL RESEARCH
Arlington, Va. 22217