Skip to main content

Full text of "Naval research logistics quarterly"

See other formats



2 J30 


— o 


VOL. 27, NO. 3 


NAVSO P-1278 



Marvin Denicoff, Office of Naval Research, Chairman Ex Officio Members 

Murray A. Geisler, Logistics Management Institute 
W. H. Marlow, The George Washington University 

Thomas C. Varley, Office of Naval Research 
Program Director 

Seymour M. Selig, Office of Naval Research 
Managing Editor 


Seymour M. Selig 

Office of Naval Research 

Arlington, Virginia 22217 


Frank M. Bass, Purdue University 

Jack Borsting, Naval Postgraduate School 

Leon Cooper, Southern Methodist University 

Eric Denardo, Yale University 

Marco Fiorello, Logistics Management Institute 

Saul I. Gass, University of Maryland 

Neal D. Glassman, Office of Naval Research 

Paul Gray, Southern Methodist University 

Carl M. Harris, Center for Management and 

Policy Research 
Arnoldo Hax, Massachusetts Institute of Technology 
Alan J. Hoffman, IBM Corporation 
Uday S. Karmarkar, University of Chicago 
Paul R. Kleindorfer, University of Pennsylvania 
Darwin Klingman, University of Texas, Austin 

Kenneth O. Kortanek, Carnegie-Mellon University 
Charles Kriebel, Carnegie-Mellon University 
Jack Laderman, Bronx, New York 
Gerald J. Lieberman, Stanford University 
Clifford Marshall, Polytechnic Institute of New York 
John A. Muckstadt, Cornell University 
William P. Pierskalla, University of Pennsylvania 
Thomas L. Saaty, University of Pittsburgh 
Henry Solomon, The George Washington University 
Wlodzimierz Szwarc, University of Wisconsin, Milwaule 
James G. Taylor, Naval Postgraduate School 
Harvey M. Wagner, The University of North Carolina 
John W. Wingate, Naval Surface Weapons Center, Whit 0a< 
Shelemyahu Zacks, Virginia Polytechnic Institute and 
State University 

The Naval Research Logistics Quarterly is devoted to the dissemination of scientific information in logistic am 
will publish research and expository papers, including those in certain areas of mathematics, statistics, and econciics 
relevant to the over-all effort to improve the efficiency and effectiveness of logistics operations. 

Information for Contributors is indicated on inside back cover. 

The Naval Research Logistics Quarterly is published by the Office of Naval Research in the months of March, jnt 
September, and December and can be purchased from the Superintendent of Documents, U.S. Government Pr:ting 
Office, Washington, D.C. 20402. Subscription Price: $1 1.15 a year in the U.S. and Canada, $13.95 elsewhere. C<t of 
individual issues may be obtained from the Superintendent of Documents. 

The views and opinions expressed in this Journal are those of the authors and not necessarily those of the (fi« 

of Naval Research. 

Issuance of this periodical approved in accordance with Department of the Navy Publications and Printing Regula >ns, 

P-35 (Revised 1-74). 


William E. Thompson 

Columbia Research Corporation 
Arlington, Virginia 

Robert D. Haynes 

ARINC Research Corporation 
Annapolis, Maryland 


The problem of computing reliability and availability and their associated 
confidence limits for multi-component systems has appeared often in the litera- 
ture. This problem arises where some or all of the component reliabilities and 
availabilities are statistical estimates (random variables) from test and other 
data. The problem of computing confidence limits has generally been con- 
sidered difficult and treated only on a case-by-case basis. This paper deals with 
Bayes confidence limits on reliability and availability for a more general class of 
systems than previously considered including, as special cases, series-parallel 
and standby systems applications. The posterior distributions obtained are ex- 
act in theory and their numerical evaluation is limited only by computing 
resources, data representation and round-off in calculations. This paper collects 
and generalizes previous results of the authors and others. 

The methods presented in this paper apply both to reliability and availability 
analysis. The conceptual development requires only that system reliability or 
availability be probabilities defined in terms acceptable for a particular applica- 
tion. The emphasis is on Bayes Analysis and the determination of the posterior 
distribution functions. Having these, the calculation of point estimates and 
confidence limits is routine. 

This paper includes several examples of estimating system reliability and 
confidence limits based on observed component test data. Also included is an 
example of the numerical procedure for computing Bayes confidence limits for 
the reliability of a system consisting of A' failure independent components con- 
nected in series. Both an exact and a new approximate numerical procedure for 
computing point and interval estimates of reliability are presented. A compari- 
son is made of the results obtained from the two procedures. It is shown that 
the approximation is entirely sufficient for most reliability engineering analysis. 


The problem of computing reliability, availability, and confidence limits for multicom- 
ponent systems where some or all of the component reliabilities and availabilities are statistical 
estimates from test and other data has appeared often in the literature. The problem of com- 
puting these confidence limits has generally been considered difficult and treated only on a case 
by case basis. The present paper deals with Bayes confidence limits on reliability and steady 
state availability for a general class of fixed mission time, two-state systems including, as special 
cases, series-parallel, stand-by and others that appear in the applications. Further, a fixed mis- 
sion length is assumed. It is also assumed that neither reliability growth nor deterioration occur 
during the life of the system and the system becomes as good as new after each repair. Finally, 
we assume that no environmental changes, which could affect reliability occur. The posterior 



distributions obtained are exact in theory and their numerical evaluation is limited only by com- 
puting resources, data representation and round-off in calculation. The present paper collects 
and generalizes previous results of the authors and others. 

The methods obtained in the following apply both to reliability and steady state availability 
analysis and to avoid repeated reference to "reliability or availability", the discussion references 
only reliability with the understanding that the terms system reliability R and component relia- 
bility r, can be replaced by system availability A and component availability a,. The conceptual 
development requires only that R and A be probabilities defined in terms acceptable for a par- 
ticular application. The emphasis is on the determination of the posterior distribution func- 
tions. Having these, the calculation of point estimates and confidence intervals is routine. 


In the Bayes inference model, the unknown probability, R, < R ^ 1, is considered a 
random variable whose posterior density is the result of combining prior information with test 
data to obtain a probability density function f(R) for R. If the posterior density of R is seen to 
be spread out, then relatively more uncertainty in the value of R obtains than when the poste- 
rior density is concentrated closely about some particular value. The posterior density function 
provides the most complete form of information about R, but sometimes summary information 
is desired. A point estimate is one such form of summary information and this can be selected 
in various ways and is analogous to the familiar statistical problem of characterizing an entire 
population by some parameter value. Examples are mean, mode, median, etc. A point esti- 
mate has the disadvantage of ignoring the information concerning the uncertainty in the un- 
known reliability. Confidence intervals derived from /(/?) provide such additional information. 
The true but unknown (and unknowable except with infinite data) reliability R is some specific 
value of the random variable /?, < R < 1. Conceptually, R can be considered a random 

sample from < /? < 1 made when the system was built. We can never know that R is, but 

r R 
f(R) gives a measure of the likelihood that R Q = R for each < fl < 1. If F(R) = J 

f{R)dR denotes the distribution function of R then 

Prob {/?, ^ R ^ R 2 ) = F(R 2 ) - F(R X ) 

and [R], R 2 ] is an interval estimate of R of confidence c = F(R 2 ) - F(R { ). The interpreta- 
tion is simply that, based on the prior and current data the probability is c that the unknown 
system reliability lies between R\ and R 2 . The interval [R\, R 2 ] has been called [25] a Bayes c 
level confidence interval. For R 2 = 1, R\ is called the lower c level confidence limit. For 
R { = 0, R 2 is called the upper c level confidence limit. Given f(R) and F(R), Bayes 
confidence limits for any c can be obtained by graphical or numerical methods and the pro- 
cedure is generally not difficult. Numerical examples and discussion of numerical methods are 
given in [25,27,8,26,28,29]. 


To establish the relationship between the reliabilities of the components of a system and 
the reliability of the entire system, the way in which performance and failure of the com- 
ponents affects performance and failure of the system must be specified. For this purpose, as 
in [5,10,15], the state of any component is coded 1 when it performs and when it fails. The 
state of all A ; components of the system can then be coded by a vector of N coordinates 

X = \X\, X 2 , . . . , Xfri) 


where x, = means the i-th component fails and x, = 1 means that it does not fail. All possi- 
ble states of the system are represented by the 2 A/ different values this vector can assume. 

Where an explicit mission time dependence is required, a random process y(t) = 
[y\U), • • • , ^a/(^)} can be defined as in [15] so that to each component trajectory a measure x, 
is assigned. Then, for example: x, = 1 if y,(t) is a failure-free process over some interval 
O^T/i.^ t ^ 7/2, an d x i = if at least one failure occurs. 

Some of the 2 N states cause the system to fail and the others cause the system to perform. 
The response of the system as a whole is written as a function </>(x) of x such that 0(x) = 
when the system is failed in state x, and </>(x) = 1 when the system performs in state x. This 
function <f>(x) is known in the literature [5,10,23] and has been called a structure function of 
order N. 

The structure function can be written in a systematic way for any series parallel system. 
When the system is not too large the structure function can also be written by observation for 
many more general systems. The structure function can always be written for a system of N 
components by enumeration of its 2 N states. For large systems this is at best very tedious, but 
generally short cuts can be found which simplify the process. The structure function is con- 
venient for conceptual development of the theory and provides a very general notation which is 
why it is used here. What is required in the application of the present results is the formula for 
system reliability in terms of component reliabilities as is done in [25,27, and 8]. The structure 
function provides this formula in a general form but other methods are available. Some of 
these methods are identified and referenced in [17] along with a new and useful algorithm 
based on graph theory. 


Assume that the components of the system are failure independent so that the elements 
of the state vector x = (x 1( ... , x N ) are independent random variables with probability distri- 

Pr[x, = 1) = r, 

Pr{x, = 0) = 1 - r, 

where r, is the reliability of the i-th component. 

The structure function 4>(x) is also a random variable with 
Pr{(f>(x)= 1} = R 
Pr{cf>(x) = 0} = 1- R 
where R is the reliability of the system. R is the expected value of 4>(x) so that 

(1) R = £{<Mx)} = £ <Mx)//' (1 - r0 1_Xl ... r» *" (1 - r N ) l ~ x " 

where the summation is over all 2 A ' states of the system. 

In a particular application given the structure function and the values of all component 
reliabilities, the system reliability, R, can be computed explicitly using (1). References [5], 
[10], and [23] provide further discussion with examples of </> and R. 



In many applications the system structure is known but some or all of the component reli- 
abilities are unknown and must be estimated from tests and other data. As a result, statements 
concerning these component and system reliabilities are subject to the uncertainties of statistical 
estimation. A method of treating this uncertainty is provided by a Bayes analysis which consid- 
ers the unknown component reliabilities as random variables and leads to Bayes confidence 
intervals for both component and system reliabilities. The following is an extension and gen- 
eralization of previous analysis of this kind [25,27,8,7,14,23,29]. 


Assume a system of A 7 failure independent components has a known structure function 
(f>(x) and reliability function R (r) r — (r\, ...,%) of the form (1). Suppose that among the 
N separate components of the system some are known to have identical reliabilities say ij, and 
k, for example, then since r, = rj = r k , the symbols /) and r k can be replaced by r, everywhere 
in (1). Finally, in this way there remain only N' < N different r's, one of each reliability 
value. In addition, suppose that among the N' different component reliabilities N'—n are 
known constants and thus there remain n different types of components with unknown reliabili- 
ties. By a simple change in notation these n different, unknown reliabilities are denoted by 
P = (P\,P2> ■■■ > Pn)- 

By multiplying out factors (1 — p) and collecting terms, the system reliability (1) can then 
be written in the equivalent form 

(2) R(p) = T,a 0l Px 0]l ■••/>,/'" 


where the constants a y , are integer for / ^ 0. 

Using a Bayes inference model, the unknown p, are considered independent random vari- 
ables with known posterior density functions, 

' fM, 0< p { < 1, /- 1, ..., n. 

The system reliability, R(p) is then also a random variable, defined by (1) with unknown 
distribution function H(R). 

In applications, what is required is the calculation of H(R) given the f,(pi)\ 
/ = 1, 2, . . . , n. Having obtained //(/?), point estimates and confidence intervals on R can be 
obtained directly. This result is also required for risk, cost and other analyses based on the 
Bayes model. The method for an explicit numerical evaluation is presented in the following 


The proposed method of evaluating the posterior distribution function H(R) is based on 
an expansion of H(R) in Chebyshev polynomials of the second kind [1,16]. The main advan- 
tages of this method lie in the rapid convergence properties of the Chebyshev expansion and 
the convenient numerical computation for its evaluation. Although a description of the pro- 
cedure has been presented in [8] and [7], for the sake of completeness, we shall outline the 
main steps below. 


Expansion by Chebyshev Polynomials 

Let H{R) denote the posterior distribution, 

H(R) = f h(R)dR, ^ R ^ 1, 
" o 

where h(R) is the posterior density of the reliability of the overall system. By definition, 
H(R) satisfies the boundary conditions: 

(3) //(0) = 0; //(l) = 1 

Let us introduce a new function Q (R ) defined by 

(4) Q(R) = H(R)- R 

the Q (R ) satisfies the boundary conditions 

(5) 0(0) = 0(1) = 

and can be expanded in a Fourier sine series of the following form: 

(6) Q(R)= — sin9[b +b l ; 

77 ■" sin6» 

sin (* + 1)0 j 

k sin 9 " 

where the angular variable 9 is related to R by the relation 

(7) R = cos 2 -|. 

The coefficients /^ of the expansion (6) can be determined by: 

(8) b k = J Q ' [//(/?) -*] #JE (/?)£//? 

where U* (R) = — is the shifted Chebyshev polynomial of the second kind [1,16] 


which can be computed by the recursion relations: 

(9) Ut +i (R) = (4R - 2) U* k (R) - U* k -i (R) 


U$(R)=l U\ (R ) = - 2 + 4R 

U$(R) = 3- 16/? + 16/? 2 
If we express U* k (R ) explicitly as a /cth order polynomial 

(10) U* k (/?) = £ C lk R k 


J R i H{R)dR- j R'+ l dR 

then Equation (8) becomes 

, R'H(R)dR- f 

'o -Jo 

It can be shown, integrating by parts, that 

(12) M\H{R)\ = -^t- {1 - M i+1 [h(R)]}, 

(ii) h = Hc ik 



Thus, Equation (11) becomes 

(13) b k = £ Q 


l-M i+l [h(R)] 

i + 1 i+2 

Note that the Chebyshev coefficients C ik can be computed independently of the moments. 
They may be stored in the form of a triangular matrix if sufficient storage space is available. A 
simple algorithm for recursively calculating the coefficients is C i>k+] =4C,_ liA -2C U -C ik _ h 

Computations and Results 

To complete the analysis it remains to compute the moments of h (R ) given the density 
functions /■(/?,) and then use (13) to compute the b r . 

From (2) R k (p) k = 1, 2, ... can be written as a finite sum 

(14) **(/>) = ! Wi°" A ■■■PR a " lk - 


where the a ljk are independent of the p t and also integers for / ^ 0. Using this result and the 
fact that the expected value of a sum is the sum of the expected values and the expected value 
of a product of independent random variables is the product of the expected values, it follows 

(15) M k {h) = ^a ojk M a]ik ...M aiuh 


where M a , denotes the a ijk 'th moment of p, . 

Ilk <f 

Having determined the coefficients b k we can write down the final expression for H(R) 
from Equations (4) and (6) as follows: 

(16) //(/?) = R + —y/R(l-R) {6 + b\U*(R) +... + bk Vt(R) + •••)• 


This result is exact in the sense that the error can be made arbitrarily small by taking a : 
sufficient number of terms. References [8] and [7] give a discussion of numerical considera- 
tions and examples. Generally, (16) has been found very convenient for numerical calculation 
using an electronic digital computer. 


To evaluate H(R) the posterior distribution /•(/?,•) for each different component reliability 
Pi is required. The derivation of these require application of Bayes inference procedures on a 
case by case basis. The theory can be found in [20,4,19,2,3,24,6,18] and some specific applica- 
tions in [25,27,8,7,14,1,16,12,23]. A tabulation for some familiar models of mathematical reli- | 
ability theory is presented in the following. 

Component With Constant Failure Rate 

A single component has an unknown constant failure rate X and fixed mission time /. j 
Component reliability p = exp(-\/) is regarded as a random variable. The natural conjugate 
prior density function is 

P(p)=Cp b '>lnO/pY° 


with parameters b and r . When test data consists of T operating hours after r failures, 

t= ti + t 2 + ... + t r + (m - r)t r . 

Here t r is the time of the r-th failure among m initially on test. Failures are not replaced and 
the test is terminated at the r-th failure. The resulting posterior density function of p is 

f{p\aMa ,b ) = {b + X }T P b (\n\lp)°, 
Y{a + \) 

where a = r + r and b = t/t + b . The k-ih moment of f(p) is 

M k {f) = (b + l) a+1 (k + b + I)-"" 1 . 
The above results are from Reference [25]. 

Component Having Fixed Probability of Success 

A single component has an unknown fixed probability of success, p. In testing, there 
were observed m successes in n trials. For the natural conjugate Beta prior density function 
with parameters m and n the posterior density function of p is 


a = m + m , b = n + n -a and B(a + 1, b + 1) = f p a {\ - p) b dp. 

J o 

The Ar-th moment of f{p\a,b), 

k = 0,1, 2, ... is: 

(b - a + \)\ (a + k)\ Y{b-a + 2) T(a + k + 1) 

M k {f) 

kJ ' a\ (b-a + kV. T(a + 1) Y{b-a + k + \)' 

This result is from [26]. 

Steady State Availability of Component With Repair 

A two state component has exponential distributions of life and of repair times. The 
duration of intervals of operation and repair define two different statistically independent 
sequences of identically distributed, mutually independent random variables. Both the mean-up 
time, 1/X, and mean repair time l//x are unknown parameters estimated from test and prior 

The long term availability of the component is a function of the random variables p, and X 

a = fi/ik + /x). 

Assuming gamma priors for X and /x with snapshot, life and repair time data, the posterior den- 
sity of availability a is the Euler density function: 

f( I ^ (1-5)" a w -Hl- aY~ l 
0< a ^ !;/• > 0; w > 0, |8| < 1. 


The parameters r, w and 8 are determined by test data and prior information as defined in 

The moments of f(a) are given in [25] in terms of Gauss' hypergeometric function 2^1 
(w + r, w + k; w + r 4- k\h). (Note the typographical error in [25] where k in 2^1 is 
replaced by r.) 

A special case of this availability model treating only "snapshot" data is given in [28], 
Snapshot data defined in [25,28] records only the state of the system (up or down) at random 
instants of time. 


Components are often combined to form system elements which are special in some 
sense. For example, the same multicomponent element may appear several times as a unit in 
the same system. In this case, it may be convenient to treat the element as a single system 
component. Some simple multicomponent system elements are presented in the following: 

N Identical Components in Series 

The reliability, p, of N identical components in series is p = P\. 

Component reliability p\ is a random variable in the Bayes representation with known pos- 
terior density, f\(p\). The moments M kX {f\\\ k = 0,1, ... ; of f\(p\) are then also known. 
The moments M k {f) of the posterior density f(p) of p are related to moments of the f, by 

M k [f} = M NkA {f l };k = 0,1,2, .... 

Using this result one can write the moments of the posterior density of series combinations of 
any of the special components treated in the previous section. 

N Identical Redundant Components 

When only one is required to operate in order that the system operates, then the reliabil- 
ity, p, of N identical failure independent redundant components is p = 1 - (1 - P\) N where P\ 
is the Bayes representation of the component reliability p\. It is shown in [8] that the moments 
M k \f) of the posterior density f(p) of p are related to the moments M kX {j\\ of the posterior 
density J\{p\) of P] by the relation 

M k {f} = t ("l) j 


By alternately applying this result and the previous one for components in series, the 
moments of the posterior density of any series parallel system of components can be obtained. 

A "2 out of 3" Element 

An element consisting of three identical failure independent components, which operates 
if any two or more of the components operate, is sometimes called a "2 out of 3 voter," [21]. 
The structure function of this element is 

4>(x h x 2 ,x i ) = 1 if X] 4- x 2 + x 3 > 2 

= if xi + x 2 + x 3 < 2 



and the reliability p is 

p = 3p? -2p\ 

where P\ is the component reliability. If the posterior density f\{p\) of p\ has moments 
M k \{f\\ then the moments M k [f) of the posterior density f{p) of p are: 

M k {f) = l ky £ 



This result follows using the fact that for p = p x , M p {f(p)\ = M NkA {f x ), when applied term by 
term to the expansion of 

(3p, 2 -2pl) k . 

Reference [21] gives the reliability function of the N- tuple Modular Redundant design 
consisting of N replicated units feeding a (/? 4- l)-out-of-A/ voter. This case can also be treated 
by the present methods. 

Exactly L Out of N Element 

An element consisting of N identical failure independent components which operates only 
when exactly L out of N components operate is a rather unusual system. If L + 1 out of N 
operate the system fails. Such a system is not a coherent structure in the sense of [5]. The 
reliability p of this element is given by 





The moments of the posterior density f(p) of the Bayes representation p in terms of the 
moments M kX {f x ] of the posterior density f\(p\) of the component reliability p x can be shown 
to be 

M k [f\ = 


I (-1V 




j+kL. 1 


This example serves to illustrate that the proposed evaluation is not restricted to coherent sys- 


Section 9.4.4 of NAVORD OD 44622, Reference [22], presents a procedure for develop- 
ing the posterior beta distribution of system reliability for system level TECHEVAL/OPEVAL 
testing. Reference [9] presents further discussion with an example. The observed system level 
data is binomial i.e., r failures in n trials. The system level, natural conjugate prior is the beta 
density. An exact prior for the system level tests is the posterior density function based on all 
prior component tests and component priors and can be computed by the methods above. The 
procedure recommended in OD44622 is to approximate the exact system prior with a beta den- 
sity having the same first and second moments. 

Equation (15) above provides a tractable tool for computing the required first and second 
moments for extending the method to arbitrary system structures. 

Let A/ 1 and M 2 denote the .first and second moments computed as shown in this report 
for the posterior density f(R) of system reliability, R, based on prior component data. The 


f(R) is considered the exact prior for determinations of a new posterior density based on bino- 
mial system level data. What are required for the approximation are the parameters n' and r' 
of the beta prior 

t d \ i >\ R" r (\ — R) r 

g(/?l " ;r)= B(y-r'+l, r'+l) 

with the same first and second moments as f(R). Having computed M x and M 2 the answer is 
direct using formulas on page 9.23 of NAVORD OD 44622 i.e., 

n'= [M,(l- M X )/(M 2 - M, 2 )]-l 

/-'= (1- M x )n'. 

The gamma prior is treated in a similar way in the same reference. 

The beta approximation can also be used directly as an approximation to the exact poste- 
rior density function for complex systems based on component test data. The approximation 
has been very good when compared with the exact result in examples treated by the authors. 
The calculation is tractable for hand computation since only the first and second moments of 
the exact posterior density function are required. 

Numerical Example 

Consider a system consisting of five components, A\ (/ = 1, . . . , 5) connected in series. 
Components A x , A 2 , A^, and A 4 have unknown fixed probabilities of success, p,\ and in testing, 
there were observed m, successes in n, trials. The fifth component, A 5 , has an unknown con- 
stant failure rate A and has mission time /. In testing, component A 5 failed r times in T operat- 
ing hours. The following test data were observed: 

« , = 20, w,= 18; /? 2 =30, w 2 = 25; // 3 = 20, m 3 = 20; /? 4 = 20, w 4 =19;r=38, /=6, r= 3 

The resulting posterior density functions are: 
/,(/?,) = 3990 R\ % (1- R x ) 2 
f 2 (R 2 ) = 4417686 R 2 25 (1 - R 2 ) 5 
f 3 (R } ) = 21 Ri° 
MRa) = 420 /? 4 19 (1 - R 4 ) 


f 5 (R 5 ) - 482.00823 fl 5 19/3 

ln « 5 

We know [25,26] that the Mellin integral transform of the posterior density function, 
h (R ) for the system is the product of the Mellin integral transforms of the density functions of 
the components. At this point we can determine h(R) exactly by means of the inverse Mellin 
integral transform or we can approximate h (R ) with a Beta density function having the same 
first and second moments as li(R). 

The Mellin integral transforms of the density function for the components of the system 

m w Mkl 21! TCS+ 18) iL#r#-/i»Mci 21! r(5 + 19) 



M \f( R >|ci- 31! T(5 + 25) 
M[f 2 (R 2 )\S]- 25 , r(s + 31) 

%4\f (p Mci 21! rCS + 20) 

M[f 5 (R 5 )\S] = 

(22/ 3 ) 4 

(s + i9/ 3 r 

The Mellin Integral transform of h(R) is M[h(R)\S] = fj A/[/) (/?,-) I S]. 


From [26] we know that the Mellin inversion integral yields directly 
h(R) = -^-- f^'°° R- s M[h(R)\S]dS 

27TI ve-i 00 

where the path of integration is any line parallel to the imaginary axis and lying to the right of 
the real part of c, If b is greater than 1, the real part of c is greater than p, and p is any 
number, then, [26] 

1 C c+ 

Itt'i Jc-i 

C + ioo 



09 (s + p) 1 

dS = 

R p 



To find h(R) we simply write M[h(R)\S] as the sum of its partial fractions [13] and 
integrate each term using the above equation. Thus the exact posterior density function, h(R), 
for system reliability is 

h(R) = + 1094388844.948 i? 18 + 30505643166.29 R ]9 . 

- 12601708553.76 R 19 

19915799047.82 fl 20 


- 31650550963.66 R 20 


- 5114357474.61 R 




+ 235122603.404 R 25 - 354959810.01 R 26 
+ 249501799.456 R 21 - 98389473.63 R 2% 
+ 21240815.37 R 29 - 1974044.939 R i0 

- 22937.221 R l9/3 + 78073.717 R l9/3 



42683.275 R x9/i \\n-^ 

- 95839.296 R l9/3 
The exact distribution function, H(R), is found by integrating the density function. 


To obtain the approximate solutions for the system reliability density and distribution 
functions, we recall that the first and second moments of h(R) are given by M[h(R)\2] and 
M[h(R)\3] respectively. The beta density function, which is used to approximate h(R), is 

h(R) = 

R a (\ - R) b 

B(a + 1, b + 1) 

where h{R) denotes the approximate system density function, (3 (a + I, b + 1) is the com- 
plete beta function, and a and b are the parameters of the beta function. The first moment of 
HR) is 



a + 1 

a + b + 2 
and the second moment is 

(a + 1) (a + 2) 

(a + b + 2) (a + b + 3) ' 
We require that the first and second moments of h (R ) and h (R ) be equal. Thus we have 

M[h(R)\2] = 
M[h(R)\3] = 

a + b + 2 

(a + 1) (a + 2) 

(a + b + 2) (a + b + 3) 

Solving simultaneously for a and b yields the parameters for the beta density function. Thus 
we have a = 6.43596 and b = 11.92734. Therefore we can now write h(R), the approximate 
density function for system reliability: 

^6.43596 n _ ^)11.92734 

h(R) = 

B (7.43596, 12.92734) 

To determine the approximate distribution functions, H(R), for system reliability we simply 
integrate h (R ) 

Table 1 provides the comparison between the results obtained by the exact solution and 
the approximate solution. 

TABLE 1 — Numerical Results Obtained from Exact 
and Approximate Solutions 


Density Function 

Distribution Function 

































































[1] Abramowitz, M. and I. A. Stegun (Editors), "Handbook of Mathematical Functions," 
National Bureau of Standards, Applied Mathematics Series, 55, 782 (1964). 

[2] Aitchison, J., "Two Papers on the Comparison of Bayesian and Frequentist Approaches to 
Statistical Problems of Prediction," Journal of the Royal Society, Series B., 26, 161-175 


Bartholomew, D.J., "A Comparison of Some Bayesian and Frequentist Inferences," Biome- 

trika 52, 1 and 2, 19-35 (1965). 
Birnbaum, Z.W., "On the Probabilistic Theory of Complex Structures," Proceeding of the 

Fourth Berkeley Symposium, 7, 49-55, University of California Press (1961). 
Birnbaum, Z.W., J.D. Esary and S.C. Saunders, "Multi-Component Systems and Structures 

and Their Reliability," Technometrics, 3, 55-77 (1961). 
Brender, D.M., "Reliability Testing in a Bayesian Context," IEEE International Convention 

Record, Part 9, 125-136 (1966). 
Chang, E.Y. and W.E. Thompson, "Bayes Analysis of Reliability of Complex Systems," 

Operations Research, 24, 156-168 (1976). 
Chang, E.Y. and W.E. Thompson, "Bayes Confidence Limits for Reliability of Redundant 

Systems," Technometrics, 77(1975). 
Cole, Peter Z.V., "A Bayesian Reliability Assessment of Complex Systems for Binomial 

Sampling," IEEE Transactions on Reliability, R-24, 114-117 (1975). 
Esary, J.D. and F. Proschan, "Coherent Structures of Non-Identical Components," Tech- 
nometrics, 5, 191-209 (1963). 
Esary, J.D. and F. Proschan, "The Reliability of Coherent Systems," Redundancy Tech- 
niques for Computing Systems, Edited by R.H. Wilcox and W.C. Mann, SPARTAN 

BOOKS, 47-61 (1962). 
Fox, B.L., "A Bayesian Approach to Reliability Assessment," NASA Memorandum 

RM/5084 - (1966). 
Gardner, M.F. and J.L. Barnes, Transients In Linear Systems, John Wiley and Sons, New 

York, 152-163 (1942). 
Gaver, D.P. and M. Mazumdar, "Statistical Estimation in a Problem of System Reliability," 

Naval Research Logistics Quarterly, 14, 473-488 (1967). 
Gnedenko, B.V., Yu.K. Belyayev and A.D. Solovyev, Mathematical Methods of Reliability 

Theory, Academic Press, New York, 76-77 (1969). 
Lanczos, C, Applied Analysis, Prentice Hall, Inc. Chapter IV and VII. 
Lin, P.M., B.J. Leon and T.C. Huang, "A New Algorithm for Symbolic System Reliability 

Analysis," IEEE Transactions on Reliability, R-25 (1976). 
Lindley, D.V., "The Robustness of Internal Estimates," Bulletin of International Statistical 

Institute, 38, 209-220 (1961). 
Lindley, D.V., "The Use of Prior Probability Distributions in Statistical Inference and 

Decisions," Proceeding of the Fourth Berkeley Symposium, 7, 453-468, University of 

California Press (1961). 
Maritz, J.S., "Empirical Bayes Methods," Methuen's Monographs on Applied Probability 

and Statistics, Methuen and Co. Ltd., London (1970). 
Matther, F.P. and Paulo T. deSousa, "Reliability Models of A^-tape Modular Redundancy 

Systems," IEEE Transactions on Reliability, R-24, 108 (1975). 
NAVORD OD 44622, Reliability Guide Series, 4. The Superintendent of Documents, 

U.S. Government Printing Office, Washington, D.C. 
Parker, J.B., "Bayesian Prior Distributions for Multi-Component Systems," Naval Research 

Logistics Quarterly, 19 (1972). 
Savage, L.J., "The Foundations of Statistics Reconsidered," Proceeding of the Fourth 

Berkeley Symposium, 7, 575-585, University of California Press (1961). 
Springer, M.D. and W.E. Thompson, "Bayesian Confidence Limits for the Reliability of 

Cascade Exponential Subsystems," IEEE Transactions on Reliability, R-16, 86-89 

Springer, M.D. and W.E. Thompson, "Bayesian Confidence Limits for the Product of A' 

Binomial Parameters," Biometrika 53, 3 and 4, 611 (1966). 
Thompson, W.E. and P. A. Palicio, "Bayesian Confidence Limits for the Availability of Sys- 
tems," IEEE Transactions on Reliability, R-24, 118-120 (1975). 


[28] Thompson, W.E. and M.D. Springer, "A Bayes Analysis of Availability for a System Con- 
sisting of Several Independent Subsystems," IEEE Transactions on Reliability, R-21, 
212-214 (1972). 

[29] Wolf, J.E., "Bayesian Reliability Assessment From Test Data," Proceedings 1976 Annual 
Reliability and Maintainability Symposium, Las Vegas, Nevada, 20-22, 411-419 (1976). 


L. Shaw 

Polytechnic Institute of New York 
Brooklyn, New York 

C-L. Hsu 

Minneapolis Honeywell 
Minneapolis, Minnesota 

S. G. Tyan 

M/A COM Laboratories 
Germantown, Maryland 


A single component system is assumed to progress through a finite number 
of increasingly bad levels of deterioration. The system with level i (0 < / < n) 
starts in state when new, and is definitely replaced upon reaching the worth- 
less state n. It is assumed that the transition times are directly monitored and 
the admissible class of strategies allows substitution of a new component only 
at such transition times. The durations in various deterioration levels are 
dependent random variables with exponential marginal distributions and a par- 
ticularly convenient joint distribution. Strategies are chosen to maximize the 
average rewards per unit time. For some reward functions (with the reward 
rate depending on the state and the duration in this state) the knowledge of 
previous state duration provides useful information about the rate of deteriora- 

Many authors have studied optimal replacement rules for parts characterized by Marko- 
vian deterioration, for example Kao [6] and Luss [9] and the many references found in those 
papers. Kao minimized the expected average cost per unit time for semi-Markovian deteriorat- 
ing system, and considered various combinations of state and age-dependent replacement rules. 

Luss examined inspection and repair models, where he assumed that the operating costs 
occurring during the system's life increase with the increasing deterioration. The holding times 
in the various states were independently, identically, and exponentially distributed. The policies 
examined include the scheduling of the next inspections (when an inspection reveals that the 
state of the system is better than certain critical state k) and preventive repairs (when an 
inspection reveals the state of the system being worse than or equal to k). The convenience of 
a Poisson-type structure for the number of events-per-unit-time made it relatively easy to allow 
general freedom in the selection of observation times. 

The work studied here is based on a modification of the model used by Luss. Our model 
for deterioration is more general, but the admissible strategies used here are more restricted. 
Here we allow the exponentially distributed durations to have different mean values, and to be 
positively correlated. 

'This work was partially supported by Grant No. N00014-75-C-0858 from the Office of Naval Research. 




The introduction here of correlation between interval durations permits the modeling of a 
rate of deterioration which can be estimated from a particular realization of the past durations. 
However, the lack of a Poisson-type of structure for the events-per-unit-time makes it much 
more difficult here to allow general freedom in the selection of observation times. At present, 
only the simple case of direct and instantaneous observation of deterioration jumps has been 

This model would be appropriate, for example, in a subsystem which functions, but with 
reduced efficiency, when some redundant components have failed; and for which failure of one 
component might indicate environmental stresses which increase the probability of failure for 
other components. In addition, deterioration in correlated stages might be used as a simple 
approximation for a continuously varying degradation which does not exhibit discrete stages. 

Figure 1 shows a typical time history of deterioration and replacement. The duration in 
state (/'— 1), prior to reaching state (/'), is r,_ x . The intervals d, in Figure 1 represent the time 
required to replace a component when it has entered state /'. The sequence {/•,■} will be Markov, 
characterized by a multi-variate exponential distribution. Reward functions will be related to 
the deterioration state and the time spent in each state. The decision rule specifies whether or 
not to replace when entering each state /, on the basis of the history of r,-_i, /■ / _ 2 , .... The 
Markov property simplifies the decision rule to be a collection of C, sets such that we replace 
on entering state /' if and only if r-_] € (£,. 







Figure 1. History of deterioration and replacement (n = 5). 



The objective is to maximize the average reward per unit time: 

L = lim -= (Total reward in (0, T)] 

T~oo T 

£ [Reward per renewal] 

E [Duration between renewals] 

(See Ross [11] page 160 for equivalence of (1) and (2).) The mean reward per renewal is 
defined here as: 

(3) % = 

N- 1 r r 

I Jo c > {t)dt - 




in which: 

A' = state at which replacement occurs (possibly random). 
p N = replacement cost if replaced on entering state N (possibly random). 
Cj(t) = reward rate when in state /'. 

Figure 2 shows several reward rate time functions c(t) which have been considered. 
When one of these c(t) functions is specified for a given problem, the c,{t) in (3) are assigned 
values j8,c(/) with: 


i30>i3l >£l> ■■■ >Pn-l >Pn = 0, 

to assure greater reward rates in less deteriorated states. State n corresponds to a completely 
failed or worthless component. 


(a) constant 

(b) linear 

(c) constant-after set up 

Figure 2. Reward rate time functions. 

The mean duration in (2) is defined as: 
(5) 5) = E 


I r, + d N 


to include a possibly random time d N for carrying out a replacement at state N. 

While the ultimate objective is to choose 6, to maximize the L defined in (1), it is well 
known that a related problem of maximizing: 

(6) £„(«) = » -a % 

is simpler [1]. Indeed, the £, which maximize L will be identical to those which maximize 
£<)(«) for the a * such that: 


£<? (« *) = 0, where £ ° (a) A max £ (a). 


Section 1 considers a case in which it is found that deterioration rate information is not 
useful (e.g., the optimal policy is independent of the amount of correlation between successive 
state durations). 

Sections 2 and 3 consider other penalty cost structures, e.g., assuming that more 
deteriorated parts are rustier, hotter, or more brittle, and therefore more costly to replace. In 
such cases the optimal policies do make use of estimates of the deterioration rates as well as of 
observations of the deterioration level. 

The Appendix describes useful properties of the multivariate exponential {/■,} sequence 

which is used to model the correlated residence times in a sequence of deterioration states. 

These durations have marginal distributions which are exponential with mean values tj,, and 

i *— ■ i 
correlations p rr =p ". 


The constant reward rate case with c,(t) = (i, and with state-independent replacement 
penalties (p, = p, d, = d) is particularly simple to analyze. We will see that as long as 
£[//|//_i, r,_ 2 , ••• 1 ^ for all /', even if the r, are not exponentially distributed, the optimal 
rule will be to replace the deteriorating part upon entering some critical state A-*, independent of 
the observed durations r,. 

Based on the problem statement, the optimal decision on entering state j must maximize 
the mean future reward until the next renewal, £,(«), for a suitable a. Here: 

(8) £,(«) = £ 

Immediately after a renewal, when j = 0, the expectations defining £ (a) are unconditional 
The optimal decisions for each state will be found in terms of a, and then the proper a* (for 
producing decisions which maximize L) is the one for which the maximum: 

(9) max£ () (a*) = £o («*) = 0. 

Optimization by dynamic programming begins by considering the decisions at the last 
step, i.e., on entering state (n - 1). There are two choices, to replace (R) or not to replace 
(R), with corresponding values: 

(10) £„_,(«;*) p-ad, 


£„_,(<*;#) = E[ (3 „_,/■„_, I r„_ 2 ] - aE[r n _ x \r n _ 2 ]-p-ad 

= E[{p n _ x -a)r n _ { \r n _ 2 ]-p-ad. 

Clearly, the best decision is not to replace, if and only if, the difference 

(12) A n _,(a;r n _ 2 ) A £„_,(«;£)-£„_,(«;/?) 

= {(S n _ x -<x)E[r n _ x \r n _ 2 \ > 0. 

is non-negative. The sign of (12) will be the sign of (j8„_,-a), due to the non-negativity of 
all interval durations. Thus the best decision depends on a and the reward parameter /8„_i, but 
not on the previously observed duration. Two cases will be considered separately. 


Z /v,lo-i 

— aE 



. i= j 


If /3„_! ^ a then the best decision at state (n — 1) is not to replace. We will now explain 
why, under this condition, it is best not to replace at any state less than n. Consider the situa- 
tion on entering (n — 2). We have already shown that it is better not to replace on entering 
(n — 1). Thus the choice will be based on a A„_ 2 °f me form: 

(13) A„_ 2 (a;/- fl _ 3 ) = £[(/3 fl _2-a)/- fl _ 2 + (0„_i -a)r„_, k_ 3 ). 
Here we have: 

(14) (j3 n -2-«) > (/8„-i-a) > 0, 
by assumption, and: 

(15) £[/■„_, I r B _ 3 ] > and £[r„_ 2 | r„_ 3 ] > 0, 

because all a, > with probability one. Thus A„_ 2 (a;r„_ 3 ) > for all a-„_ 3 > 0, and it is also 
better not to replace here. This argument can be repeated for states (n — 3), 
(n -4) 1,0. 

The other case to consider is yS,,^ < a, which requires replacement on entering state 
(n — 1), if the system ever reaches that state. When we consider the decision on entering 
(n - 2). the A„_ 2 is: 

(16) A„_ 2 (a;/'„_ 3 ) = £[(/3„_ 2 -a)/-„_ 2 |/-„_ 3 ], 

which has the sign of (/3„_ 2 -a). If (/3„_ 2 -a) < 0, then replacement is optimal on entering 
(n — 2) and in — 3) is considered next. This iteration may eventually reach a state (k — 1) 
where (ft k -\ — a) > and it is better not to replace. Arguments similar to those for the 
j8 n _! — a > case show that nonreplacement is the optimal decision at all states preceding the 
one which first arises as a nonreplacement state in this backward iteration. 

In summary, in the constant reward rate-constant replacement penalty case S- (a) is max- 
imized by a decision rule which says replace on entering some state k ^ n which depends on 
the reward parameters {{3,} and the a: 

(17) k = min{/:(a - £,) > 0}. 
Finally, we must choose a * so that £q (<* *) = 0, where: 

(18) Z§(a) = -p-ad + k fi (p, -a)E[r,]. 

Figure 3 shows a typical plot of £<? (a) as a continuous, piecewise linear curve whose zero 
crossing (£°(a*) = 0) defines a * and the optimal replacement state k* for maximizing L. 

EXAMPLE. Figure 3 shows that the optimal average reward per unit time is L = 2— 

when k* = 3, where /3 = 5, jSj = 4, /3 2 = 3, /3 3 = 2, /3 4 - 1, /8 5 = 0, p = 5, d = 1, t;,- - 2 
(/' = 0,1,2,3,4) and n = 5. From Equation (18), the optimal k is a function of a, which 
remains constant when a varies over each interval j6,+ i < a < /3,-, as shown in the figure. 


Here we generalize the model of the previous section by allowing the replacement cost p, 
and replacement duration d, to be functions of the replacement state (/), and to be random. 
These parameters are assumed to have mean values E[pj] and E[dj] which are convex nonde- 
creasing sequences in /, corresponding to the increased difficulty in replacing more deteriorated 
parts which may be, e.g., rustier, hotter or more brittle. We also assume that the mean dura- 
tions are ordered: tj ^ rj| ^ . . . > 17,,- 1, corresponding to faster transitions of more 
deteriorated parts. 




ii°oU*, a) 











1 1 

i : 

1 >v 1 




"* - L tmx 





Figure 3. Optimal reward search: constant reward rate case. 



The foregoing assumptions, together with properties of the assumed multivariate 
exponential density for stage-durations (see Appendix), lead to an optimal decision policy with 
a nice structure. That optimal policy prescribes replacement when entering state j, if and only 
if /)_, < /}*_i, where the decision thresholds are ordered: < /"o/^o ^ r °\/v\ ^ • • • ^ C-\l 

f) n -\ 

The optimal decision on entering state j must maximize the mean future reward until the 
next renewal, i. e., £,(«). For a suitable a, we have: 


£>) = £ £ /3, 





E[p N + ad N ]. 

For notational simplicity we define e, = E[p, + ad,] and note that e, is also convex and nonde- 
creasing since we are only interested in a > 0. The optimal decisions for each state will be 
found in terms of a, and then the proper a * (for producing decisions which maximize L) is the 
one for which the maximum £ vanishes: 


£,?(«*) =- e N (a*) + E 







= 0. 

Optimization by dynamic programming begins by considering the decision at the last step. 
Since state n represents a failed component, we definitely replace the component when it enters 
state n. Next, we consider the decision to be made on entering state n — 1. There are two 
choices: to replace (R) or not to replace (/?), with corresponding values 

(21) £„_,(«;/?) = -£-„_,, 

(22) £„_,(«;£) = £[/3„_i''/»-i-"' , «-ik-2] - e n 

for £„_|(a). Clearly, the best decision is not to replace, if and only if, 

A„_,(/-„_ 2 ) A £„_,(«;£) - £„_,(«,/?) 
is non-negative, i.e., 

(23) K-i(r„- 2 ) = (/3„-i-a) E[r n . x \r n . 2 \ + (e n _ x - e„) > 0. 
Referring to (A-6), A„_i(r„_ 2 ) is a linear function of r„_ 2 , with 

A„_,(0) = (jS^-ah^a ~ p) + (e B _j - O- 

Figure 4 shows the possible shapes for this function. There can be no downward zero-crossing 
at an r n _ 2 > 0. 

Thus, depending on the numerical values of the parameters, there are three possible kinds 
of optimal decision rules when entering state (n — 1): 

(i) replace for no r fl _ 2 if A„_i ^ for all /-„_ 2 ^ 

(ii) replace for any r n _ 2 if A„_] < for all r„_ 2 ^ 

(iii) replace if and only if r,*_ 2 > r„_ 2 > 0, where A„_, (r,*_ 2 ) = 0. 

In other words, 

(24) £,_,(«) = {r„_ 2 : r n _ 2 < C 2 ], 
where r„"_ 2 could be zero (case i) or infinite (case ii). 



A„_,(r„_ 2 ) 

• P„-\ > a 


Figure 4. Possible shapes for &„_ ] (r n _ 2 ). 

Next we consider the optimal decision when entering state (n — 2), and assuming that the 
optimal decision will be made at the subsequent stage. We consider cases of (/3„_| < a) and 
(/3„_i ^ a) separately. 

(a) (/3„_i < a) implies replacement on entering (n — 1), so 

A„_ 2 (r„_ 3 ) = (/3„_ 2 -a) £[r„_ 2 |r„_ 3 ] + (e„_ 2 - e„_i), 
resulting in the same three possibilities listed above for state (n — 1). 

(b) for (£„_! > a): 

(25) A„_ 2 (r„_ 3 ) = e„_ 2 + (j3„_ 2 -a)£[r n _ 2 |r„_ 3 ] 

+ J . [(p„-i-a) E[r n _ x \r n _ 2 ] - e„] /(r„_ 2 |r„_ 3 )rfr„_ 2 


+ Jo (-e„-i) /(r„_ 2 |r w _ 3 )^„_ 2 

Equation (25) can be simplified, with the aid of the notation (x) + = maxOc, 0), to the form 

(26) A„_ 2 (r„_ 3 ) = (e„_ 2 - e n _ x ) + (j8„_ 2 -a) E[r„- 2 \r„- 3 ] 

+ £[(A„_ 1 (r n _ 2 )) + |r„_ 3 ]. 
Useful comparisons can be formed if normalized variables are introduced, namely 
s, = rj-oc hM-\) = M'V-i)lr,._,U, / _,s / _, 
We now prove 

(a) 8„_ 2 (s„_ 3 ) ^ 8„_,(s„_ 3 ) 

(b) 8„_ 2 (s„_ 3 ) is convex with at most one upward zero crossing at an 5 > 0. 

There is no harm in writing 8„_,(s„_ 3 ) or 8„_,(s + ) instead of 8„_,(s„_ 2 ) for purposes of 
comparing functions. 


To prove (a), consider 

(27) 8„_ 2 (s) - 8„_iCs) = [(e„_ 2 - e„-i) - (e„-i - <?„)] + £[(8„_ 1 (s + )) + |5] 

+ [(/3 w _ 2 -a!)i7„_2- (^ w _i-a)Tj„_i] £"[s + |s]. 
where 5+ represents the normalized duration following 5. 

The terms on the right side of (27) are nonnegative due to the convexity of the e h i ) + 
^ 0, (A-6), and the assumed orderings of the /3, and tj,. 

This completes the proof that (a) is true. It follows immediately that if (i) (preceding Eq. 
(24)) applies for state in — 1), then it is also optimal not to replace in state in — 2) or any ear- 
lier state. (Recall /3„_i < /3„_ 2 < . . . , and we are now considering a < (3„-\). 

To prove (b), which is only of interest when an r*_ 2 > exists, we refer to the theorem 
in the appendix. The test difference 8„_ 2 (s) can be written as 

(28) 8„_ 2 (s) = E[e n _ 2 - e n _ x + (fi „_ 2 - a)r) „-2 s + + i8 n ^is + )) + \s] 

in which the integrand has the properties required by his) in the theorem. To see this, we 
note that r*_ 2 > implies that (8„_,(0)) + = 0, so the integrand is nonpositive at 5+ = 0. 
Thus, 8„_ 2 (s) has the shape stated in (b), implying that 

(29) e„_ 3 ={r a _ 3 :r„_ 3 < r,;_ 3 } 

where r„*_ 3 may be zero, infinity, or the nonnegative value defined by 8 „_ 2 (r„*_ 3 / '-q „_ 3 ) = 0. 

The foregoing arguments can be repeated for /•„_ 4 , r n _ 5 ... r to prove that the optimal 
replacement policy has the form: 

Replace on entering state /', if and only if, r, ^ r* where 

< rj/rjo < /-'/tj, < ... < d/i7„_, = oo. 

When repeating the proof for earlier stages, the ( ) + term in (27) and (28) is modified to the 
form, e.g., [(8„_ 2 (s + )) + - (8 „_ t (5 + )) + ] . This term is generally nonnegative, due to (a) at the 
preceding iteration (next time step); and it is zero for s + = when proving (b), since then r„*_ 3 
> 0. Thus the basic theorem is still applicable. 

3. Computational Procedure 

The preceding section derived the structure of the optimal decision rule for the case 
where replacement is more difficult and more expensive when the part is more deteriorated. 
The corresponding optimal decision thresholds can be formed as follows: 

(a) choose an initial a. 

(b) Find the r*ia) (;= n—l, n — 2, ...0) recursively, via numerical integration of 
expressions like (26) (where r*_ 3 (a) is defined by the condition A„_ 2 (v_ 3 ) = 0). 

(c) Compute 

n [ifi -a) r + (A,(/- )) + ] fir )dr . 

(d) If |£o( a )l < e ^ f° r sufficiently small e, say L max = a* = a: otherwise repeat the 
computational cycle starting with a new a. 



The following properties of £o (a) can be use d to generate an a -sequence which con- 
verges to a *. 

1. £q (**) ' s monotone decreasing, since £ (a) nas this property for a fixed policy (see 
Eq. (19)); and if £o ("2) ^ £0 ( a i) f° r a 2 > fl i, then the policy used to achieve £q ^2) could 
be used to achieve an £ («|) > £<? («i) — a contradiction. 

2. When p = 0, all r* are zero or infinite: replacement always occurs on arrival at a criti- 
cal state /*. Use of that policy will achieve the same average reward for durations having any 
value of p. Thus, a useful bound on a*(p) is a*(0) ^ a*(p); < p < 1*. 

3. When p = 1, future r, are completely predictable ( Var (/-,| r,_j) = in (A-7)), so 
a*(l) ^ a*(p). In this case there is essentially a single random variable r , and the r," can be 
calculated without the need for numerical integration of Bessel functions. 


Table I lists parameter values for a replacement problem which fall under the assumptions 

of Section 2. 

TABLE 1 — Numerical Example Parameters 

CASE I (p = 0) 



















E[ Pl \ 












Since future durations are independent of past ones, the optimal policy replaces when a 
critical state i* is reached. The general optimal reward expression 

<**(p) = 


Z/V, - Pn 


becomes, in this case 

a*(0) = max 

fcjB, 7,, - E[ Pi ] 

= max ^ (J) 

Direct evaluation shows 

1.5 2.13 2.205 2.085 1.89 

withy'*= 3 anda*(0) = 2.205. 



CASE 2 (p = 1) 

Since r, = r rj/rjo in this case, the optimal rule specifies a replacement state j(r ) as a 
function for r . 

For any such policy 

£ (a,j(r )) = E rQ 

T70 o 

This expectation will be maximized if j(r ) maximizes the bracketed term for each r . Making 
the necessary comparisons for a sequence of a -values leads to the policy 

j* =\, if ro < 0.2698 

= 2, if 0.2698 ^ r < 0.7083 

= 3, if 0.7083 ^ r 

for which |£ I < 0.003 and a *(1) = 2.25. 

< 2.25. A pilot calculation along the lines indicated in the 



We know that 2.205 < a* 


< 2.25. 

previous section shows that rj 


= o, >■; 



. 9 



— = oo for j ^ 2, and 

r\ = 

1(3 -a*)' 

where a * is chosen to make the following £ ( a ) vanish. 

oo .oo 

£ (a) = 6.4 - 3a + f J 


+ j (3-a)r, 

2r n + 



The known bounds on the optimal reward a 

bounded, too: 0.290 < n 

7 (2.981 yf7^\)dr x dr Q 

imply that the optimal threshold r* is 

< 0.375. 

Similar study of other values of the correlation parameter p lead to the optimal policy pat- 
tern described in Table II. One might say that as p increases, the past observations are more 
informative, the optimal policy makes finer distinctions, and the optimal reward increases. 


A multivariate exponential distribution has been used to describe successive stages of 
deterioration. Optimal replacement strategies have been found for the class of decision rules 
which can continuously observe the deterioration state, and which may make replacements only 



TABLE 2 - Optimal Policy Structure 

Correlation Parameter p 











r < r *(3/4) 

r < r' Q (1) 


r, < rrfl/2) 

r, < rf(3/4) 

'o < — r \ (!) 




r, ^ r' (1/2) 

/•, ^ r," (3/4) 

r > — r, (1) 

at the times of state transitions. Similar results have been found for the other reward rates 
shown in Figure 2 (linear; and constant after an initial set-up interval for readjustment to the i 
new state) [5]. 

The optimal replacement policy derived in Section 2 makes use of observations which 
allow estimation of the current rate of deterioration for the correlated stages of deterioration. 
The numerical example demonstrates how the optimal policy and reward are related to the 
amount of correlation between the durations in successive deterioration states. For the model 
used here, the optimal policy for p =0 will achieve the same reward (less than optimal) for j 
any p. Depending on the application, the suboptimal approach may be satisfactory. The addi- j 
tional reward achievable by the actual optimal policy is bounded by the easily computed optimal , 
reward for p = 1. However, it is possible that the small percentage improvement achievable 
for the p = 1/2 case in the example could represent a significant gain in a particular applica- 

The ordering of state dependent rewards, mean durations, etc. assumed here are physi- 
cally reasonable, and lead to nice ordering of the decision regions. However, other 
/?,■, 7],, p h d, orderings might be more appropriate in other situations. The model introduced 
here for dependent stage durations could be used in those cases, together with dynamic pro- 
gramming optimization, although the solutions may not have comparably neat structures. 

We anticipate that the optimization approach and policy structure described here will also 
be applicable to replacement problems having similar deterioration models. One easy extension 
would be to change the correlation structure in (A-3) from p'' -/l to something else, e.g., 
p \i-j\ + p |/-/l_ other changes could permit the r, to have non-exponential distributions, as 

long as similar total-positivity properties exist to permit analogous simplifications in the 
dynamic programming arguments. 

Some of these other r, distributions are being studied now in the hope of finding similar 
models which exhibit large percentage differences between the optimal rewards as P Vi+1 changes 
from zero to one. (Other choices of the numerical values in Table I have not revealed any 
such cases for the current model). 

One reasonable generalization would allow transitions from state /to any state j > /'. This, 
would not change the form of the solution in the case of constant replacement penalties. How- 


ever, the possibility of these additional transitions does ruin the structure when replacement 
penalties increase with the deterioration state. (The 8„_ 2 (s) > 8 „_)($) argument is no longer 


[1] Barlow, R.E. and F. Proschan, "Mathematical Theory of Reliability," John Wiley and Sons 

[2] Barlow, R.E. and F. Proschan, "Statistical Theory of Reliability and Life Testing," Holt, 

Rinehart, and Winston (1975). 
[3] Griffith, R.C., "Infinitely Divisible Multivariate Gamma Distributions," Sankhya, Series 

A, 52, 393-404 (1970). 
[4] Gumbel, E.J., "Bivariate Exponential Distributions," Journal of the American Statistical 

Association, 55, 678-707 (1960). 
[5] Hsu, C-L., L. Shaw and S.G. Tyan, "Reliability Applications of Multivariate Exponential 

Distributions," Technical Report, Poly-EE-77-036, Polytechnic Institute of New York 

[6] Kao, E.P., "Optimal Replacement Rules when Changes of States are Semi-Markovian," 

Operations Research, 21, 1231-1249 (1973). 
[7] Karlin, S., "Total Positivity," Stanford University Press (1968). 

[8] Kibble, W.F., "A Two-Variate Gamma Type Distribution," Sankhya, 5, 137-150 (1941). 
[9] Luss, H., "Maintenance Policies when Deterioration Can Be Observed by Inspections," 

Operations Research, 24, 359-366 (1976). 
[10] Marshall, A.W. and J. Olkin, "A Multivariate Exponential Distribution," Journal of the 

American Statistical Association, 22, 30-44 (1967). 
[11] Ross, S., "Applied Probability Models with Optimization Applications," Holden-Day 



Dependence Relationships Among Multivariate 
Exponential Variables 

Many multivariate distributions have been described and applied to reliability problems 
[4,8,10]. In each case the marginal univariate distributions are of the negative exponential 
form. Properties of the distribution used here are most easily derived by exploiting its relation- 
ship to multivariate normal distributions [3.5]. 

The multivariate exponential variables r x , r 2 . . . , r n can be viewed as sums of squares: 

(A-l) r, = w, 2 + z, 2 , 

where w and z are independent, zero mean, identically distributed normal vectors, each with 
covariance matrix T. It follows that the r, have exponential marginal distributions with 

(A-2) E[r,] = 2 r „ 

We specialize to the case where the underlying normal sequences {w,} and {z,} are Markovian 

(A-3) yij = Jy77JjP U - Jl 

and find that {r,} is also Markov with the joint density 



/(ro,r lf r 2 , ••• . V-i) - 

n-i l-i 

/•-0 J 






V rfiVi+ 



I± + r Jzl + y 2 iili±Pl 

170 TJn-l 


;» > 2, 

Equation (A-4) uses the modified Bessel function 7 ( ) and the notations E[r,] = tj, and 
Pr,r = P- (When « = 2, the summation in exp ( ) vanishes.) 


The conditional density is easily shown to satisfy the Markov property and [5] 


/(r,|/7_,) = [tj,(/ - p)] 'exp 

(1 -p) 

2 I p nn- 1 

, 1 ~P V V,Vi-\ 



E[n\r Hl ] = 17, + (/-,_! - T7,_!)p Vi/Vi-\- 
Varlr/lr,.!] = t,, 2 [(1 - p) 2 + 2p(l - p),-,,,/-^,]. 

These conditional moments shows, e.g., that the conditional mean of r, exceeds its mean in 
proportion to the amount by which r,_] exceeds its mean, and that conditional mean is a 
linearly increasing function of r H \. 

The dynamic programming arguments used here required calculations of conditional 
expectations based on (A-5). As is often the case [2], the total positivity properties of 
/0/|#/_i) are very useful for determining structural properties of the optimal policy. 

It is straightforward to show that both fir,, r Hi ) and /(r/|r,-_i) are totally positive of all 
orders (TP^), [5,7]. This means, for /(/•,, r Hi ), that the following determinants are nonnega- 
tive for any Wand any ai < a 2 ... < a N ; ft { < /3 2 . . . < /3#. 

/(a,,/8,) /(a, -fl 2 ) ... f(a h /3 N ) 

> 0. 

/(atf.Pi) A<* N ,P N ) 

THEOREM: if h(y) is continuous and convex, and satisfies the bounds 

(i) h (0) < 

(ii) \h(y)\ < a + b v 2m ; a > 0, b > 0, y > 0, m = positive integer. g(x) = 
I ^(y) /(yU) dy, and f(y\x) is jHPoc, then g(x) is continuous, convex, bounded in the sense 

\g(x)\ < a' + b' x 2m ; a' > 0, b' > 0, x > 0; 

and belongs to one of the three following categories: 


(I) g(x) > for all x^ 0, 

(II) g(x) < for all x ^ except with a possible zero at x = 0, 

(III) there exists a unique x*, < x* < °o, such that g(x) > for all x > x* ; and 
g(x) < for x < x* except for a possible zero at x = 0. 

This theorem is used to define optimal decision regions according to the sign of a function like 
g(x), with x* corresponding to a decision threshold. 



Edgar A. Cohen, Jr. 

Naval Surface Weapons Center 

White Oak 

Silver Spring, Maryland 


In this paper, a statistical analytic model for evaluation of the performance 
of a standard electric bomb fuze timer is presented. The model is based on 
what is called a selective design assembly, where one item, namely, a resistor, 
is used to time the circuit. In such an assembly, the remaining components are 
chosen a priori from predetermined distributions. Based on the analysis, a gen- 
eral numerical integration scheme is utilized for assessing performance of the 
timer. The results of a computer simulation are also given. In the last section 
of the paper, a theory for evaluation of the yield of two or more timers 
designed to operate in sequence is derived. To appraise such a scheme, a nu- 
merical quadrature routine is developed. 


In this paper, we shall be concerned with the statistical analysis of the bomb fuze timer 
shown in Figure 1. As is common in practice, a standard, or precision, resistor is used to time 
the circuit after the rest of the components have been assembled in a random fashion. Then, 
to meet certain timing requirements to be discussed later, a resistor is selected and introduced 
into the circuit. A number of tests must afterwards be performed in sequence to check the per- 
formance of the product under differing environmental conditions. Such environmental 
influences are, for example, temperature effects, effect of packaging, resistor incrementation (to 
be discussed), and effect of vibration and moisture uptake. In addition, one might have several 
timers which operate sequentially, all fed from the same energy storage capacitor CI of Figure 

1. This paper is devoted to an analysis of such a timer in what is called the ambient tempera- 
ture range, whose limits are 70°F and 80°F, respectively. We will also indicate the procedure 
for treating analytically the assessment of performance of combinations of several timers. The 
author has been involved in a Monte Carlo study for the Navy of such timers. Previous work 
has involved reliability studies of an entire fuze assembly using these timers [2]. 


The timer indicated in Figure 1 works once the potential difference across the two capaci- 
tors C2 and C3 is sufficient to fire the cold cathode diode tube VT. Capacitors CI and C3 ini- 
tially have the same potential across them. As time progresses, CI discharges through resistor 
RES into C2, while C3 serves as a reference capacitor. Thus, the voltage across C2 builds up 






Figure 1. Fuze timer configuration 

until the potential across tubes C2 and C3 is adequate to fire tube VT. The relationship 
between firing time and the values of the circuit components can be derived from a simple first 
order differential equation and is given by 




t = 

Q + c 2 



VC,- (V T - V) (C,+ C 2 ) 

C x = capacitance of capacitor CI 

C 2 = capacitance of capacitor C2 

V — supply voltage (potential across C3 and potential initially across CI) 

V T = firing voltage of cold cathode diode tube VT 

R = resistance of resistor RES. 

To illustrate the pertinent features of the process, write (2.1), for brevity, in the form 

(2.2) t = RF(C,, C 2 , V, Vj). 

Note that (2.2) is linear and homogeneous in R, so that R can be used as a scaling parameter. 
This is precisely how it is used when the timer is first assembled. 

In practice, the resistors are supplied in large numbers by the manufacturer, after which 
they are tested and sorted by the user into a large number of bins. The resistors in each bin 
have resistances, at a standard temperature, which fall into a certain interval. These intervals 
are arranged to have the same "percent width", to-be described in more detail below. The timer 
is to be designed to fire at a nominal time t N . Since capacitors CI and C2 are chosen at ran- 
dom from a lot, their capacitances C\ and C 2 may be treated as random variables. Likewise, 
tube firing voltage V T may also be considered as a random variable. In general, we shall also 
consider the supply voltage Fto be random. 


Let us agree to denote by R that value of R obtained from relation (2.1) when t = t N 
and C b C 2 , V, and V T are given their expected values at some standard temperature, e.g., 
75°F. For convenience, R may be used as a reference resistance, and the bin to which refer- 
ence resistor RES , of resistance R , belongs could be called the reference resistor bin. The 
interval corresponding to this bin is to contain all resistances which fall between R (1 — e) and 
R (\ + e), where e is a preassigned small positive number. Our second bin will contain all 
resistors whose resistances fall between 7? (1 + e) and R (\ + e) 2 /(l — e), and the third bin 
those resistors whose resistances lie between R (\ — e) 2 /(l -I- e) and R (\ — e). In general, 
our intervals are to be so constructed that the ratio of right endpoint to left endpoint is always 
(1 + «)/(l — e), which, to first order accuracy, is just 1 + 2e. Alternatively, one may divide 
the difference of the two endpoints by its midpoint to obtain precisely 2e. We shall, therefore, 
say that each such interval has "percent width" 2e. In setting up the interval division scheme, a 
percent increment e, is chosen a priori, and then e = e,/100. This e, is typically of the order of 
1/2 to 1%. Figure 2 is a diagram of this scheme. 


M-e) 2 Ro(l-€) R (J-€) R R (l+€) R (l+€) R (l+€) 


l + € l+€ 1-6 J-6 

Figure 2. Resistance interval setup 

Once again referring to our circuit configuration, where C\, C 2 , K and V T are random 
variables, let us define 

(2.3) f - R F(C U C 2 , V, V T ). 

Then, to achieve the nominal time t N , we define our nominal resistance to be 

(2.4) R N - R t N /t . 

Note that, since t Q is a random variable (being a function of the random variables Cj, C 2 , V, 
and V T ), R N is also a random variable. A technician may use relation (2.4) to determine R N . 
Then he picks a resistor RES p at random from the bin to which resistor RES/y belongs and 
integrates such resistor, of resistance R p , into the circuit. This process is called, in fuze tech- 
nology parlance, "resistor incrementation." Note that R p is a random variable which is statisti- 
cally dependent on R N inasmuch as R p and R N must lie in the same interval. However, once 
attention is restricted to a given interval of the scheme, it is clear that the value of R N in no 
way influences the value of R p , since one is free to select any resistor in the bin to which the 
nominal resistor belongs. We shall reemphasize this fact in Section 3. For simplicity we index 
the intervals by /', letting their left and right endpoints be r, and r i+ \, respectively. To achieve 
compatibility, the bins should initially be formed and kept at some standard temperature, and 
the timer should be assembled at that same temperature. In practice, this will, in all likelihood, 
not be the case, but one may compensate for this defect by studying the sensitivity of the timer 
to changes in bin interval width. For example, if by doubling the interval width, the overall 
change in performance is insignificant, it may be safely assumed that such a discrepancy was 
unimportant (provided the distributions due to ambient temperature variations are of small 


The problem of determining the probability of operation of the timer within two given 
times, say t x and t 2 , when there is no effect other than resistor selection is not difficult. (We 
also ignore, in this section, the effect of tube firing voltage variation from one firing to the 

378 E. A COHEN, JR. 

next. This phenomenon will be discussed in some detail in Section 4.) The reason is that the 
time is linear and homogeneous in resistance R. In fact, the bins have been designed to take 
advantage of this feature, and we shall show that the probability interval is independent of the 
bin in which resistor RESyv falls. 

First of all, let r^'in and ^ax De the minimum and maximum times, respectively, obtain- 
able when the nominal resistance R N and the picked resistance R p come from a given bin /. 
Also, let F^'/n and F^l x be the smallest and largest values of F, respectively, given only t N and 
knowing that R N comes from that bin. It follows that 
(3.1) f m ' in — r, F m ' in = r j t N /r l+ \ 


(3-2) /'max = r,+ i ^m'ax = r i+\ WO' 

Therefore, given that R N and R p lie in interval /, 

(3.3) r,t N /r, +] ^ t ^ r /+1 t N /r r 
Since rJr M = (1 - e)/(l + e), 

(3.4) (1 - «)/(l + e) ^ tit* < (1 + e)/(l - e), 
independent of bin interval. In other words, (3.4) is true with probability 1. 

Generally, suppose that one is interested in the probability that firing time falls between 
two prescribed limits about the nominal time. Consider once more a given bin /. Let us denote 
by R^ and R p (l) random variables derived from R N and R p respectively under the condition 
that R N and, therefore, R p must lie in interval /. From our discussion in section 2, it is clear 
that these new random variables must be independent. Let f] and t 2 be the lower and upper 
limits, respectively, on firing time. For any given value of the random variable Rh\ one can I 
determine limits on the random variable R p n so that the firing time lies between t\ and t-i. j 
Since, by definition, t N = Rfj' ] F, it follows that Rp cannot be less than 

(3.5) r,/F= tiR^/tn. 
Similarly, R p (,) cannot exceed 

(3.6) t 1 R ( N l) /t N . 

One must, of course, realize that (3.5) may be smaller than r, and (3.6) larger than r i+ \ for 
values of /?# ' close to r, and r i+ \, respectively. 

If we let g(R/v) be the density function of the random variable R N defined by (2.3) and 
(2.4), whose range is a function of the domain of Cj, C 2 , V, and V T , then the induced random 
variable /?#' has conditional density 

(3.7) g ( 'W>) = g(R N )/P(r, < R N ^ r l + ] ) = g(R N )/ f' M g(R N )dR N . 

The range of R^ n is restricted to the interval [r„ r, +1 ]. Using the mean value theorem of 
integral calculus, (3.7) becomes 

(3.8) g U) (R^) = g(R N )/git)(r i+] - r,), r, < f < r, 
If r, +1 — r, is sufficiently small, one sees that 

(3.9) g U) (R^ n ) = l/(r, +1 -r,). 




Similarly, let f {l) (R p U) ) be the density function for picked resistance R p { '\ whose range is like- 
wise restricted to [r,, r i+l ]. Then, with the knowledge that R^ l] and R p (l) are independent ran- 
dom variables, and, letting Pj(t\ < t < t 2 ) be the probability that firing time falls between /] 
and t 2 (given that R N and R p come from interval /), 



(ri < r < h) = f + ' f 2 " '* g<»Ullp)fi»Ull»)dRydR!P 

Jr , J t l Rfr ) /t N 

We take the liberty of defining f U) (R p (n ) = in (3.10) whenever /?„ (/) tf [r„ r /+1 ]. This is done 
purely for the sake of convenience of notation even though the range of R p U) is [r,, r i+x \. 

The probability that the time falls between t x and t 2 is expressed by 


(3.11) Pit, < t < t 2 ) = X ^(r, < r < t 2 ), 

where /?,• is the probability of choosing bin /'. 

As we have previously indicated, if r i+l — r, is sufficiently small, we can assume, for all 
practical purposes, that R^ is a uniformly distributed random variable. The picked resistance 
R p {l) should also be a uniformly distributed random variable if all resistors in bin / are equally 
likely to appear. In other words, let us assume that 



«■( i i 


g {,, W>) = f n (R p w ) = l/(r /+1 -r,). 

Suppose then that one asks for the probability that t x = t N {\ — 8) < t < t N {\ + 8) = t 2 for a 
given small 8. We proceed to derive closed form expressions for this probability. Three cases 
naturally arise, the first of which is shown in Figure 3 below. For brevity, we shall drop the 
superscript / in this figure and the two following figures. In this diagram, the interior of the 
quadrilateral formed by the lines R N = r,, R N = r i+ \, R p = t\R N /t N , and R p = t 2 R N /t N is the 
region of integration. Note that, in the two hatched regions, f {l) {R p ' ] ) = 0, since then either 
R p < r, or R n > r, 
equivalent to 

p < r, or R p > r /+1 . After a small computation, one sees that the inequality r^ 0) < ffi is 


< 8 < €. 

We also note that, using (3.12), (3.10) represents the normalized area of the interior of the 

hexagon shown in Figure 3, bounded by the lines R N = r h R N 
R p = t 2 R N lt N , R p = r h and R p = r l+] . Therefore, 

Rp — t \^-Nl t N-> 


P,{t N {\ -8) < t ^ t N (\ +8)) 



C'i+1 " r,) 2 

r,/(l-5) J( 


r- J r 


dR (,) dR^ 

(\+6)R^' ) 

dR U) dRi l] 


+ f" + ]n « f' ,+1 M « (,) «# J 

Jr, + I /(l+8) J(\-8)Rk<) N 


86 2 

(1 + e) 2 (2 + 8) (l-e) 2 (2-8) 

1+8 1-8 

It follows that P, is independent of /. From (3.11), 
(3.15) PUi < t < t 2 ) = P,{t\ < t < h). 

< 8 < e. 



R^=t 2 R N /t 

Rp R p =ri+ '~^\ 

o u 


R p 2,=t . R N /t N 


r i r (0) r (l] 
N N 



Figure 3. Picked resistance versus nominal resistance (Region 1) 

The second case occurs when r, ^ r,y ^ r N ^ r i+\- This situation is indicated in Fig- 
ure 4. One can also show that r^ 0) = r, + 1 when 8 = 2e/(l + e) and that r^ l) = r, when 
8 = 2e/(l — e). Therefore, the situation illustrated in Figure 4 occurs when 
e < 8 ^ 2e/(l + e). A third case will occur when 2e/(l + e) < 8 < 2e/(l — e), as illustrated 
in Figure 5, where the dotted region is now a pentagon. For 8 ^ 2e/(l — e), the dotted region 
becomes the interior of a rectangle completely enclosed in the sector, so that the probability 
becomes unity. In the third case, one sees that r, < r^ n ^ r, +1 ^ r^ 0) . When one integrates 
over the interior of the quadrilateral outlined in Figure 4, one again obtains the closed form 
given in (3.14). Therefore, (3.14) is valid whenever 0^8^ 2e/(l + e). The case illustrated 
in Figure 5 is different. When we integrate over the interior of the pentagon, which is that por- 
tion of the region of integration for which the integrand of (3.10) is nonzero, we find that 


Pi(t N (\ - 8) < t < t N (\ + 8)) = 

4e 2 + 4e(l +e)8 - (1 - e) 2 8 2 

8e 2 (l +8) 


1 +€ 

< 8 < 


One easily shows that (3.16) becomes unity when 8 = 2e/(l — e) is substituted. 


The analysis of the timer when temperature and cold cathode diode firing voltage varia- 1 
tions are considered is different from that of the previous section, since all components except 
for the resistor enter the time nonlinearly. It would then be necessary, at least in principle, to 
take into consideration the probabilities p, of picking the bins as well as the probabilities for 1 
picked resistance once a bin has been selected. However, if the variations due to these effects 
are relatively small, one should again see probabilities essentially independent of the bin 
selected. Furthermore, in a situation like this wherein certain distributions are quite tight, i.e., 
are of small variance, some simplifying assumptions can be made. We shall get to these 
presently. Again, as before, we assume that the bin intervals are so small that we may reason- 
ably suppose that (3.12) is true. Note also that (2.3) and (2.4) express R N in terms of t N , C\, 
C 2 , K, and V T . Assume now that C,, C 2 , V, and V T are independent, normally distributed ran-; 
dom variables. Suppose, as is common in practice when coefficients of variation are 


R (3, = R, 


R = r. . . 



Figure 4. Picked resistance versus 
nominal resistance (Region 2) 

Rl'^t 2 R N /t, 

rL 3, = r n 

Ri 2) =',fVt N 


Figure 5 Picked resistance versus 
nominal resistance (Region 3) 

small [4, pp. 246-251], that R N is linearized about the expected values of capacitances C\ and 
C 2 , supply voltage V, and tube breakdown voltage V T . Now a linear function of independent, 
normally distributed random variables is again a normally distributed random variable, and, 
from (2.3) and (2.4), it follows [4, pg. 118] that 




Z7/-D \ ~ Ov(Q,£ + C 2 ,£-) 

E{R N ) = 

C, F C 





V E C\, E - (Vt.e- V E )(C hE + C 2iE ) 

var (R N ) = 


\ 2 

var C] + 

var V + 

ac 2 


var C 2 

var V T . 



Here the subscript E indicates evaluation at expected values and var represents the variance 
operator. Now, clearly, 

dR N 



*n dF . , . 

= -^aq' , = 1 ' 2 ' 

with similar expressions for dR N /d V and dR N /d V T . The relevant partial derivatives of F are 
given by 


dF Cl 

1 V T -V 

ec, c x + c 2 

C, + C 2 ? Y 

BF C, 

[ C, C 2 (K r - 

- n 

dC 2 c, + g 

c, + c 2 ' r ' r 

dF C,C 2 

dV T Y 

dF C,C 2 V T 

dV VY ' 


* = /« 


KG- (K r - K)(G + C 2 ) 

y= vc x -{v T - v){c x + c 2 ). 

Now /?, represents the probability of choosing bin /', and that is precisely the probability that the 
random variable R N belongs to bin /. Furthermore, because we are now assuming that R N is a 
linear function of the independent, normal random variables G, G, K anc * K r , R N is likewise 
normal. Therefore, letting £ = E(R N ) and cr 2 = \ar(R N ), one has 



where v. 







dr = 


e 2 d\, 

(Ty/ln Jr < V27T 

(rj — £)/o- and v 2 = (r, +1 — £)/o\ so that p, may be readily calculated from tables. 

Supposing that the picked resistor and the other components are subject to a temperature 
change from the standard temperature, we must compute the effect of such a change, together 
with the resistor incrementation effect of Section 3, in order to obtain the probability of satisfy- 
ing the specification. It will be assumed in our analysis that the ambient temperature is a uni- 
formly distributed random variable whose range is given by T x ^ T ^ T 2 . If 
P(t\ ^ t ^ t 2 \T) is the probability of meeting the time limits for a given temperature T, then, 

(4.6) P(t^t*:t 2 ) = f T 2 PU ] ^t^t 2 \T)p(T)dT= \ f T 2 Pit, < t ^ t 2 \T)dT. 

Let us give an example of the computation of the nominal resistor distribution. Suppos- 
ing in (4.1) that t N = 2.6 seconds, C hE = .44 /uf, C 2E = .15fif, V E = 177v., and V TE = 235v., 
one finds that E{R N ) = 40.16 megohms. Also, one finds from (4.3) and similar expressions, 
upon inserting expected values, that dR^/dC\ = 8.65, dR N /dC 2 = — 293.12, dR^/d V=— 1.26, 
and dR N ldV T = — 0.95. Let us assume the following standard deviations: <t(C\) = 0.0073, 
o-(G) = 0.0025, a(V) = 0.17, and a(Vf) = 4.17, where Vf is used to denote the expected 
breakdown voltage of a diode chosen from a lot. The expected values of the breakdown vol- 
tages of all the tubes are themselves assumed to follow a normal distribution with expected 


value 235v. and with the above o\ In addition, each tube has a firing voltage which varies 
about its expected value. This new random variable, with expected value 0, we denote by A V E , 
and it is assumed that A V E is also normally distributed. The random variable V T , which 
represents the firing voltage of a tube selected from a lot, is actually formed as a sum 
V T = Vj + A V E , where we shall suppose that A V E is independent of Vf. Also, tests per- 
formed by fuze specialists indicate that the random variables A V E have the same distribution 
from one tube to the next. Assuming that <t(AV e ) = 0.24, it follows that a-(V T ) = 4.17. 
Then, from (4.2), var R N = 16.3235, or a-(R N ) = 4.04. Therefore, the coefficient of variation 
is 0.10, which is reasonably small. 

We now develop a general method for evaluating the performance of the timer which is 
based on a linear theory. Hopefully, this theory will yield at least conservative estimates. Our 
formula is a generalization of that given in paragraph 3. First of all, from (2.3) and (2.4), it 
follows that 

(4.7) R N = t N /F(C h C 2 , V, VP), 

where V± l) = Vf+kV^. Therefore, solving (4.7) for Vf, where F(C U C 2 , V, Vj- X) ) is given 
through (2.1) and (2.2), one finds that 

(4.8) Vf = VCJ{C X + C 2 ) - (AKi ]) - V) - VC X ^'"^^/(C, + C 2 ). 

Here C eff = \/C\ + 1/C 2 is the effective series capacitance of C\ and C 2 , and A V^ x) denotes 
that variation in tube firing voltage from its expected value which is associated with determina- 
tion of the nominal resistance R N . For brevity, we let g(C\, C 2 , V, R N , A F^") represent the 
right hand side of (4.8). There is, however, a second variation, which we shall denote by 
A V E 2) , that occurs once a resistor has been selected from a bin and the timer actually operated. 
These two variations must be taken into account carefully when assessing timer performance. 
One may now make a 1-1 transformation from the space of (C,, C 2 , V, AK^ n , AK^ 2) , Vf) to 
that of (C b C 2 , V, A Ki n , A V^ 2 \ R N ) through the map 

(4.9) C, = C,, C 2 = C 2 , V = V, A V ( E X) = A V ( E X \ A Vp ] = A V£\ 

Vf= g(C lt C 2 , V, R N , AKi 1} ), 

whose Jacobian is dV E /BR N . It follows [3, pp. 56-62] that the density function for the state 
(C, C 2 , V, AKi n , AV& 2 \ R N ) is 

(4.10) f(C u C 2 , V, AKi u , A^i 2) , R N ) = Px {C x )p 2 {C 2 )pi{V)pMy ( E X) ) 

■ ps^V^p.igiC,, C 2 , V, R N , AV^} 

■ \dV E /dR N \, 

where /?,(C,) (/' = 1,2) are the densities for C,, p 3 is the density for V, p 4 the density for 
^V E \ p 5 the density for &V E 2 \ and p 6 the density for Vf. These random variables are all 
assumed to be independent. In addition, AK^ n and A V^ 2) are identically distributed. Next 
account must be taken of the fact that, because of a change in temperature, the capacitances C, 
will change in value. In fact, we assume that C,(T), where T denotes temperature, is of the 

(4.11) C,(D = C,(l + Ki(T - T E )/\00), 

where Kj represents a random percent change per degree from the expected temperature T E . 
Thus Cj(T) is a product convolution [3, pp. 56-62] of C, and the second factor, which we 
denote by ACP,(T) (representing a percentage change in C, due to a temperature change from 
expected value T E to T). We then form the joint density h(C u C 2 , AC/VD, ACP 2 (D, K, 

384 E.A. COHEN. JR. 

A vp\ A Vp\ /?yy) f rom /and the densities for these percent changes. Afterwards, h is multi- 
plied by p(R p (T)), the convolution density of picked resistance at temperature, where 

(4.12) R p (T) = R p (\ + C(T - T E )/\00) 

and C is a random percent change per degree. Finally, if we are interested in the conditional 
density for any given bin i, we must divide by p h the probability of choice of bin /. It is clear 
that, in order for the time output of the timer to fall between two chosen values t\ and t 2 , 
R p (T) must lie between 

t x /F{C x {T), C 2 (T), V, V± 2) ) 

t2/F(C { (T), C 2 (T), V, Vj 2) ), 
where V^ - Kf + A vp> with Kf given by (4.8). Also, from (4.11), 

(4.13) C,(T) = CACPi(T). 

Now let X T = (C,, C 2 , &CP ; (T), ACP 2 (T), V, kV^\ AVp ] ). There follows the general 
multiple integration formula, which expresses the probability P, that the time falls between ^ 
and t 2 for bin i and conditioned on temperature T: 

r t I F 

(4.14) p,P,U, < t < t 2 \T) = X ' +1 L B v , X /V p(P p (T))h(X T ,R N )dR p (T)dX T dR„, 

, " r i ** X-ftR '(-c»,oo) •"|/r 

where R 1 {—°o t oo) represents the seven-fold Cartesian product of the real line. Finally, 

(4.15) PO, ^ r < r 2 ) = * £ A f ^ P,(/, < t ^ t 2 \ T)dT, 

1 2 ~ l 1 -oo I 

given that the temperature distribution is uniform. This integration procedure could be accom- 
plished on a digital computer through use of numerical Gaussian quadrature and Gauss- 
Hermite quadrature [5, pp. 130-132]. However, instead of using this general nonlinear 
approach, we find it convenient, in the present context, to linearize the products given by 
(4.11) and (4.12) and to make use of a linearized version of R N given by 

(4.16) R N =R N (C lE , C 2lE . V E , V$) + A X {C X - C hE ) + A 2 (C 2 - C 2>£ ) 

+ A 3 (V- V E ) + A 4 (V T - Vj$), 
where, of course, 

oR/v oR/\i o/?/v oRn 

A i = , A-, = , A i = , .. , Aa = 

6C, ' l bC 2 ' J bV ' 4 bV T 

are evaluated at the expected values for the components and VJ- X) E represents the expected value 
of random variable VJ- X) . (4.11) now becomes 

(4.17) C,m = C hE {K, - K iE ){T- r £ )/100 + C,(l + K lE (T- T E )/\00). 
where K iE represents the expected value of K h and (4.12) becomes 

(4.18) R p (T) = [1 + C E (T- T E )/\00]R p + R C (C - C E )(T - T E )/100, 

where R c is the center of the bin considered. Note that the effect of (4.17) and (4.18) is to 
replace product convolutions by convolutions of sums of random variables when it comes to 
computing densities. Also, supposing that t\ = t N {\ — 8) and t 2 = t N {\ + 8), the limits on the 
innermost integral of (4.14) become t x /F = (1 - 8)/ /v // 7 and t 2 /F= (1 +8)t N /F, respectively, i 


The functional form t N /F is to be replaced by the linearized version (4.16) with C\(T), C 2 (T), 
and Vt 2) substituted for C b C 2 , and V T , respectively. We have, therefore, after a small com- 

(4.19) t x lF= (l-8)[R N + A^C X {T) + A 2 bC 2 (T) + ^ 4 (A V^ 2) - A V^)\ 
and, likewise, 

(4.20) t 2 /F = (1 +B)[R N + A l bC l {T) + A 2 AC 2 (T) + ^ 4 (A^ 2) - A^")], 

where AC,(D = C,(T) — C,. When C\, C 2 , V, and F r are independent, normally distributed 
random variables, the analysis is a bit simpler, since it is easily seen that, in this case, the pair 
(R N , A K^ n ) is bivariate normal [3, pg. 162]. In addition, one notes that (4.19) and (4.20) do 
not depend on C\, C 2 , and V, in the linear analysis. In Section 6, we present a numerical 
example following this procedure. It may be noted, by analogy with the development in para- 
graph 3, that the condition t x lF < R(T) ^ t-J F is equivalent to requiring that R(T) lie 
between two hyperplanes in the six-dimensional (R N , ACj, AC 2 , A V^\ A VP\ R(T)) space. 


When the timer is actually packaged, or potted, this procedure will produce statistical 
changes in the component values. These changes are known in the trade as potting shifts. 
Such shifts can be taken into account by convolutions of the densities previously determined 
with those densities evolving from the operation of potting. This has an effect on such items as 
the picked resistor, the capacitors, and the voltage regulator. Generally, potting shifts are 
represented as percentage changes from previous values, and, therefore, strictly speaking, we 
have another product convolution to consider. For example, we represent the value of resis- 
tance due to temperature and potting by 

(5.1) R pol (T) = R P (T)(\ +CHG,/100), 

where the subscript pot denotes potting and CHGi represents a random per cent change from 
the value of picked resistance at temperature. If we linearize R pot (T) about nominal values, we 
find that 

(5.2) R P oST) = (1 + CHG LE / 100) R(T) + R E (T)(CHG X - CHG, f )/100, 

where Re(T) is the expected value of picked resistance at temperature for the given bin and 
CHG] £ is the expected value of CHGJ. From (4.18), this is given, to a first approximation, by 

(5.3) R E (T) = [1 + C N {T - T N )l\00]R c , 

where, as before, R c is the center of the bin interval. As for the capacitances, we assume a 

(5.4) C,.p 0l (r) -C,(D(1 + CFKVlOO), 

so that we would linearize C, pot (D about nominal values in a manner analogous to that for 
R pol (T). Lastly, the voltage regulator value after potting is representable by 

(5.5) K pol = V + CHG 3 . 

Hence, we need only go back through our analysis with R p (T) replaced by fl pot (r), C,(D 
replaced by C, pot (D, and V replaced by K pot . It is assumed that V T , the cold cathode diode 
tube firing voltage, is unaffected by potting. One more integration, corresponding to CHG 3 , is 
introduced in order to take account of the change in regulator voltage due to potting. 

386 E.A. COHEN, JR. 


Using a CDC 6600 computer, we were able to develop a computer code which can be 
used to predict efficiently the performance of the timer under the linearity assumptions outlined 
in the two previous paragraphs. The integration scheme developed will, in this paragraph, be 
discussed in some detail. A listing of the computer code used can be provided on request. 

First of all, in (4.18), we assume that R p has a uniform distribution across the bin which 
is being considered and that C is normally distributed. Let us suppose, as an example, that 
C E = -0.0235, T E = 75°F, and a-(C) = 0.0078. Then, of course, from (4.18), 

(6.1) R P (T) = [1 - 0.0235(7* - 15)/\00)R P + R C (C + 0.0235)(T - 75)/100. 

Therefore, R p (T) is a sum of two independent random variables, one of which is uniform and 
the other of which is normal and of mean 0. It follows that 

(6.2) p(RAT)) = 

2tt (0.000078) | T - 15\R c (r i+l - r,)[\ - 0.000235(7- 75)] 



( 1-0.000235(7-- 75) )r +) ~j 

R p (T)-u 

R JO. 000078) (r-75) 



v= (u - /?„ (D)//? r (0.000078) | T - 75|, 
(6.2) is converted into 

(63> P(R ' m> = V5F(„», - „)[■ -0.000235(7-75)] S^' ''^ "-■ 


(6.4) v, = [(1 - 0.000235(7- 75))r, - R p (T)]/R c (0.000078)1 T - 75| 

(6.5) v 2 = [(1 - 0.000235(7- 75))r, +1 - R p (T)]/R c (0.000078)1 T - 75 1 . 

Several cases now arise according to the value of R P (T) and according to whether or not 
T ^ 75°F. We first consider the case when T ^ 75°. Let us develop an inequality which 
allows us to assert that Vj < — 3. In fact, suppose that 

(6.6) R P (T) - t; > k x r, (0.00023 5)(T - 75), 

where k x is to be so determined that v, < — 3 is valid. Upon substituting (6.6) into (6.4), one 

(6.7) v, ^ - {k x + l)r,(0.000235)//? f (0.000078). 

Remembering that rjR c = 1 - e, we find that fc n = e/(l - e) will yield the requisite inequality. 
Next let us obtain an inequality which will permit us to say that v 2 ^ 3. Suppose that 

(6.8) r f+I - R P (T) > fc 2 r, +1 (0.000235)(r-75). 
Then, from (6.5), we have 

(6.9) v 2 > Wc 2 - l)(l + e). 

The right side of (6.9) equals 3 when 
£ 2 = (2 + €)/(! +e). 


Thus, if, for T > 75, 

(6.10) r,-(e) = r,(l + -r 1 — (0.000235) (T - 75)) ^ R(T) 

1 — € 

< r, +1 (l - 77 1 (0.000235)(T - 75)) = r /+1 (e), 
it follows from (6.3) that 

(6U) P(R " (T)) = (r, +1 -r,.)[l- 0.000235(7-75)]- 

Now suppose that r < 75. Letting R p (T) - r, > k 3 r, (0.000235) (T - 75), it follows that 

(6.12) v, < 3(/c 3 + 1)(1 -c). 

The right side equals - 3 when /c 3 = - (2 - e)/(l - e). Again, assuming that r j+x — 
R P (T) > k 4 r l+x (0.000235)(r - 75), we have 

(6.13) v 2 >-3(l+e)(Jfc 4 -l), 

which equals 3 when k 4 = e/(l + e). Therefore, when T < IS and 

(6.14) r/(e) = r,(l - \ :zJ - (0.000235)(r - 75)) < R(T) 

1 — e 

< r l+1 (l - — ! — (0.000235) (T - 75)) = r/ +1 (e), 

1 + € 

(6.11) is again satisfied. Next let us go back to the case when T ^ 75, but let us now require 
that v 2 < — 3. We find that such is true when 

(6.15) RAT) > r, +1 - — f— r M (0.000235) (T - 75). 

1 + € 

Since v 2 ^ — 3 also implies that v t < — 3, we can assume that p(R(T)) = in this case. 
Likewise, one finds that V) ^ 3 whenever 

(6.16) RAT) ^ r, - |— ^ r, (0.000235) (r - 75), 

1 — € 

so that, in this range, p(R p (T)) = 0, also. When T < 75, v 2 < - 3 whenever 

(6.17) RAT) > r M - —^ r /+1 (0.000235) (T - 75), 

1 + € 

and V! ^ 3 when 

(6.18) /?„(D ^ r, + — ^— r. (0.000235) (T - 75). 

1 — e 

Again it follows that p(R p (T)) = 0. Now there remain certain intervals in which p(R p (T)) 
cannot be treated as constant for a given temperature. For example, it is found that, when 
T ^ 75 and 

(6.19) r, +1 (e) = r /+1 (1 - 4^ (-000235)(r - 75)) 

1 +e 

< R p (T) ^ r, +1 (l - — — (.000235) (T - 75)) = s, +1 (e), 

1 + e 

- 3 < v 2 < 3 while v, ^ - 3. Also, in the interval 



S,-(e) = r,(l - 

< r,{\ + 

T — L (.000235)(r- 75)) < RAT) 

1 — 6 

(.000235) (T- 75)) = r,(e). 

1 - 6 

3 < v, < 3 while v 2 > 3. When T < 75, p{RAT)) cannot be treated as constant whenever 

(6.21) s/(e) - r,(l + 

1 -€ 

(.000235)(T- 75)) < ^(D < r/(e) 


(6.22) r/ +1 (e) < RAT) ^ r i+l (1 - 

2 + 6 
1 +6 

(.000235) (T- 75)) = s/+i(e). 

The intervals so developed, in which the behavior of p(R p (T)) is examined, are very 
important in the numerical study conducted on the CDC 6600. We now set up the precise pro- 
cedure used in the computer study, first of all, referring to (4.19) and (4.20), we find it a little 
more natural to integrate with respect to AC,(D or AC 2 (D first instead of~R p (T). We see 
then that our region of integration is fully specified by 

(6.23) fi(bC 2 .R p (T), R^.AV^.AV^) ^ AC, ^ / 2 (AC 2> R p (T), R N , A V^\ A V^) 

— oo < AC 2 < +°° 

-oo < R p (T) < +<*> 

— oo < A VP } < oo 
-oo < A VP ] < oo 

r, ^ R N ^ r /+] 
T x < T < T 2 , 

where, for A , > 0, 



/i = 

b¥L-A4C 2 -R N -A 4 (AVF-*vP) 

^l -A 2 AC 2 -R N - A 4 (* VP - A Vp) 

and the inclusion of negative values for R p (T) is merely a mathematical artifice. The density 
function for this process then has the following form: 


MAC,, AC 2 , R N , R P (T), A V^\ A Vp } ) 

= pMC,)p 2 (AC 2 )pAR p (T))p 4 (Rn, AV E \ x) )p s (AV ( E 2) )lp i {T 2 - T x ). 

The densities /?,, p 2 , and p 5 are all normal densities. The mass function p 3 was ascertained in 
(6.3). p 4 is a bivariate normal density, and p, is the probability of being in bin /'. It is easy to 
determine the correlation coefficient p for p 4 . Multiplying A V^ l) by R^, as given by (4.16), we 

(6.26) E(R N A VP } ) - A 4 £(A F^") 2 

= ^ 4 o- 2 (A^ 1) ), 

and, since the expected value of A V^ is zero, cov(R N , AV^ U ) = £(/? /v AK^ 1) ). It follows, 
using (6.26), that 

(6.27) p = ^ 4 o-(A^ 1) )/o-(/?^). 



The factors in (6.25), other than p 3 , are given by 

pMC x ) = 

P 2 (AC 2 ) = 


(27r) ,/2 o-(AC 1 ) 


(2tt) 1/2 o-(AC 2 ) 



AC, - £(AC.) 

AC 2 - £(AC 2 ) 

(AC 2 ) 

P4 (R N ,±vn 

Lih = 

27ro-(i?^)o-(A^ 1) )Vl -p 2 



2(1 -p 2 ) 

^?v — E(R N ) 



(t(R n ) 
E(R N ) 



- £(A^i u ) 





£(A^ 1) ) 

o-(AKi n ) 

/> 5 (A^ 2 )) = 


(27r) 1/2 o-(AKi 2) ) 


f A V^ 

-£(A^ 2) ) 


(A Vp ] ) 

where p is given by (6.27) and p 4 is the well-known joint normal density for two variates [7, 
pp. 111-114]. 

Now let K, E = .04 for / = 1,2 in (4.17) and trlK,) = .013, / = 1,2. Recall from our dis- 
cussion in paragraph IV that C\ E = .44p,/, C 2£ =.15p,/, E{R N ) = 40.16 megohms, 
,4, = 8.65, A 2 = - 293.12, ^ 4 =-0.95, and a(R N ) = 4.04. In addition, suppose that 
£(A^ n ) = £(AF^ 2) ) = 0ando-(AKi n ) = cr(AKi 2) ) = 0.2357. Then it is seen that 

£(AC,) = 0.000176(7-75), o-(AC,) = 0.00005874| T- 75|, 

£(AC 2 ) = 0.00006(7-75), ando-(AC 2 ) = 0.00002034|r-75|. 

Next we make several changes of variable. Let 

(6.28) u = (AC, - £(AC!))/V2(7(AC,) 

w = (AC 2 - £(AC 2 ))/V2o-(AC 2 ) 

z = v/V2 

w, = (/?„ - E(R N ))/y/2(l - P 2 )<t(R n ) 

w x = AKi n /V2(l -p 2 )o-(AKi n ) 

w 2 = A^ 2) /V2o-(AKi 2) ). 

Then (6.25) becomes 




7r 3 (r, +1 - /-,)[! + Cyv(r- 75)/100] 

- ! ;.: 

2 e--' dz 

- ( « j | — 2p u | h> | + >v | ) 

e ■ e ' ' ' ' /Pi(T 2 - T,). 
Now one finds, by completing the square, that 


-<U| 2 -2pW| W,+ >V| 2 ) -(Hi|-pu,) 2 -(l-p 2 )u , 2 

Next we let w 3 = W[ - pw, and u 2 = Vl - p 2 w,. Our integrand becomes 


h = 


7rHr l+l - r,)[l + C N (T - 75)/100] 

2 2 2 

e 3 • e 7 ■ e 1 lp\T 2 - T } ). 


For brevity, set Y = (w, h> 2 , w 3 ), and let /? 3 (-oo,oo) denote the usual three-dimensional 
Euclidean space. Also, put u 2i = (r, - E{R N ))/Jl v{R N ) and « 2 ./+i - C^+i - E(R N ))/y/2 
a(R N ). Then our integration scheme becomes 


P,(^(l -8) < f < f„(l +8)) = A x + A 2 , 

J 75 /•"2/+1 T /' r / (e) fF 7 (Y.u v R) 

r J J , ^K^ 2 c/r j . . f sl h{u,Y,R,u 2 )dudR 

,/■',(«) j.FAY.u^.R 

F 2 (Y,u r R) 
xL,(«) /.FJY.UtR) 

+ J.M L,v ,» h(u,Y,R,u 2 )dudR + I , ,. I , v _, h{u,Y,R,u 2 )dudR 

Jr^t) J F^kY.u^.Ri Jr l+i U) J F\(Y.u 2 .R) l 

and /4 2 is obtained by using 75 and T 2 for limits on the T integration in place of T x and 75, 
respectively, with primed quantities replaced by unprimed quantities. In addition, we have set 

(6.34) F 1 (r,« 2 , J R)=F 1 (w,w 2 ,w 2( w 3) /?)=[/ 1 (AC 2/ /? ( AK^ ) ,A^ 1) ,/?^)-£(AC 1 )]/V2o-(AC 1 ) 
F 2 (Y,u 2 ,R)=F 2 (w,u 2 ,w 2 ,w i ,R)^[fMC 2 ,R,AVP ) ,AV^ ) ,R N )-E^C [ )]/^2(r(^C0. 

Now /, and f 2 were defined in (6.24), and, from the changes of variable given by (6.28), we 

(6.35) AC 2 = £(AC 2 ) + V2wo-(AC 2 ) 

AFi 2) = V2o-(AKi 2, )w 2 

A^ n = >/2(l -p 2 )o-(A^ n )(»V3 + pw 2 /Vl -p 2 ) 

/?;v = £(/?/v) + J2<r(R N )u 2 . 

Our computer code is just the implementation of a nesting procedure, making use of Gaussian 
and Hermite-Gaussian quadrature routines, together with routines to evaluate the error integral 
[5, pp. 130-132], [6, pp. 319-330], [8], [1, pg. 924]. It turned out to be convenient and numer- 
ically accurate and timewise efficient to employ three Gauss points per integration step. 

The effect of cold cathode diode firing voltage variations in this problem is more 
significant than that of ambient temperature departures from nominal. In our case study, for 
example, when e = .01 and 8 = .02, P, was essentially 91%. With 8 = .03, this figure was 
increased to almost 100%. Results for six bins with e = .01 and 8 = .03 are given in Table 1. 

TABLE 1 — Performance of Fuze Timer 
for Representative Bins 






























It is seen that the probability is essentially the same independent of the bin. Running time for 
this problem was approximately four seconds per bin. Indeed one would reason, as in para- 
graph 3, that, at least approximately, each bin should yield the same probability for firing time, 
given a 8 — e combination. This should occur if the nonlinearities are not too severe and the 
distributions due to change in temperature and cold cathode diode firing voltage variations are 
fairly compact. This would then mean that we need only examine one bin to determine the 
performance of the timer, and our integration procedure could then represent a substantial time 
saving over a Monte Carlo simulation. 

Going back to (6.32), we can also give an error bound for the part neglected in the com- 
putation of P h Let us illustrate in one case what is happening. For instance, we have neglected 

(6.36) L J J , J , \ F(Y -, h{u,Y,R,u 2 )dudYdRdu 2 dT 
Clearly, (6.36) is bounded above by 

(6.37) f 2 f" 2 ' ,+1 f A f°° , h{R,Z,u 2 )dRdZdu 2 dT, 

Jis Jti 2j ~'ze/? 4 (-°°,oo) * ,s ,+i (e) 

where Z = (u, Y). Noting that h{u,w,R,W3,u 2 ,w 2 ) = g{u,w,WT,,u 2 ,w 2 )p(R) and that 
frf'f * g(Z,u 2 )dZdu 2 dT= 1, 

We need only study the behavior of the integration with respect to R. Going back to (6.37), 
when s i+ \ < R < °°, we know that V) ^ v 2 < — 3. Therefore, it is easy to show that 

(6.38) p(RAT)) < — L e~ v l /2 /R c <r(C)\T- 751/100. 


It follows {2, pg. 149] that 

, ,p(R(T))dR(T) ^ -7== I e~ x/2 dx = .00135. 

s , + l (e) V27T J 3 

A similar result is obtained when R is restricted to the interval (— °°, s,(e)) and T > 75° or 
when R lies in either (syV^e), 00 ) or (-°o, 5/(e)) and T < 75°. The result is finally that the 
portion neglected is bounded above by .0027, so that we are at most off in the third decimal 


An interesting case study arises when there are two or more timers which are statistically 
dependent. This occurs, for example, when, after the first timer is operated, a switch closes 
and a second timer is started, the second one being fed by the same capacitor which fed the 
first timer. Let us suppose, for instance, that capacitor CI in Figure 1 feeds the second fuze 
timer indicated in Figure 6. 

At the end of operation of the first timer, switch S in Figure 6 is thrown into the position indi- 
cated, thus allowing CI to begin charging up C4. C5 serves as the reference capacitor. The 
second timer is also governed by a simple first order differential equation, and one can show 
that the time is given by 

(7.D ,,^£i£i en 

Letting 4 n be the nominal time for the second timer, we find the nominal resistance for this 
timer to be 

C\V- C 2 (V T - V) 

c x v- 

- C 2 (V T - V)- {V TA - K)(C, + C 4 ) 




Figure 6. Second fuze timer configuraiion 


D (1) _ 

(C, + c 4 )^" 



VC ] - (V}"~ V)C 

vc x - (V± l) - V)C 2 


j - v ; ^ 2 ~ \ r t,\ ~ V)(C\ + C 4 ) 

where K/ n = Kf+AKJ n and V$\ == Kf, +A^', ) . Then, substituting (4.8) into (7.2), we 
derive the functional relationship 

(7.3) R$ ] - /?;" (C,, C 2l C 4 , V, R N ,bV£", V}}{). 
Solving (7.3) for Vf ti , we have 

(7.4) Vf A = h(C u C 2 ,C 4 , V, R N . R& 1) , LVp\ LVgl). 

To determine the joint density for the process, we must, by analogy with the method in para- 
graph 4, introduce a pair of diode firing variations A V^ 2) and A Vj£\. We then consider the fol- 
lowing transformation of variables: 


yf.\ ■ 

= h{C\, C 2 , C 4 



c 2 = 

c 2 

c 4 = 

c 4 

V = 


R N = 

R N 




A Vft 

AKi 2 



a vm 



K /?„, CA^'.A^ 1 ,') 



To compute the density, we employ (4.10) and the Jacobian of the transformation (7.5) to 



(7.6) d 5 (C lt C 2 , C 4 , V, R N , R$\ LVP, AKft AKJ 2) , AK$) 

= /(C„ C 2 , K.A^.AKjP,/^) rf,(C 4 ) -dMV { E x D -dMV^) 

d,iVf A ) 

bVf A 

9/?; n 

Also, if both /?,v" and R N are linearized about nominal values of capacitance, tube firing vol- 
tages, and regulator voltage, then the map 

(7.7) R^ = I,(C,, C 2 , C 4 , Vf^VP, V, Vl u LVg\) 
R N = I 2 (C„ C 2 , V, Vl AKi n ) 


A^" = AK^ 1 ' 

shows that (/?^ l> , ^/v, A V^\, A K^ n ) is a quadrivariate normal random vector [2, pg. 162]. The 
reason is that all random variables on the right side of (7.7) are independent and normally dis- 
tributed. At the nominal temperature, the density function is therefore generally representable 

(7.8) d 6 (C u C 2 ,C 4 , V,R N ,R^,AV^,AV^\, R p , R™ , AV£ 2 \ A^j) 

= </ 5 (C„ C 2 , C 4 , V, R N , RJP.LVP.LVSlLVP.LVgi) 

■p(R p )-p«HR} l) ), 

where, for example, 

p{R p )= l/(r /+1 - r,) 


//» (/?;»)- 1/0$ -i> (l) ) 

if picked resistance is equally likely across the bins. (7.8), also, obviously indicates that picked 
resistances are statistically independent of the other component values. It will be possible to 
reduce (7.8) to the simpler form 

(7.9) d 6 (R N , R$\ AKi», A K^, R p , R«\ A V?\ A Vg\) 

= p{R N ,Rk l \bVP,b,Vgl) p(R p ) VW*) ■ piAV^) -p(AV$) 

when (7.7) is valid, p(R N , R^\ A KJ n , A v£\) being the density for the quadrivariate normal 
distribution [9, pg. 88]. From (7.7) the elements of the covariance matrix [9, pg. 88] can be 
easily obtained. 

Next account must be taken of changes in component values due to temperature changes 
from the nominal value. We use the same ideas presented in paragraph 4, together with the 
same notation. The density becomes 


(7.10) d{C x , C 2 , C 4 , V, R N , Rtf\ A VP, A V${, R p (T), R p il) (T), 
ACP,(n, ACP 2 (T), ACP 4 (T)) 
= d 5 (C u C 2 , C 4 ,V, R N , R#\ AV?\ AKi f 7,AKi 2 >, AK^) 



■ p{ACP x {T)) ■ p(ACP 2 (T)) ■ d(ACP 4 (T)) ■ p(R p (T)) ■ p {l) (R p iU (T)), 

394 FA COHEN. JR. 

where p{R p (T)) and p {l) (R p {l) (T)) are again convolution densities. We must now determine 
limits of integration. One requires that the first timer fire in time t, where t x = t N {\ - 8) ^ 
/ ^ t N (\ + 8) = t 2 and that the second timer fire in time t u \ where r/ 1 ' = ^"(1 - 8 (1) ) 
^ r (,) < 4 1} (1 + 8 (l) ) = '2 <n - Therefore, we have 

(7-11) r,/F< R P (T) ^ t-jF 

,m /F u) ^ R u) (T) < ,(i) /F (i» 

— °o < C/ < «\ / = 1,2,4 

-oo < ACP,(T) < oo, /= i; 2, 4 

- oo < A VP ] < oo 

- «3 < A Vg\ < oo 

- oo < A VP ] < oo 

- oo < A K^ < oo 

''/ ^ F N < r l+] 

,.(1) <- p (1) ^ r (\) 

- oo < J/ < oo, 

where F = FiC^T), C 2 (D, K ^ 2 >), F (1) = F (,l (C,(r), C 2 (D, C 4 (D, K, F/ 2 \ K/ 2 /) and 
yp) = |/f + a ^ 2) , J// 2 / = Vf A + A J^ 2 /, as before. Kf is given by (4.8) and V$ A by (7.5). 
Also, (4.13) holds for / = 1,2, and 4. 

An integration scheme patterned after (4.14) can then be recorded with p u = prob (R N € 
bin /and R^ ]) 6 bin y) in place of /?,-. (4.15) would then be replaced by a double sum: 

(7.12) />(/, ^ t < r 2> r, (,) ^ f (1) < / 2 (n ) 

1 oo oo _ 7" 

r,- r, 

Also, in the case where (7.7) is valid, C { C 2 , C 4 , and V are eliminated and ACP,(T) is to be 
replaced by AC,(D. In addition, / (I) /F (1) and f/F become linear forms in AC^D, AC 2 (D, 
AC 4 (r), /?/v, /^ n , A^'\ A^ 2) , Al^.y, and A^ 2 |. In that case a sixteen-fold integral is 
reduced to a twelve-fold integral. 


The author would like to thank personally Messrs. Larry Burkhardt and John Abell, of the 
fuze group at this laboratory, for their generous support of this work and their helpful sugges- 
tions. Also, he would like to thank Mr. William McDonald and Mr. Ted Orlow, also of this 
laboratory, for their helpful comments and guiding ideas. 


[1] Abramowitz, Milton and Irene A. Stegun (Editors), Handbook of Mathematical Functions, 
National Bureau of Standards Applied Mathematics Series, No. 55, June (1964). 

[2] Cohen, Edgar A., Jr. and Ronald Goldstein, "A Component Reliability Model for Bomb 
Fuze MK 344 Mod 1 and MK 376 Mod 0", NSWC/WOL/TR 75-123 (1975). 


[3] Fisz, Marek, Probability Theory and Mathematical Statistics, John Wiley & Sons, Inc., New 

York (1963). 
[4] Hald, A., Statistical Theory with Engineering Applications, John Wiley & Sons, Inc., New 

York, Chapman & Hall, Limited, London (1952). 
[5] Hamming, R.W., Numerical Methods for Scientists and Engineers, McGraw-Hill, New York 

[6] Hildebrand, F.B., Introduction to Numerical Analysis, McGraw-Hill, New York (1956). 
[7] Hogg, Robert V. and Allen T. Craig, Introduction to Mathematical Statistics, 3rd Edition, 

MacMillan, London (1970). 
[8] IBM 7094/7094 Operating System Version 13, IBJOB Processor, Appendix H: FORTRAN 

IV Mathematics Subroutines, International Business Machines Corporation, New York 

[9] Parzen, Emanuel, Stochastic Processes, Holden-Day, San Francisco, London; Amsterdam 





Lionel Weiss 

Cornell University 
Ithaca, New York 


In an earlier paper, il was shown that for the problem of testing that a sam- 
ple comes from a completely specified distribution, a relatively small number of 
order statistics is asymptotically sufficient, and for all asymptotic probability cal- 
culations the joint distribution of these order statistics can be assumed to be 
normal In the present paper, these results are extended to certain cases where 
the problem is to test the hypothesis that a sample comes from a distribution 
which is a member of a specified parametric family of distributions, with the 
parameters unspecified. 


For each n, the random variables X\{n), ... , X n {n) are independent, identically distri- 
buted, with unknown common probability density function and cumulative distribution function 
f„(x), F n (x) respectively. An m-parameter family of distributions, with pdf f Q (x\9 u ... , 9 m ) 
and cdf F (x;9 { , .... 9 m ), is specified, and the problem is to test the hypothesis that f„(x) = 
/q{x\9\, ..., 9 m ) for all x, for some unspecified values of 9 X , ...,9 m . 

In [5] the simpler problem of testing the hypothesis that f„{x) = fo(x), where fo(x) is 
completely specified, was discussed. In this simpler case, the familiar probability integral 
transformation can be used to reduce the problem to that of testing whether a sample comes 
from a uniform distribution over (0,1). This type of reduction is not always available when the 
hypothetical density is not completely specified. (See [1] for some cases where the reduction is 


Since we will be interested in large sample theory, to keep the alternatives challenging we 
assume that f„(x) = f (x;9], ...,9%) (1 + r„(x)) for some unknown values 
. . , 9®„ and some unknown function r n (x ) satisfying the conditions sup |/"„(x)| < n~* and 

< n e for all n and for j = 1,2,3,4, where e is a fixed value in the open interval 

d 1 


Su jj 




1 1 



'Research supported by NSF Grant No. MCS76-06340. 




case. That is, f (x;9 h 9 2 ) = — g 

x - 0, 

The case where m = 2, and 0,, 2 ar e location and scale parameters respectively is rela- 
tively simple to analyze, and occurs often in practice, so until Section 5 we will discuss only this 

with 9 2 > 0, and the pdf g(x) is completely 

< Ai < °° for j = 

specified. G(x) denotes g(t)dt. We assume that sup 

J -co / 

1,2,3,4, and that sup g{x) < A 2 < °°. 

d ' t \ 
^ 8M 


For each n, we choose positive quantities /?„, q„, and L„ satisfying the following condi- 

p„ < q„ < 1 - «" e . 

(1.2; np n , nq„, L n , and K„ = are all integers. 



, 3 

lim — - — = 1 for some fixed 8 in the open interval 

lim p„ = 0, lim q n = 1, lim np n = °o. 





b n = inf 



l + n 

<x < G' 



— ri 


> n y for a fixed 

positive y with — - e + 28 + 5y < 0. 


°o, lim 

//-«» np„ ' n-oo A7 (1 — q„) 

g(G-H P „)) 




> A 3 > for all x < G~ l (p„), and 

> A 4 > for all x > G~ x {q n ). 

y,(n) < Y 2 {n) < < Y„(n) denote the ordered values of *,(«), • ..,X„(n). For 

typographical simplicity, we denote Y,(n) by 7,. For j — 1, ... , K n , let ^(z?) denote — 

( Y * Pn +jL„ + Y nPn+<J - l)L ), and let D,(n) denote ( J^+yz.,, - Y nPn+(i _ x)L ). For y = 
1, ... , K n — \, let W"(lj>), ... , W'(L„—l,j,n) denote the values of the L„ - 1 variables 

among {X\(n), ... , X n (n)} which fall in the open interval Yj(n) 

D,{n) ' 

^, ?,<■> * 

, written in random order: that is, the same order in which the corresponding elements 

W'iiJ.n) - Yj(n) 
of \X\(n), .... X n (n)} are written. Define W(i,j,n) as „ , „ for i 

D t (n) 


1, .... L„ - 1 and j : = 1, .... K n , so - — < W(i,j,n) < — . Let Wij.n) denote the (L„ - 

l)-dimensional vector { W{\,j,n), . . . , W{L n — \J,n)} for j — 1, . .., K„. Let 
Wi\,0,n), ... , Winp„ — 1,0, n) denote the values of the np n — 1 variables among 
\X\(n), ... , X„(n)) which fall in the open interval (— °°, Y np ) written in random order. Let 

W(0,n) denote the vector [W(l,0,n) W(np n -l,0?n)}. Let W(\,K n + \,n), . . . , 

Win — nq„,K„ + \,n) denote the values of the n — nq„ variables among \X\in) X„in)} 

which fall in the open interval (Y nq ,°°), written in random order. Let W(K„ + \,n) denote 

the vector { W(\,K„ + \, «),... , Win - nq n ,K„ + l,n)}. Let Tin) denote the (K„ + l)- 
dimensional vector { Y np +JL ; j = 0, 1, . . . , K n ). Note that if we are given the K n + 3 vectors 

defined, we can compute the n order statistics Y x , ■ ■ ■ , Y„, so that any test procedure based on 
the order statistics can also be based on the K„ + 3 vectors. 

Let h n (j(n)) denote the joint pdf for the elements of the vector Tin), and let 
h? n (w(i,n) \t(n)) denote the joint conditional pdf for the elements of the vector W(i,n), given 

that Tin) =jin). Then the joint pdf for all n elements of all the vectors is h n ijin)) 

hi'n(w(i> n ) Li(«)), which we denote by h n {1) . 

Next we construct two different "artificial" joint pdfs for the n elements of the vectors. 

In the first artificial joint pdf, the marginal pdf for Tin) and the conditional pdfs for 
Wi0,n) and WiK„ + \,n) are the same as above. The pdfs for the elements of the other vec- 
tors are constructed as follows. 

Let a t in) denote G~ 

np„ + 


7 ~T 

, and y ,in) denote ^— — — — t x ^ , for j = 1, . 

g' ia /in)) 

In g 2 iaiin)) 

K n . Let UiiJ) (/=1, .... L„—\;j=\, ... , K„) be IID random variables, independent of 
Tin), W_iQ,n), W iK„ + \,n), and each with a uniform distribution over (0,1). Then the dis- 
tribution of Wii,j,n) is to be the distribution of — — + (1 +y/(n)) UiiJ) - y iin) U 2 ii,j), for 
/ = 1, .... L n — 1 and /' = 1 K n . Denote the resulting joint pdf for all n elements by 

h (7) 

In the second artificial joint distribution, the marginal pdf for Tin) and the conditional 
pdfs for Wi\,n), ... , WiK n ,n) given Tin) are the same as in /?„ (1) . Given Tin), the np„ - 
1 elements of Wi0,n) are distributed as IID random variables, each with pdf 
giix-9?)/9$)/6$G (O%, n -0, o )/02°) for x < Y nPn , zero if x> Y nPn . Given T in), the n- nq n 
elements of WiK n + l,n) are distributed as IID random variables, each with pdf 
(U/0 2 °) g iix - 0, o )/0 2 °)/(l - G HY nqn - 9?)/9$)) for x > Y nQn , zero if x < K„ v Denote the 
resulting joint pdf for all n elements by /?„ <3) . 

If S n is any measurable region in ^-dimensional space, let P {l) iS n ) denote the probability 
assigned to S„ by the pdf /z„ (,) . The next two sections are devoted to proving the following: 

THEOREM 1: lim sup \P. i2) iS„) - P. U) iS„)\ = 0. 
THEOREM 2: lim sup \P. ( „(5„) - P hW (S„)\ = 0. 




Let h„ denote the joint pdf which differs from h„ ' only in that y ; (n) is replaced by 


y ; (n), denned as 

L n f' n (<Xj(n)) 

, where atj(n) = F' n 

= f-1 

np n + 

J ~l 

It was shown in 

[8] that lim sup \P ( 4 >(S„) — P (\)(S n )\ = 0, and thus Theorem 1 will be proved if we can show 

n— *oo S n "n "n 

that lim sup \P (2) (S„) — P w (S n )\ = 0. By the reasoning used in [8], this last equality will 

n— *°° S n n 


be demonstrated if we can show that 

h n l2) (T(n), W(0,n) W(K„ + \,n)) 


h„ w (T(n), W{Q,n) W(K„ + l,n)) 

= R n , 

say, converges stochastically to zero as n increases, when the joint pdf is actually h} 2 \ From . 
the definitions above, and the formula in [8], for all sufficiently large n we can write R n as 


7 I I 

Z 7-1 '=1 


- \og[\+y j(n)-4 yj (n)W(iJ,n)] 

where W(i,j,n) have the same distribution as - — + (\+y l (n))U(i,j) - y j(n)U 2 (ij). We 

show that the expression (2.1) converges stochastically to zero as n increases by means of three 
lemmas. (The order symbol 0( ) used below has the usual interpretation.) 

LEMMA 2.1: max |y,(«)|=0U 3+ * +2> 

PROOF: Directly from the assumptions and the definition of y ,(n). 
LEMMA 2.2: sup | F~ ] (t) - {0i° + 2 o Cr l (/)}| - 0(/T e+ ?). 

n = = ^ n 


PROOF: Since /„(x) = ~ g 






(l + r„(x)), with \r„(x)\ < n € , we have F„{x) = 

— — C x 1 

+ R n (x), where R„(x) = J -y g 

y 2 



r n (t)dt, and thus \R„(x)\ < n~ e G 

. Then we can write F„ (x) = G 


(! + /?„ (x)), where \R„(x)\ < n' 1 for all 

x. Fix any value / in the closed interval [p„,q„]. Writing F n (x)= t = G 

we have x = F„ Ht) and G 
(2.2) G~ l 

We can write G~ x 

(^ 1 (r)-0 1 °) 




(! + /?„ (x)), 

l + R„(F- l U)) 

, so 

1 + n 


<'■-'<'>-"' < G 

1 + A?" 


= (T 1 */) - 


1 + «" 





1 - n~ € 



where r* is in the open 



1 + n 


, and thus 




- G~Kt) 



< n y , by assumption (1.5). Then sup 

n „■£'&„ 

1 + n 

that sup 

p,,^ 1 ^,, [ 1 — « 

using the inequalities (2.2). 

= 0(/7 €+y ). By a completely analogous argument, it can be shown 


= 0(/? e+y ). Then the lemma follows immediately, 

LEMMA 2.3: y An) - y,(«) + 8. ■ (/?), where max \b An)\ = 0U 3 



PROOF: By lemma 2.2, we can write y ,(«) as 

L^ f;,(9? + 0$a l (n)+8 l (n)) 
In ft(0? + e°a,(n)+8 l (n)) ' 

where max |8,-(/i)| = 0(n~ €+y ), f'„(x) = —z g 
\<j<K n 6" 

x - e 

x - 9 



r'„ (x) + (1 + /-„(*)) 

(0 2 ) 2 

, so we can write f' n {9\ + #2° <*/(«) + 8,(«)) as fn #'(<*,(«)) + 8 *(«), where 



max \8'(n) | = 0(n~ e+y ). We can also write /„(6>,° + 6> 2 V(«) + §,•(«)) as — - g(a ,■(/»)) + 
il/'S*,, #2 

8, («), and thus fj(9? + 2 °a/ («) + 8y(«» as — - j-y g 2 (a i (n)) + 8*(n), where max 

\9 2 ) 1 </ ^ AT n 

|8.-(/i)| = 0(n" e+y ) and max |8*(n)| = 0(n~' +y ). Thus we can write y An) as -^ 

i</<^ n In 

{((l/(02 O ) 2 )^(« / («))+8*(«))/((l/(e 2 o ) 2 )g 2 (a / («))+8*(«))}, and the proof of the lemma fol- 
lows directly from assumptions (1.3) and (1.5). 

Now we complete the proof of Theorem 1 by applying the expansion log (1 + x) = x — 

X 2 X 3 X 4 

~- + — for |x| < 1, where |a>| < 1, to each of the logarithms in the expres- 

2 3 4(l+a>x) 4 

sion (2.1). This enables us to write the expression (2.1) as the sum of a finite number of 
expressions, each of which can easily be shown to converge stochastically to zero as n increases, 
using the lemmas. For example, two of these expressions are: 

K„ L.-l 



Z Z (y 2 («)-y 2 («))> and 

/=! /=! 


2 Z Z (yj(n)-yj(n))W(i,j,n). 

The expression (2.3) is the sum of K„(L n -\) terms, where K n {L n -\) < n. A typical term 
can be written as {y f {n)-y j{n)) (y j(n) +y 7 («)), which by Lemmas 2.1 and 2.3 is 

-y-€ + 28+5y ±- e + 2 s + 5y 

0(« ). So the whole expression (2.3) is 0(n 

) and converges to zero as n 

K„ Z...-1 

increases, by assumption (1.5). The expected value of the expression (2.4) is 2 £ £ 

j= i /= l 

1 ] *n l„-\ 

y z ( « ) , and the variance of the expression (2.4) is 4 £ £ 

f- i /= i 

iy jin) - y j{n)) 



(y ,(n) - y ,{n)) : 

1 yf(n) 

12 180 

This mean and variance can both be seen to converge to zero 

as n increases by the same reasoning as in the analysis of the expression (2.3), and thus the 
expression (2.4) converges stochastically to zero as n increases. The other expressions in the 
sum comprising the expression (2.1) can be handled similarly, completing the proof of 
Theorem 1. 


In Section 2 we showed that we can write F„(x) = G 


(\ + R„(x)) where 

|/?„(x)| < n e for all x. We now develop an analogous expression for 1 — F„(x). 1 — F„{x) 

- £" f " {,)dt = 






< n 



r '*« 



^2° | 


dt, and since 

we can write 1 — /\,(x) = 

(1 + S„(x)) where |S„(x)| < n f for all x. 

Theorem 2 will be proved if we can show that 


/?„ u, (7>), W(0,n), 

W{K„ + \,n)) 

hp ] (T(n), W(0,n), 

W(K„ + \,n)) 

= r:„ 

say, converges stochastically to zero as n increases, when the joint pdf is actually /?„ (1) . Assum- 
ing /?„ (1) is the joint pdf, the conditional (given Tin)) distribution of R' is the same as the dis- 

np, r 1 _ 

tribution of &,(!) + Q„ (2), where Q„(\) = £ log(l + r„{Vi)) - (np n - Dlog (1 + 


" — "in 

R„(Y„ P )), and Q n {2) = £ log(l + r„(Z,)) - (n - ng n )log(l + S n (Y n(l )), and 


I) Z\> 

'n-nq n 

are mutually independent, each V, with pdf 

< Y np , zero for v > Y np , each Z, with pdf 


F n (Y np ) 

for v 


F ) for 2 > Y nQir zero for z < J^ 


LEMMA 3.1: Q n (1) converges stochastically to zero as n increases. 

"/>„ - i 

PROOF: Define (?„(!) as 52 r »< K /) ~ (wa-1)/?„(K, V( ). By assumption 1.6 


|(?„(1) - Q„(l) | converges stochastically to zero as n increases. Thus the lemma will be provec 
if we show that Q n {\) converges stochastically to zero as n increases. 




E{r n {V}\T{n)) =■ 


(\ + r„(t))dt 

Y nPn ~ 9 1 


(l + R n (Y np )) 






r. M -2e, 

R n(Ynp) + a 

Y "P„ ~ 9 1 

e 2 ° 



(i + /?„(y B „)) 

where |uj < 1. From this, it follows that |£{r„(F,)| r(w)} - K„ ( 5^) I = O^ -2 *). This 
implies that E[Q n (\)\T(n)} converges to zero as n increases, and also that Variance 
{r n (Vi)\T(n)} = p in~ 2e ) which in turn implies that Variance [Q„ (1)| Tin)) converges sto- 
chastically to zero as n increases. These facts clearly imply that Q„i\) converges stochastically 
to zero as n increases. 

LEMMA 3.2: Q„ (2) converges stochastically to zero as n increases. 

_ "~" q » _ 

PROOF: Define £>„(2) as £ r niZ,) ~ ( n ~ n Qn^n (Y nq ). Just as in Lemma 3.1, all we 

have _to do is to prove that Q„(2) converges stochastically to zero as n increases. 
£{r„(Z,)|7>)} = 

f°° 1 '-0i° 


^«<?„ — ^ i 

[\ + S„{Y„ q )] 




+ (ii„n~ 


Y "q~ e \ 



Ynq., ~ ^ 1 


[l + S„{Y nQ )) 

where |a»„| < 1. From this, it follows that \E{r n {Z)\T(n)} - S„(Y nq )\ = Q p (n~ 2€ ). The rest of 
the proof is similar to the proof of Lemma 3.1. 

Lemmas 3.1 and 3.2 imply that R* converges stochastically to zero as // increases, and this 
proves Theorem 2. 


Theorem 1 implies that a statistician who knows only the vectors Tin), \V(0,n), 
W.{K n + \,n) is asymptotically as well off as a statistician who knows all the vectors Tin), 
WiO.n), Wi\,n), .... W(K n + \,n). This is so because given Tin), using a table of random 
numbers it is possible to generate additional random variables so the joint distribution of the 
additional random variables and the elements of Tin), WiQ.n), W(K„ + \,n) is the joint dis- 
tribution given by h} 2) . But Theorem 1 states that all probabilities computed using /?„ (2) are 
asymptotically the same as probabilities computed under the actual pdf /7„ (1) . 



Theorem 2 implies that asymptotically the order statistics {K 1( 


Y,,} contain no information about r„(x). This is so because under //„ (3) the conditional distribu- 
tion (given T{n)) of these order statistics does not involve r„(x). 

Taken together, the two theorems imply that a knowledge of T(n) is asymptotically as 
good as a knowledge of the whole sample, for the purpose of testing whether r n (x) = 0. This 
assumes that we have to deal only with the challenging alternatives described in Section 1, but 
less challenging alternatives do not pose any problem asymptotically. 


The results above were for the case where the unknown parameters are location and scale 
parameters. In other cases, it may not be possible to choose p„ and q n that will guarantee that 
assumptions (1.5) and (1.6) hold for all 9 X , ... , 0,„, if we want lim p„ = and lim q„ = 1. 

/;— >oo 11—°° 

But if we fix p and q with < p < q~< 1, an analogue of Theorem 1 can often be proved with 
p„ replaced by /?, q„ replaced by q, and a j(n), y ,-(«) 

defined as F n ' 

np + 


. , 9, 

L^ fo(a i (n)'J l 9 J 

In fi( a ,(n)-9 u ... , 9 J 


where 9 X , ... , 9 m are estimates of 9®, ... , 0° based on { Y np , Y„ p+L , ... , Y nq ). Then, if we 
are willing to ignore departures from the hypothesis in the tails of the distribution, we can stil 
use only the order statistics { Y„ p , Y np+L) Y„ q ). 


For the case where m = 2 and b 2 are location and scale parameters respectively, vari 
ous tests based on T(n) have been investigated in [2] and [6]. In particular, [2] contains vari 
ous analogues of the familiar Wilk-Shapiro test, first proposed in [3]. The tests in [2] and [6 
were based on T(n) because it made the analysis easier. The present paper gives a theoretics 
justification for basing tests on these sparse order statistics alone. 

For the location and scale parameter case, we can construct other tests, as follows. Fc 

j= 0,1, .... K„, let V,(n) denote yfn f„ 


let Z.(n) denote Vn — „ g 

' 2 ° 


Y p - 

"P„ + < L n " 

Y p~ l 

1 np +jL r n 

np n +jL„ 

np„ + jL n 


It was shown in [4] that for all asymptotic probability calculations, we can assume that the joii 
distribution of { ^o^)- • • • > ^k (")) ' s given by the normal pdf 




n(L n - 1) 

2L 2 


L„v K 


^T^tr + ,? (v '- v '- i): 

1 15 

Under the additional condition that — - e + 2y < 0, it can be shown that for all 

asymptotic probability calculations we can assume that the joint distribution of 
[Z (n), . . . , Z K («)} is given by the normal pdf just described. Then, if we define Pi as 


1 + 

V np,, 



Vl - Qn 

, p 2 as 

V np„ 

Pi, and the observable random variables Q , 

V^yj^\^/—(g(G- l ip„)) Y np +p xg {G-\q n ))Y nq 
L, 2 I V np„ 

/ n(L,-i: 


J np„+jL„ 

Y n „ +lL + P2 g(G^(q„)Y l 


np„+ij- \)L n 

np n + (/-!)/.„ 

for 7 = 1, . . . , A',,, a straightforward computation shows that for all asymptotic probability cal- 
culations we can assume that Q , Q x , ... , Q K are independent, each with a normal distribu- 
tion with standard deviation 8 2 , and with 

/ n(L,-\) 1 /7~ 
V L„ 2 I V np n 


/»„(0)+ P ,/»„(/:„) 

£10/} = 

n L " l [h„(j) - h n {j-\)+p 2 h„{K„)\, for j= 1, .... *„, 

where /?„(/) = £ 

L oo 


= 0;° + 2 U G 



np» + jL„ 


np„+jL n 

. If the hypothesis is true, F„ 

, and in this case we can write E[Q t } = A„(j)9\ + B n (j)9 2 , where 

A n (J), B n (J) are known, for j = 0, . . . , A",,. So we have reduced our hypothesis testing prob- 
lem to the following: we observe random variables Q , Q\, ... , Q K which are independent 
and normal, each with the same standard deviation 2 °, which is unknown. The problem is to 
test the hypothesis that £{(?,■} = A n (J)0? + B„(J)9^, for some unknown 0,°, where A„(J) and 
B n (j) are known values, for j = 0,1, ... , K„, against alternatives that E{Q f ) = A n (j)9\ + 
Bn(j)9 2 + A „(/"), where A„(/) is unknown. 

The formulation of the problem just described makes it easy to construct various tests. 
For example, suppose for convenience that K n + 1 is a multiple of 4. Then it is possible to 

find — (AT„ + 1) sets of nonrandom quantities 

' = 0, 


X„(4/), X„(4/ + l), X„(4/ + 2), X„(4/ + 3); 

such that the* — (K,, + \) quantities £>„(/) = X„(4/)Q, +X„(4/ + !)(?,+,+ 











\„(4i + 2)Q i+2 + \ n (4i + 3)Q l+i /' = 0, 1, ... , — ' L j— can be assumed to be independent 

normal random variables, each with unknown standard deviation 0°, and with E[Q„(i)} = £ 

— _ /=0 

X„(4/4\/)A„(4/ +j) = A„(/), say, where A„(/) is unknown. Then the hypothesis to be tested 

is that A„(/) = for all /. But if we examine the development above, we see that (A„(/)} is 


not completely arbitrary. Instead, A„(/) = q„ 

, where q„ (v) is a continuous function 

K n -l 

of v for < v < 1. If we have some particular alternative q n (v) against which to test the 
hypothesis, a likelihood ratio test can be constructed. If we want to test against a very wide 
class of alternatives, we could apply one of various nonparametric tests. For example, we could 
base a test on the total number of runs of positive and negative elements in the sequence 
{(?„(/)}. If the hypothesis is true_, there should be a relatively large number of runs, but if the 
hypothesis is false, neighboring Q n (iYs would tend to have the same sign, decreasing the total 
number of runs. Other tests for an analogous problem are developed in [7]. 

j -£ 

In the case where g(x) — / — e 2 , all the conditions imposed above hold if we take p„ 

, _ x 1 1 Ai A 2 A 2 

= 1 - q„ - 0(« "), e = - - A,, 8 = — - — - A 2 , y = — - A 3 , p = — - A 3 - A 4 , 

where A b A 2 , A 3 , A 4 are very small positive values chosen so that e > 0, 8 > 0, y > 0, and p 

> 2A,. 


[1] Hensler, G.L., K.G. Mehrotra and J.E. Michalek, "A Goodness of Fit Test for Multivariate 
Normality," Communications in Statistics, A6, 33-41 (1977). 

[2] Jakobovits, R.H., "Goodness of Fit Tests for Composite Hypotheses Based on an Increasing 
Number of Order Statistics," Ph.D. Thesis, Cornell University (1977). 

[3] Shapiro, S.S. and M.B. Wilk, "An Analysis of Variance Test for Normality (Complete Sam- 
ples)," Biometrika, 52, 591-611 (1965). 

[4] Weiss, L., "Statistical Procedures Based on a Gradually Increasing Number of Order Statis- 
tics," Communications in Statistics, 2, 95-114 (1973). 

[5] Weiss, L. "The Asymptotic Sufficiency of a Relatively Small Number of Order Statistics in 
Tests of Fit," Annals of Statistics, 2, 795-802 (1974). 

[6] Weiss, L., "Testing Fit with Nuisance Location and Scale Parameters," Naval Research 
Logistics Quarterly, 22, 55-63 (1975). 

[7] Weiss, L., "Asymptotic Properties of Bayes Tests of Nonparametric Hypotheses," Statistical 
Decision Theory and Related Topics, //Academic Press, 439-450 (1977). 

[8] Weiss, L., "The Asymptotic Distribution of Order Statistics," Naval Research Logistics 
Quarterly, 26, 437-445 (1979). 


Karen Isaacson and C. B. Millham 

Washington State University 
Pullman, Washington 


This work is concerned with a particular class of bimatrix games, the set of 
equilibrium points of which games possess many of the properties of solutions 
to zero-sum games, including susceptibility to solution by linear programming. 
Results in a more general setting are also included. Some of the results are be- 
lieved to constitute interesting potential additions to elementary courses in 
game theory. 


A bimatrix game is defined by an ordered pair <A,B> of m x n matrices over an ordered 
field F, together with the Cartesian product X x Y of all m-dimensional probability vectors 
x € X and all n-dimensional probability vectors y 6 Y. If player 1 chooses a strategy (probabil- 
ity vector) x and player 2 chooses a strategy v, the payoffs to the two players, respectively, are 
xAy and xBy, where x and y are interpreted appropriately as row or column vectors. A pair 
<x*,y*> in A' x Y is an equilibrium point of the game <A,B> if x*Ay* ^ xAy* and x*By* ^ 
x*By, for all probability vectors x and y. 

A Nash-solvable bimatrix game is one in which, if <x*,y*> and <x',y'> are both equili- 
brium points, then so are <x*,y'> and <x',y*>. It is well known that 0-sum bimatrix games 
(fly + bjj = 0, all i,j) are Nash-solvable, and that this property extends to constant-sum games 
(fl y -I- b,j = k, all ij, for some k € F). It is also well known that in the constant-sum case all 
equilibrium points are equivalent in that they provide the same payoffs to both players. This 
work generalizes, slightly, that contained in such sources as Luce and Raiffa (9) and Burger 
(2), and represents a very small step toward the solution of the open problem of characterizing 
Nash-solvable games. In the following, A r will be the rth row of A and A. } the y'th column of 
A, and similarly for B. The inner product of 2 vectors u, v in E" will be denoted by (w,v). The 
ordered pair is <a,v>. 


DEFINITION 1: An m x n bimatrix game <A,B> is row-constant-sum if, for each 
I / = 1, ... m, there is a k, € Fsuch that a u + by = k,, j = 1, . . . n. 

THEOREM 1: Let <x*,y*> and <x',y'> be two equilibrium points for a row-constant- 
sum game <A,B>. Then <x*,^*> and <x',y'> are interchangeable, and they are equivalent 

m m 

for PI (player 1). They are equivalent for P2 (player 2) if and only if £ x*k, = £ x,'/c,. 

;=1 (=1 



PROOF: It is well known and easily proved that <x*,y*> is an equilibrium point for 
<A,B> if and only if x* > implies that (A/., y*) = max(A k , y*) and y* > implies that 


(x*,B.i) = max(x*,B. k ), for all i, j. Accordingly, let /3 * = x*By*. Then v* > implies 
(x*,fl. y ) =/3* - £xffc, - (x*, /*.,) > £ xft - (x*„4. r ) for all r, or (x*. >*.,) ^ (x*. ^. 7 ), and 

x* > implies (y4,.,v*) = a* = max(/^., v*). If <x',v'> is any equilibrium point, then we 


have that xMv* ^ x '4y* (because x* is in equilibrium with y*) ^ x'Ay' (because y' is in 
equilibrium with x ' and by the above argument) ^ x*Ay' (because x'is in equilibrium with y') 
^ x*Ay* (because y* is in equilibrium with x* and by the above argument). Thus <x*,y*> 
and <x',y'> are interchangeable for PI, and equivalent for PI. To show they are interchange- 
able for P2, note that xBy' = ]T x^kj — x'Ay' = £ x,'/c, — x'Ay*, or x'By' = x'By*. One can simi- 

larly show that x*By* = x*By\ completing this part of the proof. 

Suppose now that £ x,'/c, = £ x*k,. Since x*Ay* = x'Ay*, we have that £ x//c, — 
x'Ay* = £ x* kj - x*/4v*, or x'By* = x* Z?v*, and equivalence follows. 


On the other hand, suppose x*Ay* = x*Ay' = x'Ay* = x'Ay', x*By* = x*By' = x'By* = 
x'By'. Then £ xfk, — x*Ay* = £ x,'/c, — x'/4y*. Since x'/ly* = x*Ay*, it follows that £ x*/c, = 

/ i i 

£ X/'A:/, and the proof is complete. 


It is well known that, if A (=—B) is the payoff matrix for a zero-sum game, optimal stra- 
tegies <x*,y*> for the game satisfy the so-called "saddle-point" property: x*Ay > x*Ay* ^ 
xAy* for all probability vectors x and y, and that, conversely, if <x*,y*> is a saddle-point of 
the function xAy, then <x*,y*> is a solution to the game A. 

THEOREM 2: <x*,y*> is an equilibrium point of the row-constant-sum game <A,B> 
if, and only if, <x*,y*> is a saddle-point of the function <$>(x,y) = xAy. 

PROOF: \[<x*,y*> is an equilibrium point of <A,B>, then x*Ay* ^ xAy* for all 
x 6 J, from which half of one implication follows. Now, let K be the m x n matrix 






* 2 . 





of row constants £,, a u + by = /c,. 

Since x*By* > x*By for all y e K, we have x*(A^ - /D.y* ^ x*(K — A)y, from which 
x*Ay ^ x*^v* [since x*Aj>* = x*Ky = £ Jc^fe/l . This completes one implication. Suppose now 

that <x*,y*> is a saddle-point of <I>. From x*Ay ^ x*/4v* it follows that v*= if' (x*,A.j) > 


a = m\n(x*,A. k ), from which, if y* > 0, £ x*A:, - (x*,A.,) ^ £ x*A:, - (x* ,A. k ) for all k, or 

* ,= i 

(x*,^.,-) >(x*,B. k ) for all fe Finally, it follows from x*Ay* ^ xAy* for all x that x*= if 
(/4,.,v*) < ma\(A k .,y*), and the proof is complete. 


The implication is that any solution of A as a 0-sum game is also an equilibrium point of 
the row-constant-sum bimatrix game <A,B>, and conversely. Thus, a solution of A found by 


linear programming will provide an equilibrium point <x*,y*> for <A,B> and the payoff a 
for PI. The payoff fi for P2 must be calculated via x*By*, or via £ x*k t — a. 



We now consider the m x n matrix A, we let B be m x n (not necessarily in row- 
constant-sum with A) and we henceforth let X x Y be the set of solutions to A as a 0-sum 
game. The following theorem then follows. 

THEOREM 3: Let <x*,y*> € X x Y. In order for <x*,y*> to be an equilibrium point 
of <A,B> regarded as a bimatrix game, it is necessary and sufficient that x*By* ^ x*By for all 
probability vectors y, or, for x*(— B)y > x*(—B)y*. It is clearly sufficient for <x*,y*> to also 
be a solution to (-/?), regarded as a 0-sum game. 

The proof is omitted, as it follows immediately from the definition of equilibrium point. 
The following comment is made, however: if <A,B> is row-constant-sum, a point <x*,y*> 
that solves A as a 0-sum game and is an equilibrium point of <A,B> , will not necessarily solve 
(-B) as a 0-sum game, because the condition x*(— B)y* ^ x(—B)y* holds if and only if 

m in 

x*Ay* — £ xfkj ^ xAy* — £ Xjkj, or x* Ay* — xAy*^ ]£ kj(x* — x,). Thus, the condition 

/= 1 i i- 1 

that <x*,j>*> also solve (—5) as a 0-sum game is extremely strong. This illustrates a major 
difference between the constant-sum case (in which the above condition will hold if <x*,y*> 
solves A as a 0-sum game) and the row-constant-sum case. It is also logical to ask if there are 
conditions on A and B which would cause an equilibrium point of <A,B> to also solve A and 
-B as separate 0-sum games. The conditions are inescapable: yf > must imply 
(x*,A. f ) = min (x*,A. k ) and x* > must imply {B r ,y*) = min (B k .,y*). Since, for example, 

k k 

to be an equilibrium point of <A,B> it is necessary that y* > imply 
(x*,B. f ) = max (x*,B. k ), any game satisfying these conditions must be heavily restricted. 


Finally, it is noted that if there are common saddle-points of A and (—5), which are therefore 
equilibrium points of the bimatrix game <A,B>, each of these saddle-points will necessarily 
provide the same payoffs a, (3 to the respective players (note the contrast of the row-constant- 
sum case with the constant-sum case). 

DEFINITION 2: A Nash Subset for a game <A,B> is a set S = [<x,y>} of equili- 
brium points for <A,B> such that, if <x,y> and <x',y'> are in S, so are <x,y'> and 
<x',y>. See (6) and (13) for related material. 

THEOREM 4: Let A and B be m x // matrices over the ordered field F, and let X x Kbe 
the set of all solutions to A regarded as a 0-sum game. In order for X x Y to constitute a Nash 
subset of equilibrium points for <A,B>, regarded as a bimatrix game, it is necessary and 
sufficient that K(X) = [k\ {x,*A. k ) = min (x*,A.,), all x* € X)c K' (X) = [k\ (x*.B. k ) = 

max (x*.B.i), all x* <E X ). 

PROOF: Write K = K(X), K' = K'(X), and let KCK'. Then because any <x*,y*> in 
X x Y solves A as a 0-sum game, x*Ay ^ x*Ay* ^ x4y* for all <x*,y*> in J x Y and all 
probability vectors x, y. Also, .y*= if {x*,A..) > min (x*,A. k ), or if y # K C A". Hence 


y*= if (x*,B.i) < max (x*,5.,) for all >>*€ X, any x* € A\ and <x*,.y*> is an equilibrium 


point for </L5>, for all <x*,y*> € J x Y. Suppose there exists k' € K - K', so that for 



some x* 6 X, (x*,B. k ) < max (x*,5.,) but (x*,A. k ) = m'm(x*,A.,). Since it is known that 

there exists v' 6 Y (see (1), page 52) such that y£- > 0, it follows that y' cannot be in equili- 
brium with x* for <A,B> regarded as a bimatrix game, a contradiction. This completes the 

COROLLARY 1: Let X* x Y* be any subset of X x K, the set of all solutions to A 
regarded as a 0-sum game. In order for X* x Y* to be a set of interchangeable equilibrium 
points (a Nash subset) for <A,B> regarded as a bimatrix game, it is sufficient that K(X*) = 
[k\ (x*,A. k ) = min (x*,A.,) for all x* e X*} = K'(X*) = {k\(x*,B. k ) = max (**,£.,) for all x* 

€ X*}. 

COROLLARY 2: Let X' C A, and let A" (A") be denned as above, and let 
Y'=[y € r|v, > implies 7 6 A'(A')). Then A" x r is a Nash subset for <A,B>. 

Finally, we consider the construction of all matrices B such that A x Y, the set of solu- 
tions to A as a 0-sum game, will also be a set of equilibrium points for <A,B> regarded as a 
bimatrix game. 

THEOREM 5: Let A be an m x n matrix over f, with Ax K its solutions as a 0-sum 
game. Then a matrix B can be constructed such that Ax Y is a Nash subset for <A,B> 
regarded as a bimatrix game. The equilibrium points <x,y> in A x Y may or may not be 
equivalent for P2, depending on construction. Further, all matrices B such that Ax Y is a 
Nash subset for <A,B> are constructed as described. 

PROOF: Let x , x, . . . x k be the extreme points of A, and assume that x\ . . . x r , r ^ k, 

x l 

are a maximal linearly-independent subset of x , 

x k . Let x ~ 

be regarded as the 

matrix of a linear transformation from E m to E\ taken with respect to a basis of unit vectors, 
and let c\c 2 , ... c m ~ r be a basis for the nullspace of x- L et 01. /3 2 , ■■■ft r be scalars. Let , 

v\ y 2 , ... v' be the extreme points of the set Y, and let KJ = {/| yj > 0}. Let K Y = U A,,/. 

Let D = UKx'.rf) =/3 /( l ^ 7 < r}, and let d\ ... <T _r+1 be m - r + 1 (if some B, * 0) 
linearly-independent solutions to the system of r equations in m variables. For j € K v , let 

m — r+\ m — r m— r + 1 

B.j = £ a,,^' + £ \ /7 c' where £ a „= 1 for at least, £ a /7 = a for some a ^ Oj, all 


y. Then, if x € A, there are scalars y,, / = 1, . . . r, such that x= £ y,x', and for 

y € K y ,(x, B.,) = 

= Iy/j8 f Of «- D- 

/• r m — r + \ m — r 

r=l [/-i />=i i=l '=i 

After all £.,, y€AT K , have been constructed, for j $ K Y , let B., be such that 

(x'.B.j) < (x',B. h ), h € AT K , for all extreme points x', i = 1, ...A:. Then, for all y* € K, x* 

€ A 

with x' 


, x*By* = ^y^3, ^ x*£v for all probability vectors v. Hence. 


Ax Y is a set of interchangeable equilibrium points for <A,B> that would, for example, be 
equivalent if /3, = |3, for all /, / 

Finally, suppose there is a matrix 5 such that Ax Y is a Nash subset for <A,B> but 
which does not have the above construction. Then there is a column B. r j € K Y , such that j 


m—r+\ m—t •m—r+\ m—r 

either B. t ^ £ «,-,</' + £ \ y/ c' for any coefficients a „, or #, = ]£ «,,</' + £ \ /V c' but 

i=i 1-1 f=i /=i 


£ a„ = a, ^ a A = 1, k€K Y , k^j. In the first instance we note (x',B./) = 8 h / = 1, .../•, 
and we contradict the assumption that d ] , ... a'"'"' + 1 are a maximal linearly-independent set of 


solutions to ix'.d) = B jt j = 1, ... r. In the second instance, if ]£ a /7 = a, ^ 1, let x = 

]£ y,x'. Then (x.B.,) = a 7 £ y, B, ^ (x,5. A )=^ -y,/3, for other k € /T r , so that any 
/= i ' /= i i= l 

equilibrium strategy y will either exclude j, or include y and exclude any k such that a k = 1. 
Either contradicts the definition of K y . 

Note that the matrix A is used only to define X x K Given the set of Jf x y, it follows 
that both A and 5 could be constructed as described, assuming the appropriate dimensionality 


It is hoped that this slight extension of previously published material regarding Nash- 
solvable bimatrix games will lend itself to inclusion in future texts in game theory and opera- 
tions research covering 2-person, 0-sum finite games (matrix games). Clearly, nearly any state- 
ment that can be made about solutions of matrix games can also be made about the somewhat 
more interesting row-constant-sum bimatrix case, and the usual methods for finding such solu- 
tions carry over with the minor modifications indicated. The reader is also referred to the 
excellent text by Vorobjev (21), and his discussion on "almost antagonistic" bimatrix games 
(pp. 103-115) for related interesting material. 


[1] Bohnenblust, H.F., S. Karlin, and L.S. Shapley, "Solutions of Discrete, Two-Person 

Games," Contributions to the Theory of Games, Annals of Mathematics, Studies 24, 

Princeton University Press (1950). 
[2] Burger, E. Theory of Games. Prentice-Hall, Englewood Cliffs, New Jersey (1963). 
[3] Gale, D. and S. Sherman, "Solutions of Finite Two-Person Games," Contributions to the 

Theory of Games, Annals of Mathematics, Studies 24, Princeton University Press 

[4] Heuer, G.A., "On Completely Mixed Strategies in Bimatrix Games," The Journal of the 

London Mathematical Society, 2, 17-20 (1975). 
[5] Heuer, G.A., "Uniqueness of Equilibrium Points in Bimatrix Games," International Jour- 
nal of Game Theory, 8, 13-25 (1979). 
[6] Heuer, G.A., and C.B. Millham, " On Nash Subsets and Mobility Chains in Bimatrix 

Games," Naval Research Logistics Quarterly 23. 311-319 (1976). 
[7] Kuhn, H.W., "An Algorithm for Equilibrium Points in Bimatrix Games," Proceedings of 

the National Academy of Sciences, 47, 1656-1662 (1961). 
[8] Lemke, C.E. and J.T. Howson, Jr., "Equilibrium Points of Bimatrix Games," Journal of the 

Society for Industrial and Applied Mathematics, 12, 413-423 (1964). 
[9] Luce, R.D., and H. Raiffa, Games and Decisions, John Wiley and Sons, New York (1957). 
, [10] Mangasarian, O.L., "Equilibrium Points of Bimatrix Games," Journal of the Society for 

Industrial and Applied Mathematics, 12, 778-780 (1964). 
[11] Millham, C.B., "On the Structure of Equilibrium Points in Bimatrix Games," SIAM Review 

10, 447-449 (1968). 


[12] Millham, C.B., "Constructing Bimatrix Games with Special Properties," Naval Research 

Logistics Quarterly 19, 709-714 (1972). 
[13] Millham, C.B., "On Nash Subsets of Bimatrix Games," Naval Research Logistics Quarterly 

21, 307-317 (1974). 
[14] Mills, H., "Equilibrium Points in Finite Games," Journal of the Society for Industrial and 

Applied Mathematics, 8, 397-402 (1960). 
[15] Nash, J.F. Jr., "Two-Person Cooperative Games," Econometrica, 21, 128-140 (1953). 
[16] Owen, G., "Optimal Threat Strategies in Bimatrix Games," International Journal of Game 

Theory, 1, 3-9 (1971). 
[17] Pugh, G.E. and J. P. Mayberry, "Theory of Measure of Effectiveness for General-Purpose 

Military Forces, Part I: A Zero-Sum Payoff Appropriate for Evaluating Combat Stra- 
tegies," Operations Research 21, 867-885 (1973). 
[18] Raghavan, T.E.S., "Completely Mixed Strategies in Bimatrix Games," The Journal of the 

London Mathematical Society, 2, 709-712 (1970). 
[19] von Neumann, J. and O. Morganstern, Theory of Games and Economic Behavior, Princeton 

University Press, Princeton, New Jersey (1953), 3rd Ed. 
[20] Vorobjev, N.N., "Equilibrium Points in Bimatrix Games," Theoriya Veroyatnostej i ee 

Primeneniya 3, 318-331 (1958). 
[21] Vorobjev, N.N., Game Theory Springer-Verlag, New York, Heidelberg, Berlin (1977). 


A. Ben-Tal 

Department of Computer Science 

Technion — Israel Institute of Technology 

Haifa, Israel 

L. Kerzner 

National Defence 
Ottawa, Canada 

S. Zlobec 

Department of Mathematics 

McGili University 
Montreal, Quebec, Canada 


This paper gives characterizations of optimal solutions for convex semi- 
infinite programming problems. These characterizations are free of a constraint 
qualification assumption. Thus ihey overcome the deficiencies of the semi- 
infinite versions of the Fritz John and the Kuhn-Tucker theories, which give 
only necessary or sufficient conditions for optimality, but not both 


A mathematical programming problem with infinitely many constraints is termed a "semi- 
infinite programming problem." Such problems occur in many situations including production 
scheduling [10], air pollution problems [6], [7], approximation theory [5], statistics and proba- 
bility [9]. For a rather extensive bibliography on semi-infinite programming the reader is 
referred to [8]. 

The purpose of this paper is to give necessary and sufficient conditions of optimality for 
convex semi-infinite programming problems. It is well known that the semi-infinite versions of 
both the Fritz John and the Kuhn-Tucker theories fail to characterize optimality (even in the 
linear case) unless a certain hypothesis, known as a "constraint qualification," is imposed on the 
problem, e.g. [4], [12]. This paper gives a characterization of optimality without assuming a 
constraint qualification. 

*This research was partially supported by Project No. NR047-021, ONR Contract N00014-75-C0569 with the Center lor 
Cybernetic Studies, The University of Texas and by the National Research Council of Canada. 



Characterization theorems without a constraint qualification for ordinary (i.e. with a finite 
number of constraints) mathematical programming problems have been obtained in [1]. It 
should be noted that the analysis of the semi-infinite case is significantly different; the special 
feature being here the topological properties of all constraint functions including the particular 
role played by the nonbinding constraints. 

The optimality conditions are given in Section 2 for differentiate convex semi-infinite 
programming programs, whose constraint functions have the "uniform mean value property." 
This class of programs is quite large and it includes programs with arbitrary convex objective 
functions and linear or strictly convex constraints. For a particular class of such programs, 
namely the programs with "uniformly decreasing" constraint functions, the optimality conditions 
can be strengthened, as shown in Section 4. A comparison with the semi-infinite analogs of the 
Fritz John and Kuhn-Tucker theories is presented in Section 5. An application to the problem 
of best linear Chebyshev approximation with constraints is demonstrated in Section 6. A linear 
semi-infinite problem taken from [4], for which the Kuhn-Tucker theory fails, is solved in this 
section using our results. 


Consider the convex semi-infinite programming problem 

Min f°(x) 


f k (x,t) < for all f <E T k , k € P A (1, ... , p} 
x € R" 

/" is convex and differentiable, 

f k (x,t) is convex and differentiable in x for every / € T k and continuous in / for every x, 

T k is a compact subset of R 1 (/ ^ 1). 

The feasible set of problem (P) is 

F = (x € R":f k (x,t) ^ for all t € T k , k € P). 
Note that F\s a convex set being the intersection of convex sets. 

For x* € F, 

T* k A [t € T k : f k {x*,t) = 0}, 

P* A {k € P. T* k * 0). 

A vector d € R" is called a feasible direction at x* if x* + d € F. For a given function f k (-,t), 
k € {0} U P and for a fixed t 6 T k , we define 

D k (x*,t) A [d e R". 3 a > ^ fix* + ad,t) = f k (x*,t) for all ^ a ^ a). 


This set is called the cone of directions of constancy in [1], where it has been shown that, for a 
differentiable convex function f k (-,t), it is a convex cone contained in the subspace 

[d: d'Vf k ix*,t) = 0}. 

Furthermore, if f k i-,t) is an analytic convex function, then D k (x*,t) is a subspace (not depend- 
ing on x*), see [1, Example 4]. In the sequel the derivative of / with respect to x, i.e. 
V x fix,t), is denoted by Vfix.t). 

Optimality conditions will be given for problem (P) if the constraint functions have the 
"uniform mean value property" which is defined as follows. 

DEFINITION 1: Let T be a compact set in R'. A function f\R" x T — R has the uni- 
form mean value property at x € R" if, for every nonzero d € R" and every a > 0, there 
exists a = a id, a), < a ^ a such that 

(MV) fix + adj) ~ f{x,t) > d' V/(x + ad,t) for every t <E T. 


If f(-,t) is a linear function in x for every t € T, i.e. if /is of the form 
f(x,t) = git) + Y,x,g,(t), 

or if /(-,/) is a differentiable strictly convex function in x for every t € T, i.e. if 

f(kx + (1 - \)y,r) < \/(x,/) + (1 - \)f(y,t) for every / € 7 

where y € /?" is arbitrary, y ^ x, < X < 1, and if fix, ■) is continuous in t for every x, then 
/has the uniform mean value property. For a linear function / one finds d'Vfix + ad.t) = 


£ djgtit) and (MV) is obviously satisfied. The mean value property for strictly convex func- 
tions follows immediately from e.g. [14, Corollary 25.5.1 and Theorem 25.7]. 

EXAMPLE!: Function 

fix,t) = t 2 [ix - t) 2 - t 2 ] for every K T = [0, 1] 

is neither linear nor strictly convex in x € R for every t € T. However /' has the uniform 
mean value property. Function 

/ 2 (x,,x 2 ,f) = 

x, 2 4- tx 2 ix 2 - t) if x 2 < — t 

X? + '_ (x 2 - t+ 1) (x 2 - 1) if X 2 ^ y / 

for every / € T = [0, 1] does not have the uniform mean value property at the origin. Note 
that f 2 is convex and differentiable in x € R 2 for every t € T and continuous in f G T for 
every x. This function has provided counterexamples to some of our early conjectures. 

Optimality conditions will now be given for problem iP). 

THEOREM 1: Let x* be a feasible solution of problem iP) where f k , k € P* have the 
uniform mean value property. Then x* is an optimal solution of iP) if, and only if, for every 
a * > the system 


(A) d'Vfix*) < , 

(B) d'Vfix* + a *d,t) < for all t € T* k , 

(c) d'vfix* + a*d,t) _ _L for all f € r VT> 

/*UV) a* x 

is inconsistent. 

PROOF: We will show that x* is nonoptimal if, and only if, there exists a * > such 
that the system (A), (B), (C) is consistent. A feasible x* is nonoptimal if, and only if, there 
exist a > and d £ R'\ d ^ 0, such that 

(1) f°(x* + ad) < fix*) 

(2) f k (x* + ad.t) ^ for every t 6 T k , 

k € P. 

By the convexity of f and the gradient inequality, the existence of a > satisfying (1) is 
equivalent to 

d'Vfix*) < 0. 

By the continuity of fi-.t), k € P, the constraints with k € P\P* can be omitted from discus- 
sion. We consider (2), for some given k € P*, and discuss separately the two cases: t € T* 
and / € T k \T*. Thus (2) can be written 

(2-a) fix* + ad.t) ^ for every t <E FJ 

(2-b) f k {x* + ad.t) < for every t € T k \T* k . 

Consider first (2-a) for some fixed k € P*. By the convexity and uniform mean value property 

of A 

(3) f(x* + ad.t) > fix*,t) + ad'Vfix* + a k d,t) for all 1 € T* k 
and for some 

< a k < a. 
Since t € T* and a > 0, (2-a) implies 

(4) d'Vfix* + a k d,t) < 0. 

(5) a = min [a k \. 


Clearly, a always exists (since Pis finite) and it is positive. By the convexity of fi-.t), (5) 
and (4), 

(6) d'Vfix* + ad.t) < d'Vfix* '+ a k d,t) ^ 0. 

On the other hand, the existence of a* > such that, for some / € T* and all k € P*, 
d'Vfix* +a*d,t) < 
implies (2-a) with < a ^ a*. 



It is left to show that the existence of a > 0, such that (2-b) holds, is equivalent to the 
existence of a > 0, such that (C) holds. Suppose that (2-b) holds for some a > 0. Then, by 
the convexity and uniform mean value property, for k € P*, 

> fix* + ad,t) > f k {x*,t) + ad'Vfix* + a k d,t) for all / € T k \T* k 

and for some 




< a k . < a. 

d'Vfix* + a k d,t) 1 

— > — — , since t € T k \T* 


> - — , by (7). 


a = min [a k \ > 0. 


Using the monotonicity of the gradient of the convex function f k (-,t), one obtains here 

d'Vf k {x* + ocd,t) . d'Vf k (x* + a k d,t) _ 

rki * \ ^ rk< * x for every < a < a fc . 

f k (x*,t) f h {x*,t) 

This gives 

— ^ , by (10) and (8) 




, by (9) 

which is (C) with a* = a. 

Suppose now that (C) is true for some a* > 0. Using again the monotonicity of the gra- 
dient of the convex function f k (-,t), and the fact that f k (x*,t) < for t 6 T k \T* k , one easily 


f k (x*,t) + a*d'Vf k (x* + ad.t) < 0, for every < a < a *. 

f k (x* +a*d,t) = f k (x*,t) +a*d'Vf k (x* + a k d,t), 

for some particular < a k < a*, a k = a k (t) 
by the mean value theorem 
< 0, by (11) 
which is (2-b) with a = a*. 

Summarizing the above results one derives the following conclusion: If x* is not optimal 
then there exists a* = min{a,a) > such that the system (A), (B) and (C) is consistent. If 
there exists a* > such that the system (A), (B) and (C) is consistent, then there exist 
a > and a > such that 



f°(x* + a d) < f°{x*) 

fix* + ad.t) ^ for every t € T* k 
fix* + ad.t) ^ for every t 6 T k \T* k 

k € P*. 
If one denotes 

a = min{a ,a} > 
then, again by the convexity of fi',t), k € {0} U P, (12) can be written 

fix* + ad) < fix*) 

fix* + ad,t) < for every t € T k , 

k e P* 

implying that x* is not optimal. 

REMARK 1: Since Vfix,-) is continuous for every x in some neighbourhood of x* 
(this follows from e.g. [14, Theorem 25.7]), condition (C) in Theorem 1 needs checking only 
at the points t € T k which are in 

N k A U Nit*), 

— i*i.Tl 

where Nit*) is a fixed open neighbourhood of t*. For the points / in T k \N k one can always find 
a * which satisfies (C). This follows from the fact that for every a, 


d'Vfjx* + ad,t) 

^ -M 


for some positive constant M, by the compactness of T k \N k . Choose M in (13) large enough, 
so that 


a* A — - ^ a. 
= M 


* f w + «**/) > d'v/y + a-d,^ and 

fix*,t) fix*,t) 


, by (13) and (14). 

EXAMPLE 2: The purpose of this example is to show that Theorem 1 fails if the con- 
straint functions do not have the uniform mean value property. Consider 

Min — x 2 

subject to 

fix u x 2 ,t) < for all t € [0,1] 




f(x u x 2 ,t) = 

x 2 + tx 2 (x 2 - t) 




x{ + (x 2 - t + \){x 2 - 1) if x 2 > -t. 

if x 2 < -t 



Function / satisfies the assumptions of problem (P) but it does not enjoy the uniform mean 
value property. The feasible set is 

F = 

€ x 7 ^ 1 

and the optimal solution is x = (0, 1)'. However, for every a * > 0, the system (A), (B) and 
(C) is inconsistent at x* = 0, a nonoptimal point. Since T* = [0,1], condition (C) is here 
redundant, while (A) and (B) become, respectively, 

-d 2 < 

2a V, 2 + t(2a*d 2 - t)d 2 < 0if2aV 2 <7 


loi*d{ + 

(2- t) 1 

(2a *d 2 - t) d 2 ^ if 2a *d 2 > t. 

The above system cannot be consistent for some a* > 0, because, if it were, the last inequality 
would be absurd for small t € [0,1]. 

When the constraint functions (but not necessarily the objective function) are linear, i.e. 
when (P ) is of the form 


Min f"(x) 


Soit) + £ x,g?{t) < 0, for all t € T k , k 6 P 

then Theorem 1 can be considerably simplified. 

COROLLARY 1: Let x* be a feasible solution of problem (L). Then x* is optimal if, and 
only if, the system 

(B 2 ) 


d'Vf°(x*) < 
£ d,g?(t) < 0, for all / € T* k 


Z d,g*U) 


gait) + Z xfefu) 

> -1, for all t € FA?*. 

k e p* 

is inconsistent. 



PROOF: Recall that linear functions have the uniform mean value property. If f k (-,t) is 
linear, then for every t € T k 

D k (x,t) - [d € R": d'Vf k (x,t) = 0}. 

Thus (B) reduces to (B 2 ). The left hand side of (C) reduces to the left hand side of (Ci), 
which does not depend on a *. Moreover, a * on the right hand side of (C) can be taken 

a* = 1, because whenever d satisfies (A) and (B 2 ), so does d = — r d. 



In many practical situations the sets T k , k € P are compact intervals and the sets T*, 
k € P* are finite. (This is always the case when fix*, •) are analytic functions not identically 
zero.) For such cases condition (B) can be replaced by a finite number of linear inequalities. 

COROLLARY 2: Let x* be a feasible solution of problem (/>), where f k , k € P* have 
the uniform mean value property. Suppose that all the sets T*, k € P* are finite. Then a 
feasible solution x* of problem (P) is optimal if, and only if, for every a * > and for every 
subset £1 A of T* the system 

d'Vf°(x*) < 

• d'Vf k {x*,t) < 0, t € n k 

d € D k {x*,t), t € T* k \VL k 

(B 3 ) 


is inconsistent. 

d'Vf k (x* + a*d,() > _ _1_ 
f k (x*,t) a' 

for all t € TJSTl , 

k € P* 

An important special case of Corollary 2 is when the sets T k themselves are finite. Then 
problem (P) can be reduced to a mathematical program of the form 


Min f°(x) 


fix) < 0, k 6 P. 

This is obtained by setting T k = {k lt k 2 , ■ 

^carcrrj a nd identifying {/*(*,*,): k ; € T k , 

k = 1,2 p] 


f k (x): k € PA {1,2 JT card T k ) 

f k (x*) = 0}. Also {D k (x*,k,): k, € T k , k = 1,2, 



{A: € P: 

p) is denoted by {D A .(x*): k € P}. 

The major difference between the semi-infinite problem (P) and the mathematical prob- 
lem (MP) is that for the latter the condition (C) is redundant; Theorem 1 then reduces to the 
following result obtained in [1, Theorem 1]. 


COROLLARY 3: Consider problem (MP), where {f k : k 6 {0} U P) are differentiable 
convex functions: R" —> R. A feasible solution x* of (MP) is optimal if, and only if, for every 
subset Q of P* the system 

d'Vf°(x*) < 

d'Vf k (x*) < 0, k € n 

d € D k (x*), k € P*\Q, 

is inconsistent. 

PROOF: Here condition (C) becomes 

d'Vf k (x* + a*d) > _ J_ . 

/*(x*) "* a * ' 

for some a * > 0. Since here the set P\P* is finite, and hence compact, the redundancy of 
condition (C) is shown as in Remark 1. 


The following result gives a characterization of a unique optimal solution of problem (P). 

THEOREM 2: Let x* be a feasible solution of problem (P), where f k , k € P* have the 
uniform mean value property. Then x* is a unique optimal solution of problem (P) if, and 
only if, for every a* > there is no d satisfying conditions (5), (C) and 

(A,) d'Vf°(x*) < or d € D (x*). 

PROOF: Suppose that the system (A,), (B), (C) is inconsistent. Then so is the system 
(A), (B), (C). Hence, by Theorem 1, x* \s an optimal solution. Suppose that_x* is not a 
unique optimal solution. Then there exist a > and d ^ such that x = x* + ad is feasible, 
which implies that d satisfies (B), (C) and f°(x*) = f°(x* + ad) . Since the set of all optimal 
solutions of a convex program is convex, the latter implies fix*) = f"(x* + ad) for all 
^ a < a, i.e., d € D (x*). Thus (/satisfies (A,), (B) and (C), which is impossible. There- 
fore x* is the unique optimum. The necessity follows by a similar argument. 



This section can be skipped without hindering the study of Section 4. 

In order to state our next result, which is a characterization of optimality for a subclass of 
convex functions, i.e. strictly convex functions in their "actual variables", we adopt some 
notions from [1]. 

For every k € P and t 6 T k , denote by [£](/) (read "block /c"), the following index sub- 
set of P: j € [k](t) if, and only if, y k : R — R, defined by 

y k (-) A (xi, ... , Xj- h ; X J+l , ... , x„) 

is not a constant function for some fixed x,, . . . , x,_,, x l+u . . . ,x„. Thus, for a given r € T k , 
\k](t) is the set of indices of those variables on which f k (-,t) actually depends. These "actual 



variables" determine the vector x ( *](,,, obtained from x = (x,, ... , x„)' by deleting the vari- 
ables [xf. j % [£](/)}, without changing the order of the remaining ones. Similarly, we denote 
by f [k]u) : /?«*[*) -» r the restriction of/* to R c ^ k l 

DEFINITION 2: A function f*\ R" x T k -* R is strictly convex in its actual variables if 
for every / € T k its restriction f [k](,) (-,t) is strictly convex. 

The above concept will be illustrated by an example. 

EXAMPLE 3: Consider 

f{x,t) = x, 2 + txl t € T= [0,1]. 
Note that function /'(•,/) is not strictly convex for every / € T. Here 

{1} if / = 

{1,2} if/ € (0,1], 

(*,) if / = 
if/ € (0,1] 

[11(f) = 





f x, 2 if / = 
x, 2 + /x 2 2 ifr € (0,1], 

clearly a strictly convex function in its actual variables for every / € T. Hence, /' is a strictly 
convex function in its actual variables. 

COROLLARY 4: Let x* be a feasible solution of problem (P), where /*(•,/), A: € P*are 
strictly convex in their actual variables and have the uniform mean value property. Then x* is 
an optimal solution of (P) if, and only if, for every a * > and every subset ft* C T* the 

(B, ft) 



is inconsistent. 

d'\7f°{x*) < 
d'Vfix* + a*d,t) < for all t € Tf$l k 

d'Vf k (x* + a*d,t) . 1 . .. , T , T% 

rkt * ^ > * for a11 * € r ^ r * 

f k (x*,t) a* 

d[k](D = for all / € ft k , 

k £ P* 

PROOF: We know, by Theorem 1, that-x* is nonoptimal if, and only if, there exists 
a* > such that the system (A), (B), (C) is consistent. In order to prove Corollary 4, it is 
enough to show that (B) is consistent if, and only if, for some subsets ft* C 7 1 *, k € P*, the 
system (B, ft), (D,ft) is consistent. Suppose that (B) holds. For every k € P* define 

ft* A [t € T* k : d'S7f k {x* + ad,t) = for all < a < a*}. 

Hence, by the mean value theorem, for every /6ft* 

fix* + ad.t) = f k (x*,t) for all < a < a *. 


Since f k i-,t) is strictly convex in its actual variables, this is equivalent to 

d{k)(t) = for all t € Q. k . 

If t € T* k \Cl k , then obviously rf'V f k ix* + ad.t) < for some < a < a *, by (B). Thus 
(B, ft), (D.ft) holds for ft^ = ft^. (Note that some or all ft^'s may be empty.) The reverse 
statement follows from the observation that d[ k \(,) = implies d'Vf k ix* + a *d,t) = 0. 


If a function f k i-,t) is strictly convex (in all variables X\, ... , x„) for every t € T k , 
k € />*, then D k (x*,t) = {0}. This implies that the system (A), (B, ft), (C), (D, ft) is incon- 
sistent for every nonempty Cl k , k € P*. Thus condition (D,ft) is redundant. In fact, condi- 
tion (C) is also redundant, which follows by the following lemma. 

LEMMA 1: Let fix.t) be convex and differentiate in x € R" for every t in a compact 
set T C R 1 and continuous in t for every x If for some d € R", 

(15) d'Vf(x*,t) < for all t € T* = {t: f{x*,t) = 0}, 

then there exists a > such that 

• (16) fix* + ad,t) < for all t 6 T\T*. 

PROOF: It is enough to show that the hypothesis (15) and the negation of the conclusion 
! (16), which is 

"For every a > there is t - ti.a) € T\r*such that fix* + ad.tia)) > 0," 
are not simultaneously satisfied. If this were true one would have the following situation: 

For every a„ of the sequence a„ = 2~" there is a t„ = t n (a n ) € T\r*such that 

(17) f(x* + a n d,t„(a„)) > 0, n = 0, \, 2, ... 

Since Tis compact, {t n } has an accumulation point t € T, i.e. there is a convergent subsequence 
' {/„} with /as its limit point. We discuss separately two possibilities and arrive at contradictions 
in each case. 

CASE I: t € T*. Since f(x*j) = and d'Vf(x*,t) < 0, by (15), there exists a > 
such that 

! (18) fix* +ad,h < 0. 

For all large values of index /, a„ < a and 

(19) fix*,t n ) < 0, 
since t„, € T\T*. This implies 

(20) fix* + ad,t„) > 0. 

(If (20) were not true, one would have, for some particular «,, 

(21) fix* +ad,t„) < 0. 

Nowa,, < a, (19), (21) and the convexity of /imply 
fix* +a„d,t„) ^ 


which contradicts (17).) But (18) and (20) contradict the continuity of fix* + ad,-)- 

CASE II: tJE T\T*. Since fix*,t) < 0, there exists a > such that (18) holds, by the 
continuity of fi-,t). The rest of the proof is the same as in Case I. 



A characterization of optimality for strictly convex constraints follows. 

COROLLARY 5: Let x* be a feasible solution of problem (P), where fi-,t) are strictly 
convex for every t € T k , k € P*. Then x* is an optimal solution of (P) if, and only if, for 
every a * > the system 

(A) d'Vf°(x*) < 

(B,) d'Vfix*,t) < for all t € T% 

k € P* 

is inconsistent. 

PROOF: First we recall that f, k € P*, under the assumption of the corollary, have the 
uniform mean value property. If x* is not optimal, then the system (A), (Bj), (C) is con- 
sistent, by Corollary 4. This implies that the less restrictive system (A), (B^ is consistent. 
Suppose that the system (A), (B]) is consistent. Then for every k € P* there is a k > such 

fix* + a k d,t) ^ for all t € T k \T* k 
by Lemma 1. Let 

a* A min{a^: k € />*). 
By the convexity of /*, it follows that 

fix* + a*d,t) < for all t € T k \T* k and k 6 P*. 

This is equivalent to (C) of Theorem 1 (see (2-b)). Therefore the system (A), (B]), (C) is 
consistent. This implies that the system (A), (B), (C) is consistent. (The reader may verify 
this statement by the technique used in the proof of Lemma 2.) Hence x*is optimal, by Corol- 
lary 4. 


REMARK 2: Differentiate strictly convex (in all variables!) functions f do have the 
uniform mean value property. However, this is not necessarily true in the case of convex func- 
tions with strictly convex restrictions. In particular, function 

fix u x 2 ,t) = 

x, 2 + tx 2 ix 2 - t) lf * 2 < J* 

S . , 

(2- t) 2 

x 2 + -7r J -—(x 2 - t + l)(x 2 - 1) if x 2 ^ -t 

is differentiate and has strictly convex restrictions for every t € [0,1]. Note that 

{1} iff = 

{1,2} if/ € (0,1]. 

[Jfc](r) = 


But function /does not have the uniform mean value property. One can show, however, that a 
differentiable function which is strictly convex in its actual variables and such that [/c](/) is 
constant over all compact set T, does have the mean value property. 


The applicability of Theorem 1 is, in general, obscured by the appearance of parameter a * 
in conditions (B) and (C). The purpose of this section is to point out some of the topological 
difficulties which arise in the removing of a * from condition (B). A class of convex functions 
for which the optimality conditions can be stated without reference to a * in condition (B) will 
be called the uniformly decreasing functions. 

In what follows we assume that f:R" x T — • R is convex and differentiable in x € R" for 
every / of a compact set Tin R m . Further, Vf(x*,t) denotes Vf x (x*,t). 

DEFINITION 3: Let /: R" x T — R and x* € R" be such that T* * 0. Then for a 
given d € R", d ^ 0, the function /is uniformly decreasing at x* in the direction d, if (i) the 

S(x*,d) A(K T*: d'Vf(x*,t) < 0} 

is compact and if (ii) there exists a > such that fix* + ad,t) = for all t G T* for which 
d € D(x*,t). 

It is not easy to recognize whether a general convex function /is uniformly decreasing. 

EXAMPLE 4: Consider the following functions from R x R into R: 

f(x,t) = t 2 [(x - t) 2 - t 2 ], t 6 T (used in Example 1) 

f{xj) = x 2 - tx, t € T 

f(x,t) = -tx, t € T. 

These functions are all convex, f 2 is actually strictly convex and / 3 linear in x for every / € T. 
:If T = [0, 1], then neither function is uniformly decreasing at x* = in the direction d = 1. 
However, if T = [1,2] then all three functions are uniformly decreasing at x* = in the same 
direction d = 1. 

As suggested by the above example, a convex function /is uniformly decreasing at x* in 
the direction d ^ 0, whenever Vf(x*, •) is continuous and the set 

E(x*,d) A {/ € T*: d'Vf(x*,t) = 0} 

is empty. Its complement 

S(x*,d) = T*\E{x*,d) - T* 

s then compact. In particular, all analytic functions not identically zero are uniformly decreas- 
ng. However, a characterization of optimality for problem (P) with such constraint functions 
s already given by Corollary 4. 

An important uniformity property of convex functions with compact S(x*,d) follows: 

LEMMA 2: Let f(x,t) be convex and differentiable in x, for every / in a compact set 
r C R"\ and continuous in t, for every x € R". Suppose further that for some x* and d ^ 
n R", the set S(x*,d) is nonempty and compact. Then there exists a > such that 



(22) fix* +ad,t) < 0, < a < a 
for all t € S(x*,d). 

PROOF: Suppose that such a > does not exist. Then there exists a sequence 
[tj) C S(x*,d) and a sequence {a,}, a, = a jit,) > such that 

fix* + ai d,t,) - 0, 

fix* + ad.t,) < 0, < a < a, 


(23) fix* +ad,tj) > 0, a > a, 

with inf (a,} = 0. Since S(x*,d) is compact, (/,) contains a convergent subsequence {/,.}. Let 
t 6 S(x*,d) be the limit point of [tj}. Now 

d'Vfix*,t) < 
implies that there exists a > such that 

fix* + ad.t) < 0, < a ^ a. 
In particular, 

(24) fix* + ad,h < 0. 

For any e > there exists j = j fl ie) such that 

(25) \tj — t\ < € and a, < a for all j > j . 
Now (23) and (25) imply 

(26) fix* + ad.t,) > for all j > j . 

But the inequalities (24) and (26) contradict the continuity of fix* + ad,-). 


EXAMPLES: Consider again 

fixj) = x 2 - tx, rG F= [1,2]. 

This function is uniformly decreasing at x* = in the direction d — \. The inequality (22) 

holds for every < a < 1, in particular a = — . If the above interval T is replaced by 

T — [0,1], then f 7 is not uniformly decreasing at x* = with rf = 1. An a > satisfying (22) 
here does not exist. 

A characterization of optimality for programs (F), with constraint functions which have 
the uniform mean value property and are uniformly decreasing, follows. 

THEOREM 3: Let x* be a feasible solution of problem (/>), where f k , k € P* have the 
uniform mean value property. Suppose also that /*, k € P* are uniformly decreasing at x* in 
every feasible direction d. Then x* is an optimal solution of iP) if, and only if, for every 
a * > the system 

(A) d'Vf°ix*) < 0, 

d'Vf k ix*,t) < or d € D k ix*,t) 
for all t € T% ' 

(B 4 ) 




is inconsistent. 

d'VfHx* +a*d,t) > _ _1_ 
f k (x*,t) a* 

for all t € T k \T* k , 

k e p* 

PROOF: Parts (A) and (C) are proved as in the case of Theorem 1. It is left to show 
that the existence of a > satisfying (2-a) is equivalent to the consistency of (B 4 ). It is clear 
that (2-a) implies (B 4 ). In order to show that (B 4 ) implies (2-a) we use the assumption that the 
functions {f k (x,t): k € P*} are uniformly decreasing at x* in the direction d. When (B 4 ) holds, 
then for every k € P* there exist a k > and a k > such that 


by Lemma 2, and 


f k (x* + ad.t) < 0, < a < a k 

for all t € S k A [t € T* k : d'Vf k (x*,t) < 0), 

f k (x* +ad,t) = 0, < a < ag 
for all t € T* k \S k , 

since d ^ 0. The latter follows by part (ii) of Definition 2 and the convexity of f k . Let 


a A min(a A ,a A } > 0. 


Clearly (27) and (28) can be written as the single statement (2-a) with a chosen as in (29). 


The following example shows that the assumption that [f k {x,t): k € P*) be uniformly 
decreasing at x* cannot be omitted in Theorem 3. 

EXAMPLE 6: Consider the program 
Min f°(x) = -x 


f(x,t) ^ 0, for all t € T = [0,1] 


f(x,t) = 

t{x - t) 2 if x > t 
if x ^ t. 

The feasible set consists of the single point x* = 0, which is therefore optimal. One can verify, 
after some manipulation, that the constraint function /has the uniform mean value property at 

x*. (For every a > there exists < a ^ — a such that (MV) holds.) However, /is not 

uniformly decreasing at x*. In order to demonstrate that Theorem 3 here fails, first we note 
that T* = T = [0, 1], so the condition (C) is redundant. Since d = 1 is in the cone of direc- 
tions of constancy D(x*,t) for every t € [0, 1], we conclude that the system (A), (B 4 ) is here 
consistent, contrary to the statement of the theorem. Therefore the assumption that the con- 
straint functions be uniformly decreasing. cannot be omitted in Theorem 3. 



In contrast to the characterizations of optimality stated in the preceding sections we will 
now recall the Fritz John and Kuhn-Tucker theories for semi-infinite programming. In the 
sequel we use the following concept from the duality theory of semi-infinite programming, e.g. 

DEFINITION 4: Let / be an arbitrary index set, [p 1 : i € /} a collection of vectors in R m 
and [c,\ / € /} a collection of scalars. The linear inequality system 

u'p 1 ^ c„ for all / € / 

is canonically closed if the set of coefficients {((/>')', c,): / € /} is compact in R'" +] and there 
exists a point u° such that 

(u")'p' < c„ for all i € /. 

We will say that problem (P) is canonically closed at x*if the system 

(B 5 ) d'Vf k (x*,t) < for all t € T%, k € P* 

is canonically closed. 

REMARK 3: All constraint functions of problem (P) can have the uniform mean value 
property, or they can be uniformly decreasing, without problem (P) being canonically closed. 

Lemma 3 below is a specialized version of Theorem 3 from [3], adjusted to our need. It 
is related to the following pair of the semi-infinite linear programs: 

(I) (ID 

Inf u'p" Sup Y, C A/ 

s.t. s.t. 

u'p'> c„ all / € / 2>'X,= P° 

u € R m k € S, X ^ 0, 

where S is the vector space of all vectors (X,: /' € /) with only finitely many nonzero entries. 
Denote by V t and V n the optimal values of (I) and (II), respectively. 

LEMMA 3: Assume that the linear inequality system of problem (I) is canonically closed. 
If the feasible set of problem (I) is nonempty and V x is finite, then problem (II) is consistent 
and V u = V x . Moreover, V u is a maximum. 

The concept of a canonically closed system' is used in the proof of the dual statement of 
the following theorem. 

THEOREM 4: ("The Fritz John Necessity Theorem") Let x* be an optimal solution of 
problem (P). Then the system 

(A) d'Vfix*) < 




d'Vf(x*,t) < for all t € T* k , 

k € P" 

is inconsistent or, dually, the system 


is consistent. 

k°Vf°(x*) + Y £ ^l^7f k (x*,t) = 

k£P' f€T' k 

\°, [\ t k : t € 7 1 *, k 6 />*} nonnegative scalars, 

not all zero and of which only finitely many are positive 

PROOF: If x* is optimal, then the inconsistency of the system (A), (B,) is well-known, 
e.g. [4, Lemma 1]. In order to prove the dual statement, we note that the inconsistency of 
(A), (B,) is equivalent to fx * = being the optimal value of the semi-infinite linear program 


Min ix 


d'Vf°(x*) + fi > 

d'Vf k (x*,t) + ai ^ 0, all t € T%, k <E P* 

The dual of (/) is 

6 R" + [ . 



\°Vf°(x*) + £ £ ^l^f k (x*,t) = 

fee/" /e rfc 

\ k > 0, only finitely many are positive. 

The feasible set of problem (I) is clearly nonempty and canonically closed (d = 0, n = 1 satisfy 
the constraints of (I) with strict inequalities). Lemma 3 is now readily applicable to the pair 
(I), (II), which proves (FJ). 


The dual statement in Theorem 4 is the Fritz John optimality condition for semi-infinite pro- 
gramming. For an equivalent formulation the reader is referred to Gehner's paper [4]. 

Under various "constraint qualifications" such as Slater's condition: 

3x € R" 3 f k {x,t) < for all t € T k , k € P 

or the "Constraint Qualification II" of Gehner [4], one can set \ = 1 in Theorem 4. In fact, 
the same is possible if problem (P) is canonically closed at x*, i.e. if there exists dsuch that 




d'Vf k (x*,t) < for all t € T* k , k € P*. 

This is easily seen by multiplying the equation in (FJ) by d satisfying (30). Note that the 
canonical closedness assumption is a semi-infinite version of the Arrow-Hurwicz-Uzawa con- 
straint qualification, e.g. [12]. The latter constraint qualification is implied by Slater's condition. 

The Fritz John condition (FJ) with X = 1 is a semi-infinite version of the Kuhn-Tucker 
condition, e.g. [12]. While the Fritz John condition is necessary but not sufficient, the Kuhn- 
Tucker condition is sufficient but not necessary for optimality. If a constraint qualification is 
assumed, then the Kuhn-Tucker condition is both necessary and sufficient for optimality for 
problem (P). If a constraint qualification is not satisfied then the Fritz John condition fails to 
establish the optimality and the Kuhn-Tucker condition fails to establish the nonoptimality of a 
feasible point x*. In contrast, our results are applicable. This will be demonstrated by two 
examples. (See also an example, taken from approximation theory, in Section 6.) 

EXAMPLE 7: Consider the semi-infinite convex problem 
Min f° = x x — x 2 
/' = x, 2 + tx 2 - t 2 ^ for all t € T x - [0, 1] 
f - -x, - tx 2 - t < for all / € T 2 = [0, 1]. 
The feasible set is 


F = 

A' 2 

:-l < x, < 

and the optimal solution is x* = (0,0)'. For this point 

77= H= (0), P* = {1,2}. 
The system (B 5 ) is 


-d x ^ 0, 
obviously not canonically closed. The Kuhn-Tucker condition is 


f 1 



+ \, 

+ x 2 


A-l > 0, X 2 > 

which clearly fails. 

One can easily verify that the constraint functions /' and f 2 have the uniform mean value 
property. Also, these functions are uniformly decreasing at x* = in every direction d ^ 0. 
(The sets T{ and rj are singletons!) Thus Theorem 3 is applicable. Conditions (A), (B 4 ) and 
(C) are here 


d x - d 2 < 

(B 4 ) 

< or rf, = 0, d 2 € R 
-d x < 



a *d\ + td 2 1 , 
~ — - ^ - -±r for all t € (0, 1] 

— t l OL* 

—d\ - td 2 1 „ 

! > - - 1 - for all / € (0, 1]. 

— / a* 

This reduce 

s to 

d x = 0, d 2 > 


— > - -~ for all t 6 (0, 1] 

-* > ~ £• 

Since d 2 > 0, the inequality (31) cannot hold for any a* > 0. Hence, by Theorem 3, 
x* = (0,0)' is optimal. The optimality of a feasible point is thus established here using 
Theorem 3 and not by the Kuhn-Tucker condition which here fails. 

Consider now the point x* = (0,-1)'. Here 

T*= {o}, T\= [0,1], P* = {1,2}. 

It is easy to verify that the Fritz John condition is satisfied in spite of the fact that x* is not 
optimal. Conditions (A), (B) and (C) are here 

(A) d x - d 2 < 



-d x - td 2 < for all t € [0,1] 

. . a *d\ + td 2 1 , 

(C) ■ , > ~ -^ for all / € (0, 1]. 

—t—r a* 

For a * = 1, these conditions are satisfied by c/[ = 0, J 2 = 1- Hence, by Theorem 1, the point 
x* = (0, 1)' is not optimal. Both the Fritz John and the Kuhn-Tucker theories fail to character- 
ize optimality in this example because a constraint qualification (or a regularization condition, 
e.g. [1]) is not here satisfied. 

Although the Fritz John and Kuhn-Tucker theories fail to characterize optimality, they 
can be used to formulate, respectively, either the necessary or the sufficient conditions of 

In the remainder of the section we will show that the ordinary Kuhn-Tucker condition 
(i.e. the (FJ) condition with X = 1) can be weakened by assuming an asymptotic form. For a 
related discussion in Banach spaces the reader is referred to [16]. 

THEOREM 5: ("The Kuhn-Tucker Sufficiency Theorem") Let x* be a feasible solution of 
problem (P). Then x* is optimal if the system 

(A) d'Vf°(x*) < 

(B 5 ) d'\7f k (x*,t) < for all t € T* k , 

k 6 P* 

is inconsistent or, dually, if the system 




V/°(x*) + £ £ X*V/*(xV) = 

kZP" t£T* k 

{X^: t € T*, k €P*} nonnegative scalars 
of which only finitely many are positive 

is consistent. 

PROOF: If the system (A), (B 5 ) is inconsistent, so is (A), (B). (Recall that 
D k {x*,t) C [d\ d'Vf k (x*,t) = 0}.) Hence, in particular, the system (A), (B), (C) is incon- 
sistent. Following the proof of Theorem 1, one conclu des that x* is optimal. The inconsistency 
of (A), (B 5 ) is equivalent to the consistency of (K-T), by e.g. [11, Corollary 5]. 


REMARK 4: The "asymptotic" form of the Kuhn-Tucker conditions (K—T) gives a 
weaker sufficient condition for optimality than the familiar (i.e., without the closure) condition 


v/°(x*) + £ £ x*v/*(xv) = o 

k€P* t€T% 

{X*: t € T%, k € P*} nonnegative scalars 
of which only finitely many are positive. 

I n som e situations the primal Kuhn-Tucker conditions (A), (B 5 ) may be easier to apply 
than (K-T). This will be illustrated on the following problem taken from [8, Example 2.4]. 

EXAMPLES: Consider 

Min f° = 4x, + j (x 4 + x 6 ) 

/' = -x, - t\X 2 - t 2 x 3 - t}x A - t x t 2 x s - t 2 2 x t + 3 - (t\ - t 2 ) 2 (t\ + t 2 ) 2 ^ 


for all / € T, = 

1 2 

: -1 < t, ^ 1, /= 1,2 

We will show, using the Kuhn-Tucker theory, that x* = (3,0,0,0,0,0)' is an optimal solution. 
The optimality of x* has been established in [8] by a different approach. 

First note that here 

Tt = 

' n r. 

: t\ - t 2 = or t x + t 2 = 
The system (A), (B 5 ) becomes 
(A) 4d x +y d 4 

(B 5 ) - d x - t x d 2 - t 2 d 3 - t\di - t x t 2 d s - t 2 d 6 < 

+| d 6 < 

for all t € T\. 



Substitute in (B 5 ) the following five points of T*\ 













This gives 

-</, < 

— d\ — d 2 — dj — d 4 — d$ — d(, ^ 

-d x - d 2 + d } - d 4 + d 5 - d 6 ^ 

-d x + d 2 ~ d 3 - d 4 + d 5 - d 6 < 

-d\ + d 2 + d 3 - d 4 - d 5 - d 6 < 0. 

Multiply the first inequality by ten thirds and each of the remaining four inequalities by one 
sixth then add all five inequalities. We get 



-jd 6 <0 

which contradicts (A). Thus the system (A), (B 5 ) is inconsistent and x*= (3,0,0,0,0,0)' is 
optimal, by Theorem 5. 

Theorems 1 and 3 suggest that the presently used constraint qualifications for semi- 
infinite programming problems are too restrictive because they do not employ the topological 
properties of problem (P), such as the uniform mean value property or the uniformly decreas- 
ing constraints. 


It is well-known that there is a close connection between convex programming and 
approximation theory, e.g. [5], [13]. In fact, many approximation problems can be formulated 
as convex semi-infinite programming problems in which case the results of this paper are 
readily applicable. In particular, the problem of linear Chebyshev approximation subject to side 



max \f(t) ~ £ XjgjU)\ 




lit) < £ x^U) < uit) for all t <E T 

is equivalent to the linear semi-infinite programming problem 


Min x n+ \ 



-x„+] < £ x,g,(t) - fit) ^ Xn+l 


for all / e T. 



Corollary 3 of this paper can be applied to (L) and it gives a characterization of the best approx- 
imation for the problem (MM). Uniqueness of the best approximation can be checked using 
Theorem 2. Rather than going into details we will illustrate this application by an example. 

EXAMPLE 9: The approximation problem stated in this example is taken from [4], see 
also [15]. It shows that there exist situations when the Kuhn-Tucker theory for semi-infinite 
programming fails to establish optimum even in the case of linear constraints. However, the 
optimality is established using the results of this paper. 

The linear Chebyshev approximation problem is 


max 1 1 

X X ~ X 2 t 




for all 1 € [0,1]. 

-t < x, + x 2 t < t 2 , for all / € [0,1]. 
An equivalent linear semi-infinite programming problem is 

Min f° = x 3 

f = ,4 - x , - X2 t - x 3 < 
f = -t A + x, + x 2 t - x 3 ^ 

f 3 = -t 2 + x { +x 2 t 
f = -t - x, - x 2 t 

Isx*= (0,0. D' optimal? 

Here Tf= {1}, T* 2 = 0, T* 3 = (0), T* 4 = {0} and P* = {1,3,4}. The system (A), (B 5 ) is 
(A) d 3 < 

-d x - d 2 - d 3 ^ 
d ] < 

-d\ < 

(B 5 ) 

and it is clearly consistent (set e.g. d\ = 0, d 2 = 1, d 3 = -1). Therefore, Theorem 5 cannot be 
applied. (Since the system (K—T) is inconsistent, x* = (0,0,1)' is not a "Kuhn-Tucker 
point".) But the system 

(B 2 ) 



d 3 <0 

d 2 ~ d 3 < o 



dot- d 

t 4 - 1 
d { + d 2 t 

d x + d 2 t 

I > -1, for all t € [0,1) 

> -1, for all / € (0,1] 

> -1, for all t 6 (0,1] 


is inconsistent. (First, d\ = 0, by the last two inequalities in (B 2 ). Now (A) and (B 2 ) imply 
d 2 > 0. This contradicts d 2 ^ obtained from the second inequality in (Ci).) Therefore 
x* = (0,0, 1)' is optimal, by Corollary 1. 


The authors are indebted to Professor G. Schmidt for providing some of the constraint 
functions used in Examples 1, 3 and 4, Mr. H. Wolkowicz for providing a counter-example to 
one of their earlier conjectures and the referee for his recommendations about organization of 
the paper and providing a correct version of Lemma 3. 


[1] Ben-Tal. A., A. Ben-Israel and S. Zlobec, "Characterization of Optimality in Convex Pro- 
gramming without a Constraint Qualification," Journal of Optimization Theory and 
Applications 20, 417-437 (1976). 

[2] Charnes, A., W.W. Cooper and K.O. Kortanek, "Duality, Haar Programs and Finite 
Sequence Spaces," Proceedings of the National Academy of Science, 48, 783-786 

[3] Charnes, A., W.W. Cooper and K.O. Kortanek, "On the Theory of Semi-Infinite Program- 
ming and a Generalization of the Kuhn-Tucker Saddle Point Theorem for Arbitrary 
Convex Functions," Naval Research Logistics Quarterly, 16, 41-51 (1969). 

[4] Gehner, K.R., "Necessary and Sufficient Conditions for the Fritz John Problem with 
Linear Equality Constraints," SIAM Journal on Control, 12, 140-149 (1974). 

[5] Gehner, K.R., "Characterization Theorems for Constrained Approximation Problems via 
Optimization Theory," Journal of Approximation Theory, 14, 51-76 (1975). 

[6] Gorr, W. and K.O. Kortanek, "Numerical Aspects of Pollution Abatement Problems: Con- 
strained Generalized Moment Techniques," Carnegie-Mellon University, School of 
Urban and Public Affairs, Institute of Physical Planning Research Report No. 12 

[7] Gustafson, S.A. and K.O. Kortanek, "Analytical Properties of Some Multiple-Source Urban 
Diffusion Models," Environment and Planning, 4, 31-41 (1972). 

[8] Gustafson, S.A. and K.O. Kortanek, "Numerical Treatment of a Class of Semi-Infinite Pro- 
gramming Problems," Naval Research Logistics Quarterly, 20, 477-504 (1973). 

[9] Gustafson, S.A. and J. Martna, "Numerical Treatment of Size Frequency Distributions 

with Computer Machine," Geologiska Foreningens Forhandlingar, 84, 372-389 (1962). 
[10] Kantorovich, L.V. and G.Sh. Rubinshtein, "Concerning a Functional Space and Some 

Extremum Problems," Dokl. Akad. Nauk. SSSR 115, 1058-1061 (1957). 
[11] Lehmann, R. and W. Oettli, "The Theorem of the Alternative, the Key-Theorem and the 

Vector-Maximum Problem," Mathematical Programming, 8, 332-344 (1975). 
[12] Mangasarian, O, Nonlinear Programming, McGraw Hill, New York (1969). 
[13] Rabinowitz, P., "Mathematical Programming and Approximation," Approximation Theory, 

A. Talbot (editor), Academic Press (1970). 
[14] Rockafellar, R.T., Convex Analysis, Princeton University Press, Princeton, N.J. (1970). 
[15] Taylor, G.D., "On Approximation by Polynomials Having Restricted Ranges," Journal on 

Numerical Analysis, 5, 258-268 (1968). 
[16] Zlobec, S., "Extensions of Asymptotic Kuhn-Tucker Conditions in Mathematical Program- 
ming," SIAM Journal on Applied Mathematics, 21, 448-460 (1971). 


Patrick G. McKeown 

University of Georgia 
Athens, Georgia 


Logistics managers often encounter incremental quantity discounts when 
choosing the best transportation mode to use. This could occur when there is a 
choice of road, rail, or water modes to move freight from a set of supply points 
to various destinations. The selection of mode depends upon the amount to be 
moved and the costs, both continuous and fixed, associated with each mode. 
This can be modeled as a transportation problem with a piecewise-linear objec- 
tive function. In this paper, we present a vertex ranking algorithm to solve the 
incremental quantity discounted transportation problem. Computational results 
for various test problems are presented and discussed. 


Whenever a logistics manager is making a decision about the movement of freight, he is 
often faced with choosing from among different modes of transportation. Movement of freight 
by air or motor express may involve no fixed costs to the transporter, but will usually involve 
relatively higher variable costs than either rail or water. However, both rail and water can 
involve the investment of large sums for rail sidings or docking facilities. The problem of 
selecting freight modes can be modeled as a transportation problem with a piecewise-linear 
objective function. This problem has been termed the incremental quantity discounted tran- 
sportation problem, since it is assumed that the variable costs decrease as the amount shipped 
increases. This comes about due to the lower variable costs for rail or water modes relative to 
air or road freight costs. The presence of fixed costs for the use of rail or water determines the 
range of shipment levels over which each cost will be applicable. Figure 1 shows this type of 
objective function. 

In this paper we will present a vertex ranking algorithm to solve this type problem along 
with the computational results from various sizes and types of problems. Background material 
is discussed in Section 2, while the details of the algorithm are given in Section 3. An example 
is worked out in Section 4 while Section 5 gives computational results. 


The incremental quantity discounted transportation problem is a member of a general 
class of math programming problems, i.e., the piecewise-linear programming problem. Vogt 
and Even [15] considered the case of the piecewise-linear transportation problem derived from 
U.S. freight rates. This problem is neither convex nor concave, and has sections of the objec- 
tive function which are flat or "free." Figure 2 shows this case. Vogt and Evans used separable 






_► Rail ► 


Flow (x u ) 

Quantity Shipped 
Figure 1 



"Free" Sections 



Quantity Shipped 
Figure 2 

- Flow (xy) 

nonconvex programming to reach an approximately optimal solution to this problem. Balachan- 
dran and Perry [1] consider another version of this problem which they termed the all unit 
quantity discount transportation problem. The main difference between this and the previous 
case is the lack of the flat section of the objective function. The latter case is typical of some 
foreign freight rates, and is shown in Figure 3 below. 

Problems similar to this one have been mentioned in the plant location literature, e.g., 
Townsend [14], and Efroymson and Ray [5]. In these cases, it is suggested that the problem be 
solved by considering multiple plants, one for each range of demand. 

Balachandran and Perry presented a branch and bound algorithm for the all unit quantity 
discount problem, which they show, will also work for the incremental quantity discount prob- 
lem as well as fixed charge transportation problems. However, no computational results are 




Flow (x„) 

Quantity Shipped 
Figure 3 

given to demonstrate the efficiency of this algorithm. Here, we will consider a vertex ranking 
algorithm for only the incremental quantity discount problem for two reasons. First, fixed 
charge transportation problems have been handled in several other places in a manner that has 
been shown to be superior to vertex ranking [2,8]. Secondly, the incremental quantity discount 
transportation problem has a concave objective function; but, neither the problem considered 
by Vogt and Evans, or the all unit quantity discount transportation problem, have nonconcave 
objective functions. This is crucial to the use of vertex ranking since this procedure will only 
consider vertices of the constraint set, and the optimal solution to problems with nonconcave 
objective functions need not occur at a vertex. 

The incremental quantity discount problem may be formulated as follows (following the 
model proposed by Balachandran and Perry [1]): 




Min Z = ££C* x u + ££ /j} yjj 

i i i j 

subject to £ Xjj = a, for /€/ 
£ Xjj = bj for j €7 

C 2 

k u ^ X U < *•(/ 

C^ = < 


ifx/r 1 < 

< \[j < °°, 


(5) y u = 

(6) 4 = 

1 if A J" 1 < x (J < \jj 



- C^A/p 1 for k = 2,3 r 

'</" './ 


(7) /j = 0, Xy ^ for all /€ 1 and 7 €7, 


7 = {1, . . . , «} = set of sinks, 

/ = { 1 , .... m } = set of sources, 
/? = {1, . . . , r} = set of cost intervals. 

As may be easily seen, this is a generalization of the fixed charge transportation problem, 
(see [1]), with a fixed charge, /$, and a continuous cost, Cjj, for each range of shipment 
between source / and destination j. Since the situation which we are attempting to model, i.e., 
the choice of shipment mode, does involve various levels of fixed charge, (1) - (7) is the 
proper formulation for this problem. It should be noted that we are implicitly assuming that 

Cjj > C# +1 for all/ €7, 7 €7. 

This is necessary for the concavity of the objective function. However, we would expect that 
lower continuous costs would occur for higher shipment levels. 

Balachandran and Perry [1] suggested that (1) - (7) may be solved by a branch and bound 
algorithm. Their procedure is similar to that used to solve travelling salesmen problems by 
driving out subtours [13]. They solve the transportation problem with all costs set to their 
lowest value, i.e., Cjj. If any routes have flow below \[j~ l , branching is done on one of these 
variables. Two branches are used. Our branch forces the flow over the arc above the lower 
limit for the cost level Cfj, i.e., Xy ^ Af 1 . In the other branch, the infeasible cost, Q, is 
replaced by the feasible cost, Cjj. This continues until a solution is found where the arc flows 
match the costs used. This is the optimal solution. However, the effectiveness of the pro- 
cedure is unknown since the authors did not provide any computational results. 

It would also appear that the work of Kennington [8] on the fixed charge transportation 
problem could possibly be modified to solve this problem by having multiple arcs between each 
set of nodes. Each arc would be bounded by \,y _1 andX,y with multiple continuous costs and 
fixed costs. However, this would lead to effectively larger problems, e.g., a problem with 60 
arcs and five breakpoints would have 300 variables in the new problem. 


Using the formulation of the incremental quantity discount transportation problem given 
in (1) - (7), along with the assumption of decreasing costs, we have a problem with linear con- 
straints and concave objective function. It is well known [7] that an optimal solution for prob- 
lems of this type will occur at a vertex of the constraint set. Examples of other problems that 
share this condition are the fixed charge problem, the quadratic transportation problem, and the 
quadratic assignment problem. Murty [12] was the first to suggest a vertex ranking scheme for 
a problem of this category. He showed that the fixed charge problem could be solved by rank- 
ing the vertices of the constraint according to the objective value up to some upper bound. At 
that point, the optimal solution would be found at one or more of the ranked vertices. 


We may formulate any problem with concave objective function and linear constraint as 

(8) Mm fix) 

(9) s.t. x£S 

(10) where S = {x\Ax = b,x ^ 0}. 

Since no "direct" optimization techniques exist for the case where fix) is nonlinear, we 
shall look at a procedure for searching the vertices of 5. To do this, we will use a linear 
underapproximation of fix), say Lix), such that Lix) < fix), x€S. In this case, to show 
that x*is an optimal solution to (8) - (10), we need only rank the vertices of S until a vertex of 
x° is found such that Lix ) ^fix*). At this point, all vertices that could possibly be optimal 
have been ranked. This is proved by Cabot and Francis [3]. 

In order to rank the extreme points of S, we need to use a result also first proved by 
Murty as Theorem 1 below: 

THEOREM 1: \{ E x , E 2 , . . . , E K are the first K vertices of a linear underapproximation 
problem which are ranked in nondecreasing order according to their objective value, then ver- 
tex E K+ i must be adjacent to one of E x , E 2 , ■ ■ ■ , E K . 

Simply put, this says that vertex 2 will be adjacent to the optimal solution to the linear 
underapproximation and vertex 3 will be adjacent to vertex 1, or vertex 2, and so on. This, 
then, gives us a procedure for ranking the vertices if all adjacent vertices can be found. It is 
this "if that can cause problems. These problems arise due to the possibility of degeneracy in 
5. If S is degenerate, then there may exist multiple bases for the same vertex. This implies 
that all such bases must be available before one can be sure that all adjacent vertices have been 
found. Finding all such bases for finding and "scanning" all adjacent vertices can be quite 
cumbersome. However, a recent application of Chernikova's work [4,9] has been shown to be 
a way around the problem of degeneracy. 

Vertex ranking has been used by McKeown [10] to solve fixed charge problems and 
Fluharty [6] to quadratic assignment problems. Cabot and Francis [3] also proposed the use of 
vertex ranking to solve a certain class of nonconvex quadratic programming problems, e.g., 
quadratic transportation problems. For a survey of vertex ranking procedures, see [11]. 

In our problem, we need to determine the linear underapproximation to the first objective 
function, (1). We may do this by first noting that 

(11) Hy = min {a,, bj) 

is an upper bound on x ir We may then note that if Fix if ) = Cjj x„ + fjjyjj then 
F(u„) ~ FiO) 


u u 

C k - u ■■ + f k 


u u 

for A,* -1 ^ Ujj < kjj, is a linear underapproximation to F(xy) 

We may now form a problem to rank vertices, i.e., 

r k u + f k 
(13) Min Z = X E 

u 2* 2* v V 

subject to (2) - (7) 

for\£-' < UiJ < k k j. 



Using (13) and (2) - (7), we may rank vertices as discussed earlier until some vertex x° 
is found such that Lix°) = ]T£ l^Xy > fix*) where x* is a candidate for optimality. We 

i J 

may start with x* equal to the optimal solution to (13) and (2) - (7), and then update it as new, 
possibly better solutions to (1) - (7), are found by the ranking procedure. When all vertices x 
are found such that L(x) < fix*), the solution procedure terminates with the present candi- 
date being optimal. 

EXAMPLE : As an example of our procedure, we will solve an incremental quantity 
discount version of the example problem presented by Balachandran and Perry [1]. Table 1 
below gives the supplies, demands, and costs, for each range of shipment. Table 2 gives the 
optimal solution to the linear underapproximation problem. The values of l u are given in the 
upper right-hand corner of each cell with shipment being circled in the basic cells. 


^•s. Destina- 
^v. tion 

Source ^\ 







4[KKx M <20] 

6[10<jc 12 <»] 
7[5<x 12 < 10] 
8[0<x 12 <5] 

3[27<x 13 <°°] 
4[15<x I3 <27] 
5[5<x 13 <15] 

One price bracket 4 



One price 
bracket 6 

6[20^x 22 <65] 

8[0<jr 22 <20] 

8[10<x 23 ^oo] 
9l5<jfj 3 <10] 
10[0<jr 23 <5] 

One price bracket 15 



l[27<jr 31 <°o] 

2[20<.v 3l <27] 
3[0^a- 3 ,<201 

3[60<x 32 <°o] 
4[30^x 33 <60] 
5[0<x 32 <30] 

10[20^x 33 <°°] 
ll[10^.v 33 <20] 
12[0<jf 33 < 10] 

5[30<x 34 <°°] 
6[20<x 34 <30] 
7l0<jr 34 <20] 


Demand S 






Source ^^ 







3.43 1 

6.25 1 


4.00 1 





6.00 1 


8.43 1 







1.85 1 

4.55 1 


5.91 1 









As an example of the calculation of the /, 7 values, we will look at / M . First, it is necessary 
to calculate f}\ and f\\ using (6). We will do f\\. 

f\\ = C u (\]\ — X\\) — C| i -X.|i 

= (5)(10) - 4(10) = 10. 

Similarly, /, 3 , = 30. i/„ = min {80, 70} = 70. Then, /„ = (3) (70 ^ + 30 = 3.43. 

Now, if we solve this continuous transportation problem, we get a value of Z = 1042.20 
with the circled cells being basic. If we compute the feasible value for this solution, Z = 1067. 
Call this solution X\ 

Now, since this solution is nondegenerate, we may use simplex pivoting to look at each 
nonbasic cell. The values of these adjacent vertices are given below: 

Vertex Z-Value 













Since the Z-value for each vertex is greater than the present value of Z, we do not need to rank 
any other vertices, and Z= 1067.0 is the optimal solution value. 


To test the vertex ranking procedure discussed here, randomly-generated problems were 
run. These problems were generated by first generating supplies and demands uniformly 
between upper and lower bounds, U and L. These supplies and demands were generated so 
that they were all multiples of 5. This was done to insure the presence of degeneracy in some 
,of the problems. All problems were set up to have discount ranges at 20, 50, 300, 1000, and 
2000. By proper selection of L and U, various numbers of ranges could be tested. 

The costs for each arc were generated by randomly generating mileages between each set 
of nodes, and then, inputting discount cost-per-mile values for each range of flow, e.g., 10, 9, 
8, etc. The final discount costs were found by multiplying the mileage between each arc times 
the various costs. In this way, various supply-demand discount ranges and cost configurations 
could be tested. These problems were generated and solved using a computer code in FOR- 
TRAN run on the CYBER 70/74 using the FTN Compiler with OPT = 1. 

(The problem characteristics and test results are given in Table 1. The second column 
gives the solution time in seconds, while the third column shows the number of vertices of the 
linear underapproximation other than the optimal solution that were ranked to solve each prob- 
lem. The fourth column gives the size of the problem (m x n); the fifth column gives the 
number of cost ranges that the arc flows would cover; the sixth column gives the cost per mile 
for each range of flow, pfi; the seventh column gives the lower and upper ranges used to gen- 
erate the supplies and demands; and finally, the last column gives the ranges used to generate 
nileages. The Cjj values were determined by Cjj = pjj (mileage). As can be seen, the algo- 
rithm successfully solved all problems tested. The most difficult problems were those with 
three ranges and supplies/demands between 5 and 100. Problems 6 and 13 are identical, except 
that 6 is over only 3 ranges, while 13 is over 5; but, problem 13 is solved in much less time. In 
;fact, the linear underapproximating transportation problem was found to be optimal and no 
pther extreme points were even ranked. This was also the case in problems 7, 9, 10, 11, and 



12, even though the number of variables increased markedly. It is also interesting to note the 
effect of costs in problems 5, 6, and 7. These are essentially the same problem, but with the 
present decrease in cost for increasing flow being less in each case. The results are as expected 
since in problem 7 the linear underapproximation will be closer to the actual objective function 
than in problems 4 and 5. 

TABLE 3 — Computational Results 





Number of 















































6x 8 





























8 x 8 


























It would appear from these results that vertex ranking does hold promise as a solution 
procedure for incremental cost discount transportation problems. Neither size of problem nor 
degeneracy appears to have any effect on solution time but cost patterns and number of cost 
ranges do seem to have a marked effect. 

Extensions of this work could be used to solve other concave linear programming prob- 
lems. Walker [16] discusses the fact that these can be considered as generalizations of fixed 
charge problems. The main difference would be that the first linear portion would have a posi- 
tive fixed charge rather than zero, as in the problem discussed here. However, this would not 
change the approach to the solution used here. 


[1] Balachandran, V. and A. Perry, "Transportation Type Problems with Quantity Discounts," 
Naval Research Logistics Quarterly, 23, 195-209 (1976). 

[2] Barr, R.L., "The Fixed Charge Transportation Problem," presented at the Joint National 
Meeting of ORSA/TIMS in Puerto Rico (Nov. 1974). 

[3] Cabot, A.V. and R.L. Francis, "Solving Certain Nonconvex Quadratic Minimization Prob- 
lems by Ranking the Extreme Points," Operations Research 18, 82-86 (1970). 

[4] Chernikiva, N.V., "Algorithm for Finding a General Formula for the Non-negative Solu- 
tions of a System of Linear Inequalities," U.S.S.R. Computational Mathematics and 
Mathematical Physics. 

,[5] Efroymson, M.A. and T.L. Ray, "A Branch-Bound Algorithm for Plant Location," Opera- 
tions Research, 14, 361-368 (1966). 

[6] Fluharty, R., "Solving Quadratic Assignment Problems by Ranking the Assignments," 
unpublished Master's Thesis, Ohio State University (1970). 

[7] Hirsch, W.M. and A.J. Hoffman, "Extreme Varieties, Concave Functions, and The Fixed 
Charge Problem," Communications on Pure and Applied Mathematics, 14, 355-370 


[8] Kennington, J.L., "The Fixed Charge Transportation Problem: A Computational Study 

with a Branch and Bound Code," AIIE Transactions, 8 (1976). 
[9] McKeown, P.G. and D.S. Rubin, "Adjacent Vertices on Transportation Polytopes," Naval 
Research Logistics Quarterly, 22, 365-374 (1975). 

[10] McKeown, P.G., "A Vertex Ranking Procedure for Solving the Linear Fixed Charge Prob- 
lem," Operations Research 23, 1183-1191 (1975). 

[11] McKeown, P.G., "Extreme Point Ranking Algorithms: A Computational Survey," 
Proceedings of Bicentennial Conference on Mathematical Programming (1976). 

[12] Murty, K., "Solving the Fixed Charge Problem by Ranking the Extreme Points," Opera- 
tions Research, 16, 268-279 (1968). 

[13] Shapiro, D., "Algorithms for the Solution of the Optimal Cost Travelling Salesmen Prob- 
lem," Sc.D. Thesis, Washington University, St. Louis (1966). 

[14] Townsend, W., "A Production Stocking Problem Analogous to Plant Location," Opera- 
tions Research Quarterly, 26, 389-396 (1975). 

[15] Vogt, L. and J. Evan, "Piecewise Linear Programming Solutions of Transportation Costs 
as Obtained from Rate Traffic," AIIE Transactions, 4 (1972). 
Walker, Warren E., "A Heuristic Adjacent Extreme Point Algorithm for the Fixed Charge 
Problem," Management Science, 22, 587-596 (1976). 


J. Intrator and M. Berrebi 

Bar-llan University 
Ramat-Gan, Israel 


An efficient auxiliary algorithm for solving transportation problems, based 
on a necessary but not sufficient condition for optimum, is presented. 

In this paper a necessary (but not sufficient) condition for a given feasible solution to a 
transportation problem to be optimal is established, and a special algorithm for finding solutions 
which satisfy this condition is adapted as an auxiliary procedure for the MODI method. 

Experimental results presented show that finding an initial solution which satisfies this 
necessary condition for problems with m < < n eliminates 70%-90% of the MODI iterations. 
(See Table 1) 

TABLE 1 — Matrix of Principal Results 

>v n 
m >^ 
























































Fraction of Modi iteration eliminated by using the method presented 
in this paper. 

The case when our algorithm is used during the solution process (especially for m — n) is 
presently being examined. Our auxiliary procedure requires relatively little computational effort 
in finding the appropriate candidate for the basis, eliminating entirely the need to calculate the 
dual variables. It works with positive variables associated with one pair of rows at a time using 
only the prices of these rows. 



Once a loop for any given pair of rows is determined it may be used to insert numerous 
non-basic cells in these two rows to the basis. The result is a considerable time reduction in 
determining loops. 

The storage and time requirements for the special lists needed in our auxiliary algorithm 
are fully discussed in [1]. A rigorous proof presented in [1] shows that updating these lists 
requires no more than 0(m log a?) computer operations per MODI iteration. 

A Linear Programming Transportation Problem is characterized by a cost matrix C and 
two positive requirement vectors a and b such' that ^ o, = ^ b r The problem is to minimize 

SZ c u x u subject to 

i j 

Z x u = b i j- 1,2 .... n 


(A) £ Xjj : = a, i = 1,2 ... m 


x u > for all (ij). 
A proper perturbation of our problem ensures that: 

(1) each feasible basic solution of (A) contains exactly m + n — 1 positive variables x lh 

(2) corresponding to each nonbasic cell (/,./) (x n = 0) there exists a unique loop of different 
cells, say L(i,j) - (/,./',) (i 2 J\) diJi) (/3J2) •••• 

(B) UrJr-l) UrJ) Oj) 

which contains at most two cells in each row and column, where the cell (/',./) is the unique 
nonbasic cell, 

(3) there are no loops which contain basic cells only. 

Notation: For fixed l,k (1 ^ / ^ k ^ m) we denote 
V,= [j\x,j > 0) 1 <7< // 

v*- V\ n y k = {j\x,j > 0, x kl > 0}. 

With no loss of generality it is assumed that for each /, (1 < / < m) there exists at least one 
destination (column) 76 V, such that (IJ) is the unique basic cell of column j. Otherwise, an 
artificial destination, say J with x tj ~ bj — e where e is an infinitely small positive number will 
be introduced. 

It is easy to see that the feasible solution of the augmented problem of dimension 
mxin + 1) satisfies 1), 2), 3) mentioned above. 

DEFINITION 1: A destination with a unique basic cell will be called a fundamental desti- 

The unique nonbasic cell (ij) of L(ij) will be considered for convenience to be the last 
cell of L(ij). For each loop L(iJ), say loop (B), we introduce the notation: 

<o c LUJ] = c, 7| - c, Vl + c i2h - c lil: + ...+ c lrJr - Q. 


It is well to know that 
Q(/j) = u i + v j ~ 0/ where w, and v 7 - are the dual prices. 

DEFINITION 2: A loop with Q > is called an improving loop. 

DEFINITION 3: Let l,k be a fixed pair of numbers so that 1 ^ / ^ k ^ m, we define 
A lk = [J\ x„ > 0, x^ = 0}- V,- V lk 
Di k (j)= C,j- C kJ j = 1,2, .... n. 

THEOREM 1: The number of elements in V lk is at most 1. 

PROOF: Suppose that J U J 2 £ V lk (1 < /, ^ J 2 < «) then the loop (/,/,) (k,J\) (k,J 2 ) 
(l,J 2 ) is of only basic cells contradicting (3) above. 

Let J\ be a fundamental destination of A /k , the purpose of Theorem 2 and Theorem 3 is 
to show that all the simplex loop L(i,J) and the numbers Q,(,/) / = l,k; J€A /k U A kl are deter- 
mined after the simplex loop L{k,J) is found. 

THEOREM 2: C L{k j$ ~ D, k U 2 ) = C UU]) - /),*(/,) for all J 2 eA lk 
J i being the above fundamental destination of A lk . 


CASE (a) V lk ^0. Denote by J the unique member of V lk (Th.l). We have j ^ J\\ 
j * J 2 U\ zx\AJ 2 (iV k ) and 

L{k,J ] ) = (k,j) U,j) (/,•/,) UUi) 

L{k,J 2 ) = (kj) UJ) (l,J 2 ) ik,J 2 ) 

: e.g. 

Q(A.y 2 ) ~~ D lk {J 2 ) = C L{kJ[) - D lk {J x ). 

CASE (b) V lk = 0. Let L{k,J x ) be the loop (B). Note that i r = / (since column 7, con- 
tains a basic cell in row /exclusively) and r > 2. Otherwise, i\ = i 2 = /and L{k,J\) = (kj) 
(IJ) U,J\) (k,J\) means that J€ V lk contradicting the fact that V, k = <f>. 

I Consider the loop: 

L = (kJO (/ 2 J,) (i 2 J 2 ) U3J2) (U-i) UJi) (U2), ('1 = k) 

' obtained from L{k,J\) by substituting J 2 for J\. Let us show that either L{k,J 2 ) = L or 
L(k,J 2 ) can be obtained from L by deleting two identical cells. 

At first, observe that all rows and columns of L (except perhaps J 2 ) contain exactly two 
different cells of L. The column J 2 has not appeared previously (unless J 2 = j r _\) because it 
1 equals one of the previous members j s , 1 ^ s ^ r — 2, then the loop (/ s +|,7 5 ) 
('s+iJs+i) • • • U r ,J 2 ) will be a loop of basic cells only which contradicts (3). 

Thus, only two possibilities exist: 

(1) J 2 9^ j r _ } and L(k,J 2 ) = L or 


(2) J 2 — j r -i and L(k,J 2 ) is obtained from L by deleting the two identical cells (Lj r -\) and 
(U 2 ). 

Since this deleting does not effect the value of C UkJ >, we have for both possibilities 
Q(A-.y : > — D lk U 2 ) = C L{kJ{) — D, k U\). 

THEOREM 3: Let J\ and J 2 be the destinations defined in Theorem 2 and J^A k . We 
shall prove that 

Q(/.y,) = - iC L{kJl ) ~ D lk U 2 )} + D kl U 3 ). 

PROOF: Let L ik,J\) be the loop (B) with i r = / because J is fundamental. 
Consider the loop L defined by 

L = UrJr-0 (/,-„ Jr-l) ••• U 2 ,j{) (/,J,) (*,/ 3 ) (W- 

CASE (a) K /A ^ 0. Same proof as in Theorem 2. 

CASE (b) ^ = 0. By the same argument as in Theorem 2 we can show that there are 
only two possibilities. 

1) y 3 ^ /| which implies that L(/,y 3 ) = L. 

2) y 3 = 7|. In this case (r > 2) and L(/,7 3 ) can be obtained from L by deleting the two identi- 
cal cells (i\J\) and (k,J } ). 

In the two cases we have 

Q(/.y,) = - ICitup-DftWl + AfC/s) 

and by Theorem 2 we have 

Q</.y,> = - IQ(W,) " D lk U 2 )] + D w (7 3 ). 

THEOREM 4: If D jk U 2 ) > D lk U 3 ) then either L(k,J 2 ) or L(k,J } ) is an improving sim- 
plex loop. 

PROOF: Since £ /A (y 3 ) = - D kl (J 3 ), it follows from Theorem 3 that 

Cnu 3 i + Q.(fc/ 2 ) = D lk U 2 ) - D lk U 3 ) > 

(D lk (J 2 ) > D /A (7 3 )) and either C u/y , or Q (A . y j is a positive number, e.g., either L(/,y 3 ) or 
L(k,J 2 ) is an improving simplex loop (Definition 2). 

COROLLARY: At optimum we have D lk U 2 ) < D lk U 3 ). 

DEFINITION 4: Define J /k by 

D, k U lk ) = max Z),*0')- 

7"€ K 


REMARK 1: We shall suppose that D lk {j\) = D, k {j 2 ) if and only if y, = j 2 . Otherwise, 
a cost perturbed problem with C,j = C y + e m,+J can be considered and 

m/+./'] mk+j\ 

AXC/.) " D,l(j 2 ) = C, - C k . + € m,+/l - e ^ + " - C, + 

+ C kj — e 2 + € 2 which for sufficiently small e > is equal only for y, = j 2 . 


THEOREM 5: If at the oplimality V lk = $ then D tk Ui k ) < D, k U k i) U, k ^ J M ), else 
D lk U, k ) = D lk U kl ) U lk = J kl ). 

PROOF: If V lk = then D /k (J /k ) ^ D /k (J k i) otherwise, (by cost-perturbation) J lk = J kl 
and ^ ^ </>. 

The first part of Theorem 5 follows now immediately from the corollary of Theorem 4. 

If V lk ^ and j is the unique element of V lk then by the definition of J tk and from j€ V { we 
have A* 0') < D lk U, k ). 

Let us show that j = 7 /A . Suppose that j ^ 7 //v then we have D /k (J) < D ik {J lk ), (Definition 4) 
and the simplex loop 

L(kJi k ) = (kj) (l,j) (lJi k ) ik,J lk ) will be an improving simplex loop since 

C L{Ulk) = A/0) + D lk U lk ) = D lk U ik ) - D lk (j) > 

contradicting the fact that we have optimality. 

Thus, j = J, k . By the same argument we have j = J kl . 

A simple algorithm consists of 

1) Computing the differences D, k (J /k ), 

2) Comparing D !k (J lk ) to D lk (J k/ ). 

If D, k U lk ) > D lk U k i) (or if J lk ^ J k/ for non-empty V lk ) we improve our solution, using 
all the nonbasic cells (l,J) or (k,J) where 7€ (V,UV k ) such that D lk U, k ) < D lk U) < D lk U kl ) 
by searching only the first loop involving the rows /and k. 

The other loops will be obtained by changing the last two cells keeping the 2k— 2 first cells 
, in the same order or in the opposite order (Theorem 2 and Theorem 3). 

REMARK: In order to assure that the first loop will not be a shortened loop, this loop 
will be obtained by using a fundamental artificial destination /with only one basic cell in the k 
row with x kJ = e. 

The proposed technique was applied to each pair of rows (l,k) until Di k (J lk ) ^ D /A (y A/ ) for 
all 1 ^ / ^ k ^ m. At that point the MODI method was implemented. Performing a MODI 
• iteration frequently caused D, k U lk ) > D /k (J kl ) for some 1 < / ^ k < m which would enable 
further utilization of the proposed technique. However, for the purpose of the present experi- 
ment the proposed technique was not reactivated after the initial processing. (See Table 1) 

The storage and time requirements of the lists J lk when updated at each MODI iteration 
are fully discussed in [1]. 


One possible way to update this list may be described as follows: For each / the destina- 
tions of ji V, are ordered in m — 1 sequences P lk (1 < k ^ / < m) of increasing D lk (j). 
Thus P ik = [jx\ j 2 . . . ; fa) (N, — the number of elements in V,) D, k (jO < D, k {j 2 ) ... 
< A*0/v,) (equality excluded because of the supposed cost-perturbation). These P tk sequences 
are organized in heaps. Adding or deleting an item from a heap requires 0(log N,) < 0(log n) 
computer operations. Since at each simplex iteration only one basic cell, say (o-,t), becomes 
nonbasic and one nonbasic cell, say (s,t), becomes basic, we have to update 2(m — 1) heaps 
(P ap and P sr for all p ^ o\ r ?± s), which amounts to 0(m log n) computer operations per 
simplex iteration, (heaps, see [2]). 


[1] Brandt, A. and J. Intrator, "Fast Algorithms for Long Transportation Problems," Computers 

and Operations Research 5, 263-271 (1978). 
[2] Knuth, D.E., The Art of Computer Programming, 3, Sorting and Searching, Addison Wesley 



Hanif D. Sherali and C. M. Shetty 

School of Industrial & Systems Engineering 

Georgia Institute of Technology 

Atlanta, Georgia 


In ihis paper we address ihe quesiion of deriving deep cuts for nonconvex 
disjunctive programs. These problems include logical constraints which restrict 
the variables to at least one of a finite number of constraint sets. Based on the 
works of Balas, Glover, and Jeroslow, we examine the set of valid inequalities 
or cuts which one may derive in this context, and defining reasonable criteria 
to measure depth of a cut we demonstrate how one may obtain the "deepest" 
cut. The analysis covers the case where each constraint set in the logical state- 
ment has only one constraint and is also extended for the case where each of 
these constraint sets may have more than one constraint. 


A Disjunctive Program is an optimization problem where the constraints represent logical 
conditions. In this study we are concerned with such conditions expressed as linear constraints. 
Several well-known problems can be posed as disjunctive programs, including the zero-one 
integer programs. The logical conditions may include conjunctive statements, disjunctive state- 
ments, negation and implication as discussed in detail by Balas [1,2]. However, an implication 
can be restated as a disjunction, and conjunctions and negations lead to a polyhedral constraint 
set. Thus, this study deals with the harder problem involving disjunctive restrictions which are 
essentially nonconvex problems. 

It is interesting to note that disjunctive programming provides a powerful unifying theory 
, for cutting plane methodologies. The approach taken by Balas [2] and Jeroslow [14] is to 
characterize all valid cutting planes for disjunctive programs. As such, it naturally leads to a 
statement which subsumes prior efforts at presenting an unified theory using convex sets, polar 
sets and level sets of gauge functions [1,2,5,6,8,13,14]. On the other hand, the approach taken 
1 by Glover [10] is to characterize all valid cutting planes through relaxations of the original dis- 
junctive program. Constraints are added sequentially, and when all the constraints are con- 
sidered Glover's, result is equivalent to that of Balas and Jeroslow. Glover's approach is a con- 
structive procedure for generating valid cuts, and may prove useful algorithmically. 

The principal thrust of the methodologies of disjunctive programming is the generation of 
cutting planes based on the linear logical disjunctive conditions in order to solve the 
corresponding nonconvex problem. Such methods have been discussed severally by Balas 
[1,2,3], Glover [8], Glover, Klingman and Stutz [11], Jeroslow [14] and briefly by Owen [17]. 
But the most fundamental and important result of disjunctive programming has been stated by 

*This paper is based upon work supported by the National Science Foundation under Grant No. ENG-77-23683. 



Balas [1,2] and Jeroslow [14], and in a different context by Glover [10]. It unifies and sub- 
sumes several earlier statements made by other authors and is restated below. This result not 
only provides a basis for unifying cutting plane theory, but also provides a different perspective 
for examining this theory. In order to state this result, we will need to use the following nota- 
tion and terminology. 

Consider the linear inequality systems S h , /?€// given by 

(1.1) S h = {x: A h x > b h , x >0}, /?€// 

where H is an appropriate index set. We may state a disjunction in terms of the sets S h , h€H 
as a condition which asserts that a feasible point must satisfy at least one of the constraint S h , 
/?€//. Notationally, we imply by such a disjunction, the restriction x€ U S h . Based on this 


disjunction, an inequality tt'x > tt will be considered a valid inequality or a valid disjunctive cut 
if it is satisfied for each x€ U S,,. (The superscript t will throughout be taken to denote the 

transpose operation). Finally, for a set of vectors {v /; : /?€//}, where v 1 ' = (vj\ ... , v^O for 
each //€//, we will denote by sup (v ; 0, the pointwise supremum v = (vi, .... v„) of the vec- 

tors v\ /?€//, such that v, = sup (v/) for j = 1, . . . , n. 

Before proceeding, we note that a condition which asserts that a feasible point must satisfy 
at least p of some q sets, p < q, may be easily transformed into the above disjunctive statement 
)y letting each S h denote the conjunction of the q original sets taken p at a time. Thus, H = 

' (J) 

in this case. Now consider that following result. 

THEOREM 1: (Basic Disjunctive Cut Principle) - Balas [1,2], Glover [10], Jeroslow 

Suppose that we are given the linear inequality systems S h , h €// of Equation (1.1), where 
\H\ may or may not be finite. Further, suppose that a feasible point must satisfy at least one 
of these systems. Then, for any choice of nonnegative vectors k'\ /?€//, the inequality 

(1.2) (sup (\ h )'A h \ x ^ inf UW 

[heH J /;€// 

is a valid disjunctive cut. Furthermore, if every system S h , /;€//is consistent, and if \H\ < 


°o, then for any valid inequality £ ttjXj ^ 7r , there exist nonnegative vectors k'\ /?€//such 


that ttq < inf (k h )'b h and for 7= 1, ... , n, the j th component of sup (k'')'A h does not 
exceed ttj. 

The forward part of the above theorem was originally proved by Balas [2] and the con- 
verse part by Jeroslow [14]. This theorem has also been independently proved by Glover [10] 
in a somewhat different setting. The theorem merely states that given a disjunction x€ U S h , 

one may generate a valid cut (1.2) by specifying any nonnegative values for the vectors k h , 
h£H. The versatility of the latter choice is apparent from the converse which asserts that so 
long as we can identify and delete any inconsistent systems, S h , /?€//, then given any valid cut 
tt'x > 7r , we may generate a cut of the type (1.2) by suitably selecting values for the parame- 
ters k'\ /;€// such that for any x belonging to the nonnegative orthant of R", if (1.2) holds 
then we must have tt'x ^ n . In other words, we can make a cut of the type (1.2) uniformly 
dominate any given valid inequality or cut. Thus, any valid inequality is either a special case of 


(1.2) or may be strictly dominated by a cut of type (1.2). In this connection, we draw the 
reader's attention to the work of Balas [1] in which several convexity/intersection cuts dis- 
cussed in the literature are recovered from the fundamental disjunctive cut. Note that since the 
inequality (1.2) defines a closed convex set, then for it to be valid, it must necessarily contain 
the polyhedral set 

(1.3) S = convex hull of U S h . 

Hence, one may deduce that a desirable deep cut would be a facet of S, or at least would sup- 
port it. Indeed, Balas [3] has shown how one may generate with some difficulty cuts which 
contain as a subset, the facets of S when \H\ < °°. Our approach to developing deep disjunc- 
tive cuts will bear directly on Theorem 1. Specifically, we will be indicating how one may 
specify values for parameters \ h to provide supports of S, and will discuss some specific criteria 
for choosing among supports. We will be devoting our attention to the following two disjunc- 
tions titled DC1 and DC2. We remark that most disjunctive statements can be cast in the for- 
mat of DC2. Disjunction DC1 is a special case of disjunction DC2, and is discussed first 
because it facilitates our presentation. 


Suppose that each systems S h is comprised of a single linear inequality, that is, let 

(1.4) S h = 

*:£ a\jXj > b\, x > 


for /?€//= {1, .... h) 

where we assume that h = \H\ < °o and that each inequality in S h , h£H\s stated with the ori- 
gin as the current point at which the disjunctive cut is being generated. Then, the disjunctive 
statement DC1 is that at least one of the sets S h , h€H must be satisfied. Since the current 
point (origin) does not satisfy this disjunction, we must have b'{ > for each h € H. Further, 
we will assume, without loss of generality, that for each /?€//, a\j > for some 
j € {1, . . . , n) or else, S,, is inconsistent and we may disregard it. 


Suppose each system S,, is comprised of a set of linear inequalities, that is, let 

(1.5) S h = 

x : £ a 1 ,] Xj ^ b- 1 for each / 6 Q h , x ^ 


for /?€//= {1, 

where Q h , h€H are appropriate constraint index sets. Again, we assume that h = \H\ < oo 
and that the representation in (1.5) is with respect to the current point as the origin. Then, the 
disjunctive statement DC2 is that at least one of the sets S h , h€H must be satisfied. Although 
it is not necessary here for bj' > for all i 6 Q h one may still state a valid disjunction by delet- 
ing all constraints with bj' ^ 0, i£Q h from each set S h , /?€//. Clearly a valid cut for the 
relaxed constraint set is valid for the original constraint set. We will thus obtain a cut which 
possibly is not as strong as may be derived from the original constraints. To aid in our develop- 
ment, we will therefore assume henceforth that b, 1 ' > 0, /€£)/,, h€H. 

Before proceeding with our analysis, let us briefly comment on the need for deep cuts. 
Although intuitively desirable, it is not always necessary to seek a deepest cut. For example, if 
one is using cutting planes to implicitly search a feasible region of discrete points, then all cuts 
which delete the same subset of this discrete region may be equally attractive irrespective of 
their depth relative to the convex bull of this discrete region. Such a situation arises, for exam- 
ple, in the work of Majthay and Whinston [16]. On the other hand, if one is confronted with 


the problem of iteratively exhausting a feasible region which is not finite, as in [20] for exam- 
ple, then indeed deep cuts are meaningful and desirable. 


In this section, we will lay the foundation for the concepts we propose to use in deriving 
deep cuts. Specifically, we will explore the following two criteria for deriving a deep cut: 

(i) Maximize the euclidean distance between the origin and the nonnegative region 
feasible to the cutting plane 

(ii) Maximize the rectilinear distance between the origin and the nonnegative region 
feasible to the cutting plane. 

Let us briefly discuss the choice of these criteria. Referring to Figure 1(a) and (b), one 
may observe that simply attempting to maximize the euclidean distance from the origin to the 
cut can favor weaker over strictly stronger cuts. However, since one is only interested in the 
subset of the nonnegative orthant feasible to the cuts, the choice of criterion (i) above avoids 
such anamolies. Of course, as Figure Kb) indicates, it is possible for this criterion to be unable 
to recognize dominance, and treat two cuts as alternative optimal cuts even through one cut 
dominates the other. 

Let us now proceed to characterize the euclidean distance from the origin to the nonnega- 
tive region feasible to a cut 

(2.1) ]£ ZjXj ^ z , where z > 0, z, > for some j€{l, ... , 


The required distance is clearly given by 


(2.2) 6 e = minimum {||x||: ]£ ZjXj ^ z , x ^ 0}. 


Consider the following result. 

LEMMA 1: Let e be defined by Equations (2.1) and (2.2). Then 

(2.3) e e = ^n- 



(2.4) y - (y lt . . . , y„), v y = maximum {0, z,}, j = \, ... , n. 


PROOF: Note that the solution x* = 

— . Moreover, ior any x ieasioie to u.4J, we nave, z ^ ^ ZjXj ^ 


, y is feasible to the problem in (2.2) with 

LMI 2 ' 

r—rr. Moreover, for any x feasible to (2.2), we have, z < £ ZjXj < £ yjXj ^ 

Lyll J= \ ' 7 =i 

1 1 v 1 1 | |x 1 1 , or that, | |x 1 1 ^ -n — rr- This completes the proof. 

lb 1 1 

Now, let us consider the second criterion. The motivation for this criterion is similar to 
that for the first criterion and moreover, as we shall see below, the use of this criterion has 



intuitive appeal. First of all, given a cut (2.1), let us characterize the rectilinear distance from 
the origin to the nonnegative region feasible to this cut. This distance is given by 


9 r = minimum {|x|: £ ZjXj ^ z , x > 0}, when |x| £ x r 

Consider the following result. 


Criterion values 

Figure 1. Recognition of dominance 

Criterion value 
for either cut 

LEMMA 2: Let 9 r be defined by Equations (2.1) and (2.5). Then, 
(2.6) 9 r = — where z m = maximum z,. 

z m J= x " 


PROOF: Note that the solution x* = (0, ... , — , ...0), with the m th component 
being non-zero, is feasible to the problem in (2.5) with |x*| = — . Moreover, for any x feasi- 

ble to (2.5), we have, 

This completes the proof. 

Zn n Z n 

Note from Equation (2.6) that the objective of maximizing G r is equivalent to finding a 
cut which maximizes the smallest positive intercept made on any axis. Hence, the intuitive 
appeal of this criterion. 


It is very encouraging to note that for the disjunction DC1 we are able to derive a cut 
which not only simultaneously satisfies both the criterion of Section 2, but which is also a facet 
! of the set S of Equation (1.3). This is a powerful statement since all valid inequalities are given 
through (1.2) and none of these can strictly dominate a facet of S. 


We will find it more convenient to state our results if we normalize the linear inequalities 
(1.4) by dividing through by their respective, positive, right-hand-sides. Hence, let us assume 
without loss of generality that 


x:£ a'ijXj > 1, x > 

for /?€//= {1, .... /?}. 

Then the application of Theorem 1 to the disjunction DC1 yields valid cuts of the form: 


(3.2) £ 


max kl'a'ii 
hZH J 

Xi ^ min {\/'} 

J hZH . 

where A [', h € H are nonnegative scalars. Again, there is no loss of generality in assuming that 
(3.3) £ A/'= 1, A," ^ 0, /;€// = {1, .... h) 

since we will not allow all A/', /?6// to be zero. This is equivalent to normalizing (3.2) by 
dividing through by ]T A/'. 

Theorem 2 below derives two cuts of the type (3.2), both of which simultaneously 
achieve the two criteria of the foregoing section. However, the second cut uniformly dominates 
the first cut. In fact, no cut can strictly dominate the second cut since it is shown to be a facet 
of 5 defined by (1.3). 

THEOREM 2: Consider the disjunctive statement DC1 where S h is defined by (3.1) and is 
assumed to be consistent for each /;€//. Then the following results hold: 

(a) Both the criteria of Section 2 are satisfied by letting A/' = A ('* where 

(3.4) A/'*= \/h for /?€// 
in inequality (3.2) to obtain the cut 


(3.5) y, a\jXj ^ 1, where a' Xj = max a 1 ],, forj = 1, ... , n. 

(b) Further, defining 

(3.6) y('= minimum [a'j/a'ij} > 0, h£H 


and letting A/' = A/'", where 

(3.7) A, / '"=y 1 7£ yf for/?e/Y 


in inequality (3.2), we obtain a cut of the form 

(3.8) T a[* x, ^ 1, where a** '= max a\, y/'fory =1, ... n 

/-i /,e// 

which again satisfies both the criteria of Section 2. 

(c) The cut (3.8) uniformly dominates the cut (3.5); in fact, 

= a \j if a \j > 

(3.9) a,*,* ' ^ • r • ^ n , j = 1. .. , n. 

" (< a,, if ai, < 


(d) The cut (3.8) is a facet of the set S of Equation (1.3). 


(a) Clearly, \/'= \/h, h€H leads to the cut (3.5) from (3.2). Now consider the 
euclidean distance criterion of maximizing 9 e (or &l) of Equation (2.3). For cut (3.5), the 
value of 9] is given by 

(3.10) (O 2 = l/Z (y/) 2 > where yj =,}, J = 1, .... n. 

Now, for any choice X /', /?€//, 


2 = 


'T,y?-bpyi:yj 2 . say. 

where ^ = max{0,max \('a'{j}. If Xf = 0, then 9 e = and noting (3.10), such a choice of 


parameters X/', h€H is suboptimal. Hence, Xf > 0, whence (3.11) becomes 6> e 2 = l/]£ 
But since (X/'/Xf) > 1 for each /?€//, we get 


K \ ) 

yJ\P = max 



0, max 

^ max 



0, maxaj', 


= •>>/• 

Thus 0,? < (0/) 2 so that the first criterion is satisfied. 

Now consider the maximization of 9 r of Equation (2.5), or equivalently Equation (2.6). 
For the choice (3.4), the value of 9 r is given by 



> 0. 

max a ly 

Now, for any choice x/', h€H, from Equations (2.6), (3.2) we get 

9 r = [minX/i /[max max X/'oi',-] = Xf /max max X/'ai',-, say. 

l/ie// J/ I 7 /!€// J ) '/ y /i 6// J 

As before, Xf = implies a value of 9 r inferior to 9*. Thus, assume Xf > 0. Then, 9 r = 
Oiy. But (x/'/Xf) > 1 for each h£H and in evaluating 0,., we are interested 

1/ max max 

j hZH 



only in those y€{l, ...,«} for which a('y > for some //€//. Thus, 9 r < 1/max max aj 
0,, so that the second criterion is also satisfied. This proves part (a). 

j hZH 

(b) and (c). First of all, let us consider the values taken by y/ 1 , /?€//. Note from the 
assumption of consistency that yj\ h€H are well defined. From (3.5), (3.6), we must have 
y\ ^ 1 for each /?€//. Moreover, if we define from (3.5) 

(3.13) H*= {/?€//: a'{ k = a\ k > for some A; 6 {1, .... n}} 

then clearly H*^ {<£} and for h£H*, Equation (3.6) implies y," < 1. Thus, 






= 1 for h€H* 
> 1 for h $H*. 

min y i = 1 

or that, using (3.7) in (3.2) yields a cut of the type (3.8), where, 


a i , ■ = max a , . • y , , /' = 1 , . . . , n. 

Now, let us establish relationship (3.9). Note from (3.5) that if a\, ^ 0, then a\-, < 
for each /?€//and hence, using (3.14), (3.16), we get that (3.9) holds. Next, consider a*, > 
for some y € { 1 , ..., n). From (3.13), (3.14), (3.16), we get 

(3.17) a[\*= max{maxai\, max af,y/'} 

h€H h£H* 

afj > 

where we have not considered /?#//* with a^ ^ since a** > 0. But for h$H*mlh a\, > 0, 
we get from (3.5), (3.6) 


„h ,, h _ h 


k:a'{ k >Q 

w*xa[ k 

< at 

a\ k 

max a i , 


= maxoi,. 

Using (3.18) in (3.17) yields a** = a' n which establishes (3.9). 

Finally, we show that (3.8) satisfies both the criteria of Section 2. This part follows 
immediately from (3.9) by noting that the cut (3.5) yields 9 e = 9* of (3.10) and 0, = 9* of 
(3.12). This completes the proofs of parts (b) and (c). 

(d) Note that since (3.8) is valid, any x€S satisfies (3.8). Hence, in order to show that 
(3.8) defines a facet of S, it is sufficient to identify n affinely independent points of 5 which 
satisfy (3.8) as an equality, since clearly, dim S = n. Define 

(3.19) 7, = L/€{1 n\: a" > 0} and let J 2 - {1, .... n) - /,. 

Consider any p€J\, and let 


e p = (0 — 0), p£J, 


have the non-zero term in the p' h position. Now, since p£J\, (3.9) yields 

a \n = a \D = max a \ 

a\' p , say, 

Hence, e p €S h and so, e p €S and moreover, e p satisfies (3.8) as an equality. Thus, e p , p€J\ 
qualify as \J\\ of the n affinely independent points we are seeking. 

Now consider a q€J 7 . Let us show that there exists an S h satisfying 

h h .. , 

y," a x i = a Xp for some p€J\ 



(3.21) y! q a f i" q = a". 

From Equation (3.16), we get ai* = max a\ a yP= a\ q a y x \ say. Then for this //„€//, Equation 


(3.6) yields y,'" = minimum {<**,■/«,'«} = a'p/a^p, say. Or, using (3.9), y,'" a\" p = a' p - a" > 
0. Thus (3.21) holds. For convenience, let us rewrite the set S h below as 

(3.22) fy = U: af'x, + a?<x 9 + £ afjx, > 1, x > 0}. 
Now, consider the direction 

(0, .... -Jr-. -4r 0) lf«,7< 

(0, . . . , 0, . . . , A , . . . , 0) if a '* = 

(3.23) d a = 

where A > 0. Let us show that d q is a direction for S h . Clearly, if a\ q = 0, then from (3.21) 

a {q = and thus (3.22) establishes (3.23). Further, if a ]q < then one may easily verify 
from (3.21), (3.22), (3.23) that 

e p = (0 y\"la" p , ... , 0) 6 S h and e p + Sty,''"^] € S h for each 8^0 

where e p has the non-zero term at position p. Thus, d q is a direction for S h . It can be easily 

shown that this implies d q is a direction for S. Since e p = (0, . . . , — —, . . . , 0) of Equation 

. (3.20) belongs to S, then so does (e p + d q ). But (e p + d q ) clearly satisfies (3.8) as an equality. 
Hence, we have identified n points of S, which satisfy the cut (3.8) as an equality, of the type 


e„= (0, ... , -^7 , ... , 0) for/^7, 
<*\ P 

e q = d q + e p for some p€J\, for each (?€7 2 

where d q is given by (3.23). Since these n points are clearly affinely independent, this com- 
pletes the proof. 

It is interesting to note that the cut (3.5) has been derived by Balas [2] and by Glover [9, 
Theorem 1]. Further, the cut (3.8) is precisely the strengthened negative edge extension cut of 
Glover [9, Theorem 2]. The effect of replacing A./'* defined in (3.4) by A./'" defined in (3.7) is 
equivalent to the translation of certain hyperplanes in Glover's theorem. We have hence 
shown through Theorem 2 how the latter cut may be derived in the context of disjunctive pro- 
gramming, and be shown to be a facet of the convex hull of feasible points. Further, both 
(3.5) and (3.8) have been shown to be alternative optima to the two criteria of Section 2. 

In generalizing this to disjunction DC2, we find that such an ideal situation no longer 
exists. Nevertheless, we are able to obtain some useful results. But before proceeding to DC2, 
let us illustrate the above concepts through an example. 


EXAMPLE: Let H = {1,2}, n = 3 and let DC1 be formulated through the sets 
5, = {x: x, + 2x 2 - 4x 3 ^ 1, x ^ 0},S 2 = {x: -y- + -y- - 2x 3 > 1, x ^ 0}. 
The cut (3.5), i.e., Eaf/X,- $5 1, is *i + 2x 2 - 2x 3 > 1. From (3.6), 

y{ = min 

1 1 
1 ' 2 

= 1 and y( = min 

2 _ 

1 2 
1/2' 1/3 


Thus, through (3.7), or more directly, from (3.16), the cut (3.8), i.e., I a'Jxj ^ 1 is 
X] + 2x 2 — 4x 3 ^ 1. This cut strictly dominates the cut (3.5) in this example, though both 
have the same values 1/V5 and 1/2 respectively for 9 e and 9 r of Equations (2.2) and (2.5). 


To begin with, let us make the following interesting observation. Suppose that for con- 
venience, we assume without loss of generality as before, that b^ = 1, i€Q h , //€//in Equation 
(1.4). Thus, for each /?€//, we have the constraint set 


x: 5>/;x ; > 1, i£Q h , x > 

Now for each /?€//, let us multiply the constraints of 5/, by corresponding scalars 8/' ^ 0, i€Q h 
and add them up to obtain the surrogate constraint 




Xj > £ 8,*, /?€#. 

Further, assuming that not all 8/' are zero for /€ (?,,, (4.2) may be re-written as 





'6 0* 



x, > 1, /?€//. 

Finally, denoting 8/' /£ 8 * by X, /? for i€Q h , h € //, we may write (4.3) as 





I *,*flj 


Xf ^ 1 for each /? € // 

X X,*- 1 for each //€//, X," ^ for /€&, h€H. 

Observe that by surrogating the constraints of (4.1) using parameters X/', /€Q /7 , /? € // satisfying 
(4.5), we have essentially represented DC2 as DC1 through (4.4). In other words, since x£S h 
implies x satisfies (4.4) for each /?€//, then given X, /( , /€£>,,, h£H, DC2 implies that at least 
one of (4.4) must be satisfied. Now, whereas Theorem 1 would directly employ (4.2) to derive 
a cut, since we have normalized (4.2) to obtain (4.4), we know from the previous section that 
the optimal strategy is to derive a cut (3.8) using inequalities (4.4). 

Now let us consider in turn the two criteria of Section 2. 



4.1. Euclidean Distance-Based Criterion 

Consider any selection of values for the parameters \/\ /€£?/,, h €// satisfying (4.5) and 
let the corresponding disjunction DC1 derived from DC2 be that at least one of (4.4) must 
hold. Then, Theorem 2 tells us through Equations (3.5), (3.10) that the euclidean distance cri- 
terion value for the resulting cut (3.8) is 



9Ak)= 1 

yj = max{0, z,}, j = 1, 



Thus, the criterion of Section 2 seeks to 

(4.9) maximize {9 e (\):X= (A/0 satisfies (4.5)} 
or equivalently, to 

(4.10) minimize {£v, 2 : (4.5), (4.7), (4.8) are satisfied}. 


It may be easily verified that the problem of (4.10) may be written as 



minimize £ yj 

subject to }) ^ £ A /' a'j for each h£H for each j = 1 , 

£ A/' = 1 for each //€// 

a/' > i£Q hl heH 

Note that we have deleted the constraints v y ^ 0, j = 1, ... , « since for any feasible A/', 
/€()/,, /7 6//, there exists a dominant solution with nonnegative yj = j = 1, ... , n. This relax- 
ation is simply a matter of convenience in our solution strategy. 

Before proposing a solution procedure for Problem PD 2 , let us make some pertinent 
remarks. Note that Problem PD 2 has the purpose of generating parameters \/', /€(?/,, /?€// 
which are to be used to obtain the surrogate constraints (4.4). Thereafter, the cut that we 
derive for the disjunction DC2 is the cut (3.8) obtained from the statement that at least one of 
(4.4) must hold. Hence, Problem PD 2 attempts to find values for A/', i€Q h , /?€//, such that 
this resulting cut achieves the euclidean distance criterion. 

Problem PD 2 is a convex quadratic program for which the Kuhn-Tucker conditions are 
both necessary and sufficient. Several efficient simplex-based quadratic programming pro- 
cedures are available to solve such a problem. However, these procedures require explicit han- 
dling of the potentially large number of constraints in Problem PD 2 . On the other hand, the 



subgradient optimization procedure discussed below takes full advantage of the problem struc- 
ture. We are first able to write out an almost complete solution to the Kuhn-Tucker system. 
We will refer to this as a partial solution. In case we are unable to either actually construct a 
complete solution or to assert that a feasible completion exists, then through the construction 
procedure itself, we have a subgradient direction available. Moreover, this latter direction is 
very likely to be a direction of ascent. We therefore propose to move in the negative of this 
direction and if necessary, project back onto the feasible region. These iterative steps are now 
repeated at this new point. 

4.1.1 Kuhn- Tucker Systems for PD 2 and Its Implications 

Letting uf, /?€//, j = 1, ... , n denote the lagrangian multipliers for constraints (4.12), 
t h , h€H those for constraints (4.13), and wj', /€(),,, h€H those for constraints (4.14), we may 
write the Kuhn-Tucker optimality conditions as 




I uf-ty 7-1 n 


2^ uj'alj + t h - w/' = for each i€Q h , and for each h€H 


V ■ 

for each j = 1 , . . . , n and each h € H 

\/'w/'= Ofor /€£>,,, /?€/¥ 

wj 1 ^ /€(?,,, h€H 

uf> Oy- 1, ... , n, h£H. 

Finally, Equations (4.12), (4.13), (4.14) must also hold. We will now consider the implications 
of the above conditions. This will enable us to construct at least a partial solution to these con- 
ditions, given particular values of \/', i€Q h , /?€//. First of all, note that Equations (4.7), 
(4.10) and (4.20) imply that 


yj ^ for each j = 1 , ... 
>j = maxjo, £ kfajj, /?€// 

for j = I, ... , n. 

Now, having determined values for y jr j = 1, . . . , n, let us define the sets 

{0} ifjy-0 



for j = 1, . . . , n. 

[h€H:yj= £ *i a i > } 

Now, consider the determination of w/', /?€//, j =' 1, . . . , /j. Clearly, Equations (4.15), (4.17) 
and (4.20) along with the definition (4.23) imply that for each j — I, ■ ■ ■ , n 

(4.24) uf = for h € H/Hj and that £ «/ - 2y r uf > for each h € fl). 


Thus, for any y'€{l, ... , «}, if //, is either empty or a singleton, the corresponding values for 
uf, h € H are uniquely determined. Hence, we have a choice in selecting values for uf, h 6 Hj 



only when \Hj\ ^ 2 for any y€{l, ... , n). Next, multiplying (4.16) by \/' and using (4.18), 
we obtain 



+ % £ X/'= for each /?€//. 

'"6 0/, 

w/' = £ i// [flj - j; y ] for each /€£>,„ h e //. 

Using Equations (4.13), (4.17), this gives us 

(4.26) 'a - - £ "yV, for each /?€//. 

Finally, Equations (4.16), (4.26) yield 


Notice that once the variables uj', h€H, j = 1, . . . , « are fixed to satisfy (4.24), all the vari- 
ables are uniquely determined. We now show that if the variables w/', / €(),,, h € H so deter- 
mined are nonnegative, we then have a Kuhn-Tucker solution. Since the objective function of 
PD 2 is convex and the constraints are linear, this solution is also optimal. 

LEMMA 2: Let a primal feasible set of a/', /€()/,, h€H be given. Determine values for 
all variables v 7 , uj\ t h , w'' using Equations (4.22) through (4.27), selecting an arbitrary solution 
in the case described in Equation (4.24) if \Hj\ > 2. If w/' > 0, KQ h , /;€//, then A/', /€£>,,, 
h £H solves Problem PD 2 . 

PROOF: By construction Equations (4.12), through (4.17), and (4.20) clearly hold. 
Thus, noting that in our problem the Kuhn-Tucker conditions are sufficient for optimality, all 
we need to show is that if w = (h>/0 ^ then (4.18) holds. But from (4.17) and (4.27) for 
any /; € H, we have, 

''€0/, /eo„ 

£ «/ [^ - 


= 1 


I a/'4- 






for each /?€//. Thus, A/' ^ 0, w/' ^ /€£>,,, /?€// imply that (4.18) holds and the proof is 

The reader may note that in Section 4.1.4 we will propose another stronger sufficient con- 
dition for a set of variables A./', /€£?/,, /?€//to be optimal. The development of this condition 
is based on a subgradient optimization procedure discussed below. 

4.1.2 Subgradient Optimization Scheme for Problem PD 

For the purpose of this development, let us use (4.22) to rewrite Problem PD 2 as follows. 
First of all define 

(4.28) A ={\ = (a/0: constraints (4.13) and (4.14) are satisfied ) 

and let /: A — R be defined by 


/0O = £ 



0, £ a," 4, /?€// 




Then, Problem PD 2 may be written as 

minimize {/(X): A. € A). 
Note that for each j = 1, . . ., n, gj(X) — max {0, Z Xfoy, h€H) is convex and nonnegative. 

Thus, [g,(X)] 2 is convex and so/(X) = Z [gj(X)] 2 is also convex. 


The main thrust of the proposed algorithm is as follows. Having a solution X at any stage, 
we will attempt to construct a solution to the Kuhn-Tucker system using Equations (4.15) : 
through (4.20). If we obtain nonnegative values iv* for the corresponding variables w'\ /'€Q /M 
h€H, then by Lemma 2 above, we terminate. Later in Section 4.1.7, we will also use another 
sufficient condition to check for termination. If we obtain no indication of optimality, we con- j 
tinue. Theorem 3 below established that in any case, the vector w = w constitutes a subgra- i 
dient of /(•) at the current point X. Following Poljak [18,19], we hence take a suitable step in 
the negative subgradient direction and project back onto the feasible region A of Equation 
(4.28). This completes one iteration. Before presenting Theorem 3, consider the following 

DEFINITION 1: Let/: A — R be a convex function and let X € AC R m . Then f € R m ' 
is a subgradient of /(•) at X if 

fix) ^ f(X) + £' (X - X) for each X € A. 

THEOREM 3: Let X be a given point in A defined by (4.28) and let w be obtained from 
Equations (4.22) through (4.27), with an arbitrary selection of a solution to (4.24). 

Then, w is a subgradient of /(•) at X, where /:A — - R is defined in Equation (4.29). 

PROOF. Let y and y be obtained through Equation (4.22) from X € A and X € A respec- 
tively. Hence, 

/<X)-£ y 2 mdf(X) = £y 2 . 

Thus, from Definition 1, we need to show that 

(4.30) £ £ wj 1 (X/' - X?) < £ y} - £ y 2 . 

hZH KQ h 7=1 7=1 

Noting from Equations (4.17), (4.27) that Z Z w/'x/'= 0, we have, 


X £ ^<x*-x,*>- £ £ »*a*- £ £ Z-yVW-Jvi 

hdH <£Q h hZH i$Q h h€H i€Q h j=\ 

Z I */ 

/! € // /= 1 

Z x,X 

'■6 0* 

z z 

/) 6 W /= I 

ufy, Z X/ 

Using (4.13) and (4.15), this yields 


z *M 


- 2 z yj 




Combining this with (4.30), we need to show that 


Z Z «/' 

Z Vol} 

< Z yf + Z # 

7-1 7=1 

But Equations (4.15), (4.20), (4.22) imply that 


Z *M 


/i 6 //v=l 7=1 

UN < ItvlP+IUIP 


so that Equation (4.31) holds. This completes the proof. 

Although, given X € A, any solution to Equations (4.22) through (4.27) will yield a 
subgradient of /(•) at the current point X, we would like to generate, without expending much 
effort, a subgradient which is hopefully a direction of ascent. Hence, this would accelerate the 
cut generation process. Later in Section 4.1.6 we describe one such scheme to determine a 
suitable subgradient direction. For the present moment, let us assume that we have generated 
a subgradient w and have taken a suitable step size 9 in the direction — w as prescribed by the 
subgradient optimization scheme of Held, Wolfe, and Crowder [12]. Let 


X = X - 9 w 

be the new point thus obtained. To complete the iteration, we must now project X into A, that 
is, we must determine a new X according to 


X new = P X (X) = minimum [| |x — X||: X € A}. 

The method of accomplishing this efficiently is presented in the next subsection. 
4.1.3 Projection Scheme 

For convenience, let us define the following linear manifold 


M h = , 

, /?€// 

and let M h be the intersection of M h with the nonnegative orthant, that is, 

(4.35) M h - {x/\ /€&: Z */*- 1. */' > °. '€&}. 

Note from Equation (4.28) that 

(4.36) A = M x x ... x M\ H \. 

Now, given X, we want to project it onto A, that is, determine X new from Equation (4.33). 
Towards this end, for any vector a = (a,, /€/), where /is a suitable index set for the |/| com- 
ponents of a, let P(a,I) denote the following problem: 




|Z (X,-a,) 2 :Z^= 1. h > 0- '€■ 

Then to determine X new , we need to find the solutions (X/,' e(t ,),, /€0/, as projections onto M h of 

,= h 

= h 

X = (X,, i€Q h ) through each of the \H\ separable Problems P(X , Q h ). Thus, henceforth in 
this section, we will consider only one such /?€//. Theorem 4 below is the basis of a finitely 
convergent iterative scheme to solve Problem P(X , Q h ). 


THEOREM 4: Consider the solution of Problem P(J3 k , I k ), where 0* = (/3f, /€/*), with 
|/J ^ 1. Define 

(4.38) Pk 



and let 

(4.39) p k = p k +{p k )l k 

where 4 denotes a vector of \l k \ elements, each equal to unity. Further, define 

(4.40) 4 +1 = {/€/*: j§f > 0). 

Finally, let/3^ 1 defined below be a subvector of/?*, 

(4.41) 0* +1 = (j3 k+] , iei k+l ) 

where, p k+x = JS,^, r€4 +1 . Now suppose that^ +1 solves P(j8 fc+1 , /*+,). 

(a) If/3* ^ 0, then/3* solves P(J3 k , I k ). 

(b) Ifp _> 0, then /3 solves / > (/3 A: , 4), where /3 has components given by 

J3 k+ \ if /€4 +1 for each /€/*. 

(4.42) /3, = 

PROOF: For the sake of convenience, let RP(a,I) denote the problem obtained by 
relaxing the nonnegativity restrictions in P(a,I). That is, let 

RP(a,I): minimize 

J £ (X, - a,) 2 : £ X, - 1 

First of all, note from Equations (4.38), (4.39) that /3* solves RP(J3 k , l k ) since fi k is the projec- 
tion of (l k onto the linear manifold 


\= (\„ /€/*): £ \,= l| 

which is the feasible region of RP(J3 k , I k ). Thus, /3* ^ implies that /3* also solves P(j3*, 4)- 
This proves part (a). 

Next, suppose that /3 A ' > 0. Observe that /3 is feasible to P(fi k , I k ) since from (4.42), we 
get /3 ^ and £ /3, = £ /3* +1 = 1 as /3* +1 solves P(fi k+l , 4 +1 ). 

Now, consider any A. = (X„ /€/*) feasible to P(/3 k , I k ). Then, by the Pythagorem 
Theorem, since ft k is the projection of p k onto (4.43), we get 

||\-/8*|| 2 = ||\-J8*|| 2 + ||J3*-/3*|| 2 . 

Hence, the optimal solution to P(fi k , I k ) is also_optimal to P(fi k , I k ). Now, suppose that we 
can show that the optimal solution to Problem P(JB k , I k ) must satisfy 

(4.44) X, = for <?4 +1 . 

Then, noting (4.41), (4.42), and using the hypothesis that J3 k+l solves P(fi k+] , I k+ \), we will 
have established part (b). Hence, let us prove that (4.44) must hold. Towards this end, con- 
sider the following Kuhn-Tucker equations for Problem P(fi k , I k ) with / and w n i€l k as the 
appropriate lagrangian multipliers: 


(4.46) £ X, = 1, a, ^ for each i£l k 

(4.47) (X, - /§/*) + t - w, = and w, ^ for each /€/* 

(4.48) X , w, = for each i € 7 A . 

Now, since £ J8* = 1, we get from (4.45), (4.46) that 
/e/ t 

^ 0. 

But from (4.46), (4.47), and (4.48) we get for each /€/*, 

0= w,X, = X,(X, + t-pf) 
which implies that for each /'€/*, we must have, 

either X, = 0, whence from (4.46), w, ■ = t — /3* must be nonnegative 

or X, = J3* - /, whence from (4.46), w, = 0. 

In either case above, noting (4.45), if /3* < 0, that is, if i&I k +\, we must have X ; = 0. This 
completes the proof. 

Using Theorem 4, one may easily validate the following procedure for finding k„ ew of 
Equation (4.33), given X h . This procedure has to be repeated separately for each /?€//. 


Set k = 0, /3° = \\ I = Q h . Go to Step 1. 

Step 1 

Given p k , I k , determine p k and ~ji k from (4.38), (4.39). If /3* ^ 0, then terminate with 
KJI ew having components given by 

pfi£ia k 


(\ " ) = 

v/x new ' i 

Otherwise, proceed to Step 2. 

Step 2 

Define I k+ \, (3 k+l as in Equations (4.40), (4.41), increment k by one and return to Step 1. 

Note that this procedure is finitely convergent as it results in a strictly decreasing, finite 
sequence \l k \ satisfying \l k \ ^ 1 for each k, since £ j3 k = 1 for each k. 

EXAMPLE: Suppose we want to project X = (-2,3,1,2) on to A c R 4 . Then the above 
procedure yields the following results. 


k = 0, 0° = (-2,3,1,2), 7 = {1,2,3,4}. 



Step 1 
P0--3/4, j§° = 

n 9 ] 5 

4 ' 4' 4' 4 

Step 2 
ft: — 1. /x — {2,3,4}, /3 1 

4' 4' 4 

Step 1 

1 _2 1 

3 ' 3' 3 

Step 2 
/c = 2, / 2 = {2,4}, |8 2 = 

1 1 
3' 3 

Step 1 

1 o2 

P2--J. P - (1.0) ^ 
Thus, X* - (0,1,0,0). 

4. /. 4 >4 Second Sufficient Condition for Termination 

_ As indicated earlier in Section 4.1.2, we will now derive a second sufficient condition on w 
for A. to solve PD 2 . For this purpose, consider the following lemma: 

LEMMA 3: Let \ € A be given and suppose we obtain w using Equations (4.22) through 
(4.27). Let w solve the problem. 

PR h : minimize 

\ £ (w,*- w/') 2 : £ w/'= 0, vv/' < for l€JA for each htH 


(4.49) 7„ = {/€£?„: A/' = 0}, /;€//. 
Then, if w = 0, A. solves Problem PD 2 . 

PROOF. Since w = solves /V?/,, /?€//, we have for each /?€//, 

(4.50) £ (w/') 2 < £ (w/'~ w/') 2 

for all w/', /€Q/, satisfying £ w/*= 0, w/' < for /'€/,,. Given any X € A and given any 

/jl > define, 

(4.51) w/'= (\/'-X/')/At, /€(?;„ /?£//. 


Then, £ w/' = for each h£H and since X," = for /€/,,, /?€//, we get vv/' < for i£J h , 

/;€//. Thus, for any A. € A, by substituting (4.51) into (4.50), we have, 
(4.52) n 2 Z (wfi 2 < £ (X/*-X,* + Atw/) 2 for each /?€//. 

But Equation (4.52) implies that for each /?€//, A.'' = X 7 ' solves the problem 


£ [\/'- (a/'-mw/')] 2 : £ X/' = 1, a," > ,€£ 

for each /; € //. 

In other words, the projection /^(X — Vv/a) of (X — vv/a) onto A is equal to X for any ix = 0. 

In view of Poljak's result [18,19], since vv is a subgradient of /(■) at X, then X solves PD 2 . 
This completes the proof. 

Note that Lemma 3 above states that if the "closest" feasible direction — w to — vv is a zero 
vector, then X solves PD 2 . Based on this result, we derive through Lemma 4 below a second 
sufficient condition for X to solve PD 2 . 

LEMMA 4: Suppose vv = solves Problems PR,,, /;€// as in Lemma 3. Then for each 
h € H, we must have 

(4.53) (a) Wj 1 = t,,, a constant, for each i$J h 

(b) wj' < t h for each i£J h 
where J,, is given by Equation (4.49). 

PROOF: Let us write the Kuhn-Tucker conditions for Problem PR,,, for any /?€//. We 

(vv/'- iv/9 + f A - Ofor /<T/ A 

(w/' - vv/') + r /; - w," = for iej,, 

U-' ^ 0, /' € //, , «/' Wj 1 =0 /€//,, ?,, unrestricted 

£ w/' = 0, w/' ^ Ofor i£J h . 


If w = solves P^ /? , /?€//, then since PR,, has a convex objective function and linear con- 
straints, then there must exist a solution to 

vv/' = t,, for each / GJ,, 

«/'= {t h - vv/') ^ for each i£J,,. 

This completes the proof. 

Thus Equation (4.53) gives us another sufficient condition for X to solve PD 2 . We illus- 
trate the use of this condition through an example in Section 4.1.7. 

4.1.5 Schema of an Algorithm to Solve Problem PD 2 

The procedure is depicted schematically below. In block 1, an arbitrary or preferably, a 
good heuristic solution X € A is sought. For example, one may use X/' = \/\Q/, I for each 
/€(),,, for h€H. For blocks 4 and 6, we recommend the procedural steps proposed by Held, 
Wolfe and Crowder [12] for the subgradient optimization scheme. 



For j = 1 n 

determine y n 
Uj, h € H. using 
Equations (4.22), 
(4.24). Hence, 
determine w from 
Equation (4.27) 



Is w > or 

does w satisfy 
Equation (4.53)' 


Terminate with K 

as an optimal solution 

to PDi 


and let 

\ = X - flw 


X by PU) 
of Equations 


Is a suitable 







Terminate with A. 
as an estimate of an 
optimal solution to 


4.1.6 Derivation of a Good Subgradient Direction 

In our discussion in Section 4.1.1, we saw that given a X € A of Equation (4.28), we were 
able to uniquely determine y h j = 1, ... , n through Equation (4.22). Thereafter, once we 
fixed values 5* for «/', j — I, ... , n, h€H satisfying Equation (4.24), we were able to uniquely 
determine values for the other variables in the Kuhn-Tucker System using Equations (4.26), 
(4.27). Moreover, the only choice in determining uj 1 , j = I, . . . , «, /?€// arose in case \Hj\ ^ 
2 for some j € (l, . . . , n) in Equation (4.25). We also established that no matter what feasible 

values we selected for uf, j€ {l, 

n), h€H, the corresponding vector w obtained was a 

subgradient direction. In order to select the best such subgradient direction, we are interested 
in finding a vector w which has _the smallest euclidean norm among all possible vectors 
corresponding to the given solution \ € A. However, this problem is not easy to solve. More- 
over, since this step will merely be a subroutine at each iteration of the proposed scheme to 
solve PD 2 , we will present a heuristic approach to this problem. 

Towards this end, let us define for convenience, mutually exclusive but not uniquely 
determined sets A / / ,,/?€//as follows: 

(4.54) N h C {y'€{l, ...,«}: heHj of Equation (4.23)} 

(4.55) N, n Nj = {(f)} for any /, j€H and |J N„ - L/€{1, ... , n}:yj > 0}. 

In other words, we take each j€[\, ... , n) which has 3> 7 > 0, and assign it to some h£Hj, 
that is, assign it to a set N h , where h£Hj. Having done this, we let 



2yjifjeN h 

otherwise for each ; e { 1, . 

n), /?€//. 

Note that Equation (4.56) yields values uf for «/, y€{l, . .. , «}, h£H which are feasible to 
(4.24). Hence, having defined sets N,,, /;€//as in Equations (4.54), (4.55), we determine uf, 
7"€{1, ... , «}, h£H through (4.56) and hence w through (4.27). 

Thus, the proposed heuristic scheme commences with a vector w obtained through an 
arbitrary selection of sets N h , h 6 H satisfying Equations (4.54), (4.55). Thereafter, we attempt 
to improve (decrease) the value of w'w in the following manner. We consider in turn each 
y € { 1 , . . . , n) which satisfies \Hj\ ^ 2 and move it from its current set N h , say, to another set 



N h with h€Hj, h^ hj, if this results in a decrease w'w. If no such single movements result in 
a decrease in w'w, we terminate with the incumbent solution w as the sought subgradient direc- 
tion. This procedure is illustrated in the example given below. 

4.1.7 Illustrative Example 

The intention of this subsection is to illustrate the scheme of the foregoing section for 
determining a good subgradient direction as well as the termination criterion of Section 4.1.4. 

Thus, let // = {1,2}, n = 3, \Q X \ = \Q 2 \ = 3 and consider the constraint sets 


x: 2x\ — 3x 2 + x 3 ^ 1 

— Xj + 2x 2 + 3*3 ^ 1 

3xj — x 2 — x 3 ^ 1 

X\, x 2 , x 3 > 

and S 2 = 

x: 3xi — x 2 — x 3 ^ 1 

2x] + x 2 — 2x 3 ^ 1 

—X] + 3x 2 + 3x 3 ^ 1 

x ]t x 2 , x 3 ^ 

Further, suppose we are currently located at a point A with 

\} = 0, \ 2 ' = 5/12, Xl = 7/12; \ 2 = 7/12, A 2 2 - 0, \ 3 2 = 5/12. 
Then the associated surrogate constraints are 


4 1 2 

— X| + — x 2 + —x } ^ 1 for h = 1 

4 2 2 

—x, + — x 2 + — x 3 ^ 1 for h = 2. 

Using Equations (4.22), (4.25), we find 

y\= \ with//! = {1,2}, y 2 = j with// 2 = {2} and ^3 = j with // 3 = {1,2}. 

Note that the possible combinations of N ] and N 2 are as follows: 
(i) #!= {1}, W 2 = {2,3}, 
(ii) AT!- {0}, ^ 2 = {1,2,3}, 
(Hi) #i= {1,3}, N 2 = {2}, and 
(iv) ^ = {3}, /V 2 ={1,2}. 

A total enumeration of the values of u obtained for these sets through (4.56) and the 
corresponding values for w are shown below. 

N Y N 2 

u},j€[l n) 

wf, ieQ,„ heH 









w 2 

w 3 ' 

w 2 

w 2 2 

w 3 2 

{1} {2,3} 
{0} {1,2,3} 
{1,3} {2} 
{3} {1,2} 






























Thus, according to the proposed scheme, if we commence with N x = {l}, N 2 = {2,3}, then 
picking j = 1 which has \Hj\ = 2, we can move 7=1 into N 2 since 2€H\. This leads to an 
improvement. As one can see from above, no further improvement is possible. In fact, the 


best solution shown above is accessible by the proposed scheme by all except the third case 
which is a "local optimal". 

We now illustrate the sufficient termination condition of Section 4.1.4. The vector w 

A-l . _ h=\ h=l 

obtained above is (0,0,0|0, -4/3, 0). Further the vector \ is ( 0, 5/12, 7/12|7/12, 0, 5/12). 
Thus, even though w >_ 0, we see that the conditions (4.53) of Lemma 6 are satisfied for each 
h€H = {1,2} and thus the given \ solves PD 2 . 

The disjunctive cut (3.8) derived with this optimal solution A is obtained through (4.57) 

(4.58) yx, + y;c 2 + jx 3 ^ 1. 

It is interesting to compare this cut with that obtained through the parameter values \/' = 
l/\Qi,\ for each i€Q h as recommended by Balas [1,2]. This latter cut is 

(4.59) jxi+jc 2 + x 3 > 1- 
Observe that (4.58) uniformly dominates (4.59). 

4.2 Maximizing the Rectilinear Distance Between the Origin and the Disjunctive Cut 

In this section, we will briefly consider the case where one desires to use rectilinear 
instead of euclidean distances. Extending the developments of Sections 2, 3 and 4.1, one may 
easily see that the relevant problem is 

minimize {maximum v,: constraints (4.12), (4.13), (4.14) are satisfied}. 

yell n) ' 

The reason why we consider this formulation is its intuitive appeal. To see this, note that the 
above problem is separable in /; € H and may be rewritten as 

PD|: minimize 

£*: { h > £ Xi'ai) for each j - 1, ...,«,£ X,*- 1, X," > 

'"€<?/, ''€0/, 

for/€0„ f Z h > 

for each /; € H. 

Thus, for each /?€//, PD] seeks X/', i€Q h such that the largest of the surrogate constraint 
coefficients is minimized. Once such surrogate constraints are obtained, the disjunctive cut 
(3.8) is derived using the principles of Section 3. 

As far as the solution of Problem PD] is concerned, we merely remark that one may 
either solve it as a linear program or rewrite it as the minimization of a piecewise linear convex 
function subject to linear constraints and use a subgradient optimization technique. 


[1] Balas, E., "Intersection Cuts from Disjunctive Constraints," Management Science Research 

Report, No. 330, Carnegie-Mellon University (1974). 
[2] Balas, E., "Disjunctive Programming: Cutting Planes from Logical Conditions in Nonlinear 

Programming," O.L. Mangasarian, R.R. Meyer, and S.M. Robinson, editors, Academic 

Press, New York (1975). 


[3] Balas, E., "Disjunctive Programming: Facets of the Convex Hull of Feasible Points," 

Management Science Research Report, No. 348, Carnegie-Mellon University, (1974). 
[4] Bazaraa, M.S. and CM. Shetty, "Nonlinear Programming: Theory and Algorithms," John 

Wiley and Sons, New York (1979). 
[5] Burdet, C, "Elements of a Theory in Non-Convex Programming," Naval Research Logis- 
tics Quarterly, 24, 47-66 (1977). 
[6] Burdet, C. "Convex and Polaroid Extensions," Naval Research Logistics Quarterly, 24, 67- 

82 (1977). 
[7] Dem'janov, V.F., "Seeking a Minimax on a Bounded Set," Soviet Mathematics Doklady, 

11, 517-521 (1970) (English Translation). 
[8] Glover, F., "Convexity Cuts for Multiple Choice Problems," Discrete Mathematics, 6, 

221-234 (1973). 
[9] Glover, F., "Polyhedral Convexity Cuts and Negative Edge Extensions," Zeitschrift fur 
Operations Research, 18, 181-186 (1974). 

[10] Glover, F., "Polyhedral Annexation in Mixed Integer and Combinatorial Programming," 
Mathematical Programming, 8, 161-188 (1975). See also MSRS Report 73-9, Univer- 
sity of Colorado (1973). 

[11] Glover, F., D. Klingman and J. Stutz, "The Disjunctive Facet Problem: Formulation and 
Solutions Techniques," Management Science Research Report, No. 72-10, University of 
Colorado (1972). 

[12] Held, M., P. Wolfe and H.D. Crowder, "Validation of Subgradient Optimization," 
Mathematical Programming, 6, 62-88 (1974). 

[13] Jeroslow, R.G., "The Principles of Cutting Plane Theory: Part I," (with an addendum), 
Graduate School of Industrial Administration, Carnegie-Mellon University (1974). 

[14] Jeroslow, R.G., "Cutting Plane Theory: Disjunctive Methods," Annals of Discrete 
Mathematics, 1, 293-330 (1977). 

[15] Karlin, S., "Mathematical Methods and Theory in Games, Programming and Economics," 
1, Addison-Wesley Publishing Company, Reading, Mass. (1959). 

[16] Majthay, A. and A. Whinston, "Quasi-Concave Minimization Subject to Linear Con- 
straints," Discrete Mathematics, 9, 35-59 (1974). 

[17] Owen, G., "Cutting Planes for Programs with Disjunctive Constraints," Optimization 
Theory and Its Applications, //, 49-55 (1973). 

[18] Poljak, B.T., "A General Method of Solving Extremum Problems," Soviet Mathematics 
Doklady, 8, 593-597 (1967). (English Translation). 

[19] Poljak, B.T., "Minimization of Unsmooth Functionals," USSR Computational Mathematics 
and Mathematical Physics, 9, 14-29 (1969). (English Translation). 

[20] Vaish, H. and C. M. Shetty, "A Cutting Plane Algorithm for the Bilinear Programming 
Problem," Naval Research Logistics Quarterly, 24, 83-94 (1975). 


B. Lev* 

Temple University 
Philadelphia, Pennsylvania 

D. I. Toof 

Ernst & Ernst 
Washington, D.C. 


The reliability of a serial production line is optimized with respect to the lo- 
cation of a single buffer. The problem was earlier defined and solved by Soy- 
ster and Toof for the special case of an even number of machines all having 
equal probability of failure. In this paper we generalize the results for any 
number of machines and remove the restriction of identical machine reliabili- 
ties. In addition, an analysis of multibuffer systems is presented with a closed 
form solution for the reliability when both the number of buffers and their 
capacity is limited. For the general multibuffer system we present an approach 
for determining system reliability. 


Several types of production line models appear in the literature. Each one is a realization 
of a different real life situation. A summary of the various types and the differences in the 
mechanism of product flow among them appears in Buzacott [5], Koenigsberg [9], Toof [14] or 
Buxey et al [1]. Recently Soyster and Toof [13] defined a serial production line, which is the 
model analyzed in this paper. 

The mechanism of product flow in a serial production line is described via Figure 1. An 
unlimited source of raw material exists before machine 1. If machine 1 is capable of working 
(i.e., not failed), an operator takes a unit of raw material and processes it on machine 1, after 
which he moves to machine 2 and processes it on machine 2, if machine 2 is capable of work- 
ing. He proceeds analagously until machine N where a finish product is completed. Let T, be 


the process time on machine /. Then the cycle time of the system T = £ T,. Let q, be the 

probability that at any cycle T machine / is capable of working and p,- = 1 — q, the probability of 
failing. The serial production line with no buffer must stop working if any of the individual 
machines on the line fails. The placement of a single buffer of capacity M after machine / 
alleviates this situation. If any of the first i machines fail and the buffer is not empty, machines 

*This study was done when the author was at the Department of Energy, Washington, D.C. under the provisions of the 
Intergovernmental Personnel Act. 





M 2 


M l+X 

M i+2 

M N 

Product Flow 

Figure 1. Serial production line with N machines and a single buffer 

i + l, / + 2, . . . , N can still function. Conversely, if any of the machines / + 1, .... TV fail 
and the buffer is not full, the first /machines may still work and produce a semifinished good to 
be stored in the buffer. One obviously would like to identify the optimal placement of this 
buffer. Soyster and Toof [13] proved that if there are an even number of machines, all identi- 
cally reliable (q, = q V /) then the optimal placement of the buffer is exactly in the middle of 
the line. In section 2 we generalize these results for any number of machines not necessarily 
identically reliable. Specifically, we prove that the optimal placement of a single buffer is at a 
place which minimizes the absolute value of the difference between the reliability of the two 
parts of the line separated by the buffer. 

The optimal location /* is determined from (1) 


n«,- n q, 

/=! /-/*+! 




n * 

A more difficult question is the optimal locations of several buffers. In section 3 we analyze a 
special case of a two buffer system, each buffer having a capacity of one unit. In section 4 we 
present an approach that can be used for any number of buffers with any capacity. The 
approach we suggest is efficient as long as the number of buffers and their capacity remains 
relatively small. 


Let a single buffer with capacity M be placed after machine /'. Let a, = \\qj, 


13, = [ q n p, ■ = (a, - a,/3,)/(/3, - a,/3,), and let X„ be the number of units in the buffer at 

the beginning of cycle n. Soyster and Toof [13] have shown that X„ defines a finite Markov 
Chain, presented its transition matrix and found that the reliability R (/) of the line is given by 
(2) and (3): 



R(i) = j3,a, +/3, (1 -a,) 

Pi ~ Pi 



R(i) =/3,a, +/3, (1 -a,) 

~ Pi 


M + 1 

if «i *Pi 

ifa, = /3,. 

One has to maximize /?(;') with respect to /, that is, to identify the optimal location of the 

/ N t N 

buffer within the line. Since a,/3, = ] J q t [ q) = ]7<7/ is a constant and does not affect the 

/-i ;-/+i /=i 

location of the buffer, one can simply ignore this term from (2) and (3) in the optimization 

phase. Thus, we want to find /'*that maximizes R(i) or: 


R (/*) = Max R (/) = Max 

0/(1 -«/) 
0/(1 -«,) 

Pi ~ P, 




M + 1 


if a, *Pi 

it a, = p r 



The approach we take to solve (4) for /'* is to show that R(i) is strictly increasing with a, for 
a, < /3, and strictly decreasing with a, for a, > /3,; that a, = /3, occurs when R (/') reaches its 
maximum value; and that R (/) is symmetric about the point i* where a,. = /3,.. 



R(i) = 08 ; -a,.j8,.) 

p-r+i _ i 

when a, = /3,, p, = 1 and (5) becomes (6) 



/?(/) = (/3, -a,/3,) 

M+ 1 

«/ * 0, 


Note in (6) as M becomes large the total reliability of the line, which is equal to a,/3, + /?(/), 
approaches (3 r That is, the two segments of the line become independent of each other. 

In this section the general strategy is to show that if a, > /3, or a, < (3, then the reliabil- 
ity of (5) is smaller than the reliability of (6). Hence, we treat a, as a continuous variable and 
show that the derivative of (5) with respect to a, is positive for a, < /S, and negative for 
a, > /3,-. 

The derivative of R (/') with respect to a, is: 

dR (/) 




(p, M+1 -p,) + 

{M P r + -(M + \) P r + 1) 




i + 


LEMMA 1: The additional reliability function R(i), is strictly increasing with respect to 

a, over the range 


'(//? (/) ' 





That is, if < a, < /3 M then 

, and strictly decreasing with respect to a, over the range 
dR (/) 


> 0. Conversely, if /3, < a, ^ 1, then 

< 0. The proof can be found in [14]. (The first range is closed from the left and open 

from the right; the second range is open from the left and closed from the right). 

THEOREM 1 : The optimal placement, /*, of a single buffer of integer capacity M in an N 
machine line is where a *= /3 * 

PROOF: The proof of this theorem is essentially complete. We must only show that (5) 
is continuous at the point where a*= /3* By definition the additional reliability attributable to 
the introduction of the buffer when a f= /3 * is: 



M + 1 

As a, — ► /3, pi — - 1 so that in (5) the limit of the steady state probability as a, — - /3, is of the 
indeterminate form 0/0. However, an application of L'Hospital's rule shows that: 


P r x - Pl 

M + \ 


a, -0,. p'"^'-! M + 1 

and thus the continuity is proven. 


Theorem 1 defines an optimal though not necessarily feasible solution to the problem of 
buffer placement. The condition a, = /3, may be impossible to satisfy. In the remainder of this 
section we examine the symmetry of the reliability function defined by Equation (5), develop a 
simple criterion that provides the best feasible solution and, lastly, we examine the special case 
of identical machine reliability, i.e., q, = q V /. 

LEMMA 2: Given K x and K 2 continuous variables such that a^ -/3 K = /3 K - a K . 
Thenp*, ■ p Kl = 1. 


PROOF: Recall that afi, = ]^q,= Q a constant for all /'. Thus the condition 
a K ~ &k = Pk-,~ a K ma y De rewritten a K *— = — * a K . This implies that: 

1122 ' <*K X a K 2 2 

Q{a Kx + a K ) 

"A", + a«, = or that a K a K = Q. 

i 2 a K ^K 2 ' 2 

Similarly, one obtains the result that fi K (3 K = Q. We want to show that p K ■ p K =1. Sub- 
stituting for p K and p K in the definition of p yields: 

(a K] - Q)ia Ki - Q) 

P*lPKl ' ((} Ki -Q) (fi K2 -Q)- 

We then must show that: 

(a K] - Q) (a Ki - Q) - Q8*, - Q) (fi Ki - Q) 
or that: 

«*,«*, - Q(a Kl + <*k 2 ) = /Sat, Pk 2 ~ 003*, + Pi)- 

The condition a K{ — Pk^ Pk 2 ~ <*k 2 infers both that a K] + a Ki = (3 K] + ji Ki and that 
a K x a K 2 = Pk { Pk 2 = Q, an d thus the proof is complete. 

This leads directly to the following theorem: 

THEOREM 2: For a continuous argument (/'), R (/') is symmetric about the point i* 
where a,. = /3,.. 

The proof is in [14]. 

The placement of the buffer has been treated as a continuous variable. While this has led 
to satisfying mathematical results, in reality one must develop an optimizing criterion which is 
physically feasible. Unfortunately, the condition a,. = /3,. does not satisfy the feasibility 
requirements. Rarely will i* be integer and what, for example, is the physical interpretation of 
i* = 7.63. To this end, it will be shown in this section that the steady state reliability of the 
line is maximized by placing the buffer after machine /* (/* integer) where /* satisfies the fol- 
lowing condition: 

\a,. - j8,-.| = min \a, - /3,|. 
1 < ; < N 

!• N 

Note that if an integer i* exists such that a,. = ] J q, = q, = /3,., it would satisfy the 

/=1 /=r+i 

above criterion and be consistent with Theorem 1. 


To this end o bserv e that \a, • — /3,| is a convex function of a, that obtains its minimum 
point at a, • = j3, ■ = -J a fit = yfQ. Thus, for 

a, < aj < y/Q, \ctj - /3,\ < \a, - j3,-|, and for JQ < a, < a,, \a ,- - (3 ,\ < \a, - j8,|. 

THEOREM 3 (Fundamental): The optimal integer placement of a single buffer of capa- 
city Min an N machine line is where \a, — /3,-| is minimized. 

PROOF: From Theorem 1 we know that by treating / as a continuous variable the optimal 
placement /'* satisfies a,* = j3,». If /'* is integer the theorem is evident. Assume that i* is not 
integer. Examine the points [/*] and [/* -t- 1]. From lemma 1 and the convexity of \a, — /3,-| 
we know that R([i*]) > R(K { ) where a [r] >a K] and R ([/* + 1]) > R (K 2 ) where 

|[;*+n > a K ■ Thus, the only two candidate placements are [/'*] and [/* + 1]. 

If \a[j*] — fi[j*]\ = |a [,••+!] — /8r/. + 1 ]| then the theorem holds and either placement is 
optimal. Therefore, assume that ]«[,♦] — /3[,-.] I < |a(/*+i] — /8[/»+i]|. We want to show that 
R([i*]) > R ([/*+l]). Assume the contrary, i.e., that R ([/* + 1]) > R ([/*]). From 
Theorem 2 we know that there exists a point K* such that R{K*) = R([i* + 1]) and that 
Wk*~ Pk*\= \ a u*+i]- Pn*+i]\- Th i s implies that R (K*) >R ([/*]). We k now that 
\ a K*~ $k*\ > lot [/*] — /S[/*]l and since both a K * and «[,»] must be greater than v ' a Wi tn i s 
implies that a K * > a [,-*]. By Theorem 2 this would infer that /?([/*]) > R{K*) which is a 
contradiction. Similar results may be obtained by assuming that Icq,-*] — /3[/*]l > la[,*+i] 
- P[r+ ill- 
Theorem 3 details a simple, yet elegant criterion for the optimal placement of a single 
buffer regardless of capacity so as to maximize the reliability of the system. 

A Special Case: q, = q V /. 


Consider the case where q, = V i. In this case: 

a, = q' 

13, = q N -'. 

It follows from Theorems 1 and 3 that if A^ is even, the optimal placement would be where 
a, = f3, which in this case is where q' = q N ~' which is satisfied at / = A/2. This is consistent 
with the results developed by Soyster and Toof [13]. 

Assume that N is odd. Then N is of the form 2K + 1 where K is integer and by 
Theorem 3 the optimal placement is either after machine K or machine K + 1 since: 

l«* -18*|- \q K - q 2K+ '- K \= \q K - q K ^\ 

l«* + i - 0* + ll = l^ + ' - <7 2 * +1 -*-' \=\q K ~ g K+l \. 

We have just completed the proof for the optimal location of a single buffer on an N machine 
serial line. The optimal location is for any N (even or odd) and for any q, (both when machine 
reliability are identical or not identical for all machines). In the next section we generalize the 
model to include more than one buffer. 


Consider a simpler case of the general model where N = 3k and q, = q for all /. The 
placement of two buffers separates the line into three segments. Since N = 3k, one may arbi- 
trarily place the first buffer immediately after machine k and the second immediately after 



machine 2k. The placement of these two buffers has just defined the three stages of the sys- 
tem. Each stage may be comprised of more than one machine; for a line of N = 3k, each stage 
is comprised of k machines. The reliability of each stage is Q { = Q 2 = Q3 = qf = Q and 
P = 1 - Q. 

The two buffer system operates analogously to the one buffer system described in section 
2. If all machines are up, then a unit of raw material is processed by stages one, two and three 
and a finished good is produced. If, for example, stage three is down, stages one and two are up 
and buffer two is not full, then both stages one and two operate and a semicompleted good 
would be stored in buffer two. If buffer two had been full and buffer one had not, then 
machine two would not operate; it would be blocked by the second buffer which is full. In this 
case only machine one would operate and a semiprocessed good would be stored in buffer one. 

Define an ordered pair (X, Y) where X represents the quantity of semifinished goods in 
buffer one at the start of cycle /, and Y the quantity in buffer two at the start of cycle /. If we 
assume that the maximum capacity of both buffers one and two is one, then the pair (X, Y) 
may take on the following four values: (0,0), (1,0), (0,1), and (1,1). The one cycle transition 
probability from state (X, Y) = (0,0) to all states is: 

• Both are empty at the start of cycle t + 1 if either all stages are up, or if stage one is 
down. Thus: P[(X l+] , K, +1 ) = (0,0) | {X,, Y,) = (0,0)] = Q 3 + P. 

• If stage one is up during cycle t but stage two is down, then a unit of raw material is 
processed on stage one and the semicompleted good stored in buffer one. Thus: 
P[(X, +] , K, +1 ) = (1,0) I (X„ Y,) = (0,0)] = QP 

• If both stages one and two are up but stage three is down, then a unit of raw material is 

processed on both stages one and two and the semicompleted good stored in buffer two. 
Thus: P[(X, +] , Y l+] ) = (0, 1) I (X,, Y t ) = (0,0)] = Q 2 P 

• Lastly, note that it is impossible for (X l+ \, Y l+X ) to equal (1,1) given that 
{X,, Y,) = (0, 0), as at most, one unit may be added to storage during any cycle. Thus: 
P[(X l+] , Y l+l ) = (1, 1) I (X„ Y t ) = (0,0)] = 0. 

One may compute the transition probabilities for all of the four possible states in an analogous 
manner. The complete transition matrix is presented in Figure 2. 

\ State 

State in t \^ 






Q 3 +P 


Q 2 P 


Q 2 P 

Q 3 +P 

QP 2 

Q 2 P 



Q 2 P 

Q 3 +P 2 




Q 2 P 

Q 3 +P 

Figure 2. Transition matrix — two buffer system 

Let 77), 7T 2 , 7r 3 , 7r 4 be the steady state probabilities of buffer states (0,0), (1,0), (0,1) and 
(1,1) respectively. The system is in state (0,0) with probability 77 1, then a good is produced if 
and only if all three stages are up. This event has a probability of Q 3 tt\. Similarly, with proba- 
bility 7r 2 the system is in state (1,0), then only stages two and three must be up for a finished 
good to be produced. This event has probability Q 2 tt 2 - Lastly in both state (0,1) and 



(1,1), buffer two is not empty and thus the only condition for a successful cycle is that stage 
three must be up. These events have probability £)t7 3 and Qn A , respectively. The steady state 
reliability, R, of the two buffer system where the capacity of both buffer one and buffer two is 
one unit is equal to: 

= <2 3 ir, + <2 2 tt 2 + Qrr 3 + Qtt ,. 



Thus, upon determining the steady state probabilities, 77,, tt 2 , 77 3 and 77 4 , one has an 
exact formulation of the reliability of the three stage, two buffer system, where each buffer has 
a capacity of one unit. 

From the transition matrix presented as Figure 2 and basic finite Markov Chain theory, 
one can calculate tt u tt 2 , 77 3 , and 77 4 in the following manner. 

First, we know that in the steady state ttB = it where B is the one step transition matrix 
of the system (Figure 2) and 

77 = (77 j, 77 jt W3, 77 4) . 

This identity yields a system of four simultaneous equations of the form 
(8) ir(B-I) = 

where B is the form: 

B = 

However, (B-I) has no inverse as the rows are linearly dependent. The classical method of 
solution to this problem is to drop one of the identity equations of 77 and substitute the fact 
that the sum of the steady state probabilities must equal one. That is, 77, + tt 2 + tt 3 + tt 4 = 1. 
Making this substitution for column 3 of B-I yields the following system of simultaneous equa- 
tions: ttA = (0,0,1,0), where: 


Q 3 + P 


Q 2 P 

Q 2 P 

Q 3 + P 

QP 2 

Q 2 P 


Q 2 P 

Q 3 + P 2 



Q 2 P 

Q 3 + P 

Q 3 + P - 

1 QP 

Q 2 P 

Q 3 + P - 

- 1 


Q 2 P 


1 Q 2 P 

Thus, 77 = (0,0, 1,0) A which reduces to 7r = A 3 

inverse matrix A~ x . The solution to the last system of four equations and four variables is: 


Q 3 + P - 1 

where A^ is the third column of the 


tt, = (Q 2 + Q + \)/(4Q 2 + 3Q + 5) 
tt 2 = (Q 2 + Q + 2)/(40 2 + 3Q + 5) 
773= (Q 2 + \)/(4Q 2 + 3Q + 5) 
it 4 = (Q 2 + Q + 1)/(4Q 2 + 3(3 + 5) 

We are now able to directly compute the steady state reliability of a two buffer series sys- 
tem where each stage has identical reliability, Q, distributed Bernoulli and each buffer a capacity 
: of one unint. We have just proved Theorem 4 which results from (7) and (9). 



THEOREM 4: For the series production system described above the steady state reliabil- 
ity of the system /?, is equal to: 

s + 2Q 4 + 4Q 3 + 3Q 2 + 1Q 

4(? 2 + 30 + 5 

R = 


The previous sections have laid the groundwork for our analysis of a general multistage, 
multibuffer system such as the one depicted in Figure 3. For ease of analysis let us assume 
that the reliability of each stage has the Bernoulli distribution with parameter Q and further that 


buffer /' has capacity M r For a general N stage system with m buffers, there are ] [ (M, + 1) 

possible buffer states; i.e., each buffer may take on M, + 1 values and there are m such buffers. 
For example, if M, = 4 for all /, and m = 5 there would be 3,125 possible buffer states ranging 
in value from (0,0,0,0,0) to (4,4,4,4,4). The question arises as to the viability of this form of 
analysis for systems with large buffer capacity (M,), multiple buffers (m) or a combination of 
the two. Clearly, the transition matrix for a large system would be relatively sparse (i.e., many 
zero entries). For example, in a four stage (three buffer) system, where each buffer has a capa- 
city of three units, there would be 4 3 = (3 + l) 3 or 64 possible transition states. For the start- 
ing state (1,1,1) there are 13 possible transitions (i.e., nonzero transition probabilities). The 
feasible transitions from the state (1,1,1) are: 

(0,1,1), (0,1,2), (0,2,1), (1,0,1), (1,0,2), (1,1,0), (1,1,1), 

(1,1,2), (1,2,0), (1,2,1), (2,0,1), (2,1,0), and (2,1,1). 









B N ~ 

jV— 1 j 



Product Flow 

Figure 3. General muliisiage, mullibufTer system 

While it is obvious that the method of analysis employed to this point is feasible, that is, 
(1) definition of a one step transition matrix; (2) development of a reliability equation as a 
function of stage reliability and the steady state transition probabilities, and (3) solving a system 
of linear equations for the steady state transition probabilities; its application is, for the most 
part, not practical. 

Let us present the transition matrices for two or three buffer systems with capacity one or 
two. For the system of two buffers of capacity two the transition matrix is given in Figure 4 
and the steady state probabilities for various values of Q are given in Figure 5 where the relia- 
bility R is: 

R = (2 3 7r, + Qtt 2 + £>7r 3 + Q 2 ir 4 + Qtt s + £>7r 6 + Q 1 -n 1 + Q-n % + Qtt 9 . 

Figure 5 was calculated by a small computer program. For various values of Q, we solved for 
the unique rr-, and calculated /?, which appears in Figure 5. For the system of four stages, and 
three buffers with capacity one, the transition matrix is given in Figure 6. 

Again, using a small computer program we solved for n , and calculated R. The steady 
state probabilities and the system reliability R is given in Figure 7 where 


= n4 

QV, + QlT 2 + 3 7T 3 + Q7T 4 + Q 3 7T 5 + QlT^ + Q 2 7T 7 + QtT g. 



Transition Matrix 

t ^\ 











Q 3 +P 

Q 2 P 




Q 3 +P 2 

Q 2 P 

Q 2 P 

QP 2 



Q 3 +P 2 

Q 2 P 



Q 2 P 

QP 2 

Q 3 +P 2 

Q 2 P 



Q 2 P 

QP 2 

QP 2 

Q 3 +P 3 

Q 2 P 

Q 2 P 

QP 2 


Q 2 P 

QP 2 

Q 3 +P 2 

Q 2 P 



Q 2 P 

QP 2 

Q 3 +P 

Q 2 P 


Q 2 P 

QP 2 


Q 3 +P 2 

Q 2 P 


Q 2 P 


Q 3 +P 

Figure 4. Two buffers of maximum capacity two 

Buffer Capacity Equals 2 

Q = .9 


Q = .3 

Q= A 






1T ? 





T 3 





7T 4 





1T S 










TT 1 




















Figure 5. Exact solutions to three stage 
two buffer system 

Maximum Buffer Capacity Equals One 

t ^\ 










Q 4 +P 

Q 3 P 

Q 2 P 




Q 4 +P 2 

Q 3 P 

Q 2 P 

Q 2 P 

QP 2 


Q 2 P 

QP 2 

Q 4 +P 2 

Q 3 P 

Q 3 P 

Q 2 P 2 



Q 2 P 

QP 2 

Q 4 +P 2 

Q 3 P 

Q 2 P 



Q 3 P 

Q 2 P 2 

QP 2 

Q 4 +P 

Q 3 P 

Q 2 P 


Q 3 P 

Q 2 P 2 

QP 2 


Q 4 +P 2 

Q 3 P 

Q 2 P 


Q 3 P 

Q 2 P 2 

Q 2 P 

QP 2 

Q 4 +P 

Q 3 P 


Q 3 P 

Q 2 P 


Q 4 +P 

Figure 6. Four stage, three buffer transition matrix 



Buffer Capacity of One Unit 

7T 4 

7T 7 
7T 8 

(3 = .9 

Q = .1 

Q = .3 






































Figure 7. Reliability of a four stage. 
three butTer svsiem 

The approach we present here can be summarized as follows; for a given configuration of 
a serial production line with multiple buffers and no restriction on their capacity, one can write 
the one step transition probability matrix and solve for its steady state probabilities which yields 
the reliability of the line. The method is efficient for a small number of buffers and small capa- 
cities. In general, the number of state variables and the number of linear equations are 


~[ (M, + 1) for m buffers with capacity M r 






Buxey, G.M., N.D. Slack and R. Wild, "Production Flow Line Systems Design — A 

Review," AIIE Transactions (1973). 
Buxey, G.M. and D. Sadjadi, "Simulation Studies of Conveyor Paced Assembly Lines with 

Buffer Capacity," The International Journal of Production Research, 14 (1976). 
Buzacott, J. A., "Automatic Transfer Lines with Buffer Stocks," International Journal of 

Production Research, 5 (1967). 
Buzacott, J. A., "The Role of Inventory Banks in Flow-Line Production Systems," The 

International Journal of Production Research, 9 (1971). 
Buzacott, J. A., "Models of Automatic Transfer Lines with Inventory Banks, A Review and 

Comparison," AIIE Transactions, 10 (1978). 
Gershwin, S.B., "The Efficiency of Transfer Lines Consisting of Three Unreliable 

Machines and Finite Interstage Buffers," Presented at ORSA/TIMS Los Angeles Meet- 
ing (1978). 
Hatcher, J.M., "The Effect of Internal Storage on the Production Rate of a Series of Stages 

Having Exponential Service Times," AIIE Transactions, 2, 150-156 (1969). 
Ignall, E. and A. Silver, "The Output of a Two-Stage System with Unreliable Machines and 

Limited Storage," AIIE Transactions, 9 (1977). 
Koenigsberg, E., "Production Lines and Internal Storage — A Review," Management Sci- 
ence, 5(1959). 
Okamura, K. and H. Yamashina, "Analysis of the Effect of Buffer Storage Capacity in 

Transfer Line Systems," AIIE Transactions, 9 (1977). 
Rao, N.P., "Two Stage Production System with Intermediate Storage," AIIE Transactions, 

Sheskin, T.J., "Allocation of Interstage Storage Along an Automatic Production Line," 

AIIE Transactions, 8 (1976). 


[13] Soyster, A.L. and D.I. Toof, "Some Comparative and Design Aspects of Fixed Cycle Pro- 
duction Systems," Naval Research Logistics Quarterly, 23, 437-454 (1976). 

[14] Toof, D.I., "Output Maximization of a Series Assembly Facility Through the Optimal 
Placement of Buffer Capacity," Unpublished Ph.D. dissertation, Temple University 


Roy D. Shapiro 

Harvard University 

Graduate School of Business Administration 

Cambridge, Massachusetts 


Consider a set of task pairs coupled in time: a first (initial) and second 
(completion) tasks of known durations with a specified time between them. If 
the operator or machine performing these tasks is able to process only one at a 
time, scheduling is necessary to insure that no overlap occurs. This problem 
has a particular application to production scheduling, transportation, and radar 
operations (send-receive pulses are ideal examples of time-linked tasks requir- 
ing scheduling). This article discusses several candidate techniques for schedule 
determination, and these are evaluated in a specific radar scheduling applica- 

This article considers the problem of scheduling task pairs, i.e., tasks which consist of two 
coupled tasks, an initial task and a completion, separated by a known, fixed time interval. If 
the operator or machine performing these tasks is only able to process one at a time, scheduling 
is necessary to insure that a completion task of one pair does not arrive for processing while 
one part of another task is being processed. 

Consider, for example, a radar tracking aircraft approaching a large airport [1]. In order 
to track adequately, it is necessary to transmit pulses and receive the reflection once every 
specified update period. The radar cannot transmit a pulse at the same time that a reflected 
pulse is arriving nor can two reflected pulses overlap. A possible strategy is to transmit to one 
tracked object and wait for that pulse to return before another pulse is transmitted as shown in 
Figure 1(a), but unless the number of objects being tracked is small, this may not allow all 
objects to be tracked in each update period. A more efficient strategy is some form of inter- 
laced scheduling like that shown in Figure Kb). Observe that the time between each pair of 
transmit and receive pulses is the same in Figure Kb) as in Figure 1(a), yet the total transmis- 
sion time is far less in Kb). 


Figure 1. Sample 4-pair schedules 





Our object is to generate a schedule for a given set of task pairs which allows that set to 
be completed in the least possible time with no overlap between tasks (Figure 2). Formally, let 

/, = the time of initiation of the rth task pair; 

Sj = the duration of the initial task of the rth pair, / = 1, 2, . . . , N; 

T, = the duration of the completion task of the rth pair, / = 1, 2, . . . , N; 

dj = the "inter-task" duration, i.e., the time between the initiation of 

the initial task of the rth pair and the initiation of that pair's completion. 


— .s,— 

a, — 

-* — 


t, + s, 

t, + d, t, + d, + T, 

Figure 2. The rth task pair 

The time between the initiation of the first task pair and the completion of the final pair we 
refer to as the frame time (or makespan, cf. [3,4] denoted z). For convenience, we will set the 
initiation time of the first pair to 0. 

The scheduling problem may be stated as 

find /, ^ 0, / = 1 /V to minimize 

z = max, (/, + d, + T,) 
subject to the constraint that no member of the set of intervals 

{(/,, t, + sx u, + d,, t, + d, + r,)) / = i n 

overlap with any other member. 

To put this problem into context with much of the recent literature classifying scheduling 
problems with regard to their computational complexity, we observe that the problem as stated 
is equivalent to a job shop problem where A' jobs are to be scheduled on two machines with the 
following characteristics*: 

1. Each job requires three operations: the first (of duration S,) to be processed on 
Machine 1; the second (of duration d, — S,) on Machine 2; the third (of duration T) 
again on Machine 1. 

2. Machine 1 may only process one operation at a time; Machine 2, however, has 
infinite processing capacity. 

"Under the classification scheme of Rinooy Kan [9], this problem is V 1 2 1 (7, no wan, V/ 2 iwn-bon |C max . See also [8]. 


3. No waiting between operations is permitted. That is, once a job is begun, it must 
proceed from Machine 1 to Machine 2 and back again to Machine 1 with no delay. 

The problem can then be shown to be NP-complete by Theorem 5.7, pg. 93 in [9] or by a 
reduction from KNAPSACK in [6]. NP-complete problems form an equivalence class of com- 
binatorial problems for which no nonenumerative algorithms are known. If an "efficient" algo- 
rithm were constructed which could solve any problem in this class, any other would also be 
solvable in polynomial time (cf. [2,4,6,7]). Members of this class include the chromatic 
number problem, the knapsack problem, and the traveling salesman problem. 

The fact that a polynomial-bounded algorithm is not likely to exist motivates the construc- 
tion of several polynomial-bounded algorithms which are presented and evaluated in Sections 2 
and 3. An integer programming formulation leads to a straightforward branch and bound pro- 
cedure which makes use of the problem's special structure. (See [11].) In view of the fact that 
this optimal procedure is likely to be tractable only for very small problems, and not even then 
for radar-like applications requiring real time solution, we proceed directly to consideration of 
three suboptimal algorithms. 


This section considers scheduling procedures which can be shown to be polynomially 
bounded: Sequencing, Nesting and Fitting. After some discussion of their characteristics, they 
will be evaluated on realistic examples in Section 3. 


An ordered set of p task pairs are said to be sequenced when the completion tasks arrive 
for processing in the same order as the initial tasks were scheduled, p pairs can be sequenced 

(1) </, > £ S, and 


(2) d, > d,- x + T^-Si-x, i = 2, 3, ... , p. 

If, as is the case for many applications, S, = T, for each task pair, (2) becomes simply 

(3) d, > rf,_ lf 

and implementation of this procedure becomes quite easy. 

We may think of this procedure as "jamming" initial tasks together until they run into the 
completion task corresponding to the first initial task. The completion tasks are guaranteed not 
to overlap since each succeeding d, is at least as large as the one before. Also, since this is a 
"single-pass" procedure (cf. [3]), computation time is linear in A/.* 

In any sequenced p-set, dead time can occur in two ways, as is shown in Figure 3. It 
occurs between the last initial task and the first completion, and it occurs between successive 
completions. The former can be written as 



'Actually, computation time is 0(A log AO since the d, have to be ordered. 



S, S 2 S 3 V 4 S< S 6 S 1 7", 

r 2 r 3 r 4 

r 5 t-6 

7" 7 

and the latter as 

Figure 3. Sequencing 


Z 4+i - 4 = 4» - ^i- 

+ d p - d ] 

zseq - I CS, + r,) + rf,-ls, 

- 1 7) + d„. 

Hence, if N task pairs are sequenced in P p-sets, the Ath set having p k pairs, /c = 1, 2, 
the total frame time may be represented as 

z seq — 2L 

, P, 


yv p 

= 1 7)-+I^- 

;=1 /<=1 

As an example, consider the following 7 task pairs with common durations for initial and 
completion tasks, ordered by increasing d,. 

i - 1: Si - F, - 2, d, - 9 

/ = 2: S 2 = F 2 = 1, rf 2 = 13 

/ = 3: S 3 = F 3 = 2, rf 3 = 15 

/ = 4: S 4 = F 4 = 3, d 4 = 15 

/ = 5: S 5 = T 5 = 2, d 5 = 19 

/ = 6: S 6 = F 6 = 4, </ 6 = 24 

/ = 7: S 7 = T 7 = 3, rf 7 = 25. 

Figure 4(a) shows their sequenced schedule. 

For comparison, Figure 4(b) shows the optimal schedule for this set of task pairs as gen- 
erated by the branching algorithm alluded to above. At the other extreme, if these pairs were 
scheduled by waiting until each pair was completely processed before initiating the next, the 
frame time would be 138. 

£ id, + T,) = 138. 


An ordered set of p task pairs are said to be nested whenever the completion tasks arrive 
for processing in the reverse of the order in which the initial tasks were scheduled, p pairs may 
be nested if 



v, v 2 .s, s 4 r, t 7 7-3 r 4 s\ s b s^ 




: = 58 

III 11 1 1 1 1 11 


II 1 

2 3 5 8 "J 11 15 16 18 20 23 25 29 32 

s 2 s 7 s h s 4 s, r 2 s s s 3 r, r 4 r 7 r 6 r, r 5 

42 44 


>i 54 58 

: - 37 

M II 1 1 1 II 1 1 1 II 1 1 III 

01 4 8 1113 14 16 18 202223 26 29 33 35 37 

> 7 s s SjS : /, r 3 r 5 r 7 .•>„ s 4 



h S 2 


"Mill 1 1 1 1 1 1 1 II 


= 70 

11 3 5 7 9 16 18 20 22 24 25 28 32 35 47 50 52 56 57 69 70 

Figure 4. Sequencing and nesting 

(4) d, > d i+l + T i+l + S„ /=1, ... , p-\. . 

Applying this procedure to the 7-pair example discussed above gives the schedule shown 
in Figure 4(c) with z = 70. 


This procedure, unlike the two discussed above, allows the user to specify a priority order- 
ing, and corresponds intuitively to the simple process which one might use when scheduling 
task pairs by hand. After setting the desired order and scheduling the first task pair at time 0, 
each successive pair is scheduled at the earliest possible time not involving any overlap with 
pairs already scheduled. 

Let us consider this procedure for the above example, taking an arbitrary ordering: 
2,6,7,4,3,1,5. As shown in Figure 5(a), the task pair is scheduled at time 0, and pairs 6 and 7 
can successively be scheduled with no overlap. If we, however, try to schedule pair 4 at the 
first available time, its completion would overlap with pair 6's completion (Figure 5(b)), so this 
is not possible. The first available time for scheduling task pair 4 without overlap is time 18 
(Figure 5 (c)). Pair 3, however, having task duration only 2, can be scheduled at time 8 (Fig- 
ure 5(d)). Observe now that pair 1 can be scheduled nowhere in the existing schedule without 
overlap, so it must be "tacked" onto the end, at time 36 (Figure 5(e)). Pair 5 is scheduled at 
time 21, completing the schedule with z = 47 (Figure 5(f)). 


In keeping with the radar application mentioned above, a simulation has been developed 
to generate aircraft configurations suitable to radar operation. For each object, range, cross- 
section, and velocity can be used to determine the necessary length of transmit and receive 
pulses (of the order of 10-100 /usees.) as well as the inter-pulse distance (of the order of 300- 
1300 /usees.). Thus, a list of task pairs can be generated for evaluation of the procedures out- 
lined in the previous section. As an example, such a list is given in Table I for /V = 20. 

For values of N shown in Table II, the simulation generated 50 such task pair lists, and 
the average frame time and computation time were computed. Figure 6 presents this data 
graphically. Note that, as one would expect, frame time is linear in N. This is not surprising 
since in the best conceivable situation, that of no idle time between subtasks, 



S, S b S 7 








r h t-, 

01 5 8 13 14 

s 2 s 6 v 7 .v 4 r 2 


25 29 30 33 

t, T b r 7 

01 5 8 1113 

Si v 6 s 7 

r 2 



23 25 26 29 30 33 

T h r 7 r, 

Dl 5 8 13 14 18 21 2< 2') 30 }} 36 


1)1 5 8 10 13 14 18 21 23 25 29 30 33 36 

.v. s 6 s 7 s, r 2 s 4 r } r h r 7 7" 4 s, 


45 47 

5 8 10 13 U 18 2! 23 25 2930 33 36 38 


01 5 8 10 13 14 18 21 23 25 2930 33 36 38 40 42 45 47 

Figure 5. Fitting 

TABLE I - Sample Task Pair List (N - 20) 


/ = : 



~~ 3 

4 i 5 
















S, = T, (/usee) 




50 1 70 
















</, (/usee) 


L. . 



















TABLE II (a) - Average Simulated 
Frame Times 














. 27.2 










Frame times in msec. 

Quantities in parentheses are standard deviations 



TABLE II (b) — Average Computation 
Times (msec) 






















90 — 



Z = Mean Frame Time 

Z~ .232 N 

— i 1 1 1 1 m- 

100 200 300 400 500 H 

Figure 6. Comparative frame times 

z = £ (Si + T,) ~ k x N 

and in the worst situation, that in which no task is performed until the previous task has been 

* - Z d, + T, - k 2 N. 


An assumption made in the treatment of this example is that the radar operator knows the 
values of S f , T, and d, precisely. If there is any uncertainty, signals can overlap. A straightfor- 
ward way of avoiding this problem in a real situation where uncertainty would obviously be 
present would be to "open a window" around the pulse. That is, if the object is such that 
transmit and receive pulses are estimated to be of 60 ^sec. duration, an interval longer than 60 
Aisec. can be allotted to these pulses to accommodate (1) the possibility that a pulse length 
longer than 60 Aisec. might be necessary or, more important, (2) the possibility that the receive 
signal might arrive sooner or later than expected. This procedure offers no conceptual difficulty 
since the window around the pulses may be made large enough to guarantee that the probability 
of overlap is as small as required. In order to retain frame times small enough to allow updat- 
ing every, say, 200 milliseconds, we must limit the size of the window somewhat. This does 
not seem to be a severe restriction, however. For example, since frame time is linear in XT',, 
opening a window around each pulse of twice that pulse's estimated duration would cause the 
frame time to be no more than doubled. The frame times of sequenced pulses in Table 11(a) 
indicate that even for large N, this is no problem. 

A second possibly problematic characteristic of the example is that it is static, i.e., no 
explicit consideration is given to new "jobs" added to the system during the scheduling process. 
In job shop scheduling, this may present no problem if jobs are released to the shop at 
predetermined times. In radar tracking, however, one cannot hold enemy missiles, and the 
scheduler must be dynamic. This can be accomplished; the new targets may be inserted into 
the queue of jobs to be processed, or, since this is likely to be time-consuming when jobs are 
ordered (as in Sequencing and Nesting), all current jobs can be processed, followed by the 
newly-arrived entries. This procedure will be especially efficient for sequencing since the rf ; 's 
are proportional to the distance between radar and target, and new targets will tend to appear at 
approximately the same range. 

The necessity to allow for search and discrimination as well as the tracking activity and 
real-time schedule determination within a 200 milli-second period makes sequencing the only 
viable alternative. Even when real-time processing is not required, one wonders whether the 
slight improvement in frame time allowed by fitting warrants the extra computational burden. 

A caveat is in order here: these results are somewhat application-dependent. It is quite 
possible that other applications which produce task pairs with different structures will lead to 
different conclusions. 


In the above discussion it has been assumed that the operator or machine can process 
only one task segment at a time. This is appropriate for the application being considered, but 
one might easily imagine instances in which there is some nonunit capacity constraint on the 
operator. For example, if trucks are being loaded and unloaded at some central depot, labor or 
space restrictions might limit the number of trucks being simultaneously processed. 

Fortunately, the suboptimal procedures described above may be extended without any 
problem.* Figure 7 shows how the example given in Section 2 may be sequenced if the operator 
is limited to two tasks at a time. Note that due to the ordering of the inter-task durations, 
sequencing guaranteed that since no more than two initial tasks can overlap, no more than two 
final tasks will overlap. 

The optimal enumerative procedure described in [11] is also easily extended. 







i) I i 1 5 8 9 II 1.1 14 Ih 17 18 20 22 24 24 30 .VI 

/, =0 i<= .1 

/, = /„ = 5 

*-; "- s 

'4 - 2 

Figure 7. Sequencing with operator capacity = 2 

Another extension is to consider tasks which consist of more than two coupled segments. 
The notation changes slightly: the rth task pair becomes a task set, the initial task of duration 
5, followed by n t subtasks; the /th subtask is of duration Ty and the time at which it is initiated 
is dy after the initiation of the initial task (Figure 8). 





Figure 8. Multiple coupled subtasks 

Fitting, as proposed above, works well in this case, but sequencing and nesting are waste- 
ful since they treat the subtasks as one long task of duration d m + T in — d n . 


[1] Air Traffic Control Advisory Committee, Report of the Department of Transportation, 1 

[2] Coffman, E.G., Jr. (editor), Computer and Jobshop Scheduling Theory, John Wiley, New 

York, N.Y. (1976). 
[3] Conway, R.W., W.L. Maxwell and L.W. Miller, Theory of Scheduling, Addison-Wesley, 

Reading, Mass. (1967). 
[4] Garey, M.R., D.S. Johnson and R. Sethi, "The Complexity of Flowshop and Jobshop 

Scheduling," Mathematics of Operations Research, 1, 117-129 (1976). 
[5] Heffes, J. and S. Horing, "Optimal Allocation of Tracking Pulses for an Array Radar," 

IEEE Transactions on Automatic Control, 15, 81-87 (1970). 
[6] Karp, R.M., "Reducibility among Combinatorial Problems," R.E. Miller and J.W. 

Thatcher, (editors), Complexity of Computer Computations, Plenum Press, New York, 

N.Y., 85-104 (1972). 
[7] Karp, R.M., "On the Computational Complexity of Combinatorial Problems," Networks, 5, 

45-68 (1975). 






Reddi, S.S. and C.V. Ramamoorthy, "On the Flowshop Sequencing Problem with No Wait 

in Process," Operational Research Quarterly, 23, 323-331 (1972). 
Rinooy Kan, A.H.G., Machine Scheduling Problems, Martinus Nijhoff, The Hague (1976). 
Schweppe, F.C. and D.L. Gray, "Radar Signal Design Subject to Simultaneous Peak and 

Average Power Constraints," IEEE Transactions on Information Theory, 12, 13-26 

Shapiro, R.D., "Scheduling Coupled Tasks," Harvard Business School, Working Paper, 

HBS 76-10. 


Kenneth R. Baker 

Dartmouth College 
Hanover, New Hampshire 

Henry L. W. Nuttle 

North Carolina State University 
Raleigh, North Carolina 


This paper examines problems of sequencing n jobs for processing by a sin- 
gle resource to minimize a function of job completion times, when the availa- 
bility of the resource varies over time. A number of well-known results for 
single-machine problems which can be applied with little or no modification to 
the corresponding variable-resource problems are given. However, it is shown 
that the problem of minimizing the weighted sum of completion times provides 
an exception. 


We consider the problem of sequencing a set N = {1, 2, . . . , n) of jobs to be processed 
using a single homogeneous resource, where the availability of the resource varies over time. 
If t represents time (measured from some origin / = 0) then we denote by r{t) the resource 
available at time t and by R (?), 

R(t) = f' r(u)du 

the cumulative availability as of time /, i.e., the area under the curve r(u) over the interval 
[0,t]. See Figure 1. 

Let pj, j = I, ... , n, denote the resource requirement of job / Once Pj units of 
resource have been applied to job j, the job is considered complete. We denote the completion 
time of job j by C,. In all problems treated the objective is is to minimize G, a function of the 
completion times of the jobs, where G is assumed to be a regular measure (see [1], Chapter 2). 

This model is a generalization of the single-machine sequencing model. The generaliza- 
tion to a resource capacity that varies over time allows for situations in which machine availabil- 
ity is interrupted for scheduled maintenance or temporarily reduced to conserve energy. It also 
allows for a situation in which processing requirements are stated in terms of man-hours and 
labor availability varies over time. 




r(() n 


H 1- 

-l 1 1 1 h 




In the single-machine case the resource profile r(t) is constant (typically r{t) = 1), and 
the cumulative profile R{t) is a straight line with slope r(t). Time is measured in some basic 
unit such as hours; and completion times, ready times, due dates and tardiness are expressed in 
the same units. Resource requirements (processing times) are simply requirements for inter- 
vals on the time-axis. 

In the variable-resource problem, the exact correspondence between the requirement for a 
unit of resource and the requirement for a unit interval on the time axis is lost. This lack of 
correspondence arises from the fact that there may be a number of units of resource available 
during a particular unit of time and a different number during the next. In the single-machine 
problem if a job j is sequenced to follow jobs in B (where B is any subset of N ) then job j will 
be complete at time C n 

C, = p(B) + Pj , 

where p(B) = ]Tp,, and p, denotes the processing time for job i. In the variable-resource 

problem it is appealing to analogously specify the completion time of job j by C r 

(1) C,= t(p(B) +/>,.) 

where p, is the resource requirement of job / and t{Q) is the (smallest) point on the time axis 
corresponding to R(t) = Q. See Figure 1. In effect, jobs are sequenced on the resource axis, 
while their completion times are measured on the time axis. For the single-machine problem 

; the completion point of job j is the same on both axes, but such is not the case for the 

! variable-resource problem. 

Notice that this specification implicity assumes that the resource available at any point in 
time is devoted entirely to the processing of a single job. Thus, for example, if ten men were 
available in a particular hour, all ten would be assigned to work simultaneously on the same 
job. Also, if the available resource represents several machines, then this formulation permits 
each job to be processed simultaneously on more than one machine. Equivalently, this means 
that jobs must be divisible into portions that can be allocated equally to the number of 
machines available. Such a formulation will be called a continuous-time model. 

In order to allow for a wider range of applicability, we can re-formulate the model in 
discrete time as follows. 

(a) Unit intervals on the time axis (of Figure 1) are called periods, and job comple- 
tion times are measured in periods. 

(b) In a given period the resource availability is an integer number of units. 

(c) Each job requires an integer number of resource-periods. 

(d) Processing work is divisible only to the level of one resource-unit for one period. 

Under this formulation, for example, the time unit might be days, the resource availability 
might be crew size, and the processing requirement might be man-days. Property (d) then res- 
tricts the refinement of a schedule to the assignment of each crew member's task on a day-by- 
clay basis. Furthermore, a task requiring two man-days could be accomplished either by one 
crew member working two days or by two members working one day each. 

In the discrete-time context, we may regard sequencing as ordering jobs on the resource 
scale in Figure 1, but taking the completion time of job j to be [C f ] vs [C,-], the smallest 



integer greater than or equal to Cj, where Cj itself is given by (1). In other words, we obtain a 
sequence using the continuous-time framework, which assumes arbitrarily divisible jobs, but we 
round up the resulting completion times when they are noninteger. Under this interpretation 
of the model, due-dates are specific days and a job is "on time" as long as it is completed on or 
before the specific day. Clearly, in the discrete time model several jobs can have the same 
completion time. 

To verify that a job sequence can be interpreted consistently with requirement (d), note 
that the cumulative resource requirement and the cumulative resource availability by the end of 
any period are both integers. It follows that the workload implied by the continuous-time solu- 
tion can be shifted to meet the integer restrictions of the discrete-time model since the resource 
availability in any period can be treated as a set of unit-resource availabilities. Then any frac- 
tion of a day's work in the original solution can be rescheduled as a day's work for the same 
proportion of the total resource units available. This rescheduling will consume an integer 
number of resource-periods for each job. 

As an example, consider the three-job problem shown below. 




r(t)= 1 < t < 4 

r{t) = 4 4 < t < 7 

In Figure 2 we represent the sequence 1-2-3 assuming infinite divisibility. In Figure 3 we show 
how the work is rescheduled to meet the integrality requirement of the discrete-time model. 
As Figures 2 and 3 indicate, the discrete-time conditions can be incorporated by a minor adjust- 
ment of continuous-time job assignments that essentially involves replacing vertical portions of 
the schedule chart with horizontal portions whenever the available resource capacity is split 
among two or more jobs within a period. 


Figure 2. 

Our purpose in this paper is to note that certain well-known results for the single-machine 
model carry over with little or no modification to the variable-resource model. In fact, we 
found only one exception. (See Section 3.) 

A variable resource problem has also been examined by Gelders and Kleindorfer [6,7] in 
the context of coordinating aggregate and detailed scheduling decisions. In their model the 
variation in resource availability results from the explicit decision to schedule overtime. This 



decision leads to a cumulative resource availability function consisting of segments with identi- 
cal positive slope (corresponding to capacity available) separated by horizontal segments 
(corresponding to unused overtime.) Their objective is to determine when and how much 
overtime should be scheduled, and to determine the associated job sequence, so as to minimize 
the sum of overtime, tardiness and flow-time costs. They also note that for a given overtime 
schedule, shortest-first sequencing minimizes mean job completion time while nondecreasing 
processing time-to-weight ratio sequencing may not minimize mean weighted job completion 
time. These two results are encompassed in our general treatment of the variable-resource 
model in Sections 2 and 3. 


Figure 3. 


The following is a set of sequencing results for the variable-resource model that are ident- 
ical to or slight modifications of their single-machine counterparts. It is not difficult to establish 
that the results we give are valid for both the continuous-time and discrete-time models. How- 
ever, proofs are omitted, since they are typically direct extensions of the original arguments in 
the single-machine case. 

The results involve sequences of jobs, or at least partial sequences. We reiterate that 
these sequences can be viewed as applying to the resource axis in Figure 1 but can be converted 
to completion time schedules in either the continuous-time or discrete-time case by means of 
the appropriate transformation. We use Cj to denote the completion time of job j and r(p(B)) 
to denote the makespan for the jobs in /?, recognizing that in the discrete-time case these quan- 
tities must be interpreted in the appropriate way. 

Minimizing the Maximum Cost 

One of the few efficient algorithms for a broad class of sequencing criteria is Lawler's pro- 
cedure [9] for minimizing the maximum cost in the sequence. Formally, the criterion is to 

G = max {gj(Cj)} 

where £/(C,-) is the cost incurred by job j when it completes at C, and where gj(t) is nonde- 
creasing in /. The solution procedure works by constructing a sequence from the back of the 
schedule and the procedure is easily adapted to the variable-resource model, as shown below. 


1. Initially let A = <f>. (A denotes the set of jobs at the end of the schedule and 
A' = N — A denotes its complement.) 

2. Find M = t(p(A')). (Mis the makespan for unscheduled jobs.) 

3. Identify job k satisfying g k {M) = min {g,(M)). (Considering only the 

unscheduled jobs, job k is the one that achieves the minimum cost when 
scheduled last.) 

4. Schedule job k last among the jobs in A'. Then add job k to A and return to Step 
2 until A = N. 

A noteworthy special case is the criterion of maximum tardiness. The procedure 
sequences jobs in nondecreasing order of due-dates in this case. Thus, as in the single machine 
problem, earliest due-date (EDD) scheduling will minimize the maximum tardiness. It will 
also find a schedule in which all jobs complete on time, if such a schedule exists. 

Minimizing the Sum of Tardiness Penalties 

Many problems of considerable interest for the single machine model may be regarded as 
special cases of the problem of minimizing total tardiness penalty, 

where T, = max(C, — d,\ 0) and w l > 0. 

Several dominance properties, in the spirit of Emmons [3], can be shown to hold for the 
variable resource problem. These in turn imply similar dominance properties for the various 
special cases and, in some instances, optimizing (ranking) procedures. Let: 

J — a set of jobs 

J' = the complement of J 

A, = the set of jobs known to follow job ;', by virtue of precedence conditions. 

B, = the set of jobs known to precede job /', by virtue of precedence conditions. 
Cj = the time required to process the jobs in set /, defined by R (C,) — £/?, 

B' = B,U { j) = the set containing job j and all jobs known to precede job / by virtue of 

the precedence conditions. 
A* = A- — {j} = the set containing the complement of A,, but excluding job / 

THEOREM 1: If w k < w. and d k ^ max(rf,, C..) then j precedes k in an optimal 

THEOREM 2: If d k ^ C . then j precedes k in an optimal sequence. 

A i 

THEOREM 3: If p, < p k , w, > w k and d, < max (d k , C fl .), then j precedes k in an 
optimal sequence. 

COROLLARY (Theorem 3): If Wj > w k , p, < p k and d, < d k , then j precedes k in an 
optimal sequence. The corollary immediately yields an optimal ranking procedure for problems 
derived by making constant any two of the three parameters. For example, when G = ^ T, 

with w, = w and d, = d, an optimal sequence is determined by ordering the jobs by processing 



requirement, smallest first (p\ ^ p 2 ... < p„). When d = we have T, = C n i.e., the mean 
flowtime problem, for which this sequence is called shortest processing time (SPT). 

The problem of minimizing the total tardiness penalty when p t = p is also not difficult to 
solve. Constant resource requirements imply a fixed sequence of completion times under any 
sequence. In particular the first job completes at tip), the second job at t(2p), etc.; and an 
optimal schedule may be found by assigning jobs to positions, as in Lawler [10]: 

x u = 1 if job /appears in sequence position j 

= otherwise 
Cy = the penalty for job /'when it appears in sequence position j, i.e. max {0, t{jp) — d,}. 

The problem is to minimize £ ]L c u x U 

i l 

Subject to 

Tfii = 1 


An assignment algorithm can produce the optimal solution. 

The most general version of the single-machine problem, with unequal due-dates, pro- 
cessing times, and weights is binary NP-complete. The computational complexity of the cases 
in which w, = w or d, = d > is an open question. However, pseudo-polynomial algorithms 
have been developed by Lawler [11] and Lawler and Moore [12]. The algorithms which have 
demonstrated the most effective computational power for the problems are those found in [14]. 
These and other enumerative algorithms can be modified in a straightforward manner to accom- 
modate the variable resource problem. 

Minimizing The Weighted Number of Tardy Jobs 

In this case we are interested in whether a job is tardy rather than the the length of time 
by which it is tardy. Let 8(7}) = 1 indicate that job j is tardy and 8(7}) = indicate that it is 
completed on time. If each job has its own penalty for being tardy, i.e., 

G= ^w/iiTj), 

then the single-machine problem is binary NP-complete, although it can be solved by a 
pseudo-polynomial dynamic programming algorithm due to Lawler and Moore [12]. The algo- 
rithm can easily be adapted to the variable-resource problem with no impact on computational 

By restricting the data we obtain special cases that are solvable by ranking algorithms, just 
as in the single-machine case: 

THEOREM 4: When d, = d for all jobs, if the processing times and weights are agreeable 
(Pi ^ pj whenever w t > w,) then an optimal sequence is obtained by scheduling the jobs in 
order of processing requirement, shortest first (in order of weight, largest first). 

COROLLARY (Theorem 4): When d, = dand p t = p, an optimal sequence is obtained by 
' scheduling jobs in order of weight, largest first. 



When Wj = w for all jobs a sequence that minimizes the number of tardy jobs i.e., 
G = £8(7^), can be determined by generalizing an efficient algorithm due to Moore [13]. 

Since maximum tardiness is minimized by sequencing the jobs in EDD order, it follows that if 
sequence S yields minimum G, then so will sequence S', in which the on-time jobs in S are 
scheduled in EDD order followed by all the tardy jobs in S. Letting S„ represent the largest 
possible set of on-time jobs (so that G = n — \S„\ is the minimum number of tardy jobs) S„ 
can be determined as follows: 

Order and index the jobs in N such that d x < d 2 ^ • • -^ d„ (where ties are bro- 
ken arbitrarily). Set S Q = and k = 1. 

If k = n + 1 stop. S„ is an optimal set. 

If t 

Z p, + Pk 

^ d k set S k = S k _\ U {k}, otherwise let p r = max 

[ P j\j\ S k . x U{k\] and set S k = ^_, U[k] - {r}. 

4. Set k = k + 1 and return to step 2. 

Constrained (Secondary Criterion) Problems 

Several authors have addressed the problem of sequencing n jobs on one machine so as to 
optimize one criterion while restricting the set of sequences so that all or some jobs also satisfy 
another. We include four such problems here. In particular, 

(a) Minimize total (mean) flow time given that a subset E of the jobs are to be on time 
(Burns and Noble [2] and Emmons [4], i.e., 

min G = £C, 

s.t. C, < d„ i € E 

(b) Minimize maximum tardiness given that a subset E of the jobs are to be on time 
(Burns and Noble [2]), i.e., 

min G = max T, 
s.t. C, < rf„ / 6 E 

(c) Minimize mean flow time over all sequences which yield minimum maximum cost 
(Emmons [5] and Heck and Roberts [8]), i.e., 

min G = £C, 

s.t. g,(C,) < G m , i e N 

where g,(C) is a non-decreasing function of Cand G„, = min {max g,(C,)} 

(d) Minimize the number of tardy jobs given that a subset E of the jobs is to be on time 
(Sidney [15]), i.e., 

min G = J^Sd 1 ,) 
subject to C, < dj i € E. 


In all cases the algorithms originally developed for the single-machine problem can easily 
be adapted to the variable-resource problem. 

The first three problems can be solved by a one pass algorithm which sequences jobs one 
, at a time from last to first. Suppose that jobs have been assigned to positions k + 1 through n. 
Let N k be the set of jobs as yet unsequenced and L k be the subset of N k that can be assigned 
position k without violating the constraint. A job from L k , say job j, is then chosen according 
to a certain rule and sequenced in position k. Then N k _ l = N k — {j}, L k -\ is generated, and a 
job is sequenced in position k— 1, etc. 

Letting E k = N k Q £and p(N k ) = £ p k , then for problems (a) and (b) 

L k = [N k - E k ) U {j\j € E k - dj > t(p(N k ))) 
' while for problem (c) 

L k = U\j € N k - gl (t(p(N k ))) ^ Gj 
The rule for choosing the job for position k in problems (a) and (c) is choose j such that 
Pj = max p, 

while for problem (b), j is chosen such that 
dj = max dj. 

i€L k 

Problem (d) may be solved by modifying the due-dates to reflect the fact that if 
d, ^ d k , k € £, and job / is to be on time in a feasible sequence then i must be completed by 
! t(R(d k ) — p k ). Then Moore's algorithm can be applied, with an adjustment to assure that jobs 
in £will be on time. This is essentially the procedure developed by Sidney [15]. 

Nonsimultaneous Arrivals 

In the preceding sections all jobs are assumed to be available for sequencing at time zero. 
We now consider problems in which job j is not available for processing until the beginning of 
period r r where r t ^ 1. If, in this situation, it is possible to interrupt the processing of a job 
and resume it later without loss of progress toward completion of the job, we say that the sys- 
tem operates in a "prempt-resume" mode. 

For single-machine problems with criteria maximum tardiness (G = max Tj) or total 
(mean) completion time (G = £ C,) when prempt-resume applies; the static optimizing rules 

EDD and SPT can be generalized in a straightforward manner to produce optimal sequences 
when all jobs are not simultaneously available ([1] p. 82). The same generalizations apply when 
resource availability varies with time, using the following procedure: 

1. At time zero if one or more jobs are available assign the resource to process the 
available job with the smallest (most urgent) priority. Otherwise leave the 
resource idle until the first job is available. 

2. At each job arrival, compare the priority of the newly available job j with the 
priority of the job currently being processed. If the priority of job j is less, allow 
job j to preempt the job being processed; otherwise add job j to the list of avail- 
able jobs. 



3. At each job completion, examine the set of available jobs and assign the resource 
to process the one with the smallest priority. 

In order to minimize maximum tardiness, the priority of a job is taken to be its due-date, and 
to minimize mean flowtime the priority is its remaining resource requirement. 


One case for which the single-machine result does not generalize in a straightforward 
manner to the corresponding variable-resource problem is the case sequencing to minimize the 
sum of weighted completion times, where 

G = £w y C y 
when all jobs are available at time zero. 

Sequencing jobs in nondecreasing ordef of the ratio p,/Wj, which will always minimize G 
in the single-machine problem, need not yield an optimal sequence when the resource availabil- 
ity varies with time. The following simple example demonstrates this fact. 


















rU) =1 ^ t < 4 

,-(,) = 4 4^/^7 M =7 

Sequencing the jobs in nondecreasing order of Pjlwj yields the order 1-2-3, for which the com- 
pletion times are 4.75, 5.5 and 7. Therefore, G = 62.75. For the sequence 2-1-3 the comple- 
tion times are 3, 5.5 and 7, with G = 61.5. (Under the discrete time framework G = 65 for 
1-2-3 but G = 64 for 2-1-3.) 

While the differences in G-values may seem almost insignificant it is possible to construct 
an example in which sequencing by increasing ratios pjw, will yield an arbitrarily bad solution. 
Consider the data for a two-job problem in which 

1 2 



5 x 10 2 "' 






5 x 10'" 

r(t) = 5 x 10 2 '", ^ t < 1 

, 1 < / < 1 + 10'". 

< Pi/w 2 ) and S\ the sequence 2-1, for large m 

rU) = 1 

Letting S represent the sequence 1-2 (p\/w 


G w-J ~ 1 (If)"') 
G(S') " 2 U '" 


For the special case in which the processing times and weights are agreeable (p, ^ p, 
whenever w, ^ w,) sequencing by nondecreasing ratios of Pj/wj does produce an optimal solu- 
tion (see Theorem 4). Otherwise the two examples given in this section reinforce the notion 
that the single-machine result cannot be extended to even the simplest versions of the 
variable-resource model. In one example the resource profile r(t) is nondecreasing, while in 
the other example r(t) is nonincreasing. In both cases there is only one change in r(t). These 
situations would appear to be among the least drastic ways of relaxing the constant resource 
assumption; but, as we have demonstrated, the ratio rule still fails. At this point, we can con- 
clude only that the minimization of Ew^C, involves more than a simple extension of the 
single-machine result. Obviously, any optimal ordering rule (if one exists) would have to 
involve information about the resource profile as well as information about processing require- 
ments and weights. We conjecture that this problem is NP-complete. 


Although it is not possible to extend all single-machine results directly to the variable- 
resource case, a few observations can be made. A look at Figure 1 indicates that the graph of 
R (t) transforms processing times (on the horizontal axis) into resource consumptions (on the 
vertical axis), and vice-versa. This transformation is at least order-preserving. In particular, 
the makespan for a set A of jobs is at least as large as the makespan for set B when the jobs in 
A have a total processing requirement that equals or exceeds the requirement of the jobs in B. 
This property is fundamental to the proof of many single-machine results as they carry over to 
variable-resource models. Moreover, the results for problems in which pj = p do not rely on 
the precise nature of the transformation, but depend only on the fact that all solutions share a 
common nondecreasing sequence of completion times. 

In the single-machine case, R(t) is linear, implying that the mapping of resource con- 
sumptions into processing times is proportionality-preserving as well as order-preserving. That 
is, ratios of intervals on the resource axis convert to identical ratios on the time axis. This pro- 
perty is not maintained in the variable-resource model, because the transformation distorts pro- 
portionality. In particular, we have in the single-machine problem that pj pj < w i /w / implies 
AC//AC,- < Wj/wj, where AC, and AC,- denote the magnitude changes in the completion times 
of adjacent jobs /' and j which are interchanged in sequence. This implication does not hold in 
the variable-resource problem, so the pairwise interchange argument may fail. 

These observations lead to the conclusion that single-machine results involving minimum 
weighted sum of completion times cannot be directly extended. An open question is therefore 
how to exploit the structure of this problem in the variable-resource case in order to find 
optimal solutions. 


[1] Baker, K.R., Introduction to Sequencing and Scheduling, Wiley (1974). 

[2] Burns, R.N. and K.J. Noble, "Single Machine Sequencing with a Subset of Jobs Completed 

on Time," Working Paper, University of Waterloo, Canada (1975). 
[3] Emmons, H., "One Machine Sequencing to Minimize Certain Functions of Job Tardiness," 

Operations Research, 77, 701-715 (1969). 
[4] Emmons, H., "One Machine Sequencing to Minimize Mean Flow Time with Minimum 

Number Tardy," Naval Research Logistics Quarterly, 22, 585-592 (1975). 
[5] Emmons, H., "A Note on a Scheduling Program with Dual Criteria," Naval Research 

Logistics Quarterly, 22, 615-616 (1975). 


[6] Gelders, L. and P. Kleindorfer, "Coordinating Aggregate and Detailed Scheduling in the 

One Machine Job Shop: Part I," Operations Research, 22, 46-60 (1974). 
[7] Gelders, L. and P. Kleindorfer, "Coordinating Aggregate and Detailed Scheduling in the 

One-Machine Job Shop: Part II," Operations Research, 23, 312-324 (1975). 
[8] Heck, H. and S. Roberts, "A Note on the Extension of a Result on Scheduling with a 

Secondary Criteria," Naval Research Logistics Quarterly, 19, 403-405 (1972). 
[9] Lawler, E.L., "Optimal Sequencing of a Single Machine Subject to Precedence Constraints," 
Management Science, 19, 544-546 (1973). 

[10] Lawler, E.L., "On Scheduling Problems with Deferral Costs," Management Science, 11, 
280-288 (1964). 

[11] Lawler, E.L., "A Pseudopolynomial Algorithm for Sequencing Jobs to Minimize Total Tar- 
diness," Annals of Discrete Mathematics, /, 331-342 (1977). 

[12] Lawler, E.L. and J.M. Moore, "A Functional Equation and Its Application to Resource 
Allocation and Sequencing Problems," Management Science, 16, 77-84 (1969). 

[13] Moore, J.M. "An n Job, One Machine Sequencing Algorithm for Minimizing the Number 
of Late Jobs," Management Science, 15, 102-109 (1968). 

[14] Schrage, L.E. and K.R. Baker, "Dynamic Programming Solution of Sequencing Problems 
with Precedence Constraints," Operations Research, 26, 444-449 (1978). 

[15] Sidney, J.B., "An Extension of Moore's Due Date Algorithm," Symposium on the Theory 
of Scheduling and Its Application, (S.E. Elmaghraby, editor) Lecture Notes on 
Economics and Mathematical Systems 86, Springer-Verlag, Berlin, 393-398 (1973). 


Charles R. Johnson 

Department of Economics and Institute for 

Physical Science and Technology 

University of Maryland 

College Park, Maryland 

Edward P. Loane 

EPL Analysis 
Olney, Maryland 


A model, for assessing the effectiveness of alternative force structures in an 
uncertain future conflict, is presented and exemplified. The methodology is ap- 
propriate to forces (e.g., the attack submarine force) where alternative unit 
types may be employed, albeit at differing effectiveness, in the same set of mis- 
sions. Procurement trade-offs, and in particular the desirability of special pur- 
pose units in place of some (presumably more expensive) general purpose 
units, can be addressed by this model. Example calculations indicate an in- 
crease in the effectiveness of a force composed of general purpose units, rela- 
tive to various mixed forces, with increase in the uncertainty regarding future 


In planning the procurement of major weapons systems (submarines, aircraft, ships, etc.), 
an argument, based upon relative cost-effectiveness in certain uses, may be made for the 
development and purchase of some items which are less versatile and effective than the "best" 
available components of an overall force. Assuming all relative costs and effectivenesses 
known, such an argument is sound at least to the extent that the uses necessitated by a poten- 
tial conflict are anticipated. However, under uncertainty about the nature of potential conflicts, 
a question, in general more subtle, is raised regarding the optimal composition of forces. In 
this case, a model is developed here to analyze the utility of "mixed" force structures, and 
examples are given to support the intuitive notion that the less specific are the presumptions 
about needs in a future conflict, the more valuable are the most versatile forces. 

Our focus here is upon presenting a model able to capture the value, under uncertainty, 
of versatile forces and not upon the equally important problem of determination of cost and 
effectiveness parameters. The latter, as well as the mixture versus force level interaction, are 
touched upon tangentially in an example. The parameter estimation problem, in general, 
requires both large scale theoretical and empirical effort and has been addressed, in the subma- 
rine case, in Reference [1]. 



By general purpose forces we shall mean the most versatile, advanced or effective com- 
ponents which technology would currently allow in building a military force structure. Special 
purpose forces, on the other hand, might be competitive in effectiveness with general purpose, 
but only in some of the uses (which we shall call missions) which possible conflicts might 
require. Naturally, we presume that the general purpose are more expensive than the special 
purpose forces per item, and further that the special purpose forces are cost effective, in some 
missions. It is assumed also that all costs are accounted for, e.g., development, production, 
maintenance, operation, repair and logistical mobility, etc. 

Examples of general versus special purpose forces include the following. In the case of 
submarine forces, the general purpose would be the newest fully equipped nuclear submarine 
while a special purpose alternative would be the conventional diesel submarine found in many 
European navies. The former is presumed at least as effective in all missions (much more so in 
some) while the latter is much less expensive and nearly as effective in some missions requiring 
only low mobility. In the case of aircraft, a long-range fighter-bomber might be considered gen- 
eral purpose and a plane designed primarily for ground attack would be special purpose. 

The force planner must procure some mixture of forces, constrained, presumably, by a 
fixed budget. In general there may be several force types, ranging from the very general to the 
very special purpose, and we may think of the force structure as being a vector of inventories 
of each type purchased. We think of a conflict as simply a collection of mission opportunities, 
and the planner's problem is then to procure that force structure which permits the most 
effective deployment for a conflict. For a specified conflict, this poses a deterministic optimiza- 
tion problem which, if the conflict includes enough important mission opportunities in which 
the special purpose forces are cost effective, will surely suggest a mixed force structure includ- 
ing at least some special purpose units. 

However, procurement of weapons systems must generally be decided upon long in 
advance of potential conflicts. For a variety of additional reasons, there will likely be consider- 
able uncertainty as to the precise nature of an actual conflict. We consider this uncertainty to 
be characterized by a (known) distribution of potential conflicts, i.e., a distribution of mission 
opportunities. We note that there are other ways in which uncertainty might be treated. For 
example, if one's own force structure is known, a hostile adversary might be expected, to the 
extent that circumstances allow, to bias a conflict in a direction which would render one's own 
force least effective. This suggests a game theoretic approach. Although it is not pursued 
further here and although its information requirements might be great, this would naturally fit 
into the model context we outline below. It seems likely that such a treatment would value the 
versatility of general purpose forces more so than the one we pursue. Another alternative 
would be to treat the effectiveness of each unit type as unknown and characterize it by a proba- 
bility distribution. 

The planner's problem which we address is then to choose that affordable mixture of 
forces which, assuming optimal deployment in any conflict, yields the largest expected 
effectiveness in the uncertain conflict. It should be noted that, as stated, there is an implicit 
assumption that the planner is willing to take the risk that the solution mixture will produce 
unusually low effectiveness in some conflicts. (This is in contrast with the game theoretic 
approach mentioned above.) However, to the extent that the planner is risk-adverse rather 
than risk-neutral, other criteria may be substituted for "expected effectiveness" without concep- 
tual difficulty and probably without operational difficulty in the development below. It should 
also be mentioned that a measure of the value of the versatility of general purpose forces under 
uncertainty lies in comparing the solution mixture of the above problem to the optimal mixture 
when the expected conflict is assumed known (i.e., the case of certainty). In general the 
"expected effectiveness" solution will differ from the "expected conflict" solution. 



We imagine n force types 7), j — 1, ..., n and m different mission categories U h 
/ = 1, . . . , aw, in which a component of the force might be engaged. Each 7} is more or less 
effective in a given U, which, to the extent that total effectiveness is linear in the deployment 
of force types to mission categories, suggests the definition of an m-by-n unit effectiveness matrix 

E-(e u ), 

in which e tj indicates the effectiveness of a unit of 7} employed in U, for a unit of conflict 
(presuming opportunities available). We denote by a \-by-n vector s, a particular force composi- 
tion in which Sj is the number of 7} available. At the time of a conflict, s is fixed and, there- 
fore, provides a constraint on the total effectiveness attainable. A particular conflict is charac- 
terized by the total opportunity for effectiveness which may be obtained from each mission category. 
These bounds are summarized in an m-by-1 vector b in which 6, is the maximum opportunity 
in U h This bound is expressed in effectiveness units rather than force units because the 
"opportunities" are opportunities to damage the opponent and the force types vary in their abil- 
ity to do so in a given mission. 

The m-by-n matrix A = (fly) summarizes the allocation (or deployment) of 7} to U h i.e., 
Ojj is the amount of force type 7} allocated to mission category U, during a conflict. The a tJ are 
necessarily nonnegative but we do not assume them integral because of the possibility of 
switching units among missions. 

The problem of waging a given conflict is then to deploy the given force so as to maxim- 
ize total effectiveness within the constraint of the opportunities the conflict presents. In general 
(no linearity assumption), total effectiveness is some function 

e = e {A ) 

of the allocation, and, furthermore, 

e(A) = e { (A) + ... + e m {A), 

where e,{A) is the effectiveness A yields through the /th mission category. This means that 
waging the known conflict b amounts to the optimization problem: 

maximize e {A ) 


subject to £ a^ ^ Sj, j = 1, . . . , n 


e,(A) < b,, i = 1, ... , m 

a, > 0. 

In case total effectiveness is linear in A, we have the linear programming problem: 

m n 

maximize ]£ £ a^e-^ 

<=1 7=1 


subject to £ ay < Sj, j — I, ... , n 

, m 

£ flyfy < K 

/ = 1, 


a > 0. 




In either case we denote the maximum achieved by Mis,b). Then, equicost force compositions 
s may be compared, for a given conflict, by comparing the M{s,b). A good general reference 
for relevant concepts in the linear case is Reference [2]. 

Uncertainty as to the nature of the conflict is characterized by a probability distribution for 
b. For a given s, there is an M(s,b) for each possible value of b. These may then be averaged 
according to the distribution of b to obtain the expected value: 

Mis) = E b {M{s,b)). 

Comparisons among force compositions may- then be made by comparing the Mis), and the 
planner's problem is to 

maximize Mis) 

subject to his budget constraint governing the possible forces 5 which may be purchased. In 

max E b iMis,b)) j* max Mis,E b ib)), 

s s 

and in the case that effectiveness is linear in A, 

max E b iMis,b)) < max iMis,E b ib)). 

s s 

Thus, the maximum expected effectiveness problem has a different solution from the problem 
of maximum effectiveness is an expected conflict, so that uncertainty makes a difference in 
planning. We present examples which illustrate this, and in which the latter favors special pur- 
pose forces while the former favors general purpose, presumably because of their greater ability 
to defend against variation (uncertainty). The suggestion is that the more uncertainty there is, 
the greater the value of general purpose forces. 


We conclude by giving two examples. The first is primarily to illustrate the evaluation 
model and some of the remarks made. The second includes a more thorough examination of 
the model and its assumptions in a detailed example intended to be suggestive of a realistic case 
which motivated this study. 

EXAMPLE 1: Here we imagine three force types. Type T { is the general purpose, and T 2 
and T 3 are different special purpose forces. There are also three mission categories. Type T 2 is 
cost effective relative to T\ in mission U\, while r 3 is cost effective relative to T\ in t/ 3 . Total 
effectiveness is assumed linear in allocations and the unit effectiveness matrix is 

E = 







. 1 



We consider seven equicost force compositions 
s 1 = (9, 0, 0) 
s 2 = (6, 3, 3) 
s 3 = (6, 6, 0) 

s 4 = (6, 0, 6) 

5 5 = ( 5> 4> 4 ) 

f 6 = 

,7 = 

(5, 8, 0) 
(5, 0, 8). 



Thus, the two special purpose forces cost half as much as the general purpose over the range of 
procurement considered. (Actually, the outcome will not differ qualitatively if more alterna- 
tives based upon the 2-for-l trade-off are considered.) 

There are six possible conflicts 

,1 _ 





b 7 = 


b 3 = 

b 4 = 

/> 5 = 




and b 6 = 


with the first three presumed to have probability 2/9 each and the last three probability 1/9 
each. Thus, the expected conflict is 

b = 

Straightforward calculations then yield 

Mis x ) = 9 

Mis 1 ) = Mis 3 ) = Mis 4 ) = 8.6, and 

Mis 5 ) = Mis 6 ) = Mis 1 ) = 8.47 

so that 

max Mis') = 9 

is achieved at s\ the all general purpose force. On the other hand, 

Mis\ b) = 9 

Mis 2 , b) = 10.2, Mis 3 , b) = Mis\ b) = 10.1, 

Mis 5 , b) = 10.6, and Mis 6 , b) = Mis 1 , b) = 9.3. 

Thus, a mixed force s 5 is optimal for the expected conflict. The conclusion, in this case, is that 
general purpose forces are overall more cost efficient under uncertainty. It should be noted that 
in calculating the Mis'), each other force had higher effectiveness than s 1 for some conflicts 
(but not overall) and all were better than s 1 in the expected conflict. Thus, it is only the value 
of versatility under uncertainty which makes s 1 preferred. 

EXAMPLE 2: This example is taken from the problem of submarine procurement and 
again illustrates the effect of uncertainty on the attractiveness of special purpose forces. 

For simplicity, we consider only two types of forces, general purpose and special purpose 
units. In this setting, the distinction between new procurement general purpose or special pur- 
pose forces might well be that between nuclear or diesel-electric propulsion. Equipment and 
weapons could be identical, but the lower underwater mobility inherent in diesel-electric pro- 
pulsion would limit effective employment of such forces to particular ASW missions. In the 
actual planning process, the existing force structure must also be considered since in a future 
conflict, presently existing units might be restricted to low vulnerability missions (presumably 
being less capable than new procurement general purpose units) and thus constitute additional 
categories of special purpose forces. 

The present example considers four missions and measures unit effectiveness in each mis- 
sion by a kill rate defined by: 


Kills (of enemy submarines) per unit 
time by one on-station U.S. submarine 
of type Tj engaged in mission U t 

,J [Number of surviving enemy submarines] 

The above quantity is well defined for important submarine missions, being independent of 
enemy force size and the number of U.S. submarines committed to U, over a substantial range 
of values. For instance, considering a fixed barrier mission, the rate of enemy transits through 
the barrier and thus the rate of opportunities for kill would be proportional to the number of 
surviving units. Also, U.S. submarine probabilities of detection and kill given an opportunity 
(here target passage through the barrier area assigned to the submarine) are, at least initially, 
inversely proportional to the width of the barrier area assigned. In this circumstance, e t j is well 
defined. Of course, nonlinear effects are present and become significant as the number of U.S. 
units is increased. One could argue that, as returns diminish, no additional submarines should 
be assigned to the fixed barrier; this then determines the mission opportunities, bj. With units 
of differing capabilities, bj is properly stated in terms of effectiveness obtained, not in some 
fixed maximum number of units employed, since the onset of diminishing returns would occur 
at different force levels for different unit effectivenesses. Finally, variations in bj (for the fixed 
barrier mission) might arise from uncertainties in enemy basing, at-sea replenishment of sub- 
marines, desirable barrier locations being untenable due to enemy ASW, etc. 

Similar arguments apply for the direct support mission (submarines employed in the 
defense of surface formations) and similar conclusions are obtained in the area search mission. 

It should be noted that kill rates add, and that the summation 

m n 

i=\ 7=1 

being an overall rate at which enemy submarines are being killed, is a sensible measure of 
effectiveness for the entire U.S. submarine force. It is even plausible that the differing subma- 
rine types would be assigned to missions so as to (approximately) maximize this sum. Finally, 
to the extent that variations in bj reflect week-to-week changes within a single conflict (i.e., one 
week large numbers of forces are required for direct support, the next week these same units 
are used in a barrier) rather than uncertainty as to some long-term mix of missions that will be 
required in an unspecified conflict, then the expected value 

E b (M(s,b)) 

can be interpreted as a time-average of force kill rate and again this is a preeminently sensible 

It is the authors' belief that the use of kill rates as measures of unit effectiveness and the 
linear formulation of force effectiveness, while necessarily involving some approximation, does 
capture the important aspects of evaluating alternative submarine force structures. Of course, 
in realistic applications, the evaluation of effectiveness for alternative forces is a substantial 
effort. Reference [1] documents a major study effort which arrives at such estimates, although 
not expressed as kill rates. Evaluation of force effectiveness is not addressed here. Quantita- 
tive inputs to this second example, shown in the following tabulation, are completely hypotheti- 
cal; and, while of reasonable relative magnitudes, are chosen to illustrate the theses of this 




Unit Effectiveness, e,j 
(Kill rates) 







Expected Total 

Opportunity for 



Mission 1 




Mission 2 




Mission 3 




Mission 4 





Alternative Force Compositions, (s],s 2 ) 
(Numbers of Units on-station) 














; Unit effectivenesses and force compositions are stated in terms of on-station submarines; actual 

numbers of operational units would be higher than, and not necessarily in proportion to, the 

; numbers shown. The alternative forces shown might well be equal cost options if there were 

, some fixed cost associated with deploying any special purpose submarines. The fourth mission 

: is not limited in the number of forces which can be employed or the total effectiveness which 

can be obtained. This might be thought of as undirected open-ocean search, which could 

always be undertaken by any submarine not otherwise assigned. 

The distributions of b, reflecting uncertainty, are represented by lists of 60 sample 
vectors— each considered equally likely. The lists are not repeated here. Sample vectors were 
generated by Monte-Carlo methods, assuming each b, is an independent truncated* Gaussian 
random variable with the above stated mean and relative standard deviations of 35% and 60% in 
the two cases considered. Effectiveness, for the alternative force compositions is shown in 
Table 3, following. 

The maximal effectiveness for each level of uncertainty is enclosed in dashes. Not 
, surprisingly, the example values show a change in preference, from a mixed force to an all gen- 
eral purpose force, as variability in mission opportunities increases. What is surprising is that 
the changes, and differences are so small overall. This can be explained qualitatively, and is a 
reflection of a real concern in procurement decisions. 

'Both high and low values were discarded so as to preserve the mean value and assure that b, ^ 0. 




Force Compositions, 5 

Force Effectiveness, , 


(s\, s 2 ) 

(35, 0) 
(25, 10) 
(20, 17) 

No Uncertainty 

(mean value 

b used) 




Deviation of 

each b, = 35% 




Deviation of 

each b-, = 60% 









In the present example, the attractiveness of special purpose units rests on the availability 
of opportunities in mission 1; i.e., if b\ ^ 11.4 then forces including some special purpose 
units are preferred to an all general purpose force. But mission 1 is a substantial (36%) of the 
projected employment of submarines; if this were taken away, then the force is over-built and 
any alternative composition is able to exploit the remaining attractive opportunities. That is, if 
b\ — then all force compositions entertained give about the same effectiveness; and as noted 
above, if b\ ^ 11.4, compositions involving special purpose units are preferred. In this cir- 
cumstance, i.e., with the numeric inputs to this example calculation, one cannot expect to see 
dramatic changes in preferences among force compositions, with explicit consideration of 

As a final point, we note the suboptimality of separating questions of force composition 
from questions of force levels. Although this raises an issue worthy of further study, we only 
mention the issue here by extending the previous example. Using exactly the same unit 
effectiveness and mission opportunity values stated previously, but considering alternative force 
compositions which involve an additional 5 general purpose submarines, one obtains the follow- 
ing results: 


Force Compositions, s 

Force Effectiveness, 


iS\, s 2 ) 

No Uncertainty 

(mean value 

b used) 



Deviation of 



Deviation of 

each b; = 35% 

each bi = 60% 

(40, 0) 




(30, 10) 
(25, 17) 
(20, 24) 










In this case, the uncertainty considered does not lead to a preference for an all general purpose 
force, although again the effects are very small. The tendency here is intuitively satisfying, i.e., 
special purpose units become more attractive as overall force levels are increased, relative to a 
fixed job to be done. Notice also that increased uncertainty decreases the incremental 
effectiveness of the additional five general purpose units, in every case. 



[1] Chief of Naval Operations, Future Submarine Employment Study (U), (29 December 

[2] Dantzig, G.B., Linear Programming and Extensions, Princeton University Press (1963). 




Dror Zuckerman 

The Hebrew University of Jerusalem 


Abdel Hameed and Shimi [1] in a recent paper considered a shock model 
with additive damage. This note generalizes the work of Abdel Hameed and 
Shimi by showing that the w-priori restriction to replacement at a shock time 
made in [1] is unnecessary. 


A recent paper by Abdel Hameed and Shimi [1] was concerned with determining the 
optimal replacement time for a breakdown model under the following assumptions: A device is 
subject to a sequence of shocks occurring randomly according to a Poisson process with parame- 
ter A. Each shock causes a random amount of damage and these damages accumulate addi- 
! tively. The successive shock magnitudes Y x , Y 2 , ■ ■ ■ , are positive, independent, identically dis- 
tributed random variables having a known distribution function F(-). A breakdown can occur 
only at the occurrence of a shock. Let 8 denote the failure time of the device. For / < 8 let 
X(t) be the accumulated damage over the time duration [0,/]. The device fails when the accu- 
mulated damage X{t) first exceeds Z. That is, 

(1) 8 = inf{/ ^ 0; X(t) ^ Z\, 

where Z is a random variable, independent of the accumulated damage process X, having a 
known distribution function G() called the killing distribution. More explicitly, if X(t) = x 
and a shock of magnitude y occurs, at time r, then the device fails with probability 

n , G(x+y) - G(x) 

, l-G(x) ■ 

Upon failure the device is immediately replaced by a new identical one with a cost of c. When 
the device is replaced before failure, a smaller replacement cost is incurred. That cost depends 
on the accumulated damage at the time of replacement and is denoted by c(x). That is to say 
c(x) is the cost of replacement before failure when the accumulated damage equals x. It is 
assumed that c(0) = and c(x) is bounded above by c. Thus there is an incentive to attempt 
to replace the device before failure. The condition c(0) =0 has to be interpreted as a policy of 
no replacement if there is no damage. 

In their paper Abdel Hameed and Shimi [1] derived an optimal replacement policy that 
minimizes the expected cost per unit time under the restriction that the device can be replaced 
only at shock point of time. 




In the present article we consider a similar breakdown model without the above restriction 
made in [1]. We allow a controller to institute a replacement at any stopping time before failure 
time. He must replace upon device failure. Throughout, we restrict attention to replacement 
policies for which cost of replacement is solely a function of the accumulated damage. In some 
shock models, replacement at a scheduled time offers potential benefits relative to replacement 
at a random time. However, the problem of scheduled replacement in failure models with addi- 
tive damage is an open problem and it is beyond the scope of the present study. 

Let Fbe the replacement time. At time 7" the device is replaced by a new one having sta- 
tistical properties identical with the original, and the replacement cycles are repeated 
indefinitely. The collection of all permissible replacement policies described above will be 
denoted by M. Our objective is to prove that an optimal policy replaces the system at shock 
point of time. Thus the restriction about the class of permissible replacement policies made in 
[1] can be omitted. 

The following will be standard notation used throughout the paper: E[Y;A], where Yis a 
random variable and A is an event, refers to the expectation E [I A Y] = E[Y\I A — \]P(A), 
where l A is the set characteristic function of A. 


By applying a standard renewal argument, the long run average cost per unit time when a 
replacement policy Tis employed can be expressed as follows 

E[c(X{T)); T < 8] + £[c; T = 8] 


«/>r = 

Let i// * = inf <// r . 


.//* ^ 

E[c(X(T))\ T < 8] + E[c; T = 8] 

for every T € A/, and the optimal replacement policy that minimizes t// T over the set M is the 
one that maximizes 


T = i}j*E[T] + E[c - c(X(T))- T < 8]. 

By applying Dynkin's formula (see Theorem 5.1 and its Corollary in Dynkin [2]) equation (4) 
reduces to 



(6) J(x) = i)j*-kc 

T = e\J q J(X(s))ds 

+ c, 

J n 




+ \ 

(x) - f c(x+y) — =— 
J G(: 



The proof of the above result follows a procedure similar to that used by the author in (Section 
2 of [3]), and therefore is omitted. 

In what follows we shall denote by S the state space of the stochastic process 
[X(t); t < 8). 






S, = {x € S; J(x) > 0}, 
S 2 = U € S; 7U) < 0}. 

Let t\, t 2 , t^, . . . , be the shock points of time and define 
W=[t,; i >l). 

Let L be the subclass of replacement policies in which a decision can be taken only over the set 

We proceed with the following result: 

THEOREM 1: For every replacement policy T x £ L, there exists a replacement policy T 2 
6 L such that 9 T > 9 T . 

'2 ' 1 

PROOF: Let T x be a replacement policy such that T x $ L. 

Let T(S 2 ) be the hitting time of the set S 2 . That is 

(9) T(S 2 ) = inf{/ > 0; X{t) € S 2 ). 

(It is understood that when the set in braces is empty, then T(S 2 ) = °°.) 

(10) T= inftf > Tf, t € W} 
and define 

(11) T 2 = min{f, 7-(S 2 )}. 

Clearly T 2 € L. Next we show that r > r . 

Using (5) we obtain 

9 T2 -9 T] = ElfJ 2 J(X(s))ds 


E\ J(X(s))ds 

£"[// J(X(s))ds; T 2 = f] 


f T( l S) J(X(s))ds\ t > T(S 2 ) 

Note that 

I. {T 2 = f) implies that \T(S 2 ) > t) and therefore E \f T J(X(s))ds; T 2 = f 1 ^ 




J (X(s)) for T(S 2 ) ^ s < T, is non-positive on the set {f > T(S 2 )}. Therefore 

J ' J(X(s))ds\ t > T(s 2 ) 

^ 0. 

Therefore, (using (12)) we obtain 


r 2 -e T] > o 

as desired. 

Recalling that an optimal replacement policy T* is the one that maximizes 9 T and using 
Theorem 1 it follows that T* € L. Hence, the optimal policy derived by [1] is the optimal one 
among all possible replacement policies for which cost of replacement is solely a function of the 
accumulated damage. 

Finally it should be pointed out that if the benefits of scheduled replacement were con- 
sidered, the conclusion reached, that an optimal policy replaces the device at a shock point of 
time, would no longer generally hold. 


[1] Abdel Hameed, M. and I.N. Shimi, "Optimal Replacement of Damaged Devices," Journal of 

Applied Probability 15, 153-161, (1978). 
[2] Dynkin, E.G., Markov Processes I, Academic Press, New York, (1965). 
[3] Zuckerman, D. "Replacement Models under Additive Damage," Naval Research Logistics 

Quarterly, 24, 549-558, (1977). 



Les Cohen 

Government Services Division 

Kenneth Leventhal & Company 

Washington, D.C. 

Diane Erickson Reedy 

Mathtech, Inc. 
Rosslyn, Virginia 


Multiple regression analysis of first term reenlistment rates over the period 
1968-1977 confirms previous findings that reenlistment is highly sensitive to 
unemployment at the time of reenlistment and shortly after enlistment, almost 
four years earlier. Bonuses, particularly lump sum bonuses, were also shown to 
be a significant determinant of reenlistment. 

This note reports the results of cross-sectional multiple regression analysis of first term 
Navy reenlistment. Equations which were estimated represent the completion of research con- 
ducted by Cohen and Reedy [1] which analyzed the sensitivity of first term reenlistment to 
fluctuations in economic conditions at the time of reenlistment and about the time of enlist- 
ment, considering the effect of the latter on reenlistment behavior four years later. The princi- 
pal finding of that study was that unemployment rates, both at the time of reenlistment and 
about the time of enlistment four years earlier, were powerful predictors of reenlistment rates. 
By comparison, measures of private sector versus military wages entered in the same equations 
were generally found to be insignificant or, at best, relatively unimportant. That study did not, 
however, take into account the influence of reenlistment bonuses which this follow-up note 

This note describes the results of regression equations, replicating those which were the 
basis of the original Cohen-Reedy paper, which include reenlistment bonus variables to con- 
sider their influence upon Navy reenlistment over the ten year period, 1968-1977. 

Reenlistment rates were compiled from Navy Military Personnel Statistics ("The Green 
Book"), quarterly by rating, separately for E-4's and E-5's. To help minimize spurious fluctua- 
tions in the data, reenlistment rates were calculated only for those quarters which had an aver- 
age of at least 10 eligibles per month. In addition, due to definitional and mensurational incon- 
sistencies, ratings which include nuclear power and diver NEC's were eliminated and other 

*This research was supported by the Office of the Chief of Naval Operations, Systems Analysis Division, under a con- 
tract with Information Spectrum, Inc., Arlington, Virginia. 



ratings which include 6 year obligors (6YO's) were analyzed separately. The resultant data base 
consisted of 3110 observations for 4YO ratings, and 787 observations for 6YO ratings. Each 
observation referred to a specific quarter, rating and pay grade, either E-4 or E-5. 

Four multiple linear regression equations were estimated: one for 4YO ratings (including 
E-4's and E-5's); one for 6YO ratings (including E-4's and E-5's); one for 4YO E-4's; and one 
for 4YO E-5's. No attempt was made to estimate separate equations for each major occupa- 
tional category as was done in the previous study. Given observed variations in earlier equa- 
tions, collective treatment of ratings has probably resulted in depressed R 2 statistics. 

The dependent variable, RATE3, is the percentage deviation of the current quarter reen- 
listment rate from the mean reenlistment rate for that rating and pay rate over the 10 years 
under study, 1968-1977. 

RATF3 = (Q uartei "ly Reenlistment Rate - Mean (10 Year) Reenlistment Rate) 

Mean (10 Year) Reenlistment Rate 

This specification of the dependent variable was adopted to contend with wide variations in the 
level of reenlistment rates from rating to rating. RATE3 describes relative changes in reenlist- 
ment rates. 

Independent variables included in the equations are listed and defined in Table 1. 

TABLE 1 — Independent Variables 

AUR current national unemployment rate 

ARAUR average rate of change in unemployment (AUR) over the 

past 6 quarters preceeding the reenlistment decision 

AUR13 unemployment (AUR) 13 quarters prior to the reenlistment 

decision (NOTE: Virtually uncorrelated with AUR.) 

RW the ratio of military basic pay to private sector earnings 

AWARD bonus award multiple 

LS dummy variable indicating lump sum payment of bonuses 

(LS = 1 for 1968 - 1974; LS = for 1975 - 1977) 

ELIG number of individuals eligible for reenlistment 

PAYRATE dummy variable indicating rate 

(PAYRATE = 1 for E - 5's; PAYRATE = for E - 4's) 

DRAFT number of persons drafted (all services) 18 quarters prior 

to reenlistment decision 

WAR dummy variable for Viet Nam War 

(WAR = 1 for 1968 - 1972; WAR = for 1973 - 1977) 

QTR3 third quarter seasonal dummy 

(QTR3 = 1 for 3rd calendar quarter only) 

TIME time variable (TIME = Year - 67) 



In the context of cross-sectional analysis, estimated coefficients do not pertain to the 
impact of a given variable over time for a specific rating, but represent the typical impact of that 
variable over the entire 10 years across all ratings which were included in the study. 

Results of the estimation procedures are summarized in Table 2. 

TABLE 2 — Reenlistment Equations: Coefficients, (t-statistics), and Means 


































( .88) 



















( .64) 









( .05) 
















- 08E-2 



- 03E-2 


- .07E-2 


- .001 



( .45) 


- .05 
( .66) 



- .04E-4 


- .04E-4 



- .04E-4 


- .04E-4 

5.17E + 4 











- .09 




- .08 















- 3.03 

- 3.10 

- 2.39 

R 2 










The three unemployment variables, AUR, ARAUR and AUR13, were specified precisely 
as in the earlier Cohen-Reedy study. Consistent with those results, the significance of the 
unemployment rate variables and the magnitude of their apparent effect upon reenlistment are 
striking. Taken literally, coefficients in the 4YO equation, for example, show a one point 
increase in AUR 13 ( + .01) indicating a 29 point ( + .29) increase in RATE3. While it is real- 
ized that these coefficients may overstate the real influence of unemployment, their equations, 
like those which they are replicating, do indicate that reenlistment decisions may in fact be sen- 
sitive to perceived costs of employment search and to the security of private sector employ- 

The first compensation variable, RW, representing the ratio of military to private sector 
wages, was calculated separately for E-4's and E-5's using basic pay for E-5's and E-6's respec- 
tively as proxies for next-term earnings. RW was not a significant variable in any of the four 



The other two compensation variables, AWARD and LS, relate to bonuses. AWARD is 
the multiple for a particular rating in a given quarter, ranging from to 6. This multiple is the 
factor which the Navy applies against an individual's monthly pay to compute the dollar amount 
of his bonus payment. AWARD was significant in all three 4YO equations. LS is a dummy 
variable which assumes a value of 1 through calendar 1974 during the period when lump sum 
awards were paid to approximately 50% of those individuals who reenlisted. Beginning January 
1, 1975, a new policy was initiated which reduced the percentage of lump sum bonus payments 
to approximately 10% of those reenlisting. The coefficient of LS indicates that when bonuses 
were paid in lump sums, the percentage difference between actual reenlistment rates and mean 
(10 year) reenlistment rates was higher by .45 than when bonuses were paid in installments. 

The variable ELIG was included in the equations simply to capture the observed relation- 
ship between low numbers of eligibles and high reenlistment rates. 

PAYRATE is a dummy variable which distinguishes between E-4's and E-5's (PAYRATE 
= 1). TIME was included to capture the influence of factors which have changed steadily over 
time such as the quality of life improvements effected by the Navy over the past several years. 

These equations support the authors' earlier findings, notably that unemployment rates at 
the time of the reenlistment decision and shortly after enlistment are important determinants of 
reenlistment rates. Relative wages continue to appear unimportant. It appears, however, that 
reenlistment bonuses have had a significant positive effect on reenlistment, particularly when 
those bonuses have been awarded in lump sum payments. 

Although by no means conclusive, the equations summarized in Table 2 suggest the fol- 
lowing management initiatives: 

— Experimentation is warranted in the use of lump sum bonuses to mitigate the effects 
of low unemployment rates on reenlistment. 

— Opportunities to reenlist might be timed to coincide with low points (periods of high 
unemployment) in the business cycle. 

— AUR13 and predicted AUR should be used to augment current information used for 
projecting reenlistment rates. 

— Based on the continued performance of the AUR13 variable, serious consideration 
must be given to implementing new programs designed to effect enlistee career decision 
making very early during the first term of service. 


[1] Cohen, L. and D. Reedy, "The Sensitivity of Navy First Term Reenlistment Rates to 
Changes in Unemployment and Relative Wages," Naval Research Logistics Quarterly, 26, 
695-709 (1979). 



The NAVAL RESEARCH LOGISTICS QUARTERLY is devoted to the dissemination of 
scientific information in logistics and will publish research and expository papers, including those 
in certain areas of mathematics, statistics, and economics, relevant to the over-all effort to improve 
the efficiency and effectiveness of logistics operations. 

Manuscripts and other items for publication should be sent to The Managing Editor, NAVAL 
RESEARCH LOGISTICS QUARTERLY, Office of Naval Research, Arlington, Va. 22217. 
Each manuscript which is considered to be suitable material tor the QUARTERLY is sent to one 
or more referees. 

Manuscripts submitted for publication should be typewritten, double-spaced, and the author 
should retain a copy. Refereeing may be expedited if an extra copy of the manuscript is submitted 
with the original. 

A short abstract (not over 400 words) should accompany each manuscript. This will appear 
at the head of the published paper in the QUARTERLY. 

There is no authorization for compensation to authors for papers which have been accepted 
for publication. Authors will receive 250 reprints of their published papers. 

Readers are invited to submit to the Managing Editor items of general interest in the field 
of logistics, for possible publication in the NEWS AND MEMORANDA or NOTES sections 




VOL. 27, N( 

NAVSO P-127 


On the Reliability, Availability and Bayes Confidence 
Intervals for Multicomponent Systems 

Optimal Replacement of Parts Having Observable 
Correlated Stages of Deterioration 

Statistical Analysis of a Conventional Fuze Timer 

The Asymptotic Sufficiency of Sparse Order Statistics 
in Tests of Fit with Nuisance Parameters 

On a Class of Nash-Solvable Bimatrix Games 
and Some Related Nash Subsets 

Optimality Conditions for Convex Semi-Infinite 
Programming Problems 











Solving Incremental Quantity Discounted Transportation P. G. MCKEOWN 
Problems by Vertex Ranking 

Auxiliary Procedures for Solving Long 
Transportation Problems 

On the Generation of Deep Disjunctive Cutting Planes 


The Role of Internal Storage Capacity in Fixed 
Cycle Production Systems 

Scheduling Coupled Tasks 

Sequencing Independent Jobs With a 
Single Resource 

Evaluation of Force Structures Under Uncertainty 

A Note on the Optimal Replacement Time 
of Damaged Devices 

A Note on the Sensitivity of Navy First Term 
Reenlistment to Bonuses, Unemployment 
and Relative Wages 








Arlington, Va. 22217