Skip to main content

Full text of "Naval research logistics quarterly"

See other formats


\l^ 


NAVAL  RESEARCH 
L 


2  0  J30 


o 
O 


m 
— o 


mr 


SEPTEMBER  1980 
VOL.  27,  NO.  3 


OFFICE     OF     NAVAL     RESEARCH 


NAVSO  P-1278 


NAVAL  RESEARCH  LOGISTICS  QUARTERLY 


EDITORIAL  BOARD 

Marvin  Denicoff,  Office  of  Naval  Research,  Chairman  Ex  Officio  Members 

Murray  A.  Geisler,  Logistics  Management  Institute 
W.  H.  Marlow,  The  George  Washington  University 


Thomas  C.  Varley,  Office  of  Naval  Research 
Program  Director 


Seymour  M.  Selig,  Office  of  Naval  Research 
Managing  Editor 


MANAGING  EDITOR 

Seymour  M.  Selig 

Office  of  Naval  Research 

Arlington,  Virginia  22217 


ASSOCIATE  EDITORS 


Frank  M.  Bass,  Purdue  University 

Jack  Borsting,  Naval  Postgraduate  School 

Leon  Cooper,  Southern  Methodist  University 

Eric  Denardo,  Yale  University 

Marco  Fiorello,  Logistics  Management  Institute 

Saul  I.  Gass,  University  of  Maryland 

Neal  D.  Glassman,  Office  of  Naval  Research 

Paul  Gray,  Southern  Methodist  University 

Carl  M.  Harris, Center  for  Management  and 

Policy  Research 
Arnoldo  Hax,  Massachusetts  Institute  of  Technology 
Alan  J.  Hoffman,  IBM  Corporation 
Uday  S.  Karmarkar,  University  of  Chicago 
Paul  R.  Kleindorfer,  University  of  Pennsylvania 
Darwin  Klingman,  University  of  Texas,  Austin 


Kenneth  O.  Kortanek,  Carnegie-Mellon  University 
Charles  Kriebel,  Carnegie-Mellon  University 
Jack  Laderman,  Bronx,  New  York 
Gerald  J.  Lieberman,  Stanford  University 
Clifford  Marshall,  Polytechnic  Institute  of  New  York 
John  A.  Muckstadt,  Cornell  University 
William  P.  Pierskalla,  University  of  Pennsylvania 
Thomas  L.  Saaty,  University  of  Pittsburgh 
Henry  Solomon,  The  George  Washington  University 
Wlodzimierz  Szwarc,  University  of  Wisconsin,  Milwaule 
James  G.  Taylor,  Naval  Postgraduate  School 
Harvey  M.  Wagner,  The  University  of  North  Carolina 
John  W.  Wingate,  Naval  Surface  Weapons  Center,  Whit  0a< 
Shelemyahu  Zacks,  Virginia  Polytechnic  Institute  and 
State  University 


The  Naval  Research  Logistics  Quarterly  is  devoted  to  the  dissemination  of  scientific  information  in  logistic  am 
will  publish  research  and  expository  papers,  including  those  in  certain  areas  of  mathematics,  statistics,  and  econciics 
relevant  to  the  over-all  effort  to  improve  the  efficiency  and  effectiveness  of  logistics  operations. 

Information  for  Contributors  is  indicated  on  inside  back  cover. 

The  Naval  Research  Logistics  Quarterly  is  published  by  the  Office  of  Naval  Research  in  the  months  of  March,  jnt 
September,  and  December  and  can  be  purchased  from  the  Superintendent  of  Documents,  U.S.  Government  Pr:ting 
Office,  Washington,  D.C.  20402.  Subscription  Price:  $1 1.15  a  year  in  the  U.S.  and  Canada,  $13.95  elsewhere.  C<t  of 
individual  issues  may  be  obtained  from  the  Superintendent  of  Documents. 

The  views  and  opinions  expressed  in  this  Journal   are  those  of  the  authors  and  not  necessarily  those  of  the  (fi« 

of  Naval  Research. 

Issuance  of  this  periodical  approved  in  accordance  with  Department  of  the  Navy  Publications  and  Printing  Regula  >ns, 

P-35  (Revised  1-74). 


ON  THE  RELIABILITY,  AVAILABILITY  AND  BAYES  CONFIDENCE 
INTERVALS  FOR  MULTICOMPONENT  SYSTEMS 

William  E.  Thompson 

Columbia  Research  Corporation 
Arlington,  Virginia 

Robert  D.  Haynes 

ARINC  Research  Corporation 
Annapolis,  Maryland 

ABSTRACT 

The  problem  of  computing  reliability  and  availability  and  their  associated 
confidence  limits  for  multi-component  systems  has  appeared  often  in  the  litera- 
ture. This  problem  arises  where  some  or  all  of  the  component  reliabilities  and 
availabilities  are  statistical  estimates  (random  variables)  from  test  and  other 
data.  The  problem  of  computing  confidence  limits  has  generally  been  con- 
sidered difficult  and  treated  only  on  a  case-by-case  basis.  This  paper  deals  with 
Bayes  confidence  limits  on  reliability  and  availability  for  a  more  general  class  of 
systems  than  previously  considered  including,  as  special  cases,  series-parallel 
and  standby  systems  applications.  The  posterior  distributions  obtained  are  ex- 
act in  theory  and  their  numerical  evaluation  is  limited  only  by  computing 
resources,  data  representation  and  round-off  in  calculations.  This  paper  collects 
and  generalizes  previous  results  of  the  authors  and  others. 

The  methods  presented  in  this  paper  apply  both  to  reliability  and  availability 
analysis.  The  conceptual  development  requires  only  that  system  reliability  or 
availability  be  probabilities  defined  in  terms  acceptable  for  a  particular  applica- 
tion. The  emphasis  is  on  Bayes  Analysis  and  the  determination  of  the  posterior 
distribution  functions.  Having  these,  the  calculation  of  point  estimates  and 
confidence  limits  is  routine. 

This  paper  includes  several  examples  of  estimating  system  reliability  and 
confidence  limits  based  on  observed  component  test  data.  Also  included  is  an 
example  of  the  numerical  procedure  for  computing  Bayes  confidence  limits  for 
the  reliability  of  a  system  consisting  of  A'  failure  independent  components  con- 
nected in  series.  Both  an  exact  and  a  new  approximate  numerical  procedure  for 
computing  point  and  interval  estimates  of  reliability  are  presented.  A  compari- 
son is  made  of  the  results  obtained  from  the  two  procedures.  It  is  shown  that 
the  approximation  is  entirely  sufficient  for  most  reliability  engineering  analysis. 


INTRODUCTION 

The  problem  of  computing  reliability,  availability,  and  confidence  limits  for  multicom- 
ponent  systems  where  some  or  all  of  the  component  reliabilities  and  availabilities  are  statistical 
estimates  from  test  and  other  data  has  appeared  often  in  the  literature.  The  problem  of  com- 
puting these  confidence  limits  has  generally  been  considered  difficult  and  treated  only  on  a  case 
by  case  basis.  The  present  paper  deals  with  Bayes  confidence  limits  on  reliability  and  steady 
state  availability  for  a  general  class  of  fixed  mission  time,  two-state  systems  including,  as  special 
cases,  series-parallel,  stand-by  and  others  that  appear  in  the  applications.  Further,  a  fixed  mis- 
sion length  is  assumed.  It  is  also  assumed  that  neither  reliability  growth  nor  deterioration  occur 
during  the  life  of  the  system  and  the  system  becomes  as  good  as  new  after  each  repair.  Finally, 
we  assume  that  no  environmental  changes,  which  could  affect  reliability  occur.    The  posterior 

345 


346  WE.  THOMPSON  AND  R.D.  HAYNES 

distributions  obtained  are  exact  in  theory  and  their  numerical  evaluation  is  limited  only  by  com- 
puting resources,  data  representation  and  round-off  in  calculation.  The  present  paper  collects 
and  generalizes  previous  results  of  the  authors  and  others. 

The  methods  obtained  in  the  following  apply  both  to  reliability  and  steady  state  availability 
analysis  and  to  avoid  repeated  reference  to  "reliability  or  availability",  the  discussion  references 
only  reliability  with  the  understanding  that  the  terms  system  reliability  R  and  component  relia- 
bility r,  can  be  replaced  by  system  availability  A  and  component  availability  a,.  The  conceptual 
development  requires  only  that  R  and  A  be  probabilities  defined  in  terms  acceptable  for  a  par- 
ticular application.  The  emphasis  is  on  the  determination  of  the  posterior  distribution  func- 
tions.  Having  these,  the  calculation  of  point  estimates  and  confidence  intervals  is  routine. 

BAYES  CONFIDENCE  INTERVALS 

In  the  Bayes  inference  model,  the  unknown  probability,  R,  0  <  R  ^  1,  is  considered  a 
random  variable  whose  posterior  density  is  the  result  of  combining  prior  information  with  test 
data  to  obtain  a  probability  density  function  f(R)  for  R.  If  the  posterior  density  of  R  is  seen  to 
be  spread  out,  then  relatively  more  uncertainty  in  the  value  of  R  obtains  than  when  the  poste- 
rior density  is  concentrated  closely  about  some  particular  value.  The  posterior  density  function 
provides  the  most  complete  form  of  information  about  R,  but  sometimes  summary  information 
is  desired.  A  point  estimate  is  one  such  form  of  summary  information  and  this  can  be  selected 
in  various  ways  and  is  analogous  to  the  familiar  statistical  problem  of  characterizing  an  entire 
population  by  some  parameter  value.  Examples  are  mean,  mode,  median,  etc.  A  point  esti- 
mate has  the  disadvantage  of  ignoring  the  information  concerning  the  uncertainty  in  the  un- 
known reliability.  Confidence  intervals  derived  from  /(/?)  provide  such  additional  information. 
The  true  but  unknown  (and  unknowable  except  with  infinite  data)  reliability  R0  is  some  specific 
value  of  the  random  variable  /?,  0  <  R  <  1.    Conceptually,  R0  can  be  considered  a  random 

sample  from  0  <  /?  <  1  made  when  the  system  was  built.    We  can  never  know  that  R0  is,  but 

rR 
f(R)  gives  a  measure  of  the  likelihood  that  RQ=  R  for  each  0  <  fl  <  1.    If  F(R)  =  J 

f{R)dR  denotes  the  distribution  function  of  R  then 

Prob  {/?,  ^  R0  ^  R2)  =  F(R2)  -  F(RX) 

and  [R],  R2]  is  an  interval  estimate  of  R  of  confidence  c  =  F(R2)  -  F(R{).  The  interpreta- 
tion is  simply  that,  based  on  the  prior  and  current  data  the  probability  is  c  that  the  unknown 
system  reliability  lies  between  R\  and  R2.  The  interval  [R\,  R2]  has  been  called  [25]  a  Bayes  c 
level  confidence  interval.  For  R2=  1,  R\  is  called  the  lower  c  level  confidence  limit.  For 
R{  =  0,  R2  is  called  the  upper  c  level  confidence  limit.  Given  f(R)  and  F(R),  Bayes 
confidence  limits  for  any  c  can  be  obtained  by  graphical  or  numerical  methods  and  the  pro- 
cedure is  generally  not  difficult.  Numerical  examples  and  discussion  of  numerical  methods  are 
given  in  [25,27,8,26,28,29]. 

DEFINITION  OF  STRUCTURE  FUNCTION 

To  establish  the  relationship  between  the  reliabilities  of  the  components  of  a  system  and 
the  reliability  of  the  entire  system,  the  way  in  which  performance  and  failure  of  the  com- 
ponents affects  performance  and  failure  of  the  system  must  be  specified.  For  this  purpose,  as 
in  [5,10,15],  the  state  of  any  component  is  coded  1  when  it  performs  and  0  when  it  fails.  The 
state  of  all  A;  components  of  the  system  can  then  be  coded  by  a  vector  of  N  coordinates 

X  =    \X\,    X2,    .  .  .  ,    Xfri) 


RELIABILITY  FOR  MULTICOMPONENT  SYSTEMS  347 

where  x,  =  0  means  the  i-th  component  fails  and  x,  =  1  means  that  it  does  not  fail.   All  possi- 
ble states  of  the  system  are  represented  by  the  2A/  different  values  this  vector  can  assume. 

Where  an  explicit  mission  time  dependence  is  required,  a  random  process  y(t)  = 
[y\U),  •  •  •  ,  ^a/(^)}  can  be  defined  as  in  [15]  so  that  to  each  component  trajectory  a  measure  x, 
is  assigned.  Then,  for  example:  x,  =  1  if  y,(t)  is  a  failure-free  process  over  some  interval 
O^T/i.^    t  ^  7/2,  and  xi  =  0  if  at  least  one  failure  occurs. 

Some  of  the  2N  states  cause  the  system  to  fail  and  the  others  cause  the  system  to  perform. 
The  response  of  the  system  as  a  whole  is  written  as  a  function  </>(x)  of  x  such  that  0(x)  =  0 
when  the  system  is  failed  in  state  x,  and  </>(x)  =  1  when  the  system  performs  in  state  x.  This 
function  <f>(x)  is  known  in  the  literature  [5,10,23]  and  has  been  called  a  structure  function  of 
order  N. 

The  structure  function  can  be  written  in  a  systematic  way  for  any  series  parallel  system. 
When  the  system  is  not  too  large  the  structure  function  can  also  be  written  by  observation  for 
many  more  general  systems.  The  structure  function  can  always  be  written  for  a  system  of  N 
components  by  enumeration  of  its  2N  states.  For  large  systems  this  is  at  best  very  tedious,  but 
generally  short  cuts  can  be  found  which  simplify  the  process.  The  structure  function  is  con- 
venient for  conceptual  development  of  the  theory  and  provides  a  very  general  notation  which  is 
why  it  is  used  here.  What  is  required  in  the  application  of  the  present  results  is  the  formula  for 
system  reliability  in  terms  of  component  reliabilities  as  is  done  in  [25,27,  and  8].  The  structure 
function  provides  this  formula  in  a  general  form  but  other  methods  are  available.  Some  of 
these  methods  are  identified  and  referenced  in  [17]  along  with  a  new  and  useful  algorithm 
based  on  graph  theory. 

DEFINITION  OF  RELIABILITY  FUNCTION 

Assume  that  the  components  of  the  system  are  failure  independent  so  that  the  elements 
of  the  state  vector  x  =  (x1(  ...  ,  xN)  are  independent  random  variables  with  probability  distri- 
butions 

Pr[x,  =  1)  =  r, 

Pr{x,  =  0)  =  1  -  r, 

where  r,  is  the  reliability  of  the  i-th  component. 

The  structure  function  4>(x)  is  also  a  random  variable  with 
Pr{(f>(x)=  1}  =  R 
Pr{cf>(x)  =  0}  =  1-  R 
where  R  is  the  reliability  of  the  system.   R  is  the  expected  value  of  4>(x)  so  that 

(1)  R  =  £{<Mx)}  =  £  <Mx)//'  (1  -  r0  1_Xl  ...  r»  *"  (1  -  rN)  l~x" 

where  the  summation  is  over  all  2A'  states  of  the  system. 

In  a  particular  application  given  the  structure  function  and  the  values  of  all  component 
reliabilities,  the  system  reliability,  R,  can  be  computed  explicitly  using  (1).  References  [5], 
[10],  and  [23]  provide  further  discussion  with  examples  of  </>  and  R. 


348  WE  THOMPSON  AND  R.D.  HAYNES 

RELIABILITY  ESTIMATION  FROM  TEST  DATA 

In  many  applications  the  system  structure  is  known  but  some  or  all  of  the  component  reli- 
abilities are  unknown  and  must  be  estimated  from  tests  and  other  data.  As  a  result,  statements 
concerning  these  component  and  system  reliabilities  are  subject  to  the  uncertainties  of  statistical 
estimation.  A  method  of  treating  this  uncertainty  is  provided  by  a  Bayes  analysis  which  consid- 
ers the  unknown  component  reliabilities  as  random  variables  and  leads  to  Bayes  confidence 
intervals  for  both  component  and  system  reliabilities.  The  following  is  an  extension  and  gen- 
eralization of  previous  analysis  of  this  kind  [25,27,8,7,14,23,29]. 

BAYES  MODEL 

Assume  a  system  of  A7  failure  independent  components  has  a  known  structure  function 
(f>(x)  and  reliability  function  R  (r)  r  —  (r\,  ...,%)  of  the  form  (1).  Suppose  that  among  the 
N  separate  components  of  the  system  some  are  known  to  have  identical  reliabilities  say  ij,  and 
k,  for  example,  then  since  r,  =  rj  =  rk,  the  symbols  /)  and  rk  can  be  replaced  by  r,  everywhere 
in  (1).  Finally,  in  this  way  there  remain  only  N'  <  N  different  r's,  one  of  each  reliability 
value.  In  addition,  suppose  that  among  the  N'  different  component  reliabilities  N'—n  are 
known  constants  and  thus  there  remain  n  different  types  of  components  with  unknown  reliabili- 
ties. By  a  simple  change  in  notation  these  n  different,  unknown  reliabilities  are  denoted  by 
P  =  (P\,P2>  ■■■  >  Pn)- 

By  multiplying  out  factors  (1  —  p)  and  collecting  terms,  the  system  reliability  (1)  can  then 
be  written  in  the  equivalent  form 

(2)  R(p)  =  T,a0lPx0]l  ■••/>,/'" 

j 

where  the  constants  ay,  are  integer  for  /  ^  0. 

Using  a  Bayes  inference  model,  the  unknown  p,  are  considered  independent  random  vari- 
ables with  known  posterior  density  functions, 

'  fM,  0<  p{ <  1,  /-  1,  ...,  n. 

The  system  reliability,  R(p)  is  then  also  a  random  variable,  defined  by  (1)  with  unknown 
distribution  function  H(R). 

In  applications,  what  is  required  is  the  calculation  of  H(R)  given  the  f,(pi)\ 
/  =  1,  2,  . . .  ,  n.  Having  obtained  //(/?),  point  estimates  and  confidence  intervals  on  R  can  be 
obtained  directly.  This  result  is  also  required  for  risk,  cost  and  other  analyses  based  on  the 
Bayes  model.  The  method  for  an  explicit  numerical  evaluation  is  presented  in  the  following 
section. 

EVALUATION  OF  THE  POSTERIOR  DISTRIBUTION 

The  proposed  method  of  evaluating  the  posterior  distribution  function  H(R)  is  based  on 
an  expansion  of  H(R)  in  Chebyshev  polynomials  of  the  second  kind  [1,16].  The  main  advan- 
tages of  this  method  lie  in  the  rapid  convergence  properties  of  the  Chebyshev  expansion  and 
the  convenient  numerical  computation  for  its  evaluation.  Although  a  description  of  the  pro- 
cedure has  been  presented  in  [8]  and  [7],  for  the  sake  of  completeness,  we  shall  outline  the 
main  steps  below. 


RELIABILITY  FOR  MULTICOMPONENT  SYSTEMS  349 

Expansion  by  Chebyshev  Polynomials 

Let  H{R)  denote  the  posterior  distribution, 

H(R)  =  f     h(R)dR,  0  ^  R  ^  1, 
"  o 

where  h(R)  is  the  posterior  density  of  the  reliability  of  the  overall  system.    By  definition, 
H(R)  satisfies  the  boundary  conditions: 

(3)  //(0)  =  0;  //(l)  =  1 

Let  us  introduce  a  new  function  Q  (R )  defined  by 

(4)  Q(R)  =  H(R)-  R 

the  Q  (R )  satisfies  the  boundary  conditions 

(5)  0(0)  =  0(1)  =  0 

and  can  be  expanded  in  a  Fourier  sine  series  of  the  following  form: 

(6)  Q(R)=  —  sin9[b0+bl  ; 


77  ■"  sin6» 

sin  (*  +  1)0  j 

k         sin  9  " 

where  the  angular  variable  9  is  related  to  R  by  the  relation 

(7)  R  =  cos2  -|. 

The  coefficients  /^  of  the  expansion  (6)  can  be  determined  by: 

(8)  bk  =  JQ'  [//(/?)  -*]  #JE  (/?)£//? 

where  U*  (R)  =  — is  the  shifted  Chebyshev  polynomial  of  the  second  kind  [1,16] 

sin# 

which  can  be  computed  by  the  recursion  relations: 

(9)  Ut+i  (R)  =  (4R  -  2)  U*k  (R)  -  U*k-i  (R) 

with 

U$(R)=l      U\  (R )  =  -  2  +  4R 

U$(R)  =  3-  16/?  +  16/?2 
If  we  express  U*k  (R )  explicitly  as  a  /cth  order  polynomial 

(10)  U*k  (/?)  =  £  ClkRk 

(=0 


J    RiH{R)dR-  j    R'+ldR 


then  Equation  (8)  becomes 

,     R'H(R)dR-  f 

'o  -Jo 

It  can  be  shown,  integrating  by  parts,  that 

(12)  M\H{R)\  =  -^t-  {1  -  Mi+1[h(R)]}, 


(ii)  h  =  Hcik 

f-0 


350  W.E.  THOMPSON  AND  R.D.  HAYNES 

Thus,  Equation  (11)  becomes 


(13)  bk  =  £  Q 

*=0 


l-Mi+l[h(R)] 


i  +  1  i+2 


Note  that  the  Chebyshev  coefficients  Cik  can  be  computed  independently  of  the  moments. 
They  may  be  stored  in  the  form  of  a  triangular  matrix  if  sufficient  storage  space  is  available.  A 
simple  algorithm  for  recursively  calculating  the  coefficients  is  Ci>k+]  =4C,_liA  -2CU  -Cik_h 

Computations  and  Results 

To  complete  the  analysis  it  remains  to  compute  the  moments  of  h  (R )  given  the  density 
functions /■(/?,)  and  then  use  (13)  to  compute  the  br. 

From  (2)  Rk(p)  k  =  1,  2,  ...   can  be  written  as  a  finite  sum 

(14)  **(/>)  =  !  Wi°"A  ■■■PRa"lk- 

j 

where  the  aljk  are  independent  of  the  pt  and  also  integers  for  /  ^  0.  Using  this  result  and  the 
fact  that  the  expected  value  of  a  sum  is  the  sum  of  the  expected  values  and  the  expected  value 
of  a  product  of  independent  random  variables  is  the  product  of  the  expected  values,  it  follows 
that 

(15)  Mk{h)  =  ^aojkMa]ik...Maiuh 

j 

where  Ma  ,  denotes  the  aijk  'th  moment  of  p, . 

Ilk  <f 

Having  determined  the  coefficients  bk  we  can  write  down  the  final  expression  for  H(R) 
from  Equations  (4)  and  (6)  as  follows: 


(16)  //(/?)  =  R  +  —y/R(l-R)  {60+  b\U*(R)  +...  +  bk  Vt(R)  +  •••)• 

77 

This  result  is  exact  in  the  sense  that  the  error  can  be  made  arbitrarily  small  by  taking  a  : 
sufficient  number  of  terms.    References  [8]  and  [7]  give  a  discussion  of  numerical  considera- 
tions and  examples.   Generally,  (16)  has  been  found  very  convenient  for  numerical  calculation 
using  an  electronic  digital  computer. 

MODELS  FOR  APPLICATION 

To  evaluate  H(R)  the  posterior  distribution /•(/?,•)  for  each  different  component  reliability 
Pi  is  required.    The  derivation  of  these  require  application  of  Bayes  inference  procedures  on  a 
case  by  case  basis.   The  theory  can  be  found  in  [20,4,19,2,3,24,6,18]  and  some  specific  applica- 
tions in  [25,27,8,7,14,1,16,12,23].   A  tabulation  for  some  familiar  models  of  mathematical  reli-  | 
ability  theory  is  presented  in  the  following. 

Component  With  Constant  Failure  Rate 

A  single  component  has  an  unknown  constant  failure  rate  X  and  fixed  mission  time  /.  j 
Component  reliability  p  =  exp(-\/)  is  regarded  as  a  random  variable.    The  natural  conjugate 
prior  density  function  is 

P(p)=Cpb'>lnO/pY° 


RELIABILITY  FOR  MULTICOMPONENT  SYSTEMS  351 

with  parameters  b0  and  r0 .  When  test  data  consists  of  T  operating  hours  after  r  failures, 

t=  ti  +  t2+  ...  +  tr+  (m  -  r)tr. 

Here  tr  is  the  time  of  the  r-th  failure  among  m  initially  on  test.   Failures  are  not  replaced  and 
the  test  is  terminated  at  the  r-th  failure.  The  resulting  posterior  density  function  of  p  is 

f{p\aMa0,b0)  =   {b  +  X}T  Pb(\n\lp)°, 
Y{a  +  \) 

where  a  =  r  +  r0  and  b  =  t/t  +  b0.  The  k-ih  moment  of  f(p)  is 

Mk{f)  =  (b  +  l)a+1  (k  +  b  +  I)-""1. 
The  above  results  are  from  Reference  [25]. 

Component  Having  Fixed  Probability  of  Success 

A  single  component  has  an  unknown  fixed  probability  of  success,  p.  In  testing,  there 
were  observed  m  successes  in  n  trials.  For  the  natural  conjugate  Beta  prior  density  function 
with  parameters  m0  and  n0  the  posterior  density  function  of  p  is 

where 

a  =  m  +  m0,  b  =  n  +  n0  -a  and  B(a  +  1,  b  +  1)  =  f    pa{\  -  p)b  dp. 

J  o 

The  Ar-th  moment  of  f{p\a,b), 

k  =  0,1, 2,  ...  is: 

(b  -  a  +  \)\       (a  +  k)\  Y{b-a  +  2)  T(a  +  k  +  1) 


Mk{f) 


kJ'  a\  (b-a  +  kV.  T(a  +  1)  Y{b-a  +  k  +  \)' 

This  result  is  from  [26]. 

Steady  State  Availability  of  Component  With  Repair 

A  two  state  component  has  exponential  distributions  of  life  and  of  repair  times.  The 
duration  of  intervals  of  operation  and  repair  define  two  different  statistically  independent 
sequences  of  identically  distributed,  mutually  independent  random  variables.  Both  the  mean-up 
time,  1/X,  and  mean  repair  time  l//x  are  unknown  parameters  estimated  from  test  and  prior 
data. 

The  long  term  availability  of  the  component  is  a  function  of  the  random  variables  p,  and  X 
i.e.: 

a  =  fi/ik  + /x). 

Assuming  gamma  priors  for  X  and  /x  with  snapshot,  life  and  repair  time  data,  the  posterior  den- 
sity of  availability  a  is  the  Euler  density  function: 

f(    I        ^       (1-5)"  aw-Hl-  aY~l 
0<  a  ^  !;/•  >  0;  w  >  0,  |8|  <  1. 


352  WE  THOMPSON  AND  R.D.  HAYNES 

The  parameters  r,  w  and  8  are  determined  by  test  data  and  prior  information  as  defined  in 
[25]. 

The  moments  of  f(a)  are  given  in  [25]  in  terms  of  Gauss'  hypergeometric  function  2^1 
(w  +  r,  w  +  k;  w  +  r  4-  k\h).  (Note  the  typographical  error  in  [25]  where  k  in  2^1  is 
replaced  by  r.) 

A  special  case  of  this  availability  model  treating  only  "snapshot"  data  is  given  in  [28], 
Snapshot  data  defined  in  [25,28]  records  only  the  state  of  the  system  (up  or  down)  at  random 
instants  of  time. 

RULES  OF  COMBINATION  FOR  SOME  BASIC  SYSTEM  ELEMENTS 

Components  are  often  combined  to  form  system  elements  which  are  special  in  some 
sense.  For  example,  the  same  multicomponent  element  may  appear  several  times  as  a  unit  in 
the  same  system.  In  this  case,  it  may  be  convenient  to  treat  the  element  as  a  single  system 
component.   Some  simple  multicomponent  system  elements  are  presented  in  the  following: 

N  Identical  Components  in  Series 

The  reliability,  p,  of  N  identical  components  in  series  is  p  =  P\. 

Component  reliability  p\  is  a  random  variable  in  the  Bayes  representation  with  known  pos- 
terior density,  f\(p\).  The  moments  MkX{f\\\  k  =  0,1,  ...  ;  of  f\(p\)  are  then  also  known. 
The  moments  Mk{f)  of  the  posterior  density  f(p)  of  p  are  related  to  moments  of  the  f,  by 

Mk[f}  =  MNkA{fl};k  =  0,1,2,  .... 

Using  this  result  one  can  write  the  moments  of  the  posterior  density  of  series  combinations  of 
any  of  the  special  components  treated  in  the  previous  section. 

N  Identical  Redundant  Components 

When  only  one  is  required  to  operate  in  order  that  the  system  operates,  then  the  reliabil- 
ity, p,  of  N  identical  failure  independent  redundant  components  is  p  =  1  -  (1  -  P\)N  where  P\ 
is  the  Bayes  representation  of  the  component  reliability  p\.  It  is  shown  in  [8]  that  the  moments 
Mk\f)  of  the  posterior  density  f(p)  of  p  are  related  to  the  moments  MkX{j\\  of  the  posterior 
density  J\{p\)  of  P]  by  the  relation 


Mk{f}  =  t  ("l)j 
7=0 


MjilA 


By  alternately  applying  this  result  and  the  previous  one  for  components  in  series,  the 
moments  of  the  posterior  density  of  any  series  parallel  system  of  components  can  be  obtained. 

A  "2  out  of  3"  Element 

An  element  consisting  of  three  identical  failure  independent  components,  which  operates 
if  any  two  or  more  of  the  components  operate,  is  sometimes  called  a  "2  out  of  3  voter,"  [21]. 
The  structure  function  of  this  element  is 

4>(xhx2,xi)  =  1  if  X]  4-  x2  +  x3  >  2 

=  0  if  xi  +  x2  +  x3  <  2 


RELIABILITY  FOR  MULTICOMPONENT  SYSTEMS 


353 


and  the  reliability  p  is 

p  =  3p?  -2p\ 

where  P\  is  the  component  reliability.    If  the  posterior  density  f\{p\)  of  p\  has  moments 
Mk\{f\\  then  the  moments  Mk[f)  of  the  posterior  density  f{p)  of  p  are: 


Mk{f)  =  lky£ 

7=0 


M,+7k.\{f\ 


This  result  follows  using  the  fact  that  for  p  =  px,  Mp{f(p)\  =  MNkA{fx),  when  applied  term  by 
term  to  the  expansion  of 

(3p,2  -2pl)k. 

Reference  [21]  gives  the  reliability  function  of  the  N- tuple  Modular  Redundant  design 
consisting  of  N  replicated  units  feeding  a  (/?  4-  l)-out-of-A/  voter.  This  case  can  also  be  treated 
by  the  present  methods. 

Exactly  L  Out  of  N  Element 

An  element  consisting  of  N  identical  failure  independent  components  which  operates  only 
when  exactly  L  out  of  N  components  operate  is  a  rather  unusual  system.  If  L  +  1  out  of  N 
operate  the  system  fails.  Such  a  system  is  not  a  coherent  structure  in  the  sense  of  [5].  The 
reliability  p  of  this  element  is  given  by 


M 

L 


p[(\-Pi) 


N-L 


The  moments  of  the  posterior  density  f(p)  of  the  Bayes  representation  p  in  terms  of  the 
moments  MkX{fx]  of  the  posterior  density  f\(p\)  of  the  component  reliability  px  can  be  shown 
to  be 


Mk[f\  = 


(N-L)k 

I    (-1V 

7=0 


(N-L)k 
J 


M 


j+kL.  1 


l/ll 


This  example  serves  to  illustrate  that  the  proposed  evaluation  is  not  restricted  to  coherent  sys- 
tems. 


DEVELOPMENT  OF  AN  APPROXIMATE  PRIOR  FOR  TESTING  AT  SYSTEM  LEVEL 

Section  9.4.4  of  NAVORD  OD  44622,  Reference  [22],  presents  a  procedure  for  develop- 
ing the  posterior  beta  distribution  of  system  reliability  for  system  level  TECHEVAL/OPEVAL 
testing.  Reference  [9]  presents  further  discussion  with  an  example.  The  observed  system  level 
data  is  binomial  i.e.,  r  failures  in  n  trials.  The  system  level,  natural  conjugate  prior  is  the  beta 
density.  An  exact  prior  for  the  system  level  tests  is  the  posterior  density  function  based  on  all 
prior  component  tests  and  component  priors  and  can  be  computed  by  the  methods  above.  The 
procedure  recommended  in  OD44622  is  to  approximate  the  exact  system  prior  with  a  beta  den- 
sity having  the  same  first  and  second  moments. 

Equation  (15)  above  provides  a  tractable  tool  for  computing  the  required  first  and  second 
moments  for  extending  the  method  to  arbitrary  system  structures. 

Let  A/1  and  M2  denote  the  .first  and  second  moments  computed  as  shown  in  this  report 
for  the  posterior  density  f(R)  of  system  reliability,  R,  based  on  prior  component  data.    The 


354  WE.  THOMPSON  AND  R.D.  HAYNES 

f(R)  is  considered  the  exact  prior  for  determinations  of  a  new  posterior  density  based  on  bino- 
mial system  level  data.  What  are  required  for  the  approximation  are  the  parameters  n'  and  r' 
of  the  beta  prior 

t  d  \    i     >\  R"   r (\  —  R)r 

g(/?l";r)=B(y-r'+l,  r'+l) 

with  the  same  first  and  second  moments  as  f(R).  Having  computed  Mx  and  M2  the  answer  is 
direct  using  formulas  on  page  9.23  of  NAVORD  OD  44622  i.e., 

n'=  [M,(l-  MX)/(M2-  M,2)]-l 

/-'=  (1-  Mx)n'. 

The  gamma  prior  is  treated  in  a  similar  way  in  the  same  reference. 

The  beta  approximation  can  also  be  used  directly  as  an  approximation  to  the  exact  poste- 
rior density  function  for  complex  systems  based  on  component  test  data.  The  approximation 
has  been  very  good  when  compared  with  the  exact  result  in  examples  treated  by  the  authors. 
The  calculation  is  tractable  for  hand  computation  since  only  the  first  and  second  moments  of 
the  exact  posterior  density  function  are  required. 

Numerical  Example 

Consider  a  system  consisting  of  five  components,  A\  (/  =  1,  . . .  ,  5)  connected  in  series. 
Components  Ax,  A2,  A^,  and  A4  have  unknown  fixed  probabilities  of  success,  p,\  and  in  testing, 
there  were  observed  m,  successes  in  n,  trials.  The  fifth  component,  A5,  has  an  unknown  con- 
stant failure  rate  A  and  has  mission  time  /.  In  testing,  component  A5  failed  r  times  in  T  operat- 
ing hours.  The  following  test  data  were  observed: 

« ,  =  20,  w,=  18;  /?2=30,  w2=  25; //3=  20,  m3=  20; /?4=  20,  w4=19;r=38,  /=6,  r=  3 

The  resulting  posterior  density  functions  are: 
/,(/?,)  =  3990  R\%  (1-  Rx)2 
f2(R2)  =  4417686  R225  (1  -  R2)5 
f3(R})  =  21  Ri° 
MRa)  =  420  /?419  (1  -  R4) 

1 


f5(R5)  -  482.00823  fl519/3 


ln«5 


We  know  [25,26]  that  the  Mellin  integral  transform  of  the  posterior  density  function, 
h  (R )  for  the  system  is  the  product  of  the  Mellin  integral  transforms  of  the  density  functions  of 
the  components.  At  this  point  we  can  determine  h(R)  exactly  by  means  of  the  inverse  Mellin 
integral  transform  or  we  can  approximate  h  (R )  with  a  Beta  density  function  having  the  same 
first  and  second  moments  as  li(R). 

The  Mellin  integral  transforms  of  the  density  function  for  the  components  of  the  system 
are: 

mwMkl       21!  TCS+  18)  iL#r#-/i»Mci       21!  r(5  +  19) 


RELIABILITY  FOR  MULTICOMPONENT  SYSTEMS 


355 


M\f(R  >|ci-  31!  T(5  +  25) 
M[f2(R2)\S]-  25,r(s  +  31) 

%4\f  (p  Mci       21!  rCS  +  20) 


M[f5(R5)\S]  = 


(22/3)4 


(s  +  i9/3r 


The  Mellin  Integral  transform  of  h(R)  is  M[h(R)\S]  =  fj   A/[/)  (/?,-)  I S]. 


;=1 


From  [26]  we  know  that  the  Mellin  inversion  integral  yields  directly 
h(R)  =  -^--  f^'°°  R-sM[h(R)\S]dS 

27TI   ve-i00 

where  the  path  of  integration  is  any  line  parallel  to  the  imaginary  axis  and  lying  to  the  right  of 
the  real  part  of  c,  If  b  is  greater  than  1,  the  real  part  of  c  is  greater  than  p,  and  p  is  any 
number,  then,  [26] 


1      Cc+ 

Itt'i  Jc-i 


C  +  ioo 


R 


-5 


09    (s  +  p)1 


dS  = 


Rp 


In 


b-\ 


To  find  h(R)  we  simply  write  M[h(R)\S]  as  the  sum  of  its  partial  fractions  [13]  and 
integrate  each  term  using  the  above  equation.  Thus  the  exact  posterior  density  function,  h(R), 
for  system  reliability  is 

h(R)  =  +  1094388844.948  i?18  +  30505643166.29  R]9. 


-  12601708553.76  R19 


19915799047.82  fl20 


4 


-  31650550963.66  R20 


In 


-  5114357474.61  R 


20 


4 


2 


+  235122603.404  R25  -  354959810.01  R26 
+  249501799.456  R21  -  98389473.63  R2% 
+  21240815.37  R29  -  1974044.939  Ri0 


-  22937.221  Rl9/3  +  78073.717  Rl9/3 

2 


In 


42683.275  Rx9/i  \\n-^ 
R 


-  95839.296  Rl9/3 
The  exact  distribution  function,  H(R),  is  found  by  integrating  the  density  function. 


'4 


To  obtain  the  approximate  solutions  for  the  system  reliability  density  and  distribution 
functions,  we  recall  that  the  first  and  second  moments  of  h(R)  are  given  by  M[h(R)\2]  and 
M[h(R)\3]  respectively.  The  beta  density  function,  which  is  used  to  approximate  h(R),  is 


h(R)  = 


Ra(\  -  R)b 


B(a  +  1,  b  +  1) 

where  h{R)  denotes  the  approximate  system  density  function,  (3 (a  +  I,  b  +  1)  is  the  com- 
plete beta  function,  and  a  and  b  are  the  parameters  of  the  beta  function.  The  first  moment  of 
HR)  is 


356 


WE.  THOMPSON  AND  R.D   HAYNES 


a  +  1 


a  +  b  +  2 
and  the  second  moment  is 

(a  +  1)  (a  +  2) 


(a  +  b  +  2)  (a  +  b  +  3)   ' 
We  require  that  the  first  and  second  moments  of  h  (R )  and  h  (R )  be  equal.  Thus  we  have 

M[h(R)\2]  = 
M[h(R)\3]  = 


a  +  b  +  2 

(a  +  1)  (a  +  2) 


(a  +  b  +  2)  (a  +  b  +  3) 


Solving  simultaneously  for  a  and  b  yields  the  parameters  for  the  beta  density  function.  Thus 
we  have  a  =  6.43596  and  b  =  11.92734.  Therefore  we  can  now  write  h(R),  the  approximate 
density  function  for  system  reliability: 

^6.43596  n  _  ^)11.92734 


h(R)  = 


B  (7.43596,  12.92734) 


To  determine  the  approximate  distribution  functions,  H(R),  for  system  reliability  we  simply 
integrate  h  (R ) 

Table  1  provides  the  comparison  between  the  results  obtained  by  the  exact  solution  and 
the  approximate  solution. 

TABLE  1  —  Numerical  Results  Obtained  from  Exact 
and  Approximate  Solutions 


R 

Density  Function 

Distribution  Function 

Exact 

Approximate 

Exact 

Approximate 

h(R) 

h(R') 

H(R) 

H{R) 

.0 

.0 

.0 

.0 

.0 

.10 

.079 

.057 

.001 

.001 

.20 

1.213 

1.208 

.052 

.048 

.30 

3.243 

3.339 

.278 

.281 

.40 

3.429 

3.382 

.635 

.641 

.50 

1.667 

1.616 

.896 

.895 

.60 

.343 

.365 

.986 

.984 

.70 

.020 

.032 

.999 

.999 

.80 

.003 

.001 

1.000 

1.000 

.90 

0.000 

0.000 

1.000 

1.000 

1.00 

0.000 

0.000 

1.000 

1.000 

REFERENCES 

[1]  Abramowitz,  M.  and  I. A.  Stegun  (Editors),  "Handbook  of  Mathematical  Functions," 
National  Bureau  of  Standards,  Applied  Mathematics  Series,  55,  782  (1964). 

[2]  Aitchison,  J.,  "Two  Papers  on  the  Comparison  of  Bayesian  and  Frequentist  Approaches  to 
Statistical  Problems  of  Prediction,"  Journal  of  the  Royal  Society,  Series  B.,  26,  161-175 
(1964). 


RELIABILITY  FOR  MULTICOMPONENT  SYSTEMS  357 

Bartholomew,  D.J.,  "A  Comparison  of  Some  Bayesian  and  Frequentist  Inferences,"  Biome- 

trika  52,  1  and  2,  19-35  (1965). 
Birnbaum,  Z.W.,  "On  the  Probabilistic  Theory  of  Complex  Structures,"  Proceeding  of  the 

Fourth  Berkeley  Symposium,  7,  49-55,  University  of  California  Press  (1961). 
Birnbaum,  Z.W.,  J.D.  Esary  and  S.C.  Saunders,  "Multi-Component  Systems  and  Structures 

and  Their  Reliability,"  Technometrics,  3,  55-77  (1961). 
Brender,  D.M.,  "Reliability  Testing  in  a  Bayesian  Context,"  IEEE  International  Convention 

Record,  Part  9,  125-136  (1966). 
Chang,  E.Y.  and  W.E.  Thompson,  "Bayes  Analysis  of  Reliability  of  Complex  Systems," 

Operations  Research,  24,  156-168  (1976). 
Chang,  E.Y.  and  W.E.  Thompson,  "Bayes  Confidence  Limits  for  Reliability  of  Redundant 

Systems,"  Technometrics,  77(1975). 
Cole,  Peter  Z.V.,  "A  Bayesian  Reliability  Assessment  of  Complex  Systems  for  Binomial 

Sampling,"  IEEE  Transactions  on  Reliability,  R-24,  114-117  (1975). 
Esary,  J.D.  and  F.  Proschan,  "Coherent  Structures  of  Non-Identical  Components,"  Tech- 
nometrics, 5,  191-209  (1963). 
Esary,  J.D.  and  F.  Proschan,  "The  Reliability  of  Coherent  Systems,"  Redundancy  Tech- 
niques for  Computing  Systems,  Edited  by  R.H.  Wilcox  and  W.C.   Mann,  SPARTAN 

BOOKS,  47-61  (1962). 
Fox,   B.L.,  "A   Bayesian   Approach   to   Reliability   Assessment,"   NASA   Memorandum 

RM/5084  -  (1966). 
Gardner,  M.F.  and  J.L.  Barnes,  Transients  In  Linear  Systems,  John  Wiley  and  Sons,  New 

York,  152-163  (1942). 
Gaver,  D.P.  and  M.  Mazumdar,  "Statistical  Estimation  in  a  Problem  of  System  Reliability," 

Naval  Research  Logistics  Quarterly,  14,  473-488  (1967). 
Gnedenko,  B.V.,  Yu.K.  Belyayev  and  A.D.  Solovyev,  Mathematical  Methods  of  Reliability 

Theory,  Academic  Press,  New  York,  76-77  (1969). 
Lanczos,  C,  Applied  Analysis,  Prentice  Hall,  Inc.  Chapter  IV  and  VII. 
Lin,  P.M.,  B.J.  Leon  and  T.C.  Huang,  "A  New  Algorithm  for  Symbolic  System  Reliability 

Analysis,"  IEEE  Transactions  on  Reliability,  R-25  (1976). 
Lindley,  D.V.,  "The  Robustness  of  Internal  Estimates,"  Bulletin  of  International  Statistical 

Institute,  38,  209-220  (1961). 
Lindley,  D.V.,  "The  Use  of  Prior  Probability  Distributions  in  Statistical  Inference  and 

Decisions,"  Proceeding  of  the  Fourth  Berkeley  Symposium,  7,  453-468,  University  of 

California  Press  (1961). 
Maritz,  J.S.,  "Empirical  Bayes  Methods,"  Methuen's  Monographs  on  Applied  Probability 

and  Statistics,  Methuen  and  Co.  Ltd.,  London  (1970). 
Matther,  F.P.  and  Paulo  T.  deSousa,  "Reliability  Models  of  A^-tape  Modular  Redundancy 

Systems,"  IEEE  Transactions  on  Reliability,  R-24,  108  (1975). 
NAVORD  OD  44622,  Reliability  Guide  Series,  4.    The  Superintendent  of  Documents, 

U.S.  Government  Printing  Office,  Washington,  D.C. 
Parker,  J.B.,  "Bayesian  Prior  Distributions  for  Multi-Component  Systems,"  Naval  Research 

Logistics  Quarterly,  19  (1972). 
Savage,  L.J.,  "The  Foundations  of  Statistics  Reconsidered,"  Proceeding  of  the  Fourth 

Berkeley  Symposium,  7,  575-585,  University  of  California  Press  (1961). 
Springer,  M.D.  and  W.E.  Thompson,  "Bayesian  Confidence  Limits  for  the  Reliability  of 

Cascade   Exponential   Subsystems,"    IEEE   Transactions   on   Reliability,    R-16,   86-89 

(1967). 
Springer,  M.D.  and  W.E.  Thompson,  "Bayesian  Confidence  Limits  for  the  Product  of  A' 

Binomial  Parameters,"  Biometrika  53,  3  and  4,  611  (1966). 
Thompson,  W.E.  and  P. A.  Palicio,  "Bayesian  Confidence  Limits  for  the  Availability  of  Sys- 
tems," IEEE  Transactions  on  Reliability,  R-24,  118-120  (1975). 


358  W.E   THOMPSON  AND  R.D.  HAYNES 

[28]  Thompson,  W.E.  and  M.D.  Springer,  "A  Bayes  Analysis  of  Availability  for  a  System  Con- 
sisting of  Several  Independent  Subsystems,"  IEEE  Transactions  on  Reliability,  R-21, 
212-214  (1972). 

[29]  Wolf,  J.E.,  "Bayesian  Reliability  Assessment  From  Test  Data,"  Proceedings  1976  Annual 
Reliability  and  Maintainability  Symposium,  Las  Vegas,  Nevada,  20-22,  411-419  (1976). 


OPTIMAL  REPLACEMENT  OF  PARTS  HAVING  OBSERVABLE 
CORRELATED  STAGES  OF  DETERIORATION* 

L. Shaw 

Polytechnic  Institute  of  New  York 
Brooklyn,  New  York 

C-L.  Hsu 

Minneapolis  Honeywell 
Minneapolis,  Minnesota 

S.  G.  Tyan 

M/A  COM  Laboratories 
Germantown,  Maryland 

ABSTRACT 

A  single  component  system  is  assumed  to  progress  through  a  finite  number 
of  increasingly  bad  levels  of  deterioration.  The  system  with  level  i  (0  <  /  <  n) 
starts  in  state  0  when  new,  and  is  definitely  replaced  upon  reaching  the  worth- 
less state  n.  It  is  assumed  that  the  transition  times  are  directly  monitored  and 
the  admissible  class  of  strategies  allows  substitution  of  a  new  component  only 
at  such  transition  times.  The  durations  in  various  deterioration  levels  are 
dependent  random  variables  with  exponential  marginal  distributions  and  a  par- 
ticularly convenient  joint  distribution.  Strategies  are  chosen  to  maximize  the 
average  rewards  per  unit  time.  For  some  reward  functions  (with  the  reward 
rate  depending  on  the  state  and  the  duration  in  this  state)  the  knowledge  of 
previous  state  duration  provides  useful  information  about  the  rate  of  deteriora- 
tion. 


Many  authors  have  studied  optimal  replacement  rules  for  parts  characterized  by  Marko- 
vian  deterioration,  for  example  Kao  [6]  and  Luss  [9]  and  the  many  references  found  in  those 
papers.  Kao  minimized  the  expected  average  cost  per  unit  time  for  semi-Markovian  deteriorat- 
ing system,  and  considered  various  combinations  of  state  and  age-dependent  replacement  rules. 

Luss  examined  inspection  and  repair  models,  where  he  assumed  that  the  operating  costs 
occurring  during  the  system's  life  increase  with  the  increasing  deterioration.  The  holding  times 
in  the  various  states  were  independently,  identically,  and  exponentially  distributed.  The  policies 
examined  include  the  scheduling  of  the  next  inspections  (when  an  inspection  reveals  that  the 
state  of  the  system  is  better  than  certain  critical  state  k)  and  preventive  repairs  (when  an 
inspection  reveals  the  state  of  the  system  being  worse  than  or  equal  to  k).  The  convenience  of 
a  Poisson-type  structure  for  the  number  of  events-per-unit-time  made  it  relatively  easy  to  allow 
general  freedom  in  the  selection  of  observation  times. 

The  work  studied  here  is  based  on  a  modification  of  the  model  used  by  Luss.  Our  model 
for  deterioration  is  more  general,  but  the  admissible  strategies  used  here  are  more  restricted. 
Here  we  allow  the  exponentially  distributed  durations  to  have  different  mean  values,  and  to  be 
positively  correlated. 


'This  work  was  partially  supported  by  Grant  No.  N00014-75-C-0858  from  the  Office  of  Naval  Research. 

359 


360 


L.  SHAW,  C-L.  HSU  AND  S.G.  TYAN 


The  introduction  here  of  correlation  between  interval  durations  permits  the  modeling  of  a 
rate  of  deterioration  which  can  be  estimated  from  a  particular  realization  of  the  past  durations. 
However,  the  lack  of  a  Poisson-type  of  structure  for  the  events-per-unit-time  makes  it  much 
more  difficult  here  to  allow  general  freedom  in  the  selection  of  observation  times.  At  present, 
only  the  simple  case  of  direct  and  instantaneous  observation  of  deterioration  jumps  has  been 
considered. 

This  model  would  be  appropriate,  for  example,  in  a  subsystem  which  functions,  but  with 
reduced  efficiency,  when  some  redundant  components  have  failed;  and  for  which  failure  of  one 
component  might  indicate  environmental  stresses  which  increase  the  probability  of  failure  for 
other  components.  In  addition,  deterioration  in  correlated  stages  might  be  used  as  a  simple 
approximation  for  a  continuously  varying  degradation  which  does  not  exhibit  discrete  stages. 

Figure  1  shows  a  typical  time  history  of  deterioration  and  replacement.  The  duration  in 
state  (/'—  1),  prior  to  reaching  state  (/'),  is  r,_x.  The  intervals  d,  in  Figure  1  represent  the  time 
required  to  replace  a  component  when  it  has  entered  state  /'.  The  sequence  {/•,■}  will  be  Markov, 
characterized  by  a  multi-variate  exponential  distribution.  Reward  functions  will  be  related  to 
the  deterioration  state  and  the  time  spent  in  each  state.  The  decision  rule  specifies  whether  or 
not  to  replace  when  entering  each  state  /,  on  the  basis  of  the  history  of  r,-_i,  /■/_2,  ....  The 
Markov  property  simplifies  the  decision  rule  to  be  a  collection  of  C,  sets  such  that  we  replace 
on  entering  state  /'  if  and  only  if  r-_]  €  (£,. 


State 
5i 

4 
3 

2 
1 


'o 


'0 


Time 


Figure  1.  History  of  deterioration  and  replacement  (n  =  5). 


(1) 

(2) 


The  objective  is  to  maximize  the  average  reward  per  unit  time: 
1 


L  =  lim  -=  (Total  reward  in  (0, T)] 

T~oo      T 

£  [Reward  per  renewal] 


E [Duration  between  renewals] 

(See  Ross  [11]  page  160  for  equivalence  of  (1)  and  (2).)    The  mean  reward  per  renewal  is 
defined  here  as: 

(3)  %  = 


N- 1    r  r 

I  Jo    c>{t)dt- 

/=0 


OPTIMAL  REPLACEMENT  OF  PARTS 


361 


in  which: 

A'  =  state  at  which  replacement  occurs  (possibly  random). 
pN  =  replacement  cost  if  replaced  on  entering  state  N  (possibly  random). 
Cj(t)  =  reward  rate  when  in  state  /'. 

Figure  2  shows  several  reward  rate  time  functions  c(t)  which  have  been  considered. 
When  one  of  these  c(t)  functions  is  specified  for  a  given  problem,  the  c,{t)  in  (3)  are  assigned 
values  j8,c(/)  with: 


(4) 


i30>i3l    >£l>    ■■■   >Pn-l    >Pn  =  0, 


to  assure  greater  reward  rates  in  less  deteriorated  states.    State  n  corresponds  to  a  completely 
failed  or  worthless  component. 


c(t) 


(a)  constant 


(b)  linear 


(c)  constant-after  set  up 


Figure  2.    Reward  rate  time  functions. 


The  mean  duration  in  (2)  is  defined  as: 
(5)  5)  =  E 


A'-l 

I  r,  +  dN 

i=0 


to  include  a  possibly  random  time  dN  for  carrying  out  a  replacement  at  state  N. 

While  the  ultimate  objective  is  to  choose  6,  to  maximize  the  L  defined  in  (1),  it  is  well 
known  that  a  related  problem  of  maximizing: 

(6)  £„(«)  =  »  -a  % 

is  simpler  [1].    Indeed,  the  £,  which  maximize  L  will  be  identical  to  those  which  maximize 
£<)(«)  for  the  a  *  such  that: 


(7) 


£<?  («  *)  =  0,  where  £0°  (a)  A  max  £0(a). 


362  L.  SHAW,  C-L.  HSU  AND  S.G.  TYAN 

Section  1  considers  a  case  in  which  it  is  found  that  deterioration  rate  information  is  not 
useful  (e.g.,  the  optimal  policy  is  independent  of  the  amount  of  correlation  between  successive 
state  durations). 

Sections  2  and  3  consider  other  penalty  cost  structures,  e.g.,  assuming  that  more 
deteriorated  parts  are  rustier,  hotter,  or  more  brittle,  and  therefore  more  costly  to  replace.  In 
such  cases  the  optimal  policies  do  make  use  of  estimates  of  the  deterioration  rates  as  well  as  of 
observations  of  the  deterioration  level. 

The  Appendix  describes  useful  properties  of  the  multivariate  exponential  {/■,}  sequence 

which  is  used  to  model  the  correlated  residence  times  in  a  sequence  of  deterioration  states. 

These  durations  have  marginal  distributions  which  are  exponential  with  mean  values  tj,,  and 

i  *—  ■  i 
correlations  prr  =p     ". 

1.   CONSTANT  REWARD  RATE-STATE  INDEPENDENT 
REPLACEMENT  PENALTIES 

The  constant  reward  rate  case  with  c,(t)  =  (i,  and  with  state-independent  replacement 
penalties  (p,  =  p,  d,  =  d)  is  particularly  simple  to  analyze.  We  will  see  that  as  long  as 
£[//|//_i,  r,_2,  •••  1  ^  0  for  all  /',  even  if  the  r,  are  not  exponentially  distributed,  the  optimal 
rule  will  be  to  replace  the  deteriorating  part  upon  entering  some  critical  state  A-*,  independent  of 
the  observed  durations  r,. 

Based  on  the  problem  statement,  the  optimal  decision  on  entering  state  j  must  maximize 
the  mean  future  reward  until  the  next  renewal,  £,(«),  for  a  suitable  a.   Here: 

(8)  £,(«)  =  £ 

Immediately  after  a  renewal,  when  j  =  0,  the  expectations  defining  £0(a)  are  unconditional 
The  optimal  decisions  for  each  state  will  be  found  in  terms  of  a,  and  then  the  proper  a*  (for 
producing  decisions  which  maximize  L)  is  the  one  for  which  the  maximum: 

(9)  max£()(a*)  =  £o0(«*)  =  0. 

Optimization  by  dynamic  programming  begins  by  considering  the  decisions  at  the  last 
step,  i.e.,  on  entering  state  (n  -  1).  There  are  two  choices,  to  replace  (R)  or  not  to  replace 
(R),  with  corresponding  values: 

(10)  £„_,(«;*) p-ad, 

and: 

£„_,(<*;#)  =  E[ (3 „_,/■„_,  I r„_2]  -  aE[rn_x\rn_2]-p-ad 

=  E[{pn_x-a)rn_{\rn_2]-p-ad. 

Clearly,  the  best  decision  is  not  to  replace,  if  and  only  if,  the  difference 

(12)  An_,(a;rn_2)  A  £„_,(«;£)-£„_,(«;/?) 

=  {(Sn_x-<x)E[rn_x\rn_2\  >  0. 

is  non-negative.  The  sign  of  (12)  will  be  the  sign  of  (j8„_,-a),  due  to  the  non-negativity  of 
all  interval  durations.  Thus  the  best  decision  depends  on  a  and  the  reward  parameter  /8„_i,  but 
not  on  the  previously  observed  duration.   Two  cases  will  be  considered  separately. 


A'-l 

Z  /v,lo-i 

—  aE 

N-] 

i=j 

.  i=j 

OPTIMAL  REPLACEMENT  OF  PARTS  363 

If /3„_!  ^  a  then  the  best  decision  at  state  (n  —  1)  is  not  to  replace.  We  will  now  explain 
why,  under  this  condition,  it  is  best  not  to  replace  at  any  state  less  than  n.  Consider  the  situa- 
tion on  entering  (n  —  2).  We  have  already  shown  that  it  is  better  not  to  replace  on  entering 
(n  —  1).   Thus  the  choice  will  be  based  on  a  A„_2  °f  me  form: 

(13)  A„_2(a;/-fl_3)  =  £[(/3fl_2-a)/-fl_2  +  (0„_i  -a)r„_,  k_3). 
Here  we  have: 

(14)  (j3n-2-«)  >  (/8„-i-a)  >  0, 
by  assumption,  and: 

(15)  £[/■„_,  I rB_3]  >  0  and  £[r„_2| r„_3]  >  0, 

because  all  a,  >  0  with  probability  one.  Thus  A„_2(a;r„_3)  >  0  for  all  a-„_3  >  0,  and  it  is  also 
better  not  to  replace  here.  This  argument  can  be  repeated  for  states  (n  —  3), 
(n  -4) 1,0. 

The  other  case  to  consider  is  yS,,^  <  a,  which  requires  replacement  on  entering  state 
(n  —  1),  if  the  system  ever  reaches  that  state.  When  we  consider  the  decision  on  entering 
(n  -  2).  the  A„_2  is: 

(16)  A„_2(a;/'„_3)  =  £[(/3„_2-a)/-„_2|/-„_3], 

which  has  the  sign  of  (/3„_2-a).  If  (/3„_2-a)  <  0,  then  replacement  is  optimal  on  entering 
(n  —  2)  and  in  —  3)  is  considered  next.  This  iteration  may  eventually  reach  a  state  (k  —  1) 
where  (ftk-\  —  a)  >  0  and  it  is  better  not  to  replace.  Arguments  similar  to  those  for  the 
j8n_!  — a  >  0  case  show  that  nonreplacement  is  the  optimal  decision  at  all  states  preceding  the 
one  which  first  arises  as  a  nonreplacement  state  in  this  backward  iteration. 

In  summary,  in  the  constant  reward  rate-constant  replacement  penalty  case  S-0(a)  is  max- 
imized by  a  decision  rule  which  says  replace  on  entering  some  state  k  ^  n  which  depends  on 
the  reward  parameters  {{3,}  and  the  a: 

(17)  k  =  min{/:(a  -  £,)  >  0}. 
Finally,  we  must  choose  a  *  so  that  £q  (<*  *)  =  0,  where: 

(18)  Z§(a)  =  -p-ad  +  kfi  (p,  -a)E[r,]. 


Figure  3  shows  a  typical  plot  of  £<?  (a)  as  a  continuous,  piecewise  linear  curve  whose  zero 
crossing  (£°(a*)  =  0)  defines  a  *  and  the  optimal  replacement  state  k*  for  maximizing  L. 

EXAMPLE.    Figure  3  shows  that  the  optimal  average  reward  per  unit  time  is  L  =  2— 

when  k*  =  3,  where  /30  =  5,  jSj  =  4,  /32  =  3,  /33  =  2,  /34  -  1,  /85  =  0,  p  =  5,  d  =  1,  t;,-  -  2 
(/'  =  0,1,2,3,4)  and  n  =  5.  From  Equation  (18),  the  optimal  k  is  a  function  of  a,  which 
remains  constant  when  a  varies  over  each  interval  j6,+  i  <  a  <  /3,-,  as  shown  in  the  figure. 

2.   INCREASING  REPLACEMENT  PENALTIES-CONSTANT  REWARD  RATE 

Here  we  generalize  the  model  of  the  previous  section  by  allowing  the  replacement  cost  p, 
and  replacement  duration  d,  to  be  functions  of  the  replacement  state  (/),  and  to  be  random. 
These  parameters  are  assumed  to  have  mean  values  E[pj]  and  E[dj]  which  are  convex  nonde- 
creasing  sequences  in  /,  corresponding  to  the  increased  difficulty  in  replacing  more  deteriorated 
parts  which  may  be,  e.g.,  rustier,  hotter  or  more  brittle.  We  also  assume  that  the  mean  dura- 
tions are  ordered:  tj0  ^  rj|  ^  . . .  >  17,,- 1,  corresponding  to  faster  transitions  of  more 
deteriorated  parts. 


364 


L.  SHAW,  C-L.  HSU  AND  S.G.  TYAN 


J 

ii°oU*,  a) 

30- 

20- 

k*(a)> 

i 

-5 
-4 

10- 

\ 

-3 
-2 

o- 

1 

1 1 

i                        : 

1 >v      1 

1 

1 

a 

"*  -  Ltmx 

^\ 

K 

10- 

20- 

Figure  3.    Optimal  reward  search:  constant  reward  rate  case. 


OPTIMAL  REPLACEMENT  OF  PARTS 


365 


The  foregoing  assumptions,  together  with  properties  of  the  assumed  multivariate 
exponential  density  for  stage-durations  (see  Appendix),  lead  to  an  optimal  decision  policy  with 
a  nice  structure.  That  optimal  policy  prescribes  replacement  when  entering  state  j,  if  and  only 
if  /)_,  <  /}*_i,  where  the  decision  thresholds  are  ordered:  0  <   /"o/^o  ^  r°\/v\  ^  •  •  •  ^  C-\l 


f)n-\ 


The  optimal  decision  on  entering  state  j  must  maximize  the  mean  future  reward  until  the 
next  renewal,  i.  e.,  £,(«).   For  a  suitable  a,  we  have: 


(19) 


£>)  =  £  £  /3, 


A'-l 


i=J 


aE 


A'-l 


E[pN  +  adN]. 


For  notational  simplicity  we  define  e,  =  E[p,  +  ad,]  and  note  that  e,  is  also  convex  and  nonde- 
creasing  since  we  are  only  interested  in  a  >  0.  The  optimal  decisions  for  each  state  will  be 
found  in  terms  of  a,  and  then  the  proper  a  *  (for  producing  decisions  which  maximize  L)  is  the 
one  for  which  the  maximum  £  vanishes: 


(20) 


£,?(«*)  =-  eN(a*)  +  E 


A'-l 
;'=0 

n 

-a* 

A'-l 

I 
j=0 

n 

=  0. 


Optimization  by  dynamic  programming  begins  by  considering  the  decision  at  the  last  step. 
Since  state  n  represents  a  failed  component,  we  definitely  replace  the  component  when  it  enters 
state  n.  Next,  we  consider  the  decision  to  be  made  on  entering  state  n  —  1.  There  are  two 
choices:  to  replace  (R)  or  not  to  replace  (/?),  with  corresponding  values 

(21)  £„_,(«;/?)  =  -£-„_,, 

(22)  £„_,(«;£)  =  £[/3„_i''/»-i-"',«-ik-2]  -  en 

for  £„_|(a).   Clearly,  the  best  decision  is  not  to  replace,  if  and  only  if, 

A„_,(/-„_2)  A  £„_,(«;£)  -  £„_,(«,/?) 
is  non-negative,  i.e., 

(23)  K-i(r„-2)  =  (/3„-i-a)  E[rn.x\rn.2\  +  (en_x  -  e„)  >  0. 
Referring  to  (A-6),  A„_i(r„_2)  is  a  linear  function  of  r„_2,  with 

A„_,(0)  =  (jS^-ah^a  ~  p)  +  (eB_j  -  O- 

Figure  4  shows  the  possible  shapes  for  this  function.    There  can  be  no  downward  zero-crossing 
at  an  rn_2  >  0. 

Thus,  depending  on  the  numerical  values  of  the  parameters,  there  are  three  possible  kinds 
of  optimal  decision  rules  when  entering  state  (n  —  1): 

(i)    replace  for  no  rfl_2  if  A„_i  ^  0  for  all  /-„_2  ^  0 

(ii)    replace  for  any  rn_2  if  A„_]  <  0  for  all  r„_2  ^  0 

(iii)    replace  if  and  only  if  r,*_2  >  r„_2  >  0,  where  A„_,  (r,*_2 )  =  0. 

In  other  words, 

(24)  £,_,(«)  =  {r„_2:  rn_2  <  C2], 
where  r„"_2  could  be  zero  (case  i)  or  infinite  (case  ii). 


366 


L.  SHAW,  C-L.  HSU  AND  S.G.  TYAN 


A„_,(r„_2) 


•     P„-\  >  a 


r„-2 


Figure  4.    Possible  shapes  for  &„_](rn_2). 


Next  we  consider  the  optimal  decision  when  entering  state  (n  —  2),  and  assuming  that  the 
optimal  decision  will  be  made  at  the  subsequent  stage.  We  consider  cases  of  (/3„_|  <  a)  and 
(/3„_i  ^  a)  separately. 

(a)  (/3„_i  <  a)  implies  replacement  on  entering  (n  —  1),  so 

A„_2(r„_3)  =  (/3„_2-a)  £[r„_2|r„_3]  +  (e„_2  -  e„_i), 
resulting  in  the  same  three  possibilities  listed  above  for  state  (n  —  1). 

(b)  for  (£„_!  >  a): 

(25)  A„_2(r„_3)  =  e„_2  +  (j3„_2-a)£[rn_2|r„_3] 

+  J  .     [(p„-i-a)  E[rn_x\rn_2]  -  e„]  /(r„_2|r„_3)rfr„_2 

C*n-l 

+  Jo       (-e„-i) /(r„_2|rw_3)^„_2 

Equation  (25)  can  be  simplified,  with  the  aid  of  the  notation  (x)+  =  maxOc,  0),  to  the  form 

(26)  A„_2(r„_3)  =  (e„_2  -  en_x)  +  (j8„_2-a)  E[r„-2\r„-3] 

+  £[(A„_1(rn_2))  +  |r„_3]. 
Useful  comparisons  can  be  formed  if  normalized  variables  are  introduced,  namely 
s,  =  rj-oc  hM-\)  =  M'V-i)lr,._,U,/_,s/_, 
We  now  prove 

(a)  8„_2(s„_3)  ^  8„_,(s„_3) 

(b)  8„_2(s„_3)  is  convex  with  at  most  one  upward  zero  crossing  at  an  5  >  0. 

There  is  no  harm  in  writing  8„_,(s„_3)  or  8„_,(s+)  instead  of  8„_,(s„_2)  for  purposes  of 
comparing  functions. 


OPTIMAL  REPLACEMENT  OF  PARTS  367 

To  prove  (a),  consider 

(27)  8„_2(s)  -  8„_iCs)  =  [(e„_2  -  e„-i)  -  (e„-i  -  <?„)]  +  £[(8„_1(s+))+|5] 

+  [(/3w_2-a!)i7„_2-  (^w_i-a)Tj„_i]  £"[s+|s]. 
where  5+  represents  the  normalized  duration  following  5. 

The  terms  on  the  right  side  of  (27)  are  nonnegative  due  to  the  convexity  of  the  eh  i  )+ 
^  0,  (A-6),  and  the  assumed  orderings  of  the  /3,  and  tj,. 

This  completes  the  proof  that  (a)  is  true.  It  follows  immediately  that  if  (i)  (preceding  Eq. 
(24))  applies  for  state  in  —  1),  then  it  is  also  optimal  not  to  replace  in  state  in  —  2)  or  any  ear- 
lier state.    (Recall  /3„_i  <  /3„_2  <  . . .  ,  and  we  are  now  considering  a  <  (3„-\). 

To  prove  (b),  which  is  only  of  interest  when  an  r*_2  >  0  exists,  we  refer  to  the  theorem 
in  the  appendix.   The  test  difference  8„_2(s)  can  be  written  as 

(28)  8„_2(s)  =  E[en_2  -  en_x  +  (fi „_2- a)r) „-2  s+  +  i8n^is+))+\s] 

in  which  the  integrand  has  the  properties  required  by  his)  in  the  theorem.  To  see  this,  we 
note  that  r*_2  >  0  implies  that  (8„_,(0))+  =  0,  so  the  integrand  is  nonpositive  at  5+  =  0. 
Thus,  8„_2(s)  has  the  shape  stated  in  (b),  implying  that 

(29)  e„_3={ra_3:r„_3<  r,;_3} 

where  r„*_3  may  be  zero,  infinity,  or  the  nonnegative  value  defined  by  8 „_2(r„*_3 / '-q „_3)  =  0. 

The  foregoing  arguments  can  be  repeated  for  /•„_4,  rn_5  ...  r0  to  prove  that  the  optimal 
replacement  policy  has  the  form: 

Replace  on  entering  state  /',  if  and  only  if,  r,  ^  r*  where 

0  <  rj/rjo  <  /-'/tj,  <  ...  <  d/i7„_,  =  oo. 

When  repeating  the  proof  for  earlier  stages,  the  (  )+  term  in  (27)  and  (28)  is  modified  to  the 
form,  e.g.,  [(8„_2(s+))+  -  (8 „_t (5+))+] .  This  term  is  generally  nonnegative,  due  to  (a)  at  the 
preceding  iteration  (next  time  step);  and  it  is  zero  for  s+  =  0  when  proving  (b),  since  then  r„*_3 
>  0.   Thus  the  basic  theorem  is  still  applicable. 

3.  Computational  Procedure 

The  preceding  section  derived  the  structure  of  the  optimal  decision  rule  for  the  case 
where  replacement  is  more  difficult  and  more  expensive  when  the  part  is  more  deteriorated. 
The  corresponding  optimal  decision  thresholds  can  be  formed  as  follows: 

(a)  choose  an  initial  a. 

(b)  Find  the  r*ia)  (;=  n—l,  n  —  2,  ...0)  recursively,  via  numerical  integration  of 
expressions  like  (26)  (where  r*_3  (a)  is  defined  by  the  condition  A„_2(v_3)  =  0). 

(c)  Compute 

n    [ifi0-a)  r0+  (A,(/-0))+]  fir0)dr0. 
o 

(d)  If  |£o(a)l  <  e^  f°r  sufficiently  small  e,  say  Lmax  =  a*  =  a:  otherwise  repeat  the 
computational  cycle  starting  with  a  new  a. 


368 


L.  SHAW,  C-L    HSU  AND  S.G    TYAN 


The  following  properties  of  £o  (a)  can  be  used  to  generate  an  a -sequence  which  con- 
verges to  a  *. 

1.  £q  (**)  's  monotone  decreasing,  since  £0(a)  nas  this  property  for  a  fixed  policy  (see 
Eq.  (19));  and  if  £o  ("2)  ^  £0  (ai)  f°r  a2  >  fli,  then  the  policy  used  to  achieve  £q  ^2)  could 
be  used  to  achieve  an  £0(«|)  >  £<?  («i)  —  a  contradiction. 

2.  When  p  =  0,  all  r*  are  zero  or  infinite:  replacement  always  occurs  on  arrival  at  a  criti- 
cal state  /*.  Use  of  that  policy  will  achieve  the  same  average  reward  for  durations  having  any 
value  of  p.  Thus,  a  useful  bound  on  a*(p)  is  a*(0)  ^  a*(p);  0  <  p  <  1*. 

3.  When  p  =  1,  future  r,  are  completely  predictable  ( Var (/-,| r,_j)  =  0  in  (A-7)),  so 
a*(l)  ^  a*(p).  In  this  case  there  is  essentially  a  single  random  variable  r0,  and  the  r,"  can  be 
calculated  without  the  need  for  numerical  integration  of  Bessel  functions. 


4.   NUMERICAL  EXAMPLE 


Table  I  lists  parameter  values  for  a  replacement  problem  which  fall  under  the  assumptions 


of  Section  2. 


TABLE  1  —  Numerical  Example  Parameters 


CASE  I    (p  =  0) 


i 

0 

1 

2 

3 

4 

5 

(Q, 

5 

4 

3 

2 

1 

17/ 

1 

0.9 

0.8 

0.7 

0.6 

E[Pl\ 

2 

2.2 

2.4 

2.6 

2.8 

E[d,] 

1 

1.1 

1.2 

1.3 

1.4 

Since  future  durations  are  independent  of  past  ones,  the  optimal  policy  replaces  when  a 
critical  state  i*  is  reached.   The  general  optimal  reward  expression 


<**(p)  = 


N-\ 

Z/V,  -  Pn 
0 


N-\ 
0 


becomes,  in  this  case 


a*(0)  =  max 


fcjB,  7,,  -  E[Pi] 


=  max  ^  (J) 


Direct  evaluation  shows 


1.5      2.13      2.205      2.085      1.89 


withy'*=  3  anda*(0)  =  2.205. 


OPTIMAL  REPLACEMENT  OF  PARTS 


369 


CASE  2    (p  =  1) 

Since  r,  =  r0  rj/rjo  in  this  case,  the  optimal  rule  specifies  a  replacement  state  j(r0)  as  a 
function  for  r0. 


For  any  such  policy 

£0(a,j(r0))  =  ErQ 


T70      o 


This  expectation  will  be  maximized  if  j(r0)  maximizes  the  bracketed  term  for  each  r0.    Making 
the  necessary  comparisons  for  a  sequence  of  a -values  leads  to  the  policy 

j*  =\,  if  ro  <  0.2698 

=  2,  if  0.2698  ^  r0<  0.7083 

=  3,  if  0.7083  ^  r0 

for  which  |£0I  <  0.003  and  a  *(1)  =  2.25. 


<  2.25.    A  pilot  calculation  along  the  lines  indicated  in  the 


CASE  3 

1 

We  know  that  2.205  <  a* 

1 
2 

<  2.25. 

previous  section  shows  that  rj 

1 
2 

=  o,  >■; 

1 

2 

.       9 

(<**- 

2) 

—    =  oo  for  j ^  2,  and 


r\  = 


1(3 -a*)' 

where  a  *  is  chosen  to  make  the  following  £0(a)  vanish. 


oo        .oo 


£0(a)  =  6.4  -  3a  +    f     J 


'-f 


+  j  (3-a)r, 


2rn  + 


0.45 


0.45 


The  known   bounds  on  the  optimal  reward  a 


bounded,  too:  0.290  <  n 


70(2.981  yf7^\)drxdrQ 


imply  that  the  optimal  threshold  r*  is 


<  0.375. 


Similar  study  of  other  values  of  the  correlation  parameter  p  lead  to  the  optimal  policy  pat- 
tern described  in  Table  II.  One  might  say  that  as  p  increases,  the  past  observations  are  more 
informative,  the  optimal  policy  makes  finer  distinctions,  and  the  optimal  reward  increases. 

5.  CONCLUSIONS 

A  multivariate  exponential  distribution  has  been  used  to  describe  successive  stages  of 
deterioration.  Optimal  replacement  strategies  have  been  found  for  the  class  of  decision  rules 
which  can  continuously  observe  the  deterioration  state,  and  which  may  make  replacements  only 


370 


L.  SHAW,  C-L.  HSU  AND  S.G.  TYAN 


TABLE  2  -  Optimal  Policy  Structure 


Correlation  Parameter  p 

0) 

c 

E 
o 
o 

o. 

a! 

0 

1/4 

1/2 

3/4 

1 

1 

r0  <  r0*(3/4) 

r0  <  r'Q  (1) 

2 

r,  <  rrfl/2) 

r,  <  rf(3/4) 

'o  <  — r\  (!) 

3 

always 

always 

r,  ^  r'  (1/2) 

/•,  ^  r,"  (3/4) 

r0  >  — r,  (1) 

at  the  times  of  state  transitions.    Similar  results  have  been  found  for  the  other  reward  rates 
shown  in  Figure  2  (linear;  and  constant  after  an  initial  set-up  interval  for  readjustment  to  the  i 
new  state)  [5]. 

The  optimal  replacement  policy  derived  in  Section  2  makes  use  of  observations  which 
allow  estimation  of  the  current  rate  of  deterioration  for  the  correlated  stages  of  deterioration. 
The  numerical  example  demonstrates  how  the  optimal  policy  and  reward  are  related  to  the 
amount  of  correlation  between  the  durations  in  successive  deterioration  states.    For  the  model 
used  here,  the  optimal  policy  for  p  =0  will  achieve  the  same  reward  (less  than  optimal)  for  j 
any  p.    Depending  on  the  application,  the  suboptimal  approach  may  be  satisfactory.    The  addi-  j 
tional  reward  achievable  by  the  actual  optimal  policy  is  bounded  by  the  easily  computed  optimal  , 
reward  for  p  =   1.    However,  it  is  possible  that  the  small  percentage  improvement  achievable 
for  the  p  =  1/2  case  in  the  example  could  represent  a  significant  gain  in  a  particular  applica- 
tion. 


The  ordering  of  state  dependent  rewards,  mean  durations,  etc.  assumed  here  are  physi- 
cally reasonable,  and  lead  to  nice  ordering  of  the  decision  regions.  However,  other 
/?,■,  7],,  ph  d,  orderings  might  be  more  appropriate  in  other  situations.  The  model  introduced 
here  for  dependent  stage  durations  could  be  used  in  those  cases,  together  with  dynamic  pro- 
gramming optimization,  although  the  solutions  may  not  have  comparably  neat  structures. 

We  anticipate  that  the  optimization  approach  and  policy  structure  described  here  will  also 
be  applicable  to  replacement  problems  having  similar  deterioration  models.  One  easy  extension 
would  be  to  change  the  correlation  structure  in  (A-3)  from  p''-/l  to  something  else,  e.g., 
p  \i-j\  +  p  |/-/l_    other  changes  could  permit  the  r,  to  have  non-exponential  distributions,  as 

long   as   similar   total-positivity    properties   exist    to    permit   analogous   simplifications   in   the 
dynamic  programming  arguments. 


Some  of  these  other  r,  distributions  are  being  studied  now  in  the  hope  of  finding  similar 
models  which  exhibit  large  percentage  differences  between  the  optimal  rewards  as  PVi+1  changes 
from  zero  to  one.  (Other  choices  of  the  numerical  values  in  Table  I  have  not  revealed  any 
such  cases  for  the  current  model). 

One  reasonable  generalization  would  allow  transitions  from  state  /to  any  state  j  >  /'.   This, 
would  not  change  the  form  of  the  solution  in  the  case  of  constant  replacement  penalties.   How- 


OPTIMAL  REPLACEMENT  OF  PARTS  371 

ever,  the  possibility  of  these  additional  transitions  does  ruin  the  structure  when  replacement 
penalties  increase  with  the  deterioration  state.  (The  8„_2(s)  >  8  „_)($)  argument  is  no  longer 
valid.) 

REFERENCES 

[1]  Barlow,  R.E.  and  F.  Proschan,  "Mathematical  Theory  of  Reliability,"  John  Wiley  and  Sons 

(1965). 
[2]  Barlow,  R.E.  and  F.  Proschan,  "Statistical  Theory  of  Reliability  and  Life  Testing,"  Holt, 

Rinehart,  and  Winston  (1975). 
[3]  Griffith,  R.C.,  "Infinitely  Divisible  Multivariate  Gamma  Distributions,"  Sankhya,  Series 

A, 52,  393-404  (1970). 
[4]  Gumbel,  E.J.,  "Bivariate  Exponential  Distributions,"  Journal  of  the  American  Statistical 

Association,  55,  678-707  (1960). 
[5]  Hsu,  C-L.,  L.  Shaw  and  S.G.  Tyan,  "Reliability  Applications  of  Multivariate  Exponential 

Distributions,"  Technical  Report,  Poly-EE-77-036,  Polytechnic  Institute  of  New  York 

(1977). 
[6]  Kao,  E.P.,  "Optimal  Replacement  Rules  when  Changes  of  States  are  Semi-Markovian," 

Operations  Research,  21,  1231-1249  (1973). 
[7]  Karlin,  S.,  "Total  Positivity,"  Stanford  University  Press  (1968). 

[8]  Kibble,  W.F.,  "A  Two-Variate  Gamma  Type  Distribution,"  Sankhya,  5,  137-150  (1941). 
[9]  Luss,  H.,  "Maintenance  Policies  when  Deterioration  Can  Be  Observed  by  Inspections," 

Operations  Research,  24,  359-366  (1976). 
[10]  Marshall,  A.W.  and  J.  Olkin,  "A  Multivariate  Exponential  Distribution,"  Journal  of  the 

American  Statistical  Association,  22,  30-44  (1967). 
[11]    Ross,    S.,   "Applied   Probability   Models   with   Optimization   Applications,"    Holden-Day 

(1970). 


APPENDIX 

Dependence  Relationships  Among  Multivariate 
Exponential  Variables 

Many  multivariate  distributions  have  been  described  and  applied  to  reliability  problems 
[4,8,10].  In  each  case  the  marginal  univariate  distributions  are  of  the  negative  exponential 
form.  Properties  of  the  distribution  used  here  are  most  easily  derived  by  exploiting  its  relation- 
ship to  multivariate  normal  distributions  [3.5]. 

The  multivariate  exponential  variables  rx,  r2  . . .  ,  rn  can  be  viewed  as  sums  of  squares: 

(A-l)  r,  =  w,2  +  z,2, 

where  w  and  z  are  independent,  zero  mean,  identically  distributed  normal  vectors,  each  with 
covariance  matrix  T.   It  follows  that  the  r,  have  exponential  marginal  distributions  with 

(A-2)  E[r,]  =  2r„ 

We  specialize  to  the  case  where  the  underlying  normal  sequences  {w,}  and  {z,}  are  Markovian 

(A-3)  yij  =  Jy77JjPU-Jl 

and  find  that  {r,}  is  also  Markov  with  the  joint  density 


372 
(A-4) 


L.  SHAW,  C-L.  HSU  AND  S.G.  TYAN 


/(ro,rlfr2,  •••  .  V-i)  - 


n-i     l-i 

/•-0       J 


n-2 

;=0 


exp 


1- 

1 


V    rfiVi+ 


nn+\ 


\-p 


I±  +   rJzl   +  y2iili±Pl 


170       TJn-l 


l»i 


;»  >  2, 


Equation   (A-4)    uses  the  modified  Bessel  function   70(  )   and  the  notations  E[r,]  =  tj,  and 
Pr,r     =  P-    (When  «  =  2,  the  summation  in  exp  (   )  vanishes.) 


(A-5) 


The  conditional  density  is  easily  shown  to  satisfy  the  Markov  property  and  [5] 

1 


/(r,|/7_,)  =  [tj,(/  -  p)]   'exp 


(1  -p) 


2         I  p  nn- 1 

,  1  ~P     V     V,Vi-\ 


with 
(A-6) 

(A-7) 


E[n\rHl]  =  17,  +  (/-,_!  -  T7,_!)p  Vi/Vi-\- 
Varlr/lr,.!]  =  t,,2[(1  -  p)2  +  2p(l  -  p),-,,,/-^,]. 


These  conditional  moments  shows,  e.g.,  that  the  conditional  mean  of  r,  exceeds  its  mean  in 
proportion  to  the  amount  by  which  r,_]  exceeds  its  mean,  and  that  conditional  mean  is  a 
linearly  increasing  function  of  rH\. 

The  dynamic  programming  arguments  used  here  required  calculations  of  conditional 
expectations  based  on  (A-5).  As  is  often  the  case  [2],  the  total  positivity  properties  of 
/0/|#/_i)  are  very  useful  for  determining  structural  properties  of  the  optimal  policy. 

It  is  straightforward  to  show  that  both  fir,,  rHi)  and  /(r/|r,-_i)  are  totally  positive  of  all 
orders  (TP^),  [5,7].  This  means,  for /(/•,,  rHi),  that  the  following  determinants  are  nonnega- 
tive  for  any  Wand  any  ai  <  a2  ...  <  aN;  ft{  <  /32  . . .  <  /3#. 

/(a,,/8,)        /(a,  -fl2)  ...  f(ah/3N) 

>  0. 

/(atf.Pi) A<*N,PN) 

THEOREM:  if  h(y)  is  continuous  and  convex,  and  satisfies  the  bounds 

(i)    h  (0)  <  0 

(ii)      \h(y)\  <  a  +  b  v2m;    a  >  0,  b  >  0,  y  >  0,  m    =    positive    integer.     g(x)     = 
I   ^(y)  /(yU)  dy,  and  f(y\x)  is  jHPoc,  then  g(x)  is  continuous,  convex,  bounded  in  the  sense 

\g(x)\  <  a'  +  b'  x2m;  a'  >  0,  b'  >  0,  x  >  0; 

and  belongs  to  one  of  the  three  following  categories: 


OPTIMAL  REPLACEMENT  OF  PARTS  373 

(I)  g(x)  >  0  for  all  x^  0, 

(II)  g(x)  <  0  for  all  x  ^  0  except  with  a  possible  zero  at  x  =  0, 

(III)  there  exists  a  unique  x*,  0  <  x*  <  °o,  such  that  g(x)  >  0  for  all  x  >  x*  ;  and 
g(x)  <  0  for  x  <  x*  except  for  a  possible  zero  at  x  =  0. 

This  theorem  is  used  to  define  optimal  decision  regions  according  to  the  sign  of  a  function  like 
g(x),  with  x*  corresponding  to  a  decision  threshold. 


I 


STATISTICAL  ANALYSIS  OF  A  CONVENTIONAL  FUZE  TIMER 

Edgar  A.  Cohen,  Jr. 

Naval  Surface  Weapons  Center 

White  Oak 

Silver  Spring,  Maryland 

ABSTRACT 

In  this  paper,  a  statistical  analytic  model  for  evaluation  of  the  performance 
of  a  standard  electric  bomb  fuze  timer  is  presented.  The  model  is  based  on 
what  is  called  a  selective  design  assembly,  where  one  item,  namely,  a  resistor, 
is  used  to  time  the  circuit.  In  such  an  assembly,  the  remaining  components  are 
chosen  a  priori  from  predetermined  distributions.  Based  on  the  analysis,  a  gen- 
eral numerical  integration  scheme  is  utilized  for  assessing  performance  of  the 
timer.  The  results  of  a  computer  simulation  are  also  given.  In  the  last  section 
of  the  paper,  a  theory  for  evaluation  of  the  yield  of  two  or  more  timers 
designed  to  operate  in  sequence  is  derived.  To  appraise  such  a  scheme,  a  nu- 
merical quadrature  routine  is  developed. 


1.   INTRODUCTION  AND  PHYSICAL  DESCRIPTION 

In  this  paper,  we  shall  be  concerned  with  the  statistical  analysis  of  the  bomb  fuze  timer 
shown  in  Figure  1.  As  is  common  in  practice,  a  standard,  or  precision,  resistor  is  used  to  time 
the  circuit  after  the  rest  of  the  components  have  been  assembled  in  a  random  fashion.  Then, 
to  meet  certain  timing  requirements  to  be  discussed  later,  a  resistor  is  selected  and  introduced 
into  the  circuit.  A  number  of  tests  must  afterwards  be  performed  in  sequence  to  check  the  per- 
formance of  the  product  under  differing  environmental  conditions.  Such  environmental 
influences  are,  for  example,  temperature  effects,  effect  of  packaging,  resistor  incrementation  (to 
be  discussed),  and  effect  of  vibration  and  moisture  uptake.  In  addition,  one  might  have  several 
timers  which  operate  sequentially,  all  fed  from  the  same  energy  storage  capacitor  CI  of  Figure 

1.  This  paper  is  devoted  to  an  analysis  of  such  a  timer  in  what  is  called  the  ambient  tempera- 
ture range,  whose  limits  are  70°F  and  80°F,  respectively.  We  will  also  indicate  the  procedure 
for  treating  analytically  the  assessment  of  performance  of  combinations  of  several  timers.  The 
author  has  been  involved  in  a  Monte  Carlo  study  for  the  Navy  of  such  timers.  Previous  work 
has  involved  reliability  studies  of  an  entire  fuze  assembly  using  these  timers  [2]. 

2.  RESISTOR  SELECTION  PROCESS 

The  timer  indicated  in  Figure  1  works  once  the  potential  difference  across  the  two  capaci- 
tors C2  and  C3  is  sufficient  to  fire  the  cold  cathode  diode  tube  VT.  Capacitors  CI  and  C3  ini- 
tially have  the  same  potential  across  them.  As  time  progresses,  CI  discharges  through  resistor 
RES  into  C2,  while  C3  serves  as  a  reference  capacitor.    Thus,  the  voltage  across  C2  builds  up 


375 


376 


E  A.  COHEN,  JR 


TIMER 

OUTPUT 


Figure  1.    Fuze  timer  configuration 

until  the  potential  across  tubes  C2  and  C3  is  adequate  to  fire  tube  VT.  The  relationship 
between  firing  time  and  the  values  of  the  circuit  components  can  be  derived  from  a  simple  first 
order  differential  equation  and  is  given  by 


(2.1) 


where 


RC,C 


t  = 


Q  +  c2 


In 


VC> 


VC,-  (VT-  V)  (C,+  C2) 


Cx  =  capacitance  of  capacitor  CI 

C2  =  capacitance  of  capacitor  C2 

V  —  supply  voltage  (potential  across  C3  and  potential  initially  across  CI) 

VT  =  firing  voltage  of  cold  cathode  diode  tube  VT 

R    =  resistance  of  resistor  RES. 

To  illustrate  the  pertinent  features  of  the  process,  write  (2.1),  for  brevity,  in  the  form 

(2.2)  t  =  RF(C,,  C2,    V,    Vj). 

Note  that  (2.2)  is  linear  and  homogeneous  in  R,  so  that  R  can  be  used  as  a  scaling  parameter. 
This  is  precisely  how  it  is  used  when  the  timer  is  first  assembled. 

In  practice,  the  resistors  are  supplied  in  large  numbers  by  the  manufacturer,  after  which 
they  are  tested  and  sorted  by  the  user  into  a  large  number  of  bins.  The  resistors  in  each  bin 
have  resistances,  at  a  standard  temperature,  which  fall  into  a  certain  interval.  These  intervals 
are  arranged  to  have  the  same  "percent  width",  to-be  described  in  more  detail  below.  The  timer 
is  to  be  designed  to  fire  at  a  nominal  time  tN.  Since  capacitors  CI  and  C2  are  chosen  at  ran- 
dom from  a  lot,  their  capacitances  C\  and  C2  may  be  treated  as  random  variables.  Likewise, 
tube  firing  voltage  VT  may  also  be  considered  as  a  random  variable.  In  general,  we  shall  also 
consider  the  supply  voltage  Fto  be  random. 


ANALYSIS  OF  FUZE  TIMER  377 

Let  us  agree  to  denote  by  R0  that  value  of  R  obtained  from  relation  (2.1)  when  t  =  tN 
and  Cb  C2,  V,  and  VT  are  given  their  expected  values  at  some  standard  temperature,  e.g., 
75°F.  For  convenience,  R0  may  be  used  as  a  reference  resistance,  and  the  bin  to  which  refer- 
ence resistor  RES0,  of  resistance  R0,  belongs  could  be  called  the  reference  resistor  bin.  The 
interval  corresponding  to  this  bin  is  to  contain  all  resistances  which  fall  between  R0  (1  —  e)  and 
R0(\  +  e),  where  e  is  a  preassigned  small  positive  number.  Our  second  bin  will  contain  all 
resistors  whose  resistances  fall  between  7?0(1  +  e)  and  R0(\  +  e)2/(l  —  e),  and  the  third  bin 
those  resistors  whose  resistances  lie  between  R0(\  —  e)2/(l  -I-  e)  and  R0(\  —  e).  In  general, 
our  intervals  are  to  be  so  constructed  that  the  ratio  of  right  endpoint  to  left  endpoint  is  always 
(1  +  «)/(l  —  e),  which,  to  first  order  accuracy,  is  just  1  +  2e.  Alternatively,  one  may  divide 
the  difference  of  the  two  endpoints  by  its  midpoint  to  obtain  precisely  2e.  We  shall,  therefore, 
say  that  each  such  interval  has  "percent  width"  2e.  In  setting  up  the  interval  division  scheme,  a 
percent  increment  e,  is  chosen  a  priori,  and  then  e  =  e,/100.  This  e,  is  typically  of  the  order  of 
1/2  to  1%.   Figure  2  is  a  diagram  of  this  scheme. 


-o- 


M-e)2        Ro(l-€)         R0(J-€)      R0       R0(l+€)  R0(l+€)  R0(l+€) 


2 


l  +  €  l+€  1-6  J-6 

Figure  2.    Resistance  interval  setup 

Once  again  referring  to  our  circuit  configuration,  where  C\,  C2,  K  and  VT  are  random 
variables,  let  us  define 

(2.3)  f0-  R0F(CU  C2,   V,   VT). 

Then,  to  achieve  the  nominal  time  tN,  we  define  our  nominal  resistance  to  be 

(2.4)  RN  -  R0tN/t0. 

Note  that,  since  tQ  is  a  random  variable  (being  a  function  of  the  random  variables  Cj,  C2,  V, 
and  VT),  RN  is  also  a  random  variable.  A  technician  may  use  relation  (2.4)  to  determine  RN. 
Then  he  picks  a  resistor  RESp  at  random  from  the  bin  to  which  resistor  RES/y  belongs  and 
integrates  such  resistor,  of  resistance  Rp,  into  the  circuit.  This  process  is  called,  in  fuze  tech- 
nology parlance,  "resistor  incrementation."  Note  that  Rp  is  a  random  variable  which  is  statisti- 
cally dependent  on  RN  inasmuch  as  Rp  and  RN  must  lie  in  the  same  interval.  However,  once 
attention  is  restricted  to  a  given  interval  of  the  scheme,  it  is  clear  that  the  value  of  RN  in  no 
way  influences  the  value  of  Rp,  since  one  is  free  to  select  any  resistor  in  the  bin  to  which  the 
nominal  resistor  belongs.  We  shall  reemphasize  this  fact  in  Section  3.  For  simplicity  we  index 
the  intervals  by  /',  letting  their  left  and  right  endpoints  be  r,  and  ri+\,  respectively.  To  achieve 
compatibility,  the  bins  should  initially  be  formed  and  kept  at  some  standard  temperature,  and 
the  timer  should  be  assembled  at  that  same  temperature.  In  practice,  this  will,  in  all  likelihood, 
not  be  the  case,  but  one  may  compensate  for  this  defect  by  studying  the  sensitivity  of  the  timer 
to  changes  in  bin  interval  width.  For  example,  if  by  doubling  the  interval  width,  the  overall 
change  in  performance  is  insignificant,  it  may  be  safely  assumed  that  such  a  discrepancy  was 
unimportant  (provided  the  distributions  due  to  ambient  temperature  variations  are  of  small 
variance). 

3.   PROBABILITY  INTERVALS  AT  THE  STANDARD  TEMPERATURE 

The  problem  of  determining  the  probability  of  operation  of  the  timer  within  two  given 
times,  say  tx  and  t2,  when  there  is  no  effect  other  than  resistor  selection  is  not  difficult.  (We 
also  ignore,  in  this  section,  the  effect  of  tube  firing  voltage  variation  from  one  firing  to  the 


378  E. A   COHEN,  JR. 

next.  This  phenomenon  will  be  discussed  in  some  detail  in  Section  4.)  The  reason  is  that  the 
time  is  linear  and  homogeneous  in  resistance  R.  In  fact,  the  bins  have  been  designed  to  take 
advantage  of  this  feature,  and  we  shall  show  that  the  probability  interval  is  independent  of  the 
bin  in  which  resistor  RESyv  falls. 

First  of  all,  let  r^'in  and  ^ax  De  the  minimum  and  maximum  times,  respectively,  obtain- 
able when  the  nominal  resistance  RN  and  the  picked  resistance  Rp  come  from  a  given  bin  /. 
Also,  let  F^'/n  and  F^lx  be  the  smallest  and  largest  values  of  F,  respectively,  given  only  tN  and 
knowing  that  RN  comes  from  that  bin.   It  follows  that 
(3.1)  fm'in  —  r,  Fm'in  =  rjtN/rl+\ 

and 

(3-2)  /'max    =   r,+  i   ^m'ax   =   ri+\   WO' 

Therefore,  given  that  RN  and  Rp  lie  in  interval  /, 

(3.3)  r,tN/r,+]  ^  t  ^  r/+1  tN/rr 
Since                   rJrM  =  (1  -  e)/(l  +  e), 

(3.4)  (1  -  «)/(l  +  e)  ^  tit*  <  (1  +  e)/(l  -  e), 
independent  of  bin  interval.   In  other  words,  (3.4)  is  true  with  probability  1. 

Generally,  suppose  that  one  is  interested  in  the  probability  that  firing  time  falls  between 
two  prescribed  limits  about  the  nominal  time.  Consider  once  more  a  given  bin  /.  Let  us  denote 
by  R^  and  Rp(l)  random  variables  derived  from  RN  and  Rp  respectively  under  the  condition 
that  RN  and,  therefore,  Rp  must  lie  in  interval  /.  From  our  discussion  in  section  2,  it  is  clear 
that  these  new  random  variables  must  be  independent.  Let  f]  and  t2  be  the  lower  and  upper 
limits,  respectively,  on  firing  time.  For  any  given  value  of  the  random  variable  Rh\  one  can  I 
determine  limits  on  the  random  variable  Rpn  so  that  the  firing  time  lies  between  t\  and  t-i.  j 
Since,  by  definition,  tN  =  Rfj']F,  it  follows  that  Rp    cannot  be  less  than 

(3.5)  r,/F=  tiR^/tn. 
Similarly,  Rp(,)  cannot  exceed 

(3.6)  t1R(Nl)/tN. 

One  must,  of  course,  realize  that  (3.5)  may  be  smaller  than  r,  and  (3.6)  larger  than  ri+\  for 
values  of  /?#  '  close  to  r,  and  ri+\,  respectively. 

If  we  let  g(R/v)  be  the  density  function  of  the  random  variable  RN  defined  by  (2.3)  and 
(2.4),  whose  range  is  a  function  of  the  domain  of  Cj,  C2,  V,  and  VT,  then  the  induced  random 
variable  /?#'  has  conditional  density 

(3.7)  g('W>)  =  g(RN)/P(r,  <  RN  ^  rl  +  ])  =  g(RN)/  f'M  g(RN)dRN. 

The  range  of  R^n  is  restricted  to  the  interval  [r„  r,+1].  Using  the  mean  value  theorem  of 
integral  calculus,  (3.7)  becomes 


(3.8)  gU)(R^)  =  g(RN)/git)(ri+]  -  r,),  r,  <  f  <  r, 
If  r,+1  —  r,  is  sufficiently  small,  one  sees  that 

(3.9)  gU)(R^n)  =  l/(r,+1-r,). 


/+!• 


ANALYSIS  OF  FUZE  TIMER 


379 


Similarly,  let  f{l)(RpU))  be  the  density  function  for  picked  resistance  Rp{'\  whose  range  is  like- 
wise restricted  to  [r,,  ri+l].  Then,  with  the  knowledge  that  R^l]  and  Rp(l)  are  independent  ran- 
dom variables,  and,  letting  Pj(t\  <  t  <  t2)  be  the  probability  that  firing  time  falls  between  /] 
and  t2  (given  that  RN  and  Rp  come  from  interval  /), 


(3.10) 


■',«#'/' 


(ri  <  r  <  h)  =  f  +'  f  2  "  '*  g<»Ullp)fi»Ull»)dRydR!P 

Jr,        JtlRfr)/tN 


We  take  the  liberty  of  defining  fU)(Rp(n)  =  0  in  (3.10)  whenever  /?„(/)  tf  [r„  r/+1].   This  is  done 
purely  for  the  sake  of  convenience  of  notation  even  though  the  range  of  RpU)  is  [r,,  ri+x\. 

The  probability  that  the  time  falls  between  tx  and  t2  is  expressed  by 

oo 

(3.11)  Pit,  <  t  <  t2)  =    X  ^(r,  <  r  <  t2), 

where  /?,•  is  the  probability  of  choosing  bin  /'. 


As  we  have  previously  indicated,  if  ri+l  —  r,  is  sufficiently  small,  we  can  assume,  for  all 
practical  purposes,  that  R^  is  a  uniformly  distributed  random  variable.  The  picked  resistance 
Rp{l)  should  also  be  a  uniformly  distributed  random  variable  if  all  resistors  in  bin  /  are  equally 
likely  to  appear.   In  other  words,  let  us  assume  that 


(3.12) 


.(/>i 


«■(  i  i 


(,), 


g{,,W>)  =  fn(Rpw)  =  l/(r/+1-r,). 


Suppose  then  that  one  asks  for  the  probability  that  tx  =  tN{\  —  8)  <  t  <  tN{\  +  8)  =  t2  for  a 
given  small  8.  We  proceed  to  derive  closed  form  expressions  for  this  probability.  Three  cases 
naturally  arise,  the  first  of  which  is  shown  in  Figure  3  below.  For  brevity,  we  shall  drop  the 
superscript  /  in  this  figure  and  the  two  following  figures.  In  this  diagram,  the  interior  of  the 
quadrilateral  formed  by  the  lines  RN  =  r,,  RN  =  ri+\,  Rp  =  t\RN/tN,  and  Rp  =  t2RN/tN  is  the 
region  of  integration.  Note  that,  in  the  two  hatched  regions,  f{l){Rp'])  =  0,  since  then  either 
Rp  <  r,  or  Rn  >  r, 
equivalent  to 


p  <  r,  or  Rp  >  r/+1.    After  a  small  computation,  one  sees  that  the  inequality  r^0)  <  ffi  is 


(3.13) 


0  <  8  <  €. 


We  also  note  that,  using  (3.12),  (3.10)  represents  the  normalized  area  of  the  interior  of  the 


hexagon    shown    in    Figure    3,    bounded    by    the    lines    RN  =  rh    RN 
Rp  =  t2RNltN,  Rp  =  rh  and  Rp  =  rl+].   Therefore, 


Rp  —    t\^-NltN-> 


(3.14) 


P,{tN{\  -8)  <  t  ^  tN(\  +8)) 


1 


+ 


C'i+1  "  r,)2 

r,/(l-5)  J( 


,/-,/(l-5) 


r-  J  r 


(\+h)R^ 


dR(,)  dR^ 


(\+6)R^') 


dRU)  dRil] 


(l-S)/?^' 

+    f"+]n    «    f',+1      M«(,)«#J 

Jr,+  I/(l+8)  J(\-8)Rk<)  N 


8 

86  2 


(1  +  e)2(2  +  8)         (l-e)2(2-8) 


1+8  1-8 

It  follows  that  P,  is  independent  of  /.    From  (3.11), 
(3.15)  PUi  <  t  <  t2)  =  P,{t\  <  t  <  h). 


0  <  8  <  e. 


380 


E.A.  COHEN,  JR. 


R^=t2RN/t 


Rp    Rp=ri+'~^\ 


ou 


R^N 


Rp2,=t.RN/tN 


%= 


ri     r(0)r(l] 
N         N 


r. 


i+l 


Figure  3.    Picked  resistance  versus  nominal  resistance  (Region  1) 


The  second  case  occurs  when  r,  ^  r,y  ^  rN  ^  ri+\-  This  situation  is  indicated  in  Fig- 
ure 4.  One  can  also  show  that  r^0)  =  r,  +  1  when  8  =  2e/(l  +  e)  and  that  r^l)  =  r,  when 
8  =  2e/(l  —  e).  Therefore,  the  situation  illustrated  in  Figure  4  occurs  when 
e  <  8  ^  2e/(l  +  e).  A  third  case  will  occur  when  2e/(l  +  e)  <  8  <  2e/(l  —  e),  as  illustrated 
in  Figure  5,  where  the  dotted  region  is  now  a  pentagon.  For  8  ^  2e/(l  —  e),  the  dotted  region 
becomes  the  interior  of  a  rectangle  completely  enclosed  in  the  sector,  so  that  the  probability 
becomes  unity.  In  the  third  case,  one  sees  that  r,  <  r^n  ^  r,+1  ^  r^0).  When  one  integrates 
over  the  interior  of  the  quadrilateral  outlined  in  Figure  4,  one  again  obtains  the  closed  form 
given  in  (3.14).  Therefore,  (3.14)  is  valid  whenever  0^8^  2e/(l  +  e).  The  case  illustrated 
in  Figure  5  is  different.  When  we  integrate  over  the  interior  of  the  pentagon,  which  is  that  por- 
tion of  the  region  of  integration  for  which  the  integrand  of  (3.10)  is  nonzero,  we  find  that 


(3.16) 


Pi(tN(\  -  8)  <  t  <  tN(\  +  8))  = 


4e2  +  4e(l  +e)8  -  (1  -  e)282 


8e2(l  +8) 


2e 


1   +€ 


<  8  < 


2e 


One  easily  shows  that  (3.16)  becomes  unity  when  8  =  2e/(l  —  e)  is  substituted. 

4.    PROBABILITY  INTERVALS  AT  AMBIENT  TEMPERATURE  BEFORE  POTTING 


The  analysis  of  the  timer  when  temperature  and  cold  cathode  diode  firing  voltage  varia-1 
tions  are  considered  is  different  from  that  of  the  previous  section,  since  all  components  except 
for  the  resistor  enter  the  time  nonlinearly.  It  would  then  be  necessary,  at  least  in  principle,  to 
take  into  consideration  the  probabilities  p,  of  picking  the  bins  as  well  as  the  probabilities  for1 
picked  resistance  once  a  bin  has  been  selected.  However,  if  the  variations  due  to  these  effects 
are  relatively  small,  one  should  again  see  probabilities  essentially  independent  of  the  bin 
selected.  Furthermore,  in  a  situation  like  this  wherein  certain  distributions  are  quite  tight,  i.e., 
are  of  small  variance,  some  simplifying  assumptions  can  be  made.  We  shall  get  to  these 
presently.  Again,  as  before,  we  assume  that  the  bin  intervals  are  so  small  that  we  may  reason- 
ably suppose  that  (3.12)  is  true.  Note  also  that  (2.3)  and  (2.4)  express  RN  in  terms  of  tN,  C\, 
C2,  K,  and  VT.  Assume  now  that  C,,  C2,  V,  and  VT  are  independent,  normally  distributed  ran-; 
dom    variables.     Suppose,    as    is    common    in    practice    when    coefficients    of   variation    are 


ANALYSIS  OF  FUZE  TIMER 


R(3,  =  R, 


381 


R  =  r.  .  . 

p 


•fHvu 


Figure  4.    Picked  resistance  versus 
nominal  resistance  (Region  2) 


Rl'^t2RN/t, 


rL3,  =  rn 


Ri2)=',fVtN 


>-R, 


Figure  5     Picked  resistance  versus 
nominal  resistance  (Region  3) 


small  [4,  pp.  246-251],  that  RN  is  linearized  about  the  expected  values  of  capacitances  C\  and 
C2,  supply  voltage  V,  and  tube  breakdown  voltage  VT.  Now  a  linear  function  of  independent, 
normally  distributed  random  variables  is  again  a  normally  distributed  random  variable,  and, 
from  (2.3)  and  (2.4),  it  follows  [4,  pg.    118]  that 

-l 


(4.1) 
and 

(4.2) 


Z7/-D      \  ~      Ov(Q,£    +    C2,£-) 

E{RN)  = 


C,FC 


l,£^2,£ 


in 


VfC 


£M,£ 


VEC\,E-  (Vt.e-  VE)(ChE  +  C2iE) 


var  (RN)  = 


dV 


\  2 


var  C]  + 


var  V  + 


ac2 

dVr 


var  C2 


var  VT. 


382 


E.A.  COHEN,  JR. 


Here  the  subscript  E  indicates  evaluation  at  expected  values  and  var  represents  the  variance 
operator.   Now,  clearly, 

dRN 


(4.3) 


dC, 


*n    dF     .      ,  . 

=  -^aq',  =  1'2' 


with  similar  expressions  for  dRN/d  V  and  dRN/d  VT.    The  relevant  partial  derivatives  of  F  are 
given  by 


(4.4) 


dF             Cl 

1                    VT-V 
X 

ec,      cx  +  c2 

C,  +  C2   ?              Y 

BF             C, 

[       C,                  C2(Kr- 

-  n 

dC2      c,  +  g 

c,  +  c2  'r  '          r 

dF        C,C2 

dVT           Y 

dF       C,C2VT 

dV            VY      ' 

and 


*  =  /« 


KG 


KG-  (Kr-  K)(G  +  C2) 

y=  vcx-{vT-  v){cx  +  c2). 

Now  /?,  represents  the  probability  of  choosing  bin  /',  and  that  is  precisely  the  probability  that  the 
random  variable  RN  belongs  to  bin  /.  Furthermore,  because  we  are  now  assuming  that  RN  is  a 
linear  function  of  the  independent,  normal  random  variables  G,  G,  K  anc*  Kr,  RN  is  likewise 
normal.   Therefore,  letting  £  =  E(RN)  and  cr2  =  \ar(RN),  one  has 

2 

1 


(4.5) 
where  v. 


Pi" 


1 


r 


2 

e 


r-i 


dr  = 


X. 


e    2      d\, 


(Ty/ln     Jr<  V27T 

(rj  —  £)/o-  and  v2  =  (r,+1  —  £)/o\  so  that  p,  may  be  readily  calculated  from  tables. 


Supposing  that  the  picked  resistor  and  the  other  components  are  subject  to  a  temperature 
change  from  the  standard  temperature,  we  must  compute  the  effect  of  such  a  change,  together 
with  the  resistor  incrementation  effect  of  Section  3,  in  order  to  obtain  the  probability  of  satisfy- 
ing the  specification.  It  will  be  assumed  in  our  analysis  that  the  ambient  temperature  is  a  uni- 
formly distributed  random  variable  whose  range  is  given  by  Tx  ^  T  ^  T2.  If 
P(t\  ^  t  ^  t2\T)  is  the  probability  of  meeting  the  time  limits  for  a  given  temperature  T,  then, 
clearly, 

(4.6)    P(t^t*:t2)  =  fT2PU]^t^t2\T)p(T)dT=        \        fT2  Pit,  <  t  ^  t2\T)dT. 


Let  us  give  an  example  of  the  computation  of  the  nominal  resistor  distribution.  Suppos- 
ing in  (4.1)  that  tN  =  2.6  seconds,  ChE  =  .44 /uf,  C2E  =  .15fif,  VE  =  177v.,  and  VTE  =  235v., 
one  finds  that  E{RN)  =  40.16  megohms.  Also,  one  finds  from  (4.3)  and  similar  expressions, 
upon  inserting  expected  values,  that  dR^/dC\  =  8.65,  dRN/dC2  =  —  293.12,  dR^/d  V=—  1.26, 
and  dRNldVT=  —  0.95.  Let  us  assume  the  following  standard  deviations:  <t(C\)  =  0.0073, 
o-(G)  =  0.0025,  a(V)  =  0.17,  and  a(Vf)  =  4.17,  where  Vf  is  used  to  denote  the  expected 
breakdown  voltage  of  a  diode  chosen  from  a  lot.  The  expected  values  of  the  breakdown  vol- 
tages of  all  the  tubes  are  themselves  assumed  to  follow  a  normal  distribution  with  expected 


ANALYSIS  OF  FUZE  TIMER  383 

value  235v.  and  with  the  above  o\  In  addition,  each  tube  has  a  firing  voltage  which  varies 
about  its  expected  value.  This  new  random  variable,  with  expected  value  0,  we  denote  by  A  VE, 
and  it  is  assumed  that  A  VE  is  also  normally  distributed.  The  random  variable  VT,  which 
represents  the  firing  voltage  of  a  tube  selected  from  a  lot,  is  actually  formed  as  a  sum 
VT  =  Vj  +  A  VE,  where  we  shall  suppose  that  A  VE  is  independent  of  Vf.  Also,  tests  per- 
formed by  fuze  specialists  indicate  that  the  random  variables  A  VE  have  the  same  distribution 
from  one  tube  to  the  next.  Assuming  that  <t(AVe)  =  0.24,  it  follows  that  a-(VT)  =  4.17. 
Then,  from  (4.2),  var  RN  =  16.3235,  or  a-(RN)  =  4.04.  Therefore,  the  coefficient  of  variation 
is  0.10,  which  is  reasonably  small. 

We  now  develop  a  general  method  for  evaluating  the  performance  of  the  timer  which  is 
based  on  a  linear  theory.  Hopefully,  this  theory  will  yield  at  least  conservative  estimates.  Our 
formula  is  a  generalization  of  that  given  in  paragraph  3.  First  of  all,  from  (2.3)  and  (2.4),  it 
follows  that 

(4.7)  RN=  tN/F(Ch  C2,    V,    VP), 

where  V±l)  =  Vf+kV^.  Therefore,  solving  (4.7)  for  Vf,  where  F(CU  C2,  V,  Vj-X))  is  given 
through  (2.1)  and  (2.2),  one  finds  that 

(4.8)  Vf  =  VCJ{CX  +  C2)  -  (AKi])  -  V)  -  VCX  ^'"^^/(C,  +  C2). 

Here  Ceff  =  \/C\  +  1/C2  is  the  effective  series  capacitance  of  C\  and  C2,  and  A  V^x)  denotes 
that  variation  in  tube  firing  voltage  from  its  expected  value  which  is  associated  with  determina- 
tion of  the  nominal  resistance  RN.  For  brevity,  we  let  g(C\,  C2,  V,  RN,  A  F^")  represent  the 
right  hand  side  of  (4.8).  There  is,  however,  a  second  variation,  which  we  shall  denote  by 
A  VE2),  that  occurs  once  a  resistor  has  been  selected  from  a  bin  and  the  timer  actually  operated. 
These  two  variations  must  be  taken  into  account  carefully  when  assessing  timer  performance. 
One  may  now  make  a  1-1  transformation  from  the  space  of  (C,,  C2,  V,  AK^n,  AK^2),  Vf)  to 
that  of  (Cb  C2,  V,  A  Kin,  A  V^2\  RN)  through  the  map 

(4.9)  C,  =  C,,  C2  =  C2,    V  =  V,  A  V(EX)  =  A  V(EX\  A  Vp]  =  A  V£\ 

Vf=  g(Clt  C2,   V,  RN,  AKi1}), 

whose  Jacobian  is  dVE/BRN.  It  follows  [3,  pp.  56-62]  that  the  density  function  for  the  state 
(C,  C2,  V,  AKin,  AV&2\  RN)  is 

(4.10)  f(Cu  C2,   V,  AKiu,  A^i2),  RN)  =  Px{Cx)p2{C2)pi{V)pMy(EX)) 

■  ps^V^p.igiC,,  C2,    V,  RN,  AV^} 

■  \dVE/dRN\, 

where  /?,(C,)  (/' =  1,2)  are  the  densities  for  C,,  p3  is  the  density  for  V,  p4  the  density  for 
^VE\  p5  the  density  for  &VE2\  and  p6  the  density  for  Vf.  These  random  variables  are  all 
assumed  to  be  independent.  In  addition,  AK^n  and  A  V^2)  are  identically  distributed.  Next 
account  must  be  taken  of  the  fact  that,  because  of  a  change  in  temperature,  the  capacitances  C, 
will  change  in  value.  In  fact,  we  assume  that  C,(T),  where  T  denotes  temperature,  is  of  the 
form 

(4.11)  C,(D  =  C,(l  +  Ki(T  -  TE)/\00), 

where  Kj  represents  a  random  percent  change  per  degree  from  the  expected  temperature  TE. 
Thus  Cj(T)  is  a  product  convolution  [3,  pp.  56-62]  of  C,  and  the  second  factor,  which  we 
denote  by  ACP,(T)  (representing  a  percentage  change  in  C,  due  to  a  temperature  change  from 
expected  value  TE  to  T).    We  then  form  the  joint  density  h(Cu  C2,  AC/VD,  ACP2(D,   K, 


384  E.A.  COHEN.  JR. 

A  vp\  A  Vp\  /?yy)  from  /and  the  densities  for  these  percent  changes.  Afterwards,  h  is  multi- 
plied by  p(Rp(T)),  the  convolution  density  of  picked  resistance  at  temperature,  where 

(4.12)  Rp(T)  =  Rp(\  +  C(T  -  TE)/\00) 

and  C  is  a  random  percent  change  per  degree.  Finally,  if  we  are  interested  in  the  conditional 
density  for  any  given  bin  i,  we  must  divide  by  ph  the  probability  of  choice  of  bin  /.  It  is  clear 
that,  in  order  for  the  time  output  of  the  timer  to  fall  between  two  chosen  values  t\  and  t2, 
Rp(T)  must  lie  between 

tx/F{Cx{T),  C2(T),    V,    V±2)) 
and 

t2/F(C{(T),  C2(T),    V,   Vj2)), 
where  V^  -  Kf  +  A  vp>  with  Kf  given  by  (4.8).   Also,  from  (4.11), 

(4.13)  C,(T)  =  CACPi(T). 

Now  let  XT=  (C,,  C2,  &CP;(T),  ACP2(T),  V,  kV^\  AVp]).  There  follows  the  general 
multiple  integration  formula,  which  expresses  the  probability  P,  that  the  time  falls  between  ^ 
and  t2  for  bin  i  and  conditioned  on  temperature  T: 

r  t    I F 

(4.14)  p,P,U,  <  t  <  t2\T)  =  X  '+1  L    Bv         ,  X  /V  p(Pp(T))h(XT,RN)dRp(T)dXTdR„, 

,"ri         **  X-ftR  '(-c»,oo)  •"|/r 

where  R1{—°ot  oo)  represents  the  seven-fold  Cartesian  product  of  the  real  line.   Finally, 

(4.15)  PO,  ^  r  <  r2)  =         *         £  A   f  ^  P,(/,  <  t  ^  t2\  T)dT, 

1  2    ~      l   1       -oo  I 

given  that  the  temperature  distribution  is  uniform.  This  integration  procedure  could  be  accom- 
plished on  a  digital  computer  through  use  of  numerical  Gaussian  quadrature  and  Gauss- 
Hermite  quadrature  [5,  pp.  130-132].  However,  instead  of  using  this  general  nonlinear 
approach,  we  find  it  convenient,  in  the  present  context,  to  linearize  the  products  given  by 
(4.11)  and  (4.12)  and  to  make  use  of  a  linearized  version  of  RN  given  by 

(4.16)  RN=RN(ClE,  C2lE.   VE,   V$)  +  AX{CX-  ChE)  +  A2(C2-  C2>£) 

+  A3(V-  VE)  +  A4(VT-  Vj$), 
where,  of  course, 

oR/v  oR/\i  o/?/v  oRn 

A  i  = ,  A-,  = ,  A i  =     ,  ..  ,  Aa  = 


6C,  '      l       bC2  '      J        bV  '      4       bVT 

are  evaluated  at  the  expected  values  for  the  components  and  VJ-X)E  represents  the  expected  value 
of  random  variable  VJ-X).    (4.11)  now  becomes 

(4.17)  C,m  =  ChE{K,  -  KiE){T-  r£)/100  +  C,(l  +  KlE(T-  TE)/\00). 
where  KiE  represents  the  expected  value  of  Kh  and  (4.12)  becomes 

(4.18)  Rp(T)  =  [1  +  CE(T-  TE)/\00]Rp  +  RC(C  -  CE)(T  -  TE)/100, 

where  Rc  is  the  center  of  the  bin  considered.  Note  that  the  effect  of  (4.17)  and  (4.18)  is  to 
replace  product  convolutions  by  convolutions  of  sums  of  random  variables  when  it  comes  to 
computing  densities.  Also,  supposing  that  t\  =  tN{\  —  8)  and  t2  =  tN{\  +  8),  the  limits  on  the 
innermost  integral  of  (4.14)  become  tx/F  =  (1  -  8)//v//7and  t2/F=  (1  +8)tN/F,  respectively,  i 


ANALYSIS  OF  FUZE  TIMER  385 

The  functional  form  tN/F  is  to  be  replaced  by  the  linearized  version  (4.16)  with  C\(T),  C2(T), 
and  Vt2)  substituted  for  Cb  C2,  and  VT,  respectively.  We  have,  therefore,  after  a  small  com- 
putation, 

(4.19)  txlF=  (l-8)[RN  +  A^CX{T)  +  A2bC2(T)  +  ^4(A  V^2)  -  A  V^)\ 
and,  likewise, 

(4.20)  t2/F  =  (1  +B)[RN  +  AlbCl{T)  +  A2AC2(T)  +  ^4(A^2)  -  A^")], 

where  AC,(D  =  C,(T)  —  C,.  When  C\,  C2,  V,  and  Fr  are  independent,  normally  distributed 
random  variables,  the  analysis  is  a  bit  simpler,  since  it  is  easily  seen  that,  in  this  case,  the  pair 
(RN,  A  K^n)  is  bivariate  normal  [3,  pg.  162].  In  addition,  one  notes  that  (4.19)  and  (4.20)  do 
not  depend  on  C\,  C2,  and  V,  in  the  linear  analysis.  In  Section  6,  we  present  a  numerical 
example  following  this  procedure.  It  may  be  noted,  by  analogy  with  the  development  in  para- 
graph 3,  that  the  condition  txlF  <  R(T)  ^  t-J F  is  equivalent  to  requiring  that  R(T)  lie 
between  two  hyperplanes  in  the  six-dimensional  (RN,  ACj,  AC2,  A  V^\  A  VP\  R(T))  space. 

5.   PROBABILITY  INTERVALS  AT  AMBIENT  TEMPERATURE  AFTER  POTTING 

When  the  timer  is  actually  packaged,  or  potted,  this  procedure  will  produce  statistical 
changes  in  the  component  values.  These  changes  are  known  in  the  trade  as  potting  shifts. 
Such  shifts  can  be  taken  into  account  by  convolutions  of  the  densities  previously  determined 
with  those  densities  evolving  from  the  operation  of  potting.  This  has  an  effect  on  such  items  as 
the  picked  resistor,  the  capacitors,  and  the  voltage  regulator.  Generally,  potting  shifts  are 
represented  as  percentage  changes  from  previous  values,  and,  therefore,  strictly  speaking,  we 
have  another  product  convolution  to  consider.  For  example,  we  represent  the  value  of  resis- 
tance due  to  temperature  and  potting  by 

(5.1)  Rpol(T)  =  RP(T)(\  +CHG,/100), 

where  the  subscript  pot  denotes  potting  and  CHGi  represents  a  random  per  cent  change  from 
the  value  of  picked  resistance  at  temperature.  If  we  linearize  Rpot(T)  about  nominal  values,  we 
find  that 

(5.2)  RPoST)  =  (1  +  CHG LE/ 100) R(T)  +  RE(T)(CHGX  -  CHG,  f)/100, 

where  Re(T)  is  the  expected  value  of  picked  resistance  at  temperature  for  the  given  bin  and 
CHG]  £  is  the  expected  value  of  CHGJ.   From  (4.18),  this  is  given,  to  a  first  approximation,  by 

(5.3)  RE(T)  =  [1  +  CN{T  -  TN)l\00]Rc, 

where,  as  before,  Rc  is  the  center  of  the  bin  interval.  As  for  the  capacitances,  we  assume  a 
form 

(5.4)  C,.p0l(r)  -C,(D(1  +  CFKVlOO), 

so  that  we  would  linearize  C,  pot(D  about  nominal  values  in  a  manner  analogous  to  that  for 
Rpol(T).   Lastly,  the  voltage  regulator  value  after  potting  is  representable  by 

(5.5)  Kpol  =  V  +  CHG3. 

Hence,  we  need  only  go  back  through  our  analysis  with  Rp(T)  replaced  by  flpot(r),  C,(D 
replaced  by  C,pot(D,  and  V  replaced  by  Kpot.  It  is  assumed  that  VT,  the  cold  cathode  diode 
tube  firing  voltage,  is  unaffected  by  potting.  One  more  integration,  corresponding  to  CHG3,  is 
introduced  in  order  to  take  account  of  the  change  in  regulator  voltage  due  to  potting. 


386  E.A.  COHEN,  JR. 

6.   NUMERICAL  RESULTS 

Using  a  CDC  6600  computer,  we  were  able  to  develop  a  computer  code  which  can  be 
used  to  predict  efficiently  the  performance  of  the  timer  under  the  linearity  assumptions  outlined 
in  the  two  previous  paragraphs.  The  integration  scheme  developed  will,  in  this  paragraph,  be 
discussed  in  some  detail.   A  listing  of  the  computer  code  used  can  be  provided  on  request. 

First  of  all,  in  (4.18),  we  assume  that  Rp  has  a  uniform  distribution  across  the  bin  which 
is  being  considered  and  that  C  is  normally  distributed.  Let  us  suppose,  as  an  example,  that 
CE  =  -0.0235,  TE  =  75°F,  and  a-(C)  =  0.0078.   Then,  of  course,  from  (4.18), 

(6.1)  RP(T)  =  [1  -  0.0235(7*  -  15)/\00)RP  +  RC(C  +  0.0235)(T  -  75)/100. 

Therefore,  Rp(T)  is  a  sum  of  two  independent  random  variables,  one  of  which  is  uniform  and 
the  other  of  which  is  normal  and  of  mean  0.   It  follows  that 

(6.2)  p(RAT))  = 


2tt  (0.000078)  |  T  -  15\Rc(ri+l  -  r,)[\  -  0.000235(7-  75)] 

2 


X 


( 1-0.000235(7--  75)  )r+)      ~j 
l-0.000235(r-75))r, 


Rp(T)-u 


R  JO.  000078)  (r-75) 


du. 


Letting 


v=  (u  -  /?„ (D)//?r (0.000078)  |  T  -  75|, 
(6.2)  is  converted  into 

(63>  P(R'm>=  V5F(„»,  -  „)[■  -0.000235(7-75)]   S^' ''^  "-■ 

where 

(6.4)  v,  =  [(1  -  0.000235(7-  75))r,  -  Rp (T)]/Rc (0.000078)1  T  -  75| 
and 

(6.5)  v2=  [(1  -  0.000235(7-  75))r,+1  -  Rp (T)]/Rc (0.000078)1  T  -  75 1 . 

Several  cases  now  arise  according  to  the  value  of  RP(T)  and  according  to  whether  or  not 
T  ^  75°F.  We  first  consider  the  case  when  T  ^  75°.  Let  us  develop  an  inequality  which 
allows  us  to  assert  that  Vj  <  —  3.    In  fact,  suppose  that 

(6.6)  RP(T)  -  t;  >  kx  r, (0.00023 5)(T  -  75), 

where  kx  is  to  be  so  determined  that  v,  <  —  3  is  valid.  Upon  substituting  (6.6)  into  (6.4),  one 
has 

(6.7)  v,  ^  -  {kx  +  l)r,(0.000235)//?f(0.000078). 

Remembering  that  rjRc  =  1  -  e,  we  find  that  fcn  =  e/(l  -  e)  will  yield  the  requisite  inequality. 
Next  let  us  obtain  an  inequality  which  will  permit  us  to  say  that  v2  ^  3.   Suppose  that 

(6.8)  rf+I  -  RP(T)  >  fc2r,+1(0.000235)(r-75). 
Then,  from  (6.5),  we  have 

(6.9)  v2>  Wc2-  l)(l  +  e). 

The  right  side  of  (6.9)  equals  3  when 
£2=  (2 +  €)/(!  +e). 


ANALYSIS  OF  FUZE  TIMER  387 

Thus,  if,  for  T  >  75, 

(6.10)  r,-(e)  =  r,(l  +  -r1—  (0.000235) (T  -  75))  ^  R(T) 

1  —  € 

<  r,+1(l  -  771  (0.000235)(T  -  75))  =  r/+1(e), 
it  follows  from  (6.3)  that 

(6U)  P(R"(T))  =    (r,+1-r,.)[l-  0.000235(7-75)]- 

Now  suppose  that  r  <  75.   Letting  Rp(T)  -  r,  >  k3  r,  (0.000235)  (T  -  75),  it  follows  that 

(6.12)  v,  <  3(/c3+  1)(1  -c). 

The  right  side  equals  -  3  when  /c3  =  -  (2  -  e)/(l  -  e).  Again,  assuming  that  rj+x  — 
RP(T)  >  k4  rl+x  (0.000235)(r  -  75),  we  have 

(6.13)  v2>-3(l+e)(Jfc4-l), 

which  equals  3  when  k4  =  e/(l  +  e).   Therefore,  when  T  <  IS  and 

(6.14)  r/(e)  =  r,(l  -  \:zJ-  (0.000235)(r  -  75))  <  R(T) 

1  —  e 

<  rl+1(l  -  — !—  (0.000235) (T  -  75))  =  r/+1(e), 

1    +  € 

(6.11)  is  again  satisfied.  Next  let  us  go  back  to  the  case  when  T  ^  75,  but  let  us  now  require 
that  v2  <  —  3.   We  find  that  such  is  true  when 

(6.15)  RAT)  >  r,+1  -  — f—  rM  (0.000235) (T  -  75). 

1    +  € 

Since  v2  ^  —  3  also  implies  that  vt  <  —  3,  we  can  assume  that  p(R(T))  =  0  in  this  case. 
Likewise,  one  finds  that  V)  ^  3  whenever 

(6.16)  RAT)  ^  r,  -  |— ^  r,  (0.000235)  (r  -  75), 

1   —   € 

so  that,  in  this  range,  p(Rp(T))  =  0,  also.   When  T  <  75,  v2  <  -  3  whenever 

(6.17)  RAT)  >  rM-  —^  r/+1  (0.000235) (T  -  75), 

1   +  € 

and  V!  ^  3  when 

(6.18)  /?„(D  ^  r,  +  — ^—  r.  (0.000235)  (T  -  75). 

1  —  e 

Again  it  follows  that  p(Rp(T))  =  0.  Now  there  remain  certain  intervals  in  which  p(Rp(T)) 
cannot  be  treated  as  constant  for  a  given  temperature.  For  example,  it  is  found  that,  when 
T  ^  75  and 

(6.19)  r,+1(e)  =  r/+1  (1  -  4^  (-000235)(r  -  75)) 

1  +e 

<  Rp(T)  ^  r,+1(l  -  — —  (.000235)  (T  -  75))  =  s,+1(e), 

1  +  e 

-  3  <  v2  <  3  while  v,  ^  -  3.   Also,  in  the  interval 


388 
(6.20) 


E.A   COHEN,  JR. 


S,-(e)  =  r,(l  - 


<   r,{\  + 


T — L  (.000235)(r-  75))  <  RAT) 

1   —  6 

(.000235) (T-  75))  =  r,(e). 


1  -  6 


3  <  v,  <  3  while  v2  >  3.   When  T  <  75,  p{RAT))  cannot  be  treated  as  constant  whenever 


(6.21)       s/(e)  -  r,(l  + 


1  -€ 


(.000235)(T-  75))  <  ^(D  <  r/(e) 


or 


(6.22)        r/+1(e)  <  RAT)  ^  ri+l  (1  - 


2  +  6 
1    +6 


(.000235) (T-  75))  =  s/+i(e). 


The  intervals  so  developed,  in  which  the  behavior  of  p(Rp(T))  is  examined,  are  very 
important  in  the  numerical  study  conducted  on  the  CDC  6600.  We  now  set  up  the  precise  pro- 
cedure used  in  the  computer  study,  first  of  all,  referring  to  (4.19)  and  (4.20),  we  find  it  a  little 
more  natural  to  integrate  with  respect  to  AC,(D  or  AC2(D  first  instead  of~Rp(T).  We  see 
then  that  our  region  of  integration  is  fully  specified  by 

(6.23)     fi(bC2.Rp(T),  R^.AV^.AV^)  ^  AC,  ^  /2(AC2>  Rp(T),  RN,  A  V^\  A  V^) 

— oo  <  AC2  <  +°° 

-oo  <  Rp(T)  <  +<*> 

— oo    <   A  VP}   <   oo 
-oo    <    A  VP]    <    oo 

r,  ^  RN  ^  r/+] 
Tx  <  T  <  T2, 

where,  for  A ,  >  0, 

1 


(6.24) 


/i  = 


b¥L-A4C2-RN-A4(AVF-*vP) 


^l  -A2AC2-RN-  A4(*  VP  -  A  Vp) 


and  the  inclusion  of  negative  values  for  Rp(T)  is  merely  a  mathematical  artifice.    The  density 
function  for  this  process  then  has  the  following  form: 


(6.25) 


MAC,,  AC2,  RN,  RP(T),  A  V^\  A  Vp}) 

=  pMC,)p2(AC2)pARp(T))p4(Rn,  AVE\x))ps(AV(E2))lpi{T2-  Tx). 


The  densities  /?,,  p2,  and  p5  are  all  normal  densities.  The  mass  function  p3  was  ascertained  in 
(6.3).  p4  is  a  bivariate  normal  density,  and  p,  is  the  probability  of  being  in  bin  /'.  It  is  easy  to 
determine  the  correlation  coefficient  p  for  p4.  Multiplying  A  V^l)  by  R^,  as  given  by  (4.16),  we 
have 

(6.26)  E(RNA  VP})  -  A4  £(A  F^")2 

=  ^4o-2(A^1)), 

and,  since  the  expected  value  of  A  V^  is  zero,  cov(RN,  AV^U)  =  £(/?/vAK^1)).  It  follows, 
using  (6.26),  that 

(6.27)  p  =  ^4o-(A^1))/o-(/?^). 


ANALYSIS  OF  FUZE  TIMER 


389 


The  factors  in  (6.25),  other  than  p3,  are  given  by 


pMCx)  = 


P2(AC2)  = 


1 


(27r),/2o-(AC1) 

1 

(2tt)1/2o-(AC2) 


exp 


exp 


AC,  -  £(AC.) 


cr(AC,) 
AC2-  £(AC2) 


(AC2) 


P4(RN,±vn 


Lih  = 


27ro-(i?^)o-(A^1))Vl  -p2 


exp 


-1 


2(1  -p2) 


^?v  —  E(RN) 


-2p 


R, 


(t(Rn) 
E(RN) 


+ 


[AKi» 

-  £(A^iu) 

cr 

(AKi») 

0"(/?yv) 


AFi" 


£(A^1)) 


o-(AKin) 


/>5(A^2))  = 


1 


(27r)1/2o-(AKi2)) 


exp 


f  A  V^ 

-£(A^2)) 

a 

(A  Vp]) 

where  p  is  given  by  (6.27)  and  p4  is  the  well-known  joint  normal  density  for  two  variates  [7, 
pp.  111-114]. 

Now  let  K,E  =  .04  for  /  =  1,2  in  (4.17)  and  trlK,)  =  .013,  /  =  1,2.  Recall  from  our  dis- 
cussion in  paragraph  IV  that  C\  E  =  .44p,/,  C2£=.15p,/,  E{RN)  =  40.16  megohms, 
,4,  =  8.65,  A2  =  -  293.12,  ^4=-0.95,  and  a(RN)  =  4.04.  In  addition,  suppose  that 
£(A^n)  =  £(AF^2))  =  0ando-(AKin)  =  cr(AKi2))  =  0.2357.   Then  it  is  seen  that 

£(AC,)  =  0.000176(7-75),  o-(AC,)  =  0.00005874|  T-  75|, 

£(AC2)  =  0.00006(7-75),  ando-(AC2)  =  0.00002034|r-75|. 

Next  we  make  several  changes  of  variable.    Let 

(6.28)  u  =  (AC,  -  £(AC!))/V2(7(AC,) 

w  =  (AC2  -  £(AC2))/V2o-(AC2) 

z  =  v/V2 

w,  =  (/?„  -  E(RN))/y/2(l  -  P2)<t(Rn) 

wx=  AKin/V2(l  -p2)o-(AKin) 

w2=  A^2)/V2o-(AKi2)). 

Then  (6.25)  becomes 


(6.29) 


*9- 


vr 


7r3(r,+1  -  /-,)[!  +  Cyv(r- 75)/100] 


-! ;.: 


2  e--'  dz 


-  ( « j |  —  2p  u  |  h>  |  +  >v  |  ) 


e        ■  e      '        '   '     '  /Pi(T2  -  T,). 
Now  one  finds,  by  completing  the  square,  that 

(6.30) 


-<U|2-2pW|  W,+  >V|2)  -(Hi|-pu,)2-(l-p2)u  ,2 


Next  we  let  w3  =  W[  -  pw,  and  u2  =  Vl  -  p2 w,.   Our  integrand  becomes 


390 
(6.31) 


h  = 


E  A   COHEN.  JR. 
1 


7rHrl+l  -  r,)[l  +  CN(T  -  75)/100] 

2  2  2 

e     3  •  e     7  ■  e     1lp\T2-  T}). 


dz 


For  brevity,  set  Y  =  (w,  h>2,  w3),  and  let  /?3(-oo,oo)  denote  the  usual  three-dimensional 
Euclidean  space.  Also,  put  u2i  =  (r,  -  E{RN))/Jl  v{RN)  and  «2./+i  -  C^+i  -  E(RN))/y/2 
a(RN).   Then  our  integration  scheme  becomes 


(6.32) 
where 


P,(^(l  -8)  <  f  <  f„(l  +8))  =  Ax  +  A2, 


J75     /•"2/+1     T  /'r/(e)     fF7(Y.uvR) 

r    J  J        ,  ^K^2c/r    j   .  .     f  sl    h{u,Y,R,u2)dudR 


,/■',(«)     j.FAY.u^.R 


F2(Y,urR) 
xL,(«)     /.FJY.UtR) 


+  J.M       L,v     ,»    h(u,Y,R,u2)dudR  +   I  ,    ,.     I    ,v     _,    h{u,Y,R,u2)dudR 

Jr^t)         J  F^kY.u^.Ri  Jrl+iU)     J  F\(Y.u2.R)  l 

and  /42  is  obtained  by  using  75  and  T2  for  limits  on  the  T  integration  in  place  of  Tx  and  75, 
respectively,  with  primed  quantities  replaced  by  unprimed  quantities.   In  addition,  we  have  set 

(6.34)  F1(r,«2,JR)=F1(w,w2,w2(w3)/?)=[/1(AC2//?(AK^),A^1),/?^)-£(AC1)]/V2o-(AC1) 
F2(Y,u2,R)=F2(w,u2,w2,wi,R)^[fMC2,R,AVP),AV^),RN)-E^C[)]/^2(r(^C0. 

Now  /,  and  f2  were  defined  in  (6.24),  and,  from  the  changes  of  variable  given  by  (6.28),  we 
have 

(6.35)  AC2=  £(AC2)  +  V2wo-(AC2) 

AFi2)  =  V2o-(AKi2,)w2 

A^n  =  >/2(l  -p2)o-(A^n)(»V3  +  pw2/Vl  -p2) 

/?;v  =  £(/?/v)  +  J2<r(RN)u2. 

Our  computer  code  is  just  the  implementation  of  a  nesting  procedure,  making  use  of  Gaussian 
and  Hermite-Gaussian  quadrature  routines,  together  with  routines  to  evaluate  the  error  integral 
[5,  pp.  130-132],  [6,  pp.  319-330],  [8],  [1,  pg.  924].  It  turned  out  to  be  convenient  and  numer- 
ically accurate  and  timewise  efficient  to  employ  three  Gauss  points  per  integration  step. 

The  effect  of  cold  cathode  diode  firing  voltage  variations  in  this  problem  is  more 
significant  than  that  of  ambient  temperature  departures  from  nominal.  In  our  case  study,  for 
example,  when  e  =  .01  and  8  =  .02,  P,  was  essentially  91%.  With  8  =  .03,  this  figure  was 
increased  to  almost  100%.   Results  for  six  bins  with  e  =  .01  and  8  =  .03  are  given  in  Table  1. 

TABLE  1  —  Performance  of  Fuze  Timer 
for  Representative  Bins 


Pi 

/■/ 

r,+\ 

Re 

.994848 

37.4435 

38.2000 

37.8218 

.995103 

38.2000 

38.9717 

38.5858 

.995221 

38.9717 

39.7590 

39.3653 

.995414 

39.7590 

40.5622 

40.1606 

.995452 

40.5622 

41.3817 

40.9719 

.995490 

41.3817 

42.2176 

41.7997 

ANALYSIS  OF  FUZE  TIMER  391 

It  is  seen  that  the  probability  is  essentially  the  same  independent  of  the  bin.  Running  time  for 
this  problem  was  approximately  four  seconds  per  bin.  Indeed  one  would  reason,  as  in  para- 
graph 3,  that,  at  least  approximately,  each  bin  should  yield  the  same  probability  for  firing  time, 
given  a  8  —  e  combination.  This  should  occur  if  the  nonlinearities  are  not  too  severe  and  the 
distributions  due  to  change  in  temperature  and  cold  cathode  diode  firing  voltage  variations  are 
fairly  compact.  This  would  then  mean  that  we  need  only  examine  one  bin  to  determine  the 
performance  of  the  timer,  and  our  integration  procedure  could  then  represent  a  substantial  time 
saving  over  a  Monte  Carlo  simulation. 

Going  back  to  (6.32),  we  can  also  give  an  error  bound  for  the  part  neglected  in  the  com- 
putation of  Ph   Let  us  illustrate  in  one  case  what  is  happening.   For  instance,  we  have  neglected 

(6.36)  L     J  J      ,  J        ,  \F(Y     -,    h{u,Y,R,u2)dudYdRdu2dT 
Clearly,  (6.36)  is  bounded  above  by 

(6.37)  f   2  f"2',+1   f      A  f°°  ,     h{R,Z,u2)dRdZdu2dT, 

Jis    Jti2j       ~'ze/?4(-°°,oo)  *,s,+i(e) 

where  Z  =  (u,  Y).   Noting  that  h{u,w,R,W3,u2,w2)  =  g{u,w,WT,,u2,w2)p(R)  and  that 
frf'f      *  g(Z,u2)dZdu2dT=  1, 

We  need  only  study  the  behavior  of  the  integration  with  respect  to  R.  Going  back  to  (6.37), 
when  si+\  <  R  <  °°,  we  know  that  V)  ^  v2  <  —  3.  Therefore,  it  is  easy  to  show  that 

(6.38)  p(RAT))  <  — L  e~vl/2/Rc<r(C)\T-  751/100. 

Vz7r 

It  follows  {2,  pg.  149]  that 

,  ,p(R(T))dR(T)  ^  -7==   I      e~x/2  dx  =  .00135. 

s,  +  l(e)  V27T     J3 

A  similar  result  is  obtained  when  R  is  restricted  to  the  interval  (— °°,  s,(e))  and  T  >  75°  or 
when  R  lies  in  either  (syV^e),00)  or  (-°o,  5/(e))  and  T  <  75°.  The  result  is  finally  that  the 
portion  neglected  is  bounded  above  by  .0027,  so  that  we  are  at  most  off  in  the  third  decimal 
place. 

7.  THE  CASE  OF  TWO  OR  MORE  TIMERS 

An  interesting  case  study  arises  when  there  are  two  or  more  timers  which  are  statistically 
dependent.  This  occurs,  for  example,  when,  after  the  first  timer  is  operated,  a  switch  closes 
and  a  second  timer  is  started,  the  second  one  being  fed  by  the  same  capacitor  which  fed  the 
first  timer.  Let  us  suppose,  for  instance,  that  capacitor  CI  in  Figure  1  feeds  the  second  fuze 
timer  indicated  in  Figure  6. 

At  the  end  of  operation  of  the  first  timer,  switch  S  in  Figure  6  is  thrown  into  the  position  indi- 
cated, thus  allowing  CI  to  begin  charging  up  C4.  C5  serves  as  the  reference  capacitor.  The 
second  timer  is  also  governed  by  a  simple  first  order  differential  equation,  and  one  can  show 
that  the  time  is  given  by 

(7.D       ,,^£i£ien 

Letting  4n  be  the  nominal  time  for  the  second  timer,  we  find  the  nominal  resistance  for  this 
timer  to  be 


C\V-  C2(VT-  V) 

cxv- 

-  C2(VT-  V)-  {VTA-  K)(C,  +  C4) 

392 


E.A.  COHEN,  JR. 


TIMER 
OUTPUT 


Figure  6.   Second  fuze  timer  configuraiion 


(7.2) 


D   (1)    _ 


(C,  +  c4)^" 


C\C* 


In 


VC]  -  (V}"~  V)C 


vcx  -  (V±l)-  V)C2 


ivff 


j        -    v  ;  ^  2    ~  \rt,\    ~    V)(C\  +  C4) 

where   K/n  =  Kf+AKJn  and   V$\  ==  Kf,  +A^',).    Then,  substituting  (4.8)  into  (7.2),  we 
derive  the  functional  relationship 


(7.3)  R$]  -  /?;"  (C,,  C2l  C4,  V,  RN,bV£",  V}}{). 
Solving  (7.3)  for  Vfti,  we  have 

(7.4)  VfA  =  h(CuC2,C4,  V,  RN.  R&1),  LVp\  LVgl). 

To  determine  the  joint  density  for  the  process,  we  must,  by  analogy  with  the  method  in  para- 
graph 4,  introduce  a  pair  of  diode  firing  variations  A  V^2)  and  A  Vj£\.  We  then  consider  the  fol- 
lowing transformation  of  variables: 


(7.5) 


yf.\  ■ 

=  h{C\,  C2,  C4 

c,= 

c, 

c2  = 

c2 

c4  = 

c4 

V  = 

V 

RN  = 

RN 

AVP 



A^» 

*Vfi 



A  Vft 

AKi2 

_ 

afj» 

a  vm 

_ 

Ayfl. 

K  /?„,  CA^'.A^1,') 


(1) 


(1) 


To  compute  the  density,  we  employ  (4.10)  and  the  Jacobian  of  the  transformation  (7.5)  to 
obtain 


ANALYSIS  OF  FUZE  TIMER 


393 


(7.6)       d5(Clt  C2,  C4,  V,  RN,  R$\  LVP,  AKft  AKJ2),  AK$) 

=  /(C„  C2,  K.A^.AKjP,/^)     rf,(C4)  -dMV{ExD  -dMV^) 


d,iVfA) 


bVfA 


9/?;n 


Also,  if  both  /?,v"  and  RN  are  linearized  about  nominal  values  of  capacitance,  tube  firing  vol- 
tages, and  regulator  voltage,  then  the  map 

(7.7)  R^  =  I,(C,,  C2,  C4,  Vf^VP,  V,  VluLVg\) 
RN=  I2(C„  C2,  V,  Vl  AKin) 

AK<V=A^'» 

A^"  =  AK^1' 

shows  that  (/?^l>,  ^/v,  A  V^\,  A  K^n)  is  a  quadrivariate  normal  random  vector  [2,  pg.  162].  The 
reason  is  that  all  random  variables  on  the  right  side  of  (7.7)  are  independent  and  normally  dis- 
tributed. At  the  nominal  temperature,  the  density  function  is  therefore  generally  representable 
by 

(7.8)  d6(CuC2,C4,  V,RN,R^,AV^,AV^\,  Rp,  R™  ,  AV£2\  A^j) 

=  </5(C„  C2,  C4,  V,  RN,  RJP.LVP.LVSlLVP.LVgi) 

■p(Rp)-p«HR}l)), 

where,  for  example, 

p{Rp)=  l/(r/+1-  r,) 

and 

//»  (/?;»)-  1/0$  -i>(l)) 

if  picked  resistance  is  equally  likely  across  the  bins.  (7.8),  also,  obviously  indicates  that  picked 
resistances  are  statistically  independent  of  the  other  component  values.  It  will  be  possible  to 
reduce  (7.8)  to  the  simpler  form 

(7.9)  d6(RN,  R$\  AKi»,  A  K^,  Rp,  R«\  A  V?\  A  Vg\) 

=  p{RN,Rkl\bVP,b,Vgl)     p(Rp)  VW*)  ■  piAV^)  -p(AV$) 

when  (7.7)  is  valid,  p(RN,  R^\  A  KJn,  A  v£\)  being  the  density  for  the  quadrivariate  normal 
distribution  [9,  pg.  88].  From  (7.7)  the  elements  of  the  covariance  matrix  [9,  pg.  88]  can  be 
easily  obtained. 

Next  account  must  be  taken  of  changes  in  component  values  due  to  temperature  changes 
from  the  nominal  value.  We  use  the  same  ideas  presented  in  paragraph  4,  together  with  the 
same  notation.   The  density  becomes 


I 


(7.10)  d{Cx,  C2,  C4,  V,  RN,  Rtf\  A  VP,  A  V${,  Rp(T),  Rpil)(T), 
ACP,(n,  ACP2(T),  ACP4(T)) 
=  d5(Cu  C2,  C4,V,  RN,  R#\  AV?\  AKif7,AKi2>,  AK^) 


AK«) 


.LVff. 


■  p{ACPx{T))  ■  p(ACP2(T))  ■  d(ACP4(T))  ■  p(Rp(T))  ■  p{l)(RpiU(T)), 


394  FA   COHEN.  JR. 

where  p{Rp(T))  and  p{l)(Rp{l)(T))  are  again  convolution  densities.  We  must  now  determine 
limits  of  integration.  One  requires  that  the  first  timer  fire  in  time  t,  where  tx  =  tN{\  -  8)  ^ 
/  ^  tN(\  +  8)  =  t2  and  that  the  second  timer  fire  in  time  tu\  where  r/1'  =  ^"(1  -  8(1)) 
^  r(,)  <  41}(1  +  8(l))  =  '2<n-   Therefore,  we  have 

(7-11)  r,/F<  RP(T)  ^  t-jF 

,m/Fu)  ^  Ru)(T)  <  ,(i)/F(i» 

—  °o  <  C/  <  «\  /  =  1,2,4 

-oo  <  ACP,(T)  <  oo,  /=  i;  2, 4 

-  oo  <  A  VP]  <   oo 

-  «3  <  A  Vg\  <    oo 

-  oo  <  A  VP]  <   oo 

-  oo  <  A  K^  <   oo 

''/  ^  FN  <  rl+] 

,.(1)   <-    p  (1)   ^    r(\) 

-  oo   <    J/  <   oo, 

where  F  =  FiC^T),  C2(D,  K  ^2>),  F(1)  =  F(,l(C,(r),  C2(D,  C4(D,  K,  F/2\  K/2/)  and 
yp)  =  |/f  +  a  ^2),  J//2/  =  VfA  +  A  J^2/,  as  before.  Kf  is  given  by  (4.8)  and  V$A  by  (7.5). 
Also,  (4.13)  holds  for  /  =  1,2,  and  4. 

An  integration  scheme  patterned  after  (4.14)  can  then  be  recorded  with  pu  =  prob  (RN  € 
bin  /and  R^])  6  bin  y)  in  place  of  /?,-.    (4.15)  would  then  be  replaced  by  a  double  sum: 

(7.12)  />(/,  ^  t  <  r2>  r,(,)  ^  f(1)  <  /2(n) 

1  oo        oo  _  7" 


r,-  r, 


Also,  in  the  case  where  (7.7)  is  valid,  C{  C2,  C4,  and  V  are  eliminated  and  ACP,(T)  is  to  be 
replaced  by  AC,(D.  In  addition,  /(I)/F(1)  and  f/F  become  linear  forms  in  AC^D,  AC2(D, 
AC4(r),  /?/v,  /^n,  A^'\  A^2),  Al^.y,  and  A^2|.  In  that  case  a  sixteen-fold  integral  is 
reduced  to  a  twelve-fold  integral. 

ACKNOWLEDGMENTS 

The  author  would  like  to  thank  personally  Messrs.  Larry  Burkhardt  and  John  Abell,  of  the 
fuze  group  at  this  laboratory,  for  their  generous  support  of  this  work  and  their  helpful  sugges- 
tions. Also,  he  would  like  to  thank  Mr.  William  McDonald  and  Mr.  Ted  Orlow,  also  of  this 
laboratory,  for  their  helpful  comments  and  guiding  ideas. 

REFERENCES 

[1]  Abramowitz,  Milton  and  Irene  A.  Stegun  (Editors),  Handbook  of  Mathematical  Functions, 
National  Bureau  of  Standards  Applied  Mathematics  Series,  No.  55,  June  (1964). 

[2]  Cohen,  Edgar  A.,  Jr.  and  Ronald  Goldstein,  "A  Component  Reliability  Model  for  Bomb 
Fuze  MK  344  Mod  1  and  MK  376  Mod  0",  NSWC/WOL/TR  75-123  (1975). 


ANALYSIS  OF  FUZE  TIMER  395 

[3]  Fisz,  Marek,  Probability  Theory  and  Mathematical  Statistics,  John  Wiley  &  Sons,  Inc.,  New 

York  (1963). 
[4]  Hald,  A.,  Statistical  Theory  with  Engineering  Applications,  John  Wiley  &  Sons,  Inc.,  New 

York,  Chapman  &  Hall,  Limited,  London  (1952). 
[5]  Hamming,  R.W.,  Numerical  Methods  for  Scientists  and  Engineers,  McGraw-Hill,  New  York 

(1962). 
[6]  Hildebrand,  F.B.,  Introduction  to  Numerical  Analysis,  McGraw-Hill,  New  York  (1956). 
[7]  Hogg,  Robert  V.  and  Allen  T.  Craig,  Introduction  to  Mathematical  Statistics,  3rd  Edition, 

MacMillan,  London  (1970). 
[8]  IBM  7094/7094  Operating  System  Version  13,  IBJOB  Processor,  Appendix  H:  FORTRAN 

IV  Mathematics  Subroutines,  International  Business  Machines  Corporation,  New  York 

(1965). 
[9]  Parzen,  Emanuel,  Stochastic  Processes,  Holden-Day,  San  Francisco,  London;  Amsterdam 

(1964). 


THE  ASYMPTOTIC  SUFFICIENCY  OF  SPARSE 

ORDER  STATISTICS  IN  TESTS  OF  FIT 

WITH  NUISANCE  PARAMETERS* 

Lionel  Weiss 

Cornell  University 
Ithaca,  New  York 

ABSTRACT 

In  an  earlier  paper,  il  was  shown  that  for  the  problem  of  testing  that  a  sam- 
ple comes  from  a  completely  specified  distribution,  a  relatively  small  number  of 
order  statistics  is  asymptotically  sufficient,  and  for  all  asymptotic  probability  cal- 
culations the  joint  distribution  of  these  order  statistics  can  be  assumed  to  be 
normal  In  the  present  paper,  these  results  are  extended  to  certain  cases  where 
the  problem  is  to  test  the  hypothesis  that  a  sample  comes  from  a  distribution 
which  is  a  member  of  a  specified  parametric  family  of  distributions,  with  the 
parameters  unspecified. 

1.  INTRODUCTION 

For  each  n,  the  random  variables  X\{n),  ...  ,  Xn{n)  are  independent,  identically  distri- 
buted, with  unknown  common  probability  density  function  and  cumulative  distribution  function 
f„(x),  Fn(x)  respectively.  An  m-parameter  family  of  distributions,  with  pdf  fQ(x\9u  ...  ,  9  m) 
and  cdf  F0(x;9{,  ....  9m),  is  specified,  and  the  problem  is  to  test  the  hypothesis  that  f„(x)  = 
/q{x\9\,  ...,  9  m)  for  all  x,  for  some  unspecified  values  of  9X,  ...,9m. 

In  [5]  the  simpler  problem  of  testing  the  hypothesis  that  f„{x)  =  fo(x),  where  fo(x)  is 
completely  specified,  was  discussed.  In  this  simpler  case,  the  familiar  probability  integral 
transformation  can  be  used  to  reduce  the  problem  to  that  of  testing  whether  a  sample  comes 
from  a  uniform  distribution  over  (0,1).  This  type  of  reduction  is  not  always  available  when  the 
hypothetical  density  is  not  completely  specified.  (See  [1]  for  some  cases  where  the  reduction  is 
available.) 


will 


Since  we  will  be  interested  in  large  sample  theory,  to  keep  the  alternatives  challenging  we 
assume  that  f„(x)  =  f0(x;9],  ...,9%)  (1  +  r„(x))  for  some  unknown  values 
. .  ,  9®„  and  some  unknown  function  rn(x  )  satisfying  the  conditions  sup  |/"„(x)|  <  n~*  and 


<  n  e  for  all  n  and  for  j  =  1,2,3,4,  where  e  is  a  fixed  value  in  the  open  interval 


d1 

r„(x) 

Su  jj 

X 

dx> 

\ 

1        1 

3' 

2 

'Research  supported  by  NSF  Grant  No.  MCS76-06340. 


397 


398 


L.  WEISS 


case.    That  is,  f0(x;9h92)    =   —  g 


x  -  0, 


The  case  where  m  =  2,  and  0,,  02  are  location  and  scale  parameters  respectively  is  rela- 
tively simple  to  analyze,  and  occurs  often  in  practice,  so  until  Section  5  we  will  discuss  only  this 

with  9 2  >   0,  and  the  pdf  g(x)  is  completely 

<  Ai  <  °°  for  j  = 


specified.    G(x)  denotes  g(t)dt.    We  assume  that  sup 

J -co  / 

1,2,3,4,  and  that  sup   g{x)  <  A2  <  °°. 


d'      t   \ 
^8M 


tions: 
(1.1) 


For  each  n,  we  choose  positive  quantities  /?„,  q„,  and  L„  satisfying  the  following  condi- 

p„  <  q„  <  1  -  «"e. 


(1.2;  npn,  nq„,  Ln,  and  K„  = are  all  integers. 


(1.3) 


(1.4) 


,  3 


lim  — - —  =  1  for  some  fixed  8  in  the  open  interval 


lim  p„  =  0,   lim  qn  =  1,    lim  npn  =  °o. 


°H 


(1.5) 


(1.6) 


(1.7) 


bn  =  inf 


g(x):G~' 


Pn 


l  +  n 


<x  <  G' 


Qn 

1 

—  ri 

-e 

>  n  y for  a  fixed 


positive  y  with  —  -  e  +  28  +  5y  <  0. 


lim 


°o,  lim 


//-«»    np„  '  n-oo    A7  (1  —  q„) 

g(G-HP„)) 


g(x) 

g(G-Hq„)) 

g(x) 


>  A3  >  0  for  all  x  <  G~l(p„),  and 

>  A4  >  0  for  all  x  >  G~x{qn). 


y,(n)  <    Y2{n)  <  <   Y„(n)  denote  the  ordered  values  of  *,(«),    •  ..,X„(n).    For 

typographical  simplicity,  we  denote    Y,(n)  by   7,.    For  j  —   1,  ...  ,  Kn,  let   ^(z?)  denote  — 

(Y*Pn+jL„     +     YnPn+<J-l)L),    and    let    D,(n)    denote    ( J^+yz.,,    -     YnPn+(i_x)L).     For    y    = 
1,  ...  ,  Kn  —  \,  let   W"(lj>),  ...  ,   W'(L„—l,j,n)  denote  the  values  of  the  L„  -  1  variables 

among   {X\(n),  ...  ,  Xn(n)}    which   fall   in   the   open   interval      Yj(n) 

D,{n)  ' 


^,  ?,<■>  * 


,  written  in  random  order:  that  is,  the  same  order  in  which  the  corresponding  elements 

W'iiJ.n)  -  Yj(n) 
of    \X\(n),  ....  Xn(n)}    are    written.      Define     W(i,j,n)    as „  ,    „  for    i 


Dt(n) 


SPARSE  ORDER  STATISTICS  399 

1,  ....  L„  -  1  and  j :  =  1,  ....  Kn,  so  -  —  <  W(i,j,n)  <  —  .    Let  Wij.n)  denote  the  (L„  - 

l)-dimensional  vector  { W{\,j,n),  . . .  ,  W{Ln  —  \J,n)}  for  j  —  1,  . ..,  K„.  Let 
Wi\,0,n),  ...  ,  Winp„  —  1,0, n)  denote  the  values  of  the  npn  —  1  variables  among 
\X\(n),  ...  ,  X„(n))  which  fall  in  the  open  interval  (— °°,  Ynp  )  written  in  random  order.    Let 

W(0,n)    denote    the    vector    [W(l,0,n) W(npn-l,0?n)}.     Let     W(\,Kn  +  \,n),  . . .  , 

Win  —  nq„,K„  +  \,n)  denote  the  values  of  the  n  —  nq„  variables  among  \X\in) X„in)} 

which  fall  in  the  open  interval  (Ynq  ,°°),  written  in  random  order.    Let   W(K„  +  \,n)  denote 

the  vector  { W(\,K„  +  \, «),...  ,  Win  -  nqn,K„  +  l,n)}.  Let  Tin)  denote  the  (K„  +  l)- 
dimensional  vector  { Ynp  +JL  ;  j  =  0, 1,  . . .  ,  Kn).   Note  that  if  we  are  given  the  Kn  +  3  vectors 

defined,  we  can  compute  the  n  order  statistics  Yx,  ■  ■  ■  ,  Y„,  so  that  any  test  procedure  based  on 
the  order  statistics  can  also  be  based  on  the  K„  +  3  vectors. 

Let  hn(j(n))  denote  the  joint  pdf  for  the  elements  of  the  vector  Tin),  and  let 
h?n(w(i,n)  \t(n))  denote  the  joint  conditional  pdf  for  the  elements  of  the  vector  W(i,n),  given 

that  Tin)  =jin).   Then  the  joint  pdf  for  all  n  elements  of  all  the  vectors  is  hnijin)) 

hi'n(w(i>n)  Li(«)),  which  we  denote  by  hn{1). 

Next  we  construct  two  different  "artificial"  joint  pdfs  for  the  n  elements  of  the  vectors. 

In  the  first  artificial  joint  pdf,  the  marginal  pdf  for  Tin)  and  the  conditional  pdfs  for 
Wi0,n)  and  WiK„  +  \,n)  are  the  same  as  above.  The  pdfs  for  the  elements  of  the  other  vec- 
tors are  constructed  as  follows. 


Let  a tin)  denote  G~ 


np„  + 


1 

7~T 


,  and  y ,in)  denote  ^—  — — — t   x^  ,  for  j  =  1,  . 


g' ia /in)) 

In    g2iaiin)) 

Kn.  Let  UiiJ)  (/=1,  ....  L„—\;j=\,  ...  ,  K„)  be  IID  random  variables,  independent  of 
Tin),  W_iQ,n),  W  iK„  +  \,n),  and  each  with  a  uniform  distribution  over  (0,1).  Then  the  dis- 
tribution of  Wii,j,n)  is  to  be  the  distribution  of  —  —  +  (1  +y/(n))  UiiJ)  -  y iin)  U2ii,j),  for 
/  =  1,  ....  Ln  —  1  and  /'  =  1 Kn.    Denote  the  resulting  joint  pdf  for  all  n  elements  by 

h(7) 
"n 

In  the  second  artificial  joint  distribution,  the  marginal  pdf  for  Tin)  and  the  conditional 
pdfs  for  Wi\,n),  ...  ,  WiKn,n)  given  Tin)  are  the  same  as  in  /?„(1).  Given  Tin),  the  np„  - 
1  elements  of  Wi0,n)  are  distributed  as  IID  random  variables,  each  with  pdf 
giix-9?)/9$)/6$G  (O%,n-0,o)/02°)  for  x  <  YnPn,  zero  if  x>  YnPn.  Given  T  in),  the  n-  nqn 
elements  of  WiKn  +  l,n)  are  distributed  as  IID  random  variables,  each  with  pdf 
(U/02°)  g  iix  -  0,o)/02°)/(l  -  G  HYnqn  -  9?)/9$))  for  x  >  YnQn,  zero  if  x  <  K„v  Denote  the 
resulting  joint  pdf  for  all  n  elements  by  /?„<3). 

If  Sn  is  any  measurable  region  in  ^-dimensional  space,  let  P  {l)iSn)  denote  the  probability 
assigned  to  S„  by  the  pdf  /z„(,).   The  next  two  sections  are  devoted  to  proving  the  following: 

THEOREM  1:  lim  sup  \P.i2)iS„)  -  P.U)  iS„)\  =  0. 
THEOREM  2:  lim  sup  \P.  („(5„)  -  PhW(S„)\  =  0. 


400 


L.  WEISS 


2.  PROOF  OF  THEOREM  1 


Let  h„     denote  the  joint  pdf  which  differs  from  h„  '  only  in  that  y;(n)  is  replaced  by 


(2) 


y;(n),  denned  as 


Ln    f'n(<Xj(n)) 


,  where  atj(n)  =  F'n 


=  f-1 


npn  + 


J~l 


It  was  shown  in 


[8]  that  lim  sup  \P  (4>(S„)  —  P  (\)(Sn)\  =  0,  and  thus  Theorem  1  will  be  proved  if  we  can  show 

n— *oo      Sn  "n  "n 

that  lim  sup  \P  (2)  (S„)  —  P  w(Sn)\  =  0.    By  the  reasoning  used  in  [8],  this  last  equality  will 

n— *°°       S  n  n 

n 

be  demonstrated  if  we  can  show  that 

hnl2)(T(n),  W(0,n) W(K„  +  \,n)) 


log 


h„w(T(n),  W{Q,n) W(K„  +  l,n)) 


=  Rn, 


say,  converges  stochastically  to  zero  as  n  increases,  when  the  joint  pdf  is  actually  h}2\    From  . 
the  definitions  above,  and  the  formula  in  [8],  for  all  sufficiently  large  n  we  can  write  Rn  as 


(2.1) 


7  I    I 

Z   7-1     '=1 


log[l+y}(n)-4yj(n)W(U,n)] 

-  \og[\+y  j(n)-4yj(n)W(iJ,n)] 


where  W(i,j,n)  have  the  same  distribution  as  -  —  +  (\+y  l(n))U(i,j)  -  y j(n)U2(ij).    We 

show  that  the  expression  (2.1)  converges  stochastically  to  zero  as  n  increases  by  means  of  three 
lemmas.    (The  order  symbol  0(  )  used  below  has  the  usual  interpretation.) 


LEMMA  2.1:    max    |y,(«)|=0U    3+*+2> 


PROOF:  Directly  from  the  assumptions  and  the  definition  of  y ,(n). 
LEMMA  2.2:     sup     |  F~]  (t)  -  {0i°  +  02oCrl(/)}|  -  0(/Te+?). 


n  =    =  ^  n 


1 


PROOF:  Since  /„(x)  =  ~  g 
x-0f 


e°2 

x-0? 


9°2 


x-0 


0 


0? 


(l  +  r„(x)),  with  \r„(x)\  <  n  €,  we  have  F„{x)  = 


—  —  Cx      1 

+  Rn(x),  where  R„(x)  =  J       -y  g 

y2 


t-v\ 


0  1 


rn(t)dt,  and  thus  \R„(x)\  <  n~eG 


.   Then  we  can  write  F„  (x)  =  G 


x-0 


(!  +  /?„  (x)),  where  \R„(x)\  <  n'1  for  all 


x.    Fix  any  value  /  in  the  closed  interval  [p„,q„].   Writing  Fn(x)=  t  =  G 


we  have  x  =  F„  Ht)  and  G 
(2.2)  G~l 

We  can  write  G~x 


(^1(r)-01°) 


x-0 


0  \ 


0,° 


(!  +  /?„  (x)), 


l  +  R„(F-lU)) 


,  so 


1  +  n 

t 


<'■-'<'>-"'  <G 


1    +    A?" 


0,° 


=   (T1*/)  - 


r/7" 


1  +  «" 


1 

■ 

t 

J 

1  -  n~€ 
1 

i 

'(G~Ht*)) 

where  r*  is  in  the  open 


interval 


i 


1  +  n 


J 


,    and    thus 


SPARSE  ORDER  STATISTICS 
1 


401 


t 


-  G~Kt) 

t 


g(G-Hf)) 


<    ny ,    by    assumption    (1.5).     Then      sup 


n  „■£'&„ 


1  +  n 

that      sup 

p,,^1^,,  [  1  —  « 

using  the  inequalities  (2.2). 


=  0(/7  €+y).    By  a  completely  analogous  argument,  it  can  be  shown 


G~Ht) 


=   0(/?  e+y).    Then  the  lemma  follows  immediately, 


LEMMA  2.3:  y  An)  -  y,(«)  +  8. ■  (/?),  where    max     \b  An)\  =  0U    3 

i</<^„ 


+8-e+3y 


PROOF:  By  lemma  2.2,  we  can  write  y ,(«)  as 

L^   f;,(9?  +  0$al(n)+8l(n)) 
In    ft(0?  +  e°a,(n)+8l(n))  ' 

where     max     |8,-(/i)|    =  0(n~€+y),  f'„(x)    =    —z  g 
\<j<Kn  6" 


x  -  e 


x  -  9 


o 


99 


r'„   (x)    +    (1  +  /-„(*)) 


(020)2 


,  so  we  can  write  f'n{9\   +  #2°  <*/(«)  +  8,(«))  as    fn        #'(<*,(«))  +  8  *(«),  where 


(02°)' 


1 


max    \8'(n)  |  =  0(n~e+y).    We  can  also  write  /„(6>,°  +  6>2V(«)  +  §,•(«))  as  — -  g(a  ,■(/»))  + 
il/'S*,,  #2 

8,  («),  and  thus  fj(9?  +  02°a/  («)   +  8y(«»  as  — - j-y  g2(ai(n))   +  8*(n),  where     max 

\9 2  )  1  </ ^ ATn 

|8.-(/i)|    =    0(n"e+y)    and     max      |8*(n)|    =    0(n~'+y).     Thus  we  can   write  y  An)    as    -^ 

i</<^n  In 

{((l/(02O)2)^(«/(«))+8*(«))/((l/(e2o)2)g2(a/(«))+8*(«))},  and  the  proof  of  the  lemma  fol- 
lows directly  from  assumptions  (1.3)  and  (1.5). 

Now  we  complete  the  proof  of  Theorem  1  by  applying  the  expansion  log  (1  +  x)  =  x  — 

X2  X3  X4 

~-  +  — for  |x|  <  1,  where  |a>|  <  1,  to  each  of  the  logarithms  in  the  expres- 

2  3         4(l+a>x)4 

sion  (2.1).  This  enables  us  to  write  the  expression  (2.1)  as  the  sum  of  a  finite  number  of 
expressions,  each  of  which  can  easily  be  shown  to  converge  stochastically  to  zero  as  n  increases, 
using  the  lemmas.   For  example,  two  of  these  expressions  are: 

K„    L.-l 


(2.3) 


1 


Z   Z  (y2(«)-y2(«))>  and 


/=!     /=! 


(2.4) 


2Z    Z    (yj(n)-yj(n))W(i,j,n). 


The  expression  (2.3)  is  the  sum  of  K„(Ln-\)  terms,  where  Kn{Ln-\)  <  n.    A  typical  term 
can    be    written    as    {y  f{n)-y  j{n))    (y j(n)  +y7(«)),    which    by    Lemmas    2.1    and    2.3    is 


-y-€  +  28+5y  ±-e  +  2s  +  5y 

0(«  ).    So  the  whole  expression  (2.3)  is  0(n 


)  and  converges  to  zero  as  n 

K„     Z...-1 


increases,  by  assumption   (1.5).    The  expected  value  of  the  expression   (2.4)   is  2    £  £ 

j=  i  /=  l 

1  ]                                                                                                                                  *n  l„-\ 

y z ( « ) ,    and    the    variance    of    the    expression    (2.4)    is    4     £  £ 

f-  i  /=  i 


iy jin)  -  y j{n)) 


402 


L.  WEISS 


(y ,(n)  - y ,{n)): 


1      yf(n) 

12         180 


This  mean  and  variance  can  both  be  seen  to  converge  to  zero 


as  n  increases  by  the  same  reasoning  as  in  the  analysis  of  the  expression  (2.3),  and  thus  the 
expression  (2.4)  converges  stochastically  to  zero  as  n  increases.  The  other  expressions  in  the 
sum  comprising  the  expression  (2.1)  can  be  handled  similarly,  completing  the  proof  of 
Theorem  1. 


3.  PROOF  OF  THEOREM  2 


In  Section  2  we  showed  that  we  can  write  F„(x)    =    G 


02° 


(\  +  R„(x))  where 


|/?„(x)|  <  n  e  for  all  x.    We  now  develop  an  analogous  expression  for  1  —  F„(x).    1  —  F„{x) 


-  £"  f"{,)dt  = 


1-G 


02° 


9? 


dt 


x-9 


<     n 


+ 


\-G 


r  '*« 


X 

-e?\ 

^2°       | 

*2° 


dt,    and    since 


we     can     write     1     —     /\,(x)     = 


(1  +  S„(x))  where  |S„(x)|  <  n  f  for  all  x. 


Theorem  2  will  be  proved  if  we  can  show  that 


log 


/?„u,(7>),    W(0,n), 


W{K„  +  \,n)) 


hp](T(n),  W(0,n), 


W(K„  +  \,n)) 


=  r:„ 


say,  converges  stochastically  to  zero  as  n  increases,  when  the  joint  pdf  is  actually  /?„(1).   Assum- 
ing /?„(1)  is  the  joint  pdf,  the  conditional  (given  Tin))  distribution  of  R'  is  the  same  as  the  dis- 

np,r  1  _ 

tribution  of  &,(!)    +   Q„ (2),  where  Q„(\)    =     £  log(l    +   r„{Vi))  -   (npn  -   Dlog  (1    + 


/-l 


" —  "in 


R„(Y„P)),    and    Qn{2)    =      £   log(l    +    r„(Z,))    -    (n  -    ngn)log(l    +    Sn(Yn(l)),    and 


"p„ 


I)  Z\> 


'n-nqn 


are  mutually  independent,  each  V,  with  pdf 


<  Ynp  ,  zero  for  v  >  Ynp  ,  each  Z,  with  pdf 


fn(z) 


Fn(Ynp) 


for  v 


"p,i 


F  )  for  2  >  YnQir  zero  for  z  <  J^ 


n^'nq„' 


LEMMA  3.1:  Qn  (1)  converges  stochastically  to  zero  as  n  increases. 


"/>„-i 


PROOF:    Define    (?„(!)    as      52      r»< K/)    ~    (wa-1)/?„(K,V().     By    assumption    1.6 


;-l 


|(?„(1)  -  Q„(l)  |  converges  stochastically  to  zero  as  n  increases.   Thus  the  lemma  will  be  provec 
if  we  show  that  Qn{\)  converges  stochastically  to  zero  as  n  increases. 


I 


At) 


't-9^ 


E{rn{V}\T{n))  =■ 


0? 


(\  +  r„(t))dt 


YnPn  ~  9 1 


99 


(l  +  Rn(Ynp)) 


SPARSE  ORDER  STATISTICS 


403 


"/>„ 


9? 


°! 


r.     M-2e, 


Rn(Ynp)   +  a 


Y"P„  ~  9  1 


e2° 


^-' 


0,° 


(i  +  /?„(yB„)) 


where  |uj  <  1.  From  this,  it  follows  that  |£{r„(F,)|  r(w)}  -  K„  ( 5^)  I  =  O^-2*).  This 
implies  that  E[Qn(\)\T(n)}  converges  to  zero  as  n  increases,  and  also  that  Variance 
{rn(Vi)\T(n)}  =  0pin~2e)  which  in  turn  implies  that  Variance  [Q„  (1)|  Tin))  converges  sto- 
chastically to  zero  as  n  increases.  These  facts  clearly  imply  that  Q„i\)  converges  stochastically 
to  zero  as  n  increases. 

LEMMA  3.2:  Q„  (2)  converges  stochastically  to  zero  as  n  increases. 

_  "~"q»        _ 

PROOF:  Define  £>„(2)  as    £    rniZ,)  ~  (n  ~  nQn^n (Ynq  ).    Just  as  in  Lemma  3.1,  all  we 

have  _to  do  is  to  prove  that  Q„(2)  converges  stochastically  to  zero  as  n  increases. 
£{r„(Z,)|7>)}  = 


f°°  1        '-0i° 


\-G 


^«<?„ —  ^  i 


[\  +  S„{Y„q)] 


S,AY„a) 


\-G 


02° 


+  (ii„n~ 


\-G 


Y"q~e\ 


0? 


\-G 


Ynq.,  ~  ^  1 


9? 


[l  +  S„{YnQ)) 


where  |a»„|  <  1.    From  this,  it  follows  that  \E{rn{Z)\T(n)}  -  S„(Ynq)\  =  Qp(n~2€).   The  rest  of 
the  proof  is  similar  to  the  proof  of  Lemma  3.1. 

Lemmas  3.1  and  3.2  imply  that  R*  converges  stochastically  to  zero  as  //  increases,  and  this 
proves  Theorem  2. 

4.  CONSEQUENCES  OF  THE  THEOREMS 


Theorem  1  implies  that  a  statistician  who  knows  only  the  vectors  Tin),  \V(0,n), 
W.{Kn  +  \,n)  is  asymptotically  as  well  off  as  a  statistician  who  knows  all  the  vectors  Tin), 
WiO.n),  Wi\,n),  ....  W(Kn  +  \,n).  This  is  so  because  given  Tin),  using  a  table  of  random 
numbers  it  is  possible  to  generate  additional  random  variables  so  the  joint  distribution  of  the 
additional  random  variables  and  the  elements  of  Tin),  WiQ.n),  W(K„  +  \,n)  is  the  joint  dis- 
tribution given  by  h}2).  But  Theorem  1  states  that  all  probabilities  computed  using  /?„(2)  are 
asymptotically  the  same  as  probabilities  computed  under  the  actual  pdf  /7„(1). 


404 


L.  WEISS 


Theorem  2  implies  that  asymptotically  the  order  statistics  {K1( 


nq„+\> 


Y,,}  contain  no  information  about  r„(x).   This  is  so  because  under  //„(3)  the  conditional  distribu- 
tion (given  T{n))  of  these  order  statistics  does  not  involve  r„(x). 


Taken  together,  the  two  theorems  imply  that  a  knowledge  of  T(n)  is  asymptotically  as 
good  as  a  knowledge  of  the  whole  sample,  for  the  purpose  of  testing  whether  rn(x)  =  0.  This 
assumes  that  we  have  to  deal  only  with  the  challenging  alternatives  described  in  Section  1,  but 
less  challenging  alternatives  do  not  pose  any  problem  asymptotically. 

5.  EXTENSION  TO  OTHER  CASES 

The  results  above  were  for  the  case  where  the  unknown  parameters  are  location  and  scale 
parameters.  In  other  cases,  it  may  not  be  possible  to  choose  p„  and  qn  that  will  guarantee  that 
assumptions  (1.5)  and  (1.6)  hold  for  all  9X,  ...  ,  0,„,  if  we  want  lim  p„  =  0  and  lim  q„  =  1. 

/;— >oo  11—°° 

But  if  we  fix  p  and  q  with  0  <  p  <  q~<  1,  an  analogue  of  Theorem  1  can  often  be  proved  with 
p„  replaced  by  /?,  q„  replaced  by  q,  and  a  j(n),  y ,-(«) 


defined  as  F n  ' 


np  + 


J- 


.  ,  9, 


L^   fo(ai(n)'Jl 9 J 

In    fi(a,(n)-9u   ...  ,  9 J 


respectively, 


where  9X,  ...  ,  9m  are  estimates  of  9®,  ...  ,  0°  based  on  { Ynp, Y„p+L  ,  ...  ,  Ynq).  Then,  if  we 
are  willing  to  ignore  departures  from  the  hypothesis  in  the  tails  of  the  distribution,  we  can  stil 
use  only  the  order  statistics  {  Y„p,  Ynp+L) Y„q). 

6.  APPLICATIONS 

For  the  case  where  m  =  2  and  0b  02  are  location  and  scale  parameters  respectively,  vari 
ous  tests  based  on  T(n)  have  been  investigated  in  [2]  and  [6].  In  particular,  [2]  contains  vari 
ous  analogues  of  the  familiar  Wilk-Shapiro  test,  first  proposed  in  [3].  The  tests  in  [2]  and  [6 
were  based  on  T(n)  because  it  made  the  analysis  easier.  The  present  paper  gives  a  theoretics 
justification  for  basing  tests  on  these  sparse  order  statistics  alone. 

For  the  location  and  scale  parameter  case,  we  can  construct  other  tests,  as  follows.    Fc 


j=  0,1,  ....  K„,  let    V,(n)  denote  yfn  f„ 


np„+jL„ 


let  Z.(n)  denote  Vn  — „  g 

'  02° 


nPn+JLn 


Y  p - 

"P„+  <Ln  " 


Y  p~ l 

1  np  +jL  rn 


npn+jL„ 


np„  +  jLn 


an 


It  was  shown  in  [4]  that  for  all  asymptotic  probability  calculations,  we  can  assume  that  the  joii 
distribution  of  {  ^o^)-  •  •  •  >    ^k  ("))  's  given  by  the  normal  pdf 


c„exp 


SPARSE  ORDER  STATISTICS 


405 


n(Ln-  1) 

2L2 


L„\q 


L„vK 


+ 


^T^tr  +  ,?(v'-v'-i): 


1  15 

Under  the  additional  condition  that  — - e   +  2y   <   0,  it  can  be  shown  that  for  all 

asymptotic     probability     calculations     we     can     assume     that     the    joint     distribution     of 
[Z0(n),  . . .  ,  ZK  («)}  is  given  by  the  normal  pdf  just  described.   Then,  if  we  define  Pi  as 

Qn 


1  + 


V     np,, 

Qo- 


1 


Vl  -  Qn 


,   p2   as 


V    np„ 


Pi,   and   the   observable   random   variables    Q0, 


V^yj^\^/—(g(G-lip„))  Ynp+pxg{G-\qn))Ynq 
L,2        I  V    np„ 


/n(L,-i: 


«G- 


J  np„+jL„ 


Yn„+lL+P2g(G^(q„)Yl 


'"'„ 


np„+ij-  \)Ln 


npn  +  (/-!)/.„ 


for  7  =  1,  . . .  ,  A',,,  a  straightforward  computation  shows  that  for  all  asymptotic  probability  cal- 
culations we  can  assume  that  Q0,  Qx,  ...  ,  QK  are  independent,  each  with  a  normal  distribu- 
tion with  standard  deviation  82,  and  with 


/n(L,-\)   1       /7~ 
V  L„2         I  V    npn 

V 


/»„(0)+P  ,/»„(/:„) 


£10/}  = 


n  L"    l    [h„(j)  -  hn{j-\)+p2h„{K„)\,  for  j=  1,  ....  *„, 


where  /?„(/)  =  £ 


L  oo 


0/C-l 


=  0;°  +  02UG 


:-1 


np„+jL„ 


np»  +  jL„ 


np„+jL„ 


np„+jLn 


.   If  the  hypothesis  is  true,  F„ 


,  and  in  this  case  we  can  write  E[Qt}  =  A„(j)9\  +  Bn(j)92,  where 


An(J),  Bn(J)  are  known,  for  j  =  0,  . . .  ,  A",,.  So  we  have  reduced  our  hypothesis  testing  prob- 
lem to  the  following:  we  observe  random  variables  Q0,  Q\,  ...  ,  QK  which  are  independent 
and  normal,  each  with  the  same  standard  deviation  02°,  which  is  unknown.  The  problem  is  to 
test  the  hypothesis  that  £{(?,■}  =  An(J)0?  +  B„(J)9^,  for  some  unknown  0,°,  where  A„(J)  and 
Bn(j)  are  known  values,  for  j  =  0,1,  ...  ,  K„,  against  alternatives  that  E{Qf)  =  An(j)9\  + 
Bn(j)92  +  A  „(/"),  where  A„(/)  is  unknown. 

The  formulation  of  the  problem  just  described  makes  it  easy  to  construct  various  tests. 
For  example,  suppose  for  convenience  that  Kn  +  1  is  a  multiple  of  4.    Then  it  is  possible  to 


find  —    (AT„  +  1)   sets  of  nonrandom  quantities 


'  =  0, 


K„-3 


X„(4/),  X„(4/  +  l),  X„(4/  +  2),  X„(4/  +  3); 


such  that  the*  —  (K,,  +  \)  quantities  £>„(/)  =  X„(4/)Q,  +X„(4/  + !)(?,+,+ 


L. 

WEISS 

K„- 

3 

1 

— 

0, 

1 

4 

406 

\„(4i  +  2)Qi+2  +  \n(4i  +  3)Ql+i    /' =  0,  1,  ...  ,  — 'Lj—      can  be  assumed  to  be  independent 

normal  random  variables,  each  with  unknown  standard  deviation  0°,  and  with  E[Q„(i)}  =   £ 

—  _  /=0 

X„(4/4\/)A„(4/  +j)  =  A„(/),  say,  where  A„(/)  is  unknown.   Then  the  hypothesis  to  be  tested 

is  that  A„(/)  =  0  for  all  /.    But  if  we  examine  the  development  above,  we  see  that  (A„(/)}  is 

4/ 


not  completely  arbitrary.    Instead,  A„(/)  =  q„ 


,  where  q„  (v)  is  a  continuous  function 


Kn-l 

of  v  for  0  <  v  <  1.  If  we  have  some  particular  alternative  qn  (v)  against  which  to  test  the 
hypothesis,  a  likelihood  ratio  test  can  be  constructed.  If  we  want  to  test  against  a  very  wide 
class  of  alternatives,  we  could  apply  one  of  various  nonparametric  tests.  For  example,  we  could 
base  a  test  on  the  total  number  of  runs  of  positive  and  negative  elements  in  the  sequence 
{(?„(/)}.  If  the  hypothesis  is  true_,  there  should  be  a  relatively  large  number  of  runs,  but  if  the 
hypothesis  is  false,  neighboring  Qn(iYs  would  tend  to  have  the  same  sign,  decreasing  the  total 
number  of  runs.   Other  tests  for  an  analogous  problem  are  developed  in  [7]. 

j       -£ 

In  the  case  where  g(x)  —     /—  e    2  ,  all  the  conditions  imposed  above  hold  if  we  take  p„ 


,  _  x  1  1  Ai  A2  A2 

=   1  -  q„  -  0(«  "),  e  =  -  -  A,,  8  =  —  -  —  -  A2,  y  =  —  -  A3,  p  =  —  -  A3  -  A4, 

where  Ab  A2,  A3,  A4  are  very  small  positive  values  chosen  so  that  e  >  0,  8  >  0,  y  >  0,  and  p 

>  2A,. 

REFERENCES 

[1]  Hensler,  G.L.,  K.G.  Mehrotra  and  J.E.  Michalek,  "A  Goodness  of  Fit  Test  for  Multivariate 
Normality,"  Communications  in  Statistics,  A6,  33-41  (1977). 

[2]  Jakobovits,  R.H.,  "Goodness  of  Fit  Tests  for  Composite  Hypotheses  Based  on  an  Increasing 
Number  of  Order  Statistics,"  Ph.D.  Thesis,  Cornell  University  (1977). 

[3]  Shapiro,  S.S.  and  M.B.  Wilk,  "An  Analysis  of  Variance  Test  for  Normality  (Complete  Sam- 
ples)," Biometrika,  52,  591-611  (1965). 

[4]  Weiss,  L.,  "Statistical  Procedures  Based  on  a  Gradually  Increasing  Number  of  Order  Statis- 
tics," Communications  in  Statistics,  2,  95-114  (1973). 

[5]  Weiss,  L.  "The  Asymptotic  Sufficiency  of  a  Relatively  Small  Number  of  Order  Statistics  in 
Tests  of  Fit,"  Annals  of  Statistics,  2,  795-802  (1974). 

[6]  Weiss,  L.,  "Testing  Fit  with  Nuisance  Location  and  Scale  Parameters,"  Naval  Research 
Logistics  Quarterly,  22,  55-63  (1975). 

[7]  Weiss,  L.,  "Asymptotic  Properties  of  Bayes  Tests  of  Nonparametric  Hypotheses,"  Statistical 
Decision  Theory  and  Related  Topics,  //Academic  Press,  439-450  (1977). 

[8]  Weiss,  L.,  "The  Asymptotic  Distribution  of  Order  Statistics,"  Naval  Research  Logistics 
Quarterly,  26,  437-445  (1979). 


ON  A  CLASS  OF  NASH-SOLVABLE  BIMATRIX  GAMES 
AND  SOME  RELATED  NASH  SUBSETS 

Karen  Isaacson  and  C.  B.  Millham 

Washington  State  University 
Pullman,  Washington 

ABSTRACT 

This  work  is  concerned  with  a  particular  class  of  bimatrix  games,  the  set  of 
equilibrium  points  of  which  games  possess  many  of  the  properties  of  solutions 
to  zero-sum  games,  including  susceptibility  to  solution  by  linear  programming. 
Results  in  a  more  general  setting  are  also  included.  Some  of  the  results  are  be- 
lieved to  constitute  interesting  potential  additions  to  elementary  courses  in 
game  theory. 


1.  INTRODUCTION 

A  bimatrix  game  is  defined  by  an  ordered  pair  <A,B>  of  m  x  n  matrices  over  an  ordered 
field  F,  together  with  the  Cartesian  product  X  x  Y  of  all  m-dimensional  probability  vectors 
x  €  X  and  all  n-dimensional  probability  vectors  y  6  Y.  If  player  1  chooses  a  strategy  (probabil- 
ity vector)  x  and  player  2  chooses  a  strategy  v,  the  payoffs  to  the  two  players,  respectively,  are 
xAy  and  xBy,  where  x  and  y  are  interpreted  appropriately  as  row  or  column  vectors.  A  pair 
<x*,y*>  in  A'  x  Y  is  an  equilibrium  point  of  the  game  <A,B>  if  x*Ay*  ^  xAy*  and  x*By*  ^ 
x*By,  for  all  probability  vectors  x  and  y. 

A  Nash-solvable  bimatrix  game  is  one  in  which,  if  <x*,y*>  and  <x',y'>  are  both  equili- 
brium points,  then  so  are  <x*,y'>  and  <x',y*>.  It  is  well  known  that  0-sum  bimatrix  games 
(fly  +  bjj  =  0,  all  i,j)  are  Nash-solvable,  and  that  this  property  extends  to  constant-sum  games 
(fly  -I-  b,j  =  k,  all  ij,  for  some  k  €  F).  It  is  also  well  known  that  in  the  constant-sum  case  all 
equilibrium  points  are  equivalent  in  that  they  provide  the  same  payoffs  to  both  players.  This 
work  generalizes,  slightly,  that  contained  in  such  sources  as  Luce  and  Raiffa  (9)  and  Burger 
(2),  and  represents  a  very  small  step  toward  the  solution  of  the  open  problem  of  characterizing 
Nash-solvable  games.  In  the  following,  Ar  will  be  the  rth  row  of  A  and  A.}  the  y'th  column  of 
A,  and  similarly  for  B.  The  inner  product  of  2  vectors  u,  v  in  E"  will  be  denoted  by  (w,v).  The 
ordered  pair  is  <a,v>. 

2.  ROW-CONSTANT-SUM  BIMATRIX  GAMES 

DEFINITION  1:  An  m  x  n  bimatrix  game  <A,B>  is  row-constant-sum  if,  for  each 
I  /  =  1,  ...  m,  there  is  a  k,  €  Fsuch  that  au  +  by  =  k,,  j  =  1,  . . .  n. 

THEOREM  1:  Let  <x*,y*>  and  <x',y'>  be  two  equilibrium  points  for  a  row-constant- 
sum  game  <A,B>.   Then  <x*,^*>  and  <x',y'>  are  interchangeable,  and  they  are  equivalent 

m  m 

for  PI  (player  1).   They  are  equivalent  for  P2  (player  2)  if  and  only  if  £  x*k,  =  £  x,'/c,. 

;=1  (=1 

407 


408  K.  ISAACSON  AND  C.B.  MILLHAM 

PROOF:  It  is  well  known  and  easily  proved  that  <x*,y*>  is  an  equilibrium  point  for 
<A,B>  if  and  only  if  x*  >  0  implies  that  (A/.,  y*)  =  max(Ak ,  y*)  and  y*  >  0  implies  that 

k 

(x*,B.i)  =  max(x*,B.k),  for  all  i,  j.  Accordingly,  let  /3  *  =  x*By*.  Then  v*  >  0  implies 
(x*,fl.y)  =/3*  -  £xffc,  -  (x*,  /*.,)  >  £  xft  -  (x*„4.r)  for  all  r,  or  (x*.  >*.,)  ^  (x*.  ^.7),  and 

x*  >  0  implies  (y4,.,v*)  =  a*  =  max(/^.,  v*).    If  <x',v'>  is  any  equilibrium  point,  then  we 

k 

have  that  xMv*  ^  x  '4y*  (because  x*  is  in  equilibrium  with  y*)  ^  x'Ay'  (because  y'  is  in 
equilibrium  with  x '  and  by  the  above  argument)  ^  x*Ay'  (because  x'is  in  equilibrium  with  y') 
^  x*Ay*  (because  y*  is  in  equilibrium  with  x*  and  by  the  above  argument).  Thus  <x*,y*> 
and  <x',y'>  are  interchangeable  for  PI,  and  equivalent  for  PI.  To  show  they  are  interchange- 
able for  P2,  note  that  xBy'  =  ]T  x^kj  —  x'Ay'  =  £  x,'/c,  —  x'Ay*,  or  x'By'  =  x'By*.   One  can  simi- 

larly  show  that  x*By*  =  x*By\  completing  this  part  of  the  proof. 

Suppose  now  that  £  x,'/c,  =  £  x*k,.  Since  x*Ay*  =  x'Ay*,  we  have  that  £  x//c,  — 
x'Ay*  =  £  x*  kj  -   x*/4v*,  or  x'By*  =  x*  Z?v*,  and  equivalence  follows. 

i 

On  the  other  hand,  suppose  x*Ay*  =  x*Ay'  =  x'Ay*  =  x'Ay',  x*By*  =  x*By'  =  x'By*  = 
x'By'.  Then  £   xfk,  —  x*Ay*  =  £   x,'/c,  —  x'/4y*.   Since  x'/ly*  =  x*Ay*,  it  follows  that  £  x*/c,  = 

/  i  i 

£  X/'A:/,  and  the  proof  is  complete. 

i 

It  is  well  known  that,  if  A  (=—B)  is  the  payoff  matrix  for  a  zero-sum  game,  optimal  stra- 
tegies <x*,y*>  for  the  game  satisfy  the  so-called  "saddle-point"  property:  x*Ay  >  x*Ay*  ^ 
xAy*  for  all  probability  vectors  x  and  y,  and  that,  conversely,  if  <x*,y*>  is  a  saddle-point  of 
the  function  xAy,  then  <x*,y*>  is  a  solution  to  the  game  A. 


THEOREM  2:    <x*,y*>  is  an  equilibrium  point  of  the  row-constant-sum  game  <A,B> 
if,  and  only  if,  <x*,y*>  is  a  saddle-point  of  the  function  <$>(x,y)  =  xAy. 


PROOF:    \[<x*,y*>    is  an  equilibrium   point  of  <A,B>,  then  x*Ay*  ^  xAy*  for  all 
x  6  J,  from  which  half  of  one  implication  follows.   Now,  let  K  be  the  m  x  n  matrix 


K 


*l 

*,. 

.A:, 

*2 

*2. 

•*2 

*„ 

*„. 

./c„ 

of  row  constants   £,,  au  +  by  =  /c,. 


Since    x*By*  >  x*By   for    all    y  e   K,    we    have    x*(A^  -  /D.y*  ^  x*(K  —  A)y,    from    which 
x*Ay  ^  x*^v*  [since  x*Aj>*  =  x*Ky  =  £  Jc^fe/l .  This  completes  one  implication.  Suppose  now 

that  <x*,y*>  is  a  saddle-point  of  <I>.    From  x*Ay  ^  x*/4v*  it  follows  that  v*=  0  if'  (x*,A.j)  > 

m 

a  =  m\n(x*,A.k),  from  which,  if  y*  >  0,   £  x*A:,  -  (x*,A.,)  ^  £  x*A:,  -  (x*  ,A.k)  for  all  k,  or 

*  ,=  i 

(x*,^.,-)  >(x*,B.k)  for  all  fe  Finally,  it  follows  from  x*Ay*  ^  xAy*  for  all  x  that  x*=  0    if 
(/4,.,v*)  <  ma\(Ak.,y*),  and  the  proof  is  complete. 

k 

The  implication  is  that  any  solution  of  A  as  a  0-sum  game  is  also  an  equilibrium  point  of 
the  row-constant-sum  bimatrix  game  <A,B>,  and  conversely.   Thus,  a  solution  of  A  found  by 


NASH-SOLVABLE  BIMATRIX  GAMES  409 

linear  programming  will  provide  an  equilibrium  point  <x*,y*>  for  <A,B>  and  the  payoff  a 
for  PI.  The  payoff  fi  for  P2  must  be  calculated  via  x*By*,  or  via  £  x*kt  —  a. 

i 

3.   A  SOMEWHAT  MORE  GENERAL  SETTING 

We  now  consider  the  m  x  n  matrix  A,  we  let  B  be  m  x  n  (not  necessarily  in  row- 
constant-sum  with  A)  and  we  henceforth  let  X  x  Y  be  the  set  of  solutions  to  A  as  a  0-sum 
game.   The  following  theorem  then  follows. 

THEOREM  3:  Let  <x*,y*>  €  X  x  Y.  In  order  for  <x*,y*>  to  be  an  equilibrium  point 
of  <A,B>  regarded  as  a  bimatrix  game,  it  is  necessary  and  sufficient  that  x*By*  ^  x*By  for  all 
probability  vectors  y,  or,  for  x*(—  B)y  >  x*(—B)y*.  It  is  clearly  sufficient  for  <x*,y*>  to  also 
be  a  solution  to  (-/?),  regarded  as  a  0-sum  game. 

The  proof  is  omitted,  as  it  follows  immediately  from  the  definition  of  equilibrium  point. 
The  following  comment  is  made,  however:  if  <A,B>  is  row-constant-sum,  a  point  <x*,y*> 
that  solves  A  as  a  0-sum  game  and  is  an  equilibrium  point  of  <A,B> ,  will  not  necessarily  solve 
(-B)  as  a  0-sum  game,  because  the  condition  x*(—  B)y*  ^  x(—B)y*  holds  if  and  only  if 

m  in 

x*Ay*  —  £  xfkj  ^  xAy*  —  £  Xjkj,  or  x*  Ay*  —  xAy*^  ]£    kj(x*  —  x,).    Thus,  the  condition 

/=  1  i  i- 1 

that  <x*,j>*>  also  solve  (—5)  as  a  0-sum  game  is  extremely  strong.  This  illustrates  a  major 
difference  between  the  constant-sum  case  (in  which  the  above  condition  will  hold  if  <x*,y*> 
solves  A  as  a  0-sum  game)  and  the  row-constant-sum  case.  It  is  also  logical  to  ask  if  there  are 
conditions  on  A  and  B  which  would  cause  an  equilibrium  point  of  <A,B>  to  also  solve  A  and 
-B  as  separate  0-sum  games.  The  conditions  are  inescapable:  yf  >  0  must  imply 
(x*,A.f)  =  min  (x*,A.k)  and  x*  >  0  must  imply  {Br,y*)  =    min  (Bk.,y*).  Since,  for  example, 

k  k 

to  be  an  equilibrium  point  of  <A,B>  it  is  necessary  that  y*  >  0  imply 
(x*,B.f)  =  max  (x*,B.k),   any   game   satisfying   these   conditions   must    be    heavily   restricted. 

k 

Finally,  it  is  noted  that  if  there  are  common  saddle-points  of  A  and  (—5),  which  are  therefore 
equilibrium  points  of  the  bimatrix  game  <A,B>,  each  of  these  saddle-points  will  necessarily 
provide  the  same  payoffs  a,  (3  to  the  respective  players  (note  the  contrast  of  the  row-constant- 
sum  case  with  the  constant-sum  case). 

DEFINITION  2:  A  Nash  Subset  for  a  game  <A,B>  is  a  set  S  =  [<x,y>}  of  equili- 
brium points  for  <A,B>  such  that,  if  <x,y>  and  <x',y'>  are  in  S,  so  are  <x,y'>  and 
<x',y>.   See  (6)  and  (13)  for  related  material. 

THEOREM  4:  Let  A  and  B  be  m  x  //  matrices  over  the  ordered  field  F,  and  let  X  x  Kbe 
the  set  of  all  solutions  to  A  regarded  as  a  0-sum  game.  In  order  for  X  x  Y  to  constitute  a  Nash 
subset  of  equilibrium  points  for  <A,B>,  regarded  as  a  bimatrix  game,  it  is  necessary  and 
sufficient  that  K(X)  =    [k\  {x,*A.k)  =  min  (x*,A.,),  all  x*  €  X)c  K'  (X)  =  [k\  (x*.B.k)  = 

max  (x*.B.i),  all  x*  <E  X  ). 

PROOF:  Write  K  =  K(X),  K'  =  K'(X),  and  let  KCK'.  Then  because  any  <x*,y*>  in 
X  x  Y  solves  A  as  a  0-sum  game,  x*Ay  ^  x*Ay*  ^  x4y*  for  all  <x*,y*>  in  J  x  Y  and  all 
probability  vectors  x,  y.    Also,  .y*=  0  if  {x*,A..)    >  min  (x*,A.k),  or  if  y  #  K  C  A".    Hence 

A: 

y*=  0  if  (x*,B.i)    <  max  (x*,5.,)  for  all  >>*€   X,  any  x*  €  A\  and  <x*,.y*>  is  an  equilibrium 

i 

point  for  </L5>,  for  all  <x*,y*>  €  J  x   Y.    Suppose  there  exists  k'  €  K  -  K',  so  that  for 


410 


K.  ISAACSON  AND  C.B  MILLHAM 


some  x*  6  X,  (x*,B.k)  <  max  (x*,5.,)  but  (x*,A.k)    =    m'm(x*,A.,).    Since  it  is  known  that 

there  exists  v'  6  Y  (see  (1),  page  52)  such  that  y£-  >  0,  it  follows  that  y' cannot  be  in  equili- 
brium with  x*  for  <A,B>  regarded  as  a  bimatrix  game,  a  contradiction.  This  completes  the 
proof. 

COROLLARY  1:  Let  X*  x  Y*  be  any  subset  of  X  x  K,  the  set  of  all  solutions  to  A 
regarded  as  a  0-sum  game.  In  order  for  X*  x  Y*  to  be  a  set  of  interchangeable  equilibrium 
points  (a  Nash  subset)  for  <A,B>  regarded  as  a  bimatrix  game,  it  is  sufficient  that  K(X*)  = 
[k\  (x*,A.k)  =  min  (x*,A.,)  for  all  x*  e  X*}  =  K'(X*)  =  {k\(x*,B.k)  =  max  (**,£.,)  for  all  x* 

€  X*}. 

COROLLARY  2:  Let  X'  C  A,  and  let  A" (A")  be  denned  as  above,  and  let 
Y'=[y  €   r|v,    >  0  implies 7  6  A'(A')).   Then  A"  x  r  is  a  Nash  subset  for  <A,B>. 

Finally,  we  consider  the  construction  of  all  matrices  B  such  that  A  x  Y,  the  set  of  solu- 
tions to  A  as  a  0-sum  game,  will  also  be  a  set  of  equilibrium  points  for  <A,B>  regarded  as  a 
bimatrix  game. 

THEOREM  5:  Let  A  be  an  m  x  n  matrix  over  f,  with  Ax  K  its  solutions  as  a  0-sum 
game.  Then  a  matrix  B  can  be  constructed  such  that  Ax  Y  is  a  Nash  subset  for  <A,B> 
regarded  as  a  bimatrix  game.  The  equilibrium  points  <x,y>  in  A  x  Y  may  or  may  not  be 
equivalent  for  P2,  depending  on  construction.  Further,  all  matrices  B  such  that  Ax  Y  is  a 
Nash  subset  for  <A,B>  are  constructed  as  described. 


PROOF:  Let  x  ,  x,  . . .  xk  be  the  extreme  points  of  A,  and  assume  that  x\  . . .  xr,  r  ^  k, 

xl 


are  a  maximal  linearly-independent  subset  of  x  , 


xk.    Let  x   ~ 


be  regarded  as  the 


matrix  of  a  linear  transformation  from  Em  to  E\  taken  with  respect  to  a  basis  of  unit  vectors, 
and  let  c\c2,  ...  cm~r  be  a  basis  for  the  nullspace  of  x-    Let  01.  /32,  ■■■ftr  be  scalars.  Let  , 

v\  y2,  ...  v'  be  the  extreme  points  of  the  set   Y,  and  let  KJ  =  {/| yj  >  0}.  Let  KY  =  U  A,,/. 

Let  D  =   UKx'.rf)  =/3/(l  ^  7  <  r},  and  let  d\  ...  <T_r+1  be  m  -  r  +  1   (if  some  B,  *  0) 
linearly-independent  solutions  to  the  system  of  r  equations  in  m  variables.    For  j  €  Kv,  let 

m  —  r+\  m  —  r  m—  r  + 1 

B.j  =     £    a,,^'  +  £    \/7c'  where     £     a „=  1   for  at  least,  £  a/7  =  a  for  some  a  ^  Oj,  all 

r 

y.     Then,    if   x  €  A,    there    are    scalars    y,,  /  =  1,  . . .  r,    such    that    x=    £  y,x',    and    for 


y  €  Ky,(x,  B.,)   = 


=    Iy/j8f  Of  «-  D- 


/•  r  m  —  r  +  \  m  —  r 

r=l  [/-i  />=i  i=l  '=i 

After     all      £.,,  y€ATK,      have     been     constructed,  for     j  $  KY,  let  B.,     be     such     that 

(x'.B.j)  <  (x',B.h),  h  €  ATK,  for  all  extreme  points  x',  i  =  1,  ...A:.    Then,  for  all  y*   €    K,  x* 


€    A 


with  x' 


Erfx' 


,   x*By*  =    ^y^3,    ^  x*£v  for   all   probability   vectors    v.     Hence. 


/=i 


Ax   Y  is  a  set  of  interchangeable  equilibrium  points  for  <A,B>  that  would,  for  example,  be 
equivalent  if /3,  =  |3,  for  all  /,  / 

Finally,  suppose  there  is  a  matrix  5  such  that  Ax  Y  is  a  Nash  subset  for  <A,B>  but 
which  does  not  have  the  above  construction.    Then  there  is  a  column  B.r  j  €  KY,  such  that  j 


NASH-SOLVABLE  BIMATRIX  GAMES  411 

m—r+\  m—t  •m—r+\  m—r 

either  B.t  ^     £     «,-,</'  +  £  \y/c'  for  any   coefficients  a „,  or  #,  =     ]£     «,,</'  +  £  \/Vc'  but 

i=i  1-1  f=i  /=i 

ffi-c+i 

£    a„  =  a,  ^  aA  =  1,  k€KY,  k^j.    In  the  first  instance  we  note  (x',B./)  =  8h  /  =  1,  .../•, 
and  we  contradict  the  assumption  that  d] ,  ...  a'"'"'  +  1  are  a  maximal  linearly-independent  set  of 

m-r+\ 

solutions  to  ix'.d)  =  Bjt  j  =  1,  ...  r.    In  the  second  instance,  if    ]£      a/7  =  a,  ^  1,  let  x  = 

]£   y,x'.    Then   (x.B.,)    =  a7£   y,  B,  ^    (x,5.A)=^   -y,/3,   for  other  k  €  /Tr,  so  that  any 
/=  i  '  /=  i  i=  l 

equilibrium  strategy  y  will  either  exclude  j,  or  include  y  and  exclude  any  k  such  that  ak  =  1. 
Either  contradicts  the  definition  of  Ky. 

Note  that  the  matrix  A  is  used  only  to  define  X  x  K  Given  the  set  of  Jf  x  y,  it  follows 
that  both  A  and  5  could  be  constructed  as  described,  assuming  the  appropriate  dimensionality 
conditions. 

4.  CONCLUSIONS 

It  is  hoped  that  this  slight  extension  of  previously  published  material  regarding  Nash- 
solvable  bimatrix  games  will  lend  itself  to  inclusion  in  future  texts  in  game  theory  and  opera- 
tions research  covering  2-person,  0-sum  finite  games  (matrix  games).  Clearly,  nearly  any  state- 
ment that  can  be  made  about  solutions  of  matrix  games  can  also  be  made  about  the  somewhat 
more  interesting  row-constant-sum  bimatrix  case,  and  the  usual  methods  for  finding  such  solu- 
tions carry  over  with  the  minor  modifications  indicated.  The  reader  is  also  referred  to  the 
excellent  text  by  Vorobjev  (21),  and  his  discussion  on  "almost  antagonistic"  bimatrix  games 
(pp.  103-115)  for  related  interesting  material. 

BIBLIOGRAPHY 

[1]    Bohnenblust,    H.F.,   S.    Karlin,   and   L.S.   Shapley,  "Solutions  of  Discrete,   Two-Person 

Games,"    Contributions  to  the   Theory  of  Games,   Annals  of  Mathematics,   Studies  24, 

Princeton  University  Press  (1950). 
[2]  Burger,  E.  Theory  of  Games.     Prentice-Hall,  Englewood  Cliffs,  New  Jersey  (1963). 
[3]  Gale,  D.  and  S.  Sherman,  "Solutions  of  Finite  Two-Person  Games,"  Contributions  to  the 

Theory  of  Games,    Annals  of  Mathematics,   Studies   24,   Princeton   University   Press 

(1950). 
[4]  Heuer,  G.A.,  "On  Completely  Mixed  Strategies  in  Bimatrix  Games,"  The  Journal  of  the 

London  Mathematical  Society,  2,  17-20  (1975). 
[5]  Heuer,  G.A.,  "Uniqueness  of  Equilibrium  Points  in  Bimatrix  Games,"  International  Jour- 
nal of  Game  Theory,  8,  13-25  (1979). 
[6]   Heuer,  G.A.,  and  C.B.  Millham,  "  On  Nash  Subsets  and  Mobility  Chains  in  Bimatrix 

Games,"  Naval  Research  Logistics  Quarterly  23.  311-319  (1976). 
[7]  Kuhn,  H.W.,  "An  Algorithm  for  Equilibrium  Points  in  Bimatrix  Games,"  Proceedings  of 

the  National  Academy  of  Sciences,  47,  1656-1662  (1961). 
[8]  Lemke,  C.E.  and  J.T.  Howson,  Jr.,  "Equilibrium  Points  of  Bimatrix  Games,"  Journal  of  the 

Society  for  Industrial  and  Applied  Mathematics,  12,  413-423  (1964). 
[9]  Luce,  R.D.,  and  H.  Raiffa,  Games  and  Decisions,  John  Wiley  and  Sons,  New  York  (1957). 
,   [10]  Mangasarian,  O.L.,  "Equilibrium  Points  of  Bimatrix  Games,"  Journal  of  the  Society  for 

Industrial  and  Applied  Mathematics,  12,  778-780    (1964). 
[11]  Millham,  C.B.,  "On  the  Structure  of  Equilibrium  Points  in  Bimatrix  Games,"  SIAM  Review 

10,  447-449  (1968). 


412  K.  ISAACSON  AND  C.B.  MILLHAM 

[12]  Millham,  C.B.,  "Constructing  Bimatrix  Games  with  Special  Properties,"  Naval  Research 

Logistics  Quarterly  19,  709-714  (1972). 
[13]  Millham,  C.B.,  "On  Nash  Subsets  of  Bimatrix  Games,"  Naval  Research  Logistics  Quarterly 

21,  307-317  (1974). 
[14]  Mills,  H.,  "Equilibrium  Points  in  Finite  Games,"  Journal  of  the  Society  for  Industrial  and 

Applied  Mathematics,  8,  397-402  (1960). 
[15]  Nash,  J.F.  Jr.,  "Two-Person  Cooperative  Games,"  Econometrica,  21,  128-140  (1953). 
[16]  Owen,  G.,  "Optimal  Threat  Strategies  in  Bimatrix  Games,"  International  Journal  of  Game 

Theory,  1,  3-9  (1971). 
[17]  Pugh,  G.E.  and  J. P.  Mayberry,  "Theory  of  Measure  of  Effectiveness  for  General-Purpose 

Military  Forces,  Part  I:  A  Zero-Sum  Payoff  Appropriate  for  Evaluating  Combat  Stra- 
tegies," Operations  Research  21,  867-885  (1973). 
[18]  Raghavan,  T.E.S.,  "Completely  Mixed  Strategies  in  Bimatrix  Games,"  The  Journal  of  the 

London  Mathematical  Society,  2,  709-712  (1970). 
[19]  von  Neumann,  J.  and  O.  Morganstern,  Theory  of  Games  and  Economic  Behavior,  Princeton 

University  Press,  Princeton,  New  Jersey  (1953),  3rd  Ed. 
[20]   Vorobjev,  N.N.,  "Equilibrium  Points  in  Bimatrix  Games,"  Theoriya  Veroyatnostej  i  ee 

Primeneniya  3,  318-331  (1958). 
[21]  Vorobjev,  N.N.,    Game  Theory  Springer-Verlag,  New  York,  Heidelberg,  Berlin  (1977). 


OPTIMALITY  CONDITIONS  FOR  CONVEX 
SEMI-INFINITE  PROGRAMMING  PROBLEMS* 

A.  Ben-Tal 

Department  of  Computer  Science 

Technion  — Israel  Institute  of  Technology 

Haifa,  Israel 

L.  Kerzner 

National  Defence 
Ottawa,  Canada 

S.  Zlobec 

Department  of  Mathematics 

McGili  University 
Montreal,  Quebec,  Canada 

ABSTRACT 

This  paper  gives  characterizations  of  optimal  solutions  for  convex  semi- 
infinite  programming  problems.  These  characterizations  are  free  of  a  constraint 
qualification  assumption.  Thus  ihey  overcome  the  deficiencies  of  the  semi- 
infinite  versions  of  the  Fritz  John  and  the  Kuhn-Tucker  theories,  which  give 
only  necessary  or  sufficient  conditions  for  optimality,  but  not  both 


1.   INTRODUCTION 

A  mathematical  programming  problem  with  infinitely  many  constraints  is  termed  a  "semi- 
infinite  programming  problem."  Such  problems  occur  in  many  situations  including  production 
scheduling  [10],  air  pollution  problems  [6], [7],  approximation  theory  [5],  statistics  and  proba- 
bility [9].  For  a  rather  extensive  bibliography  on  semi-infinite  programming  the  reader  is 
referred  to  [8]. 

The  purpose  of  this  paper  is  to  give  necessary  and  sufficient  conditions  of  optimality  for 
convex  semi-infinite  programming  problems.  It  is  well  known  that  the  semi-infinite  versions  of 
both  the  Fritz  John  and  the  Kuhn-Tucker  theories  fail  to  characterize  optimality  (even  in  the 
linear  case)  unless  a  certain  hypothesis,  known  as  a  "constraint  qualification,"  is  imposed  on  the 
problem,  e.g.  [4], [12].  This  paper  gives  a  characterization  of  optimality  without  assuming  a 
constraint  qualification. 


*This  research  was  partially  supported  by  Project  No.  NR047-021,  ONR  Contract  N00014-75-C0569  with  the  Center  lor 
Cybernetic  Studies,  The  University  of  Texas  and  by  the  National  Research  Council  of  Canada. 

413 


414  A.  BEN-TAL,  L.  KERZNER.  AND  S.  ZLOBEC 

Characterization  theorems  without  a  constraint  qualification  for  ordinary  (i.e.  with  a  finite 
number  of  constraints)  mathematical  programming  problems  have  been  obtained  in  [1].  It 
should  be  noted  that  the  analysis  of  the  semi-infinite  case  is  significantly  different;  the  special 
feature  being  here  the  topological  properties  of  all  constraint  functions  including  the  particular 
role  played  by  the  nonbinding  constraints. 

The  optimality  conditions  are  given  in  Section  2  for  differentiate  convex  semi-infinite 
programming  programs,  whose  constraint  functions  have  the  "uniform  mean  value  property." 
This  class  of  programs  is  quite  large  and  it  includes  programs  with  arbitrary  convex  objective 
functions  and  linear  or  strictly  convex  constraints.  For  a  particular  class  of  such  programs, 
namely  the  programs  with  "uniformly  decreasing"  constraint  functions,  the  optimality  conditions 
can  be  strengthened,  as  shown  in  Section  4.  A  comparison  with  the  semi-infinite  analogs  of  the 
Fritz  John  and  Kuhn-Tucker  theories  is  presented  in  Section  5.  An  application  to  the  problem 
of  best  linear  Chebyshev  approximation  with  constraints  is  demonstrated  in  Section  6.  A  linear 
semi-infinite  problem  taken  from  [4],  for  which  the  Kuhn-Tucker  theory  fails,  is  solved  in  this 
section  using  our  results. 

2.   OPTIMALITY  CONDITIONS  FOR  PROGRAMS  HAVING  UNIFORM 
MEAN  VALUE  PROPERTY 

Consider  the  convex  semi-infinite  programming  problem 
(P) 

Min  f°(x) 

s.t. 

fk(x,t)  <  0  for  all  f  <E  Tk,  k  €  P   A  (1,  ...  ,  p} 
x  €  R" 
where 

/"  is  convex  and  differentiable, 

fk(x,t)  is  convex  and  differentiable  in  x  for  every  /  €  Tk  and  continuous  in  /  for  every  x, 

Tk  is  a  compact  subset  of  R1  (/  ^  1). 

The  feasible  set  of  problem  (P)  is 

F  =  (x  €  R":fk(x,t)  ^  0  for  all  t  €  Tk,  k  €  P). 
Note  that  F\s  a  convex  set  being  the  intersection  of  convex  sets. 

For  x*  €  F, 

T*k   A  [t  €  Tk:  fk{x*,t)  =  0}, 

P*  A  {k  €  P.  T*k  *  0). 

A  vector  d  €  R"  is  called  a  feasible  direction  at  x*  if  x*  +  d  €  F.    For  a  given  function  fk(-,t), 
k  €  {0}  U  P  and  for  a  fixed  t  6  Tk,  we  define 

Dk(x*,t)    A  [d  e  R".  3  a  >  0  ^  fix*  +  ad,t)  =  fk(x*,t)  for  all  0  ^  a  ^  a). 


OPTIMALITY  FOR  SEMI-INFINITE  PROGRAMMING  415 

This  set  is  called  the  cone  of  directions  of  constancy  in  [1],  where  it  has  been  shown  that,  for  a 
differentiable  convex  function  fk(-,t),  it  is  a  convex  cone  contained  in  the  subspace 

[d:  d'Vfkix*,t)  =  0}. 

Furthermore,  if  fki-,t)  is  an  analytic  convex  function,  then  Dk(x*,t)  is  a  subspace  (not  depend- 
ing on  x*),  see  [1,  Example  4].  In  the  sequel  the  derivative  of  /  with  respect  to  x,  i.e. 
V xfix,t),  is  denoted  by  Vfix.t). 

Optimality  conditions  will  be  given  for  problem  (P)  if  the  constraint  functions  have  the 
"uniform  mean  value  property"  which  is  defined  as  follows. 

DEFINITION  1:  Let  T  be  a  compact  set  in  R'.  A  function  f\R"  x  T  —  R  has  the  uni- 
form mean  value  property  at  x  €  R"  if,  for  every  nonzero  d  €  R"  and  every  a  >  0,  there 
exists  a  =  a  id, a),  0  <  a  ^  a  such  that 

(MV)  fix  +  adj)  ~  f{x,t)    >  d'  V/(x  +  ad,t)  for  every  t  <E  T. 

a 

If  f(-,t)  is  a  linear  function  in  x  for  every  t  €  T,  i.e.  if  /is  of  the  form 
f(x,t)  =  git)  +  Y,x,g,(t), 

or  if  /(-,/)  is  a  differentiable  strictly  convex  function  in  x  for  every  t  €  T,  i.e.  if 

f(kx  +  (1  -  \)y,r)  <  \/(x,/)  +  (1  -  \)f(y,t)  for  every  /  €  7 

where  y  €  /?"  is  arbitrary,  y  ^  x,  0  <  X  <  1,  and  if  fix,  ■)  is  continuous  in  t  for  every  x,  then 
/has  the  uniform  mean  value  property.    For  a  linear  function  /  one  finds  d'Vfix  +  ad.t)  = 

n 

£  djgtit)  and  (MV)  is  obviously  satisfied.  The  mean  value  property  for  strictly  convex  func- 
tions  follows  immediately  from  e.g.  [14,  Corollary  25.5.1  and  Theorem  25.7]. 

EXAMPLE!:    Function 

fix,t)  =  t2[ix  -  t)2  -  t2]  for  every  K  T  =  [0, 1] 

is  neither  linear  nor  strictly  convex  in  x  €  R  for  every  t  €  T.  However  /'  has  the  uniform 
mean  value  property.   Function 


/2(x,,x2,f)  = 


x,2  4-  tx2ix2  -  t)  if  x2  <  —  t 

X?   +  '_  (x2  -  t+  1)   (x2-  1)  if  X2  ^   y  / 


for  every  /  €  T  =  [0, 1]  does  not  have  the  uniform  mean  value  property  at  the  origin.  Note 
that  f2  is  convex  and  differentiable  in  x  €  R2  for  every  t  €  T  and  continuous  in  f  G  T  for 
every  x.   This  function  has  provided  counterexamples  to  some  of  our  early  conjectures. 

Optimality  conditions  will  now  be  given  for  problem  iP). 

THEOREM  1:  Let  x*  be  a  feasible  solution  of  problem  iP)  where  fk,  k  €  P*  have  the 
uniform  mean  value  property.  Then  x*  is  an  optimal  solution  of  iP)  if,  and  only  if,  for  every 
a  *  >  0  the  system 


416  A    BEN-TAL.  L    KERZNER,  AND  S.  ZLOBEC 

(A)  d'Vfix*)   <  0  , 

(B)  d'Vfix*  +  a  *d,t)  <  0  for  all  t  €  T*k, 

(c)  d'vfix*+a*d,t)      _  _L  for  all  f  €  r VT> 

/*UV)  a*  x 

is  inconsistent. 

PROOF:  We  will  show  that  x*  is  nonoptimal  if,  and  only  if,  there  exists  a  *  >  0  such 
that  the  system  (A),  (B),  (C)  is  consistent.  A  feasible  x*  is  nonoptimal  if,  and  only  if,  there 
exist  a  >  0  and  d  £  R'\  d  ^  0,  such  that 

(1)  f°(x*  +  ad)  <  fix*) 

(2)  fk(x*  +  ad.t)  ^  0  for  every  t  6   Tk, 

k  €  P. 

By  the  convexity  of  f  and  the  gradient  inequality,  the  existence  of  a  >  0  satisfying  (1)  is 
equivalent  to 

d'Vfix*)  <  0. 

By  the  continuity  of  fi-.t),  k  €  P,  the  constraints  with  k  €  P\P*  can  be  omitted  from  discus- 
sion. We  consider  (2),  for  some  given  k  €  P*,  and  discuss  separately  the  two  cases:  t  €  T* 
and  /  €  Tk\T*.   Thus  (2)  can  be  written 

(2-a)  fix*  +  ad.t)  ^  0  for  every  t  <E  FJ 

(2-b)  fk{x*  +  ad.t)  <  0  for  every  t  €  Tk\T*k. 

Consider  first  (2-a)  for  some  fixed  k  €  P*.    By  the  convexity  and  uniform  mean  value  property 

of  A 

(3)  f(x*  +  ad.t)  >  fix*,t)  +  ad'Vfix*  +  akd,t)  for  all  1  €  T*k 
and  for  some 

0  <  ak  <  a. 
Since  t  €   T*  and  a  >  0,  (2-a)  implies 

(4)  d'Vfix*  +  akd,t)  <  0. 
Denote 

(5)  a  =  min  [ak\. 

k£P* 

Clearly,  a  always  exists  (since  Pis  finite)  and  it  is  positive.  By  the  convexity  of  fi-.t),  (5) 
and  (4), 

(6)  d'Vfix*  +  ad.t)  <  d'Vfix*  '+  akd,t)  ^  0. 

On  the  other  hand,  the  existence  of  a*  >  0  such  that,  for  some  /  €  T*  and  all  k  €  P*, 
d'Vfix*  +a*d,t)  <  0 
implies  (2-a)  with  0  <  a  ^  a*. 


OPTIMALITY  FOR  SEMI-INFINITE  PROGRAMMING 


417 


It  is  left  to  show  that  the  existence  of  a  >  0,  such  that  (2-b)  holds,  is  equivalent  to  the 
existence  of  a  >  0,  such  that  (C)  holds.  Suppose  that  (2-b)  holds  for  some  a  >  0.  Then,  by 
the  convexity  and  uniform  mean  value  property,  for  k  €  P*, 

0  >  fix*  +  ad,t)  >  fk{x*,t)  +  ad'Vfix*  +  akd,t)  for  all  /  €    Tk\T*k 


and  for  some 

(7) 

Hence, 


(8) 


0  <  ak  .  <  a. 


d'Vfix*  +  akd,t)  1 

—  >  —  — ,  since  t  €  Tk\T* 
a 


f{x*,t) 


>  -  — ,  by  (7). 
a* 


Denote 
(9) 


a  =  min  [ak\  >  0. 


kiP' 


Using  the  monotonicity  of  the  gradient  of  the  convex  function  fk(-,t),  one  obtains  here 
(10) 


d'Vfk{x*  +  ocd,t)    .      d'Vfk(x*  +  akd,t)  _ 

rki    *    \ ^  rk<    *    x for  every  0  <  a  <  afc. 

fk(x*,t)  fh{x*,t) 


This  gives 


—  ^ ,  by  (10)  and  (8) 


f(x*,t) 


> 


a 


,  by  (9) 


which  is  (C)  with  a*  =  a. 


Suppose  now  that  (C)  is  true  for  some  a*  >  0.  Using  again  the  monotonicity  of  the  gra- 
dient of  the  convex  function  fk(-,t),  and  the  fact  that  fk(x*,t)  <  0  for  t  6  Tk\T*k,  one  easily 
obtains 


(11) 
But 


fk(x*,t)  +  a*d'Vfk(x*  +  ad.t)  <  0,  for  every  0  <  a  <  a  *. 


fk(x*  +a*d,t)  =  fk(x*,t)  +a*d'Vfk(x*  +  akd,t), 

for  some  particular  0  <  ak  <  a*,  ak  =  ak(t) 
by  the  mean  value  theorem 
<  0,  by  (11) 
which  is  (2-b)  with  a  =  a*. 

Summarizing  the  above  results  one  derives  the  following  conclusion:  If  x*  is  not  optimal 
then  there  exists  a*  =  min{a,a)  >  0  such  that  the  system  (A),  (B)  and  (C)  is  consistent.  If 
there  exists  a*  >  0  such  that  the  system  (A),  (B)  and  (C)  is  consistent,  then  there  exist 
a0  >  0  and  a  >  0  such  that 


418 


A.  BEN-TAL,  L    KERZNER,  AND  S.  ZLOBEC 


f°(x*  +  a0d)   <  f°{x*) 

fix*  +  ad.t)  ^  0  for  every  t  €  T*k 
fix*  +  ad.t)  ^  0  for  every  t  6   Tk\T*k 

k  €  P*. 
If  one  denotes 

a  =  min{a0,a}  >  0 
then,  again  by  the  convexity  of  fi',t),  k  €  {0}  U  P,  (12)  can  be  written 

fix*  +  ad)  <  fix*) 

fix*  +  ad,t)  <  0  for  every  t  €  Tk, 

k  e  P* 

implying  that  x*  is  not  optimal. 


REMARK  1:  Since  Vfix,-)  is  continuous  for  every  x  in  some  neighbourhood  of  x* 
(this  follows  from  e.g.  [14,  Theorem  25.7]),  condition  (C)  in  Theorem  1  needs  checking  only 
at  the  points  t  €  Tk  which  are  in 

Nk  A     U     Nit*), 


—  i*i.Tl 


where  Nit*)  is  a  fixed  open  neighbourhood  of  t*.   For  the  points  /  in  Tk\Nk  one  can  always  find 
a  *  which  satisfies  (C).   This  follows  from  the  fact  that  for  every  a, 


(13) 


d'Vfjx*  +  ad,t) 


^  -M 


fix*,t) 

for  some  positive  constant  M,  by  the  compactness  of  Tk\Nk.    Choose  M  in  (13)  large  enough, 
so  that 


(14) 


a*  A  — -  ^  a. 
=    M 


Now, 


*fw +  «**/)  >  d'v/y  +  a-d,^        and 

fix*,t)  fix*,t) 


> 


,  by  (13)  and  (14). 


EXAMPLE  2:    The  purpose  of  this  example  is  to  show  that  Theorem  1  fails  if  the  con- 
straint functions  do  not  have  the  uniform  mean  value  property.   Consider 

Min  —  x2 


subject  to 


fixux2,t)  <  0  for  all  t  €  [0,1] 


OPTIMALITY  FOR  SEMI-INFINITE  PROGRAMMING 


419 


where 


f(xux2,t)  = 


x2  +  tx2(x2  -  t) 

,3 


.2 


t 


x{  +  (x2-  t  +  \){x2-  1)  if  x2  >  -t. 


if  x2  <  -t 

I 

2 


Function  /  satisfies  the  assumptions  of  problem  (P)  but  it  does  not  enjoy  the  uniform  mean 
value  property.   The  feasible  set  is 


F  = 


0  €  x7  ^  1 


and  the  optimal  solution  is  x  =  (0, 1)'.  However,  for  every  a  *  >  0,  the  system  (A),  (B)  and 
(C)  is  inconsistent  at  x*  =  0,  a  nonoptimal  point.  Since  T*  =  [0,1],  condition  (C)  is  here 
redundant,  while  (A)  and  (B)  become,  respectively, 


-d2  <  0 

2a  V,2  +  t(2a*d2-  t)d2      <    0if2aV2<7 

,3 


loi*d{  + 


(2-  t)1 


(2a  *d2  -  t)  d2  ^  0  if  2a  *d2  >  t. 


The  above  system  cannot  be  consistent  for  some  a*  >  0,  because,  if  it  were,  the  last  inequality 
would  be  absurd  for  small  t  €  [0,1]. 

When  the  constraint  functions  (but  not  necessarily  the  objective  function)  are  linear,  i.e. 
when  (P )  is  of  the  form 


(L) 


Min  f"(x) 


s.t. 

Soit)  +  £  x,g?{t)  <  0,  for  all  t  €  Tk,  k  6  P 

then  Theorem  1  can  be  considerably  simplified. 

COROLLARY  1:    Let  x*  be  a  feasible  solution  of  problem  (L).   Then  x*  is  optimal  if,  and 
only  if,  the  system 


(A) 
(B2) 

(C,) 


d'Vf°(x*)  <  0 
£  d,g?(t)    <  0,  for  all  /  €  T*k 


i=\ 


Z  d,g*U) 


i=\ 


gait)  +  Z  xfefu) 


>  -1,  for  all  t  €  FA?*. 


k  e  p* 


is  inconsistent. 


420 


A.  BEN-TAL,  L.  KERZNER,  AND  S.  ZLOBEC 


PROOF:    Recall  that  linear  functions  have  the  uniform  mean  value  property.    If  fk(-,t)  is 
linear,  then  for  every  t  €  Tk 

Dk(x,t)  -  [d  €  R":  d'Vfk(x,t)  =  0}. 

Thus  (B)  reduces  to  (B2).    The  left  hand  side  of  (C)  reduces  to  the  left  hand  side  of  (Ci), 
which  does  not  depend  on  a  *.    Moreover,  a  *  on  the  right  hand  side  of  (C)  can  be  taken 

a*  =  1,  because  whenever  d  satisfies  (A)  and  (B2),  so  does  d  =  — r  d. 


a 


□ 


In  many  practical  situations  the  sets  Tk,  k  €  P  are  compact  intervals  and  the  sets  T*, 
k  €  P*  are  finite.  (This  is  always  the  case  when  fix*,  •)  are  analytic  functions  not  identically 
zero.)    For  such  cases  condition  (B)  can  be  replaced  by  a  finite  number  of  linear  inequalities. 

COROLLARY  2:  Let  x*  be  a  feasible  solution  of  problem  (/>),  where  fk,  k  €  P*  have 
the  uniform  mean  value  property.  Suppose  that  all  the  sets  T*,  k  €  P*  are  finite.  Then  a 
feasible  solution  x*  of  problem  (P)  is  optimal  if,  and  only  if,  for  every  a  *  >  0  and  for  every 
subset  £1A  of  T*  the  system 

d'Vf°(x*)  <  0 

•  d'Vfk{x*,t)  <  0,  t  €  nk 

d  €  Dk{x*,t),  t  €   T*k\VLk 


(A) 
(B3) 


(C) 


is  inconsistent. 


d'Vfk(x*  +  a*d,()    >    _  _1_ 
fk(x*,t)  a' 

for  all  t  €   TJSTl  , 


k  €  P* 


An  important  special  case  of  Corollary  2  is  when  the  sets  Tk  themselves  are  finite.   Then 
problem  (P)  can  be  reduced  to  a  mathematical  program  of  the  form 

(MP) 

Min  f°(x) 

s.t. 

fix)  <  0,  k  6  P. 

This    is    obtained    by    setting     Tk  =  {kltk2,  ■ 


^carcrrj    and    identifying    {/*(*,*,):  k;  €  Tk, 


k  =  1,2 p] 


with 


fk(x):  k  €  PA  {1,2 JT  card  Tk) 


fk(x*)  =  0}.    Also  {Dk(x*,k,):  k,  €   Tk,  k  =  1,2, 


k=\ 


Here 


{A:  €  P: 


p)  is  denoted  by  {DA.(x*):  k  €  P}. 


The  major  difference  between  the  semi-infinite  problem  (P)  and  the  mathematical  prob- 
lem (MP)  is  that  for  the  latter  the  condition  (C)  is  redundant;  Theorem  1  then  reduces  to  the 
following  result  obtained  in  [1,  Theorem  1]. 


0PT1MALITY  FOR  SEMI-INFINITE  PROGRAMMING  421 

COROLLARY  3:  Consider  problem  (MP),  where  {fk:  k  6  {0}  U  P)  are  differentiable 
convex  functions:  R"  —>  R.  A  feasible  solution  x*  of  (MP)  is  optimal  if,  and  only  if,  for  every 
subset  Q  of  P*  the  system 

d'Vf°(x*)  <  0 

d'Vfk(x*)  <  0,  k  €  n 

d  €  Dk(x*),  k  €  P*\Q, 

is  inconsistent. 

PROOF:    Here  condition  (C)  becomes 

d'Vfk(x*  +  a*d)    >  _  J_  . 

/*(x*)  "*       a  *  ' 

for  some  a  *  >  0.    Since  here  the  set  P\P*  is  finite,  and  hence  compact,  the  redundancy  of 
condition  (C)  is  shown  as  in  Remark  1. 

□ 

The  following  result  gives  a  characterization  of  a  unique  optimal  solution  of  problem  (P). 

THEOREM  2:  Let  x*  be  a  feasible  solution  of  problem  (P),  where  fk,  k  €  P*  have  the 
uniform  mean  value  property.  Then  x*  is  a  unique  optimal  solution  of  problem  (P)  if,  and 
only  if,  for  every  a*  >  0  there  is  no  d satisfying  conditions  (5),  (C)  and 

(A,)  d'Vf°(x*)  <  0  or  d  €  D0(x*). 

PROOF:  Suppose  that  the  system  (A,),  (B),  (C)  is  inconsistent.  Then  so  is  the  system 
(A),  (B),  (C).  Hence,  by  Theorem  1,  x*  \s  an  optimal  solution.  Suppose  that_x*  is  not  a 
unique  optimal  solution.  Then  there  exist  a  >  0  and  d  ^  0  such  that  x  =  x*  +  ad  is  feasible, 
which  implies  that  d  satisfies  (B),  (C)  and  f°(x*)  =  f°(x*  +  ad)  .  Since  the  set  of  all  optimal 
solutions  of  a  convex  program  is  convex,  the  latter  implies  fix*)  =  f"(x*  +  ad)  for  all 
0  ^  a  <  a,  i.e.,  d  €  D0(x*).  Thus  (/satisfies  (A,),  (B)  and  (C),  which  is  impossible.  There- 
fore x*  is  the  unique  optimum.   The  necessity  follows  by  a  similar  argument. 

□ 

3.   OPTIMALITY  CONDITIONS  FOR  STRICTLY  CONVEX  FUNCTIONS 
IN  THEIR  ACTUAL  VARIABLES 

This  section  can  be  skipped  without  hindering  the  study  of  Section  4. 

In  order  to  state  our  next  result,  which  is  a  characterization  of  optimality  for  a  subclass  of 
convex  functions,  i.e.  strictly  convex  functions  in  their  "actual  variables",  we  adopt  some 
notions  from  [1]. 

For  every  k  €  P  and  t  6  Tk,  denote  by  [£](/)  (read  "block  /c"),  the  following  index  sub- 
set of  P:  j  €  [k](t)  if,  and  only  if,  yk:  R  —  R,  defined  by 

yk(-)  A  (xi,   ...  ,  Xj-h   ;  XJ+l,   ...  ,  x„) 

is  not  a  constant  function  for  some  fixed  x,,  . . .  ,  x,_,,  xl+u  . . .  ,x„.   Thus,  for  a  given  r  €  Tk, 
\k](t)  is  the  set  of  indices  of  those  variables  on  which  fk(-,t)  actually  depends.    These  "actual 


422 


A.  BEN-TAL,  L.  KERZNER,  AND  S.  ZLOBEC 


variables"  determine  the  vector  x(*](,,,  obtained  from  x  =  (x,,  ...  ,  x„)'  by  deleting  the  vari- 
ables [xf.  j  %  [£](/)},  without  changing  the  order  of  the  remaining  ones.  Similarly,  we  denote 
by  f[k]u):  /?«*[*)  -»  r  the  restriction  of/*  to  Rc^kl 

DEFINITION  2:    A  function  f*\  R"  x  Tk  -*  R  is  strictly  convex  in  its  actual  variables  if 
for  every  /  €   Tk  its  restriction  f[k](,)(-,t)  is  strictly  convex. 

The  above  concept  will  be  illustrated  by  an  example. 

EXAMPLE  3:   Consider 

f{x,t)  =  x,2  +  txl  t  €  T=  [0,1]. 
Note  that  function  /'(•,/)  is  not  strictly  convex  for  every  /  €  T.   Here 

{1}  if  /  =  0 

{1,2}  if/  €  (0,1], 

(*,)  if  /  =  0 
if/  €  (0,1] 


[11(f)  = 


*[i](i> 


and 


/ 


HO) 


f  x,2  if  /  =  0 
x,2  +  /x22  ifr  €  (0,1], 


clearly  a  strictly  convex  function  in  its  actual  variables  for  every  /  €  T.    Hence,  /'  is  a  strictly 
convex  function  in  its  actual  variables. 

COROLLARY  4:  Let  x*  be  a  feasible  solution  of  problem  (P),  where  /*(•,/),  A:  €  P*are 
strictly  convex  in  their  actual  variables  and  have  the  uniform  mean  value  property.  Then  x*  is 
an  optimal  solution  of  (P)  if,  and  only  if,  for  every  a  *  >  0  and  every  subset  ft*  C  T*  the 
system 


(A) 
(B,  ft) 

(C) 

(D,ft) 

is  inconsistent. 


d'\7f°{x*)  <  0 
d'Vfix*  +  a*d,t)  <  0  for  all  t  €  Tf$lk 

d'Vfk(x*  +  a*d,t)    .  1     .       ..      ,   T,T% 

rkt   *   ^ > *  for  a11  *  €   r^r* 

fk(x*,t)  a* 

d[k](D  =  0  for  all  /  €  ft  k, 

k  £  P* 


PROOF:  We  know,  by  Theorem  1,  that-x*  is  nonoptimal  if,  and  only  if,  there  exists 
a*  >  0  such  that  the  system  (A),  (B),  (C)  is  consistent.  In  order  to  prove  Corollary  4,  it  is 
enough  to  show  that  (B)  is  consistent  if,  and  only  if,  for  some  subsets  ft*  C  71*,  k  €  P*,  the 
system  (B,  ft),  (D,ft)  is  consistent.    Suppose  that  (B)  holds.   For  every  k  €  P* define 

ft*  A  [t  €   T*k:  d'S7fk{x*  +  ad,t)  =  0  for  all  0  <  a  <  a*}. 

Hence,  by  the  mean  value  theorem,  for  every  /6ft* 

fix*  +  ad.t)  =  fk(x*,t)  for  all  0  <  a  <  a  *. 


OPTIMALITY  FOR  SEMI-INFINITE  PROGRAMMING  423 

Since  fki-,t)  is  strictly  convex  in  its  actual  variables,  this  is  equivalent  to 

d{k)(t)  =  0  for  all  t  €  Q.k. 

If  t  €  T*k\Clk,  then  obviously  rf'V fkix*  +  ad.t)  <  0  for  some  0  <  a  <  a  *,  by  (B).  Thus 
(B,  ft),  (D.ft)  holds  for  ft^  =  ft^.  (Note  that  some  or  all  ft^'s  may  be  empty.)  The  reverse 
statement  follows  from  the  observation  that  d[k\(,)  =  0  implies  d'Vfkix*  +  a  *d,t)  =  0. 

D 

If  a  function  fki-,t)  is  strictly  convex  (in  all  variables  X\,  ...  ,  x„)  for  every  t  €  Tk, 
k  €  />*,  then  Dk(x*,t)  =  {0}.  This  implies  that  the  system  (A),  (B,  ft),  (C),  (D,  ft)  is  incon- 
sistent for  every  nonempty  Clk,  k  €  P*.  Thus  condition  (D,ft)  is  redundant.  In  fact,  condi- 
tion (C)  is  also  redundant,  which  follows  by  the  following  lemma. 

LEMMA  1:  Let  fix.t)  be  convex  and  differentiate  in  x  €  R"  for  every  t  in  a  compact 
set  T  C  R1  and  continuous  in  t  for  every  x    If  for  some  d  €  R", 

(15)  d'Vf(x*,t)  <  0  for  all  t  €  T*  =  {t:  f{x*,t)  =  0}, 

then  there  exists  a  >  0  such  that 

•  (16)  fix*  +  ad,t)  <  0  for  all    t  6  T\T*. 

PROOF:   It  is  enough  to  show  that  the  hypothesis  (15)  and  the  negation  of  the  conclusion 
!  (16),  which  is 

"For  every  a  >  0  there  is  t  -  ti.a)  €  T\r*such  that  fix*  +  ad.tia))  >  0," 
are  not  simultaneously  satisfied.   If  this  were  true  one  would  have  the  following  situation: 

For  every  a„  of  the  sequence  a„  =  2~"  there  is  a  t„  =  tn(an)  €  T\r*such  that 

(17)  f(x*  +  and,t„(a„))  >  0,   n  =  0,   \,  2,   ... 

Since  Tis  compact,  {tn}  has  an  accumulation  point  t  €  T,  i.e.  there  is  a  convergent  subsequence 
'  {/„}  with  /as  its  limit  point.    We  discuss  separately  two  possibilities  and  arrive  at  contradictions 
in  each  case. 

CASE  I:  t  €  T*.  Since  f(x*j)  =  0  and  d'Vf(x*,t)  <  0,  by  (15),  there  exists  a  >  0 
such  that 

!  (18)  fix*  +ad,h  <  0. 

For  all  large  values  of  index  /,  a„  <  a  and 

(19)  fix*,tn)  <  0, 
since  t„,  €  T\T*.   This  implies 

(20)  fix*  +  ad,t„)  >  0. 

(If  (20)  were  not  true,  one  would  have,  for  some  particular  «,, 

(21)  fix*  +ad,t„)  <  0. 

Nowa,,  <  a,  (19),  (21)  and  the  convexity  of /imply 
fix*  +a„d,t„)  ^  0 


424  A.  BEN-TAL,  L.  KERZNER,  AND  S.  ZLOBEC 

which  contradicts  (17).)    But  (18)  and  (20)  contradict  the  continuity  of  fix*  +  ad,-)- 

CASE  II:  tJE  T\T*.  Since  fix*,t)  <  0,  there  exists  a  >  0  such  that  (18)  holds,  by  the 
continuity  of  fi-,t).   The  rest  of  the  proof  is  the  same  as  in  Case  I. 

□ 

r 

A  characterization  of  optimality  for  strictly  convex  constraints  follows. 

COROLLARY  5:  Let  x*  be  a  feasible  solution  of  problem  (P),  where  fi-,t)  are  strictly 
convex  for  every  t  €  Tk,  k  €  P*.  Then  x*  is  an  optimal  solution  of  (P)  if,  and  only  if,  for 
every  a  *  >  0  the  system 

(A)  d'Vf°(x*)  <  0 

(B,)  d'Vfix*,t)  <  0  for  all  t  €  T% 

k  €  P* 

is  inconsistent. 

PROOF:  First  we  recall  that  f,  k  €  P*,  under  the  assumption  of  the  corollary,  have  the 
uniform  mean  value  property.  If  x*  is  not  optimal,  then  the  system  (A),  (Bj),  (C)  is  con- 
sistent, by  Corollary  4.  This  implies  that  the  less  restrictive  system  (A),  (B^  is  consistent. 
Suppose  that  the  system  (A),  (B])  is  consistent.  Then  for  every  k  €  P*  there  is  ak  >  0  such 
that 

fix*  +  akd,t)  ^  0  for  all  t  €   Tk\T*k 
by  Lemma  1.   Let 

a*  A  min{a^:  k  €  />*). 
By  the  convexity  of  /*,  it  follows  that 

fix*  +  a*d,t)  <  0  for  all  t  €  Tk\T*k  and  k  6  P*. 

This  is  equivalent  to  (C)  of  Theorem  1  (see  (2-b)).  Therefore  the  system  (A),  (B]),  (C)  is 
consistent.  This  implies  that  the  system  (A),  (B),  (C)  is  consistent.  (The  reader  may  verify 
this  statement  by  the  technique  used  in  the  proof  of  Lemma  2.)  Hence  x*is  optimal,  by  Corol- 
lary 4. 

D 

REMARK  2:  Differentiate  strictly  convex  (in  all  variables!)  functions  f  do  have  the 
uniform  mean  value  property.  However,  this  is  not  necessarily  true  in  the  case  of  convex  func- 
tions with  strictly  convex  restrictions.    In  particular,  function 


fixux2,t)  = 


x,2  +  tx2ix2  -  t)  lf  *2  <  J* 


S .  , 

(2-  t)2 


x2  +  -7rJ-—(x2-  t  +  l)(x2-  1)  if  x2  ^  -t 


is  differentiate  and  has  strictly  convex  restrictions  for  every  t  €  [0,1].    Note  that 

{1}      iff  =  0 

{1,2}  if/  €  (0,1]. 


[Jfc](r)  = 


OPTIMALITY  FOR  SEMI-INFINITE  PROGRAMMING  425 

But  function  /does  not  have  the  uniform  mean  value  property.  One  can  show,  however,  that  a 
differentiable  function  which  is  strictly  convex  in  its  actual  variables  and  such  that  [/c](/)  is 
constant  over  all  compact  set  T,  does  have  the  mean  value  property. 

4.    PROGRAMS  WITH  UNIFORMLY  DECREASING  CONSTRAINTS 

The  applicability  of  Theorem  1  is,  in  general,  obscured  by  the  appearance  of  parameter  a  * 
in  conditions  (B)  and  (C).  The  purpose  of  this  section  is  to  point  out  some  of  the  topological 
difficulties  which  arise  in  the  removing  of  a  *  from  condition  (B).  A  class  of  convex  functions 
for  which  the  optimality  conditions  can  be  stated  without  reference  to  a  *  in  condition  (B)  will 
be  called  the  uniformly  decreasing  functions. 

In  what  follows  we  assume  that  f:R"  x  T  — •  R  is  convex  and  differentiable  in  x  €  R"  for 
every  /  of  a  compact  set  Tin  Rm.   Further,  Vf(x*,t)  denotes  Vfx(x*,t). 

DEFINITION  3:  Let  /:  R"  x  T  —  R  and  x*  €  R"  be  such  that  T*  *  0.  Then  for  a 
given  d  €  R",  d  ^  0,  the  function  /is  uniformly  decreasing  at  x*  in  the  direction  d,  if  (i)  the 
set 

S(x*,d)  A(K   T*:  d'Vf(x*,t)  <  0} 

is  compact  and  if  (ii)  there  exists  a  >  0  such  that  fix*  +  ad,t)  =  0  for  all  t  G  T*  for  which 
d  €  D(x*,t). 

It  is  not  easy  to  recognize  whether  a  general  convex  function  /is  uniformly  decreasing. 

EXAMPLE  4:    Consider  the  following  functions  from  R  x  R  into  R: 

f(x,t)  =  t2[(x  -  t)2  -  t2],  t  6  T  (used  in  Example  1) 

f{xj)  =  x2  -  tx,  t  €   T 

f(x,t)  =  -tx,  t  €   T. 

These  functions  are  all  convex,  f2  is  actually  strictly  convex  and  /3  linear  in  x  for  every  /  €  T. 
:If  T  =  [0, 1],  then  neither  function  is  uniformly  decreasing  at  x*  =  0  in  the  direction  d  =  1. 
However,  if  T  =  [1,2]  then  all  three  functions  are  uniformly  decreasing  at  x*  =  0  in  the  same 
direction  d  =  1. 

As  suggested  by  the  above  example,  a  convex  function  /is  uniformly  decreasing  at  x*  in 
the  direction  d  ^  0,  whenever  Vf(x*,  •)  is  continuous  and  the  set 

E(x*,d)  A  {/  €  T*:  d'Vf(x*,t)  =  0} 

is  empty.    Its  complement 

S(x*,d)  =  T*\E{x*,d)  -  T* 

s  then  compact.  In  particular,  all  analytic  functions  not  identically  zero  are  uniformly  decreas- 
ng.  However,  a  characterization  of  optimality  for  problem  (P)  with  such  constraint  functions 
s  already  given  by  Corollary  4. 

An  important  uniformity  property  of  convex  functions  with  compact  S(x*,d)  follows: 

LEMMA  2:  Let  f(x,t)  be  convex  and  differentiable  in  x,  for  every  /  in  a  compact  set 
r  C  R"\  and  continuous  in  t,  for  every  x  €  R".  Suppose  further  that  for  some  x*  and  d  ^  0 
n  R",  the  set  S(x*,d)  is  nonempty  and  compact.   Then  there  exists  a  >  0  such  that 


426 


A.  BEN-TAL,  L.  KERZNER,  AND  S.  ZLOBEC 


(22)  fix*  +ad,t)  <  0,    0  <  a  <  a 
for  all  t  €  S(x*,d). 

PROOF:  Suppose  that  such  a  >  0  does  not  exist.  Then  there  exists  a  sequence 
[tj)  C  S(x*,d)  and  a  sequence  {a,},  a,  =  a  jit,)  >  0  such  that 

fix*  +  aid,t,)  -  0, 

fix*  +  ad.t,)  <  0,    0  <  a  <  a, 

and 

(23)  fix*  +ad,tj)  >  0,  a  >  a, 

with  inf  (a,}  =  0.    Since  S(x*,d)  is  compact,  (/,)  contains  a  convergent  subsequence  {/,.}.    Let 
t  6  S(x*,d)  be  the  limit  point  of  [tj}.   Now 

d'Vfix*,t)  <  0 
implies  that  there  exists  a  >  0  such  that 

fix*  +  ad.t)  <  0,      0  <  a  ^  a. 
In  particular, 

(24)  fix*  +  ad,h  <  0. 

For  any  e  >  0  there  exists  j0  =  jflie)  such  that 

(25)  \tj  —  t\  <  €  and  a,   <  a  for  all  j  >  j0. 
Now  (23)  and  (25)  imply 

(26)  fix*  +  ad.t,)  >  0   for  all  j  >  j0. 

But  the  inequalities  (24)  and  (26)  contradict  the  continuity  of  fix*  +  ad,-). 

D 

EXAMPLES:   Consider  again 

fixj)  =  x2-  tx,  rG  F=  [1,2]. 

This  function  is  uniformly  decreasing  at  x*  =  0  in  the  direction  d  —  \.    The  inequality  (22) 

holds  for  every  0  <  a  <  1,   in   particular  a  =  — .     If  the  above  interval    T  is  replaced  by 

T  —  [0,1],  then  f7  is  not  uniformly  decreasing  at  x*  =  0  with  rf  =  1.    An  a  >  0  satisfying  (22) 
here  does  not  exist. 

A  characterization  of  optimality  for  programs  (F),  with  constraint  functions  which  have 
the  uniform  mean  value  property  and  are  uniformly  decreasing,  follows. 

THEOREM  3:  Let  x*  be  a  feasible  solution  of  problem  (/>),  where  fk,  k  €  P*  have  the 
uniform  mean  value  property.  Suppose  also  that  /*,  k  €  P*  are  uniformly  decreasing  at  x*  in 
every  feasible  direction  d.  Then  x*  is  an  optimal  solution  of  iP)  if,  and  only  if,  for  every 
a  *  >  0  the  system 


(A)  d'Vf°ix*)  <  0, 

d'Vfkix*,t)  <  0  or  d  €  Dkix*,t) 
for  all  t  €  T%     ' 


(B4) 


OPTIMALITY  FOR  SEMI-INFINITE  PROGRAMMING 


427 


(C) 


is  inconsistent. 


d'VfHx*  +a*d,t)    >   _  _1_ 
fk(x*,t)  a* 

for  all  t  €  Tk\T*k, 

k  e  p* 


PROOF:  Parts  (A)  and  (C)  are  proved  as  in  the  case  of  Theorem  1.  It  is  left  to  show 
that  the  existence  of  a  >  0  satisfying  (2-a)  is  equivalent  to  the  consistency  of  (B4).  It  is  clear 
that  (2-a)  implies  (B4).  In  order  to  show  that  (B4)  implies  (2-a)  we  use  the  assumption  that  the 
functions  {fk(x,t):  k  €  P*}  are  uniformly  decreasing  at  x*  in  the  direction  d.  When  (B4)  holds, 
then  for  every  k  €  P*  there  exist  ak  >  0  and  ak  >  0  such  that 


(27) 

by  Lemma  2,  and 

(28) 


fk(x*  +  ad.t)  <  0,    0  <  a  <  ak 

for  all  t  €  Sk  A  [t  €  T*k:  d'Vfk(x*,t)  <  0), 

fk(x*  +ad,t)  =  0,  0  <  a  <  ag 
for  all  t  €  T*k\Sk, 


since  d  ^  0.   The  latter  follows  by  part  (ii)  of  Definition  2  and  the  convexity  of  fk.    Let 


(29) 


a  A  min(aA,aA0}  >  0. 


A€/>* 


Clearly  (27)  and  (28)  can  be  written  as  the  single  statement  (2-a)  with  a  chosen  as  in  (29). 


□ 


The  following  example  shows  that  the  assumption  that  [fk{x,t):  k  €  P*)  be  uniformly 
decreasing  at  x*  cannot  be  omitted  in  Theorem  3. 

EXAMPLE  6:   Consider  the  program 
Min  f°(x)  =  -x 

s.t. 

f(x,t)  ^  0,  for  all  t  €  T  =  [0,1] 


where 


f(x,t)  = 


t{x  -  t)2   if  x  >  t 
0  if  x  ^  t. 


The  feasible  set  consists  of  the  single  point  x*  =  0,  which  is  therefore  optimal.  One  can  verify, 
after  some  manipulation,  that  the  constraint  function  /has  the  uniform  mean  value  property  at 

x*.    (For  every  a  >  0  there  exists  0  <  a  ^  —  a  such  that  (MV)  holds.)    However,  /is  not 

uniformly  decreasing  at  x*.  In  order  to  demonstrate  that  Theorem  3  here  fails,  first  we  note 
that  T*  =  T  =  [0, 1],  so  the  condition  (C)  is  redundant.  Since  d  =  1  is  in  the  cone  of  direc- 
tions of  constancy  D(x*,t)  for  every  t  €  [0, 1],  we  conclude  that  the  system  (A),  (B4)  is  here 
consistent,  contrary  to  the  statement  of  the  theorem.  Therefore  the  assumption  that  the  con- 
straint functions  be  uniformly  decreasing. cannot  be  omitted  in  Theorem  3. 


428  A    BEN-TAL,  L.  KERZNER,  AND  S.  ZLOBEC 

5.   THE  FRITZ  JOHN  AND  KUHN-TUCKER  THEORIES  FOR 
SEMI-INFINITE  PROGRAMMING 

In  contrast  to  the  characterizations  of  optimality  stated  in  the  preceding  sections  we  will 
now  recall  the  Fritz  John  and  Kuhn-Tucker  theories  for  semi-infinite  programming.  In  the 
sequel  we  use  the  following  concept  from  the  duality  theory  of  semi-infinite  programming,  e.g. 
[3]. 

DEFINITION  4:  Let  /  be  an  arbitrary  index  set,  [p1:  i  €  /}  a  collection  of  vectors  in  Rm 
and  [c,\  /  €  /}  a  collection  of  scalars.   The  linear  inequality  system 

u'p1  ^  c„  for  all  /  €  / 

is  canonically  closed  if  the  set  of  coefficients  {((/>')',  c,):  /  €  /}  is  compact  in  R'"+]  and  there 
exists  a  point  u°  such  that 

(u")'p'  <  c„  for  all  i  €  /. 

We  will  say  that  problem  (P)  is  canonically  closed  at  x*if  the  system 

(B5)  d'Vfk(x*,t)  <  0  for  all  t  €  T%,  k  €  P* 

is  canonically  closed. 

REMARK  3:  All  constraint  functions  of  problem  (P)  can  have  the  uniform  mean  value 
property,  or  they  can  be  uniformly  decreasing,  without  problem  (P)  being  canonically  closed. 

Lemma  3  below  is  a  specialized  version  of  Theorem  3  from  [3],  adjusted  to  our  need.  It 
is  related  to  the  following  pair  of  the  semi-infinite  linear  programs: 

(I)  (ID 

Inf  u'p"  Sup  Y,  CA/ 

s.t.  s.t. 

u'p'>  c„  all  /  €  /  2>'X,=  P° 

u  €  Rm  k  €  S,    X  ^  0, 

where  S  is  the  vector  space  of  all  vectors  (X,:  /'  €  /)  with  only  finitely  many  nonzero  entries. 
Denote  by  Vt  and  Vn  the  optimal  values  of  (I)  and  (II),  respectively. 

LEMMA  3:  Assume  that  the  linear  inequality  system  of  problem  (I)  is  canonically  closed. 
If  the  feasible  set  of  problem  (I)  is  nonempty  and  Vx  is  finite,  then  problem  (II)  is  consistent 
and  Vu  =  Vx.   Moreover,  Vu  is  a  maximum. 

The  concept  of  a  canonically  closed  system'  is  used  in  the  proof  of  the  dual  statement  of 
the  following  theorem. 

THEOREM  4:  ("The  Fritz  John  Necessity  Theorem")  Let  x*  be  an  optimal  solution  of 
problem  (P).   Then  the  system 

(A)  d'Vfix*)  <  0 


OPTIMALITY  FOR  SEMI-INFINITE  PROGRAMMING 


429 


(B,) 


d'Vf(x*,t)  <  0  for  all  t  €   T*k, 


k  €  P" 


is  inconsistent  or,  dually,  the  system 


(FJ) 


is  consistent. 


k°Vf°(x*)  +    Y     £  ^l^7fk(x*,t)  =  0 

k£P'  f€T'k 

\°,  [\tk:  t  €  71*,  k  6  />*}  nonnegative  scalars, 

not  all  zero  and  of  which  only  finitely  many  are  positive 


PROOF:  If  x*  is  optimal,  then  the  inconsistency  of  the  system  (A),  (B,)  is  well-known, 
e.g.  [4,  Lemma  1].  In  order  to  prove  the  dual  statement,  we  note  that  the  inconsistency  of 
(A),  (B,)  is  equivalent  to  fx  *  =  0  being  the  optimal  value  of  the  semi-infinite  linear  program 


(I) 


Min  ix 


s.t. 


d'Vf°(x*)  +  fi  >  0 

d'Vfk(x*,t)  +  ai  ^  0,  all  t  €  T%,  k  <E  P* 
d 


The  dual  of  (/)  is 
(ft) 


6  R"  +  [. 


Max  0 


s.t. 


\°Vf°(x*)  +    £     £   ^l^fk(x*,t)  =  0 


fee/"  /e  rfc 

\k  >  0,  only  finitely  many  are  positive. 

The  feasible  set  of  problem  (I)  is  clearly  nonempty  and  canonically  closed  (d  =  0,  n  =  1  satisfy 
the  constraints  of  (I)  with  strict  inequalities).  Lemma  3  is  now  readily  applicable  to  the  pair 
(I),  (II),  which  proves  (FJ). 

□ 


The  dual  statement  in  Theorem  4  is  the  Fritz  John  optimality  condition  for  semi-infinite  pro- 
gramming.   For  an  equivalent  formulation  the  reader  is  referred  to  Gehner's  paper  [4]. 

Under  various  "constraint  qualifications"  such  as  Slater's  condition: 

3x  €  R"  3  fk{x,t)  <  0  for  all  t  €   Tk,  k  €  P 

or  the  "Constraint  Qualification  II"  of  Gehner  [4],  one  can  set  \0  =  1  in  Theorem  4.    In  fact, 
the  same  is  possible  if  problem  (P)  is  canonically  closed  at  x*,  i.e.  if  there  exists  dsuch  that 


430 


A.  BEN-TAL,  L.  KERZNER,  AND  S.  ZLOBEC 


(30) 


d'Vfk(x*,t)  <  0  for  all  t  €  T*k,  k  €  P*. 


This  is  easily  seen  by  multiplying  the  equation  in  (FJ)  by  d  satisfying  (30).  Note  that  the 
canonical  closedness  assumption  is  a  semi-infinite  version  of  the  Arrow-Hurwicz-Uzawa  con- 
straint qualification,  e.g.  [12].   The  latter  constraint  qualification  is  implied  by  Slater's  condition. 

The  Fritz  John  condition  (FJ)  with  X0  =  1  is  a  semi-infinite  version  of  the  Kuhn-Tucker 
condition,  e.g.  [12].  While  the  Fritz  John  condition  is  necessary  but  not  sufficient,  the  Kuhn- 
Tucker  condition  is  sufficient  but  not  necessary  for  optimality.  If  a  constraint  qualification  is 
assumed,  then  the  Kuhn-Tucker  condition  is  both  necessary  and  sufficient  for  optimality  for 
problem  (P).  If  a  constraint  qualification  is  not  satisfied  then  the  Fritz  John  condition  fails  to 
establish  the  optimality  and  the  Kuhn-Tucker  condition  fails  to  establish  the  nonoptimality  of  a 
feasible  point  x*.  In  contrast,  our  results  are  applicable.  This  will  be  demonstrated  by  two 
examples.    (See  also  an  example,  taken  from  approximation  theory,  in  Section  6.) 

EXAMPLE  7:   Consider  the  semi-infinite  convex  problem 
Min  f°  =  xx  —  x2 
s.t. 
/'  =  x,2  +  tx2  -  t2  ^  0  for  all  t  €  Tx  -  [0, 1] 
f  -  -x,  -  tx2  -  t  <  0  for  all  /  €  T2  =  [0, 1]. 
The  feasible  set  is 

'  0 


F  = 


A' 2 


:-l  <  x,  <  0 


and  the  optimal  solution  is  x*  =  (0,0)'.   For  this  point 

77=  H=  (0),    P*  =  {1,2}. 
The  system  (B5)  is 

0^0 

-dx  ^  0, 
obviously  not  canonically  closed.   The  Kuhn-Tucker  condition  is 


1 

f     1 
0 

-1 

0 

-1 

+  \, 

0 

+  x2 

0 

— 

0 

A-l  >  0,  X2  >  0 

which  clearly  fails. 

One  can  easily  verify  that  the  constraint  functions  /'  and  f2  have  the  uniform  mean  value 
property.  Also,  these  functions  are  uniformly  decreasing  at  x*  =  0  in  every  direction  d  ^  0. 
(The  sets  T{  and  rj  are  singletons!)  Thus  Theorem  3  is  applicable.  Conditions  (A),  (B4)  and 
(C)  are  here 


(A) 

dx  -  d2  <  0 

(B4) 

0  <  0  or  rf,  =  0,  d2  €  R 
-dx  <  0 

OPTIMALITY  FOR  SEMI-INFINITE  PROGRAMMING  431 


(C) 

a  *d\  +  td2               1                        , 
~ — -  ^  -  -±r  for  all  t  €  (0, 1] 

—  tl                                 OL* 

—d\  -  td2              1                      „ 

! >  -  -1-  for  all  /  €  (0, 1]. 

— /                   a* 

This  reduce 

s  to 

dx  =  0,  d2  >  0 

(31) 

—  >  -  -~  for  all  t  6  (0, 1] 

-*  >  ~  £• 

Since  d2  >  0,  the  inequality  (31)  cannot  hold  for  any  a*  >  0.  Hence,  by  Theorem  3, 
x*  =  (0,0)'  is  optimal.  The  optimality  of  a  feasible  point  is  thus  established  here  using 
Theorem  3  and  not  by  the  Kuhn-Tucker  condition  which  here  fails. 

Consider  now  the  point  x*  =  (0,-1)'.   Here 

T*=  {o},  T\=  [0,1],  P* =  {1,2}. 

It  is  easy  to  verify  that  the  Fritz  John  condition  is  satisfied  in  spite  of  the  fact  that  x*  is  not 
optimal.   Conditions  (A),  (B)  and  (C)  are  here 

(A)  dx  -  d2  <  0 

0^0 


(B) 


-dx-  td2  <  0  for  all  t  €  [0,1] 


.   .  a  *d\  +  td2  1  , 

(C)  ■    ,        >  ~  -^  for  all  /  €  (0, 1]. 

—t—r  a* 

For  a  *  =  1,  these  conditions  are  satisfied  by  c/[  =  0,  J2  =  1-  Hence,  by  Theorem  1,  the  point 
x*  =  (0, 1)'  is  not  optimal.  Both  the  Fritz  John  and  the  Kuhn-Tucker  theories  fail  to  character- 
ize optimality  in  this  example  because  a  constraint  qualification  (or  a  regularization  condition, 
e.g.  [1])  is  not  here  satisfied. 

Although  the  Fritz  John  and  Kuhn-Tucker  theories  fail  to  characterize  optimality,  they 
can  be  used  to  formulate,  respectively,  either  the  necessary  or  the  sufficient  conditions  of 
optimality. 

In  the  remainder  of  the  section  we  will  show  that  the  ordinary  Kuhn-Tucker  condition 
(i.e.  the  (FJ)  condition  with  X0  =  1)  can  be  weakened  by  assuming  an  asymptotic  form.  For  a 
related  discussion  in  Banach  spaces  the  reader  is  referred  to  [16]. 

THEOREM  5:  ("The  Kuhn-Tucker  Sufficiency  Theorem")  Let  x*  be  a  feasible  solution  of 
problem  (P).  Then  x*  is  optimal  if  the  system 

(A)  d'Vf°(x*)  <  0 

(B5)  d'\7fk(x*,t)  <  0  for  all  t  €  T*k, 

k  6  P* 

is  inconsistent  or,  dually,  if  the  system 


432 


A.  BEN-TAL,  L.  KERZNER,  AND  S.  ZLOBEC 


(K-T) 


V/°(x*)  +    £      £    X*V/*(xV)  =  0 

kZP"  t£T*k 

{X^:  t  €  T*,  k  €P*}  nonnegative  scalars 
of  which  only  finitely  many  are  positive 


is  consistent. 


PROOF:  If  the  system  (A),  (B5)  is  inconsistent,  so  is  (A),  (B).  (Recall  that 
Dk{x*,t)  C  [d\  d'Vfk(x*,t)  =  0}.)  Hence,  in  particular,  the  system  (A),  (B),  (C)  is  incon- 
sistent. Following  the  proof  of  Theorem  1,  one  concludes  that  x*  is  optimal.  The  inconsistency 
of  (A),  (B5)  is  equivalent  to  the  consistency  of  (K-T),  by  e.g.  [11,  Corollary  5]. 


□ 


REMARK  4:    The  "asymptotic"  form  of  the  Kuhn-Tucker  conditions  (K—T)  gives  a 
weaker  sufficient  condition  for  optimality  than  the  familiar  (i.e.,  without  the  closure)  condition 


(K-T) 


v/°(x*)  +  £   £  x*v/*(xv)  =  o 

k€P*  t€T% 

{X*:  t  €  T%,  k  €  P*}  nonnegative  scalars 
of  which  only  finitely  many  are  positive. 


In  some  situations  the  primal  Kuhn-Tucker  conditions  (A),  (B5)  may  be  easier  to  apply 
than  (K-T).  This  will  be  illustrated  on  the  following  problem  taken  from  [8,  Example  2.4]. 

EXAMPLES:  Consider 

Min  f°  =  4x,  +  j  (x4  +  x6) 

s.t. 
/'  =  -x,  -  t\X2  -  t2x3  -  t}xA  -  txt2xs  -  t22xt  +  3  -  (t\  -  t2)2(t\  +  t2)2  ^  0 

t] 


for  all  /  €  T,  = 


1 2 


:  -1  <  t,  ^  1,  /=  1,2 


We  will  show,  using  the  Kuhn-Tucker  theory,  that  x*  =  (3,0,0,0,0,0)'  is  an  optimal  solution. 
The  optimality  of  x*  has  been  established  in  [8]  by  a  different  approach. 


First  note  that  here 
t 

Tt  = 


'  n  r. 


:  t\  -  t2  =  0  or  tx  +  t2  =  0 
The  system  (A),  (B5)  becomes 
(A)  4dx  +y  d4 

(B5)  -  dx-  txd2  -  t2d3  -  t\di  -  txt2ds  -  t2d6  <  0 


+|  d6  <  0 


for  all  t  €  T\. 


0PTIMAL1TY  FOR  SEMI-INFINITE  PROGRAMMING 


433 


Substitute  in  (B5)  the  following  five  points  of  T*\ 


1 

0 

1 

1 

-1 

-1 

0 

' 

1 

' 

-1 

' 

1' 

-1 

This  gives 

-</,  <  0 

— d\  —  d2  —  dj  —  d4  —  d$  —  d(,  ^  0 

-dx  -  d2  +  d}-  d4+  d5-  d6^  0 

-dx  +  d2  ~  d3  -  d4  +  d5  -  d6  <  0 

-d\  +  d2  +  d3  -  d4  -  d5  -  d6  <  0. 

Multiply  the  first  inequality  by  ten  thirds  and  each  of  the  remaining  four  inequalities  by  one 
sixth  then  add  all  five  inequalities.   We  get 


-Ad, 


-7*4 


-jd6<0 


which  contradicts  (A).    Thus  the  system  (A),  (B5)  is  inconsistent  and  x*=  (3,0,0,0,0,0)'  is 
optimal,  by  Theorem  5. 

Theorems  1  and  3  suggest  that  the  presently  used  constraint  qualifications  for  semi- 
infinite  programming  problems  are  too  restrictive  because  they  do  not  employ  the  topological 
properties  of  problem  (P),  such  as  the  uniform  mean  value  property  or  the  uniformly  decreas- 
ing constraints. 

6.   AN  APPLICATION  TO  CHEBYSHEV  APPROXIMATION 

It  is  well-known  that  there  is  a  close  connection  between  convex  programming  and 
approximation  theory,  e.g.  [5], [13].  In  fact,  many  approximation  problems  can  be  formulated 
as  convex  semi-infinite  programming  problems  in  which  case  the  results  of  this  paper  are 
readily  applicable.  In  particular,  the  problem  of  linear  Chebyshev  approximation  subject  to  side 
constraints 


(MM) 


Min 


max  \f(t)  ~  £  XjgjU)\ 


r£T 


/=! 


S.t. 


lit)  <  £  x^U)  <  uit)  for  all  t  <E  T 
i-i 

is  equivalent  to  the  linear  semi-infinite  programming  problem 

(L) 

Min  xn+\ 

s.t. 


ft 

-x„+]  <  £  x,g,(t)  -  fit)  ^  Xn+l 


i=\ 


for  all  /  e  T. 


434 


A.  BEN-TAL,  L.  KERZNER.  AND  S   ZLOBEC 


Corollary  3  of  this  paper  can  be  applied  to  (L)  and  it  gives  a  characterization  of  the  best  approx- 
imation for  the  problem  (MM).  Uniqueness  of  the  best  approximation  can  be  checked  using 
Theorem  2.   Rather  than  going  into  details  we  will  illustrate  this  application  by  an  example. 

EXAMPLE  9:  The  approximation  problem  stated  in  this  example  is  taken  from  [4],  see 
also  [15].  It  shows  that  there  exist  situations  when  the  Kuhn-Tucker  theory  for  semi-infinite 
programming  fails  to  establish  optimum  even  in  the  case  of  linear  constraints.  However,  the 
optimality  is  established  using  the  results  of  this  paper. 

The  linear  Chebyshev  approximation  problem  is 


Min 


max   1 1 
r€[0,l] 


XX~  X2t 


S.t. 


<  0 

<  0 


for  all  1  €  [0,1]. 


-t  <  x,  +  x2t  <  t2,  for  all  /  €  [0,1]. 
An  equivalent  linear  semi-infinite  programming  problem  is 

Min  f°  =  x3 
s.t. 

f  =  ,4  -  x,  -  X2t  -  x3      <  0 
f  =  -tA  +  x,  +  x2t  -  x3   ^  0 

f3  =  -t2  +  x{+x2t 
f  =  -t  -  x,  -  x2t 

Isx*=  (0,0.  D' optimal? 

Here  Tf=  {1},  T*2=  0,  T*3  =  (0),  T*4=  {0}  and  P*  =  {1,3,4}.  The  system  (A),  (B5)  is 
(A)  d3  <  0 

-dx  -  d2-  d3    ^  0 
d]  <  0 

-d\  <  0 


(B5) 


and  it  is  clearly  consistent  (set  e.g.  d\  =  0,  d2  =  1,  d3  =  -1).  Therefore,  Theorem  5  cannot  be 
applied.  (Since  the  system  (K—T)  is  inconsistent,  x*  =  (0,0,1)'  is  not  a  "Kuhn-Tucker 
point".)    But  the  system 


(A) 
(B2) 


(C,) 


~d\ 


d3<0 

d2  ~  d3  <  o 

<  0 

^  0 


dot-  d 


t4-  1 
d{  +  d2t 


dx  +  d2t 


I   >  -1,  for  all  t  €  [0,1) 

>  -1,  for  all  /  €  (0,1] 

>  -1,  for  all  t  6  (0,1] 


OPTIMALITY  FOR  SEMI-INFINITE  PROGRAMMING  435 

is  inconsistent.  (First,  d\  =  0,  by  the  last  two  inequalities  in  (B2).  Now  (A)  and  (B2)  imply 
d2  >  0.  This  contradicts  d2  ^  0  obtained  from  the  second  inequality  in  (Ci).)  Therefore 
x*  =  (0,0, 1)'  is  optimal,  by  Corollary  1. 

ACKNOWLEDGMENT 

The  authors  are  indebted  to  Professor  G.  Schmidt  for  providing  some  of  the  constraint 
functions  used  in  Examples  1,  3  and  4,  Mr.  H.  Wolkowicz  for  providing  a  counter-example  to 
one  of  their  earlier  conjectures  and  the  referee  for  his  recommendations  about  organization  of 
the  paper  and  providing  a  correct  version  of  Lemma  3. 

REFERENCES 

[1]  Ben-Tal.  A.,  A.  Ben-Israel  and  S.  Zlobec,  "Characterization  of  Optimality  in  Convex  Pro- 
gramming without  a  Constraint  Qualification,"  Journal  of  Optimization  Theory  and 
Applications  20,  417-437  (1976). 

[2]  Charnes,  A.,  W.W.  Cooper  and  K.O.  Kortanek,  "Duality,  Haar  Programs  and  Finite 
Sequence  Spaces,"  Proceedings  of  the  National  Academy  of  Science,  48,  783-786 
(1962). 

[3]  Charnes,  A.,  W.W.  Cooper  and  K.O.  Kortanek,  "On  the  Theory  of  Semi-Infinite  Program- 
ming and  a  Generalization  of  the  Kuhn-Tucker  Saddle  Point  Theorem  for  Arbitrary 
Convex  Functions,"  Naval  Research  Logistics  Quarterly,  16,  41-51  (1969). 

[4]  Gehner,  K.R.,  "Necessary  and  Sufficient  Conditions  for  the  Fritz  John  Problem  with 
Linear  Equality  Constraints,"  SIAM  Journal  on  Control,  12,  140-149  (1974). 

[5]  Gehner,  K.R.,  "Characterization  Theorems  for  Constrained  Approximation  Problems  via 
Optimization  Theory,"  Journal  of  Approximation  Theory,  14,  51-76  (1975). 

[6]  Gorr,  W.  and  K.O.  Kortanek,  "Numerical  Aspects  of  Pollution  Abatement  Problems:  Con- 
strained Generalized  Moment  Techniques,"  Carnegie-Mellon  University,  School  of 
Urban  and  Public  Affairs,  Institute  of  Physical  Planning  Research  Report  No.  12 
(1970). 

[7]  Gustafson,  S.A.  and  K.O.  Kortanek,  "Analytical  Properties  of  Some  Multiple-Source  Urban 
Diffusion  Models,"  Environment  and  Planning,  4,  31-41  (1972). 

[8]  Gustafson,  S.A.  and  K.O.  Kortanek,  "Numerical  Treatment  of  a  Class  of  Semi-Infinite  Pro- 
gramming Problems,"  Naval  Research  Logistics  Quarterly,  20,  477-504  (1973). 

[9]  Gustafson,  S.A.  and  J.  Martna,  "Numerical  Treatment  of  Size  Frequency  Distributions 

with  Computer  Machine,"  Geologiska  Foreningens  Forhandlingar,  84,  372-389  (1962). 
[10]  Kantorovich,  L.V.  and  G.Sh.  Rubinshtein,  "Concerning  a  Functional  Space  and  Some 

Extremum  Problems,"  Dokl.  Akad.   Nauk.  SSSR  115,  1058-1061  (1957). 
[11]  Lehmann,  R.  and  W.  Oettli,  "The  Theorem  of  the  Alternative,  the  Key-Theorem  and  the 

Vector-Maximum  Problem,"  Mathematical  Programming,  8,  332-344  (1975). 
[12]  Mangasarian,  O,  Nonlinear  Programming,  McGraw  Hill,  New  York  (1969). 
[13]  Rabinowitz,  P.,  "Mathematical  Programming  and  Approximation,"  Approximation  Theory, 

A.  Talbot  (editor),  Academic  Press  (1970). 
[14]  Rockafellar,  R.T.,  Convex  Analysis,  Princeton  University  Press,  Princeton,  N.J.  (1970). 
[15]  Taylor,  G.D.,  "On  Approximation  by  Polynomials  Having  Restricted  Ranges,"  Journal  on 

Numerical  Analysis,  5,  258-268  (1968). 
[16]  Zlobec,  S.,  "Extensions  of  Asymptotic  Kuhn-Tucker  Conditions  in  Mathematical  Program- 
ming," SIAM  Journal  on  Applied  Mathematics,  21,  448-460  (1971). 


SOLVING  INCREMENTAL  QUANTITY  DISCOUNTED 
TRANSPORTATION  PROBLEMS  BY  VERTEX  RANKING 

Patrick  G.  McKeown 

University  of  Georgia 
Athens,  Georgia 

ABSTRACT 

Logistics  managers  often  encounter  incremental  quantity  discounts  when 
choosing  the  best  transportation  mode  to  use.  This  could  occur  when  there  is  a 
choice  of  road,  rail,  or  water  modes  to  move  freight  from  a  set  of  supply  points 
to  various  destinations.  The  selection  of  mode  depends  upon  the  amount  to  be 
moved  and  the  costs,  both  continuous  and  fixed,  associated  with  each  mode. 
This  can  be  modeled  as  a  transportation  problem  with  a  piecewise-linear  objec- 
tive function.  In  this  paper,  we  present  a  vertex  ranking  algorithm  to  solve  the 
incremental  quantity  discounted  transportation  problem.  Computational  results 
for  various  test  problems  are  presented  and  discussed. 


1.  INTRODUCTION 

Whenever  a  logistics  manager  is  making  a  decision  about  the  movement  of  freight,  he  is 
often  faced  with  choosing  from  among  different  modes  of  transportation.  Movement  of  freight 
by  air  or  motor  express  may  involve  no  fixed  costs  to  the  transporter,  but  will  usually  involve 
relatively  higher  variable  costs  than  either  rail  or  water.  However,  both  rail  and  water  can 
involve  the  investment  of  large  sums  for  rail  sidings  or  docking  facilities.  The  problem  of 
selecting  freight  modes  can  be  modeled  as  a  transportation  problem  with  a  piecewise-linear 
objective  function.  This  problem  has  been  termed  the  incremental  quantity  discounted  tran- 
sportation problem,  since  it  is  assumed  that  the  variable  costs  decrease  as  the  amount  shipped 
increases.  This  comes  about  due  to  the  lower  variable  costs  for  rail  or  water  modes  relative  to 
air  or  road  freight  costs.  The  presence  of  fixed  costs  for  the  use  of  rail  or  water  determines  the 
range  of  shipment  levels  over  which  each  cost  will  be  applicable.  Figure  1  shows  this  type  of 
objective  function. 

In  this  paper  we  will  present  a  vertex  ranking  algorithm  to  solve  this  type  problem  along 
with  the  computational  results  from  various  sizes  and  types  of  problems.  Background  material 
is  discussed  in  Section  2,  while  the  details  of  the  algorithm  are  given  in  Section  3.  An  example 
is  worked  out  in  Section  4  while  Section  5  gives  computational  results. 

2.  BACKGROUND  MATERIAL 

The  incremental  quantity  discounted  transportation  problem  is  a  member  of  a  general 
class  of  math  programming  problems,  i.e.,  the  piecewise-linear  programming  problem.  Vogt 
and  Even  [15]  considered  the  case  of  the  piecewise-linear  transportation  problem  derived  from 
U.S.  freight  rates.  This  problem  is  neither  convex  nor  concave,  and  has  sections  of  the  objec- 
tive function  which  are  flat  or  "free."   Figure  2  shows  this  case.   Vogt  and  Evans  used  separable 

437 


438 


PG.  MCKEOWN 


Total 
Cost 


Motor 
Freight 


_►  Rail ► 


Water 


Flow  (xu) 


Quantity  Shipped 
Figure  1 


J 

i 

"Free"  Sections 

Total 
Cost 

te- 

Quantity  Shipped 
Figure  2 


-  Flow  (xy) 


nonconvex  programming  to  reach  an  approximately  optimal  solution  to  this  problem.  Balachan- 
dran  and  Perry  [1]  consider  another  version  of  this  problem  which  they  termed  the  all  unit 
quantity  discount  transportation  problem.  The  main  difference  between  this  and  the  previous 
case  is  the  lack  of  the  flat  section  of  the  objective  function.  The  latter  case  is  typical  of  some 
foreign  freight  rates,  and  is  shown  in  Figure  3  below. 

Problems  similar  to  this  one  have  been  mentioned  in  the  plant  location  literature,  e.g., 
Townsend  [14],  and  Efroymson  and  Ray  [5].  In  these  cases,  it  is  suggested  that  the  problem  be 
solved  by  considering  multiple  plants,  one  for  each  range  of  demand. 

Balachandran  and  Perry  presented  a  branch  and  bound  algorithm  for  the  all  unit  quantity 
discount  problem,  which  they  show,  will  also  work  for  the  incremental  quantity  discount  prob- 
lem as  well  as  fixed  charge  transportation  problems.    However,  no  computational  results  are 


SOLVING  TRANSPORTATION  PROBLEMS  BY  VERTEX  RANKING 


439 


Total 
Cost 


Flow  (x„) 


Quantity  Shipped 
Figure  3 


given  to  demonstrate  the  efficiency  of  this  algorithm.  Here,  we  will  consider  a  vertex  ranking 
algorithm  for  only  the  incremental  quantity  discount  problem  for  two  reasons.  First,  fixed 
charge  transportation  problems  have  been  handled  in  several  other  places  in  a  manner  that  has 
been  shown  to  be  superior  to  vertex  ranking  [2,8].  Secondly,  the  incremental  quantity  discount 
transportation  problem  has  a  concave  objective  function;  but,  neither  the  problem  considered 
by  Vogt  and  Evans,  or  the  all  unit  quantity  discount  transportation  problem,  have  nonconcave 
objective  functions.  This  is  crucial  to  the  use  of  vertex  ranking  since  this  procedure  will  only 
consider  vertices  of  the  constraint  set,  and  the  optimal  solution  to  problems  with  nonconcave 
objective  functions  need  not  occur  at  a  vertex. 

The  incremental  quantity  discount  problem  may  be  formulated  as  follows  (following  the 
model  proposed  by  Balachandran  and  Perry  [1]): 

(1) 

(2) 
(3) 

(4) 


Min  Z  =  ££C*  xu  +  ££  /j}  yjj 

i     i  i     j 

subject  to  £  Xjj  =  a,  for  /€/ 
£  Xjj  =  bj  for  j  €7 


C 
C2 


ku  ^  XU  <  *•(/ 


C^  =  < 


o 


ifx/r1  < 


<  \[j  <  °°, 


440  P.G.  MCKEOWN 


(5)  yu  = 

(6)  4  = 


1  if  A  J"1  <  x(J  <  \jj 

0  otherwise 

v-l 


-  C^A/p1  for  k  =  2,3 r 


'</" './ 


and 

(7)  /j  =  0,  Xy  ^  0  for  all  /€  1  and  7  €7, 

where 

7  =  {1,  . . .  ,  «}  =  set  of  sinks, 

/  =  { 1 ,  ....  m }  =  set  of  sources, 
/?  =  {1,  . . .  ,  r}  =  set  of  cost  intervals. 

As  may  be  easily  seen,  this  is  a  generalization  of  the  fixed  charge  transportation  problem, 
(see  [1]),  with  a  fixed  charge,  /$,  and  a  continuous  cost,  Cjj,  for  each  range  of  shipment 
between  source  /  and  destination  j.  Since  the  situation  which  we  are  attempting  to  model,  i.e., 
the  choice  of  shipment  mode,  does  involve  various  levels  of  fixed  charge,  (1)  -  (7)  is  the 
proper  formulation  for  this  problem.   It  should  be  noted  that  we  are  implicitly  assuming  that 

Cjj  >  C#+1  for  all/  €7,  7  €7. 

This  is  necessary  for  the  concavity  of  the  objective  function.    However,  we  would  expect  that 
lower  continuous  costs  would  occur  for  higher  shipment  levels. 

Balachandran  and  Perry  [1]  suggested  that  (1)  -  (7)  may  be  solved  by  a  branch  and  bound 
algorithm.  Their  procedure  is  similar  to  that  used  to  solve  travelling  salesmen  problems  by 
driving  out  subtours  [13].  They  solve  the  transportation  problem  with  all  costs  set  to  their 
lowest  value,  i.e.,  Cjj.  If  any  routes  have  flow  below  \[j~l,  branching  is  done  on  one  of  these 
variables.  Two  branches  are  used.  Our  branch  forces  the  flow  over  the  arc  above  the  lower 
limit  for  the  cost  level  Cfj,  i.e.,  Xy  ^  Af1.  In  the  other  branch,  the  infeasible  cost,  Q,  is 
replaced  by  the  feasible  cost,  Cjj.  This  continues  until  a  solution  is  found  where  the  arc  flows 
match  the  costs  used.  This  is  the  optimal  solution.  However,  the  effectiveness  of  the  pro- 
cedure is  unknown  since  the  authors  did  not  provide  any  computational  results. 

It  would  also  appear  that  the  work  of  Kennington  [8]  on  the  fixed  charge  transportation 
problem  could  possibly  be  modified  to  solve  this  problem  by  having  multiple  arcs  between  each 
set  of  nodes.  Each  arc  would  be  bounded  by  \,y_1andX,y  with  multiple  continuous  costs  and 
fixed  costs.  However,  this  would  lead  to  effectively  larger  problems,  e.g.,  a  problem  with  60 
arcs  and  five  breakpoints  would  have  300  variables  in  the  new  problem. 

3.   SOLUTION  PROCEDURE 

Using  the  formulation  of  the  incremental  quantity  discount  transportation  problem  given 
in  (1)  -  (7),  along  with  the  assumption  of  decreasing  costs,  we  have  a  problem  with  linear  con- 
straints and  concave  objective  function.  It  is  well  known  [7]  that  an  optimal  solution  for  prob- 
lems of  this  type  will  occur  at  a  vertex  of  the  constraint  set.  Examples  of  other  problems  that 
share  this  condition  are  the  fixed  charge  problem,  the  quadratic  transportation  problem,  and  the 
quadratic  assignment  problem.  Murty  [12]  was  the  first  to  suggest  a  vertex  ranking  scheme  for 
a  problem  of  this  category.  He  showed  that  the  fixed  charge  problem  could  be  solved  by  rank- 
ing the  vertices  of  the  constraint  according  to  the  objective  value  up  to  some  upper  bound.  At 
that  point,  the  optimal  solution  would  be  found  at  one  or  more  of  the  ranked  vertices. 


SOLVING  TRANSPORTATION  PROBLEMS  BY  VERTEX  RANKING  441 

We  may  formulate  any  problem  with  concave  objective  function  and  linear  constraint  as 
below: 

(8)  Mm  fix) 

(9)  s.t.  x£S 

(10)  where  S  =  {x\Ax  =  b,x  ^  0}. 

Since  no  "direct"  optimization  techniques  exist  for  the  case  where  fix)  is  nonlinear,  we 
shall  look  at  a  procedure  for  searching  the  vertices  of  5.  To  do  this,  we  will  use  a  linear 
underapproximation  of  fix),  say  Lix),  such  that  Lix)  <  fix),  x€S.  In  this  case,  to  show 
that  x*is  an  optimal  solution  to  (8)  -  (10),  we  need  only  rank  the  vertices  of  S  until  a  vertex  of 
x°  is  found  such  that  Lix0)  ^fix*).  At  this  point,  all  vertices  that  could  possibly  be  optimal 
have  been  ranked.   This  is  proved  by  Cabot  and  Francis  [3]. 

In  order  to  rank  the  extreme  points  of  S,  we  need  to  use  a  result  also  first  proved  by 
Murty  as  Theorem  1  below: 

THEOREM  1:  \{  Ex,  E2,  . . .  ,  EK  are  the  first  K  vertices  of  a  linear  underapproximation 
problem  which  are  ranked  in  nondecreasing  order  according  to  their  objective  value,  then  ver- 
tex EK+i  must  be  adjacent  to  one  of  Ex,  E2,  ■  ■  ■  ,  EK. 

Simply  put,  this  says  that  vertex  2  will  be  adjacent  to  the  optimal  solution  to  the  linear 
underapproximation  and  vertex  3  will  be  adjacent  to  vertex  1,  or  vertex  2,  and  so  on.  This, 
then,  gives  us  a  procedure  for  ranking  the  vertices  if  all  adjacent  vertices  can  be  found.  It  is 
this  "if  that  can  cause  problems.  These  problems  arise  due  to  the  possibility  of  degeneracy  in 
5.  If  S  is  degenerate,  then  there  may  exist  multiple  bases  for  the  same  vertex.  This  implies 
that  all  such  bases  must  be  available  before  one  can  be  sure  that  all  adjacent  vertices  have  been 
found.  Finding  all  such  bases  for  finding  and  "scanning"  all  adjacent  vertices  can  be  quite 
cumbersome.  However,  a  recent  application  of  Chernikova's  work  [4,9]  has  been  shown  to  be 
a  way  around  the  problem  of  degeneracy. 

Vertex  ranking  has  been  used  by  McKeown  [10]  to  solve  fixed  charge  problems  and 
Fluharty  [6]  to  quadratic  assignment  problems.  Cabot  and  Francis  [3]  also  proposed  the  use  of 
vertex  ranking  to  solve  a  certain  class  of  nonconvex  quadratic  programming  problems,  e.g., 
quadratic  transportation  problems.   For  a  survey  of  vertex  ranking  procedures,  see  [11]. 

In  our  problem,  we  need  to  determine  the  linear  underapproximation  to  the  first  objective 
function,  (1).  We  may  do  this  by  first  noting  that 

(11)  Hy  =  min  {a,,  bj) 

is  an  upper  bound  on  xir   We  may  then  note  that  if  Fixif)  =  Cjj  x„  +  fjjyjj  then 
F(u„)  ~  FiO) 


k- 


uu 


Ck-  u  ■■  +  fk 


(12) 

uu 

for  A,*-1  ^  Ujj  <  kjj,  is  a  linear  underapproximation  to  F(xy) 

We  may  now  form  a  problem  to  rank  vertices,  i.e., 

rk  u   +  fk 
(13)  Min  Z  =  X  E 


u      2*  2*  v  V 


subject  to  (2)  -  (7) 


for\£-'  <  UiJ  <  kkj. 


442 


I'(i    MCKIOWN 


Using  (13)  and  (2)  -  (7),  we  may  rank  vertices  as  discussed  earlier  until  some  vertex  x° 
is  found  such  that  Lix°)    =  ]T£  l^Xy  >  fix*)  where  x*  is  a  candidate  for  optimality.    We 

i     J 

may  start  with  x*  equal  to  the  optimal  solution  to  (13)  and  (2)  -  (7),  and  then  update  it  as  new, 
possibly  better  solutions  to  (1)  -  (7),  are  found  by  the  ranking  procedure.  When  all  vertices  x 
are  found  such  that  L(x)  <  fix*),  the  solution  procedure  terminates  with  the  present  candi- 
date being  optimal. 

EXAMPLE :  As  an  example  of  our  procedure,  we  will  solve  an  incremental  quantity 
discount  version  of  the  example  problem  presented  by  Balachandran  and  Perry  [1].  Table  1 
below  gives  the  supplies,  demands,  and  costs,  for  each  range  of  shipment.  Table  2  gives  the 
optimal  solution  to  the  linear  underapproximation  problem.  The  values  of  lu  are  given  in  the 
upper  right-hand  corner  of  each  cell  with  shipment  being  circled  in  the  basic  cells. 

TABLE  1 


^•s.      Destina- 
^v.     tion 

Source      ^\ 

1 

2 

3 

4 

Warehouse 
Capacity 

1 

3[2(Kx,,<«>] 
4[KKxM<20] 
5[(Kx,i<10] 

6[10<jc12<»] 
7[5<x12<  10] 
8[0<x12<5] 

3[27<x13<°°] 
4[15<xI3<27] 
5[5<x13<15] 

One  price  bracket  4 

80 

2 

One  price 
bracket  6 

5[65<Ar22<°°] 
6[20^x22<65] 

8[0<jr22<20] 

8[10<x23^oo] 
9l5<jfj3<10] 
10[0<jr23<5] 

One  price  bracket  15 

90 

3 

l[27<jr31<°o] 

2[20<.v3l<27] 
3[0^a-3,<201 

3[60<x32<°o] 
4[30^x33<60] 
5[0<x32<30] 

10[20^x33<°°] 
ll[10^.v33<20] 
12[0<jf33<  10] 

5[30<x34<°°] 
6[20<x34<30] 
7l0<jr34<20] 

55 

Market 
Demand  S 

70 

60 

35 

60 

TABLE  2 


^\Destina- 
^\tion 
Source      ^^ 

1 

2 

3 

4 

Warehouse 
Capacity 

1 

3.43  1 

6.25  1 

4.20| 

4.00  1 

80 

© 

© 

2 

6.00  1 

6.67| 

8.43  1 

15.001 

90 

© 

© 

© 

3 

1.85  1 

4.55  1 

10.861 

5.91  1 

55 

© 

Market 
Demands 

70 

60 

35 

60 

SOLVING  TRANSPORTATION  PROBLEMS  BY  VERTEX  RANKING  443 

As  an  example  of  the  calculation  of  the  /,7  values,  we  will  look  at  /M.  First,  it  is  necessary 
to  calculate  f}\  and  f\\  using  (6).   We  will  do  f\\. 

f\\  =  Cu  (\]\  —  X\\)  —  C| i  -X.|i 

=  (5)(10)  -  4(10)  =  10. 

Similarly,  /,3,  =  30.  i/„  =  min  {80,  70}  =  70.   Then,  /„  =   (3)  (70^  +  30  =  3.43. 

Now,  if  we  solve  this  continuous  transportation  problem,  we  get  a  value  of  Z  =  1042.20 
with  the  circled  cells  being  basic.  If  we  compute  the  feasible  value  for  this  solution,  Z  =  1067. 
Call  this  solution  X\ 

Now,  since  this  solution  is  nondegenerate,  we  may  use  simplex  pivoting  to  look  at  each 
nonbasic  cell.  The  values  of  these  adjacent  vertices  are  given  below: 

Vertex  Z-Value 


(1,1) 

1067.10 

(1,2) 

1118.40 

(2,4) 

1143.75 

(3,2) 

1154.40 

(3,3) 

1141.05 

(3,4) 

1069.80 

Since  the  Z-value  for  each  vertex  is  greater  than  the  present  value  of  Z,  we  do  not  need  to  rank 
any  other  vertices,  and  Z=  1067.0  is  the  optimal  solution  value. 

4.   COMPUTATIONAL  RESULTS 

To  test  the  vertex  ranking  procedure  discussed  here,  randomly-generated  problems  were 
run.  These  problems  were  generated  by  first  generating  supplies  and  demands  uniformly 
between  upper  and  lower  bounds,  U  and  L.  These  supplies  and  demands  were  generated  so 
that  they  were  all  multiples  of  5.  This  was  done  to  insure  the  presence  of  degeneracy  in  some 
,of  the  problems.  All  problems  were  set  up  to  have  discount  ranges  at  20,  50,  300,  1000,  and 
2000.    By  proper  selection  of  L  and  U,  various  numbers  of  ranges  could  be  tested. 

The  costs  for  each  arc  were  generated  by  randomly  generating  mileages  between  each  set 
of  nodes,  and  then,  inputting  discount  cost-per-mile  values  for  each  range  of  flow,  e.g.,  10,  9, 
8,  etc.  The  final  discount  costs  were  found  by  multiplying  the  mileage  between  each  arc  times 
the  various  costs.  In  this  way,  various  supply-demand  discount  ranges  and  cost  configurations 
could  be  tested.  These  problems  were  generated  and  solved  using  a  computer  code  in  FOR- 
TRAN run  on  the  CYBER  70/74  using  the  FTN  Compiler  with  OPT  =  1. 

(The  problem  characteristics  and  test  results  are  given  in  Table  1.  The  second  column 
gives  the  solution  time  in  seconds,  while  the  third  column  shows  the  number  of  vertices  of  the 
linear  underapproximation  other  than  the  optimal  solution  that  were  ranked  to  solve  each  prob- 
lem. The  fourth  column  gives  the  size  of  the  problem  (m  x  n);  the  fifth  column  gives  the 
number  of  cost  ranges  that  the  arc  flows  would  cover;  the  sixth  column  gives  the  cost  per  mile 
for  each  range  of  flow,  pfi;  the  seventh  column  gives  the  lower  and  upper  ranges  used  to  gen- 
erate the  supplies  and  demands;  and  finally,  the  last  column  gives  the  ranges  used  to  generate 
nileages.  The  Cjj  values  were  determined  by  Cjj  =  pjj  (mileage).  As  can  be  seen,  the  algo- 
rithm successfully  solved  all  problems  tested.  The  most  difficult  problems  were  those  with 
three  ranges  and  supplies/demands  between  5  and  100.  Problems  6  and  13  are  identical,  except 
that  6  is  over  only  3  ranges,  while  13  is  over  5;  but,  problem  13  is  solved  in  much  less  time.  In 
;fact,  the  linear  underapproximating  transportation  problem  was  found  to  be  optimal  and  no 
pther  extreme  points  were  even  ranked.    This  was  also  the  case  in  problems  7,  9,  10,  11,  and 


444 


P.G.  MCKLOWN 


12,  even  though  the  number  of  variables  increased  markedly.  It  is  also  interesting  to  note  the 
effect  of  costs  in  problems  5,  6,  and  7.  These  are  essentially  the  same  problem,  but  with  the 
present  decrease  in  cost  for  increasing  flow  being  less  in  each  case.  The  results  are  as  expected 
since  in  problem  7  the  linear  underapproximation  will  be  closer  to  the  actual  objective  function 
than  in  problems  4  and  5. 

TABLE  3  —    Computational  Results 


Problem 
Number 

Vertices 
Ranked 

Solution 
Time 

mxn 

Number  of 
Ranges 

p6 

U.L 

Mileage 
Range 

1 

13 

2.604 

6x8 

3 

10,9,8 

1,50 

100,200 

2 

39 

10.289 

8x8 

3 

10,9,8 

1,50 

100,200 

3 

39 

34.103 

9x9 

3 

10,9,8 

1,50 

100,200 

4 

257 

42.938 

6x8 

3 

10,9,8 

1,100 

100,200 

5 

247 

39.799 

6x8 

3 

20,18,17 

1,100 

100,200 

6 

84 

13.353 

6x  8 

3 

20,19,18 

1,100 

100,200 

7 

0 

.121 

4x6 

5 

20,19,18,17,16 

400,500 

50,100 

8 

18 

.888 

4x8 

5 

20,19,18,17,16 

400,500 

50,100 

9 

0 

.196 

6x8 

5 

20,19,18,17,16 

400,500 

50,100 

10 

0 

.393 

8  x  8 

5 

20,19,18,17,16 

400,500 

50,100 

11 

0 

.518 

9x9 

5 

20,19,18,17,16 

400,500 

50,100 

12 

0 

.130 

4x6 

5 

10,9,8,7,6 

400,500 

50,100 

13 

0 

.213 

6x8 

5 

20,19,18,17,16 

400,500 

100,200 

It  would  appear  from  these  results  that  vertex  ranking  does  hold  promise  as  a  solution 
procedure  for  incremental  cost  discount  transportation  problems.  Neither  size  of  problem  nor 
degeneracy  appears  to  have  any  effect  on  solution  time  but  cost  patterns  and  number  of  cost 
ranges  do  seem  to  have  a  marked  effect. 

Extensions  of  this  work  could  be  used  to  solve  other  concave  linear  programming  prob- 
lems. Walker  [16]  discusses  the  fact  that  these  can  be  considered  as  generalizations  of  fixed 
charge  problems.  The  main  difference  would  be  that  the  first  linear  portion  would  have  a  posi- 
tive fixed  charge  rather  than  zero,  as  in  the  problem  discussed  here.  However,  this  would  not 
change  the  approach  to  the  solution  used  here. 

REFERENCES 


[1]  Balachandran,  V.  and  A.  Perry,  "Transportation  Type  Problems  with  Quantity  Discounts," 
Naval  Research  Logistics  Quarterly,  23,  195-209  (1976). 

[2]  Barr,  R.L.,  "The  Fixed  Charge  Transportation  Problem,"  presented  at  the  Joint  National 
Meeting  of  ORSA/TIMS  in  Puerto  Rico  (Nov.  1974). 

[3]  Cabot,  A.V.  and  R.L.  Francis,  "Solving  Certain  Nonconvex  Quadratic  Minimization  Prob- 
lems by  Ranking  the  Extreme  Points,"  Operations  Research  18,  82-86  (1970). 

[4]  Chernikiva,  N.V.,  "Algorithm  for  Finding  a  General  Formula  for  the  Non-negative  Solu- 
tions of  a  System  of  Linear  Inequalities,"  U.S.S.R.  Computational  Mathematics  and 
Mathematical  Physics. 

,[5]  Efroymson,  M.A.  and  T.L.  Ray,  "A  Branch-Bound  Algorithm  for  Plant  Location,"  Opera- 
tions Research,  14,  361-368  (1966). 

[6]  Fluharty,  R.,  "Solving  Quadratic  Assignment  Problems  by  Ranking  the  Assignments," 
unpublished  Master's  Thesis,  Ohio  State  University  (1970). 

[7]  Hirsch,  W.M.  and  A.J.  Hoffman,  "Extreme  Varieties,  Concave  Functions,  and  The  Fixed 
Charge  Problem,"  Communications  on  Pure  and  Applied  Mathematics,  14,  355-370 
(1961). 


SOLVING  TRANSPORTATION  PROBLEMS  BY  VERTEX  RANKING  445 

[8]    Kennington,  J.L.,  "The  Fixed  Charge  Transportation  Problem:  A  Computational  Study 

with  a  Branch  and  Bound  Code,"  AIIE  Transactions,  8  (1976). 
[9]    McKeown,  P.G.  and  D.S.  Rubin,  "Adjacent  Vertices  on  Transportation  Polytopes,"  Naval 
Research  Logistics  Quarterly,  22,  365-374  (1975). 

[10]   McKeown,  P.G.,  "A  Vertex  Ranking  Procedure  for  Solving  the  Linear  Fixed  Charge  Prob- 
lem," Operations  Research  23,  1183-1191    (1975). 

[11]     McKeown,    P.G.,   "Extreme   Point   Ranking   Algorithms:     A   Computational   Survey," 
Proceedings  of  Bicentennial  Conference  on  Mathematical  Programming  (1976). 

[12]    Murty,  K.,  "Solving  the  Fixed  Charge  Problem  by  Ranking  the  Extreme  Points,"  Opera- 
tions Research,  16,  268-279  (1968). 

[13]    Shapiro,  D.,  "Algorithms  for  the  Solution  of  the  Optimal  Cost  Travelling  Salesmen  Prob- 
lem," Sc.D.  Thesis,  Washington  University,  St.  Louis  (1966). 

[14]    Townsend,  W.,  "A  Production  Stocking  Problem  Analogous  to  Plant  Location,"  Opera- 
tions Research  Quarterly,  26,  389-396  (1975). 

[15]    Vogt,  L.  and  J.  Evan,  "Piecewise  Linear  Programming  Solutions  of  Transportation  Costs 
as  Obtained  from  Rate  Traffic,"  AIIE  Transactions,  4  (1972). 
Walker,  Warren  E.,  "A  Heuristic  Adjacent  Extreme  Point  Algorithm  for  the  Fixed  Charge 
Problem,"  Management  Science,  22,  587-596  (1976). 


AUXILIARY  PROCEDURES  FOR  SOLVING  LONG 
TRANSPORTATION  PROBLEMS 

J.  Intrator  and  M.  Berrebi 

Bar-llan  University 
Ramat-Gan,  Israel 

ABSTRACT 

An  efficient  auxiliary  algorithm  for  solving  transportation  problems,  based 
on  a  necessary  but  not  sufficient  condition  for  optimum,  is  presented. 


In  this  paper  a  necessary  (but  not  sufficient)  condition  for  a  given  feasible  solution  to  a 
transportation  problem  to  be  optimal  is  established,  and  a  special  algorithm  for  finding  solutions 
which  satisfy  this  condition  is  adapted  as  an  auxiliary  procedure  for  the  MODI  method. 

Experimental  results  presented  show  that  finding  an  initial  solution  which  satisfies  this 
necessary  condition  for  problems  with  m  <  <  n  eliminates  70%-90%  of  the  MODI  iterations. 
(See  Table  1) 


TABLE  1  —  Matrix  of  Principal  Results 


>v  n 
m    >^ 

20 

30 

40 

50 

100 

200 

300 

4 

0.65 

0.69 

0.72 

0.74 

0.88 

0.91 

0.93 

5 

0.61 

0.67 

0.69 

0.71 

0.84 

0.87 

0.90 

6 

0.59 

0.65 

0.66 

0.68 

0.80 

0.82 

0.85 

8 

0.61 

0.62 

0.64 

0.66 

0.76 

0.80 

0.82 

10 

0.57 

0.65 

0.66 

0.69 

0.73 

0.77 

0.80 

20 

0.25 

0.27 

0.31 

0.36 

0.45 

0.50 

0.52 

Fraction  of  Modi  iteration  eliminated  by  using  the  method  presented 
in  this  paper. 

The  case  when  our  algorithm  is  used  during  the  solution  process  (especially  for  m  —  n)  is 
presently  being  examined.  Our  auxiliary  procedure  requires  relatively  little  computational  effort 
in  finding  the  appropriate  candidate  for  the  basis,  eliminating  entirely  the  need  to  calculate  the 
dual  variables.  It  works  with  positive  variables  associated  with  one  pair  of  rows  at  a  time  using 
only  the  prices  of  these  rows. 

447 


448  J.  INTRATOR  AND  M.  BERREB1 

Once  a  loop  for  any  given  pair  of  rows  is  determined  it  may  be  used  to  insert  numerous 
non-basic  cells  in  these  two  rows  to  the  basis.  The  result  is  a  considerable  time  reduction  in 
determining  loops. 

The  storage  and  time  requirements  for  the  special  lists  needed  in  our  auxiliary  algorithm 
are  fully  discussed  in  [1].  A  rigorous  proof  presented  in  [1]  shows  that  updating  these  lists 
requires  no  more  than  0(m  log  a?)  computer  operations  per  MODI  iteration. 

A  Linear  Programming  Transportation  Problem  is  characterized  by  a  cost  matrix  C  and 
two  positive  requirement  vectors  a  and  b  such' that  ^  o,  =  ^  br   The  problem  is  to  minimize 

SZ  cuxu  subject  to 

i   j 

Z  xu  =  bi       j-  1,2  ....  n 

i 

(A)  £  Xjj :  =  a,        i  =  1,2  ...  m 

j 

xu  >  0         for  all  (ij). 
A  proper  perturbation  of  our  problem  ensures  that: 

(1)  each  feasible  basic  solution  of  (A)  contains  exactly  m  +  n  —  1  positive  variables  xlh 

(2)  corresponding  to  each  nonbasic  cell  (/,./)  (xn  =  0)  there  exists  a  unique  loop  of  different 
cells,  say  L(i,j)  -  (/,./',)  (i2J\)  diJi)  (/3J2)  •••• 

(B)  UrJr-l)    UrJ)    Oj) 

which  contains  at  most  two  cells  in  each  row  and  column,  where  the  cell  (/',./)  is  the  unique 
nonbasic  cell, 

(3)  there  are  no  loops  which  contain  basic  cells  only. 

Notation:    For  fixed  l,k  (1  ^  /  ^  k  ^  m)  we  denote 
V,=  [j\x,j  >  0)         1  <7<  // 

v*-  V\  n  yk  =  {j\x,j  >  0,  xkl  >  0}. 

With  no  loss  of  generality  it  is  assumed  that  for  each  /,  (1  <  /  <  m)  there  exists  at  least  one 
destination  (column)  76  V,  such  that  (IJ)  is  the  unique  basic  cell  of  column  j.  Otherwise,  an 
artificial  destination,  say  J  with  xtj  ~  bj  —  e  where  e  is  an  infinitely  small  positive  number  will 
be  introduced. 

It  is  easy  to  see  that  the  feasible  solution  of  the  augmented  problem  of  dimension 
mxin  +  1)  satisfies  1),  2),  3)  mentioned  above. 

DEFINITION  1:  A  destination  with  a  unique  basic  cell  will  be  called  a  fundamental  desti- 
nation. 

The  unique  nonbasic  cell  (ij)  of  L(ij)  will  be  considered  for  convenience  to  be  the  last 
cell  of  L(ij).    For  each  loop  L(iJ),  say  loop  (B),  we  introduce  the  notation: 

<o     cLUJ]  =  c,7|  -  c,Vl  +  ci2h  -  clil:  +  ...+  clrJr  -  Q. 


SOLVING  LONG  TRANSPORTATION  PROBLEMS  449 

It  is  well  to  know  that 
Q(/j)  =  ui  +  vj  ~  0/  where  w,  and  v7-  are  the  dual  prices. 

DEFINITION  2:   A  loop  with  Q  >  0  is  called  an  improving  loop. 

DEFINITION  3:   Let  l,k  be  a  fixed  pair  of  numbers  so  that  1  ^  /  ^  k  ^  m,  we  define 
Alk=  [J\  x„  >  0,    x^  =  0}-  V,-  Vlk 
Dik(j)=  C,j-  CkJ       j  =  1,2,  ....  n. 

THEOREM  1:  The  number  of  elements  in  Vlk  is  at  most  1. 

PROOF:  Suppose  that  JUJ2£  Vlk  (1  <  /,  ^  J2  <  «)  then  the  loop  (/,/,)  (k,J\)  (k,J2) 
(l,J2)  is  of  only  basic  cells  contradicting  (3)  above. 

Let  J\  be  a  fundamental  destination  of  A/k,  the  purpose  of  Theorem  2  and  Theorem  3  is 
to  show  that  all  the  simplex  loop  L(i,J)  and  the  numbers  Q,(,/)  /  =  l,k;  J€A/k  U  Akl  are  deter- 
mined after  the  simplex  loop  L{k,J)  is  found. 

THEOREM  2:    CL{kj$  ~  D,kU2)  =  CUU])  -  /),*(/,)  for  all  J2eAlk 
J i  being  the  above  fundamental  destination  of  Alk. 

PROOF: 

CASE  (a)  Vlk  ^0.  Denote  by  J  the  unique  member  of  Vlk  (Th.l).  We  have  j  ^  J\\ 
j  *  J2  U\  zx\AJ2(iVk)  and 

L{k,J])  =  (k,j)  U,j)  (/,•/,)  UUi) 

L{k,J2)  =  (kj)  UJ)  (l,J2)  ik,J2) 

:  e.g. 

Q(A.y2)  ~~  Dlk{J2)  =  CL{kJ[)  -  Dlk{Jx). 

CASE  (b)  Vlk  =  0.  Let  L{k,Jx)  be  the  loop  (B).  Note  that  ir  =  /  (since  column  7,  con- 
tains a  basic  cell  in  row  /exclusively)  and  r  >  2.  Otherwise,  i\  =  i2=  /and  L{k,J\)  =  (kj) 
(IJ)  U,J\)  (k,J\)  means  that  J€  Vlk  contradicting  the  fact  that  V,k  =  <f>. 

I  Consider  the  loop: 

L  =  (kJO  (/2J,)  (i2J2)  U3J2) (U-i)  UJi)  (U2),  ('1  =  k) 

'  obtained  from  L{k,J\)  by  substituting  J2  for  J\.    Let  us  show  that  either  L{k,J2)   =  L  or 
L(k,J2)  can  be  obtained  from  L  by  deleting  two  identical  cells. 

At  first,  observe  that  all  rows  and  columns  of  L  (except  perhaps  J2)  contain  exactly  two 
different  cells  of  L.   The  column  J2  has  not  appeared  previously  (unless  J2  =  jr_\)  because  it 
1  equals    one    of    the     previous     members    js,     1  ^  s  ^  r  —  2,     then     the     loop     (/s+|,75) 
('s+iJs+i)  •  •  •  Ur,J2)  will  be  a  loop  of  basic  cells  only  which  contradicts  (3). 

Thus,  only  two  possibilities  exist: 

(1)  J2  9^  jr_}  and  L(k,J2)  =  L  or 


450  J.  INTRATOR  AND  M.  BERREBI 

(2)  J2  —  jr-i  and  L(k,J2)  is  obtained  from  L  by  deleting  the  two  identical  cells  (Ljr-\)  and 
(U2). 

Since  this  deleting  does  not  effect  the  value  of  CUkJ  >,  we  have  for  both  possibilities 
Q(A-.y:>  —  DlkU2)  =  CL{kJ{)  —  D,kU\). 

THEOREM  3:  Let  J\  and  J2  be  the  destinations  defined  in  Theorem  2  and  J^Ak.    We 
shall  prove  that 

Q(/.y,)  =  -  iCL{kJl)  ~  DlkU2)}  +  Dkl  U3). 

PROOF:    Let  L  ik,J\)  be  the  loop  (B)  with  ir  =  /  because  J  is  fundamental. 
Consider  the  loop  L  defined  by 

L  =   UrJr-0    (/,-„   Jr-l)   •••   U2,j{)    (/,J,)    (*,/3)   (W- 

CASE  (a)  K/A  ^  0.   Same  proof  as  in  Theorem  2. 

CASE  (b)   ^  =  0.    By  the  same  argument  as  in  Theorem  2  we  can  show  that  there  are 
only  two  possibilities. 

1)  y3  ^  /|  which  implies  that  L(/,y3)  =  L. 

2)  y3  =  7|.    In  this  case  (r  >  2)  and  L(/,73)  can  be  obtained  from  L  by  deleting  the  two  identi- 
cal cells  (i\J\)  and  (k,J}). 

In  the  two  cases  we  have 

Q(/.y,)  =  -  ICitup-DftWl  +  AfC/s) 

and  by  Theorem  2  we  have 

Q</.y,>  =  -  IQ(W,)  "  DlkU2)]  +  Dw(73). 

THEOREM  4:    If  DjkU2)  >  DlkU3)  then  either  L(k,J2)  or  L(k,J})  is  an  improving  sim- 
plex loop. 

PROOF:  Since  £/A(y3)  =  -  Dkl(J3),  it  follows  from  Theorem  3  that 

Cnu3i  +  Q.(fc/2)  =  DlkU2)  -  DlkU3)  >  0 

(Dlk(J2)  >  D/A (73))  and  either  Cu/y  ,  or  Q(A.y  j  is  a  positive  number,  e.g.,  either  L(/,y3)  or 
L(k,J2)  is  an  improving  simplex  loop  (Definition  2). 

COROLLARY:  At  optimum  we  have  DlkU2)  <  DlkU3). 

DEFINITION  4:    Define  J/k  by 

D,kUlk)  =  max  Z),*0')- 

7"€  K 


SOLVING  LONG  TRANSPORTATION  PROBLEMS  451 


REMARK  1:    We  shall  suppose  that  Dlk{j\)  =  D,k{j2)  if  and  only  if  y,  =  j2.   Otherwise, 
a  cost  perturbed  problem  with  C,j  =  Cy  +  em,+J  can  be  considered  and 

m/+./']  mk+j\ 


AXC/.)  "  D,l(j2)  =  C,  -  Ck.  +  €m,+/l  -  e^+"  -  C,  + 


+  Ckj  —  e        2  +  €        2  which  for  sufficiently  small  e  >  0  is  equal  0  only  for  y,  =  j2. 


>2 


THEOREM  5:  If  at  the  oplimality  Vlk  =  $  then  DtkUik)  <  D,kUki)  U,k  ^  JM),  else 
DlkU,k)  =  DlkUkl)  Ulk  =  Jkl). 

PROOF:  If  Vlk  =  0  then  D/k(J/k)  ^  D/k(Jki)  otherwise,  (by  cost-perturbation)  Jlk  =  Jkl 
and  ^  ^  </>. 

The  first  part  of  Theorem  5  follows  now  immediately  from  the  corollary  of  Theorem  4. 

If  Vlk  ^  0  and  j  is  the  unique  element  of  Vlk  then  by  the  definition  of  Jtk  and  from  j€  V{  we 
have  A* 0')  <  DlkU,k). 

Let  us  show  that  j  =  7/A.    Suppose  that  j  ^  7//v  then  we  have  D/k(J)  <  Dik{Jlk),  (Definition  4) 
and  the  simplex  loop 

L(kJik)  =  (kj)  (l,j)  (lJik)  ik,Jlk)  will  be  an  improving  simplex  loop  since 

CL{Ulk)  =  A/0)  +  DlkUlk)  =  DlkUik)  -  Dlk(j)  >  0 

contradicting  the  fact  that  we  have  optimality. 

Thus,  j  =  J,k.   By  the  same  argument  we  have  j  =  Jkl. 

A  simple  algorithm  consists  of 

1)  Computing  the  differences  D,k  (J/k), 

2)  Comparing  D!k(Jlk)  to  Dlk(Jk/). 

If  D,kUlk)  >  DlkUki)  (or  if  Jlk  ^  Jk/  for  non-empty  Vlk)  we  improve  our  solution,  using 
all  the  nonbasic  cells  (l,J)  or  (k,J)  where  7€  (V,UVk)  such  that  DlkU,k)  <  DlkU)  <  DlkUkl) 
by  searching  only  the  first  loop  involving  the  rows  /and  k. 

The  other  loops  will  be  obtained  by  changing  the  last  two  cells  keeping  the  2k— 2  first  cells 
,  in  the  same  order  or  in  the  opposite  order  (Theorem  2  and  Theorem  3). 

REMARK:  In  order  to  assure  that  the  first  loop  will  not  be  a  shortened  loop,  this  loop 
will  be  obtained  by  using  a  fundamental  artificial  destination  /with  only  one  basic  cell  in  the  k 
row  with  xkJ  =  e. 

The  proposed  technique  was  applied  to  each  pair  of  rows  (l,k)  until  Dik(Jlk)  ^  D/A(yA/)  for 
all  1  ^  /  ^  k  ^  m.    At  that  point  the  MODI  method  was  implemented.    Performing  a  MODI 
•  iteration  frequently  caused  D,kUlk)  >  D/k(Jkl)  for  some  1  <  /  ^  k  <  m  which  would  enable 
further  utilization  of  the  proposed  technique.    However,  for  the  purpose  of  the  present  experi- 
ment the  proposed  technique  was  not  reactivated  after  the  initial  processing.    (See  Table  1) 

The  storage  and  time  requirements  of  the  lists  Jlk  when  updated  at  each  MODI  iteration 
are  fully  discussed  in  [1]. 


452  J.  INTRATOR  AND  M.  BERREBI 

One  possible  way  to  update  this  list  may  be  described  as  follows:  For  each  /  the  destina- 
tions of  ji  V,  are  ordered  in  m  —  1  sequences  Plk  (1  <  k  ^  /  <  m)  of  increasing  Dlk(j). 
Thus  Pik  =  [jx\  j2  . . .  ;  fa)  (N,  —  the  number  of  elements  in  V,)  D,k(jO  <  D,k{j2)  ... 
<  A*0/v,)  (equality  excluded  because  of  the  supposed  cost-perturbation).  These  Ptk  sequences 
are  organized  in  heaps.  Adding  or  deleting  an  item  from  a  heap  requires  0(log  N,)  <  0(log  n) 
computer  operations.  Since  at  each  simplex  iteration  only  one  basic  cell,  say  (o-,t),  becomes 
nonbasic  and  one  nonbasic  cell,  say  (s,t),  becomes  basic,  we  have  to  update  2(m  —  1)  heaps 
(Pap  and  Psr  for  all  p  ^  o\  r  ?±  s),  which  amounts  to  0(m  log  n)  computer  operations  per 
simplex  iteration,    (heaps,  see  [2]). 

REFERENCES 

[1]  Brandt,  A.  and  J.  Intrator,  "Fast  Algorithms  for  Long  Transportation  Problems,"  Computers 

and  Operations  Research  5,  263-271  (1978). 
[2]  Knuth,  D.E.,  The  Art  of  Computer  Programming,  3,  Sorting  and  Searching,  Addison  Wesley 

(1973). 


ON  THE  GENERATION  OF  DEEP  DISJUNCTIVE  CUTTING  PLANES* 

Hanif  D.  Sherali  and  C.  M.  Shetty 

School  of  Industrial  &  Systems  Engineering 

Georgia  Institute  of  Technology 

Atlanta,  Georgia 

ABSTRACT 

In  ihis  paper  we  address  ihe  quesiion  of  deriving  deep  cuts  for  nonconvex 
disjunctive  programs.  These  problems  include  logical  constraints  which  restrict 
the  variables  to  at  least  one  of  a  finite  number  of  constraint  sets.  Based  on  the 
works  of  Balas,  Glover,  and  Jeroslow,  we  examine  the  set  of  valid  inequalities 
or  cuts  which  one  may  derive  in  this  context,  and  defining  reasonable  criteria 
to  measure  depth  of  a  cut  we  demonstrate  how  one  may  obtain  the  "deepest" 
cut.  The  analysis  covers  the  case  where  each  constraint  set  in  the  logical  state- 
ment has  only  one  constraint  and  is  also  extended  for  the  case  where  each  of 
these  constraint  sets  may  have  more  than  one  constraint. 

1.  INTRODUCTION 

A  Disjunctive  Program  is  an  optimization  problem  where  the  constraints  represent  logical 
conditions.  In  this  study  we  are  concerned  with  such  conditions  expressed  as  linear  constraints. 
Several  well-known  problems  can  be  posed  as  disjunctive  programs,  including  the  zero-one 
integer  programs.  The  logical  conditions  may  include  conjunctive  statements,  disjunctive  state- 
ments, negation  and  implication  as  discussed  in  detail  by  Balas  [1,2].  However,  an  implication 
can  be  restated  as  a  disjunction,  and  conjunctions  and  negations  lead  to  a  polyhedral  constraint 
set.  Thus,  this  study  deals  with  the  harder  problem  involving  disjunctive  restrictions  which  are 
essentially  nonconvex  problems. 

It  is  interesting  to  note  that  disjunctive  programming  provides  a  powerful  unifying  theory 
,  for  cutting  plane  methodologies.  The  approach  taken  by  Balas  [2]  and  Jeroslow  [14]  is  to 
characterize  all  valid  cutting  planes  for  disjunctive  programs.  As  such,  it  naturally  leads  to  a 
statement  which  subsumes  prior  efforts  at  presenting  an  unified  theory  using  convex  sets,  polar 
sets  and  level  sets  of  gauge  functions  [1,2,5,6,8,13,14].  On  the  other  hand,  the  approach  taken 
1  by  Glover  [10]  is  to  characterize  all  valid  cutting  planes  through  relaxations  of  the  original  dis- 
junctive program.  Constraints  are  added  sequentially,  and  when  all  the  constraints  are  con- 
sidered Glover's,  result  is  equivalent  to  that  of  Balas  and  Jeroslow.  Glover's  approach  is  a  con- 
structive procedure  for  generating  valid  cuts,  and  may  prove  useful  algorithmically. 

The  principal  thrust  of  the  methodologies  of  disjunctive  programming  is  the  generation  of 
cutting  planes  based  on  the  linear  logical  disjunctive  conditions  in  order  to  solve  the 
corresponding  nonconvex  problem.  Such  methods  have  been  discussed  severally  by  Balas 
[1,2,3],  Glover  [8],  Glover,  Klingman  and  Stutz  [11],  Jeroslow  [14]  and  briefly  by  Owen  [17]. 
But  the  most  fundamental  and  important  result  of  disjunctive  programming  has  been  stated  by 


*This  paper  is  based  upon  work  supported  by  the  National  Science  Foundation  under  Grant  No.  ENG-77-23683. 

453 


454  H  D   SHERALI  AND  CM   SHETTY 

Balas  [1,2]  and  Jeroslow  [14],  and  in  a  different  context  by  Glover  [10].  It  unifies  and  sub- 
sumes several  earlier  statements  made  by  other  authors  and  is  restated  below.  This  result  not 
only  provides  a  basis  for  unifying  cutting  plane  theory,  but  also  provides  a  different  perspective 
for  examining  this  theory.  In  order  to  state  this  result,  we  will  need  to  use  the  following  nota- 
tion and  terminology. 

Consider  the  linear  inequality  systems  Sh,  /?€// given  by 

(1.1)  Sh  =  {x:  Ahx  >  bh,  x  >0},  /?€// 

where  H  is  an  appropriate  index  set.  We  may  state  a  disjunction  in  terms  of  the  sets  Sh,  h€H 
as  a  condition  which  asserts  that  a  feasible  point  must  satisfy  at  least  one  of  the  constraint  Sh, 
/?€//.    Notationally,  we  imply  by  such  a  disjunction,  the  restriction  x€  U  Sh.    Based  on  this 

h€H 

disjunction,  an  inequality  tt'x  >  tt0  will  be  considered  a  valid  inequality  or  a  valid  disjunctive  cut 
if  it  is  satisfied  for  each  x€  U  S,,.    (The  superscript  t  will  throughout  be  taken  to  denote  the 

transpose  operation).  Finally,  for  a  set  of  vectors  {v/;:  /?€//},  where  v1'  =  (vj\  ...  ,  v^O  for 
each  //€//,  we  will  denote  by  sup  (v;0,    the  pointwise  supremum  v  =  (vi,  ....  v„)  of  the  vec- 

tors  v\  /?€//,  such  that  v,  =  sup  (v/)  for  j  =  1,  . . .  ,  n. 

Before  proceeding,  we  note  that  a  condition  which  asserts  that  a  feasible  point  must  satisfy 
at  least  p  of  some  q  sets,  p  <  q,  may  be  easily  transformed  into  the  above  disjunctive  statement 
)y  letting  each  Sh  denote  the  conjunction  of  the  q  original  sets  taken  p  at  a  time.   Thus,  H  = 


' (J) 


in  this  case.  Now  consider  that  following  result. 


THEOREM  1:  (Basic  Disjunctive  Cut  Principle)  -  Balas  [1,2],  Glover  [10],  Jeroslow 
[14] 

Suppose  that  we  are  given  the  linear  inequality  systems  Sh,  h €// of  Equation  (1.1),  where 
\H\  may  or  may  not  be  finite.  Further,  suppose  that  a  feasible  point  must  satisfy  at  least  one 
of  these  systems.   Then,  for  any  choice  of  nonnegative  vectors  k'\  /?€//,  the  inequality 

(1.2)  (sup  (\h)'Ah\   x  ^  inf  UW 

[heH  J  /;€// 

is  a  valid  disjunctive  cut.    Furthermore,  if  every  system  Sh,  /;€//is  consistent,  and  if  \H\  < 

n 

°o,    then  for  any  valid  inequality  £  ttjXj  ^  7r0,  there  exist  nonnegative  vectors  k'\  /?€//such 

7=1 

that  ttq  <  inf  (kh)'bh  and  for  7=  1,  ...  ,  n,  the  j  th  component  of  sup  (k'')'Ah  does  not 
exceed  ttj. 

The  forward  part  of  the  above  theorem  was  originally  proved  by  Balas  [2]  and  the  con- 
verse part  by  Jeroslow  [14].  This  theorem  has  also  been  independently  proved  by  Glover  [10] 
in  a  somewhat  different  setting.   The  theorem  merely  states  that  given  a  disjunction  x€    U    Sh, 

one  may  generate  a  valid  cut  (1.2)  by  specifying  any  nonnegative  values  for  the  vectors  kh, 
h£H.  The  versatility  of  the  latter  choice  is  apparent  from  the  converse  which  asserts  that  so 
long  as  we  can  identify  and  delete  any  inconsistent  systems,  Sh,  /?€//,  then  given  any  valid  cut 
tt'x  >  7r0,  we  may  generate  a  cut  of  the  type  (1.2)  by  suitably  selecting  values  for  the  parame- 
ters k'\  /;€//  such  that  for  any  x  belonging  to  the  nonnegative  orthant  of  R",  if  (1.2)  holds 
then  we  must  have  tt'x  ^  n0.  In  other  words,  we  can  make  a  cut  of  the  type  (1.2)  uniformly 
dominate  any  given  valid  inequality  or  cut.   Thus,  any  valid  inequality  is  either  a  special  case  of 


GENERATION  OF  DEEP  DISJUNCTIVE  CUTTING  PLANES  455 

(1.2)  or  may  be  strictly  dominated  by  a  cut  of  type  (1.2).  In  this  connection,  we  draw  the 
reader's  attention  to  the  work  of  Balas  [1]  in  which  several  convexity/intersection  cuts  dis- 
cussed in  the  literature  are  recovered  from  the  fundamental  disjunctive  cut.  Note  that  since  the 
inequality  (1.2)  defines  a  closed  convex  set,  then  for  it  to  be  valid,  it  must  necessarily  contain 
the  polyhedral  set 

(1.3)  S  =  convex  hull  of  U    Sh. 

Hence,  one  may  deduce  that  a  desirable  deep  cut  would  be  a  facet  of  S,  or  at  least  would  sup- 
port it.  Indeed,  Balas  [3]  has  shown  how  one  may  generate  with  some  difficulty  cuts  which 
contain  as  a  subset,  the  facets  of  S  when  \H\  <  °°.  Our  approach  to  developing  deep  disjunc- 
tive cuts  will  bear  directly  on  Theorem  1.  Specifically,  we  will  be  indicating  how  one  may 
specify  values  for  parameters  \h  to  provide  supports  of  S,  and  will  discuss  some  specific  criteria 
for  choosing  among  supports.  We  will  be  devoting  our  attention  to  the  following  two  disjunc- 
tions titled  DC1  and  DC2.  We  remark  that  most  disjunctive  statements  can  be  cast  in  the  for- 
mat of  DC2.  Disjunction  DC1  is  a  special  case  of  disjunction  DC2,  and  is  discussed  first 
because  it  facilitates  our  presentation. 


DC1: 


Suppose  that  each  systems  Sh  is  comprised  of  a  single  linear  inequality,  that  is,  let 


(1.4)  Sh  = 


*:£  a\jXj  >  b\,  x  >  0 

7=1 


for /?€//=  {1,  ....  h) 


where  we  assume  that  h  =  \H\  <  °o  and  that  each  inequality  in  Sh,  h£H\s  stated  with  the  ori- 
gin as  the  current  point  at  which  the  disjunctive  cut  is  being  generated.  Then,  the  disjunctive 
statement  DC1  is  that  at  least  one  of  the  sets  Sh,  h€H  must  be  satisfied.  Since  the  current 
point  (origin)  does  not  satisfy  this  disjunction,  we  must  have  b'{  >  0  for  each  h  €  H.  Further, 
we  will  assume,  without  loss  of  generality,  that  for  each  /?€//,  a\j  >  0  for  some 
j  €  {1,  . . .  ,  n)  or  else,  S,,  is  inconsistent  and  we  may  disregard  it. 

DC2: 

Suppose  each  system  S,,  is  comprised  of  a  set  of  linear  inequalities,  that  is,  let 


(1.5)  Sh  = 


x :  £  a1,]  Xj  ^  b-1  for  each  /  6  Qh ,  x  ^  0 

7=1 


for /?€//=  {1, 


where  Qh,  h€H  are  appropriate  constraint  index  sets.  Again,  we  assume  that  h  =  \H\  <  oo 
and  that  the  representation  in  (1.5)  is  with  respect  to  the  current  point  as  the  origin.  Then,  the 
disjunctive  statement  DC2  is  that  at  least  one  of  the  sets  Sh,  h€H must  be  satisfied.  Although 
it  is  not  necessary  here  for  bj'  >  0  for  all  i  6  Qh  one  may  still  state  a  valid  disjunction  by  delet- 
ing all  constraints  with  bj'  ^  0,  i£Qh  from  each  set  Sh,  /?€//.  Clearly  a  valid  cut  for  the 
relaxed  constraint  set  is  valid  for  the  original  constraint  set.  We  will  thus  obtain  a  cut  which 
possibly  is  not  as  strong  as  may  be  derived  from  the  original  constraints.  To  aid  in  our  develop- 
ment, we  will  therefore  assume  henceforth  that  b,1'  >  0,  /€£)/,,  h€H. 

Before  proceeding  with  our  analysis,  let  us  briefly  comment  on  the  need  for  deep  cuts. 
Although  intuitively  desirable,  it  is  not  always  necessary  to  seek  a  deepest  cut.  For  example,  if 
one  is  using  cutting  planes  to  implicitly  search  a  feasible  region  of  discrete  points,  then  all  cuts 
which  delete  the  same  subset  of  this  discrete  region  may  be  equally  attractive  irrespective  of 
their  depth  relative  to  the  convex  bull  of  this  discrete  region.  Such  a  situation  arises,  for  exam- 
ple, in  the  work  of  Majthay  and  Whinston  [16].   On  the  other  hand,  if  one  is  confronted  with 


456  H.D.  SHERALI  AND  CM.  SHETTY 

the  problem  of  iteratively  exhausting  a  feasible  region  which  is  not  finite,  as  in  [20]  for  exam- 
ple, then  indeed  deep  cuts  are  meaningful  and  desirable. 

2.   DEFINING  SUITABLE  CRITERIA  FOR  EVALUATING  THE  DEPTH  OF  A  CUT 

In  this  section,  we  will  lay  the  foundation  for  the  concepts  we  propose  to  use  in  deriving 
deep  cuts.   Specifically,  we  will  explore  the  following  two  criteria  for  deriving  a  deep  cut: 

(i)      Maximize  the  euclidean  distance  between  the  origin  and  the  nonnegative  region 
feasible  to  the  cutting  plane 

(ii)     Maximize  the  rectilinear  distance  between  the  origin  and  the  nonnegative  region 
feasible  to  the  cutting  plane. 

Let  us  briefly  discuss  the  choice  of  these  criteria.  Referring  to  Figure  1(a)  and  (b),  one 
may  observe  that  simply  attempting  to  maximize  the  euclidean  distance  from  the  origin  to  the 
cut  can  favor  weaker  over  strictly  stronger  cuts.  However,  since  one  is  only  interested  in  the 
subset  of  the  nonnegative  orthant  feasible  to  the  cuts,  the  choice  of  criterion  (i)  above  avoids 
such  anamolies.  Of  course,  as  Figure  Kb)  indicates,  it  is  possible  for  this  criterion  to  be  unable 
to  recognize  dominance,  and  treat  two  cuts  as  alternative  optimal  cuts  even  through  one  cut 
dominates  the  other. 

Let  us  now  proceed  to  characterize  the  euclidean  distance  from  the  origin  to  the  nonnega- 
tive region  feasible  to  a  cut 


(2.1)  ]£  ZjXj  ^  z0,  where  z0  >  0,  z,  >  0  for  some  j€{l,  ...  , 

7=1 

The  required  distance  is  clearly  given  by 

n 

(2.2)  6e  =  minimum  {||x||:  ]£  ZjXj  ^  z0,  x  ^  0}. 

7=1 

Consider  the  following  result. 

LEMMA  1:  Let  0e  be  defined  by  Equations  (2.1)  and  (2.2).  Then 

(2.3)  ee  =  ^n- 

Ibll 

where, 

(2.4)  y  -  (ylt  . . .  ,  y„),  vy  =  maximum  {0,  z,},  j  =  \,  ...  ,  n. 


zo 


PROOF:    Note  that  the  solution  x*  = 

— .    Moreover,  ior  any  x  ieasioie  to  u.4J,  we  nave,  z0  ^  ^  ZjXj  ^ 

7=1 


,  y  is  feasible  to  the  problem  in  (2.2)  with 

LMI2  ' 

r—rr.    Moreover,  for  any  x  feasible  to  (2.2),  we  have,  z0  <  £  ZjXj  <  £  yjXj  ^ 

Lyll  J=\    '  7=i 

1 1  v  1 1  |  |x  1 1 ,  or  that,  |  |x  1 1  ^  -n — rr-   This  completes  the  proof. 

lb  1 1 

Now,  let  us  consider  the  second  criterion.    The  motivation  for  this  criterion  is  similar  to 
that  for  the  first  criterion  and  moreover,  as  we  shall  see  below,  the  use  of  this  criterion  has 


GENERATION  OF  DEEP  DISJUNCTIVE  CUTTING  PLANES 


457 


intuitive  appeal.   First  of  all,  given  a  cut  (2.1),  let  us  characterize  the  rectilinear  distance  from 
the  origin  to  the  nonnegative  region  feasible  to  this  cut.  This  distance  is  given  by 


(2.5) 


9r  =  minimum  {|x|:  £  ZjXj  ^  z0,  x  >  0},  when  |x|  £  xr 


Consider  the  following  result. 


/-i 


Criterion  values 


Figure  1.   Recognition  of  dominance 


Criterion  value 
for  either  cut 


LEMMA  2:  Let  9r  be  defined  by  Equations  (2.1)  and  (2.5).   Then, 
(2.6)  9r  =  —  where  zm  =  maximum  z,. 

zm  J=x " 


z0 

PROOF:    Note  that  the  solution  x*  =   (0,   ...  ,  — ,  ...0),  with  the  m  th  component 
being  non-zero,  is  feasible  to  the  problem  in  (2.5)  with  |x*|  =  — .    Moreover,  for  any  x  feasi- 


ble to  (2.5),  we  have, 


This  completes  the  proof. 


Zn  n        Z  n 


Note  from  Equation  (2.6)  that  the  objective  of  maximizing  Gr  is  equivalent  to  finding  a 
cut  which  maximizes  the  smallest  positive  intercept  made  on  any  axis.  Hence,  the  intuitive 
appeal  of  this  criterion. 

3.  DERIVING  DEEP  CUTS  FOR  DC1 


It  is  very  encouraging  to  note  that  for  the  disjunction  DC1  we  are  able  to  derive  a  cut 
which  not  only  simultaneously  satisfies  both  the  criterion  of  Section  2,  but  which  is  also  a  facet 
!  of  the  set  S  of  Equation  (1.3).  This  is  a  powerful  statement  since  all  valid  inequalities  are  given 
through  (1.2)  and  none  of  these  can  strictly  dominate  a  facet  of  S. 


458  H.D.  SHERALI  AND  CM.  SHETTY 

We  will  find  it  more  convenient  to  state  our  results  if  we  normalize  the  linear  inequalities 
(1.4)  by  dividing  through  by  their  respective,  positive,  right-hand-sides.  Hence,  let  us  assume 
without  loss  of  generality  that 


(3.1) 


x:£  a'ijXj  >  1,  x  >  0 
7-1 


for /?€//=  {1,  ....  /?}. 


Then  the  application  of  Theorem  1  to  the  disjunction  DC1  yields  valid  cuts  of  the  form: 

n 


(3.2)  £ 

7-1 


max  kl'a'ii 
hZH  J 


Xi  ^  min  {\/'} 

J         hZH  . 


where  A  [',  h  €  H  are  nonnegative  scalars.   Again,  there  is  no  loss  of  generality  in  assuming  that 
(3.3)  £  A/'=  1,  A,"  ^  0,  /;€//  =  {1,  ....  h) 

since  we  will  not  allow  all  A/',  /?6//  to  be  zero.    This  is  equivalent  to  normalizing  (3.2)  by 
dividing  through  by  ]T  A/'. 

Theorem  2  below  derives  two  cuts  of  the  type  (3.2),  both  of  which  simultaneously 
achieve  the  two  criteria  of  the  foregoing  section.  However,  the  second  cut  uniformly  dominates 
the  first  cut.  In  fact,  no  cut  can  strictly  dominate  the  second  cut  since  it  is  shown  to  be  a  facet 
of  5  defined  by  (1.3). 

THEOREM  2:  Consider  the  disjunctive  statement  DC1  where  Sh  is  defined  by  (3.1)  and  is 
assumed  to  be  consistent  for  each  /;€//.   Then  the  following  results  hold: 


(a)  Both  the  criteria  of  Section  2  are  satisfied  by  letting  A/'  =  A ('*  where 

(3.4)  A/'*=  \/h        for /?€// 
in  inequality  (3.2)  to  obtain  the  cut 

n 

(3.5)  y,  a\jXj  ^  1,  where  a'Xj  =  max  a1],,  forj  =  1,  ...  ,  n. 

(b)  Further,  defining 

(3.6)  y('=  minimum  [a'j/a'ij}  >  0,  h£H 

j:a'{j>0 

and  letting  A/'  =  A/'",  where 

(3.7)  A,/'"=y17£  yf       for/?e/Y 

p<LH 

in  inequality  (3.2),  we  obtain  a  cut  of  the  form 


(3.8)  T  a[*  x,  ^  1,  where  a** '=  max  a\,  y/'fory  =1,  ...  n 

/-i  /,e// 

which  again  satisfies  both  the  criteria  of  Section  2. 


(c)    The  cut  (3.8)  uniformly  dominates  the  cut  (3.5);  in  fact, 

=  a  \j  if  a  \j  >  0 

(3.9)  a,*,* '  ^      •    r     •  ^  n   ,    j  =  1.     ..  ,  n. 

"  (<  a,,  if  ai,  <  0 


GENERATION  OF  DEEP  DISJUNCTIVE  CUTTING  PLANES  459 

(d)   The  cut  (3.8)  is  a  facet  of  the  set  S  of  Equation  (1.3). 

PROOF: 

(a)  Clearly,  \/'=  \/h,  h€H  leads  to  the  cut  (3.5)  from  (3.2).  Now  consider  the 
euclidean  distance  criterion  of  maximizing  9e(or  &l)  of  Equation  (2.3).  For  cut  (3.5),  the 
value  of  9]  is  given  by 

(3.10)  (O2  =  l/Z  (y/)2  >  0  where  yj  =  maxfO.af,},  J  =  1,  ....  n. 

Now,  for  any  choice  X  /',  /?€//, 


(3.11) 


02  = 


min(X/0 


'T,y?-bpyi:yj2.  say. 

where  ^  =  max{0,max  \('a'{j}.    If  Xf  =  0,  then  9e  =  0  and  noting  (3.10),  such  a  choice  of 


h€H 


parameters  X/',  h€H  is  suboptimal.    Hence,  Xf  >   0,  whence  (3.11)  becomes  6>e2  =     l/]£ 
But  since  (X/'/Xf)  >  1  for  each  /?€//,  we  get 


-V/ 


K\ ) 


yJ\P  =  max 


1 

A,* 

0,  max 

^  max 

/;€// 

hi 

0,  maxaj', 


/;€// 


=  •>>/• 


Thus  0,?  <  (0/)2  so  that  the  first  criterion  is  satisfied. 

Now  consider  the  maximization  of  9r  of  Equation  (2.5),  or  equivalently  Equation  (2.6). 
For  the  choice  (3.4),  the  value  of  9r  is  given  by 

1 


(3.12) 


>  0. 


max  a  ly 
j 


Now,  for  any  choice  x/',  h€H,  from  Equations  (2.6),  (3.2)  we  get 

9r  =  [minX/i  /[max  max  X/'oi',-]  =  Xf /max  max  X/'ai',-,  say. 

l/ie//        J/    I     7        /!€//  J)  '/        y        /i  6//  J 

As  before,  Xf  =  0  implies  a  value  of  9r  inferior  to  9*.    Thus,  assume  Xf  >  0.    Then,  9r  = 
Oiy.   But  (x/'/Xf)  >  1  for  each  h£H and  in  evaluating  0,.,  we  are  interested 


1/  max  max 

j       hZH 


X/< 


Af 


only  in  those  y€{l,  ...,«}  for  which  a('y  >  0  for  some  //€//.   Thus,  9r  <  1/max  max  aj 
0,,  so  that  the  second  criterion  is  also  satisfied.  This  proves  part  (a). 


j        hZH 


(b)  and  (c).  First  of  all,  let  us  consider  the  values  taken  by  y/1,  /?€//.  Note  from  the 
assumption  of  consistency  that  yj\  h€H  are  well  defined.  From  (3.5),  (3.6),  we  must  have 
y\  ^  1  for  each  /?€//.    Moreover,  if  we  define  from  (3.5) 

(3.13)  H*=  {/?€//:  a'{k=  a\k  >  0  for  some  A;  6  {1,  ....  n}} 

then  clearly  H*^  {<£}  and  for  h£H*,  Equation  (3.6)  implies  y,"  <  1.   Thus, 


460 


H.D.  SHERALI  AND  CM.  SHETTY 


(3.14) 

Hence, 
(3.15) 


yf 


=  1   for  h€H* 
>  1   for  h  $H*. 


min  y  i  =  1 
hZH 


or  that,  using  (3.7)  in  (3.2)  yields  a  cut  of  the  type  (3.8),  where, 


(3.16) 


a  i , ■  =  max  a , . •  y , ,   /'  =  1 ,  . . .  ,  n. 


Now,  let  us  establish  relationship  (3.9).  Note  from  (3.5)  that  if  a\,  ^  0,  then  a\-,  <  0 
for  each  /?€//and  hence,  using  (3.14),  (3.16),  we  get  that  (3.9)  holds.  Next,  consider  a*,  >  0 
for  some  y € { 1 ,  ...,  n).   From  (3.13),  (3.14),  (3.16),  we  get 

(3.17)  a[\*=  max{maxai\,    max    af,y/'} 

h€H  h£H* 

afj  >  0 

where  we  have  not  considered  /?#//*  with  a^  ^  0  since  a**  >  0.    But  for  h$H*mlh  a\,  >  0, 
we  get  from  (3.5),  (3.6) 


(3.18) 


„h    ,,  h  _  h 


min 

k:a'{k>Q 


w*xa[k 

<  at 

a\k 

max  a  i , 


«fc 


=  maxoi,. 


Using  (3.18)  in  (3.17)  yields  a**  =  a'n  which  establishes  (3.9). 

Finally,  we  show  that  (3.8)  satisfies  both  the  criteria  of  Section  2.  This  part  follows 
immediately  from  (3.9)  by  noting  that  the  cut  (3.5)  yields  9e  =  9*  of  (3.10)  and  0,  =  9*  of 
(3.12).   This  completes  the  proofs  of  parts  (b)  and  (c). 

(d)  Note  that  since  (3.8)  is  valid,  any  x€S  satisfies  (3.8).  Hence,  in  order  to  show  that 
(3.8)  defines  a  facet  of  S,  it  is  sufficient  to  identify  n  affinely  independent  points  of  5  which 
satisfy  (3.8)  as  an  equality,  since  clearly,  dim  S  =  n.    Define 

(3.19)  7,  =  L/€{1 n\:  a"  >  0}  and  let  J2  -  {1,  ....  n)  -  /,. 

Consider  any  p€J\,  and  let 


(3.20) 


ep=  (0 — 0),  p£J, 


\p 


have  the  non-zero  term  in  the  p'h  position.    Now,  since  p£J\,  (3.9)  yields 


a\n  =   a\D  =   max   a\ 


a\'p,  say, 


Hence,  ep€Sh    and  so,  ep€S  and  moreover,  ep  satisfies  (3.8)  as  an  equality.    Thus,  ep,  p€J\ 
qualify  as  \J\\  of  the  n  affinely  independent  points  we  are  seeking. 


Now  consider  a  q€J7.    Let  us  show  that  there  exists  an  Sh   satisfying 

h     h  .. , 

y,"  axi  =  aXp  for  some  p€J\ 


GENERATION  OF  DEEP  DISJUNCTIVE  CUTTING  PLANES  461 

and 

(3.21)  y!qafi"q  =  a". 

From  Equation  (3.16),  we  get  ai*  =  max  a\a  yP=  a\qayx\  say.   Then  for  this  //„€//,  Equation 

h<zH 

(3.6)  yields  y,'"  =  minimum  {<**,■/«,'«}  =  a'p/a^p,  say.   Or,  using  (3.9),  y,'"  a\"p  =  a'p  -  a"  > 
0.  Thus  (3.21)  holds.   For  convenience,  let  us  rewrite  the  set  Sh  below  as 

(3.22)  fy  =  U:  af'x,  +  a?<x9  +    £    afjx,  >  1,  x  >  0}. 
Now,  consider  the  direction 

(0,  ....      -Jr-.       -4r 0)    lf«,7<  0 

(0,  . . .  ,  0,  . . .  ,         A  ,  . . .  ,  0)  if  a '*  =  0 


(3.23)  da  = 


where  A  >  0.    Let  us  show  that  dq  is  a  direction  for  Sh  .    Clearly,  if  a\q  =  0,  then  from  (3.21) 

a{q  =  0  and  thus  (3.22)  establishes  (3.23).    Further,  if  a]q  <  0  then  one  may  easily  verify 
from  (3.21),  (3.22),  (3.23)  that 

ep  =  (0 y\"la"p  ,  ...  ,  0)  6  Sh   and  ep  +  Sty,''"^]  €  Sh   for  each  8^0 

where  ep  has  the  non-zero  term  at  position  p.    Thus,  dq  is  a  direction  for  Sh  .    It  can  be  easily 

shown  that  this  implies  dq  is  a  direction  for  S.    Since  ep  =  (0,  . . .  ,  — —,  . . .  ,  0)  of  Equation 

.  (3.20)  belongs  to  S,  then  so  does  (ep  +  dq).   But  (ep  +  dq)  clearly  satisfies  (3.8)  as  an  equality. 
Hence,  we  have  identified  n  points  of  S,  which  satisfy  the  cut  (3.8)  as  an  equality,  of  the  type 

■ 

e„=  (0,  ...  ,  -^7  ,  ...  ,  0)     for/^7, 
<*\P 

eq  =  dq  +  ep  for  some  p€J\,  for  each  (?€72 

where  dq  is  given  by  (3.23).    Since  these  n  points  are  clearly  affinely  independent,  this  com- 
pletes the  proof. 

It  is  interesting  to  note  that  the  cut  (3.5)  has  been  derived  by  Balas  [2]  and  by  Glover  [9, 
Theorem  1].  Further,  the  cut  (3.8)  is  precisely  the  strengthened  negative  edge  extension  cut  of 
Glover  [9,  Theorem  2].  The  effect  of  replacing  A./'*  defined  in  (3.4)  by  A./'"  defined  in  (3.7)  is 
equivalent  to  the  translation  of  certain  hyperplanes  in  Glover's  theorem.  We  have  hence 
shown  through  Theorem  2  how  the  latter  cut  may  be  derived  in  the  context  of  disjunctive  pro- 
gramming, and  be  shown  to  be  a  facet  of  the  convex  hull  of  feasible  points.  Further,  both 
(3.5)  and  (3.8)  have  been  shown  to  be  alternative  optima  to  the  two  criteria  of  Section  2. 

In  generalizing  this  to  disjunction  DC2,  we  find  that  such  an  ideal  situation  no  longer 
exists.  Nevertheless,  we  are  able  to  obtain  some  useful  results.  But  before  proceeding  to  DC2, 
let  us  illustrate  the  above  concepts  through  an  example. 


462  H.D.  SHERALI  AND  CM.  SHETTY 

EXAMPLE:  Let  H  =  {1,2},  n  =  3  and  let  DC1  be  formulated  through  the  sets 
5,  =  {x:  x,  +  2x2  -  4x3  ^  1,  x  ^  0},S2  =  {x:  -y-  +  -y-  -  2x3  >  1,  x  ^  0}. 
The  cut  (3.5),  i.e.,  Eaf/X,-  $5  1,  is  *i  +  2x2  -  2x3  >  1.   From  (3.6), 


y{  =  min 


1    1 
1  '  2 


=  1  and  y(  =  min 


2  _ 


1         2 
1/2'    1/3 


2. 


Thus,  through  (3.7),  or  more  directly,  from  (3.16),  the  cut  (3.8),  i.e.,  I  a'Jxj  ^  1  is 
X]  +  2x2  —  4x3  ^  1.  This  cut  strictly  dominates  the  cut  (3.5)  in  this  example,  though  both 
have  the  same  values  1/V5  and  1/2  respectively  for  9e  and  9r  of  Equations  (2.2)  and  (2.5). 

4.    DERIVING  DEEP  CUTS  FOR  DC2 

To  begin  with,  let  us  make  the  following  interesting  observation.  Suppose  that  for  con- 
venience, we  assume  without  loss  of  generality  as  before,  that  b^  =  1,  i€Qh,  //€//in  Equation 
(1.4).   Thus,  for  each  /?€//,  we  have  the  constraint  set 


(4.1) 


x:  5>/;x;  >  1,  i£Qh,  x  >  0 
M 


Now  for  each  /?€//,  let  us  multiply  the  constraints  of  5/,  by  corresponding  scalars  8/'  ^  0,  i€Qh 
and  add  them  up  to  obtain  the  surrogate  constraint 


(4.2) 


I 

7=1 


Xj  >    £  8,*,  /?€#. 


Further,  assuming  that  not  all  8/'  are  zero  for  /€  (?,,,  (4.2)  may  be  re-written  as 


(4.3) 


I 

7=1 


I 

'6  0* 

8/' 

1 
a'' 

x,  >  1,  /?€//. 


Finally,  denoting  8/' /£   8  *  by  X,/?  for  i€Qh,  h  €  //,  we  may  write  (4.3)  as 


(4.4) 


where, 
(4.5) 


z 

/-I 


I  *,*flj 


'€<?* 


Xf  ^  1  for  each  /?  €  // 


X  X,*-  1  for  each  //€//,  X,"  ^  0  for  /€&,  h€H. 

Observe  that  by  surrogating  the  constraints  of  (4.1)  using  parameters  X/',  /€Q/7,  /? € // satisfying 
(4.5),  we  have  essentially  represented  DC2  as  DC1  through  (4.4).  In  other  words,  since  x£Sh 
implies  x  satisfies  (4.4)  for  each  /?€//,  then  given  X,/(,  /€£>,,,  h£H,  DC2  implies  that  at  least 
one  of  (4.4)  must  be  satisfied.  Now,  whereas  Theorem  1  would  directly  employ  (4.2)  to  derive 
a  cut,  since  we  have  normalized  (4.2)  to  obtain  (4.4),  we  know  from  the  previous  section  that 
the  optimal  strategy  is  to  derive  a  cut  (3.8)  using  inequalities  (4.4). 


Now  let  us  consider  in  turn  the  two  criteria  of  Section  2. 


GENERATION  OF  DEEP  DISJUNCTIVE  CUTTING  PLANES 


463 


4.1.   Euclidean  Distance-Based  Criterion 

Consider  any  selection  of  values  for  the  parameters  \/\  /€£?/,,  h €//  satisfying  (4.5)  and 
let  the  corresponding  disjunction  DC1  derived  from  DC2  be  that  at  least  one  of  (4.4)  must 
hold.  Then,  Theorem  2  tells  us  through  Equations  (3.5),  (3.10)  that  the  euclidean  distance  cri- 
terion value  for  the  resulting  cut  (3.8)  is 


(4.6) 

where, 
(4.7) 
and 
(4.8) 


9Ak)=  1 


yj  =  max{0,  z,},  j  =  1, 


max 


,j=l 


Thus,  the  criterion  of  Section  2  seeks  to 

(4.9)  maximize  {9e(\):X=  (A/0  satisfies  (4.5)} 
or  equivalently,  to 

(4.10)  minimize  {£v,2:    (4.5),  (4.7),  (4.8)  are  satisfied}. 

7=1 

It  may  be  easily  verified  that  the  problem  of  (4.10)  may  be  written  as 


(4.11) 
(4.12) 
(4.13) 
(4.14) 


PD,: 


minimize  £  yj 


y=i 
subject  to })  ^   £  A /' a'j  for  each  h£H  for  each  j  =  1 , 

£  A/'  =  1  for  each  //€// 

a/'  >  0  i£Qhl  heH 


Note  that  we  have  deleted  the  constraints  vy  ^  0,  j  =  1,  ...  ,  «  since  for  any  feasible  A/', 
/€()/,,  /7  6//,  there  exists  a  dominant  solution  with  nonnegative  yj  =  j  =  1,  ...  ,  n.  This  relax- 
ation is  simply  a  matter  of  convenience  in  our  solution  strategy. 

Before  proposing  a  solution  procedure  for  Problem  PD2,  let  us  make  some  pertinent 
remarks.  Note  that  Problem  PD2  has  the  purpose  of  generating  parameters  \/',  /€(?/,,  /?€// 
which  are  to  be  used  to  obtain  the  surrogate  constraints  (4.4).  Thereafter,  the  cut  that  we 
derive  for  the  disjunction  DC2  is  the  cut  (3.8)  obtained  from  the  statement  that  at  least  one  of 
(4.4)  must  hold.  Hence,  Problem  PD2  attempts  to  find  values  for  A/',  i€Qh,  /?€//,  such  that 
this  resulting  cut  achieves  the  euclidean  distance  criterion. 

Problem  PD2  is  a  convex  quadratic  program  for  which  the  Kuhn-Tucker  conditions  are 
both  necessary  and  sufficient.  Several  efficient  simplex-based  quadratic  programming  pro- 
cedures are  available  to  solve  such  a  problem.  However,  these  procedures  require  explicit  han- 
dling of  the  potentially  large  number  of  constraints  in  Problem  PD2.    On  the  other  hand,  the 


464 


H  D   SHERAL1  AND  CM.  SHETTY 


subgradient  optimization  procedure  discussed  below  takes  full  advantage  of  the  problem  struc- 
ture. We  are  first  able  to  write  out  an  almost  complete  solution  to  the  Kuhn-Tucker  system. 
We  will  refer  to  this  as  a  partial  solution.  In  case  we  are  unable  to  either  actually  construct  a 
complete  solution  or  to  assert  that  a  feasible  completion  exists,  then  through  the  construction 
procedure  itself,  we  have  a  subgradient  direction  available.  Moreover,  this  latter  direction  is 
very  likely  to  be  a  direction  of  ascent.  We  therefore  propose  to  move  in  the  negative  of  this 
direction  and  if  necessary,  project  back  onto  the  feasible  region.  These  iterative  steps  are  now 
repeated  at  this  new  point. 

4.1.1  Kuhn-  Tucker  Systems  for  PD2  and  Its  Implications 

Letting  uf,  /?€//,  j  =  1,  ...  ,  n  denote  the  lagrangian  multipliers  for  constraints  (4.12), 
th,  h€H those  for  constraints  (4.13),  and  wj',  /€(),,,  h€H those  for  constraints  (4.14),  we  may 
write  the  Kuhn-Tucker  optimality  conditions  as 


(4.15) 

(4.16) 
(4.17) 

(4.18) 
(4.19) 
(4.20) 


I   uf-ty    7-1 n 

n 

2^  uj'alj  +  th  -  w/'  =  0  for  each  i€Qh,  and  for  each  h€H 

7=1 


V    ■ 


0  for  each  j  =  1 ,  . . .  ,  n  and  each  h  €  H 


\/'w/'=  Ofor  /€£>,,,  /?€/¥ 

wj1  ^  0  /€(?,,,  h€H 

uf>  Oy-  1,  ...  ,  n,  h£H. 


Finally,  Equations  (4.12),  (4.13),  (4.14)  must  also  hold.  We  will  now  consider  the  implications 
of  the  above  conditions.  This  will  enable  us  to  construct  at  least  a  partial  solution  to  these  con- 
ditions, given  particular  values  of  \/',  i€Qh,  /?€//.  First  of  all,  note  that  Equations  (4.7), 
(4.10)  and  (4.20)  imply  that 


(4.21) 
(4.22) 


yj  ^  0  for  each  j  =  1 ,  ... 
>j  =  maxjo,    £  kfajj,  /?€// 


for  j  =  I,  ...  ,  n. 


Now,  having  determined  values  for  yjr  j  =  1,  . . .  ,  n,  let  us  define  the  sets 

{0}  ifjy-0 


(4.23) 


Hj- 


for  j  =  1,  . . .  ,  n. 


[h€H:yj=    £  *iai  >  0} 


Now,  consider  the  determination  of  w/',  /?€//,  j  ='  1,  . . .  ,  /j.   Clearly,  Equations  (4.15),  (4.17) 
and  (4.20)  along  with  the  definition  (4.23)  imply  that  for  each  j  —  I,  ■  ■  ■  ,  n 

(4.24)  uf  =  0  for  h  €  H/Hj  and  that    £    «/  -  2yr  uf  >  0  for  each  h  €  fl). 


/)€//. 


Thus,  for  any  y'€{l,  ...  ,  «},  if  //,  is  either  empty  or  a  singleton,  the  corresponding  values  for 
uf,  h  €  H  are  uniquely  determined.    Hence,  we  have  a  choice  in  selecting  values  for  uf,  h  6  Hj 


GENERATION  OF  DEEP  DISJUNCTIVE  CUTTING  PLANES 


465 


only  when  \Hj\  ^  2  for  any  y€{l,  ...  ,  n).   Next,  multiplying  (4.16)  by  \/'  and  using  (4.18), 
we  obtain 


(4.25) 


I 


+  %   £   X/'=  0  for  each  /?€//. 

'"6  0/, 


w/'  =  £  i//  [flj  -  j;y]  for  each  /€£>,„  h  e  //. 


Using  Equations  (4.13),  (4.17),  this  gives  us 

(4.26)  'a  -  -  £  "yV,  for  each  /?€//. 

Finally,  Equations  (4.16),  (4.26)  yield 

(4.27) 

Notice  that  once  the  variables  uj',  h€H,  j  =  1,  . . .  ,  «  are  fixed  to  satisfy  (4.24),  all  the  vari- 
ables are  uniquely  determined.  We  now  show  that  if  the  variables  w/',  /  €(),,,  h  €  H  so  deter- 
mined are  nonnegative,  we  then  have  a  Kuhn-Tucker  solution.  Since  the  objective  function  of 
PD2  is  convex  and  the  constraints  are  linear,  this  solution  is  also  optimal. 

LEMMA  2:  Let  a  primal  feasible  set  of  a/',  /€()/,,  h€H  be  given.  Determine  values  for 
all  variables  v7,  uj\  th,  w''  using  Equations  (4.22)  through  (4.27),  selecting  an  arbitrary  solution 
in  the  case  described  in  Equation  (4.24)  if  \Hj\  >  2.  If  w/'  >  0,  KQh,  /;€//,  then  A/',  /€£>,,, 
h  £H  solves  Problem  PD2. 

PROOF:  By  construction  Equations  (4.12),  through  (4.17),  and  (4.20)  clearly  hold. 
Thus,  noting  that  in  our  problem  the  Kuhn-Tucker  conditions  are  sufficient  for  optimality,  all 
we  need  to  show  is  that  if  w  =  (h>/0  ^  0  then  (4.18)  holds.  But  from  (4.17)  and  (4.27)  for 
any  /;  €  H,  we  have, 


''€0/,  /eo„ 


£ «/  [^  - 

-^ 

=  1 

"7 

I   a/'4- 

-# 

,/=! 

7=1 

'€G/, 

=  0 


for  each  /?€//.    Thus,  A/'  ^  0,  w/'  ^  0  /€£>,,,  /?€//  imply  that  (4.18)  holds  and  the  proof  is 
complete. 

The  reader  may  note  that  in  Section  4.1.4  we  will  propose  another  stronger  sufficient  con- 
dition for  a  set  of  variables  A./',  /€£?/,,  /?€//to  be  optimal.  The  development  of  this  condition 
is  based  on  a  subgradient  optimization  procedure  discussed  below. 

4.1.2  Subgradient  Optimization  Scheme  for  Problem  PD 

For  the  purpose  of  this  development,  let  us  use  (4.22)  to  rewrite  Problem  PD2  as  follows. 
First  of  all  define 

(4.28)  A  ={\  =  (a/0:  constraints  (4.13)  and  (4.14)  are  satisfied  ) 

and  let  /:  A  —  R  be  defined  by 


(4.29) 


/0O  =  £ 

7=1 


maximum 


0,    £   a," 4,  /?€// 

'6G/, 


466 


H.D.  SHERALI  AND  CM    SHETTY 


Then,  Problem  PD2  may  be  written  as 

minimize  {/(X):  A.  €  A). 
Note  that  for  each  j  =  1,  . . .,  n,  gj(X)  —  max  {0,    Z   Xfoy,  h€H)  is  convex  and  nonnegative. 

Thus,  [g,(X)]2  is  convex  and  so/(X)  =  Z  [gj(X)]2  is  also  convex. 

y=i 

The  main  thrust  of  the  proposed  algorithm  is  as  follows.   Having  a  solution  X  at  any  stage, 
we  will  attempt  to  construct  a  solution  to  the  Kuhn-Tucker  system  using  Equations  (4.15)  : 
through  (4.20).    If  we  obtain  nonnegative  values  iv*  for  the  corresponding  variables  w'\  /'€Q/M 
h€H,  then  by  Lemma  2  above,  we  terminate.    Later  in  Section  4.1.7,  we  will  also  use  another 
sufficient  condition  to  check  for  termination.   If  we  obtain  no  indication  of  optimality,  we  con-  j 
tinue.    Theorem  3  below  established  that  in  any  case,  the  vector  w  =  w  constitutes  a  subgra-  i 
dient  of  /(•)  at  the  current  point  X.   Following  Poljak  [18,19],  we  hence  take  a  suitable  step  in 
the  negative  subgradient  direction  and  project  back  onto  the  feasible  region  A  of  Equation 
(4.28).    This  completes  one  iteration.    Before  presenting  Theorem  3,  consider  the  following 
definition. 

DEFINITION  1:  Let/:  A  —  R  be  a  convex  function  and  let  X  €  AC  Rm.   Then  f  €  Rm  ' 
is  a  subgradient  of /(•)  at  X  if 

fix)  ^  f(X)  +  £'  (X  -  X)  for  each  X  €  A. 

THEOREM  3:  Let  X  be  a  given  point  in  A  defined  by  (4.28)  and  let  w  be  obtained  from 
Equations  (4.22)  through  (4.27),  with  an  arbitrary  selection  of  a  solution  to  (4.24). 

_ 
Then,  w  is  a  subgradient  of /(•)  at  X,  where  /:A  — -  R  is  defined  in  Equation  (4.29). 

PROOF.   Let  y  and  y  be  obtained  through  Equation  (4.22)  from  X  €  A  and  X  €  A  respec- 
tively.  Hence, 

/<X)-£  y2mdf(X)  =  £y2. 

Thus,  from  Definition  1,  we  need  to  show  that 

(4.30)  £     £    wj1  (X/'  -  X?)  <  £  y}  -  £  y2. 

hZH  KQh  7=1  7=1 

Noting  from  Equations  (4.17),  (4.27)  that  Z    Z    w/'x/'=  0,  we  have, 

hZHitQ,, 

X  £  ^<x*-x,*>-  £  £  »*a*-  £  £  Z-yVW-Jvi 

hdH  <£Qh  hZH  i$Qh  h€H  i€Qh  j=\ 


Z   I  */ 

/!  €  //    /=  1 


Z  x,X 

'■6  0* 


z  z 

/)  6  W  /=  I 


ufy,   Z  X/ 


Using  (4.13)  and  (4.15),  this  yields 


t>> 


z  *M 


-2 


-  2  z  yj 

/=i 


GENERATION  OF  DEEP  DISJUNCTIVE  CUTTING  PLANES 


467 


Combining  this  with  (4.30),  we  need  to  show  that 


(4.31) 


Z   Z  «/' 
ye//y-i 


Z  Vol} 


<  Z  yf  +  Z  # 

7-1  7=1 


But  Equations  (4.15),  (4.20),  (4.22)  imply  that 


IH 


Z  *M 

hQm 


/i  6 //v=l  7=1 


UN  <  ItvlP+IUIP 


heHj-l 

so  that  Equation  (4.31)  holds.  This  completes  the  proof. 


Although,  given  X  €  A,  any  solution  to  Equations  (4.22)  through  (4.27)  will  yield  a 
subgradient  of /(•)  at  the  current  point  X,  we  would  like  to  generate,  without  expending  much 
effort,  a  subgradient  which  is  hopefully  a  direction  of  ascent.  Hence,  this  would  accelerate  the 
cut  generation  process.  Later  in  Section  4.1.6  we  describe  one  such  scheme  to  determine  a 
suitable  subgradient  direction.  For  the  present  moment,  let  us  assume  that  we  have  generated 
a  subgradient  w  and  have  taken  a  suitable  step  size  9  in  the  direction  —  w  as  prescribed  by  the 
subgradient  optimization  scheme  of  Held,  Wolfe,  and  Crowder  [12].   Let 


(4.32) 


X  =  X  -  9  w 


be  the  new  point  thus  obtained.   To  complete  the  iteration,  we  must  now  project  X  into  A,  that 
is,  we  must  determine  a  new  X  according  to 


(4.33) 


Xnew  =  PX(X)  =  minimum  [|  |x  —  X||:  X  €  A}. 


The  method  of  accomplishing  this  efficiently  is  presented  in  the  next  subsection. 
4.1.3  Projection  Scheme 

For  convenience,  let  us  define  the  following  linear  manifold 


(4.34) 


Mh  =  , 


,  /?€// 


and  let  Mh  be  the  intersection  of  Mh  with  the  nonnegative  orthant,  that  is, 

(4.35)  Mh  -  {x/\  /€&:  Z  */*-  1.  */'  >  °.  '€&}. 

Note  from  Equation  (4.28)  that 

(4.36)  A  =  Mx  x  ...  x  M\H\. 

Now,  given  X,  we  want  to  project  it  onto  A,  that  is,  determine  Xnew  from  Equation  (4.33). 
Towards  this  end,  for  any  vector  a  =  (a,,  /€/),  where  /is  a  suitable  index  set  for  the  |/|  com- 
ponents of  a,  let  P(a,I)  denote  the  following  problem: 


(4.37) 


PiaJY. 


minimize 


|Z  (X,-a,)2:Z^=  1.  h  >  0-  '€■ 


Then  to  determine  Xnew,  we  need  to  find  the  solutions  (X/,'e(t,),,  /€0/,  as  projections  onto  Mh  of 


,=  h 


=  h 


X  =  (X,,  i€Qh)  through  each  of  the  \H\  separable  Problems  P(X  ,  Qh).  Thus,  henceforth  in 
this  section,  we  will  consider  only  one  such  /?€//.  Theorem  4  below  is  the  basis  of  a  finitely 
convergent  iterative  scheme  to  solve  Problem  P(X  ,  Qh). 


468  H.D.  SHERALI  AND  CM.  SHETTY 

THEOREM  4:  Consider  the  solution  of  Problem  P(J3k,  Ik),  where  0*  =  (/3f,  /€/*),  with 
|/J  ^  1.   Define 


(4.38)  Pk 


1-1/3/ 


\h 


and  let 

(4.39)  pk  =  pk+{pk)lk 

where  4  denotes  a  vector  of  \lk\  elements,  each  equal  to  unity.   Further,  define 

(4.40)  4+1=  {/€/*:  j§f  >  0). 

Finally,  let/3^1  defined  below  be  a  subvector  of/?*, 

(4.41)  0*+1=  (j3k+],  ieik+l) 

where,  pk+x  =  JS,^,  r€4+1.   Now  suppose  that^+1  solves  P(j8fc+1,  /*+,). 

(a)  If/3*  ^  0,  then/3*  solves  P(J3k,  Ik). 

(b)  Ifp  _>  0,  then  /3  solves  />(/3A:,  4),  where /3  has  components  given  by 

J3k+\  if  /€4+1  for  each  /€/*. 
0  otherwise 


(4.42)  /3,  = 


PROOF:    For  the  sake  of  convenience,  let  RP(a,I)  denote  the  problem  obtained  by 
relaxing  the  nonnegativity  restrictions  in  P(a,I).   That  is,  let 


RP(a,I):  minimize 


J  £  (X,  -  a,)2:  £  X,  -  1 


First  of  all,  note  from  Equations  (4.38),  (4.39)  that  /3*  solves  RP(J3k,  lk)  since  fik  is  the  projec- 
tion of  (lk  onto  the  linear  manifold 


(4.43) 


\=  (\„  /€/*):  £  \,=  l| 


which  is  the  feasible  region  of  RP(J3k,  Ik).  Thus,  /3*  ^  0  implies  that  /3*  also  solves  P(j3*,  4)- 
This  proves  part  (a). 

Next,  suppose  that  /3A '  >  0.   Observe  that  /3  is  feasible  to  P(fik,  Ik)  since  from  (4.42),  we 
get  /3  ^  0  and  £  /3,  =    £    /3*+1  =  1  as  /3*+1  solves  P(fik+l,  4+1). 

Now,   consider  any  A.  =    (X„  /€/*)   feasible  to  P(/3k,  Ik).    Then,   by  the  Pythagorem 
Theorem,  since  ftk  is  the  projection  of  pk  onto  (4.43),  we  get 

||\-/8*||2  =  ||\-J8*||2  +  ||J3*-/3*||2. 

Hence,  the  optimal  solution  to  P(fik,  Ik)  is  also_optimal  to  P(fik,  Ik).  Now,  suppose  that  we 
can  show  that  the  optimal  solution  to  Problem  P(JBk,  Ik)  must  satisfy 

(4.44)  X,  =  0  for  <?4+1. 

Then,  noting  (4.41),  (4.42),  and  using  the  hypothesis  that  J3k+l  solves  P(fik+],  Ik+\),  we  will 
have  established  part  (b).  Hence,  let  us  prove  that  (4.44)  must  hold.  Towards  this  end,  con- 
sider the  following  Kuhn-Tucker  equations  for  Problem  P(fik,  Ik)  with  /  and  wn  i€lk  as  the 
appropriate  lagrangian  multipliers: 


GENERATION  OF  DEEP  DISJUNCTIVE  CUTTING  PLANES  469 

(4.46)  £  X,  =  1,  a,  ^  0  for  each  i£lk 

(4.47)  (X,  -  /§/*)  +  t  -  w,  =  0  and  w,  ^  0  for  each  /€/* 

(4.48)  X ,  w,  =  0  for  each  i  €  7A . 

Now,  since  £  J8*  =  1,  we  get  from  (4.45),  (4.46)  that 
/e/t 

^  0. 


But  from  (4.46),  (4.47),  and  (4.48)  we  get  for  each  /€/*, 

0=  w,X,  =  X,(X,  +  t-pf) 
which  implies  that  for  each  /'€/*,  we  must  have, 

either  X,  =  0,  whence  from  (4.46),  w, ■  =  t  —  /3*  must  be  nonnegative 

or  X,  =  J3*  -  /,  whence  from  (4.46),  w,  =  0. 

In  either  case  above,  noting  (4.45),  if /3*  <  0,  that  is,  if  i&Ik+\,  we  must  have  X;  =  0.   This 
completes  the  proof. 

Using  Theorem  4,  one  may  easily  validate  the  following  procedure  for  finding  k„ew  of 
Equation  (4.33),  given  Xh.  This  procedure  has  to  be  repeated  separately  for  each  /?€//. 

Initialization 

Set  k  =  0,  /3°  =  \\  I0  =  Qh.   Go  to  Step  1. 

Step  1 

Given  pk,  Ik,  determine  pk  and  ~jik  from  (4.38),  (4.39).    If /3*  ^  0,  then  terminate  with 
KJIew  having  components  given  by 


pfi£iak 

0  otherwise. 


(\  "    )  = 

v/x  new  '  i 

Otherwise,  proceed  to  Step  2. 

Step  2 

Define  Ik+\,  (3k+l  as  in  Equations  (4.40),  (4.41),  increment  k  by  one  and  return  to  Step  1. 

Note  that  this  procedure  is  finitely  convergent  as  it  results  in  a  strictly  decreasing,  finite 
sequence  \lk\  satisfying  \lk\  ^  1  for  each  k,  since  £  j3k  =  1  for  each  k. 

EXAMPLE:  Suppose  we  want  to  project  X   =  (-2,3,1,2)  on  to  A  c  R4.   Then  the  above 
procedure  yields  the  following  results. 

Initialization 

k  =  0,  0°  =  (-2,3,1,2),  70=  {1,2,3,4}. 


470 


H.D.  SHERALI  AND  CM.  SHETTY 


Step  1 
P0--3/4,  j§°  = 


n    9  ]    5 

4  '    4'    4'    4 


Step  2 
ft:  —  1.  /x  —  {2,3,4},  /31 


111 
4'    4'    4 


Step  1 


1  _2   1 

3  '       3'    3 


Step  2 
/c  =  2,  /2=  {2,4},  |82  = 


1    1 
3'    3 


Step  1 


1      o2 


P2--J.  P  -  (1.0)  ^  0 
Thus,  X*    -  (0,1,0,0). 


4.  /.  4  >4  Second  Sufficient  Condition  for  Termination 

_  As  indicated  earlier  in  Section  4.1.2,  we  will  now  derive  a  second  sufficient  condition  on  w 
for  A.  to  solve  PD2.   For  this  purpose,  consider  the  following  lemma: 

LEMMA  3:  Let  \  €  A  be  given  and  suppose  we  obtain  w  using  Equations  (4.22)  through 
(4.27).   Let  w  solve  the  problem. 


PRh:  minimize 


\  £  (w,*-  w/')2:  £    w/'=  0,  vv/'  <  0  for  l€JA  for  each  htH 


where, 

(4.49)  7„  =  {/€£?„:  A/' =  0},  /;€//. 
Then,  if  w  =  0,  A.  solves  Problem  PD2. 

PROOF.   Since  w  =  0  solves  /V?/,,  /?€//,  we  have  for  each  /?€//, 

(4.50)  £    (w/')2<    £    (w/'~  w/')2 

for  all  w/',  /€Q/,  satisfying   £    w/*=  0,  w/'  <  0  for  /'€/,,.    Given  any  X   €  A  and  given  any 

/jl  >    0  define, 

(4.51)  w/'=  (\/'-X/')/At,   /€(?;„   /?£//. 


GENERATION  OF  DEEP  DISJUNCTIVE  CUTTING  PLANES  471 

Then,   £    w/'  =  0  for  each  h£H  and  since  X,"  =  0  for  /€/,,,  /?€//,  we  get  vv/'  <  0  for  i£Jh, 

/;€//.  Thus,  for  any  A.  €  A,  by  substituting  (4.51)  into  (4.50),  we  have, 
(4.52)  n2  Z    (wfi2<   £    (X/*-X,*  +  Atw/)2   for  each /?€//. 

But  Equation  (4.52)  implies  that  for  each  /?€//,  A.''  =  X7'  solves  the  problem 
i 


minimize 


£    [\/'-  (a/'-mw/')]2:  £   X/'  =  1,  a,"  >  0  ,€£ 


for  each  /;  €  //. 


In  other  words,  the  projection  /^(X  —  Vv/a)  of  (X  —  vv/a)  onto  A  is  equal  to  X  for  any  ix  =  0. 

In  view  of  Poljak's  result  [18,19],  since  vv  is  a  subgradient  of  /(■)  at  X,  then  X  solves  PD2. 
This  completes  the  proof. 

Note  that  Lemma  3  above  states  that  if  the  "closest"  feasible  direction  —  w  to  —  vv  is  a  zero 
vector,  then  X  solves  PD2.  Based  on  this  result,  we  derive  through  Lemma  4  below  a  second 
sufficient  condition  for  X  to  solve  PD2. 

LEMMA  4:  Suppose  vv  =  0  solves  Problems  PR,,,  /;€//  as  in  Lemma  3.  Then  for  each 
h  €  H,  we  must  have 

(4.53)  (a)  Wj1  =  t,,,  a  constant,  for  each  i$Jh 

(b)  wj'  <  th  for  each  i£Jh 
where  J,,  is  given  by  Equation  (4.49). 

PROOF:  Let  us  write  the  Kuhn-Tucker  conditions  for  Problem  PR,,,  for  any  /?€//.  We 
obtain 

(vv/'-  iv/9  +  fA-  Ofor  /<T/A 

(w/'  -  vv/')  +  r/;  -  w,"  =  0  for  iej,, 

U-'  ^  0,  /'  €  //, ,  «/'  Wj1  =0   /€//,,  ?,,  unrestricted 

£    w/'  =  0,  w/'  ^  Ofor  i£Jh. 

'€0/, 

If  w  =  0  solves  P^/? ,  /?€//,  then  since  PR,,  has  a  convex  objective  function  and  linear  con- 
straints, then  there  must  exist  a  solution  to 

vv/'  =  t,,  for  each  /  GJ,, 
and 

«/'=  {th  -  vv/')  ^  0  for  each  i£J,,. 

This  completes  the  proof. 

Thus  Equation  (4.53)  gives  us  another  sufficient  condition  for  X  to  solve  PD2.  We  illus- 
trate the  use  of  this  condition  through  an  example  in  Section  4.1.7. 

4.1.5  Schema  of  an  Algorithm  to  Solve  Problem  PD2 

The  procedure  is  depicted  schematically  below.  In  block  1,  an  arbitrary  or  preferably,  a 
good  heuristic  solution  X  €  A  is  sought.  For  example,  one  may  use  X/'  =  \/\Q/, I  for  each 
/€(),,,  for  h€H.  For  blocks  4  and  6,  we  recommend  the  procedural  steps  proposed  by  Held, 
Wolfe  and  Crowder  [12]  for  the  subgradient  optimization  scheme. 


472 


H.D.  SHERALI  AND  CM.  SHETTY 


For  j  =  1 n 

determine  yn 
Uj,  h  €  H.  using 
Equations  (4.22), 
(4.24).    Hence, 
determine  w  from 
Equation  (4.27) 


3 


Yes 


Is  w  >  0  or 

does  w  satisfy 
Equation  (4.53)' 


No 


Terminate  with  K 

as  an  optimal  solution 

to  PDi 


Select  0 

and  let 

\  =  X  -  flw 


5 


Replace_ 
X  by  PU) 
of  Equations 

(4.33) 


Is  a  suitable 

subgradient 

optimization 

termination 

criterion 

satisfied? 


Yes 


Terminate  with  A. 
as  an  estimate  of  an 
optimal  solution  to 
PD, 


No 


4.1.6  Derivation  of  a  Good  Subgradient  Direction 

In  our  discussion  in  Section  4.1.1,  we  saw  that  given  a  X  €  A  of  Equation  (4.28),  we  were 
able  to  uniquely  determine  yh  j  =  1,  ...  ,  n  through  Equation  (4.22).  Thereafter,  once  we 
fixed  values  5*  for  «/',  j  —  I,  ...  ,  n,  h€H satisfying  Equation  (4.24),  we  were  able  to  uniquely 
determine  values  for  the  other  variables  in  the  Kuhn-Tucker  System  using  Equations  (4.26), 
(4.27).  Moreover,  the  only  choice  in  determining  uj1,  j  =  I,  . . .  ,  «,  /?€// arose  in  case  \Hj\  ^ 
2  for  some  j  €  (l,  . . .  ,  n)  in  Equation  (4.25).   We  also  established  that  no  matter  what  feasible 


values  we  selected  for  uf,  j€   {l, 


n),  h€H,  the  corresponding  vector  w  obtained  was  a 


subgradient  direction.  In  order  to  select  the  best  such  subgradient  direction,  we  are  interested 
in  finding  a  vector  w  which  has  _the  smallest  euclidean  norm  among  all  possible  vectors 
corresponding  to  the  given  solution  \  €  A.  However,  this  problem  is  not  easy  to  solve.  More- 
over, since  this  step  will  merely  be  a  subroutine  at  each  iteration  of  the  proposed  scheme  to 
solve  PD2,  we  will  present  a  heuristic  approach  to  this  problem. 

Towards  this  end,  let  us  define  for  convenience,  mutually  exclusive  but  not  uniquely 
determined  sets  A//,,/?€//as  follows: 

(4.54)  Nh  C  {y'€{l,  ...,«}:  heHj  of  Equation   (4.23)} 

(4.55)  N,  n  Nj  =  {(f)}  for  any  /,  j€H  and  |J  N„  -  L/€{1,  ...  ,  n}:yj  >  0}. 

In  other  words,  we  take  each  j€[\,  ...  ,  n)  which  has  3>7  >  0,  and  assign  it  to  some  h£Hj, 
that  is,  assign  it  to  a  set  Nh,  where  h£Hj.   Having  done  this,  we  let 


(4.56) 


"/ 


2yjifjeNh 

0  otherwise    for  each  ;  e  { 1,  . 


n),  /?€//. 


Note  that  Equation  (4.56)  yields  values  uf  for  «/,  y€{l,  . ..  ,  «},  h£H  which  are  feasible  to 
(4.24).  Hence,  having  defined  sets  N,,,  /;€//as  in  Equations  (4.54),  (4.55),  we  determine  uf, 
7"€{1,  ...  ,  «},  h£H  through  (4.56)  and  hence  w  through  (4.27). 

Thus,  the  proposed  heuristic  scheme  commences  with  a  vector  w  obtained  through  an 
arbitrary  selection  of  sets  Nh,  h 6 H satisfying  Equations  (4.54),  (4.55).  Thereafter,  we  attempt 
to  improve  (decrease)  the  value  of  w'w  in  the  following  manner.  We  consider  in  turn  each 
y  € { 1 ,  . . .  ,  n)  which  satisfies  \Hj\  ^  2  and  move  it  from  its  current  set  Nh ,  say,  to  another  set 


GENERATION  OF  DEEP  DISJUNCTIVE  CUTTING  PLANES 


473 


Nh  with  h€Hj,  h^  hj,  if  this  results  in  a  decrease  w'w.  If  no  such  single  movements  result  in 
a  decrease  in  w'w,  we  terminate  with  the  incumbent  solution  w  as  the  sought  subgradient  direc- 
tion. This  procedure  is  illustrated  in  the  example  given  below. 

4.1.7  Illustrative  Example 

The  intention  of  this  subsection  is  to  illustrate  the  scheme  of  the  foregoing  section  for 
determining  a  good  subgradient  direction  as  well  as  the  termination  criterion  of  Section  4.1.4. 

Thus,  let  //  =  {1,2},  n  =  3,  \QX\  =  \Q2\  =  3  and  consider  the  constraint  sets 


S,= 


x:    2x\  —  3x2  +  x3  ^  1 

—  Xj  +  2x2  +  3*3  ^  1 

3xj  —  x2  —  x3  ^  1 

X\,  x2,  x3  >  0 


and  S2  = 


x:       3xi  —  x2  —  x3  ^  1 

2x]  +  x2  —  2x3  ^  1 

—X]  +  3x2  +  3x3  ^  1 

x]t  x2,  x3  ^  0 


Further,  suppose  we  are  currently  located  at  a  point  A  with 

\}  =  0,  \2'  =  5/12,  Xl  =  7/12;  \2  =  7/12,  A22  -  0,  \32  =  5/12. 
Then  the  associated  surrogate  constraints  are 


(4.57) 


4  1  2 

— X|  +  —  x2  +  —x}  ^  1  for  h  =  1 


4  2  2 

—x,  +  — x2  +  — x3  ^  1    for  h  =  2. 


Using  Equations  (4.22),  (4.25),  we  find 

y\=  \  with//!  =  {1,2},  y2=  j  with//2=  {2}  and  ^3  =  j  with  //3  =  {1,2}. 

Note  that  the  possible  combinations  of  N]  and  N2  are  as  follows: 
(i)  #!=  {1},  W2  =  {2,3}, 
(ii)  AT!-  {0},  ^2=  {1,2,3}, 
(Hi)  #i=  {1,3},  N2=  {2},  and 
(iv)  ^  =  {3},  /V2={1,2}. 

A  total  enumeration  of  the  values  of  u  obtained  for  these  sets  through  (4.56)  and  the 
corresponding  values  for  w  are  shown  below. 


NY        N2 

u},j€[l n) 

wf,  ieQ,„  heH 

w'w 

«/ 

u\ 

ul 

u\ 

u\ 

u] 

w/ 

w2 

w3' 

w2 

w22 

w32 

{1}      {2,3} 
{0}     {1,2,3} 
{1,3}  {2} 
{3}      {1,2} 

8/3 

0 
8/3 

0 

0 
0 
0 
0 

0 

0 

4/3 
4/3 

0 
8/3 

0 
8/3 

4/3 
4/3 
4/3 
4/3 

4/3 
4/3 

0 

0 

16/9 

0 
20/9 
-4/9 

-56/9 

0 

-28/9 

28/9 

40/9 

0 

20/9 

-20/9 

-40/9 

0 

-20/9 

20/9 

-28/9 

-4/3 

4/9 

20/9 

56/9 

0 

28/9 

-28/9 

129.78 

1.78 

34.37 

34.37 

Thus,  according  to  the  proposed  scheme,  if  we  commence  with  Nx  =  {l},  N2  =  {2,3},  then 
picking  j  =  1  which  has  \Hj\  =  2,  we  can  move  7=1  into  N2  since  2€H\.  This  leads  to  an 
improvement.    As  one  can  see  from  above,  no  further  improvement  is  possible.    In  fact,  the 


474  H.D.  SHERALI  AND  CM   SHETTY 

best  solution  shown  above  is  accessible  by  the  proposed  scheme  by  all  except  the  third  case 
which  is  a  "local  optimal". 

We  now  illustrate  the  sufficient  termination  condition  of  Section  4.1.4.    The  vector  w 

A-l .  _  h=\  h=l 

obtained  above  is  (0,0,0|0,  -4/3,  0).  Further  the  vector  \  is  (  0,  5/12,  7/12|7/12,  0,  5/12). 
Thus,  even  though  w  >_  0,  we  see  that  the  conditions  (4.53)  of  Lemma  6  are  satisfied  for  each 
h€H  =  {1,2}  and  thus  the  given  \  solves  PD2. 

The  disjunctive  cut  (3.8)  derived  with  this  optimal  solution  A  is  obtained  through  (4.57) 
as 

(4.58)  yx,  +  y;c2  +  jx3  ^  1. 

It  is  interesting  to  compare  this  cut  with  that  obtained  through  the  parameter  values  \/'  = 
l/\Qi,\  for  each  i€Qh  as  recommended  by  Balas  [1,2].  This  latter  cut  is 

(4.59)  jxi+jc2  +  x3  >  1- 
Observe  that  (4.58)  uniformly  dominates  (4.59). 

4.2   Maximizing  the  Rectilinear  Distance  Between  the  Origin  and  the  Disjunctive  Cut 

In  this  section,  we  will  briefly  consider  the  case  where  one  desires  to  use  rectilinear 
instead  of  euclidean  distances.  Extending  the  developments  of  Sections  2,  3  and  4.1,  one  may 
easily  see  that  the  relevant  problem  is 

minimize  {maximum  v,:  constraints  (4.12),  (4.13),  (4.14)  are  satisfied}. 

yell n)       ' 

The  reason  why  we  consider  this  formulation  is  its  intuitive  appeal.  To  see  this,  note  that  the 
above  problem  is  separable  in  /;  €  H  and  may  be  rewritten  as 


PD|:  minimize 


£*:  {h  >   £  Xi'ai)  for  each  j  -  1,  ...,«,£  X,*-  1,  X,"  >  0 

'"€<?/,  ''€0/, 


for/€0„f  Zh>  0 


for  each  /;  €  H. 


Thus,  for  each  /?€//,  PD]  seeks  X/',  i€Qh  such  that  the  largest  of  the  surrogate  constraint 
coefficients  is  minimized.  Once  such  surrogate  constraints  are  obtained,  the  disjunctive  cut 
(3.8)  is  derived  using  the  principles  of  Section  3. 

As  far  as  the  solution  of  Problem  PD]  is  concerned,  we  merely  remark  that  one  may 
either  solve  it  as  a  linear  program  or  rewrite  it  as  the  minimization  of  a  piecewise  linear  convex 
function  subject  to  linear  constraints  and  use  a  subgradient  optimization  technique. 

BIBLIOGRAPHY 

[1]  Balas,  E.,  "Intersection  Cuts  from  Disjunctive  Constraints,"  Management  Science  Research 

Report,  No.  330,  Carnegie-Mellon  University  (1974). 
[2]  Balas,  E.,  "Disjunctive  Programming:    Cutting  Planes  from  Logical  Conditions  in  Nonlinear 

Programming,"  O.L.  Mangasarian,  R.R.  Meyer,  and  S.M.  Robinson,  editors,  Academic 

Press,  New  York  (1975). 


GENERATION  OF  DEEP  DISJUNCTIVE  CUTTING  PLANES  475 

[3]   Balas,  E.,  "Disjunctive  Programming:    Facets  of  the  Convex  Hull  of  Feasible  Points," 

Management  Science  Research  Report,  No.  348,  Carnegie-Mellon  University,  (1974). 
[4]  Bazaraa,  M.S.  and  CM.  Shetty,  "Nonlinear  Programming:    Theory  and  Algorithms,"  John 

Wiley  and  Sons,  New  York  (1979). 
[5]  Burdet,  C,  "Elements  of  a  Theory  in  Non-Convex  Programming,"  Naval  Research  Logis- 
tics Quarterly,  24,  47-66  (1977). 
[6]  Burdet,  C.  "Convex  and  Polaroid  Extensions,"  Naval  Research  Logistics  Quarterly,  24,  67- 

82  (1977). 
[7]  Dem'janov,  V.F.,  "Seeking  a  Minimax  on  a  Bounded  Set,"  Soviet  Mathematics  Doklady, 

11,  517-521  (1970)  (English  Translation). 
[8]  Glover,  F.,  "Convexity  Cuts  for  Multiple  Choice  Problems,"  Discrete  Mathematics,   6, 

221-234  (1973). 
[9]  Glover,  F.,  "Polyhedral  Convexity  Cuts  and  Negative  Edge  Extensions,"  Zeitschrift  fur 
Operations  Research,  18,  181-186  (1974). 

[10]  Glover,  F.,  "Polyhedral  Annexation  in  Mixed  Integer  and  Combinatorial  Programming," 
Mathematical  Programming,  8,  161-188  (1975).  See  also  MSRS  Report  73-9,  Univer- 
sity of  Colorado  (1973). 

[11]  Glover,  F.,  D.  Klingman  and  J.  Stutz,  "The  Disjunctive  Facet  Problem:  Formulation  and 
Solutions  Techniques,"  Management  Science  Research  Report,  No.  72-10,  University  of 
Colorado  (1972). 

[12]  Held,  M.,  P.  Wolfe  and  H.D.  Crowder,  "Validation  of  Subgradient  Optimization," 
Mathematical  Programming,  6,  62-88  (1974). 

[13]  Jeroslow,  R.G.,  "The  Principles  of  Cutting  Plane  Theory:  Part  I,"  (with  an  addendum), 
Graduate  School  of  Industrial  Administration,  Carnegie-Mellon  University  (1974). 

[14]  Jeroslow,  R.G.,  "Cutting  Plane  Theory:  Disjunctive  Methods,"  Annals  of  Discrete 
Mathematics,  1,  293-330  (1977). 

[15]  Karlin,  S.,  "Mathematical  Methods  and  Theory  in  Games,  Programming  and  Economics," 
1,  Addison-Wesley  Publishing  Company,  Reading,  Mass.  (1959). 

[16]  Majthay,  A.  and  A.  Whinston,  "Quasi-Concave  Minimization  Subject  to  Linear  Con- 
straints," Discrete  Mathematics,  9,  35-59  (1974). 

[17]  Owen,  G.,  "Cutting  Planes  for  Programs  with  Disjunctive  Constraints,"  Optimization 
Theory  and  Its  Applications,  //,  49-55  (1973). 

[18]  Poljak,  B.T.,  "A  General  Method  of  Solving  Extremum  Problems,"  Soviet  Mathematics 
Doklady,  8,  593-597  (1967).  (English  Translation). 

[19]  Poljak,  B.T.,  "Minimization  of  Unsmooth  Functionals,"  USSR  Computational  Mathematics 
and  Mathematical  Physics,  9,  14-29  (1969).  (English  Translation). 

[20]  Vaish,  H.  and  C.  M.  Shetty,  "A  Cutting  Plane  Algorithm  for  the  Bilinear  Programming 
Problem,"  Naval  Research  Logistics  Quarterly,  24,  83-94  (1975). 


THE  ROLE  OF  INTERNAL  STORAGE  CAPACITY 
IN  FIXED  CYCLE  PRODUCTION  SYSTEMS 

B.  Lev* 

Temple  University 
Philadelphia,  Pennsylvania 

D.  I.  Toof 

Ernst  &  Ernst 
Washington,  D.C. 

ABSTRACT 

The  reliability  of  a  serial  production  line  is  optimized  with  respect  to  the  lo- 
cation of  a  single  buffer.  The  problem  was  earlier  defined  and  solved  by  Soy- 
ster  and  Toof  for  the  special  case  of  an  even  number  of  machines  all  having 
equal  probability  of  failure.  In  this  paper  we  generalize  the  results  for  any 
number  of  machines  and  remove  the  restriction  of  identical  machine  reliabili- 
ties. In  addition,  an  analysis  of  multibuffer  systems  is  presented  with  a  closed 
form  solution  for  the  reliability  when  both  the  number  of  buffers  and  their 
capacity  is  limited.  For  the  general  multibuffer  system  we  present  an  approach 
for  determining  system  reliability. 


1.   INTRODUCTION 

Several  types  of  production  line  models  appear  in  the  literature.  Each  one  is  a  realization 
of  a  different  real  life  situation.  A  summary  of  the  various  types  and  the  differences  in  the 
mechanism  of  product  flow  among  them  appears  in  Buzacott  [5],  Koenigsberg  [9],  Toof  [14]  or 
Buxey  et  al  [1].  Recently  Soyster  and  Toof  [13]  defined  a  serial  production  line,  which  is  the 
model  analyzed  in  this  paper. 

The  mechanism  of  product  flow  in  a  serial  production  line  is  described  via  Figure  1.  An 
unlimited  source  of  raw  material  exists  before  machine  1.  If  machine  1  is  capable  of  working 
(i.e.,  not  failed),  an  operator  takes  a  unit  of  raw  material  and  processes  it  on  machine  1,  after 
which  he  moves  to  machine  2  and  processes  it  on  machine  2,  if  machine  2  is  capable  of  work- 
ing.   He  proceeds  analagously  until  machine  N  where  a  finish  product  is  completed.    Let  T,  be 

N 

the  process  time  on  machine  /.    Then  the  cycle  time  of  the  system  T  =  £  T,.    Let  q,  be  the 

/=i 
probability  that  at  any  cycle  T  machine  /  is  capable  of  working  and  p,-  =  1  —  q,  the  probability  of 
failing.    The  serial  production  line  with  no  buffer  must  stop  working  if  any  of  the  individual 
machines  on  the  line  fails.    The  placement  of  a  single  buffer  of  capacity  M  after  machine  / 
alleviates  this  situation.    If  any  of  the  first  i  machines  fail  and  the  buffer  is  not  empty,  machines 


*This  study  was  done  when  the  author  was  at  the  Department  of  Energy,  Washington,  D.C.  under  the  provisions  of  the 
Intergovernmental  Personnel  Act. 

477 


478 


B    LEV  AND  D.I.  TOOF 


M, 

M2 

M, 


Ml+X 


Mi+2 

MN 

Product  Flow 

Figure  1.    Serial  production  line  with  N  machines  and  a  single  buffer 

i  +  l,  /  +  2,  . . .  ,  N  can  still  function.  Conversely,  if  any  of  the  machines  /  +  1,  ....  TV  fail 
and  the  buffer  is  not  full,  the  first  /machines  may  still  work  and  produce  a  semifinished  good  to 
be  stored  in  the  buffer.  One  obviously  would  like  to  identify  the  optimal  placement  of  this 
buffer.  Soyster  and  Toof  [13]  proved  that  if  there  are  an  even  number  of  machines,  all  identi- 
cally reliable  (q,  =  q  V  /)  then  the  optimal  placement  of  the  buffer  is  exactly  in  the  middle  of 
the  line.  In  section  2  we  generalize  these  results  for  any  number  of  machines  not  necessarily 
identically  reliable.  Specifically,  we  prove  that  the  optimal  placement  of  a  single  buffer  is  at  a 
place  which  minimizes  the  absolute  value  of  the  difference  between  the  reliability  of  the  two 
parts  of  the  line  separated  by  the  buffer. 


The  optimal  location  /*  is  determined  from  (1) 


(1) 


n«,-  n  q, 

/=!  /-/*+! 


Min 


/=i 


N 

n  * 


A  more  difficult  question  is  the  optimal  locations  of  several  buffers.  In  section  3  we  analyze  a 
special  case  of  a  two  buffer  system,  each  buffer  having  a  capacity  of  one  unit.  In  section  4  we 
present  an  approach  that  can  be  used  for  any  number  of  buffers  with  any  capacity.  The 
approach  we  suggest  is  efficient  as  long  as  the  number  of  buffers  and  their  capacity  remains 
relatively  small. 

2.    OPTIMAL  LOCATION  OF  A  SINGLE  BUFFER 


Let    a    single    buffer    with    capacity    M  be    placed    after    machine    /'.     Let    a, =  \\qj, 

N 

13,  =       [  qn  p, ■  =  (a,  -  a,/3,)/(/3,  -  a,/3,),  and  let  X„  be  the  number  of  units  in  the  buffer  at 

the  beginning  of  cycle  n.  Soyster  and  Toof  [13]  have  shown  that  X„  defines  a  finite  Markov 
Chain,  presented  its  transition  matrix  and  found  that  the  reliability  R  (/)  of  the  line  is  given  by 
(2)  and  (3): 


(2) 


(3) 


R(i)  =  j3,a,  +/3,  (1  -a,) 


Pi  ~  Pi 


M+\ 


1 


R(i)  =/3,a,  +/3,  (1  -a,) 


~  Pi 
M 


M+\ 


M  +  1 


if  «i *Pi 


ifa,  =  /3,. 


One  has  to  maximize  /?(;')  with  respect  to  /,  that  is,  to  identify  the  optimal  location  of  the 

/  N        t  N 

buffer  within  the  line.    Since  a,/3,  =  ]  J  qt       [  q)  =  ]7<7/  is  a  constant  and  does  not  affect  the 

/-i      ;-/+i  /=i 

location  of  the  buffer,  one  can  simply  ignore  this  term  from  (2)  and  (3)  in  the  optimization 

phase.   Thus,  we  want  to  find  /'*that  maximizes  R(i)  or: 


(4) 


R  (/*)  =  Max  R  (/)  =  Max 


0/(1 -«/) 
0/(1  -«,) 


Pi  ~  P, 


M+\ 


1-P. 

M 

M  +  1 


M+\ 


if  a,  *Pi 

it  a,  =  pr 


INTERNAL  STORAGE  CAPACITY  IN  FIXED  CYCLE  PRODUCTION  SYSTEMS 


479 


The  approach  we  take  to  solve  (4)  for  /'*  is  to  show  that  R(i)  is  strictly  increasing  with  a,  for 
a,  <  /3,  and  strictly  decreasing  with  a,  for  a,  >  /3,;  that  a,  =  /3,  occurs  when  R  (/')  reaches  its 
maximum  value;  and  that  R  (/)  is  symmetric  about  the  point  i*  where  a,.  =  /3,.. 


Let 

(5) 


R(i)  =  08; -a,.j8,.) 


p-r+i  _  i 

when  a,  =  /3,,  p,  =  1  and  (5)  becomes  (6) 

M 


(6) 


/?(/)  =  (/3,  -a,/3,) 


M+  1 


«/  *  0, 


0,-iS,. 


Note  in  (6)  as  M  becomes  large  the  total  reliability  of  the  line,  which  is  equal  to  a,/3,  +  /?(/), 
approaches  (3r   That  is,  the  two  segments  of  the  line  become  independent  of  each  other. 

In  this  section  the  general  strategy  is  to  show  that  if  a,  >  /3,  or  a,  <  (3,  then  the  reliabil- 
ity of  (5)  is  smaller  than  the  reliability  of  (6).  Hence,  we  treat  a,  as  a  continuous  variable  and 
show  that  the  derivative  of  (5)  with  respect  to  a,  is  positive  for  a,  <  /S,  and  negative  for 
a,  >  /3,-. 


The  derivative  of  R  (/')  with  respect  to  a,  is: 


dR  (/) 

da. 


M+\ 


-otfii 


(p,M+1-p,)  + 


{MPr+-(M  +  \)Pr  + 1) 


(p; 


M+\ 


\y 


i  + 


pfi, 


LEMMA  1:    The  additional  reliability  function  R(i),  is  strictly  increasing  with  respect  to 


a,  over  the  range 


1/2 


'(//?  (/) ' 

da. 


,1 


0, 


n? 


That  is,  if  0  <  a,  <  /3M  then 


,  and  strictly  decreasing  with  respect  to  a,  over  the  range 
dR  (/) 


da. 


>  0.    Conversely,  if /3,  <  a,  ^  1,  then 


<  0.  The  proof  can  be  found  in  [14].    (The  first  range  is  closed  from  the  left  and  open 


from  the  right;  the  second  range  is  open  from  the  left  and  closed  from  the  right). 

THEOREM  1 :  The  optimal  placement,  /*,  of  a  single  buffer  of  integer  capacity  M  in  an  N 
machine  line  is  where  a  *=  /3  * 

PROOF:  The  proof  of  this  theorem  is  essentially  complete.  We  must  only  show  that  (5) 
is  continuous  at  the  point  where  a*=  /3*  By  definition  the  additional  reliability  attributable  to 
the  introduction  of  the  buffer  when  a  f=  /3  *  is: 

M 


(fi*-a*p*) 


M  +  1 


As  a,  — ►  /3,  pi  — -  1  so  that  in  (5)  the  limit  of  the  steady  state  probability  as  a,  — -  /3,  is  of  the 
indeterminate  form  0/0.   However,  an  application  of  L'Hospital's  rule  shows  that: 


lim 


Prx-Pl 

M  +  \ 


M 


a,   -0,.     p'"^'-!  M   +    1 

and  thus  the  continuity  is  proven. 


480  B.  LEV  AND  D.I.  TOOF 

Theorem  1  defines  an  optimal  though  not  necessarily  feasible  solution  to  the  problem  of 
buffer  placement.  The  condition  a,  =  /3,  may  be  impossible  to  satisfy.  In  the  remainder  of  this 
section  we  examine  the  symmetry  of  the  reliability  function  defined  by  Equation  (5),  develop  a 
simple  criterion  that  provides  the  best  feasible  solution  and,  lastly,  we  examine  the  special  case 
of  identical  machine  reliability,  i.e.,  q, =  q  V  /. 

LEMMA  2:  Given  Kx  and  K2  continuous  variables  such  that  a^  -/3K  =  /3K  -  aK  . 
Thenp*,  ■  pKl  =  1. 

N 

PROOF:  Recall  that  afi, =  ]^q,=  Q  a  constant  for  all  /'.  Thus  the  condition 
aK  ~  &k  =  Pk-,~  aK   may  De  rewritten  aK *—  =  — * aK  .   This  implies  that: 

1122  '         <*KX  aK2  2 

Q{aKx  +  aK) 

"A",  +  a«,  =  or  that  aK    aK    =  Q. 

i  2  aK^K2  '        2 

Similarly,  one  obtains  the  result  that  fiK  (3K  =  Q.   We  want  to  show  that  pK   ■  pK  =1.    Sub- 
stituting for  pK   and  pK   in  the  definition  of  p  yields: 

(aK]-  Q)iaKi-  Q) 


P*lPKl     '    ((}Ki-Q)   (fiK2-Q)- 

We  then  must  show  that: 

(aK]  -  Q)  (aKi  -  Q)  -  Q8*,  -  Q)  (fiKi  -  Q) 
or  that: 

«*,«*,  -  Q(aKl  +  <*k2)  =  /Sat,  Pk2  ~  003*,  +  Pi)- 

The   condition  aK{—  Pk^  Pk2~  <*k2    infers    both    that  aK]  +  aKi  =  (3K]  +  jiKi   and   that 
aKxaK2  =  Pk{Pk2  =  Q,  and  thus  the  proof  is  complete. 

This  leads  directly  to  the  following  theorem: 

THEOREM  2:  For  a  continuous  argument  (/'),  R  (/')  is  symmetric  about  the  point  i* 
where  a,.  =  /3,.. 

The  proof  is  in  [14]. 

The  placement  of  the  buffer  has  been  treated  as  a  continuous  variable.  While  this  has  led 
to  satisfying  mathematical  results,  in  reality  one  must  develop  an  optimizing  criterion  which  is 
physically  feasible.  Unfortunately,  the  condition  a,.  =  /3,.  does  not  satisfy  the  feasibility 
requirements.  Rarely  will  i*  be  integer  and  what,  for  example,  is  the  physical  interpretation  of 
i*  =  7.63.  To  this  end,  it  will  be  shown  in  this  section  that  the  steady  state  reliability  of  the 
line  is  maximized  by  placing  the  buffer  after  machine  /*  (/*  integer)  where  /*  satisfies  the  fol- 
lowing condition: 

\a,.  -  j8,-.|  =    min     \a,  -  /3,|. 
1  < ;  <  N 

!•  N 

Note  that  if  an  integer  i*  exists  such  that  a,.  =  ]  J  q,  =  q,  =  /3,.,    it  would  satisfy  the 

/=1  /=r+i 

above  criterion  and  be  consistent  with  Theorem  1. 


INTERNAL  STORAGE  CAPACITY  IN  FIXED  CYCLE  PRODUCTION  SYSTEMS  481 

To  this  end  observe  that  \a, •  —  /3,|  is  a  convex  function  of  a,  that  obtains  its  minimum 
point  at  a, •  =  j3,  ■  =  -J  a  fit  =  yfQ.   Thus,  for 

a,  <  aj  <  y/Q,   \ctj  -  /3,\  <  \a,  -  j3,-|,  and  for  JQ   <  a,  <  a,,   \a ,-  -  (3  ,\  <  \a,  -  j8,|. 

THEOREM  3  (Fundamental):  The  optimal  integer  placement  of  a  single  buffer  of  capa- 
city Min  an  N  machine  line  is  where  \a,  —  /3,-|  is  minimized. 

PROOF:  From  Theorem  1  we  know  that  by  treating  /  as  a  continuous  variable  the  optimal 
placement  /'*  satisfies  a,*  =  j3,».  If  /'*  is  integer  the  theorem  is  evident.  Assume  that  i*  is  not 
integer.  Examine  the  points  [/*]  and  [/*  -t-  1].  From  lemma  1  and  the  convexity  of  \a,  —  /3,-| 
we    know    that    R([i*])  >  R(K{)     where    a[r]>aK]    and    R  ([/*  +  1])  >  R  (K2)     where 

|[;*+n  >  aK  ■   Thus,  the  only  two  candidate  placements  are  [/'*]  and  [/*  +  1]. 

If  \a[j*]  —  fi[j*]\  =  |a [,••+!]  —  /8r/.+  1]|  then  the  theorem  holds  and  either  placement  is 
optimal.  Therefore,  assume  that  ]«[,♦]  —  /3[,-.] I  <  |a(/*+i]  —  /8[/»+i]|.  We  want  to  show  that 
R([i*])  >  R  ([/*+l]).  Assume  the  contrary,  i.e.,  that  R  ([/*  + 1])  >  R  ([/*]).  From 
Theorem  2  we  know  that  there  exists  a  point  K*  such  that  R{K*)  =  R([i*  +  1])  and  that 
Wk*~  Pk*\=  \au*+i]- Pn*+i]\-  This  implies  that  R  (K*)  >R  ([/*]).  We  know  that 
\aK*~  $k*\  >  lot [/*]  —  /S[/*]l  and  since  both  aK*  and  «[,»]  must  be  greater  than  v 'a Wi  tnis 
implies  that  aK*  >  a [,-*].  By  Theorem  2  this  would  infer  that  /?([/*])  >  R{K*)  which  is  a 
contradiction.  Similar  results  may  be  obtained  by  assuming  that  Icq,-*]  — /3[/*]l  >  la[,*+i] 
-  P[r+ ill- 
Theorem  3  details  a  simple,  yet  elegant  criterion  for  the  optimal  placement  of  a  single 
buffer  regardless  of  capacity  so  as  to  maximize  the  reliability  of  the  system. 

A  Special  Case:  q,  =  q  V  /. 


J 


Consider  the  case  where  q,  =  V  i.    In  this  case: 

a,  =  q' 

13,  =  qN-'. 

It  follows  from  Theorems  1  and  3  that  if  A^  is  even,  the  optimal  placement  would  be  where 
a,  =  f3,  which  in  this  case  is  where  q'  =  qN~'  which  is  satisfied  at  /  =  A/2.  This  is  consistent 
with  the  results  developed  by  Soyster  and  Toof  [13]. 

Assume  that  N  is  odd.  Then  N  is  of  the  form  2K  +  1  where  K  is  integer  and  by 
Theorem  3  the  optimal  placement  is  either  after  machine  K  or  machine  K  +  1  since: 

l«* -18*|-  \qK-  q2K+'-K\=  \qK  -  qK^\ 

l«*+i  -  0*+ll  =  l^  +  '  -  <72*+1-*-'  \=\qK~  gK+l\. 

We  have  just  completed  the  proof  for  the  optimal  location  of  a  single  buffer  on  an  N  machine 
serial  line.  The  optimal  location  is  for  any  N  (even  or  odd)  and  for  any  q,  (both  when  machine 
reliability  are  identical  or  not  identical  for  all  machines).  In  the  next  section  we  generalize  the 
model  to  include  more  than  one  buffer. 

3.   TWO  BUFFERS  OF  CAPACITY  ONE  UNIT 

Consider  a  simpler  case  of  the  general  model  where  N  =  3k  and  q,  =  q  for  all  /.  The 
placement  of  two  buffers  separates  the  line  into  three  segments.  Since  N  =  3k,  one  may  arbi- 
trarily place  the  first  buffer  immediately  after  machine  k  and  the  second  immediately  after 


482 


B.  LEV  AND  D.l.  TOOF 


machine  2k.  The  placement  of  these  two  buffers  has  just  defined  the  three  stages  of  the  sys- 
tem. Each  stage  may  be  comprised  of  more  than  one  machine;  for  a  line  of  N  =  3k,  each  stage 
is  comprised  of  k  machines.  The  reliability  of  each  stage  is  Q{  =  Q2  =  Q3  =  qf  =  Q  and 
P  =  1  -  Q. 

The  two  buffer  system  operates  analogously  to  the  one  buffer  system  described  in  section 
2.  If  all  machines  are  up,  then  a  unit  of  raw  material  is  processed  by  stages  one,  two  and  three 
and  a  finished  good  is  produced.  If,  for  example,  stage  three  is  down,  stages  one  and  two  are  up 
and  buffer  two  is  not  full,  then  both  stages  one  and  two  operate  and  a  semicompleted  good 
would  be  stored  in  buffer  two.  If  buffer  two  had  been  full  and  buffer  one  had  not,  then 
machine  two  would  not  operate;  it  would  be  blocked  by  the  second  buffer  which  is  full.  In  this 
case  only  machine  one  would  operate  and  a  semiprocessed  good  would  be  stored  in  buffer  one. 

Define  an  ordered  pair  (X,  Y)  where  X  represents  the  quantity  of  semifinished  goods  in 
buffer  one  at  the  start  of  cycle  /,  and  Y  the  quantity  in  buffer  two  at  the  start  of  cycle  /.  If  we 
assume  that  the  maximum  capacity  of  both  buffers  one  and  two  is  one,  then  the  pair  (X,  Y) 
may  take  on  the  following  four  values:  (0,0),  (1,0),  (0,1),  and  (1,1).  The  one  cycle  transition 
probability  from  state  (X,  Y)  =  (0,0)  to  all  states  is: 

•  Both  are  empty  at  the  start  of  cycle  t  +  1  if  either  all  stages  are  up,  or  if  stage  one  is 
down.   Thus:  P[(Xl+],    K,+1)  =  (0,0)  |  {X,,    Y,)  =  (0,0)]  =  Q3  +  P. 

•  If  stage  one  is  up  during  cycle  t  but  stage  two  is  down,  then  a  unit  of  raw  material  is 
processed  on  stage  one  and  the  semicompleted  good  stored  in  buffer  one.  Thus: 
P[(X,+],  K,+1)  =  (1,0)  I  (X„  Y,)  =  (0,0)]  =  QP 

•  If  both  stages  one  and  two  are  up  but  stage  three  is  down,  then  a  unit  of  raw  material  is 

processed  on  both  stages  one  and  two  and  the  semicompleted  good  stored  in  buffer  two. 
Thus:  P[(X,+],  Yl+])  =  (0,  1)  I  (X,,  Yt)  =  (0,0)]  =  Q2P 

•  Lastly,  note  that  it  is  impossible  for  (Xl+\,  Yl+X)  to  equal  (1,1)  given  that 
{X,,  Y,)  =  (0, 0),  as  at  most,  one  unit  may  be  added  to  storage  during  any  cycle.  Thus: 
P[(Xl+],   Yl+l)  =  (1, 1)  I  (X„  Yt)  =  (0,0)]  =  0. 

One  may  compute  the  transition  probabilities  for  all  of  the  four  possible  states  in  an  analogous 
manner.   The  complete  transition  matrix  is  presented  in  Figure  2. 


\     State 

\t+l 
State  in  t     \^ 

(0,0) 

(1,0) 

(0,1) 

(1,1) 

(0,0) 

Q3+P 

QP 

Q2P 

0 

(1,0) 

Q2P 

Q3+P 

QP2 

Q2P 

(0,1) 

QP 

Q2P 

Q3+P2 

QP 

(1,1) 

0 

QP 

Q2P 

Q3+P 

Figure  2.   Transition  matrix  —  two  buffer  system 

Let  77),  7T2,  7r3,  7r4  be  the  steady  state  probabilities  of  buffer  states  (0,0),  (1,0),  (0,1)  and 
(1,1)  respectively.  The  system  is  in  state  (0,0)  with  probability  77 1,  then  a  good  is  produced  if 
and  only  if  all  three  stages  are  up.  This  event  has  a  probability  of  Q3tt\.  Similarly,  with  proba- 
bility 7r2  the  system  is  in  state  (1,0),  then  only  stages  two  and  three  must  be  up  for  a  finished 
good   to   be   produced.     This   event   has   probability    Q2tt2-     Lastly   in   both   state    (0,1)    and 


INTERNAL  STORAGE  CAPACITY  IN  FIXED  CYCLE  PRODUCTION  SYSTEMS 


483 


(1,1),  buffer  two  is  not  empty  and  thus  the  only  condition  for  a  successful  cycle  is  that  stage 
three  must  be  up.  These  events  have  probability  £)t73  and  QnA,  respectively.  The  steady  state 
reliability,  R,  of  the  two  buffer  system  where  the  capacity  of  both  buffer  one  and  buffer  two  is 
one  unit  is  equal  to: 

=  <23ir,  +  <22tt2+  Qrr3  +  Qtt  ,. 


(7) 


R 


Thus,  upon  determining  the  steady  state  probabilities,  77,,  tt2,  77  3  and  774,  one  has  an 
exact  formulation  of  the  reliability  of  the  three  stage,  two  buffer  system,  where  each  buffer  has 
a  capacity  of  one  unit. 

From  the  transition  matrix  presented  as  Figure  2  and  basic  finite  Markov  Chain  theory, 
one  can  calculate  ttu  tt2,  773,  and  77  4  in  the  following  manner. 

First,  we  know  that  in  the  steady  state  ttB  =  it  where  B  is  the  one  step  transition  matrix 
of  the  system  (Figure  2)  and 

77   =    (77  j,    77 jt    W3,    77 4)  . 


This  identity  yields  a  system  of  four  simultaneous  equations  of  the  form 
(8)  ir(B-I)  =  0 

where  B  is  the  form: 


B  = 


However,  (B-I)  has  no  inverse  as  the  rows  are  linearly  dependent.  The  classical  method  of 
solution  to  this  problem  is  to  drop  one  of  the  identity  equations  of  77  and  substitute  the  fact 
that  the  sum  of  the  steady  state  probabilities  must  equal  one.  That  is,  77,  +  tt2  +  tt3  +  tt4  =  1. 
Making  this  substitution  for  column  3  of  B-I  yields  the  following  system  of  simultaneous  equa- 
tions:  ttA  =  (0,0,1,0),  where: 

1    0 


Q3  +  P 

QP 

Q2P 

0 

Q2P 

Q3  +  P 

QP2 

Q2P 

QP 

Q2P 

Q3  +  P2 

QP 

0 

QP 

Q2P 

Q3  +  P 

Q3  +  P  - 

1    QP 

Q2P 

Q3  +  P  - 

-  1 

QP 

Q2P 

0 

QP 

1    Q2P 


Thus,  77  =  (0,0, 1,0) A      which  reduces  to  7r  =  A3 

inverse  matrix  A~x.   The  solution  to  the  last  system  of  four  equations  and  four  variables  is: 


QP 

Q3  +  P  -  1 

where  A^  is  the  third  column  of  the 


(9) 


tt,  =  (Q2+  Q  +  \)/(4Q2  +  3Q  +  5) 
tt2=  (Q2+  Q  +  2)/(402  +  3Q  +  5) 
773=  (Q2+  \)/(4Q2  +  3Q  +  5) 
it4=  (Q2+  Q  +  1)/(4Q2  +  3(3  +  5) 


We  are  now  able  to  directly  compute  the  steady  state  reliability  of  a  two  buffer  series  sys- 
tem where  each  stage  has  identical  reliability,  Q,  distributed  Bernoulli  and  each  buffer  a  capacity 
:  of  one  unint.   We  have  just  proved  Theorem  4  which  results  from  (7)  and  (9). 


484 


B.  LEV  AND  D.I.  TOOF 


THEOREM  4:    For  the  series  production  system  described  above  the  steady  state  reliabil- 
ity of  the  system  /?,  is  equal  to: 

0s  +  2Q4  +  4Q3  +  3Q2  +  1Q 

4(?2  +  30  +  5 


R  = 


4.    EXTENSION  OF  THE  GENERAL  MUTLIBUFFER  CASE 

The  previous  sections  have  laid  the  groundwork  for  our  analysis  of  a  general  multistage, 
multibuffer  system  such  as  the  one  depicted  in  Figure  3.  For  ease  of  analysis  let  us  assume 
that  the  reliability  of  each  stage  has  the  Bernoulli  distribution  with  parameter  Q  and  further  that 

m 

buffer  /'  has  capacity  Mr    For  a  general  N  stage  system  with  m  buffers,  there  are  ]  [  (M,  +  1) 

possible  buffer  states;  i.e.,  each  buffer  may  take  on  M,  +  1  values  and  there  are  m  such  buffers. 
For  example,  if  M,  =  4  for  all  /,  and  m  =  5  there  would  be  3,125  possible  buffer  states  ranging 
in  value  from  (0,0,0,0,0)  to  (4,4,4,4,4).  The  question  arises  as  to  the  viability  of  this  form  of 
analysis  for  systems  with  large  buffer  capacity  (M,),  multiple  buffers  (m)  or  a  combination  of 
the  two.  Clearly,  the  transition  matrix  for  a  large  system  would  be  relatively  sparse  (i.e.,  many 
zero  entries).  For  example,  in  a  four  stage  (three  buffer)  system,  where  each  buffer  has  a  capa- 
city of  three  units,  there  would  be  43  =  (3  +  l)3  or  64  possible  transition  states.  For  the  start- 
ing state  (1,1,1)  there  are  13  possible  transitions  (i.e.,  nonzero  transition  probabilities).  The 
feasible  transitions  from  the  state  (1,1,1)  are: 

(0,1,1),  (0,1,2),  (0,2,1),  (1,0,1),  (1,0,2),  (1,1,0),  (1,1,1), 

(1,1,2),  (1,2,0),  (1,2,1),  (2,0,1),  (2,1,0),  and  (2,1,1). 


Raw 

Material 


M, 


B 


Mi 


B 


M 


N-\ 


BN~ 


jV—  1  j 


Mk 


Finished 
Goods 


Product  Flow 

Figure  3.    General  muliisiage,  mullibufTer  system 


While  it  is  obvious  that  the  method  of  analysis  employed  to  this  point  is  feasible,  that  is, 
(1)  definition  of  a  one  step  transition  matrix;  (2)  development  of  a  reliability  equation  as  a 
function  of  stage  reliability  and  the  steady  state  transition  probabilities,  and  (3)  solving  a  system 
of  linear  equations  for  the  steady  state  transition  probabilities;  its  application  is,  for  the  most 
part,  not  practical. 

Let  us  present  the  transition  matrices  for  two  or  three  buffer  systems  with  capacity  one  or 
two.  For  the  system  of  two  buffers  of  capacity  two  the  transition  matrix  is  given  in  Figure  4 
and  the  steady  state  probabilities  for  various  values  of  Q  are  given  in  Figure  5  where  the  relia- 
bility R  is: 

R  =  (237r,  +  Qtt2  +  £>7r3  +  Q2ir4  +  Qtts  +  £>7r6  +  Q1-n1  +  Q-n%  +  Qtt9. 

Figure  5  was  calculated  by  a  small  computer  program.  For  various  values  of  Q,  we  solved  for 
the  unique  rr-,  and  calculated  /?,  which  appears  in  Figure  5.  For  the  system  of  four  stages,  and 
three  buffers  with  capacity  one,  the  transition  matrix  is  given  in  Figure  6. 

Again,  using  a  small  computer  program  we  solved  for  n ,  and  calculated  R.  The  steady 
state  probabilities  and  the  system  reliability  R  is  given  in  Figure  7  where 


R 


=  n4 


QV,    +    QlT2   +    037T3   +    Q7T4   +    Q37T5   +    QlT^   +    Q27T  7   +    QtT  g. 


INTERNAL  STORAGE  CAPACITY  IN  FIXED  CYCLE  PRODUCTION  SYSTEMS 


485 


Transition  Matrix 


\t+l 
t  ^\ 

(0,0) 

(0,1) 

(0,2) 

(1,0) 

(1,1) 

(1,2) 

(2,0) 

(2,1) 

(2,2) 

(0,0) 

Q3+P 

Q2P 

0 

QP 

0 

0 

0 

0 

0 

(0,1) 

QP 

Q3+P2 

Q2P 

Q2P 

QP2 

0 

0 

0 

0 

(0,2) 

0 

QP 

Q3+P2 

0 

Q2P 

QP 

0 

0 

0 

(1,0) 

Q2P 

QP2 

0 

Q3+P2 

Q2P 

0 

QP 

0 

0 

(1,1) 

0 

Q2P 

QP2 

QP2 

Q3+P3 

Q2P 

Q2P 

QP2 

0 

(1,2) 

0 

0 

Q2P 

0 

QP2 

Q3+P2 

0 

Q2P 

QP 

(2,0) 

0 

0 

0 

Q2P 

QP2 

0 

Q3+P 

Q2P 

0 

(2,1) 

0 

0 

0 

0 

Q2P 

QP2 

QP 

Q3+P2 

Q2P 

(2,2) 

0 

0 

0 

0 

0 

Q2P 

0 

QP 

Q3+P 

Figure  4.   Two  buffers  of  maximum  capacity  two 


Buffer  Capacity  Equals  2 


Q  =  .9 

(3=7 

Q  =  .3 

Q=  A 

7T, 

.108 

.105 

.097 

.092 

1T? 

.092 

.091 

.090 

.089 

T3 

.060 

.059 

.060 

.064 

7T4 

.126 

.124 

.118 

.114 

1TS 

.106 

.109 

.121 

.129 

"6 

.092 

.091 

.090 

.089 

TT1 

.183 

.191 

.209 

.218 

T« 

.126 

.124 

.118 

.114 

77q 

.108 

.105 

.097 

.092 

R 

.855 

.596 

.205 

.061 

Figure  5.    Exact  solutions  to  three  stage 
two  buffer  system 

Maximum  Buffer  Capacity  Equals  One 


\t+l 
t    ^\ 

(0,0,0) 

(0,0,1) 

(0,1,0) 

(0,1,1) 

(1,0,0) 

(1,0,1) 

(1,1,0) 

(1,1,1) 

(0,0,0) 

Q4+P 

Q3P 

Q2P 

0 

QP 

0 

0 

0 

(0,0,1) 

QP 

Q4+P2 

Q3P 

Q2P 

Q2P 

QP2 

0 

0 

(0,1,0) 

Q2P 

QP2 

Q4+P2 

Q3P 

Q3P 

Q2P2 

QP 

0 

(0,1,1) 

0 

Q2P 

QP2 

Q4+P2 

0 

Q3P 

Q2P 

QP 

(1,0,0) 

Q3P 

Q2P2 

QP2 

0 

Q4+P 

Q3P 

Q2P 

0 

(1,0,1) 

0 

Q3P 

Q2P2 

QP2 

QP 

Q4+P2 

Q3P 

Q2P 

(1,1,0) 

0 

0 

Q3P 

Q2P2 

Q2P 

QP2 

Q4+P 

Q3P 

(1,1,1) 

0 

0 

0 

Q3P 

0 

Q2P 

QP 

Q4+P 

Figure  6.   Four  stage,  three  buffer  transition  matrix 


486 


B   LEV  AND  D.I.  TOOF 


Buffer  Capacity  of  One  Unit 


7T4 

7T7 
7T8 


(3  =  .9 

Q  =  .1 

Q  =  .3 

(3=1 

.109 

.100 

.091 

.080 

.079 

.075 

.071 

.073 

.098 

.098 

.118 

.138 

.079 

.080 

.071 

.021 

.157 

.156 

.221 

.270 

.137 

.147 

.118 

.193 

.206 

.213. 

.221 

.187 

.136 

.131 

.091 

.038 

.819 

.533 

.143 

.036 

Figure  7.   Reliability  of  a  four  stage. 
three  butTer  svsiem 


The  approach  we  present  here  can  be  summarized  as  follows;  for  a  given  configuration  of 
a  serial  production  line  with  multiple  buffers  and  no  restriction  on  their  capacity,  one  can  write 
the  one  step  transition  probability  matrix  and  solve  for  its  steady  state  probabilities  which  yields 
the  reliability  of  the  line.  The  method  is  efficient  for  a  small  number  of  buffers  and  small  capa- 
cities.    In  general,   the  number  of  state  variables  and  the  number  of  linear  equations  are 

m 

~[  (M,  +  1)  for  m  buffers  with  capacity  Mr 


[1 
[2 

[3 
[4 
[5 
[6 

[7 
[8 
[9 
10 
11 

:i2 


BIBLIOGRAPHY 

Buxey,  G.M.,  N.D.  Slack  and  R.  Wild,  "Production  Flow  Line  Systems  Design  —   A 

Review,"  AIIE  Transactions  (1973). 
Buxey,  G.M.  and  D.  Sadjadi,  "Simulation  Studies  of  Conveyor  Paced  Assembly  Lines  with 

Buffer  Capacity,"  The  International  Journal  of  Production  Research,  14  (1976). 
Buzacott,  J. A.,  "Automatic  Transfer  Lines  with  Buffer  Stocks,"  International  Journal  of 

Production  Research,  5  (1967). 
Buzacott,  J. A.,  "The  Role  of  Inventory  Banks  in  Flow-Line  Production  Systems,"  The 

International  Journal  of  Production  Research,  9  (1971). 
Buzacott,  J. A.,  "Models  of  Automatic  Transfer  Lines  with  Inventory  Banks,  A  Review  and 

Comparison,"  AIIE  Transactions,  10  (1978). 
Gershwin,    S.B.,    "The    Efficiency    of   Transfer    Lines    Consisting    of   Three    Unreliable 

Machines  and  Finite  Interstage  Buffers,"  Presented  at  ORSA/TIMS  Los  Angeles  Meet- 
ing (1978). 
Hatcher,  J.M.,  "The  Effect  of  Internal  Storage  on  the  Production  Rate  of  a  Series  of  Stages 

Having  Exponential  Service  Times,"  AIIE  Transactions,  2,  150-156  (1969). 
Ignall,  E.  and  A.  Silver,  "The  Output  of  a  Two-Stage  System  with  Unreliable  Machines  and 

Limited  Storage,"  AIIE  Transactions,  9  (1977). 
Koenigsberg,  E.,  "Production  Lines  and  Internal  Storage  —  A  Review,"  Management  Sci- 
ence, 5(1959). 
Okamura,  K.  and  H.  Yamashina,  "Analysis  of  the  Effect  of  Buffer  Storage  Capacity  in 

Transfer  Line  Systems,"  AIIE  Transactions,  9  (1977). 
Rao,  N.P.,  "Two  Stage  Production  System  with  Intermediate  Storage,"  AIIE  Transactions, 

7(1975). 
Sheskin,  T.J.,  "Allocation  of  Interstage  Storage  Along  an  Automatic  Production  Line," 

AIIE  Transactions,  8  (1976). 


INTERNAL  STORAGE  CAPACITY  IN  FIXED  CYCLE  PRODUCTION  SYSTEMS  487 

[13]  Soyster,  A.L.  and  D.I.  Toof,  "Some  Comparative  and  Design  Aspects  of  Fixed  Cycle  Pro- 
duction Systems,"  Naval  Research  Logistics  Quarterly,  23,  437-454  (1976). 

[14]  Toof,  D.I.,  "Output  Maximization  of  a  Series  Assembly  Facility  Through  the  Optimal 
Placement  of  Buffer  Capacity,"  Unpublished  Ph.D.  dissertation,  Temple  University 
(1978). 


SCHEDULING  COUPLED  TASKS 

Roy  D.  Shapiro 

Harvard  University 

Graduate  School  of  Business  Administration 

Cambridge,  Massachusetts 

ABSTRACT 

Consider  a  set  of  task  pairs  coupled  in  time:  a  first  (initial)  and  second 
(completion)  tasks  of  known  durations  with  a  specified  time  between  them.  If 
the  operator  or  machine  performing  these  tasks  is  able  to  process  only  one  at  a 
time,  scheduling  is  necessary  to  insure  that  no  overlap  occurs.  This  problem 
has  a  particular  application  to  production  scheduling,  transportation,  and  radar 
operations  (send-receive  pulses  are  ideal  examples  of  time-linked  tasks  requir- 
ing scheduling).  This  article  discusses  several  candidate  techniques  for  schedule 
determination,  and  these  are  evaluated  in  a  specific  radar  scheduling  applica- 
tion. 

This  article  considers  the  problem  of  scheduling  task  pairs,  i.e.,  tasks  which  consist  of  two 
coupled  tasks,  an  initial  task  and  a  completion,  separated  by  a  known,  fixed  time  interval.  If 
the  operator  or  machine  performing  these  tasks  is  only  able  to  process  one  at  a  time,  scheduling 
is  necessary  to  insure  that  a  completion  task  of  one  pair  does  not  arrive  for  processing  while 
one  part  of  another  task  is  being  processed. 

Consider,  for  example,  a  radar  tracking  aircraft  approaching  a  large  airport  [1].  In  order 
to  track  adequately,  it  is  necessary  to  transmit  pulses  and  receive  the  reflection  once  every 
specified  update  period.  The  radar  cannot  transmit  a  pulse  at  the  same  time  that  a  reflected 
pulse  is  arriving  nor  can  two  reflected  pulses  overlap.  A  possible  strategy  is  to  transmit  to  one 
tracked  object  and  wait  for  that  pulse  to  return  before  another  pulse  is  transmitted  as  shown  in 
Figure  1(a),  but  unless  the  number  of  objects  being  tracked  is  small,  this  may  not  allow  all 
objects  to  be  tracked  in  each  update  period.  A  more  efficient  strategy  is  some  form  of  inter- 
laced scheduling  like  that  shown  in  Figure  Kb).  Observe  that  the  time  between  each  pair  of 
transmit  and  receive  pulses  is  the  same  in  Figure  Kb)  as  in  Figure  1(a),  yet  the  total  transmis- 
sion time  is  far  less  in  Kb). 


b). 


Figure  1.   Sample  4-pair  schedules 


489 


490 


R  D   SHAPIRO 


1.    NOTATION,  CLASSIFICATION,  AND  COMPLEXITY 

Our  object  is  to  generate  a  schedule  for  a  given  set  of  task  pairs  which  allows  that  set  to 
be  completed  in  the  least  possible  time  with  no  overlap  between  tasks  (Figure  2).   Formally,  let 

/,      =    the  time  of  initiation  of  the  rth  task  pair; 

Sj     =    the  duration  of  the  initial  task  of  the  rth  pair,  /  =  1,  2,  . . .  ,  N; 

T,    =    the  duration  of  the  completion  task  of  the  rth  pair,  /  =  1,  2,  . . .  ,  N; 

dj     =    the  "inter-task"  duration,  i.e.,  the  time  between  the  initiation  of 

the  initial  task  of  the  rth  pair  and  the  initiation  of  that  pair's  completion. 


rl 

— .s,— 

a,                               — 

-* — 

T,—— 

t,  +  s, 


t,  +  d,  t,  +  d,  +  T, 

Figure  2.  The  rth  task  pair 


The  time  between  the  initiation  of  the  first  task  pair  and  the  completion  of  the  final  pair  we 
refer  to  as  the  frame  time  (or  makespan,  cf.  [3,4]  denoted  z).  For  convenience,  we  will  set  the 
initiation  time  of  the  first  pair  to  0. 

The  scheduling  problem  may  be  stated  as 

find  /,  ^  0,  /  =  1 /V  to  minimize 

z  =  max,  (/,  +  d,  +  T,) 
subject  to  the  constraint  that  no  member  of  the  set  of  intervals 

{(/,,  t,  +  sx  u,  +  d,,  t,  +  d,  +  r,))  /  =  i n 

overlap  with  any  other  member. 

To  put  this  problem  into  context  with  much  of  the  recent  literature  classifying  scheduling 
problems  with  regard  to  their  computational  complexity,  we  observe  that  the  problem  as  stated 
is  equivalent  to  a  job  shop  problem  where  A' jobs  are  to  be  scheduled  on  two  machines  with  the 
following  characteristics*: 

1.  Each  job  requires  three  operations:  the  first  (of  duration  S,)  to  be  processed  on 
Machine  1;  the  second  (of  duration  d,  —  S,)  on  Machine  2;  the  third  (of  duration  T) 
again  on  Machine  1. 

2.  Machine  1  may  only  process  one  operation  at  a  time;  Machine  2,  however,  has 
infinite  processing  capacity. 


"Under  the  classification  scheme  of  Rinooy  Kan  [9],  this  problem  is  V 1 2 1  (7,  no  wan,  V/2  iwn-bon  |Cmax.    See  also  [8]. 


SCHEDULING  COUPLED  TASKS  491 

3.       No  waiting  between  operations  is  permitted.    That  is,  once  a  job  is  begun,  it  must 
proceed  from  Machine  1  to  Machine  2  and  back  again  to  Machine  1  with  no  delay. 

The  problem  can  then  be  shown  to  be  NP-complete  by  Theorem  5.7,  pg.  93  in  [9]  or  by  a 
reduction  from  KNAPSACK  in  [6].  NP-complete  problems  form  an  equivalence  class  of  com- 
binatorial problems  for  which  no  nonenumerative  algorithms  are  known.  If  an  "efficient"  algo- 
rithm were  constructed  which  could  solve  any  problem  in  this  class,  any  other  would  also  be 
solvable  in  polynomial  time  (cf.  [2,4,6,7]).  Members  of  this  class  include  the  chromatic 
number  problem,  the  knapsack  problem,  and  the  traveling  salesman  problem. 

The  fact  that  a  polynomial-bounded  algorithm  is  not  likely  to  exist  motivates  the  construc- 
tion of  several  polynomial-bounded  algorithms  which  are  presented  and  evaluated  in  Sections  2 
and  3.  An  integer  programming  formulation  leads  to  a  straightforward  branch  and  bound  pro- 
cedure which  makes  use  of  the  problem's  special  structure.  (See  [11].)  In  view  of  the  fact  that 
this  optimal  procedure  is  likely  to  be  tractable  only  for  very  small  problems,  and  not  even  then 
for  radar-like  applications  requiring  real  time  solution,  we  proceed  directly  to  consideration  of 
three  suboptimal  algorithms. 

2.    SUB-OPTIMAL  ALGORITHMS 

This  section  considers  scheduling  procedures  which  can  be  shown  to  be  polynomially 
bounded:  Sequencing,  Nesting  and  Fitting.  After  some  discussion  of  their  characteristics,  they 
will  be  evaluated  on  realistic  examples  in  Section  3. 

Sequencing 

An  ordered  set  of  p  task  pairs  are  said  to  be  sequenced  when  the  completion  tasks  arrive 
for  processing  in  the  same  order  as  the  initial  tasks  were  scheduled,  p  pairs  can  be  sequenced 
whenever 

(1)  </,  >  £  S,    and 

1=1 

(2)  d,  >  d,-x+  T^-Si-x,  i  =  2,  3,  ...  ,  p. 

If,  as  is  the  case  for  many  applications,  S,  =  T,  for  each  task  pair,  (2)  becomes  simply 

(3)  d,  >  rf,_lf 

and  implementation  of  this  procedure  becomes  quite  easy. 

We  may  think  of  this  procedure  as  "jamming"  initial  tasks  together  until  they  run  into  the 
completion  task  corresponding  to  the  first  initial  task.  The  completion  tasks  are  guaranteed  not 
to  overlap  since  each  succeeding  d,  is  at  least  as  large  as  the  one  before.  Also,  since  this  is  a 
"single-pass"  procedure  (cf.  [3]),  computation  time  is  linear  in  A/.* 

In  any  sequenced  p-set,  dead  time  can  occur  in  two  ways,  as  is  shown  in  Figure  3.  It 
occurs  between  the  last  initial  task  and  the  first  completion,  and  it  occurs  between  successive 
completions.   The  former  can  be  written  as 


±s, 


i=\ 


'Actually,  computation  time  is  0(A  log  AO  since  the  d,  have  to  be  ordered. 


492 


R.D   SHAPIRO 


S,  S2        S3  V4         S<        S6  S1  7", 


r2      r3       r4 


r5     t-6 


7"7 


and  the  latter  as 


Figure  3.   Sequencing 


Hence, 


Z  4+i  -  4  =  4»  -  ^i- 


+  dp  -  d] 


zseq  -  I  CS,  +  r,)  +  rf,-ls, 

-  1  7)  +  d„. 

Hence,  if  N  task  pairs  are  sequenced  in  P  p-sets,  the  Ath  set  having  pk  pairs,  /c  =  1,  2, 
the  total  frame  time  may  be  represented  as 


zseq  —  2L 


,  P, 


Pk 

yv             p 

=  1  7)-+I^- 

;=1               /<=1 

As  an  example,  consider  the  following  7  task  pairs  with  common  durations  for  initial  and 
completion  tasks,  ordered  by  increasing  d,. 

i  -  1:  Si  -  F,  -  2,  d,  -  9 

/  =  2:  S2=  F2  =  1,  rf2=  13 

/  =  3:  S3=  F3  =  2,  rf3=  15 

/  =  4:  S4  =  F4  =  3,  d4  =  15 

/  =  5:  S5=  T5=  2,  d5=  19 

/  =  6:  S6  =  F6  =  4,  </6  =  24 

/  =  7:  S7  =  T7=  3,  rf7  =  25. 

Figure  4(a)  shows  their  sequenced  schedule. 

For  comparison,  Figure  4(b)  shows  the  optimal  schedule  for  this  set  of  task  pairs  as  gen- 
erated by  the  branching  algorithm  alluded  to  above.  At  the  other  extreme,  if  these  pairs  were 
scheduled  by  waiting  until  each  pair  was  completely  processed  before  initiating  the  next,  the 
frame  time  would  be  138. 


£  id,  +  T,)  =  138. 
1=1 


Nesting 

An  ordered  set  of  p  task  pairs  are  said  to  be  nested  whenever  the  completion  tasks  arrive 
for  processing  in  the  reverse  of  the  order  in  which  the  initial  tasks  were  scheduled,  p  pairs  may 
be  nested  if 


SCHEDULING  COUPLED  TASKS  493 


Sequencing: 


v,  v2.s,  s4     r,         t7     7-3   r4  s\    sb     s^ 

n 

?"„ 

7", 

:  =  58 

III        11    1    1      1    1        11 

1 

II           1 

0    2  3    5        8  "J    11         15  16  18  20     23  25         29     32 
Oplimal 

s2  s7     sh      s4  s,  r2    ss  s3  r,     r4    r7     r6    r,  r5 

42  44 

49 

>i  54       58 

:  -  37 

M       II        1          1       1     II     1     1     1     II        1        1          III 

01       4           8       1113  14  16  18   202223     26     29        33    35  37 
Nesting: 

>7  ss  SjS:                /,      r3  r5     r7     .•>„     s4 

'4 

n 

h    S2 

n= 

"Mill           1  1  1   1  1 1    1      II 

II 

=  70 

11       3     5     7    9  16    18  20  22  24  25    28         32      35  47      50  52        56  57  69  70 

Figure  4.   Sequencing  and  nesting 

(4)  d,  >  di+l+  Ti+l  +  S„  /=1,  ...  ,  p-\.     . 

Applying  this  procedure  to  the  7-pair  example  discussed  above  gives  the  schedule  shown 
in  Figure  4(c)  with  z  =  70. 

Fitting 

This  procedure,  unlike  the  two  discussed  above,  allows  the  user  to  specify  a  priority  order- 
ing, and  corresponds  intuitively  to  the  simple  process  which  one  might  use  when  scheduling 
task  pairs  by  hand.  After  setting  the  desired  order  and  scheduling  the  first  task  pair  at  time  0, 
each  successive  pair  is  scheduled  at  the  earliest  possible  time  not  involving  any  overlap  with 
pairs  already  scheduled. 

Let  us  consider  this  procedure  for  the  above  example,  taking  an  arbitrary  ordering: 
2,6,7,4,3,1,5.  As  shown  in  Figure  5(a),  the  task  pair  is  scheduled  at  time  0,  and  pairs  6  and  7 
can  successively  be  scheduled  with  no  overlap.  If  we,  however,  try  to  schedule  pair  4  at  the 
first  available  time,  its  completion  would  overlap  with  pair  6's  completion  (Figure  5(b)),  so  this 
is  not  possible.  The  first  available  time  for  scheduling  task  pair  4  without  overlap  is  time  18 
(Figure  5  (c)).  Pair  3,  however,  having  task  duration  only  2,  can  be  scheduled  at  time  8  (Fig- 
ure 5(d)).  Observe  now  that  pair  1  can  be  scheduled  nowhere  in  the  existing  schedule  without 
overlap,  so  it  must  be  "tacked"  onto  the  end,  at  time  36  (Figure  5(e)).  Pair  5  is  scheduled  at 
time  21,  completing  the  schedule  with  z  =  47  (Figure  5(f)). 

3.   TASK  PAIR  SIMULATION  AND  NUMERICAL  RESULTS 

In  keeping  with  the  radar  application  mentioned  above,  a  simulation  has  been  developed 
to  generate  aircraft  configurations  suitable  to  radar  operation.  For  each  object,  range,  cross- 
section,  and  velocity  can  be  used  to  determine  the  necessary  length  of  transmit  and  receive 
pulses  (of  the  order  of  10-100  /usees.)  as  well  as  the  inter-pulse  distance  (of  the  order  of  300- 
1300  /usees.).  Thus,  a  list  of  task  pairs  can  be  generated  for  evaluation  of  the  procedures  out- 
lined in  the  previous  section.   As  an  example,  such  a  list  is  given  in  Table  I  for  /V  =  20. 

For  values  of  N  shown  in  Table  II,  the  simulation  generated  50  such  task  pair  lists,  and 
the  average  frame  time  and  computation  time  were  computed.  Figure  6  presents  this  data 
graphically.  Note  that,  as  one  would  expect,  frame  time  is  linear  in  N.  This  is  not  surprising 
since  in  the  best  conceivable  situation,  that  of  no  idle  time  between  subtasks, 


494 


R.D.  SHAPIRO 


S,      Sb        S7 


(a) 


(b) 


(c) 


(d) 


(e) 


(f) 


H 


rh        t-, 


01  5         8  13  14 

s2    s6      v7     .v4       r2 

JL 


25         29  30      33 


t,       Tb       r7 


01  5         8        1113 


Si     v6     s7 


r2 

n 


Si 


23  25  26    29  30      33 


Th        r7     r, 


Dl  5         8  13  14         18       21  2<         2')  30     }}      36 


n 


1)1     5    8  10   13  14    18   21  23  25    29  30   33   36 

.v.  s6     s7    s,       r2  s4       r}    rh       r7    7"4  s, 


n 


0 


45     47 


5         8    10      13  U         18      2!    23    25         2930      33      36    38 


n 


01  5  8    10      13  14  18      21    23    25  2930      33      36    38  40    42      45    47 

Figure  5.   Fitting 

TABLE  I  -  Sample  Task  Pair  List  (N  -  20) 


47 


/  =  : 

I 

~T 

~~ 3 

4i        5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

S,  =  T,  (/usee) 

70 

70 

80 

50 1      70 

80 

70 

70 

70 

40 

50 

70 

60 

60 

60 

70 

50 

40 

50 

40 

</,  (/usee) 

1334 

L.       . 

1258 

1254 

I159JI107 

1022 

954 

884 

791 

750 

709 

674 

631 

623 

621 

555 

513 

498 

465 

387 

TABLE  II  (a)  -  Average  Simulated 
Frame  Times 


N 
20 

SEQUENCE 

4.4 
(.414) 

NEST 

7.3 
(1.387) 

FIT 

4.0 
(.381) 

50 

8.6 
(.559) 

15.0 
(1.830) 

7.6 
(.452) 

100 

15.5 
(.759) 

.  27.2 
(2.747) 

13.8 
(.679) 

200 

29.2 
(1.197) 

52.2 
(4.683) 

27.4 
(.990) 

500 

70.8 
(2.188) 

119.2 
(8.391) 

66.1 
(1.590) 

Frame  times  in  msec. 

Quantities  in  parentheses  are  standard  deviations 


SCHEDULING  COUPLED  TASKS 


495 


TABLE  II  (b)  —  Average  Computation 
Times  (msec) 


N 
20 

SEQUENCE 
1.9 

NEST 
4.0 

FIT 
67.2 

50 

4.4 

16.3 

440.1 

100 

8.6 

51.9 

1742 

200 

16.9 

195.3 

7091 

500 

42.0 

1064 

44160 

Z(MS)n 
120-- 


90  — 


60 


30- 


Z  =  Mean  Frame  Time 


Nest 
Z~  .232  N 


— i 1 1 1 1 m- 

100         200         300         400         500  H 

Figure  6.    Comparative  frame  times 


z  =  £  (Si  +  T,)  ~  kxN 


and  in  the  worst  situation,  that  in  which  no  task  is  performed  until  the  previous  task  has  been 
completed, 

*  -  Z  d,  +  T,  -  k2N. 


496  R.D   SHAPIRO 

An  assumption  made  in  the  treatment  of  this  example  is  that  the  radar  operator  knows  the 
values  of  Sf,  T,  and  d,  precisely.  If  there  is  any  uncertainty,  signals  can  overlap.  A  straightfor- 
ward way  of  avoiding  this  problem  in  a  real  situation  where  uncertainty  would  obviously  be 
present  would  be  to  "open  a  window"  around  the  pulse.  That  is,  if  the  object  is  such  that 
transmit  and  receive  pulses  are  estimated  to  be  of  60  ^sec.  duration,  an  interval  longer  than  60 
Aisec.  can  be  allotted  to  these  pulses  to  accommodate  (1)  the  possibility  that  a  pulse  length 
longer  than  60  Aisec.  might  be  necessary  or,  more  important,  (2)  the  possibility  that  the  receive 
signal  might  arrive  sooner  or  later  than  expected.  This  procedure  offers  no  conceptual  difficulty 
since  the  window  around  the  pulses  may  be  made  large  enough  to  guarantee  that  the  probability 
of  overlap  is  as  small  as  required.  In  order  to  retain  frame  times  small  enough  to  allow  updat- 
ing every,  say,  200  milliseconds,  we  must  limit  the  size  of  the  window  somewhat.  This  does 
not  seem  to  be  a  severe  restriction,  however.  For  example,  since  frame  time  is  linear  in  XT',, 
opening  a  window  around  each  pulse  of  twice  that  pulse's  estimated  duration  would  cause  the 
frame  time  to  be  no  more  than  doubled.  The  frame  times  of  sequenced  pulses  in  Table  11(a) 
indicate  that  even  for  large  N,  this  is  no  problem. 

A  second  possibly  problematic  characteristic  of  the  example  is  that  it  is  static,  i.e.,  no 
explicit  consideration  is  given  to  new  "jobs"  added  to  the  system  during  the  scheduling  process. 
In  job  shop  scheduling,  this  may  present  no  problem  if  jobs  are  released  to  the  shop  at 
predetermined  times.  In  radar  tracking,  however,  one  cannot  hold  enemy  missiles,  and  the 
scheduler  must  be  dynamic.  This  can  be  accomplished;  the  new  targets  may  be  inserted  into 
the  queue  of  jobs  to  be  processed,  or,  since  this  is  likely  to  be  time-consuming  when  jobs  are 
ordered  (as  in  Sequencing  and  Nesting),  all  current  jobs  can  be  processed,  followed  by  the 
newly-arrived  entries.  This  procedure  will  be  especially  efficient  for  sequencing  since  the  rf;'s 
are  proportional  to  the  distance  between  radar  and  target,  and  new  targets  will  tend  to  appear  at 
approximately  the  same  range. 

The  necessity  to  allow  for  search  and  discrimination  as  well  as  the  tracking  activity  and 
real-time  schedule  determination  within  a  200  milli-second  period  makes  sequencing  the  only 
viable  alternative.  Even  when  real-time  processing  is  not  required,  one  wonders  whether  the 
slight  improvement  in  frame  time  allowed  by  fitting  warrants  the  extra  computational  burden. 

A  caveat  is  in  order  here:  these  results  are  somewhat  application-dependent.  It  is  quite 
possible  that  other  applications  which  produce  task  pairs  with  different  structures  will  lead  to 
different  conclusions. 

CONCLUDING  REMARKS 

In  the  above  discussion  it  has  been  assumed  that  the  operator  or  machine  can  process 
only  one  task  segment  at  a  time.  This  is  appropriate  for  the  application  being  considered,  but 
one  might  easily  imagine  instances  in  which  there  is  some  nonunit  capacity  constraint  on  the 
operator.  For  example,  if  trucks  are  being  loaded  and  unloaded  at  some  central  depot,  labor  or 
space  restrictions  might  limit  the  number  of  trucks  being  simultaneously  processed. 

Fortunately,  the  suboptimal  procedures  described  above  may  be  extended  without  any 
problem.*  Figure  7  shows  how  the  example  given  in  Section  2  may  be  sequenced  if  the  operator 
is  limited  to  two  tasks  at  a  time.  Note  that  due  to  the  ordering  of  the  inter-task  durations, 
sequencing  guaranteed  that  since  no  more  than  two  initial  tasks  can  overlap,  no  more  than  two 
final  tasks  will  overlap. 


The  optimal  enumerative  procedure  described  in  [11]  is  also  easily  extended. 


SCHEDULING  COUPLED  TASKS 


497 


U 


U-l 


~1 


t_ 


i)   I   i   1     5         8   9     II     1.1  14    Ih  17  18     20     22     24  24  30        .VI 

/,  =0  i<=  .1 

/,  =  0  /„  =  5 

*-;    "-s 

'4  -  2 

Figure  7.  Sequencing  with  operator  capacity  =  2 

Another  extension  is  to  consider  tasks  which  consist  of  more  than  two  coupled  segments. 
The  notation  changes  slightly:  the  rth  task  pair  becomes  a  task  set,  the  initial  task  of  duration 
5,  followed  by  nt  subtasks;  the  /th  subtask  is  of  duration  Ty  and  the  time  at  which  it  is  initiated 
is  dy  after  the  initiation  of  the  initial  task  (Figure  8). 


k-l 


U 


U.A 


■*Vn 


Figure  8.    Multiple  coupled  subtasks 

Fitting,  as  proposed  above,  works  well  in  this  case,  but  sequencing  and  nesting  are  waste- 
ful since  they  treat  the  subtasks  as  one  long  task  of  duration  dm  +  Tin  —  dn. 


BIBLIOGRAPHY 

[1]    Air  Traffic  Control  Advisory  Committee,  Report  of  the  Department  of  Transportation,  1 

(1969). 
[2]    Coffman,  E.G.,  Jr.  (editor),  Computer  and  Jobshop  Scheduling  Theory,  John  Wiley,  New 

York,  N.Y.  (1976). 
[3]    Conway,  R.W.,  W.L.  Maxwell  and  L.W.  Miller,  Theory  of  Scheduling,  Addison-Wesley, 

Reading,  Mass.  (1967). 
[4]    Garey,  M.R.,  D.S.  Johnson  and  R.  Sethi,  "The  Complexity  of  Flowshop  and  Jobshop 

Scheduling,"  Mathematics  of  Operations  Research,  1,  117-129  (1976). 
[5]    Heffes,  J.  and  S.  Horing,  "Optimal  Allocation  of  Tracking  Pulses  for  an  Array  Radar," 

IEEE  Transactions  on  Automatic  Control,  15,  81-87  (1970). 
[6]     Karp,    R.M.,   "Reducibility   among   Combinatorial   Problems,"    R.E.    Miller   and   J.W. 

Thatcher,  (editors),  Complexity  of  Computer  Computations,  Plenum  Press,  New  York, 

N.Y.,  85-104  (1972). 
[7]    Karp,  R.M.,  "On  the  Computational  Complexity  of  Combinatorial  Problems,"  Networks,  5, 

45-68  (1975). 


498 


R.D   SHAPIRO 


[8] 

[9] 
[10] 


[11] 


Reddi,  S.S.  and  C.V.  Ramamoorthy,  "On  the  Flowshop  Sequencing  Problem  with  No  Wait 

in  Process,"  Operational  Research  Quarterly,  23,  323-331  (1972). 
Rinooy  Kan,  A.H.G.,  Machine  Scheduling  Problems,  Martinus  Nijhoff,  The  Hague  (1976). 
Schweppe,  F.C.  and  D.L.  Gray,  "Radar  Signal  Design  Subject  to  Simultaneous  Peak  and 

Average   Power   Constraints,"    IEEE   Transactions  on   Information    Theory,    12,    13-26 

(1966). 
Shapiro,  R.D.,  "Scheduling  Coupled  Tasks,"  Harvard  Business  School,  Working  Paper, 

HBS  76-10. 


SEQUENCING  INDEPENDENT  JOBS 
WITH  A  SINGLE  RESOURCE 

Kenneth  R.  Baker 

Dartmouth  College 
Hanover,  New  Hampshire 

Henry  L.  W.  Nuttle 

North  Carolina  State  University 
Raleigh,  North  Carolina 

ABSTRACT 

This  paper  examines  problems  of  sequencing  n  jobs  for  processing  by  a  sin- 
gle resource  to  minimize  a  function  of  job  completion  times,  when  the  availa- 
bility of  the  resource  varies  over  time.  A  number  of  well-known  results  for 
single-machine  problems  which  can  be  applied  with  little  or  no  modification  to 
the  corresponding  variable-resource  problems  are  given.  However,  it  is  shown 
that  the  problem  of  minimizing  the  weighted  sum  of  completion  times  provides 
an  exception. 


1.   INTRODUCTION 

We  consider  the  problem  of  sequencing  a  set  N  =  {1,  2,  . . .  ,  n)  of  jobs  to  be  processed 
using  a  single  homogeneous  resource,  where  the  availability  of  the  resource  varies  over  time. 
If  t  represents  time  (measured  from  some  origin  /  =  0)  then  we  denote  by  r{t)  the  resource 
available  at  time  t  and  by  R  (?), 


R(t)  =  f'  r(u)du 


the  cumulative  availability  as  of  time  /,  i.e.,  the  area  under  the  curve  r(u)  over  the  interval 
[0,t].   See  Figure  1. 

Let  pj,  j  =  I,  ...  ,  n,  denote  the  resource  requirement  of  job  /  Once  Pj  units  of 
resource  have  been  applied  to  job  j,  the  job  is  considered  complete.  We  denote  the  completion 
time  of  job  j  by  C,.  In  all  problems  treated  the  objective  is  is  to  minimize  G,  a  function  of  the 
completion  times  of  the  jobs,  where  G  is  assumed  to  be  a  regular  measure  (see  [1],  Chapter  2). 

This  model  is  a  generalization  of  the  single-machine  sequencing  model.  The  generaliza- 
tion to  a  resource  capacity  that  varies  over  time  allows  for  situations  in  which  machine  availabil- 
ity is  interrupted  for  scheduled  maintenance  or  temporarily  reduced  to  conserve  energy.  It  also 
allows  for  a  situation  in  which  processing  requirements  are  stated  in  terms  of  man-hours  and 
labor  availability  varies  over  time. 

499 


500 


K.R    BAKER  AND  H.L.W.  NUTTLE 


r(()  n 

3-- 
2- 


H 1- 


-l 1 1 1 h 


C, 


RU)k 


SEQUENCING  INDEPENDENT  JOBS  501 

In  the  single-machine  case  the  resource  profile  r(t)  is  constant  (typically  r{t)  =  1),  and 
the  cumulative  profile  R{t)  is  a  straight  line  with  slope  r(t).  Time  is  measured  in  some  basic 
unit  such  as  hours;  and  completion  times,  ready  times,  due  dates  and  tardiness  are  expressed  in 
the  same  units.  Resource  requirements  (processing  times)  are  simply  requirements  for  inter- 
vals on  the  time-axis. 

In  the  variable-resource  problem,  the  exact  correspondence  between  the  requirement  for  a 
unit  of  resource  and  the  requirement  for  a  unit  interval  on  the  time  axis  is  lost.  This  lack  of 
correspondence  arises  from  the  fact  that  there  may  be  a  number  of  units  of  resource  available 
during  a  particular  unit  of  time  and  a  different  number  during  the  next.  In  the  single-machine 
problem  if  a  job  j  is  sequenced  to  follow  jobs  in  B  (where  B  is  any  subset  of  N  )  then  job  j  will 
be  complete  at  time  Cn 

C,  =  p(B)  +Pj, 

where  p(B)  =  ]Tp,,  and  p,  denotes  the  processing  time  for  job  i.    In  the  variable-resource 

problem  it  is  appealing  to  analogously  specify  the  completion  time  of  job  j  by  Cr 

(1)  C,=  t(p(B)  +/>,.) 

where  p,  is  the  resource  requirement  of  job  /  and  t{Q)  is  the  (smallest)  point  on  the  time  axis 
corresponding  to  R(t)  =  Q.  See  Figure  1.  In  effect,  jobs  are  sequenced  on  the  resource  axis, 
while  their  completion  times  are  measured  on  the  time  axis.    For  the  single-machine  problem 

;  the  completion  point  of  job  j  is  the  same  on  both  axes,  but  such  is  not  the  case  for  the 

!  variable-resource  problem. 

Notice  that  this  specification  implicity  assumes  that  the  resource  available  at  any  point  in 
time  is  devoted  entirely  to  the  processing  of  a  single  job.  Thus,  for  example,  if  ten  men  were 
available  in  a  particular  hour,  all  ten  would  be  assigned  to  work  simultaneously  on  the  same 
job.  Also,  if  the  available  resource  represents  several  machines,  then  this  formulation  permits 
each  job  to  be  processed  simultaneously  on  more  than  one  machine.  Equivalently,  this  means 
that  jobs  must  be  divisible  into  portions  that  can  be  allocated  equally  to  the  number  of 
machines  available.   Such  a  formulation  will  be  called  a  continuous-time  model. 


In  order  to  allow  for  a  wider  range  of  applicability,  we  can  re-formulate  the  model  in 
discrete  time  as  follows. 

(a)  Unit  intervals  on  the  time  axis  (of  Figure  1)  are  called  periods,  and  job  comple- 
tion times  are  measured  in  periods. 

(b)  In  a  given  period  the  resource   availability  is  an  integer  number  of  units. 

(c)  Each  job  requires  an  integer  number  of  resource-periods. 

(d)  Processing  work  is  divisible  only  to  the  level  of  one  resource-unit  for  one  period. 

Under  this  formulation,  for  example,  the  time  unit  might  be  days,  the  resource  availability 
might  be  crew  size,  and  the  processing  requirement  might  be  man-days.  Property  (d)  then  res- 
tricts the  refinement  of  a  schedule  to  the  assignment  of  each  crew  member's  task  on  a  day-by- 
clay  basis.  Furthermore,  a  task  requiring  two  man-days  could  be  accomplished  either  by  one 
crew  member  working  two  days  or  by  two  members  working  one  day  each. 

In  the  discrete-time  context,  we  may  regard  sequencing  as  ordering  jobs  on  the  resource 
scale  in  Figure  1,    but  taking  the  completion  time  of  job  j  to  be  [Cf]  vs  [C,-],  the  smallest 


502 


K.R.  BAKER  AND  H.L.W.  NUTTLE 


integer  greater  than  or  equal  to  Cj,  where  Cj  itself  is  given  by  (1).  In  other  words,  we  obtain  a 
sequence  using  the  continuous-time  framework,  which  assumes  arbitrarily  divisible  jobs,  but  we 
round  up  the  resulting  completion  times  when  they  are  noninteger.  Under  this  interpretation 
of  the  model,  due-dates  are  specific  days  and  a  job  is  "on  time"  as  long  as  it  is  completed  on  or 
before  the  specific  day.  Clearly,  in  the  discrete  time  model  several  jobs  can  have  the  same 
completion  time. 

To  verify  that  a  job  sequence  can  be  interpreted  consistently  with  requirement  (d),  note 
that  the  cumulative  resource  requirement  and  the  cumulative  resource  availability  by  the  end  of 
any  period  are  both  integers.  It  follows  that  the  workload  implied  by  the  continuous-time  solu- 
tion can  be  shifted  to  meet  the  integer  restrictions  of  the  discrete-time  model  since  the  resource 
availability  in  any  period  can  be  treated  as  a  set  of  unit-resource  availabilities.  Then  any  frac- 
tion of  a  day's  work  in  the  original  solution  can  be  rescheduled  as  a  day's  work  for  the  same 
proportion  of  the  total  resource  units  available.  This  rescheduling  will  consume  an  integer 
number  of  resource-periods  for  each  job. 

As  an  example,  consider  the  three-job  problem  shown  below. 


J 


1 


Pj 

r(t)=  1       0  <  t  <  4 

r{t)  =  4       4  <  t  <  7 

In  Figure  2  we  represent  the  sequence  1-2-3  assuming  infinite  divisibility.  In  Figure  3  we  show 
how  the  work  is  rescheduled  to  meet  the  integrality  requirement  of  the  discrete-time  model. 
As  Figures  2  and  3  indicate,  the  discrete-time  conditions  can  be  incorporated  by  a  minor  adjust- 
ment of  continuous-time  job  assignments  that  essentially  involves  replacing  vertical  portions  of 
the  schedule  chart  with  horizontal  portions  whenever  the  available  resource  capacity  is  split 
among  two  or  more  jobs  within  a  period. 


r(/),i 


Figure  2. 


Our  purpose  in  this  paper  is  to  note  that  certain  well-known  results  for  the  single-machine 
model  carry  over  with  little  or  no  modification  to  the  variable-resource  model.  In  fact,  we 
found  only  one  exception.    (See  Section  3.) 

A  variable  resource  problem  has  also  been  examined  by  Gelders  and  Kleindorfer  [6,7]  in 
the  context  of  coordinating  aggregate  and  detailed  scheduling  decisions.  In  their  model  the 
variation  in  resource  availability  results  from  the  explicit  decision  to  schedule  overtime.    This 


SEQUENCING  INDEPENDENT  JOBS 


503 


decision  leads  to  a  cumulative  resource  availability  function  consisting  of  segments  with  identi- 
cal positive  slope  (corresponding  to  capacity  available)  separated  by  horizontal  segments 
(corresponding  to  unused  overtime.)  Their  objective  is  to  determine  when  and  how  much 
overtime  should  be  scheduled,  and  to  determine  the  associated  job  sequence,  so  as  to  minimize 
the  sum  of  overtime,  tardiness  and  flow-time  costs.  They  also  note  that  for  a  given  overtime 
schedule,  shortest-first  sequencing  minimizes  mean  job  completion  time  while  nondecreasing 
processing  time-to-weight  ratio  sequencing  may  not  minimize  mean  weighted  job  completion 
time.  These  two  results  are  encompassed  in  our  general  treatment  of  the  variable-resource 
model  in  Sections  2  and  3. 


U)ji 


Figure  3. 


2.   RESULTS  THAT  GENERALIZE  TO  VARIABLE  RESOURCES 

The  following  is  a  set  of  sequencing  results  for  the  variable-resource  model  that  are  ident- 
ical to  or  slight  modifications  of  their  single-machine  counterparts.  It  is  not  difficult  to  establish 
that  the  results  we  give  are  valid  for  both  the  continuous-time  and  discrete-time  models.  How- 
ever, proofs  are  omitted,  since  they  are  typically  direct  extensions  of  the  original  arguments  in 
the  single-machine  case. 

The  results  involve  sequences  of  jobs,  or  at  least  partial  sequences.  We  reiterate  that 
these  sequences  can  be  viewed  as  applying  to  the  resource  axis  in  Figure  1  but  can  be  converted 
to  completion  time  schedules  in  either  the  continuous-time  or  discrete-time  case  by  means  of 
the  appropriate  transformation.  We  use  Cj  to  denote  the  completion  time  of  job  j  and  r(p(B)) 
to  denote  the  makespan  for  the  jobs  in  /?,  recognizing  that  in  the  discrete-time  case  these  quan- 
tities must  be  interpreted  in  the  appropriate  way. 

Minimizing  the  Maximum  Cost 


One  of  the  few  efficient  algorithms  for  a  broad  class  of  sequencing  criteria  is  Lawler's  pro- 
cedure [9]  for  minimizing  the  maximum  cost  in  the  sequence.  Formally,  the  criterion  is  to 
minimize 

G  =  max  {gj(Cj)} 

where  £/(C,-)  is  the  cost  incurred  by  job  j  when  it  completes  at  C,  and  where  gj(t)  is  nonde- 
creasing in  /.  The  solution  procedure  works  by  constructing  a  sequence  from  the  back  of  the 
schedule  and  the  procedure  is  easily  adapted  to  the  variable-resource  model,  as  shown  below. 


504  K.R.  BAKER  AND  H.L.W.  NUTTLE 

1.  Initially  let  A  =  <f>.    (A  denotes  the  set  of  jobs  at  the  end  of  the  schedule  and 
A'  =  N  —  A  denotes  its  complement.) 

2.  Find  M  =  t(p(A')).    (Mis  the  makespan  for  unscheduled  jobs.) 

3.  Identify     job     k     satisfying     gk{M)  =  min  {g,(M)).       (Considering     only     the 

unscheduled  jobs,  job    k  is   the   one   that   achieves   the   minimum   cost   when 
scheduled  last.) 

4.  Schedule  job  k  last  among  the  jobs  in  A'.    Then  add  job  k  to  A  and  return  to  Step 
2  until  A  =  N. 

A  noteworthy  special  case  is  the  criterion  of  maximum  tardiness.  The  procedure 
sequences  jobs  in  nondecreasing  order  of  due-dates  in  this  case.  Thus,  as  in  the  single  machine 
problem,  earliest  due-date  (EDD)  scheduling  will  minimize  the  maximum  tardiness.  It  will 
also  find  a  schedule  in  which  all  jobs  complete  on  time,  if  such  a  schedule  exists. 

Minimizing  the  Sum  of  Tardiness  Penalties 

Many  problems  of  considerable  interest  for  the  single  machine  model  may  be  regarded  as 
special  cases  of  the  problem  of  minimizing  total  tardiness  penalty, 

where  T,  =  max(C,  —  d,\  0)  and  wl  >  0. 

Several  dominance  properties,  in  the  spirit  of  Emmons  [3],  can  be  shown  to  hold  for  the 
variable  resource  problem.  These  in  turn  imply  similar  dominance  properties  for  the  various 
special  cases  and,  in  some  instances,  optimizing  (ranking)  procedures.    Let: 

J     —  a  set  of  jobs 

J'    =  the  complement  of  J 

A,  =  the  set  of  jobs  known  to  follow  job  ;',  by  virtue  of  precedence  conditions. 

B,  =  the  set  of  jobs  known  to  precede  job  /',  by  virtue  of  precedence  conditions. 
Cj  =  the  time  required  to  process  the  jobs  in  set  /,  defined  by  R  (C,)  —  £/?, 

B'  =  B,U  {  j)  =  the  set  containing  job  j and  all  jobs  known  to  precede  job  /  by  virtue  of 

the  precedence  conditions. 
A*  =  A-  —  {j}  =  the  set  containing  the  complement  of  A,,  but  excluding  job  / 

THEOREM  1:  If  wk  <  w.  and  dk  ^  max(rf,,  C..)  then  j  precedes  k  in  an  optimal 
sequence. 

THEOREM  2:  If  dk  ^  C .   then  j  precedes  k  in  an  optimal  sequence. 

Ai 

THEOREM  3:  If  p,  <  pk,  w,  >  wk  and  d,  <  max  (dk,  Cfl.),  then  j  precedes  k  in  an 
optimal  sequence. 

COROLLARY  (Theorem  3):  If  Wj  >  wk,  p,  <  pk  and  d,  <  dk,  then  j  precedes  k  in  an 
optimal  sequence.  The  corollary  immediately  yields  an  optimal  ranking  procedure  for  problems 
derived  by  making  constant  any  two  of  the  three  parameters.    For  example,  when  G  =   ^  T, 

with  w,  =  w  and  d,  =  d,  an  optimal  sequence  is  determined  by  ordering  the  jobs  by  processing 


! 


SEQUENCING  INDEPENDENT  JOBS  505 

requirement,  smallest  first  (p\  ^  p2  ...  <  p„).  When  d  =  0  we  have  T,  =  Cn  i.e.,  the  mean 
flowtime  problem,  for  which  this  sequence  is  called  shortest  processing  time  (SPT). 

The  problem  of  minimizing  the  total  tardiness  penalty  when  pt  =  p  is  also  not  difficult  to 
solve.  Constant  resource  requirements  imply  a  fixed  sequence  of  completion  times  under  any 
sequence.  In  particular  the  first  job  completes  at  tip),  the  second  job  at  t(2p),  etc.;  and  an 
optimal  schedule  may  be  found  by  assigning  jobs  to  positions,  as  in  Lawler  [10]: 

xu  =   1  if  job  /appears  in  sequence  position  j 

=  0  otherwise 
Cy  =  the  penalty  for  job  /'when  it  appears  in  sequence  position  j,  i.e.  max  {0,  t{jp)  —  d,}. 

The  problem  is  to  minimize  £  ]L  cuxU 

i        l 

Subject  to 

Tfii  =  1 

i 

An  assignment  algorithm  can  produce  the  optimal  solution. 

The  most  general  version  of  the  single-machine  problem,  with  unequal  due-dates,  pro- 
cessing times,  and  weights  is  binary  NP-complete.  The  computational  complexity  of  the  cases 
in  which  w,  =  w  or  d,  =  d  >  0  is  an  open  question.  However,  pseudo-polynomial  algorithms 
have  been  developed  by  Lawler  [11]  and  Lawler  and  Moore  [12].  The  algorithms  which  have 
demonstrated  the  most  effective  computational  power  for  the  problems  are  those  found  in  [14]. 
These  and  other  enumerative  algorithms  can  be  modified  in  a  straightforward  manner  to  accom- 
modate the  variable  resource  problem. 

Minimizing  The  Weighted  Number  of  Tardy  Jobs 

In  this  case  we  are  interested  in  whether  a  job  is  tardy  rather  than  the  the  length  of  time 
by  which  it  is  tardy.  Let  8(7})  =  1  indicate  that  job  j  is  tardy  and  8(7})  =  0  indicate  that  it  is 
completed  on  time.   If  each  job  has  its  own  penalty  for  being  tardy,  i.e., 

G=  ^w/iiTj), 

then  the  single-machine  problem  is  binary  NP-complete,  although  it  can  be  solved  by  a 
pseudo-polynomial  dynamic  programming  algorithm  due  to  Lawler  and  Moore  [12].  The  algo- 
rithm can  easily  be  adapted  to  the  variable-resource  problem  with  no  impact  on  computational 
efficiency. 

By  restricting  the  data  we  obtain  special  cases  that  are  solvable  by  ranking  algorithms,  just 
as  in  the  single-machine  case: 

THEOREM  4:  When  d,  =  d  for  all  jobs,  if  the  processing  times  and  weights  are  agreeable 
(Pi  ^  pj  whenever  wt  >  w,)  then  an  optimal  sequence  is  obtained  by  scheduling  the  jobs  in 
order  of  processing  requirement,  shortest  first  (in  order  of  weight,  largest  first). 

COROLLARY  (Theorem  4):  When  d,  =  dand  pt  =  p,  an  optimal  sequence  is  obtained  by 
'  scheduling  jobs  in  order  of  weight,  largest  first. 


506 


K.R.  BAKER  AND  H.L.W.  NUTTLE 


When    Wj  =  w  for  all  jobs  a  sequence  that  minimizes  the  number  of  tardy  jobs  i.e., 
G  =   £8(7^),  can  be  determined  by  generalizing  an  efficient  algorithm  due  to  Moore  [13]. 

Since  maximum  tardiness  is  minimized  by  sequencing  the  jobs  in  EDD  order,  it  follows  that  if 
sequence  S  yields  minimum  G,  then  so  will  sequence  S',  in  which  the  on-time  jobs  in  S  are 
scheduled  in  EDD  order  followed  by  all  the  tardy  jobs  in  S.  Letting  S„  represent  the  largest 
possible  set  of  on-time  jobs  (so  that  G  =  n  —  \S„\  is  the  minimum  number  of  tardy  jobs)  S„ 
can  be  determined  as  follows: 


Order  and  index  the  jobs  in  N  such  that  dx  <  d2  ^  •  •  -^  d„  (where  ties  are  bro- 
ken arbitrarily).   Set  SQ  =  0  and  k  =  1. 

If  k  =  n  +  1  stop.   S„  is  an  optimal  set. 


If     t 


Z  p,  +  Pk 


^  dk     set     Sk  =  Sk_\  U    {k},     otherwise     let     pr     =     max 

[Pj\j\  Sk.x  U{k\]  and  set  Sk  =  ^_,  U[k]  -  {r}. 

4.       Set  k  =  k  +  1  and  return  to  step  2. 

Constrained  (Secondary  Criterion)  Problems 

Several  authors  have  addressed  the  problem  of  sequencing  n  jobs  on  one  machine  so  as  to 
optimize  one  criterion  while  restricting  the  set  of  sequences  so  that  all  or  some  jobs  also  satisfy 
another.    We  include  four  such  problems  here.    In  particular, 

(a)     Minimize  total  (mean)  flow  time  given  that  a  subset  E  of  the  jobs  are  to  be  on  time 
(Burns  and  Noble  [2]  and  Emmons  [4],  i.e., 

min  G  =  £C, 


s.t.  C,  <  d„  i  €  E 


(b)     Minimize  maximum  tardiness  given  that  a  subset  E  of  the  jobs  are  to  be  on  time 
(Burns  and  Noble  [2]),  i.e., 

min  G  =  max  T, 
s.t.  C,  <  rf„  /  6  E 


(c)  Minimize  mean  flow  time  over  all  sequences  which  yield  minimum  maximum  cost 
(Emmons  [5]  and  Heck  and  Roberts  [8]),  i.e., 

min  G  =  £C, 

s.t.  g,(C,)  <  Gm,  i  e  N 

where  g,(C)  is  a  non-decreasing  function  of  Cand  G„,  =  min  {max  g,(C,)} 

(d)  Minimize  the  number  of  tardy  jobs  given  that  a  subset  E  of  the  jobs  is  to  be  on  time 
(Sidney  [15]),  i.e., 

min  G  =  J^Sd1,) 
subject  to  C,  <  dj  i  €  E. 


SEQUENCING  INDEPENDENT  JOBS  507 

In  all  cases  the  algorithms  originally  developed  for  the  single-machine  problem  can  easily 
be  adapted  to  the  variable-resource  problem. 

The  first  three  problems  can  be  solved  by  a  one  pass  algorithm  which  sequences  jobs  one 
,  at  a  time  from  last  to  first.  Suppose  that  jobs  have  been  assigned  to  positions  k  +  1  through  n. 
Let  Nk  be  the  set  of  jobs  as  yet  unsequenced  and  Lk  be  the  subset  of  Nk  that  can  be  assigned 
position  k  without  violating  the  constraint.  A  job  from  Lk,  say  job  j,  is  then  chosen  according 
to  a  certain  rule  and  sequenced  in  position  k.  Then  Nk_l  =  Nk  —  {j},  Lk-\  is  generated,  and  a 
job  is  sequenced  in  position  k—  1,  etc. 

Letting  Ek  =  Nk    Q  £and  p(Nk)  =    £  pk,  then  for  problems  (a)  and  (b) 

Lk  =  [Nk  -  Ek)  U  {j\j  €  Ek-  dj  >  t(p(Nk))) 
'  while  for  problem  (c) 

Lk=  U\j  €  Nk-gl(t(p(Nk)))  ^  Gj 
The  rule  for  choosing  the  job  for  position  k  in  problems  (a)  and  (c)  is  choose  j  such  that 
Pj  =  max  p, 

while  for  problem  (b),  j  is  chosen  such  that 
dj  =  max  dj. 

i€Lk 

Problem    (d)    may    be    solved    by    modifying    the    due-dates    to    reflect    the    fact    that    if 
d,  ^  dk,  k  €  £,  and  job  /  is  to  be  on  time  in  a  feasible  sequence  then  i  must  be  completed  by 
!  t(R(dk)  —  pk).    Then  Moore's  algorithm  can  be  applied,  with  an  adjustment  to  assure  that  jobs 
in  £will  be  on  time.   This  is  essentially  the  procedure  developed  by  Sidney  [15]. 

Nonsimultaneous  Arrivals 

In  the  preceding  sections  all  jobs  are  assumed  to  be  available  for  sequencing  at  time  zero. 
We  now  consider  problems  in  which  job  j  is  not  available  for  processing  until  the  beginning  of 
period  rr  where  rt  ^  1.  If,  in  this  situation,  it  is  possible  to  interrupt  the  processing  of  a  job 
and  resume  it  later  without  loss  of  progress  toward  completion  of  the  job,  we  say  that  the  sys- 
tem operates  in  a  "prempt-resume"  mode. 

For  single-machine  problems  with  criteria  maximum  tardiness  (G  =  max  Tj)  or  total 
(mean)  completion  time  (G  =  £  C,)  when  prempt-resume  applies;  the  static  optimizing  rules 

EDD  and  SPT  can  be  generalized  in  a  straightforward  manner  to  produce  optimal  sequences 
when  all  jobs  are  not  simultaneously  available  ([1]  p.  82).  The  same  generalizations  apply  when 
resource  availability  varies  with  time,  using  the  following  procedure: 

1.  At  time  zero  if  one  or  more  jobs  are  available  assign  the  resource  to  process  the 
available  job  with  the  smallest  (most  urgent)  priority.  Otherwise  leave  the 
resource  idle  until  the  first  job  is  available. 

2.  At  each  job  arrival,  compare  the  priority  of  the  newly  available  job  j  with  the 
priority  of  the  job  currently  being  processed.  If  the  priority  of  job  j  is  less,  allow 
job  j  to  preempt  the  job  being  processed;  otherwise  add  job  j  to  the  list  of  avail- 
able jobs. 


508 


KR.  BAKER  AND  H.L.W   NUTTLE 


3.       At  each  job  completion,  examine  the  set  of  available  jobs  and  assign  the  resource 
to  process  the  one  with  the  smallest  priority. 

In  order  to  minimize  maximum  tardiness,  the  priority  of  a  job  is  taken  to  be  its  due-date,  and 
to  minimize  mean  flowtime  the  priority  is  its  remaining  resource  requirement. 

3.   MINIMIZING  THE  SUM  OF  WEIGHTED  COMPLETION  TIMES 

One  case  for  which  the  single-machine  result  does  not  generalize  in  a  straightforward 
manner  to  the  corresponding  variable-resource  problem  is  the  case  sequencing  to  minimize  the 
sum  of  weighted  completion  times,  where 

G  =  £wyCy 
when  all  jobs  are  available  at  time  zero. 

Sequencing  jobs  in  nondecreasing  ordef  of  the  ratio  p,/Wj,  which  will  always  minimize  G 
in  the  single-machine  problem,  need  not  yield  an  optimal  sequence  when  the  resource  availabil- 
ity varies  with  time.   The  following  simple  example  demonstrates  this  fact. 

EXAMPLE 


j 

1 

2 

3 

Pi 

7 

3 

6 

Wj 

5 

2 

4 

Pj/Wj 

1.4 

1.5 

1.5 

rU)  =1        0  ^  t  <  4 

,-(,)  =  4       4^/^7        M  =7 

Sequencing  the  jobs  in  nondecreasing  order  of  Pjlwj  yields  the  order  1-2-3,  for  which  the  com- 
pletion times  are  4.75,  5.5  and  7.  Therefore,  G  =  62.75.  For  the  sequence  2-1-3  the  comple- 
tion times  are  3,  5.5  and  7,  with  G  =  61.5.  (Under  the  discrete  time  framework  G  =  65  for 
1-2-3  but  G  =  64  for  2-1-3.) 


While  the  differences  in  G-values  may  seem  almost  insignificant  it  is  possible  to  construct 
an  example  in  which  sequencing  by  increasing  ratios  pjw,  will  yield  an  arbitrarily  bad  solution. 
Consider  the  data  for  a  two-job  problem  in  which 

1  2 


Pi 

10'" 

5  x  102"' 

Wj 

1 

10"' 

Pj/Wj 

10'" 

5  x  10'" 

r(t)  =  5  x  102'",  0  ^  t  <  1 

,   1  <  /  <  1  +  10'". 

<  Pi/w2)  and  S\  the  sequence  2-1,  for  large  m 


rU)  =  1 

Letting  S  represent  the  sequence  1-2  (p\/w 
have 


we 


G  w-J    ~    1  (If)"') 
G(S')   "   2U     '" 


SEQUENCING  INDEPENDENT  JOBS  509 

For  the  special  case  in  which  the  processing  times  and  weights  are  agreeable  (p,  ^  p, 
whenever  w,  ^  w,)  sequencing  by  nondecreasing  ratios  of  Pj/wj  does  produce  an  optimal  solu- 
tion (see  Theorem  4).  Otherwise  the  two  examples  given  in  this  section  reinforce  the  notion 
that  the  single-machine  result  cannot  be  extended  to  even  the  simplest  versions  of  the 
variable-resource  model.  In  one  example  the  resource  profile  r(t)  is  nondecreasing,  while  in 
the  other  example  r(t)  is  nonincreasing.  In  both  cases  there  is  only  one  change  in  r(t).  These 
situations  would  appear  to  be  among  the  least  drastic  ways  of  relaxing  the  constant  resource 
assumption;  but,  as  we  have  demonstrated,  the  ratio  rule  still  fails.  At  this  point,  we  can  con- 
clude only  that  the  minimization  of  Ew^C,  involves  more  than  a  simple  extension  of  the 
single-machine  result.  Obviously,  any  optimal  ordering  rule  (if  one  exists)  would  have  to 
involve  information  about  the  resource  profile  as  well  as  information  about  processing  require- 
ments and  weights.   We  conjecture  that  this  problem  is  NP-complete. 

4.   COMMENTS 

Although  it  is  not  possible  to  extend  all  single-machine  results  directly  to  the  variable- 
resource  case,  a  few  observations  can  be  made.  A  look  at  Figure  1  indicates  that  the  graph  of 
R  (t)  transforms  processing  times  (on  the  horizontal  axis)  into  resource  consumptions  (on  the 
vertical  axis),  and  vice-versa.  This  transformation  is  at  least  order-preserving.  In  particular, 
the  makespan  for  a  set  A  of  jobs  is  at  least  as  large  as  the  makespan  for  set  B  when  the  jobs  in 
A  have  a  total  processing  requirement  that  equals  or  exceeds  the  requirement  of  the  jobs  in  B. 
This  property  is  fundamental  to  the  proof  of  many  single-machine  results  as  they  carry  over  to 
variable-resource  models.  Moreover,  the  results  for  problems  in  which  pj  =  p  do  not  rely  on 
the  precise  nature  of  the  transformation,  but  depend  only  on  the  fact  that  all  solutions  share  a 
common  nondecreasing  sequence  of  completion  times. 

In  the  single-machine  case,  R(t)  is  linear,  implying  that  the  mapping  of  resource  con- 
sumptions into  processing  times  is  proportionality-preserving  as  well  as  order-preserving.  That 
is,  ratios  of  intervals  on  the  resource  axis  convert  to  identical  ratios  on  the  time  axis.  This  pro- 
perty is  not  maintained  in  the  variable-resource  model,  because  the  transformation  distorts  pro- 
portionality. In  particular,  we  have  in  the  single-machine  problem  that  pj  pj  <  wi/w/  implies 
AC//AC,-  <  Wj/wj,  where  AC,  and  AC,-  denote  the  magnitude  changes  in  the  completion  times 
of  adjacent  jobs  /'  and  j  which  are  interchanged  in  sequence.  This  implication  does  not  hold  in 
the  variable-resource  problem,  so  the  pairwise  interchange  argument  may  fail. 

These  observations  lead  to  the  conclusion  that  single-machine  results  involving  minimum 
weighted  sum  of  completion  times  cannot  be  directly  extended.  An  open  question  is  therefore 
how  to  exploit  the  structure  of  this  problem  in  the  variable-resource  case  in  order  to  find 
optimal  solutions. 


REFERENCES 

[1]  Baker,  K.R.,  Introduction  to  Sequencing  and  Scheduling,  Wiley  (1974). 

[2]  Burns,  R.N.  and  K.J.  Noble,  "Single  Machine  Sequencing  with  a  Subset  of  Jobs  Completed 

on  Time,"  Working  Paper,  University  of  Waterloo,  Canada  (1975). 
[3]  Emmons,  H.,  "One  Machine  Sequencing  to  Minimize  Certain  Functions  of  Job  Tardiness," 

Operations  Research,  77,  701-715  (1969). 
[4]  Emmons,  H.,  "One  Machine  Sequencing  to  Minimize  Mean  Flow  Time  with  Minimum 

Number  Tardy,"  Naval  Research  Logistics  Quarterly,  22,  585-592  (1975). 
[5]  Emmons,  H.,  "A  Note  on  a  Scheduling  Program  with  Dual  Criteria,"  Naval  Research 

Logistics  Quarterly,  22,  615-616  (1975). 


510  K.R.  BAKER  AND  H.L.W.  NUTTLE 

[6]  Gelders,  L.  and  P.  Kleindorfer,  "Coordinating  Aggregate  and  Detailed  Scheduling  in  the 

One  Machine  Job  Shop:  Part  I,"  Operations  Research,  22,  46-60  (1974). 
[7]  Gelders,  L.  and  P.  Kleindorfer,  "Coordinating  Aggregate  and  Detailed  Scheduling  in  the 

One-Machine  Job  Shop:  Part  II,"  Operations  Research,  23,  312-324  (1975). 
[8]  Heck,  H.  and  S.  Roberts,  "A  Note  on  the  Extension  of  a  Result  on  Scheduling  with  a 

Secondary  Criteria,"  Naval  Research  Logistics  Quarterly,  19,  403-405  (1972). 
[9]  Lawler,  E.L.,  "Optimal  Sequencing  of  a  Single  Machine  Subject  to  Precedence  Constraints," 
Management  Science,  19,  544-546  (1973). 

[10]  Lawler,  E.L.,  "On  Scheduling  Problems  with  Deferral  Costs,"  Management  Science,  11, 
280-288  (1964). 

[11]  Lawler,  E.L.,  "A  Pseudopolynomial  Algorithm  for  Sequencing  Jobs  to  Minimize  Total  Tar- 
diness," Annals  of  Discrete  Mathematics,  /,  331-342  (1977). 

[12]  Lawler,  E.L.  and  J.M.  Moore,  "A  Functional  Equation  and  Its  Application  to  Resource 
Allocation  and  Sequencing  Problems,"  Management  Science,  16,  77-84  (1969). 

[13]  Moore,  J.M.  "An  n  Job,  One  Machine  Sequencing  Algorithm  for  Minimizing  the  Number 
of  Late  Jobs,"  Management  Science,  15,  102-109  (1968). 

[14]  Schrage,  L.E.  and  K.R.  Baker,  "Dynamic  Programming  Solution  of  Sequencing  Problems 
with  Precedence  Constraints,"  Operations  Research,  26,  444-449  (1978). 

[15]  Sidney,  J.B.,  "An  Extension  of  Moore's  Due  Date  Algorithm,"  Symposium  on  the  Theory 
of  Scheduling  and  Its  Application,  (S.E.  Elmaghraby,  editor)  Lecture  Notes  on 
Economics  and  Mathematical  Systems  86,  Springer-Verlag,  Berlin,  393-398  (1973). 


EVALUATION  OF  FORCE  STRUCTURES  UNDER  UNCERTAINTY 

Charles  R.  Johnson 

Department  of  Economics  and  Institute  for 

Physical  Science  and  Technology 

University  of  Maryland 

College  Park,  Maryland 

Edward  P.  Loane 

EPL  Analysis 
Olney,  Maryland 

ABSTRACT 

A  model,  for  assessing  the  effectiveness  of  alternative  force  structures  in  an 
uncertain  future  conflict,  is  presented  and  exemplified.  The  methodology  is  ap- 
propriate to  forces  (e.g.,  the  attack  submarine  force)  where  alternative  unit 
types  may  be  employed,  albeit  at  differing  effectiveness,  in  the  same  set  of  mis- 
sions. Procurement  trade-offs,  and  in  particular  the  desirability  of  special  pur- 
pose units  in  place  of  some  (presumably  more  expensive)  general  purpose 
units,  can  be  addressed  by  this  model.  Example  calculations  indicate  an  in- 
crease in  the  effectiveness  of  a  force  composed  of  general  purpose  units,  rela- 
tive to  various  mixed  forces,  with  increase  in  the  uncertainty  regarding  future 
conflicts. 

INTRODUCTION 

In  planning  the  procurement  of  major  weapons  systems  (submarines,  aircraft,  ships,  etc.), 
an  argument,  based  upon  relative  cost-effectiveness  in  certain  uses,  may  be  made  for  the 
development  and  purchase  of  some  items  which  are  less  versatile  and  effective  than  the  "best" 
available  components  of  an  overall  force.  Assuming  all  relative  costs  and  effectivenesses 
known,  such  an  argument  is  sound  at  least  to  the  extent  that  the  uses  necessitated  by  a  poten- 
tial conflict  are  anticipated.  However,  under  uncertainty  about  the  nature  of  potential  conflicts, 
a  question,  in  general  more  subtle,  is  raised  regarding  the  optimal  composition  of  forces.  In 
this  case,  a  model  is  developed  here  to  analyze  the  utility  of  "mixed"  force  structures,  and 
examples  are  given  to  support  the  intuitive  notion  that  the  less  specific  are  the  presumptions 
about  needs  in  a  future  conflict,  the  more  valuable  are  the  most  versatile  forces. 

Our  focus  here  is  upon  presenting  a  model  able  to  capture  the  value,  under  uncertainty, 
of  versatile  forces  and  not  upon  the  equally  important  problem  of  determination  of  cost  and 
effectiveness  parameters.  The  latter,  as  well  as  the  mixture  versus  force  level  interaction,  are 
touched  upon  tangentially  in  an  example.  The  parameter  estimation  problem,  in  general, 
requires  both  large  scale  theoretical  and  empirical  effort  and  has  been  addressed,  in  the  subma- 
rine case,  in  Reference  [1]. 


511 


512  CR.  JOHNSON  AND  E.P.  LOANE 

By  general  purpose  forces  we  shall  mean  the  most  versatile,  advanced  or  effective  com- 
ponents which  technology  would  currently  allow  in  building  a  military  force  structure.  Special 
purpose  forces,  on  the  other  hand,  might  be  competitive  in  effectiveness  with  general  purpose, 
but  only  in  some  of  the  uses  (which  we  shall  call  missions)  which  possible  conflicts  might 
require.  Naturally,  we  presume  that  the  general  purpose  are  more  expensive  than  the  special 
purpose  forces  per  item,  and  further  that  the  special  purpose  forces  are  cost  effective,  in  some 
missions.  It  is  assumed  also  that  all  costs  are  accounted  for,  e.g.,  development,  production, 
maintenance,  operation,  repair  and  logistical  mobility,  etc. 

Examples  of  general  versus  special  purpose  forces  include  the  following.  In  the  case  of 
submarine  forces,  the  general  purpose  would  be  the  newest  fully  equipped  nuclear  submarine 
while  a  special  purpose  alternative  would  be  the  conventional  diesel  submarine  found  in  many 
European  navies.  The  former  is  presumed  at  least  as  effective  in  all  missions  (much  more  so  in 
some)  while  the  latter  is  much  less  expensive  and  nearly  as  effective  in  some  missions  requiring 
only  low  mobility.  In  the  case  of  aircraft,  a  long-range  fighter-bomber  might  be  considered  gen- 
eral purpose  and  a  plane  designed  primarily  for  ground  attack  would  be  special  purpose. 

The  force  planner  must  procure  some  mixture  of  forces,  constrained,  presumably,  by  a 
fixed  budget.  In  general  there  may  be  several  force  types,  ranging  from  the  very  general  to  the 
very  special  purpose,  and  we  may  think  of  the  force  structure  as  being  a  vector  of  inventories 
of  each  type  purchased.  We  think  of  a  conflict  as  simply  a  collection  of  mission  opportunities, 
and  the  planner's  problem  is  then  to  procure  that  force  structure  which  permits  the  most 
effective  deployment  for  a  conflict.  For  a  specified  conflict,  this  poses  a  deterministic  optimiza- 
tion problem  which,  if  the  conflict  includes  enough  important  mission  opportunities  in  which 
the  special  purpose  forces  are  cost  effective,  will  surely  suggest  a  mixed  force  structure  includ- 
ing at  least  some  special  purpose  units. 

However,  procurement  of  weapons  systems  must  generally  be  decided  upon  long  in 
advance  of  potential  conflicts.  For  a  variety  of  additional  reasons,  there  will  likely  be  consider- 
able uncertainty  as  to  the  precise  nature  of  an  actual  conflict.  We  consider  this  uncertainty  to 
be  characterized  by  a  (known)  distribution  of  potential  conflicts,  i.e.,  a  distribution  of  mission 
opportunities.  We  note  that  there  are  other  ways  in  which  uncertainty  might  be  treated.  For 
example,  if  one's  own  force  structure  is  known,  a  hostile  adversary  might  be  expected,  to  the 
extent  that  circumstances  allow,  to  bias  a  conflict  in  a  direction  which  would  render  one's  own 
force  least  effective.  This  suggests  a  game  theoretic  approach.  Although  it  is  not  pursued 
further  here  and  although  its  information  requirements  might  be  great,  this  would  naturally  fit 
into  the  model  context  we  outline  below.  It  seems  likely  that  such  a  treatment  would  value  the 
versatility  of  general  purpose  forces  more  so  than  the  one  we  pursue.  Another  alternative 
would  be  to  treat  the  effectiveness  of  each  unit  type  as  unknown  and  characterize  it  by  a  proba- 
bility distribution. 

The  planner's  problem  which  we  address  is  then  to  choose  that  affordable  mixture  of 
forces  which,  assuming  optimal  deployment  in  any  conflict,  yields  the  largest  expected 
effectiveness  in  the  uncertain  conflict.  It  should  be  noted  that,  as  stated,  there  is  an  implicit 
assumption  that  the  planner  is  willing  to  take  the  risk  that  the  solution  mixture  will  produce 
unusually  low  effectiveness  in  some  conflicts.  (This  is  in  contrast  with  the  game  theoretic 
approach  mentioned  above.)  However,  to  the  extent  that  the  planner  is  risk-adverse  rather 
than  risk-neutral,  other  criteria  may  be  substituted  for  "expected  effectiveness"  without  concep- 
tual difficulty  and  probably  without  operational  difficulty  in  the  development  below.  It  should 
also  be  mentioned  that  a  measure  of  the  value  of  the  versatility  of  general  purpose  forces  under 
uncertainty  lies  in  comparing  the  solution  mixture  of  the  above  problem  to  the  optimal  mixture 
when  the  expected  conflict  is  assumed  known  (i.e.,  the  case  of  certainty).  In  general  the 
"expected  effectiveness"  solution  will  differ  from  the  "expected  conflict"  solution. 


EVALUATING  FORCE  STRUCTURES  513 

MODEL  DESCRIPTION 

We  imagine  n  force  types  7),  j  —  1,  ...,  n  and  m  different  mission  categories  Uh 
/  =  1,  . . .  ,  aw,  in  which  a  component  of  the  force  might  be  engaged.  Each  7}  is  more  or  less 
effective  in  a  given  U,  which,  to  the  extent  that  total  effectiveness  is  linear  in  the  deployment 
of  force  types  to  mission  categories,  suggests  the  definition  of  an  m-by-n  unit  effectiveness  matrix 

E-(eu), 

in  which  etj  indicates  the  effectiveness  of  a  unit  of  7}  employed  in  U,  for  a  unit  of  conflict 
(presuming  opportunities  available).  We  denote  by  a  \-by-n  vector  s,  a  particular  force  composi- 
tion in  which  Sj  is  the  number  of  7}  available.  At  the  time  of  a  conflict,  s  is  fixed  and,  there- 
fore, provides  a  constraint  on  the  total  effectiveness  attainable.  A  particular  conflict  is  charac- 
terized by  the  total  opportunity  for  effectiveness  which  may  be  obtained  from  each  mission  category. 
These  bounds  are  summarized  in  an  m-by-1  vector  b  in  which  6,  is  the  maximum  opportunity 
in  Uh  This  bound  is  expressed  in  effectiveness  units  rather  than  force  units  because  the 
"opportunities"  are  opportunities  to  damage  the  opponent  and  the  force  types  vary  in  their  abil- 
ity to  do  so  in  a  given  mission. 

The  m-by-n  matrix  A  =  (fly)  summarizes  the  allocation  (or  deployment)  of  7}  to  Uh  i.e., 
Ojj  is  the  amount  of  force  type  7}  allocated  to  mission  category  U,  during  a  conflict.  The  atJ  are 
necessarily  nonnegative  but  we  do  not  assume  them  integral  because  of  the  possibility  of 
switching  units  among  missions. 

The  problem  of  waging  a  given  conflict  is  then  to  deploy  the  given  force  so  as  to  maxim- 
ize total  effectiveness  within  the  constraint  of  the  opportunities  the  conflict  presents.  In  general 
(no  linearity  assumption),  total  effectiveness  is  some  function 

e  =  e  {A ) 

of  the  allocation,  and,  furthermore, 

e(A)  =  e{(A)  +  ...  +  em{A), 

where  e,{A)  is  the  effectiveness  A  yields  through  the  /th  mission  category.  This  means  that 
waging  the  known  conflict  b  amounts  to  the  optimization  problem: 

maximize   e  {A ) 

m 

subject  to   £  a^  ^  Sj,     j  =  1,  . . .  ,  n 

i=\ 

e,(A)  <  b,,  i  =  1,  ...  ,  m 

a,  >  0. 


In  case  total  effectiveness  is  linear  in  A,  we  have  the  linear  programming  problem: 

m      n 

maximize   ]£  £  a^e-^ 

<=1 7=1 

m 

subject  to   £  ay  <  Sj,     j  —  I,  ...  ,  n 

,  m 


£  flyfy  <  K 

/  =  1, 

7=1 

a0  >  0. 

- 

514 


C.R.  JOHNSON  AND  E.P   LOANE 


In  either  case  we  denote  the  maximum  achieved  by  Mis,b).  Then,  equicost  force  compositions 
s  may  be  compared,  for  a  given  conflict,  by  comparing  the  M{s,b).  A  good  general  reference 
for  relevant  concepts  in  the  linear  case  is  Reference  [2]. 

Uncertainty  as  to  the  nature  of  the  conflict  is  characterized  by  a  probability  distribution  for 
b.  For  a  given  s,  there  is  an  M(s,b)  for  each  possible  value  of  b.  These  may  then  be  averaged 
according  to  the  distribution  of  b  to  obtain  the  expected  value: 

Mis)  =  Eb{M{s,b)). 

Comparisons  among  force  compositions  may- then  be  made  by  comparing  the  Mis),  and  the 
planner's  problem  is  to 

maximize  Mis) 

subject  to  his  budget  constraint  governing  the  possible  forces  5  which  may  be  purchased.  In 
general, 

max  EbiMis,b))  j*  max  Mis,Ebib)), 

s  s 

and  in  the  case  that  effectiveness  is  linear  in  A, 

max  EbiMis,b))  <  max  iMis,Ebib)). 

s  s 

Thus,  the  maximum  expected  effectiveness  problem  has  a  different  solution  from  the  problem 
of  maximum  effectiveness  is  an  expected  conflict,  so  that  uncertainty  makes  a  difference  in 
planning.  We  present  examples  which  illustrate  this,  and  in  which  the  latter  favors  special  pur- 
pose forces  while  the  former  favors  general  purpose,  presumably  because  of  their  greater  ability 
to  defend  against  variation  (uncertainty).  The  suggestion  is  that  the  more  uncertainty  there  is, 
the  greater  the  value  of  general  purpose  forces. 

EXAMPLES 

We  conclude  by  giving  two  examples.  The  first  is  primarily  to  illustrate  the  evaluation 
model  and  some  of  the  remarks  made.  The  second  includes  a  more  thorough  examination  of 
the  model  and  its  assumptions  in  a  detailed  example  intended  to  be  suggestive  of  a  realistic  case 
which  motivated  this  study. 

EXAMPLE  1:  Here  we  imagine  three  force  types.  Type  T{  is  the  general  purpose,  and  T2 
and  T3  are  different  special  purpose  forces.  There  are  also  three  mission  categories.  Type  T2  is 
cost  effective  relative  to  T\  in  mission  U\,  while  r3  is  cost  effective  relative  to  T\  in  t/3.  Total 
effectiveness  is  assumed  linear  in  allocations  and  the  unit  effectiveness  matrix  is 


E  = 


1 

.7 

.1 

1 

.1 

.1 

.  1 

.1 

.7 

We  consider  seven  equicost  force  compositions 
s1  =  (9,    0,    0) 
s2  =  (6,    3,    3) 
s3  =  (6,    6,    0) 

s4  =  (6,    0,    6) 

55  =  (5>    4>    4) 


f6  = 

,7  = 


(5,    8,    0) 
(5,    0,    8). 


EVALUATING  FORCE  STRUCTURES 


515 


Thus,  the  two  special  purpose  forces  cost  half  as  much  as  the  general  purpose  over  the  range  of 
procurement  considered.  (Actually,  the  outcome  will  not  differ  qualitatively  if  more  alterna- 
tives based  upon  the  2-for-l  trade-off  are  considered.) 


There  are  six  possible  conflicts 


,1  _ 


0 

6 

6 

12 

0 

6 

b7  = 

6 

b3  = 

0 

b4  = 

0 

/>5  = 

12 

6 

0 

6 

0 

0 

and  b6  = 


0 

0 

12 


with  the  first  three  presumed  to  have  probability  2/9  each  and  the  last  three  probability  1/9 
each.  Thus,  the  expected  conflict  is 


b  = 


Straightforward  calculations  then  yield 

Misx)  =  9 

Mis1)  =  Mis3)  =  Mis4)  =  8.6,  and 

Mis5)  =  Mis6)  =  Mis1)  =  8.47 


so  that 


max  Mis')  =  9 
KK7 


is  achieved  at  s\  the  all  general  purpose  force.  On  the  other  hand, 

Mis\  b)  =  9 
while 

Mis2,  b)  =  10.2,  Mis3,  b)  =  Mis\  b)  =  10.1, 

Mis5,  b)  =  10.6,  and  Mis6,  b)  =  Mis1,  b)  =  9.3. 

Thus,  a  mixed  force  s5  is  optimal  for  the  expected  conflict.  The  conclusion,  in  this  case,  is  that 
general  purpose  forces  are  overall  more  cost  efficient  under  uncertainty.  It  should  be  noted  that 
in  calculating  the  Mis'),  each  other  force  had  higher  effectiveness  than  s1  for  some  conflicts 
(but  not  overall)  and  all  were  better  than  s1  in  the  expected  conflict.  Thus,  it  is  only  the  value 
of  versatility  under  uncertainty  which  makes  s1  preferred. 

EXAMPLE  2:  This  example  is  taken  from  the  problem  of  submarine  procurement  and 
again  illustrates  the  effect  of  uncertainty  on  the  attractiveness  of  special  purpose  forces. 

For  simplicity,  we  consider  only  two  types  of  forces,  general  purpose  and  special  purpose 
units.  In  this  setting,  the  distinction  between  new  procurement  general  purpose  or  special  pur- 
pose forces  might  well  be  that  between  nuclear  or  diesel-electric  propulsion.  Equipment  and 
weapons  could  be  identical,  but  the  lower  underwater  mobility  inherent  in  diesel-electric  pro- 
pulsion would  limit  effective  employment  of  such  forces  to  particular  ASW  missions.  In  the 
actual  planning  process,  the  existing  force  structure  must  also  be  considered  since  in  a  future 
conflict,  presently  existing  units  might  be  restricted  to  low  vulnerability  missions  (presumably 
being  less  capable  than  new  procurement  general  purpose  units)  and  thus  constitute  additional 
categories  of  special  purpose  forces. 

The  present  example  considers  four  missions  and  measures  unit  effectiveness  in  each  mis- 
sion by  a  kill  rate  defined  by: 


516  C.R   JOHNSON  AND  E.P   LOANE 


Kills  (of  enemy  submarines)  per  unit 
time  by  one  on-station  U.S.  submarine 
of  type  Tj  engaged  in  mission  Ut 


,J       [Number  of  surviving  enemy  submarines] 

The  above  quantity  is  well  defined  for  important  submarine  missions,  being  independent  of 
enemy  force  size  and  the  number  of  U.S.  submarines  committed  to  U,  over  a  substantial  range 
of  values.  For  instance,  considering  a  fixed  barrier  mission,  the  rate  of  enemy  transits  through 
the  barrier  and  thus  the  rate  of  opportunities  for  kill  would  be  proportional  to  the  number  of 
surviving  units.  Also,  U.S.  submarine  probabilities  of  detection  and  kill  given  an  opportunity 
(here  target  passage  through  the  barrier  area  assigned  to  the  submarine)  are,  at  least  initially, 
inversely  proportional  to  the  width  of  the  barrier  area  assigned.  In  this  circumstance,  etj  is  well 
defined.  Of  course,  nonlinear  effects  are  present  and  become  significant  as  the  number  of  U.S. 
units  is  increased.  One  could  argue  that,  as  returns  diminish,  no  additional  submarines  should 
be  assigned  to  the  fixed  barrier;  this  then  determines  the  mission  opportunities,  bj.  With  units 
of  differing  capabilities,  bj  is  properly  stated  in  terms  of  effectiveness  obtained,  not  in  some 
fixed  maximum  number  of  units  employed,  since  the  onset  of  diminishing  returns  would  occur 
at  different  force  levels  for  different  unit  effectivenesses.  Finally,  variations  in  bj  (for  the  fixed 
barrier  mission)  might  arise  from  uncertainties  in  enemy  basing,  at-sea  replenishment  of  sub- 
marines, desirable  barrier  locations  being  untenable  due  to  enemy  ASW,  etc. 

Similar  arguments  apply  for  the  direct  support  mission  (submarines  employed  in  the 
defense  of  surface  formations)  and  similar  conclusions  are  obtained  in  the  area  search  mission. 

It  should  be  noted  that  kill  rates  add,  and  that  the  summation 

m      n 

i=\  7=1 

being  an  overall  rate  at  which  enemy  submarines  are  being  killed,  is  a  sensible  measure  of 
effectiveness  for  the  entire  U.S.  submarine  force.  It  is  even  plausible  that  the  differing  subma- 
rine types  would  be  assigned  to  missions  so  as  to  (approximately)  maximize  this  sum.  Finally, 
to  the  extent  that  variations  in  bj  reflect  week-to-week  changes  within  a  single  conflict  (i.e.,  one 
week  large  numbers  of  forces  are  required  for  direct  support,  the  next  week  these  same  units 
are  used  in  a  barrier)  rather  than  uncertainty  as  to  some  long-term  mix  of  missions  that  will  be 
required  in  an  unspecified  conflict,  then  the  expected  value 

Eb(M(s,b)) 

can  be  interpreted  as  a  time-average  of  force  kill  rate  and  again  this  is  a  preeminently  sensible 
measure. 

It  is  the  authors'  belief  that  the  use  of  kill  rates  as  measures  of  unit  effectiveness  and  the 
linear  formulation  of  force  effectiveness,  while  necessarily  involving  some  approximation,  does 
capture  the  important  aspects  of  evaluating  alternative  submarine  force  structures.  Of  course, 
in  realistic  applications,  the  evaluation  of  effectiveness  for  alternative  forces  is  a  substantial 
effort.  Reference  [1]  documents  a  major  study  effort  which  arrives  at  such  estimates,  although 
not  expressed  as  kill  rates.  Evaluation  of  force  effectiveness  is  not  addressed  here.  Quantita- 
tive inputs  to  this  second  example,  shown  in  the  following  tabulation,  are  completely  hypotheti- 
cal; and,  while  of  reasonable  relative  magnitudes,  are  chosen  to  illustrate  the  theses  of  this 
paper. 


EVALUATING  FORCE  STRUCTURES 


517 


TABLE  1. 


Unit  Effectiveness,  e,j 
(Kill  rates) 

General 

Purpose 

Submarines 

Special 

Purpose 

Submarines 

Expected  Total 

Opportunity  for 

Effectiveness 

EtUh) 

Mission  1 

1.0 

.95 

16 

Mission  2 

1.50 

.50 

16 

Mission  3 

.75 

.375 

12 

Mission  4 

.40 

.20 

Unlimited 

TABLE  2. 


Alternative  Force  Compositions,  (s],s2) 
(Numbers  of  Units  on-station) 

General 

Purpose 

Submarines 

Special 

Purpose 

Submarines 

35 

0 

25 

10 

20 

17 

15 

24 

;  Unit  effectivenesses  and  force  compositions  are  stated  in  terms  of  on-station  submarines;  actual 

numbers  of  operational  units  would  be  higher  than,  and  not  necessarily  in  proportion  to,  the 

;  numbers  shown.    The  alternative  forces  shown  might  well  be  equal  cost  options  if  there  were 

,  some  fixed  cost  associated  with  deploying  any  special  purpose  submarines.    The  fourth  mission 

:  is  not  limited  in  the  number  of  forces  which  can  be  employed  or  the  total  effectiveness  which 

can  be  obtained.    This  might  be  thought  of  as  undirected  open-ocean  search,  which  could 

always  be  undertaken  by  any  submarine  not  otherwise  assigned. 


The  distributions  of  b,  reflecting  uncertainty,  are  represented  by  lists  of  60  sample 
vectors— each  considered  equally  likely.  The  lists  are  not  repeated  here.  Sample  vectors  were 
generated  by  Monte-Carlo  methods,  assuming  each  b,  is  an  independent  truncated*  Gaussian 
random  variable  with  the  above  stated  mean  and  relative  standard  deviations  of  35%  and  60%  in 
the  two  cases  considered.  Effectiveness,  for  the  alternative  force  compositions  is  shown  in 
Table  3,  following. 

The  maximal  effectiveness  for  each  level  of  uncertainty  is  enclosed  in  dashes.    Not 
,  surprisingly,  the  example  values  show  a  change  in  preference,  from  a  mixed  force  to  an  all  gen- 
eral purpose  force,  as  variability  in  mission  opportunities  increases.    What  is  surprising  is  that 
the  changes,  and  differences  are  so  small  overall.   This  can  be  explained  qualitatively,  and  is  a 
reflection  of  a  real  concern  in  procurement  decisions. 


'Both  high  and  low  values  were  discarded  so  as  to  preserve  the  mean  value  and  assure  that  b,  ^  0. 


518 


C.R.  JOHNSON   \NI)  E.P.  LOANK 


TABLE  3. 


Force  Compositions,  5 

Force  Effectiveness,  , 

Mis) 

(s\,  s2) 

(35,    0) 
(25,  10) 
(20,  17) 
(15,24) 

No  Uncertainty 

(mean  value 

b  used) 

38.2 
37.9- 

Relative 

Standard 

Deviation  of 

each  b,  =  35% 

37.4 
36.9 

Relative 

Standard 

Deviation  of 

each  b-,  =  60% 

36.5 

35.8 

39.1 

37.6 

36.2 

37.9 

37.1 

36.0 

In  the  present  example,  the  attractiveness  of  special  purpose  units  rests  on  the  availability 
of  opportunities  in  mission  1;  i.e.,  if  b\  ^  11.4  then  forces  including  some  special  purpose 
units  are  preferred  to  an  all  general  purpose  force.  But  mission  1  is  a  substantial  (36%)  of  the 
projected  employment  of  submarines;  if  this  were  taken  away,  then  the  force  is  over-built  and 
any  alternative  composition  is  able  to  exploit  the  remaining  attractive  opportunities.  That  is,  if 
b\  —  0  then  all  force  compositions  entertained  give  about  the  same  effectiveness;  and  as  noted 
above,  if  b\  ^  11.4,  compositions  involving  special  purpose  units  are  preferred.  In  this  cir- 
cumstance, i.e.,  with  the  numeric  inputs  to  this  example  calculation,  one  cannot  expect  to  see 
dramatic  changes  in  preferences  among  force  compositions,  with  explicit  consideration  of 
uncertainty. 


As  a  final  point,  we  note  the  suboptimality  of  separating  questions  of  force  composition 
from  questions  of  force  levels.  Although  this  raises  an  issue  worthy  of  further  study,  we  only 
mention  the  issue  here  by  extending  the  previous  example.  Using  exactly  the  same  unit 
effectiveness  and  mission  opportunity  values  stated  previously,  but  considering  alternative  force 
compositions  which  involve  an  additional  5  general  purpose  submarines,  one  obtains  the  follow- 
ing results: 


TABLE  4. 


Force  Compositions,  s 

Force  Effectiveness, 

Mis) 

iS\,  s2) 

No  Uncertainty 

(mean  value 

b  used) 

Relative 

Standard 

Deviation  of 

Relative 

Standard 

Deviation  of 

each  b;  =  35% 

each  bi  =  60% 

(40,    0) 

42.0 

40.7 

39.6 

(30,  10) 
(25,  17) 
(20,  24) 

41.6 

40.3 

39.0 

39.6 

42.8 

41.0 

41.7 

40.7 

39.7 

In  this  case,  the  uncertainty  considered  does  not  lead  to  a  preference  for  an  all  general  purpose 
force,  although  again  the  effects  are  very  small.  The  tendency  here  is  intuitively  satisfying,  i.e., 
special  purpose  units  become  more  attractive  as  overall  force  levels  are  increased,  relative  to  a 
fixed  job  to  be  done.  Notice  also  that  increased  uncertainty  decreases  the  incremental 
effectiveness  of  the  additional  five  general  purpose  units,  in  every  case. 


EVALUATING  FORCE  STRUCTURES  519 

REFERENCES 

[1]  Chief  of  Naval  Operations,  Future  Submarine  Employment  Study  (U),   (29  December 

1972)-SECRET. 
[2]  Dantzig,  G.B.,  Linear  Programming  and  Extensions,  Princeton  University  Press  (1963). 


A  NOTE 

ON  THE  OPTIMAL  REPLACEMENT 

TIME  OF  DAMAGED  DEVICES 

Dror  Zuckerman 

The  Hebrew  University  of  Jerusalem 
Israel 

ABSTRACT 

Abdel  Hameed  and  Shimi  [1]  in  a  recent  paper  considered  a  shock  model 
with  additive  damage.  This  note  generalizes  the  work  of  Abdel  Hameed  and 
Shimi  by  showing  that  the  w-priori  restriction  to  replacement  at  a  shock  time 
made  in  [1]  is  unnecessary. 

1.  INTRODUCTION 

A  recent  paper  by  Abdel  Hameed  and  Shimi  [1]  was  concerned  with  determining  the 
optimal  replacement  time  for  a  breakdown  model  under  the  following  assumptions:  A  device  is 
subject  to  a  sequence  of  shocks  occurring  randomly  according  to  a  Poisson  process  with  parame- 
ter A.  Each  shock  causes  a  random  amount  of  damage  and  these  damages  accumulate  addi- 
!  tively.  The  successive  shock  magnitudes  Yx,  Y2,  ■  ■  ■ ,  are  positive,  independent,  identically  dis- 
tributed random  variables  having  a  known  distribution  function  F(-).  A  breakdown  can  occur 
only  at  the  occurrence  of  a  shock.  Let  8  denote  the  failure  time  of  the  device.  For  /  <  8  let 
X(t)  be  the  accumulated  damage  over  the  time  duration  [0,/].  The  device  fails  when  the  accu- 
mulated damage  X{t)  first  exceeds  Z.   That  is, 

(1)  8  =  inf{/  ^  0;  X(t)  ^  Z\, 

where  Z  is  a  random  variable,  independent  of  the  accumulated  damage  process  X,  having  a 
known  distribution  function  G()  called  the  killing  distribution.  More  explicitly,  if  X(t)  =  x 
and  a  shock  of  magnitude  y  occurs,  at  time  r,  then  the  device  fails  with  probability 

n,  G(x+y)  -  G(x) 

,  l-G(x)         ■ 

Upon  failure  the  device  is  immediately  replaced  by  a  new  identical  one  with  a  cost  of  c.  When 
the  device  is  replaced  before  failure,  a  smaller  replacement  cost  is  incurred.  That  cost  depends 
on  the  accumulated  damage  at  the  time  of  replacement  and  is  denoted  by  c(x).  That  is  to  say 
c(x)  is  the  cost  of  replacement  before  failure  when  the  accumulated  damage  equals  x.  It  is 
assumed  that  c(0)  =  0  and  c(x)  is  bounded  above  by  c.  Thus  there  is  an  incentive  to  attempt 
to  replace  the  device  before  failure.  The  condition  c(0)  =0  has  to  be  interpreted  as  a  policy  of 
no  replacement  if  there  is  no  damage. 

In  their  paper  Abdel  Hameed  and  Shimi  [1]  derived  an  optimal  replacement  policy  that 
minimizes  the  expected  cost  per  unit  time  under  the  restriction  that  the  device  can  be  replaced 
only  at  shock  point  of  time. 

521 


522 


D   ZUCKERMAN 


In  the  present  article  we  consider  a  similar  breakdown  model  without  the  above  restriction 
made  in  [1].  We  allow  a  controller  to  institute  a  replacement  at  any  stopping  time  before  failure 
time.  He  must  replace  upon  device  failure.  Throughout,  we  restrict  attention  to  replacement 
policies  for  which  cost  of  replacement  is  solely  a  function  of  the  accumulated  damage.  In  some 
shock  models,  replacement  at  a  scheduled  time  offers  potential  benefits  relative  to  replacement 
at  a  random  time.  However,  the  problem  of  scheduled  replacement  in  failure  models  with  addi- 
tive damage  is  an  open  problem  and  it  is  beyond  the  scope  of  the  present  study. 

Let  Fbe  the  replacement  time.  At  time  7"  the  device  is  replaced  by  a  new  one  having  sta- 
tistical properties  identical  with  the  original,  and  the  replacement  cycles  are  repeated 
indefinitely.  The  collection  of  all  permissible  replacement  policies  described  above  will  be 
denoted  by  M.  Our  objective  is  to  prove  that  an  optimal  policy  replaces  the  system  at  shock 
point  of  time.  Thus  the  restriction  about  the  class  of  permissible  replacement  policies  made  in 
[1]  can  be  omitted. 

The  following  will  be  standard  notation  used  throughout  the  paper:  E[Y;A],  where  Yis  a 
random  variable  and  A  is  an  event,  refers  to  the  expectation  E  [IA  Y]  =  E[Y\IA  —  \]P(A), 
where  lA  is  the  set  characteristic  function  of  A. 

2.  THE  OPTIMAL  POLICY 

By  applying  a  standard  renewal  argument,  the  long  run  average  cost  per  unit  time  when  a 
replacement  policy  Tis  employed  can  be  expressed  as  follows 

E[c(X{T));  T  <  8]  +  £[c;  T  =  8] 
E[T) 


(3) 


«/>r  = 


Let  i//  *  =  inf  <//  r. 


Clearly 


.//*  ^ 


E[c(X(T))\  T  <  8]  +  E[c;  T  =  8] 
£171 


for  every  T  €  A/,  and  the  optimal  replacement  policy  that  minimizes  t//  T  over  the  set  M  is  the 
one  that  maximizes 


(4) 


0T  =  i}j*E[T]  +  E[c  -  c(X(T))-    T  <  8]. 


By  applying  Dynkin's  formula  (see  Theorem  5.1  and  its  Corollary  in  Dynkin  [2])  equation  (4) 
reduces  to 


(5) 

where 

(6)      J(x)  =  i)j*-kc 


0T=  e\Jq  J(X(s))ds 


+  c, 


J  n 


G(x+y) 


G(x) 


dF{y) 


+  \ 


(x)  -   f  c(x+y)  — =— 
J  G(: 


G(x+y) 
x) 


dF{y) 


The  proof  of  the  above  result  follows  a  procedure  similar  to  that  used  by  the  author  in  (Section 
2  of  [3]),  and  therefore  is  omitted. 

In    what    follows    we    shall    denote    by    S  the    state    space    of   the    stochastic    process 
[X(t);  t  <  8). 


NOTE  ON  REPLACEMENT  TIME 


523 


Let 

(7) 

and 
(8) 


S,  =  {x  €  S;  J(x)  >  0}, 
S2  =  U  €  S;  7U)  <  0}. 


Let  t\,  t2,  t^,  . . . ,  be  the  shock  points  of  time  and  define 
W=[t,;  i  >l). 

Let  L  be  the  subclass  of  replacement  policies  in  which  a  decision  can  be  taken  only  over  the  set 
W. 

We  proceed  with  the  following  result: 

THEOREM  1:  For  every  replacement  policy  Tx  £  L,  there  exists  a  replacement  policy  T2 
6  L  such  that  9  T   >  9  T  . 

'2  '  1 

PROOF:  Let  Tx  be  a  replacement  policy  such  that  Tx  $  L. 

Let  T(S2)  be  the  hitting  time  of  the  set  S2.   That  is 

(9)  T(S2)  =  inf{/  >  0;  X{t)  €  S2). 

(It  is  understood  that  when  the  set  in  braces  is  empty,  then  T(S2)  =  °°.) 
Let 

(10)  T=  inftf  >  Tf,  t  €   W} 
and  define 

(11)  T2  =  min{f,  7-(S2)}. 


Clearly  T2  €  L.   Next  we  show  that  0  r    >  0  r  . 


Using  (5)  we  obtain 


9T2-9T]=  ElfJ2  J(X(s))ds 


\s: 


E\  J(X(s))ds 


£"[//  J(X(s))ds;  T2=  f] 


(12) 


fT(lS)J(X(s))ds\  t  >  T(S2) 


Note  that 


I.        {T2=  f)  implies  that  \T(S2)  >  t)  and  therefore  E  \fT  J(X(s))ds;  T2  =  f  1  ^  0 


524 


D.  ZUCKERMAN 


II. 


J  (X(s))  for  T(S2)   ^   s  <    T,  is  non-positive  on  the  set  {f  >  T(S2)}.    Therefore 


J    '    J(X(s))ds\  t  >  T(s2) 


^  0. 


Therefore,  (using  (12))  we  obtain 


9 


r2-eT]  >  o 


as  desired. 

Recalling  that  an  optimal  replacement  policy  T*  is  the  one  that  maximizes  9  T  and  using 
Theorem  1  it  follows  that  T*  €  L.  Hence,  the  optimal  policy  derived  by  [1]  is  the  optimal  one 
among  all  possible  replacement  policies  for  which  cost  of  replacement  is  solely  a  function  of  the 
accumulated  damage. 

Finally  it  should  be  pointed  out  that  if  the  benefits  of  scheduled  replacement  were  con- 
sidered, the  conclusion  reached,  that  an  optimal  policy  replaces  the  device  at  a  shock  point  of 
time,  would  no  longer  generally  hold. 

REFERENCES 

[1]  Abdel  Hameed,  M.  and  I.N.  Shimi,  "Optimal  Replacement  of  Damaged  Devices,"  Journal  of 

Applied  Probability  15,  153-161,  (1978). 
[2]  Dynkin,  E.G.,  Markov  Processes  I,  Academic  Press,  New  York,  (1965). 
[3]  Zuckerman,  D.  "Replacement  Models  under  Additive  Damage,"  Naval  Research  Logistics 

Quarterly,  24,  549-558,  (1977). 


■ 


A  NOTE  ON  THE  SENSITIVITY  OF  NAVY  FIRST 
TERM  REENLISTMENT  TO  BONUSES,  UNEMPLOYMENT 
AND  RELATIVE  WAGES* 

Les  Cohen 

Government  Services  Division 

Kenneth  Leventhal  &  Company 

Washington,  D.C. 

Diane  Erickson  Reedy 

Mathtech,  Inc. 
Rosslyn,  Virginia 

ABSTRACT 

Multiple  regression  analysis  of  first  term  reenlistment  rates  over  the  period 
1968-1977  confirms  previous  findings  that  reenlistment  is  highly  sensitive  to 
unemployment  at  the  time  of  reenlistment  and  shortly  after  enlistment,  almost 
four  years  earlier.  Bonuses,  particularly  lump  sum  bonuses,  were  also  shown  to 
be  a  significant  determinant  of  reenlistment. 

This  note  reports  the  results  of  cross-sectional  multiple  regression  analysis  of  first  term 
Navy  reenlistment.  Equations  which  were  estimated  represent  the  completion  of  research  con- 
ducted by  Cohen  and  Reedy  [1]  which  analyzed  the  sensitivity  of  first  term  reenlistment  to 
fluctuations  in  economic  conditions  at  the  time  of  reenlistment  and  about  the  time  of  enlist- 
ment, considering  the  effect  of  the  latter  on  reenlistment  behavior  four  years  later.  The  princi- 
pal finding  of  that  study  was  that  unemployment  rates,  both  at  the  time  of  reenlistment  and 
about  the  time  of  enlistment  four  years  earlier,  were  powerful  predictors  of  reenlistment  rates. 
By  comparison,  measures  of  private  sector  versus  military  wages  entered  in  the  same  equations 
were  generally  found  to  be  insignificant  or,  at  best,  relatively  unimportant.  That  study  did  not, 
however,  take  into  account  the  influence  of  reenlistment  bonuses  which  this  follow-up  note 
addresses. 

This  note  describes  the  results  of  regression  equations,  replicating  those  which  were  the 
basis  of  the  original  Cohen-Reedy  paper,  which  include  reenlistment  bonus  variables  to  con- 
sider their  influence  upon  Navy  reenlistment  over  the  ten  year  period,  1968-1977. 

Reenlistment  rates  were  compiled  from  Navy  Military  Personnel  Statistics  ("The  Green 
Book"),  quarterly  by  rating,  separately  for  E-4's  and  E-5's.  To  help  minimize  spurious  fluctua- 
tions in  the  data,  reenlistment  rates  were  calculated  only  for  those  quarters  which  had  an  aver- 
age of  at  least  10  eligibles  per  month.  In  addition,  due  to  definitional  and  mensurational  incon- 
sistencies, ratings  which  include  nuclear  power  and  diver  NEC's  were  eliminated  and  other 


*This  research  was  supported  by  the  Office  of  the  Chief  of  Naval  Operations,  Systems  Analysis  Division,  under  a  con- 
tract with  Information  Spectrum,  Inc.,  Arlington,  Virginia. 

525 


526  L   COHEN  AND  D.  REEDY 

ratings  which  include  6  year  obligors  (6YO's)  were  analyzed  separately.  The  resultant  data  base 
consisted  of  3110  observations  for  4YO  ratings,  and  787  observations  for  6YO  ratings.  Each 
observation  referred  to  a  specific  quarter,  rating  and  pay  grade,  either  E-4  or  E-5. 

Four  multiple  linear  regression  equations  were  estimated:  one  for  4YO  ratings  (including 
E-4's  and  E-5's);  one  for  6YO  ratings  (including  E-4's  and  E-5's);  one  for  4YO  E-4's;  and  one 
for  4YO  E-5's.  No  attempt  was  made  to  estimate  separate  equations  for  each  major  occupa- 
tional category  as  was  done  in  the  previous  study.  Given  observed  variations  in  earlier  equa- 
tions, collective  treatment  of  ratings  has  probably  resulted  in  depressed  R2  statistics. 

The  dependent  variable,  RATE3,  is  the  percentage  deviation  of  the  current  quarter  reen- 
listment  rate  from  the  mean  reenlistment  rate  for  that  rating  and  pay  rate  over  the  10  years 
under  study,  1968-1977. 

RATF3  =    (Quartei"ly  Reenlistment  Rate  -  Mean  (10  Year)  Reenlistment  Rate) 

Mean  (10  Year)  Reenlistment  Rate 

This  specification  of  the  dependent  variable  was  adopted  to  contend  with  wide  variations  in  the 
level  of  reenlistment  rates  from  rating  to  rating.  RATE3  describes  relative  changes  in  reenlist- 
ment rates. 

Independent  variables  included  in  the  equations  are  listed  and  defined  in  Table  1. 

TABLE  1  —  Independent  Variables 

AUR current  national  unemployment  rate 

ARAUR average  rate  of  change  in  unemployment  (AUR)  over  the 

past  6  quarters  preceeding  the  reenlistment  decision 

AUR13 unemployment  (AUR)  13  quarters  prior  to  the  reenlistment 

decision  (NOTE:  Virtually  uncorrelated  with  AUR.) 

RW the  ratio  of  military  basic  pay  to  private  sector  earnings 

AWARD bonus  award  multiple 

LS dummy  variable  indicating  lump  sum  payment  of  bonuses 

(LS  =  1  for  1968  -  1974;  LS  =  0  for  1975  -  1977) 

ELIG number  of  individuals  eligible  for  reenlistment 

PAYRATE dummy  variable  indicating  rate 

(PAYRATE  =  1  for  E  -  5's;  PAYRATE  =  0  for  E  -  4's) 

DRAFT number  of  persons  drafted  (all  services)  18  quarters  prior 

to  reenlistment  decision 

WAR dummy  variable  for  Viet  Nam  War 

(WAR  =  1  for  1968  -  1972;  WAR  =  0  for  1973  -  1977) 

QTR3 third  quarter  seasonal  dummy 

(QTR3  =  1  for  3rd  calendar  quarter  only) 

TIME time  variable  (TIME  =  Year  -  67) 


NOTE  ON  FIRST  TERM  REENLISTMENT 


527 


In  the  context  of  cross-sectional  analysis,  estimated  coefficients  do  not  pertain  to  the 
impact  of  a  given  variable  over  time  for  a  specific  rating,  but  represent  the  typical  impact  of  that 
variable  over  the  entire  10  years  across  all  ratings  which  were  included  in  the  study. 

Results  of  the  estimation  procedures  are  summarized  in  Table  2. 

TABLE  2  —  Reenlistment  Equations:  Coefficients,  (t-statistics),  and  Means 


EQUATION 

4Y0 

6YO 

4Y0/E-4 

4YO/E-5 

INDEPENDENT 

Coef. 

Coef. 

Coef. 

Coef. 

VARIABLE 

(i) 

Mean 

(i) 

Mean 

(/) 

Mean 

(/) 

Mean 

AUR 

14.52 
(4.96) 

.06 

18.46 
(5.07) 

.06 

15.77 
(3.56) 

.06 

12.49 
(3.41) 

.06 

ARAUR 

-.84 
(1.61) 

.02 

-1.56 
(2.34) 

.02 

-.70 
(  .88) 

.02 

-1.38 
(2.12) 

.02 

AUR13 

29.38 

(12.44) 

.04 

25.13 

(8.00) 

.05 

24.21 
(6.75) 

.04 

36.87 

(12.50) 

.04 

RW 

-.63 
(1.12) 

.77 

.48 
(  .64) 

.77 

.94 
(1.03) 

.72 

-.68 
(1.03) 

.82 

AWARD 

.03 
(4.06) 

1.15 

.0004 
(  .05) 

2.31 

.03 
(3.47) 

1.15 

.02 
(2.43) 

1.15 

LS 

.45 
(6.32) 

.76 

.57 
(5.81) 

.67 

.31 
(2.90) 

.76 

.50 
(5.62) 

.76 

ELIG 

-  08E-2 

(8.74) 

103.76 

-  03E-2 
(3.83) 

165.82 

-  .07E-2 
(5.69) 

123.71 

-  .001 
(6.92) 

83.74 

PAYRATE 

.03 
(  .45) 

.50 

-  .05 
(  .66) 

.50 

DRAFT 

-  .04E-4 
(6.69) 

5.18E+4 

-  .04E-4 

(4.82) 

4.77E+4 

-  .04E-4 
(5.03) 

5.18E+4 

-  .04E-4 
(5.51) 

5.17E  +  4 

WAR 

.26 
(4.10) 

.57 

.36 
(4.46) 

.50 

.21 
(2.12) 

.57 

.37 
(4.61) 

.57 

QTR3 

-  .09 
(3.60) 

.26 

-.08 
(2.46) 

.26 

-  .08 
(2.17) 

.26 

-.07 
(2.32) 

.26 

TIME 

.08 
(4.49) 

5.05 

.08 
(3.86) 

5.63 

.08 
(2.94) 

5.05 

.06 
(2.85) 

5.06 

CONSTANT 

-2.26 
(5.27) 

-  3.03 
(5.85) 

-  3.10 
(4.47) 

-  2.39 
(4.23) 

R2 

.34 

.48 

.40 

.35 

OBSERVATIONS 

3110 

787 

1556 

1554 

The  three  unemployment  variables,  AUR,  ARAUR  and  AUR13,  were  specified  precisely 
as  in  the  earlier  Cohen-Reedy  study.  Consistent  with  those  results,  the  significance  of  the 
unemployment  rate  variables  and  the  magnitude  of  their  apparent  effect  upon  reenlistment  are 
striking.  Taken  literally,  coefficients  in  the  4YO  equation,  for  example,  show  a  one  point 
increase  in  AUR  13  (  +  .01)  indicating  a  29  point  (  +  .29)  increase  in  RATE3.  While  it  is  real- 
ized that  these  coefficients  may  overstate  the  real  influence  of  unemployment,  their  equations, 
like  those  which  they  are  replicating,  do  indicate  that  reenlistment  decisions  may  in  fact  be  sen- 
sitive to  perceived  costs  of  employment  search  and  to  the  security  of  private  sector  employ- 
ment. 


The  first  compensation  variable,  RW,  representing  the  ratio  of  military  to  private  sector 
wages,  was  calculated  separately  for  E-4's  and  E-5's  using  basic  pay  for  E-5's  and  E-6's  respec- 
tively as  proxies  for  next-term  earnings.  RW  was  not  a  significant  variable  in  any  of  the  four 
equations. 


528 


L   COHEN  AND  D   REEDY 


The  other  two  compensation  variables,  AWARD  and  LS,  relate  to  bonuses.  AWARD  is 
the  multiple  for  a  particular  rating  in  a  given  quarter,  ranging  from  0  to  6.  This  multiple  is  the 
factor  which  the  Navy  applies  against  an  individual's  monthly  pay  to  compute  the  dollar  amount 
of  his  bonus  payment.  AWARD  was  significant  in  all  three  4YO  equations.  LS  is  a  dummy 
variable  which  assumes  a  value  of  1  through  calendar  1974  during  the  period  when  lump  sum 
awards  were  paid  to  approximately  50%  of  those  individuals  who  reenlisted.  Beginning  January 
1,  1975,  a  new  policy  was  initiated  which  reduced  the  percentage  of  lump  sum  bonus  payments 
to  approximately  10%  of  those  reenlisting.  The  coefficient  of  LS  indicates  that  when  bonuses 
were  paid  in  lump  sums,  the  percentage  difference  between  actual  reenlistment  rates  and  mean 
(10  year)  reenlistment  rates  was  higher  by  .45  than  when  bonuses  were  paid  in  installments. 

The  variable  ELIG  was  included  in  the  equations  simply  to  capture  the  observed  relation- 
ship between  low  numbers  of  eligibles  and  high  reenlistment  rates. 

PAYRATE  is  a  dummy  variable  which  distinguishes  between  E-4's  and  E-5's  (PAYRATE 
=  1).  TIME  was  included  to  capture  the  influence  of  factors  which  have  changed  steadily  over 
time  such  as  the  quality  of  life  improvements  effected  by  the  Navy  over  the  past  several  years. 

These  equations  support  the  authors'  earlier  findings,  notably  that  unemployment  rates  at 
the  time  of  the  reenlistment  decision  and  shortly  after  enlistment  are  important  determinants  of 
reenlistment  rates.  Relative  wages  continue  to  appear  unimportant.  It  appears,  however,  that 
reenlistment  bonuses  have  had  a  significant  positive  effect  on  reenlistment,  particularly  when 
those  bonuses  have  been  awarded  in  lump  sum  payments. 

Although  by  no  means  conclusive,  the  equations  summarized  in  Table  2  suggest  the  fol- 
lowing management  initiatives: 

—  Experimentation  is  warranted  in  the  use  of  lump  sum  bonuses  to  mitigate  the  effects 
of  low  unemployment  rates  on  reenlistment. 


—  Opportunities  to  reenlist  might  be  timed  to  coincide  with  low  points  (periods  of  high 
unemployment)  in  the  business  cycle. 


—  AUR13  and  predicted  AUR  should  be  used  to  augment  current  information  used  for 
projecting  reenlistment  rates. 

—  Based  on  the  continued  performance  of  the  AUR13  variable,  serious  consideration 
must  be  given  to  implementing  new  programs  designed  to  effect  enlistee  career  decision 
making  very  early  during  the  first  term  of  service. 


REFERENCES 


[1]  Cohen,  L.  and  D.  Reedy,  "The  Sensitivity  of  Navy  First  Term  Reenlistment  Rates  to 
Changes  in  Unemployment  and  Relative  Wages,"  Naval  Research  Logistics  Quarterly,  26, 
695-709  (1979). 


U.S.  GOVERNMENT  PRINTING  OFFICE:  1980-311-493/3 


INFORMATION  FOR  CONTRIBUTORS 

The  NAVAL  RESEARCH  LOGISTICS  QUARTERLY  is  devoted  to  the  dissemination  of 
scientific  information  in  logistics  and  will  publish  research  and  expository  papers,  including  those 
in  certain  areas  of  mathematics,  statistics,  and  economics,  relevant  to  the  over-all  effort  to  improve 
the  efficiency  and  effectiveness  of  logistics  operations. 

Manuscripts  and  other  items  for  publication  should  be  sent  to  The  Managing  Editor,  NAVAL 
RESEARCH  LOGISTICS  QUARTERLY,  Office  of  Naval  Research,  Arlington,  Va.  22217. 
Each  manuscript  which  is  considered  to  be  suitable  material  tor  the  QUARTERLY  is  sent  to  one 
or  more  referees. 

Manuscripts  submitted  for  publication  should  be  typewritten,  double-spaced,  and  the  author 
should  retain  a  copy.  Refereeing  may  be  expedited  if  an  extra  copy  of  the  manuscript  is  submitted 
with  the  original. 

A  short  abstract  (not  over  400  words)  should  accompany  each  manuscript.  This  will  appear 
at  the  head  of  the  published  paper  in  the  QUARTERLY. 

There  is  no  authorization  for  compensation  to  authors  for  papers  which  have  been  accepted 
for  publication.  Authors  will  receive  250  reprints  of  their  published  papers. 

Readers  are  invited  to  submit  to  the  Managing  Editor  items  of  general  interest  in  the  field 
of  logistics,  for  possible  publication  in  the  NEWS  AND  MEMORANDA  or  NOTES  sections 
of  the  QUARTERLY 


NAVAL  RESEARCH 

LOGISTICS 

QUARTERLY 


SEPTEMBER 
VOL.  27,  N( 

NAVSO  P-127 


CONTENTS 
ARTICLES 

On  the  Reliability,  Availability  and  Bayes  Confidence 
Intervals  for  Multicomponent  Systems 

Optimal  Replacement  of  Parts  Having  Observable 
Correlated  Stages  of  Deterioration 

Statistical  Analysis  of  a  Conventional  Fuze  Timer 

The  Asymptotic  Sufficiency  of  Sparse  Order  Statistics 
in  Tests  of  Fit  with  Nuisance  Parameters 

On  a  Class  of  Nash-Solvable  Bimatrix  Games 
and  Some  Related  Nash  Subsets 

Optimality  Conditions  for  Convex  Semi-Infinite 
Programming  Problems 


W.  E.  THOMPSON 
R.  D.  HAYNES 

L.  SHAW 

S.  G.  TYAN 

C-L.  HSU 

E.  A.  COHEN,  JR. 

L.  WEISS 

K.  ISAACSON 
C.  B.  MILLHAM 

A.  BEN-TAL 

L.  KERZNER 

S.  ZLOBEC 


Solving  Incremental  Quantity  Discounted  Transportation  P.  G.  MCKEOWN 
Problems  by  Vertex  Ranking 

Auxiliary  Procedures  for  Solving  Long 
Transportation  Problems 

On  the  Generation  of  Deep  Disjunctive  Cutting  Planes 


J.  INTRATOR 
M.  BERREBI 


The  Role  of  Internal  Storage  Capacity  in  Fixed 
Cycle  Production  Systems 

Scheduling  Coupled  Tasks 

Sequencing  Independent  Jobs  With  a 
Single  Resource 

Evaluation  of  Force  Structures  Under  Uncertainty 

A  Note  on  the  Optimal  Replacement  Time 
of  Damaged  Devices 

A  Note  on  the  Sensitivity  of  Navy  First  Term 
Reenlistment  to  Bonuses,  Unemployment 
and  Relative  Wages 


H.  D.  SHERALI 
C.  M.  SHETTY 

B.  LEV 
D.  I.  TOOF 

R.  D.  SHAPIRO 

K.  R.  BAKER 
H.  L.  W.  NUTTLE 

C.  R.  JOHNSON 
E.  P.  LOANE 

D.  ZUCKERMAN 

L.  COHEN 
D.  E.  REEDY 


OFFICE  OF  NAVAL  RESEARCH 
Arlington,  Va.  22217