Skip to main content

Full text of "A preliminary report on CIMS PL/I"

See other formats


COO-3077-31 


^ 


Courant  Institute  of 
Mathematical  Sciences 

AEG  Computing  and  Applied  Mathematics  Center 


A  Preliminary  Report  on  CIMS  PL/I 

Paul  Abrahams 


AEC  Research  and  Development  Report 

Mathematics  and  Computers 
August  1973 


New  York  University 


^ 


UNCLASSIFIED 


AEC  Computing  and  Applied  Mathematics  Center 
Courant  Institute  of  Mathematical  Sciences 
New  York  University 


Mathematics  and  Computers  COO-3077-31 

A  PRELIMINARY  REPORT  ON  CIMS  PL/I 
Paul  Abrahams 


Contract  No.  AT (11-1) -3077 


UNCLASSIFIED 


Table  of  Contents  Page 

1 .  Introduction 1 

2 .  Implementation  Characteristics 2 

2  , 1    Language  accepted  by  CIMS  PL/I 2 

2  .  2    Features  Not  Yet  Implemented 4 

2 .  3   Ope  rating  Characteris  tics 6 

3.  Bootstrapping  Method 7 

4.  Structure  of  the  Compiler 11 

5 .  Implementation  Techniques 14 

5. 1  Parsing  Method 14 

5.2  Data  Structure  Design 17 

5.2.1  Choice  of  Representation  Method 17 

5.2.2  Examination  of  Trade-Offs  for 

Specific  Structures 19 

5 .  3    Run-Time  Storage  Allocation 22 

6  .  Conclusions 2  4 

6.1  Sources  of  Program  Complexity 24 

6.2  Operating  Systems  Support 27 

6 .  3   Virtual  Memory 29 

6.4  Reliability 30 

6.5  Project  Organization 31 

6.6  PL/I  as  a  Systems  Programming  Language 32 

6.7  Design  Flaws 33 

References 36 

Appendix  A   -  User's  Manual  for  CIMS  PL/I 37 

Appendix  B   -  Useful  Coding  Tricks 64 


-111- 


1 .   Introduction 

CIMS  PL/I  is  an  implementation  of  a  PL/I  subset  on  the 
CDC  6600  computer.   The  subset  is  oriented  towards  compiler 
writing,  and  is  sufficiently  large  so  that  the  compiler  itself 
could  be  conveniently  written  in  it.   The  objective  of  this 
report  is  to  set  forth  some  of  the  more  interesting  aspects  of 
the  compiler  and  some  of  the  conclusions  that  I  drew  from 
creating  it. 

I  had  several  initial  objectives  when  I  decided  upon  this 
project.   One  of  them  was  to  create  a  compiler  for  a  useful 
programming  language.   That  goal  has  been  achieved,  though 
it  is  now  less  significant  since  CDC  has  also  produced  a  PL/I 
compiler.   A  second  objective  was  to  learn  about  compiling 
techniques.   Compilation  of  PL/I  is  sufficiently  difficult 
so  that  if  one  can  compile  PL/I   one  can  probably  compile 
anything.   A  third  objective  was  to  gain  more  understanding  of 
the  characteristics  of  complex  programs.   A  fourth  objective 
was  to  stimulate  thinking  about  programming  language  design. 

CIMS  PL/I  has  been  operational  since  October  19  72  and 
consumed  about  three  and  a  half  calendar  years  and  one  and 
a  half  man  years  of  work.   Production  of  the  original  version 
of  the  compiler  was  essentially  a  one-person  effort.   Since 
the  original  version  appeared,  a  number  of  extensions  and 
improvements  have  been  made,  some  of  them  by  others.   The 
compiler  itself  is  written  in  PL/I,  and  was  bootstrapped  using 
the  IBM  360  and  the  IBM  F-compiler  for  PL/I. 

There  are  two  appendices  to  this  report.   Appendix  A  is  the 
current  user's  guide  to  CIMS  PL/I.   Appendix  B  describes  a 
collection  of  coding  tricks  that  I  found  useful  and  that  may 
be  useful  to  others. 


-1- 


2.    Implementation  Characteristics 

Two  aspects  of  the  implementation  are  worth  describing: 
the  language  accepted  by  the  compiler  and  the  operating 
characteristics  of  the  compiler.   The  three  following  sub- 
sections discuss  the  features  included  in  CIMS  PL/I,  the 
features  not  yet  implemented,  and  the  operating  characteristics. 

2.1    Language  Accepted  by  CIMS  PL/I 

In  this  section  I  shall  only  summarize  the  available 
language  features,  as  a  complete  description  is  given  in 
Appendix  A.   In  implementing  these  features ,  there  are  many 
possible  language  definitions  to  which  one  can  turn.   I  have 
used  as  my  guide  the  definition  of  PL/I   being  developed  as 
a  proposed  standard  by  committee  X3J1  of  the  American  National 
Standards  Institute  jointly  with  committee  TCIO  of  the  European 
Computer  Manufacturers  Association.   While  the  proposed  standard 
has  not  yet  been  issued,  drafts  of  it  are  available  [5] .  These 
drafts  do  not,  however.,  include  final  decisions  on  all  language 
questions. 

The  datatypes   that  have  been  implemented  so  far  are 
FIXED,  FLOAT,  CHARACTER,  BIT,  POINTER,  AREA,  LABEL,  FILE,  and 
ENTRY.   Labels,  files,  and  entries  are  presently  I'imited  to 
constants.   The  implementation  of  FLOAT  is  quite  weak,  since 
only  arithmetic  operations  are  provided  for  it;  in  particular, 
there  is  as  yet  no  floating  point  input  or  output.   The 
only  way  to  create  or  access  floating  point  constants  is 
through  conversion  from  FIXED. 

Various  other  attributes  are  available.   Arrays  and  structures 
have  been  implemented,  and  these  may  be  nested  within  each  other. 
The  storage  classes  other  than  CONTROLLED  are  available,  as 
well  as  the  INITIAL  attribute  (with  some  restrictions)  and  the 
LIKE  attribute.   Most  of  the  statement  types  have  been  imple- 
mented.  In  general,  though,  the  power  of  a  PL/I  subset  is  not 
accurately  reflected  by  the  range  of  statements  that  it 
recognizes . 

Transput  (usually  called  input-output)  is  rather  primitive 
at  present.   The  only  transput  operations  are  writing  a  character 

-2- 


string  to  a  dataset  or  reading  a  character  string  from  a  dataset. 
Datasets   can  be  rewound  by  opening  and  closing  them  with 
appropriate  options.   Since  the  initial  bootstrap,  provisions 
have  been  added  for  dataset  name  substitution,  so  that  the 
same  program  can  be  applied  to  datasets  with  different  names. 
The  usual  way  of  creating  output  information  is  to  build  it  up 
in  a  string  by  concatenation,  using  implicit  conversions  to 
convert  numbers  to  strings.   Then  the  string  is  written  out  as 
a  whole.   For  input,  a  string  is  brought  in  and  broken  down  into 
parts  using  the  SUBSTR  built-in  function.   Strings  representing 
numbers  can  then  be  implicitly  converted  to  the  numbers  them- 
selves.  Work  is  presently  under  way  on  providing  stream  transput 
facilities,  which  include  both  formatted  and  unformatted 
transput  of  lists  of  variables. 

A  limited  set  of  on-conditions  has   been  added  since  the 
initial  version  of  CIS  PL/I  became  operational.   These  were 
chosen  mostly  because  they  were  needed  by  the  compiler  itself. 
Perhaps  the  most  useful  of  these  is  ENDFILE,  which  allows  the 
programmer  to  take  some  specified  action  when  end-of-file  is 
encountered  on  input  from  a  specified  dataset.   Another, 
STORAGE,  allows  the  compiler  to  change  its  field  length 
dynamically. 

At  present  the  list  of  built-in  functions  is  quite  limited. 
Since  many  of  these  functions  can  be  implemented  using  existing 
Fortran  library  routines,  most  of  the  work  depends  only  upon 
providing  appropriate  linkages.   It  is  possible  to  access  the 
Fortran  library  even  under  the  present  implementation  by  calling 
a  Fortran  function,  since  calling  sequences  are  compatible  for 
scalar  numerical  arguments. 

A  number  of  implicit  and  explicit  type  conversions  are  in  the 
present  implementation;  conversion  to  and  from  character  strings 
has  been  emphasized.   These  conversions  take  place  both  on 
assignment  and  when  arguments  are  passed  to  procedures. 
Aggregate  arguments  can  be  passed  to  procedures ,  though  type 
conversion  for  such  arguments  has  yet  to  be  implemented. 


-3- 


2.2    Features  Not  Yet  Implemented 

The  major  features  that  are  lacking  are: 

1.  Stream  transput.   PL/I  has  three  varieties  of  stream 
transput,  none  of  which  are  currently  implemented.   Edit-directed 
transput  is  similar  to  Fortran  format-directed  transput  but 

is  far  more  dynamic.   PL/I  allows  arbitrary  expressions  in  most 
contexts  where  Fortran  requires   a  constant,  and  the  values  of 
these  expressions  can  be  influenced  by  data  being  read  in  under 
control  of  the  format  statement.   In  the  extreme  case,  an 
expression  in  a  format  statement  could  invoke  a  function  which 
in  turn  would  initiate  another  transput  operation,  perhaps  even 
on  the  same  dataset! 

List-directed  input  causes  data  to  be  read  in  a  free-field 
format,  and  among  other  things,  requires  that  type  conversions 
be  selected  dynamically  during  the  input  process.   (For 
instance,  a  data  item  in  floating-point  representation  might 
be  read  into  a  fixed  variable.)   List-directed  output  causes 
the  value  of  a  sequence  of  expressions  to  be  written  onto  a 
dataset  at  fixed  tab  positions.   Data-directed  input  and  out- 
put associate  the  name  of  a  variable  with  its  value;  in 
particular,  on  input  the  name  of  the  variable  being  read  in 
is  part  of  the  input.  Thus  a  run- time  symbol  table  must  be  kept. 

2.  Record  transput,   A  very  limited  form  of  record  transput 
is  the  only  form  of  transput  presently  in  CIMS  PL/I.   Record 
transput  is  used  to  achieve  random  access  to  secondary  storage 
by  means  of  keys,  and  requires  a  rather  extensive  interface 
with  the  operating  system.   Record  transput  also  interacts  with 
multitasking,  since  a  record  transput  operation  may  overlap 
computation  as  well  as  other  transput  operations. 

3.  Aggregates.   The  full  treatment  of   aggregate  data 
(structures  and  arrays)  is  one  of  the  thorniest  aspects  of  PL/I 
implementation.   Procedures  should  be  able  to  return  aggregates 
as  their  values ,  and  aggregates  should  be  able  to  be  passed  as 
arguments  to  built-in  functions.   Further,  type  conversion  must 


-4- 


be  implemented  for  aggregates,  and  this  conversion  may  require 
the  development  of  an  aggregate  from  a  scalar.   PL/I  provides 
the  ability  to  reference  cross-sections  of  arrays,  but  this 
ability  is  not  yet  in  CIMS  PL/I.   Further,  through  the  use 
of  the  DEFINED  attribute,  one  aggregate  may  be  defined  in  terms 
of  a  mapping  from  another  aggregate.   The  DEFINED  attribute 
has  not  yet  been  implemented  either. 

4.  Arithmetic  data  types.   PL/I  allows  three  independent 
arithmetic  attributes  to  be  applied:   base  (DECIMAL  or  BINARY) , 
scale  (FIXED  or  FLOAT)  and  mode  (REAL  or  COMPLEX) .   Furthermore, 
precisions  (number  of  digits  to  be  kept)  and  scale  factors  may 
be  specified.   Since  at  present  all  fixed  numbers  behave  like 
Fortran  integers,  a  great  deal  remains  to  be  done  in  this  area. 

5.  Multi- tasking.   PL/I  provides  extensive  facilities  for 
the  generation  of  parallel  tasks  and  for  task  synchronization. 
These  facilities  are  useful  even  on  a  machine  with  only  one 
central  processor,  since  tasks  may  be  dependent  upon  transput 
operations.   None  of  these  facilities  are  yet  implemented, 
though  certain  hooks  for  them  exist. 

6.  Pictures.   PL/I  has  a  PICTURE  facility  much  like  that 
of  COBOL.   Pictures  are  useful  for  describing  and  editing  data 
used  in  commerical  applications.   Pictures  are  not  presently 
implemented. 

7.  Data  alignment.   PL/I  provides  the  ALIGNED  and  UNALIGNED 
attributes  to  control  the  tradeoff  between  economy  of  storage 
and  speed  of  access.   Although  on  the  surface  it  might  appear 
that  a  correct  implementation  could  ignore  these  attributes , 

it  happens  that  they  have  certain  implications  with  regard  to 
the  storage  of  character  and  bit  strings.   In  particular, 
adjacent  character  strings  in  an  unaligned  structure  must  be 
stored  adjacently  even  if  that  requires  starting  one  of  them 
in  the  middle  of  a  word.   A  similar  situation  holds  for  bit 
strings.   One  implication  of  this  requirement  is  that  pointers 
must  be  able  to  point  to  bits  or  characters  within  a  word; 
pointers  as  presently  implemented  in  CIMS  PL/I  can  only  point 
to  words . 

-5- 


8.  Bit  strings.   The  implementation  of  bit  strings  is 
presently  quite  restrictive,  as  they  can  only  be  of  fixed 
length  and  that  length  must  not  exceed  60.   Varying  bit  strings 
are  not  implemented  at  all. 

9.  Built-in  functions.   PL/I  requires  a  rather  large  library 
of  built-in  functions,  and  in  addition  there  are  a  number  of 
implementation-defined  ones  that  would  be  desirable.   Only  a 

few  built-in  functions  have  been  implemented  so  far,  though 
many  of  the  remaining  ones  could  be  taken  directly  from  the 
Fortran  library. 


2.3    Operating  Characteristics 

The  compiler  is  designed  to  operate  under  the  CDC  Scope  3.2 
Operating  System  [6]   on  the  CDC  6000  computer  series,  though 
it  will  also  run  under  the  Scope  3.4  Operating  System  [7].  The 
input  to  the  compiler  is  a  sequence  of  PL/I  external  procedures, 
and  the  output  is  in  the  form  of  6600  assembly  language.   This 
assembly  language  must  then  be  assembled  in  a  separate  step, 
using  a  set  of  macro  definitions  provided  with  the  compiler. 
Finally,  the  assembled  program  is  loaded  together  with  a 
run-time  library  and  executed. 

The  compiler  itself  presently  operates  at  a  rate  of  about 
15  statements  per  second   and  requires  a  minimum  field  length 
of  lOlK  octal;  large  programs  require  considerably  more  space. 
Assembly  takes  about  30%  more  time  than  compilation,  so  that 
the  total  rate  of  compilation  is  about  6  statements  per  second. 

An  extensive  set  of  compile-time  diagnostics  has  been 
provided.   Run-time  diagnostics  are  much  sketchier,  though  a 
recently  added  feature  makes  it  easy  to  determine  the  point  at 
which  a  program  has  stopped.   Listings  are  unornamented  except 
for  the  inclusion  of  statement  numbers.   A  listing  of  variables 
and  a  cross-reference  map  can  be  obtained. 


-6- 


3.   Bootstrapping  Method 

The  CIMS  PL/I  compiler  is  written  in  PL/I  and  was  boot- 
strapped from  the  IBM  360.   In  Figure  1  we  see  the  process 
involved  in  obtaining  an  executable  program  from  a  source 
program.   The  source  program  is  processed  by  the  compiler  to 


macro 
definition 


PL/I 


source 
code 


COMPILER 
(written 
in  PL/I) 


6600  assembly 


I  language  t 
» 
) 

I 
.1 


run-time 
library 


COMPASS 


ASSEMBLER  nodules 


load 


>  LOADER 


execu- 
table 
pro-   ^ 
gram 


TAPE 
tTRANSFER 


Dotted  path  is  taken  during  bootstrapping. 


Figure  1.    Compilation  Process  and  Bootstrap  Procedure 


produce  a  version  in  COMPASS,  the  6600  assembly  language.   Since 
the  COMPASS  version  is  expressed  as  lines  of  code,  the  compiler 
can  be  viewed  as  transforming  character  strings  to  character 
strings.   (In  this  analysis  we  are  ignoring  such  matters  as 
output  listings.)   The  COMPASS  code  consists  mostly  of  calls 
on  macros,  so  that  the  assembler  must  be  provided  with  a  collec- 
tion of  macro  definitions.   The  output  of  the  assembler  is  a 
collection  of  load  modules,  one  for  each  external  procedure. 
The  compiled  code  will  call  upon  various  run-time  subroutines 
for  services  such  as  storage  allocation,  character  manipula- 
tion, and  evaluation  of  built-in  functions.  These  run- time 
subroutines  are  linked  to  the  compiled  code  by  the  loader,  so 
that  the  result  of  the  loading  process  is  an  executable 
program. 

Now  let's  return  to  the  compiler  itself,  which  we  shall 
call  CI.   CI  is  written  in  PL/I  and  transforms  character  strings 


-7- 


to  character  strings.   Therefore  it  will  run  on  any  machine  that 
has  an  operational  PL/I  compiler;  we  shall  call  this  second 
compiler  C2 .   Furthermore  CI  is  almost  entirely  machine- 
independent.   There  are  a  few  machine  dependencies  such  as  the 
character  tables  used  by  the  lexical  scanner,  the  transformation 
used  by  the  hashing  algorithm,  and  the  overlay  calling  conven- 
tions.  These  sections  must  be  changed  to  correspond  to  the 
machine  upon  which  CI  is  executing. 

When  this  project  began,  of  course,  PL/I  did  not  exist  on 
the  6600.   However,  we  can  consider  a  rather  awkward  way  of 
obtaining  executable  PL/I  programs  on  the  6600  under  these 
circumstances.   First,  CI  is  compiled  on  the  360  using  C2 . 
Since  CI  is  a  PL/I  program  and  C2  compiles  PL/I  programs  so 
that  they  will  execute  on  the  360,  there  are  no  difficulties 
here.   The  result  is  that  CI  now  executes  on  the  360,  and 
converts  PL/I  source  code  to  6600  assembly  language.   It  might 
seem  that  the  process  would  fail  since  C2  turns  out  360  machine 
language.   Remember  though  that  CI  is  just  a  PL/I  program  that 
transforms  character  strings  to  character  strings.   The  fact 
that  the  output  character  strings  are  6600  assembly  language 
is  irrelevant. 

We  now  have  our  awkward  method  of  compiling  PL/I  for  the  6600. 
We  take  our  PL/I  source  code,  pass  it  through  CI  on  the  360,  and 
write  the  results  on  tape.   We  then  read  the  tape  on  the  6600  and 
carry  out  the  rest  of  the  process,  as  shown  in  Figure  1. 

The  bootstrap  approach  is  based  on  the  observation  that  CI  is 
itself  written  in  PL/I,  and  tlierefore  this  awkward  method  of 
compilation  can  be  applied.   Thus,  we  pass  the  text  of  CI  through 
the  process  of  Figure  1,  including  the  tape  transfer  across 
machines.   The  result  is  CI  as  an  executable  program  on  the 
6600  --  in  other  words,  an  operational  PL/I  compiler  on  the 
6600.   From  this  point  on  the  360  is  not  used,  and  the  dotted 
connection  of  Figure  1  is  replaced  by  a  direct  connection. 

Bootstrapping  from  the  6600  to  the  6600  is  necessary 
whenever  improvements  are  made  to  the  compiler.   These  improve- 
ments are  of  two  sorts.   Language  extensions  may  enable  the 


•8- 


compiler  to  do  things  that  it  could  not  do  previously.   For 
instance,  the  original  version  had  no  means  of  detecting 
end-of-file,  and  so  the  compilation  of  each  external  procedure 
required  a  separate  job  step.   When  the  ENDFILE  on-condition 
was  added,  the  compiler  could  be  modified  to  allow  for  batched 
compilations.   Optimization  of  compiled  code  is  an  even  more 
important  advantage  of  repeated  bootstrapping.   As  the  code 
turned  out  by  the  compiler  becomes  more  efficient,  the 
performance  of  the  compiler  itself  becomes  improved  through 
repeated  bootstrapping. 

An  incidental  advantage  of  bootstrapping  is  that  it 
provides  a  severe  test  of  the  compiler  itself.   The  approach 
is  shown  in  Figure  2.   We  start  off  with  an  old  machine- 
language  version   M(l)   and  a  new  source-language  version  S(2), 


(1) 


(2) 


(3) 


S{2) 


S(2) 


S(2) 


>  M(l) 


^  M(2) 


^M(3] 


■^  C(2,l)   =  M(2) 


-^   C(2,2)   =  M(3) 


->   C(2,3)   =  M(4) 


M(i)  machine    language-version      i 

S(i)  source    language    version        i 

C(i,j)  machine- language    compiler 

produced  by  compiling  source 
version  i  under  machine  version  j 


Figure  2.    Iterated  Bootstrapping 


The  first  compilation  produces  M(2) ,  which  we  then  use  to 
compile  S(2)  again  and  produce  M(3).   The  process  is  iterated 
once  more  to  produce  M(4).   If  the  bootstrap  process  has 
succeeded,  then  M(3)  and  M(4)  will  be  bit-for-bit  identical. 


-9- 


a  condition  that  can  easily  be  checked  mechanically.   To  see 
this,  we  note  that  the  code  generated  by  S(2)  should  be 
independent  of  the  C  used  to  compile  S(2) ,  since  S(2)  as  we 
noted  before  is  just  transforming  character  strings  to 
character  strings.   Now  M(2)  and  M(3)  are  both  realizations 
of  S(2)  and  thus  should  transform  any  input,  including  S(2) 
itself,  in  the  same  way.   We  now  observe  that  if  M(3)  and  M(4) 
are  bit-for-bit  identical,  then  any  number  of  further  iterations 
will  produce  results  identical  to  M(3) .   Thus  we  know  that 
stability  has  been  achieved.   In  typical  practice,  M(l)  performs 
the  same  transformation  as  S(2)  and  hence  as  M(2) ;  in  this  case, 
M(2)  and  M(3)  will  be  identical  and  the  process  can  be  termi- 
nated one  step  earlier.   Since  S(2)  may  produce  rather  different 
code  than   S(l),  however,  we  cannot  count  on  this  early  termina- 
tion. 


•10- 


4.   Structure  of  the  Compiler 

The  major  components  of  the  compiler  itself  are  shown  in 
Figure  3.   The  input  to  the  compiler  is  PL/I  source  code; 
the  output  is  6600  assembly  language.   The  unit  of  processing 
is  the  external  procedure,  and  each  of  the  boxes  shown 
represents  a  pass  over  part  or  all  of  the  procedure. 


LIKE 
resolver 

declaration 
processor 

parser 

:ieclara- 

reference 
resolver 

-^ 

source 

tions 

code 

^ 

J 

I 

r 

lexical 
scan 

storage  planner 

> 

^ 

^^ 

CO 

language 

Figure  3.   Structure  of  the  Compiler 


The  first  pass  is  performed  by  the  parser,  which  is  described 
in  more  detail  in  Section  5.1.   It  communicates  with  a  lexical 
scanner  which  breaks  down  the  input  stream  into  tokens  such  as 
identifiers,  numbers,  and  punctuators.   The  output  of  the  parser 
is  a  collection  of  interconnected  data  structures.   Some  of  these 
data  structures  are  associated  with  declarations;  most  of  the 
rest  of  them  are  associated  with  statements.   The  data  structures 


-11- 


are  interconnected  through  pointers,  and  their   organization 
is  discussed  further  in  Section  5.2. 

During  parsing,  a  list  is  kept  of  all  declarations  that  use 
the  LIKE  attribute.   The  LIKE  attribute  allows  the  programmer 
to  copy  a  portion  of  a  declaration  into  another  declaration, 
and  considerably  simplifies  the  work  of  writing  long  but 
similar  declarations.   It  must  be  processed  at  a  very  early 
stage  because  until  it  is  processed  the  set  of  declarations 
is  not  complete.   The  parser  keeps  a  list  of  all  places  where 
the  LIKE  attribute  is  used,  so  that  it  is  not  necessary  to  scan 
the  entire  text  looking  for  occurrences  of  LIKE. 

The  task  of  the  reference  resolver  is  to  connect  uses  of 
variables  with  their  declarations.   The  reference  resolver 
must  scan  through  all  statements  as  well  as  through  all 
expressions  appearing  in  declarations.   Each  use  of  a  variable 
is  represented  by  a  pointer.   This  pointer  is  then  replaced  by 
a  pointer  to  the  appropriate  declaration.   The  task  of  resolving 
a  reference  can  be  lengthy  because  of  the  complications  intro- 
duced by  block  structuring  and  name  qualification.   Therefore 
different  references  to  the  same  variable  are  represented  by 
pointers  with  the  same  value,  and  the  full  work  of  reference 
resolution  need  be  carried  out  only  once  for  each  distinct 
reference. 

Reference  resolution  may  turn  up  references  to  variables  that 
are  nowhere   declared.   Implicit  declarations  are  generated  for 
such  variables.   These  declarations  may  require  the  application 
of  contextual  attributes,  which  are  determined  by  the  way  that 
a  variable  is  used.   Contextual   attributes  are  picked  up  during 
the  parse  and  applied  to  implicit  declarations  during  reference 
resolution. 

The  declaration  processor  generates  the  full  set  of  declara- 
tions for  the  procedure.   It  must  check  and  complete  the 
declarations  that  have  been  given  by  the  programmer,  and  it  must 
also  generate  certain  declarations  that  the  programmer  has  not 
(and  indeed,  cannot  have)  specified.   Completion  of  declarations 
is  necessary  because  the  attributes   that  the  programmer  has 


-12- 


written  down  may  not  be  sufficient  to  completely  characterize 
the  declared  variable.   Completion  of  declarations  also 
involves  the  application  of  any  DEFAULT  statements  that  the 
programmer  may  have  written.   When  completing  declarations 
we  must  also  associate  each  built-in  function  with  an  index 
number.   Each  internal  procedure  must  have  an  entry  declara- 
tion generated  for  it,  and  subscripted  labels  have  label  array 
declarations  generated  for  them. 

The  last  and  most  complex  pass  is  the  code  generator. 
The  code  generator  processes  the  statements  of  the  source 
program  in  the  sequence  in  which  they  were  written,  except 
for   DECLARE  statements.   The  type  of  statement  then  determines 
the  particular  statement  generator  that  must  be  called  upon. 
Furthermore,  whenever  a  BEGIN  statement  or  a  PROCEDURE  statement 
is  encountered,  the  storage  planner  is  invoked.   The  storage 
planner  is  a  major  part  of  the  code  generator;  its  task  is  to 
develop  the  arrangement  of  storage  that  will  exist  during 
program  execution. 

The  largest  single  component  of  the  code  generator  is  the 
procedure  COMPUTE.   COMPUTE  receives  an  expression  as  its  input. 
It  generates  code  for  evaluating  that   expression,  and  returns 
as  its  result  a  data  structure  that  specifies  the  attributes 
of  the  computed  value  as  well  as  the  information  needed  to 
access  that  value. 


-13- 


5 .    Implementation  Techniques 

In  this  section  I  shall  discuss  a  few  interesting  aspects 
of  the  implementation.   This  discussion  is  by  no  means  exhaus- 
tive.  It  is  intended  to  present  a  few  selected  issues  and  to 
show  the  approach  that  I  took  to  them. 

5.1   Parsing  Method 

The  parsing  method  used  in  the  compiler  is  detailed  in  [1], 
and  will  only  be  sketched  here.   It  is  based  on  a  program 
called  SYNDIPAR  (SYNtax  Directed  PARser) ,  whose  input  is  a 
specification  of  the  required  syntax  and  semantics  and  whose 
output  is  a  PL/I  parsing  procedure  together  with  the  tables 
that  drive  it.   SYNDIPAR  itself  is  written  in  SNOBOL.   In  my 
early  planning  I  assumed  that  SYNDIPAR  would  not  have  to  be 
run  very  often  and  therefore  its  efficiency  would  be  irrelevant. 
This  has  not  turned  out  to  be  the  case;  the  inef ficiences  of 
SNOBOL  have  caused  SYNDIPAR  to  consume  inordinate  amounts  of 
resources,  and  I  am  planning  to  rewrite  it  in  PL/I.   Neverthe- 
less, it  works.   (Of  course,  the  compiler  itself  is  in  no  way 
dependent  upon  SNOBOL.) 

The  first  input  to  SYNDIPAR  is  a  set  of  syntax  equations 
for  PL/I .   These  syntax  equations  are  written  in  an  extended 
form  of  BNF  which  includes  operators  for  repetition  both  with 
and  without  delimiters  between  the  items  being  repeated. 
For  instance,  an  argument  list  is  specified  as  a  sequence  of 
expressions  delimined  by,  i.e.,  separated  by,  commas. 
SYNDIPAR  starts  with  the  assumption  that  the  given  grammar 
belongs  to  the  class  LL(1).   The  essential  property  of  an  LL(1) 
grammar  is  that  at  any  point  in  the  parse  where  alternatives 
exist,  one  can  decide  which  way  to  go  by  looking  only  at  the 
next  symbol  in  the  input  string.   Unfortunately  PL/I  does  not 
have  cin  LL(1)  grammar.   So  the  parser  also  must  call  upon 
"oracles"  to  look  ahead  in  the  input  stream  far  enough  to  deter- 
mine the  correct  decision  in  cases  of  doubt.   We  thus  have  a 
form  of  programming  by  exception.   Since  the  situations  where 


-14- 


oracles  are  needed  are  very  few,  we  gain  the  efficiency  of 
LL(1)  parsing  most  of  the  time.   The  oracles  themselves  need 
not  carry  out  a  full  analysis,  but  need  only  make  a  binary 
decision.   They  can  do  this  by  very  simple  means,  such  as 
looking  for  adjacent  symbols  of  certain  types  or  counting 
parentheses.   An  oracle  need  not  concern  itself  with  incorrect 
input;  if  the  input  is  incorrect  the  oracle  simply  makes  the 
most  convenient  decision  and  leaves  it  to  other  mechanisms  to 
detect,  announce,  and  possibly  correct  the  error. 

The  syntax  equations  provided  to  SYNDIPAR  also  contain  calls 
on  semantic  routines .   The  semantic  routines  are  provided  in  the 
form  of  blocks  of  PL/I  code,  and  their  task  is  to  generate  the 
data  structures  that  are  the  output  of  the  parse.   SYNDIPAR 
generates  the  appropriate  linkages  so  that  these  blocks  are 
called  at  the  right  time.   The  semantic  routines  are  actually 
interspersed  with  the  syntax  equations ,  since  it  is  easier  to 
prepare  the  input  in  this  form.   The  semantic  routines  are  simply 
copied  into  the  parsing  procedure  produced  by  SYNDIPAR,  with 
some  linkage  material  added  front  and  back. 

The  first  output  of  SYNDIPAR  is  the  parsing  procedure  itself. 
Since  almost  all  of  this  procedure  is  just  a  transcription  of 
the  input,  there  is  not  much  to  say  about  it.   The  second  output 
is  a  sequence  of  instructions  for  a  "parsing  machine".   The  code 
for  this  machine  is  interpreted  when  an  actual  source  program 
is  being  parsed.   The  interpreter  itself  is  included  in  the  parser. 
Each  instruction  for  the  parsing  machine  has  an  operator  (there 
are  ten  of  them)  and  an  optional  operand.   Denoting  the  integer 
operand  by  x  ,  we  can  sketch  the  operations : 

1.  Jump  to  the  semantic  routine  at  SLAB (x) ,   where  the 
label  SLAB (x)  has  been  attached  by  SYNDIPAR  to  the  appropriate 
block  of  code. 

2.  Jump  to  the  syntax  routine  whose  code  starts  with 
instruction  x,    saving  a  return  address  on  a  pushdown  stack. 

3.  Unstack  a  return  address  and  jump  to  it. 

4.  Use  X      together  with  the  current  symbol  or  symbol  type 
to  determine  a  branch  address  and  then  branch  there.   The 


-15- 


(x, symbol)   pairs  are   used  as  input  to  a  hashing  algorithm. 
If  no  entry  exists  for  the   (x, symbol)   pair  then  execute  the 
next  instruction. 

5.  Advance  to  the  next  token  in  the  input  stream. 

6.  Jump  to  the  instruction  at  x. 

7.  Signal  an  error. 

8.  Plant  a  flag  indicating  that  the  currently  governing 
error  control  routine  is  the  block  of  code  at  ERR(x) . 

9.  Not  used. 

10.  There  is  an  array  of  tenninal  symbols  called  TERM. 

If   TERM(x)   is  equal  to  the  next  symbol,  then  skip  one  instruc- 
tion; otherwise  execute  the  next  instruction. 

11.  If  the  symbol  class  (integer,  string,  punctuator,  etc.) 
of  the  next  symbol  has  code  x,  skip  one  instruction;  otherwise 
execute  the  next  instruction. 

The  use  of  oracles  requires  some  special  techniques  in  the 
lexical  scanner.   An  oracle  works  by  looking  ahead  through  the 
input  stream,  searching  for  enough  information  to  make  its 
decision.   It  sees  the  input  stream,  as  a  sequence  of  tokens 
rather  than  as  a  sequence  of  characters.   These  tokens  will 
need  to  be  scanned  again  during  the  subsequent  syntactic 
analysis;  furthermore,  the  same  token  sequence  may  be  examined 
by  several  different  oracles  prior  to  analysis. 

These  requirements  are  met  by  providing  two  modes  of  lexical 
analysis.   In  the  lookahead  mode,  scanned  tokens  are  saved  on 
a  queue.   Upon  completion  of  lookahead,  the  current  token  is 
taken  to  be  the  one  that  was  current  when  the  lookahead  started. 
In  either  lookahead  or  ordinary  mode,  tokens  are  taken  from  the 
queue  until  it  is  exhausted;  thereafter  they  are  taken  from  the 
input  stream  itself.   In  ordinary  mode,  scanned  tokens  are 
discarded;  in  lookahead  mode,  they  are  saved. 

What  if  the  space  on  the  queue  is  exhausted?   This  situation 
will  occur  if  an  oracle  is  unable  to  make  a  decision  after  scan- 
ning some  .reasonable  number  of  tokens.   When  the  queue  space 
is  exhausted,  the  oracle  is  notified,  and  it  makes  an  arbitrary 


-16- 


decision.   If  this  decision  is  wrong,  a  syntactic  error  will 
be  introduced  artificially.   Such  errors  are  unavoidable  if 
we  don't  want  to  store  two  unbounded  alternative  parses,  since 
there  are  examples  where  the  syntactic  type  of  a  construct 
remains  indeterminate  for  an  unbounded  number  of  symbols. 

5.2    Data  Structure  Design 

The  design  of  the  data  structures  for  the  compiler  was  the 
most  difficult  and  critical  part  of  the  overall  design.   Once 
the  data  structures  were  selected,  most  of  the  programming 
became  straightforward.   It  is  worthwhile  examining  some  of 
the  ideas  underlying  the  data  organization,  and  some  of  the 
tradeoffs  that  had  to  be  considered. 

5.2.1    Choice  of  Representation  Method 

The  first  decision  concerning  data  structures  was  the 
choice  between  a  syllable-oriented   and  a  pointer-oriented 
representation.   In  a  syllable-oriented  representation,  the 
program  is  encoded  as  a  linear  sequence  of  syllables. 
Syllables  may  either  be  type  codes  which  determine  the  meaning 
of  the  following  information,  or  representatios  of  the  informa- 
tion itself  (including  pointers  and  indices) .   The  usual 
reverse-Polish  representation  is  syllable-oriented,  as  are 
representations  in  terms  of  triples  or  quadruples.   These 
representations  have  the  convenient  property  that  they  can  be 
written  onto  secondary  storage,  rewound,  and  reprocessed.  Even 
the  rewind  can  be  avoided  through  clever  tricks,  though  for 
disks  this  consideration  is  irrelevant. 

In  a  pointer-oriented  representation,  sequencing  is 
accomplished  by  using  chains  of  pointers  rather  than  by  placing 
items  one  after  the  other.   In  a  pointer-oriented  representa- 
tion the  entire  intermediate  representation  of  a  program  must 
be  directly  addressable.   For  entities  such  as  expressions  the 
complexity  of  the  two  representations  is  about  the  same. 
Pointers,  however,  have  the  advantage  that  information  need 
not  be  processed  in  the  same  order  as  it  is  generated. 


-17- 


It  would  seem  that  the  case  for  a  syllable-oriented  repre- 
sentation is  very  strong,  and  for  most  programming  languages 
that  would  be  the  natural  choice.   PL/I,  though,  poses  some 
special  problems,  most  of  which  arise  from  the  fact  that 
declarations  may  be  scattered  throughout  a  block  rather  than 
all  appearing  at  the  head  of  a  block.   Thus  the  code  associated 
with  declarations  must  somehow  be  moved  to  the  right  place,  and 
this  kind  of  code  motion  over  an  arbitrary  distance  becomes 
rather  complex  in  a  syllable-oriented  scheme.   For  instance, 
a  block  could  conclude  with  a  declaration  of  a  character  string 
whose  length  is  given  by  an  expression.   Somehow  the  length 
expression  must  be  moved  to  the  start  of  the  block,  since  it 
must   be  evaluated  before  execution  of  the  block  itself  can 
begin.   A  similar  problem  occurs  with  implicitly  declared 
variables.   Even  if  the  programmer  writes  all  his  declarations 
at  the  head  of  the  block,  there  still  is  the  problem  of  proce- 
dures internal  to  the  block,  for  which  declarations  must  be 
generated,  as  well  as  the  problem  of  detecting  and  dimensioning 
constant  label  arrays. 

It  is  apparent  that  the  syllable  appraoch  is  not  very  helpful 
for  representing  declarations.   Declarations  are  by  nature 
interlinked  in  many  ways,  and  simply  do  not  have  the  sort  of 
sequential  structure  necessary  for  representation  by  syllables. 

A   further  difficulty  with  the  syllable-oriented  approach 
arises  in  the  analysis  of  expressions.   A  simple  bottom-up 
analysis  is  not  sufficient  if  certain  kinds  of  local  optimization 
are  to  be  attempted,  since  one  wants  to  consider  an  operand  in 
conjunction  with  the  operator  being  applied  to  it.   In  a 
reverse-Polish  representation  it  is  awkward  to  locate  such  an 
operator  before  both  operands  have  been  analyzed. 

In  the  original  bootstrap  version  of  the  compiler  the 
objective  was  to  get  it  on  the  air   as  soon  as  possible. 
I  therefore  chose  to  use  the  pointer-oriented  approach 
primarily  because  it  was  easier  to  implement  and  did  not  require 
any  transput  facilities.  In  retrospect  the  decision  is  still  not 


-18- 


clear-cut,  though  there  are  places  where  syllables  could  have 
been  used  to  good  effect.   The  tradeoffs  are  influenced  by  the 
availability  of  memory,  either  virtual  or  real.   These  considera- 
tions are  discussed  in  Section  6.3. 

5.2.2    Examination  of  Tradeoffs  for  Specific  Structures 

As  an  example  of  tradeoffs,  consider  the  representation  of 
identifiers  within  the  data  structures.   An  occurrence  of  an 
identifier  might  be  represented  by  the  character  string  for 
the  identifier  or  by  a  pointer  to  some  other  data  structure 
that  contains  that  character  string.   In  examining  the  space 
requirements  we  see  that  if  most  identifiers  are  short  (as  we 
would  usually  expect  them  to  be) ,  then  it  is  more  economical 
to  store  the  character  strings  directly.   However,  there  is 
a  trap.   Since  identifiers  vary  in  length,  we  must  store  not 
only  the  character  string  but  also  its  length.   At  this  point 
the  economy  of  space  begins  to  favor  the  indirect  representa- 
tion.  The  indirect   representation  has  another  cost:   the 
hashed  table  search  necessary  to  locate  the  unique  structure 
associated  with  a  given  identifier.   However,  this  cost  must 
be  paid  eventually  anyway  since  it  is  necessary  to  hash 
identifiers  in  order  to  resolve  references.   The  compiler  uses 
the  indirect  approach.   Every  identifier  has  a  unique  structure 
that  represents  it.   The  identifier  is  characterized  by  a 
pointer  to  this  structure.   Associated  with  the  structure  is 
a  list  of  all  declarations  of  that  identifier.   The  list  is 
formed  by  pointing  to  the  first  such  declaration  and  then 
chaining  the   rest  of  them  through  a  pointer  component  of 
the  declaration  itself. 

As  another  example,  consider  the  structure  that  is  used  to 
describe  the  declaration  of  a  single  variable.   The  actual  PL/I 
specification  of  that  structure  is  given  by: 


-19- 


DECLARE    1    DECL    BASED  (DPTR)  , 

2    ATTRIBS    BIT  (60)  ,    /*    BINARY    ATTRIBUTES    */ 

2    COMPLEX_ATTRIBS    POINTER,     /*    POINTER   TO    LIST    OF    COMPLEX 

ATTRIBUTES    */ 
2    ACCESS_RULE    POINTER,    /*    POINTER   TO   STRUCTURE    FOR   ACCESSING 

LOCATION    OF    THIS    VARIABLE    */ 
2    PARENT    POINTER,    /*    CONTAINING    STRUCTURE    OR   BLOCK    */ 
2    SIBLING   POINTER,    /*    NEXT    VARIABLE    ON    SAME    LEVEL    */ 
2    LINK    POINTER,    /*    CONTINUATION    OF    LIST    OF    DECLARATIONS 

OF    SAME       IDENTIFIER    */ 
2    IDENTIFIER   POINTER,    /*    STRUCTURE    CONTAINING   NAME    OF 

IDENTIFIER   BEING    DECLARED    */ 
2    STMTNO    FIXED    BIN,    /*    NUMBER    OF    STATEMENT    DECLARING 

THIS    VARIABLE    */ 
2    NEXTLEVEL    FIXED    BIN;    /*    LEVEL    OF    STATIC    NESTING    OF 

THIS    VARIABLE    */ 


The  name  of  the  entire  structure  is  DECL,  and  its  components 
are   ATTRIBS,  COMPLEX_ATTRIBS ,  etc.   In  designing  this  data 
structure,  space  and  ease  of  access  must  be  traded  off. 
The  problem  is  that  while  most  declarations  of  variables  are 
quite  simple,  the  most  complicated  ones  are  very  complicated 
indeed.   To  represent  all  possibilities  directly  would  lead 
to  a  very  large  structure  for  even  the  simplest  sort  of 
declaration.   The  details  of  DECL  suggest  how  the  tradeoff 
was  handled. 

Consider,  first,  the  representation  of  attributes.   The 
declaration  of  a  variable  may  attach  to  that  variable  a  variety 
of  attributes.   Some  of  them  may  be  recorded  by  a  single  bit, 
e.g.,  EXTERNAL.   Others  may  require  one  or  two  integers,  e.g., 
PRECISION.   Others  require  an  expression,  e.g.,  CHARACTER, 
though  the  obvious  representation  of  an  expression  here  is  as 
a  pointer  to  something  else.   Yet  others  require  a  complex 
ad  hoc  structure,  e.g.,  ENTRY  or  INITIAL. 

The  DECL  structure  uses  two  different  devices  for  represent- 
ing attributes.   First  there  is  a  bit  string,  ATTRIBS,  in  which 
one  bit  is  devoted  to  representing  the  presence  or  absence  of 
each  attribute.   Second,  there  is  a  pointer  COMPLEX_ATTRIBS  to 
a  linked  list  of  complex  attributes.   Each  of  these  includes 
a  flag  to  indicate  its  type,  a  pointer  to  the  next  complex 
attribute,  and  further  material  determined  by  which  attribute 
is  being  described.   There  are,  of  course,  further  data 

-20- 


structures,  not  given  here,  for  describing  each  attribute  type. 
It  is  convenient  to  include  a  bit  in  ATTRIBS  for  attributes 
that  also  appear  in  the  list  of  complex  attributes. 

For  a  declaration  with  no  complex  attributes,  then,  the 
declaration  contains  only  one  extraneous  piece  of  information: 
a  null  pointer  as  the  value  of  COMPLEX_ATTRIBS .   For  more 
elaborate  declarations  we  will  have  correspondingly  elaborate 
lists  of  complex  attributes.   Note  that  the  complex  attributes 
are  not  necessarily  mutually  exclusive.   For  instance,  a  variable 
could  have  the  BASED,  DIMENSION,  INITIAL,  and  CHARACTER  attri- 
butes simultaneously. 

A  similar  economy  issue  is  raised  by  the  fact  that  most 
PL/I  variables  are  unstructured,  but  yet  we  must  be  able  to 
represent  structured  variables.   To  represent  an  element  of  a 
general  (i.e.,  not  necessarily  binary)  tree  we  need  to  know  the 
parent,  first  child,  and  next  sibling  of  that  element   as  well 
as  any  information  associated  with  the  element  itself.   Structured 
variables  do  indeed  have  the  organization  of  a  general  tree. 
Yet  an  unstructured  variable  needs  none  of  this  tree-oriented 
information . 

We  note,  however,  that  for  every  variable  we  must  be  able 
to  identify  the  block  in  which  this  variable  is  declared.   We 
achieve  this  by  treating  the  parent  of  a  level-one  variable  (i.e., 
one  with  no  parent)  as  being  the  descriptor  of  the  declaring 
block.   Thus  we  no  longer  have  orphan  variables;  every  variable 
has  a  functional  parent,  and  the  PARENT  component  of  DECL  will 
always  be  put  to  good  use.   We  can,  of  course,  find  the  block 
associated  with  any  variable  not  at  level-one  by  tracing  up  the 
chain  of  ancestors. 

We  also  note  that  for  every  block  we  will  want  a  list  of 
variables  declared  in  that  block.   We  can  form  that  list  by 
stringing  together  all  the  level-one  variables;  any  other 
variable  can  be  found  by  tracing  downwards  through  the  trees. 
Since  a  level-one  variable  has  no  sibling,  the  SIBLING  component 
of  DECL  can  be  used  for  this  purpose.   Not  all  variables  will 


-21- 


have  siblings  under  this  interpretation,  but  most  of  them  will; 
and  that  is  sufficient  to  justify  the  use  of  space. 

How   about  the  first  child?   Unlike  the  two  other  tree  links , 
the  first  child  has  no  obvious  alternate  use  for  variables  that 
don't  have  a  first  child.   And  most  variables  do  not  in  fact 
have  children.   So  we  do  not  set  aside  a  component  of  DECL  for 
the  first  child;  rather,  we  store  this  information  as  a  complex 
attribute,  and  spend  the  space  only  when  it  is  really  necessary. 

These  examples  may  illuminate  the  nature  of  the  decisions 
that  have  to  be  made.   We  are  not  worrying  about  the  detailed 
layout  of  bits  in  storage,  but  rather  about  the  more  general 
tradeoff  of  indirection  versus  space.   If  information  is  stored 
explicitly  wherever  it  is  needed,  then  it  will  take  more  space. 
If  it  is  never  stored  explicitly  when  a  path  to  it  exists,  then 
processing  it  will  take  more  time,  and  furthermore  will  lead  to 
more  complex  programming.   Some  implications  of  this  kind  of 
decision  for  programming  languages  are  discussed  in  Section  6.1. 

5.3        Run-Time   Storage   Allocation 

Run-time  storage  allocation  is  carried  out  through  the  use 
of  a  generalized  block  allocator.   The  allocator  has  the  ability 
to  obtain  a  block  of  storage  of  given  size,  or  to . free  a  given 
block.   An  allocated  block  is  never  moved.   If  a  block  is  freed 
and  either  of  its  neighbors  is  freed,  then  the  adjacent  free 
blocks  are  combined  to  form  a  larger  block. 

Storage  may  be  required  either  as  a  consequence  of  entering 
a  procedure,  evaluating  an  expression,  or  executing  an  ALLOCATE 
statement.   The  treatment  of  ALLOCATE  statements  is  straight- 
forward; the  necessary  amount  of  storage  is  determined  and  the 
allocator  is  then  asked  for  a  block  of  that  size.    The  block 
is   retained  xintil  a  FREE  statement  for  it  is  executed. 

The  remaining  cases  involve  storage  requests  that  are  not 
explicitly  made  by  the  programmer.   We  may  classify  storage 
requests  according  to  the  time  at  which  the  size  of  the  request 
becomes  known.   This  time  may  be  at  compile  time,  at  block 
entrance  time,  or  an  expression  evaluation  time.   For  the  moment. 


-22- 


assume  that  we  use  a  pushdown  stack.   Upon  procedure  entrance, 
we  set  aside  an  amount  of  stack  space  obtained  by  adding  the 
amount  of  header  information  to  the  amount  of  storage  required 
for  variables  and  temporaries  whose  size  is  known  at  compile 
time  (excluding  STATIC  variables,  which  are  allocated  separately). 
We  then  calculate  the  amount  of  storage  needed  for  variables  and 
temporaries  whose  size  is  known  at  block  entrance  time,  and  set 
aside  that  further  storage  on  the  stack.   When  evaluating  an 
expression  we  may  need  temporary  storage  (though  not  storage 
for  variables)  where  the  amount  of  storage  needed  cannot  be 
determined  until  expression  evaluation.   Such  storage  is 
presently  not  allocated  on  the  stack,  but  rather  is  obtained 
directly  from  the  block  allocator. 

Because  of  the  requirements  of  multitasking  and  the  ALLOCATE 
statement,  a  straight  pushdown  stack  storage  allocation  regime 
is  not  feasible.   A  modified  approach,  which  I  like  to  call 
"sticky  allocation,"  has  been  used.   Space  for  the  pushdown  stack 
is  obtained  in  large  but  noncontiguous  chunks.   These  chunks  are 
linked  together.   Within  a  chunk,  allocation  proceeds  in  pushdown 
fashion.   When  a  chunk  is  exhausted,  a  new  chunk  is  obtained  from 
the  allocator.   The  size  of  the  new  chunk  is  determined  by  adding 
a  constant  to  the  amount  of  pushdown  space  needed.   When  the  stack 
retracts,  however,   a  vacant  chunk  is  not  freed  immediately. 
Instead,  a  buffer  of  one  vacant  chunk  is  retained.   Thus  even 
if  one  encotinters  a  series  of  procedure  calls  at  a  point  where 
a  chunk  is  just  filled   up,  the  overhead  of  obtaining  a  new 
chunk  from  the  allocator  can  be  avoided. 

The  difficulty  with  storage  allocation  at  present  is  that  the 
internal  organization  of  the  stack  is  unduly  complicated.  Further- 
more, all  temporary  storage  really  should  be  placed  on  the  stack. 
The  consequence  of  the  present  arrangements  is  that  prologues  and 
epilogues,  which  must  be  performed  upon  procedure  entrance  and 
exit,  are  both  complicated  and  time-consuming. 


-23- 


6 .    Conclusions 

In  this  section  I  shall  present  some  of  the  conclusions  that 
I  drew  from  this  work. 

6.1   Sources  of  Program  Complexity 

The  source  language  version  of  the  compiler  occupies  about 
6000  card  images,  and  constitutes   a  large  and  complex  program. 
It  is  fruitful  to  inquire  as  to  the  causes  of  this  complexity. 

As  a  program  evolves  from  an  idea  to  a  reality,  conventions 
and  concepts  are  tentatively  set  forth  and  then  modified.  For 
instance,  data  structures  are  designed  with  algorithms  in  mind; 
specification  of  the  algorithms  then  shows  up  flaws  in  the  data 
structures.   Thus  the  program  production  process  involves 
continual  change.   Each  decision  is  bound  by  decisions  made 
in  the  past,  and  influenced  by  the  anticipation  of  decisions 
yet  to  be  made.   An  analogy  can  be  drawn  with  filling  in  a 
crossword  puzzle.   If  you  are  filling  in  a  line  and  are  not  sure 
which  of  two  words  fits  at  that  point,  you  will  look  at  the 
cross  definition  to  make  the  more  plausible  choice.   Even  if 
you  do  not  know  the  cross  words   you  will  have  some  information 
about  them,  e.g.,  that  certain  letter  sequences  are  unlikely 
to  occur. 

Present  decisions  can  also  influence  past  ones.   When 
conceptual  flaws  are  discovered,  past  decisions  may  need  to  be 
modified.   These  modifications,  in  turn,  will  propagate  and  so 
their  consequences  must  be  accounted  for.   Thus  the  case  for 
breadth-first  programming  versus  depth-first  programming  is  not 
clear  cut.   In  either  case  one  starts  with  an  overall  goal  and 
breaks  it  up  into  a  series  of  subgoals   G, ,G_,...,G  .  With 
depth-first  programming,  one  does  all  of  G,  before  proceeding 
to  G- .   In  breadth-first  programming  one  does  G,  through  G   to 
another  level  of  detail  without  fully  expanding  any  of  them. 
In  depth-first  programming  one  is  likely  to  program  G,  without 
having  designed  many  details  of  G_ .   This  practice  has  the  dis- 
advantage of  not  anticipating  relevant  characteristics  of 
G_,...,G  .   It  has  the  advantage,  though,  that  in  creating 

G_,...,G   one  knows  all  about  G, . 
2       n  1 

-24- 


One   cause  of  complexity,  then,  is  the  propagation  of 
decisions.   It  follows  that  automation  of  this  propagation 
will  lead  to  a  reduction  of  complexity.   I  will  give  an  example 
of  how  automatic  propagation  of  decisions  has  already  been 
used  to  reduce  complexity.   The  data  structures  used  by  the 
compiler  are  specified  through  PL/I  structure  declarations. 
These  declarations  are  kept  as  so-called  "common  decks" , 
recognized  by  the  CDC  UPDATE  program.   Only  one  copy  is  kept, 
and  each  procedure  references  those  declarations  that  it  needs. 
A  change  to  this  one  copy  will  then  propagate  to  all  procedures 
that  use  the  declaration. 

The  most  frequent  situation  is  that  we  need  to  add  a 
component  to  a  data  structure.   In  this  case  the  propagation 
is  quite  automatic.   In  the  case  where  the  meaning  of  a 
component  is  changed,  no  automatic  propagation  is  possible, 
though  automatic  warning  through  some  cross-referencing  scheme 
might  still  be  possible.   This  case  is  worth  examining  in 
more  detail. 

Automatic  propagation  of  changes  of  meaning  is  best  served 
by  factoring  out  the  definition  of  meaning  from  the  use  of  that 
meaning.   I  shall  give  some  examples.   Frequently  a  variable 
can  assume  some  set   of  meaningful  values,  e.g.,  "red",  "blue", 
"yellow".   If  we  encode  these  three  values  as  1,  2,  and  3,  then 
changes  in  the  encoding  will  force  changes  in  the  programs   that 
use  it.   If  the  names  of  the  colors  themselves  are  used,  then 
the  program  becomes  independent  of  changes  of  representation. 
So  what  we  want  is  a  declarative  facility  to  associate  names 
with  colors.   Such  a  facility  is  actually  provided  in  COBOL 
and  in  Algol  68,  but  not  in  PL/I.   Another  example  has  to  do 
with  the  association  of  a  property  with  an  object.  Such  an 
association  might  be  made  either  by  storing  the  property  as 
part  of  the  object  or  by  having  the  property  computed  by  some 
function.   It  would  be  desirable  to  factor  out  this  decision 
by  using  a  functional  notation  uniformly  and  then  letting  the 
function  definition  reduce  to  either  a  subroutine  call  or  an 
appropriately  indexed  address  reference. 


-25- 


There  is  a  view  currently  popular  among  the  advocates  of 
"structured  programming"  that  all  data  structures  should  be 
hidden  by  procedure  calls.   In  this  view,  access  to  a  data 
structure  should  be  obtained  by  calling  an  appropriate  proce- 
dure.  This  approach  does  have  advantages  vis-a-vis  access 
control,  but  it  transfers  rather  than  solves  the  problem  of 
changes  of  meaning.   Data  structures  in  an  application  such 
as  compilation  are  necessarily  highly  interlinked,  and  so  the 
value  returned  by  a  procedure  will  often  have  its  meaning 
determined  by  some  underlying  data  structure.   For  instance, 
a  binary  choice,  represented  by  a  predicate,  may  turn  into  a 
three-way  choice. 

A  releated  problem  stems  naturally  from  the  general  graph 
structure  of  typical  programs,  in  which  procedures  are  called 
from  many  different  places.   It  often  turns  out  that  the 
different  calling  points  each  requires  a  slightly  different 
service  or  piece  of  information.   This  gives  rise  to  a  proce- 
dure organization  that  resembles  a  "theme  with  variations". 
In  general,  the  necessary  variations  on  the  theme  will  not  all 
be  apparent  at  the  time  when  the  procedure  is  designed.  Often 
the  multiple-use  aspects  of  such  procedures   are  not  at  all 
apparent  at  the  time  the  procedures  are  written.   Top-down 
programming  does  not  solve  the  problem  because  recursion  may 
give  rise  to  cycles  in  the  program  graph. 

An  example  of  a  theme  with  variations  is  the  procedure  LOCN , 
which  is  part  of  the  code  generator  of  the  compiler.   The  task 
of  LOCN  is  to  calculate  the  location  associated  with  a  reference 
to  a  variable.   The  reference  may  or  may  not  be  subscripted  and 
may  or  may  not  be  based  on  some  pointer.   Subscripts  may  or  may 
not  be  implicit.   For  instance,  if   A  is  an  array   the  expres- 
sion  "ADDR(A)"   should  retrieve  the  location   of  the  array 
itself,  but  the  statement   "A=l;"   should  retrieve  individual 
elements  of  A  chosen  by  implicit  subscripts.   Thus  the  use  of 
LOCN  depends  on  the  context.   There  are  two  different  entry 
points  to  cover  the  two  cases,  plus  a  third  entry  point  to 
cover  the  case  where  an  offset  from  a  base  is  desired  rather 


-26- 


than  an  actual  location.  All  three  cases  share  most  of  the  same 
code;  but  there  are  various  internal  case-dependent  choices  that 
give  rise  to  the  organization  of  the  theme  with  variations. 


6.2    Operating  Systems  Support 

During  the  development  of  the  compiler  I  used  two  different 
operating  systems:   OS/360  on  the  IBM  360  and  Scope  3.2  on  the 
CDC  6600.   The  support  offered  by  these  systems  became  a 
critical  issue  during  this  development.   Two  quite  different 
kinds  of  support  were  involved:   support  of  the  programming 
process  itself,  and  run-time  facilities  available  to  the  compiler. 

The  PL/I  compiler  exists  in  three  different  forms:   as  a 
collection  of  symbolic  source  programs,  as  a  collection  of 
compiled  load  modules ,  and  as  a  collection  of  absolute  over- 
lays .   All  three  forms  require  maintenance,  but  the  symbolic 
one  is  the  most  critical.   File  maintenance  was  a  very  time- 
consuming  operation,  particularly  under  OS/360  where  the 
facilities  are  grossly  inadequate.   lEBUPDTE ,  the  symbolic 
updating  program  under  OS/360,  is  difficult  to  use  because  of 
unclear  specifications,  arbitrary  restrictions,  and  an  error- 
prone  input  form.   Under  Scope  3.2,  on  the  other  hand,  I  was 
able  to  carry  out  file  maintenance  expeditiously  and  with  a 
minimum  of  fuss.   My  success  was  due  to  the  combined  use  of 
the  permanent  file  system  for  saving  programs,   the  CDC 
symbolic  update  program  UPDATE,  and  the  NYU  interactive  editor 
designed  by  Richard  Reich.   Under  both  operating  systems  I  found 
that  it  took  some  time  to  develop  adequate  file  maintenance 
procedures,  but  that  once  these  procedures  were  developed  their 
use  saved  a  great  deal  of  time. 

Adequate  editing  facilities  are  essential  in  the  reduction 
of  program  complexity.   There  seems  to  be  no  substitute  for 
interactive  editing  in  this  matter.   The  evolutionary  nature 
of  complex  programs  assures  that  reorganizations  will  be  neces- 
sary, and  during  these  reorganizations  one  wants  to  be  able  to 
move  blocks  of  text  from  one  place  to  another  as  well  as  to 


-27- 


carry  out  various  global  substitutions.   The  ability  to  move 
large  text  blocks  is  a  facility  often  lacking  in  editors. 
Reformatting  is  also  important,  though  this  may  not  be  an 
operating  system  function  because  of  its  language  dependence. 

A  good  measure  of  the  adequacy  of  programming  support  has 
to  do  with  the  proportion  of  runs  that  are  lost  because  of 
errors  outside  the  program  being  debugged.   Such  errors  include 
job  control  language , problems ,  canpiler  errors,  updating 
problems,  and  machine  troubles.   As  I  kept  no  records,  my 
impressions  must  be  subjective;  but  I  would  estimate  that 
overall  I  lost  perhaps  two-thirds  of  my  runs  on  the  360  for 
system- related  reasons  and  perhaps  one-tenth  of  my  runs  on 
the  6600  for  system-related  reasons.   The  comparison  is  somewhat 
unfair  to  the  360  because  much  of  my  work  on  the  6600  was  done 
interactively;   nevertheless,  "battling  the  system"  was  a  major 
part  of  my  effort  in  getting  the  bootstrap  going. 

Neither  the  360  nor  the  6600  provided  adequate  debugging 
facilities.   On  the  360  certain  facilities  were  built  into 
the  compiler  itself,  e.g.,  tlie  CHECK  condition,  but  their  use 
required  modification  of  the  source  program;  and  lEBUPDTE  pro- 
vided no  systematic  way  of  removing  these  changes  once  made. 
Debugging  in  machine  language  on  the  360  was  impossible  since 
I  did  not  have  the  requisite  knowledge  of  the  IBM  compiler  or 
of  OS/360.   On  the  6600  I  used  a  machine-language  interactive 
debugger  developed  by  Peter  Maclean  [4] ,  which  was  the  best 
that  could  be  hoped  for  under  the  circumstances.   Some  PL/I 
source  language  debugging  facilities  that  don't  require 
recompilation  are  presently  being  developed  by  Janet  Fabri  as 
a  master's  thesis. 

The  compiler  required  run-time  support  for  overlaying.   The 
linkage  editor  on  the  360  provided  better  support  for  over- 
laying than  any  corresponding  facility  under  Scope  3.2  on  the 
6600.   The  compiler  is  now  organized  into  overlays,  but  this 
had  to  be  done  in  a  machine-dependent  manner.   The  loader  under 
the  Scope  3.4  operating  system  for  the  6600  does  have  facilities 
as  good  as  those  of  the  360  linkage  editor. 


-28- 


Another  area  where  run-time  support  is  needed  is  in  error 
recovery.   The  facilities  on  the  360  for  error  recovery  are 
superior  to  those  on  the  6600,  though  some  improvement  can 
be  expected  on  the  6600  when   Scope  3.4  comes  into  operation. 
On  the  360,  it  is  possible  for  the  program  to  regain  control 
from  machine  interrupts  such  as  overflow  or  reference  out  of 
bounds.   On  the  6600   control  is  taken  away  from  the  program 
under  such  circumstances.   Neither  machine  provides  program 
recovery  from  time  limit,  though  a  restricted  facility  for 
time  limit  recovery  is  provided  by  Scope  3.4  in  the  form  of 
"reprieves"  for  various  errors.   It  would  be  very  helpful  if 
the  programmer  could  specify  a  local  time  limit,  not  to  exceed 
his  global  time  limit,  and  be  interrupted  when  this  time  limit 
was  reached.   With  such  a  facility  the  programmer  could  use 
the  time  limit  for  resource  allocation,  and  could  perform  as 
much  processing  as  he  wished  to  following  the  exhaustion  of  a 
local  time  limit.   Since  the  global  time  limit  would  still  be 
inviolate  there  would  not  be  any  adverse  impact  on  job 
scheduling. 

Run-time  support  is  also  needed  for  transput  and  for  the 
mathematical  library.   As  these  facilities  in  CIMS  PL/I  are 
very  limited,  I  did  not  have  much  experience  with  the  adequacy 
of  Scope  3.2  in  this  regard.   The  issue  did  not  arise  in  my 
work  under  OS/360. 

6 . 3    Virtual  Memory 

My  experience  suggests  some  compelling  reasons  why  compilers 
can  function  more  effectively  in  a  large  address  space  such  as 
that  usually  provided  by  a  virtual  memory.   Various  decisions 
about  compiler  design  are  dependent  on  the  availability  of  address 
space.   The  situation  shows  up  in  two  areas:   program  structure 
and  data  structure.   A  small  address  space  dictates  the  use  of 
overlays  in  program  organization,  and  can  lead  to  compilers 
with  a  great  number  of  specialized  phases,  e.g.,  the  IBM 
F-compiler.   The  resulting  program  organization  is  quite 
complicated  and  furthermore  generates  a  great  deal  of  transput 


-29- 


activity.  With  a  large  address  space  one  can  ignore  the  over- 
laying problem.   If  only  a  small  physical  memory  is  available, 
then  the  paging  processes  of  the  operating  system  will  produce 
the  effects  of  overlaying  without  the  corresponding  complica- 
tions in  the  program.   Paging  will  still  require  transput 
activity,  but  that  activity  would  be  necessary  in  any  case. 
If  physical  memory  is  large,  then  the  transput  activity  will 
be  avoided  since  little  paging  will  be  necessary.   Thus  a 
single  program  organization  can  function  optimally  over  a  wide 
range  of  availability  of  physical  memory.   Good  programming 
practices  should  still  be  followed   to  preserve  locality  of 
reference.   Some  experiences  with  the  MULTICS  PL/I  compiler, 
though,  suggest  that  some  phase-oriented  organization  may  be 
desirable  even  in  a   virtual  memory  environment. 

The  question  of  syllable-oriented  intermediate  representa- 
tion versus  pointer-oriented  intermediate  representation  is 
much  less  vexing  in  a  large  address  space,  since  one  can  use 
that  representation  which  is  the  most  natural  at  each  point 
(cf.  Sec.  5.2.1).    The  entire  intermediate  representation 
becomes  addressable,  and  thus  the  problems  of  code  motion 
alluded  to  previously  are  no  problem  at  all.   The  virtual 
memory  lends  itself  naturally  to  coping  with  a  wide  range  of 
source  program  sizes;  the  amount  of  paging  activity  becomes 
proportional  to  the  program  size.   For  short  programs  the 
intermediate  representation  often  will  not  be  paged  out. 

6.4    Reliability 

In  order  to  maintain  the  reliability  of  the  compiler  I 
developed  a  series  of  acceptance  tests.   The  most  stringent 
test  is  to  have  the  compiler  compile  itself  as  described  in 
Section  3;  success  at  this  point  almost  certainly  implies  that 
any  later  errors  found  in  the  compiler  can  be  fixed.   There  is 
a  danger  that  must  be  avoided:  if  an  error  in  the  compiler  makes 
it  impossible  to  recompile  the  code  section  containing  the  error, 
then  there  is  no  way  to  escape  except  by  returning  to  a  previous 
safe  version.   Since  it  is  expensive  to  recompile  the  entire 


-30- 


compiler  I  have  done  it  only  about  every  two  months.   Because 
of  the  range  of  language  facilities  used  by  the  compiler  itself, 
the  self-compilation  test  has  turned  out  to  be  quite  effective. 

Whenever  any  change  at  all  is  made  to  the  compiler  or  to  its 
run-time  library,  I  check  it  by  running  a  standard  acceptance 
test.   The  first  version  of  the  acceptance  test  was  designed  to 
run  through  all  code  paths  in  the  compiler  except  for  those 
associated  with  errors.   Thereafter,  whenever  any  error  was 
discovered  either  in  the  compiler  or  in  its  library,  a  further 
test  case  was  added  to  cover  that  error.   Thus  the  acceptance 
test  provides  warning  if  any  previously  found  error  recurs. 
Not  surprisingly,  most  of  the  errors  that  have  shown  up  occurred 
when  the  compiler  aborted  as  a  result  of  a  programmer  error. 

In  the  design  of  an  acceptance  test,  the  output  should  be 
as  short  as  possible.  Only  incorrect  results  should  produce 
messages.  Thus  one  need  not  scan  a  long  output  to  see  if  it 
is  the  same  as  the  previous  one,  a  process  which  is  prone  to 
oversights.  The  present  acceptance  test  produces  only  one 
line  of  diagnostic  output  if  the  compiler  is  working  properly, 
though  it  does  produce  a  listing  of  the  testing  program.  Any 
malfunction  is  evident  at  a  glance. 

6.5    Project  Organization 

Up  to  the  completion  of  the  first  bootstrap,  the  compiler  was 
essentially  a  one-person  project.   The  obvious  disadvantage  of 
such  an  organization  is  the  limit  on  how  much  code  one  person 
can  produce.   A  more  subtle  disadvantage  is  the  lack  of  inter- 
action among  co-workers  that  often  leads  to  improvements  in 
de  s  i  gn . 

On  the  other  side,  one-person  programming  seems  to  avoid  most 
of  the  interface  difficulties  that  plague  multi-programmer 
projects.   It  eliminates  the  misunderstandings  that  often  make 
it  hard  for  modules  written  by  different  people  to  communicate 
with  one  another.   The  "chief  programmer  team"  approach  suggested 
by  Mills  and   described  by  Baker  [2]  seems  to  preserve  the  spirit 
of  one-person  programming  while  eliminating  many  of  its 


-31- 


disadvantages.   My  experience  suggests  that  additional  help 
can  be  best  utilized  when  two  criteria  are  satisfied: 

(1)  the  task  to  be  done  is  unambiguously  defined,   and 

(2)  the  task  to  be  done  is  not  on  the  critical  path. 

The  best  way  to  define  a  programming  task  unambiguously  is 
to  pose  it  in  the  form  of  an  addition  to  a  program  that  is 
already  working.   The  existing  program  then  defines  the  neces- 
sary interfaces.   Additional  help  is  often  useful,  too,  in 
gathering  information.   For  instance,  operating  systems 
facilities  are  often  poorly  documented,  and  obtaining  informa- 
tion about  them  is  a  task  that  can  usefully  be  farmed  out. 

6.6    PL/I  as  a  Systems  Programming  Language 

One  of  the  original  design  goals  of  PL/I  was  to  provide  an 
adequate  language  for  systems  programming.   It  has  been  used  as 
the  systems  programming  language  at  MULTICS  [3] .   Experience  with 
the  compiler  suggests  that  while  PL/I  is  probably  better  than 
most  other  languages  around  for  that  purpose,  it  is  far  from  ideal, 

The  strongest  argument  for  PL/I  is  the  wide  range  of 
facilities  that  it  provides.   The  description  of  CIMS  PL/I 
indicates  the  range  of  facilities  that  were  actually  used  in 
writing  it.   Some  examples  are:   different  forms  of  storage 
allocation,  pointer  manipulation,  bit  strings  and  character 
strings  and  the  associated  built-in  functions,  structured 
variables  (particularly  valuable) ,  block  structure  and  recur- 
sion, and  flexible  forms  of  iteration.   Another  argiament  for 
PL/I  is  that  usually  one  can   accomplish  much  by  writing  down 
little;  the  language  has  great  expressive  power. 

There  are,  of  course,  failings  of  PL/I.   The  pointer  structure 
is  such  that  it  is  impossible  to  write  any  reasonable  form  of 
garbage  collector;  the  list  processing  facilities  require  far 
too  much  of  the  user.   The  iteration  structure,  while  flexible, 
still  does  not  allow  some  more  recently  suggested  concepts  that 
lead  to  cleaner  programming,  e.g.,  specifying  a  loop  as  a  cycle 
and  placing  an  escape  clause  at  an  arbitrary  place  within  it. 


-32- 


structuring  should  be  associated  with  data  types  rather  than 
with  variables;  the  approach  of  Algol  68  in  this  area  seems  far 
more  sensible.   It  would  be  desirable,  in  fact,  to  use  a 
functional  representation  for  structures  and  then  to  be  able 
to  define  the  function  later  as  being  either  computed  by  some 
procedure  or  found   merely  by  address  manipulation.   The  PL/I 
on-conditions  seem  unnecessarily  general,  and  often  do  not 
provide  simple  but  desirable  effects.   For  instance,  storage 
allocation  in  PL/I  can  be  carried  out  in  specified  areas. 
When  an  area  is  exhausted,  an  on-condition  is  raised  to   signal 
this  fact.   However,  there  is  no  way  to  determine  which  area 
was  the  one  that  was  exhausted,  except  through  knowledge  of  the 
point  of  execution  of  the  program.   The  obvious  solution,  writing 
an  allocation  function,  does  not  work  since  although  one  can 
pass  allocated  storage   to  functions ,  one  cannot  pass  data 
descriptors  to  them  in  any  reasonable  way. 

Efficiency  is  certainly  an  issue,  but  the  significance  of 
it  is  not  yet  known.   It  seems  clear  that  PL/I  compilers  can  be 
far  more  optimized  than  they  have  been  so  far.   My  own  view  is 
the  inefficiency  in  generated  code  can  be  brought  to  a  suffici- 
ently low  level  so  that  it  is  not  a  major  consideration. 

I  do  not  regard  the  size  of  PL/I  as  a  disadvantage  for 
systems  programming.   A  reasonable  implementation  should  not 
include  run-time  support  for  a  given  program  beyond  that  which 
it  actually  requires.   Thus  a  program  is  not  penalized  for  those 
facilities  that  it  does  not  use. 

6.7    Design  Flaws 

Naturally,  experience  in  coding  and  using  the  compiler  has 
shwon  up  a  number  of  flaws  in  the  original  design.   It  was 
possible  to  correct  some  of  them  without  great  difficulty,  but 
others  will  require  extensive  rewriting. 

To  start  with  the  bright  side,  I  will  describe  some  of  the 
flaws  that  have  been  discovered  and  corrected.  The  inability 
to  detect  end-of-file  and  to  rewind  files  was  a  serious  lack  in 


-33- 


the  original  version.   This  lack  was  remedied  by  adding 
various  on-conditions ,  including  ENDFILE,  and  also  by  adding 
an  environment  option  that  allows  files  to  be  opened  or  closed 
with  or  without  rewind.   The  original  version  was  unable  to 
compile  more  than  one  external  procedure  on  a  single  call. 
Correction  of  this  flaw  required  the  insertion  of  some  control 
mechanisms  that  could  not  be  written  in  PL/I  because  of  their 
extralingual  manipulation  of  the  storage  allocation  state. 
In  the  original  version,  error  diagnostics  were  separated  from 
the  procedures  to  which  they  referred,  and  this  situation  made 
it  difficult  to  correlate  error  complaints  with  statements. 
Finally,  a  small  change  was  made  so  that  one  can  determine 
the  statement  being  executed  at  the  time  of  an  error.   This 
is  done  by  setting  machine  register  AO  to  the  current  statement 
number  at  the  beginning  of  execution  of  each  statement.  (The 
user  can  turn  off  this  mode  of  compilation  if  he  wishes.)  Since 
the  standard  dump  produced  on  an  error  gives  the  value  of  AO , 
it  is  not  difficult  to  isolate  the  offending  statement. 

Serious  flaws,  of  course,  remain.   The  worst  of  them  is  that 
the  code  generator  directly  produces  register-dependent  code. 
Keeping  track  of  registers  has  led  to  great  complexity. 
The  register-dependence  of  the  code  also  makes  optimization 
very  difficult  and  inhibits  adding  a  number  of  additional 
language  features.   An  unexpected  problem  was  that  the  heavy 
use  of  macros  and  micros  caused  the  6600  assembler  to  use  an 
excessive  amount  of  time.   It  was  surprising  to  discover  that 
a  compilation  actually  takes  less  time  than  assembly  of  the 
resulting  code.   The  right  approach  seems  to  be  to  produce 
some  intermediate  code  form  that  is  register-independent  and 
perhaps  even  machine-independent,  and  to  add  a  further  pass 
to  convert  this  code  to  register-dependent  instructions  and 
macro  instructions,  with  expansions  kept  as  simple  as  possible. 

Another  problem  is  the  constraints  imposed  upon  the  parsing 
procedure  by  SYNDIPAR.   Because  the  parse  must  be  written  as 
one  enormous  procedure,  it  is  not  possible  to  use  recursion 


-34- 


within  the  semantic  routines  of  the  parser.   The  result  is 
that  the  parser  is  unnecessarily  complex,  and  also  uses 
various  tables  whose  sizes  cannot  be  intelligently  determined 
in  advance.   The  difficulty  is  not  fundamental,  and  could  be 
cured  by  changing  the  conventions  for  calling  semantic  routines, 

There  are  also  problems  in  the  way  that  pushdown  stack 
storage  is  handled.  These  problems  are  discussed  briefly 
in  Section  5.3. 


-35- 


References 


1.  Abrahams,  P.,   A  syntax-directed  parser  for  recalcitrant 
arammars.   Intern.  J.  Computer  Math.,  Sec.  A,  Vol.  3,  1972, 
pp.   105-115. 

2.  Baker,  F.  T. ,   Chief  programmer  team  management  of 
production  programming.   IBM  Systems  J.,  Vol.  11,  No.  1, 
1972,  pp.  56-73. 

3.  Corbato,  F.  J.,  PL/I  as  a  tool  for  systems  programming. 
Datamation,  May  1969,  pp.  68-76. 

4.  Maclean,  Peter,   An  interactive  debugging  aid  for  the 

CDC  6600,  M.S.  Thesis,  Computer  Sci.  Dept.,  New  York  Univ., 
February  19  73. 

5.  PL/I    BASIS/I-9.      European   Computer  Manufacturers   Association 
and  American   National   Standards    Institute,    December   1972. 

6.  Control  Data  6400/6500/6600  Computer  Systems  -  SCOPE 
Reference  Manual,  Version  3.2.   Publication  No.  60189400, 
Control  Data  Corp.,   Palo  Alto,  Cal . 

7.  Control  Data   Cyber  70  Computer  Systems  -  SCOPE  Reference 
Manual,  Version  3.4.   Publication  No.  60307200,  Control  Data 
Corp.,  Palo  Alto,  Cal. 


-36- 


Appendix  A 
CIMS  PL/I  Users  Manual 


-37- 


PL/  I 
PL/1 
PL/  1 
PL/  1 
PL/  I 
PL/1 
PL/I 
PL/I 
PL/1 
PL/1 
PL/1 
PL/I 
PL/  1 
PL/1 
PL/1 
PL/1 
PL/1 
PL/1 
PL/I 
PL/1 
PL/I 
PL/  I 
PL/I 
PL/  1 
PL/1 
PL/I 
PL/  1 
PL/1 
PL/1 
PL/1 
PL/I 
PL/I 
PL/1 
PL/1 
PL/I 
PL/1 
PL/I 
PL/  I 
PL/1 
PL/I 
PL/I 
PL/  I 
PL/I 
PL/  1 
PL/I 
PL/  1 
PL/  1 
PL/I 
PL/I 
PL/I 
PL/1 
PL/i 
PL/i 
PL/1 
PL/  I 
PL/i 
PL/I 
PL/I 
PL/  I 
PL/i 


PI  /I  PL /I  PL /I  PL /I  PL /I  PL /I  PL /I  PL/ I  PL  I  PL/ I  Pi  /  I  PL/ 1  PL/ I  PL/ I  Pi.  /  I  PL/ T  PL/ 
P|/IPl/IPL/IPL/IPL/Ipl/IPL/IPL/IPLIPL/IPi/1PiVIPL/IPl/IPL/1PL/?PL/ 


xxxxxxxxxxx 

X  X 

X 

X 

X 

X 

X 

X 

X 

X  X 

xxxxxxxxxxx 


xxxxxxxxxxxx 

X  X 

X  X 

X  X 

X  X 

xxxxxxxxxxxx 

X 
X 
X 
X 
X 


VXXXXXXXX 
X 
X 
X 
X 
X 
X 
X 
X 

y 

XXXXXXXXX 


Y 
V 

y 

X 

y 
y 

X 
X 
X 
X 
XXXXXXXXXXXX 


X  X 

XX  XX 

XX  XX 

XX  XX 

X       X       y       X 

X  XX  X 

XXX 
X  X 

X  X 

X  X 

X  X 


xxxxxxxxxyy 

X  X 

X 

i 

y 
xxxxxxxxxxx 

X 
X 
X 

X  X 

xxxxxxxxxv 


xxxxxxxxy 

X 
X 
X 
X 
X 
X 
X 
X 
X 

xxxxxxxxy 


author:  PAUL  ARRAHAMS 


PL/IPL/IPL/IPL/IPL/IPL/IPL/IPL/IPLIPL/IPI/IPL/IPL/IPL/IPI 
PL/IPl/IPL/IPL/IPL/IpL/IPL/IPL/IPLIPL/IPI  /IPI  /IPL/IPL/IPI 

-38- 


/IPL/TPL/ 
/IPL/IPL/ 


IPL/IPL/I 
IPL/IPL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PU/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
Pl/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
PL/I 
IPL/IPL/I 
IPL/IPL/I 


****************************** 

♦  « 

♦  CHANGES  AMD  MEw  FPATUr^ES   * 

♦  *♦♦***♦*••#♦*♦♦♦•♦♦*«♦♦**♦*•#♦♦ 

A,  4/2/7!^ 

1,  THE  INPUT-OUTPUT  FACILITIES  HAVE  BEEN  GPFATLY  IMPROVPD.   T^E 
CMANGfcS  INCLUDE  THC  ABILITY  TO  REWIND  FILES,  THE  RFMHVAL  OF  THF 
RtSTRICTlON  THAT  PERMANENT  FILES  CAKnQT  t^E  Rf^AD  illRErTLY.  anP 
THE  AVAILABILITY  Or  THF  FNDFILE  AND  ilNDEF  I  NE 'IF  I  Lf^  CQ^iDITlONS. 
StE  SfcCTIONS  2,1.7.  ?,3.5,  2.3,16,  anD  2.8  FIR  DRTAIlS. 

2,  THE  ON-STATEMENT  IS  MQw  AVAILABLE.   SEE  SEC.  2.3.14  FOR 
ntTAlLS, 

3,  THE  ALLOCATE  STATEMENT  NOW  ALLOWS  MULTIPLE  ALLOC  AT  I  O^iS . 

B,  4/23/74 

1,  THE  COMPILER  NOW  EVPFCTS  ITS  INPUT  ON  THE  FILE  -INPUT-  AND  PRODUCES 
THE  SOURCE  LISTING  ON  THE  FJLE  -OUTPUT-  (CF.  SEC5.  3.2»  3,!'). 

2,  THE  FILE  -ERRORS-  TS  NO  LONGER  USED. 

3,  FILE  SUbSTITUTICN  ^OW  RECOGNIZES  NULL  FILES  (CF,  SEC.  3.4,1). 
r,  6/15/7J 

1.  COMPILATIONS  MAY  Nnw  BE  PATCHED,  SO  THAT  ANY  NUM ^ER  ^F  EXTFRNAL  PRO- 

cfcCuRfcs  CAN  PE  Compiled  with  a  single  Control  ca^d.  as  thf  compiled 

COCE  is  now  in  a  SiNGLf'  RECORD,  THE  RECM^^G  STEP  IS  ALSO  ELIMINATED, 
THESE  CHANGES  MAkE  THE  CONTROL  CARDS  MUCH  SliPLEf!.   FURTHERMORE,  THE 
FILE  -TABLES-  IS  No  LOMQFR  NFEDED,  AS  IT  HAS  BEEN  INCORPORATED  INTO 
THE  COMPILER  PRCpER.  SEE  SEC.  3,2  poR  DETAIL^. 

2.  AUTOMATIC  FIELD  LEvGTH  MANAGFMENT  HAS  BEEN  ADDED  Tn  THE  COnPILER.   A 
COMPILATION  LISTING,  NOW  INDICATES  THE  FIFLD  LENGTH  RPQUIREr  FOR 
CUMPILATION. 

3.  A  DEFAULT  STaTEMFNT  HAS  REEN  ADDED.   SEE  APPENDIX  A  FOR  DETAIL^, 

4.  A  CROSSpREFERENCF  "aP  CAM  MOW  BE  OBTAINED.   ^EE  ^EC.  3.3  FTP  DETAILS, 

5.  DUE  TO  A  BUG  THAT  IS  RATHER  DIFFICULT  TO  FIX  AT  THE  '^OMENT.  ON-UNITS 
CANNOT  DECLARE  ANY  LOCAL  AUTOMATIC  VARIABLES.   T'1  RY^ASS  TUS 
RbSTRICTION,  WRITE  THE  ON-IJNiT  AS  A  PROCEDURE  AND  CALL  IT, 

D.  6/27/73 


1,  A  CE^UGmNG  OPTION  HAS  BEEN  ADDED  TO  ALLOW  DETERii  I  NAT  I  ON  OF  THE 

SUTEMENT  BEING  FXpcOTED  AT  THE  TjmE  OF  A  PR  IGf^A'^l  AB^RT,   ^EE  SecS, 
3.3,2  AND  4  FOR  DETAILS. 

-39- 


1,  INTROnuCTION 

2,  LANGUAGE  FEATURES 


TA«L(^    OF    CONTEf-iTS 


3, 


2.1  ATTRIBUTES 

2. 

1.1  SCALAR  DATA  TYPES 

2,1,1.1  FIXED 

2.1,1.2  FLOAT 

2,1,1,3  chARAcTPR 

i2,l,l,4  BIT 

2,1,1,5  POINTER 

2,1,1,6  AREA 

2,1.1,7  LABEL 

2,1.1.8  ENTRY 

?., 

1,2  AGGREGATICN 

?.. 

1,3  STORAGE  CLASSE*^ 

?.. 

1,4  SCOPE  ATTRIBUTES 

2. 

>1,5  INITIAL  ATTRIBUTE 

2. 

1,6  ENTRY  ATTRIBUTE 

2. 

>1,7  FILE  ATTRIBUTES 

2. 

1.8  BUILTIN  ATTRIBUTE 

2, 

.1.9  LIKE  ATTRIBUTE 

2,2  ATTRIBUTE  DETERMINATION 

2,3  STATEMENT  TYPES 

2, 

,3,1  ASSIGNMENT  STATEMENT 

2. 

,3,2  ALLOCATE  STATEMENT 

2. 

,3.3  BEGIN  STATEMENT 

2, 

.3,4  CALL  STATEMENT 

2 

.3,5  CLOSE  STaTfmpnT 

2 

.3,6  DECLARE  STATEMENT 

2 

.3,7  DEFAULT  STATEMENT 

2 

.3,0  DO  STATEMENT 

2 

.3,9  END  STATEMENT 

2 

.3,10  ENTRY  STATE^EMT 

2 

.3,11  FREE  STaTEMpnT 

2 

.3,12  GO  TO  STATE»^ENT 

2 

.3,13  |F  STATEh-FNT 

2 

.3,14  NULL  STATEMPNT 

2 

.3,15  ON  STATEMENT 

2 

.3,16  OPEN  STATEMPNT 

2 

.3,17  PROCEDURE  STATEMENT 

2 

.3,16  READ  STATEMENT 

2 

.3,19  RETURN  STATEMENT 

2 

.3.20  STOP  STATEMPNT 

2 

.3,21  WRITE  STATE'^ENT 

2.4 

tXPRESSIONS 

2.5 

rVPE  CONVERSION 

2.6 

KLNCTIUN  CALLS  ANn  PARAMETER 

2,7 

bllLTlN  FUNCTIONS 

2.8 

INPUT-OUTPUT 

2,9 

CHARACTER  SET 

2. in 

KEY  WORDS  AND  pHbaSES 

OPfcWATlNG  ENVIRONMENT 

PASSAGE 


3.1  INPUT  FORMAT 

3.2  vJCnTHql  card  SETUP 


-40- 


3.3  CCMPILER  OPTIONS 
3'3.1  XREF 

3.3.2  STiNST 

3.4  PROGRAM  CALLING  CarH 

3.4.1  FILE  NAMF  «;UPST  I  TUT  I  ON 

3.4.2  PASSING  AN  APQUMFNT  TO  A  MAIN  PROCEDURE 


4,  ERROR  MESSAGES  AND  CERUnCING  TECHNIQUES 


5,  AREAS  OF  I  NCQI^PA  T  I  B  IL  t  T  V 

5.1  tXTENSIONS  IN  CiMS  PL/I 

5.2  INCOMPATIBILITIPS  BETWEEN  CTMS  Pl/I  AND  I3M  PL/f 

6,  FUTURE  FLANS 


7,  ACKNOWLEDGMENTS 


A.  THE  DEMULT  STATEMENT 


APPENniCES 


-41- 


1,  INJTROnuCTION 

AN  EXFERIMbMAL  PL/I  rOwPlLFR  K^OWN  AS  CIMS  PL/I  IS  NOW  AVAILABLE  ON  THE 

66G0  AT  THE  AbC  COMPUTlNn  rE'vlTER.   I  WELCOME  ANY  QdEjTIO'iS  0°  CQMMFNTS  ON 
ThF  USE  Ot-  THt  COMPILER,  AvD  PARTlCULARLV  AKY  REPO'^To  OF  BUGS  OR  SlGPESTIONS 
FOR  IMPROVEMENTS, 

SINCfc  CIMS  PL/I  IS  STtlL  UNHFR  nEVELOPMENT,  YOU  S^O  JLD  'iQT  GIVE  UP  ON 

USING  IT  IF  YOUR  INITIAL  EXPERIENCE  WITH  IT  TS  LNFAVORABLEJ  i^AlT  A  FEW  WEEKS 
OH  hONTHS  AND  IT  WILL  BE  8FTTER.   CHECK  THE  WRITEUP  FREQiJENTLY  FOR  CHANGE 
NOTICES. 

THE  SUBSET  THAT  I  hiVc  IMPLEMENTED  IS  CRiEt'TED  TQWA'-JDS  rOMRlLFR 
WRITING!  IN  PARTICULAR,  IT  IS  THAT  SURSET  THAT  WaS  NECESSARY  IN  ORfER  TO 
WRITE  THE  COMPILER  IN  PL/I.   SOME  ESTIMATE  OF  THE  RELIABILITY  OF  ThE  COMPILER 
MAY  BE  GLtANEtJ  FROM  THE  FACT  fWAT  IT  IS  CAPABLE  OF  COMPILING  ITSELF;  THE 
COMPILER  ►'f^ESfcNTLY  REQUIRES  ABOUT  6000  PL/I  SOURcE  STaTE^E^JTS,   Thf  LACK  OF 
FLOATING  r-ClNT  INPUT-OUTPUT  IS  PRORARLV  THE  MAJOR  TIFFICJLTY  IN  USING  THE 
CURRENT  VtRSION  CF  THE  COMPILER  FOR  NUMERICAL  PROBLEMS,  THOUGH  FLOATING  POINT 
IS  AVAlLAdLE  INTERNALLY  FOR  COMPUTATION. 

THE  LANGUAGE  ACCEPTED  BY  CIMS  PL/I  IS  KoT  COMPLETELY  CO'^PATIBLE  WITH 
THAT  ACCEHTED  BY  ANY  OF  THc  IBM  COMPILERS.   nlFFPRENCES  ARE  OF  TWO  KINDS! 
CONSTRUCTS  THAT  ARE  IMPL^MPNTED  IN  ONE  BUT  NOT  TWE  OTHER,  ANO  CONSTRUCTS  THAT 
flEHAVE  DIFFERENTLY,  OR  REQUIRE  DIFFERENT  SYNTAX,  IN  THE  VARIOUS  COi'PlLPRS, 
THIS  LAST  SET  IS  NOT  LARGE.   INCOMPATIBILITIES  ARE  DETAILED  IN  SEC.  5, 


•42- 


?.  LANGUAGE  FEATURES 

IN  This  SECTION  I  SMALL  DESCRIBF  THAT  PORTION 
OF  PL/I  PHESENTLY  IMPLEKFNTEH  IN  CIH*!  PL/I.    A  KNOWLFDGF  OF  PL/I  IS  NFCFSSARY 
IN  ORDER  fC  UNDERSTAND  TmI9  MATERIAL.   YOU  SHOULH  ASSUME  THAT  ANYTUNG  NOT 
MfcNTIONEn  IS  NOT  IMPLEmEmTFD. 

2.1  ATTRIdLTES 


2,1.1  SCALAR  DATA  TYPES 
2,1.1.1  FIXED 


BASF  AND  PRECISION  ATTRIBUTES  Ai'E  ACCEPTED  ^^UT  IGiJORED, 
ALL  FIXED  CATA  riEHAVE  LIkE  FORTRAN  INTEGERS. 


2.1.1.2  FLCAT 


THE  PRECISION  ATTRlnUTE  IS  IGNORED.   NO  I/O  FACILITIES  ^xiST  FOP  FLOATING 
NUMBERS,  NCH  MAY  FLOATING  CONSTANTS  APPEAR  IN  PROGRAIS.   iMpiIClT  TYPE 
CONVERSIONS  HETikEEn  FiXEn  AND  FLOAT  HO  EXIST'.  ANH  ARE  THP  ONLY  WAY  OF  OBTAINING 

OK  RETRIEVING  Float  data. 


2.1.1-3  CHARACTER 

CHARACTER  UATA  IS  FULl.Y  IMPlEmEnTEDi  AND  MAY  9E  VAPYIMQ 
STRING  Coi\STAnTS  ArE  LIMT^D  TO  lOO  CHArACtErS  ImClU^ING  T^E 
VARIABLES  ARE  NUT  LiMlTEn, 

2.1.1.4  BIT 


or  not.  character 
apostrophFSi  But 


ONLY  NQNVARYING  BIT  STRINGS  ARb  AVAILAElE.   THE^E  MiJST  MAVE  CTNSTANT 
LENGTH  (I.E,,  NOT  GIVEN  BY  AN  FXPRESSION).   THE  LENGTH  OF  A  «IT  STRING  MUST 
ALWAYS  BE  SPECIFED  BY  A  DECIMAL  INTEGER  CONSTANT  NOT  EXCEEnjMG  60, 


2,1.1.5  POINTER 


POINTERS  MAY  ONLY  PO^'T  TO  WORDS,  NOT  TO  BITS, 


2.1.1.6  AREA 


ALLOCATIONS  MAY  BE  DOmE  IN  AREAS*  BUT  AREAS  MAY  NOT  BE  aSSIGNFD. 
NOR  CAN  FREEING  BE  DONE  IN  AN  AREA.   AN  AREA  MAY  BE  TOTALLY  FREED  FY  ASSIGNING 
EMPTY  TO  IT, 

2,1.1.7  LAEEL 

THE  ONLY  PERMITTED  FORM  OF  LABEL  DATA  IS  THE  COJSTA JT  LABEL  AFRAY,   SUCH 
AN  ARRAY  IS  ALWAYS  DECLAREH  IMPLICITLY  THROUGH  THE  APPEARANCF  OF  A  SUBSCRIPTED 

-43- 


LABEL  PREMXJ  PHESENTLY,  SUCW  ARRAYS  MUST  HE  ONE-D I  MENS  MNAL .   I8K  PL/I  REQUIRE 
SUCH  AN  AHRAY  TO  BE  DECLARED  JM  A  DECLARE  SJATEMFNTj  CIMS  PL/I  PROHBITS  SUCH  A 
DECLaRaTIUN,   constant  LaBpL  arrays  aHE  iJSED  TO  OBTAIN  The  EFFECT  fp  THE 
FORTRAN  CUfPUTEU  GO  TO  , 


2.1.1.8  ENTRY 

ONLY  ENTRY  CONSTANTS,  EITHER  INTERNAL 
(CF.  SEC.  2.1,6  AND  2,6) 

2.1.2  AGG'^EGATION 


CR  EXTERNAL.  ARF  4LL0WEr. 


STRUCTURfcS  AND  ARRAYS  APE  IMPLEMENTED.  AND  CAN  -^E  USED  !N  ASSIGNMENTS. 
THE  REFER  OPTION  IS  RECCGNIZFD.  WITH  THE  SAME  LIMITATIONS  AS  F-LEVEL  PL/Ii 
CROSS-SECnO'JS  OF  ARRAYS  (U'lTH  *  SUBSTITiJTED  FOR  A  DIMENSION)  ARE  NOT  RECOG- 
NIZED.  THE  ♦  NUTATION  MAY  BE  USED  FOR  STRING  LENGTHS  AN^  ARwAY  BOl NDS  OF 
pARAMETERb, 


2.1.3  STORAGE  CLASSES 


THE  ALTTMATIC,  STATIC,  AND  BASEO  STORAGE  CLASSE3  ARE  AVAlLARLF. 
RASED  VARIABLES  NEED  NOT  BE  DECLARED  WITH  a  POINTER. 

2.1.4  SCOH'E  ATTRIBUTES 

THE  SCOPE  ATTRIBUTES  INTERNAL  A^D  EXTERNAL  ARE  1EC01NIZED.  HTWEVER,  A 
GIVEN  EXTtRNAL  VARIABLE  TAVN^T  BE  DECLARED  f'.ORE  THAN  ONCE  IN  A  pROTEDURE)  AN 
ATTEMPT  TO  DO  SO  WILL  CREATE  AN  ASSEMBLY  ERROR.  EXTERMAL  VARIABLE  NAMES  AND 
EXTERNAL  tNTRY  NAMES  LONr,Eo  THAN  SEVPN  CHARACTERS  ARE  TRUNCATED  RY  USINQ  THE 
FJRST  FUIJH  AND  LAST  THREF  CHARACTERS.  EXTERNAL  VARIABLE"  ARC  IMPLEMENTED  AS 
LABELLED  CCM'ION  AREAS, 

2.1.5  INIT  lAL  ATTRIBUTE 

THE  INITIAL  ATTRIBUTE  IS  RECOGNIZED,  BUT  ITERaTIOI  FACTORS  Mu^T  BE  DECIMAL 
INTEGER  CONSTANTS.   FIVE  LEVELS  OF  NESTING  ARE  ALLOWED*  ANn  THE  LENGTH  OF  A 
LIST  IS  ESSENTIALLY  UNLIMITED,   ITERATED  EXPRESSIONS  MUST  HE  CONSTANTS.   STRING 
REPETITION  FACTORS  ARE  NOT  ALLOWED. 

2.1.6  ENTrtY  ATTRIBUTE 

THE  ENTRY  ATTRIBUTE  MUST  RE  DECLARED  FOR  EXTERNAL  E JTRY  POINTc,  AND 
MUST  NOT  mE  DECLARED  FOR  INTERNAL  ENTRY  POINTS.   THE  ASTERISK'  NUTATION  IS 
ALLOWED  FOR  STRING  LENGTmS.   ENTRY  DESCRIPTORS  t'UST  IE  OUTT^D  FOR  ANY 
PARAMETERS  THAT  ARE  AGQRfGaTPS.  AND  THE  ARGUMENTS  PARSED  TO  SUCH  PARAMETERS 
MUST  AGREc  EXACTLY  WITH  THF  PARAMETER  DECl  AFv  AT  I  OnJ  .   fOR  PARAMETERS.  STRING 
LENGTHS  AND  ARRAY  BOUNDS  MUST  f<E  GIVEN  EITHER  HY  DECIMAL  CONSTANTS  OR  8Y 
ASTERISKS  (SEE  ALSO  SEC,  2.6). 


2.1.7  FILE  ATTRIBUTES 


-44- 


THERh  ARE  FOUR  FILE  ATTRIRUTES  AVAILABLE:  FILE,  INPJT.  nUTPUT". 
AND  ENVIRUNMENT,   ENV  I  RCN)MPfMT  HAS  THP  FOLLOWING  HPTION^I 

REWIND  -  NOREWIND  -  U^'LOAn 
BINAHV  -  CODED 

THE  ENVIRONMENT  OPTION  ^AY  B^    ATTACHED  TO  A  FILE  DECLARAT I DN ,  AN  OPEN 
STATEMENT,  QR  A  CLOSE  STaTFM»^NT,   AN  EXAMPLE  OF  AN  EWIRJNME'IT 
OPTION  is: 

ENV(REWIND, CODED) 

THE  DEFAULT  ENVIRONMENTS  ARE  NORFWInD  A'-JD  BINARY.   IF  A  JY  C?1NFLICTING 
ENVIRONMENT  OPTIONS  ARE  PRPStNT  AT  OPEN  OR  CLOSE  TIM6,  T-tE  U'JDEFIKFD" 
FILE  CONDITION  IS  RAISED.  «;EE  SEC,  2.8  FOR  RELATED  DISCUSSIO^J. 

2.1.8  BUILTPJ  ATTRIBUTE 

THE  allLTlN  ATTRIBLTE  MAY  BE  DECLARED.   BUIlTIN  NAMES  THAT  ARF  DECLARED 
WITH  OTHER  ATTRIBUTES  MAY  OE  USED  WITHOUT  CONFLICT,  PRIVIDED  THAT  PUILTIN  IS 
DECLARED  ^CH  ALL  BLOCKS  COvfAlNlNG  USES  OF  THE  BiJlLTiN  FJNCTTON  IN  ITS  USUAL 
SENSE.   NOTE  THAT  NULL  f'UST  BE  DECLARED  BuiLTlN  IF  IT  IS  USEH  FQR  THF  NULL 
POINTER,  AND  SIMILARLY  FOR  ENTRY. 

2.1.9  LIKt  ATTRIBUTE 

THE  LIKE  ATTRIBUTE  IS  AVAILABLE. 

2.2  ATTRIBUTE  DETERM I N AT T Qm 

CONTfcXTUAL  CECLARATIO^'  IS  AVAIL&BLE  FOR  FILES,  F»0INTETS.  AREAc,  AND 
BUILTIN  FUNCTIONS  THAT  ExPPCT  ArqumEnTS.   UNDECLARED  VARIA^L'^S  NOT  AFFECTED 
BY  CONTEXILAL  DECLARATICV  ALL  BECOME  FIXED.   THIS  BFHAVI)R  IS  DIFFFRFNT  FROM 
THAT  OF  ItiN  F-LEVEL  PL/I,  w'hICH  DEFAULTS  VARIABLE  TY^^ES  ACCORDING  TO  THE  FIRST 
LETTER  OF  THE  VARIABLE  NAMF  AS  IN  FQRTRA^J. 

MISSING  ATTRIBUTES  ARF  FILLED  IN  BY  THE  COMPILE^^  USING  THE  STANDARD  PL/I 
RULES. 

2.3  STATEMENT  TYPES 

THE  I-CLLGWING  STATEMENT  TYPES  ARE  AVAILABLE,  WITH 
THE  RESTRICTIONS  NOTED; 

2.3.1  ASSIGNMENT  STATEMENT 

ASSIGNMENTS  MAY  BE  MAnE  TO  AGGREGATES  AS  WELL  Aj  TO  SCALARS, 
MULTIPLE  ASSIGNMENT  IS  PPRmiTTEQ.   HOWEVER.  ALL  "iTEMl  ON  THE  LEFT  ciDE  OF  AN 
ASSIGNMENT  STATEMENT  MUST  "AVE  IDENTICAL  AGGREGATE  TYPES,  OR  AN  UNTIAGNOSED 
ERROR  WILL  OCCUR,   THE  Ey  vAME  OPTION  IS  ACCEPTED  BUT  IG-IORE'^,   IMPLICIT  TYPE 
CONVERSION  IS  PERFORMED  ON  ASSIGNMENT  WHEN  NECESSARY,' 

2.3.2  ALLOCATE  STATEMENT 

THE  ALLOCATE  STATEMENT  WITH  OPTIONAL  IN  AND  SET  CLAUSES  IS  RECOGNIZED. 

-45- 


2.3.3  i^bGlN     STATEMENT 

THE    dEGIiN    STATEMENT    I  <?    ALLOWED. 

2.3.4  CALL  STATEMENT 

THE  CALL  STATEMENT  IS  ALLOWED.  THOuOH  THE  TASK,  EVEMT  A^JD  PRlTRlTV  OPTIONS 

ARE  NOT  RfcCOGNlZED. 

2.3.5  CLObE  STATEMENT 

THE  CLOSE  STATEMENT  I*;  ALLOWED.   LISTS  nF  FILES  ARE  ALL'^WED.  «S  IS 
THE  ENVIRONMENT  OPTION.   AT  CLHSF  TIMEi  ANY  DECLARED  ENVIRONMENT 
OPTIONS  AHE  COMblNFD  WITH  THE  ENVIRONMENT  OPTIONS  ATTACHED  T"i  THE 
CLOSE  STATEMENT, 

2.3.6  DECLARE  STATEMENT 

THE  UECLARE  STATEMENT  IS  ALLOWED,   FACTORING  OF  ATT'M9UTES,  INCLUDING 
LEVEL  NUMdERS  AND  DIMENSIOmSi  IS  ALLOWED, 

2.3.7  DEFALlT  STATEMENT 

THE  UEFAULT  STATEMENT  IS  AVAILARLE,   THIS  STATE  (ENT  IS  ^EFlNEr  BY 
ANSI  PL/I.  dUT  IS  NOT  IMPLFmFNTED  IN'  ANY  IHM  COM^^ILE^'^S  I'J  THIS  FORy  .   THE 
DEFAULT  SiATEmEnT  AllOwS  ThE  USER  TO  SPECIFY  ATTiJlBUTES  TO  BF  ASSIPNED 
TO  A  VARIABLE,  DFPENDInG  0"'  THE  ATTRIBUTES  THAT  IT  ALREADY  HAS  AS  uELL  AS  ON 
THE  LEADI.nG  CHARACTERS  CF  THE  NAME  QF  THE  VARIABLE.   THE  DEFAULT 
STATEMENT  IS  DESCRIBED  IM  nETAlL  IN  APPENDIX  A. 

2.3.8  DO  STATEMENT 

ALL  OF  THE  FORMS  OF  T^E  DO  STATEMENT  ARE  ALLOMEH,  BJT  T^'E  PARAMETERS  OF 
THE  DO  ARE  ALWAYS  CONVERTEn  TO  FIXED  VALUES.   EXPRESSIONS  MAV  RE  USEU  AS  DO 
PARAMETERS, 

2.3.9  END  STATEMENT 

THE  END  STATEMENT  IS  ALLOWED,  AND  MULTIPLE  END  CLOS^JRE  IS  RECCGNIZED, 

2.3.10  tNIRY  STATEMENT 

THE  bNTRY  STATEMENT  IS  ALLOWED,   MULTIPLE  ENTRY  POI  JTS  ARE  PERMITTED,   THE 
RETURNS  Ot^TlON  IS  RECOGMZFD. 

2.3.11  FRcE  STATEMENT 


THE  ^REE  STATEMENT  IS  ALLOwFD.  RUT  VARIABLES  ALLOCATED  IN  AREAS  CANNOT  BE 


FRFED. 


2,3.12  GO  TO  STATEMENT 


THE  GC  TO  STATEMENT  I «:  ALLOWED.   IT  IS  PERMITTED  TO  GO  tq  a  LABEL  IN  A 
RLOCK  CONTATJING  THE  CURREvT  ONE.   IT  IS  PERMITTED  T)  GO  TO  AN  -ELEMENT  OF  A 

-46- 


CONSTANT  LABEL  ARRAY,  USlNr?  A  SUHSCRlPTEH  LAREL  REFEf^E'JCR, 

2.3.13  IF  STATEMENT 

THE  IF  STATEMENT  IS  AlLOWFD.   HOWEVER.  aSSIGNMEJTS  TO  THE  IDENTIFIER  IF 
ARE  NOT  PHCPERLY  RECOGNlZEn.   IF  STAtEME'JTS  mAY  HAVE  ASSOCIATED  El^E  CLAUSES. 

2.3.14  NULL  STATEMENT 

THE    IMILL    STATEMENT     IS    ALLOWED. 

2.3.15  ON  STATEMENT 

THE  UN  STATEMENT  IS  ALLOWED.   CONDITIONS  PRESENTLY  fJECOHNIZED  ARE» 


AREA 

ENDFILE 

UNDEI-  InEDFILE 

ERROR 
F  INISh 
STORAGE 


MULTIPLE  CCNDITIONS  MAY  RE  SPECIFIED  IN  A  SINGLE  ON- STATEMENT,  E.G.: 

ON  ENCF  ILE(OUT),  UN OEr j NEHF  ILE ( OUT )  GO  TO  LaB  J 

'ME  BEHAVICR  OF  ENDFILE  WITH  RESPECT  TO  END-OF-RECORT  IS  DISrUSSEC  IN  SEC,  2.8. 
THE  STORAGE  CONDITION  IS  RAISED  WHEN  THE  MAIN  STORAGE  POOL  HAS  BEEN  EXHAUSTED. 
NORMALLY  fHERE  IS  NO  WAY  To  »ECOVER  FROM  THIS  CONDITION.   WQ-iEVER.  UNDER 
SOME  CIRCUMSTANCES  RECOVERY  mAy  HE  POSSIBLE  EITHER  BY  FRFEIN^'  LARGF  RlOCKS 
OF  ALLOCAIEO  STORAGE  OR  RY  EDITING  FROM  A  NEST  OF  PRICED  IRES  USING  AUTOMATIC 
STORAGE. 

BECAUSE  OF  A  BUG  ThAT  IS  RATHER  DIFFlcUlT  TO  FIX  PRESENTLY,  ON-UNITS 
THAT  ARE  BEGIN  BLOCKS  CANNOT  DECLARE  ANY  LOCAL  AUTOMATIC  VARTARLES.   TO  BYPASS 
THIS  RESTRICTION,  TURN  THE  BEGIN  BLOCK  I'-lTO  A  PA-^AMETERLESS  PROCEDl  RE  AND  CALL 
IT. 

2.3.16  OPtN  STATEMENT 

THE  OPEN  STATEMENT  IS  ALLOWED.  WiTH  OPTIONS  TITLE.  OUTPUT,  INPUT, 
AND  ENVIRONMENT  (CF.  SECs.  2.1.7  AND  2,8).   AT  OPEN  TIME.  DECLARED 
OPTIONS  ARE  COMBINED  WiTh  ThOSE  SPECIFIED  FOR  THE  OPEN  STaTe'^ENT. 
LISTS  OF  I-  ILES  ARE  ALLOWED. 

2.3.17  PROCEDURE  STATEMENT 

THE  PKOCfcDURE  STATEMENT  IS  ALLOWED,  WITH  OPT  I aNS( MA  IN )  USED  Tf  INDICATE 
A  MAIN  PROCEDURE.   PROCEOUPES  NEED  NOT  HAVE  LABELS  IF  THPY  CONTAIN  ENTRY 
STATEMENTS,   THfc  RECURSIVE  AND  RETURNS  OPTIONS  ARE  REC0GJI7En, 


2,3.18  READ  STATEMENT 


-47- 


ONLY  CnE  rORM  or  THP  sEAD  statement  is  RECOGNIZBDJ   RFa^  riLE(*FILE*)  INTO 
(*VARIABLfc*)   .   A  READ  STATEMENT  ON  AN  UNOPENED  FILE  CA JSES 
THAT  riLE  TO  BE  OPENED  AS  AN  IN'PUT  FILE, 

P.3.19  RETLHN  STATEMENT 

THE  HETURN  STATEMENT  IS  RECOGNlzED;  IT  HAS  AN  Qf^TlOJAL  ARGUMENT.   THIS 
ARGUMENT  MLST  BE  GIVEN  IF  THE  PROCEDURE  WAS  FNTEREH  AT  Ai  ENTRY  POINT  THAT 
RETURNS  A  VALUEi  AND  MUST  RE  Of^ITTFD  OTHERWISE, 

?.,Z.20    STUF  STATEMENT 

THE  ijTOP  STATEMENT  IS  RFCnGNl7En, 

2.3.21  WRITE  STATEMENT 

ONLY  CNE  FORM  OF  THE  '-JRITE  STATEMENT  IS  RECOGNIZEDl 
WRITE  FlLfc<*FlLt*)  FROM  (*v/ARIABlE*)  ,   A  WRITE  STATEMENT  ON  AN  UNrpENED 
FILE  CAUStS  THAT  FILE  TC  BE  npFNFD  As  An  OUTPUT  FILE, 

2.4  EXPREtJSlONS 

EXPRbSSIONS  ARE  IMPLEMENTED.  INCLUDING  IMPLICIT  CONVERSIONS  RESULTING 
FROM  THE  COMBINATION  OF  niFFFRENT  DATA  TYPES  THROUGH  OPE'^ATO'^S.   VARIABLE 
REFERENCES  MAY  BE  INCOMPLETELY  QUALIFIED  IF  NO  A'^BIGJITY  RFSi'LTS.  AND 
SUasCRIPTb  IN  SUBSCRlPTEn  OUALIFIED  NAMES  MAY  RE  INTERLEAVED  IN  ANv  POSITION 
PROVIDED  (FaT  THEY  ARE  GIVFN  IN  THE  RIGHT  SEQUENCE.   NOTE  THAT  CRO<:S-SECT  I ONS 
OF  ARRAYS  AHE  NOT  ALLOWEH, 

ANY  HCI'JTER'VALUED  FXPRFSSION  HaY  BE  USFD  AS  A  POINTER  TUALIFIER;  THUS 
POINTER  QUALIFIERS  MAY  BE  VESTED  TO  ANY  HEPTH,   THIS  GENERAL  I Z AT  I  OK  IS  NOT 
ALLOWED  BY  F-LEVEL  PL/1, 

ALL  OF  THE  PL/I  OPERATORS  ARE  ALLOWED.   IN  PERFOR'-IDG  COMPARISON 
OPFRATlONb  OiJ  CHARACTER  STRIMGS,  THE  BLANK  CHARACTER  IS  TREATED  AS  THOUGH  IT 
HAD  DISPLAY  CODE  0  (I.E.,  IT  IS  THE  LOWEST  CHARACTER  IN  THE  COLLATING 
SEQUENCE)-   OTHERWISE,  TmE  COC  COLLaTINq  SEQUEf^CF  iS  USe'I, 

THE  ALLOWABLE  CONSTANT  TYPES  ARE  SIGNED  DECIMAL  I'JTFGFRS,  BIT  STRINGS,  AND 
CHARACTER  STRINGS.   STRInG  REpFjITlOM  ^aCtoRS  ArF  mQT  RECOGNIZED. 

2.5  TYPE  CCNVERSION 

type  conversion  occurs  implicitly  i  j  three  contests! 

1.  arplication  of  cperators 

2,  assignment 


i,  MATCHING  OF  ARGLmEmTS  TO  ENTRY  PARANETERS 

-48- 


THE  UMV  FACILITY  PrE-^E^TLY  AVAILABLE  FOR  EXPLICIT  TYPE  CONVERSION  IS  THE 
CHAR  BUILIIN  FUNCTION,  ViHlTH  MAY  BE  CALLED  EITHER  WITH  0'}    WITHOUT  4  SECOND 
ARGUMfcNT, 

THE  I-CLLOWIN'G  TYPE  COmvFRSIONS  ARE  AVAlLARLP: 

1.  FIXED  TO  FLOAT. 

2.  FIXED  TO  CHARACTER.   THE  RESULT  IS  AN  E I  rJHT-CH  AR  ACTE"  STRIKG 
RIGHT-JUSTIFIED  WiTH  BLANK  FILL, 

3.  FIXED  TO  BIT.   THE  RESULT  IS  A17-BIT  STRING. 

4.  FlCaT  TO  FIXED. 

5.  CHARACTER  TO  FIXED.  THE  iNPUT  CHARACTER  STRING  UY  '-'AVE  BLANKS  ON 
EITHER  SIDE  OF  ThE  NUHRER  TO  BE  CONVERTED.  ALL  nLAN'<S  CONVERTS  TO 
ZtRO, 

6.  CHARACTER  TO  BIT.   THE  INPUT  CHARACTER  STRING  ^UST  B^^  OF  CTNSTANT 
LtNGTHi  AND  THAT  LENGTH  MUST  NOT  EXCEED  (SO. 

7.  RIT  TO  CHARACTER. 

8.  RIT  TO  FIXED. 

2,6  FUNCTION  CALLS  AND  PARAMETER  PASSAGE 


G 
FUNCTl 
DEFAUL 
CHARAC 
PASSED 
RECEIV 
PARAMt 
THEN  I 
IS  DON 
PARAME 
DECLAR 
IMPLIC 


ENEHA 
ON  Cli 
T  TO 
TER  S 
AS  A 
ING  F 
TER  C 
T  lb 
E,  1 
TER  T 
ATIUN 
ITLY, 


LLY 

LLS. 

FIXE 

TRIN 
RGUM 
ARAM 
ESCR 
ASSU 
F  A 
YPE 
WIT 


SPEAKIN 
HOWEV 
D,  A  F 
GS  MUST 
bN'TS.  B 
ETER. 
IPTORS, 
MED  THA 

paramET 

IF  NECE 
H  A  FUL 


G,  The  USUAL 
ER,  FNTRY  NA 
UNCTION  CAN 

BE  HECLaRED 

UT  They  must 

NC  PROVlSinN 

IF  A  PARAM 

t  thp  argume 
er  i«;  declar 

SSARY.  NOTE 
L  SET  OF  PAR 


PL/1  RULES  APPLY  TO  PARAMFTER  PASSAGE  AND 
MF-S  DECLARED  WITH  JQ  RETUR^^S  ATTRIBUTE  ALWAYS 
nNl Y  RETURN  A  SCALAR  AG  A  VALUEj  RETURNED 
WITH  CONSTANT  LENGTH,   AGGREGATRS  MAY  BE 
AGREF-  EXACTLY  WIT-I  THE  ATTRIBUTFS  OF  THE 
IS  MADE  FOR  THE  DC-CLARATinN  OF  AGGREGATE 
ETFR  IS  NCf  DECLARED  IN  AN  ENTRY  DECLARATION. 
NT  MATCHES  THE  PARAMETER,  AND  NO  CONVERSION 
ED  THE'J  ThE  ARGUMEJT  IS  CO'iVERTEr  TO  THE 

THAT  FOR  INTERNAL  PROCEnU"ES  AN  ENTRY 
AMETER  DESCRIPTORS  IS  ALWAYS  CREATED 


MULTIPLE  ENTRY  POINTS  ARE  ALLOWED.  AND  THEY  NEET  NOT  AG»EE  IN  THE  TYPE  OF 
VALUE  RETURNED,   THE  NECfSSA»Y  CONVEpSIO'IS  WILL  ALWAYS  BE  PE^FORMEr.   MOTE  THAT 
A  PROCEDUHE  INVQKED  BY  A  FUNCTION  REFERENCE  MUST  RETURN  A  VALUE,  WHLE  A 
PROCEDURE  INVOKED  BY  A  CALL  STATEMENT  MUST  NnT  RETURN  A  VAlUF.   VITLATION  OF 
THESE  HULtS  CAN  LEAD  TO  MYSTERIOUS  RUN-TlME  ERRC'^S. 

FUNCnON  CALLS  ARE  PERFORMEn  USING  THE  FTN  CONVENTIONS,   IN  PARTICULAR, 
Al  CONTAINS  A  POINTER  TC  THE  ARGUMENT  LIST  AND  THE  RESULT  IS  RETURNED  IN  X6, 
THUS  PL/I  PROCEDURES  CAN  CALL  FTN  ROUTINES.   HOWEVER,  PL/I  U>ES  DORE  VECTORS  TO 
DESCRIBE  CHARACTER  STRINGS  AND  AGGREGATES,  SO  THESE  CAN  JEIT^'iER  BE  PASSED 
NOR  RETURNED  TO  FTN  PRoCRDuRES  WITHOUT  A  MORE  DETAILED  KJOWLFDQE  OF  THE  PL/I 


-49- 


RUN-T[Mt  STRUCTURE. 
?,7  8U1LT1N  FUNCTIONS 

THE  I-CLLOWING  BUlLTiN  FUNCTIONS  ARE  AVAILABLE! 


ARS 

CHAR 

EMPTY 

HROIJNC 

INDEX 

LROlJi^C 

LENfif  h 

MAX 


MIN 

MOD 

NULL 

SIGN 

SUh«?TR 

unspfc 

VEKIFY 


THE  HCLLOWING  RESTRICTIONS  SHOULD  BE  NOTED: 

1,  HBCUND  AND  LBOUND  TAN  ONLY  HaVE  A  CCNSTA'JT  A3  TH"IR  <;ECOND  ARGUMENT, 

2,  EMPTY  CAN  ONLY  Bp  USED  ON  THE  Rl^HT  SIDE  OF  AN  ASSIG^'MENT  TO  AN  AREA. 
II-  USED,  IT  MUST  BP  EXPLICITLY  DECLARED  TO  BE  flUILTlM, 

3,  MOC  ALWAYS  CONVERTS  ITS  aRRUmFNTS  TC  FIXED. 

A,     SuESTH  AND  UNSPEC  CANNOT  BE  USED  ON  THE  LEFT  SIDE  OF  AN  ASSIGNMENT 
SI  ATEMEivjT. 

5,  II-  SUdSTR  IS  APPLIED  TO  A  RIT  STRING'.  THE  THIRD  ARGU^'ENT  MLST  EITHER  BE 

cunstanT  OR  MissiNjr,,  IF  IT  is  missing,  ^he  lenqTh  OF  the  ^esulTing 

RIT  string  is  ALwAvs  TAKEN  A!^  60, 

6,  iJi^SPEC  CANNOT  PE  APPLIED  TO  VARYING  STRINGS,  SINCE  T^E  RESiLT  WOULD 
PHCDUCE  A  VARYING  «IT  STRING.  A  DATA  TYPE  THAT  I^  NOT  YET  AVAILABLE, 
UimSPEC  of  a  FiXEn  HATUM  YlELHS  1^  BITS)  UNSPEC  OF  A  POINTER  YIELDS 
Id  BITS)  UNSPEC  OF  A  FLOaT  DaTUM  YiElDS  ^0  BITS. 

7,  THE  SECOND  ARGUMENT  OF  VERIFY  MUST  EE  A  SUGLE  C^^ARAC;TER.  AND  CANNOT 
Rt  ANY  OF  THE  CHARACTERS) 


<  >  -  I 


?,8  INPUT-CUTPUT 


THERb  ARE  FOUR  RELEVA^'T  STATEMENTS  FOR  T  NPUT-OUTPUT I  OPPN,  CLfSE, 
READ»  AND  URITE,   STREAM  iNpUT-OUTPUT  IS  NOT  YET  AVAILABLE.   FILES  CAN 
RE  OPENED  IMPLICITLY  BY  READ  QR  wRlTE  STATEMENTS.  RUT  DECLARPD  FlL^ 
ATTRIBUTES  ARE  NOT  APPLIED  WHEN  IMPLICIT  OPENING  OCCJRS.   IMPLICIT 
FILE  CLOSING  TA^ES  PLACE  Qv  PROGRAM  TF.RM  I  NAT  I  ON ,  EIT'<E'<  'JO^^MAL  OR 
ABNORMAL.   HOWEVER,  IF  THE  P«OfiRAM  TERMINATES  ON  A  MAC^I'lE  E^JROR  Si  CH 
AS  AD[)RESS  OUT  OE  RANGE,  FilES  WILL  NOT  ^E  ClOSEO  AN"),  C  INSEOUENTL  v , 
MUST  OUTPUT  WILL  BE  LOST.   NOTE  THAJ  PL/I  I NPUT --lUTP  ]T  I  :^  'iOT  CQNSIS' 
TENT  WITH  FORTRAN  INPUT-flUTPUT  BFCAU<?E  THE  FETS  ARE  ACCESSED  DIF- 
FERENTLY,  THUS,  ALTHOUGH  pl/I  PROGRAMS  CAN  CALL  FORTRAN  PROGRAMS, 
THE  FORTRAN  PROGRAMS  CANNOT  DO  ANY  I nPUT-OUTPUT . 


DECLARED  ENVIRONMENT  OPTIONS  ARE_APPLIED  ON  B0T'<  OPENIN^.  AND 


CUnSING.   THUS,  IF  A  FILF  TS  DECLAREH  WITH  THE  ATTi^lUTE  ENV  ( RFW I  \T  ) . 
IT  WILL  Rt  REWOUND  ON  BCTH  OPENING  AMD  CLOSING.   NOTE  THAT  IF  NEITl-ER 
THF  FILE  UECLARATION  NOR  TmE  OPEN  STATEMENT  FOR  A  FILE  SPECIFY  INPlT  OR 
OUTPUT,  THEN  INPUT  WILL  RE  ASSUMED.   THUS  WHFN  AN  OUTPUT  FILP  IS 
OPENED  EXPLICITLY,  THE  CuTouT  ATTRIBUTE  ;1UST  BE  ATTACHED  TO  FITHER 
THE  OPEN  STATEMENT  OR  THF  FILE  DECLARATION, 

FILES  CAN  BE  REWOUNn  WITHIN  A  PROGRAM  BY  CLnSINT  THEM  A'JD  THEN 

REOPENING  Them  with  the  FNVlRONMENT(REWrJD)  nPTION  O'J  THP  OPPN  STATE- 
MENT, 


SPECIAL  CONVENTIONS  EXIST, 
ENDS-OF-RtCORU  IN  INPUT  FllE'^, 


AS  IN  FORTRAN,  CONCE'^NINi]  FMnEDDED 
IF  THE  ACTUAL  INPUT  FILE  IS  "AMED 


jilNPUTj!,  THEN  SUCH  END-CF-RECQRD  MARKS  ARE  TREATED  LIKE  FNO-iF-FILF 
MARKS.   n>\    ANY  UTHER  FiLF  THFY  AmE  IGNORED.   THE  DECISIOI  AynUT  THP 
FILE  NAME  ARE  BASED  ON  THE  NAME  THAT  EXISTS  AFTER  MAKING  ALL  SUBST r TUT  I ONS 
DUE  EITHER  TO  THE  TITLE  OPTION  OR  TO  THE  CALl  CARD  SJB5TITUTI0N  FATILITY. 


THE  REAH  AND  WRITE  STaTFMFNTS  MUST  SPECIFY  F I  XEl-LE  )GTH  CHARATTFR  STRINGS 
AS  THEIR  VARIABLES.   THESE  STRINGS  ARE  THEN  WRITTEN  TR  RFAD  AS  THCl  GH  THEY  WERE 
IN  FORTRAN  A-FORMAT,  I.E.,  TRAILING  BLANKS  ARE  ELlMlJATEil,   VARYINT  CHARACTER 
STRINGS  CANNOT  tiE    USED  IM  THIS  CONTEXT, 


AL 
FACILIT 
INPUT-O 
BUILT-I 
CONCATfe 
CUNVERS 
'  FIXED 
CHARACT 
COULD  B 
MATiONJ 


THOUGH  INPUT-OUTPUT  FACILITIES  ARE  RATHER  PRIilTlVE.  CERTAIN  OTHER 

lEb  OF  THE  LANGUAGE  CAN  8E  USED  TO  GREAT  ADVA  ITAGF^  IN  PERFORMING 

UTHLT,   INPUT  DATA  TAN  PE  BROKEN  TOWN  USING  T  <E  S'IBST"  AND  INDEX 

N  t-LNCTIONS,  WHILE  OUTPUT  DATA  CA:-!  HE  BUIlT  U^  USING  'STRING 

NAIIO'-J,   SINCE  IMPLICIT  NUMBER-TO-TO-CH  AR  ACTE '^  AN')  CH  AR  ACTEr -TO-NUMBER 

IO'\S  ARE  AVAILABLF,  IT  IS  NOT  DIFFICULT  TO  OBTAIN  NUMFRICAL  OUTPUT, 

DATUM,  WHEN  CONVFRTED  TQ  A  CHARACTER  STRJNG,  PROlUCE'?  AN  EIGHT'- 
ER  REPRESENTATION  WITH  LEADING  BlAnkS.   A-?  AN  EXAlpLE,  THE  FOLLOWING 
E  OSED  TU  WRITE  CUT  THE  VALUE  OE  K  TOGETHER  WITH  IDEnTIFYINT  InFoR- 


DCL  C    CHAR(80  )  ; 
C  s  X  K'  JfttK; 

WRITt  FILE  (OUT) 


FROM  (C)j 


MORE  GENERALLY,  ONE  CAN  DEFINE  A  PROCEDURE  PRINT; 


PRINI  :  PROC(C)  ; 

DCL  C  CHAR(*)  I 
WRITE  FILE(CUT) 
END  PRINT  ; 


FROM  (C)  ) 


IF  PRINT  IS  PROVIDED  AS  AN  EXTERNAL  PROCEDURE,  IT  MU=?T  BE  HECLARED  AS: 

DCL  PRINT  ENTRv(CHAR(*))  j 
PRINT  CAN  EE  CALLED  WITH  A^'Y  ARGUMENT  THAT  EITHER  IS  A 

CHARACTER-STRINii  VARIABLF  OR  CAN  BE  CONVERTED  TO  A  C  URACTFR-STR I  NT  VARIABLE* 
AND  THAT  ARGUMENT  WILL  BF  PRINTED.   NOTICE  THAT  JF  PRINT  JS  PROVIDRD  WITH  AN 
ARGUMENT  THAT  IS  NOT  A  FIXED-LENGTH  CHARACTER  STRING,  THE  ARGUMENT  WILL 
IMPLICITLY  8E  CONVERTED  TO  A  FlXED-LENGTH  CHARACTER  STRI.JG. 

2,9  CHARACTER  SET 


-51- 


SINCt  THb  CDC  DISPLAY  CHARACTER  SET  DIFFERS  FRO'I  THF  STANDARD  PL/I 
CHARACTER  SET,  A  MAPPING  I «:  nECESSARV.   THOSE  CHARACTERS  COM'^ON  TO  THE  STANDARD 
PL/I  SET  **NU  THE  DISPLAY  S^T  ARE  UNCHANGED,   THREE  A'^EAS  OF  'MfTFERFNCE  EXIST: 

1.  THE  PL/I  APOSTROPHE  IS  REPRESENTED  BY  *     . 

2,  THE  NON^ALPHANUMFRIC  CHARACTERS  ALLOWED  IN  nENTlFIE^'S  ARE: 
<?  -  = 

6,    THE  OPERATORS  THaT  DIFFER  FROM  STANDARD  PL/I  AREJ 

t       OR 

tT      CONCATENATE 

A        AND 

?,10  KEY  wCRDS  AND  PHRaSFS 

THE  t-CLLOWJNG  KEYWORDS  AND  KEY  PHRASES  ARE  RECQlN  I  ZED,  JITH  T»-E  INDICATED 
ABPrEVIATICNSI 


ALLOCATE 

AUTOMATIC 

AREA 

BASEU 

BEGIN 

BIT 

BINARY 

BUILMN 

BY  NA^'E 

BY 

CALL 

CHARACTER 

CLOSt 

DECIMAL 

DECLARE 

UbFAULT 

UO 

ELSE 

END 

ENDFILE 

y  iNisp 

ENTRY 

fcNVIKCNMENT 

ERROR 

EXTERNAL 

f   ILE 

FIXED 

FLOAT 

(•WEE 

F  WOM 

GO  TO 

IF 

IMTl  At, 

IN 

INTERNAL 

INTO 

ON 


AUTO 


BIN 


CHAR 

DEC 
DCL 
DFT 


ENV 
EXT 


GOTO 
I  NIT 
INT 


-52- 


OPEN 
OPTl 

PRor; 

POIN 

READ 

RECU 

RbTlJ 

SET 

STAT 

STOP 

STOP 

TO 

UNDE 

VARY 

WHIL 

WRIT 


UNS 

bCuRfc 

(EW 

HSIVE 

HNS 

IC 

AGE 

K INEDFILE 

ING 

b 

b 


PROC 
PTR 


UNDFF 
VAR 


-53- 


3.  OPERATING  ENVIRONMENT 

COMPIIATJON  or  A  PL/I  PROGRAM  INVOLVES  TWO  STEPS:  THE  cnMPILATION  PROPER 
AND  THE  AbSEMBLV  OF  THE  RE<?ULTING  COMPASS  CCnE,   COMPILATION  AND  EvECUTION 
TOGETHFR  HEQUIRfc  THREE  SEPARATE  nATASETSI  THE  COMPILER  ITSEL^,  THE  SYSTEM 
TEXT  FOR  rhfc  ASSEMBLY,  AnD  T^E  RUN-TIME  LIBRARY.   TWT  OF  T^E^E  DATASEtS  ARE 
AVAILABLE  CN  SYbLiB: 


PLl 


THE  COMPILER  ITSELF 


PLlTfcXT    THE  SYSTEM  TEXT  REQUIRED  FOR  ASSEMBLY 
THE  THIRD  CATASfcT  IS  A  PERMANENT  FlLEt 
PLIRLIB    THE  RUN^TIMP  LIRRaRY 

THE  COMPILER  PRESENTLY  RPQllIRES  AN  ABSOLUTE  MINIMUM  'IF  lOlK  TO  LOAT  AND  EXE- 
CUTE,  A  i^IZE  OF  llOK  IS  RpcOMMgNDED.   IF  STORAGE  iS  TUteOUaTE,  ThE  MESSAGE 
^MEM  REOUbST  EXCEEDS  JOBCAPD  FL^^  WILL  APPEAR.   TmE  LARGER  THP  PROCFDURE  TO  BE 
COMPILED.  THE  MORE  SPACE  IT  WILL  REQUIRE,   COMPlLATITN  SPEED  CAN  BE  ESTIMATED 
AT  APPROXlf'ATfcLY  15  ST  A  TEMENTS/SECOND,  NOT  INCLUDING  ASSEMBLY  TIME.   ASSEMBLY 
TIME  IS  APPROXIMATELY  30  PERCENT  L^NoER  ThaN  COMPILATION  TiMP,  SO  THE  COMBINED 
RATE  IS  A^'FROXIMATELY  6  ST  ATEMENTS/SFCOND  . 


THE  SOURCE  LISTING  INCLUDES  A  MESSAGE  GIVING  THE  AM:)UNT  OF  FIFLD  LENGTH 
REQUIRED  »-CR  COMPILATION,   TO  DETERHINE  THE  CORRECT  FIELT  LE''GTH'  ISE  A 
GENEROUS  INITIAL  ESTIMATE  ANH  THEN  CUT  DOWN  ACCORDINGLY,   THF  COMPILER  WILL 
NOT  ATTEMPT  TO  INCREASE  iTS  FIELD  LENGTH  BEYOND  THAT  EXITING  AT  Tl-E  TIME 
IT  WAS  FIHST  CALLED.   hCwEwER#  EACH  EXTERNAL  PROCEDU'TE  IS  COMPILED  IN  THE 
MINIMUM  NtCESSARY  FIELD  lEnGTH, 


THE  CCMPILATION  PROCESS  USES  FIVE  LOCAL  FILESl 


INPUT 


SOURCE  PROGRAM  INPUT 


OUTPUT     SOURCE  PROGRAM  LISTING  WITH  STATEMENT  NUMIERS.  FOLLTWED  BY 
ERROR  LISTlMG 

CODE       IjENERATED  ASSEMBLY  COnE 

LGO        LOADABLE  FILE  PRODUCED  BY  THE  ASSEMBLER 

PLIEHR     TEMPORARY  FILF  USFD  FOR  LIST  DF  SOURCE  PROGRA'*  ERRORS 

3,1  INPUT  FORMAT 

INPUI  TO  THE  COMPILER  IS  ASSUMED  TO  BE  IN  COLUM  JS  2  T^R'^UGH  17.  THE 

PRINTED  LISTING  ACTUALLY  GOES  i)p    TO  COLU'IN  9",  SO  INPUT  OBTAINED  FROM  UPDATE 

WILL  SHOW  THE  UPDATE  SER T At  I 7AT I  ON .   CHARACTERS  APPEARlNl  IN  COLUMN  1  ARE  USED 
FOR  CARRIAGE  CONTROL  IN  THF  LlSTlNfi. 


EACH  EXTfcRNAL  PROCEPURE  MiJST  RE  FOLLOWEn  BY  A  CARD  HTH    A  *  Ik  COLUMN  1  TC 

INDICATE  fhE  END  OF  THE  PRncFDURR.   THIS  CONVENTION  IS  NREDE"  SINCF  COMMENTS 

MAY  PRECEOE  THE  INITIAL  PRoCFUURE  STATEMENT  OR  FOLLO>I  THE  FI'lAL  ENT  STATEMENT, 

-54- 

\ 


3,2  CONTROL  CARD  SETUP 

AS  AN  EXAMPLE,  COmSTDPR  A  PL/I  PROGRAM  WITH  THE  FOLLOWT'G 
CHARACTERISTICS! 

1,  THE  PROURAM  CONSTSTS  OF  TWO  EXTERNAL  PROCEDU'^ES , 

2,  THE  PROGRAM  REaCs  ITS  INPUT  FROM  FILF  IN  AND  PR0nuCE<5  ITS  ruTPUT  ON 
FILE  OUT, 

3,  THE  PROGRAM  DOES  N^T  RFQUIRE  MORE  THAN  5lK  TO  EXECUTE, 
THE  FOLLOWING  CONTROL  CARD  SEQUENCE  IS  THEN  aPPROPRIATEJ 

A123456,CM110000»T100.  VOUR  NAMP 

LOADfcR, 

NOREUtCE, 

CIMGtT(PLl,PLlTEXT) 

ATTACI-,RLltJ,PLlRLlB, 

PH. 

RFL, 30000, 

REWINC(C0DE> 

COMPASS (I =CODE,GsPLl TEXT, LsO.OsO) 

L0AD.LQ0»RLIB, 

EXECUTE,,  I\=INPUT,CUT  =  0UTPUT.       ;<OPTlONAL    DATA    F^R    MAIN    pROCEDLREj! 
EQR 


■FIRST    PL/I    PP^OCEnURfc. 


■SECOND  Pl/I  PROCEDURE- 


EQR 


INPUT  nATA- 


EOF 

ONOTE  THE  FOLLOWING  REMARKS  ON  THIS  SEQUENCES 

1,  THE  OUTPUT  WILL  APPEAR  IN  THE  SEiJUENCEl 

1  LISTING  OF  SOURCE  PROGRAM*  WITH  ERRORS 
,  RESULTS  OF  EXECUTING  THE  PPO'IRAM 

2,  EACH  PROCEDURE  ^'UST  BE  FOLLOWED  f^Y  A  STAR  IN  COL  IMN  l, 

3,  TO  CALL  THE  COMPILER  WITH  AN  ALTERNATE  OUTPUT  FILE  OF  -OUTi-  AND  AN 
ALTERNATE  INPUT  FIlE  OF  -INl-,  USE  THE  CALL  CARD 

PLl(INPUT=lNi,OUTPUT=OUTl) 


3,3  COMPILER  OPTIONS 


-55- 


COMPILER  OPTIONS  ARF  SPPCIFIED  AS  AN  ARnUMEMT  TT  THf=  CO'iPlLER'.  USING 
THE  CONVEi^jTlON  OF  SECi  3.4.2.   FOR  INSTANCE.  THE  CALLING  CARP 

PLl.  i<XREr* 

TURNS  ON  ThE  -XREF-  OPTION.   OPTIONS  ARE  SEPARATED  BY  COMMAS  IN  THF 
OPTIONS  LIST, 

3.3.1  XREK 

THE  -XREF-  OPTION  CaU<:ES  a  CROSS-REFERENCE  MAP  AND  ATTRIBUTE  |  ISTlNG 
TO  APPEAR  IN  THE  SOURCE  LISTING.   NOTE  THAT  EXTRA  CO^PILATIO'I  SPACF  IS 
REOUIRfcD  WHEN  THIS  OPTiCn  IS  USEP, 

3.3.2  ST,^ST 

WHEN  THE  -ST-  OPTION  IS  USED,  REGISTER  AO  IS  SET  TO  THE  CURRENT 
STATEMENT  NU^BEW  At  THE  START  OF  ExECUTInIG  EaCH  nON-JULL  STATEMENT.  THE  DEFAULT 
IS  -ST-,  Wl-ICH  TURNS  THE  OPTION  ON.   NOTE  THaT  TmIS  TPTDN  CAUSES  THE 
OBJECT  PROGRAM  TC  RUN  SLOWER  AND  TAKE  HQRE  SPACE.   IT  IS  USEFUL,  HrwEVER,  FOR 
nfcHUGGlNG.  tNST'  TURNS  TwE  OPTION  OFF. 

3,4  PROGRAf*  CALLING  CARD 

THE  CARD  THAT  CALLS  A  COMPILED  PL/I  PROGRAM  CAN  ALST  RE  USED  TO  ACHIEVE 
SUBSTITUTION  OF  FILE  NA^FS  AND  PASSAGE  OF  AN  ARGUMENT  TO  THE  CALLEf  PROGRAM, 
THESE  FACILITIES  ARE  DESCRIBED  IN  THE  FOLLOWING  TWO  SECTIONS. 


3.4.1  FILE  NAME  SUBSTITUTION 


THE  t-lLES  REFERENCED  INSIDE  A  PROGRAM  ARE  VIRTUAL  FILES.   AN  ACTUAL  NAME 
CAN  BE  StJBSTITUTED  FOR  A  VIRTUAL  NAME  AT  THE  TIME  THAT  A  PL/I  PROGRAM  IS 
CALLED.   11-6  PROGRAM  CALLJ 

L0AD(LG0»RLIB) 

EXECUTE,, Al=Bl.A2=B?, A3=B3. 

CAUSES  THt  ACTUAL  FILE  NaMFS  Bl,H2.  R3  TO  BE  SUBSTITUTED  FOR  THE  VIRTUAL  FILE 
NAMES  A1»a2i  and  A3  RESPFCT I VEL Y .   IF  THE  PROGRAM  REFERENCES  FILE  41  INTERNALLY 
FOR  OUTPUT,  THEN  THE  ACTUAL  OUTPUT  WILL  APPEaR  CN  FILE  61,   IF  NO  CUPSTITUTION 
IS  GIVEN  t-CR  A  VIRTUAL  ElL^  mAmE.  THE  ACTUAL  NA^F  IS  TAKEN  T'^  RE  T»-E  SAmE  AS 
THE  VIRTUAL  NAME.   NOTE  THAT  FILE  nAmE  SUBSTITUTION  ^Y  RE  U«?ED  FOR  CALLS  TO 
THE  COMPILER  JTSELF  IN  THE  SAME  FASHION,   NOTE  ALSO  THAT  IF  A  PROGRAM  IS 
INVOKED  RY  AN  EXECUTE  CARD.  AN  EXTRA  COMMA  IS  NEFDED  F0LLO»^I'^G  EXECUTE. 


IF  AN  ACTUAL  FILE  NAMF  IS  -0-.  THEN  THAT  FILE  1 1  TREATEH  AS 
FILE.  ANY  OUTPUT  SENT  TO  THAT  FILE  IS  DISCARDEDl  ANY  ATTEMPT  TO 
FILE  WILL  RAISE  THE  ENDFILF  CONDITION. 

3.4.2  PASblN^.  AN  ARGUMENT  TO  A  MAIN  PROCEDURE 


A  NULL 
RFAP  THE 


AN  AHGUMENT  MAY  BE  pA'^SFD  TO  TflE  HAiN  PROCEOURE  FRO  1  THF  PROGRAM  CALL 
CARD.   TO  ACCOMPLISH  THIS,  DECLARE  THE  MAIN  PROCFDURE  TO  HAVF  A  SINGLE 
PARAMETER.   THAT  PARAMETFR  MUST  THEN  BE  DECLARED  TO  'IE  CH  AR  AOTER  ( 8r  )  VARYING, 
ON  THE  PRUGRAM  CALL  OR  EXECUTE  CaRD,  THE  ARGUMENT  TC  BE  PaSS^^D  TO  THE  MAIN 
PROCEDURE  IS  bURRQUNDED  RY  THE  CHARACTER  *    ,  AND  WRITTEN  FOLLOWING  THE  CLOSING 

-56- 


PARENTHESIS  OR  PERIOD  OF  TmE  CALL.   FMBEDDED  *    CHARACTERS?  WITHIN  Tl-E  ARGUMENT 
ARE  WRITTEN  AS  **     ,     IF  NO  ARGUMENT  IS  PROVIDED,  A  NULL  STRING  WILL  BE  PASSED  AS 
ARGUMENT,   FOH  EXAMPLE,  THF  FOLLOWING  PROGRAM  CALL  CARD  JSES  BOTH  FILE  NAME 
SUPSTITUTICN  AND  PARAMETER  pASSAQEt 


LOAD.LGO.RLIB, 

EXECUTE,,  lNPUT  =  TApEt,       ^THls    uQf^/^l    FAIL;! 


LGOC  INPUTsTAPEl)       i<THTS    WON^;!T    FAlL»< 


-57- 


4,  ERROR  MESSAGtS  AND  DERUr^GING  TECHNIQUES 

IF  YOuR  PHOljRAM  DOES  Unj    SUCCESSrULY  COMPILE  AfvID  EXECUTE.  THE  FIRST 
THING  YOU  f'UST  UO  IS  TO  TRY  TO  ClAsSIFY  THE  CAUSE.   THE  rOLLHWlNG  KINDS  OF 
ERRORS  CAis  OCCUHJ 

1,  SOLRCE  PROGRAM  ERRORS  CORRECTLY  TlAGNOSEn  RY  THE  COMPILER. 

2,  SULRCfc  PROGRAM  ERRORS  INCORRECTLY  DIAGNOSED  lY  THE  COMPILER, 

3,  FAILURE  OF  THE  COMPILER  TO  COMPLETE  EXECUTICM. 

4,  EHfiORS  IN  THE  COmRaSS  ASSEMBLY, 

5,  I\CORHECT  RESULTS  of  EXECUTION  DUE  TO  ERRORS  Vi    THE  SOURCE  PROGRAM, 

6,  INCORRECT  RESULTS  of  EXECUTION  DUE  TO  ERRORS  IN  THE  COMPILFR 
OH  THE  RUN-TIME  LIBRARY. 

IF  THE  COMPILER  DIAGNOSES  AN  ERROR.  THEN  YOU  SHOULD  CHECK  WHETHER  YOU  HAVE 
VIOLATED  iiCME  LANGUAGE  RESTRICTION.   IF'  'JOT.  THEN  YQU  HAVE  A  BUG  Tr  REPORT, 

ASSUMING  THAT  THE  COMolLER  HAS  NOT  DIaGnOSEO  ANY  ERRORS.  THE  rOMPlLER 
MAY  FAIL  UlRING  EXECUTION,   IF  THIS  SITUATION  ARISES.  THE  FI^ST  CAI SE  TO 
CHECK  IS  YCUR  DECK  SETUP.   If    THE  DECK  SETUP  IS  CORRECT,  THE'O  REPORT  A  BUG, 
THE  SAME  HEmARK  ARRlIEs  TO  E«RORS  IN  THE  COMPASS  ASS^MRLY.   vqTE  T^-AT  WARNING 
MESSAGES  IN  The  COmraSS  ASSEMBLY  ARE  EXPECTED.   THERE  IS  ONE  LANGUAGE 
RESTRICTION  WHOSE  VI0LATr0\)S  WILL  LEaD  TO  ASSEMBLY  ERRORS:  DPCLARATION  OF  THE 
SAME  EXTERNAL  VARIABLE  IM  mqrE  THAN  ONE  PLACE  IN  A  SINGLE  EXTERNAL  PROCEDURE, 

ASSUME  THEN  THAT  YOuR  PROGRAM  HAS  COMPILED  AND  ASSEMBLE^  WITHPUT 
APPARENT  DIFFICULTY.   IT  MAY  FAIL  DURING  EXECUTION  EITHER  RY  FARING  TO 
COMPLETE  bXECUTlON  OR  BY  GIVING  WRONH  ANSWERS.   HUGS  OCCURRpiG  AT  THIS  STAGE 
MAY  WELL  HE  SOURCE  PROGRAM  ERRORS  WITHOUT  THAT  BFIMG  OBVIOUS.   IN  TRDER  TO 
ISOLATE  THEM,  IT  IS  RECCmmpnDEO  THAT  YOU  WHITE  OUT  SUFFICIENTLY  MANY 
INTERMEDIATE  RESULTS  TO  ENABLE  YOU  TO  ISOLATE  WHERE  THE  TROUBLE  IS  OCCURRING. 
IF  YOU  CA\  ISOLATE  THE  TROUBLE  AS  OCCURRING  IN  A  SINGLE  STAT^^MENT,  THEN  REPORT 
THE  BUG. 

PRESENTLY  THE  ONLY  RUM-TIME  MESSAGES  aRE  THOSE  ASSOCIaTFD  WITf-  PROGRAM 
ABORTS,   AMONG  THE  CAUSES  '^F  THESE  AREJ   EXHAUSTION  TF  THE  STORAGE  POOL, 
CONVERSION  ERRORS  IN  GOInG  FROM  CHARACTER  STRINGS  TO  OTHER  DATA  TYPES,  AND 
READING  PAST  AN  END  oF  FiLF. 

IF  THE  STATEMENT  OPTION  IS  NOT  TURNED  OFF  (CF .     3EC.  3.3.2).  T»-EN 
IT  IS  POSblBLb  TO  DETERMINF  THE  STATEMENT  NUMBER  OF  THE  STATEMENT  PEING 
EXECUTED  AT  THE  TIME  OF  AN  ERROR.   WITH  THIS  OPTION  JN,  REGISTER  A"  IS  SET 
to  THE  ClJHRE'JT  STATEMENT  NumRER  AT  THE  BEGINNING  OF  EXECUTING  EACH  STATE- 
MENT,  THt  DUMP  SHOULD  ThEm  INDICATE  THE  STATEMENT  NUMBER  THAT  CAUSED  THE 
DIFFICULTY,   HOWEVER,  ThiS  OPTION  CAUSES  THE  PROGRAM  TO  ^UN  LONGER  AND 
CONSUME  EXTRA  SPACE.  SO  IT  SHOULD  NOT  BE  USED  FOR  DETUQGED  PROGRAM*:, 


-58- 


5,  AREAS  OF  I NCQMPAT I B I L I  TV 

THE  AREAS  OF  INCOMPATIBILITY  BETWEEN  CIMS  PL/I  AND  IBM  PL/I  ARE  ALL  MEN- 
TIONED IN  THE  LANGUAGE  SPECIFICATION  OF  SECTION  ?.       SINCE  THP  LANGi  AGE  FEATURES 
OF  ClhS  PL/I  ARE  PRESENTLY  FAR  FEWER  THAJ  THOSE  IN  I'lM  PL/I,  THOSE  FEATURES 
THAT  ARE  LACKING  WILL  NOT  «E  DETAILED  HERE.   HOWEVER,  DIFFERENCES  IN  THE  OTHER 
DIRECTION  SHOULD  BE  NOTED, 

5,1  EXTENSIONS  IN  CIMS  Pl/I 

1,  A  PROCEDURE  STaTfMFNT  NEED  NOT  BE  LAPELLED  A^  LO  JG  a'=^  aT 
LtAST  O^E  ENTRY  POINT  OF  TME  PROCEDURE  I  <5  LABELLED. 

2.  A  POINTER-QUALIFYING  EXPRESSION  IS  NOT  RESTRICTED  TO  BE  A  SCALAR 
UinSu^SCHIPTED  VARI4BLE)  IT  MAY  BE  AN  ARBITRARY  P  1 1  NTPR'V  ALL  ED  EXPRES" 
SICN, 

3.  THE  BASED  ATTRIBuTF  NEED  NOT  HAVE  AN  ASSOCIATED  POINTER  EXPRESSION. 
IF  SUCH  AN  EXPRESSION  IS  GIVFN*  IT  IS  NCT  RESTRICTED  TO  RE  A  SIMPLE 
VARIABLE, 

4,  BASED  CHARACTER  STRINGS  MAY  oE  VARYING.   F-LEVEL  PL/I  DOES  NOT  ALLOW 

THIS, 


5,  THE  ON-gTATEMENT  May  SPECIFY  SEVERAL  CONHlTMNS.   F. 
PL/J  DOES  NOT  ALLOW  THIS- 


EVEL 


6,  ClfS  PL/I  HAS  A  DEFAULT  STATEMENT  THAT  DOES  JOT  RXIST  AT  AIL  IN  THE 
F-COMPILER.   THE  I^M  OPTIMIZING  COMPILER  D0E3  HAVE  A  DEFAUI  T  STATEMENT. 
BUT  IT  HAS  DIFFERENT  SPECIFICATIONS  THAN  THE  ONE  IN  CIMS  PL/I, 

7,  ClfS  PL/I  HAS  A  STORAGE  ON-CONDITlON  THAT  HA"?  NO  EOUTVALENT  IN  THE 
F-COMPILER,   THE  I^m  OPTIMIZING  COMPILER  DOES  HAVE  S^'CH  A  CONDITION* 
HOWEVER, 

5,?     INCOMPATIBILITIES  BETWEEN  CIMS  PL/I  AND  IBM  PL/I 

1.  IN  CIMS  PL/I,  INTERNAL  PROCEDURES  MUST  NOT  BE  DECLARED, 
Ii\  SOME  VERSIONS  OF  IBM  PL/I.  ThEY  H^JST  f\E    DECLA-'^ED, 

2.  IN  CIMS  PL/I,  CONSTANT  LABEL  ARRAYS  MUST  NOT  BE  DECLAREDI  IN 
IB^  PL/I  THEY  MIST  BE  DECLARED, 

3.  IN  CIMS  PL/I,  ALL  UNDECLARED  VARIABLES  WITHOUT  CONTEXTUAL  ATTRIBUTES 
DEFAULT  TO  FIXED  (pINARY).   1N  I^M  Pl/I  THEY  WOULD  DEFAULT  TO  FLOAT. 

4.  THERE  IS  NO  CONNECTION  BETWEEN  THE  ENVIRONMEJT  OPTIO^J  AS  U^ED 
IN  IBM  PL/I  AND  THF  ENVIRONMENT  OPTION  AS  USED  I  J  CI^S  PL/I. 


-59- 


6,  FUTURE  FLANS 


EVENaALLY  I  HOPE  Tn  lMPLf=MENT  THE  FULL  PL/I  LA  JQUAIE  A«!  OEFUED  RY 
COMMITTE  X3J1  OF  THE  AMEpICAN  i^ATIOMaL  STANDARDS  IMSTITUTE.   THE  LANGUAGE  AS 
PRESENTLY  IMPLEMENTED  IS  A  LONG  WAY  FROM  THAT  GOAL.   THE  FEATURES  MENTIONED 
IN  REVISION  G  OF  THIS  MAmUAL  HAVE  ALL  BEEN  IMPLEMENTED,   STREAM  INFUT-OUTPUT 
IS  ON  THE  ^-AY,  dUT  NO  OThEP  IMPROVEMENTS  ARE  PLANNED  FOR  THE  IMMEDIATE  FUTURE, 


-60- 


7,  ACKNOWLEDGMENTS 

LINDA  SUSKIND  CONSTRUCTED  THE  FACILITIES  FOR  SUB>TITJTI0N  OF  FTLF 
NAMES- 

ABRAHAM  QETZLER  CONSTRUCTED  THE  CROSS-REFERENCE  MAP  AND  ATTRIPUTE  LISTING. 
AS  WELL  AS  THfc  OVERLAY  ROUTINES  USED  WITHIN  THE  COMPILER. 

IRA  KAYE  BUILT  THE  PNTIRE  MECHANISM  FOR  THE  DEFAULT  STATEMENT'.  AND  ALSO 
WROTE  APPtNDIX  A  OF  THIS  MANUAL. 

EMILY  EISENbERG  CONStRucTEH  THE  ORIGINAL  VERSION  OF  THE  LEXICAI  SCANNER, 

PETER  fACLEAN  PROVJCED  AN  INTERACTIVE  DERUGGING  FACILITY  WHOSE  USE  WAS 
INVALUABLt  IN  DEBUGGING  THE  COMPILER. 


-61- 


APPENDIX  A 


A.l  DESCRIPTION 


A.l  .1 


DEFAULT/DFT   (PPEHICATE)   DTT-ATTR-SET  t ,  ")FT- aJTR-SET  1  I 


IF  A  VARlAhiLE  MEETS  T^E  CONDlTinNS  SPECIFED  BY  THE  PREDICATE,  GlVE  THE 

VARIABLE  ThE  ATTRIBUTES  SPPCIFIED  IN  THE  DFT-A  TTR-SETS .   ATTf^MPT  Tr  APPLY  THE 
SETS  IN  THE  ORDER  LISTED,  anH  apply  EACH  ONE  WHICH  D3ES  iOT  CAUSE  ANY  CONFLIC- 
TING ATTRIBUTES  TO  BE  GIvEv. 

A, 1.2   DE^AULT/DFT  (PRFClCATE)  EHRORj 

IF  THE  VARIABLE  MEETS  THE  CONnlTlONl  SPFCIFIED  IN  PREDICATE.  ^IVE  IT  THE 
ERROR  ATTHlBUTEi  AND  PRImT  A  MESSAGE.   PROGRAM  WILL  JOT  RE  E^ECUTAFLE  IF  ANY 
VARIABLES  ARE  GIVEN  THE  ERPOR  ATTRIBUTE, 

A, 1.3   DEt-  AULT/DFT  SYSTEM; 

APPLV  THE  STANDARD  PL/1  SYSTEM  DEFAULTS  TO  THE  VARIABLE.   ANY  DEFAULT 
STATEMENTS  APPEARING  AFTER  T^IS  TYPE  WILL  RE  IQr.ORED, 

A, 1.4   DEFAULT/DFT  NONE  J 

APPLY  NO  FURTHER  DEFAULTS,  !NcLUDlN«l  THE  SYSTEM  DEFAULT.   IF  4  VARIABLE  IS 
NOT  CQMPLbTE,  IT  WILL  ^E  Glv^N  THE  ERROR  ATTRIBUTE.  AnH  An  EpROR  MESSAGE  WILL 
RE  PRINTEU, 

A,?  ORDER  CF  PROCESSING  ANn  SCOPING. 

DEFAULT  STATEMENTS  aR=  PROCESStD  IN  THE  ORDER  I')  WHICH  THEY  APPEAR.  THE 
DEFAULT  SIATEMENTS  APPEARlMG  IN  A  BLOCK  ARE  APPLIED  IN  SEQUENCE  TO  EAC^ 
VARIABLE  UECLARED  IN  THE  BLOCK, 

DEFAULT  STATEMENTS  FOLLOW  THE  SAME  ;JAME  SCOPING  CONVENTIONS  ^<:    VARIABLES. 
THEY  ARE  KNOWN  JN  THE  PLOCkS  IN  WHICH  THEY  APPEAR  AM  ALL  ENCLOSED  BLOCKS, 
THEY  ARE  PROCbSSED  FROM  THP  INNERMOST  BL^CK  OUT. 

EX,  1, 

AJ  PROC   OPTiOmS  (MAIN)  I 
DFT  (PI)  SFT1  » 
DCL  Al  ; 

•  •  • 

Bl  PROC  J 

DFT  (P2)  «;ET2  J 
DCL  Bl  J 

•  fl  • 

END  B    i 
END  AJ 

Rl  WILL  HAVE  PREDICATES  p2  AND  PI  AfPLlED  TO  IT  IN  T <AT  nRHE",  Al  UiLL  HAVE 
PREDICATE  Pi  APPLIED  TO  iT.   jr  iT  IS  DESlREn  ThaT  O^JLY  ^ReDTCaTE  P2  BE  APPLIED 
TO  Bl,  THbN  IfJ    SYSTEM;  SHOULD  BE  PLACED  AFYFR  DFT(P^)  SET?; 


EX,  2, 


A!  PROC  OPTICnS^MAIN)  ; 
DFT  (P3)  SETT  I 
DFT  none; 


-62- 


VARIABLES 
DEFAULTS, 
VARIABLES 
VARIABLES 
DEFAULTS. 


Bl  PROC  } 
DFT  SYSTEI" 


pRnc  J 

DFT<P4) 

•  «  • 

c; 


SET4  I 


END 

End  b; 

END  Aj 

CECLAHED  in  BLOCi<  a  WILL  HaVE 

Including  systfm  defaults, 
declared  in  blnc^  r  will  have 
chclared  in  block  c  will  have 


PREDICATE  P3  APPLIFD.  AND  NO  OTHER 

THE  STAKHARD  SYSTEfl  "lEFAULTS  APPLIED 
P4  APPLIED  T^  THEM,  AND  THFN  SYSTEM 


A, 3   LESCHIPTION  OF  PRFClCATES  AND  DFFAULT  ATTRp^UTE  SETi, 
A. 3.1      PREDICATES 

PREDICATES  ARE  MADE  Up  BY  CDMRlMlNG  ANY  OF  THE  ALLO  MBLf^  ATTRTBUTE  KEY- 
WORDS IN  dCOLEAN  EXPRESSIONS  USING  a.  t,  AND  -.   TO  !?EFE'-)  TO  DIMEN'cIONED 
ARRAYS,  USE  THE  KEYWORD  oImeNSION  OF-  OlM.   IN  ADhITMN  T^E  FOLLOWING  ELEMENTS 
ARE  ALLOWtC  IN  PREDICATESl 

RANGE(STRING)  -  A  VARIABLE  f'FFTS  THIS  CCNDITIOU  IF  IT  STARTS  WITH  THE 
CHARACTER  STRING, 

RANGE(CHAR1  :  CMAr2)  -  A  VARIABLE  MEttS  TMIS  C0?>1D  I T I  O'^S  IF  ITS  FIRST 
LETTER  IS  IN  THE  LEXICAL  ImtERVAL  SPECIFIED  BY  CHARl  AND  CMA'^2. 
RaNGEC*)  -  ALL  VARIABLE*?  mEET  ThiS  cCNDlTjO^^. 


M.3.2 


DEFAULT  ATTRIBUTF  SETS. 


ATTRIBUTE  SETS  ARE  mAHE  UP  OF  LISTS  OF  ATTRlBDTf^S  W'UCH  ARE  Tr  RE  APPLIED 
TO  A  VARIABLE  IF  IT  MEETS  THE  CONDITIONS  OF  THE  PREDICATE.   AS  MANY  SETS  AS 
DESIRED  MAY  Hfc  PLACED  IN  A  DEFAULT  STATEMENT.   THE  ATTRIBUTES  GIVEN  IN  THE  SET 
ARE  APPLItC  TO  THE  VARIA«LF,  PRQVlntn  THAT  THEY  DO  N')T  CTNFLICT  WITH 
ATTRIBUTES  THAT  HAVE  AlREAHY  BEEN  ASSIGNED  TO  THE  VA'^^IABlE.   IF  SUCH 
CONFLICTS  EXIST,  THEN  T^P  SET  IS  IGNORED.   ANY  ATTRIBUTE  OTHPR  THAK  THE 
LIKE  ATTRIBUTE  CAN  APPEAR  IN  A  DEFAULT  ATTRIBUTE  SET, 


-63- 


Appendix  B.    Useful  Coding  Tricks 

In  this  appendix  I  shall  describe  some  coding  tricks  that 
I  used  in  the  compiler  and  in  its  run-time  library.   This 
section  assumes  a  knowledge  of  6600  machine  language.   In  the 
following  descriptions,  the  assumption  is  made  that  any  register 
not  defined  by  the  code  specification  can  be  used  as  a  temporary. 

B.l    Absolute  Value,  Maximum,  and  Minimum 

The  method  for  taking  absolute  values  without  a  branch 
instruction  has  been  passed  down  by  oral  tradition,  but  I  have 
never  seen  it  in  print.   The  following  code  takes  the  absolute 
value  of  the  contents  of  X7  and  leaves  the  result  in  X7. 


BX6  X7 

AX6  60  FILL    X6    VJITH    SIGN    BIT    OF    X7 

BX7  X7-X6  INVERT    X7    IF   NEGATIVE 

A  variation  of  this  device  will  take  the  maximum  of  X6  and  X7 , 
leaving  the  result  in  X7 : 

FX5  X7-X6 

BX4  X5 

AX4  60 

BX5  X5*X4  X7-X6    IF   NEGATIVE,    ELSE    0 

FX7  X7-X5  IF    X7    EXCEEDS    X6 ,    SUBTRACT    X7-X6 

If  60  happens  to  be  available  in  a  B-register,  these  sequences 
can  each  be  shortened  by  one  instruction.   The  code  for  the 
minimum  is  analogous. 

B.2    Least  Integer  in  X 

If  we  have  a  floating  number  and  wish  to  find  the  least 
integer  not  exceeding  that  number,  the  code  is  tricky  because 
float-to-integer  conversion  truncates  towards  zero  and  gives 
the  wrong  result  for  negative  numbers.   Assuming  input  in  X7 , 
the  following  code  puts  the  desired  least  integer  into  X7. 


-64- 


MXl 

1 

BX2 

X7*X1 

LX2 

1 

1X7 

X7+X2 

UX7 

X7,B2 

LX7 

X7,B2 

1X7 

X7-X2 

1  TO  X2  IF  X7  NEGATIVE,  ELSE  0 
IF  X7  NONNEGATIVE,  LEAVE  IT  UNCHANGED 
IF  X7  AN  EXACT  INTEGER,  BUMP   IT  UP 
IF  X7  NOT  EXACT  INTEGER,  KEEP  SAME 
INTEGER  VALUE 

CONVERT  FLOAT  TO  INTEGER 
SUBTRACT  1  IF  NEGATIVE 

To  understand  this  code,  note  that  if  X7  is  nonnegative  the 
addition  and  subtraction  of  X2  has  no  effect,  since  X2  contains 
zero.   Otheanvise  we  have  two  cases.   If  X7  contained  the  floating 
representation  of  a  negative  integer,  then  the  addition  will 
give  us  a  value  that  will,  when  truncated,  give  the  next  higher 
integer.   (E.g.,  we  will  go  from  -3  to  -2.99999999.)  After 
conversion  to  float  we  have  a  number  one  too  big  (e.g.  ,  -2) 
and  the  following  subtraction  corrects  this  situation.   Suppose, 
on  the  contrary,  that  X7  contained  the  floating  representation 
of  a  negative  number  that  was  not  an  exact  integer.  Then  we 
would  go,  for  instance,  from  -3.3  to  -3.299999999  to  -3 
(after  conversion  to  integer)  to  -4  (after  subtraction) . 

The  same  device  works  for  finding  the  greatest  integer. 
In  order  to  treat  zero  correctly  the  correct  procedure  is  to 
complement  X7 ,  use  the  code  given  above,  and  recomplement  X7. 

B.3   Division  by  Ten 

Performing  integer  division  by  10  is  necessary  when  finding 
the  number  of  words  needed  to  represent  a  packed  character  string, 
since  each  character  occupies  6  bits  and  each  word  has  60  bits. 
Integer  division  on  the  6600  can  only  be  carried  out  by  using 
the  floating  division  instruction;  the  whole  process  requires 
five  instructions.   The  division  instruction  itself  takes  20 
minor  cycles ,  twice  as  long  as  any  other  non-branching  instruc- 
tion.  Thus  it  is  desirable  to  avoid  use  of  the  division  instruc- 
tion in  dividing  by  10.   This  can  be  done  by  multiplying  by  a 
suitably  imnormalized  one-tenth.   Assuming  input  and  result 
in  X7 ,  we  have : 


■65- 


SAl  -17170631463146314632B    1/10+EPSILON 

PX6  X7  FLOAT  X7  INTO  X6 

FX6  X1*X6  QUOTIENT  TO  X6 

SX7  X6  CLEAR  JUNK  BITS  FROM  EXPONENT 

The  exponent  is  chosen  so  that  the  desired  quotient  appears  in 
the  right  end  of  the  upper  half  of  the  product.   A  similar  trick 
can  be  used  for  division  by  other  integer  constants. 


B.4    Changing  the  Representation  of  Blank 

The  internal  character  code  for  blank  is  55B,  which  gives 
the  wrong  result  in  character  comparisons  since  it  places  blank 
after  the  letters  or  digits.   Before  doing  a  string  comparison 
other  than  equality  or  inequality  it  is  desirable  to  map  all 
blanks  to  the  code   OOB,  thus  redefining  the  collating  sequence 
to  make  it  more  reasonable.   It  is  possible  to  perform  this 
operation  on  an  entire  word  without  looping.   The  process  proceeds 
in  five  stages.   In  the  first  stage  we  map  the  input  word  so 
that  55B  characters  become  77B  (all  ones)  and  no  other  characters 
become  77B.   In  the  second  stage  we  convert  each  7  7B  to  a  charac- 
ter such  that  the  leftmost  bit  is  1,  and  each  other  character  to 
a  character  such  that  the  leftmost  bit  is  0.   This  is  done  through 
shifting  and  "and"-ing.   In  the  third  stage  we  convert  each  77B 
to  40B  and  each  other  character  to  0 .   In  the  fourth  stage  we 
convert  each  40B  to  77B   and  each  other  character  to  0.   In  the 
fifth  stage  we  use  the  mask  we  have  just  developed  to  change 
the  55B's  to  OOB's.   Assuming  input  and  output  in  X7 ,  the  code  is: 


-66- 


SAl 

=  10H 

BX6 

-X1-X7 

BX5 

X6 

LX5 

3 

BX6 

X5*X6 

SBl 

1 

LX5 

X6,B1 

BX6 

X5*X6 

LX5 

X6,B1 

BX6 

X5*X6 

LX5 

X6,B1 

BX6 

X5*X6 

SA2 

=40404 

BX6 

X2*X6 

AX5 

X6,B1 

BX6 

X5+X6 

AX5 

X6,B1 

BX6 

X5+X6 

AX5 

X6,B1 

BX6 

X5+X6 

BX5 

X6 

AX5 

3 

BX6 

X5+X6 

BX7 

-X6*X7 

CHANGE  55B  to  77B  in  X6 

AND  3  LEFT  BITS  WITH  3  RIGHT  BITS 


ALL  6  BITS  NOW  ADDED  INTO  LEFT  BIT 
)404040B 

ELIMINATE    EXTRA   BITS 
CHANGE    4 OB    TO    77B 


NOW  ALL  CHARS  ARE  77B  or  0 
MASK  55B  TO  0 


The  same  approach  can  be  used,  for  instance,  to  count  null 
characters  in   a  word;  in  this  case  one  produces  40B  for  each 
null  character  and  then  uses  the  population  count  instruction. 
Other  related  tasks  can  also  be  accomplished  with  this  approach. 

B.5    Searching   for  a  Set  of  Characters  in  a  String 

The  trick  for  searching  for  a  set  of  characters  in  a  string 
is  needed  in  connection  with  the  VERIFY  built-in  function  in  PL/I. 
In  this  situation  we  are  given  a  set  of  characters ,  and  we  wish 
to  find  the  first  character  in  the  string  that  is  not  in  the  set. 
(The  problem  of  finding  the  first  character  that  is  in  the  set 
is  similar.)   Rather  than  give  the  actual  code,  I  will  just 
describe  the  method.   First,  form  a  bit  mask  such  that  bit  _i 
is  one  if  and  only  if  the  character  with  representation  i.  is  in 
the  set.   Then  loop  through  the  characters  in  the  string.  For 
each  character,  use  the  character  representation  as  a  shift 
count  and  shift  a  single  one  bit  left  by  that  count.   Form  the 
"and"  of  the  shifted  bit  and  the  bit  mask  constructed  above. 
If  the  result  is  nonzero,  then  the  character  is  in  the  set. 

There  is  a  difficulty  with  this  method.   If  the  characters 
include  any  whose  octal  representations  exceed  60 ,  then  there 

will  be  an  ambiguity,  and  two  bit  masks    are  needed  rather  than  one. 

-67- 


This  report  was  prepared  as  an  account  of 
Government  sponsored  work.   Neither  the 
United  States,  nor  the  Commission,  nor  any 
person  acting  on  behalf  of  the  Commission: 

A.  Makes  any  warranty  or  representation, 
express  or  implied,  with  respect  to  the 
accuracy,  completeness,  or  usefulness  of 
the  information  contained  in  this  report, 
or  that  the  use  of  any  information, 
apparatus,  method,  or  process  disclosed 
In  this  report  may  not  infringe  privately 
owned  rights;  or 

B.  Assumes  any  liabilities  with  respect  to 
the  use  of,  or  for  damages  resulting  from 
the  use  of  any  information,  apparatus, 
method,  or  process  disclosed  in  this 
report. 

As  used  in  the  above,  "person  acting  on  behalf 
of  the  Commission"  includes  any  employee  or 
contractor  of  the  Commission,  or  employee  of 
such  contractor,  to  the  extent  that  such  em- 
ployee or  contractor  of  the  Commission,  or 
employee  of  such  contractor  prepares,  dis- 
seminates, or  provides  access  to,  any  infor- 
mation pursuant  to  his  employment  or  contract 
with  the  Commission,  or  his  employment  with 
such  contractor. 


-68-