COO-3077-31
^
Courant Institute of
Mathematical Sciences
AEG Computing and Applied Mathematics Center
A Preliminary Report on CIMS PL/I
Paul Abrahams
AEC Research and Development Report
Mathematics and Computers
August 1973
New York University
^
UNCLASSIFIED
AEC Computing and Applied Mathematics Center
Courant Institute of Mathematical Sciences
New York University
Mathematics and Computers COO-3077-31
A PRELIMINARY REPORT ON CIMS PL/I
Paul Abrahams
Contract No. AT (11-1) -3077
UNCLASSIFIED
Table of Contents Page
1 . Introduction 1
2 . Implementation Characteristics 2
2 , 1 Language accepted by CIMS PL/I 2
2 . 2 Features Not Yet Implemented 4
2 . 3 Ope rating Characteris tics 6
3. Bootstrapping Method 7
4. Structure of the Compiler 11
5 . Implementation Techniques 14
5. 1 Parsing Method 14
5.2 Data Structure Design 17
5.2.1 Choice of Representation Method 17
5.2.2 Examination of Trade-Offs for
Specific Structures 19
5 . 3 Run-Time Storage Allocation 22
6 . Conclusions 2 4
6.1 Sources of Program Complexity 24
6.2 Operating Systems Support 27
6 . 3 Virtual Memory 29
6.4 Reliability 30
6.5 Project Organization 31
6.6 PL/I as a Systems Programming Language 32
6.7 Design Flaws 33
References 36
Appendix A - User's Manual for CIMS PL/I 37
Appendix B - Useful Coding Tricks 64
-111-
1 . Introduction
CIMS PL/I is an implementation of a PL/I subset on the
CDC 6600 computer. The subset is oriented towards compiler
writing, and is sufficiently large so that the compiler itself
could be conveniently written in it. The objective of this
report is to set forth some of the more interesting aspects of
the compiler and some of the conclusions that I drew from
creating it.
I had several initial objectives when I decided upon this
project. One of them was to create a compiler for a useful
programming language. That goal has been achieved, though
it is now less significant since CDC has also produced a PL/I
compiler. A second objective was to learn about compiling
techniques. Compilation of PL/I is sufficiently difficult
so that if one can compile PL/I one can probably compile
anything. A third objective was to gain more understanding of
the characteristics of complex programs. A fourth objective
was to stimulate thinking about programming language design.
CIMS PL/I has been operational since October 19 72 and
consumed about three and a half calendar years and one and
a half man years of work. Production of the original version
of the compiler was essentially a one-person effort. Since
the original version appeared, a number of extensions and
improvements have been made, some of them by others. The
compiler itself is written in PL/I, and was bootstrapped using
the IBM 360 and the IBM F-compiler for PL/I.
There are two appendices to this report. Appendix A is the
current user's guide to CIMS PL/I. Appendix B describes a
collection of coding tricks that I found useful and that may
be useful to others.
-1-
2. Implementation Characteristics
Two aspects of the implementation are worth describing:
the language accepted by the compiler and the operating
characteristics of the compiler. The three following sub-
sections discuss the features included in CIMS PL/I, the
features not yet implemented, and the operating characteristics.
2.1 Language Accepted by CIMS PL/I
In this section I shall only summarize the available
language features, as a complete description is given in
Appendix A. In implementing these features , there are many
possible language definitions to which one can turn. I have
used as my guide the definition of PL/I being developed as
a proposed standard by committee X3J1 of the American National
Standards Institute jointly with committee TCIO of the European
Computer Manufacturers Association. While the proposed standard
has not yet been issued, drafts of it are available [5] . These
drafts do not, however., include final decisions on all language
questions.
The datatypes that have been implemented so far are
FIXED, FLOAT, CHARACTER, BIT, POINTER, AREA, LABEL, FILE, and
ENTRY. Labels, files, and entries are presently I'imited to
constants. The implementation of FLOAT is quite weak, since
only arithmetic operations are provided for it; in particular,
there is as yet no floating point input or output. The
only way to create or access floating point constants is
through conversion from FIXED.
Various other attributes are available. Arrays and structures
have been implemented, and these may be nested within each other.
The storage classes other than CONTROLLED are available, as
well as the INITIAL attribute (with some restrictions) and the
LIKE attribute. Most of the statement types have been imple-
mented. In general, though, the power of a PL/I subset is not
accurately reflected by the range of statements that it
recognizes .
Transput (usually called input-output) is rather primitive
at present. The only transput operations are writing a character
-2-
string to a dataset or reading a character string from a dataset.
Datasets can be rewound by opening and closing them with
appropriate options. Since the initial bootstrap, provisions
have been added for dataset name substitution, so that the
same program can be applied to datasets with different names.
The usual way of creating output information is to build it up
in a string by concatenation, using implicit conversions to
convert numbers to strings. Then the string is written out as
a whole. For input, a string is brought in and broken down into
parts using the SUBSTR built-in function. Strings representing
numbers can then be implicitly converted to the numbers them-
selves. Work is presently under way on providing stream transput
facilities, which include both formatted and unformatted
transput of lists of variables.
A limited set of on-conditions has been added since the
initial version of CIS PL/I became operational. These were
chosen mostly because they were needed by the compiler itself.
Perhaps the most useful of these is ENDFILE, which allows the
programmer to take some specified action when end-of-file is
encountered on input from a specified dataset. Another,
STORAGE, allows the compiler to change its field length
dynamically.
At present the list of built-in functions is quite limited.
Since many of these functions can be implemented using existing
Fortran library routines, most of the work depends only upon
providing appropriate linkages. It is possible to access the
Fortran library even under the present implementation by calling
a Fortran function, since calling sequences are compatible for
scalar numerical arguments.
A number of implicit and explicit type conversions are in the
present implementation; conversion to and from character strings
has been emphasized. These conversions take place both on
assignment and when arguments are passed to procedures.
Aggregate arguments can be passed to procedures , though type
conversion for such arguments has yet to be implemented.
-3-
2.2 Features Not Yet Implemented
The major features that are lacking are:
1. Stream transput. PL/I has three varieties of stream
transput, none of which are currently implemented. Edit-directed
transput is similar to Fortran format-directed transput but
is far more dynamic. PL/I allows arbitrary expressions in most
contexts where Fortran requires a constant, and the values of
these expressions can be influenced by data being read in under
control of the format statement. In the extreme case, an
expression in a format statement could invoke a function which
in turn would initiate another transput operation, perhaps even
on the same dataset!
List-directed input causes data to be read in a free-field
format, and among other things, requires that type conversions
be selected dynamically during the input process. (For
instance, a data item in floating-point representation might
be read into a fixed variable.) List-directed output causes
the value of a sequence of expressions to be written onto a
dataset at fixed tab positions. Data-directed input and out-
put associate the name of a variable with its value; in
particular, on input the name of the variable being read in
is part of the input. Thus a run- time symbol table must be kept.
2. Record transput, A very limited form of record transput
is the only form of transput presently in CIMS PL/I. Record
transput is used to achieve random access to secondary storage
by means of keys, and requires a rather extensive interface
with the operating system. Record transput also interacts with
multitasking, since a record transput operation may overlap
computation as well as other transput operations.
3. Aggregates. The full treatment of aggregate data
(structures and arrays) is one of the thorniest aspects of PL/I
implementation. Procedures should be able to return aggregates
as their values , and aggregates should be able to be passed as
arguments to built-in functions. Further, type conversion must
-4-
be implemented for aggregates, and this conversion may require
the development of an aggregate from a scalar. PL/I provides
the ability to reference cross-sections of arrays, but this
ability is not yet in CIMS PL/I. Further, through the use
of the DEFINED attribute, one aggregate may be defined in terms
of a mapping from another aggregate. The DEFINED attribute
has not yet been implemented either.
4. Arithmetic data types. PL/I allows three independent
arithmetic attributes to be applied: base (DECIMAL or BINARY) ,
scale (FIXED or FLOAT) and mode (REAL or COMPLEX) . Furthermore,
precisions (number of digits to be kept) and scale factors may
be specified. Since at present all fixed numbers behave like
Fortran integers, a great deal remains to be done in this area.
5. Multi- tasking. PL/I provides extensive facilities for
the generation of parallel tasks and for task synchronization.
These facilities are useful even on a machine with only one
central processor, since tasks may be dependent upon transput
operations. None of these facilities are yet implemented,
though certain hooks for them exist.
6. Pictures. PL/I has a PICTURE facility much like that
of COBOL. Pictures are useful for describing and editing data
used in commerical applications. Pictures are not presently
implemented.
7. Data alignment. PL/I provides the ALIGNED and UNALIGNED
attributes to control the tradeoff between economy of storage
and speed of access. Although on the surface it might appear
that a correct implementation could ignore these attributes ,
it happens that they have certain implications with regard to
the storage of character and bit strings. In particular,
adjacent character strings in an unaligned structure must be
stored adjacently even if that requires starting one of them
in the middle of a word. A similar situation holds for bit
strings. One implication of this requirement is that pointers
must be able to point to bits or characters within a word;
pointers as presently implemented in CIMS PL/I can only point
to words .
-5-
8. Bit strings. The implementation of bit strings is
presently quite restrictive, as they can only be of fixed
length and that length must not exceed 60. Varying bit strings
are not implemented at all.
9. Built-in functions. PL/I requires a rather large library
of built-in functions, and in addition there are a number of
implementation-defined ones that would be desirable. Only a
few built-in functions have been implemented so far, though
many of the remaining ones could be taken directly from the
Fortran library.
2.3 Operating Characteristics
The compiler is designed to operate under the CDC Scope 3.2
Operating System [6] on the CDC 6000 computer series, though
it will also run under the Scope 3.4 Operating System [7]. The
input to the compiler is a sequence of PL/I external procedures,
and the output is in the form of 6600 assembly language. This
assembly language must then be assembled in a separate step,
using a set of macro definitions provided with the compiler.
Finally, the assembled program is loaded together with a
run-time library and executed.
The compiler itself presently operates at a rate of about
15 statements per second and requires a minimum field length
of lOlK octal; large programs require considerably more space.
Assembly takes about 30% more time than compilation, so that
the total rate of compilation is about 6 statements per second.
An extensive set of compile-time diagnostics has been
provided. Run-time diagnostics are much sketchier, though a
recently added feature makes it easy to determine the point at
which a program has stopped. Listings are unornamented except
for the inclusion of statement numbers. A listing of variables
and a cross-reference map can be obtained.
-6-
3. Bootstrapping Method
The CIMS PL/I compiler is written in PL/I and was boot-
strapped from the IBM 360. In Figure 1 we see the process
involved in obtaining an executable program from a source
program. The source program is processed by the compiler to
macro
definition
PL/I
source
code
COMPILER
(written
in PL/I)
6600 assembly
I language t
»
)
I
.1
run-time
library
COMPASS
ASSEMBLER nodules
load
> LOADER
execu-
table
pro- ^
gram
TAPE
tTRANSFER
Dotted path is taken during bootstrapping.
Figure 1. Compilation Process and Bootstrap Procedure
produce a version in COMPASS, the 6600 assembly language. Since
the COMPASS version is expressed as lines of code, the compiler
can be viewed as transforming character strings to character
strings. (In this analysis we are ignoring such matters as
output listings.) The COMPASS code consists mostly of calls
on macros, so that the assembler must be provided with a collec-
tion of macro definitions. The output of the assembler is a
collection of load modules, one for each external procedure.
The compiled code will call upon various run-time subroutines
for services such as storage allocation, character manipula-
tion, and evaluation of built-in functions. These run- time
subroutines are linked to the compiled code by the loader, so
that the result of the loading process is an executable
program.
Now let's return to the compiler itself, which we shall
call CI. CI is written in PL/I and transforms character strings
-7-
to character strings. Therefore it will run on any machine that
has an operational PL/I compiler; we shall call this second
compiler C2 . Furthermore CI is almost entirely machine-
independent. There are a few machine dependencies such as the
character tables used by the lexical scanner, the transformation
used by the hashing algorithm, and the overlay calling conven-
tions. These sections must be changed to correspond to the
machine upon which CI is executing.
When this project began, of course, PL/I did not exist on
the 6600. However, we can consider a rather awkward way of
obtaining executable PL/I programs on the 6600 under these
circumstances. First, CI is compiled on the 360 using C2 .
Since CI is a PL/I program and C2 compiles PL/I programs so
that they will execute on the 360, there are no difficulties
here. The result is that CI now executes on the 360, and
converts PL/I source code to 6600 assembly language. It might
seem that the process would fail since C2 turns out 360 machine
language. Remember though that CI is just a PL/I program that
transforms character strings to character strings. The fact
that the output character strings are 6600 assembly language
is irrelevant.
We now have our awkward method of compiling PL/I for the 6600.
We take our PL/I source code, pass it through CI on the 360, and
write the results on tape. We then read the tape on the 6600 and
carry out the rest of the process, as shown in Figure 1.
The bootstrap approach is based on the observation that CI is
itself written in PL/I, and tlierefore this awkward method of
compilation can be applied. Thus, we pass the text of CI through
the process of Figure 1, including the tape transfer across
machines. The result is CI as an executable program on the
6600 -- in other words, an operational PL/I compiler on the
6600. From this point on the 360 is not used, and the dotted
connection of Figure 1 is replaced by a direct connection.
Bootstrapping from the 6600 to the 6600 is necessary
whenever improvements are made to the compiler. These improve-
ments are of two sorts. Language extensions may enable the
•8-
compiler to do things that it could not do previously. For
instance, the original version had no means of detecting
end-of-file, and so the compilation of each external procedure
required a separate job step. When the ENDFILE on-condition
was added, the compiler could be modified to allow for batched
compilations. Optimization of compiled code is an even more
important advantage of repeated bootstrapping. As the code
turned out by the compiler becomes more efficient, the
performance of the compiler itself becomes improved through
repeated bootstrapping.
An incidental advantage of bootstrapping is that it
provides a severe test of the compiler itself. The approach
is shown in Figure 2. We start off with an old machine-
language version M(l) and a new source-language version S(2),
(1)
(2)
(3)
S{2)
S(2)
S(2)
> M(l)
^ M(2)
^M(3]
■^ C(2,l) = M(2)
-^ C(2,2) = M(3)
-> C(2,3) = M(4)
M(i) machine language-version i
S(i) source language version i
C(i,j) machine- language compiler
produced by compiling source
version i under machine version j
Figure 2. Iterated Bootstrapping
The first compilation produces M(2) , which we then use to
compile S(2) again and produce M(3). The process is iterated
once more to produce M(4). If the bootstrap process has
succeeded, then M(3) and M(4) will be bit-for-bit identical.
-9-
a condition that can easily be checked mechanically. To see
this, we note that the code generated by S(2) should be
independent of the C used to compile S(2) , since S(2) as we
noted before is just transforming character strings to
character strings. Now M(2) and M(3) are both realizations
of S(2) and thus should transform any input, including S(2)
itself, in the same way. We now observe that if M(3) and M(4)
are bit-for-bit identical, then any number of further iterations
will produce results identical to M(3) . Thus we know that
stability has been achieved. In typical practice, M(l) performs
the same transformation as S(2) and hence as M(2) ; in this case,
M(2) and M(3) will be identical and the process can be termi-
nated one step earlier. Since S(2) may produce rather different
code than S(l), however, we cannot count on this early termina-
tion.
•10-
4. Structure of the Compiler
The major components of the compiler itself are shown in
Figure 3. The input to the compiler is PL/I source code;
the output is 6600 assembly language. The unit of processing
is the external procedure, and each of the boxes shown
represents a pass over part or all of the procedure.
LIKE
resolver
declaration
processor
parser
:ieclara-
reference
resolver
-^
source
tions
code
^
J
I
r
lexical
scan
storage planner
>
^
^^
CO
language
Figure 3. Structure of the Compiler
The first pass is performed by the parser, which is described
in more detail in Section 5.1. It communicates with a lexical
scanner which breaks down the input stream into tokens such as
identifiers, numbers, and punctuators. The output of the parser
is a collection of interconnected data structures. Some of these
data structures are associated with declarations; most of the
rest of them are associated with statements. The data structures
-11-
are interconnected through pointers, and their organization
is discussed further in Section 5.2.
During parsing, a list is kept of all declarations that use
the LIKE attribute. The LIKE attribute allows the programmer
to copy a portion of a declaration into another declaration,
and considerably simplifies the work of writing long but
similar declarations. It must be processed at a very early
stage because until it is processed the set of declarations
is not complete. The parser keeps a list of all places where
the LIKE attribute is used, so that it is not necessary to scan
the entire text looking for occurrences of LIKE.
The task of the reference resolver is to connect uses of
variables with their declarations. The reference resolver
must scan through all statements as well as through all
expressions appearing in declarations. Each use of a variable
is represented by a pointer. This pointer is then replaced by
a pointer to the appropriate declaration. The task of resolving
a reference can be lengthy because of the complications intro-
duced by block structuring and name qualification. Therefore
different references to the same variable are represented by
pointers with the same value, and the full work of reference
resolution need be carried out only once for each distinct
reference.
Reference resolution may turn up references to variables that
are nowhere declared. Implicit declarations are generated for
such variables. These declarations may require the application
of contextual attributes, which are determined by the way that
a variable is used. Contextual attributes are picked up during
the parse and applied to implicit declarations during reference
resolution.
The declaration processor generates the full set of declara-
tions for the procedure. It must check and complete the
declarations that have been given by the programmer, and it must
also generate certain declarations that the programmer has not
(and indeed, cannot have) specified. Completion of declarations
is necessary because the attributes that the programmer has
-12-
written down may not be sufficient to completely characterize
the declared variable. Completion of declarations also
involves the application of any DEFAULT statements that the
programmer may have written. When completing declarations
we must also associate each built-in function with an index
number. Each internal procedure must have an entry declara-
tion generated for it, and subscripted labels have label array
declarations generated for them.
The last and most complex pass is the code generator.
The code generator processes the statements of the source
program in the sequence in which they were written, except
for DECLARE statements. The type of statement then determines
the particular statement generator that must be called upon.
Furthermore, whenever a BEGIN statement or a PROCEDURE statement
is encountered, the storage planner is invoked. The storage
planner is a major part of the code generator; its task is to
develop the arrangement of storage that will exist during
program execution.
The largest single component of the code generator is the
procedure COMPUTE. COMPUTE receives an expression as its input.
It generates code for evaluating that expression, and returns
as its result a data structure that specifies the attributes
of the computed value as well as the information needed to
access that value.
-13-
5 . Implementation Techniques
In this section I shall discuss a few interesting aspects
of the implementation. This discussion is by no means exhaus-
tive. It is intended to present a few selected issues and to
show the approach that I took to them.
5.1 Parsing Method
The parsing method used in the compiler is detailed in [1],
and will only be sketched here. It is based on a program
called SYNDIPAR (SYNtax Directed PARser) , whose input is a
specification of the required syntax and semantics and whose
output is a PL/I parsing procedure together with the tables
that drive it. SYNDIPAR itself is written in SNOBOL. In my
early planning I assumed that SYNDIPAR would not have to be
run very often and therefore its efficiency would be irrelevant.
This has not turned out to be the case; the inef ficiences of
SNOBOL have caused SYNDIPAR to consume inordinate amounts of
resources, and I am planning to rewrite it in PL/I. Neverthe-
less, it works. (Of course, the compiler itself is in no way
dependent upon SNOBOL.)
The first input to SYNDIPAR is a set of syntax equations
for PL/I . These syntax equations are written in an extended
form of BNF which includes operators for repetition both with
and without delimiters between the items being repeated.
For instance, an argument list is specified as a sequence of
expressions delimined by, i.e., separated by, commas.
SYNDIPAR starts with the assumption that the given grammar
belongs to the class LL(1). The essential property of an LL(1)
grammar is that at any point in the parse where alternatives
exist, one can decide which way to go by looking only at the
next symbol in the input string. Unfortunately PL/I does not
have cin LL(1) grammar. So the parser also must call upon
"oracles" to look ahead in the input stream far enough to deter-
mine the correct decision in cases of doubt. We thus have a
form of programming by exception. Since the situations where
-14-
oracles are needed are very few, we gain the efficiency of
LL(1) parsing most of the time. The oracles themselves need
not carry out a full analysis, but need only make a binary
decision. They can do this by very simple means, such as
looking for adjacent symbols of certain types or counting
parentheses. An oracle need not concern itself with incorrect
input; if the input is incorrect the oracle simply makes the
most convenient decision and leaves it to other mechanisms to
detect, announce, and possibly correct the error.
The syntax equations provided to SYNDIPAR also contain calls
on semantic routines . The semantic routines are provided in the
form of blocks of PL/I code, and their task is to generate the
data structures that are the output of the parse. SYNDIPAR
generates the appropriate linkages so that these blocks are
called at the right time. The semantic routines are actually
interspersed with the syntax equations , since it is easier to
prepare the input in this form. The semantic routines are simply
copied into the parsing procedure produced by SYNDIPAR, with
some linkage material added front and back.
The first output of SYNDIPAR is the parsing procedure itself.
Since almost all of this procedure is just a transcription of
the input, there is not much to say about it. The second output
is a sequence of instructions for a "parsing machine". The code
for this machine is interpreted when an actual source program
is being parsed. The interpreter itself is included in the parser.
Each instruction for the parsing machine has an operator (there
are ten of them) and an optional operand. Denoting the integer
operand by x , we can sketch the operations :
1. Jump to the semantic routine at SLAB (x) , where the
label SLAB (x) has been attached by SYNDIPAR to the appropriate
block of code.
2. Jump to the syntax routine whose code starts with
instruction x, saving a return address on a pushdown stack.
3. Unstack a return address and jump to it.
4. Use X together with the current symbol or symbol type
to determine a branch address and then branch there. The
-15-
(x, symbol) pairs are used as input to a hashing algorithm.
If no entry exists for the (x, symbol) pair then execute the
next instruction.
5. Advance to the next token in the input stream.
6. Jump to the instruction at x.
7. Signal an error.
8. Plant a flag indicating that the currently governing
error control routine is the block of code at ERR(x) .
9. Not used.
10. There is an array of tenninal symbols called TERM.
If TERM(x) is equal to the next symbol, then skip one instruc-
tion; otherwise execute the next instruction.
11. If the symbol class (integer, string, punctuator, etc.)
of the next symbol has code x, skip one instruction; otherwise
execute the next instruction.
The use of oracles requires some special techniques in the
lexical scanner. An oracle works by looking ahead through the
input stream, searching for enough information to make its
decision. It sees the input stream, as a sequence of tokens
rather than as a sequence of characters. These tokens will
need to be scanned again during the subsequent syntactic
analysis; furthermore, the same token sequence may be examined
by several different oracles prior to analysis.
These requirements are met by providing two modes of lexical
analysis. In the lookahead mode, scanned tokens are saved on
a queue. Upon completion of lookahead, the current token is
taken to be the one that was current when the lookahead started.
In either lookahead or ordinary mode, tokens are taken from the
queue until it is exhausted; thereafter they are taken from the
input stream itself. In ordinary mode, scanned tokens are
discarded; in lookahead mode, they are saved.
What if the space on the queue is exhausted? This situation
will occur if an oracle is unable to make a decision after scan-
ning some .reasonable number of tokens. When the queue space
is exhausted, the oracle is notified, and it makes an arbitrary
-16-
decision. If this decision is wrong, a syntactic error will
be introduced artificially. Such errors are unavoidable if
we don't want to store two unbounded alternative parses, since
there are examples where the syntactic type of a construct
remains indeterminate for an unbounded number of symbols.
5.2 Data Structure Design
The design of the data structures for the compiler was the
most difficult and critical part of the overall design. Once
the data structures were selected, most of the programming
became straightforward. It is worthwhile examining some of
the ideas underlying the data organization, and some of the
tradeoffs that had to be considered.
5.2.1 Choice of Representation Method
The first decision concerning data structures was the
choice between a syllable-oriented and a pointer-oriented
representation. In a syllable-oriented representation, the
program is encoded as a linear sequence of syllables.
Syllables may either be type codes which determine the meaning
of the following information, or representatios of the informa-
tion itself (including pointers and indices) . The usual
reverse-Polish representation is syllable-oriented, as are
representations in terms of triples or quadruples. These
representations have the convenient property that they can be
written onto secondary storage, rewound, and reprocessed. Even
the rewind can be avoided through clever tricks, though for
disks this consideration is irrelevant.
In a pointer-oriented representation, sequencing is
accomplished by using chains of pointers rather than by placing
items one after the other. In a pointer-oriented representa-
tion the entire intermediate representation of a program must
be directly addressable. For entities such as expressions the
complexity of the two representations is about the same.
Pointers, however, have the advantage that information need
not be processed in the same order as it is generated.
-17-
It would seem that the case for a syllable-oriented repre-
sentation is very strong, and for most programming languages
that would be the natural choice. PL/I, though, poses some
special problems, most of which arise from the fact that
declarations may be scattered throughout a block rather than
all appearing at the head of a block. Thus the code associated
with declarations must somehow be moved to the right place, and
this kind of code motion over an arbitrary distance becomes
rather complex in a syllable-oriented scheme. For instance,
a block could conclude with a declaration of a character string
whose length is given by an expression. Somehow the length
expression must be moved to the start of the block, since it
must be evaluated before execution of the block itself can
begin. A similar problem occurs with implicitly declared
variables. Even if the programmer writes all his declarations
at the head of the block, there still is the problem of proce-
dures internal to the block, for which declarations must be
generated, as well as the problem of detecting and dimensioning
constant label arrays.
It is apparent that the syllable appraoch is not very helpful
for representing declarations. Declarations are by nature
interlinked in many ways, and simply do not have the sort of
sequential structure necessary for representation by syllables.
A further difficulty with the syllable-oriented approach
arises in the analysis of expressions. A simple bottom-up
analysis is not sufficient if certain kinds of local optimization
are to be attempted, since one wants to consider an operand in
conjunction with the operator being applied to it. In a
reverse-Polish representation it is awkward to locate such an
operator before both operands have been analyzed.
In the original bootstrap version of the compiler the
objective was to get it on the air as soon as possible.
I therefore chose to use the pointer-oriented approach
primarily because it was easier to implement and did not require
any transput facilities. In retrospect the decision is still not
-18-
clear-cut, though there are places where syllables could have
been used to good effect. The tradeoffs are influenced by the
availability of memory, either virtual or real. These considera-
tions are discussed in Section 6.3.
5.2.2 Examination of Tradeoffs for Specific Structures
As an example of tradeoffs, consider the representation of
identifiers within the data structures. An occurrence of an
identifier might be represented by the character string for
the identifier or by a pointer to some other data structure
that contains that character string. In examining the space
requirements we see that if most identifiers are short (as we
would usually expect them to be) , then it is more economical
to store the character strings directly. However, there is
a trap. Since identifiers vary in length, we must store not
only the character string but also its length. At this point
the economy of space begins to favor the indirect representa-
tion. The indirect representation has another cost: the
hashed table search necessary to locate the unique structure
associated with a given identifier. However, this cost must
be paid eventually anyway since it is necessary to hash
identifiers in order to resolve references. The compiler uses
the indirect approach. Every identifier has a unique structure
that represents it. The identifier is characterized by a
pointer to this structure. Associated with the structure is
a list of all declarations of that identifier. The list is
formed by pointing to the first such declaration and then
chaining the rest of them through a pointer component of
the declaration itself.
As another example, consider the structure that is used to
describe the declaration of a single variable. The actual PL/I
specification of that structure is given by:
-19-
DECLARE 1 DECL BASED (DPTR) ,
2 ATTRIBS BIT (60) , /* BINARY ATTRIBUTES */
2 COMPLEX_ATTRIBS POINTER, /* POINTER TO LIST OF COMPLEX
ATTRIBUTES */
2 ACCESS_RULE POINTER, /* POINTER TO STRUCTURE FOR ACCESSING
LOCATION OF THIS VARIABLE */
2 PARENT POINTER, /* CONTAINING STRUCTURE OR BLOCK */
2 SIBLING POINTER, /* NEXT VARIABLE ON SAME LEVEL */
2 LINK POINTER, /* CONTINUATION OF LIST OF DECLARATIONS
OF SAME IDENTIFIER */
2 IDENTIFIER POINTER, /* STRUCTURE CONTAINING NAME OF
IDENTIFIER BEING DECLARED */
2 STMTNO FIXED BIN, /* NUMBER OF STATEMENT DECLARING
THIS VARIABLE */
2 NEXTLEVEL FIXED BIN; /* LEVEL OF STATIC NESTING OF
THIS VARIABLE */
The name of the entire structure is DECL, and its components
are ATTRIBS, COMPLEX_ATTRIBS , etc. In designing this data
structure, space and ease of access must be traded off.
The problem is that while most declarations of variables are
quite simple, the most complicated ones are very complicated
indeed. To represent all possibilities directly would lead
to a very large structure for even the simplest sort of
declaration. The details of DECL suggest how the tradeoff
was handled.
Consider, first, the representation of attributes. The
declaration of a variable may attach to that variable a variety
of attributes. Some of them may be recorded by a single bit,
e.g., EXTERNAL. Others may require one or two integers, e.g.,
PRECISION. Others require an expression, e.g., CHARACTER,
though the obvious representation of an expression here is as
a pointer to something else. Yet others require a complex
ad hoc structure, e.g., ENTRY or INITIAL.
The DECL structure uses two different devices for represent-
ing attributes. First there is a bit string, ATTRIBS, in which
one bit is devoted to representing the presence or absence of
each attribute. Second, there is a pointer COMPLEX_ATTRIBS to
a linked list of complex attributes. Each of these includes
a flag to indicate its type, a pointer to the next complex
attribute, and further material determined by which attribute
is being described. There are, of course, further data
-20-
structures, not given here, for describing each attribute type.
It is convenient to include a bit in ATTRIBS for attributes
that also appear in the list of complex attributes.
For a declaration with no complex attributes, then, the
declaration contains only one extraneous piece of information:
a null pointer as the value of COMPLEX_ATTRIBS . For more
elaborate declarations we will have correspondingly elaborate
lists of complex attributes. Note that the complex attributes
are not necessarily mutually exclusive. For instance, a variable
could have the BASED, DIMENSION, INITIAL, and CHARACTER attri-
butes simultaneously.
A similar economy issue is raised by the fact that most
PL/I variables are unstructured, but yet we must be able to
represent structured variables. To represent an element of a
general (i.e., not necessarily binary) tree we need to know the
parent, first child, and next sibling of that element as well
as any information associated with the element itself. Structured
variables do indeed have the organization of a general tree.
Yet an unstructured variable needs none of this tree-oriented
information .
We note, however, that for every variable we must be able
to identify the block in which this variable is declared. We
achieve this by treating the parent of a level-one variable (i.e.,
one with no parent) as being the descriptor of the declaring
block. Thus we no longer have orphan variables; every variable
has a functional parent, and the PARENT component of DECL will
always be put to good use. We can, of course, find the block
associated with any variable not at level-one by tracing up the
chain of ancestors.
We also note that for every block we will want a list of
variables declared in that block. We can form that list by
stringing together all the level-one variables; any other
variable can be found by tracing downwards through the trees.
Since a level-one variable has no sibling, the SIBLING component
of DECL can be used for this purpose. Not all variables will
-21-
have siblings under this interpretation, but most of them will;
and that is sufficient to justify the use of space.
How about the first child? Unlike the two other tree links ,
the first child has no obvious alternate use for variables that
don't have a first child. And most variables do not in fact
have children. So we do not set aside a component of DECL for
the first child; rather, we store this information as a complex
attribute, and spend the space only when it is really necessary.
These examples may illuminate the nature of the decisions
that have to be made. We are not worrying about the detailed
layout of bits in storage, but rather about the more general
tradeoff of indirection versus space. If information is stored
explicitly wherever it is needed, then it will take more space.
If it is never stored explicitly when a path to it exists, then
processing it will take more time, and furthermore will lead to
more complex programming. Some implications of this kind of
decision for programming languages are discussed in Section 6.1.
5.3 Run-Time Storage Allocation
Run-time storage allocation is carried out through the use
of a generalized block allocator. The allocator has the ability
to obtain a block of storage of given size, or to . free a given
block. An allocated block is never moved. If a block is freed
and either of its neighbors is freed, then the adjacent free
blocks are combined to form a larger block.
Storage may be required either as a consequence of entering
a procedure, evaluating an expression, or executing an ALLOCATE
statement. The treatment of ALLOCATE statements is straight-
forward; the necessary amount of storage is determined and the
allocator is then asked for a block of that size. The block
is retained xintil a FREE statement for it is executed.
The remaining cases involve storage requests that are not
explicitly made by the programmer. We may classify storage
requests according to the time at which the size of the request
becomes known. This time may be at compile time, at block
entrance time, or an expression evaluation time. For the moment.
-22-
assume that we use a pushdown stack. Upon procedure entrance,
we set aside an amount of stack space obtained by adding the
amount of header information to the amount of storage required
for variables and temporaries whose size is known at compile
time (excluding STATIC variables, which are allocated separately).
We then calculate the amount of storage needed for variables and
temporaries whose size is known at block entrance time, and set
aside that further storage on the stack. When evaluating an
expression we may need temporary storage (though not storage
for variables) where the amount of storage needed cannot be
determined until expression evaluation. Such storage is
presently not allocated on the stack, but rather is obtained
directly from the block allocator.
Because of the requirements of multitasking and the ALLOCATE
statement, a straight pushdown stack storage allocation regime
is not feasible. A modified approach, which I like to call
"sticky allocation," has been used. Space for the pushdown stack
is obtained in large but noncontiguous chunks. These chunks are
linked together. Within a chunk, allocation proceeds in pushdown
fashion. When a chunk is exhausted, a new chunk is obtained from
the allocator. The size of the new chunk is determined by adding
a constant to the amount of pushdown space needed. When the stack
retracts, however, a vacant chunk is not freed immediately.
Instead, a buffer of one vacant chunk is retained. Thus even
if one encotinters a series of procedure calls at a point where
a chunk is just filled up, the overhead of obtaining a new
chunk from the allocator can be avoided.
The difficulty with storage allocation at present is that the
internal organization of the stack is unduly complicated. Further-
more, all temporary storage really should be placed on the stack.
The consequence of the present arrangements is that prologues and
epilogues, which must be performed upon procedure entrance and
exit, are both complicated and time-consuming.
-23-
6 . Conclusions
In this section I shall present some of the conclusions that
I drew from this work.
6.1 Sources of Program Complexity
The source language version of the compiler occupies about
6000 card images, and constitutes a large and complex program.
It is fruitful to inquire as to the causes of this complexity.
As a program evolves from an idea to a reality, conventions
and concepts are tentatively set forth and then modified. For
instance, data structures are designed with algorithms in mind;
specification of the algorithms then shows up flaws in the data
structures. Thus the program production process involves
continual change. Each decision is bound by decisions made
in the past, and influenced by the anticipation of decisions
yet to be made. An analogy can be drawn with filling in a
crossword puzzle. If you are filling in a line and are not sure
which of two words fits at that point, you will look at the
cross definition to make the more plausible choice. Even if
you do not know the cross words you will have some information
about them, e.g., that certain letter sequences are unlikely
to occur.
Present decisions can also influence past ones. When
conceptual flaws are discovered, past decisions may need to be
modified. These modifications, in turn, will propagate and so
their consequences must be accounted for. Thus the case for
breadth-first programming versus depth-first programming is not
clear cut. In either case one starts with an overall goal and
breaks it up into a series of subgoals G, ,G_,...,G . With
depth-first programming, one does all of G, before proceeding
to G- . In breadth-first programming one does G, through G to
another level of detail without fully expanding any of them.
In depth-first programming one is likely to program G, without
having designed many details of G_ . This practice has the dis-
advantage of not anticipating relevant characteristics of
G_,...,G . It has the advantage, though, that in creating
G_,...,G one knows all about G, .
2 n 1
-24-
One cause of complexity, then, is the propagation of
decisions. It follows that automation of this propagation
will lead to a reduction of complexity. I will give an example
of how automatic propagation of decisions has already been
used to reduce complexity. The data structures used by the
compiler are specified through PL/I structure declarations.
These declarations are kept as so-called "common decks" ,
recognized by the CDC UPDATE program. Only one copy is kept,
and each procedure references those declarations that it needs.
A change to this one copy will then propagate to all procedures
that use the declaration.
The most frequent situation is that we need to add a
component to a data structure. In this case the propagation
is quite automatic. In the case where the meaning of a
component is changed, no automatic propagation is possible,
though automatic warning through some cross-referencing scheme
might still be possible. This case is worth examining in
more detail.
Automatic propagation of changes of meaning is best served
by factoring out the definition of meaning from the use of that
meaning. I shall give some examples. Frequently a variable
can assume some set of meaningful values, e.g., "red", "blue",
"yellow". If we encode these three values as 1, 2, and 3, then
changes in the encoding will force changes in the programs that
use it. If the names of the colors themselves are used, then
the program becomes independent of changes of representation.
So what we want is a declarative facility to associate names
with colors. Such a facility is actually provided in COBOL
and in Algol 68, but not in PL/I. Another example has to do
with the association of a property with an object. Such an
association might be made either by storing the property as
part of the object or by having the property computed by some
function. It would be desirable to factor out this decision
by using a functional notation uniformly and then letting the
function definition reduce to either a subroutine call or an
appropriately indexed address reference.
-25-
There is a view currently popular among the advocates of
"structured programming" that all data structures should be
hidden by procedure calls. In this view, access to a data
structure should be obtained by calling an appropriate proce-
dure. This approach does have advantages vis-a-vis access
control, but it transfers rather than solves the problem of
changes of meaning. Data structures in an application such
as compilation are necessarily highly interlinked, and so the
value returned by a procedure will often have its meaning
determined by some underlying data structure. For instance,
a binary choice, represented by a predicate, may turn into a
three-way choice.
A releated problem stems naturally from the general graph
structure of typical programs, in which procedures are called
from many different places. It often turns out that the
different calling points each requires a slightly different
service or piece of information. This gives rise to a proce-
dure organization that resembles a "theme with variations".
In general, the necessary variations on the theme will not all
be apparent at the time when the procedure is designed. Often
the multiple-use aspects of such procedures are not at all
apparent at the time the procedures are written. Top-down
programming does not solve the problem because recursion may
give rise to cycles in the program graph.
An example of a theme with variations is the procedure LOCN ,
which is part of the code generator of the compiler. The task
of LOCN is to calculate the location associated with a reference
to a variable. The reference may or may not be subscripted and
may or may not be based on some pointer. Subscripts may or may
not be implicit. For instance, if A is an array the expres-
sion "ADDR(A)" should retrieve the location of the array
itself, but the statement "A=l;" should retrieve individual
elements of A chosen by implicit subscripts. Thus the use of
LOCN depends on the context. There are two different entry
points to cover the two cases, plus a third entry point to
cover the case where an offset from a base is desired rather
-26-
than an actual location. All three cases share most of the same
code; but there are various internal case-dependent choices that
give rise to the organization of the theme with variations.
6.2 Operating Systems Support
During the development of the compiler I used two different
operating systems: OS/360 on the IBM 360 and Scope 3.2 on the
CDC 6600. The support offered by these systems became a
critical issue during this development. Two quite different
kinds of support were involved: support of the programming
process itself, and run-time facilities available to the compiler.
The PL/I compiler exists in three different forms: as a
collection of symbolic source programs, as a collection of
compiled load modules , and as a collection of absolute over-
lays . All three forms require maintenance, but the symbolic
one is the most critical. File maintenance was a very time-
consuming operation, particularly under OS/360 where the
facilities are grossly inadequate. lEBUPDTE , the symbolic
updating program under OS/360, is difficult to use because of
unclear specifications, arbitrary restrictions, and an error-
prone input form. Under Scope 3.2, on the other hand, I was
able to carry out file maintenance expeditiously and with a
minimum of fuss. My success was due to the combined use of
the permanent file system for saving programs, the CDC
symbolic update program UPDATE, and the NYU interactive editor
designed by Richard Reich. Under both operating systems I found
that it took some time to develop adequate file maintenance
procedures, but that once these procedures were developed their
use saved a great deal of time.
Adequate editing facilities are essential in the reduction
of program complexity. There seems to be no substitute for
interactive editing in this matter. The evolutionary nature
of complex programs assures that reorganizations will be neces-
sary, and during these reorganizations one wants to be able to
move blocks of text from one place to another as well as to
-27-
carry out various global substitutions. The ability to move
large text blocks is a facility often lacking in editors.
Reformatting is also important, though this may not be an
operating system function because of its language dependence.
A good measure of the adequacy of programming support has
to do with the proportion of runs that are lost because of
errors outside the program being debugged. Such errors include
job control language , problems , canpiler errors, updating
problems, and machine troubles. As I kept no records, my
impressions must be subjective; but I would estimate that
overall I lost perhaps two-thirds of my runs on the 360 for
system- related reasons and perhaps one-tenth of my runs on
the 6600 for system-related reasons. The comparison is somewhat
unfair to the 360 because much of my work on the 6600 was done
interactively; nevertheless, "battling the system" was a major
part of my effort in getting the bootstrap going.
Neither the 360 nor the 6600 provided adequate debugging
facilities. On the 360 certain facilities were built into
the compiler itself, e.g., tlie CHECK condition, but their use
required modification of the source program; and lEBUPDTE pro-
vided no systematic way of removing these changes once made.
Debugging in machine language on the 360 was impossible since
I did not have the requisite knowledge of the IBM compiler or
of OS/360. On the 6600 I used a machine-language interactive
debugger developed by Peter Maclean [4] , which was the best
that could be hoped for under the circumstances. Some PL/I
source language debugging facilities that don't require
recompilation are presently being developed by Janet Fabri as
a master's thesis.
The compiler required run-time support for overlaying. The
linkage editor on the 360 provided better support for over-
laying than any corresponding facility under Scope 3.2 on the
6600. The compiler is now organized into overlays, but this
had to be done in a machine-dependent manner. The loader under
the Scope 3.4 operating system for the 6600 does have facilities
as good as those of the 360 linkage editor.
-28-
Another area where run-time support is needed is in error
recovery. The facilities on the 360 for error recovery are
superior to those on the 6600, though some improvement can
be expected on the 6600 when Scope 3.4 comes into operation.
On the 360, it is possible for the program to regain control
from machine interrupts such as overflow or reference out of
bounds. On the 6600 control is taken away from the program
under such circumstances. Neither machine provides program
recovery from time limit, though a restricted facility for
time limit recovery is provided by Scope 3.4 in the form of
"reprieves" for various errors. It would be very helpful if
the programmer could specify a local time limit, not to exceed
his global time limit, and be interrupted when this time limit
was reached. With such a facility the programmer could use
the time limit for resource allocation, and could perform as
much processing as he wished to following the exhaustion of a
local time limit. Since the global time limit would still be
inviolate there would not be any adverse impact on job
scheduling.
Run-time support is also needed for transput and for the
mathematical library. As these facilities in CIMS PL/I are
very limited, I did not have much experience with the adequacy
of Scope 3.2 in this regard. The issue did not arise in my
work under OS/360.
6 . 3 Virtual Memory
My experience suggests some compelling reasons why compilers
can function more effectively in a large address space such as
that usually provided by a virtual memory. Various decisions
about compiler design are dependent on the availability of address
space. The situation shows up in two areas: program structure
and data structure. A small address space dictates the use of
overlays in program organization, and can lead to compilers
with a great number of specialized phases, e.g., the IBM
F-compiler. The resulting program organization is quite
complicated and furthermore generates a great deal of transput
-29-
activity. With a large address space one can ignore the over-
laying problem. If only a small physical memory is available,
then the paging processes of the operating system will produce
the effects of overlaying without the corresponding complica-
tions in the program. Paging will still require transput
activity, but that activity would be necessary in any case.
If physical memory is large, then the transput activity will
be avoided since little paging will be necessary. Thus a
single program organization can function optimally over a wide
range of availability of physical memory. Good programming
practices should still be followed to preserve locality of
reference. Some experiences with the MULTICS PL/I compiler,
though, suggest that some phase-oriented organization may be
desirable even in a virtual memory environment.
The question of syllable-oriented intermediate representa-
tion versus pointer-oriented intermediate representation is
much less vexing in a large address space, since one can use
that representation which is the most natural at each point
(cf. Sec. 5.2.1). The entire intermediate representation
becomes addressable, and thus the problems of code motion
alluded to previously are no problem at all. The virtual
memory lends itself naturally to coping with a wide range of
source program sizes; the amount of paging activity becomes
proportional to the program size. For short programs the
intermediate representation often will not be paged out.
6.4 Reliability
In order to maintain the reliability of the compiler I
developed a series of acceptance tests. The most stringent
test is to have the compiler compile itself as described in
Section 3; success at this point almost certainly implies that
any later errors found in the compiler can be fixed. There is
a danger that must be avoided: if an error in the compiler makes
it impossible to recompile the code section containing the error,
then there is no way to escape except by returning to a previous
safe version. Since it is expensive to recompile the entire
-30-
compiler I have done it only about every two months. Because
of the range of language facilities used by the compiler itself,
the self-compilation test has turned out to be quite effective.
Whenever any change at all is made to the compiler or to its
run-time library, I check it by running a standard acceptance
test. The first version of the acceptance test was designed to
run through all code paths in the compiler except for those
associated with errors. Thereafter, whenever any error was
discovered either in the compiler or in its library, a further
test case was added to cover that error. Thus the acceptance
test provides warning if any previously found error recurs.
Not surprisingly, most of the errors that have shown up occurred
when the compiler aborted as a result of a programmer error.
In the design of an acceptance test, the output should be
as short as possible. Only incorrect results should produce
messages. Thus one need not scan a long output to see if it
is the same as the previous one, a process which is prone to
oversights. The present acceptance test produces only one
line of diagnostic output if the compiler is working properly,
though it does produce a listing of the testing program. Any
malfunction is evident at a glance.
6.5 Project Organization
Up to the completion of the first bootstrap, the compiler was
essentially a one-person project. The obvious disadvantage of
such an organization is the limit on how much code one person
can produce. A more subtle disadvantage is the lack of inter-
action among co-workers that often leads to improvements in
de s i gn .
On the other side, one-person programming seems to avoid most
of the interface difficulties that plague multi-programmer
projects. It eliminates the misunderstandings that often make
it hard for modules written by different people to communicate
with one another. The "chief programmer team" approach suggested
by Mills and described by Baker [2] seems to preserve the spirit
of one-person programming while eliminating many of its
-31-
disadvantages. My experience suggests that additional help
can be best utilized when two criteria are satisfied:
(1) the task to be done is unambiguously defined, and
(2) the task to be done is not on the critical path.
The best way to define a programming task unambiguously is
to pose it in the form of an addition to a program that is
already working. The existing program then defines the neces-
sary interfaces. Additional help is often useful, too, in
gathering information. For instance, operating systems
facilities are often poorly documented, and obtaining informa-
tion about them is a task that can usefully be farmed out.
6.6 PL/I as a Systems Programming Language
One of the original design goals of PL/I was to provide an
adequate language for systems programming. It has been used as
the systems programming language at MULTICS [3] . Experience with
the compiler suggests that while PL/I is probably better than
most other languages around for that purpose, it is far from ideal,
The strongest argument for PL/I is the wide range of
facilities that it provides. The description of CIMS PL/I
indicates the range of facilities that were actually used in
writing it. Some examples are: different forms of storage
allocation, pointer manipulation, bit strings and character
strings and the associated built-in functions, structured
variables (particularly valuable) , block structure and recur-
sion, and flexible forms of iteration. Another argiament for
PL/I is that usually one can accomplish much by writing down
little; the language has great expressive power.
There are, of course, failings of PL/I. The pointer structure
is such that it is impossible to write any reasonable form of
garbage collector; the list processing facilities require far
too much of the user. The iteration structure, while flexible,
still does not allow some more recently suggested concepts that
lead to cleaner programming, e.g., specifying a loop as a cycle
and placing an escape clause at an arbitrary place within it.
-32-
structuring should be associated with data types rather than
with variables; the approach of Algol 68 in this area seems far
more sensible. It would be desirable, in fact, to use a
functional representation for structures and then to be able
to define the function later as being either computed by some
procedure or found merely by address manipulation. The PL/I
on-conditions seem unnecessarily general, and often do not
provide simple but desirable effects. For instance, storage
allocation in PL/I can be carried out in specified areas.
When an area is exhausted, an on-condition is raised to signal
this fact. However, there is no way to determine which area
was the one that was exhausted, except through knowledge of the
point of execution of the program. The obvious solution, writing
an allocation function, does not work since although one can
pass allocated storage to functions , one cannot pass data
descriptors to them in any reasonable way.
Efficiency is certainly an issue, but the significance of
it is not yet known. It seems clear that PL/I compilers can be
far more optimized than they have been so far. My own view is
the inefficiency in generated code can be brought to a suffici-
ently low level so that it is not a major consideration.
I do not regard the size of PL/I as a disadvantage for
systems programming. A reasonable implementation should not
include run-time support for a given program beyond that which
it actually requires. Thus a program is not penalized for those
facilities that it does not use.
6.7 Design Flaws
Naturally, experience in coding and using the compiler has
shwon up a number of flaws in the original design. It was
possible to correct some of them without great difficulty, but
others will require extensive rewriting.
To start with the bright side, I will describe some of the
flaws that have been discovered and corrected. The inability
to detect end-of-file and to rewind files was a serious lack in
-33-
the original version. This lack was remedied by adding
various on-conditions , including ENDFILE, and also by adding
an environment option that allows files to be opened or closed
with or without rewind. The original version was unable to
compile more than one external procedure on a single call.
Correction of this flaw required the insertion of some control
mechanisms that could not be written in PL/I because of their
extralingual manipulation of the storage allocation state.
In the original version, error diagnostics were separated from
the procedures to which they referred, and this situation made
it difficult to correlate error complaints with statements.
Finally, a small change was made so that one can determine
the statement being executed at the time of an error. This
is done by setting machine register AO to the current statement
number at the beginning of execution of each statement. (The
user can turn off this mode of compilation if he wishes.) Since
the standard dump produced on an error gives the value of AO ,
it is not difficult to isolate the offending statement.
Serious flaws, of course, remain. The worst of them is that
the code generator directly produces register-dependent code.
Keeping track of registers has led to great complexity.
The register-dependence of the code also makes optimization
very difficult and inhibits adding a number of additional
language features. An unexpected problem was that the heavy
use of macros and micros caused the 6600 assembler to use an
excessive amount of time. It was surprising to discover that
a compilation actually takes less time than assembly of the
resulting code. The right approach seems to be to produce
some intermediate code form that is register-independent and
perhaps even machine-independent, and to add a further pass
to convert this code to register-dependent instructions and
macro instructions, with expansions kept as simple as possible.
Another problem is the constraints imposed upon the parsing
procedure by SYNDIPAR. Because the parse must be written as
one enormous procedure, it is not possible to use recursion
-34-
within the semantic routines of the parser. The result is
that the parser is unnecessarily complex, and also uses
various tables whose sizes cannot be intelligently determined
in advance. The difficulty is not fundamental, and could be
cured by changing the conventions for calling semantic routines,
There are also problems in the way that pushdown stack
storage is handled. These problems are discussed briefly
in Section 5.3.
-35-
References
1. Abrahams, P., A syntax-directed parser for recalcitrant
arammars. Intern. J. Computer Math., Sec. A, Vol. 3, 1972,
pp. 105-115.
2. Baker, F. T. , Chief programmer team management of
production programming. IBM Systems J., Vol. 11, No. 1,
1972, pp. 56-73.
3. Corbato, F. J., PL/I as a tool for systems programming.
Datamation, May 1969, pp. 68-76.
4. Maclean, Peter, An interactive debugging aid for the
CDC 6600, M.S. Thesis, Computer Sci. Dept., New York Univ.,
February 19 73.
5. PL/I BASIS/I-9. European Computer Manufacturers Association
and American National Standards Institute, December 1972.
6. Control Data 6400/6500/6600 Computer Systems - SCOPE
Reference Manual, Version 3.2. Publication No. 60189400,
Control Data Corp., Palo Alto, Cal .
7. Control Data Cyber 70 Computer Systems - SCOPE Reference
Manual, Version 3.4. Publication No. 60307200, Control Data
Corp., Palo Alto, Cal.
-36-
Appendix A
CIMS PL/I Users Manual
-37-
PL/ I
PL/1
PL/ 1
PL/ 1
PL/ I
PL/1
PL/I
PL/I
PL/1
PL/1
PL/1
PL/I
PL/ 1
PL/1
PL/1
PL/1
PL/1
PL/1
PL/I
PL/1
PL/I
PL/ I
PL/I
PL/ 1
PL/1
PL/I
PL/ 1
PL/1
PL/1
PL/1
PL/I
PL/I
PL/1
PL/1
PL/I
PL/1
PL/I
PL/ I
PL/1
PL/I
PL/I
PL/ I
PL/I
PL/ 1
PL/I
PL/ 1
PL/ 1
PL/I
PL/I
PL/I
PL/1
PL/i
PL/i
PL/1
PL/ I
PL/i
PL/I
PL/I
PL/ I
PL/i
PI /I PL /I PL /I PL /I PL /I PL /I PL /I PL/ I PL I PL/ I Pi / I PL/ 1 PL/ I PL/ I Pi. / I PL/ T PL/
P|/IPl/IPL/IPL/IPL/Ipl/IPL/IPL/IPLIPL/IPi/1PiVIPL/IPl/IPL/1PL/?PL/
xxxxxxxxxxx
X X
X
X
X
X
X
X
X
X X
xxxxxxxxxxx
xxxxxxxxxxxx
X X
X X
X X
X X
xxxxxxxxxxxx
X
X
X
X
X
VXXXXXXXX
X
X
X
X
X
X
X
X
y
XXXXXXXXX
Y
V
y
X
y
y
X
X
X
X
XXXXXXXXXXXX
X X
XX XX
XX XX
XX XX
X X y X
X XX X
XXX
X X
X X
X X
X X
xxxxxxxxxyy
X X
X
i
y
xxxxxxxxxxx
X
X
X
X X
xxxxxxxxxv
xxxxxxxxy
X
X
X
X
X
X
X
X
X
xxxxxxxxy
author: PAUL ARRAHAMS
PL/IPL/IPL/IPL/IPL/IPL/IPL/IPL/IPLIPL/IPI/IPL/IPL/IPL/IPI
PL/IPl/IPL/IPL/IPL/IpL/IPL/IPL/IPLIPL/IPI /IPI /IPL/IPL/IPI
-38-
/IPL/TPL/
/IPL/IPL/
IPL/IPL/I
IPL/IPL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PU/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
Pl/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
PL/I
IPL/IPL/I
IPL/IPL/I
******************************
♦ «
♦ CHANGES AMD MEw FPATUr^ES *
♦ *♦♦***♦*••#♦*♦♦♦•♦♦*«♦♦**♦*•#♦♦
A, 4/2/7!^
1, THE INPUT-OUTPUT FACILITIES HAVE BEEN GPFATLY IMPROVPD. T^E
CMANGfcS INCLUDE THC ABILITY TO REWIND FILES, THE RFMHVAL OF THF
RtSTRICTlON THAT PERMANENT FILES CAKnQT t^E Rf^AD illRErTLY. anP
THE AVAILABILITY Or THF FNDFILE AND ilNDEF I NE 'IF I Lf^ CQ^iDITlONS.
StE SfcCTIONS 2,1.7. ?,3.5, 2.3,16, anD 2.8 FIR DRTAIlS.
2, THE ON-STATEMENT IS MQw AVAILABLE. SEE SEC. 2.3.14 FOR
ntTAlLS,
3, THE ALLOCATE STATEMENT NOW ALLOWS MULTIPLE ALLOC AT I O^iS .
B, 4/23/74
1, THE COMPILER NOW EVPFCTS ITS INPUT ON THE FILE -INPUT- AND PRODUCES
THE SOURCE LISTING ON THE FJLE -OUTPUT- (CF. SEC5. 3.2» 3,!').
2, THE FILE -ERRORS- TS NO LONGER USED.
3, FILE SUbSTITUTICN ^OW RECOGNIZES NULL FILES (CF, SEC. 3.4,1).
r, 6/15/7J
1. COMPILATIONS MAY Nnw BE PATCHED, SO THAT ANY NUM ^ER ^F EXTFRNAL PRO-
cfcCuRfcs CAN PE Compiled with a single Control ca^d. as thf compiled
COCE is now in a SiNGLf' RECORD, THE RECM^^G STEP IS ALSO ELIMINATED,
THESE CHANGES MAkE THE CONTROL CARDS MUCH SliPLEf!. FURTHERMORE, THE
FILE -TABLES- IS No LOMQFR NFEDED, AS IT HAS BEEN INCORPORATED INTO
THE COMPILER PRCpER. SEE SEC. 3,2 poR DETAIL^.
2. AUTOMATIC FIELD LEvGTH MANAGFMENT HAS BEEN ADDED Tn THE COnPILER. A
COMPILATION LISTING, NOW INDICATES THE FIFLD LENGTH RPQUIREr FOR
CUMPILATION.
3. A DEFAULT STaTEMFNT HAS REEN ADDED. SEE APPENDIX A FOR DETAIL^,
4. A CROSSpREFERENCF "aP CAM MOW BE OBTAINED. ^EE ^EC. 3.3 FTP DETAILS,
5. DUE TO A BUG THAT IS RATHER DIFFICULT TO FIX AT THE '^OMENT. ON-UNITS
CANNOT DECLARE ANY LOCAL AUTOMATIC VARIABLES. T'1 RY^ASS TUS
RbSTRICTION, WRITE THE ON-IJNiT AS A PROCEDURE AND CALL IT,
D. 6/27/73
1, A CE^UGmNG OPTION HAS BEEN ADDED TO ALLOW DETERii I NAT I ON OF THE
SUTEMENT BEING FXpcOTED AT THE TjmE OF A PR IGf^A'^l AB^RT, ^EE SecS,
3.3,2 AND 4 FOR DETAILS.
-39-
1, INTROnuCTION
2, LANGUAGE FEATURES
TA«L(^ OF CONTEf-iTS
3,
2.1 ATTRIBUTES
2.
1.1 SCALAR DATA TYPES
2,1,1.1 FIXED
2.1,1.2 FLOAT
2,1,1,3 chARAcTPR
i2,l,l,4 BIT
2,1,1,5 POINTER
2,1,1,6 AREA
2,1.1,7 LABEL
2,1.1.8 ENTRY
?.,
1,2 AGGREGATICN
?..
1,3 STORAGE CLASSE*^
?..
1,4 SCOPE ATTRIBUTES
2.
>1,5 INITIAL ATTRIBUTE
2.
1,6 ENTRY ATTRIBUTE
2.
>1,7 FILE ATTRIBUTES
2.
1.8 BUILTIN ATTRIBUTE
2,
.1.9 LIKE ATTRIBUTE
2,2 ATTRIBUTE DETERMINATION
2,3 STATEMENT TYPES
2,
,3,1 ASSIGNMENT STATEMENT
2.
,3,2 ALLOCATE STATEMENT
2.
,3.3 BEGIN STATEMENT
2,
.3,4 CALL STATEMENT
2
.3,5 CLOSE STaTfmpnT
2
.3,6 DECLARE STATEMENT
2
.3,7 DEFAULT STATEMENT
2
.3,0 DO STATEMENT
2
.3,9 END STATEMENT
2
.3,10 ENTRY STATE^EMT
2
.3,11 FREE STaTEMpnT
2
.3,12 GO TO STATE»^ENT
2
.3,13 |F STATEh-FNT
2
.3,14 NULL STATEMPNT
2
.3,15 ON STATEMENT
2
.3,16 OPEN STATEMPNT
2
.3,17 PROCEDURE STATEMENT
2
.3,16 READ STATEMENT
2
.3,19 RETURN STATEMENT
2
.3.20 STOP STATEMPNT
2
.3,21 WRITE STATE'^ENT
2.4
tXPRESSIONS
2.5
rVPE CONVERSION
2.6
KLNCTIUN CALLS ANn PARAMETER
2,7
bllLTlN FUNCTIONS
2.8
INPUT-OUTPUT
2,9
CHARACTER SET
2. in
KEY WORDS AND pHbaSES
OPfcWATlNG ENVIRONMENT
PASSAGE
3.1 INPUT FORMAT
3.2 vJCnTHql card SETUP
-40-
3.3 CCMPILER OPTIONS
3'3.1 XREF
3.3.2 STiNST
3.4 PROGRAM CALLING CarH
3.4.1 FILE NAMF «;UPST I TUT I ON
3.4.2 PASSING AN APQUMFNT TO A MAIN PROCEDURE
4, ERROR MESSAGES AND CERUnCING TECHNIQUES
5, AREAS OF I NCQI^PA T I B IL t T V
5.1 tXTENSIONS IN CiMS PL/I
5.2 INCOMPATIBILITIPS BETWEEN CTMS Pl/I AND I3M PL/f
6, FUTURE FLANS
7, ACKNOWLEDGMENTS
A. THE DEMULT STATEMENT
APPENniCES
-41-
1, INJTROnuCTION
AN EXFERIMbMAL PL/I rOwPlLFR K^OWN AS CIMS PL/I IS NOW AVAILABLE ON THE
66G0 AT THE AbC COMPUTlNn rE'vlTER. I WELCOME ANY QdEjTIO'iS 0° CQMMFNTS ON
ThF USE Ot- THt COMPILER, AvD PARTlCULARLV AKY REPO'^To OF BUGS OR SlGPESTIONS
FOR IMPROVEMENTS,
SINCfc CIMS PL/I IS STtlL UNHFR nEVELOPMENT, YOU S^O JLD 'iQT GIVE UP ON
USING IT IF YOUR INITIAL EXPERIENCE WITH IT TS LNFAVORABLEJ i^AlT A FEW WEEKS
OH hONTHS AND IT WILL BE 8FTTER. CHECK THE WRITEUP FREQiJENTLY FOR CHANGE
NOTICES.
THE SUBSET THAT I hiVc IMPLEMENTED IS CRiEt'TED TQWA'-JDS rOMRlLFR
WRITING! IN PARTICULAR, IT IS THAT SURSET THAT WaS NECESSARY IN ORfER TO
WRITE THE COMPILER IN PL/I. SOME ESTIMATE OF THE RELIABILITY OF ThE COMPILER
MAY BE GLtANEtJ FROM THE FACT fWAT IT IS CAPABLE OF COMPILING ITSELF; THE
COMPILER ►'f^ESfcNTLY REQUIRES ABOUT 6000 PL/I SOURcE STaTE^E^JTS, Thf LACK OF
FLOATING r-ClNT INPUT-OUTPUT IS PRORARLV THE MAJOR TIFFICJLTY IN USING THE
CURRENT VtRSION CF THE COMPILER FOR NUMERICAL PROBLEMS, THOUGH FLOATING POINT
IS AVAlLAdLE INTERNALLY FOR COMPUTATION.
THE LANGUAGE ACCEPTED BY CIMS PL/I IS KoT COMPLETELY CO'^PATIBLE WITH
THAT ACCEHTED BY ANY OF THc IBM COMPILERS. nlFFPRENCES ARE OF TWO KINDS!
CONSTRUCTS THAT ARE IMPL^MPNTED IN ONE BUT NOT TWE OTHER, ANO CONSTRUCTS THAT
flEHAVE DIFFERENTLY, OR REQUIRE DIFFERENT SYNTAX, IN THE VARIOUS COi'PlLPRS,
THIS LAST SET IS NOT LARGE. INCOMPATIBILITIES ARE DETAILED IN SEC. 5,
•42-
?. LANGUAGE FEATURES
IN This SECTION I SMALL DESCRIBF THAT PORTION
OF PL/I PHESENTLY IMPLEKFNTEH IN CIH*! PL/I. A KNOWLFDGF OF PL/I IS NFCFSSARY
IN ORDER fC UNDERSTAND TmI9 MATERIAL. YOU SHOULH ASSUME THAT ANYTUNG NOT
MfcNTIONEn IS NOT IMPLEmEmTFD.
2.1 ATTRIdLTES
2,1.1 SCALAR DATA TYPES
2,1.1.1 FIXED
BASF AND PRECISION ATTRIBUTES Ai'E ACCEPTED ^^UT IGiJORED,
ALL FIXED CATA riEHAVE LIkE FORTRAN INTEGERS.
2.1.1.2 FLCAT
THE PRECISION ATTRlnUTE IS IGNORED. NO I/O FACILITIES ^xiST FOP FLOATING
NUMBERS, NCH MAY FLOATING CONSTANTS APPEAR IN PROGRAIS. iMpiIClT TYPE
CONVERSIONS HETikEEn FiXEn AND FLOAT HO EXIST'. ANH ARE THP ONLY WAY OF OBTAINING
OK RETRIEVING Float data.
2.1.1-3 CHARACTER
CHARACTER UATA IS FULl.Y IMPlEmEnTEDi AND MAY 9E VAPYIMQ
STRING Coi\STAnTS ArE LIMT^D TO lOO CHArACtErS ImClU^ING T^E
VARIABLES ARE NUT LiMlTEn,
2.1.1.4 BIT
or not. character
apostrophFSi But
ONLY NQNVARYING BIT STRINGS ARb AVAILAElE. THE^E MiJST MAVE CTNSTANT
LENGTH (I.E,, NOT GIVEN BY AN FXPRESSION). THE LENGTH OF A «IT STRING MUST
ALWAYS BE SPECIFED BY A DECIMAL INTEGER CONSTANT NOT EXCEEnjMG 60,
2,1.1.5 POINTER
POINTERS MAY ONLY PO^'T TO WORDS, NOT TO BITS,
2.1.1.6 AREA
ALLOCATIONS MAY BE DOmE IN AREAS* BUT AREAS MAY NOT BE aSSIGNFD.
NOR CAN FREEING BE DONE IN AN AREA. AN AREA MAY BE TOTALLY FREED FY ASSIGNING
EMPTY TO IT,
2,1.1.7 LAEEL
THE ONLY PERMITTED FORM OF LABEL DATA IS THE COJSTA JT LABEL AFRAY, SUCH
AN ARRAY IS ALWAYS DECLAREH IMPLICITLY THROUGH THE APPEARANCF OF A SUBSCRIPTED
-43-
LABEL PREMXJ PHESENTLY, SUCW ARRAYS MUST HE ONE-D I MENS MNAL . I8K PL/I REQUIRE
SUCH AN AHRAY TO BE DECLARED JM A DECLARE SJATEMFNTj CIMS PL/I PROHBITS SUCH A
DECLaRaTIUN, constant LaBpL arrays aHE iJSED TO OBTAIN The EFFECT fp THE
FORTRAN CUfPUTEU GO TO ,
2.1.1.8 ENTRY
ONLY ENTRY CONSTANTS, EITHER INTERNAL
(CF. SEC. 2.1,6 AND 2,6)
2.1.2 AGG'^EGATION
CR EXTERNAL. ARF 4LL0WEr.
STRUCTURfcS AND ARRAYS APE IMPLEMENTED. AND CAN -^E USED !N ASSIGNMENTS.
THE REFER OPTION IS RECCGNIZFD. WITH THE SAME LIMITATIONS AS F-LEVEL PL/Ii
CROSS-SECnO'JS OF ARRAYS (U'lTH * SUBSTITiJTED FOR A DIMENSION) ARE NOT RECOG-
NIZED. THE ♦ NUTATION MAY BE USED FOR STRING LENGTHS AN^ ARwAY BOl NDS OF
pARAMETERb,
2.1.3 STORAGE CLASSES
THE ALTTMATIC, STATIC, AND BASEO STORAGE CLASSE3 ARE AVAlLARLF.
RASED VARIABLES NEED NOT BE DECLARED WITH a POINTER.
2.1.4 SCOH'E ATTRIBUTES
THE SCOPE ATTRIBUTES INTERNAL A^D EXTERNAL ARE 1EC01NIZED. HTWEVER, A
GIVEN EXTtRNAL VARIABLE TAVN^T BE DECLARED f'.ORE THAN ONCE IN A pROTEDURE) AN
ATTEMPT TO DO SO WILL CREATE AN ASSEMBLY ERROR. EXTERMAL VARIABLE NAMES AND
EXTERNAL tNTRY NAMES LONr,Eo THAN SEVPN CHARACTERS ARE TRUNCATED RY USINQ THE
FJRST FUIJH AND LAST THREF CHARACTERS. EXTERNAL VARIABLE" ARC IMPLEMENTED AS
LABELLED CCM'ION AREAS,
2.1.5 INIT lAL ATTRIBUTE
THE INITIAL ATTRIBUTE IS RECOGNIZED, BUT ITERaTIOI FACTORS Mu^T BE DECIMAL
INTEGER CONSTANTS. FIVE LEVELS OF NESTING ARE ALLOWED* ANn THE LENGTH OF A
LIST IS ESSENTIALLY UNLIMITED, ITERATED EXPRESSIONS MUST HE CONSTANTS. STRING
REPETITION FACTORS ARE NOT ALLOWED.
2.1.6 ENTrtY ATTRIBUTE
THE ENTRY ATTRIBUTE MUST RE DECLARED FOR EXTERNAL E JTRY POINTc, AND
MUST NOT mE DECLARED FOR INTERNAL ENTRY POINTS. THE ASTERISK' NUTATION IS
ALLOWED FOR STRING LENGTmS. ENTRY DESCRIPTORS t'UST IE OUTT^D FOR ANY
PARAMETERS THAT ARE AGQRfGaTPS. AND THE ARGUMENTS PARSED TO SUCH PARAMETERS
MUST AGREc EXACTLY WITH THF PARAMETER DECl AFv AT I OnJ . fOR PARAMETERS. STRING
LENGTHS AND ARRAY BOUNDS MUST f<E GIVEN EITHER HY DECIMAL CONSTANTS OR 8Y
ASTERISKS (SEE ALSO SEC, 2.6).
2.1.7 FILE ATTRIBUTES
-44-
THERh ARE FOUR FILE ATTRIRUTES AVAILABLE: FILE, INPJT. nUTPUT".
AND ENVIRUNMENT, ENV I RCN)MPfMT HAS THP FOLLOWING HPTION^I
REWIND - NOREWIND - U^'LOAn
BINAHV - CODED
THE ENVIRONMENT OPTION ^AY B^ ATTACHED TO A FILE DECLARAT I DN , AN OPEN
STATEMENT, QR A CLOSE STaTFM»^NT, AN EXAMPLE OF AN EWIRJNME'IT
OPTION is:
ENV(REWIND, CODED)
THE DEFAULT ENVIRONMENTS ARE NORFWInD A'-JD BINARY. IF A JY C?1NFLICTING
ENVIRONMENT OPTIONS ARE PRPStNT AT OPEN OR CLOSE TIM6, T-tE U'JDEFIKFD"
FILE CONDITION IS RAISED. «;EE SEC, 2.8 FOR RELATED DISCUSSIO^J.
2.1.8 BUILTPJ ATTRIBUTE
THE allLTlN ATTRIBLTE MAY BE DECLARED. BUIlTIN NAMES THAT ARF DECLARED
WITH OTHER ATTRIBUTES MAY OE USED WITHOUT CONFLICT, PRIVIDED THAT PUILTIN IS
DECLARED ^CH ALL BLOCKS COvfAlNlNG USES OF THE BiJlLTiN FJNCTTON IN ITS USUAL
SENSE. NOTE THAT NULL f'UST BE DECLARED BuiLTlN IF IT IS USEH FQR THF NULL
POINTER, AND SIMILARLY FOR ENTRY.
2.1.9 LIKt ATTRIBUTE
THE LIKE ATTRIBUTE IS AVAILABLE.
2.2 ATTRIBUTE DETERM I N AT T Qm
CONTfcXTUAL CECLARATIO^' IS AVAIL&BLE FOR FILES, F»0INTETS. AREAc, AND
BUILTIN FUNCTIONS THAT ExPPCT ArqumEnTS. UNDECLARED VARIA^L'^S NOT AFFECTED
BY CONTEXILAL DECLARATICV ALL BECOME FIXED. THIS BFHAVI)R IS DIFFFRFNT FROM
THAT OF ItiN F-LEVEL PL/I, w'hICH DEFAULTS VARIABLE TY^^ES ACCORDING TO THE FIRST
LETTER OF THE VARIABLE NAMF AS IN FQRTRA^J.
MISSING ATTRIBUTES ARF FILLED IN BY THE COMPILE^^ USING THE STANDARD PL/I
RULES.
2.3 STATEMENT TYPES
THE I-CLLGWING STATEMENT TYPES ARE AVAILABLE, WITH
THE RESTRICTIONS NOTED;
2.3.1 ASSIGNMENT STATEMENT
ASSIGNMENTS MAY BE MAnE TO AGGREGATES AS WELL Aj TO SCALARS,
MULTIPLE ASSIGNMENT IS PPRmiTTEQ. HOWEVER. ALL "iTEMl ON THE LEFT ciDE OF AN
ASSIGNMENT STATEMENT MUST "AVE IDENTICAL AGGREGATE TYPES, OR AN UNTIAGNOSED
ERROR WILL OCCUR, THE Ey vAME OPTION IS ACCEPTED BUT IG-IORE'^, IMPLICIT TYPE
CONVERSION IS PERFORMED ON ASSIGNMENT WHEN NECESSARY,'
2.3.2 ALLOCATE STATEMENT
THE ALLOCATE STATEMENT WITH OPTIONAL IN AND SET CLAUSES IS RECOGNIZED.
-45-
2.3.3 i^bGlN STATEMENT
THE dEGIiN STATEMENT I <? ALLOWED.
2.3.4 CALL STATEMENT
THE CALL STATEMENT IS ALLOWED. THOuOH THE TASK, EVEMT A^JD PRlTRlTV OPTIONS
ARE NOT RfcCOGNlZED.
2.3.5 CLObE STATEMENT
THE CLOSE STATEMENT I*; ALLOWED. LISTS nF FILES ARE ALL'^WED. «S IS
THE ENVIRONMENT OPTION. AT CLHSF TIMEi ANY DECLARED ENVIRONMENT
OPTIONS AHE COMblNFD WITH THE ENVIRONMENT OPTIONS ATTACHED T"i THE
CLOSE STATEMENT,
2.3.6 DECLARE STATEMENT
THE UECLARE STATEMENT IS ALLOWED, FACTORING OF ATT'M9UTES, INCLUDING
LEVEL NUMdERS AND DIMENSIOmSi IS ALLOWED,
2.3.7 DEFALlT STATEMENT
THE UEFAULT STATEMENT IS AVAILARLE, THIS STATE (ENT IS ^EFlNEr BY
ANSI PL/I. dUT IS NOT IMPLFmFNTED IN' ANY IHM COM^^ILE^'^S I'J THIS FORy . THE
DEFAULT SiATEmEnT AllOwS ThE USER TO SPECIFY ATTiJlBUTES TO BF ASSIPNED
TO A VARIABLE, DFPENDInG 0"' THE ATTRIBUTES THAT IT ALREADY HAS AS uELL AS ON
THE LEADI.nG CHARACTERS CF THE NAME QF THE VARIABLE. THE DEFAULT
STATEMENT IS DESCRIBED IM nETAlL IN APPENDIX A.
2.3.8 DO STATEMENT
ALL OF THE FORMS OF T^E DO STATEMENT ARE ALLOMEH, BJT T^'E PARAMETERS OF
THE DO ARE ALWAYS CONVERTEn TO FIXED VALUES. EXPRESSIONS MAV RE USEU AS DO
PARAMETERS,
2.3.9 END STATEMENT
THE END STATEMENT IS ALLOWED, AND MULTIPLE END CLOS^JRE IS RECCGNIZED,
2.3.10 tNIRY STATEMENT
THE bNTRY STATEMENT IS ALLOWED, MULTIPLE ENTRY POI JTS ARE PERMITTED, THE
RETURNS Ot^TlON IS RECOGMZFD.
2.3.11 FRcE STATEMENT
THE ^REE STATEMENT IS ALLOwFD. RUT VARIABLES ALLOCATED IN AREAS CANNOT BE
FRFED.
2,3.12 GO TO STATEMENT
THE GC TO STATEMENT I «: ALLOWED. IT IS PERMITTED TO GO tq a LABEL IN A
RLOCK CONTATJING THE CURREvT ONE. IT IS PERMITTED T) GO TO AN -ELEMENT OF A
-46-
CONSTANT LABEL ARRAY, USlNr? A SUHSCRlPTEH LAREL REFEf^E'JCR,
2.3.13 IF STATEMENT
THE IF STATEMENT IS AlLOWFD. HOWEVER. aSSIGNMEJTS TO THE IDENTIFIER IF
ARE NOT PHCPERLY RECOGNlZEn. IF STAtEME'JTS mAY HAVE ASSOCIATED El^E CLAUSES.
2.3.14 NULL STATEMENT
THE IMILL STATEMENT IS ALLOWED.
2.3.15 ON STATEMENT
THE UN STATEMENT IS ALLOWED. CONDITIONS PRESENTLY fJECOHNIZED ARE»
AREA
ENDFILE
UNDEI- InEDFILE
ERROR
F INISh
STORAGE
MULTIPLE CCNDITIONS MAY RE SPECIFIED IN A SINGLE ON- STATEMENT, E.G.:
ON ENCF ILE(OUT), UN OEr j NEHF ILE ( OUT ) GO TO LaB J
'ME BEHAVICR OF ENDFILE WITH RESPECT TO END-OF-RECORT IS DISrUSSEC IN SEC, 2.8.
THE STORAGE CONDITION IS RAISED WHEN THE MAIN STORAGE POOL HAS BEEN EXHAUSTED.
NORMALLY fHERE IS NO WAY To »ECOVER FROM THIS CONDITION. WQ-iEVER. UNDER
SOME CIRCUMSTANCES RECOVERY mAy HE POSSIBLE EITHER BY FRFEIN^' LARGF RlOCKS
OF ALLOCAIEO STORAGE OR RY EDITING FROM A NEST OF PRICED IRES USING AUTOMATIC
STORAGE.
BECAUSE OF A BUG ThAT IS RATHER DIFFlcUlT TO FIX PRESENTLY, ON-UNITS
THAT ARE BEGIN BLOCKS CANNOT DECLARE ANY LOCAL AUTOMATIC VARTARLES. TO BYPASS
THIS RESTRICTION, TURN THE BEGIN BLOCK I'-lTO A PA-^AMETERLESS PROCEDl RE AND CALL
IT.
2.3.16 OPtN STATEMENT
THE OPEN STATEMENT IS ALLOWED. WiTH OPTIONS TITLE. OUTPUT, INPUT,
AND ENVIRONMENT (CF. SECs. 2.1.7 AND 2,8). AT OPEN TIME. DECLARED
OPTIONS ARE COMBINED WiTh ThOSE SPECIFIED FOR THE OPEN STaTe'^ENT.
LISTS OF I- ILES ARE ALLOWED.
2.3.17 PROCEDURE STATEMENT
THE PKOCfcDURE STATEMENT IS ALLOWED, WITH OPT I aNS( MA IN ) USED Tf INDICATE
A MAIN PROCEDURE. PROCEOUPES NEED NOT HAVE LABELS IF THPY CONTAIN ENTRY
STATEMENTS, THfc RECURSIVE AND RETURNS OPTIONS ARE REC0GJI7En,
2,3.18 READ STATEMENT
-47-
ONLY CnE rORM or THP sEAD statement is RECOGNIZBDJ RFa^ riLE(*FILE*) INTO
(*VARIABLfc*) . A READ STATEMENT ON AN UNOPENED FILE CA JSES
THAT riLE TO BE OPENED AS AN IN'PUT FILE,
P.3.19 RETLHN STATEMENT
THE HETURN STATEMENT IS RECOGNlzED; IT HAS AN Qf^TlOJAL ARGUMENT. THIS
ARGUMENT MLST BE GIVEN IF THE PROCEDURE WAS FNTEREH AT Ai ENTRY POINT THAT
RETURNS A VALUEi AND MUST RE Of^ITTFD OTHERWISE,
?.,Z.20 STUF STATEMENT
THE ijTOP STATEMENT IS RFCnGNl7En,
2.3.21 WRITE STATEMENT
ONLY CNE FORM OF THE '-JRITE STATEMENT IS RECOGNIZEDl
WRITE FlLfc<*FlLt*) FROM (*v/ARIABlE*) , A WRITE STATEMENT ON AN UNrpENED
FILE CAUStS THAT FILE TC BE npFNFD As An OUTPUT FILE,
2.4 EXPREtJSlONS
EXPRbSSIONS ARE IMPLEMENTED. INCLUDING IMPLICIT CONVERSIONS RESULTING
FROM THE COMBINATION OF niFFFRENT DATA TYPES THROUGH OPE'^ATO'^S. VARIABLE
REFERENCES MAY BE INCOMPLETELY QUALIFIED IF NO A'^BIGJITY RFSi'LTS. AND
SUasCRIPTb IN SUBSCRlPTEn OUALIFIED NAMES MAY RE INTERLEAVED IN ANv POSITION
PROVIDED (FaT THEY ARE GIVFN IN THE RIGHT SEQUENCE. NOTE THAT CRO<:S-SECT I ONS
OF ARRAYS AHE NOT ALLOWEH,
ANY HCI'JTER'VALUED FXPRFSSION HaY BE USFD AS A POINTER TUALIFIER; THUS
POINTER QUALIFIERS MAY BE VESTED TO ANY HEPTH, THIS GENERAL I Z AT I OK IS NOT
ALLOWED BY F-LEVEL PL/1,
ALL OF THE PL/I OPERATORS ARE ALLOWED. IN PERFOR'-IDG COMPARISON
OPFRATlONb OiJ CHARACTER STRIMGS, THE BLANK CHARACTER IS TREATED AS THOUGH IT
HAD DISPLAY CODE 0 (I.E., IT IS THE LOWEST CHARACTER IN THE COLLATING
SEQUENCE)- OTHERWISE, TmE COC COLLaTINq SEQUEf^CF iS USe'I,
THE ALLOWABLE CONSTANT TYPES ARE SIGNED DECIMAL I'JTFGFRS, BIT STRINGS, AND
CHARACTER STRINGS. STRInG REpFjITlOM ^aCtoRS ArF mQT RECOGNIZED.
2.5 TYPE CCNVERSION
type conversion occurs implicitly i j three contests!
1. arplication of cperators
2, assignment
i, MATCHING OF ARGLmEmTS TO ENTRY PARANETERS
-48-
THE UMV FACILITY PrE-^E^TLY AVAILABLE FOR EXPLICIT TYPE CONVERSION IS THE
CHAR BUILIIN FUNCTION, ViHlTH MAY BE CALLED EITHER WITH 0'} WITHOUT 4 SECOND
ARGUMfcNT,
THE I-CLLOWIN'G TYPE COmvFRSIONS ARE AVAlLARLP:
1. FIXED TO FLOAT.
2. FIXED TO CHARACTER. THE RESULT IS AN E I rJHT-CH AR ACTE" STRIKG
RIGHT-JUSTIFIED WiTH BLANK FILL,
3. FIXED TO BIT. THE RESULT IS A17-BIT STRING.
4. FlCaT TO FIXED.
5. CHARACTER TO FIXED. THE iNPUT CHARACTER STRING UY '-'AVE BLANKS ON
EITHER SIDE OF ThE NUHRER TO BE CONVERTED. ALL nLAN'<S CONVERTS TO
ZtRO,
6. CHARACTER TO BIT. THE INPUT CHARACTER STRING ^UST B^^ OF CTNSTANT
LtNGTHi AND THAT LENGTH MUST NOT EXCEED (SO.
7. RIT TO CHARACTER.
8. RIT TO FIXED.
2,6 FUNCTION CALLS AND PARAMETER PASSAGE
G
FUNCTl
DEFAUL
CHARAC
PASSED
RECEIV
PARAMt
THEN I
IS DON
PARAME
DECLAR
IMPLIC
ENEHA
ON Cli
T TO
TER S
AS A
ING F
TER C
T lb
E, 1
TER T
ATIUN
ITLY,
LLY
LLS.
FIXE
TRIN
RGUM
ARAM
ESCR
ASSU
F A
YPE
WIT
SPEAKIN
HOWEV
D, A F
GS MUST
bN'TS. B
ETER.
IPTORS,
MED THA
paramET
IF NECE
H A FUL
G, The USUAL
ER, FNTRY NA
UNCTION CAN
BE HECLaRED
UT They must
NC PROVlSinN
IF A PARAM
t thp argume
er i«; declar
SSARY. NOTE
L SET OF PAR
PL/1 RULES APPLY TO PARAMFTER PASSAGE AND
MF-S DECLARED WITH JQ RETUR^^S ATTRIBUTE ALWAYS
nNl Y RETURN A SCALAR AG A VALUEj RETURNED
WITH CONSTANT LENGTH, AGGREGATRS MAY BE
AGREF- EXACTLY WIT-I THE ATTRIBUTFS OF THE
IS MADE FOR THE DC-CLARATinN OF AGGREGATE
ETFR IS NCf DECLARED IN AN ENTRY DECLARATION.
NT MATCHES THE PARAMETER, AND NO CONVERSION
ED THE'J ThE ARGUMEJT IS CO'iVERTEr TO THE
THAT FOR INTERNAL PROCEnU"ES AN ENTRY
AMETER DESCRIPTORS IS ALWAYS CREATED
MULTIPLE ENTRY POINTS ARE ALLOWED. AND THEY NEET NOT AG»EE IN THE TYPE OF
VALUE RETURNED, THE NECfSSA»Y CONVEpSIO'IS WILL ALWAYS BE PE^FORMEr. MOTE THAT
A PROCEDUHE INVQKED BY A FUNCTION REFERENCE MUST RETURN A VALUE, WHLE A
PROCEDURE INVOKED BY A CALL STATEMENT MUST NnT RETURN A VAlUF. VITLATION OF
THESE HULtS CAN LEAD TO MYSTERIOUS RUN-TlME ERRC'^S.
FUNCnON CALLS ARE PERFORMEn USING THE FTN CONVENTIONS, IN PARTICULAR,
Al CONTAINS A POINTER TC THE ARGUMENT LIST AND THE RESULT IS RETURNED IN X6,
THUS PL/I PROCEDURES CAN CALL FTN ROUTINES. HOWEVER, PL/I U>ES DORE VECTORS TO
DESCRIBE CHARACTER STRINGS AND AGGREGATES, SO THESE CAN JEIT^'iER BE PASSED
NOR RETURNED TO FTN PRoCRDuRES WITHOUT A MORE DETAILED KJOWLFDQE OF THE PL/I
-49-
RUN-T[Mt STRUCTURE.
?,7 8U1LT1N FUNCTIONS
THE I-CLLOWING BUlLTiN FUNCTIONS ARE AVAILABLE!
ARS
CHAR
EMPTY
HROIJNC
INDEX
LROlJi^C
LENfif h
MAX
MIN
MOD
NULL
SIGN
SUh«?TR
unspfc
VEKIFY
THE HCLLOWING RESTRICTIONS SHOULD BE NOTED:
1, HBCUND AND LBOUND TAN ONLY HaVE A CCNSTA'JT A3 TH"IR <;ECOND ARGUMENT,
2, EMPTY CAN ONLY Bp USED ON THE Rl^HT SIDE OF AN ASSIG^'MENT TO AN AREA.
II- USED, IT MUST BP EXPLICITLY DECLARED TO BE flUILTlM,
3, MOC ALWAYS CONVERTS ITS aRRUmFNTS TC FIXED.
A, SuESTH AND UNSPEC CANNOT BE USED ON THE LEFT SIDE OF AN ASSIGNMENT
SI ATEMEivjT.
5, II- SUdSTR IS APPLIED TO A RIT STRING'. THE THIRD ARGU^'ENT MLST EITHER BE
cunstanT OR MissiNjr,, IF IT is missing, ^he lenqTh OF the ^esulTing
RIT string is ALwAvs TAKEN A!^ 60,
6, iJi^SPEC CANNOT PE APPLIED TO VARYING STRINGS, SINCE T^E RESiLT WOULD
PHCDUCE A VARYING «IT STRING. A DATA TYPE THAT I^ NOT YET AVAILABLE,
UimSPEC of a FiXEn HATUM YlELHS 1^ BITS) UNSPEC OF A POINTER YIELDS
Id BITS) UNSPEC OF A FLOaT DaTUM YiElDS ^0 BITS.
7, THE SECOND ARGUMENT OF VERIFY MUST EE A SUGLE C^^ARAC;TER. AND CANNOT
Rt ANY OF THE CHARACTERS)
< > - I
?,8 INPUT-CUTPUT
THERb ARE FOUR RELEVA^'T STATEMENTS FOR T NPUT-OUTPUT I OPPN, CLfSE,
READ» AND URITE, STREAM iNpUT-OUTPUT IS NOT YET AVAILABLE. FILES CAN
RE OPENED IMPLICITLY BY READ QR wRlTE STATEMENTS. RUT DECLARPD FlL^
ATTRIBUTES ARE NOT APPLIED WHEN IMPLICIT OPENING OCCJRS. IMPLICIT
FILE CLOSING TA^ES PLACE Qv PROGRAM TF.RM I NAT I ON , EIT'<E'< 'JO^^MAL OR
ABNORMAL. HOWEVER, IF THE P«OfiRAM TERMINATES ON A MAC^I'lE E^JROR Si CH
AS AD[)RESS OUT OE RANGE, FilES WILL NOT ^E ClOSEO AN"), C INSEOUENTL v ,
MUST OUTPUT WILL BE LOST. NOTE THAJ PL/I I NPUT --lUTP ]T I :^ 'iOT CQNSIS'
TENT WITH FORTRAN INPUT-flUTPUT BFCAU<?E THE FETS ARE ACCESSED DIF-
FERENTLY, THUS, ALTHOUGH pl/I PROGRAMS CAN CALL FORTRAN PROGRAMS,
THE FORTRAN PROGRAMS CANNOT DO ANY I nPUT-OUTPUT .
DECLARED ENVIRONMENT OPTIONS ARE_APPLIED ON B0T'< OPENIN^. AND
CUnSING. THUS, IF A FILF TS DECLAREH WITH THE ATTi^lUTE ENV ( RFW I \T ) .
IT WILL Rt REWOUND ON BCTH OPENING AMD CLOSING. NOTE THAT IF NEITl-ER
THF FILE UECLARATION NOR TmE OPEN STATEMENT FOR A FILE SPECIFY INPlT OR
OUTPUT, THEN INPUT WILL RE ASSUMED. THUS WHFN AN OUTPUT FILP IS
OPENED EXPLICITLY, THE CuTouT ATTRIBUTE ;1UST BE ATTACHED TO FITHER
THE OPEN STATEMENT OR THF FILE DECLARATION,
FILES CAN BE REWOUNn WITHIN A PROGRAM BY CLnSINT THEM A'JD THEN
REOPENING Them with the FNVlRONMENT(REWrJD) nPTION O'J THP OPPN STATE-
MENT,
SPECIAL CONVENTIONS EXIST,
ENDS-OF-RtCORU IN INPUT FllE'^,
AS IN FORTRAN, CONCE'^NINi] FMnEDDED
IF THE ACTUAL INPUT FILE IS "AMED
jilNPUTj!, THEN SUCH END-CF-RECQRD MARKS ARE TREATED LIKE FNO-iF-FILF
MARKS. n>\ ANY UTHER FiLF THFY AmE IGNORED. THE DECISIOI AynUT THP
FILE NAME ARE BASED ON THE NAME THAT EXISTS AFTER MAKING ALL SUBST r TUT I ONS
DUE EITHER TO THE TITLE OPTION OR TO THE CALl CARD SJB5TITUTI0N FATILITY.
THE REAH AND WRITE STaTFMFNTS MUST SPECIFY F I XEl-LE )GTH CHARATTFR STRINGS
AS THEIR VARIABLES. THESE STRINGS ARE THEN WRITTEN TR RFAD AS THCl GH THEY WERE
IN FORTRAN A-FORMAT, I.E., TRAILING BLANKS ARE ELlMlJATEil, VARYINT CHARACTER
STRINGS CANNOT tiE USED IM THIS CONTEXT,
AL
FACILIT
INPUT-O
BUILT-I
CONCATfe
CUNVERS
' FIXED
CHARACT
COULD B
MATiONJ
THOUGH INPUT-OUTPUT FACILITIES ARE RATHER PRIilTlVE. CERTAIN OTHER
lEb OF THE LANGUAGE CAN 8E USED TO GREAT ADVA ITAGF^ IN PERFORMING
UTHLT, INPUT DATA TAN PE BROKEN TOWN USING T <E S'IBST" AND INDEX
N t-LNCTIONS, WHILE OUTPUT DATA CA:-! HE BUIlT U^ USING 'STRING
NAIIO'-J, SINCE IMPLICIT NUMBER-TO-TO-CH AR ACTE '^ AN') CH AR ACTEr -TO-NUMBER
IO'\S ARE AVAILABLF, IT IS NOT DIFFICULT TO OBTAIN NUMFRICAL OUTPUT,
DATUM, WHEN CONVFRTED TQ A CHARACTER STRJNG, PROlUCE'? AN EIGHT'-
ER REPRESENTATION WITH LEADING BlAnkS. A-? AN EXAlpLE, THE FOLLOWING
E OSED TU WRITE CUT THE VALUE OE K TOGETHER WITH IDEnTIFYINT InFoR-
DCL C CHAR(80 ) ;
C s X K' JfttK;
WRITt FILE (OUT)
FROM (C)j
MORE GENERALLY, ONE CAN DEFINE A PROCEDURE PRINT;
PRINI : PROC(C) ;
DCL C CHAR(*) I
WRITE FILE(CUT)
END PRINT ;
FROM (C) )
IF PRINT IS PROVIDED AS AN EXTERNAL PROCEDURE, IT MU=?T BE HECLARED AS:
DCL PRINT ENTRv(CHAR(*)) j
PRINT CAN EE CALLED WITH A^'Y ARGUMENT THAT EITHER IS A
CHARACTER-STRINii VARIABLF OR CAN BE CONVERTED TO A C URACTFR-STR I NT VARIABLE*
AND THAT ARGUMENT WILL BF PRINTED. NOTICE THAT JF PRINT JS PROVIDRD WITH AN
ARGUMENT THAT IS NOT A FIXED-LENGTH CHARACTER STRING, THE ARGUMENT WILL
IMPLICITLY 8E CONVERTED TO A FlXED-LENGTH CHARACTER STRI.JG.
2,9 CHARACTER SET
-51-
SINCt THb CDC DISPLAY CHARACTER SET DIFFERS FRO'I THF STANDARD PL/I
CHARACTER SET, A MAPPING I «: nECESSARV. THOSE CHARACTERS COM'^ON TO THE STANDARD
PL/I SET **NU THE DISPLAY S^T ARE UNCHANGED, THREE A'^EAS OF 'MfTFERFNCE EXIST:
1. THE PL/I APOSTROPHE IS REPRESENTED BY * .
2, THE NON^ALPHANUMFRIC CHARACTERS ALLOWED IN nENTlFIE^'S ARE:
<? - =
6, THE OPERATORS THaT DIFFER FROM STANDARD PL/I AREJ
t OR
tT CONCATENATE
A AND
?,10 KEY wCRDS AND PHRaSFS
THE t-CLLOWJNG KEYWORDS AND KEY PHRASES ARE RECQlN I ZED, JITH T»-E INDICATED
ABPrEVIATICNSI
ALLOCATE
AUTOMATIC
AREA
BASEU
BEGIN
BIT
BINARY
BUILMN
BY NA^'E
BY
CALL
CHARACTER
CLOSt
DECIMAL
DECLARE
UbFAULT
UO
ELSE
END
ENDFILE
y iNisp
ENTRY
fcNVIKCNMENT
ERROR
EXTERNAL
f ILE
FIXED
FLOAT
(•WEE
F WOM
GO TO
IF
IMTl At,
IN
INTERNAL
INTO
ON
AUTO
BIN
CHAR
DEC
DCL
DFT
ENV
EXT
GOTO
I NIT
INT
-52-
OPEN
OPTl
PRor;
POIN
READ
RECU
RbTlJ
SET
STAT
STOP
STOP
TO
UNDE
VARY
WHIL
WRIT
UNS
bCuRfc
(EW
HSIVE
HNS
IC
AGE
K INEDFILE
ING
b
b
PROC
PTR
UNDFF
VAR
-53-
3. OPERATING ENVIRONMENT
COMPIIATJON or A PL/I PROGRAM INVOLVES TWO STEPS: THE cnMPILATION PROPER
AND THE AbSEMBLV OF THE RE<?ULTING COMPASS CCnE, COMPILATION AND EvECUTION
TOGETHFR HEQUIRfc THREE SEPARATE nATASETSI THE COMPILER ITSEL^, THE SYSTEM
TEXT FOR rhfc ASSEMBLY, AnD T^E RUN-TIME LIBRARY. TWT OF T^E^E DATASEtS ARE
AVAILABLE CN SYbLiB:
PLl
THE COMPILER ITSELF
PLlTfcXT THE SYSTEM TEXT REQUIRED FOR ASSEMBLY
THE THIRD CATASfcT IS A PERMANENT FlLEt
PLIRLIB THE RUN^TIMP LIRRaRY
THE COMPILER PRESENTLY RPQllIRES AN ABSOLUTE MINIMUM 'IF lOlK TO LOAT AND EXE-
CUTE, A i^IZE OF llOK IS RpcOMMgNDED. IF STORAGE iS TUteOUaTE, ThE MESSAGE
^MEM REOUbST EXCEEDS JOBCAPD FL^^ WILL APPEAR. TmE LARGER THP PROCFDURE TO BE
COMPILED. THE MORE SPACE IT WILL REQUIRE, COMPlLATITN SPEED CAN BE ESTIMATED
AT APPROXlf'ATfcLY 15 ST A TEMENTS/SECOND, NOT INCLUDING ASSEMBLY TIME. ASSEMBLY
TIME IS APPROXIMATELY 30 PERCENT L^NoER ThaN COMPILATION TiMP, SO THE COMBINED
RATE IS A^'FROXIMATELY 6 ST ATEMENTS/SFCOND .
THE SOURCE LISTING INCLUDES A MESSAGE GIVING THE AM:)UNT OF FIFLD LENGTH
REQUIRED »-CR COMPILATION, TO DETERHINE THE CORRECT FIELT LE''GTH' ISE A
GENEROUS INITIAL ESTIMATE ANH THEN CUT DOWN ACCORDINGLY, THF COMPILER WILL
NOT ATTEMPT TO INCREASE iTS FIELD LENGTH BEYOND THAT EXITING AT Tl-E TIME
IT WAS FIHST CALLED. hCwEwER# EACH EXTERNAL PROCEDU'TE IS COMPILED IN THE
MINIMUM NtCESSARY FIELD lEnGTH,
THE CCMPILATION PROCESS USES FIVE LOCAL FILESl
INPUT
SOURCE PROGRAM INPUT
OUTPUT SOURCE PROGRAM LISTING WITH STATEMENT NUMIERS. FOLLTWED BY
ERROR LISTlMG
CODE IjENERATED ASSEMBLY COnE
LGO LOADABLE FILE PRODUCED BY THE ASSEMBLER
PLIEHR TEMPORARY FILF USFD FOR LIST DF SOURCE PROGRA'* ERRORS
3,1 INPUT FORMAT
INPUI TO THE COMPILER IS ASSUMED TO BE IN COLUM JS 2 T^R'^UGH 17. THE
PRINTED LISTING ACTUALLY GOES i)p TO COLU'IN 9", SO INPUT OBTAINED FROM UPDATE
WILL SHOW THE UPDATE SER T At I 7AT I ON . CHARACTERS APPEARlNl IN COLUMN 1 ARE USED
FOR CARRIAGE CONTROL IN THF LlSTlNfi.
EACH EXTfcRNAL PROCEPURE MiJST RE FOLLOWEn BY A CARD HTH A * Ik COLUMN 1 TC
INDICATE fhE END OF THE PRncFDURR. THIS CONVENTION IS NREDE" SINCF COMMENTS
MAY PRECEOE THE INITIAL PRoCFUURE STATEMENT OR FOLLO>I THE FI'lAL ENT STATEMENT,
-54-
\
3,2 CONTROL CARD SETUP
AS AN EXAMPLE, COmSTDPR A PL/I PROGRAM WITH THE FOLLOWT'G
CHARACTERISTICS!
1, THE PROURAM CONSTSTS OF TWO EXTERNAL PROCEDU'^ES ,
2, THE PROGRAM REaCs ITS INPUT FROM FILF IN AND PR0nuCE<5 ITS ruTPUT ON
FILE OUT,
3, THE PROGRAM DOES N^T RFQUIRE MORE THAN 5lK TO EXECUTE,
THE FOLLOWING CONTROL CARD SEQUENCE IS THEN aPPROPRIATEJ
A123456,CM110000»T100. VOUR NAMP
LOADfcR,
NOREUtCE,
CIMGtT(PLl,PLlTEXT)
ATTACI-,RLltJ,PLlRLlB,
PH.
RFL, 30000,
REWINC(C0DE>
COMPASS (I =CODE,GsPLl TEXT, LsO.OsO)
L0AD.LQ0»RLIB,
EXECUTE,, I\=INPUT,CUT = 0UTPUT. ;<OPTlONAL DATA F^R MAIN pROCEDLREj!
EQR
■FIRST PL/I PP^OCEnURfc.
■SECOND Pl/I PROCEDURE-
EQR
INPUT nATA-
EOF
ONOTE THE FOLLOWING REMARKS ON THIS SEQUENCES
1, THE OUTPUT WILL APPEAR IN THE SEiJUENCEl
1 LISTING OF SOURCE PROGRAM* WITH ERRORS
, RESULTS OF EXECUTING THE PPO'IRAM
2, EACH PROCEDURE ^'UST BE FOLLOWED f^Y A STAR IN COL IMN l,
3, TO CALL THE COMPILER WITH AN ALTERNATE OUTPUT FILE OF -OUTi- AND AN
ALTERNATE INPUT FIlE OF -INl-, USE THE CALL CARD
PLl(INPUT=lNi,OUTPUT=OUTl)
3,3 COMPILER OPTIONS
-55-
COMPILER OPTIONS ARF SPPCIFIED AS AN ARnUMEMT TT THf= CO'iPlLER'. USING
THE CONVEi^jTlON OF SECi 3.4.2. FOR INSTANCE. THE CALLING CARP
PLl. i<XREr*
TURNS ON ThE -XREF- OPTION. OPTIONS ARE SEPARATED BY COMMAS IN THF
OPTIONS LIST,
3.3.1 XREK
THE -XREF- OPTION CaU<:ES a CROSS-REFERENCE MAP AND ATTRIBUTE | ISTlNG
TO APPEAR IN THE SOURCE LISTING. NOTE THAT EXTRA CO^PILATIO'I SPACF IS
REOUIRfcD WHEN THIS OPTiCn IS USEP,
3.3.2 ST,^ST
WHEN THE -ST- OPTION IS USED, REGISTER AO IS SET TO THE CURRENT
STATEMENT NU^BEW At THE START OF ExECUTInIG EaCH nON-JULL STATEMENT. THE DEFAULT
IS -ST-, Wl-ICH TURNS THE OPTION ON. NOTE THaT TmIS TPTDN CAUSES THE
OBJECT PROGRAM TC RUN SLOWER AND TAKE HQRE SPACE. IT IS USEFUL, HrwEVER, FOR
nfcHUGGlNG. tNST' TURNS TwE OPTION OFF.
3,4 PROGRAf* CALLING CARD
THE CARD THAT CALLS A COMPILED PL/I PROGRAM CAN ALST RE USED TO ACHIEVE
SUBSTITUTION OF FILE NA^FS AND PASSAGE OF AN ARGUMENT TO THE CALLEf PROGRAM,
THESE FACILITIES ARE DESCRIBED IN THE FOLLOWING TWO SECTIONS.
3.4.1 FILE NAME SUBSTITUTION
THE t-lLES REFERENCED INSIDE A PROGRAM ARE VIRTUAL FILES. AN ACTUAL NAME
CAN BE StJBSTITUTED FOR A VIRTUAL NAME AT THE TIME THAT A PL/I PROGRAM IS
CALLED. 11-6 PROGRAM CALLJ
L0AD(LG0»RLIB)
EXECUTE,, Al=Bl.A2=B?, A3=B3.
CAUSES THt ACTUAL FILE NaMFS Bl,H2. R3 TO BE SUBSTITUTED FOR THE VIRTUAL FILE
NAMES A1»a2i and A3 RESPFCT I VEL Y . IF THE PROGRAM REFERENCES FILE 41 INTERNALLY
FOR OUTPUT, THEN THE ACTUAL OUTPUT WILL APPEaR CN FILE 61, IF NO CUPSTITUTION
IS GIVEN t-CR A VIRTUAL ElL^ mAmE. THE ACTUAL NA^F IS TAKEN T'^ RE T»-E SAmE AS
THE VIRTUAL NAME. NOTE THAT FILE nAmE SUBSTITUTION ^Y RE U«?ED FOR CALLS TO
THE COMPILER JTSELF IN THE SAME FASHION, NOTE ALSO THAT IF A PROGRAM IS
INVOKED RY AN EXECUTE CARD. AN EXTRA COMMA IS NEFDED F0LLO»^I'^G EXECUTE.
IF AN ACTUAL FILE NAMF IS -0-. THEN THAT FILE 1 1 TREATEH AS
FILE. ANY OUTPUT SENT TO THAT FILE IS DISCARDEDl ANY ATTEMPT TO
FILE WILL RAISE THE ENDFILF CONDITION.
3.4.2 PASblN^. AN ARGUMENT TO A MAIN PROCEDURE
A NULL
RFAP THE
AN AHGUMENT MAY BE pA'^SFD TO TflE HAiN PROCEOURE FRO 1 THF PROGRAM CALL
CARD. TO ACCOMPLISH THIS, DECLARE THE MAIN PROCFDURE TO HAVF A SINGLE
PARAMETER. THAT PARAMETFR MUST THEN BE DECLARED TO 'IE CH AR AOTER ( 8r ) VARYING,
ON THE PRUGRAM CALL OR EXECUTE CaRD, THE ARGUMENT TC BE PaSS^^D TO THE MAIN
PROCEDURE IS bURRQUNDED RY THE CHARACTER * , AND WRITTEN FOLLOWING THE CLOSING
-56-
PARENTHESIS OR PERIOD OF TmE CALL. FMBEDDED * CHARACTERS? WITHIN Tl-E ARGUMENT
ARE WRITTEN AS ** , IF NO ARGUMENT IS PROVIDED, A NULL STRING WILL BE PASSED AS
ARGUMENT, FOH EXAMPLE, THF FOLLOWING PROGRAM CALL CARD JSES BOTH FILE NAME
SUPSTITUTICN AND PARAMETER pASSAQEt
LOAD.LGO.RLIB,
EXECUTE,, lNPUT = TApEt, ^THls uQf^/^l FAIL;!
LGOC INPUTsTAPEl) i<THTS WON^;!T FAlL»<
-57-
4, ERROR MESSAGtS AND DERUr^GING TECHNIQUES
IF YOuR PHOljRAM DOES Unj SUCCESSrULY COMPILE AfvID EXECUTE. THE FIRST
THING YOU f'UST UO IS TO TRY TO ClAsSIFY THE CAUSE. THE rOLLHWlNG KINDS OF
ERRORS CAis OCCUHJ
1, SOLRCE PROGRAM ERRORS CORRECTLY TlAGNOSEn RY THE COMPILER.
2, SULRCfc PROGRAM ERRORS INCORRECTLY DIAGNOSED lY THE COMPILER,
3, FAILURE OF THE COMPILER TO COMPLETE EXECUTICM.
4, EHfiORS IN THE COmRaSS ASSEMBLY,
5, I\CORHECT RESULTS of EXECUTION DUE TO ERRORS Vi THE SOURCE PROGRAM,
6, INCORRECT RESULTS of EXECUTION DUE TO ERRORS IN THE COMPILFR
OH THE RUN-TIME LIBRARY.
IF THE COMPILER DIAGNOSES AN ERROR. THEN YOU SHOULD CHECK WHETHER YOU HAVE
VIOLATED iiCME LANGUAGE RESTRICTION. IF' 'JOT. THEN YQU HAVE A BUG Tr REPORT,
ASSUMING THAT THE COMolLER HAS NOT DIaGnOSEO ANY ERRORS. THE rOMPlLER
MAY FAIL UlRING EXECUTION, IF THIS SITUATION ARISES. THE FI^ST CAI SE TO
CHECK IS YCUR DECK SETUP. If THE DECK SETUP IS CORRECT, THE'O REPORT A BUG,
THE SAME HEmARK ARRlIEs TO E«RORS IN THE COMPASS ASS^MRLY. vqTE T^-AT WARNING
MESSAGES IN The COmraSS ASSEMBLY ARE EXPECTED. THERE IS ONE LANGUAGE
RESTRICTION WHOSE VI0LATr0\)S WILL LEaD TO ASSEMBLY ERRORS: DPCLARATION OF THE
SAME EXTERNAL VARIABLE IM mqrE THAN ONE PLACE IN A SINGLE EXTERNAL PROCEDURE,
ASSUME THEN THAT YOuR PROGRAM HAS COMPILED AND ASSEMBLE^ WITHPUT
APPARENT DIFFICULTY. IT MAY FAIL DURING EXECUTION EITHER RY FARING TO
COMPLETE bXECUTlON OR BY GIVING WRONH ANSWERS. HUGS OCCURRpiG AT THIS STAGE
MAY WELL HE SOURCE PROGRAM ERRORS WITHOUT THAT BFIMG OBVIOUS. IN TRDER TO
ISOLATE THEM, IT IS RECCmmpnDEO THAT YOU WHITE OUT SUFFICIENTLY MANY
INTERMEDIATE RESULTS TO ENABLE YOU TO ISOLATE WHERE THE TROUBLE IS OCCURRING.
IF YOU CA\ ISOLATE THE TROUBLE AS OCCURRING IN A SINGLE STAT^^MENT, THEN REPORT
THE BUG.
PRESENTLY THE ONLY RUM-TIME MESSAGES aRE THOSE ASSOCIaTFD WITf- PROGRAM
ABORTS, AMONG THE CAUSES '^F THESE AREJ EXHAUSTION TF THE STORAGE POOL,
CONVERSION ERRORS IN GOInG FROM CHARACTER STRINGS TO OTHER DATA TYPES, AND
READING PAST AN END oF FiLF.
IF THE STATEMENT OPTION IS NOT TURNED OFF (CF . 3EC. 3.3.2). T»-EN
IT IS POSblBLb TO DETERMINF THE STATEMENT NUMBER OF THE STATEMENT PEING
EXECUTED AT THE TIME OF AN ERROR. WITH THIS OPTION JN, REGISTER A" IS SET
to THE ClJHRE'JT STATEMENT NumRER AT THE BEGINNING OF EXECUTING EACH STATE-
MENT, THt DUMP SHOULD ThEm INDICATE THE STATEMENT NUMBER THAT CAUSED THE
DIFFICULTY, HOWEVER, ThiS OPTION CAUSES THE PROGRAM TO ^UN LONGER AND
CONSUME EXTRA SPACE. SO IT SHOULD NOT BE USED FOR DETUQGED PROGRAM*:,
-58-
5, AREAS OF I NCQMPAT I B I L I TV
THE AREAS OF INCOMPATIBILITY BETWEEN CIMS PL/I AND IBM PL/I ARE ALL MEN-
TIONED IN THE LANGUAGE SPECIFICATION OF SECTION ?. SINCE THP LANGi AGE FEATURES
OF ClhS PL/I ARE PRESENTLY FAR FEWER THAJ THOSE IN I'lM PL/I, THOSE FEATURES
THAT ARE LACKING WILL NOT «E DETAILED HERE. HOWEVER, DIFFERENCES IN THE OTHER
DIRECTION SHOULD BE NOTED,
5,1 EXTENSIONS IN CIMS Pl/I
1, A PROCEDURE STaTfMFNT NEED NOT BE LAPELLED A^ LO JG a'=^ aT
LtAST O^E ENTRY POINT OF TME PROCEDURE I <5 LABELLED.
2. A POINTER-QUALIFYING EXPRESSION IS NOT RESTRICTED TO BE A SCALAR
UinSu^SCHIPTED VARI4BLE) IT MAY BE AN ARBITRARY P 1 1 NTPR'V ALL ED EXPRES"
SICN,
3. THE BASED ATTRIBuTF NEED NOT HAVE AN ASSOCIATED POINTER EXPRESSION.
IF SUCH AN EXPRESSION IS GIVFN* IT IS NCT RESTRICTED TO RE A SIMPLE
VARIABLE,
4, BASED CHARACTER STRINGS MAY oE VARYING. F-LEVEL PL/I DOES NOT ALLOW
THIS,
5, THE ON-gTATEMENT May SPECIFY SEVERAL CONHlTMNS. F.
PL/J DOES NOT ALLOW THIS-
EVEL
6, ClfS PL/I HAS A DEFAULT STATEMENT THAT DOES JOT RXIST AT AIL IN THE
F-COMPILER. THE I^M OPTIMIZING COMPILER D0E3 HAVE A DEFAUI T STATEMENT.
BUT IT HAS DIFFERENT SPECIFICATIONS THAN THE ONE IN CIMS PL/I,
7, ClfS PL/I HAS A STORAGE ON-CONDITlON THAT HA"? NO EOUTVALENT IN THE
F-COMPILER, THE I^m OPTIMIZING COMPILER DOES HAVE S^'CH A CONDITION*
HOWEVER,
5,? INCOMPATIBILITIES BETWEEN CIMS PL/I AND IBM PL/I
1. IN CIMS PL/I, INTERNAL PROCEDURES MUST NOT BE DECLARED,
Ii\ SOME VERSIONS OF IBM PL/I. ThEY H^JST f\E DECLA-'^ED,
2. IN CIMS PL/I, CONSTANT LABEL ARRAYS MUST NOT BE DECLAREDI IN
IB^ PL/I THEY MIST BE DECLARED,
3. IN CIMS PL/I, ALL UNDECLARED VARIABLES WITHOUT CONTEXTUAL ATTRIBUTES
DEFAULT TO FIXED (pINARY). 1N I^M Pl/I THEY WOULD DEFAULT TO FLOAT.
4. THERE IS NO CONNECTION BETWEEN THE ENVIRONMEJT OPTIO^J AS U^ED
IN IBM PL/I AND THF ENVIRONMENT OPTION AS USED I J CI^S PL/I.
-59-
6, FUTURE FLANS
EVENaALLY I HOPE Tn lMPLf=MENT THE FULL PL/I LA JQUAIE A«! OEFUED RY
COMMITTE X3J1 OF THE AMEpICAN i^ATIOMaL STANDARDS IMSTITUTE. THE LANGUAGE AS
PRESENTLY IMPLEMENTED IS A LONG WAY FROM THAT GOAL. THE FEATURES MENTIONED
IN REVISION G OF THIS MAmUAL HAVE ALL BEEN IMPLEMENTED, STREAM INFUT-OUTPUT
IS ON THE ^-AY, dUT NO OThEP IMPROVEMENTS ARE PLANNED FOR THE IMMEDIATE FUTURE,
-60-
7, ACKNOWLEDGMENTS
LINDA SUSKIND CONSTRUCTED THE FACILITIES FOR SUB>TITJTI0N OF FTLF
NAMES-
ABRAHAM QETZLER CONSTRUCTED THE CROSS-REFERENCE MAP AND ATTRIPUTE LISTING.
AS WELL AS THfc OVERLAY ROUTINES USED WITHIN THE COMPILER.
IRA KAYE BUILT THE PNTIRE MECHANISM FOR THE DEFAULT STATEMENT'. AND ALSO
WROTE APPtNDIX A OF THIS MANUAL.
EMILY EISENbERG CONStRucTEH THE ORIGINAL VERSION OF THE LEXICAI SCANNER,
PETER fACLEAN PROVJCED AN INTERACTIVE DERUGGING FACILITY WHOSE USE WAS
INVALUABLt IN DEBUGGING THE COMPILER.
-61-
APPENDIX A
A.l DESCRIPTION
A.l .1
DEFAULT/DFT (PPEHICATE) DTT-ATTR-SET t , ")FT- aJTR-SET 1 I
IF A VARlAhiLE MEETS T^E CONDlTinNS SPECIFED BY THE PREDICATE, GlVE THE
VARIABLE ThE ATTRIBUTES SPPCIFIED IN THE DFT-A TTR-SETS . ATTf^MPT Tr APPLY THE
SETS IN THE ORDER LISTED, anH apply EACH ONE WHICH D3ES iOT CAUSE ANY CONFLIC-
TING ATTRIBUTES TO BE GIvEv.
A, 1.2 DE^AULT/DFT (PRFClCATE) EHRORj
IF THE VARIABLE MEETS THE CONnlTlONl SPFCIFIED IN PREDICATE. ^IVE IT THE
ERROR ATTHlBUTEi AND PRImT A MESSAGE. PROGRAM WILL JOT RE E^ECUTAFLE IF ANY
VARIABLES ARE GIVEN THE ERPOR ATTRIBUTE,
A, 1.3 DEt- AULT/DFT SYSTEM;
APPLV THE STANDARD PL/1 SYSTEM DEFAULTS TO THE VARIABLE. ANY DEFAULT
STATEMENTS APPEARING AFTER T^IS TYPE WILL RE IQr.ORED,
A, 1.4 DEFAULT/DFT NONE J
APPLY NO FURTHER DEFAULTS, !NcLUDlN«l THE SYSTEM DEFAULT. IF 4 VARIABLE IS
NOT CQMPLbTE, IT WILL ^E Glv^N THE ERROR ATTRIBUTE. AnH An EpROR MESSAGE WILL
RE PRINTEU,
A,? ORDER CF PROCESSING ANn SCOPING.
DEFAULT STATEMENTS aR= PROCESStD IN THE ORDER I') WHICH THEY APPEAR. THE
DEFAULT SIATEMENTS APPEARlMG IN A BLOCK ARE APPLIED IN SEQUENCE TO EAC^
VARIABLE UECLARED IN THE BLOCK,
DEFAULT STATEMENTS FOLLOW THE SAME ;JAME SCOPING CONVENTIONS ^<: VARIABLES.
THEY ARE KNOWN JN THE PLOCkS IN WHICH THEY APPEAR AM ALL ENCLOSED BLOCKS,
THEY ARE PROCbSSED FROM THP INNERMOST BL^CK OUT.
EX, 1,
AJ PROC OPTiOmS (MAIN) I
DFT (PI) SFT1 »
DCL Al ;
• • •
Bl PROC J
DFT (P2) «;ET2 J
DCL Bl J
• fl •
END B i
END AJ
Rl WILL HAVE PREDICATES p2 AND PI AfPLlED TO IT IN T <AT nRHE", Al UiLL HAVE
PREDICATE Pi APPLIED TO iT. jr iT IS DESlREn ThaT O^JLY ^ReDTCaTE P2 BE APPLIED
TO Bl, THbN IfJ SYSTEM; SHOULD BE PLACED AFYFR DFT(P^) SET?;
EX, 2,
A! PROC OPTICnS^MAIN) ;
DFT (P3) SETT I
DFT none;
-62-
VARIABLES
DEFAULTS,
VARIABLES
VARIABLES
DEFAULTS.
Bl PROC }
DFT SYSTEI"
pRnc J
DFT<P4)
• « •
c;
SET4 I
END
End b;
END Aj
CECLAHED in BLOCi< a WILL HaVE
Including systfm defaults,
declared in blnc^ r will have
chclared in block c will have
PREDICATE P3 APPLIFD. AND NO OTHER
THE STAKHARD SYSTEfl "lEFAULTS APPLIED
P4 APPLIED T^ THEM, AND THFN SYSTEM
A, 3 LESCHIPTION OF PRFClCATES AND DFFAULT ATTRp^UTE SETi,
A. 3.1 PREDICATES
PREDICATES ARE MADE Up BY CDMRlMlNG ANY OF THE ALLO MBLf^ ATTRTBUTE KEY-
WORDS IN dCOLEAN EXPRESSIONS USING a. t, AND -. TO !?EFE'-) TO DIMEN'cIONED
ARRAYS, USE THE KEYWORD oImeNSION OF- OlM. IN ADhITMN T^E FOLLOWING ELEMENTS
ARE ALLOWtC IN PREDICATESl
RANGE(STRING) - A VARIABLE f'FFTS THIS CCNDITIOU IF IT STARTS WITH THE
CHARACTER STRING,
RANGE(CHAR1 : CMAr2) - A VARIABLE MEttS TMIS C0?>1D I T I O'^S IF ITS FIRST
LETTER IS IN THE LEXICAL ImtERVAL SPECIFIED BY CHARl AND CMA'^2.
RaNGEC*) - ALL VARIABLE*? mEET ThiS cCNDlTjO^^.
M.3.2
DEFAULT ATTRIBUTF SETS.
ATTRIBUTE SETS ARE mAHE UP OF LISTS OF ATTRlBDTf^S W'UCH ARE Tr RE APPLIED
TO A VARIABLE IF IT MEETS THE CONDITIONS OF THE PREDICATE. AS MANY SETS AS
DESIRED MAY Hfc PLACED IN A DEFAULT STATEMENT. THE ATTRIBUTES GIVEN IN THE SET
ARE APPLItC TO THE VARIA«LF, PRQVlntn THAT THEY DO N')T CTNFLICT WITH
ATTRIBUTES THAT HAVE AlREAHY BEEN ASSIGNED TO THE VA'^^IABlE. IF SUCH
CONFLICTS EXIST, THEN T^P SET IS IGNORED. ANY ATTRIBUTE OTHPR THAK THE
LIKE ATTRIBUTE CAN APPEAR IN A DEFAULT ATTRIBUTE SET,
-63-
Appendix B. Useful Coding Tricks
In this appendix I shall describe some coding tricks that
I used in the compiler and in its run-time library. This
section assumes a knowledge of 6600 machine language. In the
following descriptions, the assumption is made that any register
not defined by the code specification can be used as a temporary.
B.l Absolute Value, Maximum, and Minimum
The method for taking absolute values without a branch
instruction has been passed down by oral tradition, but I have
never seen it in print. The following code takes the absolute
value of the contents of X7 and leaves the result in X7.
BX6 X7
AX6 60 FILL X6 VJITH SIGN BIT OF X7
BX7 X7-X6 INVERT X7 IF NEGATIVE
A variation of this device will take the maximum of X6 and X7 ,
leaving the result in X7 :
FX5 X7-X6
BX4 X5
AX4 60
BX5 X5*X4 X7-X6 IF NEGATIVE, ELSE 0
FX7 X7-X5 IF X7 EXCEEDS X6 , SUBTRACT X7-X6
If 60 happens to be available in a B-register, these sequences
can each be shortened by one instruction. The code for the
minimum is analogous.
B.2 Least Integer in X
If we have a floating number and wish to find the least
integer not exceeding that number, the code is tricky because
float-to-integer conversion truncates towards zero and gives
the wrong result for negative numbers. Assuming input in X7 ,
the following code puts the desired least integer into X7.
-64-
MXl
1
BX2
X7*X1
LX2
1
1X7
X7+X2
UX7
X7,B2
LX7
X7,B2
1X7
X7-X2
1 TO X2 IF X7 NEGATIVE, ELSE 0
IF X7 NONNEGATIVE, LEAVE IT UNCHANGED
IF X7 AN EXACT INTEGER, BUMP IT UP
IF X7 NOT EXACT INTEGER, KEEP SAME
INTEGER VALUE
CONVERT FLOAT TO INTEGER
SUBTRACT 1 IF NEGATIVE
To understand this code, note that if X7 is nonnegative the
addition and subtraction of X2 has no effect, since X2 contains
zero. Otheanvise we have two cases. If X7 contained the floating
representation of a negative integer, then the addition will
give us a value that will, when truncated, give the next higher
integer. (E.g., we will go from -3 to -2.99999999.) After
conversion to float we have a number one too big (e.g. , -2)
and the following subtraction corrects this situation. Suppose,
on the contrary, that X7 contained the floating representation
of a negative number that was not an exact integer. Then we
would go, for instance, from -3.3 to -3.299999999 to -3
(after conversion to integer) to -4 (after subtraction) .
The same device works for finding the greatest integer.
In order to treat zero correctly the correct procedure is to
complement X7 , use the code given above, and recomplement X7.
B.3 Division by Ten
Performing integer division by 10 is necessary when finding
the number of words needed to represent a packed character string,
since each character occupies 6 bits and each word has 60 bits.
Integer division on the 6600 can only be carried out by using
the floating division instruction; the whole process requires
five instructions. The division instruction itself takes 20
minor cycles , twice as long as any other non-branching instruc-
tion. Thus it is desirable to avoid use of the division instruc-
tion in dividing by 10. This can be done by multiplying by a
suitably imnormalized one-tenth. Assuming input and result
in X7 , we have :
■65-
SAl -17170631463146314632B 1/10+EPSILON
PX6 X7 FLOAT X7 INTO X6
FX6 X1*X6 QUOTIENT TO X6
SX7 X6 CLEAR JUNK BITS FROM EXPONENT
The exponent is chosen so that the desired quotient appears in
the right end of the upper half of the product. A similar trick
can be used for division by other integer constants.
B.4 Changing the Representation of Blank
The internal character code for blank is 55B, which gives
the wrong result in character comparisons since it places blank
after the letters or digits. Before doing a string comparison
other than equality or inequality it is desirable to map all
blanks to the code OOB, thus redefining the collating sequence
to make it more reasonable. It is possible to perform this
operation on an entire word without looping. The process proceeds
in five stages. In the first stage we map the input word so
that 55B characters become 77B (all ones) and no other characters
become 77B. In the second stage we convert each 7 7B to a charac-
ter such that the leftmost bit is 1, and each other character to
a character such that the leftmost bit is 0. This is done through
shifting and "and"-ing. In the third stage we convert each 77B
to 40B and each other character to 0 . In the fourth stage we
convert each 40B to 77B and each other character to 0. In the
fifth stage we use the mask we have just developed to change
the 55B's to OOB's. Assuming input and output in X7 , the code is:
-66-
SAl
= 10H
BX6
-X1-X7
BX5
X6
LX5
3
BX6
X5*X6
SBl
1
LX5
X6,B1
BX6
X5*X6
LX5
X6,B1
BX6
X5*X6
LX5
X6,B1
BX6
X5*X6
SA2
=40404
BX6
X2*X6
AX5
X6,B1
BX6
X5+X6
AX5
X6,B1
BX6
X5+X6
AX5
X6,B1
BX6
X5+X6
BX5
X6
AX5
3
BX6
X5+X6
BX7
-X6*X7
CHANGE 55B to 77B in X6
AND 3 LEFT BITS WITH 3 RIGHT BITS
ALL 6 BITS NOW ADDED INTO LEFT BIT
)404040B
ELIMINATE EXTRA BITS
CHANGE 4 OB TO 77B
NOW ALL CHARS ARE 77B or 0
MASK 55B TO 0
The same approach can be used, for instance, to count null
characters in a word; in this case one produces 40B for each
null character and then uses the population count instruction.
Other related tasks can also be accomplished with this approach.
B.5 Searching for a Set of Characters in a String
The trick for searching for a set of characters in a string
is needed in connection with the VERIFY built-in function in PL/I.
In this situation we are given a set of characters , and we wish
to find the first character in the string that is not in the set.
(The problem of finding the first character that is in the set
is similar.) Rather than give the actual code, I will just
describe the method. First, form a bit mask such that bit _i
is one if and only if the character with representation i. is in
the set. Then loop through the characters in the string. For
each character, use the character representation as a shift
count and shift a single one bit left by that count. Form the
"and" of the shifted bit and the bit mask constructed above.
If the result is nonzero, then the character is in the set.
There is a difficulty with this method. If the characters
include any whose octal representations exceed 60 , then there
will be an ambiguity, and two bit masks are needed rather than one.
-67-
This report was prepared as an account of
Government sponsored work. Neither the
United States, nor the Commission, nor any
person acting on behalf of the Commission:
A. Makes any warranty or representation,
express or implied, with respect to the
accuracy, completeness, or usefulness of
the information contained in this report,
or that the use of any information,
apparatus, method, or process disclosed
In this report may not infringe privately
owned rights; or
B. Assumes any liabilities with respect to
the use of, or for damages resulting from
the use of any information, apparatus,
method, or process disclosed in this
report.
As used in the above, "person acting on behalf
of the Commission" includes any employee or
contractor of the Commission, or employee of
such contractor, to the extent that such em-
ployee or contractor of the Commission, or
employee of such contractor prepares, dis-
seminates, or provides access to, any infor-
mation pursuant to his employment or contract
with the Commission, or his employment with
such contractor.
-68-