lllH^jilV *lwHli^v?i^^j( f H'l
LIBRARY OF THE
UNIVERSITY OF ILLINOIS
ihe person charging this material is re-
sponsible for its return to the library from
which it was withdrawn on or before the
Latest Date stamped below.
Theft, mutilation, and underlining of books
are reasons for disciplinary action and may
result in dismissal from the University.
UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN
'■■ '' P "**
JUN 1 2 R
DEC * 5 fflD
L161 — O-1096
? 10 - 8l t Engin.
"«c ««-.. ENGINEERING LIBRARY
CONFERENCE ROOM university of ill.no.s
CAC Document No. 30
Center for Advanced Computation
University of Illinois at Urb ana- Champaign
Urbana, Illinois 6l801
A 10-page Description of the ILLIAC IV System
Stewart A. Denenberg
October 12, 1971
The architecture or hardware structure of the ILLIAC IV System
is discussed. The ILLIAC IV System comprises the ILLIAC IV Array plus the
ILLIAC IV Subsystem. The ILLIAC IV Array is a Vector or Array Processor
with a specialized Control Unit that can be viewed as a small stand-alone
computer by itself. The text has been revised and condensed from ILLIAC IV
Document No. 225.
A. ILLIAC IV in Brief
The original design of ILLIAC IV contained four Control Units :
each of which controlled a 6k Arithmetic and Logic Unit (ALU) Array-
Processor. The version being built by the Burroughs Corporation will have
only one Control Unit which drives 6k ALUs as shown in Figure 1. It is
for this reason that ILLIAC IV is sometimes referred to as a Quadrant
(one-fourth of the original machine) and it is this abbreviated version of
ILLIAC IV that will be discussed for the remainder of this document.
• • •
Figure 1. Functional Block Diagram of ILLIAC IV
One difference between ILLIAC IV and a general Array Processor is
that the Control Unit (CU) has been decoupled from the rest of the Array
Processor so that certain instructions can be executed completely within
the resources of the CU at the same time that the ALU is performing its
vector operations . In this way another degree of parallelism is exploited
in addition to the inherent parallelism of 6k ALUs being driven simultane-
ously. What we have is 2 computers inside ILLIAC IV, one that operates on
scalars and one that operates on vectors. All of the instructions however,
emanate from the computer that operates on scalars — the CU.
Each element of the ALU Array is not called by its generic name
(ALU) but is called a Processing Element or PE. There are 6k PEs and they
are numbered from to 63. Each PE responds to appropriate instructions if
the PE is in an active mode . (There exist instructions in the repertoire
which can activate or de-activate a PE.) Each PE performs the same opera-
tion under command from the CU in the lock-stepped manner of an Array
Processor. That is, since there is only one Control Unit, there is only
one instruction stream and all of the ALUs respond together or are lock-
stepped to the current instruction. If the current instruction is ADD
for example, then all of the ALUs will Add — there can be no instruction
Digitized by the Internet Archive
in 2012 with funding from
University of Illinois Urbana-Champaign
which will cause some of the ALUs to be adding while others are multiplying.
Every ALU in the Array performs the instruction operation in this lock-
stepped fashion, but the operands are vectors whose components can be and
usually are different.
Each PE has a full complement of arithmetic and logical circuitry
and under command from the CU will perform an instruction "at-a-crack" as
an Array Processor. Each PE has its own 20U8 word 64-bit memory called a
Processing Element Memory (PEM) which can be accessed in about 350 ns .
Special routing instructions can be used to move data from PEM to PEM.
Additionally, operands can be sent to the PEs from the CU via a full-word
(64 bit) one-way communication line and the CU has eight-word one-way
communication with the PEM array (for instruction and data fetching).
An ILLIAC IV word is 64 bits and data numbers can be represented
in either 64-bit floating point, 64-bit logical, 48-bit fixed point, 32-bit
floating point, 24- bit fixed point, or 8-bit fixed point (character) mode.
By utilizing the 64-bit, 32-bit and 8-bit data formats the 64 PEs can hold a
vector of operands with either 64, 128, or 512 components. Since ILLIAC IV
can add 512 operands in the 8 bit integer mode in about 66 nanoseconds , it
is capable of performing almost lO 1 of these "short" additions per second.
ILLIAC IV can perform approximately 150 million 64-bit, rounded, normalized
floating-point additions per second.
The I/O is handled by a B65OO Computer System. The Operating
including the assemblers and compilers, also reside in the B65OO,
B. The ILLIAC IV System
The ILLIAC IV System can be organized as in Figure 2. The ILLIAC
IV System consists of the ILLIAC IV Array plus the ILLIAC IV I/O System.
ILLIAC 1Z SYSTEM
I BUFFER INPUT/OUTPUT I
I INPUT/OUTPUT SWITCH I
Figure 2. ILLIAC IV System Organization
The ILLIAC IV Array consists of the Array Processor and the Control Unit.
In turn, the Array Processor is made up of 6h Processing Elements (PEs ) and
their 6h associated memories — Processing Element Memories (PEMs ) . The
ILLIAC IV I/O System is comprised of the I/O Subsystem, the Disk File
System and the B65OO control computer. The I/O Subsystem is broken down
further to the CDC, BIOM and IOS. The B65OO is actually a medium-scale
computer system by itself and supervises the Laser Memory and the ARPA
Network Link .
The ILLIAC IV Array will be discussed first, in a general manner,
followed by a brief description of the ILLIAC IV I/O System.
1. The ILLIAC IV Array
Figure 3 represents the ILLIAC IV Array — the Control Unit plus
the Array Processor.
CONTROL UNIT BUS
COMMON 1 II
DATA j C(
, BUS 1 P/
PE 6 3
TO PE 63
• • •
• • •
Figure 3. ILLIAC IV Array
a. Control Unit (CU )
The Control Unit is not just the Control Unit that we're used to
thinking of on a conventional computer but can be viewed as a small
unsophisticated computer in its own right. Not only does it cause the 6k
Processing Elements to respond to instructions , there is a repertoire of
instructions that can be completely executed within the resources of the
Control Unit, and the execution of these instructions is overlapped with
the execution of the instructions which drive the Processing Element Array.
Again, it is worthwhile to view ILLIAC IV as being two computers, one
which operates on scalars and one which operates on vectors.
The Control Unit contains 6k integrated circuit registers called
the ADVAST Data Buffer (ADB) which can be used as a high speed scratch-pad
memory. ADVAST is an acronym for Advanced Station and is one of the five
functional sections of the CU. Each register of the ADB (DO through D63)
is 6U-bits long. The CU also has k Accumulator Registers called ACARO ,
ACAR1, ACAR2, and ACAR3 each of which is also 6k bits long. The ACARs can
be used as accumulators for integer addition, shifting, Boolean operations
and holding loop control information in conjunction with the simple ALU.
In addition, the ACARs can be used as index registers to modify storage
references within the memory section (PEM) .
b . Processing Element (PE )
Each Processing Element (PE) is a sophisticated ALU capable of a
wide range of arithmetic and logical operations. There are 6k PEs numbered
through 63. Each PE in the array has 6 programmable registers: the A
register (RGA) or Accumulator, the B register (RGB) which holds the second
operand in a binary operation (such as Add, Subtract, Multiply or Divide),
the R or routing register (RGR) which transmits information from one PE to
another, the S register (RGS) which can be used as temporary storage by the
programmer, the X register (RGX) or index register to modify the address
field of an instruction, and the D or mode register (RGD) which controls
the active or nonactive status of each PE independently. The mode register
determines whether a PE will be active or passive during instruction execu-
tion. Since this register is under the programmer's control, individual
PEs within the array of 6k PEs may be set to enabled (active) or disabled
(passive) status based on the contents of one of the other PE registers.
For example, there are instructions which disable all PEs whose RGR contents
are greater than their RGA contents . Only those PEs in an enabled state are
able to execute the current instruction. All registers are 6k bits except
RGX which is 16 bits and RGD which is 8 bits.
c. Processing Element Memory (PEM )
Each PE has its own 2048 word, 6U-bits per word, random access
memory. Each memory is called a Processing Element Memory or PEM and they
are numbered through 63 also. A PE and PEM taken together is called a
Processing Unit or PU. PE. may only access PEM. so that one PU cannot
modify the memory of another PU. Information can, however, be passed from
one PU to another via the Routing Network which is one of the k paths by
which data flows through the ILLIAC IV Array.
d. Data Paths
Besides the Instruction Control Path which drives the 6k PEs
during the execution of an instruction there are four paths by which data
flows through the ILLIAC IV Array. These paths are called the Control Unit
Bus (CU Bus), the Common Data Bus (CDB), the Routing Network, and the Mode
i. Control Unit Bus ( CU Bus )
Operands or data from the PEMs in blocks of eight words can be
sent to the CU via the Control Unit Bus (CU Bus). The instructions to be
executed are distributed throughout the PEMs and are fetched in blocks of
eight words to the CU via the CU Bus as necessary. Although the Operating
System takes care of fetching and executing instructions , data can also be
fetched in blocks of 8 words under program control using the CU Bus.
ii . Common Pat a Bus ( CDB )
Information stored in the Control Unit can be "broadcast" to the
entire 6k PE Array simultaneously via the Common Data Bus (CDB). A value
such as a constant to be used as a multiplier need not be stored 6k times
in each PEM; instead this value can be stored within a CU register and then
broadcast to each enabled PE in the array. In addition the operand or
address portion of an instruction is sent to the PE array via the CDB.
iii . Routing Network
Information in one PE register can be sent to another PE register
by special routing instructions. (information can be transferred from PE
register to PEM by standard LOAD or STORE instructions.) High speed rout-
ing lines run between every RGR of every PE and its nearest left and right
neighbor (distances of -1 and +1 respectively) and its neighbor 8 positions
to the left and 8 positions to the right (-8 and +8 respectively). Other
routing distances are effected by combinations of routing -1, +1, -8, or +8
PEMs; that is, if a route of 5 to the right is desired, the software will
figure out that the fastest way to do this is by a right route of 8
followed by three left routes of 1. Figure k shows one way to view the
connectivity which exists between PEs . As can be seen from the figure, PE
is connected to PE ^ , PE, , PEn, and PE,- .
Figure k. PE Routing Connections
iv. Mode Bit Line
The Mode Bit Line consists of one line coming from the RGD of
each PE in the Array. The Mode Bit Line can transmit one of the eight mode
bits of each RGD in the array up to an ACAR in the Control Unit. If this
bit is the bit which indicates whether or not a PE is on or off, we can
transmit a "mode pattern" to an ACAR. This mode pattern reflects the
status or on-offness of each PE in the array; then there are instructions
which are executed completely within the Control Unit that can test this
mode pattern and branch on a zero or non-zero condition. In this way
branching in the instruction stream can occur based on the mode pattern of
the entire 6k PE array.
2. ILLIAC IV Input/Output (I/O) System
The ILLIAC IV Array is an extremely powerful information pro-
cessor, but it has of itself no I/O capability. The I/O capability along
with the supervisory system (including compilers and utilities) reside
within the ILLIAC IV I/O System. The ILLIAC IV I/O System (see Figure 5)
consists of the I/O Subsystem, a Disk File System (DFS) and a B65OO Control
Computer (which in turn supervises a large Laser Memory and the ARPA Network
Link). The total ILLIAC IV System consisting of the ILLIAC IV I/O System
and the ILLIAC IV Array is shown in Figure 6. All system configurations
shown are transitory, and more than likely will have changed several times
in the next year or so.
B6500 CONTROL COMPUTER
Figure 5- ILLIAC IV I/O System
B6500 Peripheral Card Reader, Card Punch,
Line Printer, 4 Magnetic Tape Units, 2 Disk Files,
Console Printer and Keyboard
Figure 6. ILLIAC IV Syst
a. I/O Subsystem
The I/O Subsystem consists of the Control Descriptor Controller
(CDC), the Buffer Input/Output Memory (BIOM) and the Input/Output Switch
i . Control Descriptor Controller (CDC )
The CDC monitors a section of the CU waiting for an I/O request
to appear. The CDC can then interrupt the B6500 Control Computer which can,
in turn, try to honor the request and place a response code back in that
section of the CU via the CDC. This response code indicates the status of
the I/O request to the program in the ILLIAC IV Array.
The CDC causes the B65OO to initiate the loading of the PE Memory-
Array with programs and data from the ILLIAC IV Disk (also called the Disk
File System or DFS). After PE Memory has been loaded, the CDC can then
pass control to the CU to begin execution of the ILLIAC IV Program.
ii. Buffer Input /Output Memory (BIOM )
The B65OO Control Computer can transfer information from its
memory through its CPU at the rate of 80 x 10° bits /second. The ILLIAC IV
Disk (DFS) accepts information at the rate of 500 x 10 bits/second. This
factor of over six in information transfer rates between the two systems
necessitates the placing of a rate-smoothing buffer between them. The BIOM
is that buffer. A buffer is also necessary for the conversion of 48-bit
B65OO words to 64-bit ILLIAC IV words which can come out of the BIOM two
at a time via the 128 bit wide path to the Disk File System. The BIOM is
actually four PE memories providing 8192 words of 64-bit storage .
Input/Output Switch (IPS )
The IOS performs two functions . As its name implies , it is a
switch and is responsible for switching information from either the Disk
File System or from a port which can accept input from a real time device.
All bulk data transfers to and from the PE Memory Array are via IOS. As a
switch it must insure that only one input is sending to the Array at a
given time. In addition, the IOS acts as a buffer between the Disk File
System and the Array, since each channel from the ILLIAC IV Disk to the IOS
is 256 bits wide and the bus from the IOS to the PE Memory Array is 1024
b. Disk File System (DFS )
The Disk File System (DFS) consists of two Storage Units, two
Electronics Units and two Disk File Controllers . The DFS is also called
the ILLIAC IV Disk or simply, the Disk. The Disk is of 10 9 -bit capacity,
having 128 heads, with one head per track. The DFS has two channels, each
of which can transmit or receive data at a rate of .5 x 10-^ bits /second
over a path 256 bits wide; however, if both channels are sending or
receiving simultaneously the transfer rate is 10° bits/second.
c. B65OO Control Computer
The B65OO Control Computer consists of a Central Processing Unit
(CPU) , Memory, a Multiplexor and a set of Peripheral Devices (Card Reader,
Card Punch, Line Printer, k Magnetic Tape Units, 2 Disk Files and Console
Printer and Keyboard) . It is the function of the B65OO to manage all
programmers' requests for system resources. This means that the Operating
System -will reside on the B65OO. All compiling and assembling of programs
is also performed on the B65OO. Utilities, such as Card-to-Disk, Card-to-
Tape, etc. are also executed on the B65OO. From a total System standpoint,
the ILLIAC IV Array can be considered as a special-purpose peripheral
device of the B65OO capable of solving certain classes of problems with
extremely high speed.
i . Laser Memory
The B65OO supervises a 10 -bit read-only Laser Memory developed
by the Precision Instrument Company. The beam from an argon laser records
binary data by burning microscopic holes in a thin film of metal coated on
a strip of polyester sheet, -which is carried by a rotating drum. Each data
strip can store some 2.9 billion bits. A "strip file" provides storage for
400 data strips containing more than a trillion bits . The time to locate
data stored on any one of the 400 strips is five seconds. Within the same
strip data can be located in 200 milliseconds . The read and record rate
is four million bits a second on each of two channels. A projected use of
this memory will allow the user to "dump" large quantities of programs and
data into this storage medium for leisurely review at a later time; hard
copy output can optionally be made from files within the Laser Memory.
ii . ARPA Network Link
The ARPA Network is a group of computer installations separated
geographically but connected by high speed (50,000 bits/second) data
communication lines. On these lines, the members of the "Net" can transmit
information — usually in the form of programs, data, or messages. The link
performs an information switching function and is handled by an IMP (inter-
face Message Processor) and a Network Control Program stored within each
member installation's "host" computer. Each IMP operates in a "store and
forward mode", that is, information in one IMP is not lost until the re-
ceiving IMP has signalled complete reception and retention of the message.
The IMP interfaces with each member's computer system and converts
information into standard format for transmission to the rest of the Net.
Conversely, the IMP accepts information in a standard format and converts
it to the particular data format of the member installation. In this way,
the ARPA Network is a form of a computer utility with each contributing
member offering its unique resources to all of the other members .
1. G. H. Barnes, et al . , "The ILLIAC IV Computer", IEEE Transactions on
Computers, Vol. C-17, No. 8, August 1968, pp. 71+6-757.
2. S. A. Denenberg, "An Introductory Description of the ILLIAC IV System",
ILLIAC IV Document No. 225, Department of Computer Science File
No. 850, Urbana, 111.: Center for Advanced Computation, University
of Illinois, July 15, 1971.
3. D. L. Slotnick, "The Fastest Computer", Scientific American, Vol. 22^,
No. 2, February 1971, pp. 76-87.
DOCUMENT CONTROL DATA -R&D
(Security claaaltlcatlon ol till; body ol abattmct and Indaatng mmmtmtim mntai ba BiWrt< trh
an tha ovarall raport la eta f Iliad)
I. originating ACTIVITY (Corporal* author)
Center for Advanced Computation
University of Illinois at Urbana- Champaign
2*. REPORT SECURITY CLASSIFICATION
J. REPORT TITLE
A 10-Page Description of the ILLIAC IV System
4. DESCRIPTIVE NOTE* (Typa ol raport mnd tnelualva data a)
S- AUTHOR(S) (Fitat nmmta, ntlddla Initial, laat nam*)
Stewart A. Denenberg
a. REPORT DATE
October 12. 1Q71
7«. TOTAL NO. OF PACE*
76. NO. OF REFS
•a. CONTRACT OR GRANT NO.
6. PROJECT NO.
ARPA Order No. 1899
•a. ORIGINATOR'S REPORT NUMBERISt
CAC Document No. 30
•6. OTHER REPORT NOISI (Any othat numbara that may ba aaalgnad
10. DISTRIBUTION STATEMENT
Copies may be obtained from the address given in (l) above,
II. SUPPLEMENTARY NOTES
12. SPONSORING MILITARY ACTIVITY
U.S. Army Research Office-Durham
Duke Station, Durham, North Carolina
The architecture or hardware structure of the ILLIAC IV
System is discussed. The ILLIAC IV System comprises the ILLIAC IV Array
plus the ILLIAC IV Subsystem. The ILLIAC IV Array is a Vector or
Array Processor with a specialized Control Unit that can be viewed as a
small stand-alone computer by itself. The text has been revised and
condensed from ILLIAC IV Document No. 225-
DD , F -°o?..1473
HO L E *T
Reference and Learning Manuals
Computer Systems (General)